This paper presents a reinforcement learning framework for optimizing energy pricing in peer-to-peer (P2P) energy systems. The framework aims to maximize the profit of all components in a microgrid, including consumers, prosumers, the service provider, and a community battery. Experimental results on the Pymgrid dataset demonstrate the approach's effectiveness in price optimization, considering the interests of different components and the impact of community battery capacity.
This paper introduces DaringFed, a novel dynamic Bayesian persuasion pricing mechanism for online federated learning (OFL) that addresses the challenge of two-sided incomplete information (TII) regarding resources. It formulates the interaction between the server and clients as a dynamic signaling and pricing allocation problem within a Bayesian persuasion game, demonstrating the existence of a unique Bayesian persuasion Nash equilibrium. Evaluations on real and synthetic datasets demonstrate that DaringFed optimizes accuracy and convergence speed and improves the server's utility.
This study introduces a reinforcement learning (RL) framework using Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC) to optimize the cleaning schedules of photovoltaic panels in arid regions. Applied to a case study in Abu Dhabi, the PPO-based framework demonstrated up to 13% cost savings compared to simulation optimization methods by dynamically adjusting cleaning intervals based on environmental conditions. The research highlights the potential of RL in enhancing the efficiency and reducing the operational costs of solar power generation.
Researchers propose a spatio-temporal model for high-resolution wind forecasting in Saudi Arabia using Echo State Networks and stochastic partial differential equations. The model reduces spatial information via energy distance, captures dynamics with a sparse recurrent neural network, and reconstructs data using a non-stationary stochastic partial differential equation approach. The model achieves more accurate forecasts of wind speed and energy, potentially saving up to one million dollars annually compared to existing models.
The paper introduces a novel actor-critic framework called Distillation Policy Optimization that combines on-policy and off-policy data for reinforcement learning. It incorporates variance reduction mechanisms like a unified advantage estimator (UAE) and a residual baseline. The empirical results demonstrate improved sample efficiency for on-policy algorithms, bridging the gap with off-policy methods.