From: Peer-to-peer energy trading optimization in energy communities using multi-agent deep reinforcement learning
Hyperparameter
Value
Train batch size
100
Tau (Ï„)
0.005
Gamma (γ)
0.99
Critic learning rate
0.001
Actor learning rate
Policy delay
2
Policy noise
0.2
Noise clip
0.5