Fig. 2From: Towards reinforcement learning for vulnerability analysis in power-economic systemsLearning curve of the 15 agents over 200 tsd. training steps with average return and standard deviation averaged over 10 episodesBack to article page