ANALYZING THE PERFORMANCES OF DEEP REINFORCEMENT LEARNING ALGORITHMS IN THE STARCRAFT 2 ENVIRONMENT

Authors

  • Sonja Trpovski
  • Saša Lalić Autor

DOI:

https://doi.org/10.24867/02BE34Lalic

Keywords:

Starcraft 2, učenje uslovljavanjem, duboko učenje, A3C, Deep-Q learning

Abstract

This paper studies the performance of deep reinforcement learning algorithms in solving a subset of  problems in the Starcraft 2 environment. Algorithms that were studied were: A3C and Deep-Q Learning. Each algorithm was tested with a different set of training parameters, such as the number of skipped agent steps, and the neural network learning rate. Both algorithms perform similarly in accordance to parameter changes, when dealing with the problems that do not require a great number of actions to achieve an optimal solution. Consequently parameters values that skip greater number of actions lead to better results given the same training time. Reducing the learning rate leads to a decrease of performances for both algorithms in all of the problems. Both algorithms have achieved satisfiable results in problems that mostly involve management of units. However the results were quite low tasks that included the construction of a base.

References

[1] A. Basel, P. G. Keerthana “Asynchronous Advantage Actor-Critic Agent for Starcraft II”, 22.7.2018,
[2] Starcraft 2 Windows PC version, Blizzard Entertainment, 2010.
[3] O. Vinyals, T. Ewalds, S. Bartunov, P. Georgiev, “StarCraft II: A New Challenge for Reinforcement Learning”, 16.08.2017.
[4] S. Wender, I. Watson, “Applying Reinforcement Learning to Small Scale Combat in the Real-Time Strategy Game StarCraft:Broodwar”, 2012
[5] https://github.com/deepmind/pysc2
[6] R. Ring, “Replicating DeepMind StarCraft II Reinforcement Learning Benchmark with Actor-Critic Methods”, 2018
[7] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, “TensorFlow: Large-scale machine learning on heterogeneous systems”, 2015, Software available from tensorflow.org.
[8] NVIDIA cuDNN, https://developer.nvidia.com/cudnn
[9] L. Kaelbling, M. Littman, A. Moore, "Reinforcement Learning: A Survey", Journal of Artificial Intelligence Research. 4: 237–285, Archived from the original on 20.11.2001. (1996).
[10] A. Juliani, “Asynchronous Actor-Critic Agents (A3C)”, 17.12.2016.

Published

2019-03-10

Issue

Section

Electrotechnical and Computer Engineering