USING REINFORCEMENT LEARNING TO TRAIN AN AGENT FOR THE Obstacle Tower ENVIRONMENT

Authors

  • Predrag Njegovanović Autor

DOI:

https://doi.org/10.24867/06BE06Njegovanovic

Keywords:

Reinforcement learning, three-dimensional environment, environment control, navigation, sparse rewards, neural network

Abstract

Reinforcement learning in today's order of things when artificial intelligence is on the rise is a favorable field for new research. One of the problems that was trying to be solved in the last year or two is the problem of environmental control or navigation. This paper presents one form of solution to the problem of navigation and generalization in a three-dimensional environment is presented while there are limits to rewards, by forming an autonomous agent with deep learning techniques. An evaluation of the agent's performance was performed by comparing it with the human performance and results already described in the accompanying scientific papers.

References

[1] AY Ng. (2017, December 14). Practical applications of reinforcement learning in industry. Retrieved June 24, 2019 from https://www.oreilly.com/ideas/practical-applications-of-reinforcement-learning-in-industry
[2] Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., & Efros, A. A. (2018). Large-scale study of curiosity-driven learning. arXiv preprint arXiv:1808.04355.
[3] Python. (n.d). Retrieved from https://www.python.org/
[4] Juliani, A., Khalifa, A., Berges, V. P., Harper, J., Henry, H., Crespi, A., ... & Lange, D. (2019). Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning. arXiv preprint arXiv:1902.01378.
[5] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
[6] Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167.
[7] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.

Published

2019-12-21

Issue

Section

Electrotechnical and Computer Engineering