USING REINFORCEMENT LEARNING TO TRAIN AN AGENT FOR THE CarRacing-v0 ENVIRONMENT

Authors

  • Novica Šarenac Autor

DOI:

https://doi.org/10.24867/06BE05Sarenac

Keywords:

reinforcement learning, CarRacing-v0, Deep Q-Network, Advantage Actor Critic, Asynchronous Advantage Actor Critic

Abstract

This paper presents training and evaluation of the agent for autonomous driving in OpenAI Gym environment CarRacing-v0. Environment is a top-down view of racing track. Agent is trained using reinforcement learning techniques. Algorithms are compared in terms of results achieved in the environment, training time and implementation details. Algorithms that are implemented and evaluated are: Deep Q-Network (DQN), Advantage Actor Critic (A2C) i Asynchronous Advantage Actor Critic (A3C).

References

[1] V. Mnih, K. Kavukcouoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, D. Hassabis „Human-level control through deep reinforcement learning“
[2] V. Mnih, A.P. Badia, M. Mirza, A. Graves, T. Harley, T.P. Lillicrap, D. Silver, K. Kavukcouglu „Asynchronous Methods for Deep Reinforcement Learning“
[3] https://gym.openai.com/envs/CarRacing-v0/ [pristupljeno 7.9.2019.]
[4] S. Ruder „An overview of gradient descent optimization algorithms“
[5] D. P. Kingma, J. L. Ba „Adam: A Method for Stochastic Optimization“

Published

2019-12-21

Issue

Section

Electrotechnical and Computer Engineering