USING REINFORCEMENT LEARNING TO TRAIN AN AGENT FOR THE CarRacing-v0 ENVIRONMENT
DOI:
https://doi.org/10.24867/06BE05SarenacKeywords:
reinforcement learning, CarRacing-v0, Deep Q-Network, Advantage Actor Critic, Asynchronous Advantage Actor CriticAbstract
This paper presents training and evaluation of the agent for autonomous driving in OpenAI Gym environment CarRacing-v0. Environment is a top-down view of racing track. Agent is trained using reinforcement learning techniques. Algorithms are compared in terms of results achieved in the environment, training time and implementation details. Algorithms that are implemented and evaluated are: Deep Q-Network (DQN), Advantage Actor Critic (A2C) i Asynchronous Advantage Actor Critic (A3C).
References
[1] V. Mnih, K. Kavukcouoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, D. Hassabis „Human-level control through deep reinforcement learning“
[2] V. Mnih, A.P. Badia, M. Mirza, A. Graves, T. Harley, T.P. Lillicrap, D. Silver, K. Kavukcouglu „Asynchronous Methods for Deep Reinforcement Learning“
[3] https://gym.openai.com/envs/CarRacing-v0/ [pristupljeno 7.9.2019.]
[4] S. Ruder „An overview of gradient descent optimization algorithms“
[5] D. P. Kingma, J. L. Ba „Adam: A Method for Stochastic Optimization“
[2] V. Mnih, A.P. Badia, M. Mirza, A. Graves, T. Harley, T.P. Lillicrap, D. Silver, K. Kavukcouglu „Asynchronous Methods for Deep Reinforcement Learning“
[3] https://gym.openai.com/envs/CarRacing-v0/ [pristupljeno 7.9.2019.]
[4] S. Ruder „An overview of gradient descent optimization algorithms“
[5] D. P. Kingma, J. L. Ba „Adam: A Method for Stochastic Optimization“
Downloads
Published
2019-12-21
Issue
Section
Electrotechnical and Computer Engineering