SELECTION OF HYPERPARAMETERS OF DEEP REINFORCEMENT LEARNING ALGORITHMS USING A GENETIC ALGORITHM

Authors

  • Vasilije Pantić Autor

DOI:

https://doi.org/10.24867/15BE28Pantic

Keywords:

Deep Reinforcement Learning, Genetic Algorithm

Abstract

This paper solves the problem of robots walking in space using deep reinforcement learning algorithms that are optimized using a genetic algorithm.

References

[1] Schulman, John, et al. "Trust region policy optimization." International conference on machine learning. PMLR, 2015.
[2] Fletcher, Roger. "Conjugate gradient methods for indefinite systems." Numerical analysis. Springer, Berlin, Heidelberg, 1976. 73-89.
[3] Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
[4] https://gym.openai.com/
[5] https://pybullet.org/wordpress/
[6] https://pytorch.org/
[7] https://github.com/reinai/HumanoidRobotWalk
[8]https://github.com/sovaso/GeneticAlgorithmForHumanoidRobotWalk
[9] Reynolds, Douglas A. "Gaussian mixture models." Encyclopedia of biometrics 741 (2009): 659-663.
[10] Gao, Bolin, and Lacra Pavel. "On the properties of the softmax function with application in game theory and reinforcement learning." arXiv preprint arXiv:1704.00805 (2017).

Published

2021-11-08

Issue

Section

Electrotechnical and Computer Engineering