Skip to main navigation menu Skip to main content Skip to site footer

Electrotechnical and Computer Engineering

Vol. 36 No. 11 (2021): Proceedings of the Faculty of Technical Sciences

SELECTION OF HYPERPARAMETERS OF DEEP REINFORCEMENT LEARNING ALGORITHMS USING A GENETIC ALGORITHM

  • Vasilije Pantić
DOI:
https://doi.org/10.24867/15BE28Pantic
Submitted
November 8, 2021
Published
2021-11-08

Abstract

This paper solves the problem of robots walking in space using deep reinforcement learning algorithms that are optimized using a genetic algorithm.

References

[1] Schulman, John, et al. "Trust region policy optimization." International conference on machine learning. PMLR, 2015.
[2] Fletcher, Roger. "Conjugate gradient methods for indefinite systems." Numerical analysis. Springer, Berlin, Heidelberg, 1976. 73-89.
[3] Schulman, John, et al. "Proximal policy optimization algorithms." arXiv preprint arXiv:1707.06347 (2017).
[4] https://gym.openai.com/
[5] https://pybullet.org/wordpress/
[6] https://pytorch.org/
[7] https://github.com/reinai/HumanoidRobotWalk
[8]https://github.com/sovaso/GeneticAlgorithmForHumanoidRobotWalk
[9] Reynolds, Douglas A. "Gaussian mixture models." Encyclopedia of biometrics 741 (2009): 659-663.
[10] Gao, Bolin, and Lacra Pavel. "On the properties of the softmax function with application in game theory and reinforcement learning." arXiv preprint arXiv:1704.00805 (2017).