DESING OF ADAPTIVE CONTROLLERS BY MEANS OF PPO ALGORITHM USING MATLAB

Authors

  • Вељко Радоjичић Autor

DOI:

https://doi.org/10.24867/30BE21Radojicic

Keywords:

PPO, agent, controller

Abstract

This paper focuses on exploring of using Reinforcement learning’s Proximal Policy Optimization algorithm for problems of control of continual dynamic systems. Reinforcement learning agent has been trained on relatively simple linear and non-linear exaples of systems, using MATLAB’s Reinforcemet Learning Designer application, while for result and simulation, Simulink has been used.

References

[1] Richard S. Sutton, Andrew G. Barto “Reinforcement Learning: An Introduction, Second edition”, Bradford Book, The MIT Press, Cambridge, Massachusetts, London, England, 2014.-2015.
[2] John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov, “Proximal Policy Optimization Algorithms”, https://arxiv.org/ abs/1707.06347, 2017.
[3] John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel, “High-Dimensional Continuous Control Using Generalized Advantage Estimation”, ICLR, 2016.
[4] Nai-Chieh Huang, Ping-Chun Hsieh, Kuo- Hao Ho, I-Chen Wu “PPO-Clip Attains Global Optimality: Towards Deeper Understandings of Clippin”, Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan, https://arxiv.org/abs/2312.12065, 2024.
[5] Wouter van Heeswijk, “Policy Gradients In Reinforcement Learning Explained” Medium, 2022.

Published

2025-04-04

Issue

Section

Electrotechnical and Computer Engineering