REINFORCEMENT LEARNING WITH SAFETY MECHANISMS – CASE STUDY: A SAFETY INTERRUPT

Authors

  • Miloš Pavlić Autor

DOI:

https://doi.org/10.24867/11BE02Pavlic

Keywords:

Reinforcement Learning, Neural Networks, Safe Interruptibility

Abstract

Previous research has proposed various rein­forcement learning algorithms. However, scientists suggest more research should be put into the safety mechanisms of these algorithms. There have been several efforts to turn these mechanisms into technical specifications to make direct progress in this field possible. The focus of this paper is to test reinforcement learning algorithms and their modifications (DQN, A2C, and SAC Discrete) for safe interruptibility. The environment the algorithms were trained in is proposed in AI Safety Gridworlds paper. The results show that the tested reinforcement learning algorithms are all safely interruptible with hyperparameter settings proposed in this paper. The only limitation is that training has to be monitored and stopped in the right moment for algorithms to be safely interruptible.

References

[1] LEIKE, Jan, et al. AI safety gridworlds. arXiv preprint arXiv:1711.09883, 2017.
[2] MNIH, Volodymyr, et al. Playing atari with deep reinforce-ment learning. arXiv preprint arXiv:1312.5602, 2013.
[3] SUTTON, Richard S., et al. Introduction to reinforcement learning. Cambridge: MIT press, 1998.
[4] WILLIAMS, Ronald J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 1992, 8.3-4: 229-256.
[5] ORSEAU, Laurent; ARMSTRONG, M. S. Safely interruptible agents. 2016.
[6] HUTTER, Marcus. Universal artificial intelligence: Sequential decisions based on algorithmic probability. Springer Science & Business Media, 2004.
[7] HADFIELD-MENELL, Dylan, et al. The off-switch game. arXiv preprint arXiv:1611.08219, 2016.
[8] RIEDL, Mark O.; HARRISON, Brent. Enter the matrix: A virtual world approach to safely interruptable autonomous systems. arXiv preprint arXiv:1703.10284, 2017.
[9] MNIH, Volodymyr, et al. Asynchronous methods for deep reinforcement learning. In: International conference on machine learning. 2016. p. 1928-1937.
[10] HAARNOJA, Tuomas, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290, 2018.
[11] CHRISTODOULOU, Petros. Soft actor-critic for discrete action settings. arXiv preprint arXiv:1910.07207, 2019.
[12] IOFFE, Sergey; SZEGEDY, Christian. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.

Published

2020-12-22

Issue

Section

Electrotechnical and Computer Engineering