Abstract: Learning automata are reinforcement learners belonging to the class of policy iterators. They have already been shown to exhibit nice convergence properties in a wide range of discrete action game settings. Recently, a new formulation for a continuous action reinforcement learning automata (CARLA) was proposed. In this paper, we study the behavior of these CARLA in continuous action games and propose a novel method for coordinated exploration of the joint-action space. Our method allows a team of independent learners, using CARLA, to find the optimal joint action in common interest settings. We first show that independent agents using CARLA will converge to a local optimum of the continuous action game. We then introduce a method for coordinated exploration which allows the team of agents to find the global optimum of the game. We validate our approach in a number of experiments.
Claus C. & Boutilier C.1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of National Conference on Artificial Intelligence (AAAI-98), 746–752.
Howell M. & Best M.2000. On-line PID tuning for engine idle-speed control using continuous action reinforcement learning automata. Control Engineering Practice8(2), 147–154.
Kapetanakis S., Kudenko D. & Strens M.2003. Learning to coordinate using commitment sequences in cooperative multiagent-systems. In Proceedings of the Third Symposium on Adaptive Agents and Multiagent Systems (AAMAS-03), 2004.
Rodríguez A., Grau R. & Nowé A.2011. Continuous action reinforcement learning automata. Performance and convergence. In Proceedings of the Third International Conference on Agents and Artificial Intelligence, Filipe, J. & Fred, A. (eds). SciTePress, 473–478.
Vrabie D., Pastravanu O., Abu-Khalaf M. & Lewis F.2009. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica2(45), 477–484.
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X
Abstract: Abstract: Learning automata are reinforcement learners belonging to the class of policy iterators. They have already been shown to exhibit nice convergence properties in a wide range of discrete action game settings. Recently, a new formulation for a continuous action reinforcement learning automata (CARLA) was proposed. In this paper, we study the behavior of these CARLA in continuous action games and propose a novel method for coordinated exploration of the joint-action space. Our method allows a team of independent learners, using CARLA, to find the optimal joint action in common interest settings. We first show that independent agents using CARLA will converge to a local optimum of the continuous action game. We then introduce a method for coordinated exploration which allows the team of agents to find the global optimum of the game. We validate our approach in a number of experiments.
HTML
This is a common assumption in control applications.
The original formulation is J(Λ)=∫ΛR (a, Λ) df (a) but for consistency reasons, we adapted this formulation to the notation used in this paper.
A conflicting interest version of ESRL also exists, however, as we only use common interest settings in this paper, we only describe the common interest version.
ESRL can perfectly deal with games with stochastic payoff but we illustrate the idea on a deterministic example.
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X
Catalog
Share:
Export File
Citation
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X
Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review 31(1)77−95, doi: 10.1017/S026988891500020X