A reinforcement learning approach to coordinate exploration with limited communication in continuous action games

Abdel Rodríguez; Peter Vrancx; Ricardo Grau; Ann Nowé; Abdel Rodríguez; Peter Vrancx; Ricardo Grau; Ann Nowé

doi:10.1017/S026988891500020X

Abstract: Learning automata are reinforcement learners belonging to the class of policy iterators. They have already been shown to exhibit nice convergence properties in a wide range of discrete action game settings. Recently, a new formulation for a continuous action reinforcement learning automata (CARLA) was proposed. In this paper, we study the behavior of these CARLA in continuous action games and propose a novel method for coordinated exploration of the joint-action space. Our method allows a team of independent learners, using CARLA, to find the optimal joint action in common interest settings. We first show that independent agents using CARLA will converge to a local optimum of the continuous action game. We then introduce a method for coordinated exploration which allows the team of agents to find the global optimum of the game. We validate our approach in a number of experiments.

Other Articles By Authors

A reinforcement learning approach to coordinate exploration with limited communication in continuous action games

Computational Modeling Lab

Center of Studies in Informatics

Published online: 11 February 2016

Abstract: Abstract: Learning automata are reinforcement learners belonging to the class of policy iterators. They have already been shown to exhibit nice convergence properties in a wide range of discrete action game settings. Recently, a new formulation for a continuous action reinforcement learning automata (CARLA) was proposed. In this paper, we study the behavior of these CARLA in continuous action games and propose a novel method for coordinated exploration of the joint-action space. Our method allows a team of independent learners, using CARLA, to find the optimal joint action in common interest settings. We first show that independent agents using CARLA will converge to a local optimum of the continuous action game. We then introduce a method for coordinated exploration which allows the team of agents to find the global optimum of the game. We validate our approach in a number of experiments.

HTML

This is a common assumption in control applications.

The original formulation is J(Λ)=∫_ΛR (a, Λ) df (a) but for consistency reasons, we adapted this formulation to the notation used in this paper.

A conflicting interest version of ESRL also exists, however, as we only use common interest settings in this paper, we only describe the common interest version.

ESRL can perfectly deal with games with stochastic payoff but we illustrate the idea on a deterministic example.

Rights and permissions

References (18)

About this article

Cite this article

Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review. 31: doi: 10.1017/S026988891500020X

Abdel Rodríguez, Peter Vrancx, Ricardo Grau, Ann Nowé. 2016. A reinforcement learning approach to coordinate exploration with limited communication in continuous action games. The Knowledge Engineering Review. 31: doi: 10.1017/S026988891500020X

{{lists.name}}

A reinforcement learning approach to coordinate exploration with limited communication in continuous action games

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors