Individual evolutionary learning with many agents

Jasmina Arifovic; John Ledyard; Jasmina Arifovic; John Ledyard

doi:10.1017/S026988891200015X

Abstract: Individual Evolutionary Learning (IEL) is a learning model based on the evolution of a population of strategies of an individual agent. In prior work, IEL has been shown to be consistent with the behavior of human subjects in games with a small number of agents. In this paper, we examine the performance of IEL in games with many agents. We find IEL to be robust to this type of scaling. With the appropriate linear adjustment of the mechanism parameter, the convergence behavior of IEL in games induced by Groves–Ledyard mechanisms in quadratic environments is independent of the number of participants.

HTML

Acknowledgments

We thank Olena Kostyshyna for her very able research assistance. We also thank an anonymous referee for very helpful comments.

These will be defined more precisely later.

This section is intended mainly as a reminder to the reader of the formal structure of the problem. For more details, see Groves and Ledyard (1977) or Chen and Plott (1996).

Adaptive learning is defined in Milgrom and Roberts (1990) and includes best response, fictitious play, Bayesian Learning, and others. The sufficient condition for convergence under adaptive learning is \[--><$>{{\partial }^2} {{V}^{_{i} }} /\partial {{m}^{_{i} }} \partial {{m}^{_{j} }} \,\geqslant \,0 <$><!--\].

We thank Paul Healy for the Gabay–Moulin reference. See Healy (2006) for a use of the theorem in the context of public good mechanism design. The diagonal condition is satisfied if \[--><$> |{{\partial }^2} {{W}^{_{i} }} /\partial {{m}^{_{i} }} {{m}^{_{i} }} | \gt \,\mathop{\sum}\nolimits_{i \ne j} |{{\partial }^2} {{W}^{_{i} }} /\partial {{m}^{_{i} }} {{m}^{_{j} }} | <$><!--\].

This follows from the Gabay–Moulin condition since the left-hand side of the inequality goes to zero as \[--><$> N \rightarrow 0 <$><!--\], while the right-hand side is bounded away from zero.

It can be shown that \[--><$>{\ lim }_{k \rightarrow \infty } \,{{u}^i} (\hat{m}(kN\hat{\gamma }))\,{\rm{ - }}\,{{u}^i} (\hat{m}(kN\hat{\gamma })/{{\mu }_{{\rm{ - }}i}}) = 0 <$><!--\].

See Watkins (1989).

A number of applications in economics use the genetic algorithm (developed by Holland, 1970, 1974) to implement this idea. For example, see Arifovic (1996), Miller (1996), Marks (1998), Vriend (2000), Lux and Schonstein (2005), etc.

Our approach follows most closely Arifovic (1994), but there have been a number of other individual learning applications, for example, Marimon et al. (1990), Vriend (2000), Lux and Hommes (2008).

Since we use the identical algorithm that we used for our simulations with N = 5, our description closely follows the behavioral model presented in Arifovic and Ledyard (2011).

J is a free parameter of IEL. In this paper we set J = 200.

ρ is a free parameter of the behavioral model. In this paper we set ρ = 0.033, exactly the same number we have used in our other IEL papers.

An alternative selection model is the probabilistic choice function \[--><$>\pi ({{\theta }^k} ) = {{{{e}^{\lambda {{u}^i} ({{\theta }^k} )}} }}/{{\mathop{\sum}\nolimits_j {{e}^{\lambda {{u}^i} ({{\theta }^k} )}} }} <$><!--\]. We have found (see e.g. Arifovic & Ledyard, 2011) that the behavior predicted changes very little with this model from our proportional selection rule, for all λ. This is because the set A tends to become homogeneous fairly fast, at which point the selection rule is irrelevant.

This implies that if there are negative forgone utilities in a set, payoffs are normalized by adding a constant to each payoff that is, in absolute value, equal to the lowest payoff in the set.

See Camerer and Chong (2004).

Our subjects were given a ‘what-if’ calculator that they could use prior and during the beginning of an experiment. Chen and Tang's subjects did not have access to such a tool.

This was the range of values used in the both the Arifovic–Ledyard and Chen-Tang experiments.

It is worth pointing out that we use our learning algorithm to ‘locate’ equilibria. This is different from for solving for an evolutionary stable equilibrium, where one takes an existing equilibrium and asks whether it is stable with respect to best-response dynamics. We thank our anonymous referee for this remark.

The maximum number of periods for each run was set at t_max = 1000. If the convergence criterion is not fulfilled by that time, a run is terminated. All of our runs, for all γ's and N's converged within 1000 periods.

Rights and permissions

References (26)

About this article

Cite this article

Jasmina Arifovic, John Ledyard. 2012. Individual evolutionary learning with many agents. The Knowledge Engineering Review 27(2)239−254, doi: 10.1017/S026988891200015X

Jasmina Arifovic, John Ledyard. 2012. Individual evolutionary learning with many agents. The Knowledge Engineering Review 27(2)239−254, doi: 10.1017/S026988891200015X

{{lists.name}}

Individual evolutionary learning with many agents

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors