Search
2012 Volume 27
Article Contents
RESEARCH ARTICLE   Open Access    

Individual evolutionary learning with many agents

More Information
  • Corresponding authors: Jasmina Arifovic ;  John Ledyard

Article Metrics

Article views(16) PDF downloads(30)

Other Articles By Authors

RESEARCH ARTICLE   Open Access    

Individual evolutionary learning with many agents

  • Corresponding authors: Jasmina Arifovic ;  John Ledyard
The Knowledge Engineering Review  27 2012, 27(2): 239−254  |  Cite this article

Abstract: Abstract: Individual Evolutionary Learning (IEL) is a learning model based on the evolution of a population of strategies of an individual agent. In prior work, IEL has been shown to be consistent with the behavior of human subjects in games with a small number of agents. In this paper, we examine the performance of IEL in games with many agents. We find IEL to be robust to this type of scaling. With the appropriate linear adjustment of the mechanism parameter, the convergence behavior of IEL in games induced by Groves–Ledyard mechanisms in quadratic environments is independent of the number of participants.

    • We thank Olena Kostyshyna for her very able research assistance. We also thank an anonymous referee for very helpful comments.

    • These will be defined more precisely later.

    • This section is intended mainly as a reminder to the reader of the formal structure of the problem. For more details, see Groves and Ledyard (1977) or Chen and Plott (1996).

    • Adaptive learning is defined in Milgrom and Roberts (1990) and includes best response, fictitious play, Bayesian Learning, and others. The sufficient condition for convergence under adaptive learning is \[--><$>{{\partial }^2} {{V}^{_{i} }} /\partial {{m}^{_{i} }} \partial {{m}^{_{j} }} \,\geqslant \,0 <$><!--\].

    • We thank Paul Healy for the Gabay–Moulin reference. See Healy (2006) for a use of the theorem in the context of public good mechanism design. The diagonal condition is satisfied if \[--><$> |{{\partial }^2} {{W}^{_{i} }} /\partial {{m}^{_{i} }} {{m}^{_{i} }} | \gt \,\mathop{\sum}\nolimits_{i \ne j} |{{\partial }^2} {{W}^{_{i} }} /\partial {{m}^{_{i} }} {{m}^{_{j} }} | <$><!--\].

    • This follows from the Gabay–Moulin condition since the left-hand side of the inequality goes to zero as \[--><$> N \rightarrow 0 <$><!--\], while the right-hand side is bounded away from zero.

    • It can be shown that \[--><$>{\ lim }_{k \rightarrow \infty } \,{{u}^i} (\hat{m}(kN\hat{\gamma }))\,{\rm{ - }}\,{{u}^i} (\hat{m}(kN\hat{\gamma })/{{\mu }_{{\rm{ - }}i}}) = 0 <$><!--\].

    • See Watkins (1989).

    • A number of applications in economics use the genetic algorithm (developed by Holland, 1970, 1974) to implement this idea. For example, see Arifovic (1996), Miller (1996), Marks (1998), Vriend (2000), Lux and Schonstein (2005), etc.

    • Our approach follows most closely Arifovic (1994), but there have been a number of other individual learning applications, for example, Marimon et al. (1990), Vriend (2000), Lux and Hommes (2008).

    • Since we use the identical algorithm that we used for our simulations with N = 5, our description closely follows the behavioral model presented in Arifovic and Ledyard (2011).

    • J is a free parameter of IEL. In this paper we set J = 200.

    • ρ is a free parameter of the behavioral model. In this paper we set ρ = 0.033, exactly the same number we have used in our other IEL papers.

    • An alternative selection model is the probabilistic choice function \[--><$>\pi ({{\theta }^k} ) = {{{{e}^{\lambda {{u}^i} ({{\theta }^k} )}} }}/{{\mathop{\sum}\nolimits_j {{e}^{\lambda {{u}^i} ({{\theta }^k} )}} }} <$><!--\]. We have found (see e.g. Arifovic & Ledyard, 2011) that the behavior predicted changes very little with this model from our proportional selection rule, for all λ. This is because the set A tends to become homogeneous fairly fast, at which point the selection rule is irrelevant.

    • This implies that if there are negative forgone utilities in a set, payoffs are normalized by adding a constant to each payoff that is, in absolute value, equal to the lowest payoff in the set.

    • See Camerer and Chong (2004).

    • Our subjects were given a ‘what-if’ calculator that they could use prior and during the beginning of an experiment. Chen and Tang's subjects did not have access to such a tool.

    • This was the range of values used in the both the Arifovic–Ledyard and Chen-Tang experiments.

    • It is worth pointing out that we use our learning algorithm to ‘locate’ equilibria. This is different from for solving for an evolutionary stable equilibrium, where one takes an existing equilibrium and asks whether it is stable with respect to best-response dynamics. We thank our anonymous referee for this remark.

    • The maximum number of periods for each run was set at tmax = 1000. If the convergence criterion is not fulfilled by that time, a run is terminated. All of our runs, for all γ's and N's converged within 1000 periods.

    • Copyright © Cambridge University Press 20122012Cambridge University Press
References (26)
  • About this article
    Cite this article
    Jasmina Arifovic, John Ledyard. 2012. Individual evolutionary learning with many agents. The Knowledge Engineering Review 27(2)239−254, doi: 10.1017/S026988891200015X
    Jasmina Arifovic, John Ledyard. 2012. Individual evolutionary learning with many agents. The Knowledge Engineering Review 27(2)239−254, doi: 10.1017/S026988891200015X
  • Catalog

      /

      DownLoad:  Full-Size Img  PowerPoint
      Return
      Return