A utility-based analysis of equilibria in multi-objective normal-form games

Roxana Rădulescu; Patrick Mannion; Yijie Zhang; Diederik M. Roijers; Ann Nowé; Roxana Rădulescu; Patrick Mannion; Yijie Zhang; Diederik M. Roijers; Ann Nowé

doi:10.1017/S0269888920000351

2020 Volume 35

Article Contents

Next Previous

ORIGINAL RESEARCH Open Access

A utility-based analysis of equilibria in multi-objective normal-form games

¹Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, Brussels1050, Belgium, e-mails: roxana.radulescu@vub.be, ann.nowe@vub.be
²School of Computer Science, National University of Ireland Galway, GalwayH91 TK33, Ireland, e-mail: patrick.mannion@nuigalway.ie
³Universiteit van Amsterdam, Amsterdam, The Netherlands, e-mail: yijie.zhang@student.uva.nl
⁴Microsystems Technology, HU University of Applied Sciences Utrecht, Heidelberglaan 15, 3584CSUtrecht, The Netherlands, e-mail: diederik.yamamoto-roijers@hu.nl
This article extends an earlier unpublished paper (Rădulescu et al., 2019) that was originally presented at the Adaptive and Learning Agents Workshop 2019.

More Information

Received: 16 January 2020
Revised: 20 May 2020
Accepted: 20 May 2020
Published online: 30 June 2020
The Knowledge Engineering Review 35, Article number: e32 (2020) | Cite this article

Abstract

Abstract: In multi-objective multi-agent systems (MOMASs), agents explicitly consider the possible trade-offs between conflicting objective functions. We argue that compromises between competing objectives in MOMAS should be analyzed on the basis of the utility that these compromises have for the users of a system, where an agent’s utility function maps their payoff vectors to scalar utility values. This utility-based approach naturally leads to two different optimization criteria for agents in a MOMAS: expected scalarized returns (ESRs) and scalarized expected returns (SERs). In this article, we explore the differences between these two criteria using the framework of multi-objective normal-form games (MONFGs). We demonstrate that the choice of optimization criterion (ESR or SER) can radically alter the set of equilibria in a MONFG when nonlinear utility functions are used.
Rights and permissions
© The Author(s), 2020. Published by Cambridge University Press2020Cambridge University Press

References

Arifovic , J., Boitnott , J. F. & Duffy , J.2016. Learning correlated equilibria: an evolutionary approach. Journal of Economic Behavior & Organization 157, 171–190.

Google Scholar

Aumann , R. J.1974. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics1(1), 67–96.

Google Scholar

Aumann , R. J.1987. Correlated equilibrium as an expression of bayesian rationality. Econometrica: Journal of the Econometric Society 1, 1–18.

Google Scholar

Bergstresser , K. and Yu , P.1977. Domination structures and multicriteria problems in n-person games. Theory and Decision8(1), 5–48.

Google Scholar

Blackwell , D.et al.1956. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics6(1), 1–8.

Google Scholar

Borm , P., Tijs , S. & van den Aarssen , J.1990. Pareto equilibria in multi-objective games. Methods of Operations Research60, 303–312.

Google Scholar

Colby , M. & Tumer , K.2015. An evolutionary game theoretic analysis of difference evaluation functions. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, 1391–1398. ACM.

Google Scholar

Devlin , S. & Kudenko , D.2011. Theoretical considerations of potential-based reward shaping for multi-agent systems. In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 225–232.

Google Scholar

Foster , D. P. & Vohra , R.1999. Regret in the on-line decision problem. Games and Economic Behavior29 (1–2), 7–35.

Google Scholar

Fudenberg , D. & Kreps , D. M.1993. Learning mixed equilibria. Games and Economic Behavior5 (3), 320–367. ISSN 0899-8256.

Google Scholar

Hart , S. & Schmeidler , D.1989. Existence of correlated equilibria. Mathematics of Operations Research14(1), 18–25.

Google Scholar

Igarashi , A. & Roijers , D. M.2017. Multi-criteria coalition formation games. In International Conference on Algorithmic DecisionTheory, 197–213. Springer.

Google Scholar

Jensen , J. L. W. V.et al.1906. Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Actamathematica30, 175–193.

Google Scholar

Lozan , V. & Ungureanu , V.2013. Computing the pareto-nash equilibrium set in finite multi-objective mixed-strategy games. Computer Science Journal of Moldova, 21 (2).

Google Scholar

Lozovanu , D., Solomon , D. & Zelikovsky , A.2005. Multiobjective games and determining pareto-nashequilibria. Buletinul Academiei de Ştiinţe a Republicii Moldova. Matematica, (3), 115–122.

Google Scholar

Mannion , P., Devlin , S., Duggan , J. & Howley , E.2018. Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning. The Knowledge Engineering Review33, e23.

Google Scholar

Mannion , P., Devlin , S., Mason , K., Duggan , J. & Howley , E.2017a. Policy invariance under reward transformations for multi-objective reinforcement learning. Neurocomputing263, 60–73.

Google Scholar

Mannion , P., Duggan , J. & Howley , E.2016a. An experimental review of reinforcement learning algorithms for adaptive traffic signal control. In Autonomic Road Transport Support Systems, McCluskey , L. T., Kotsialos , A., Müller , P. J., Klügl , F., Rana , O. & Schumann , R. (eds), 47–66. Springer International Publishing.

Google Scholar

Mannion , P., Duggan , J. & Howley , E.2017b. A theoretical and empirical analysis of reward transformations in multi-objective stochastic games. In Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2017b.

Google Scholar

Mannion , P., Mason , K., Devlin , S., Duggan , J. & Howley , E. 2016b. Multi-objective dynamic dispatch optimisation using multi-agent reinforcement learning. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2016b.

Google Scholar

Mossalam , H., Assael , Y. M., Roijers , D. M. & Whiteson , S.2016. Multi-objective deep reinforcement learning. In NIPS Workshop on Deep Reinforcement Learning.

Google Scholar

Nash , J.1950. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences36(1), 48–49. ISSN 0027-8424.

Google Scholar

Nash , J.1951. Non-cooperative games. Annals of Mathematics54(2), 286–295.

Google Scholar

Papadimitriou , C. H. & Roughgarden , T.2008. Computing correlated equilibria in multi-player games. Journal of the ACM (JACM)55(3), 14.

Google Scholar

Rădulescu , R., Legrand , M., Efthymiadis , K., Roijers , D. M. & Nowé , A.2018. Deep multi-agent reinforcement learning in a homogeneous open population. In Proceedings of the 30th Benelux Conference on Artificial Intelligence (BNAIC 2018), 177–191.

Google Scholar

Rădulescu , R., Mannion , P., Roijers , D. & Nowé , A.2019. Equilibria in multi-objective games: a utility-based perspective. In Adaptive and Learning Agents Workshop (at AAMAS 2019), May 2019.

Google Scholar

Rădulescu , R., Mannion , P., Roijers , D. M. and Nowé , A.2020. Multi-objective multi-agent decision making: a utility-based analysis and survey. Autonomous Agents and Multi-Agent Systems34 (10).

Google Scholar

Reymond , M., Patyn , C., Rădulescu , R., Deconinck , G. & Nowé , A.2018. Reinforcement learning for demand response of domestic household appliances. In Proceedings of the Adaptive and Learning Agents Workshop at FAIM 2018.

Google Scholar

Roijers , D. M.2016. Multi-Objective Decision-Theoretic Planning. PhD thesis, University of Amsterdam.

Google Scholar

Roijers , D. M., Steckelmacher , D. & Nowé , A.2018. Multi-objective reinforcement learning for the expected utility of the return. In Proceedings of the Adaptive and Learning Agents Workshop at FAIM 2018.

Google Scholar

Roijers , D. M., Vamplew , P., Whiteson , S. & Dazeley , R.2013. A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research48, 67–113.

Google Scholar

Roijers , D. M. & Whiteson , S.2017. Multi-objective decision making. Synthesis Lectures on Artificial Intelligence and Machine Learning11(1), 1–129.

Google Scholar

Shapley , L. S. & Rigby , F. D.1959. Equilibrium points in games with vector payoffs. Naval Research Logistics Quarterly6 (1), 57–61.

Google Scholar

Talpert , V., Sobh , I., Kiran , B. R., Mannion , P., Yogamani , S., El-Sallab , A. & Perez , P.2019. Exploring applications of deep reinforcement learning for real-world autonomous driving systems. In International Conference on Computer Vision Theory and Applications (VISAPP), February 2019.

Google Scholar

Vamplew , P., Dazeley , R., Berry , A., Issabekov , R. & Dekker , E.2011. Empirical evaluation methods for multiobjective reinforcement learning algorithms. Machine Learning84 (1–2), 51–80.

Google Scholar

Van Moffaert , K. & Nowé , A.2014. Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine Learning Research15(1), 3483–3512.

Google Scholar

Virtanen , P., Gommers , R., Oliphant , T. E., Haberland , M., Reddy , T., Cournapeau , D., Burovski , E., Peterson , P., Weckesser , W., Bright , J., van der Walt , S. J., Brett , M., Wilson , J., Jarrod Millman , K., Mayorov , N., Nelson , A. R. J., Jones , E., Kern , R., Larson , E., Carey , C., Polat , İ., Feng , Y., Moore , E. W., Vand erPlas, J., Laxalde , D., Perktold , J., Cimrman , R., Henriksen , I., Quintero , E. A., Harris , C. R., Archibald , A. M., Ribeiro , A. H., Pedregosa , F., van Mulbregt , P. & Contributors , S.2019. SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python. arXiv e-prints, art. arXiv:1907.10121, July 2019.

Google Scholar

Voorneveld , M., Vermeulen , D. & Borm , P.1999. Axiomatizations of paretoequilibria in multicriteria games. Games and Economic Behavior280 (1), 146–154.

Google Scholar

Walraven , E. & Spaan , M. T. J.2016. Planning under uncertainty for aggregated electric vehicle charging with renewable energy supply. In Proceedings of the European Conference on Artificial Intelligence, 904–912.

Google Scholar

Watkins , C. J. C. H. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK, 1989.

Google Scholar

Wierzbicki , A. P.1995. Multiple criteria games – theory and applications. Journal of Systems Engineering and Electronics60 (2), 65–81.

Google Scholar

Wiggers , A. J., Oliehoek , F. A. & Roijers , D. M.2016. Structure in the value function of two-player zero-sum games of incomplete information. In Proceedings of the Twenty-second European Conference on Artificial Intelligence, 1628–1629. IOS Press.

Google Scholar

Yliniemi , L., Agogino , A. K. & Tumer , K.2015. Simulation of the introduction of new technologies in air traffic management. Connection Science270 (3), 269–287.

Google Scholar

Yliniemi , L. & Tumer , K.2016. Multi-objective multiagent credit assignment in reinforcement learning and nsga-ii. Soft Computing200 (10), 3869–3887.

Google Scholar

Zhang , Y., Rădulescu , R., Mannion , P., Roijers , D. M. & Nowé , A.2020. Opponent modelling for reinforcement learning in multi-objective normal form games. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), May 2020.

Google Scholar

Zinkevich , M., Greenwald , A. & Littman , M. L.2006. Cyclic equilibria in markov games. In Advances in Neural Information Processing Systems, 1641–1648.

Google Scholar

Zintgraf , L. M., Kanters , T. V., Roijers , D. M., Oliehoek , F. A. & Beau , P.2015. Quality assessment of MORL algorithms: a utility-based approach. In Benelearn 2015: Proceedings of the Twenty-Fourth Belgian-Dutch Conference on Machine Learning.

Google Scholar

Zintgraf , L. M., Roijers , D. M., Linders , S., Jonker , C. M. & Nowé , A.2018. Ordered preference elicitation strategies for supporting multi-objective decision making. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems, 1477–1485. International Foundation for Autonomous Agents and Multiagent Systems.

Google Scholar

About this article

Cite this article

Roxana Rădulescu, Patrick Mannion, Yijie Zhang, Diederik M. Roijers, Ann Nowé. 2020. A utility-based analysis of equilibria in multi-objective normal-form games. The Knowledge Engineering Review. 35:351 doi: 10.1017/S0269888920000351

Roxana Rădulescu, Patrick Mannion, Yijie Zhang, Diederik M. Roijers, Ann Nowé. 2020. A utility-based analysis of equilibria in multi-objective normal-form games. The Knowledge Engineering Review. 35:351 doi: 10.1017/S0269888920000351

Download PDF

Article Metrics

Article views(254) PDF downloads(284)

{{lists.name}}

A utility-based analysis of equilibria in multi-objective normal-form games

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors