|
Abdallah , S. & Lesser , V.2006. Learning the task allocation game, In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems, AAMAS ’06, ACM Press, 850–857. |
|
Agogino , A. K. & Tumer , K.2004. Unifying temporal and structural credit assignment problems, In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS ’04, IEEE, 980–987. |
|
Auer , P., Cesa-Bianchi , N., Freund , Y. & Schapire , R. E.2002. The nonstochastic multiarmed bandit problem. SIAM Journal on Computing32(1), 48–77. |
|
Awerbuch , B. & Kleinberg , R. D.2004. Adaptive routing with end-to-end feedback: Distributed learning and geometric approaches. In Proceedings of the Thirty-Sixth Annual ACM Symposium on Theory of Computing, STOC ’04, ACM, 45–53. |
|
Bar-Gera , H.2010. Traffic assignment by paired alternative segments. Transportation Research Part B: Methodological44(8–9), 1022–1046. |
|
Bazzan , A. L. C.2009. Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Autonomous Agents and Multiagent Systems18(3), 342–375. |
|
Bazzan , A. L. C. & Klügl , F.2005. Case studies on the Braess paradox: simulating route recommendation and learning in abstract and microscopic models. Transportation Research C13(4), 299–319. |
|
Bazzan , A. L. C. & Klügl , F.2013. Introduction to Intelligent Systems inTraffic and Transportation, Vol. 7. Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan and Claypool. |
|
Beckmann , M., McGuire , C. B. & Winsten , C. B.1956. Studies in the Economics of Transportation, Yale University Press, New Haven. |
|
Bonifaci , V., Salek , M. & Schäfer , G.2011. Efficiency of restricted tolls in non-atomic network routing games. In Algorithmic Game Theory: Proceedings of the 4th International Symposium (SAGT 2011), Persiano , G. (ed). Springer, 302–313. |
|
Bowling , M.2005. Convergence and no-regret in multiagent learning. In L. K.Saul , Y.Weiss & L.Bottou , (eds.) Advances in Neural Information Processing Systems 17: Proceedings of the 2004 Conference, MIT Press, 209–216. |
|
Boyan , J. A. & Littman , M. L.1994. Packet routing in dynamically changing networks: A reinforcement learning approach. Advances in Neural Information Processing Systems6, 671–678. |
|
Braess , D.1968. Über ein Paradoxon aus der Verkehrsplanung. Unternehmensforschung12, 258. |
|
Buşoniu , L., Babuska , R. & De Schutter , B.2008. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews38(2), 156–172. |
|
Chen , H., An , B., Sharon , G., Hanna , J. P., Stone , P., Miao , C. & Soh , Y. C.2018. DyETC: Dynamic electronic toll collection for traffic congestion alleviation. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), number February, AAAI Press, 757–765. |
|
Chen , P.-A. & Kempe , D.2008. Altruism, selfishness, and spite in traffic routing. In Proceedings of the 9th ACM Conference on Electronic Commerce (EC ’08), Riedl , J. & Sandholm , T. (eds.), ACM Press, 140–149. |
|
Claus , C. & Boutilier , C.1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, 746–752. |
|
Colby , M., Duchow-Pressley , T., Chung , J. J. & Tumer , K.2016. Local approximation of difference evaluation functions. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016), IFAAMAS, Singapore, 521–529. |
|
Cole , R., Dodis , Y. & Roughgarden , T.2003. Pricing network edges for heterogeneous selfish users. In Proceedings of the Thirty-fifth Annual ACM Symposium on Theory of Computing, STOC ’03, ACM, 521–530. |
|
Crites , R. H. & Barto , A. G.1998. Elevator group control using multiple reinforcement learning agents. Machine Learning33(2), 235–262. |
|
de Palma , A. & Lindsey , R.2011. Traffic congestion pricing methodologies and technologies. Transportation Research Part C: Emerging Technologies19(6), 1377–1399. |
|
Fehr , E. & Fischbacher , U.2003. The nature of human altruism. Nature425(6960), 785–791. |
|
Fleischer , L., Jain , K. & Mahdian , M.2004. Tolls for heterogeneous selfish users in multicommodity networks and generalized congestion games. In 45th Annual IEEE Symposium on Foundations of Computer Science, IEEE, Rome, Italy, 277–285. |
|
Foerster , J., Nardell , N., Farquhar , G., Afouras , T., Torr , P. H., Kohli , P. & Whiteson , S.2017. Stabilising experience replay for deep multi-agent reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning, 70, PMLR, 1146–1155. |
|
Centre for Economics and Business Research2014. The Future Economic and Environmental Costs of Gridlock in 2030, Technical report, Centre for Economics and Business Research, London. |
|
Hernandez-Leal , P., Zhan , Y., Taylor , M. E., Sucar , L. E. & Munoz de Cote , E.2017. An exploration strategy for non-stationary opponents. Autonomous Agents and Multi-Agent Systems31(5), 971–1002. |
|
Hoefer , M. & Skopalik , A.2009. Altruism in atomic congestion games. In 17th Annual European Symposium on Algorithms, Fiat , A. & Sanders , P. (eds.), Springer, Berlin Heidelberg, 179–189. |
|
Hu , J. & Wellman , M. P.1998. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the 15th International Conference on Machine Learning, Morgan Kaufmann, 242–250. |
|
Hu , J. & Wellman , M. P.2003. Nash q-learning for general-sum stochastic games. Journal of Machine Learning Research4, 1039–1069. |
|
Jayakrishnan , R., Cohen , M., Kim , J., Mahmassani , H. S. & Hu , T.-Y.1993. A Simulation-Based Framework for the Analysis of Traffic Networks Operating with Real-Time Information, Technical Report UCB-ITS-PRR-93-25, University of California, Berkeley. |
|
Kaisers , M. & Tuyls , K.2010. Frequency adjusted multi-agent q-learning. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, 309–316. |
|
Kobayashi , K. & Do , M.2005. The informational impacts of congestion tolls upon route traffic demands. Transportation Research A39(7–9), 651–670. |
|
Koutsoupias , E. & Papadimitriou , C.1999. Worst-case equilibria. In Proceedings of the 16th Annual Conference on Theoretical Aspects of Computer Science (STACS), Springer-Verlag, 404–413. |
|
Lanctot , M., Zambaldi , V., Gruslys , A., Lazaridou , A., Tuyls , K., Perolat , J., Silver , D. & Graepel , T.2017. A unified game-theoretic approach to multiagent reinforcement learning. In Advances in Neural Information Processing Systems, Guyon , I., Luxburg , U. V., Bengio , S., Wallach , H., Fergus , R., Vishwanathan , S. & Garnett , R. (eds.), 30, Curran Associates, Inc., 4190–4203. |
|
Lauer , M. & Riedmiller , M.2004. Reinforcement learning for stochastic cooperative multi-agent-systems. Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS 2004 3, 1516–1517. |
|
Laurent , G. J., Matignon , L. & Le Fort-Piat , N.2011. The world of independent learners is not markovian. International Journal of Knowledge-based and Intelligent Engineering Systems15(1), 55–64. |
|
LeBlanc , L. J., Morlok , E. K. & Pierskalla , W. P.1975. An efficient approach to solving the road network equilibrium traffic assignment problem. Transportation Research9(5), 309–318. |
|
Levy , N. & Ben-Elia , E.2016. Emergence of system optimum: A fair and altruistic agent-based route-choice model. Procedia Computer Science83, 928–933. |
|
Littman , M. L.1994. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, ML, Morgan Kaufmann, 157–163. |
|
Littman , M. L.2001. Friend-or-Foe Q-learning in general-sum games. In Proceedings of the Eighteenth International Conference on Machine Learning (ICML’01). Morgan Kaufmann, 322–328. |
|
Lujak , M., Giordani , S. & Ossowski , S.2015. Route guidance: Bridging system and user optimization in traffic assignment. Neurocomputing151, 449–460. |
|
Malialis , K., Devlin , S. & Kudenko , D.2016. Resource abstraction for reinforcement learning in multiagent congestion problems. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, International Foundation for Autonomous Agents and Multiagent Systems, 503–511. |
|
Matignon , L., Laurent , G. J. & Le Fort-Piat , N.2012. Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems. The Knowledge Engineering Review27(1), 1–31. |
|
McFadden , D.2001. Disaggregate behavioral travel demand’s RUM side. In Travel Behaviour Research: The Leading Edge, Hensher , D. A. (ed), Elsevier, 17–63. |
|
Meir , R. & Parkes , D.2018. Playing the wrong game: Bounding externalities in diverse populations of agents. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018), IFAAMAS, Stockholm, 86–94. |
|
Meir , R. & Parkes , D. C.2016. When are marginal congestion tolls optimal? In Proceedings of the Ninth Workshop on Agents in Traffic and Transportation (ATT-2016), Bazzan , A. L. C., Klügl , F., Ossowski , S. & Vizzari , G. (eds). CEUR-WS.org, 8. |
|
Mirzaei , H., Sharon , G., Boyles , S., Givargis , T. & Stone , P.2018. Link-based parameterized micro-tolling scheme for optimal traffic management. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems, AAMAS ’18, Dastani , M., Sukthankar , G., André , E. & Koenig , S. (eds). IFAAMAS, 2013–2015. |
|
Nash , J.1950. Non-Cooperative Games, PhD thesis, Princeton University. |
|
National Surface Transportation Infrastructure Financing Commission2009. Paying Our Way: A New Framework for Transportation Finance, Technical report, National Surface Transportation Infrastructure Financing Commission, Washington DC. |
|
Bureau of Public Roads1964. Traffic Assignment Manual, Technical report, US Department of Commerce, Washington, D. C. |
|
Omidshafiei , S., Pazis , J., Amato , C., How , J. P. & Vian , J.2017. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. In Proceedings of the 34th International Conference on Machine Learning, 70, 4108–4122. |
|
Ortúzar , J. d. D. & Willumsen , L. G.2011. Modelling Transport, 4 edition, John Wiley & Sons. |
|
Pigou , A.1920. The Economics of Welfare, Palgrave Classics in Economics, Palgrave Macmillan. |
|
Proper , S. & Tumer , K.2012. Modeling difference rewards for multiagent learning. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2012), Conitzer, V., Winikoff, M., Padgham, L. & van der Hoek, W. (eds). IFAAMAS. |
|
Rădulescu , R., Vrancx , P. & Nowé , A.2017. Analysing congestion problems in multi-agent reinforcement learning. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, International Foundation for Autonomous Agents and Multiagent Systems, 1705–1707. |
|
Ramos , G. de O.2018. Regret Minimisation and System-Efficiency in Route Choice, PhD thesis, Universidade Federal do Rio Grande do Sul, Porto Alegre. |
|
Ramos , G. de O. & Bazzan , A. L. C.2015. Towards the user equilibrium in traffic assignment using GRASP with path relinking. In Proceedings of the 2015 on Genetic and Evolutionary Computation Conference, GECCO ’15, ACM, 473–480. |
|
Ramos , G. de O. & Bazzan , A. L. C.2016. Efficient local search in traffic assignment. In 2016 IEEE Congress on Evolutionary Computation (CEC). IEEE, 1493–1500. |
|
Ramos , G. de O., da Silva , B. C. & Bazzan , A. L. C.2017. Learning to minimise regret in route choice. In Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), Das , S., Durfee , E., Larson , K. & Winikoff , M. (eds). IFAAMAS, 846–855. |
|
Ramos , G. de O., da Silva , B. C., Rădulescu , R. & Bazzan , A. L. C.2018. Learning system-efficient equilibria in route choice using tolls. In Proceedings of the Adaptive Learning Agents Workshop 2018 (ALA-18), Stockholm. |
|
Ramos , G. de O., Rădulescu , R., Nowé , A. & Tavares , A. R. 2020. Toll-based learning for minimising congestion under heterogeneous preferences. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), An, B., Yorke-Smith, N., El Fallah Seghrouchni, A. & Sukthankar , G. (eds). IFAAMAS. |
|
Robbins , H.1952. Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society58(5), 527–535. |
|
Rosenthal , R. W.1973. A class of games possessing pure-strategy Nash equilibria. International Journal of Game Theory2, 65–67. |
|
Roughgarden , T.2005. Selfish Routing and the Price of Anarchy, MIT Press. |
|
Sandholm , T.2007. Perspectives on multiagent learning. Artificial Intelligence171(7), 382–391. |
|
Sen , S., Sekaran , M. & Hale , J.1994. Learning to coordinate without sharing information. In Proceedings of the Twelfth National Conference on Artificial Intelligence, 426–431. |
|
Sharon , G., Boyles , S. D., Alkoby , S. & Stone , P.2019. Marginal cost pricing with a fixed error factor in traffic networks. In Proceedings of the 18th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2019), Agmon , N., Taylor , M., Elkind , E. & Veloso , M. (eds). IFAAMAS, Montreal, 1539–1546. |
|
Sharon , G., Hanna , J. P., Rambha , T., Levin , M. W., Albert , M., Boyles , S. D. & Stone , P.2017. Real-time adaptive tolling scheme for optimized social welfare in traffic networks. In Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), Das , S., Durfee , E., Larson , K. & Winikoff , M. (eds). IFAAMAS, 828–836. |
|
Stefanello , F. & Bazzan , A. L. C.2016. Traffic Assignment Problem – Extending Braess Paradox, Technical report, Universidade Federal do Rio Grande do Sul, Porto Alegre, RS. |
|
Sutton , R. & Barto , A.1998. Reinforcement Learning: An Introduction, MIT Press. |
|
Tan , M.1993. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, 330–337. |
|
Tavares , A. R. & Bazzan , A. L.2014. An agent-based approach for road pricing: System-level performance and implications for drivers. Journal of the Brazilian Computer Society20(1), 15. |
|
Tesauro , G.1994. Td-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation6(2), 215–219. |
|
Tuyls , K. & Weiss , G.2012. Multiagent learning: Basics, challenges, and prospects. AI Magazine33(3), 41–52. |
|
van Essen , M., Thomas , T., van Berkum , E. & Chorus , C.2016. From user equilibrium to system optimum: a literature review on the role of travel information, bounded rationality and non-selfish behaviour at the network and individual levels. Transport Reviews36(4), 527–548. |
|
Verbeeck , K., Nowé , A., Parent , J. & Tuyls , K.2007. Exploring selfish reinforcement learning in repeated games with stochastic rewards. Autonomous Agents and Multi-Agent Systems14(3), 239–269. |
|
Vrancx , P., Verbeeck , K. & Nowe , A.2008. Decentralized learning in markov games. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)38(4), 976–981. |
|
Vrancx , P., Verbeeck , K. & Nowé , A.2010. Learning to take turns. In Proceedings of the AAMAS 2010 Workshop on Adaptive Learning Agents and Multi-Agent Systems (ALA 2010), 1–7. |
|
Wardrop , J. G.1952. Some theoretical aspects of road traffic research. Proceedings of the Institution of Civil Engineers, Part II 1(36), 325–362. |
|
Watkins , C. J. C. H. & Dayan , P.1992. Q-learning. Machine Learning8(3), 279–292. |
|
Wolpert , D. H. & Tumer , K.1999. An introduction to Collective Intelligence, Technical report NASA-ARC-IC-99-63, NASA Ames Research Center. arXiv:cs/9908014 [cs.LG]. |
|
Wolpert , D. H. & Tumer , K.2002. Collective intelligence, data routing and Braess’ paradox. Journal of Artificial Intelligence Research16, 359–387. |
|
Yang , H., Meng , Q. & Lee , D.-H.2004. Trial-and-error implementation of marginal-cost pricing on networks in the absence of demand functions. Transportation Research Part B: Methodological38(6), 477–493. |
|
Ye , H., Yang , H. & Tan , Z.2015. Learning marginal-cost pricing via a trial-and-error procedure with day-to-day flow dynamics. Transportation Research Part B: Methodological81, 794–807. |
|
Yen , J. Y.1971. Finding the k shortest loopless paths in a network. Management Science17(11), 712–716. |
|
Youn , H., Gastner , M. T. & Jeong , H.2008. Price of anarchy in transportation networks: Efficiency and optimality control. Physical Review Letters101(12), 128701. |
|
Zhang , J., Pourazarm , S., Cassandras , C. G. & Paschalidis , I. C.2016. The price of anarchy in transportation networks by estimating user cost functions from actual traffic data. In 2016 IEEE 55th Conference on Decision and Control (CDC), IEEE, 789–794. |
|
Zinkevich , M.2003. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the Twentieth International Conference on Machine Learning, AAAI Press, 928–936. |