|
Boutilier C.1996. Planning, learning and coordination in multiagent decision processes. In Proceedings of the 6th Conference on Theoretical Aspects of Rationality and Knowledge, 195–210.
Google Scholar
|
|
Claus C. & Boutilier C.1998. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the 15th National Conference on Artificial Intelligence, 746–752. AAAI Press.
Google Scholar
|
|
De Hauwere Y.-M., Vrancx P. & Nowé A.2010. Learning multi-agent state space representations. In The 9th International Conference on Autonomous Agents and Multiagent Systems, 715–722.
Google Scholar
|
|
De Hauwere Y.-M., Vrancx P. & Nowé A.2011a. Adaptive state representations for multi-agent reinforcement learning. In Proceedings of the 3th International Conference on Agents and Artificial Intelligence, 181–189.
Google Scholar
|
|
De Hauwere Y.-M., Vrancx P. & Nowé A.2011b. Solving delayed coordination problems in MAS (extended abstract). In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 1115–1116.
Google Scholar
|
|
De Hauwere Y.-M., Vrancx P. & Nowé A.2011c. Solving sparse delayed coordination problems in multi-agent reinforcement learning. In Adaptive Agents and Multi-Agent Systems V, Lecture Notes in Artificial Intelligence Volume 7113, 45–52. Springer-Verlag.
Google Scholar
|
|
Devlin S. & Kudenko D.2011. Theoretical considerations of potential-based reward shaping for multiagent systems. In The 10th International Conference on Autonomous Agents and Multiagent Systems—Volume 1, 225–232.
Google Scholar
|
|
Devlin S. & Kudenko D. (In Press), Plan-based reward shaping for multi-agent reinforcement learning. Knowledge Engineering Review.
Google Scholar
|
|
Greenwald A. & Hall K.2003. Correlated-Q learning. In AAAI Spring Symposium, 242–249. AAAI Press.
Google Scholar
|
|
Grzes M. & Kudenko D.2008. Plan-based reward shaping for reinforcement learning. In 4th International IEEE Conference on Intelligent Systems, 2008. IS’08, 2, 10–22–10–29.
Google Scholar
|
|
Hu J. & Wellman M.2003. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research4, 1039–1069.
Google Scholar
|
|
Kok J., ’t Hoen P., Bakker B. & Vlassis N.2005. Utile coordination: learning interdependencies among cooperative agents. In Proceedings of the IEEE Symposium on Computational Intelligence and Games (CIG’05), 29–36.
Google Scholar
|
|
Melo F. & Veloso M.2009. Learning of coordination: exploiting sparse interactions in multiagent systems. In Proceedings of the 8th International Conference on Autonomous Agents and Multi-Agent Systems, 773–780.
Google Scholar
|
|
Melo F. & Veloso M.2010. Local Multiagent Coordination in Decentralised MDPs with Sparse Interactions. Technical report CMU-CS-10-133, School of Computer Science, Carnegie Mellon University.
Google Scholar
|
|
Ng A. Y., Harada D. & Russell S.1999. Policy invariance under reward transformations: theory and application to reward shaping. In Proceedings of the 16th International Conference on Machine Learning, 278–287. Morgan Kaufmann.
Google Scholar
|
|
Randløv J. & Alstrøm P.1998. Learning to drive a bicycle using reinforcement learning and shaping. In Proceedings of the 15th International Conference on Machine Learning, ICML’98, 463–471. Morgan Kaufmann.
Google Scholar
|
|
Tsitsiklis J.1994. Asynchronous stochastic approximation and Q-learning. Journal of Machine Learning16(3), 185–202.
Google Scholar
|
|
Tumer K. & Khani N.2009. Learning from actions not taken in multiagent systems. Advances in Complex Systems12(4–5), 455–473.
Google Scholar
|
|
Vrancx P., Verbeeck K. & Nowé A.2008. Decentralized learning in Markov games. IEEE Transactions on Systems, Man and Cybernetics (Part B: Cybernetics)38(4), 976–981.
Google Scholar
|
|
Watkins C.1989. Learning from Delayed Rewards. PhD thesis, University of Cambridge.
Google Scholar
|