A review of machine learning for automated planning

Sergio Jiménez; Tomás De La Rosa; Susana Fernández; Fernando Fernández; Daniel Borrajo; Sergio Jiménez; Tomás De La Rosa; Susana Fernández; Fernando Fernández; Daniel Borrajo

doi:10.1017/S026988891200001X

2012 Volume 27

Article Contents

Next Previous

RESEARCH ARTICLE Open Access

A review of machine learning for automated planning

Departamento de Informática

More Information

Published online: 12 November 2012
The Knowledge Engineering Review 27, Article number: 10.1017/S026988891200001X (2012) | Cite this article

Abstract

Abstract: Recent discoveries in automated planning are broadening the scope of planners, from toy problems to real applications. However, applying automated planners to real-world problems is far from simple. On the one hand, the definition of accurate action models for planning is still a bottleneck. On the other hand, off-the-shelf planners fail to scale-up and to provide good solutions in many domains. In these problematic domains, planners can exploit domain-specific control knowledge to improve their performance in terms of both speed and quality of the solutions. However, manual definition of control knowledge is quite difficult. This paper reviews recent techniques in machine learning for the automatic definition of planning knowledge. It has been organized according to the target of the learning process: automatic definition of planning action models and automatic definition of planning control knowledge. In addition, the paper reviews the advances in the related field of reinforcement learning.
Rights and permissions
Copyright © Cambridge University Press 20122012Cambridge University Press

References

Aler R., Borrajo D., Isasi P.2002. Using genetic programming to learn and improve control knowledge. Artificial Intelligence141(1–2), 29–56.

Google Scholar

Amir E., Chang A.2008. Learning partially observable deterministic action models. Journal of Artificial Intelligence Research33, 349–402.

Google Scholar

Bacchus F., Kabanza F.2000. Using temporal logics to express search control knowledge for planning. Artificial Intelligence116(1–2), 123–191.

Google Scholar

Barto A., Duff M.1994. Monte carlo matrix inversion and reinforcement learning. Advances in Neural Information Processing Systems 6, 687–694.

Google Scholar

Bellingham J., Rajan K.2007. Robotics in remote and hostile environments. Science318(5853), 1098–1102.

Google Scholar

Bellman R., Kalaba R.1965. Dynamic Programming and Modern Control Theory. Academic Press.

Google Scholar

Benson S. S.1997. Learning Action Models for Reactive Autonomous Agents. PhD thesis, Stanford University.

Google Scholar

Bergmann R., Wilke W.1996. PARIS: flexible plan adaptation by abstraction and refinement. In Workshop on Adaptation in Case-Based Reasoning, ECAI-96.

Google Scholar

Bertsekas D. P.1995. Dynamic Programming and Optimal Control. Athena Scientific.

Google Scholar

Bertsekas D. P., Tsitsiklis J. N.1996. Neuro-Dynamic Programming (Optimization and Neural Computation Series, 3). Athena Scientific.

Google Scholar

Blockeel H., De Raedt L.1998. Top-down induction of first-order logical decision trees. Artificial Intelligence101, 285–297.

Google Scholar

Blockeel H., Raedt L. D., Ramong J.1998. Top-down induction of clustering trees. In Proceedings of the 15th International Conference on Machine Learning, San Francisco, CA, USA.

Google Scholar

Blum A. L., Furst M. L.1995. Fast planning through planning graph analysis. Artificial Intelligence90(1), 1636–1642.

Google Scholar

Bonet B., Geffner H.2001. Planning as heuristic search. Artificial Intelligence129(1–2), 5–33.

Google Scholar

Borrajo D., Veloso M.1997. Lazy incremental learning of control knowledge for efficiently obtaining quality plans. AI Review Journal—Special Issue on Lazy Learning11(1–5), 371–405.

Google Scholar

Botea A., Enzenberger M., Mller M., Schaeffer J.2005a. Macro-FF: improving AI planning with automatically learned macro-operators. Journal of Artificial Intelligence Research24, 581–621.

Google Scholar

Botea A., Müller M., Schaeffer J.2005b. Learning partial-order macros from solutions. In ICAPS 2005. Proceedings of the 15th International Conference on Automated Planning and Scheduling, Biundo, S., Myers, K. & Rajan, K. (eds). Monterey, California, 231–240.

Google Scholar

Botea A., Müller M., Schaeffer J.2007. Fast planning with iterative macros. In Proceedings of the International Joint Conference on Artificial Intelligence IJCAI-07, 1828–1833.

Google Scholar

Boutilier C., Reiter R., Price B.2001. Symbolic dynamic programming for first-order MDPs. In International Joint Conference on Artificial Intelligence, Seattle, Washington, USA.

Google Scholar

Brazdil P., Giraud-Carrier C., Soares C., Vilalta R.2009. Metalearning: Applications to Data Mining—Cognitive Technologies. Springer.

Google Scholar

Bresina J. L., Jansson A. K., Morris P. H., Rajan K.2005. Mixed-initiative activity planning for mars rovers. In IJCAI, Edinburgh, Scotland, UK, 1709–1710.

Google Scholar

Bui H. H., Venkatesh S., West G.2002. Policy recognition in the abstract hidden Markov model. Journal of Artificial Intelligence Research17, 451–499.

Google Scholar

Bulitko V., Lee G.2006. Learning in real-time search: a unifying framework. Journal of Artificial Intelligence Research25, 119–157.

Google Scholar

Bylander T.1991. Complexity results for planning. In International Joint Conference on Artificial Intelligence, IJCAI-91, Sydney, Australia.

Google Scholar

Bylander T.1994. The computational complexity of propositional STRIPS planning. Artificial Intelligence69(1–2), 165–204.

Google Scholar

Castillo L., Fdez-Olivares J., García-Pérez O., Palao F.2006. Bringing users and planning technology together. Experiences in SIADEX. In International Conference on Automated Planning and Scheduling (ICAPS 2006), Cumbria, UK.

Google Scholar

Charniak E., Goldman R. P.1993. A bayesian model of plan recognition. Artificial Intelligence64(1), 53–79.

Google Scholar

Cohen W. W.1990. Learning approximate control rules of high utility. In International Conference on Machine Learning, Austin, Texas, USA.

Google Scholar

Coles A., Smith A.2007. Marvin: a heuristic search planner with online macro-action learning. Journal of Artificial Intelligence Research28, 119–156.

Google Scholar

Cortellessa G., Cesta A.2006. Evaluating mixed-initiative systems: an experimental approach. In Proceedings of the 16th International Conference on Automated Planning & Scheduling, ICAPS-06, Cumbria, UK.

Google Scholar

Cresswell S., McCluskey T. L., West M.2009. Acquisition of object-centred domain models from planning examples. In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), Thessaloniki, Greece.

Google Scholar

Croonenborghs T., Driessens K., Bruynooghe M.2007a. Learning relational options for inductive transfer in relational reinforcement learning. In Proceedings of the 17th Conference on Inductive Logic Programming, Corvallis, OR, USA.

Google Scholar

Croonenborghs T., Ramon J., Blockeel H., Bruynooghe M.2007b. Online learning and exploiting relational models in reinforcement learning. In Proceedings of the 20th International Joint Conference on Artificial Intelligence. AAAI Press, 726–731.

Google Scholar

Cussens J.2001. Parameter estimation in stochastic logic programs. Machine Learning44(3), 245–271.

Google Scholar

Dawson C., Silklossly L.1977. The role of preprocessing in problem solving system. In International Joint Conference on Artificial Intelligence, IJCAI-77, Cambridge, MA, USA, 465–471.

Google Scholar

de la Rosa T., García-Olaya A., Borrajo D.2007. Using cases utility for heuristic planning improvement. In International Conference on Case-Based Reasoning, Belfast, Northern Ireland.

Google Scholar

de la Rosa T., Jiménez S., Borrajo D.2008. Learning relational decision trees for guiding heuristic planning. In International Conference on Automated Planning and Scheduling (ICAPS 08), Sydney, Australia.

Google Scholar

de la Rosa T., Jiménez S., García-Durán R., Fernández F., García-Olaya A., Borrajo D.2009. Three relational learning approaches for lookahead heuristic planning. In Working Notes of the ICAPS'09 Workshop on Planning and Learning, Thessaloniki, Greece.

Google Scholar

Driessens K., Matwin S.2004. Integrating guidance into relational reinforcement learning. Machine Learning57, 271–304.

Google Scholar

Driessens K., Ramon J.2003. Relational instance based regression for relational reinforcement learning. In International Conference on Machine Learning, Washington, DC, USA.

Google Scholar

Dzeroski S., Raedt L. D., Driessens K.2001. Relational reinforcement learning. Machine Learning43, 7–52.

Google Scholar

Edelkamp S.2002. Symbolic pattern databases in heuristic search planning. In International Conference on Automated Planning and Scheduling, Toulouse, France.

Google Scholar

Ernst G. W., Newell A.1969. GPS: A Case Study in Generality and Problem Solving, ACM Monograph Series. Academic Press.

Google Scholar

Erol K., Nau D. S., Subrahmanian V. S.1992. On the complexity of domain-independent planning. Artificial Intelligence56, 223–254.

Google Scholar

Estlin T. A., Mooney R. J.1996. Hybrid learning of search control for partial-order planning. In In New Directions in AI Planning. IOS Press, 115–128.

Google Scholar

Etzioni O.1993. Acquiring search-control knowledge via static analysis. Artificial Intelligence62(2), 255–301.

Google Scholar

Ferguson G., Allen J. F., Miller B.1996. Trains-95: towards a mixed-initiative planning assistant. In International Conference on Artificial Intelligence Planning Systems, AIPS96, Edinburgh, UK. AAAI Press, 70–77.

Google Scholar

Fern A., Yoon S., Givan R.2004. Learning domain-specific control knowledge from random walks. In International Conference on Automated Planning and Scheduling, Whistler, Canada, 191–199.

Google Scholar

Fern A., Yoon S. W., Givan R.2006. Approximate policy iteration with a policy language bias: solving relational Markov decision processes. Journal of Artificial Intelligence Research25, 75–118.

Google Scholar

Fikes R., Hart P., Nilsson N. J.1972. Learning and executing generalized robot plans. Artificial Intelligence3, 251–288.

Google Scholar

Fikes R., Nilsson N. J.1971. STRIPS: a new approach to the application of theorem proving to problem solving. Artificial Intelligence2, 189–208.

Google Scholar

Florez J. E., Garca J., Torralba A., Linares C., Garca-Olaya A., Borrajo D.2010. Timiplan: an application to solve multimodal transportation problems. In Proceedings of SPARK, Scheduling and Planning Applications woRKshop, ICAPS'10, Toronto, Canada.

Google Scholar

Fox M., Long D.2003. PDDL2.1: an extension to PDDL for expressing temporal planning domains. Journal of Artificial Intelligence Research, 61–124.

Google Scholar

Fuentetaja R., Borrajo D.2006. Improving control-knowledge acquisition for planning by active learning. In European Conference on Learning, Berlin, Germany, 138–149.

Google Scholar

García-Durán R., Fernández F., Borrajo D.2006. Combining macro-operators with control knowledge. In ILP, Santiago de Compostela, Spain.

Google Scholar

García-Durán R., Fernández F., Borrajo D. (2012). A prototype-based method for classification with time constraints: a case study on automated planning. Pattern Analysis and Applications Journal15(3), 261–277.

Google Scholar

García-Martínez R., Borrajo D.2000. An integrated approach of learning, planning, and execution. Journal of Intelligent and Robotics Systems29, 47–78.

Google Scholar

Gartner T., Driessens K., Ramon J.2003. Graph kernels and Gaussian processes for relational reinforcement learning. In International Conference on Inductive Logic Programming, ILP 2003, Szeged, Hungary.

Google Scholar

Gerevini A., Saetti A., Vallati M.2009a. An automatically configurable portfolio-based planner with macro-actions: PbP. In Proceedings of the 19th International Conference on Automated Planning and Scheduling (ICAPS-09), Thessaloniki, Greece.

Google Scholar

Gerevini A. E., Haslum P., Long D., Saetti A., Dimopoulos Y.2009b.Deterministic planning in the fifth international planning competition: Pddl3 and experimental evaluation of the planners. Artificial Intelligence173(5–6), 619–668.

Google Scholar

Ghallab M., Nau D., Traverso P.2004. Automated Planning Theory and Practice. Morgan Kaufmann.

Google Scholar

Gil Y.1992. Acquiring Domain Knowledge for Planning by Experimentation. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh.

Google Scholar

Gretton C., Thiébaux S.2004. Exploiting first-order regression in inductive policy selection. In Conference on Uncertainty in Artificial Intelligence, Banff, Canada.

Google Scholar

Helmert M.2009. Concise finite-domain representations for pddl planning tasks. Artificial Intelligence.

Google Scholar

Hernández C., Meseguer P.2007. Improving LRTA*(k). In International Joint Conference on Artificial Intelligence, IJCAI-07, Hyderabad, India, 2312–2317.

Google Scholar

Hoffmann J., Nebel B.2001a. The FF planning system: fast plan generation through heuristic search. Journal of Artificial Intelligence Research14, 253–302.

Google Scholar

Hoffmann J., Nebel B.2001b. The FF planning system: fast plan generation through heuristic search. Journal of Artificial Intelligence Research14, 253–302.

Google Scholar

Hogg C., Muñoz-Avila H., Kuter U.2008. HTN-MAKER: learning HTNs with minimal additional knowledge engineering required. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.

Google Scholar

Hogg C., Kuter U., Muñoz-Avila H.2009. Learning hierarchical task networks for nondeterministic planning domains. In International Joint Conference on Artificial Intelligence, IJCAI-09, Pasadena, CA, USA.

Google Scholar

Howe A. E., Dahlman E., Hansen C., Scheetz M., Mayrhauser A. V.1999. Exploiting competitive planner performance. In Proceedings of the 5th European Conference on Planning, Durham, UK.

Google Scholar

Ilghami O., Nau D. S., Muñoz-Avila H.2002. CaMeL: learning method preconditions for HTN planning. In Proceedings of the 6th International Conference on AI Planning and Scheduling, Toulouse, France. AAAI Press, 131–141.

Google Scholar

Ilghami O., Muñoz-Avila H., Nau D. S., Aha D. W.2005. Learning approximate preconditions for methods in hierarchical plans. In International Conference on Machine Learning, Bonn, Germany.

Google Scholar

Ilghami O., Nau D. S., Muñoz-Avila H.2006. Learning to do HTN planning. In International Conference on Automated Planning and Scheduling, ICAPS 2006, Cumbria, UK.

Google Scholar

Jaeger M.1997. Relational bayesian networks. In Conference on Uncertainty in Artificial Intelligence, Rhode Island, Providence, USA.

Google Scholar

Jiménez S., Fernández F., Borrajo D.2008. The PELA architecture: integrating planning and learning to improve execution. In Proceedings of the 23rd AAAI Conference on Artificial Intelligence (AAAI-08), Chicago, IL, USA.

Google Scholar

Kaelbling L. P., Littman M. L., Moore A. P.1996. Reinforcement learning: a survey. Journal of Artificial Intelligence Research4, 237–285.

Google Scholar

Kambhampati S.2007. Model-lite planning for the web age masses: the challenges of planning with incomplete and evolving domain models. In: Senior Member Track of the AAAI. AAAI Press/MIT Press.

Google Scholar

Kambhampati S., Hendler J. A.1992. A validation structure-based theory of plan modification and reuse. Artificial Intelligence Journal55, 193–258.

Google Scholar

Keller R.1987. The Role of Explicit Contextual Knowledge in Learning Concepts to Improve Performance. PhD thesis, Rutgers University.

Google Scholar

Kersting K., Raedt L. D.2001. Towards combining inductive logic programming with Bayesian networks. In International Conference on Inductive Logic Programming, Strasbourg, France, 118–131.

Google Scholar

Khardon R.1999. Learning action strategies for planning domains. Artificial Intelligence113, 125–148.

Google Scholar

Kittler J.1998. Combining classifiers: A theoretical framework. Pattern Analysis and Application1(1), 18–27.

Google Scholar

Korf R. E.1985. Macro-operators: a weak method for learning. Artificial Intelligence26, 35–77.

Google Scholar

Korf R. E.1990. Real-time heuristic search. Artificial Intelligence42(2–3), 189–211.

Google Scholar

Lanchas J., Jiménez S., Fernández F., Borrajo D.2007. Learning action durations from executions. In Working notes of the ICAPS'07 Workshop on AI Planning and Learning, Rhode Island, Providence, USA.

Google Scholar

Larkin J., Reif F., Carbonell J.1988. FERMI: a flexible expert reasoner with multi-domain inference. Cognitive Science12(1), 101–138.

Google Scholar

Leckie C., Zukerman I.1991. Learning search control rules for planning: an inductive approach. In Proceedings of the International Workshop on Machine Learning. Morgan Kaufmann, 422–426.

Google Scholar

Martin M., Geffner H.2000. Learning generalized policies in planning using concept languages. In International Conference on Artificial Intelligence Planning Systems, AIPS00, Breckenridge, USA.

Google Scholar

Mcallester D., Givan R.1989. Taxonomic syntax for first order inference. Journal of the ACM40, 289–300.

Google Scholar

McCluskey T. L.1987. Combining weak learning heuristics in general problem solvers. In IJCAI'87: Proceedings of the 10th International Joint Conference on Artificial Intelligence. Morgan Kaufmann Publishers Inc., Milan, Italy, 331–333.

Google Scholar

McGann C., Py F., Rajan K., Ryan J., Henthorn R.2008. Adaptive control for autonomous underwater vehicles. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.

Google Scholar

Mehta N., Natarajan S., Tadepalli P., Fern A.2008. Transfer in variable-reward hierarchical reinforcement learning. Machine Learning73(3), 289–312.

Google Scholar

Minton S.1988. Learning Effective Search Control Knowledge: An Explanation-Based Approach. Kluwer Academic Publishers.

Google Scholar

Mitchell T. M.1997. Machine Learning. McGraw-Hill.

Google Scholar

Mitchell T., Utgoff T., Banerji R.1982. Learning problem solving heuristics by experimentation. In Machine Learning: An Artificial Intelligence Approach, Michalski, R. S., Carbonell, J. G. & Michell, T. M. (eds). Morgan Kaufmann.

Google Scholar

Mourão K., Petrick R. P. A., Steedman M.2008. Using kernel perceptrons to learn action effects for planning. In Proceedings of the International Conference on Cognitive Systems (CogSys 2008), Karlsruhe, Germany.

Google Scholar

Mourão K., Petrick R. P. A., Steedman M.2010. Learning action effects in partially observable domains. In European Conference on Artificial Intelligence, Barcelona, Spain.

Google Scholar

Muggleton S.1995. Stochastic logic programs. In International Workshop on Inductive Logic Programming, Leuven, Belguim.

Google Scholar

Muise C., McIlraith S., Baier J. A., Reimer M.2009. Exploiting N-gram analysis to predict operator sequences. In 19th International Conference on Automated Planning and Scheduling (ICAPS), Thessaloniki, Greece.

Google Scholar

Muñoz-Avila Aha H., Breslow D. & Nau L.1999. HICAP: An interactive case based planning architecture and its application to noncombatant evacuation operations. In Conference on Innovative Applications of Artificial Intelligence, IAAI-99, Orlando, Florida, USA.

Google Scholar

Nau D. S., Smith S. J., Erol K.1998. Control strategies in htn planning: theory versus practice. In In AAAI-98/IAAI-98 Proceedings, Madison, Wisconsin. USA.

Google Scholar

Nau D., Ilghami O., Kuter U., Murdock J. W., Wu D., Yaman F.2003. SHOP2: an HTN planning system. Journal of Artificial Intelligence Research20, 379–404.

Google Scholar

Nayak P., Kurien J., Dorais G., Millar W., Rajan K., Kanefsky R.1999. Validating the DS-1 remote agent experiment. In Artificial Intelligence, Robotics and Automation in Space.

Google Scholar

Newton M. A. H., Levine J., Fox M., Long D.2007. Learning macro-actions for arbitrary planners and domains. In International Conference on Automated Planning and Scheduling, Providence, USA.

Google Scholar

Nilsson N. J.1984. Shakey the Robot. Technical Report 323, AI Center, SRI International.

Google Scholar

Oates T., Cohen P. R.1996. Searching for planning operators with context-dependent and probabilistic effects. In National Conference on Artificial Intelligence, Portland, Oregon, USA.

Google Scholar

Otterlo M. V.2009. The Logic of Adaptive Behavior: Knowledge Representation and Algorithms for Adaptive Sequential Decision Making under Uncertainty in First-Order and Relational Domains. IOS Press.

Google Scholar

Pasula H. M., Zettlemoyer L. S., Kaelbling L. P.2007. Learning symbolic models of stochastic domains. Journal of Artificial Intelligence Research29, 309–352.

Google Scholar

Porteous J., Sebastia L.2004. Ordered landmarks in planning. Journal of Artificial Intelligence Research22, 215–278.

Google Scholar

Quinlan J., Cameron-Jones R.1995. Introduction of logic programs: FOIL and related systems. NewGeneration Computing, Special issue on Inductive Logic Programming13(3–4), 287–312.

Google Scholar

Raedt L. D.2008. Logical and Relational Learning. Springer.

Google Scholar

Ramirez M., Geffner H.2009. Plan recognition as planning. In IJCAI'09: Proceedings of the 21st International Jont Conference on Artifical Intelligence, Pasadena, CA, USA.

Google Scholar

Ramirez M., Geffner H.2010. Probabilistic plan recognition using off-the-shelf classical planners. In National Conference on Artificial Intelligence (AAAI'2010), Atlanta, Georgia, USA.

Google Scholar

Reynolds S. I.2002. Reinforcement Learning with Exploration. PhD thesis, The University of Birmingham, UK.

Google Scholar

Richardson M., Domingos P.2006. Markov logic networks. Machine Learning62, 107–136.

Google Scholar

Rivest R. L.1987. Learning decision lists. Machine Learning2(3), 229–246.

Google Scholar

Sanner S., Boutilier C.2005. Approximate linear programming for first-order mdps. In Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, Edinburgh, Scotland, UK, 509–517.

Google Scholar

Sanner S., Boutilier C.2006. Practical linear value-approximation techniques for first-order MDPs. In Proceedings of the 22nd Conference in Uncertainty in Artificial Intelligence, Cambridge, MA, USA.

Google Scholar

Sanner S., Kersting K.2010. Symbolic dynamic programming for first-order pomdps. In Proceedings of the 24th AAAI Conference on Artificial Intelligence (AAAI-10), Fox, M. & Poole, D. (eds). Atlanta, Georgia, USA, AAAI Press.

Google Scholar

Serina I.2010. Kernel functions for case-based planning. Artificial Intelligence174(16–17), 1369–1406.

Google Scholar

Shavlik J. W.1989. Acquiring recursive and iterative concepts with explanation-based learning. In Machine Learning.

Google Scholar

Shen W., Simon H. A.1989. Rule creation and rule learning through environmental exploration. In International Joint Conference on Artificial Intelligence, IJCAI-89, Detroit, Michigan, USA, 675–680.

Google Scholar

Srivastava S., Immerman N., Zilberstein S.2008. Learning generalized plans using abstract counting. In National Conference on Artificial Intelligence (AAAI'2008), Chicago, Illinois, USA.

Google Scholar

Strehl A. L., Littman M. L.2005. A theoretical analysis of model-based interval estimation. In Proceedings of the 21nd International Conference on Machine Learning (ICML-05), Bonn, Germany, 857–864.

Google Scholar

Sutton R. S., Barto A. G.1998. Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning). The MIT Press.

Google Scholar

Taylor M. E., Stone P.2007. Cross-domain transfer for reinforcement learning. In International Conference on Machine Learning, ICML, Corvallis, OR, USA.

Google Scholar

Theocharous G., Kaelbling L. P.2003. Approximate planning in POMDPs with macro-actions. In Proceedings of Advances in Neural Information Processing Systems 16, Whistler, Canada.

Google Scholar

Thiébaux S., Hoffmann J., Nebel B.2005. In defense of PDDL axioms. Artificial Intelligence168(1–2), 38–69.

Google Scholar

Veloso M. M., Carbonell J. G.1993. Derivational analogy in prodigy: automating case acquisition, storage, and utilization. Machine Learning10, 249–278.

Google Scholar

Veloso M. M., Pérez M. A., Carbonell J. G.1990. Nonlinear planning with parallel resource allocation. In Proceedings of the DARPA Workshop on Innovative Approaches to Planning, Scheduling, and Control, San Diego, CA, USA, Morgan Kaufmann, 207–212.

Google Scholar

Vrakas D., Tsoumakas G., Bassiliades N., Vlahavas I. P.2005. HAPRC: an automatically configurable planning system. AI Communications18(1), 41–60.

Google Scholar

Walsh T. J., Littman M. L.2008. Efficient learning of action schemas and web-service descriptions. In AAAI'08: Proceedings of the 23rd National Conference on Artificial Intelligence, Chicago, Illinois, USA. AAAI Press, 714–719.

Google Scholar

Wang X.1994. Learning planning operators by observation and practice. In International Conference on AI Planning Systems, AIPS-94, Chicago, Illinois, USA.

Google Scholar

Wang C., Joshi S., Khardon R.2007. First order decision diagrams for relational MDPs. In International Joint Conference on Artificial Intelligence, IJCAI-07, Hyderabad, India.

Google Scholar

Watkins C. J. C. H.1989. Learning from Delayed Rewards. PhD thesis, King's College, Oxford.

Google Scholar

Wiering M.1999. Explorations in efficient reinforcement learning. PhD thesis, University of Amsterdam IDSIA, the Netherlands.

Google Scholar

Winner E., Veloso M.2003. DISTILL: towards learning domain-specific planners by example. In International Conference on Machine Learning, ICML'03, Washington, DC, USA.

Google Scholar

Xu Y., Fern A., Yoon S. W.2007. Discriminative learning of beam-search heuristics for planning. In International Joint Conference on Artificial Intelligence, Hyderabad, India.

Google Scholar

Yang Q., Wu K., Jiang Y.2007. Learning action models from plan traces using weighted MAX-SAT. Artificial Intelligence Journal171, 107–143.

Google Scholar

Yoon S., Kambhampati S.2007. Towards model-lite planning: a proposal for learning and planning with incomplete domain models. In ICAPS2007 Workshop on Artificial Intelligence Planning and Learning, Providence, USA.

Google Scholar

Yoon S., Fern A., Givan R.2002. Inductive policy selection for first-order MDPs. In Conference on Uncertainty in Artificial Intelligence, UAI02, Alberta, Edmonton, Canada.

Google Scholar

Yoon S., Fern A., Givan R.2006. Learning heuristic functions from relaxed plans. In International Conference on Automated Planning and Scheduling (ICAPS-2006), Cumbria, UK.

Google Scholar

Yoon S., Fern A., Givan R.2007. Using learned policies in heuristic-search planning. In International Joint Conference on Artificial Intelligence, Hyderabad, India.

Google Scholar

Yoon S., Fern A., Givan R.2008. Learning control knowledge for forward search planning. Journal of Machine Learning Research9, 683–718.

Google Scholar

Younes H., Littman M. L., Weissman D., Asmuth J.2005. The first probabilistic track of the international planning competition. Journal of Artificial Intelligence Research24, 851–887.

Google Scholar

Zelle J., Mooney R.1993. Combining FOIL and EBG to speed-up logic programs. In International Joint Conference on Artificial Intelligence. IJCAI-93, Chambéry, France.

Google Scholar

Zhuo H., Li L., Yang Q., Bian R.2008. Learning action models with quantified conditional effects for software requirement specification. In ICIC '08: Proceedings of the 4th International Conference on Intelligent Computing, Shanghai, China. Springer-Verlag, 874–881.

Google Scholar

Zimmerman T., Kambhampati S.2003. Learning-assisted automated planning: looking back, taking stock, going forward. AI Magazine24, 73–96.

Google Scholar

About this article

Cite this article

Sergio Jiménez, Tomás De La Rosa, Susana Fernández, Fernando Fernández, Daniel Borrajo. 2012. A review of machine learning for automated planning. The Knowledge Engineering Review 27(4)433−467, doi: 10.1017/S026988891200001X

Sergio Jiménez, Tomás De La Rosa, Susana Fernández, Fernando Fernández, Daniel Borrajo. 2012. A review of machine learning for automated planning. The Knowledge Engineering Review 27(4)433−467, doi: 10.1017/S026988891200001X

Download PDF

Article Metrics

Article views(17) PDF downloads(147)

{{lists.name}}

A review of machine learning for automated planning

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors