Recent research advances in Reinforcement Learning in Spoken Dialogue Systems

Matthew Frampton; Oliver Lemon; Matthew Frampton; Oliver Lemon

doi:10.1017/S0269888909990166

Abstract: This paper will summarize and analyze the work of the different research groups who have recently made significant contributions in using Reinforcement Learning techniques to learn dialogue strategies for Spoken Dialogue Systems (SDSs). This use of stochastic planning and learning has become an important research area in the past 10 years, since it promises automatic data-driven optimization of the behavior of SDSs that were previously hand-coded by expert developers. We survey the most important developments in the field, compare and contrast the different approaches, and describe current open problems.

Other Articles By Authors

Recent research advances in Reinforcement Learning in Spoken Dialogue Systems

Center for the Study of Language and Information

Heriot Watt University

Corresponding authors: Matthew Frampton ; Oliver Lemon

Published online: 01 December 2009

Abstract: Abstract: This paper will summarize and analyze the work of the different research groups who have recently made significant contributions in using Reinforcement Learning techniques to learn dialogue strategies for Spoken Dialogue Systems (SDSs). This use of stochastic planning and learning has become an important research area in the past 10 years, since it promises automatic data-driven optimization of the behavior of SDSs that were previously hand-coded by expert developers. We survey the most important developments in the field, compare and contrast the different approaches, and describe current open problems.

HTML

For in-depth discussions of technical details such as temporal difference learning, Monte Carlo learning, eligibility traces, and Q-values, we refer the reader to Sutton and Barto (1998).

Supervised Learning (SL) algorithms are machine-learning algorithms, which generate a function that maps inputs to desired outputs.

A CL is a number between 0 and 1 based on acoustic measurements and defines how sure the system is to have performed correct recognition.

Multivariate linear regression (see page 1433 of Sheskin (2007)) models numerical data by a least squares function which is a linear combination of the model parameters and depends on >1 independent variables. A least squares function fits a model so that the sum of the squared residuals has its least value, a residual being the difference between an observed value and the value given by the model.

R², the ‘coefficient of determination’ (see page 1230 of Sheskin (2007)), is the proportion of variability in a data set that is accounted for by a statistical model. R² = 1 indicates that the fitted model explains all variability, R² = 0, no ‘linear’ relationship between the dependent and independent variables, and R² = 0.39, that approximately 39% of the variation in the dependent variable can be explained by the independent variables, and the remaining 61% by unknown variables/inherent variability.

In an n-fold cross-validation, the data is first divided into n (usually equal-sized) portions, and then in each of n folds, a different one of these portions is used for testing, while the remainder of the data is used for training. Results are averaged across the n folds.

A speech recognizer may provide a list, in order, of its top n hypotheses for a user utterance according to their CLs.

This is obviously also a very important issue for how best to evaluate the accuracy of user simulations.

If i = 1, i − 1 is considered to be the final slot, and if i is the final slot, i+ 1 is considered to be the first slot, for example, ‘So you want to fly from Edinburgh to where?’.

Utterances that cannot be handled by the system.

If there are no unfilled/unconfirmed slots to switch focus to, the strategy continues to follow DA2.

Rights and permissions

References (55)

About this article

Cite this article

Matthew Frampton, Oliver Lemon. 2009. Recent research advances in Reinforcement Learning in Spoken Dialogue Systems. The Knowledge Engineering Review. 24:166 doi: 10.1017/S0269888909990166

Matthew Frampton, Oliver Lemon. 2009. Recent research advances in Reinforcement Learning in Spoken Dialogue Systems. The Knowledge Engineering Review. 24:166 doi: 10.1017/S0269888909990166

{{lists.name}}

Recent research advances in Reinforcement Learning in Spoken Dialogue Systems

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors