Skip to main content
Log in

The relative impact of student affect on performance models in a spoken dialogue tutoring system

  • Original Paper
  • Published:
User Modeling and User-Adapted Interaction Aims and scope Submit manuscript

Abstract

We hypothesize that student affect is a useful predictor of spoken dialogue system performance, relative to other parameters. We test this hypothesis in the context of our spoken dialogue tutoring system, where student learning is the primary performance metric. We first present our system and corpora, which have been annotated with several student affective states, student correctness and discourse structure. We then discuss unigram and bigram parameters derived from these annotations. The unigram parameters represent each annotation type individually, as well as system-generic features. The bigram parameters represent annotation combinations, including student state sequences and student states in the discourse structure context. We then use these parameters to build learning models. First, we build simple models based on correlations between each of our parameters and learning. Our results suggest that our affect parameters are among our most useful predictors of learning, particularly in specific discourse structure contexts. Next, we use the PARADISE framework (multiple linear regression) to build complex learning models containing only the most useful subset of parameters. Our approach is a value-added one; we perform a number of model-building experiments, both with and without including our affect parameters, and then compare the performance of the models on the training and the test sets. Our results show that when included as inputs, our affect parameters are selected as predictors in most models, and many of these models show high generalizability in testing. Our results also show that overall, the affect-included models significantly outperform the affect-excluded models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ai, H., Litman, D.: Comparing real-real, simulated-simulated, and simulated-real spoken dialogue corpora. In: Proceedings of the AAAI Workshop on Statistical and Empirical Approaches for Spoken Dialogue Systems, pp. 1–6. Boston, USA (2006)

  • Aist, G., Kort, B., Reilly, R., Mostow, J., Picard, R.: Experimentally augmenting an intelligent tutoring system with human-supplied capabilities: adding human-provided emotional scaffolding to an automated reading tutor that listens. In: Proceedings of Intelligent Tutoring Systems Conference (ITS) Workshop on Empirical Methods for Tutorial Dialogue Systems, pp. 16–28. San Sebastian, Spain (2002)

  • Ang, J., Dhillon, R., Krupski, A., Shriberg, E., Stolcke, A.: Prosody-based automatic detection of annoyance and frustration in human-computer dialog. In: Hansen, J.H.L., Pellom, B. (eds.) International Conference on Spoken Language Processing (ICSLP), pp. 2037–2039. Denver, USA (2003)

  • Batliner A., Fischer K., Huber R., Spilker J. and Noth E. (2003). How to find trouble in communication. Speech Communication 40(1–2): 117–143

    Article  MATH  Google Scholar 

  • Batliner, A., Steidl, S., Hacker, C., Noth, E.: Private emotions vs. social interaction—a data-driven approach towards analysing emtion in speech. User Model. User-Adapt. Interact. J. Personalization Res. 18 (2008) (this issue), doi: 10.1007/s11257-007-9039-4

  • Bhatt, K., Evens, M., Argamon, S.: Hedged responses and expressions of affect in human/human and human/computer tutorial interactions. In: Proceedings of Cognitive Science (CogSci), pp. 114–119. Chicago, USA (2004)

  • Black, A., Taylor, P.: Festival Speech Synthesis System: system documentation (1.1.1). Human Communication Research Centre Technical Report 83, The Centre for Speech Technology Research, University of Edinburgh (1997)

  • Bohus, D., Rudnicky, A.: RavenClaw: dialog management using ierarchical task decomposition and an expectation agenda. In: Proceedings of Eurospeech, pp. 597–600. Geneva, Switzerland (2003)

  • Bonneau-Maynard, H., Devillers, L., Rosset, S.: Predictive performance of dialog systems. In: Proceedings of Language Resources and Evaluation Conference (LREC), Athens, Greece (2000) ppp. 177–181

  • Burleson, W., Picard, R.: Affective agents: Sustaining motivation to learn through failure and a state of stuck. In: Social and Emotional Intelligence in Learning Environments Workshop at the Intelligent Tutoring Systems Conference (ITS), Maceio, Brazil (2004) pp. 29–36

  • Chi M., Siler S., Jeong H., Yamauchi T. and Hausmann R. (2001). Learning from human tutoring. Cognitive Science 25: 471–533

    Article  Google Scholar 

  • Conati, C., Mclaren, H.: Evaluating A probabilistic model of student affect. In: Proceedings of Intelligent Tutoring Systems Conference(ITS) pp. 55–66. Maceio, Brazil (2004)

  • Cowie R. and Cornelius R.R. (2003). Describing the emotional states that are expressed in speech. Speech Commun 40(1–2): 5–32

    Article  MATH  Google Scholar 

  • Craig S., Graesser A., Sullins J. and Gholson B. (2004). Affect and learning: an exploratory look into the role of affect in learning with AutoTutor. J. Educ. Media 29(3): 241–250

    Google Scholar 

  • de Vicente, A., Pain, H.: Informing the detection of the students motivational state: an empirical study. In: Proceedings of the Intelligent Tutoring Systems Conference (ITS), pp. 933–943. Biarritz, France (2002)

  • D’Mello S., Craig S., Sullins J. and Graesser A. (2006). Predicting affective states through an emote-aloud procedure from AutoTutor’s mixed-initiative dialogue. Int J Artificial Intelligence Education 16: 3–28

    Google Scholar 

  • D’Mello, S.K., Craig, S.D., Witherspoon, A., McDaniel, B., Graesser, A.: Automatic detection of learner’s affect from conversational cues. User Model. and User-Adapt. Interact. J. Personalization Res. 18 (this issue) doi: 10.1007/s11257-007-9037-6 (2008)

  • Feng, M., Heffernan, N., Koedinger, K.: Addressing the testing challenge with a web-based e-assessment system that tutors as it assesses. In: Proceedings of the Fifteenth International World Wide Web Conference, pp. 307–316. ACM Press, Edinburgh (2006)

  • Forbes-Riley, K., Litman, D.: Modelling user satisfaction and student learning in a spoken dialogue tutoring system with generic, tutoring, and ser affect parameters. In: Proceedings of the Human Language Technology Conference/Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp. 264–271. New York City, USA (2006)

  • Forbes-Riley, K., Litman, D., Silliman, S., Tetreault, J.: Comparing synthesized versus pre-recorded tutor speech in an intelligent tutoring spoken dialogue system. In: Proceedings of the Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 509–514. Melbourne Beach, Florida, USA (2006)

  • Forbes-Riley, K., Litman, D.: Analyzing dependencies between student certainness states and tutor responses in a spoken dialogue corpus. In: Minker, W., Dybkjaer, L. (eds.) Recent Trends in Discourse and Dialogue, Springer. To Appear (2008)

  • Gabsdil, M., Lemon, O.: Combining acoustic and pragmatic features to predict recognition performance in spoken dialogue systems. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), pp. 343–350. Barcelona, Spain (2004)

  • Graesser A.C., Chipman P., Haynes B.C. and Olney A. (2005). AutoTutor: an intelligent tutoring system with mixed-initiative dialogue. IEEE Trans. Educ. 48(4): 612–618

    Article  Google Scholar 

  • Graesser A.C. and Olde B. (2003). How does one know whether a person understands a device? The quality of the questions the person asks when the device breaks down. J. Educ. Psychol. 95: 524–536

    Article  Google Scholar 

  • Gratch J. and Marsella S. (2003). Fight the way you train: the role and limits of emotions in training for combat. Brown J. World Aff. 10(1): 63–76

    Google Scholar 

  • Hall, L., Woods, S., Sobral, D., Paiva, A., Dautenhahn, K., Wolke, D., Newall, L.: Designing empathic agents: adults vs. kids. In: Proceedings of the Intelligent Tutoring Systems Conference (ITS), pp. 604–613. Maceio, Brazil (1996)

  • Hirschberg, J., Nakatani, C.: A prosodic analysis of discourse segments in direction-giving monologues. In: Proceedings of the Annual Meeting on Association for Computational Linguistics (ACL), pp. 286–293. Santa Cruz, California (1996)

  • Huang X.D., Alleva F., Hon H.W., Hwang M.Y., Lee K.F. and Rosenfeld R. (1993). The SphinxII speech recognition system: an overview. Comp. Speech and Lang. 7(2): 137–148

    Article  Google Scholar 

  • Jordan, P., Makatchev, M., VanLehn, K.: Abductive theorem proving for analyzing student explanations. In: Hoppe, U., Verdejo, F., Kay, J., (eds.) Proceedings of Artificial Intelligence in Education, pp. 73–80. IOS Press, Sydney (2003)

  • Jordan, P., VanLehn, K.: Discourse processing for explanatory essays in tutorial applications. In: Proceedings of the 3rd SIGdial workshop on discourse and dialogue, pp. 74–83. Philadelphia, Pennsylvania (2002)

  • Jordan, P.W., Makatchev, M., VanLehn, K.: Combining competing language understanding approaches in an intelligent tutoring system. In: Proceedings of the Intelligent Tutoring Systems Conference (ITS), pp. 346–357. Maceio, Brazil (2004)

  • Klein J., Moon Y. and Picard R. (2002). This computer responds to user frustration: theory, design, and results. Interact. Comput. 14: 119–140

    Google Scholar 

  • Kort, B., Reilly, R., Picard, R.: An affective model of interplay between emotions and learning: Reengineering educational pedagogy-building a learning companion. In: Okamoto, T., Hartley, R., Kinshuk, J., Klus, P., (eds.) Proceedings IEEE International Conference on Advanced Learning Technology: Issues, Achievements and Challenges, pp. 43–48. Madison, WI (2001)

  • Landis J.R. and Koch G.G. (1977). The measurement of observer agreement for categorical data. Biometrics 33: 159–174

    Article  MATH  MathSciNet  Google Scholar 

  • Lee, C., Narayanan, S., Pieraccini, R.: Combining acoustic and language information for emotion recognition. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. 873–876. Denver, Colorado, USA (2002)

  • Levow, G.-A.: Prosodic cues to discourse segment boundaries in human-computer dialogue. In: Proceedings of the SIGdial Workshop on Discourse and Dialogue, pp. 102–108. Barcelona, Spain (2004)

  • Liscombe, J., Venditti, J., Hirschberg, J.: Detecting certainness in spoken tutorial dialogues. In: Proceedings of Interspeech, pp. 1837–1840. Lisbon, Portugal (2005)

  • Litman, D., Forbes-Riley, K.: Annotating student emotional states in spoken tutoring dialogues. In: Proceedings of the SIGdial Workshop on Discourse and Dialogue, pp. 144–153. Boston, USA (2004)

  • Litman D.J. and Forbes-Riley K. (2006a). Correlations between dialogue acts and learning in spoken tutoring dialogues. J. Natural Lang. Eng. Special Issue on Educational Applications 12(2): 161–176

    Google Scholar 

  • Litman D.J. and Forbes-Riley K. (2006b). Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors. Speech Commun. 48(5): 559–590

    Article  Google Scholar 

  • Litman D., Rosé C., Forbes-Riley K., VanLehn K., Bhembe D. and Silliman S. (2006). Spoken versus typed human and computer dialogue tutoring. Internat. J. Artif. Intell. Educ. 16: 145–170

    Google Scholar 

  • McQuiggan, S.W., Mott, B.W., Lester, J.C.: Modeling self-efficacy in intelligent tutoring systems: an inductive approach. User Model. User-Adapt. Interact. J. Personalization Res. 18 (this issue), doi: 10.1007/s11257-007-9040-y (2008)

  • Möller, S.: Parameters for quantifying the interaction with spoken dialogue telephone services. In: Proceedings of the SIGdial Workshop on Discourse and Dialogue, pp. 166–177. Lisbon, Portugal (2005a)

  • Möller, S.: Towards generic quality prediction models for spoken dialogue systems—a case study. In: Proceedings of Interspeech, pp. 2489–2492. Lisbon, Portugal (2005b)

  • (1980). Multivariate Techniques in Human Communication Research. Academic Press, New York

    MATH  Google Scholar 

  • Moore, J.D., Porayska-Pomsta, K., Varges, S., Zinn, C.: Generating tutorial feedback with affect. In: Proceedings of the Florida Artificial Intelligence Research Society Conference (FLAIRS), pp. 923–928. Miami Beach, Florida (2004)

  • Mostow, J., Aist, G.: Evaluating tutors that listen: an overview of Project LISTEN. In: Forbus, K., Feltovich, P., (eds.) Smart machines in education: The coming revolution in educational technology, pp. 169–234. MIT/AAAI Press, Menlo Park, CA (2001)

  • Narayanan, S.: Towards modeling user behavior in human-machine interaction: effect of errors and emotions. In: Proceedings of the ISLE Workshop on Dialogue Tagging for Multi-modal Human Computer Interaction. Edinburgh, Scotland (2002)

  • Pon-Barry H., Schultz K., Bratt E.O., Clark B. and Peters S. (2006). Responding to student uncertainty in spoken tutorial dialogue systems. International Journal of Artificial Intelligence in Education 16: 171–194

    Google Scholar 

  • Porayska-Pomsta, K., Mavrikis, M., Pain, H.: Diagnosing and acting on student affect: the tutor’s perspective. User Modeling and User-Adapted Interaction: The Journal of Personalization Research 18 (this issue), doi: 10.1007/s11257-007-9041-X (2008)

  • Prendinger H. and Ishizuka M. (2001). Let’s talk! socially intelligent agents for language conversation training. IEE Trans. Syst. Man, Cyber. Syst. Hum. (Special Issue on Socially Intelligent Agents—The Human in the Loop) 31(5): 465–471

    Google Scholar 

  • Rosé, C.P.: A framework for robust sentence level interpretation. In: Proceedings of the North American Chapter of the Association for Computational Lingusitics (NAACL), pp. 1129–1135. Seattle, Washington (2000)

  • Rosé, C.P., Bhembe, D., Roque, A., VanLehn, K.: An efficient incremental architecture for robust interpretation. In: Proceedings of the Human Languages Technology Conference (HLT), pp. 307–312. San Diego, USA (2002)

  • Rosé, C.P., Jordan, P., Ringenberg, M., Siler, S., VanLehn, K., Weinstein, A.: Interactive conceptual tutoring in Atlas-Andes. In: Proceedings of Artificial Intelligence in Education (AIED), pp. 256–266. San Antonio, Texas, USA (2001)

  • Rosé, C.P., Lavie, A.: Balancing robustness and efficiency in unification augmented context-free parsers for large practical applications. In: Junqua, J.C., Noord, G.V., (eds.): Robustness in Language and Speech Technologies, pp. 239–269. Kluwer Academic Press (2001)

  • Rotaru, M., Litman, D.: Exploiting discourse structure for spoken dialogue performance analysis. In: Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 85–93. Sydney, Australia (2006)

  • Shafran, I., Riley, M., Mohri, M.: Voice signatures. In: Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 31–36. St. Thomas, US Virgin Islands (2003)

  • Tetreault, J., Litman, D.: Using reinforcement learning to build a better model of dialogue state. In: Proceedings 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 289–296. Trento, Italy (2006)

  • VanLehn K., Graesser A., Jackson G., Jordan P., Olney A. and Rosé C. (2007). When are tutorial dialogues more effective than reading?. Cogn. Sci. 31(1): 3–52

    Google Scholar 

  • VanLehn, K., Jordan, P.W., Rosé, C., Bhembe, D., Böttner, M., Gaydos, A., Makatchev, M., Pappuswamy, U., Ringenberg, M., Roque, A., Siler, S., Srivastava, R., Wilson, R.: The architecture of Why2-Atlas: a coach for qualitative physics essay writing. In: Proceedings of the 6th International Intelligent Tutoring Systems Conference. pp. 158–167. Biarritz, France (2002)

  • VanLehn K., Siler S., Murray C., Yamauchi T. and Baggett W.B. (2003). Why do only some events cause learning during human tutoring?. Cogn. Instr. 21(3): 209–249

    Article  Google Scholar 

  • Walker M., Kamm C. and Litman D. (2000). Towards developing general models of usability with PARADISE. Nat. Lang. Eng. 6: 363–377

    Article  Google Scholar 

  • Walker, M., Passonneau, R., Boland, J.: Quantitative and qualitative evaluation of darpa communicator spoken dialogue systems. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL). pp. 515–522. Toulouse, France (2001)

  • Walker, M., Rudnicky, A., Prasad, R., Aberdeen, J., Bratt, E., Garofolo, J., Hastie, H., Le, A., Pellom, B., Potamianos, A., Passonneau, R., Roukos, S., Sanders, G., Seneff, S., Stallard, D.: DARPA communicator: cross-System results for the 2001 evaluation. In: Proceedings of the International Conference on Spoken Language Processing (ICSLP), pp. 269–272. Denver, Colorado, USA (2002)

  • Walker, M.A., Litman, D., Kamm, C., Abella, A.: PARADISE: a general framework for evaluating spoken dialogue agents. In: Proceedings of the Annual Meeting of the Association of Computational Linguistics (ACL). pp. 271–280. Madrid, Spain (1997)

  • Wang, N., Johnson, W., Rizzo, P., Shaw, E., Mayer, R.: Experimental evaluation of polite interaction tactics for pedagogical agents. In: Proceedings of Intelligent User Interface Conference (IUI), pp. 12–19. (2005)

  • Yannakakis, G.N., Hallam, J., Lund, H.H.: Entertainment capture through heart rate activity in physical interactive playgrounds. User Model. User-Adapt. Interact. J. Personalization Res. 18 (this issue), doi: 10.1007/s11227-007-9036-7 (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kate Forbes-Riley.

Additional information

Kate Forbes-Riley and Mihai Rotaru contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Forbes-Riley, K., Rotaru, M. & Litman, D.J. The relative impact of student affect on performance models in a spoken dialogue tutoring system. User Model User-Adap Inter 18, 11–43 (2008). https://doi.org/10.1007/s11257-007-9038-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11257-007-9038-5

Keywords

Navigation