Skip to main content
Log in

Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Computer-aided pronunciation training (CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language (L2) learners’ speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach towards gradation that is motivated by auditory perception. We have developed a computational method for generating a perceptual distance (PD) between two spoken phonemes. This is used to compute the auditory confusion of native language (L1). PD is found to correlate well with the mispronunciations detected in CAPT system for Chinese learners of English, i.e., L1 being Chinese (Mandarin and Cantonese) and L2 being US English. The results show that auditory confusion is indicative of pronunciation confusions in L2 learning. PD can also be used to help us grade the severity of errors (i.e., mispronunciations that confuse more distant phonemes are more severe) and accordingly prioritize the order of corrective feedback generated for the learners.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Braj K. Asian Englishes: Beyond the Canon. Hong Kong: Hong Kong University Press, 2005.

    Google Scholar 

  2. Harrison A M, Lau W Y, Meng H, Wang L. Improving mispronunciation detection and diagnosis of learners’ speech with context-sensitive phonological rules based on language transfer. In Proc. the 9th Annual Conference of the International Speech Communication Association, Sept. 2008, pp.2787–2790.

  3. Meng H, Lo Y, Wang L, Lau W Y. Deriving salient learners’ mispronunciations from cross-language phonological comparisons. In Proc. IEEE Workshop on Automatic Speech Recognition and Understanding, December 2007, pp.437–442.

  4. Lo W K, Harrison A M, Meng H, Wang L. Decision fusion for improving mispronunciation detection using language transfer knowledge and phoneme-dependent pronunciation scoring. In Proc. the 6th International Symposium on Chinese Spoken Language Processing, December 2008, pp.25–28.

  5. Yuen K W, Leung W K, Liu P F, Wong K H, Qian X, Lo W K, Meng H. Enunciate: An internet-accessible computer-aided pronunciation training system and related user evaluations. In Proc. International Conference on Speech Databases and Assessment, October 2011, pp.85–90.

  6. Laver J. Principles of Phonetics. Cambridge, UK: Cambridge University Press, 1994.

    Book  Google Scholar 

  7. Ellis R. Corrective feedback and teacher development. L2 Journal, 2009, 1: 3–18.

    Google Scholar 

  8. Wang H, Qian X, Meng H. Phonological modeling of mispronunciation gradations in L2 English speech of L1 Chinese learners. In Proc. International Conference on Acoustics, Speech, and Signal Processing, May 2014.

  9. Huang G, Jia J, Cai L. A study on perception measurement of mandarin vowels based on LPC spectrum features. In Proc. Phonetic Conference, May 2010.

  10. Jia J, Wang Y, Zhang Y, Tian Y, Cai L. Discussion on perception definition computing method of mandarin consonants. In Proc. Phonetic Conference, May 2012.

  11. Meng H, Zee E, Lee W S. A contrastive phonetic study between Cantonese and English to predict salient mispronunciations by Cantonese learners of English. Technical Report, SEEM2007-1500, Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong, February 2007.

  12. Neri A, Cucchiarini C, Strik H, Boves L. The pedagogy-technology interface in computer assisted pronunciation training. Computer Assisted Language Learning, 2002, 15(5): 441–467.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jia Jia.

Additional information

This work is supported by the National Basic Research 973 Program of China under Grant No. 2013CB329304, the National Natural Science Foundation of China under Grant No. 61370023, and the Major Project of the National Social Science Foundation of China under Grant No. 13&ZD189. This work is also partially supported by the General Research Fund of the Hong Kong SAR Government under Project No. 415511 and the CUHK Teaching Development Grant.

Electronic supplementary material

Below is the link to the electronic supplementary material.

ESM 1

(PDF 80 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jia, J., Leung, WK., Wu, YH. et al. Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception. J. Comput. Sci. Technol. 29, 751–761 (2014). https://doi.org/10.1007/s11390-014-1465-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-014-1465-2

Keywords

Navigation