The standard metric to evaluate automatic speech recognition (ASR) systems is the word error rate (WER). WER has proven very useful in stand-alone ASR systems. Nowadays, these systems are often embedded in complex natural language processing systems to perform tasks like speech translation, man-machine dialogue, or information retrieval from speech. This exacerbates the need for the speech processing community to design a new evaluation metric to estimate the quality of automatic transcriptions within their larger applicative context. We introduce a new measure to evaluate ASR in the context of named entity recognition, which makes use of a probabilistic model to estimate the risk of ASR errors inducing downstream errors in named entity detection. Our evaluation, on the ETAPE data, shows that ATENE achieves a higher correlation than WER between the performances in named entities recognition and in automatic speech transcription.
Cite as: Jannet, M.A.B., Galibert, O., Adda-Decker, M., Rosset, S. (2015) How to evaluate ASR output for named entity recognition? Proc. Interspeech 2015, 1289-1293, doi: 10.21437/Interspeech.2015-322
@inproceedings{jannet15_interspeech, author={Mohamed Ameur Ben Jannet and Olivier Galibert and Martine Adda-Decker and Sophie Rosset}, title={{How to evaluate ASR output for named entity recognition?}}, year=2015, booktitle={Proc. Interspeech 2015}, pages={1289--1293}, doi={10.21437/Interspeech.2015-322} }