Pronunciation learning for named-entities through crowd-sourcing

Rutherford, Attapol T.; Peng, Fuchun; Beaufays, Françoise

doi:10.21437/Interspeech.2014-354

Pronunciation learning for named-entities through crowd-sourcing

Attapol T. Rutherford, Fuchun Peng, Françoise Beaufays

Obtaining good pronunciations for named-entities poses a challenge for automated speech recognition because named-entities are diverse in nature and origin, and new entities come up every day. In this paper, we investigate the feasibility of learning named-entity pronunciations using crowd-sourcing. By collecting audio samples from non-linguistic-expert speakers with Mechanical Turk and learning from them, we can quickly derive pronunciations that are more accurate in speech recognition tests than manual pronunciations generated by linguistic experts. Compared to traditional approaches of generating pronunciations, this new approach proves to be cheap, fast, and quite accurate.

doi: 10.21437/Interspeech.2014-354

Cite as: Rutherford, A.T., Peng, F., Beaufays, F. (2014) Pronunciation learning for named-entities through crowd-sourcing. Proc. Interspeech 2014, 1448-1452, doi: 10.21437/Interspeech.2014-354

@inproceedings{rutherford14_interspeech,
  author={Attapol T. Rutherford and Fuchun Peng and Françoise Beaufays},
  title={{Pronunciation learning for named-entities through crowd-sourcing}},
  year=2014,
  booktitle={Proc. Interspeech 2014},
  pages={1448--1452},
  doi={10.21437/Interspeech.2014-354}
}