This paper shows results of recovering punctuation over speech transcriptions for a Portuguese broadcast news corpus. The approach is based on maximum entropy models and uses word, part-of-speech, time and speaker information. The contribution of each type of feature is analyzed individually. Separate results for each focus condition are given, making it possible to analyze the differences of performance between planned and spontaneous speech.
Cite as: Batista, F., Caseiro, D., Mamede, N., Trancoso, I. (2007) Recovering punctuation marks for automatic speech recognition. Proc. Interspeech 2007, 2153-2156, doi: 10.21437/Interspeech.2007-581
@inproceedings{batista07_interspeech, author={Fernando Batista and Diamantino Caseiro and Nuno Mamede and Isabel Trancoso}, title={{Recovering punctuation marks for automatic speech recognition}}, year=2007, booktitle={Proc. Interspeech 2007}, pages={2153--2156}, doi={10.21437/Interspeech.2007-581} }