ISCA Archive Interspeech 2004
ISCA Archive Interspeech 2004

Using part-of-speech for predicting phrase breaks

Ian Read, Stephen Cox

Predicting the location of phrase breaks within an utterance is an important task in text-to-speech synthesis, and can be done with reasonable accuracy using part-of-speech (POS) tags as features. However, it seems unlikely that the 40 or more different tags used by most taggers all contribute to this task, and in fact many may contribute noise. In this paper, we present an algorithm for reducing the standard Penn Treebank POS tag set for use in predicting phrase breaks. Using the best first search approach, the algorithm considers possible groupings of tags, searching the groupings that yield the highest overall performance. The reduced tag sets were evaluated by an n-gram model trained on POS sequences along with their associated juncture (break/non-break), the reduced tag set raised the model's performance on junctures correct from 90.38% to 92.43%, and reduced insertions from 2.89% to 1.83%.


doi: 10.21437/Interspeech.2004-285

Cite as: Read, I., Cox, S. (2004) Using part-of-speech for predicting phrase breaks. Proc. Interspeech 2004, 741-744, doi: 10.21437/Interspeech.2004-285

@inproceedings{read04_interspeech,
  author={Ian Read and Stephen Cox},
  title={{Using part-of-speech for predicting phrase breaks}},
  year=2004,
  booktitle={Proc. Interspeech 2004},
  pages={741--744},
  doi={10.21437/Interspeech.2004-285}
}