Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet)

Berkley, D. A.; Flanagan, James L.

doi:10.21437/ICSLP.1990-228

Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet)

D. A. Berkley, James L. Flanagan

This report describes the design and implementation of a digital teleconferencing system that integrates a number of speech technologies together with image and data facilities. The aim is to provide a variety of sophisticated communication features that are easy to learn and use. The system is called HuMaNet, for Human/Machine Network. The system is controlled totally and interactively hands-free by natural speech. The system combines the technologies of speech recognition, text synthesis, and talker verification with autodirective microphone arrays, image compression, data and hypertext management to provide high-quality audio and image conferencing over basic-rate ISDN (Integrated Services Digital Network). The present public-switched transport capacity provides "2B+D", or two 64 k bits/sec circuit-switched channels (2B), and one 16 k bits/sec packet-switched channel (D).

doi: 10.21437/ICSLP.1990-228

Cite as: Berkley, D.A., Flanagan, J.L. (1990) Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet). Proc. First International Conference on Spoken Language Processing (ICSLP 1990), 861-864, doi: 10.21437/ICSLP.1990-228

@inproceedings{berkley90_icslp,
  author={D. A. Berkley and James L. Flanagan},
  title={{Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet)}},
  year=1990,
  booktitle={Proc. First International Conference on Spoken Language Processing (ICSLP 1990)},
  pages={861--864},
  doi={10.21437/ICSLP.1990-228}
}