This report describes the design and implementation of a digital teleconferencing system that integrates a number of speech technologies together with image and data facilities. The aim is to provide a variety of sophisticated communication features that are easy to learn and use. The system is called HuMaNet, for Human/Machine Network. The system is controlled totally and interactively hands-free by natural speech. The system combines the technologies of speech recognition, text synthesis, and talker verification with autodirective microphone arrays, image compression, data and hypertext management to provide high-quality audio and image conferencing over basic-rate ISDN (Integrated Services Digital Network). The present public-switched transport capacity provides "2B+D", or two 64 k bits/sec circuit-switched channels (2B), and one 16 k bits/sec packet-switched channel (D).
Cite as: Berkley, D.A., Flanagan, J.L. (1990) Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet). Proc. First International Conference on Spoken Language Processing (ICSLP 1990), 861-864, doi: 10.21437/ICSLP.1990-228
@inproceedings{berkley90_icslp, author={D. A. Berkley and James L. Flanagan}, title={{Integration of speech recognition, text-to-speech synthesis, and talker verification into a hands-free audio/image teleconferencing system (humanet)}}, year=1990, booktitle={Proc. First International Conference on Spoken Language Processing (ICSLP 1990)}, pages={861--864}, doi={10.21437/ICSLP.1990-228} }