Authors:
- Developes a set of five universal perceptual quality dimensions for TTS signals
- Introduces a test protocol for the assessment of the five dimensions in a listening test
- Investigates factors influencing the five perceptual quality dimensions
- Presents different approaches towards instrumental quality assessment of synthetic speech.
- Examines the integration of an instrumental quality assessment model into a TTS system for quality improvement
- Includes supplementary material: sn.pub/extras
Part of the book series: T-Labs Series in Telecommunication Services (TLABS)
Buy it now
Buying options
Tax calculation will be finalised at checkout
Other ways to access
This is a preview of subscription content, log in via an institution to check for access.
Table of contents (8 chapters)
-
Front Matter
-
Back Matter
About this book
This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined.
Authors and Affiliations
-
Quality and Usability Lab, Institute of Software Engineering and Theoretical Computer Science, Berlin Institute of Technology, Berlin, Germany
Florian Hinterleitner
Bibliographic Information
Book Title: Quality of Synthetic Speech
Book Subtitle: Perceptual Dimensions, Influencing Factors, and Instrumental Assessment
Authors: Florian Hinterleitner
Series Title: T-Labs Series in Telecommunication Services
DOI: https://doi.org/10.1007/978-981-10-3734-4
Publisher: Springer Singapore
eBook Packages: Engineering, Engineering (R0)
Copyright Information: Springer Nature Singapore Pte Ltd. 2017
Hardcover ISBN: 978-981-10-3733-7Published: 18 April 2017
Softcover ISBN: 978-981-10-9953-3Published: 29 July 2018
eBook ISBN: 978-981-10-3734-4Published: 07 April 2017
Series ISSN: 2192-2810
Series E-ISSN: 2192-2829
Edition Number: 1
Number of Pages: XVI, 157
Number of Illustrations: 29 b/w illustrations
Topics: Signal, Image and Speech Processing, User Interfaces and Human Computer Interaction