Abstract
We present a novel method for the visualization of speakers which is microphone independent. To solve the problem of lacking microphone independency we present two methods to reduce the influence of the recording conditions on the visualization. The first one is a registration of maps created from identical speakers recorded under different conditions, i.e., different microphones and distances in two steps: Dimension reduction followed by the linear registration of the maps. The second method is an extension of the Sammon mapping method, which performs a non-linear registration during the dimension reduction procedure. The proposed method surpasses the two step registration approach with a mapping error ranging from 17 % to 24 % and a grouping error which is close to zero.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Shozakai, M., Nagino, G.: Analysis of Speaking Styles by Two-Dimensional Visualization of Aggregate of Acoustic Models. In: Proc. Int. Conf. on Spoken Language Processing (ICSLP), Jeju Island (Rep.of Korea), vol. 1, pp. 717–720 (2004)
Nagino, G., Shozakai, M.: Building an effective corpus by using acoustic space visualization (cosmos) method. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings (ICASSP 2005), pp. 449–452 (2005)
Haderlein, T., Zorn, D., Steidl, S., Nöth, E., Shozakai, M., Schuster, M.: Visualization of Voice Disorders Using the Sammon Transform. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 589–596. Springer, Heidelberg (2006)
Maier, A., Nöth, E., Batliner, A., Nkenke, E., Schuster, M.: Fully Automatic Assessment of Speech of Children with Cleft Lip and Palate. Informatica 30(4), 477–482 (2006)
Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M., Wong, M.: You stupid tin box - children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In: Proceedings of the 4th International Conferen e of Language Resources and Evaluation LREC 2004, ELRA edn., pp. 171–174 (2004)
Maier, A., Haderlein, T., Nöth, E.: Environmental Adaptation with a Small Data Set of the Target Domain. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 431–437. Springer, Heidelberg (2006)
Sammon, J.: A nonlinear mapping for data structure analysis. In: IEEE Transactions on Computers C-18, pp. 401–409 (1969)
Mahalanobis, P.C.: On the generalised distance in statistics. In: Proceedings of the National Institute of Science of India 12, pp. 49–55 (1936)
Naylor, W., Chapman, B.: WNLIB Homepage (2008) (last visited 17/01/2008), www.willnaylor.com/wnlib.html
Maier, A., Hacker, C., Steidl, S., Nöth, E., Niemann, H.: Robust parallel speech recognition in multiple energy bands. In: Kropatsch, G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 133–140. Springer, Heidelberg (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Maier, A., Exner, J., Steidl, S., Batliner, A., Haderlein, T., Nöth, E. (2008). An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_49
Download citation
DOI: https://doi.org/10.1007/978-3-540-87391-4_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)