An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies

Maier, Andreas; Exner, Julian; Steidl, Stefan; Batliner, Anton; Haderlein, Tino; Nöth, Elmar

doi:10.1007/978-3-540-87391-4_49

Andreas Maier¹,
Julian Exner¹,
Stefan Steidl¹,
Anton Batliner¹,
Tino Haderlein¹ &
…
Elmar Nöth¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

961 Accesses
1 Citations

Abstract

We present a novel method for the visualization of speakers which is microphone independent. To solve the problem of lacking microphone independency we present two methods to reduce the influence of the recording conditions on the visualization. The first one is a registration of maps created from identical speakers recorded under different conditions, i.e., different microphones and distances in two steps: Dimension reduction followed by the linear registration of the maps. The second method is an extension of the Sammon mapping method, which performs a non-linear registration during the dimension reduction procedure. The proposed method surpasses the two step registration approach with a mapping error ranging from 17 % to 24 % and a grouping error which is close to zero.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Shozakai, M., Nagino, G.: Analysis of Speaking Styles by Two-Dimensional Visualization of Aggregate of Acoustic Models. In: Proc. Int. Conf. on Spoken Language Processing (ICSLP), Jeju Island (Rep.of Korea), vol. 1, pp. 717–720 (2004)
Google Scholar
Nagino, G., Shozakai, M.: Building an effective corpus by using acoustic space visualization (cosmos) method. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings (ICASSP 2005), pp. 449–452 (2005)
Google Scholar
Haderlein, T., Zorn, D., Steidl, S., Nöth, E., Shozakai, M., Schuster, M.: Visualization of Voice Disorders Using the Sammon Transform. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 589–596. Springer, Heidelberg (2006)
Chapter Google Scholar
Maier, A., Nöth, E., Batliner, A., Nkenke, E., Schuster, M.: Fully Automatic Assessment of Speech of Children with Cleft Lip and Palate. Informatica 30(4), 477–482 (2006)
Google Scholar
Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M., Wong, M.: You stupid tin box - children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In: Proceedings of the 4th International Conferen e of Language Resources and Evaluation LREC 2004, ELRA edn., pp. 171–174 (2004)
Google Scholar
Maier, A., Haderlein, T., Nöth, E.: Environmental Adaptation with a Small Data Set of the Target Domain. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 431–437. Springer, Heidelberg (2006)
Chapter Google Scholar
Sammon, J.: A nonlinear mapping for data structure analysis. In: IEEE Transactions on Computers C-18, pp. 401–409 (1969)
Google Scholar
Mahalanobis, P.C.: On the generalised distance in statistics. In: Proceedings of the National Institute of Science of India 12, pp. 49–55 (1936)
Google Scholar
Naylor, W., Chapman, B.: WNLIB Homepage (2008) (last visited 17/01/2008), www.willnaylor.com/wnlib.html
Maier, A., Hacker, C., Steidl, S., Nöth, E., Niemann, H.: Robust parallel speech recognition in multiple energy bands. In: Kropatsch, G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 133–140. Springer, Heidelberg (2005)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Mustererkennung (Informatik 5), Universität Erlangen-Nürnberg, Martensstraße 3, 91058, Erlangen, Germany
Andreas Maier, Julian Exner, Stefan Steidl, Anton Batliner, Tino Haderlein & Elmar Nöth

Authors

Andreas Maier
View author publications
You can also search for this author in PubMed Google Scholar
Julian Exner
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Steidl
View author publications
You can also search for this author in PubMed Google Scholar
Anton Batliner
View author publications
You can also search for this author in PubMed Google Scholar
Tino Haderlein
View author publications
You can also search for this author in PubMed Google Scholar
Elmar Nöth
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maier, A., Exner, J., Steidl, S., Batliner, A., Haderlein, T., Nöth, E. (2008). An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_49

Download citation

DOI: https://doi.org/10.1007/978-3-540-87391-4_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87390-7
Online ISBN: 978-3-540-87391-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics