Skip to main content

An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies

  • Conference paper
Text, Speech and Dialogue (TSD 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5246))

Included in the following conference series:

Abstract

We present a novel method for the visualization of speakers which is microphone independent. To solve the problem of lacking microphone independency we present two methods to reduce the influence of the recording conditions on the visualization. The first one is a registration of maps created from identical speakers recorded under different conditions, i.e., different microphones and distances in two steps: Dimension reduction followed by the linear registration of the maps. The second method is an extension of the Sammon mapping method, which performs a non-linear registration during the dimension reduction procedure. The proposed method surpasses the two step registration approach with a mapping error ranging from 17 % to 24 % and a grouping error which is close to zero.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shozakai, M., Nagino, G.: Analysis of Speaking Styles by Two-Dimensional Visualization of Aggregate of Acoustic Models. In: Proc. Int. Conf. on Spoken Language Processing (ICSLP), Jeju Island (Rep.of Korea), vol. 1, pp. 717–720 (2004)

    Google Scholar 

  2. Nagino, G., Shozakai, M.: Building an effective corpus by using acoustic space visualization (cosmos) method. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. Proceedings (ICASSP 2005), pp. 449–452 (2005)

    Google Scholar 

  3. Haderlein, T., Zorn, D., Steidl, S., Nöth, E., Shozakai, M., Schuster, M.: Visualization of Voice Disorders Using the Sammon Transform. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 589–596. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  4. Maier, A., Nöth, E., Batliner, A., Nkenke, E., Schuster, M.: Fully Automatic Assessment of Speech of Children with Cleft Lip and Palate. Informatica 30(4), 477–482 (2006)

    Google Scholar 

  5. Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M., Wong, M.: You stupid tin box - children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In: Proceedings of the 4th International Conferen e of Language Resources and Evaluation LREC 2004, ELRA edn., pp. 171–174 (2004)

    Google Scholar 

  6. Maier, A., Haderlein, T., Nöth, E.: Environmental Adaptation with a Small Data Set of the Target Domain. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2006. LNCS (LNAI), vol. 4188, pp. 431–437. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  7. Sammon, J.: A nonlinear mapping for data structure analysis. In: IEEE Transactions on Computers C-18, pp. 401–409 (1969)

    Google Scholar 

  8. Mahalanobis, P.C.: On the generalised distance in statistics. In: Proceedings of the National Institute of Science of India 12, pp. 49–55 (1936)

    Google Scholar 

  9. Naylor, W., Chapman, B.: WNLIB Homepage (2008) (last visited 17/01/2008), www.willnaylor.com/wnlib.html

  10. Maier, A., Hacker, C., Steidl, S., Nöth, E., Niemann, H.: Robust parallel speech recognition in multiple energy bands. In: Kropatsch, G., Sablatnig, R., Hanbury, A. (eds.) DAGM 2005. LNCS, vol. 3663, pp. 133–140. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petr Sojka Aleš Horák Ivan Kopeček Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maier, A., Exner, J., Steidl, S., Batliner, A., Haderlein, T., Nöth, E. (2008). An Extension to the Sammon Mapping for the Robust Visualization of Speaker Dependencies. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2008. Lecture Notes in Computer Science(), vol 5246. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87391-4_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87391-4_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87390-7

  • Online ISBN: 978-3-540-87391-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics