Semantic speech recognition in the Basque context Part I: cross-lingual approaches

Barroso, Nora; López de Ipiña, Karmele; Barroso, Odei; Ezeiza, Aitzol; Hernández, Carmen; Graña, Manuel

doi:10.1007/s10772-011-9111-7

Semantic speech recognition in the Basque context Part I: cross-lingual approaches

Published: 06 October 2011

Volume 15, pages 33–40, (2012)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Nora Barroso¹,
Karmele López de Ipiña²,
Odei Barroso¹,
Aitzol Ezeiza²,
Carmen Hernández² &
…
Manuel Graña²

123 Accesses
1 Citation
Explore all metrics

Abstract

This work, divided into Part I and II, describes the development of GorUP a Semantic Speech Recognition System in the Basque context. Part I analyses cross-lingual approaches oriented to under-resourced languages and Part II the development of the Language Identification system. During the development, data optimization methods and Soft Computing methodologies oriented to complex environment are used in order to overcome the lack of resources. Moreover, in this context three languages coexist: French, Spanish and Basque. Indeed our main goal is the development of robust Automatic Speech Recognition (ASR) systems for Basque, but all language variability has to be analyzed. In this regard, Basque speakers mix during the speech not only sounds but also words of the three languages which results in a strong presence of cross-lingual elements. Besides, Basque is an agglutinative language with a special morpho-syntactic structure inside the words that may lead to intractable vocabularies. Nowadays, our work is oriented to Information Retrieval and mainly to small internet mass-media. In these cases the available resources for Basque in general, and for this task in particular, are very few and complex to process because of the noisy environment. Thus, the methods employed in this development (ontology-based approach or cross-lingual methodologies oriented to profit from more powerful languages) could suit the requirements of many under-resourced languages.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A study on the challenges and opportunities of speech recognition for Bengali language

Article 05 November 2021

M. F. Mridha, Abu Quwsar Ohi, … Muhammad Mostafa Monowar

Turkish Speech Recognition

Analyzing Multilingual Automatic Speech Recognition Systems Performance

References

Ambikairajah, L., & Choi, E. (2005). Robust language identification based on fused phonotactic information with MLKSFM ICME. In IEEE international conference on pre-classifier, multimedia and expo.
Google Scholar
Barroso, N., Ezeiza, A., Gilisagasti, N., Lopez de Ipiña, K., López, A., & López, J. M. (2007). Development of multimodal resources for multilingual information retrieval in the Basque context. In Proc. of Interspeech, Antwerp, Belgium.
Google Scholar
Barroso, N., López de Ipiña, K., Ezeiza, A., Hernández, C., Ezeiza, N., Barroso, O., Susperregi, U., & Barroso, S. (2011). GorUp: an ontology-driven audio information retrieval system that suits the requirements of under-resourced languages. In INTERSPEECH, Florence, Italy.
Google Scholar
Kanthak, S., & Ney, H. (2002). Context dependent acoustic modelling using graphemes for large vocabulary speech recognition. In ICASSP, Orlando, FL (pp. 845–848).
Google Scholar
Le, V. B., & Besacier, L. (2009). Automatic speech recognition for under-resourced languages: application to Vietnamese language. IEEE Transactions on Audio, Speech, and Language Processing, 17(8), 1471–1482.
Article Google Scholar
Lopez de Ipiña, K., Graña, M., Ezeiza, N., Hernández, M., Zulueta, E., Ezeiza, A., & Tovar, C. (2003). Selection of lexical units for CSR of Basque. In LNCS. Progress in pattern recognition, speech and image analysis (pp. 244–250). Berlin: Springer.
Chapter Google Scholar
Schultz, T., & Kirchhoff, N. (2006). Multilingual speech processing. Elsevier: Amsterdam.
Google Scholar
Schultz, T., & Waibel, A. (1998). Multilingual and crosslingual speech recognition. In Proceedings of the DARPA broadcast news. Workshop.
Google Scholar
Seng, S., Sam, S., Le, V. B., Bigi, B., & Besacier, L. (2008). Which units for acoustic and language modelling for Khmer automatic speech recognition. In 1st international conference on spoken language processing for under-resourced languages, Hanoi, Vietnam.
Google Scholar
Toledano, D., Moreno, A., Colás, J., & Garrido, J. (2005). Acoustic-phonetic decoding of different types of spontaneous speech in Spanish. In Disfluencies in spontaneous speech 2005, Aix-en-Provence, France.
Google Scholar
Vandecatseye, A., et al. (2004). The COST278 pan-European broadcast news database. In Proceedings of LREC 2004, Lisbon, Portugal.
Google Scholar
Wheatley, B., Kondo, K., Anderson, W., & Muthusamy, Y. (1994). An evaluation of cross-language adaptation for rapid HMM development in a new language. In ICASSP, Adelaide (pp. 237–240).
Google Scholar

Download references

Author information

Authors and Affiliations

Irunweb Enterprise, Auzolan 2B – 2, Irun, 20303, Basque Country, Spain
Nora Barroso & Odei Barroso
Grupo de Inteligencia Computacional, University of the Basque Country UPV/EHU, Plaza de Europa 1, 20008, Donostia, Spain
Karmele López de Ipiña, Aitzol Ezeiza, Carmen Hernández & Manuel Graña

Authors

Nora Barroso
View author publications
You can also search for this author in PubMed Google Scholar
Karmele López de Ipiña
View author publications
You can also search for this author in PubMed Google Scholar
Odei Barroso
View author publications
You can also search for this author in PubMed Google Scholar
Aitzol Ezeiza
View author publications
You can also search for this author in PubMed Google Scholar
Carmen Hernández
View author publications
You can also search for this author in PubMed Google Scholar
Manuel Graña
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karmele López de Ipiña.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Barroso, N., López de Ipiña, K., Barroso, O. et al. Semantic speech recognition in the Basque context Part I: cross-lingual approaches. Int J Speech Technol 15, 33–40 (2012). https://doi.org/10.1007/s10772-011-9111-7

Download citation

Received: 12 June 2011
Accepted: 19 August 2011
Published: 06 October 2011
Issue Date: March 2012
DOI: https://doi.org/10.1007/s10772-011-9111-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic speech recognition in the Basque context Part I: cross-lingual approaches

Abstract

Access this article

Similar content being viewed by others

A study on the challenges and opportunities of speech recognition for Bengali language

Turkish Speech Recognition

Analyzing Multilingual Automatic Speech Recognition Systems Performance

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic speech recognition in the Basque context Part I: cross-lingual approaches

Abstract

Access this article

Similar content being viewed by others

A study on the challenges and opportunities of speech recognition for Bengali language

Turkish Speech Recognition

Analyzing Multilingual Automatic Speech Recognition Systems Performance

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation