Hierarchical Expert Neural Network System for Speech Recognition

Rocha, Priscila; Silva, Washington; Barros, Allan

doi:10.1007/s40313-019-00459-w

Hierarchical Expert Neural Network System for Speech Recognition

Published: 11 March 2019

Volume 30, pages 347–359, (2019)
Cite this article

Journal of Control, Automation and Electrical Systems Aims and scope Submit manuscript

296 Accesses
2 Citations
Explore all metrics

Abstract

This work proposes a hierarchical architecture composed of a expert neural network set based on the ensemble method with dynamic selection of classifiers for application in speech recognition systems. Therefore, 30 commands in the Brazilian Portuguese language were coded by a two-dimensional time matrix, resulting from the application of the discrete cosine transformation in the mel-cepstral coefficients. These patterns were modified by means of a nonlinear transformation to a high-dimensionality space through a set of Gaussian radial basis functions (GRBFs) parameterized with the centroid and covariance characteristics of the classes. The classification was made through the dynamic classifier selection approach, in which multilayer perceptron and learning vector quantization configurations were analyzed to constitute the multiple classifiers specialized in the subdivisions made in the total of classes to be recognized. Then, given a new test pattern, the GRBF that presents the highest value of the receptive field in relation to the input feature vector indicates the class to which the pattern is nearer, thus directing to the expert neural network that provides the final result of classification based on the local accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Development and Application of Artificial Neural Network

Article 30 December 2017

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Abdalla, M. I., Abobakr, H. M., & Gaafar, T. S. (2013). DWT and MFCCS based feature extraction methods for isolated word recognition. International Journal of Computer Applications, 69(20), 21–25.
Article Google Scholar
Aida-zade, K., Xocayev, A., & Rustamov, S. (2016). Speech recognition using support vector machines. In IEEE 10th international conference on application of information and communication technologies (AICT) (pp. 1–4).
Araújo, R. A., Oliveira, A. L., & Meira, S. (2017). A morphological neural network for binary classification problems. Engineering Applications of Artificial Intelligence, 65, 12–28. https://doi.org/10.1016/j.engappai.2017.07.014.
Article Google Scholar
Bellegarda, J. R., & Monz, C. (2016). State of the art in statistical methods for language and speech processing. Computer Speech and Language, 35, 163–184.
Article Google Scholar
Bhowmik, T., Chowdhury, A., & Mandal, S. K. D. (2018). Deep neural network based place and manner of articulation detection and classification for bengali continuous speech. Procedia Computer Science, 125, 895–901. https://doi.org/10.1016/j.procs.2017.12.114.
Article Google Scholar
Britanak, V., Yip, P., & Rao, K. (2010). Discrete cosine and sine transforms: General properties, fast algorithms and integer approximations. Amsterdam: Elsevier.
Google Scholar
Britto, A. S., Sabourin, R., & Oliveira, L. (2014). Dynamic selection of classifiers-a comprehensive review. Pattern Recognition, 47(11), 3665–3680. https://doi.org/10.1016/j.patcog.2014.05.003.
Article Google Scholar
Buhmann, M. (2003). Radial basis functions: Theory and implementations. Cambride: Cambridge University Press.
Book MATH Google Scholar
Cao, L., Jin, L., Tao, H., Li, G., Zhuang, Z., & Zhang, Y. (2015). Multi-focus image fusion based on spatial frequency in discrete cosine transform domain. IEEE Signal Processing Letters, 22(2), 220–224. https://doi.org/10.1109/LSP.2014.2354534.
Article Google Scholar
Cardoso, S. A., Castanho, J. E. C., Franchin, M. N., & Fontes, I. R. (2010). Sesame: sistema de reconhecimento de comandos de voz utilizando PDS e RNA. In Anais do XVIII Congresso Brasileiro de Automática (pp. 1316–1323).
Debatin, L., Haendchen, A., & Dazzi, R. L. S. (2017). O problema do reconhecimento de voz offline em dispositivos móveis: em busca de uma abordagem racional. In Anais do XXIII Simpósio Brasileiro de Sistemas Multimídia e Web: Workshops e Pôsteres. Porto Alegre: Sociedade Brasileira de Computação (pp. 229–230).
Didaci, L., Giacinto, G., Roli, F., & Marcialis, G. L. (2005). A study on the performances of dynamic classifier selection based on local accuracy estimation. Pattern Recognition, 38(11), 2188–2191. https://doi.org/10.1016/j.patcog.2005.02.010.
Article MATH Google Scholar
Dougherty, G. (2013). Pattern recognition and classification: An introduction. New York: Springer.
Book MATH Google Scholar
Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. New York: Wiley.
MATH Google Scholar
Filho, J. A. S. L., Canuto, A. M., & Santiago, R. H. N. (2018). Investigating the impact of selection criteria in dynamic ensemble selection methods. Expert Systems with Applications, 106, 141–153.
Article Google Scholar
Giacinto, G., & Roli, F. (1999). Intelligent system of speech recognition using neural networks based on DCT parametric models of low order. In Proceedings 10th international conference on image analysis and processing (pp. 659–664).
Gnanasekar, A. K., Jayavelu, P., & Nagarajan, V. (2012). Speech recognition based wireless automation of home loads with fault identification for physically challenged. In International conference on communication and signal processing (pp. 128–132).
Halmos, P. (2017). Finite-dimensional vector spaces: Second edition. Dover books on mathematics. Mineola: Dover Publications.
Google Scholar
Haykin, S. (2011). Neural networks and learning machines (3rd ed.). Hoboken, NJ: Pearson Education.
Google Scholar
Hu, Y., & Hwang, J. E. (2014). Handbook of neural networks for speech processing (1st ed.). New York: CRC Press.
Google Scholar
Hua, Z. & Ng, W. L. (2010). Speech recognition interface design for in-vehicle system. In Proceedings of the 2nd international conference on automotive user interfaces and interactive vehicular applications (pp. 29–33).
Janson, S., Janson, P., Bollobas, B., Fulton, W., Katok, A., Kirwan, F., et al. (1997). Gaussian Hilbert spaces. Cambridge: Cambridge University Press.
Book Google Scholar
Jensen, J., & Tan, Z. (2015). Minimum mean-square error estimation of mel-frequency cepstral features-a theoretically consistent approach. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(1), 186–197. https://doi.org/10.1109/TASLP.2014.2377591.
Article Google Scholar
Jo, J., Yoo, H., & Park, I. (2016). Energy-efficient floating-point MFCC extraction architecture for speech recognition systems. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 24(2), 754–758. https://doi.org/10.1109/TVLSI.2015.2413454.
Article Google Scholar
Kautz, T., Eskofier, B. M., & Pasluosta, C. F. (2017). Generic performance measure for multiclass-classifiers. Pattern Recognition, 68, 111–125.
Article Google Scholar
Kheradpisheh, S. R., Sharifizadeh, F., Nowzari-Dalini, A., Ganjtabesh, M., & Ebrahimpour, R. (2014). Mixture of feature specified experts. Information Fusion, 20, 242–251.
Article Google Scholar
Koo, Y., Yang, J., Park, M., Kang, E., Hwang, W., Lee, W., et al. (2014). An intelligent motion control of two wheel driving robot based voice recognition. In 14th International conference on control, automation and systems (ICCAS) (pp. 313–315).
Kuncheva, L. (2014). Combining pattern classifiers: Methods and algorithms. Hoboken: Wiley.
MATH Google Scholar
Kuncheva, L. I. (2000). Clustering-and-selection model for classifier combination. In Proceedings of fourth international conference on knowledge-based intelligent engineering systems and allied technologies, KES’2000 (Cat. No.00TH8516) (pp. 185–188).
Li, W., Zhou, Y., Poh, N., Zhou, F., & Liao, Q. (2013). Feature denoising using joint sparse representation for in-car speech recognition. IEEE Signal Processing Letters, 20(7), 681–684.
Article Google Scholar
Liu, Y., Ouyang, C., & Li, J. (2017). Ensemble method to joint inference for knowledge extraction. Expert Systems with Applications, 83, 114–121.
Article Google Scholar
Palacios, D. S., Ferri, C., & Quintana, M. J. R. (2017). Improving performance of multiclass classification by inducing class hierarchies. Procedia Computer Science, 108, 1692–1701.
Article Google Scholar
Piryatinska, A., Darkhovsky, B., & Kaplan, A. (2017). Binary classification of multichannel-EEG records based on the complexity of continuous vector functions. Computer Methods and Programs in Biomedicine, 152, 131–139. https://doi.org/10.1016/j.cmpb.2017.09.001.
Article Google Scholar
Priddy, K., & Keller, P. (2005). Artificial neural networks: An introduction (Illustrated ed.). Washington, DC: SPIE Press.
Book Google Scholar
Qian, Y., Liu, J., & Johnson, M. T. (2009). Efficient embedded speech recognition for very large vocabulary mandarin car-navigation systems. IEEE Transactions on Consumer Electronics, 55(3), 1496–1500.
Article Google Scholar
Rao, K. R., & Yip, P. (1990). Discrete cosine transform: Algorithms, advantages, applications. San Diego: Academic Press Professional Inc.
Book MATH Google Scholar
Rocha, P. L., & Silva, W. L. S. (2016). Intelligent system of speech recognition using neural networks based on DCT parametric models of low order. In International joint conference on neural networks (IJCNN) (pp. 788–795).
Roman, S. (2007). Advanced linear algebra. Berlin: Springer.
Google Scholar
Silva, I., Spatti, D., & Flauzino, R. (2010). Redes Neurais Artificiais para Engenharia e Ciências Aplicadas: Curso Prático. São Paulo: Artliber.
Google Scholar
Silva, W., & Serra, G. (2014). Intelligent genetic fuzzy inference system for speech recognition: An approach from low order feature based on discrete cosine transform. Journal of Control, Automation and Electrical Systems, 25(6), 689–698.
Article Google Scholar
Singh, T., & Yadav, N. (2015). Voice recognition based advance patient’s room automation. International Journal of Research in Engineering and Technology, 4(6), 308–310. https://doi.org/10.1007/s40313-016-0285-8.
Article Google Scholar
Song, Q., Jiang, H., & Liu, J. (2017). Feature selection based on FDA and F-score for multi-class classification. Expert Systems with Applications, 81, 22–27.
Article Google Scholar
Sousa, C. A. R. D. (2016). An overview on weight initialization methods for feedforward neural networks. In International joint conference on neural networks (IJCNN) (pp. 52–59).
Stoll, R. (2013). Linear algebra and matrix theory. Dover books on mathematics. Mineola: Dover Publications.
Google Scholar
Strang, G. (2003). Introduction to linear algebra. Wellesley: Wellesley-Cambridge Press.
MATH Google Scholar
Theodoridis, S., & Koutroumbas, K. (2008). Pattern recognition. Amsterdam: Elsevier.
MATH Google Scholar
Woods, K., Bowyer, K., & Kegelmeyer, W. P. (1996). Combination of multiple classifiers using local accuracy estimates. In Proceedings CVPR IEEE computer society conference on computer vision and pattern recognition (pp. 391–396).
Xie, F., Fan, H., Li, Y., Jiang, Z., Meng, R., & Bovik, A. (2017). Melanoma classification on dermoscopy images using a neural network ensemble model. IEEE Transactions on Medical Imaging, 36(3), 849–858.
Article Google Scholar
Zhang, Q., Yang, L. T., Chen, Z., & Li, P. (2018). A survey on deep learning for big data. Information Fusion, 42, 146–157. https://doi.org/10.1016/j.inffus.2017.10.006.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Federal University of Maranhão, Avenida dos Portugueses, 1966-Vila Bacanga, São Luís, MA, Brazil
Priscila Rocha & Allan Barros
Institute Federal of Maranhão, Avenida Getúlio Vargas, 4-Monte Castelo, São Luís, MA, Brazil
Washington Silva

Authors

Priscila Rocha
View author publications
You can also search for this author in PubMed Google Scholar
Washington Silva
View author publications
You can also search for this author in PubMed Google Scholar
Allan Barros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Priscila Rocha.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rocha, P., Silva, W. & Barros, A. Hierarchical Expert Neural Network System for Speech Recognition. J Control Autom Electr Syst 30, 347–359 (2019). https://doi.org/10.1007/s40313-019-00459-w

Download citation

Received: 19 April 2018
Revised: 21 January 2019
Accepted: 04 March 2019
Published: 11 March 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s40313-019-00459-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical Expert Neural Network System for Speech Recognition

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical Expert Neural Network System for Speech Recognition

Abstract

Access this article

Similar content being viewed by others

Development and Application of Artificial Neural Network

Fundamentals of Artificial Neural Networks and Deep Learning

Neural Networks – State of Art, Brief History, Basic Models and Architecture

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation