Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter November 29, 2013

Forensic voice comparison by means of artificial neural networks

  • Kinga Sałapa EMAIL logo , Agata Trawińska and Irena Roterman-Konieczna

Abstract

This article examines the effectiveness of artificial neural networks (ANNs) as forensic voice comparison techniques. This study specifically considers feed-forward multilayer perceptron (MLP) and radial basic function (RBF) network models. Formant frequencies of Polish vowel e (stressed or unstressed) in selected contexts were used as predictors. This has already been confirmed in an earlier investigation that determined that dynamic formant frequencies of vowels are powerful elements in distinguishing the voice. It has been concluded that neural networks might assist in distinguishing speakers from the others with very good accuracy, reaching 100%. MLP models should be given preference. The results of the investigation have shown the influence of vowel e triads on the effectiveness of correct classification rates. In addition, the authors have determined that the accuracy of classification is greater when based on a single context than for similar input data aggregated over several different contexts.


Corresponding author: Kinga Sałapa, Department of Bioinformatics and Telemedicine, Jagiellonian University Medical College, św. Łazarza 16, 31-530 Kraków, Poland, E-mail:

Special thanks to colleagues from the Institute of Forensic Research for sharing voice recordings.

Conflict of interest statement

Authors’ conflict of interest disclosure: The authors stated that there are no conflicts of interest regarding the publication of this article. Research funding played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

Research funding: This work is supported by a scientific grant of the Jagiellonian University Medical College (K/ZDS/003959).

Employment or leadership: None declared.

Honorarium: None declared.

References

1. Enzinger E. Formant trajectories in forensic speaker recognition. PhD dissertation. Wien: Universität in Wien, 2009.Search in Google Scholar

2. Drygajlo A. Value and interpretation of biometric evidence in forensic automatic speaker recognition, 2010. Available at: http://cancun2010.forensic-voice-comparison.net. Accessed: 30 Jul 2013.Search in Google Scholar

3. Trawińska A, Klus A. Forensic speaker identification by the linguistic-acoustic method in KEU and IES. Problems Forensic Sci 2009;LXXVIII:160–74.Search in Google Scholar

4. Fant G. Acoustic theory of speech production. Mouton: The Hague, 1960.Search in Google Scholar

5. Nolan F. Speaker identification evidence: its forms, limitations and roles. In: Proceedings of the Conference Law and Language: Prospect and Retrospect, University of Texas School of Law 2001:1–19.Search in Google Scholar

6. Künzel H. Effects of voice disguise on speaking fundamental frequency. Forensic Linguist 2000;7:149–79.Search in Google Scholar

7. Suneetha DG. Pitch breaks as voice disguise. In: Proceedings of 22nd Conference of the International Association for Forensic Phonetics and Acoustics, July 21–24, 2013, University of South Florida, USA, 2013.Search in Google Scholar

8. Masthoff H, Meinerz Ch. The effectiveness of voice disguise: implications for research and casework. In: Proceedings of the 22nd Conference of the International Association for Forensic Phonetics and Acoustic, July 21–24, 2013, University of South Florida, USA, 2013.Search in Google Scholar

9. Nolan F, Grigoras C. A case for formant analysis in forensic speaker identification. Int J Speech Lang Law 2005;2:143–73.10.1558/sll.2005.12.2.143Search in Google Scholar

10. Gold E, French P. International practices in forensic speaker comparison. Int J Speech Lang Law 2011;18:293–307.10.1558/ijsll.v18i2.293Search in Google Scholar

11. McDougall K. Speaker-specific formant dynamics: an experiment on Australian English /aI/. Int J Speech Lang Law 2004;11:103–30.10.1558/sll.2004.11.1.103Search in Google Scholar

12. McDougall K, Nolan F. Discrimination of speakers using the formant dynamics of /u:/ in British English. In: Proceedings of the 16th International Congress of Phonetic Sciences, Universität des Saarlandes, 2007:1825–8.Search in Google Scholar

13. McLachlan G. Discriminant analysis and statistical pattern recognition. In: Wiley series in probability and statistics. Hoboken, NJ: Wiley, 2004.Search in Google Scholar

14. Huberty CJ, Olejnik S. Applied MANOVA and discriminant analysis. In: Wiley series in probability and statistics. Hoboken, NJ: Wiley, 2006.Search in Google Scholar

15. Jassem W, Grygiel W. Off-line classification of Polish vowel spectra using artificial neural networks. J Int Phonetic Assoc 2004;34:37–52.10.1017/S0025100304001537Search in Google Scholar

16. Du H. Data mining techniques and applications: an introduction. Hampshire: Cengage Learning, 2010.Search in Google Scholar

17. Tufféry S. Data mining and statistics for decision making. In: Wiley series in computational statistics. Hoboken, NJ: Wiley, 2011.10.1002/9780470979174Search in Google Scholar

18. Salapa K, Trawińska A, Roterman I. Forensic speaker identification models based on artificial neural networks. Case study: Polish vowel e. In: Annual Conference of the International Association for Forensic Phonetics and Acoustics (IAFPA), University of South Florida, USA, 2013.Search in Google Scholar

19. Salapa K, Trawińska A, Roterman I. Applying data mining classification techniques to speaker identification, In: Proceedings of the XIX National Conference on Application of Mathematics in Biology and Medicine, September 16–20, 2013, University of Gdansk, 2013.Search in Google Scholar

20. Bishop CM. Pattern recognition and machine learning. New York: Springer, 2006.Search in Google Scholar

21. Basu JK, Bhattacharyya D, Kim T. Use of artificial neural network in pattern recognition. Int J Software Eng Appl 2010;4:23–34.Search in Google Scholar

22. Jessen M. Phonetisch und linguistische Prinzipien des forensischen Stimmvergleichs. München: LINCOM EUROPA, 2012.Search in Google Scholar

23. Jessen M. Forensic phonetics. Lang Linguist Compass 2008;2:671–711.10.1111/j.1749-818X.2008.00066.xSearch in Google Scholar

Received: 2013-10-19
Accepted: 2013-11-5
Published Online: 2013-11-29
Published in Print: 2013-12-01

©2013 by Walter de Gruyter Berlin Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/bams-2013-0153/html
Scroll to top button