Abstract
Computer vision systems for monitoring people and collecting valuable demographics in a social environment will play an increasingly important role in enhancing user’s experience and can significantly improve the intelligibility of a human computer interaction (HCI) system. For example, a robust gender classification system is expected to provide a basis for passive surveillance and access to a smart building using demographic information or can provide valuable consumer statistics in a public place. The option of an audio cue in addition to the visual cue promises a robust solution with high accuracy and ease-of-use in human computer interaction systems.
This paper investigates the use of Support Vector Machines(SVMs) for the purpose of gender classification. Both visual (thumbnail frontal face) and audio (features from speech data) cues were considered for designing the classifier and the performance obtained by using each cue was compared. The performance of the SVM was compared with that of two simple classifiers namely, the nearest prototype neighbor and the k-nearest neighbor on all feature sets. It was found that the SVM outperformed the other two classifiers on all datasets. The best overall classification rates obtained using the SVM for the visual and speech data were 95.31% and 100%, respectively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
V. Bruce, A. Burton, N. Dench, E. Hanna, P. Healey, O. Mason, A. Coombes, R. Fright, and A. Linney. Sex discrimination: How do we tell the difference between male and female faces? Perception, 22, 1993.
R. Brunelli and T. Poggio. Hyperbf networks for gender classification. IUW, 92:311–314.
C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2:121–167, 1998.
A. Burton, V. Bruce, and N. Dench. What’s the difference between men and women? evidence from facial measurement. Perception, 22:153–176, 1993.
D. Childers, K. Wu, K. Bae, and D. Hicks. Automatic recognition of gender by voice. In Proceedings of the IEEE ICASSP-88, pages 603–606, 1988.
D. G. Childers and K. Wu. Gender recognition from speech. part 2: Fine analysis. Journal of the Acoustical Society of America, pages 1841–1856, 1991.
COLEA. A matlab software tool for speech analysis. http://www.utdallas.edu/~loizou/speech/colea.htm.
G. Cottrell, J. Metcalfe, and E. Face. Gender and emotion recognition using holons. In R. P. Lippman, J. Moody, and D. S. Touretzky, editors, Advances in Neural Information Processing Systems, volume 3, pages 564–571. Morgan Kaufmann., 1991.
M2VTS database. http://www.tele.ucl.ac.be/projects/m2vts/m2fdb.html.
J. Fussell. Automatic sex identification from short segments of speech. In Proceedings of the IEEE ICASSP-91, Toronto, Canada, pages 409–412, 1991.
B. A. Golomb, D. T. Lawrence, and Terrence J. Sejnowksi. Sexnet: A neural network identifies sex from human faces. In Richard P. Lippmann, John E. Moody, and David S. Touretzky, editors, Advances in Neural Information Processing Systems, volume 3, pages 572–579. Morgan Kaufmann Publishers, Inc., 1991.
S. Gutta, J. R. J. Huang, P. Jonathon, and H. Wechsler. Mixture of experts for classification of gender, ethnic origin, and pose of human faces. IEEE Trans. on Neural Networks, 11(4):948–960, 2000.
S. Gutta, H. Wechsler, and P. J. Phillips. Gender and ethnic classification of human faces using hybrid classifiers. In Proc. of the IEEE International Conference on Automatic Face and Gesture Recognition, pages 194–199, 1998.
T. Joachims. Making large-scale svm learning practical. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods-Support Vector Learning. MIT-Press, 1999. Software downloadable from ftp://ftp-ai.cs.uni-dortmund.de/pub/Users/thorsten/svm_light/current/svm_light.tar.gz.
Thorsten Joachims. Text categorization with support vector machines: learning with many relevant features. In Claire Nédellec and Céline Rouveirol, editors, Proceedings of ECML-98, 10th European Conference on Machine Learning, number 1398, pages 137–142, Chemnitz, DE, 1998. Springer Verlag, Heidelberg, DE.
B. Moghaddam and M. Yang. Gender classification with support vector machines. In Proc. of 4th IEEE Intl. Conf. on Automatic Face and Gesture Recognition, 2000.
S. Nayar, S. Nene, and H. Murase. Real-time 100 object recognition system. Technical Report CUCS-019-95, Columbia University, 1994.
A. M. Noll. Cepstrum pitch determination. Journal of the Acoustical Society of America, 41:293–309, 1967.
Edgar E. Osuna, Robert Freund, and Federico Giorsi. Support vector ma-chines:training and applications. Technical report, MIT Artificial Intelligence Laboratory and Center for Biological and Computational Learning Department of Brain and Cognitive Sciences, March 1997.
Eluned S parris and Michael J Carey. Language independent gender identification. In (ICASSP), volume 2, pages 685–688, 1996.
John C. Platt. Fast training of svms using sequential minimal optimization. In B. Schölkopf, C. Burges, and A. Smola, editors, Advances in Kernel Methods-Support Vector Learning, pages 185–208. MIT Press, 1998.
S. Slomka and Sridharan. S. Automatic gender identification optimized for language independence. In IEEE TENCON-Speech and Image Technologies for Computing and Telecommunications, pages 145–148, 1997.
S. Stevens and J. Volkmann. The relation of pitch to frequency: A revised scale. American Journal of Psychology, 53:329–353, 1940.
S. H. Tamura, Kawai, and H. Mitsumoto. Male/female identification from 8 x 6 very low resolution face images by neural networkö. Pattern Recogntion, 29(2):331–335, 1996.
V. Vapnik. Estimation of Dependencies Based on Empirical Data. Springer-Verlag, 1982.
Vladimir N. Vapnik. The nature of statistical learning theory. Springer Verlag, Heidelberg, DE, 1995.
V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, 1995.
Y. Yang. An evaluation of statistical approaches to text categorization. Journal on Information Retrieval, 1998.
M. Yeasin and Y. Kuniyoshi. Detecting and tracking human face and eye using a space-variant vision sensor and an active vision head. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition, volume 2, pages 168–173, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Walawalkar, L., Yeasin, M., Narasimhamurthy, A.M., Sharma, R. (2002). Support Vector Learning for Gender Classification Using Audio and Visual Cues: A Comparison. In: Lee, SW., Verri, A. (eds) Pattern Recognition with Support Vector Machines. SVM 2002. Lecture Notes in Computer Science, vol 2388. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45665-1_12
Download citation
DOI: https://doi.org/10.1007/3-540-45665-1_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44016-1
Online ISBN: 978-3-540-45665-0
eBook Packages: Springer Book Archive