Gender Classification using Distance Classifier and Neural Network

-The face recognition system with large sets of training sets for personal identification normally attains good accuracy. In this paper Gender Classification algorithm with only small training sets and it yields good results even with one image per person. This process involves four stages: Pre-processing, Feature Extraction, Feature Selection and Classification. Gender classification has become an essential task in human computer interaction (HCI). Gender classification is used in immense number of applications like passive surveillance, control in smart buildings (restricting access to certain areas based on gender) and supermarkets, gender advertising, security investigation. So far detection of gender using facial features is done by using the methods like Distance classifier and neural network. A radial basis function network is an artificial neural network that uses radial basis functions as activation functions. The output of the network is a linear combination of radial basis functions of the inputs and neuron parameters. Radial basis function networks have many uses, including function approximation, time series prediction, classification, and system control.


I. INTRODUCTION
Gender means the range of physical, biological, mental and behavioral characteristics which differentiates masculinity and femininity. Classification means categorization. Gender classification is a binary classification technique to classify male and female. This technique has become a most important task as it has many applications. So far the work is used to recognize the gender by visual observation but now the work is emphasized to a computer so that it can guess whether the given is a male or female.
The general definition for gender classification is the genetic or environmental process which is used to specify an individual characteristic physically. Gender detection is one of the most interesting and challenging field in the biometrics. The recognition process can be more efficient if it is based on features that provide some information about the class to be detected. Generally it is defined as the genetic process by which the gender of an individual is determined based on their facial features [1] Most of these methods for gender classification are based on training processes using several samples for each person. They include Discrete Wavelet Transform. Discrete Cosine Transform Neural Networks, and Distance Classifier. Preparing multiple training image samples from different point of views or under different lightening conditions is usually difficult or even impossible [2] The difficulty we face while performing automatic gender classification is the difference in image quality, geometry and photometry. Gender classification has been studied less. Gender classification has been especially interesting for psychologists but demographic data collection are used by automatic gender classification .For classification of gender many types of methods are available appearance-based method or holistic approach, hybrid approach [7].
To classify the facial images based on gender many techniques have been taken. This paper works out on the particular approach using Discrete Cosine Transform features are extracted then features are selected by using zigzag algorithm and Distance classifier are used to measure the distance and tells that the image belongs to which group whether belongs to male or female. Template matching can easily be expressed mathematically. Let x be the feature vector for the unknown input, and let m 1 , m 2 , ..., m c be templates (i.e., perfect, noisefree feature vectors) for the c classes. Then the error in matching x against m k is given by ||x-m k || Here ||u|| is called the norm of the vector u. A minimumerror classifier computes ||x-m k || for k = 1 to c and chooses the class for which this error is minimum. Since ||x-m k || is also the distance from x to m k , we call this a minimum-distance classifier. Clearly, a template matching system is a minimumdistance classifier.

II. DISTANCE CLASSIFIER
III. RADIAL BASIS FUNCTION The idea of Radial Basis Function (RBF) Networks derives from the theory of function approximation. We have already seen how Multi-Layer Perception (MLP) networks with a hidden layer of sigmoid units can learn to approximate functions. RBF Networks take a slightly different approach. Their main features are: 1. They are two-layer feed-forward networks. 2. The hidden nodes implement a set of radial basis functions (e.g. Gaussian functions). 3. The output nodes implement linear summation functions as in an MLP. 4. The network training is divided into two stages: first the weights from the input to hidden layer are determined, and then the weights from the hidden to output layer. 5. The training/learning is very fast.

The networks are very good at interpolation [MLP][4] A. Radial Basis Function Networks
The basic architecture for a RBF is a 3-layer network, as shown in Fig. The input layer is simply a fan-out layer and does no processing. The second or hidden layer performs a non-linear mapping from the input space into a (usually) higher dimensional space in which the patterns become linearly separable.

1) Feature Extraction
Here we have used Discrete Cosine Transform to extract features of the input image. Like other transforms, the Discrete Cosine Transform (DCT) attempts to decorrelate the image data. After decorrelation each transform coefficient can be encoded independently without losing compression efficiency [5] 2) The Two-Dimensional DCT The objective of this document is to study the efficacy of DCT on images. This necessitates the extension of ideas presented in the last section to a two-dimensional space. (1) y) is the x,y th element of the image represented by the matrix p. 'N' is the size of the block that the DCT is done on. The equation calculates one entry (i,j th ) of the transformed image from the pixel values of the original image matrix [6] Feature Selection Here we have used ZigZag algorithm for the highest coefficient selection from the feature set.

3) Zigzag Algorithm
The DCT coefficients with high variance are mainly located in the upper-left corner of the DCT matrix. Accordingly, we scan the DCT coefficient matrix in a zigzag manner starting from the upper-left corner and subsequently convert it to a one-dimensional (1-D) vector. This is similar to sorting according to importance. High importance coefficients are located in the top-left corner of the block. When a total of 16 coefficients are selected from an image, only 1st coefficient of each of 16 DCT blocks is selected. As the no. of selected coefficients increases so does the size of the feature vector. For 32 size feature vector first 2 coefficients from each DCT block are selected, and in the same manner 48, 64, 128 and 256 size feature vectors were created [8]  the image is split into four sub-bands, namely HH1, HL1, LH1, and LL1, as shown in Figure . The HH1, HL1 and LH1 sub-bands represent the diagonal details, horizontal features and vertical structures of the image, respectively. The LL1 sub-band is the low resolution residual consisting of low frequency components and it is this sub-band which is further split at higher levels of decomposition.
The signal is also decomposed simultaneously using a highpass filter h. The outputs giving the detail coefficients (from the high-pass filter) and approximation coefficients (from the low-pass). It is important that the two filters are related to each other and they are known as a quadrature mirror filter. However, since half the frequencies of the signal have now been removed, half the samples can be discarded according to Nyquist's rule. The filter outputs are then sub sampled by 2 (Mallat's and the common notation is the opposite, g-high pass and h-low pass): This decomposition has halved the time resolution since only half of each filter output characterizes the signal. However, each output has half the frequency band of the input so the frequency resolution has been doubled.
With the sub sampling operator ↓

(y↓ k)[n]=y[k n]
The summation can be written more concisely y low =(x *g) ↓ 2 After getting the 2D Discrete Wavelet Transform of input image as the feature set, we select only the lower frequency components from the lower quadrant.

VI. CONCLUSION
Gender classification becomes a real challenge in some cases of images. We have implemented the method of Distance Classifier and Neural Network Classifier for Gender Classification. Basically we have used FEI face image database. For gender classification we have tested the algorithm on 50 people both male and female. We have found around 92% results in case of neural network where as 90 % in case of distance classifier. Neural network classifier takes more time to execute as compared to distance classifier.
In future we have planned to extend the work for different network, and to have a comparative analysis among them.