Forensic Face Sketch Recognition

- A forensic sketch is matched with a database of images. A professional forensic sketch artist draws a sketch of a suspect’s face by listening to a description given by witness. A database has a collection of sketches and images. A much efficient recognition system is implemented in this paper in order to recognize a forensic sketch. firstly, the feature vectors of the face are extracted from both the sketches and images using SIFT & MLBP descriptors. Later the nearest neighbor matching is implemented to match the feature vectors of sketches and images. secondly, the same feature vectors obtained from SIFT & MLBP are further revised to reduce the dimension of SIFT & MLBP feature vectors. A linear Discriminant Analysis is implemented. In this approach, A multiple projection of slices of feature vectors are made. The images in the database are first converted into pseudo-sketches, then the matching of pseudo-sketches with the forensic sketch is done by minimum distance recognition. The two standard databases are CUHK & IIIT-D of sketches and images are used. The recognition rates for both the approaches are compared. _____________________________________________________________________________________________________


I. INTRODUCTION
A wide range of techniques and algorithms have been proposed and implemented for Face recognition under different pose, lighting and expression of faces. However, there has been relatively less attraction towards sketch recognition. This Sketch Recognition plays a very important role in the law enforcement. However, in many cases of crimes like murder, Robbery, Accident, Sexual assault, the only witness is the victim. A professional sketch artist draws a sketch by the verbal description stated by the victim. These sketches are called as forensic sketch. Once the sketch is completed, an automated Recognition system that would match the drawn sketch with the images in the database maintained by law enforcement officials. In this paper, a combined technique to match a forensic sketch with the database is proposed. The normal face recognition system needs modification because the sketches and images are two different entities. This Recognition system poses a great challenge because A witness may not exactly remember the criminal's appearance. These sketches are often incomplete and inaccurate. The remaining sections of this paper are organized as follows. Section II states a related work of the algorithm. Section III explains SIFT & MLBP feature descriptor based Nearest Neighbor Matching. Section IV explains LFDA based Minimum Distance Matching. Section V gives Result analysis using Matlab simulation. Finally, section VI conclusion of this paper.

II.RELATED WORK
The Forensic Face Sketch Recognition System requires a database of forensic sketches and these are not much easily available. Most of the face sketch recognition system is implemented for the viewed sketches, sketched that are drawn by looking at the person not just by the verbal description. A small number of sketches were used in the database. Major research work in matching these viewed sketches was done by Tang et al. [3], [2], [5], [6], [7]. In his approach, a digital image obtained from a sketch and face is matched using recognition algorithm. In SIFT based approach, the feature vectors are extracted from sketches and images, and used for matching though the face and sketches modalities are different, the feature vectors extracted from an image and sketch contains the information necessary for recognition. [13]. In this paper, A similar feature based SIFT is used to match the sketches with image. In addition to SIFT, A Multi-scale local binary pattern, histogram based feature description is also used. A similar matching algorithm is used by Liao. [14]. And finally both these large feature vectors are arranged into many slices of smaller dimension using an efficient discriminant classifiers, LFDA proposed by Lei & Li. [15].

III. SIFT AND MLBP BASED SKETCH RECOGNITION
A SIFT based representation of test sketch and images in database is computed. The feature vectors are computed for uniformly distributed sub-regions of the face, these vectors are sampled. The sampling points are selected by two entities, a patch size 'p' and a displacement size 'd'. The patch size 'p' is the size of the square window over which the feature vector of an image is computed. The displacement size 'd' is the patch displacement for each sample in terms of number of pixels. The difference (p-d) gives the overlapping pixels of two adjacent patches. The sampling points are computed. A PXP size window slides across the image. An image is of size HXW. The horizontal sampling locations(M) given by M=(H-P)/(d+1) and similarly vertical locations points (N), N=(W-P)/(d+1), next an image feature vector 'ɸ' is computed of size 't'. These feature vectors are concatenated into one dimensional feature vector ɸ of size (MXNXt). Later a Nearest Neighbor sketch matching is implemented on the feature vectors.

A. Scale Invariant Feature Transform feature Descriptor Algorithm
SIFT is a Scale-invariant feature transform [11]. SIFT is a feature descriptor algorithm which is used to extract the necessary feature vectors from an image. This was published by David Lowe. The important points of an image are the feature description vectors. These vectors should be detectable even if there is any different in scaling, noise and background interference and lighting of an image, such invariant vectors extracted from test sketch are compared with the feature vectors of an image in the database. The vectors which are invariant under any conditions extracted will lie in high contrast regions like in the edges of an image. SIFT is one such feature vector extraction algorithm and the vectors extracted are invariant under any circumstances and reduces errors caused due to variations. The steps involved in the extraction are Step 1: Building a Blurred image: The original image is I (x, y) where x, y are coordinates of an image I. A several octaves of the original image is generated. Every octave generated is half the size of the previous octave. In an octave, images are blurred using the Gaussian blur. B (x, y, σ) represents the blurred image. G (x, y, σ) represents blur operator. Where σ is the amount of blur parameter. Mathematically, the blurred image is obtained by convoluting Gaussian blur G (x, y, σ) and original image I (x, y).
Step 2: Detecting keypoint locations The two consecutive images in an octave are subtracted and repeat this process for all the octaves. These images are the approximated scale invariant of Gaussian. These are useful for detecting keypoints. The difference of Gaussian images is D(x, y, σ).
(2) where B(x, y, kσ) is scaled k times the B(x, y, σ).  Step 4: Elimination of Low Contrast and Edgy Regions In this step firstly, the keypoints which have low contrast and keypoints along edges are eliminated. The keypoints are further refined using Taylor expansion. The magnitude of the intensity of the current pixel of a difference of Gaussian image is less than the defined value, that keypoint is eliminated. Secondly, two gradients are calculated for a keypoint which is perpendicular to each other. The keypoints obtained can be flat, in these flat region, the two gradients will be small. Along an edge, one gradient will be large and another gradient will be small and the last region is corners, in this both gradients are large. The corner region gradients are retained, other edges and flat gradients are rejected. This can be mathematically obtained by hessian matrix.
Histogram is an estimation of probability distribution of gradients of keypoints. It identifies the most prominent gradient. A group of gradients are assigned to one keypoint if there is only one peak. Suppose there are many peaks above 80 to 85% mark, these are all converted to new keypoint. A distinct feature vector of 128 numbers is generated from this.

B. Generation of Multiscale Local Binary Pattern Feature vectors
An image is partitioned into sub-windows. In each window Multiscale Local Binary pattern histograms are obtained, normalized and concatenated into one feature vector of that window. A 16 X 16 window of pixels is taken, this is further split into 16, 4 X 4 sub-windows, A histogram of 8 equal parts called bins are generated. The gradient orientation from the above 4 X 4 subwindow is placed into appropriate 8 bins, this is repeated for 16, 4 X 4 sub-windows. The value 128 number of feature vectors are normalized. Finally, these features obtained have sufficient information to determine an image identity. This features are well suited for Sketch Recognition.

C. System Architecture of SIFT and MLBP
A sketch or corresponding photos are pre-processed. An image is split into patches. Each patch with a size of P X P. The SIFT feature vectors are computed for each patch of an image. An image of size W X H, hence many such features has to be computed for a single image because image has high resolution. A window of size P X P slides across the image in raster scan technique. An M X N SIFT features are computed as discussed in section III. Finally, a set of V-128 dimensional features for each image V= M-N.

B. Face Sketch Synthesis System
The images in the database are first synthesized into sketches using photo Eigen space synthesis system then matching is performed in the same modality that is sketch Eigen space using Minimum distance. This section deals with photo to sketch transformation. The sketch differs from an image in texture and shape. The texture and shape are separated. The Eigen transformation is done separately for the shape and the texture as shown in figure 10. The shape of the face is represented as a graph. A graph contains a set of fiducial points shown in figure11. The shape of the sketch and an image are linear. The fiducial points of sketch and image will be in same position. The texture transformation is based on grayscale around the fiducial points. The steps involved in the synthesis of sketch are as follows: 1) Identify the reference fiducial set of points on the graph model of an image to extract the shape.
represents an vector of shape of image for an input photo and obtain texture .
2) Apply transformation of shape and texture for a sketch . = ( − ) + (4) (a) (b) (c) Figure 12: Face Sketch Synthesis Results Face photo (b) synthesized sketches using separate Eigen transformation on texture and shape (c) real sketch

C. Minimum Distance Matching
The Eigen vectors of the images and the input sketch are computed and compared. ] is the image sample matrix. Similarly compute the sketch Eigen space using for the sketch.
Step 2 : Use to compute the pseudo sketch ′ where = ∑ + = for each image in the database.
Step 3 : Compute Eigen weight vector ′ = ′ for ′ in . Similarly for Eigen weight vector ′ = ′ for ′ (probe sketch ) in .Finally the one with the minimum distance between ′ and ′ is classified as sketch with its corresponding face.

V. RESULTS and DISCUSSIONS
The two databases were used: CUHK database shown in Figure 14(a) and IIIT-D sketch database shown in Figure 14 The CUHK sketch dataset has sketches and image pairs. They have same background and constant illumination. The IIIT-D dataset has also sketches and image pairs.

Result Analysis
A Comparison of SIFT descriptor and LFDA with Eigen transformation is being carried out. The performance of SIFT algorithm is less for IIIT-D database. SIFT feature extraction method is more sensitive to registration errors. The Recognition rates of SIFT & MLBP without LFDA and with LFDA for CUHK and IIIT-D database is shown in the table 1.1. The proposed algorithm matches sketch to digital image effectively and the simulation was carried out in MATLAB 7.0.

VI. CONCLUSION
Recognition of face image photographs from the sketches is a challenging task. The sketches are inaccurate and incomplete. Matching these sketches with digital face offers many difficulties because of grey level changes and even the shape of the sketch and image. The key contribution of the above proposed combined algorithm is making use of SIFT and MLBP feature vectors of sketches and images. This offered a good performance rates. Later A Discriminant Analysis is applied on these SIFT and MLBP vectors to reduce the dimensionality. The performance was further improved by LFDA. A large database need to be collected to understand and analyze the complexity in Forensic sketch recognition.