Digital Signature: A Novel Adaptative Image Segmentation Approach

Partitioning or segmenting an entire image into distinct recognizable regions is a central challenge in computer vision which has received increasing attention in recent years. It is remarkable how a simple task to humans is not so simple in computer vision, but it can be explained. To humans, an image is not just a random collection of pixels; it is a meaningful arrangement of regions and objects (Figure 1 shows a variety of images). Despite the variations of these images, humans have no problem interpreting them. We can agree about the different regions in the images and recognize the different objects. Human visual grouping was studied extensively by the Gestalt psychologists. Thus, Wertheimer pointed out the importance of perceptual grouping and organization in vision and listed several key factors Wertheimer (1938), that lead to human perceptual grouping: similarity,


Introduction
Partitioning or segmenting an entire image into distinct recognizable regions is a central challenge in computer vision which has received increasing attention in recent years. It is remarkable how a simple task to humans is not so simple in computer vision, but it can be explained. To humans, an image is not just a random collection of pixels; it is a meaningful arrangement of regions and objects (Figure 1 shows a variety of images). Despite the variations of these images, humans have no problem interpreting them. We can agree about the different regions in the images and recognize the different objects. Human visual grouping was studied extensively by the Gestalt psychologists. Thus, Wertheimer pointed out the importance of perceptual grouping and organization in vision and listed several key factors Wertheimer (1938), that lead to human perceptual grouping: similarity, Fig. 1. Some challenging images for a segmentation algorithm. Our goal is to develop a single grouping procedure which can deal with all these types of images. Image Source: Sample images in MIT/CMU test set for frontal face detection (Sung & Poggio (1999)). 6 www.intechopen.com proximity, continuity, symmetry, parallelism, closure and familiarity. In computer vision, these factors have been used as guidelines for many grouping algorithms. Thus, the most studied version of grouping in computer vision is image segmentation. Image segmentation techniques can be classified into two families: (1) region-based, and (2) contour-based approaches.
• Region-based approaches try to find partitions of the image pixels into sets corresponding to coherent image properties such as brightness, color and texture.
• Contour-based approaches usually start with a first stage of edge detection, followed by a linking process that seeks to exploit curvilinear continuity.
On the other hand, in order to distinguish good segmentations from bad segmentations the, already mentioned, classical Gestalt theory has developed various principles of grouping (Palmer (1999); Wertheimer (1938)) such as proximity, similarity and good continuation. As Ren & Malik (2003) pointed out, the principle of good continuation states that a good segmentation should have smooth 1. Intra-region similarity: the elements in a region are similar. This includes similar brightness, similar texture, and low contour energy inside the region; 2. inter-region (dis)similarity: the elements in different regions are dissimilar. This in turn includes dissimilar brightness, dissimilar texture, and high contour energy on region boundaries.
These classical principles of grouping have inspired many previous approaches to segmentation. However, the Gestalt principles distinguish competing segmentations only when everything else is equal. Many of the previous works have made ad-hoc decisions for using and combining these cues. However, even to this day, many of the computational issues of perceptual grouping have remained unresolved.
In this work, we present an image descriptor based on self-similarities which is able to capture the general structure of an image. Computed descriptors are similar for images with the same layout, even if textures and colors are different. Similarly to Shechtman & Irani (2007), images are partitioned into smaller cells which, conveniently compared with a patch located at the image center, yield a vector of values that describes local aspect correspondences. In this Chapter, we demonstrate the effectiveness of our approach on two main topics in computer vision's research: facial expression recognition and hair/skin segmentation. The outline of the Chapter is as follows. Section 2 reviews the previous segmentation approaches. The theoretical concept of digital signature and its main features will be discussed in Section 3 at a high level. In Section 4, some experimental results for classification and segmentation process in order to illustrate the effectiveness and usefulness of the proposed approach. The conclusions of this work in Section 5 will conclude this Chapter.

Related work
Many image segmentation methods have been proposed over the last several decades. As new segmentation methods have been proposed, a variety of evaluation methods have been used to compare new segmentation methods to prior methods. These methods are fundamentally very different, and can be partitioned based on the diversity in segment types: • Algorithms for extracting uniformly colored regions (Comaniciu & Meer (2002); Shi & Malik (2000)).
Thus, most of the methods require using image features that characterize the regions to be segmented. Particularly, texture and color have been independently and extensively used in the area. On the other hand, some algorithms can be also categorized on unsupervised (Shi & Malik (2000)), while others require user interaction (Rother et al. (2004)). Some algorithms employ symmetry cues for image segmentation (Riklin-Raviv et al. (2006)), while others use high-level semantic cues provided by object classes (i.e., class-based segmentation, see Borenstein & Ullman (2002); Leibe & Schiele (2003); Levin & Weiss (2006)).
There are also variants in the segmentation tasks, ranging from segmentation of a single input image, through simultaneous segmentation of a pair of images (Rother et al. (2006)) or multiple images. Bagon et al. (2008) proposed a single unified approach to define and extract visually meaningful image segments, without any explicit modelling. Their approach defines a 'good image segment' as one which is 'easy to compose' (like a puzzle) using its own parts. It captures a wide range of segment types: uniformly colored segments, through textured segments, and even complex objects. Shechtman & Irani (2007) proposed a segmentation approach based on 'local self-similarity descriptor'. It captures self-similarity of color, edges, repetitive patterns and complex textures in a single unified way. These self-similarity descriptors are estimated on a dense grid of points in image/video data, at multiple scales. Moreover, our Digital Signature is based in this work.

Digital signature
We present an image descriptor based on self-similarities which is able to capture the general structure of an image. Computed descriptors are similar for images with the same layout, even if textures and colors are different, similarly to Shechtman & Irani (2007). Images are partitioned into smaller cells which, conveniently compared with a main patch located in the image, yield a vector of comparison results that describes local aspect correspondences. The Digital Signature (DS) descriptor is computed from a square image subdivided into n × n cells, where each cell corresponds to an m × m pixels image patch. The number of cells and their pixel size have effect on how much an image structure is generalized. A low number of cells will not capture many structural details, while too many small cells will produce a too detailed descriptor. The present approach will consider overlapping cells, which may be required to capture subtle structural details.
Once an image is partitioned, an m × m main patch located in the image (which does not have to correspond to a cell in the image partition) is compared with all partition cells. In order to achieve greater generalization, image patches are compared computing the Sum of Squared Differences (SSD) between pixel values (or the Sum of Absolute Differences (SAD), which is computationally less expensive). Each cell-center comparison is consecutively stored in a m × m dimensions DS descriptor vector. In a more general mathematical sense, a Digital Signature z for a proposed cell p,i sa function that makes a cell comparative that fall into each of the disjoint categories (similar to histogram's bins), whereas the graph of a digital signature is merely one way to represent a digital signature descriptor. Thus, if we let ncells be the total number of cells, Wcsize be the width for each cell and Hcsize be the height for each cell, the descriptor meets the following conditions for each channel: Such description overcomes color, contrast and textures. Images are described in terms of their general structure, similarly to Shechtman & Irani (2007). An image showing a white upper half and a black lower half will produce exactly the same descriptor as an image showing a black upper half and a white lower half. Local aspect correspondences are exactly the same: the upper half is different from the lower half. Rotations, however, are not considered. Digital Signature descriptors are specially useful to describe points defined by a scale salient point detector (like DoG or SURF (Bay & Tuytelaars (2006))). The DS descriptor is shown as a barcode for representation purposes. Thus, given an input image (i.e. a scale salient point or a known region like detected mouths), a number of cells n and their pixel size m, DS is computed as follows: 1. The image is divided into different cells inside a template sized (n × m) × (n × m) pixels.
2. The template is partitioned into n × n cells, each of them sized m × m pixels.
3. Each cell is compared with any other template cell, and each result is consecutively stored in the n × n DS descriptor vector. Thus, each cell provides its own Digital Signature.
In order to illustrate the effectiveness and usefulness of the proposed approach we present two different tests. For the first test, similar Digital Signatures from different images are applied to classify mouths into smiling or non-smiling gestures, for the second test, different Digital Signatures from the same image are compared in order to obtain a good image segmentation (See Figure 2).

Facial expression recognition
After extensive research in Psychology, it is now known that emotions play a significant role in human decision making processes (Damasio (1994); Picard (1997)). The ability to show and interpret them is therefore also important for human-machine interaction. Perceptual User Interfaces (PUIs) use multiple input modalities to capitalize on all the communication cues, thus maximizing the bandwidth of communication between a user and a computer.
Examples of PUIs have included: assistants for the disabled, augmented reality, interactive entertainment, virtual environments, intelligent kiosks, etc. Some facial expressions can be very subtle and difficult to recognize even between humans. Besides, in human-computer interaction the range of expressions displayed is typically reduced. In front of a computer, for example, the subjects rarely display accentuated surprise or anger expressions as he/she could display when interacting with another human subject. The human smile is a distinct facial configuration that could be recognized by a computer with greater precision and robustness. Besides, it is a significantly useful facial expression, as it allows to sense happiness or enjoyment and even approval (and also the lack of them) (Ekman & Friesen (1982)). As opposed to facial expression recognition, smile detection research has produced less literature. Lip edge features and a perceptron were used in Ito et al. (2005). The lip zone is obviously the most important, since human smiles involve mainly the Zygomatic muscle pair, which raises the mouth ends. Edge features alone, however, may be insufficient. Smile detection was also tackled in the BROAFERENCE system to assess TV or multimedia content (Kowalik et al. (2005)). This system was based on tracking the positions of a number of mouth points and using them as features feeding a neural network classifier. The work Shinohara & Otsu (2004), in turn, used Higher-order Local Autocorrelation, achieving near 98% recognition rates. More recently, the comprehensive work Whitehill et al. (2008) contends that there is a large performance gap between typical tests made in the literature and results obtained in real-life conditions. The authors conclude that the training set may be all-important, specially in terms of variability and size, which should be on the order of thousands of images. The Sony Cybershot DSC T-200 digital camera has an ingenious "smile shutter" mode. Using proprietary algorithms, the camera automatically detects the smiling face and closes the shutter. To detect the different degrees of smiles by the subject, smile detection sensitivity can be set to high, medium or low. Some reviews argue that: "the technology is not still so much sensitive that it can capture minor facial changes. Your facial expression has to change considerably for the camera to realize that" (Entertainment.millionface.com: Smile detection technology in camera (2008)), or "The camera's smile detection -which is one of its more novel features -is reported to be inaccurate and touchy" (Swik.net: Sonys Cyber-shot T200 gets its first review (2008)). Whatever the case, detection rates or details of the algorithm are not available, and so it is difficult to compare results with this system. Canon also has a similar smile detection system. This section describes different techniques that are applied to the smile detection problem in video streams and shows the benefits of our new approach.

Representation
In order to show the performance of our new approach, it will be compared with other image descriptors such as Local Binary Patters and Principal Components Analysis. The Local Binary Pattern (LBP) is an image descriptor commonly used for classification and retrieval. Introduced by Ojala et al. (2002) for texture classification, they are characterized by invariance to monotonic changes in illumination and low processing cost. Given a pixel, the LBP operator thresholds the circular neighborhood within a distance by the pixel gray value, and labels the center pixel considering the result as a binary pattern. The basic version considers the pixel as the center of a 3 × 3 window and builds the binary pattern based on the eight neighbors of the center pixel, as shown in Figure 3. However, the LBP definition can be easily extended to any radius, R, considering P neighbors Ojala et al. (2002): Rotation invariance is achieved in the LBP based representation considering the local binary pattern as circular. The experience achieved by Ojala et al. (2002) suggested that just a particular subset of local binary patterns are typically present in most of the pixels contained in real images. They refer to these patterns as uniform. Uniform patterns are characterized by the fact that they contain, at most, two bitwise transitions from 0 to 1 or viceversa. For example, 00000000, 00011100 and 10000011 are uniform patterns. In the experiments carried out by Ojala et al. (2002) with texture images, uniform patterns account for a bit less than 90% of all patterns when using the 3x3 neighborhood. More recently LBPs have been used to describe facial appearance. Once the LBP image is obtained, most authors apply a histogram based representation approach (Sébastien Marcel & Heusch (2007)). However, as pointed out by some recent works, the histogram based representation loses relative location information (Sébastien Marcel & Heusch (2007); Tao & Veldhuis (2007)), thus LBP can also be used as a preprocessing method. Using LBP as preprocessing method, having the effect of emphasizing edges and noise. To reduce the noise influence, Tao & Veldhuis (2007) proposed recently a modification in Equation 2. Instead of weighting the neighbors differently, their weights are all the same, obtaining the so called Simplified LBPs, see Figure 3-d. Their approach has shown some benefits applied to facial verification, due to the fact that by simplifying the weights, the image becomes more robust to illumination changes, having a maximum of nine different values per pixel. The total number of local patterns are largely reduced so the image has a more constrained value domain.
In the experiments described below, both approaches will be investigated, i.e. using the histogram based approach, but also using Uniform LBP and Simplified LBP as a preprocessing step. In this case, only the center patch will be considered to obtain our descriptor.
Raw face images are highly dimensional. A classical technique applied for face representation to avoid the consequent processing overload problem is Principal Components Analysis (PCA) decomposition (Kirby & Sirovich (1990)). PCA decomposition is a method that reduces data dimensionality, without a significant loss of information, by performing a covariance analysis between factors. As such, it is suitable for highly dimensional data sets, such as face images. A normalized image of the target object, i.e. a face, is projected in the PCA space, see Figure 4. The appearance of the different individuals is then represented in a space of lower dimensionality by means of a number of those resulting coefficients, vi (Turk & Pentland (1991)). We now discuss the central contribution of this section, the incorporation of the Digital Signature descriptor. However, images containing smiling mouths require local brightness to be preserved: teeth are always brighter than surrounding skin and that must be captured by the descriptor. Thus, instead of using SSD, patches are compared using Sum of Differences. Otherwise, a closed mouth would produce the same descriptor as a smiling mouth: lips are surrounded by differently colored skin, exactly as teeth are surrounded by differently colored lips. Figure 5 shows an example with an 11 × 11 cell partition, each cell sized 10 × 10 pixels. For this experiment, the DS is applied to classify mouths found by a face detector (Castrillón Santana et al. (2007)) into smiling or non-smiling gestures. Smiling mouths look similar no matter the skin color or the presence of facial hair. This generality can be registered by a self-similarity descriptor like DS. Thus, for this test, our Digital Signature descriptor is computed as follows: 1. The image is resized to a template sized (n × m) × (n × m) pixels.
2. The template is partitioned into n × n cells, each of them sized m × m pixels.
3. A main cell sized m × m pixels is captured from the template image.
4. The main cell is compared with each template cell, and each result is consecutively stored in the n × n DS descriptor vector.

Classification and face detection
Two more points are needed to be justified. The first one is the classification method. In order to tell wether two images (smiling or not smiling) have a similar structure, their corresponding DS descriptors can be compared computing SAD between both vectors. However, given that the present work aims at classifying mouth images in two categories, a Support Vector Machine approach (Burges (1998)) is used. The SVM is a set of related supervised learning methods used for classification and regression. They belong to a family of generalized linear classifiers. A property of SVMs is that they simultaneously minimize the empirical classification error and maximize the geometric margin; hence they are also known as maximum margin classifiers. The second point is to explain how are faces going to be extracted from videos. Several approaches have recently appeared for real-time face detection (Schneiderman & Kanade (2000); Viola & Jones (2004)), all of them aiming at making the problem less environment dependent. Focusing on live video stream processing, a face detector based on cue combination (Castrillón Santana et al. (2007)) outperforms well known single-cue based detectors such as Viola & Jones (2004). This approach provides a more reliable tool for real time interaction in PUIs. The face detection system used to extract faces from video streams in this work (see Castrillón Santana et al. (2007) for details) integrates, among other cues, different classifiers based on the general object detection framework by Viola and Jones Viola & Jones (2004), skin color, multilevel tracking, etc. In order to further minimize the influence of false alarms, the facial feature detector capabilities were extended, locating not only faces but also eyes, nose and mouth. This reduces the number of false alarms, for it is less probable that multiple detectors, i.e. face and its inner features, are activated simultaneously with a false alarm. Its important to point that each detected face is normalized to 59 × 65 pixels using the position of the eyes. The facial element detection procedure is only applied in those areas which bear evidence of containing a face. This is true for regions in the current frame, where a face has been detected, or in areas with a detected face in the previous frame. Figure 6 shows the possibilities of the face detector.

Performance and evaluation
The DaFEx (Battocchi & Pianesi (2004)) database was used in this experiment. In this database 8 professional actors showed 7 expressions (6 basic facial expressions + 1 neutral) on 3 intensity levels (low, medium, high) twice. The frames of the 48 'happy' videos were extracted of the database sequences (see Figure 7) for test. The total number of images contained in the 'happy' videos of dataset is 12,988, from which, 6,783 are smiling images and 6,205 are no smiling images. These images have been also normalized according to eye positions obtaining 59 × 65 samples and have been annotated manually by a human. As briefly mentioned above, the experimental setup considers one possibility as input: the mouth. The input image is a grayscale image, and for representation purposes we have used the following approaches for the tests: • PCA. A PCA space obtained from the original gray images of 59 × 65 pixels.
• ULBP Hist. A concatenation of histograms based on the gray image or the resulting ULBP image. As it can be appreciated in Table 1, best results in almost every situation are achieved with our new approach. None of the LBP based representations outperforms that approach. However, even if the Uniform LBP approach evidences a larger improvement when normalized histograms are used, the Simplified LBP approach reported better results than Uniform LBP. As already stated in Tao & Veldhuis (2007) this preprocessing provides benefits in the context of facial analysis. The PCA based representation achieves a better error rate than the LBP's approaches. However, it doesnt keep a good balance between false postive and false negative. PCA Fig. 8. Results achieved by the Digital Signature approach for the high intensity test. As it can be appreciated, best results are achieved for n =10andm =3.
deserves an additional observation, not always the increasing of the space dimension for PCA reports better results. This is the main reason for keeping only 130 coefficients for the image descriptor. When the DS descriptor is used, the test achieved the lowest error rate. For the Digital Signature approach, error-rate behavior is also quite similar to behaviour obtained previously with PCA and Image Value tests rates. Again, the lowest error rates were achieved by the grayscale image test. For DS, it is important to mention that overlap is not considered between cells. Firstly, several tests without overlapping were made in order to find optimun DS parameters (number of cells and cells' size yielding the lowest error rate). For smile detection it was found that 10 × 10 cells of 3 × 3 pixels performed best (See Figure  8) thanks to the closeness between the size of the extracted DS main cell (30 × 30 pixels) and the original size of the mouth capture (20 × 12 pixels). It is also shown that worst results are achieved for configurations less than 10 × 10 cells because of the loss of information due to resizing in the Normalization step. Beyond that number of cells and for bigger sizes, behaviour is irregular due to the fact that information extracted is not reliable because of the false information introduced when the mouth is resized to fit the DS cell. When images are upsampled, redundant and useless information is created. Unfortunately, when overlap was introduced, the achieved error rates were higher than without overlapping. Used images were too small for overlapping regions to be significant. Unlikely what happens with the PCA approach in this case there is a good balance between false positive and false negative. Thus, the new approach has got better results than other approaches that have been successfully used before for smile detection (Freire et al. (2009a;).

Image segmentation
Face recognition techniques have attracted much attention over the years and many algorithms have been developed. Since it has many potential applications in computer vision and automatic access control system, its research has rapidly expanded by not only engineers but also neuroscientists. Especially, face segmentation is an essential step of face recognition system since most face classification techniques tend to only work with face images. Therefore face segmentation has to correctly extract only face part of given large image. However, apart from facial features, hair style or clothing also reflects important personal traits. Clothing segmentation is widely used in many computer vision tasks, such as dressed people detection (Ioffe & Forsyth (2001); Ronfard et al. (2002)), identification, image editing, human sketches and portraits for graphics rendering (Chen et al. (2004;). However, existing segmentation methods suffer from variations in colors and styles, different lighting conditions, cluttering backgrounds and occlusions generated by poses or other objects. Most existing methods use clothing models (Chen et al. (2006); Lee & Cohen (2006); Sprague & Luo (2002)). Typically, clothing models are trained with tagged samples, and then the clothing is extracted from images by comparing to the models. For example, in Sprague & Luo (2002), a dress model and a shirt model were trained, respectively, and then clothing detection was performed on the pre-segmented image by selecting a better match against the trained models. Such methods can only handle a few clothing styles and rely on the pre-segmentation accuracy.
On the other hand, human hair related applications have attracted increasing interest in recent years, since hair plays a significant role in the overall appearance of an individual. To achieve these tasks, hair segmentation is generally the first prerequisite step. However, to our knowledge, in most previous studies, hair is assumed segmented already or manually labeled. Furthermore, besides the above hair-related applications, many computer vision tasks can also benefit from segmented hair. For instance, it provides an important clue for gender classification, since hair styles of male and female are generally different. Hair can also facilitate age estimation since hair distribution and color gradually changes with the increase of age, especially for old men/women. Hair information can even contribute a lot to face recognition, considering that the hair style of one normal subject does not abruptly change frequently. To sum up, more attention should be paid to automatic hair segmentation. The work of Yacoob and Davis Yacoob & Davis (2006) is the only prior work we can find on hair detection. Their approach constructs a simple color model and uses it to recognize the hair pixel. However, their detection can only work under controlled background environment and very less hair color variation. On the other hand, hair modelling, synthesis, and animation have already become active research topics in computer graphics (Kajiya & Kay (1989); Marschner et al. (2003); Moon & Marschner (2006); Paris et al. (2004); Wei et al. (2005)). In this section, without any predefined model, we propose a new method to segment parts of an image (e.g. clothing, hair style or skin) by using the Digital Signature technique.

The proposed method
Thus, given an input image, a number of cells n and their pixel size m, the proposed method is computed as follows (See Figure 9): 1. The facial detector described in Section 4.1.3 provides the eyes position. Thus, a region of interest can be estimated.
2. The image is divided into different cells inside a template sized (n × m) × (n × m) pixels.
3. The template is partitioned into n × n cells, each of them sized m × m pixels.
4. Taking the eye positions into account, it is possible to estimate a main cell position.
5. Each cell is compared with any other template cell, and each result is consecutively stored in the n × n DS descriptor vector. Thus, each cell provides its own Digital Signature (See Figure 2).
6. Each cell's DS will be compared to the main cell's DS. This will help to decide whether or not the proposed cell is similar to the main cell.

Results
Figure 10 displays a segmentation results with a simple color frame. The Digital Signature succeeded to generate accurate skin and background segmentation. It fails to generate accurate hair segmentation because the hair and cloth have similar color in the example. It can be also appreciated an spatial aliasing effect due to the fact that the cell's shape is square.
One way to reduce the spatial aliasing is to decrease the cell's size but, this fact could be harmful for the method because it will cause an information loss for each cell. RGB color images were considered here for test. In each segmentation, the value of the one free parameter, cell size in Equation 1, was kept constant: 11 × 11, despite the different characteristics of the images. Figure 11 shows the results of applying the segmentation algorithm to two more images under unconstraint lighting environment. The overall result of this study was that the segmentation results were generally stable to perturbations of the cell size as far as the distance between eyes remains similar between images. Fig. 11. Some components of the partition applying a Digital Signature cell size of 11 × 11 (n = 11). Images source: askmen.com (Askmen.com (2011)).

Conclusions
In this Chapter, we have introduced a novel descriptor for segmenting an image into a regular grid of cells. We have argued that the regular grid confers a number of useful properties, such as a powerful representation approach for classification purposes. We have also demonstrated that despite this topological constraint, we can achieve segmentation performance comparable with current algorithms. For the first test, this Chapter described a smile detection using different LBP approaches, as well as PCA image representation, combined with SVM. The potentiality of the Digital Signature based representation for smile verification has been shown. The DS based representation presented in this paper outperforms other approaches with an improvement over a 5%. Uniform LBP does not respond to a statistical spatial patterns locality. This means, that there is no gradual change between adjacent blocks preprocessed with Uniform LBP. Depending on the value of a pixel inside one of the blocks, codification between two adjacent pixels can be, for example, from pattern 2 to pattern 9. Translated to the space domain of SVM, this means that dimensions can be too far away. Simplified LBP keeps the statistical spatial patterns locality. There is a gradual change between adjacent preprocessed pixels. Translated to the SVM's space domain, this gradual change means that similar points are closer in this space. Our future line is focus on the potentiality of the DS descriptor for generic applications such as image retrieval. In this paper we have developed a static smile classifier achieving, in some cases, a 93% of success rate. Due to this success rate, smile detection in video streams, where temporal coherence is implicit, will be studied in short term, as a cue to get the ability to recognize the dynamics of the smile expression. For the second test, we have analyzed how the Digital Signature cell's descriptors can be used to segment an input image without any previous training. Once the eyes position is obtained, the template of the main cell can be find inside the image by comparing their digital signatures. The model works very well under certain circumstances: 1. There is a variety of patterns between the main cell and the rest of cells.
2. Under constraint and unconstraint lighting environment.
3. The face/hair/clothes features are relatively small to fill holes (See false negatives for the first dress in Figure 11) The above insight opens the door for further research into the balance between complexity of the model and the way in which it is used for inference. In order to reduce the aliasing effect other geometrical shape for the cells can be also tested. On the other hand the cell's size depends on image cues such as the image's size or other heuristic method depending on the segmentation process (i.e. for face segmentation, the distance between the eyes could be an important cue for the cell size).

Acknowledgments
We would like to thank Dr. Thomas B. Moeslund and Dr. Nasrollahi for their advice, criticism and encouragement during the development of this algorithm. This work has been partially supported by the Research Project TIN2008-06068, funded by the Ministerio de Ciencia y Educación (Government of Spain). Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications.