Face Detection and Tracking using OpenCV

— An application for automatic face detection and tracking on video streams from surveillance cameras in public or commercial places is discussed in this paper. In many situations it is useful to detect where the people are looking for, e.g. in exhibits, commercial malls, and public places in buildings. Prototype is designed to work with web cameras for the face detection and tracking system based on open source platforms Arduino and OpenCV. The system is based on AdaBoost algorithm and abstracts faces Haar-Like features. This system can be used for security purpose to record the visitor face as well as to detect and track the face. A program is developed using OpenCV that can detect people's face and also track from the web camera.


INTRODUCTION
HE research purpose of computer vision aims to simulate the manner of human eyes directly by using computer.Computer vision is such kind of research field which tries to percept and represent the 3D information for world objects.Its essence is to reconstruct the visual aspects of 3D object by analyzing the 2D information extracted accordingly.3D objects surface reconstruction and representation not only provide theoretical benefits, but also are required by numerous applications.
Face detection is a process, which is to analysis the input image and to determine the number, location, size, position and the orientation of face.Face detection is the base for face tracking and face recognition, whose results directly affect the process and accuracy of face recognition.The common face detection methods are: knowledge-based approach, Statistics-based approach and integration approach with different features or methods.The knowledge-based approach [Feng, 2004;Faizi, 2008] can achieve face detection for complex background images to some extent and also obtain high detection speed, but it needs more integration features to further enhance the adaptability.Statistics-based approach [Liang et al., 2002;Wang et al., 2008] detects face by judging all possible areas of images by classifier, which is to look the face region as a class of models, and use a large number of "Face" and "non-face" training samples to construct the classifier.
The method has strong adaptability and robustness, however, the detection speed needs to be improved, because it requires test all possible windows by exhaustive search and has high computational complexity.The AdaBoost algorithm [Zhang, 2008;Guo & Wang, 2009] arose in recent years; it trains the key category features to the weak classifiers, and cascades them into a strong classifier for face detection.The method has real-time detection speed and high detection accuracy, but needs long training time.The digital image of the face generated is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels [Lu et al., 1999].Pixel values typically represent gray levels, colours, heights, opacities etc.It is to be noted that digitization implies that a digital image is an approximation of a real scene.Recently there has been a tremendous growth in the field of computer vision.The conversion of this huge amount of low level information into usable high level information is the subject of computer vision.It deals with the development of the theoretical and algorithmic [Jiang, 2007] basis by which useful information about the 3D world can be automatically extracted and T analyzed from a single or multiple 2D images of the world as shown in figure 1.

Figure1: Typical Data Processing in Computer Vision
This paper describes a system that can detect and track human face in real time using haar-like features where the detection algorithm is based on wavelet transform.In computer vision, low level processing involves image processing tasks in which the quality of the image is improved for the benefit of human observers and higher level routines to perform better [Viola & Jones, 2001].Intermediate level processing involves the processes of feature extraction and pattern detection tasks.High level vision involves autonomous interpretation of scenes for pattern classification, recognition and identification of objects in the scenes as well as any other information required for human understanding as shown in figure 2. Statistics-based approach to this paper detects face by judging all possible areas of images by classifier, which is to look the face region as a class of models, and use a large number of "Face" and "Non-face" training samples to construct the classifier.The method has strong adaptability and robustness.The program can rectangle the face area with the data got from web camera video stream.

II. RELATED WORK
Face detection is used in biometrics, often as a part of (or together with) a facial recognition system.It is also used in video surveillance, human computer interface and image database management.Some recent digital cameras use face detection for autofocus [DCRP Review: Canon PowerShot S5 IS].Face detection is also useful for selecting regions of interest in photo slideshows that use a pan-and-scale Ken Burns effect.Face detection is gaining the interest of marketers.A webcam can be integrated into a television and detect any face that walks by.The system then calculates the race, gender, and age range of the face.Once the information is collected, a series of advertisements can be played that is specific toward the detected race/gender/age.This paper shows prototype or partial implementation of this type of work.Face detection is also being researched in the area of energy conservation [Energy Conservation].Methodology for face recognition based on information theory approach of coding and decoding the face image is discussed in [Sarala A. Dabhade & Mrunal S. Bewoor, 2012].Proposed methodology is connection of two stages -Face detection using Haar Based Cascade classifier and recognition using Principle Component analysis.Various face detection and recognition methods have been evaluated [Faizan Ahmad et al., 2013] and also solution for image detection and recognition is proposed as an initial step for video surveillance.Implementation of face recognition using principal component analysis using 4 distance classifiers is proposed in [Hussein Rady, 2011].A system that uses different distance measures for each image will perform better than a system that only uses one.The experiment show that PCA gave better results with Euclidian distance classifier and the squared Euclidian distance classifier than the City Block distance classifier, which gives better results than the squared Chebyshev distance classifier.A structural face construction and detection system is presented in [Sankarakumar et al., 2013].The proposed system consists the different lightning, rotated facial image, skin color etc.

III. DESCRIPTION OF TOOLS
In this section the tools and methodology to implement and evaluate face detection and tracking using OpenCV are detailed.The library was originally written in C and this C interface makes OpenCV portable to some specific platforms such as digital signal processors.Wrappers for languages such as C#, Python, Ruby and Java (using JavaCV) have been developed to encourage adoption by a wider audience [Zhang, 2008]

Computer Vision
code necessary to code up vision functionality as well as reduce common programming errors such as memory leaks (through automatic data allocation and de-allocation) that can arise when using OpenCV in C as shown in figure 4.
Most of the new developments and algorithms in OpenCV are now developed in the C++ interface [Bradski & Kaebler, 2009].Unfortunately, it is much more difficult to provide wrappers in other languages to C++ code as opposed to C code; therefore the other language wrappers are generally lacking some of the newer OpenCV 2.0 features.A CUDA-based GPU interface has been in progress since September 2010.

Processing Software
The Processing language is a text programming language specifically designed to generate and modify images.Processing strives to achieve a balance between clarity and advanced features.The system facilitates teaching many computer graphics and interaction techniques including vector/raster drawing, image processing, color models, mouse and keyboard events, network communication, and objectoriented programming.Libraries easily extend Processing"s ability to generate sound, send/receive data in diverse formats, and to import/export 2D and 3D file formats [Ben Fry & Casey Reas, 2007].
Processing is for writing software to make images, animations, and interactions.Processing is a dialect of a programming language called Java; the language syntax is almost identical, but Processing adds custom features related to graphics and interaction as shown in figure 3. The graphic elements of Processing are related to PostScript (a foundation of PDF) and OpenGL (a 3D graphics specification).Because of these shared features, learning Processing is an entry-level step to programming in other languages and using different software tools.

Arduino
Arduino is an open-source electronics prototyping platform based on flexible, easy-to-use hardware and software.The hardware consists of a simple open hardware design for the Arduino board with an Atmel AVR processor and on-board input/output support.The software consists of a standard programming language compiler and the boot loader that runs on the board.
Arduino can sense the environment by receiving input from a variety of sensors and can affect its surroundings by controlling lights, motors, and other actuators.The microcontroller on the board is programmed using the Arduino programming language (based on Wiring) and the Arduino development environment (based on Processing).Arduino projects can be stand-alone or they can communicate with software running on a computer (e.g.Flash, Processing, and MaxMSP) The board as shown in figure 5 can be built by hand or purchased preassembled the software can be downloaded for free.The hardware reference designs (CAD files) are available under an open-source license.

IV. FACE DETECTION
In this section the base algorithm used to detect the face is discussed [Feng, 2004].AdaBoost algorithm is discussed first then feature selection is discussed.

ADABOOST
In 1995, Freund and Schapire first introduced the AdaBoost algorithm [Faizi, 2008].It was then widely used in pattern recognition.
ii) Calculate the weight (w i ) training error for each hypothesis iii) Set: a t = ())

Feature Selection using Haar like Features
In the implementation of face detection, Xi contains a huge number of face features, and some of the features with low ϵ i to train our strong classifier are selected.By AdaBoost algorithm this can be achieved automatically [Lu et al., 1999].For each iteration ϵ i with each feature in X i can be calculated and then the lowest one is what we need.For doing this, the face detection rapid could be very fast.In next part, you will find there are many haar-like features, so it is hard to make use of all them.Two-rectangle features are "A" and "B"."C" is threerectangle feature and "D" is four-rectangle feature.At a size of 24x 24, there are more than 180,000 rectangle features.To implement face detection and tracking tools required are:

Software Required
OpenCV 2.3.1 super pack for windows, Arduino IDE 1.0 for windows, Processing IDE for windows.

Hardware Required
PC preferably running windows 7 sp1, Arduino uno or compatible plus power source (5v-dc), standard servos *2, webcam w/usb interface, breadboard, jump wires, hobby wire to tie pan/tilt servos and webcam together.
Figure 7 shows experimental setup used.Breadboard is used to make connections.The various connections required are as given below SERVOS: 1.The yellow/signal wire for the pan (x axis) servo goes to digital pin 9. 2. The yellow/signal wire for the tilt (y axis) servo goes to digital pin 10. 3. The red/VCC wires of both servos go to the arduino's 5v pin. 4. The black/GND wires of both servos go to arduino's gnd pin.WEBCAM: The webcam's USB goes to the pc.The code will identify it via a number representing the USB port its connected.ARDUINO: The arduino uno is connected to the pc via usb.Take note of the com port the USB is connected to.COM port can be found from the arduino tools/serial ports menu.Check mark next to the active USB port shows the COM port which is used to communicate with arduino.

VI. IMPLEMENTATION
After a classifier is trained, it can be applied to a region of interest (of the same size as used during the training) in an input image.The classifier output is "1" if the region is likely to show the face and "0" otherwise.To search for the object in the whole image one can move the search window across the image and check every location using the classifier.Here we use two different codes for face detection and tracking respectively.The algorithm used for both the codes (Processing & Arduino) is detailed in this section.

Implementation of Hardware
Basically Arduino will analyze a serial input for commands and set the servo positions accordingly.A command consists of two bytes: a servo ID and a servo position.If the Arduino receives a servo ID, then it waits for another serial byte and then assigns the received position value to the servo identified by the servo ID.The Arduino Servo library is used to easily control the pan and tilt servos.There's a character variable that will be used to keep track of the characters that come in on the Serial port.a) Library named servo.his used in arduino to control the servo motors, based on the data obtained by the openCV through COM port.b) Depending on the difference found in step8 the 2 servo motors are sent with appropriate controls for the pan-tilt movement of camera.c) Step b is kept in a continuous loop.

VII. RESULT AND ANALYSIS
The image of the face captured by web-cam with the help of Processing, OpenCV undergoes different steps as mentioned below.
Generate rectangle class which keeps track of the face coordinates.Create an instance of the OpenCV library.This serial library is needed to communicate with the Arduino.Adjust Screen Size Parameters on contrast/brightness values.Convert the image coming from webcam to greyscale format.Find out if any faces were detected.If a face is found, find the midpoint of the first face in the frame.Manipulate these values to find the midpoint of the rectangle.
Find out if the Y component of the face is below the middle of the screen, if it is below the middle of the screen.Update the tilt position variable to lower the tilt servo.Find out if the Y component of the face is above the middle of the screen.Find out if the X component of the face is to the left of the middle of the screen.Update the pan position variable to move the servo to the left.Find out if the X component of the face is to the right of the middle of the screen.Update the pan position variable to move the servo to the right.Update the servo positions by sending the serial command to the Arduino.The pan & tilt position of the servo motor linked with web camera is directly proportional to the serial command of the coordinates to the Arduino of the X & Y components of the face from midpoint of the rectangle.Figure 10 shows the result of the face detection and figure 11 shows the face detection as well as tracking.
By using this approach it was found that time taken to detect the face was less than 1 second which means that this setup can be used in real time.The detection efficiency was greatly improved by using OpenCV.The average frame rate was found to be 15 fps.

VIII. CONCLUSION
Prototype system for automatic face detection and tracking is successfully implemented and tested.The test results show that the detection method used in the paper can accurately detect and trace human face in real time.This paper shows the intersection of Image processing and embedded systems, by using openCV and arduino real time implementation is possible.Future Work: Along with face detection, face recognition may also be implemented.

Figure 2 :
Figure 2: Process Levels in Computer Vision Intel and now supported by Willow Garage [Lu et al., 1999].It is free for use under the open source BSD license.The library is cross-platform.It focuses mainly on real-time image processing.If the library finds Intel's Integrated Performance Primitives on the system [Open Source Computer Vision Library Reference Manualintel; Gary Bradski & Adrian Kaehler O"Reilly, 2008], it will use these proprietary optimized routines to accelerate it.

Figure 4 :
Figure 4: Structure Design of Processing

Figure 6 :
Figure 6: Haar-like Features Introduced in Viola's Paper

Figure 9 :
Figure 9: Processing Window with the Code

Figure 9 :
Figure 9: Output of Algorithm Showing the Face Detection Figure 10: Face Detection with Camera