Detection of a Moving Car Based on Invariant Moments

: This paper presents a proposed technique based on Invariant Moments to detect moving cars in front of Autonomous Cars in order to increase safety and reduce road accidents, thus saving lives, which are one of the most important matters in the context of Autonomous Cars. Object detection from a sequence of images is preparatory step but a crucial mission for computer vision applications in information extraction. In this work, a car database was constructed and used in background subtraction to define Region of Interest, then applying the invariant moments for RoI and dependence on the information of the database that was constructed to detect and recognize the moving car. The technique was tested and the experimental results showed that it could detect the moving objects successfully with detection rate over 87% and precision 97% and FoM 91%.


Introduction
The issue of raising road safety and preventing road traffic injuries has become worldwide.In addition, with the number of accidents increasing day by day, it has become important to take over the human errors.All of this could come to an end with Autonomous Cars.An Autonomous Car or a driverless car (sometimes called a self-driving car) is a robotic vehicle, which through feedback returned by various sensors operate independently.The purpose of these vehicles reduces risks, problems and costs that emerged from human intervention.It is designed to travel between destinations without a human operator.So, one of the major functions of Autonomous Car is the detection of moving object (cars) in front of it by using computer vision technologies.Even after research for several years, detection and tracking moving object is still an open research issue.Until today, it is still a great challenge to achieve an accurate, robust and high performance approach.How can one define the object to be detected and tracked is the difficulty level of this problem Shaikh et al. (2014), which help to make the decision for determining the path of Autonomous Cars and avoid accidents.

Detection Moving Object (Related work)
Detection of a moving object is a critical process in computer vision application, especially in video surveillance and Autonomous Car.Insignificant information in the scene is ignored in this process and focus attention to moving objects.To achieve this target through the past years, many methods and algorithms and techniques have been proposed either based on a predictive or probabilistic mechanism (Wang et al., 2011;Warnell et al., 2012;Mansour and Vetro, 2014;Thangarajah et al., 2016).Optical flow, frame difference and background subtraction are the prime three methods of detecting moving objects in addition to many other methods using sophisticated techniques in combination with those basic ones.

Optical Flow
The change in optical flows properties by motion objects over time was exploited by Optical flow and it is suitable for the dynamic and static background.However, because of the poor anti-noise performance and its complex computation so, for real time processing, it requires a special hardware Yang et al. (2012).

Frame Difference Model (FDM)
In frame difference, or time difference, the moving regions are extracted by threshold of time difference in adjacent frames' pixels.The interest of this method is its rapid background update, good adaptive performance and the insensitive to the variation of light, but the disadvantage of this method that it cannot detect the moving objects which has conformable inter color and big size Yang et al. (2012).

Background Subtraction Model (BSM)
BSM is a method used to find movable object and to achieve this, the current frame image differentiating with background image, if the result is greater than the threshold.In this case, the pixels of the current image are moving pixels, otherwise the pixels are not moving.To define the moving objects, this model is completely relied on the background image.The correct constructing of background image determines slow movement of the objects and temporarily motionless objects.The resultant image of background subtraction contains noise if the background image does not contain maximum number of stationary pixels and it is not able to detect regions of objects properly (Hossain and Das, 2014).

Adaptive Background Subtraction Model (ABSM)
This method is similar to background subtraction but here, background image is adaptively updated over time.In this method, transforming the colored video to grey scale video is not necessary.Initially, creation background image then is modified according to the surrounding environment changes with the assist of learning rate (a).This method takes too much time to modify the background from time to time (Hossain and Das, 2014).
In addition, recent suggestions are using sophisticated techniques on the background subtraction in combination with those basic ones.To extract backgrounds from moving points, several methods are using advanced statistic models (Huang and Chen, 2013a;2013b;Cheng et al., 2015;Guo et al., 2013).However, others proposed using neural networks or outlier detection models (Zhou et al., 2012;Huang and Do, 2014).According to their main characteristics, Bouwmans (2014) categorized these subtraction models further into17 groups.

Gaussian Mixture Models (GMM)
In this method, the components of background image are accumulated as terms of Gaussian Distribution Functions.A pixel from the input image is decided whether it is from foreground or background by the Gaussian distribution functions.This statistical determinant is effective for minor changes in the background moreover and using mixtures of such functions makes the method multimodal Dong-Sun and Jinsan (2016).Thangarajah et al. (2016) presented a method to update the threshold of GMM based BGS with regard to color distortion, illumination measures in pixel level and similarity.A threshold was set automatically for moving objects detection in video sequences.Srivastav et al. (2017) proposed a technique to detect object which is able to reduce the holes problem that result from two frame differencing based on the differencing of three frame and background subtraction.
In this work, database is constructed, which contains features for number of objects (cars) used for objects detection.Also it is based on background subtraction to define Region of Interest (RoI), then applying the invariant moments for RoI, lastly using Sum of Square Error (SSE) between the calculated invariant moments and the invariant moments that is stored in the database as features in order to detect and recognize the car.The period between one image and another in the same sequence (same video) is controlled by the user of the system.

The Proposed Technique for Moving Object Detection
In Autonomous Car, the front view is acquired (as a video) using a camera mounted on the Autonomous Car then applying few processes in order to detect the moving object (car).Any video is a consecutive of sequent images or frames from which object (car) can be detected.One of important functions in Autonomous Car is detecting and tracking the moving object (car) in front of it.In this work, it is assumed, that the Autonomous Car is in the middle of the road also that just one moving car appears in the front view and it can move freely to the left, right, up and down according to traffic laws.Object detection in video sequence refers to the process of separating frame into background and moving objects according to features such as color, intensity, edge or motion, to locate and identify objects and estimates its velocity and location in frame.
This section describes all steps to detect and recognize the moving car, the method comprises four main tasks (construct car database, define region of interest and applying invariant moments to each regions of interest compute SSE).

Constructing Cars Database
Preparing a database containing images of specific roads has been done.These roads are recorded when they are empty.The images of empty roads are used for background subtraction in order to extract the area of interest as in (3.2).The database also contains features for several targets (cars) which represent objects characteristics such as invariant moments for (binary, grey, green) images for the same car, using these characteristics to detect and recognize the car.

Define Region of Interest
Define Region of Interest by using Background Subtraction between two images has been described.Two images used for subtraction process are defined as background image and foreground image.Background image is retrieved from database where it is stored and the foreground image from the sequence of images that obtained from the camera and the user controls the period of time between one image and another to the same sequence.

■■■
The colored image spatial information consists of color information stored in three various components (color channels) for each pixel, which considered as coordinates in some color space.Most of the present methods for background subtraction convert the spatial color information of background and foreground images to grey image of 0-255 intensity levels.The eventual grey image of background subtraction is called Differenced Image (DI) and it is given in Equation 1: Where, S(x, y) refers to pixel intensity of ID at x th row and y th column of grey image.B(x, y) and F(x, y) refer to the pixels intensity of x th row and y th column of background and foreground images respectively.By setting the threshold Th, Binary image is obtained (converting the image to black and white) as in Equation 2, thereby extracting the moving object region from the image: Where, BI(x, y) refers to pixel intensity of x th row and y th column of binary image.Also, Converting the Binary image from Equation 2 to Green image GR(x, y) of 0-255 intensity by using the spatial coordinate of non-zero value of the Binary image and the green component (channel) with the same spatial coordinate from foreground image as in Equation 3: Where, GR(x, y) refers to pixel intensity of the x th row and y th column of Green image and F(x, y) green denotes pixel intensity of the x th row and y th column of the foreground image with just green component.

Applying the Invariant Moments
Pixel intensity is nothing but the pixel color value.Image moments can simply be described as some functions of the image pixel intensity.Moments are described with respect to their power as in raised to the power in mathematics.
The geometrical image features and description of object shape are represented by using a set of moments.Via moment functions, center of mass, area of an image and orientation information can be found.From geometric moments the properties of an image can be generated Ong et al. (2014).
If an image describing by 2D discrete intensity function I(x, y) with non-zero values in the finite part of XOY plane then the geometrical moments of all orders (p+q) are presented by Equation 4, Favorskaya et al. (2013): Where, M and N are the image dimensions and x, y are the region coordinates (pixel coordinates in digitize images).Calculation of central moments of order (p + q) is provided by Equation 5: M N p q pq x y x x y y I x y Where, µ pq denotes the central moment and x , y is the gravity center coordinates (centroid) of object image, which are invariant to translations, which can be obtained using the Equation 6, Favorskaya et al. (2013): Scale invariance is achieved by normalizations of moments according to Equation 7, where 2 2 It is known that seven non -linear functions ϕ 1 -ϕ 7 of normalized invariant Hu moments Favorskaya et al. (2013)

■■■
Equation 8 is applied to the Grey image S(x, y) and then to the Binary image BI(x, y) and then to the Green image GR(x, y) separately.

Computing the Sum of Square Error (SSE)
After applying the invariant moments for the object (car) in the image, we used the value of seven invariant moments and Sum of Square Error (SSE) to detect and recognize the car according to the information that stored in the cars database.
SSE is a measure of discrepancy between the calculated invariant moments in 3.3 and the invariant moments that are stored in the database for specific objects (cars).A small RSS indicates a tight fit of the object in the database to the object that extracted from the sequence of images.It is used as an optimality criterion in object selection.SSE is given in the Equation 9: ) where, n = 1, …, 7 and SSE is the Sum of Square Error and ϕ i is the seventh invariant moments calculated for the input image and l ϕ is the seventh invariant moments which is stored in the database for specific car.SSE is applied for Grey, Binary and Green images separately.Using the result of SSE on one of the three types (Grey, Binary and Green) of images to detect and recognize the car and we can use the results of SSE on the three types together to detect and recognize the car.

Case Studies and Experimental Results
The described technique above was implemented by using Visual Studio 2012 using C sharp programming language and used camera mounted on the car, which obtained the front view as a sequence of images.The system has been tested using a set of images up to ( 40 Table ( 1) shows the performance of the proposed technique and other algorithms based on the above measures.
Figure 1 and 2 illustrate the steps of implementation of the proposed technique for two different cars.Image (a) is the original image derived from a series of images, which are the input images to the system.Image (b) is grey image resulted from background subtraction.Image (c) and (d) are the result from convert image (b) to binary and green respectively.Images (e) In Figure 1 and 2 Illustrate the applying of the invariant moments on images (b) (c) and (d) separately.Image (f) illustrates the result from computing SSE between the values from (e) in Fig. 1. and Fig. 2. and the invariant moments that was stored in the database to detect and recognize moving car as in (g) Table (1) illustrates the performance comparison between the proposed technique and four other techniques based on Recall, Precision and Figure of Merit (FoM) matrices, where they obtained result show that the proposed technique has Recall, Precision and FoM matrices values higher than the other existing models, so that the proposed technique more robust than the other compared algorithms in terms of FoM.
) as samples, Fig. (1) And (2) show the steps of proposed technique for two samples taken from a two sequences of images.The evaluation of the proposed technique was tested using three performance matrices, namely Precision, Recall and Figure of Merit (FoM).These metrics are based on the following parameters, False Positive (FP), True Positive (TP) and False Negative (FN).The Recall or Detection Rate: given by Equation 10 measures the percentage of predicted True Positive (TP) as compared to the total number of actual positives in the ground truthdefined by Equation 11 measures the percentage of correct detection as compared to the total number of detection as positives: are very different perspective measures of the performance.Thus, a weighted harmonic mean measured jointly with recall and precision called Figure of Merit (FoM) provides a better performance evaluation defined by Equation12:

Table 1 :
Performance comparison on various dataset for several algorithms