Comparative Analysis of Background Subtraction and CNN Algorithms for Mid-Block Traffic Data Collection and Classification

Classification of vehicles in the traffic stream is a pre-requisite for planning and designing the facilities for road-users. Considering the importance and gaining popularity of automated systems in this field, the aim of this article is to compare two algorithms-one using the Background Subtraction (BS) technique and the other using Convolutional Neural Network (CNN) with a primary focus on an increased number of vehicle classifications. To check the reliability of these algorithms, the outputs produced were validated against the data obtained from Kachkoot Toll Plaza, India. The results were analyzed using drop-line diagrams and confusion matrices . The overall efficiency of the CNN-based algorithm (0.98) was found to be better than the BS-based algorithm (0.95). The comparison presented in this paper will be useful for transportation professionals and agencies.


Introduction and Background
Vehicle classification forms the basis in almost all the transportation engineering and planning related works, for example, in the structural and geometric design of pavements, designing traffic regulatory and control devices, developing facilities for road-users, etc. Its accuracy affects the quality of the research, which in turn, affects the policies in the transportation sector (Illahi and Mir, 2020). The idea of grouping vehicles into classes is a way to distinguish them by their size, geometry, and planned application. Conventional approaches to vehicle counting and classification that include manual recordings such as observation sheets and semi-automatic devices such as traffic counting pads, etc. are still used. But these methods are tedious, highly error-prone, and require a huge amount of resources. Automated vehicle detection techniques are found to solve these issues and therefore, advantageous over conventional methods . It is noteworthy that some popular traffic flow models have also been validated using automated techniques (Kanagaraj et al., 2015). In recent years, owing to such benefits, automated techniques have drawn attention of the researchers as far as vehicle detection and classification is concerned. However, the use of automated devices like pneumatic sensor tubes, infrared/ laser scanners, etc. (Brosnan et al., 2015) are found to be less successful as these provide results that are either unclassified, expensive or both. Computer vision (CV) techniques in combination with machine learning (ML) tools, contrary to these, are gaining popularity in the detection and classification of vehicles in real-time traffic scenarios (Zaki et al., 2013).
Apart from ML tools, there are broadly three CV techniques for the detection of moving objects; they are BS, temporal difference, and optical flow in which BS being the most popular (Daigavane et al., 2011). Zaki and Sayed (2018) found that vehicle detection and classification using CV techniques are highly sought-after techniques because of the ease with which data can be extracted and used for further analysis. The application of CV techniques and ML tools have been utilized efficiently under the umbrella of intelligent transportation systems. Saran and Sreelekha (2015) reviewed some important developments in ML tools such as neural network (NN) and its link with conventional classification systems such as probability estimation, tradeoffs involved, variable selection, etc. They found that NN is a competitive alternative to traditional classifiers. Similarly, Weinblatt et al. (2013) analyzed and reported length-based vehicle classification schemes and appropriate length bin boundaries. It is evident from the literature that the researchers have accepted the superiority of CV techniques and ML tools as vehicle classifiers. This has been demonstrated in a number of ways by identifying and classifying vehicles, pedestrians, the surrounding environment, etc. For example, in a study by Jocic et al. (2019), CNN was utilized to identify pedestrians, bicyclists, traffic lights, and cars.
The combination of CV techniques and ML tools work on the basis of image processing algorithms. These algorithms take a sequence of images from a camera or a recorded video in case of offline processing. Broadly, there are two vehicle detection and classification systems that are based on the way these carry out the detection and classification processes (Zaki and Sayed, 2018). On one hand, there exists a two-stage classification system in which the first stage detects the vehicles. In the second stage, the results obtained in the detection stage are further analyzed for vehicle classification (Yan et al., 2013). On the other hand, simultaneous detection and classification could also be achieved. In this system, the vehicle is detected and simultaneously tagged i.e. classified within a single stage.
The following are some of the vehicle detection and classification works successfully carried out using CV techniques, ML tools, or both: Zaki et al. (2013) developed an automated system for classifying road-users based on the discrimination of the shapes of the speed profiles of each roaduser type. Zhao and Nevatia (2003) used frame differencing and BS to detect passenger cars using the Bayesian network. Similarly, image features to segment motion were put to test by Daigavane et al. (2011). Buch et al. (2010) developed a model to detect and classify vehicles as well as roadusers. The model was tested in urban traffic conditions and validated from reference Imagery Library for Intelligent Detection System (i-LIDS) datasets from the UK Home Office. Müller et al. (2001) reviewed the classification algorithms of NN and proposed a conceptual framework for their use and selection. Khanloo et al. (2012) developed a framework that used a variety of appearance features with their parameters using ML. Li et al. (2017) proposed a unified framework for the simultaneous detection of bicyclists and pedestrians using the Upper Body Multiple Potentil Region (UB-MPR) and region CNN. Likewise, in a study by Wang et al. (2017), the vehicle classification system was developed using Faster R-CNN which was then tested on NVDIA Jetson TK1 board. Saran and Sreelekha (2015) proposed a vehicle detection and classification algorithm using Histograms of Oriented Gradients (HOG) and geometric features of the vehicles. Likewise, Yan et al. (2013) developed a CV-based algorithm to detect bicycles. Zaki and Sayed (2018) presented a road-user multi-step classification system in shared space facilities.
Some other methods/ techniques used by various researchers are also available in the literature. For example, George et al. (2013) developed an algorithm for automatic identification and classification of vehicles using acoustic signals. Wei et al. (2013) proposed an algorithm for identifying traffic phases using a K-means clustering and level of service method. Rajab et al. (2016) used a single-element piezoelectric sensor placed diagonally on a traffic lane in combination with ML technology to classify vehicles. Morris and Trivedi (2006) demonstrated simultaneous tracking and classification of vehicles in a real-time system monitoring highway traffic using cooperating tracker and classifier modules. Based on the observation of geometry of a bicycle i.e. a frame (in the form of two triangles) and two wheels (in the form of ellipses), a model was proposed by Lin and Young (2017) to detect bicycles in the traffic stream. Hu et al. (2014) proposed a real-time multiple bicycle detection algorithm which extracts a feature called multi-scale block local binary pattern (MBLBP). Harlow and Peng (2001) used "range sensors" to detect and classify vehicles.
Reviewing the literature, it is clear that the vehicle detection and classification are of importance in transportation designing and planning works. It was found that the algorithms based on CV techniques and ML tools have gained superiority over the traditional methods. The reasons are that these algorithms have proven to be more efficient and require less amount of resources. In these algorithms, the BS technique and NN were found to be most popular among the researchers. However, it was observed that only a fewer number of vehicle classifications have been attained using these algorithms. To meet the needs of transportation planners and transport agencies, more classifications are required to be attained in the practical field.
Considering the importance and gaining popularity of CV techniques in combination with ML tools for automatic traffic data collection and classification, the aim of this study is to compare two of the most popular algorithms with the primary focus on attaining a greater number of vehicle classifications. To achieve that, two algorithms-one based on the BS technique and the other based on CNN have been tailored to detect and classify eleven vehicle types within the same algorithm. These algorithms were then tested in the practical field which was followed by the validation of the results against the ground (true) data.
The remaining article is organized in the following manner: in Section 2, two traffic classification methodologies are presented and explained. This is followed by the explanation of the study area, data acquired, and classification of vehicles adopted in Section 3. Section 4 presents the results obtained from the site which is followed by elucidating and validating them. Finally, the article provides a succinct conclusion regarding the efficiency of the two algorithms in Section 5.

Traffic Classification Methodologies
In this study, two algorithms for vehicle detection and classification are compared. The algorithms are based on the BS technique (see Figure 1) and CNN (see Figure 4), respectively.

Traffic Classification Algorithm using BS
This approach consists of two fundamental stages: (1) vehicle detection using BS, (2) vehicle classification using a neural network. Initially, 500 frames of the video were used to train the background subtractor which then produces foreground masks of subsequent frames. These foreground masks contain all detected moving objects as blobs of pure white color due to thresholds. Then, contours are drawn around these detected blobs and apply certain validations to rule out obvious false positives such as the minimum size of the object in the frame. This produces a list of detected vehicles whose position is tracked with the help of centroid of the detected contour, also known as bounding-box (see Figure 2

Traffic Classification Algorithm using CNN
This approach consists of a pre-trained CNN-based system that detects and recognizes vehicles simultaneously within a frame. This system works on a pre-built classification system which then classifies the detected objects using inbuilt labels that the network has been trained upon. The process followed in this 1444 approach is presented in Figure 4. In this approach, the first stage consists of processing the frames through vehicle detection and recognition network with their respective coordinates and bounding boxes. These bounding boxes are used to locate and store the object in memory (see Figure 5). When the vehicle crosses the separation boundary, the counter for that specific class is increased by one in that particular direction.

Required Adjustments
Camera calibration is one of the vital steps in vehicle detection techniques involving video analysis applications. From side-view, overlapping vehicles that pass the boundary at the same instant of time are detected and counted as single "vehicle" or missed entirely. To overcome this problem, the detection and distinct-counting of the vehicles at the same instant in multi-lane/ multicarriageway is possible by adjusting the camera angle as well as its elevation. Keeping this in view, the camera was adjusted and calibrated as per the guidelines provided by Ismail et al. (2013).

Study Area and Vehicle Types
Kachkoot Toll Plaza, which is located on NH-44, Amliar in the Union Territory (UT) of Jammu & Kashmir was taken as the study area. 24-hour videos and data entry sheets from July 22, 2019 to July 28, 2019 were obtained from the concerned data processing department. A total of eleven vehicle classifications were considered in this study which includes 2-wheelers, 3-wheelers, private passenger vehicles (PPVs), light commercial passenger vehicles (LCPVs), light commercial goods vehicles (LCGVs), tractor-trailers, mini-buses, ≥ 2-axle buses, 2-axle trucks, 3-axle trucks, and 4-6-axle trucks. It is important to mention that since no reliable data on the slow-moving nonmotorized vehicles such as bicycles, bullock-carts, pedestrians, etc. was available, these vehicle types were excluded from this study.

Results, Validation and Discussion
The one-week (24-hour) videos (as mentioned in Section 3) were run through both the BS-based and the CNN-based algorithms and the results obtained were validated against the data entry sheets provided by the concerned data processing department at Kachkoot Toll Plaza. The results for each of the eleven vehicle classifications are presented in Figure 6. To validate the results, various metrics were computed using the confusion matrices (refer to Table 1 and Table 2). The computed metrics include overall efficiency ( ), precision ( ), recall/ sensitivity (⍺) and specificity ( ) (refer to Equations 1-4).
where, TP and TN are true positive and true negative, respectively; F + and Fare false positive and false negative, respectively; X is the element in the confusion matrix other than the diagonal element; subscript c stands for the corresponding class in the confusion matrix.
International Journal of Mathematical, Engineering and Management Sciences Vol. 5, No. 6, 1440-1451 https://doi.org/10.33889/IJMEMS.2020.5.6.107 In drop-line diagrams (Figure 6), it can be seen that, comparatively, the counts for all the eleven vehicle classifications obtained from the CNN-based algorithm are close to the toll data. The confusion matrices (see Table 1 and Table 2) reflect the sub-metrics like true positive (TP), true negative (TN), false positive (F + ) and false negative (F -), which in turn, are useful in obtaining the metrics like precision, specificity, sensitivity and overall efficiency of an algorithm. These metrics were computed for both the algorithms which were then plotted for all the eleven vehicle classifications (see Figures 7-9). These plots signify that the CNN-based algorithm is better in terms of the precision, specificity as well as the sensitivity in most of these vehicle classifications.
Also, considering the overall efficiency, the CNN-based algorithm performed better (η = 0.98) compared to the BS-based algorithm (η = 0.95) (see Figure 10).     Analyzing the results, some useful takeaways regarding the BS-based algorithm are as follows:  In case the vehicles coming in the same or opposite direction cross the green vertical band simultaneously, the counting as well as the classification of the vehicles is adversely affected. This is mainly due to the vehicles overlapping in the frame.  The overlapping issue can be dealt with by changing the view of the camera from side to elevated. The problem of vehicle counting, as a result of this, gets solved to an extent but, the accuracy of vehicle classification is reduced.  Even slight physical disturbance to the camera due to environmental or human reasons causes vibrations and/ or shaking of the camera which creates noise in the foreground mask and therefore reduces the efficiency of vehicle detection.  Comparatively, this algorithm of vehicle detection and classification is time-consuming.
However, it requires lesser computational power.
Some valuable takeaways regarding the CNN-based algorithm are as follows:  The vehicle detection, in this case, is free from any foreground mask and is unaffected by the vibrations or shaking of the camera. Therefore, the accuracy of detecting and classifying vehicles is increased.   A suitable camera angle can be used for collecting data, although making sure that the vehicles must be visible and distinguishable. Moreover, no background training is required in this case.  This algorithm proves to be more efficient in terms of computational power required on aggregate and time taken for simultaneous object detection and classification of vehicles.  The object detection and classification of vehicles are done simultaneously and therefore the whole process becomes simple and less tedious.

Conclusions
The aim of this study was to compare two algorithms for automatic detection and classification of vehicles on mid-block road sections with a primary focus on an increased number of classifications. Therefore, BS-based and CNN-based algorithms were tailored and then tested. The results from both the algorithms were validated against Kachkoot Toll data using drop-line diagrams and confusion matrices. Analyzing the results showed that the CNN-based algorithm outperformed the BS-based algorithm in terms of precision, specificity and recall/ sensitivity. The overall efficiency of the CNN-based algorithm (η = 0.98) was also found to be better as compared to the BS-based algorithm (η = 0.95). Moreover, considering the overall time, the CNN-based algorithm requires less computational power. Considering these merits, this algorithm will be helpful to transportation agencies and professionals.

Conflicts of Interest
The authors confirm that there is no conflict of interest to declare for this publication.