Motorsafe: An Android Application for Motorcyclists Using Decision Tree Algorithm

—In Vietnam, a ban on mobile phone use while driving has been in effect for a long time because this habit is highly dangerous to both drivers and other road users, causing distracted driving and resulting in multiple crashes. MotorSafe is a novel mobile application that has been launched as a solution for dealing with mobile phone usage while operating a vehicle. Based on the data obtained from the accelerometer sensor on the smartphone, the decision tree al-gorithm was made use of, which enabled the device to recognize the app user's status. Some necessary tasks then followed to help the biker focus on operating his vehicle instead of displaying unnecessary behaviors.


Introduction
According to the Traffic Police Department under the Ministry of Public Security, there are about 60 million motorbikes in circulation in Vietnam, accounting for 93% of the total vehicle numbers.Therefore, motorcyclists account for a high proportion (approximately 59%) of the road traffic collisions in the country, causing ten thousand deaths and injuries each year and tragically many of whom are teenagers.Bikers are reported to frequently use mobile phones, or smartphones given today's phone technology while controlling their vehicle, leading to biker distraction as well as their reduced ability to observe and control their speed.Unexpected events hence might occur and in most cases these could have catastrophic consequences.
In recent years, in order to reduce the number of traffic accidents and fatalities as well as to protect drivers, several countermeasures have been developed by researchers and companies around the world.A variety of mobile phone apps have been designed to act as voluntary 'workload managers', which work to prevent distracted driving due to mobile phone use, but most of them are for cars [3][4][5][6].The pioneer in developing apps to protect two wheeler riders, the app S-Bike, which will come preinstalled on all 4G-enabled Samsung Galaxy J Series devices in the future, is designed to enable responsible and safe riding.S-Bike can filter incoming calls but also allow receiving the calls from important people after the motorcyclist brings his bike to a halt [2].However, S-Bike has several major limitations in that it cannot automatically switch between S-Bike mode and normal mode and the biker needs to do this manually.This is extremely inconvenient and problematic because the smartphone is still on S-bike mode after the biker halts, thus he or she might miss certain important incoming calls if he or she forgets or fails to switch the phone back to the normal mode.Furthermore, this application is only available on the costly products of Samsung, which means all other Android cell phones cannot have this app installed.
Therefore, in this study, we propose an effective application that can be installed on all Android smartphones to reduce crash rates associated with mobile phone use while driving; it is named "Motorsafe".The application will automatically recognize exactly the states of the user.For example, while the biker is detected "On vehicle", the phone will automatically change to the silent mode and all incoming calls from unimportant or unknown people will be rejected.When the biker stops the bike, the phone will also automatically return to the normal mode.
In the first version, this software is developed using Google Activity Recognition API (GAR_API) with its machine learning classifiers to recognize user's activities.This algorithm has an advantage in that it has available, well-studied functions which can be applied immediately with a high accuracy.However, this advantage is also a drawback of that algorithm as the available functions should have parameters which cannot be adjusted and exploited other features when using this algorithm.(I really don't get what you mean here).In addition, the GAR_API algorithm is extremely complex and ill-suited to the lower configuration phones.
In this study, we developed a simpler algorithm, the Decision Tree Algorithm, which can be used on a number of Android cellphones with different configurations including the budget smart phones.Therefore, our application can approach a wider range of users.This algorithm will detect exact behaviors of the users: "On vehicle" or "Not on vehicle" by analyzing the information obtained from built-in accelerometer sensors on Android smartphones.After that, a number of tasks will be performed based on the detected status:  Automatically detecting the status of users and changing the phone to silent mode when the user is riding a two-wheeler, and switching the phone back to normal mode when the user brings their motorbike to a halt. Rejecting all unimportant and unknown incoming calls and automatically sending a message to inform the caller of the user's "On vehicle" state. Allowing receiving incoming calls from important people (VIP) regardless of whether the user is on the vehicle or not.The list of VIPs can be erased, modified or replaced. Handling emergency calls (not in the VIPs list), calls which are identified as continuous call frequency (3 calls in 3-5 minutes), in the same vein as calls from the VIPs list. Warning traffic participants if they are speeding. Detecting an accident and automatically sending a message notification together with the exact location of the crash to the biker's relatives or to hotline of the nearest hospitals.
Our application is available on the Google Play [17].

Data collection and feature extraction
In this paper, the data was collected directly from the accelerometer sensor, then these data were processed to determine the user's "On Driving" behavior through the motions of the vehicle.
The raw data stream from the accelerometers is the acceleration of each axis X, Y, Z in the units of g-force.Most of the existing accelerometers provide a user interface to configure the sampling frequency so that the user could choose the most suitable sampling rate through experiments.After collecting the raw data from the accelerometer sensor, the next step is to pre-process it before performing any further statistical computations.One purpose of the data pre-processing is to reduce the noise from the sensors.
To classify the users' activities, we used two different components of the raw acceleration data: static and dynamic.Firstly, the static component of the acceleration was caused by the orientation of the sensor which is proportional to the gravity and this static component can be obtained after passing low pass filter (LPF).Secondly, the dynamic component of the acceleration was caused directly by the movement of the accelerometer in the smartphone.The Dynamic Body Acceleration can be calculated by subtracting the static component of the acceleration from the corresponding raw accelerometer.Overall dynamic body acceleration (ODBA) and its vectorial variation (VeDBA) represented an aggregated acceleration used to detect different states [7].Therefore, the dynamic body acceleration is a good candidate to discriminate between behaviors with high dynamic movements (such as walking or running) from behaviors and with low dynamic movements (such as on vehicles or on the table).
where i = X,Y,Z Ai: the dynamic acceleration data y[i]: the filtering acceleration data x[i]: the raw acceleration data Then, the obtained values for DBA are used to calculate the overall dynamic body acceleration (ODBA) and the vectorial dynamic body acceleration (VeDBA)

Data classification
The decision tree is one of the most popular machine learning algorithms used for both classification and regression problems [12] [13].The general motivation behind the deployment of the Decision Tree is to create a training model which can be used to predict the class or value of the target variables by learning the decision rules inferred from prior data (training data).
In the field of machine learning, the decision tree is a type of predictive models; it means a mapping from observations of a thing / phenomenon to the conclusion of a target value of a thing / phenomenon.Each internal node corresponds to a variable, the connection between it and its child node corresponds to a specific value for that variable.Each leaf node represents the predicted value of the variable target which depends on the values of the input that the variables go through, a path from the root node to the leaf node [8].
There are many specific decision-tree algorithms such as ID3, C4.5, CART, and CHAID.Since the ROC analysis is increasingly recognized as an important tool for evaluation and comparison of classifiers, we decided to use receiver operating curve (ROC) to define the threshold with aim of improving the quality of the decision tree algorithm in this study.In our research, we took into consideration two main modes: On Vehicle and Not-on Vehicle.For the Not-on Vehicle mode, we divided it into low dynamic activities (denoted by Not-on Vehicle (1)) and high dynamic ones (denoted by Not-on Vehicle (2)).The proposed flowchart for classification is shown in Fig. 1.

Fig. 1. The decision tree algorithm flowchart
The feature VeDBAs was then brought to the flow (as shown in Figure 2).VeDBAs values were compared with a predefined threshold A to differentiate between cases with high and low dynamic activities.In case of high dynamic activities (such as walking and running), the output was marked by "Not on vehicle (1)".In case of low dynamic activity, the system continued to compare the VeDBAs with threshold B in order to classify the data under two classes "On vehicle" and "Not on vehicle (2)".

Results and Discussion
Firstly, the raw acceleration data was acquired from the acceleration sensor with a sampling frequency of 50 Hz.A data frame with two-second episodes in real time (VeDBA) was then processed.As a consequence, each data frame consisted of 100 samples and the algorithm subsequently converted this frame into 1 element (i.e. each element was a sum of 100 continuous samples).After that, we used the confusion matrix in order to evaluate the performance of the detected algorithm [10][11].TP (true positive) and TN (true negative) are correctly predicted cases, while FP (false positive) and FN (false negative) are the wrong predicted and unexpected cases.FN is equivalent to type-I error (rejecting the event as a positive event and assigning negative event) and FP is equivalent to type II error (accepting an event as a positive event when the event is negative in reality).The best prediction method that can produce a point in the upper left corner of the ROC space.For example, 100% of true positive cases were found and 0% of false positive ones at all.The predicted random-line resulted in a straight line which formed a 45-degree angle with the horizontal axis from the bottom left to the upper right.The model could predict better recognition than random predictions when it had ROC on the left side of the random line.The best value of the cut point of Positive and Negative classification was the asymptotic point of the line which was parallel to the random line with the ROC curve.Finally, the optimal values of the thresholds were identified as: The thresholds are used to classify three states as shown in Fig. 1.In this paper, the performance of the system was measured by calculating the precision and specificity using the following equations Precision: Specificity: Twelve volunteers were randomly selected from different groups of students at VNU University of Engineering and Technology -Hanoi, Vietnam.The raw acceleration data was acquired from the acceleration sensor with the rate of 50 samples per second.Each state was recorded in approximately 30 minutes (50% for training and 50% for testing) by each volunteer.Experimental data was then analyzed and the average performance of system is shown in Table 1.Our findings revealed the simplicity and remarkable effectiveness of the decision tree algorithm.The findings indicated that the specificity and precision of the system were considerably high.Specifically, the precision of "On vehicle" was particularly high because there were only a few instances, which were observed "Not on vehicle" in reality but were predicted "On vehicle" state by algorithm.It demonstrated that our algorithm is an optimal solution.
After being installed on the user's smartphone, the application needs to be activated only once.From then on, the software will automatically detect the behavior of the users to specify whether they are driving or not driving.If the behavior "On Vehicle" is detected or when the user starts driving, the app will automatically change the phone from normal mode to silent mode and switch the phone back to normal mode when the user stops their motorbike.
The Motorsafe software has multiple important features as shown in the Introduction section.Figure 3 shows sample setting menus for some of the above-mentioned features.Our application is available on the Google Play at [17].

Conclusion
In this paper, we have successfully developed a "full-stack" Android application, the Motorsafe, with highly practical functions which has been widely used.Firstly, the decision tree algorithm was utilized to classify human behavior based on analyses of data obtained from the accelerometer in the smartphones.The reason for using the decision tree algorithm was that our software could be installed on low-cost smartphones, thereby reaching more users.Secondly, we have developed certain services after identifying the behavior "On Vehicle" of the users.There are many kinds of Android phones together with various built-in sensors, which are available in the market and with which this application is compatible.Motorsafe is a good example of cutting-edge mobile technologies which can be implemented in almost every aspect of our lives [13][14][15][16].Currently, electroencephalogram-based brain computer studies are very promising and have been used for control applications [18][19][20][21].With the strong development of mobile technology, the electroencephalogram-based brain -mobile phone interface will be an interesting research topic.

Fig. 3 .
Fig. 3. Setting the content of the pre-prepared message or quick responses to send to the caller (a), Setting the Emergency call (b), Setting the VIPs list (c)

Table 1 .
Average Performance (mean±standard deviation) of our software