Comparative Analysis Between Naïve Bayes Algorithm and Decision Tree Loss Rate from Fire Disaster Data in DKI Jakarta Province

Abstract


A. Introduction
The expansion of urban areas and the density of settlements in the DKI Jakarta Province are directly related to the significant increase in population. Indirectly, several events contributed to the fire calamity. Annual fire disasters in the DKI Jakarta region, especially during the dry season, have resulted in property loss or fatalities [1].
In many urban locations, like Jakarta, fire is one of the calamities that frequently occurs. Understanding the fire patterns in the DKI Jakarta Province in 2018 requires this study. This study's analysis of fire incidence data can help pinpoint regional patterns, primary causes, and distinctive features of local fires. Both economically and socially, fires can result in enormous losses. These losses include harm to property, loss of life, and emotional toll on the victims. With the aid of this study, it will be possible to assess the social and economic effects of the fires that occurred in the DKI Jakarta Province in 2018 [1].
This study can shed light on the typical fire causes by examining fire incidence data. This knowledge can be utilized to educate the public about fire safety precautions and assist government agencies in creating more potent fire prevention initiatives [2].
Studying fires helps us learn more about disasters in general. By analyzing fire incidence data, this study can shed light on the elements that affect the intensity and spread of fires. Future disaster prediction models may benefit from using this data to improve accuracy [3].
Information sharing regarding the possible risks of fire disasters is one of the measures taken to lower the risk of fire disasters in DKI Jakarta Province [4]. Prediction analysis of the potential hazard of fire catastrophes may be done using the 2018 Fire Occurrence dataset in the DKI Jakarta province. The Decision Tree and Naive Bayes algorithms provided conclusions and forecast fire disasters in the DKI Jakarta Province. This study has classified significant fire losses using the Nave Bayes approach and ranked large fire disaster losses using the decision tree method to determine whether the process is more suitable to utilize for categorizing significant fire disaster losses [5].
The purpose of this study's findings is to categorize the severity of fire disaster losses using the Naive Bayes and the Decision Tree methods and compare the Accuracy values of the Naive Bayes and Decision Tree methods. Overall, this research aims to categorize the severity of losses under the disaster object, the building, and its causes. The number of units, personnel, operation duration, and reaction time will also be categorized [6].
Data from fire incidents in the DKI Jakarta Province, which was utilized as a sample, was collected from data.Jakarta.go.id. After looking through other websites, I used this dataset because it contains all the variables [7].

B. Research Method
Two algorithmic techniques, the Decision Tree and Nave Bayes algorithms, can be employed to analyze fire incident data in the DKI Jakarta Province to come to conclusions and make predictions about fire disasters. Although these two algorithms analyze data differently, they can offer insightful information for comprehending and reducing fire risk [8].
A decision-making process known as the Decision Tree Algorithm creates a decision tree-like model. This technique can be used in the context of fire disaster analysis to pinpoint the variables that have the most impact on the likelihood of fires, such as location, season, building type, and human variables like irresponsibility. The association between these variables and fire events may be seen clearly in the Decision Tree. The Decision Tree analysis results can be used to conduct more precise and detailed preventive action and identify locations at a higher risk of fire [9], [10]. However, Nave Bayes is an algorithm that uses Bayes' theorem to create predictions based on the likelihood that an event will occur. Naive Bayes can be used to determine the possibility of a fire in predicting fire disasters based on various variables, including the weather, air temperature, wind patterns, and the history of previous fire incidents. These algorithms can help create early warning systems or models for predicting the risk of a fire, which can give authorities, firefighters, and the general public crucial information. The most important variables affecting the region's fire risk can also be found using Nave Bayes [10], [11], [12].
A more thorough assessment of the characteristics and variables influencing the occurrence of fires in DKI Jakarta Province can be made by merging the findings of the analysis of these two algorithms. The Nave Bayes algorithm's predictions can be a great source of inspiration for creating fire disaster prevention and mitigation plans that are more successful. These two methods offer insights that may help communities and authorities better plan for these risks, lessen the effects of fire disasters, and reduce their effects. [13], [14], [15].

C. Result and Discussion
Naturally , we need a omprehensive dataset with variables compatible with our technique to analyze data using these two algorithms and clearly define the problem and how to solve it. Download the necessary packages to run the algorithm after importing the data into the Rstudio program. Data imported using broad categories from https://data.jakarta.go.id. Data that is pertinent to the analysis variables are chosen through the sorting procedure at year 2018. The data format cleaning and correction stage was completed (Preprocessing) before deploying Nave Bayes and Decision Tree. Random data selection was used to conduct the study. Refer to Figure 1 for the procedures that were followed.

Figure 1. Preprocessing Data
The trial was conducted by randomly picking data. Following that step, special commands were utilized to process the data using the Nave Bayes and Decision Tree methodologies, yielding in-depth insights into the patterns and causes that affect fire incidences in DKI Jakarta Province. The command used to process data using the Naive Bayes approach is as follows; for more information, see Figure 2.  In addition, the stages of computing conditional class probabilities are completed to process data using the Naive Bayes approach. Calculating each attribute's probability about a particular class is a phase in these processes. This information is the foundation for creating fire risk estimates based on existing variables in fire analysis in the DKI Jakarta Province. So, to determine the prior probability, execute these steps (1.1): (1.1) While this happens, a bright blue highlight indicates the likelihood probability, which is the likelihood that the independent variable X will have the value it does if we know the class. This clearly shows how the independent variables and the current courses relate to one another. The probability likelihood value based on the "building" variable was as follows for a specific illustration. This procedure is a crucial component of the Naive Bayes approach used to process fire data in the DKI Jakarta Province, which will aid in creating predictions and a deeper comprehension of fire risk (see (1.2)).
(1.2) The prediction stage of the testing data that was previously gathered is the following step, and it is based on the model created using the Nave Bayes approach. This testing data consists of a collection of data that was not used in the model training procedure so that it may be used as a reference for evaluating the model's performance and accuracy in identifying fire occurrences. How the model's prediction results are displayed emphasizes the expected outcomes for each sample of testing data. This makes it evident to what extent the model can classify fire data adequately based on the available variables. Refer to Figure 4 for further information on how this approach plays a crucial part in evaluating how well the model predicts fire events in the DKI Jakarta Province and helping to improve and develop future prediction models that are even more precise.  Estimates of the likelihood of an accident involving fire with losses falling within a specific nominal range can be made using the prediction findings. As an illustration, the prediction results indicate roughly 88 occurrences with possible losses between 1 million and 250 million, five events ; fiveedicted to have losses between 250 million and 500 million, and so on. Refer to Figure 6 for more information on how this data can be used to determine how fire risk is distributed based on the variety of potential losses.  The confusion matrix, which shows a comparison between the model's projected results and reality, also records the correctness of the model's results. The model's accuracy is estimated to be 75% based on the confusion matrix, which represents how well the model can categorize fire incidents. This data summarizes how well the model performed in predicting fire disasters based on data processed using the Decision Tree approach. Understanding and improving the fire risk prediction process in DKI Jakarta Province depends on the data processing procedure and applying special commands in the Decision Tree approach.
The location of the incident, the number of personnel handling the disaster, the number of fire fighting units, the response time of the extinguisher, the length of operation, and the causes of the fire, as mentioned in (1.3) are the factors that have the most significant impact on the magnitude of the disaster loss.
(1.3) Figure 7. Decision Tree Plot According to the decision tree plot, the anticipated loss is between 1 million and 250 million dollars if the building area is less than 120. Suppose the building area is greater than 120 but at most 425. In that case, the loss for industrial and residential structures is anticipated to remain between 1 and 250 million dollars. In comparison, it is predicted to be between 250 and 500 million dollars for other objects. Refer to Figure 7 for further information. The Decision Tree method provided an accuracy rate of 78.1% from the confusion matrix building findings, demonstrating how well the model can categorize fire incidents based on the processed data. The number of true positives, true negatives, false positives, and false negatives is shown in the confusion matrix, which compares the results of the model's predictions with the actual data. The Decision Tree method can predict and classify fire events fairly accurately by studying this confusion matrix. The information in this report provides an in-depth analysis of the Decision Tree algorithm's performance in processing and interpreting fire data in the DKI Jakarta Province. Figure 8 shows how this study also lays a crucial foundation for the future creation of a more precise and trustworthy fire risk prediction model.

D. Conclusion
According to the accuracy level, the Decision Tree method is more accurate than the Naive Bayes method. With a 75% accuracy rate for the Naive Bayes algorithm and a 78% accuracy rate for the Decision Tree algorithm, it can beion Tree technique performs better than the Naive Bayes algorithm at classifying the severity of fire disaster losses.

E. Acknowledgment
We appreciate the outstanding assistance provided by Universitas Multimedia Nusantara, which was essential to accomplishing this research project. We appreciate their steadfast support and are grateful for their significant gift, which helped us accomplish our goals.