A Machine Learning-Based Method for Predicting the Classification of Aircraft Damage

Efficient and accurate classification of aircraft damage is paramount in ensuring the safety and reliability of air transportation. This research uses a machine learning-based approach tailored to predict the classification of aircraft damage with high precision and reliability to achieve data-driven insights as input for the improvement of safety standards. Leveraging a diverse dataset encompassing various types and severities of damage instances, our methodology harnesses the power of machine learning algorithms to discern patterns and correlations within the data. The approach involves using extensive datasets consisting of various structural attributes, flight data, and environmental conditions. The Random Forest algorithm, Support Vector Machine, and Neural Networks methods used in the research are more accurate than traditional methods, providing detailed information on the factors contributing to damage severity. By using machine learning, maintenance schedules can be optimized and flight safety can be improved. This research is a significant step toward predictive maintenance, which is poised to improve safety standards in the aerospace industry.


Introduction
Safety is a top priority in aviation, ensuring safe take-offs and landings for every flight.Civil aviation is a complex system that involves many factors, including regulation, transnational nature, deregulation, financing and operation, and safety and security [1], [2].A proactive approach to safety in aviation involves identifying hazards and risks before they materialize into incidents or accidents and taking necessary actions to reduce safety risks [3], [4], [5].Data analytics plays a significant role in improving safety in the aviation industry by identifying risks, understanding operational issues, and enhancing safety.By leveraging predictive analytics and advanced safety analytics solutions, airlines can take proactive measures to prevent incidents and improve safety [6], [7], [8].Machine learning can potentially bring significant benefits to the industry, improving safety, efficiency and customer experience in aviation.Its modelling technique uses training data to construct models, then predict outcomes [9], differentiating it from data mining.As a self-learning process, it models data to make predictions, whereas data mining focuses on pattern discovery in large datasets [10].

Related Work
Machine learning is a useful tool for analyzing safety and risk in various fields, such as industrial risk assessment [11].It is also used in analyzing and predicting factors in material development and manufacturing processes and can help measure sensitive factors to improve product quality [12].Machine learning also aims to discover the critical parameters that can occur in the conditions of related cases, and then use this as input in the development of safer operations [13].Several algorithms were utilized to predict aircraft crash severity: Support Vector Machine, Random Forest, Gradient Boosting Classifier, K-nearest neighbours Classifier, Logistic Regression and Artificial Neural Network.The dataset used was obtained from the National Transportation Safety Board (NTSB) [14].Deep learning models, comprising recurrent and non-recurrent neural networks, predict crashes before they occur [15].Machine learning algorithms show promise in predicting aircraft crash severity and can support proactive safety management [16] .

Methodology
Using historical data, machine learning algorithms have demonstrated significant promise in predicting the severity of aircraft accidents.The methodology proposed involves several stages, as described in multiple research papers [17], [18], [19].Various statistical modeling approaches are employed to create accident prediction models.These are increasingly sophisticated and crucial to safety studies.It is essential to establish a close connection between the developers of these models and those who apply them in practice to ensure the maximum benefit [20].The research process consists of several stages (Figure 1).The National Transportation Safety Board (NTSB) provided aviation accident data, which includes information on incidents and accidents from 1982 to 2020.
By analysing numerical, categorical, and time-related variables, exploratory data analysis (EDA) can be used to identify key factors that impact customer satisfaction, such as flight punctuality, in-flight amenities, and customer service.Additionally, EDA can be a valuable tool for identifying patterns and trends in aviation accidents, which can help in developing strategies for preventing future accidents and improving safety measures.

Result
Exploratory Data Analysis (EDA) can be a useful tool in identifying patterns and factors that contribute to accidents.

Data Cleaning
Data cleaning is a crucial step in research as raw data is often incomplete, inconsistent, and noisy, which affects the accuracy and reliability of insights derived from it.The following steps were performed for data cleaning: 1. Create an empty DataFrame to store columns and their null values 2. Use Simple Imputer to fill in the null spaces with the modal values for categorical data.3. Use Simple Imputer to fill in the null spaces with the mean values for the numeric data 4. Concatenate the cleaning DataFrame.

Heatmap
Heatmaps are a versatile tool in machine learning that can be used for feature selection, prediction, data visualization, and interpretability.Figure 2 shows the result of the correlation matrix calculation.Based on the plotting results, it can be observed Event.id is highly correlated with Year.It therefore follows that the event identifier has information related to the temporal aspects of the event itself, such as the year of occurrence.And then, Injury.Severity is highly correlated with Total.Fatal.Injuries.This implies that as fatal injuries increase, the severity of injuries tends to be higher.Purpose.of.flight is highly correlated with Schedule, this suggests that purpose of the flight can influence or determine the schedule of the flight.Aircraft.damage is highly correlated with Investigation.Type, this suggest that the severity or type of damage can influence the chosen investigation approach.Amateur.Built is highly correlated with Event.id, this implies that the event identifier may contain information related to whether the incident involved an amateur-built aircraft.
Once the heatmap has been plotted, the next step is to identify any data points that significantly deviate from the majority.This is done through the outlier process.However, it is important to keep in mind that this process can potentially distort the statistical analysis, impact the accuracy of the model, and introduce bias when making decisions.Therefore, it is crucial to approach the outlier detection process with caution and take appropriate measures to minimize any negative effects on the analysis.

Random Sampling
Effective feature engineering is a crucial step in the process of creating machine learning models.It involves identifying and selecting the most relevant features from the available data, as well as creating new features that can uncover hidden patterns, capture complex relationships, handle data limitations, and enhance the predictive power of the model.By doing so, feature engineering can lead to better insights and more accurate predictions.
One important aspect of feature engineering is identifying highly correlated independent variables in the model.This can be done using the Variance Inflation Factor (VIF) value.
Outliers can lead to inflated VIF values for correlated independent variables by creating a perception of stronger correlation, thereby exaggerating actual correlation levels.Additionally, to perform an accurate performance analysis, we classified the results into two categories of damage:

Minor & Sustained
This category pertains to accidents that cause limited damage to specific areas of the aircraft, such as the engines, landing gear, or control surfaces, while the aircraft itself remains mostly intact.It may also encompass accidents that result in moderate damage but do not involve complete destruction or disintegration.

Destroyed
This category includes accidents where the aircraft is irreparable due to extensive damage, fire, explosion, structural failure or widespread wreckage.Table 2 reveals that the logistic regression model has low recall and precision values, while the accuracy appears to be good due to its dominance by class 0. Meanwhile, the decision tree model shows an increase in precision and recall scores, but class imbalance and misclassification problems still exist.Additionally, there is a risk of overfitting, where the model may have memorized the pattern rather than learning it.Similarly, the Random Forest Classifier demonstrated significant improvements in Precision, Accuracy, and Recall scores.Overfitting was seen to be a primary problem.The model performs well on training data but fails to perform well on unseen data.In our experiments, the KNN algorithm demonstrated a poor performance.

Predictive Analysis
Checking for overfitting is of vital importance, and one way of doing so is by comparing the accuracy scores of the training and testing sets.If the training accuracy is noticeably higher than the testing accuracy, it could suggest that the model is overfitting to the training data.Boosting is an ensemble technique that combines weak models to create a strong model.It rectifies previous models' errors and reduces overfitting with algorithms such as AdaBoost and Gradient Boosting Machines (GBM), adapting to minimize errors, making it effective for overfitting and class imbalance issues.The boosting technique was applied to each model.The final step in applying the boosting technique is Categorical Boost. Figure 8 depicts the confusion matrix plot.After performing a series of processes on each model, the performance of each model is compared to choose the best one.Based on Table 4, the recall rate measures how accurately the algorithm predicts the severity of aircraft damage.3. Performance Analysis of The Algorithms Based on Figure 10.An AUROC (Area Under the Receiver Operating Characteristic) curve is a way of evaluating the performance of a binary classifier, which distinguishes between positive and negative instances.The AUROC curve considers the balance between the true positive rate (sensitivity or recall) and the false positive rate (1 -specificity) at different classification thresholds.An AUROC curve area of 0.8233 is a measure of the overall performance of the binary classifier.The higher the AUROC curve area, the better the classifier is at achieving a high true positive rate while keeping the false positive rate low.

Conclusion
In this paper, we propose a methodology for predicting the severity of aircraft accidents.We tested four different ML algorithms using eight different ML models.To achieve even better results, it is recommended to use more accurate evaluation techniques such as the Boosting technique.Additionally, algorithms can be enhanced through hyperparameter tuning procedures using Python libraries such as Scikit-learn.The proposed algorithm is

EAI Endorsed Transactions on
Internet of Things | Volume 10 | 2024 useful for estimating the remaining engine life during the preventive maintenance of aircraft engines, and can be used with the machine learning algorithm described in this study.

Figure 1 .
Figure 1.The steps involved in the prediction analysis process

A
confusion matrix allows the visualization of how well a classification model has performed on a set of test data where the actual values are known.Confusion matrices are a useful tool for data analysts to evaluate the ML model's performance and identify which functions it performs well and which it does not.

Figure 3 .
Figure 3.The results of applying adaptive boosting to the model

Figure 4 .Figure 6
Figure 4.The results of applying Gradient boosting to the model Figure 6 shows a decrease in False Positive Rate, indicating an improvement in misclassification.True Negative continues to dominate the matrix and is the primary contributor to performance metrics.

Figure 5 .
Figure 5.The results of applying Extreme Gradient boosting to the model Based on Figure 7, it can be seen that the False Positive Rate decreased from the previous confusion matrix.The misclassification problem seems to have

Figure 8 .
Figure 8.The results of applying Categorical boosting to the modelThe AUROC curve is a widely used method for evaluating the performance of a classification model that predicts whether something belongs to a specific category or not.It quantifies how well a model can differentiate between positive and negative classes.The True Positive Rate (also known as Sensitivity or Recall) indicates the percentage of actual positive cases that the model correctly identifies as positive.On the other hand, the False Positive Rate represents the proportion of actual negative cases that the model misclassifies as positive.The AUROC value ranges from 0 to 1, where 0.5 signifies a random classifier, and 1 denotes a perfect classifier.The higher the AUROC score, the better the model's ability to distinguish between positive and negative classes.The AUROC value is calculated as the area beneath the AUROC curve, and a larger area under the curve indicates better model performance.If the False Positive Rate (FPR) experiences a drastic increase after a certain point in the AUROC (Area Under the Receiver Operating Characteristic) curve, it indicates that the model's ability to differentiate between positive and negative classes decreases rapidly when the classification threshold is lowered.

Figure 6 .Figure 7 .
Figure 6.The results of applying Categorical Gradient boosting to the model

Table 1 .
VIF Value of Feature EAI Endorsed Transactions onInternet of Things | Volume 10 | 2024

Table 1 .
The Performance Metrics of Test Data

Table 2 .
Checking for Overfitting