Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning

Liu, Fei; Li, Jiawei; Wen, Xiangxi; Wang, Yu; Tong, Rongjia; Liu, Shubin; Chen, Daxiong

doi:10.3390/app132111957

Open AccessArticle

Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning

¹

Air Traffic Control and Navigation College, Air Force Engineering University, Xi’an 710051, China

²

PLA Troops No. 93735, Tianjin 310700, China

³

PLA Troops No. 66137, Beijing 100032, China

⁴

PLA Troops No. 94188, Xi’an 710050, China

⁵

PLA Troops No. 94755, Zhangzhou 363000, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(21), 11957; https://doi.org/10.3390/app132111957

Submission received: 5 September 2023 / Revised: 17 October 2023 / Accepted: 23 October 2023 / Published: 1 November 2023

Download

Browse Figures

Versions Notes

Abstract

:

With the rapid development of the air transportation industry, the air traffic situation is becoming more and more complicated. Determining the situation of air traffic is of great significance to ensure the safety and smoothness of air traffic. The strong subjectivity of assessment criteria, inaccurate assessment results and weak systemic assessment method are the main problems in air traffic situation assessment research. The aim of our research is to present an objective and accurate situation assessment method for air traffic systems. The paper presents a model to assess air traffic situation based on the complex network theory and ensemble learning. The air traffic weighted network model was introduced to systematically describe the real state of an air traffic system. Assessment criteria based on the complex network analysis method can systematically reflect the operational state of an air traffic weighted network system. We transformed the air traffic situation assessment into a binary classification, which makes situation assessment objective and accurate. Ensemble learning was introduced to improve the classification accuracy, which further improves the accuracy of the situation assessment model. The model was trained and tested on the dataset of the East China air traffic weighted network in 2019. Its average classification accuracy is 0.98. The recall and precision rates both exceed 0.95. Experiments have confirmed that the situation assessment model can accurately output air traffic situation value and situation level. Furthermore, the assessment results are consistent with the real operational situation of the air traffic in East China.

Keywords:

air traffic network; situation assessment; complex network; ensemble learning

1. Introduction

In recent years, the air traffic transportation industry has developed rapidly. As an important part of the transportation system, air traffic transportation has greatly promoted economic development and culture exchange among cities. According to statistics from the Civil Aviation Administration of China (2019) [1], there were more than 238 transport airports and 5521 regular flight routes in China before the outbreak of the COVID-19 epidemic. The industry-wide transport airlines completed 4,966,200 transport take-off flights. The airport in the capital of Beijing has more than 300 fixed routes and more than 280 navigable airports. The Beijing-Shanghai route (A593) has 38,000 flights per year and carries more than 7.7 million passengers. Additionally, there were 875,902 abnormal flights. There are some reasons for abnormal flights: congestion, equipment failure, and meteorological conditions, etc.

The operational situation of the air traffic system has an important influence on air safety and transportation. We can judge the safety level of air traffic operations by assessing the air traffic situation [2,3,4]. The air traffic situation is in a bad state, including route congestion and flight delay which seriously affect the stability and safety of air traffic. And air transport revenue will decline [5]. Assessing the air traffic situation is helpful to find the risks and unstable factors. They affect the safe and efficient operation of air traffic. We can predict the evolutionary trends of air traffic flow and optimize the allocation of airspace resources [6]. And conflict detection and identification can be achieved by assessing the air traffic operational situation [7,8]. Therefore, it is very important to assess the air traffic situation. However, research on the situation assessment of air traffic systems is still in its infancy.

The air traffic system is a complex and gigantic system. Airports, routes, and aircraft interact with each other [9,10]. The role of each element must be considered in assessing the air traffic situation. Thus, we proposed to construct an air traffic network model based on the complex network theory. The air traffic network including airports, routes, and operation data were weighted and constructed based on the complex network theory. The assessment of the air traffic situation is necessary to assess the operational situation of an air traffic network.

At present, there are three methods to assess the network situation. The first is based on probability statistics. The weight of each safety factor is usually determined by expert qualitative knowledge or existing research results. A comprehensive weight index is constructed to calculate the network security situation, using methods such as the analytic hierarchy process (AHP), the Bayes network, and the entropy method [11,12,13], etc. Lu and Zhuang [14] proposed a network security situation assessment method based on an attack analytic hierarchy process. It took the change in the weighted index values after the network was attacked as the security situation results. Yi and Guo [15] considered the special network security requirements of the IIoT and proposed a quantitative evaluation method of network security based on the AHP. Hu [16] combined support vector machines and improved the cuckoo algorithm to predict network security postures, and the method achieved a high accuracy in its performance on the data mining and knowledge discovery (KDD) data set. Pu [17] proposed a Bayes network evaluation model based on an attacker’s income, loss, cost, and risk-related indexes. These methods can reflect the influence of expert experience or existing knowledge of a network security situation effectively. Their calculation is simple and easy to implement. However, strong subjectivity will lead to inaccurate evaluation results. And it cannot be applied to the dynamic changes of the network. The second assessment method is based on information fusion and logical reasoning. It mainly filters various characteristics and information contained in security situation data by setting pre-rules. An information fusion method is adopted to synthesize information feedback values as situation evaluation results. The typical methods include clustering or D-S evidence theory. Wang and Yu [18] proposed a network security situation evaluation system based on a modified D-S evidence theory. Through the information fusion of internal factors and external threats, a multi-module evaluation system was constructed. Gao [19] carried out grey clustering analysis on network attacks. Network attacks were divided into three levels of harm, including “strong”, “medium”, and “weak”. On this basis, the AHP method was used to build a security situation assessment model. Wen and Tang [20] considered the information fusion of multi-information sources and multi-level heterogeneous systems. And a framework of comprehensive factors as weighted data for assessment for a network security situation was proposed. These methods can use data characteristics and information to analyze a network security state. The assessment process is objective. However, it cannot effectively analyze the hidden information in the data when faced with a large amount of high-dimensional network data. The third assessment method is to use traditional artificial intelligence models to evaluate the network security situation. The typical methods include the neural network model and the support vector machine (SVM) model, etc. These methods are widely used in the field of air traffic congestion research. Takeichi [21] used artificial neural network modeling including an autoregressive property to predict the delays of a congested airport. Kim [22] proposed a recurrent neural networks model to predict day-to-day delay status. Gui [23] counted and predicted the air traffic flow based on support vector regression and long short-term memory. Gopalakrishnan [24] compared the performance of three classes of models: the Markov jump linear system (MJLS), classification and regression trees (CART), and the linear regression (LR) and artificial neural network (ANN) architectures in air traffic delay networks. However, delay and congestion are only typical manifestations when the air traffic situation is in a bad status. The status and change in air traffic operations should be accounted for in real time. Systematic and scientific assessment of the air traffic situation is the key method to solve these problems.

In this paper, we turn the situation assessment of the air traffic system into the situation assessment of the air traffic network based on the complex network theory. And binary classification is used to quantitatively assess the situation of the air traffic network. In order to improve the classification accuracy, we use ensemble learning to build the assessment model.

2. Preliminaries

2.1. Construction of Air Traffic Network Model

In the undirected graph

G (V, E)

,

V

represents all nodes in graph

G

, and

E

represents edges between all nodes in graph

G

[25]. In the air traffic operation network

G_{1} (V_{1}, E_{1}, ω_{1})

,

V_{1}

represents all nodes in graph

G_{1}

,

E_{1}

represents edges between all nodes in graph

G_{1}

, and

ω_{1}

represents the edge weight. The nodes are airports and waypoints in

G_{1} (V_{1}, E_{1}, ω_{1})

. Edges are routes. The edge weight is calculated by the operation data of air traffic.

Four typical operation indexes of air traffic were finally selected to characterize the operation situation of airports, waypoints, and routes. They were route saturation (RS), flight delay rate (FDR), meteorological conditions (MC), and military flights (MF). RS represents the ratio of the daily flow to the maximum capacity of routes. FDR represents the ratio of the number of delayed flights on the daily airports, waypoints, and routes to the number of normal flights. MC represents weather conditions for daily airports, waypoints, and routes. MF represents the ratio of the daily time of military activities affecting the operation of the airports, waypoints, and routes to the total time of normal operation. We sorted the MC into five categories including sunny, cloudy, rainy, foggy, and thunderstorm. The MC was presented in text format. In order to calculate these, we encoded the weather information in a 1–5 order.

On this basis, we used the AHP method to construct the edge weight of air traffic network. Its process was simple, efficient, and applicable. According to the degree of influence on air traffic, the four indexes were ranked. RS is the core index reflecting the transport efficiency of the routes. It can indicate the operation situation of the routes. FDR is a supplement of AS in terms of delay. It can reflect the operational stability of the airports, waypoints, and routes. MC and MF are important factors that affect the normal flights. Compared with other indexes, MC and MF can indirectly reflect the operational situation of the airports, waypoints and routes. According to the statistics of the civil aviation administration, in recent years, military activities have a greater impact than meteorological conditions. Therefore, the order of importance of these four indexes is:

R S > F D R > M F > M C

.

The importance of the four indexes was compared by the scale method. The comparison results are shown in Table 1.

According to Table 1, the judgment matrix A is:

A = [\begin{matrix} 1 & 3 & 5 & 7 \\ 1 / 3 & 1 & 3 & 5 \\ 1 / 5 & 1 / 3 & 1 & 3 \\ 1 / 7 & 1 / 5 & 1 / 3 & 1 \end{matrix}]

Calculating the eigenvector

ω

corresponding to the maximum eigenvalue

λ_{\max}

, we perform normalization processing to obtain a weight vector:

W_{i} = \frac{\sqrt[n]{\prod_{j = 1}^{n} a_{i j}}}{\sum_{i = 1}^{n} \sqrt[n]{\prod_{j = 1}^{n} a_{i j}}}

(1)

where

a_{i j}

is the element of A, and

W = {[W_{1}, W_{2}, \dots, W_{n}]}^{T}

is the eigenvector corresponding to the maximum eigenvalue of A.

According to Formula (1), the weight vector is calculated as follows:

\begin{array}{l} W & = [\begin{matrix} W_{1} & W_{2} & W_{3} & W_{4} \end{matrix}] \\ = [\begin{matrix} 0 . 5650 & 0 . 2622 & 0 . 1175 & 0 . 0553 \end{matrix}] \end{array}

(2)

Conducting a consistency test and calculating the maximum eigenvalue

λ_{\max}

:

λ_{\max} = \frac{1}{n} \sum_{i = 1}^{n} \frac{\sum_{j = 1}^{n} a_{i j} W_{j}}{W_{i}} = 4 . 1169

(3)

The consistency index CI is:

C I = \frac{λ_{\max} - n}{n - 1} = 0 . 0389

(4)

The consistency proportion CR is:

C R = \frac{(λ_{\max} - n) / (n - 1)}{R I} = 0 . 0438

(5)

where RI is the random consistency index.

When n = 4, RI = 0.9. The judgment matrix satisfies the consistency test. The weight of each index is:

W_{R S} = 0.5650 W_{F D R} = 0.2622

W_{M C} = 0.1175 W_{M F} = 0.0553

In order to eliminate the difference of indexes’ magnitude, the maximum and minimum method was used to normalize.

R S_{j} = \frac{R S_{j} - \min R S (e)}{\max R S (e) - \min R S (e)}

(6)

The remaining three indicators were treated in the same way. Finally, the comprehensive edge weight

w_{i}

of the aviation network is:

\begin{array}{l} w_{i} = & 0.5650 \times R S_{i} + 0.2622 \times F D R_{i} \\ + 0.1175 \times M C_{i} + 0.0553 \times M F_{i} \end{array}

(7)

The larger the

w_{i}

, the more complex the operation of the edge, and the higher the operation risk degree.

In this paper, we assess the air traffic situation in East China. The air traffic network in East China can be constructed.

In Figure 1,

G_{2} (V_{2}, E_{2}, ω_{2})

consists of 104 nodes and 195 edges. Routes segments (edges) are connected by a series of airports and waypoints (nodes). The waypoints contain the navigation stations, crossing points, and reporting points. With the waypoints’ help, pilots can keep on the right track by receiving the exact position signal from the ground navaids. Each node coordinate is the actual geographic coordinate.

2.2. Situation Assessment Indexes of Air Traffic Network

In order to represent the status of the air traffic network, we chose 7 typical evaluation indexes in complex network analysis. The calculations of these indexes were based on

w_{i}

. Therefore, they can comprehensively evaluate the topology and operation characteristics of air traffic network.

The “effective distance” was introduced to replace the actual topological distance in the weighted network [26]. “Effective distance” can be calculated by the following formula:

D_{i j} = 1 - \ln w_{i j}

(8)

where

w_{i j}

is the weight of the connection between node i and node j. Evaluation indexes of the complex network are calculated by “effective distance”. They can reflect the actual operation characteristics of the air traffic network.

The larger the network edge weight, the smaller the effective distance. The network average effective distance ND is:

N D = \frac{1}{|E|} \sum_{i, j \in V} D_{i j}

(9)

On this basis, the optimized network evaluation indexes were given as follows:

(a): Node Strength (NS)

The weighted node degree is the node strength.

\bar{s_{i}} = \frac{1}{N} \sum_{j = 1}^{N} a_{i j} w_{i j}

(10)

The larger the NS value, the closer the link between the waypoint and the surrounding nodes. It can reflect the operation complexity of the current air traffic.

(b): Node (edge) Betweenness (NB/EB)

It reflects the centrality of nodes and edges in the whole network. The number of interfaces

B (v_{k})

of node k refers to the proportion of the shortest effective distance passing through node k in the total effective distance:

B (v_{k}) = \frac{D^{'}_{i j} (v_{k})}{D_{i j}^{'}}

(11)

where

D_{i j}^{'} (v_{k})

is the shortest effective distance between node i and node j through node k, and

D_{i j}^{'}

is the shortest effective distance between node i and node j.

The edge betweenness is:

B (e_{l}) = \frac{D_{i j}^{''} (e_{l})}{D_{i j}^{'}}

(12)

where

D_{i j}^{''} (e_{l})

is the shortest effective distance through edge l.

(c): Network Clustering Coefficient (NCC)

In the weighted network, the average clustering coefficient describes the aggregation characteristics of the network [19,20]. The weighted clustering coefficient of a single node can be expressed as follows:

c_{w} (i) = \frac{1}{s_{i} (k_{i} - 1)} \sum_{j, k} \frac{w_{i j} + w_{i k}}{2} \cdot a_{i j} a_{j k} a_{k i}

(13)

where

s_{i}

is the point strength,

k_{i}

is the node degree value,

v_{j}

and

v_{k}

represent two adjacent nodes of

v_{i}

.

a_{i j}

indicates the connection status of node pairs. When

v_{i}

and

v_{j}

are interconnected,

a_{i j} = 1

, otherwise

a_{i j} = 0

.

The larger the clustering coefficient of a single node, the more likely the node is at the core of the regional community. The average weighted clustering coefficient of the network is:

\bar{C} = \frac{1}{N} \sum_{i = 1}^{N} c_{w} (i)

(14)

It can reflect the clustering level of the network.

(d): Network Efficiency (NE)

In the weighted network, the improved NE is the average of the reciprocal sum of the effective distances between all nodes:

N E = \frac{1}{N (N - 1)} \sum_{i \neq j} 1 / D_{i j}

(15)

where

D_{i j}

is the effective distance between

v_{i}

and

v_{j}

. It can reflect the difficulty of network information transmission. The smaller the NE, the smoother the information transmission, and the stronger the network robustness.

(e): Network Density (ND)

In the weighted network, the improved ND is:

N D = \frac{\sum_{i}^{n} \sum_{j}^{n} a_{i j} w i j}{2 n}

(16)

where n is the total number of network nodes. The greater the ND, the higher the heterogeneity of the whole network.

(f): Network PageRank (NPR)

The PageRank index shows that the importance of network nodes depends not only on the number of adjacent nodes of the node, but also on the importance of adjacent nodes. In the weighted network, the definition is as follows:

P R (v_{i}) = [(1 - ρ) / N] + ρ \sum_{j \in I (i)} \frac{w_{i j} P R (j)}{W (j)}

(17)

where

(1 - ρ)

is the probability of turning to other nodes,

ρ

is the damping coefficient, usually

ρ

= 0.85.

I (i)

is the set of nodes connected to

v_{i}

, and

W (j)

is the set of nodes connected to

v_{j}

.

According to the daily operation of the air traffic network, the comprehensive edge weight

w_{i}

will change. Therefore, the above six indexes can describe the air traffic operation. The air traffic network situation dataset can be built in the designated area and time.

3. Methods

The air traffic network situation dataset needs to be further processed to assess the overall situation of air traffic. In order to accurately determine the air traffic state, we transform the situation assessment into a binary classification. According to significantly good and poor air traffic operation status, samples are labeled. These samples are used as the training dataset to input the model to calculate the classification boundary. According to the distance from the sample to the classification boundary, the air traffic situation can be quantitatively described. Machine learning has inherent advantages in solving binary classification problems. However, a single machine-learning model has some problems, such as a low classification accuracy and unscientific classification standards. Therefore, the ensemble learning was used to build the situation assessment model. It can integrate the advantages of various machine-learning models to obtain a more accurate and scientific classification model.

3.1. Basic Principles of Situation Assessment

The purpose of transforming the situation assessment into binary classification is to assess the air traffic situation more scientifically. In addition, we can obtain the quantitative assessment value. Support vector classification (SVC) is the main tool to deal with the classification of low-dimensional data. It has the advantages of a high classification accuracy and fast calculation speed. It can calculate the distance from the sample to the classification hyperplane. The distance will be taken as the situation assessment value of each sample.

Support vector classification (SVC) can divide samples into two parts through a hyperplane. The classification hyperplane is a straight line in two-dimensional space and a surface in three-dimensional space. In two-dimensional space, the classification hyperplane can be expressed as:

g (x) = w^{T} \cdot x + b = 0

(18)

where

x \in ℝ^{n}

,

T \in \{(x_{1}, y_{1}), (x_{2}, y_{2}), \dots (x_{N}, y_{N})\}

,

b \in R

,

\frac{2}{‖w‖}

represents the distance between H₁ and H₂. The classification boundary interval of the sample is [−1, 1]. According to g(x_i) > 0 and g(x_i) < 0, the air traffic network situation is divided into two categories. g(x_i) > 0 indicates that the sample is positive. The air traffic situation represented by this sample is stable during this period. g(x_i) < 0 indicates that the sample is negative. The air traffic situation represented by this sample is risky and unstable. The situation assessment principle based on SVC is shown in Figure 2.

SVC can find an optimal hyperplane H that maximizes the margin between the two classes. The air traffic situation in different periods was divided into two categories by SVC. The margin between samples and the classification hyperplane H is different in Figure 2. It can be used to quantitatively assess the air traffic state in a certain time. Therefore, it is very important to obtain an accurate and reasonable classification hyperplane.

3.2. Ensemble Learning

A single classification learning method is usually difficult to fully mine for all the classification knowledge of datasets. It will cause problems such as the poor generalization of the classification model and insufficient classification ability. Ensemble learning can synthesize the advantages of various classification learning methods to obtain a more accurate and reasonable classification result. The basic process of ensemble learning is shown in Figure 3. The training database is input into classification learning models to obtain classification datasets. Classification datasets are integrated into an integrated dataset. Finally, the ensemble learning model is trained by the integrated dataset.

In the base classifier generation, the training database

T D = \{x | x_{1}, x_{2} \dots x_{n}\}

is trained on classification learners C₁, C₂,

\dots

, C_t. In base classifier integration, an integrated database

E D = \{C_{1} (x), C_{2} (x), \dots, C_{t} (x)\}

can be obtained by classification learners C₁, C₂,

\dots

, C_t. And the ensemble learning model C(x) can be built by training ED.

According to the base classifiers’ type [27,28,29], ensemble learning contains a homogeneous ensemble and heterogeneous ensemble. The heterogeneous ensemble has an advantage in improving the accuracy and generalization level of the classification model. Therefore, we choose four different base classifiers. According to reference [30,31], we can obtain a higher classification accuracy by choosing the k-nearest neighbor, Bayes, BP neural network, and SVM. And it can also ensure that the calculation time is in an acceptable range.

3.3. Situation Assessment Process

In Figure 4, the air traffic network situation training database is input into four basic classifiers and trained to obtain four kinds of classification parameter datasets in Phase One and Two. The ensemble dataset is constructed by the “number of sample points belonging to positive and negative classes”, “posterior probability”, “activation function value”, and “distance from sample to classification hyperplane”. In Phase Three, the ensemble learning model is established to assess the air traffic situation data. Finally, the air traffic situation in a certain time can be quantitatively assessed by this model. In Table 2, according to the distance from the sample to the classification hyperplane H, we classify the air traffic situation level.

4. Experimental Setup and Details

All experiments used the same configuration of computer (CPU Intel i7-7700, RAM 16 GB) and experimental platform, Python 3.7.1, Windows10–64 bit. Air traffic operation data in East China in 2019 were provided by the Air Traffic Management Bureau of the CAAC.

4.1. Air Traffic Network Situation Dataset in East China

We selected 191 significantly good and significantly poor air traffic operation conditions in 2019 as the training dataset. A total of 90 samples were selected as the test dataset. In Table 3, we refer to the method in Section 2.1 to calculate the adjacency matrix of the air traffic weighted network in East China. Four typical operation indexes of air traffic in East China were used to calculate the edge weight

w_{i}

.

In Table 4, the adjacency matrix of the air traffic network was used to calculate seven typical evaluation indexes based on the method in Section 2.2. The air traffic network situation dataset in East China was obtained by calculating seven complex network evaluation indexes of each day. The training dataset consists of samples with significantly good and significantly poor air traffic operation conditions. These samples were labeled to calculate the classification boundary.

4.2. Ensemble Learning Model

The training dataset needed to be standardized. Then, it was input into four base classifiers.

(a): K-NN model

We adopt 10-fold cross validation. After model parameters were optimized by the grid search method, the average classification accuracy score was 0.8257. The number of positive and negative samples in the five adjacent samples around each sample were output, as shown in Table 5. The KNeighborsClassifier() in sklearn was called to build this model. And kneighbors() was used to calculate adjacent to each sample.

(b): Bayes model

We adopt 10-fold cross validation. After model parameters were optimized by the grid search method, the average classification accuracy score was 0.7936. The prior probability of each sample corresponding to the positive and negative classes were output, as shown in Table 6. The GaussianNB() in sklearn was called to build this model. And predict_proba() was used to calculate each sample.

(c): BP neural network model

We used 10-layer neural network training. After model parameters were optimized by the grid search method, the average classification accuracy score was 0.8702. Each sample’s activation function output is shown in Table 7. The tensorflow() was called to build this model. And tf.nn.softmax() was used to calculate corresponding to each sample.

(d): SVM model

We adopt 10-fold cross validation. After model parameters were optimized by grid search method, the average classification accuracy score was 0.8852. The distance between each sample and the classification hyperplane s shown in Table 8. The SupportVectorClassifier() in sklearn was called to build this model. And decision_function() was used to calculate.

According to the four models’ training results, the classification accuracy of the four basic classifiers are different. The SVM and BP neural network classification performance are better than K-NN and Bayes. However, the greater the difference in the base classifiers, the better the performance of the ensemble model in heterogeneous ensemble process. The ensemble database of the air traffic situation in East China was built by Table 5, Table 6, Table 7 and Table 8. In Table 9, the four classification results are combined into a new ensemble learning dataset. It was used to train the air traffic situation assessment model.

In this paper, we chose the SVC as the basic framework of the ensemble learning model. The grid search was used to search for optimal parameters. The 10-fold cross validation and mean square error were set as parameters. The main parameters of SVC were finally determined as follows: kernel = ‘rbf’, gamma = 0.0083, cache_size = 6000, C = 1.0. And we used the confusion matrix to compare the classification effect of the ensemble learning model and single classification model.

In Table 10, the ensemble learning model performed optimally in all indexes. Its average classification accuracy is up to 0.98. Recall and precision both exceed 0.95. The ensemble model’s classification ability has obvious advantages. It indicates that the ensemble learning method can effectively improve the assessment accuracy of the air traffic network situation assessment model. According to the relationship between recall and precision, the PR graph was drawn in Figure 5. The PR curve of the ensemble model is more outside than others. And the optimal equilibrium point is the highest. It indicates that the ensemble learning model is superior to the others in balancing precision and recall. There is a crossover between SVM and the BP neural network’s PR curve. And the position gap of the PR equilibrium point is small. It indicates that the classification performance of SVM is close to the BP neural network. The Bayes model performed worse in precision and recall than the other classifiers.

4.3. Situation Assessment of Air Traffic Based on Ensemble Learning

The experiments can prove that the ensemble learning model is the better choice to build the air traffic situation assessment model. Therefore, we input the test dataset into the air traffic situation assessment model. In Figure 6, the distances from 90 samples to the classified hyperplane was used to draw the change diagram of the air traffic situation in East China. And the quantitative assessment results of the air traffic situation assessment model were compared with the actual operation state. The distances between the test samples and the classification hyperplane were output by decision_function (). The dotted red line divides the samples into positive and negative categories.

Within the positive region, the air traffic situation in East China is at a low risk level. Within the negative category, the air traffic situation in East China is at a higher risk level. According to the color distribution of the samples, the air traffic situation level is “safety” or “potential risk” on most days. For about 35 days, the air traffic situation level in East China was between “potential risk” and “general risk”. The 14-day situation level is in the negative category, belonging to the “moderate risk” and “significant risk”. The air traffic situation of these samples has been confirmed to have great risks. The distance from the sample to the classification boundary can be used to further determine the risk severity level on that day. In addition, it can be found that there are more samples marked in blue in the positive area. This indicates that the air traffic situation in East China is generally in a relatively safe state within 90 days. And the air traffic operation is relatively stable. In order to analyze the changing trend in the air traffic situation in East China, we draw the needle graph. The distance from the sample to the classification hyperplane was displayed more intuitively. We can clearly see the fluctuation of the air traffic situation in 90 days.

In Figure 7, the 90-day air traffic situation changes periodically. Every 10–15 days, the air traffic situation appears to exhibit more intensive fluctuations. As shown in the red circle, the air traffic situation will be affected on the days when significant risk appeared. We acquired the operation of the air traffic in East China on the days corresponding to the No. 5, No. 24, and No. 80 samples. We found that there were a lot of military flights and heavy weather in East China on these days. Some airports and routes were temporarily closed. And flight delays were severe. This led to different levels of flight congestion on the following days in the air traffic. In addition, we also acquired the air traffic operation state of some general and potential risk samples. We have found that minor delays due to routes changes, heavy weather, and traffic control are common on these days. With the influence factors gradually eliminated, the air traffic operation state gradually returned to stability and safety within a few days.

5. Conclusions

In this work, we proposed an air traffic situation assessment method based on the complex network theory and ensemble learning. This method met the evaluation requirements of the system and solved the shortcomings of the index evaluation method. The air traffic weighted network model based on the complex network theory can fully characterize the overall characteristics of the air traffic system. Using the complex network analysis method, we can systematically evaluate the state of air traffic system composed of airport, route, waypoint, and operation data. And an artificial intelligence situation assessment method was introduced based on index evaluation. We transformed the situation assessment into a binary classification to obtain more objective and quantitative assessment results. In order to improve the classification effect of a single classification model, ensemble learning was introduced to construct an air traffic situation evaluation model. It can achieve a better classification effect in less calculation time. We took the air traffic situation data in East China as the experimental object. The results found that the situation assessment model worked well when we used it to assess the air traffic situation of the target samples. The situation assessment results of the air traffic can be quantitatively output, proving that the model can effectively assess the air traffic operation state by comparing the real operation. And the situation assessment model can be used to study the changing trend in the air traffic situation and the reasons for these changes.

In the future, researchers can further increase the number of samples in the training set to obtain a more perfect assessment model. In addition, the index selection of the air traffic weighted network analysis and the basis classifier selection of ensemble learning can be optimized further.

Author Contributions

Conceptualization, J.L. and F.L.; methodology, J.L.; software, J.L.; validation, J.L., S.L. and X.W.; formal analysis, J.L. and F.L.; investigation, J.L., Y.W. and S.L.; resources, J.L., F.L. and R.T.; data curation, J.L. and Y.W.; writing—original draft preparation, J.L.; writing—review and editing, J.L., F.L. and X.W.; visualization, J.L., F.L. and S.L.; supervision, J.L. and F.L.; project administration, F.L., X.W. and D.C.; funding acquisition, F.L. and X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (No. 7180122).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

All data included in this study are available upon request by contacting the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Statistical Bulletin on the Development of China Civil Aviation Industry in 2019. Available online: https://www.caac.gov.cn/en/HYYJ/SJ/index_3.html (accessed on 10 October 2023).
Yi, L.; Min, R.; Kunjie, C.; Dan, L.; Ziqiang, Z.; Fan, L.; Bo, Y. Identifying and managing risks of ai-driven operations: A case study of automatic speech recognition for improving air traffic safety. Chin. J. Aeronaut. 2023, 36, 366–386. [Google Scholar]
Wang, X.; Ma, Y.; Wang, Y.; Jin, W.; Wang, X.; Tang, J.; Jia, C.; Yu, J. Traffic flow prediction via spatial temporal graph neural network. In Proceedings of the Web Conference, Taiwan, China, 20–24 April 2020; pp. 1082–1092. [Google Scholar]
Zhu, X.; Cai, K.; Cao, X. A semi-supervised learning method for air traffic complexity evaluation. In Proceedings of the 2017 Integrated Communications, Navigation and Surveillance Conference (ICNS), Herndon, VA, USA, 18–20 April 2017; pp. 1A3–1–1A3–11. [Google Scholar]
Rodríguez-Sanz, Á.; Andrada, L.R. Managing airport capacity and demand: An economic approach. IOP Conf. Ser. Mater. Sci. Eng. 2022, 1226, 012024. [Google Scholar] [CrossRef]
Yuan, J.; Long, C. Construction of General Aviation Air Traffic Management Auxiliary Decision System Based on Track Evaluation. In Proceedings of the International Conference on Human-Computer Interaction, Lausanne, Switzerland, 23–25 April 2022; pp. 391–405. [Google Scholar]
Pérez-Castán, J.A.; Pérez-Sanz, L.; Serrano-Mira, L.; Saéz-Hernando, F.J.; Rodríguez Gauxachs, I.; Gómez-Comendador, V.F. Design of an atc tool for conflict detection based on machine learning techniques. Aerospace 2022, 9, 67. [Google Scholar] [CrossRef]
Rytter, A.; Skorupski, J. The concept of initial air traffic situation assessment as a stage of medium-term conflict detection. Procedia Eng. 2017, 187, 420–424. [Google Scholar] [CrossRef]
Delahaye, D.; García, A.; Lavandier, J.; Chaimatanan, S.; Soler, M. Air traffic complexity map based on linear dynamical systems. Aerospace 2022, 9, 230. [Google Scholar] [CrossRef]
Tascn, D.C.; Daz Olariaga, O. Air traffic forecast and its impact on runway capacity. a system dynamics approach. J. Air Transp. Manag. 2021, 90, 101946. [Google Scholar] [CrossRef]
Hu, G.Y.; Qiao, P.L. Cloud belief rule base model for network security situation prediction. IEEE Commun. Lett. 2016, 20, 914–917. [Google Scholar] [CrossRef]
Leau, Y.B.; Selvakumar, M. A cost-sensitive entropy-based network security situation assessment model. Adv. Sci. Lett. 2016, 22, 2865–2870. [Google Scholar] [CrossRef]
Xing, J.; Zhang, Z. Prediction model of network security situation based on genetic algorithm and support vector machine. J. Intell. Fuzzy Syst. 2021, 3, 1–9. [Google Scholar] [CrossRef]
Lu, S.; Zhuang, Y. A Network Security Situational Awareness Framework Based on Situation Fusion. In Proceedings of the International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage, Nanjing, China, 18–20 December 2020. [Google Scholar]
Yi, J.K.; Guo, L. AHP-Based Network Security Situation Assessment for Industrial Internet of Things. Electronics 2023, 12, 3458. [Google Scholar] [CrossRef]
Hu, J.; Ma, D.; Liu, C.; Shi, Z.; Yan, H.; Hu, C. Network Security Situation Prediction Based on MR-SVM. IEEE Access 2019, 7, 130937–130945. [Google Scholar] [CrossRef]
Pu, Z.Y. Network security situation analysis based on a dynamic Bayes network and phase space reconstruction. J. Supercomput. 2018, 1, 1–16. [Google Scholar]
Wang, C.D.; Yu, Z. Network security situation evaluation based on modified DS evidence theory. Wuhan Univ. J. Nat. Sci. 2014, 19, 409–416. [Google Scholar] [CrossRef]
Gao, K.L.; Xu, R.Z.; Wang, Y.F. A study of hierarchical network security situation evaluation system for electric power enterprise based on Grey Clustering Analysis. In Proceedings of the International Conference on Computer Science and Service System, Nanjing, China, 27–29 June 2011; pp. 1990–1995. [Google Scholar]
Wen, Z.; Tang, J. Quantitative assessment for network security situation based on weighted factors. J. Comput. Methods Sci. Eng. 2016, 16, 821–833. [Google Scholar] [CrossRef]
Takeichi, N.; Kaida, R.; Shimomura, A.; Yamauchi, T. Prediction of delay due to air traffic control by machine learning. In Proceedings of the AIAA Modeling and Simulation Technologies Conference, Grapevine, TX, USA, 9–13 January 2017; p. 1323. [Google Scholar]
Kim, Y.J.; Choi, S.; Briceno, S.; Mavris, D. A deep learning approach to flight delay prediction. In Proceedings of the 2016 IEEE/AIAA 35th Digital Avionics Systems Conference, Sacramento, CA, USA, 25–29 September 2016; pp. 1–6. [Google Scholar]
Gui, G.; Zhou, Z.; Wang, J.; Liu, F.; Sun, J. Machine learning aided air traffic flow analysis based on aviation big data. IEEE Trans. Veh. Technol. 2020, 69, 4817–4826. [Google Scholar] [CrossRef]
Gopalakrishnan, K.; Balakrishnan, H. A comparative analysis of models for predicting delays in air traffic networks. ATM Semin. 2017, 6. [Google Scholar]
Berge, C. The theory of graphs and its applications. Bull. Math. Biophys. 1962, 24, 441–443. [Google Scholar]
Brockmann, D.; Helbing, D. The Hidden Geometry of Complex, Network-Driven Contagion Phenomena. Science 2013, 342, 1337–1342. [Google Scholar] [CrossRef]
Krawczyk, B.; Minku, L.L.; Gama, J.; Stefanowski, J.; Woźniak, M. Ensemble learning for data stream analysis: A survey. Inf. Fusion 2017, 37, 132–156. [Google Scholar] [CrossRef]
Gomes, H.M.; Barddal, J.P.; Enembreck, F.; Bifet, A. A survey on ensemble learning for data stream classification. ACM Comput. Surv. 2017, 50, 1–36. [Google Scholar] [CrossRef]
Silva RA, D.; Canuto AM, D.P.; Barreto CA, D.S.; Xavier-Junior, J.C. Automatic recommendation method for classifier ensemble structure using meta-learning. IEEE Access 2021, 1, 99. [Google Scholar]
Yu, J.; Zhao, C.; Zheng, W.; Li, Y.; Chen, C. Android Malware Detection Using Ensemble Learning on Sensitive APIs. First EAI Int. Conf. Proc. 2020, 11, 126–140. [Google Scholar]
Mehmood, Z.; Asghar, S. Customizing svm as a base learner with adaboost ensemble to learn from multi-class problems: A hybrid approach adaboost-msvm. Knowl. Based Syst. 2021, 217, 106845. [Google Scholar] [CrossRef]

Figure 1. The air traffic network in East China

G_{2} (V_{2}, E_{2}, ω_{2})

.

Figure 1. The air traffic network in East China

G_{2} (V_{2}, E_{2}, ω_{2})

.

Figure 2. Classification principle of SVC.

Figure 3. Process of ensemble learning.

Figure 4. Process of air traffic situation assessment.

Figure 5. The PR curve of the ensemble model.

Figure 6. The distances between the test samples and the classification hyperplane.

Figure 7. The distance from the sample to the classification hyperplane.

Table 1. Comparison of importance.

CV	RS	FDR	MF	MC
RS	1	3	5	7
FDR	1/3	1	3	5
MF	1/5	1/3	1	3
MC	1/7	1/5	1/3	1

Table 2. Air traffic operation state.

Distance	Situation Level	Explanation
(1.5, 2.0]	Safety	The air traffic is stable and safety. There is no security risk.
(1.0, 1.5]	Potential Risk	The air traffic is basically stable. Air traffic robustness is weakened. The risk of unsafe incidents increased.
(0, 1.0]	General Risk	There are systemic risks in air traffic. The stability of air traffic operation decreases. A few unsafe incidents are happening.
(−1.0, 0]	Moderate Risk	The air traffic security situation is deteriorating. The probability of seriously unsafe incidents is greatly increased. The number of flight delays increases.
[−2.0, −1.0]	Significant Risk	The robustness and safety of air traffic are weak. Many unsafe incidents are happening. The normal operation order is seriously damaged.

Table 3. The weighted network adjacency matrix of specified date.

Waypoints	WYK_VOR	DOGAR	LADIX	CG	KALBA	···	PIMOL	LAMEN
WYK_VOR	0	0.502812	0	0	0	···	0	0
DOGAR	0.502812	0	0.536808	0	0	···	0	0
LADIX	0	0.536808	0	0.592898	0	···	0	0
CG	0	0	0.592898	0	0.710775	···	0	0
KALBA	0	0	0	0.710775	0	···	0	0
···	···	···	···	···	···	···	···	···
PIMOL	0	0	0	0	0	···	0	0
LAMEN	0	0	0	0	0	···	0	0

Table 4. The air traffic network situation dataset in East China.

Data	NS	NB	EB	ND	NCC	NE	NLD	NI	NPR	CLASS
1 January 2019	2.0394	0.0882	0.0678	1.3277	0.1493	0.4829	0.0382	0.5234	0.0484	1.0000
4 January 2019	2.0945	0.0702	0.0478	1.2428	0.1593	0.5942	0.0399	0.4694	0.0572	1.0000
17 January 2019	1.9049	0.0682	0.0434	1.4024	0.1481	0.3832	0.0418	0.4213	0.0609	1.0000
30 January 2019	2.3844	0.0723	0.0522	1.3521	0.1595	0.3287	0.0413	0.4382	0.0689	2.0000
2 February 2019	2.1837	0.0912	0.0528	1.2047	0.1466	0.5321	0.0523	0.3694	0.0523	1.0000
⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
9 December 2019	1.9489	0.1024	0.0346	1.4185	0.1503	0.5343	0.0619	0.4398	0.0442	1.0000
16 December 2019	2.0874	0.0507	0.0571	1.5105	0.1494	0.5034	0.0497	0.3957	0.0531	1.0000
29 December 2019	2.2049	0.0973	0.0519	1.4454	0.1446	0.4875	0.065	0.4529	0.0789	2.0000

Table 5. The number of positive and negative samples.

Sample	Positive	Negative
1	4	1
2	3	2
3	4	1
4	0	5
5	4	1
⋯	⋯	⋯
189	3	2
190	3	2
191	5	0

Table 6. The positive and negative prior probability.

Sample	Positive	Negative
1	1.00000000	7.46429951 × 10⁻¹³
2	1.00000000	7.07690019 × 10⁻²¹
3	1.00000000	5.88921171 × 10⁻²⁰
4	4.31728731 × 10⁻²	9.56827127 × 10⁻¹
5	9.99924099 × 10⁻¹	7.59010962 × 10⁻⁵
⋯	⋯	⋯
189	4.90876080 × 10⁻¹	9.12391981 × 10⁻¹
190	9.76546933 × 10⁻¹	2.34530674 × 10⁻²
191	9.80120565 × 10⁻¹	1.98794345 × 10⁻²

Table 7. The activation function.

Sample	Value
1	0.99478238
2	0.99806342
3	0.99982834
4	0.74293931
5	0.97827422
⋯	⋯
189	0.98293848
190	0.32949282
191	0.99993762

Table 8. The distance between each sample and the classification hyperplane.

Sample	Distance
1	1.16488012
2	1.42125133
3	1.59446007
4	−0.44328792
5	0.92691649
⋯	⋯
189	1.38729788
190	1.09898215
191	1.49192777

Table 9. Ensemble learning dataset.

Sample	Positive	Negative	Positive Prior Probability	Negative Prior Probability	Value	Distance	Class
1	4	1	1.00	7.46 × 10⁻¹³	9.95 × 10⁻¹	1.16488012	1.0000
2	3	2	1.00	7.08 × 10⁻²¹	9.98 × 10⁻¹	0.92125133	1.0000
3	4	1	1.00	5.89 × 10⁻¹⁰	1.00	0.49446007	1.0000
4	0	5	4.32 × 10⁻¹	9.57 × 10⁻¹	7.43 × 10⁻¹	−0.44328792	2.0000
5	4	1	1.00	7.59 × 10⁻⁵	9.78 × 10⁻¹	0.82691649	1.0000
⋯	⋯	⋯	⋯	⋯	⋯	⋯	⋯
189	3	2	9.12 × 10⁻¹	4.91 × 10⁻¹	9.83 × 10⁻¹	1.38729788	1.0000
190	3	2	2.36 × 10⁻²	9.77 × 10⁻¹	3.33 × 10⁻¹	1.09898215	1.0000
191	5	0	1.99 × 10⁻²	9.80 × 10⁻¹	9.99 × 10⁻¹	1.49192777	1.0000

Table 10. The recall and accuracy.

Model	Accuracy	Recall	Precision
K-NN	0.8257	0.8094	0.7947
Bayes	0.7939	0.7631	0.7493
BP neural network	0.8702	0.8552	0.8419
SVM	0.8853	0.8947	0.8750
Ensemble	0.9898	0.9566	0.9743

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, F.; Li, J.; Wen, X.; Wang, Y.; Tong, R.; Liu, S.; Chen, D. Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning. Appl. Sci. 2023, 13, 11957. https://doi.org/10.3390/app132111957

AMA Style

Liu F, Li J, Wen X, Wang Y, Tong R, Liu S, Chen D. Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning. Applied Sciences. 2023; 13(21):11957. https://doi.org/10.3390/app132111957

Chicago/Turabian Style

Liu, Fei, Jiawei Li, Xiangxi Wen, Yu Wang, Rongjia Tong, Shubin Liu, and Daxiong Chen. 2023. "Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning" Applied Sciences 13, no. 21: 11957. https://doi.org/10.3390/app132111957

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Situation Assessment of Air Traffic Based on Complex Network Theory and Ensemble Learning

Abstract

1. Introduction

2. Preliminaries

2.1. Construction of Air Traffic Network Model

2.2. Situation Assessment Indexes of Air Traffic Network

3. Methods

3.1. Basic Principles of Situation Assessment

3.2. Ensemble Learning

3.3. Situation Assessment Process

4. Experimental Setup and Details

4.1. Air Traffic Network Situation Dataset in East China

4.2. Ensemble Learning Model

4.3. Situation Assessment of Air Traffic Based on Ensemble Learning

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI