Optimization of Air Traffic Management Efficiency Based on Deep Learning Enriched by the Long Short-term Memory(LSTM) and Extreme Learning Machine(ELM)

Air traffic management refers to the activities required for the efficient and safe management of the national air system (NAS) for each country. This concept has been widely assessed due to its complexity and sensitivity for the beneficiaries, including passengers, airlines, regulatory agencies, and other organizations. To date, various methods (e.g., statistical and fuzzy techniques) and data mining algorithms (e.g., neural network) have been used to solve the issues of air traffic management (ATM) and delay the minimization problems. However, each of these techniques has some disadvantages, such as overlooking the data, computational complexities, and uncertainty. The present study aimed to increase ATM efficiency using the deep learning approach. The main research objective was to propose a deep learning model with the application of a long short-term memory-based deep learning model in order to increase the predictive accuracy in short daily and long-term annual windows by enhancing deep learning (two-dimensional). In addition, the deep model output was transferred to the extreme learning machine fast learning deep neural machine in order to calculate the estimated time of arrival real-time based on other similar input data, including the NAS data, bureau of transportation statistics system, and automatic dependent surveillance-broadcast system. The final results indicated the increased accuracy of ATM compared to other studies.

Another challenge in flight sequences is the coverage of several airports by a single ATM while each airport may have several pattern areas for its ATM (3).Moreover, the traffic or pattern areas in the nearby airports may be dependent or independent, and each airport may have several parallel or non-parallel runways.The traffic of parallel runways may be dependent or independent, whereas the traffic of crossover runways is dependent.
Landing and takeoff runways might differ in every airport or be used jointly (3,4).Furthermore, each runway may have several procedures for landing and takeoff, which might have dependent and independent traffic.These issues show the high complexity of the modeling of the problem.Given the high scale of air traffic data (large data) in the classification learning process, the level of complexity is higher with the increased number of the categories of each class.In addition, the selection of the significant features by traditional data mining is almost impossible.
Various measures have been taken to solve the problems of ATM and ASP (5).Many studies have aimed to solve these issues using mathematical models and methods, linear programming, mixed planning, and statistical models.
However, one of the limitations of these studies is not considering the actual data and operating environment.
Other studies have used the first come first serve (FCFS) technique, along with the queue model and other mixed methods, to solve the problem (22)(23)(24)(25)(26).
Another category of research and articles have applied data mining methods to investigate the influential factors in air traffic and flights (27)(28)(29)(30)(31)(32).
The machine learning algorithm is a conventional method which used to resolve the issues of air traffic, ASL, delay forecast, and minimization.
An optimal approach to solving the mentioned issue involves using the structure of artificial neural networks (53)(54)(55).Following the evolution of neural networks, deep learning

Proposed Method
The proposed model in the present study was based on the LSTM and ELM algorithms.networks.The three gates were a completely connected layer, the input of which was a vector, and the output was an actual number.Figure 3 shows the initial structure of the LSTM cell, which is interpreted, as follows:    h q t = sBiLSTM( h q t−1, h q t+1, qt) , h q 0 = 0, h a t = sBiLSTM( h a t−1, h a t+1, at) , h a 0 =h q n, The calculations are as follows: where d represents the dimensions of the hidden mode.

Coherence Mechanism for Problem Presentation
In this section, a coherence mechanism was implemented to encode the problem in accordance with the response sequence (Figure 5).We In order to obtain the accuracy vector of the question with respect to each word of the response, we combined the explanatory weights and approximation matrix to calculate the new CQ and CA field vectors.In this section, C Q and C A were the results the interaction between the problem and vector response, as follows:     The inverse H + was generalized from the hidden output layer matrix.In the final step, we examined the assessable criteria.

Analysis and Evaluation
At this stage of the research, we are analyzed and evaluated the applied data and assessed the results and criteria.

Dataset
The dataset obtained from (1,28) included 1,100,000 records and 15 features according to Table 1.

Assessable Criteria
It was crucial to test and evaluate the results by a set of criteria to assess the performance of the proposed method.  2 shows the position of the parameters in the confusion matrix.

Table 2. Confusion matrix
The elements of the matrix were equal to: In addition, the following criteria were used to evaluate the performance of the proposed method:

Table 3. Confusion matrix description
The number of the behaviors that represented the existence of a delay and were correctly predicted  5.In other words, the higher predictive

Accuracy
The comparison of the accuracy criteria with 15 and 30 minutes of delay is presented in Table 6.The first evaluated criterion was accuracy.As is observed, a 30-minute delay had a higher accuracy percentage, a reason for which is that the delay has  The second criteria evaluated in Table (7)  According to Table (7), the cause of 40% of the delays has been recorded in the airports, the most important of which was air time, followed by delayed arrival.Each of the delay factors alone could record several arrival delays at the subsequent airports, except for the arrival delay factor.Therefore, a significant part of the delay factors was related to delayed arrivals, which will be resolved when airlines have the required time for retrieving and returning to the flight schedule.At present, the cause of delays of less than 15 minutes and departure delays is not recorded at most airports.However, the recording information for the delays between 15 and 30 minutes is more thorough, which leads to higher accuracy and precision.

Comparison of the Proposed Method with Conducted Research
Improving the accuracy and precision of the ATM is a basic method in ATM research.Several ATM approaches have been provided on an ATC level.
The accuracy of the proposed approach to traffic was low and did not respond to heavy traffic.In the present study, an LST-ELM hybrid model was applied to improve the accuracy of the proposed method.The comparison of the proposed approach for the 30-minute delay and (1) and ( 28) studies is shown in Figure 8.
According to the obtained results, the proposed method had a more  Air traffic management (ATM): is an aviation term encompassing all systems that assist aircraft to depart from an aerodrome, transit airspace, and land at a destination aerodrome, including Air Traffic Services (ATS), Airspace Management (ASM),and Air Traffic Flow and Capacity Management (ATFCM).
Bidirectional LSTMs: are an extension of traditional LSTMs that can improve model performance on sequence classification problems.In problems where all time steps of the input sequence are available, Bidirectional LSTMs train two instead of one LSTMs on the input sequence.Elapsed Flying Time: Actual time an airplane spends in the air, as opposed to time spent taxiing to and from the gate and during stopovers.Extreme learning machines(ELM):are feedforward neural networks for classification, regression, clustering, sparse approximation, compression and feature learning with a single layer or multiple layers of hidden nodes, where the parameters of hidden nodes (not just the weights connecting inputs to hidden nodes) need not be tuned.Long short-term memory (LSTM): is an artificial recurrent neural network (RNN) architecture used in the field of deep learning.Unlike standard feed-forward neural networks, LSTM has feedback connections.Terminal Control Area(TCA or TMA): A terminal control area (TMA , or TCA in the U.S. and Canada) , also known as a terminal manoeuvring area (TMA) in Europe, is an aviation term to describe a designated area of controlled airspace surrounding a major airport where there is a high volume of traffic.
Air traffic management (ATM) refers to the activities required for the efficient and safe management of the national air system (NAS) for each country.Overall, ATM encompasses the two components of air traffic control (ATC) and air traffic follow management (1).The ATC system mainly uses tactical decisions (e.g., real-time separation method) for collision detection.NAS is divided into several sections to present ATC services and help air traffic controller operators in the process of traffic control and flight separation by ATCs.The aircraft sequencing problem (ASP) for the prevention of flight delay and interference is an important issue in the working area of ATM (1).In addition, the expenses of fleet fuel and flight delays in airports and secondary costs constitute a considerable part of airline costs.The arrival schedule of flights is considered by airlines and aviation/airport companies (2), attracting attention to the airports with better ATM performance by airlines.

2 .
networks have been considered to be one of the most recent and complete solutions in this regard.This novel technique could solve problems with a high accuracy owing to its ability to accept the large data of the problem and neural network integration, as well as learning techniques and structural dynamism, in the hidden layers formation.Aviation and ATM issues are no exception, and most of the recent studies regarding flight delay forecast and aircraft sequencing problems have benefited from this technique (1, 52, 56-58).Multilayer neural networks or deep neural networks are parts of machine learning discussion and a set of algorithms that attempt to model highlevel abstract concepts based on learning at various levels and layers, thereby enabling deep learning to process large volumes of data in complicated categories (1, 56, 57, 59-61).In the current research, it was attempted to propose an accurate and proper method to solve the problem within the work domain of the terminal management area (TMA) using the combination of deep neural network and other methods.Furthermore, the present study aimed to propose a deep learning model using a long short-term memory (LSTM)-based deep learning model and recurrent neural network (RNN) in order to increase the predictive accuracy of short and longterm annual windows by enhancing deep learning (two-dimensional).In the third phase, the output of the deep model was transferred to the extreme learning machine (ELM) and fast learning deep neural machine in order to calculate the estimated time of arrival (ETA) and estimated time of departure (ETD) of each flight based on other similar input data, including the NAS data, bureau of transportation statistics B) system.Finally, an aircraft sequence was developed within the airport TMA range with a 15-minute time window for flight arrival using evolutionary and meta-heuristic algorithms by matching the flight rules with the learning findings and increasing the accuracy.The following sections of the article have been structured, with section two reviews the previous methods in the ATM field, section three describes the proposed model, and section four ,evaluates and compares the results with other techniques; in addition, section five has been dedicated to a conclusion.A Review of Previous Methods Numerous efforts have been made to solve the ATM problem and minimize the rates of ETA and ETD delay in various dimensions.Most of the studies in this regard have evaluated inbound and outbound flights separately, attempting to propose solutions using methods such as data mining and mathematical techniques (Figure 1).

Figure 2
depicts the flowchart of the proposed method.

Figure 4 .
Figure 4. Structure of Stacked BiLSTM Networks attempted to interact more closely with the functions and summaries in the coherence mechanism by designing the matrices' multiplication to address more questions.Initially, the matrix multiplication was carried out to estimate the L matrix, which included the propensity scores related to all pairs of the problem and response.

Figure 5 .
Figure 5. Schematic of Coherence Mechanism to Display the Problem A soft accuracy layer could be used for the integration of information from the words of the problem and response in order to reduce the information loss of the stacked BiLSTM (67, 68).In the proposed model, the attention mechanism was applied for the cohesion output.In the current research, CQ t was assumed to show the t-th attention field vector of this problem, and the maximum aggregation occurred to convert the input into a vector with O q fixed length.In addition, the software weight of all the text vectors (CA,CA2,…,Cam) could be learned independently based on O q through the attention mechanism, and the O a weight field vector used the response as the final representation.
In the equations above, W am and W qm show the attention matrices of C A t and Oq, respectively, and W ms is the attention weight vector.The official presentation of the Q a response was determined based on the attention (accuracy) weight of S aq (t) for the t-th word response text vector.In addition, normalization occurred by the performance of the Soft max function, which was proportional to C A t .The higher values of aq (t) demonstrated a more significant correlation between C A t and the problem, while the problem vector drew more attention.

Figure 7 .
Figure 7. Structure of ELM Model As can be seen, x1,x2,…,xn were the input of the educational data, and w ij and β jk were the input weight in the neural network and indicative of the output weight vector between the hidden layer and output node, respectively.As a result, the output of the hidden layer corresponded to the x input.In this regard, OL was the node of the hidden layer, and b j was the neuron threshold in the hidden layer.In addition, the education sample set was {(xi,yi)|xi2Rn,yi2Rm,i = 1,2,. ..,N}, and general, the confusion matrix was used to evaluate the position and efficiency of the disease classification and diagnosis systems.The analysis of the confusion matrix in the classification and detection of flight delays led to the four modes of true positive, true negative, false positive, and false negative.Table

4 . 3 2 .Table 4 . 4 . 3 . 1
by the model; TP The number of the behaviors that represented the presence of a delay, and the model incorrectly predicted the absence of delay; FP The number of the behaviors that indicated the absence of delay, and the model incorrectly predicted the existence of delay; FN The number of the behaviors that showed the lack of delay, and the model correctly predicted them.Results and Discuss Flight delays and the problem of predicting the amount of delay were divided into several factors, conditions, and data.According to a reliable study in this regard, flight delay predictions could be classified as: 1. Delays due to flight planning and scheduling; Delays due to flight operation conditions at the airport; 3. Delays due to weather conditions; 4. Delays due to the terms and conditions of airline aviation operations and air traffic control; 5. Delays due to temporary conditions, such as the flight season or day; 6. Delays due to the flight conditions of the national flight network; 7. Delays due to the flight atmosphere Since the type of delay in the present study included the numbers one (delays due to flight planning and scheduling), two (delays due to flight operation conditions at the airport), and six This was the most important criterion for determining the performance of a classification algorithm, which showed the percentage of the proper classification of the total set of the experimental records. =  +   + T +  +  It showed the ability of the algorithm to accurately detect delay.Rcall=  + It demonstrated the efficiency of the classifier in the accurate prediction of the lack of delay.It demonstrated the ability of the algorithm to detect the positive categories (i.e., delay).Precision=  + It showed the harmonic mean between accuracy and recall.F-measure= 2 *  *  +  Measuring the accuracy of the predicted rates compared to the correct rates; MSE√∑(y − y) 2 /n n t=1 It was a statistical tool to determine the predictive accuracy in modeling.MSE= ∑ (y − y) 2 /n n t=1 If the distribution of two datasets in a dataset was not the same, this criterion was used to calculate the accuracy of the introduced method.Balanced_Acc_Test Formulations description (delays due to the flight conditions of the national flight network), the amount of delay time slot was considered to be less than 15 minutes and 15-30 minutes based on the mentioned findings.A squawk radar is considered for the flight when the aircraft announces its readiness to fly based on the flight time specified in the flight schedule, and the flight will continue with the same squawk and flight sequence if it continues for 15 minutes.Otherwise, the squawk is canceled, and the flight must request a flight squawk from the country's air control center, which will change the flight schedule.On the other hand, if there is a delay of more than 15 minutes and less than 30 minutes, the flight can carry on with the same schedule and a new squawk.In case of a delay of more than 30 minutes, the flight needs to send a flight delay message to the national air traffic network or set and send a new flight schedule.Calculate MSE and RMSE MSE is a statistical tool applied to determine predictive accuracy of a model.Table 5 shows the root-meansquare error (RMSE) of the desired airports.The parameter is mostly used to estimate the difference between the predicted values by a model and the observed values (1, 60).The accuracy of the proposed model would be higher when the MSE per each specific mother was lower than the other model.The criteria considered in the proposed method for two delays of 15 and 30 minutes and 10 airports are presented in Table accuracy of a model leads to the lower MSE.The RMSE criteria in 30-minute delays of LORD, PHX, and JFK airports had a lower percentage compared to the other airports, which was mainly due to the need for fewer traffic data compared to other airports, especially at the PHX Airport.In the case of the PHX Airport, the amount of air traffic data did not exceed the threshold value, while the traffic data for the other airports exceeded the threshold value (28 ).
been obtained and calculated due to flight operations in the TMA control space in estimating 30-minute delays, which adds to the previous delays and could no longer be estimated.Moreover, in delays of 30 minutes and more, the recorded information is more accurate since the order of flight arrival and departure numbers changes according to the order intended for the flight with the airport control mechanism and it is necessary to send a flight delay message or a flight plan update.
was recall, which had a better percentage of 15-minute delays at the LAX Airport.Some of the advantages of the data of these two airports included less noise and proximity to each other.This airport has the largest number of flights compared to the nearby airports, as well as a higher operating volume than other airports.The amount of system recall in the obtained estimate leads to the detection and reduction of human errors, operating systems, aviation accidents, and operational and airport costs.In addition, the three criteria of accuracy balance, MCC, and F-measure had better performance in 30-minute delays.The use of the BiLSTM algorithm and improvement of the ELM parameter had a properly generalized 30-minute delay.In addition, the improvement of the ELM in the training and testing phase will increase accuracy and precision compared to other airports.Therefore, it could be concluded that the effect of the delay was properly modeled using the proposed method.In general, the improved ELM algorithm is faster, more accurate, and more generalizable in classification compared to other algorithms.
appropriate performance improvement as opposed to the comparable references due to the reconstruction of nonlinear time series and valid predictions.The obtained results also indicated that the proposed method could manage a complex nonlinear time series.Therefore, the use of the BiLSTM algorithm requires fewer hidden layers due to its greater learning capability and improving of the ELM network, which could enhance accuracy in an air traffic delay.Unlike other algorithms (e.g., BP), using the ELM algorithm needs no hidden layers, and its parameters are selected randomly.The goal of this algorithm is achieving the lowest training error and the smallest output soft weight.Furthermore, the improvement of this algorithm leads to the avoidance of the local minimum, and BiLSTM could be used to solve the long-term dependency problem.Together, these two algorithms improve accuracy more effectively compared to other methods.

Table 5 .
MSE and RMSE Criteria