A Method of Reducing Flight Delay by Exploring Internal Mechanism of Flight Delays

is paper explores the internal mechanism of ight departure delay for the Delta Air Lines (IATA-Code: DL) from the viewpoint of statistical law. We roughly divide all of delay factors into two sorts: propagation factor (PF), and nonpropagation factors (NPF). From the statistical results, we nd that the distribution of the ight departure delay caused by only NPF exhibits obvious power law (PL) feature, which can be explained by queuing model, while the original distribution of ight departure delay follows the shi power law (SPL). e mechanism of SPL distribution of ight departure delay is considered as the results of the aircra queue for take-o due to the airports congestion and the propagation delay caused by late-arriving aircra. Based on the above mechanism, we develop a specic measure for formulating ight planning from the perspective of mathematical statistics, which is easy to implement and reduces ight delays without increasing operational costs. We analyze the punctuality performance for 10 of the busiest and the highest delay ratio airports from 155 airports where DL took o and landed in the second half of 2017. en, the scheduled turnaround time for all ights and the average scheduled turnaround time for all aircra operated by DL has been counted. At last, the eectiveness and practicability of our method is veried by the ights operation data of the rst half of 2018.


Introduction
Flight delay is one of the major issues in aviation systems all over the world. Such delay events downgrade the functioning of airlines and cause tremendous loss in human life, economy and tra cs [1,2]. To alleviate the harm of ight delay, considerable work has been done [3][4][5][6][7][8][9]. Actually, the air transportation system is a rather complex system, which have been traditionally described as graphs with vertices representing airports and edges direct ights during a xed time period [10]. ese graphs are called aviation network. Recently, many research has been carried out from the viewpoint of complex network [11][12][13], which propose almost all kinds of aviation network features.
Many networks in nature display rather complex structures, that o en seem random and unpredictable. Barabási and Albert discovered that many realistic networks [14,15] exhibit the scale-free feature, which the vertex connectivity follows a PL distribution. e fundamental mechanisms leading to the PL distribution are considered to be growth and preferential attachment [16,17]. On the other hand, in Ref. [18], the author proposed a SPL's model with a parameter which controls the relative weights between the power-law and exponential behaviors. Empirical investigation for many real world networks [19][20][21] also shows SPL distribution. ese work provide an e ective theoretical support for us to explore the internal mechanism of ight delay and propose e ective measures to alleviate the harm of ight delay.
ere are many factors that cause ight delay, the Bureau of Transportation Statistics (BTS) classi es them into ve categories [22]: (1) aircra arriving late, (2) national aviation system (NAS) delay, (3) air carrier delay as a result of crew, baggage loading or maintenance problems, (4) extreme weather conditions such as hurricanes or blizzards and (5) security-related delays. If one ight is delayed, then a subsequent ight might also be delayed because it is awaiting that inbound aircra . is kind of delay is called propagation delay [3,5,[22][23][24][25], which is also quite substantial (more than onethird of the delays) [3]. On the other hand, since the schedule of one aircra is quite tight, the on-route absorption of departure delay of the last ight is very limited and the delay in subsequent ights is relatively predictable, while the delay caused by NPFs is hard to predict. us, quantitative research of propagation delay is great signi cance, which helps to come up with solutions.
In order to alleviate the delay of propagation, researchers have proposed to modify schedule departure time so as to re-allocate the existing slack in the ight schedule [3,6,[26][27][28]. ese studies share a similar research methods: they allow schedule departure time to vary within a time window, then establish an objective function with several constraints, and nally obtain the optimal solution. ey focus on the impact of schedule modi cation on system performance to maximize the utilization of aviation resources. But we are more concerned about how to reduce ight delay ratio and hope to propose the concrete practicing method. In the follow, we propose a speci c implementation method, not an objective function, although we used the same idea as the previous studies, that is, modify schedule departure time. We take advantage of the predictability of propagation delay and assume that there is no newly formed delay (delay caused by NPFs) a er changing the plan, the e ectiveness and practicability of our method is veri ed by the ights operation data of the rst half of 2018. e structure of this paper is organized as follows: Section 2 presents a statistical law for airline of DL and explores internal mechanism of ight delay. Section 3 contains analysis for operation performance evaluation of di erent airports and statistical results of the scheduled turnaround time for all ights and the average scheduled turnaround time for every aircra . And the speci c method is put forward. Section 4 presents and discusses the empirical results. In Section 5, conclusions and some hints for future research are given.

Statistical Law and Internal Mechanism
We collect primary records of ights operation from July 1, 2017 to December 31, 2017 for the Delta Air Lines. e data of ights operation were downloaded from the website of the Bureau of Transportation Statistics (BTS) [29]. Our analysis focuses on the departure delay rather than the arrival delay, because the arrival delay is approximately linearly related to the departure delay [30]. In general, the departure delay is commonly measured as the di erence between the scheduled and the actual ight departure time. e Federal Aviation Administration (FAA) de nes the ight departure delay as the ights departure at least 15 minutes behind schedule. e detail information for primary data is listed in Table 1.
In order to vividly describe the ight delay, we plot in Figure 1 the probability distribution function (PDF) of the departure delay and set (statistical interval of PDF) of PDF is equal to 15 minutes. Clearly, we notice that the departure delay distribution shows attenuation trend, which is faster than the linear attenuation in double logarithmic chart. erefore, we consider the departure delay distribution is well approximated by SPL: Shown in Figure 1, the tting function ( ) of SPL can describe the empirical data very well. Statistical data shown as black lled circles, while red tting line in panel describes the tting result of Formula (1), in which the corresponding parameters ≈ 132.43, ≈ 25.83, and ≈ 2.74. e constants of , , and are estimated in the way of the least square tting, and the goodness of t is about 2 ≈ 0.999.
To explore the internal mechanism of ight departure delay, we rst investigate the factors causing ight delays. As shown before, delay factors include ve categories, we consider these ve kinds of factors can be roughly divided into two sorts: the propagation factor (PF), i.e., category (1) aircra arriving late, and the nonpropagation factor (NPF) which include all other four. Flight delays caused by NPFs are more accidental, while delay propagation has more direct relevance. Delay propagation occurs when late arrivals at an airport cause late departures, which in turn cause late arrivals at the destination airports. In general, the air tra c controller will set appropriate turnaround bu er time to prevent propagation delay when formulating ight planning [7], although this method reduces revenue-marking ight time and incurs schedule time costs. From the follow statistical results, we nd that current measure of setting bu er time does not play a prominent role.
Actually, a key challenge to explore the internal mechanism of ight delay is extracting e ective information from the raw data. Because the existing data do not provide direct information to distinguish between the di erent types of delay factors [23]. e other reason is that ight delay may be not merely attributed by a late arrival of the ight immediately preceding it, but also be attributed by one or more other factors (NPFs). In order to quantitatively study the propagation delay and simplify the cause-explanation of late-arrival in the present work, we consider that: a delayed ight with the time between the last actual arrival and the current schedule departure less than is attributed by PF. We know that the schedule turnaround time is consisting of two portions, namely the schedule bu er time and the standard ground service time [31]. For di erent types of aircra , the required standard ground service time is about 30-50 minutes (generally speaking, the larger the passenger capacity of the aircra , the longer the necessary ground service time). at means if the time between the last actual arrival and the current schedule departure is less than 30-50 minutes, it can be attributed to propagation delay.
To explore the impact of PF on the statistical law of the departure delay, we remove the departure delay causing by PF from the raw data. Since the data we collected without the information about the passenger capacity for di erent aircra , we plot the departure delay (remove the delayed ights causing by PF) distribution 2 ( ) by setting = 30, 40, and 50 for all aircra in Figure 2.
It exhibits a PL distribution instead of a SPL distribution, given by number of delayed ights caused by PF, and the fewer the number of delayed ights caused by NPFs. On the other hand, the smaller the value, the better the t of the curve using the PL function.
In the statistical process, we use di erent thresholds to obtain PL distributions, which shows that the distribution of ight delay caused by NPFs does exhibit the characteristics of PL distribution. To understand the origin of this observed PL distribution, we have to realize that the airport runway restrictions and the take-o queue size as the signi cant causal factors that a ect the actual departure time [32]. One delayed ight caused by NPFs, such as extreme weather, the ights behind this at the same airport usually delay too. When emergencies return to normal, the waiting aircra s' takeo is a queuing process. erefore, the distribution characteristic shown in Figure 2 can be regarded as the consequence of a decision-based queuing process [17,33,34]: when some perceived priority has been executed, the time of the planes waiting for take-o will show the characteristic of PL, with most ights rapidly take o , whereas a few experience very long waiting times. erefore, the mechanism of SPL distribution of ight departure delay is considered as the results of aircra queue for take-o due to the airports congestion and the propagation delay caused by late-arriving aircra .

Method
According to the previous mechanism of the ight delay, we can deal with the ight delay from two aspects, namely the airports congestion and the propagation delay. e most e ective way to reduce queuing time is building multiple airport runways. However, it is a huge investment. From the perspective of statistics, a new method is developed to improve the ight on-time performance. is method consists of two stages: (1) data statistics and summarization; (2) implementation steps.

Data Statistics and Summarization.
Due to the airports congestion, delay originating from these airports spreads to downstream ights. So the operation performance of airports plays a vital role in the punctuality ratio of airlines. e data that we collected not only contains the message of time for where 2 is a constant and 2 is a constant parameter of the distribution known as the exponent or scaling parameter. We obtain the value of 2 and 2 by the way of the least square tting (a er taking the log of the two sides and 2 becomes the slope of the line). As shown in Figure 2, the main part of the distributions t well with the t function of Formula (2), while the tail of distributions (larger delay) do not appear to be captured by it. However, the goodness of t of R 2 for all distributions with di erent is bigger than 0.99. From the data, we nd that the number of delayed ights with delay l bigger than 500 minutes is about 400-500, accounting for only 0.085-0.106% of the total number of ights. e fact that the scaling spans close to two orders of magnitude, from minutes to hours, indicates that most ight delays (70.51% for DL) are within less than one hour. With the increasing of the value, the value of the distribution function is smaller. Obviously, the longer the necessary ground service time, the more the  3: Log-log plots of departure delay distribution 3 ( ) for the 10 busy airports with the highest delay ratio. and where required, catering and cabin cleaning procedures.
is measure is associated with airport operational e ciency and is used to improve the planning of ight connectivity and the robustness of ight plan. In our method, we will modify the existing ight schedule and redistribute part of the schedule bu er time in the ight schedule without changing total slack time of the day and total daily number of ights.
In order to properly reset the slack, we count the scheduled turnaround time for all ight and the average scheduled turnaround time for all aircra operated by DL in the second half of 2017. Since there are typically no ights between 0 and 6 o' clock, we do not take into account this longer time when calculating the scheduled turnaround time. On the other hand, records available in BTS are not always complete for all aircra .
To promote the quality of statistics, we take 100 ights within 6 months as the ltering threshold, which means that aircra with their taking-o records smaller than 100 will not be counted into our statistics in the present work. A er ltering, a total of 728 aircra s are counted, and the total number of turned around for these aircra is 347073. e scheduling of aircra turnarounds is a consequence of both the operational policies and the scheduling strategies of an airline. For di erent airlines, the average scheduled turnaround time is quite di erent, Southwest Airlines in the USA shows a low average aircra turn time of 17 minutes and United Airlines an average turn time of 50 minutes [36]. In Ref. [36], we know that Delta Airlines shows an average turnaround time of 46.7 minutes, in which the database includes information from September 1987 to May 1994. According to our statistics, the average scheduled turnaround time of all ightsis about 75.3 minutes and standard deviation is about 92.9. is shows that the scheduled turnaround time of ights has increased greatly nowadays, it is particularly advantageous to our method of redistributing part of the schedule bu er time. Number distribution of the schedule turnaround time is shown in Figure 4(a), almost all ight's scheduled turnaround time is longer than 30 minutes. So we set the minimum necessary turnaround time to be 30 minutes in our method. departure and arrival, but also the carrier, tail number of aircra and the airports for departure and arrival. Next, we assess the operation performance of each airport and compute the scheduled turnaround time for all ights and the average schedule turnaround time for every aircra .
While recent studies on air tra c delays focus primarily on operation performance for the di erent airlines [22,35], we are interested in operation performance for the di erent airports. As we know, airports are distributed in di erent locations, the punctuality ratio for di erent airports are very different due to the weather conditions and other regional factors. From our statistical results, we nd there are 44 airports which have more than 2,000 taking-o ights in the second half of 2017 and 10 of 44 airports with the highest delay ratios are reported in Table 2. We can see that, airport of SEA has more delayed ights than BOS, but the total delay is smaller. at means the ight delay of airport BOS is mostly larger than SEA, so delay at airport BOS will have a greater impact on subsequent ights.
Initial delays a ect the downstream ights, but small delays do not have much impact due to the scheduled turnaround bu er time. e study of delay distribution for various airports is necessary, not only delay ratio. In Figure 3, we compare the ight departure delay distributions of 10 airports. From Table 2, we know that airports of JFK, LGA, LAX, and SEA concentrate a large part of Delta Airline's ights, but the characteristics of their delay distributions are not very di erent from each others. e shape of the delay distribution of di erent airports is similar, but small di erence can only be observed when one focuses on EWR airport. e EWR airport shows a bias toward larger delays and may have a greater impact on subsequent ights than other airports. e insu cient schedule turnaround time is another important factor for causing the propagation delay. e schedule turnaround time stands for the time spent by an aircra on ground from scheduled arrival to scheduled departure from the gate, which is used for an aircra to absorb last ight delay, complete full o -loading and loading maintenance of aircra Speci c measures are as Figure 5, where means schedule departure time, ὔ means schedule arrival time, means actual departure time, ὔ means actual arrival time, , and means scheduled bu er time, standard ground service time and scheduled turnaround time, respectively.
One aircra ies from airport 1 to airport 4, if airport 1 belongs to one of the 10 busiest airports in the previous statistics, then we delay the scheduled departure time of ight 2 from 2 to 2 , and the amount of delay is equal to the scheduled turnaround time 3 between the ight 2 and the ight 3 minus the necessary turnaround time . All in all, if the time interval t between the actual arrival time of ight 1 and the schedule departure time of ight 2 is larger than required ground service time, ight 2 will take o on time.
In Figure 4(b), we can see that almost all aircra 's average scheduled turnaround time is about 50-140 minutes. If we set the necessary turnaround time too large, then the change to the ight plan is small, and the e ect of restraining delay propagation will not be obvious.

Implementation Steps.
e overall approach is based on the ight delay mechanism where newly formed delays usually occur at busy airports due to airport/airspace capacity constraints and they spread to downstream ights by the same aircra . From our data, it is possible to trace the propagation of delay from airport to airport: if a particular aircra is scheduled to y from airport A to airport B and then to airport C and departs from A with a long delay, part or all of that delay will be propagated downstream and result in departure delay at B and, possibly, subsequently at C. In this section, we will develop a new method for formulating ight planning by using the previous statistical results.
Since the newly formed delay was hard to predict when we formulated the ight planning, we simply assume that ights departure from these 10 of the highest delay ratio airports mentioned above will experience this kind of delay. Actually, we cannot reduce the newly formed delay by optimizing ight plans, but we can mitigate the propagation e ects of last ight delay by postponing the scheduled departure time of subsequent ights. On the other hand, we have to keep the scheduled departure time of the next ight unchanged and reserve enough turnaround time (greater than the necessary turnaround time) for the next ight. is means that we can delay the scheduled departure time of the current ight, and the maximum amount of delay is equal to the schedule bu er time between the current and the next ight operated by the same aircra . According to our statistical results, the scheduled turnaround time varies greatly between di erent ights or di erent aircra , but the required standard ground service time is about 30-50 minutes, so we set the necessary turnaround time to 30, 40, and 50 minutes as mentioned earlier.  we eliminated e ects of PF, the distribution of departure delay exhibit an obvious PL feature instead of SPL. e queue model which executes the highest-priority item on its list helps to understand mechanism of PL feature. We consider that the mechanism of SPL distribution of ight departure delay is the results of aircra queue for take-o due to airports congestion and propagation delay caused by late-arriving aircra .
Based on the above mechanism, we develop a speci c measure to mitigate propagation delay without increasing operational costs. Speci cally, if one aircra takes o from an airport with a higher delay ratio, we delayed the schedule departure time of the next ight operated by the same aircra , which is equal to the schedule bu er time between the next ight and the subsequent ight. It is proved that our approach is pretty e ective in reducing ight delay, although it is not signi cant for ights with larger delay.
In addition, our approach is based on the predictability of propagation delays and mathematical induction, which provides a new way to optimize ight schedules. Although this is by no means intended as a exhaustive study, it nonetheless provides a starting point to motivate future research, which is more accurate forecasting of the newly formed delays and nding the optimal amount of slack that we redistributed.
Data Availability e data used to support the ndings of this study can be found from the website of the Bureau of Transportation Statistics (BTS) at http://www.bts.gov.

Conflicts of Interest
e authors declare that there is no con ict of interest regarding the publication of this paper.

Empirical Results
In order to verify the e ectiveness and practicability of our method, we collect additional six-month data of ight operation in the rst half of 2018. We will use the method of this article to adjust the ight planning and compare the number of delayed ights before and a er adjustment for the rst six months of 2018. From the previous statistical results, we know that the 10 airports with the busiest and the highest delay ratio are SFO, EWR, JFK, LGA, MIA, PBI, ORD, BOS, LAX and SEA. We assume that if one ight departs from one of these 10 airports, it will generate newly formed delay and cause another ight immediately a er it with the same aircra also to delay. However, strictly speaking, the latter ight delay may be not merely attributed by a late arrival of the ight immediately preceding it, but also be attributed by one or more of other factors. In other words, sometimes the actual departure delay is hard to predict when we change the ight plan by our method, while the delay only caused by PF is not. erefore, to simplify the prediction of current ight delays in the present work, we do not take into account the newly formed delay when the last ight by the same aircra departed from one of the 10 highest delay ratio airports. e six-month data comprehends 463322 ight operation records, and a total of 84828 ights departing from the 10 highest delay ratio airports. Actually, since there are typically no ights between 0 and 6 o' clock, delay on the last ight of each day does not propagate to the rst ight of the next day. erefore, without considering the delayed propagation of the last ight per day, we only adjust the schedule departure time for 72902 ights instead of 84828 ights. Comparing the results before and a er adjustment, we nd that the departure delay ratio dropped from 13.91% to 12.06%, 12.25% and 12.39% with equal to 30 minutes, 40 minutes and 50 minutes, respectively. e change in the number of delayed ights in each delay interval is presented in Figure 6.
Obviously, we can see that the number of delayed ights in almost all delay intervals has decreased. And the smaller the necessary turnaround time , the more the delay and delay ratio will be reduced. But we cannot set too small in our method, because large aircra s require a relatively long turnaround time, small does not correspond to actual. e other reason is that the operation of the ights is full of many uncertain factors, the slack time is reserved to help deal with some unexpected situations and improve the robustness of the ight plan. On the other hand, our method is pretty e ective in the case of short delay, but not in the case of long delay. is is due to the limited slack time reserved by the airline in formulating ight plan. Many delayed ights with small delay have been able to take o on time a er our measure, but ights with larger delay have only slightly reduced delay.

Conclusion
By data mining and statistical analysis, we study the distribution characteristics and inherent mechanism of ight departure delay for DL. From the statistical results, we nd that the distribution of ight departure delay follows SPL, and when