Short-Term Arrival Delay Time Prediction in Freight Rail Operations Using Data-Driven Models

Despite rail’s growing popularity as a mode of freight transportation due to its role in intermodal transportation and numerous economic and environmental benefits, optimizing all aspects of rail infrastructure use remains a significant challenge. To address this issue, various methods for developing train disruption prediction models have been used. However, these models continue to struggle with accurately predicting short-term arrival delay times, as well as identifying the causes of delays and the expected impact on operations. The lack of information available to operators makes it difficult for them to effectively mitigate the effects of disruptions. The goal of this study is to investigate a set of data-driven models for the short-term prediction of arrival delay time using data from the National Railway Company of Luxembourg of freight rail operations between Bettembourg (Luxembourg) and other nine terminal stations across the EU, and then investigate the effects of the features associated with the arrival delay time. For our dataset, the lightGBM model outperformed other models in predicting the arrival delay time in freight rail operations, with departure delay time, trip distance, and train composition appearing to be the most influential features in predicting the arrival delay time in the short-term. The National Railway Company of Luxembourg can use the short-term prediction model developed in this study as a decision-support system. For example, knowing a train’s arrival delay time allows you to estimate future operational time, providing more support to reduce disruptions and subsequent operational delays via a simple web service.


I. INTRODUCTION
The freight transportation industry is constantly changing, and rail transportation is becoming a more popular option due to its advantages in terms of operational costs, efficiency, reliability, emissions, and safety. This trend has resulted in the gradual integration of rail into intermodal transportation, with public agencies encouraging a shift away from other alternatives [1]. As rail intermodal operations become more important for the efficiency and dependability of the freight transport industry, optimizing all aspects of rail infrastructure The associate editor coordinating the review of this manuscript and approving it for publication was Orazio Gambino . is critical. This includes ensuring that the infrastructure is well-maintained, properly managed, and capable of meeting the current transportation system's demands. Furthermore, using technology and data analytics to optimize rail infrastructure and improve overall performance of the rail transport sector is critical [2], [3].
However, due to the complexity of rail networks and the large volume of rolling stock operating on them, train delays are a significant issue that must be addressed. Delays are divided into two types: those caused by the unpredictable time it takes to prepare the train for departure and those caused by variations in the train's performance during its journey [4], [5], [6]. Arrival delay prediction, which involves calculating the difference between the actual arrival time and the scheduled arrival time for a trip between two stations, is critical for rail risk management. When there are disruptions, train dispatchers must assess the impact on the overall schedule and minimize losses by adjusting operations to reduce the chain of delays that could impact overall system operation [7], [8].
Event-based models, which involve procedures with departure, travel, and arrival events, are a common approach for forecasting disruptions and the resulting operational delays in railway operations. Data-driven models, on the other hand, have shown promise in handling and recognizing relationships between nonlinear, multidimensional, and time-based data. These models have been successfully used to uncover interrelationships between various features in rail operations [8], [9], [10], [11], [12], [13], [14], and previous studies have used them to forecast rail operation delays caused by disruptions [15], [16], [17], [18], [19]. These models, however, have failed to predict short-term arrival delay times, as well as the underlying factors that caused the delay and the expected impact on operations. To overcome the limitations of previous studies, the present research has two main objectives: • Evaluate and compare the effectiveness of various data-driven models in predicting short-term arrival delay times in freight rail operations.
• Determine the significance of features associated with arrival delay time.
• Create a Short-term Decision Support System (STDSS) to evaluate operational interventions aimed at reducing disruptions and their associated delays in real-time freight operations. The remainder of this article is structured as follows. Section II provides a review of previous studies that have examined methods for modelling delays in rail operations, as well as data-driven models. In Section III, the problem is described in detail. Section IV describes the case study and methodology used to implement data-driven models for predicting short-term arrival delay times in freight rail operations, as well as an examination of the significance of the characteristics associated with arrival delay time. Section V includes the results and discussion of the study. Finally, Section VI summarizes the research's key findings and suggests future research directions.

II. LITERATURE REVIEW
Numerous studies have been conducted to investigate the issue of delay propagation caused by disruptions in rail operations. Barta et al. proposed a Markov chain-based model to investigate the spread of delays among trains that connect intermodal terminals, which can be caused by unforeseen events like traffic congestion or unscheduled maintenance [20]. In another study, a Bayesian networks approach was proposed to address this issue, where evidence of events was used to reduce uncertainty over time for other events [21]. Additionally, researchers assessed the effectiveness of various timetables, including the shuttle timetable, in allowing operations to continue despite disruptions [22], [23].
Wen et al. investigated data-driven methods for train dispatching in passenger and freight rail operations, discovering that the use of ML methods is very promising due to the rich data that can be obtained from train operations. For this reason, numerous studies have used data-driven models such as decision trees, support vector machines, random forests, and artificial neural networks to predict and investigate rail operation delays. These studies' findings have been mixed, with some revealing a strong relationship between train delays and dwell times and others revealing a weaker relationship between running times and departure delays [24], [25]. Peters et al., for example, built a neural network based on rules between dependent trains to forecast delays for real-time delay monitoring [26], whereas Pongnumkul et al., used the moving average of historical travel times and travel times of the k-nearest neighbors (k-NN) to predict passenger train arrival times [27]. In another interesting study in this field, Oneto et al. created a dynamic data-driven train delay prediction system for large-scale railway networks using weather data from national services [28].
Several data-driven approaches have been used to forecast rail operation delays caused by disruptions using regression models. For example, Kecman and Goverde created a decision tree model and a least-trimmed squares robust linear regression model to predict train running and dwell times [15]. Li et al. used linear regression and K-Nearest Neighbor algorithms to predict the duration of station stops in a different study [16]. Barbour et al. used a support vector regression model to forecast estimated arrival times for freight trains based on train, network, and traffic congestion features [29]. Meanwhile, other researchers [17], [18] used data-driven methods to investigate the characteristics of rail service interruptions and the resulting delays in High-Speed Railway Systems. Minbashi et al. also proposed a machine learning-based framework to improve predictability in freight rail operations by focusing on yard arrivals and departures and employing a random forest algorithm [30].
While previous research has studied the problem of predicting train disruptions, our study addresses a significant gap in the literature. Specifically, we focus on the short-term prediction of arrival delay times in freight rail operations and identify the root causes of delay and their expected impact on operations. While previous studies have struggled to predict the arrival delay time in the short-term, particularly after the train departs from the previous control station, our study develops a consistent data-driven model using supervised Machine Learning (ML) that surpasses other models in predicting the arrival delay time within this context. Additionally, we use the Shapley Additive exPlanation method (SHAP) to thoroughly analyze the impact of the features such as departure delay time, trip distance, and train composition on arrival delay time. Our findings allow us to develop a STDSS that can evaluate operational interventions aimed at reducing delays in freight rail operations. In general, our research makes an important addition to the existing literature by addressing a specific gap in short-term prediction and identifying the root causes of delay in freight rail operations, with the aim of creating this useful STDSS for a specific company.
In a previous study [31], some of the authors of this research used a binary classification approach to predict whether a train will be delayed or not in the long-term (days) and to identify the features that cause those delays. However, in this current study, the authors focus on developing a short-term prediction model for real-time decision making using a regression approach. TABLE 1 summarizes recent and representative studies on railway delay propagation in chronological order. It discusses the type of railway investigated as well as the major contributions.

III. PROBLEM DESCRIPTION
Time and distance charts are a common and standardized method for evaluating train performance and detecting intra-journey schedule deviations. FIGURE 1 shows an example where the cumulative distance traveled on the vertical axis and cumulative time on the horizontal axis can be easily identified, allowing for easy identification of critical points where delays accumulate and evaluating the train's performance compared to the schedule at any point during the journey. This method aids in providing a comprehensive view of the journey and is useful for evaluating train performance [46].
Given that the orange line in FIGURE 1 represents the scheduled time-spatial trajectory and the blue line represents the train's actual ones, the goal of this research is to predict the arrival delay time after the train has departed from the previous control station (so that the departure delay time is known), and we use data-driven models to identify the data behavior responsible for the variety of intra-journey possibilities. Equations Arr Delay = Act Arr − Sch Time + Act Dep − Dep Delay (4) (5) Given that this problem is dealing with the short-term prediction of the arrival delay time once the train has departed from the previous station, the arrival delay time is a function of known features, except for the actual arrival time, for which data-driven models are implemented to predict its value.
Data for train journey segments can be sourced from a schedule of specific waypoints defined by the National Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Railway Company of Luxembourg (Société Nationale des chemins de fer Luxembourgeois or CFL).
To better control the delays accumulated along the train's route, a short-term forecast must be performed once the train passes through an intermediate station as a ''checkpoint'' to create a STDSS that makes accurate real-time predictions that allow for the implementation of strategies such as train rescheduling, reordering, rerouting, and other strategies to optimize freight rail operations.

IV. METHODS AND PROCEDURES
In this section, we describe the process of creating a short-term predictive data-driven model to predict the arrival delay time of a train that has already departed from the previous control station. FIGURE 2 depicts the steps in the methodology used in this study, from data collection to the development of the predictive model for further analysis. All the steps depicted are discussed in depth below.

A. DATA COLLECTION
The study used data from the National Rail Company of Luxembourg -CFL Multimodal, which was collected over a 17-month period, from November 2019 to April 2021. The datasets contain information on their freight rail operations conducted between this period of time between Bettembourg (Luxembourg) and other nine stations within the EU (Boulou, Champigneulles and Lyon in France; Zeebrugge and Antwerp in Belgium; Kiel and Rostock in Germany; Poznan in Poland; and Trieste in Italy). This data was provided by CFL Multimodal, which contained a wide variety of attributes related to trains, wagons, stations, and operations, as shown in TABLE 2. The datasets were meticulously analyzed and combined to ensure that all freight rail operations along the various routes depicted in FIGURE 3 were considered.
To ensure high quality, the dataset used in this study underwent various data pre-processing procedures such as feature VOLUME 11, 2023  engineering, data cleaning, and data transformation. Section IV-B describes the resulting dataset in detail, including its size, descriptive statistics, and the data pre-processing procedures used. The data-driven models were then developed and trained using this refined dataset, as described in Section IV-C, with the goal of predicting freight rail arrival delay times. The goal of these models is to predict arrival delay times in freight rail operations in order to provide valuable insights that can be used to improve the reliability of freight rail transport.

B. DATA PROCESSING
Following the organization and combination of the datasets listed in TABLE 2, single dataset was processed to ensure that each row represents the trips between a pair of control stations, which are the stations (junctions) that the train passes through between the starting and destination stations, as shown in FIGURE 3.
Data imputation techniques were used to fill any missing values in the merged dataset. The median value was used to fill in numerical features, and the most common class was used to fill in categorical features. Furthermore, by utilizing the dataset's available features such as train weight, train length, and train wagon count, we used feature engineering to create two new features to improve the predictive capability of our models: train weight per length and train weight per wagon. Additionally, we used one-hot encoding to convert categorical features into dummy variables and the z-score standardization method to rescale numerical features to ensure that the data is on the same scale [47], [48]. The interquartile range method was used to remove outliers from the numerical features as well.
As part of our correlation analysis, we used a 0.7 Pearson correlation coefficient threshold to eliminate any features that were highly correlated with one another. This threshold was chosen in accordance with standard data analysis practice, which states that a correlation coefficient greater than 0.7 indicates a strong linear relationship between two variables [49]. We were able to reduce redundancy in our dataset and improve model performance by removing these highly correlated features.
The goal of this study was to predict the arrival delay time, which is the numerical difference between the actual arrival time and the scheduled arrival time for trips between two control stations (as explained in Section III, equations (1)- (5)). Regression approach was chosen as the best data-driven approach because the target feature is a numeric value. Following extensive data pre-processing and feature engineering, a total of 10,265 trips between control stations were identified for analysis.

C. DATA-DRIVEN MODELS
Predicting arrival delay times in rail operations is a difficult task due to the numerous factors that can affect train schedules. Machine Learning models are increasingly being used for this purpose because of their ability to effectively analyze large amounts of data and learn from it in order to make accurate predictions. Several studies have demonstrated the effectiveness of various machine learning models in predicting arrival delay times in rail systems, including linear regression, logistic regressions, k-nearest neighbors, random forests, gradient boosting machines, and artificial neural networks [39], [50], [51], [52], [53]. These models can consider a variety of factors, such as weather, passenger volume, and train speed, to provide more accurate predictions of arrival delays.
To effectively train and evaluate machine learning (ML) models for predicting arrival delay time, the original dataset was randomly divided into two sets, namely a training set and a testing set, with a 70% to 30% ratio [13]. It is worth noting that both the training and testing data are part of the data used in this study, and as such, they were subject to the same preprocessing and cleaning steps to ensure their consistency and quality. The proportions of independent input features and the target feature, which in this case is the arrival delay time, were the same in both subsets. In order to avoid bias in the results, it is also critical to ensure that the distribution of values for all independent features is similar for both groups.
The arrival delay time was then predicted using a set of machine learning models that had previously been widely and efficiently applied to a variety of regression problems. These models are as follows: • Linear regression is a machine learning algorithm that forecasts numerically continuous output with a constant slope. This model is typically used to predict values within a continuous range rather than categorizing them into different classes [54].
• The K-nearest neighbors regressor is a non-parametric ML algorithm that approximates the relationship between independent features and continuous outcomes by averaging observations in the same neighborhood [55].
• Random forest regressor, which is a tree-based ensemble ML model that generates many regressors in parallel and aggregates their results by combining a sampling method and an ensemble approach to improve model building [54].
• Light gradient boosting machine, an open-source framework developed by Microsoft for training gradient boosting models [56]. This is another tree-based ensemble ML model that works in a sequential order, with each subsequent model attempting to improve on the errors of the previous model. As a result, each model improves ensemble performance [57].
The performance of the data-driven models was assessed using several metrics, including the Root Mean Squared Error (RMSE), the Coefficient of Determination (R2), the Mean Absolute Percentage Error (MAPE), and the Mean Absolute Error (MAE), which are commonly used metrics to assess the performance of regression machine learning models [52], [58], [59], [60], [61].
• R2 measures the proportion of variance in the dependent variable that is explained by the independent variable(s • MAPE measures the percentage difference between predicted and actual values. It is often used in forecasting and provides a measure of the average magnitude of the errors as a percentage of the actual values. A lower MAPE indicates a more accurate model Overall, these metrics are useful for evaluating the performance of regression machine learning models because they offer different perspectives on the model's prediction accuracy. Using multiple metrics can help ensure that the model performs well in various areas. The equations for calculating these metrics are shown in (6)-(9), where: y i is the actual value of the observation i (target);ŷ i is the predicted value of the observation i (model's output);ȳ i is the average value of all observations i, and n is the number of observations. A good model will typically have a high R2 value as well as low RMSE, MAPE and MAE values.
The tuning of hyperparameters is an important step in optimizing the performance of data-driven models. To that end, the random search method was used to find the models' optimal hyperparameters [62]. Furthermore, the models were evaluated using the k-fold cross-validation method to ensure that their performance is robust and generalizable. To do this, the training set was divided into K subsets, with the classes in each subset represented in the same proportions as the entire dataset, and the learning model was then applied to the remaining subsets [63]. This method is commonly used to mitigate any bias introduced by the holdout method, which uses a fixed amount of data for training and the remainder for testing.
Following the selection of the best ML model for predicting arrival delay time, it is critical to assess the model's learning curves to ensure that they are accurate. The learning curves depict the trend of the model's training and cross-validation scores as a function of training sample count. This allows us to detect possible problems of overfitting or underfitting as well as determine whether adding more observations to the training set improves model performance [64].
The models were trained and validated in Python 3.8.5 on an Intel Core i9-10885H CPU @ 2.40 GHz with 32 GB DDR4 memory ram, a Hard Disk SSD 1TB NVMe class 40, and a GPU NVIDIA Quadro P620 DDR5. This hardware configuration enables quick and efficient model training and validation, reducing the time and resources required for the analysis.

D. ANALYSIS OF THE INPUT FEATURES
Following the identification of the best data-driven model, the impact of the features associated with arrival delay time is calculated using the model's coefficients for each input feature. Following the conditional dependence theory [65], the model's coefficients represent the relationship between the given input feature x i and the target y (i.e., arrival delay time), with the assumption that all other features x j remain constant. These coefficients represent the impact of each input feature on the model's output, allowing us to evaluate the effect of each individual feature on the arrival delay time.
After that, the Shapley Additive exPlanation method (SHAP) is used to generate feature dependence plots. This method ensures that the results are better interpreted because it reveals the direct impact of each feature on the model [61], [66], allowing for the discovery of correlations between two variables and their impact on freight rail arrival delay times. SHAP feature dependence plots depict the interaction effect of two combined features from the same observation, as well as their impact on the model-predicted feature: the arrival delay time.

V. RESULTS AND DISCUSSION
This section presents the results and analysis of data-driven models for short-term prediction of arrival delay times in freight rail operations, which is divided into two parts: (a) an examination of the performance of the trained data-driven models, with the goal of identifying the model that performed the best based on evaluation metrics such as RMSE, R2, MAPE and MAE, and (b) an analysis of the features that have the most significant impact on delays in freight rail operations, which are then used to gain insights into how the features interact in the output of the best data-driven model.

A. DATA-DRIVEN MODELS
Initially, several feature combinations were tested to determine the most effective set of attributes for data-driven models. Using the Pearson method for correlation analysis, less relevant attributes were removed, resulting in a refined set of features that did not compromise the models' performance. The final dataset's composition is shown in TABLE 3, and a validation process was carried out to ensure that the distribution of values for all features was consistent between the training and test groups by carrying out a consistency test of the data, as shown in FIGURE 4. This approach was used to ensure that the models were trained on a representative sample of the data while avoiding overfitting and underfitting risks.
As described in in Section IV-C, five data-driven models were initially analyzed and evaluated based on the proposed evaluation metrics and the best performing model was then selected for further analysis, where we can see that the lightGBM model performed better than the other models.  To improve its performance even further, the random search method was used in conjunction with popular ML Python libraries such as Pycaret, Scikit-learn, and lightGBM [56], [67], [68]. The evaluation metrics of the lightGBM model after tuning their parameters are also shown in TABLE 4.
Considering the results in TABLE 4, even though the results for some models are quite similar, the tuned lightGBM model slightly outperforms for predicting the arrival delay time (even if any of those are valid options). As a result, this model is chosen to assess the impact of the input features on the model output as well as to investigate the relationship between disruptions and their subsequent delays. For both training and test data, the errors between the best model's prediction of the arrival delay time and the actual arrival delay time of the operations performed by CFL Multimodal were estimated (see FIGURE 5). The scatter plots and the corresponding equations provided for the training and test data show that the model performs well in predicting the arrival delay time of operations made by CFL Multimodal. The R2 scores of 0.96 for the training data and 0.89 for the test data indicate a strong correlation between the predicted and actual values of arrival delay time. The equation (10) for the training data and (11) for the test data reveal that the model's predictions are consistent with the actual data, with only slight deviations from the ideal line (y=x), demonstrating an overall performance of the model being satisfactory. y train = 0.96x + 3.22 (10) y test = 0.94x + 6.40 (11) LightGBM is an open-source gradient boosting framework that improves prediction accuracy in regression and classification problems by utilizing decision tree algorithms. It is based on the gradient boosting framework, which combines multiple weak learners to create a strong learner capable of making more accurate predictions. The framework constructs trees in depth and computes gradient and hessian values using a histogram-based approach, which speeds up training and reduces memory usage [56], [57]. LightGBM also has data parallelism, which enables faster training on large datasets, and regularized parameter learning, which reduces overfitting. VOLUME 11, 2023 46973 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.    The input features are arranged in descending order by the magnitude of their impact. The greater the value of the feature, the more important it is in predicting the arrival delay time. The departure delay time, as shown in FIGURE 6, is the most important factor in predicting the arrival delay time, followed by the distance traveled (between the previous and destination control stations) and the train composition (in terms of weight, length, and number of wagons).
The SHAP method was used to construct the feature dependency plot between each pair of the seven available features, allowing the discovery of greater interaction effects between each pair of features with a higher SHAP value, and thus a higher incidence in the predicted feature. FIGURE 7 and FIGURE 8 depict the strongest interactions discovered in the feature dependence scatter plots, which show the effect of a single feature on the predictions of the lightGBM model. The following considerations must be made: • Each point represents a single prediction (observation) from the dataset.
• The x-axis represents the value of the specified feature.
• The y-axis displays the SHAP value for that feature, which indicates how much the model's prediction of the arrival delay time for that sample is influenced by knowing the feature's value.
• The color corresponds to the second feature, which interacts significantly with the feature on the x-axis. FIGURE 7 depicts the variability of the train's weight per length in predicting the arrival delay time, with a growing trend in the impact of this variable on predicting the arrival delay time, and it is also observed that trains with a higher weight per length of the train have a lower total distance of the trip in general. FIGURE 8, on the other hand, depicts the roughly linear and positive trend between the departure delay time and its SHAP value, or the direct correlation between the departure delay time and the arrival delay time.
This emphasizes the significance of departure delay time in predicting arrival delay time in freight rail operations. These findings are consistent with previous research on passenger trains, which found that departure delay time is a significant predictor of arrival delay time [7], [8]. This study, however, is the first to show the same correlation in freight rail operations. FIGURE 7 and FIGURE 8 show how important it is to consider the weight per length of the train and the total distance of the trip as variables in predicting arrival delay time. These findings can be used to inform future operational interventions, such as optimizing routes to reduce distance and weight per length of the train, to improve overall freight railway reliability.
It is also worth noting that this study is based on data from a single freight rail company; thus, it would be advantageous to expand this research by including data from other freight rail companies to develop a more comprehensive study. This would allow for more reliable conclusions and a more comprehensive understanding of freight rail operations' behavior.

C. DISCUSSION
This study is the first of its kind to use gradient boosting models to predict arrival delay times in freight rail operations in the short-term. The resulting model is highly efficient and can handle large-scale datasets with high-dimensional features. LightGBM has been shown to outperform other popular machine learning algorithms such as random forest and XGBoost in various benchmarks and real-world applications, making it a popular choice for predictive modeling tasks [56], [57]. Other studies in the field have used different ML models such as neural networks to address other problems in freight rail operations [41], [69].
Previous research has found that train length has an impact on both passenger and freight train punctuality [70], [71]. However, Van Der Kooij et al., discovered that enforcing temporary speed restrictions on longer and heavier passenger trains to safeguard the use of infrastructure could produce significant network delays [72].
This study created a short-term predictive data-driven model to predict the arrival delay time of a train that has already departed from the previous control station and examined the features associated with arrival delay time. This study makes the following significant contributions: • The development of a consistent short-term predictive data-driven model, which discovered that the lightGBM model surpasses other data-driven models in predicting arrival delay time in freight rail operations.
• The impact of the features associated with arrival delay time was examined, and it was discovered that the departure delay time, the distance of the trip, and the train composition are critical in predicting the arrival delay time in freight rail operations.
• The possibility of CFL implementing the short-term prediction model developed in this study as a STDSS that can be accessed through a simple web service to predict arrival delay times and assess future operational interventions to reduce disruptions and the resulting delays in freight operations. The findings of this study are useful for the National Railway Company of Luxembourg and other freight rail operations because they can use the predictive model to anticipate delays in the short-term and take proactive measures to reduce disruptions and their consequences. In addition, the analysis of the characteristics associated with arrival delay time provides insights for future research and optimization of freight rail operations.
Some of the authors of this paper previously published a study [31] in which they used the same dataset to build a long-term prediction model using a binary classification approach to identify the rail operating features associated with intermodal freight rail operation delays, allowing them to predict whether a train will be delayed or not in the long run based on its composition. Although both the previous and current studies are concerned with developing predictive models for train delay times in intermodal freight rail operations in Luxembourg, there are significant differences between the two. In the previous study, a binary classification approach was developed for long-term predictions, whereas in this new study, a regression approach was developed for short-term predictions, allowing for real-time decision making. Furthermore, the SHAP method is used in this new study to identify the relationships between input features and delay times, allowing for a more thorough analysis of the causes of delay and the expected impact on real-time operations. Furthermore, the Luxembourg National Railway Company can use the short-term prediction model developed in this study as a decision-support system, providing more support to reduce disruptions and subsequent operational delays, for example, using a simple web service.

VI. CONCLUSION
This study presents a comprehensive approach to predicting freight rail arrival delay times, as well as investigating the underlying causes of delays and their expected impact on VOLUME 11, 2023 operations. The goal is to predict operational delays in real time and to create a STDSS that will assist decision-makers in future operational interventions to reduce disruptions and the resulting delays in freight operations. This will improve railway reliability in the freight transport sector in the long run.
Previous studies have developed models that predict the occurrence of disruptions or delay times in railway operations, but most of them have focused on passenger trains. Freight train research has primarily focused on examining the impact of network delays rather than train delays and has been unable to predict short-term delay times once the train has departed from the previous control station.
In this study, we used regression algorithms to train five data-driven models and analyzed predefined evaluation metrics (R2, RMSE, MAPE and MAE). For the examined dataset, which included railway operations carried out between Luxembourg and nine stations in Belgium, France, Germany, Poland, and Italy over a 17-month period, the lightGBM model stood out as the best data-driven model to predict arrival delay times in freight rail operations.
The lightGBM model has demonstrated that departure delay time, trip distance, and train composition are variables with a significant impact on the prediction of arrival delay times in railway operations. Our findings show that longer trains, longer distances, and heavier trains all have a direct relationship with arrival delay times in general. These findings may pave the way for future research into optimizing the routes of these freight trains' operations to reduce distances, resulting in not only shorter operating times, but also shorter arrival delay times.
However, it is important to note that analyzing the behavior of freight rail operations using only data from one company is insufficient when compared to multiple freight rail companies. As a result, future phases of this study could include data from other companies operating in the region to develop a broader study at the continental level, which could include data from other sources, such as historical climatic data in railway operations. Furthermore, future research studies may be geared toward the creation of a workflow capable of automating all the processes required, from data extraction to the construction and implementation of the models developed in this study. These models enable practitioners to predict arrival delay times in freight rail operations in real time, thereby supporting decisions to reduce the impact of these delays on the overall system operation.
JUAN PINEDA-JARAMILLO received the Ph.D. degree in engineering from the Technical University of Valencia, Spain. He is currently with the MobiLab Transport Research Group, University of Luxembourg, on the prediction of disruptions and optimization in rail intermodal operations as part of an FNR-funded project between the University of Luxembourg and CFL Multimodal. He is also an experienced data scientist and a researcher with passion in building and deploying successful algorithms and predictive models in different areas within transport planning, traffic safety, and railway engineering. His research interests include data science and transportation analytics.
FEDERICO BIGI is currently pursuing the Ph.D. degree with the Transportation Department, University of Luxembourg. His research interest includes the optimization of the shunting movement for freight trains management to agent-based modeling and simulation.
TOMMASO BOSI is currently pursuing the Ph.D. degree in computer science and automation with Roma Tre University, Italy. He is one of the Ph.D. Student Representative with the Department of Civil, Computer Science and Aeronautical Technologies Engineering. He is also the Co-Founder of Aless Don Milani, a foundation that develops national and European projects in the field of sustainability innovations. His research interests include the application of operations research and big data in transportation systems, and aiming to spread the ideals of sustainable mobility.