Forecasting the number of road accidents in Poland by day of the week and the impact of pandemics and pandemic-induced changes

Abstract The COVID19 pandemic has significantly affected the performance of the transport sector and its overall intensity. Reduced mobility has a large impact on the number of road accidents. The aim of this study is to forecast the number of road accidents in Poland and to assess the impact of the COVID19 pandemic on the variation in road crashes. For this purpose, day-wise historical crash data from 2011 onwards have been collected and analysed. Based on real historical field data, the future has been forecasted for both pandemic and nonpandemic variants. Forecasting of the number of accidents has been carried out using selected time series models and exponential models. Based on obtained data, it can be stated that pandemic resulted in a decrease in number of road accidents in Poland with ranges of reduction varying from 11% to 30% based on different days of week. Most visible decrease is observed on 3 days viz. Monday, Wednesday, and Saturday. Further, the projections show that in view of the current situation one may expect further decrease in the number of road accidents in Poland.


Introduction
Road accidents are incidents that cause not only injury or death to road users, but also material damage.According to World Health Organization (WHO) data, about 1.3 million people die every year as a result of road accidents.In most countries, road accidents take up an average of 3% of their gross domestic product (GDP).Road accidents are the leading cause of death for minors and young people aged between 5-29 years (WHO.The Global status on road safety, 2018).The UN General Assembly has set an ambitious target of halving the number of deaths and injuries from road accidents by 2030 with respect to the numbers in 2018.
The extent of a road accident is an attribute to determine the severity of the accident.Predicting accident severity is crucial for relevant authorities while developing transport safety policies to eliminate accidents, reduce injuries, fatalities and property damage (Tambouratzis et al., 2014;Zhu et al., 2019).Identification of critical factors affecting accident severity is a prerequisite for developing countermeasures to eliminate and mitigate accident severity (Arteaga et al., 2020).Yang et al. (2022) proposed a multi-node Deep Neutral Network (DNN) to predict different levels of severity of injury, death and property loss.According to the authors, DNN technique allows for a comprehensive and accurate analysis of the severity of road accidents.
There are several sources of accident data.Most commonly, they are collected and analysed by government authorities through relevant government agencies.Data are collected through police reports, insurance databases or hospital records.Partial information on road accidents is then processed for the transport sector on a larger scale (Gorzelanczyk et al., 2020).
Intelligent transport systems are currently the most important source of data related to road accident analysis and forecasting.This data can be processed through the use of GPS devices in vehicles (Chen, 2017).Roadside vehicle detection microsystems can continuously record vehicle data (speed, traffic volume, vehicle type, etc.;Khaliq et al., 2019).A vehicle number plate recognition system also enables the collection of large amounts of traffic data over a monitored period of time (Rajput et al., 2015).Social media can be another source of traffic and accident data, but its accuracy may be insufficient due to the incompetence of reporters (Zheng et al., 2018).
To ensure the relevance of accident data, it is necessary to work with several data sources, which are needed to be confronted appropriately.Combining different data sources by consolidating heterogeneous road accident data helps in increasing the accuracy of analysis results (Abdullah & Emam, 2016).
A statistical study to assess the severity, finding the relationship between road accidents and road users was conducted by Vilaça et al. (2017).The result of the study is a suggestion to improve road safety standards and adopt other policies related to transport safety.Bąk et al. (2019) conducted a statistical study of road safety in a selected region of Poland based on the number of road accidents, and attempted to learn various causes for the causes of road accidents.Their study used multivariate statistical analysis to examine safety data on those responsible for missing.
The choice of accident data source for analysis depends on the type of traffic problem being addressed.The combination of statistical models with other natural driving data or other data obtained through intelligent transport systems contributes to the accuracy of accident forecasts and contributes to the elimination of accidents (Chand et al., 2021).
Different methods for forecasting the number of accidents can be found in the literature.Most commonly, time series methods are used to forecast the number of road accidents (Helgason, 2016;Lavrenz et al., 2018), the disadvantages of which are the inability to assess the quality of the forecast based on expired forecasts and the often-occurring autocorrelation of the residual component (Forecasting based on time series, 2022).Procházka et al. (2017) utilized the multiple seasonality model for forecasting and Sunny et al. (2018) used the Holt-Winters exponential smoothing method (Sunny et al., 2018).The drawbacks of these methods include the inability to introduce exogenous variables into the model (Dudek, 2013;Szmuksta-Zawadzka & Zawadzki, 2009).
A vector autoregression model is used to forecast the number of road accidents, whose limitation is the requirement of a large number of observations of the variables in order to correctly estimate their parameters (Wójcik, 2014).While forecasting the number of road accidents in the vector autoregression model, the major limitation faced is the requirement of a large number of points in order to correctly conduct the estimation of the parameters (Wójcik, 2014).Monederoa et al. (2021) have also successfully used the autoregression models to predict the number of fatalities.Similarly, Boom's regression models have been used by Al-Madani (2018).These models require one single-line connection (Mamczur 2022, Piłatowska, 2012).Biswas et al. (2019) used Random Forest regression to predict the number of road accidents.In another study, the data contain groups of correlated features with similar significance to the original data, where smaller groups are favoured over larger groups (Random forest, 2022), and there is instability in the method and spike prediction (Fijorek et al., 2010).Chudy-Laskowska and Pisula (2014) used the following in the prediction problem under discussion: an autoregressive model with quadratic trend, a model with univariate periodic trend and an exponential equalization model.A moving average model can be used for the prediction of the discussed issue; however, the disadvantages of the technique are low prediction accuracy, loss of data in the sequence, lack of consideration of trends and seasonal effects (Kaszpruk, 2010).Prochozka and Camej used the GARMA method.In this method, some restrictions are imposed in the parameter space to guarantee the stationarity of the process (Prochazka & Camaj, 2017).Very often, the ARMA model for a stationary process or ARIMA or SARIMA for a non-stationary process is used for forecasting (Dutta et al., 2020;Karlaftis & Vlahogianni, 2009;Procházka et al., 2017;Sunny et al., 2018).This results in very high flexibility of the discussed models, but it is also their disadvantage, as good model identification requires more experience from the researcher than, for example, regression analysis (Łobejko, 2015).Another disadvantage is the linear nature of the ARIMA model (Dudek, 2013).
Chudy-Laskowska and Pisula (2015) used the ANOVA method to forecast the discussed issue.The disadvantage of this method is the adoption of additional assumptions, in particular the assumption of sphericity, the violation of which may lead to erroneous conclusions (Road safety assessment handbook).Neural network models are also used to predict the number of road accidents.The disadvantages of the model used are the need for experience in this field (Chudy-Laskowska & Pisula, 2014;Wrobel, 2017) and the dependence of the final solution on the initial conditions of the network, as well as the lack of interpretability in the traditional way (Data mining techniques StatSoft).
A new prediction method is the use of Hadoop model by Kumar et al. (2019).The disadvantage of this method is its inability to work with small data files (Top Advantages and Disadvantages of Hadoop, 2022).Karlaftis and Vlahogianni (2009) used the Garch model for prediction.The disadvantages of this method are its complex form and complicated model (Fiszeder, 2009;Perczak & Fiszeder, 2014).On the other hand, McIlroy and his team used the ADF test (McIlroy et al., 2019), which has the drawback of poor power in the case of autocorrelation of the random component (Muck, 2022).
Various researchers (Li et al., 2017;Shetty et al., 2017) have also used Data-Mining techniques for forecasting, which usually have the disadvantage of huge sets of general descriptions (Marcinkowska, 2015).One also encounters the combination of models proposed by Sebego et al. (2008) as a combination of different models.Parametric models were also proposed in the work of Bloomfield (1973).
Given the above analysis, the purpose of this study is to forecast the number of road accidents in Poland and to assess the impact of the COVID-19 pandemic on the variability of road accidents.For this purpose, daily historical accident data since 2011 were collected and analyzed.Based on actual historical field data, forward-looking forecasts were made for both the pandemic and nonpandemic variants.Selected time series models and exponential models were used to forecast the number of accidents.

Materials and methods
The number of road accidents on Polish roads is decreasing year by year.Based on the day of crash occurrence, it is observed that the drop in accidents is more than 60% on Saturdays and Sundays.Further, day-wise fluctuations and variations in the number of crashes are also recorded with a decreasing trend.However, when compared with the European Union, the number of accidents is still very high in Poland.Next, when the accidents happened during pandemic are observed, a drop in the number of road crashes is observed.As compared to 2019, 23% less road crashes have been recorded in 2020.Figure 1 graphically represents the trend in road crashes in Poland over the years across 7 days of a week.
Selected exponential alignment models were used to forecast the number of road accidents by day of the week.The essence of these methods is that the time series of the forecast variable is pronounced with a weighted moving average and the weights are determined according to the exponential function.These weights were optimally selected by the Statistica software in which the analysis was conducted.The forecast in this case is based on a weighted average of the current and historical series.The result of the forecast using this method depends on the choice of the model and its parameters.Forecasting the number of accidents in Poland was carried out using selected time series models.
The following errors of expired forecasts determined from equations (1-5) were used to calculate measures of analytical forecasting perfection: • MAE-mean average error • MPE-mean percentage error • MAPE-mean absolute percentage error • SSE-Error Sum of Squares SSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where: n-the length of the forecast horizon, Y-observed value of road accidents, Y p -forecasted value of road accidents.
In order to compare the number of accidents during a pandemic and if it did not exist, the mean percentage error was minimized.
The day-wise variation in the number of road accidents was tested by conducting a Kruskal-Wallis test.The value of the test statistic is 14.4 with test probability p < 0.05.The value states that according to the hypothesis, equality of the average level of road accidents should be rejected.This means that the number of accidents we are dealing with, is exhibiting a systematic decrease in the average level of accidents over the years.Further, day-wise there is a marked variation among the crashes as is presented in Figure 2. Figure 2 clearly reveals that the highest number of road accidents takes place during the weekend, on Friday and Saturday.This is due to increased traffic on weekends which propels a high number of road accidents.Sundays exhibit least number of road crashes as the workplaces are closed on that day leading to less traffic flow on roads.
Selected time series models were used for further analysis to forecast the number of road accidents.

Forecasting the road accidents
Data from the Polish Police from 2001 to 2021 were used to forecast the weekly number of accidents.To examine the role of pandemic on the number of road accidents in Poland, the study was divided into two timelines: • From the end of 2011 to 2021 (with pandemic), and • From the end of 2011 till the end of 2019 (without pandemic) The study assumes that the beginning of the pandemic is 2020, due to the lack of statistics from the police on the number of road accidents by days of the week and breakdown by months.The day-wise forecasting results of the study where the effect of pandemic is included (upto 2021) are shown in Figures 3 to 9. Similarly, the day-wise crash forecasting without considering the data from pandemic (upto 2019) is shown in Figures 10 to 16.The various forecasting methods used in the study are coded as M1, M2, . .., Mn.The various forecasting techniques used for the study are as follows.

Results
The charts below show the results of forecasting the number of road accidents in Poland by pandemic and day of the week (Table 1).For Monday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 3148 for M8 to 5839-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 1175-M7 to 4080-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 350%.For the weekday studied, the smallest MAPE error value was observed for the M7 method (Figure 3).
On the next analyzed date, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 2880 for M8 to 5539-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 1319-M8 to 3844-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 291%.For the studied day of the week, the smallest MAPE error value was observed for the M9 method (Figure 4).
For Wednesday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 3064 for M8 to 5638-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 1102-M7 to 3914-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 355%.For the weekday studied, the smallest MAPE error value was observed for the M7 method (Figure 5).
On Thursday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 2827 for M8 to 5735-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 749-M8 to 3885-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 518%.For the day of the week studied, the smallest MAPE error value was observed for the M7 method (Figure 6).On Friday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 3471 for M14 to 6604-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 1312-M15 to 4463-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 340%.For the weekday studied, the smallest MAPE error value was observed for the M7 method (Figure 7).
On Saturday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 2762 for M10 to 5813-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 813-M10 to 3809-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results  is more than 340%.For the weekday studied, the smallest MAPE error value was observed for the M7 method (Figure 8).
On Sunday, the projected number of traffic accidents from 2022 to 2031, changes for 2022 from 2231 for M8 to 4866-M3.For the last analyzed period, 2031, the value of the number of traffic accidents changes from 542-M7 to 3081-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 568%.For the studied day of the week, the smallest MAPE error value was observed for the M9 method (Figure 9).On Monday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 3956 for M8 to 4786-M3.For the last period analyzed, 2029, the  value of the number of traffic accidents changes from 2225-M7 to 4825-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 216%.For the weekday studied, the smallest MAPE error value was observed for the M12 method (Figure 10).On Tuesday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 4123 for M7 to 4537-M3.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 2634-M8 to 4758-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 181%.For the weekday studied, the smallest MAPE error value was observed for the M11 method (Figure 11).On Wednesday, not including the pandemic, the projected number of traffic accidents from 2022 to 2029 changes for 2020 from 4130 for M7 to 4639-M3.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 2558-M7 to 4875-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 190%.For the weekday studied, the smallest MAPE error value was observed for the M10 method (Figure 12).On Thursday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 4178 for M7 to 4651-M3.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 2589-M7 to 4862-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and  maximum results is more than 187%.For the weekday studied, the smallest MAPE error value was observed for the M14 method (Figure 13).
On Friday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 4568 for M7 to 5292-M3.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 2573-M7 to 5190-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 201%.For the weekday studied, the smallest MAPE error value was observed for the M9 method (Figure 14).On Saturday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 3995 for M7 to 4477-M5.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 2066-M9 to 4477-M5.The difference  between the minimum and maximum results is more than 217%.For the studied day of the week, the smallest MAPE error value was observed for the M12 method (Figure 15).On Sunday, not including the pandemic, the projected number of road accidents from 2022 to 2029 changes for 2020 from 3333 for M7 to 3732-M3.For the last period analyzed, 2029, the value of the number of traffic accidents changes from 1740-M9 to 3645-M3.As can be seen in this case, moving average methods should not be used.The difference between the minimum and maximum results is more than 210%.For the weekday studied, the smallest MAPE error value was observed for the M7 method (Figure 16).
Based on the results of the conducted research by the inclusion of pandemic, it can be concluded that the number of forecasted road accidents at the end of analysis period of 2031  varies between 542 to 4117 across all days which depends on the usage of specific forecasting technique.Further, results are evident that the number of road accidents in Poland will decrease over years.
To compare the number of road accidents depending on the day of the week during pandemic and had there been no pandemic, the number of road accidents was forecasted based on various forecasting techniques, for which the average percentage error is the smallest.The following methods were selected as the best forecasting methods for various days.
Based on the research, it can be concluded that the estimated number of road accidents in Poland for the year 2029 varies from 740.30 to 4824.00, depending on the method and day of the week.
Based on the presented research results, it can be concluded that the number of road accidents in Poland will decrease year over year on each day of the week.

Discussion
The forecasted number of road accidents in 2020 on each day of the week was compared with the actual number of road accidents reported by the police (Statistic Road Accident, 2022).These data are presented in Figures 17-23.Based on the obtained data, it can be concluded that the pandemic caused a decrease in the number of road accidents in Poland by 11% to 30% depending on the days of the week.This reduction is most evident on Monday, Wednesday, and Saturday.For individual days it is as follows: • Monday-29%, • Tuesday-23%, • Wednesday-29%, • Thursday-22%, • Friday-11%, • Saturday-30%, • Sunday-24%.As we can see, the smallest difference is on the day when the highest number of traffic accidents occur.
In the next step, taking into account formulas 1-5, measurement errors were determined for the projected number of traffic accidents (Table 2).Due to the popularity of the MAPE error, its value was used for further analysis.The study assumed the following criteria for MAPE error (Lewis, 1982): • <10%-Highly accurate forecasting, • 10%-20%-Good forecasting, • 20%-50%-Reasonable forecasting, • > 50%-Inaccurate forecasting.Based on this, the above assumptions, it can be concluded that in most cases the measurement error is at a high level.In contrast, only in two cases it is reasonable.The value of the error varied in the range: 1.64%-38.74%.Its average value with pandemic was 14.51%, while without pandemic it was 8.47%.Based on this, it can be concluded that the received values of forecasts are at a high level.

Conclusion
The day-wise forecasting of the number of accidents in Poland was determined using the exponential equalization method with Statistica software.The applied weights were estimated by the program in such a way as to minimize the mean absolute error and the mean absolute percentage error.Based on the obtained results, it can be concluded that the pandemic caused a reduction in the number of road accidents in Poland.The range of reduction varies between 11% to 30% depending on the day for which forecasting is conducted.Based on the current situation, further reduction in the number of road accidents in Poland can be expected which might be comparable with number of accidents in EU.
Moreover, based on analysis of the obtained results it can be stated that the forecasts of the number of road accidents in Poland for the coming years show a decreasing trend, especially due to the arrival of COVID-19 virus pandemic.The calculated forecast errors prove the accuracy of the used models.Another remarkable finding from the analysis is that variation of crashes is observed for different days of a week.Moreover, based on the day for which forecasting is conducted, specific types of forecasting technique prove to be more accurate.The forecasts of the number of road accidents obtained in the article may be used in the future to formulate further actions aimed at minimizing the number of accidents in the countries under study.These actions may consist, for example, in introducing higher penalties for traffic offences on Polish roads from beginning of 2022.In further research, the authors plan to consider more factors affecting the level of accidents in Poland.These may include traffic density, weather conditions or the age of the accident perpetrator.

Figure
Figure 1.Comparison of the number of accidents in Poland from 2001-2021.
Figure 2. Comparison of the average number of road accidents in Poland by days of the week in 2001-2021.

Figure
Figure 4. Forecasting the number of road accidents on Tuesday from 2022-2031.

Figure
Figure 3. Forecasting the number of road accidents on Monday from 2022-2031.

Figure
Figure 6.Forecasting the number of road accidents on Thursday from 2022-2031.

Figure
Figure 5. Forecasting the number of road accidents on Wednesday from 2022-2031.

Figure
Figure 7. Forecasting the number of road accidents on Friday from 2022-2031.

Figure
Figure 8. Forecasting the number of road accidents on Saturday in 2022-2031.

Figure
Figure 9. Forecasting the number of road accidents on Sunday from 2022-2031.

Figure
Figure 10.Forecasting the number of road accidents on Monday from 2020-2029 if there was no pandemic.

Figure
Figure 11.Forecasting the number of road accidents on Tuesday from 2020-2029 if there was no pandemic.

Figure
Figure 12.Forecasting the number of road accidents on Wednesday from 2020-2029 if there was no pandemic.

Figure
Figure 13.Forecasting the number of road accidents on Thursday from 2020-2029 if there was no pandemic.

Figure
Figure 14.Forecasting the number of road accidents on Friday from 2020-2029 if there was no pandemic.

Figure
Figure 16.Forecasting the number of road accidents on Sunday from 2020-2029 if there was no pandemic.

Figure
Figure 15.Forecasting the number of road accidents on Saturday from 2020-2029 if there was no pandemic.

Figure
Figure 17.Comparison of number of road accidents in Monday with and without pandemic (presence of pandemic-M7, absence of pandemic-M12).

Figure
Figure 18.Comparison of number of road accidents in Tuesday with and without pandemic (presence of pandemic-M9, absence of pandemic-M11).

Figure
Figure 19.Comparison of number of road accidents in Wednesday with and without pandemic (presence of pandemic-M7, absence of pandemic-M10).

Figure
Figure 20.Comparison of number of road accidents in Thursday with and without pandemic (presence of pandemic-M7, absence of pandemic-M15).

Figure
Figure 21.Comparison of number of road accidents in Friday with and without pandemic (presence of pandemic-M7, absence of pandemic-M9).

Figure
Figure 22.Comparison of number of road accidents in Saturday with and without pandemic (presence of pandemic-M7, absence of pandemic-M12).

Figure
Figure 23.Comparison of number of road accidents in Sunday with and without pandemic (presence of pandemic-M9, absence of pandemic-M7).