Forecasting traffic accidents using disaggregated data

https://doi.org/10.1016/j.ijforecast.2005.11.001Get rights and content

Abstract

Traffic accidents, measured monthly, present different characteristics when the aggregate is compared to its individual components. When disaggregated data are used, the effects of policy variables, calendar events, and different seasonal behaviors should be clearly understood and their coefficients properly estimated. In this paper, we compare the empirical performance of various models in assessing the effects of policy variables, legal changes, and traffic security campaigns. In addition, aggregated versus disaggregated forecasts of the main accident variables are compared in order to examine the robustness of the forecasting improvement from using disaggregated data. In particular, we test the robustness of this improvement against the specification of the model, information set, type of measure of forecasting accuracy, and forecast year. Overall, we conclude that forecast combinations based on disaggregated models display better performance.

Introduction

The effects of disaggregation on forecasting the aggregate have been studied extensively over the past two or three decades. From a theoretical standpoint, Tiao and Guttman (1980) provided guidelines for conditions, with no guarantee of a gain in forecast efficiency (measured by the minimum mean squared error), when employing the component series rather than the aggregate. Wei and Abraham (1981) compared forecast efficiency based on the aggregate series, univariate component series and joint multiple time series. Kohn (1982) defined the aggregate as any linear combination of the observed series, and analyzed the equality of forecasts between the aggregated and disaggregated approaches. Later, Lütkepohl, 1984, Lütkepohl, 1985 established the equality of forecasts when sets of linear combinations of the observed time series are considered. More recently, Clark (2000) examined the problem of forecasting an aggregate composed of cointegrated disaggregates. In addition, this author established conditions under which forecasts of an aggregate variable obtained from a disaggregated Vector Error Correction Model are equal to those obtained from an aggregated univariate time series model. In the case of common trends, Poncela and Garcia-Ferrer (2005) derived conditions that guarantee the equality of forecasts between the aggregated and disaggregated approaches using unobserved component models. The empirical counterpart to this issue has shown mixed results (being case dependent). Nevertheless, recent results regarding Gross Domestic Product growth rates (Zellner & Tobias, 2000) and several European macroeconomic variables (Marcellino, Stock, & Watson, 2003) indicated that, generally, “it pays to disaggregate”. Most studies on short-term predictions show that considerable gains in efficiency, based on mean-square-error-type criteria, can be obtained when using models based on disaggregated data. However, as the prediction horizon increases, the gain in efficiency from using disaggregated data diminishes substantially (Koreisha & Fan, 2004).

The existence of many empirical aggregate relationships tends to “obscure” some of the underlying characteristics of the individual series when data on traffic accidents are analyzed. For instance, in the case of monthly seasonal data, a different behavior is often observed in the aggregate variable compared to its individual components. Given this situation, the estimation of both the aggregate seasonal pattern and the calendar effects may be highly biased if these elements differ substantially among the individual components. In some cases, aggregated calendar effects seem to be negligible as a result of offsetting individual effects of different signs (Garcia-Ferrer & del Hoyo, 1987). These unexpected results may not only provide misleading conclusions for policy, but may also cause considerable deterioration in the forecasting performance of alternative models.

Spain is one of the European countries with the highest rate of road accidents (Page, 2001). Roughly speaking, Spain had the same number of motor fatalities in 2003 that it had thirty years ago. This occurred in spite of huge infrastructure investments, considerable technological improvement, and generous spending on numerous traffic security campaigns. Traffic authorities in other European countries regularly set targets that are aimed at reducing road casualties. However, in Spain, most of these efforts thus far have been in vain. This situation is especially apparent when we compare the circumstances in Spain to those observed in other European countries. For instance, in 1987 the target in Great Britain was to reduce the number of fatalities and serious injuries by one-third by 2000 compared with the average for 1981 to 1985. This target was surpassed; road fatalities fell by 39% and serious injuries by 45%. The success in Great Britain was made possible through legislation changes, improved infrastructure, and vehicle crash protection (Raeside, 2004).

In this paper, we compare the forecasting performance of a large number of econometric models in order to address the issue of setting realistic targets. Specifically, we apply these models to monthly Spanish traffic accident variables during the years 1975 to 2003. This paper has two main objectives. The first objective is to evaluate and compare the empirical performance of various models in assessing the effects of policy variables, legal changes and traffic security campaigns. When dealing with this issue, disaggregation is crucial in order to avoid misleading conclusions for policy decisions. The second objective is to compare individual forecasts with various combination forecasts of the main accident variables. We also evaluate whether or not the forecasting improvement when using disaggregated data is robust to the specification of the model, the information set in each model, the type of measure of forecasting accuracy, and the forecast year. In Section 2 we discuss the definitions and characteristics of the data used in the study. The methodologies used to analyze the aggregate relationships between traffic accidents and real economic activities are briefly described in Section 3. In addition, the empirical results of this study are presented in Section 3. Beginning with an analysis of the data, we estimate several univariate and intervention models in order to provide starting points for the construction of causal econometric models. In Section 4, we analyze the predictive performance of models that focus their attention on disaggregated data and assess the value of disaggregation in terms of forecasting. A conclusion of the study is provided in Section 5.

Section snippets

Description of the data

The empirical implementation of alternative models requires information on both accident rates and economic variables. As a result, the data used in this paper have been separated into three groups:

  • 1.

    The following variables related to accident rates [monthly data from January 1975 to December 2003 (348 observations)] were measured: (i) number of accidents with injured passengers-ACC ; (ii) number of killed passengers-FAT;3

Methodologies and empirical results

In this paper, we will use two different methodologies and three levels of information sets. When our information set is exclusively restricted to the past history of the forecast variable, the dynamic harmonic regression (DHR) model will be used. As we enlarge the information set (including exogenous economic variables), variants of the intervention ARIMA model and dynamic transfer functions will be considered. The theoretical frameworks are briefly sketched in the next subsections and the

Forecasting

The main purpose of this section is to assess the predictive performance of the models that have been analyzed earlier. In particular, we use the same exogenous variables to examine the predictive accuracy of aggregated versus disaggregated models. In addition, we examine the improvement in prediction that occurs when we use more elaborate models with larger information sets. The choice of the forecasting period is directly related to the beginning of the traffic authorities' public

Conclusions and policy issues

The traffic agencies in many developed countries periodically set road safety targets. Although these targets are only mildly defined in Spain, various governments have openly encouraged the target of a “drastic” reduction in accident rates by 2010. In order to maintain an effective use of resources, it is important to monitor the progress that is made towards these targets. This paper, using seasonal monthly data of recent Spanish accident rates, presents a methodology for analyzing the

References (36)

  • Bujosa, M., Garcia-Ferrer, A., & Young, P. C. (2005). An ARMA representation of unobserved component models under...
  • C. Chen et al.

    Joint estimation of model parameters and outlier effects in time series

    Journal of the American Statistical Association

    (1993)
  • T.E. Clark

    Forecasting an aggregate of cointegrated disaggregates

    Journal of Forecasting

    (2000)
  • R.T. Clemen

    Combining forecasts: A review and annotated bibliography with discussion

    International Journal of Forecasting

    (1989)
  • Garcia-Ferrer, A., de Juan, A., & Poncela, P. (2004). The relationship between traffic accidents and real economic...
  • A. Garcia-Ferrer et al.

    Analysis of the car accident indexes in Spain: A multiple time series approach

    Journal of Business and Economic Statistics

    (1987)
  • A. Harvey

    Forecasting structural time series models and the Kalman filter

    (1989)
  • A. Harvey et al.

    The effects of seat belt legislation in British road casualties: A case study in structural time series modelling

    Journal of the Royal Statistical Society, Series A, Part 3

    (1986)
  • Cited by (31)

    • Time series modeling in traffic safety research

      2018, Accident Analysis and Prevention
    View all citing articles on Scopus
    1

    Tel.: +34 91 4974100; fax: +34 91 4974091.

    2

    Tel.: +34 91 4975521; fax: +34 91 4974091.

    View full text