Introduction

That the weather impacts on our daily lives is a known fact. Rainy and inclement days increase the likelihoods of traffic incidents and delays (Karlaftis and Yannis 2010). Extreme hot and cold weather not only incur more energy needs (Wang et al. 2010) but also induce acute health situations (Leung et al. 2008; Kjellstrom et al. 2010). Severe weather events such as hurricanes, tornadoes, and tropical cyclones often demand special rescue and emergency operations (Williamson et al. 2002). Weather forecast data have been used effectively in decision-making to plan road maintenance works, prepare for road clearing on snowy days, and estimate resources required to respond to power outages in bad weather conditions. It has been claimed that “at a cost of US $4 billion annually in the United States, weather is responsible for about two-thirds of aviation delays—$1.7 billion of which would be avoidable with better observations and forecasts” (NOAA 2005). Because the accuracy of weather forecasting has improved with better tools (Shukla et al. 2010), the value of integrating weather information and systems into weather responsive transport management and emergency operations is irrefutable (Cluett et al. 2011; Williamson et al. 2002).

Weather conditions have long been known to affect human health. Keatinge (2002) reported that cardiovascular (such as ischemic heart and cerebrovascular) diseases and respiratory illnesses gave rise to disproportionately higher mortality in winter. Rooney et al. (1998) and Liang et al. (2009) also found that extreme temperatures could exacerbate chronic illnesses. Locally in Hong Kong, Yan’s (2000) study recorded a significant negative relationship between minimum temperature and mortality for the age group 65 years and above, as well as a significant positive association between cloud cover and mortality. Wong and Lai (2012) also confirmed the negative relationship between daily ambulance demand and daily average temperature up to 28 °C. Recently, the Hong Kong Observatory (HKO) cooperated with the Senior Citizen’s Home Safety Association to explore the effects of air temperature on the usage of the Personal Emergency Link Service and those requiring subsequent hospitalization (Hong Kong Observatory 2007). Their initial findings suggested that the number of hospitalized users was inversely proportional to temperatures below 23 °C and 10 % higher at 12 °C compared to that at 23 °C. The above discussions clearly imply an association between weather conditions and ambulance demand, with the association being more pronounced among the elderly.

The Fire Services Department (FSD) manages emergency ambulance services in Hong Kong while the Hospital Authority and some voluntary organizations shoulder non-emergency ambulance calls. A steady increase in the number of ambulance calls (both emergency and non-emergency) has been observed since 1998 (Table 1). The FSD reportedly handled more than 687,133 ambulance calls in 2010, which reflected a 17.5 % increase from 2005 and a 48 % increase from 1998 (Fire Services Department 2011). During the same period, the total population increased by 4.1 % and 8 %, respectively, in 2005 and 2010 compared with that of 1998. The corresponding increases in the elderly (65+) population were 20.8 % in 2005 and 31.5 % in 2010. The FSD practices a logistic assignment of emergency vehicles that dispatches ambulances on the basis of “next-in-queue” by assigning the nearest available ambulance to the scene regardless of which district the ambulance belongs to. It also implements staggered shifts and a move-up system to further improve service and response times by deploying ambulances from nearby clusters to stand by in a cluster whose ambulance resources have been fully utilized. A yearly rate of 8–11 % emergency move-ups of ambulances in 2006–2009 was needed to provide an adequate operational coverage (Fire Services Department 2011; Hong Kong Legislative Council 2005). In view of the rising demand for emergency ambulance services and limited resources, early knowledge of expected demand is instrumental in the strategic mobilization and deployment of ambulances.

Table 1 Annual counts of ambulance calls from 1998 to 2010

The development of forecasting models for ambulance demand and health incidence made notable progress (Setzler et al. 2009) in the late 1990s. Baker and Fitzpatrick (1986) attempted separate forecasts on daily emergency and non-emergency ambulance demand using an exponential smoothing model developed originally by Winters (Saydam et al. 1986) and then combined the two into a single forecast through goal programming. Four months of data on daily ambulance demand from September to December 1982 were used to forecast daily demand in January and February of 1983. When comparing the mean square errors, the technique of goal programming seemed to outperform multiple linear regression models. Tandberg et al. (1998) employed different time series methods (based on moving averages, means with moving average smoothing, and autoregressive integrated moving averages) to estimate the hourly volumes of ambulance runs. Hourly ambulance statistics in 1994 were used to develop forecast models, and similar statistics in 1995 were used to test and validate the models. The results indicated that time series by “means with moving average smoothing” yielded the best performance, with 54.3 % of the variance explained. Liao et al. (2002) considered weather factors with the back-propagation neural network to predict daily ambulance runs. They employed daily ambulance records from 1996 to 1997 in model development and 1998 records in model validation. Their models based on time-series data were able to predict ambulance runs to within 6 % error on average. Channouf et al. (2007) compared the forecast accuracy of two time series models: (1) an autoregressive model based on ambulance data with trend effects of seasonality and special-day removed; and (2) a doubly seasonal ARIMA model with special-day effects included. Their models were derived using 36 months of data and validated based on data in the subsequent 14 months. The results suggested that the first model was better, with a smaller root mean square error (RMSE) at 13.91 and an insignificant average absolute percentage error (AAPE) of 5.72 %. A comprehensive review on ambulance demand forecasts can be found in Wargon (2009).

Other than ambulance demand forecasts, time-series models have been used widely in health-related applications. Tong and Hu’s (2001) epidemic-forecasting model on disease incidence of the Ross River Virus in Cairns was based on the autoregressive integrated moving average (ARIMA) method. Their forecasting model confirmed that relative humidity and rainfall were significantly associated with disease incidence. Time-series modeling was also adopted by Ajdacic-Gross et al. (2007) in evaluating seasonal associations between monthly suicide data and weather conditions, and by Abeku et al. (2002) in verifying the accuracy of five different forecasting methods of malaria incidence.

We believe that time-series analysis in forecasting ambulance demand is superior to ordinary multiple regression analysis (McLay et al. 2012) and the back-propagation neural network approach because it can cater for both predictor and temporal effects. Indeed, the general utility of time-series modeling by adequately trained non-specialists has contributed to its popularity (Allard 1998). In this paper, we have employed a series of linear time-series forecasting models that allow for the inclusion of predictor variables in estimating daily ambulance calls of Hong Kong. The predictive powers of various models are compared using the measure of root mean square error (RMSE). Here, the effectiveness of using HKO’s 7-day weather forecast data to predict ambulance demand for the coming days is compared against that using retrospective and actual weather data. Because only forecast data are available before an event day, the practical utility of forecast meteorological data, which are not error-free, in predicting daily ambulance demand for the next few days can be evaluated.

Data and methods

We predicted daily ambulance demand using the autoregressive integrated moving average (ARIMA) method in model development. The series of daily ambulance calls is the dependent series to be predicted, whereas the 7-day weather forecast or actual meteorological series constitute the independent or predictor series.

Daily ambulance demand data

For this study, we obtained over 6 million records (covering a 3-year period from May 2006 to April 2009 inclusive) of emergency attendance at the Accident and Emergency Departments of all hospitals managed by the Hong Kong Hospital Authority. This 3-year period was chosen because there was neither infectious disease outbreak (e.g., SARS or H1N1 influenza) nor policy changes in emergency medical services (e.g., introduction of ambulance user-fee policy) to cause confounding biases in the study. Each anonymized record contains an arbitrary record number, date and ambulance brought-in indicators to yield essential daily statistics for the study. Having excluded walk-in patients by referencing against the ambulance brought-in indicator, 22 % of records (or over 1.3 million) remained in the model. Figure 1 shows a time-series plot of the daily ambulance cases (with 3-day smoothing) that reveals a pattern of random fluctuation with some periodicity showing comparatively more cases in the summer/hotter and winter/cooler months. We also took note of extreme values or outlier data (such as those on the first day of Chinese New Year when it is considered inappropriate to have hospital visits) that could not be explained by forecasting models. This special day with anomalous ambulance demand is excluded from our prediction model. In addition, a 0.7 % increase in the population from mid-2006 to mid-2009 and the change of age structure (Census and Statistics 2012) were considered negligible. We preferred not to make further artificial adjustment (e.g., normalization) on the data series to prevent undesirable artifacts, although the series had an increasing trend.

Fig. 1
figure 1

A time-series plot of daily ambulance cases with 3-day smoothing (May 2006–April 2009)

Meteorological data

Hong Kong, which is situated in the southern part of China bordering Guangdong Province, is subject to cold and northerly winds from China and has a monsoon-influenced humid subtropical climate tending towards moderately warm temperatures for nearly half the year. Daily actual values of common weather variables from May 2006 to April 2009 inclusive, and the corresponding 7-day forecast values were extracted from the website of the HKO (2010). The HKO produces weather forecast data by employing numerical weather prediction models based upon outputs from the combined Japan Meteorological Agency and European Centre for Medium-Range Weather Forecasts global models (Lam et al. 2005). Our study averaged the forecast maximum and minimum daily readings to represent the 7-day forecast values of daily average temperature and humidity. Only temperature and relative humidity were considered in our study, not only because their corresponding 7-day weather forecasts were available from the HKO through the Internet but also because these two variables have been shown to have a direct relationship with ambulance demand (Wong and Lai 2012; Yan 2000). The presence of confounders such as air pollutants was not considered due to unavailability of their corresponding forecast values.

Method of analysis

ARIMA model

Compared to a typical multiple regression model, the ARIMA, or linear time-series model (Harvey 1989, 1993) is an advanced forecasting algorithm consisting of two sub processes. The first is an autoregressive process ϕ p(B) that describes how an observation is related to past observations. The second is a moving average process θ q(B) that describes how an observation is related to past error terms. Mathematically, they are expressed as follows.

Order p autoregressive process:

$$ {\phi_p}(B)=1-{\varphi_1}B-{\varphi_2}{B^2}-\cdots -{\varphi_p}{B^p}, $$

where φ p is an autoregressive coefficient; p is the order of the non-seasonal autoregressive part of the model; and B is the backward shift operator.

Order q moving average process:

$$ {\theta_q}(B)=1-{\vartheta_1}B-{\vartheta_1}{B^2}-\cdots -{\vartheta_q}{B^q}, $$

where ϑ p is an autoregressive coefficient; and q is the order of the non-seasonal moving average part of the model.

Combining the two processes in an ARIMA model (along with consideration of seasonal components and predictors) yields the mathematical expression of ARIMA(p,d,q) (P,D,Q):

$$ \begin{array}{*{20}c} {\begin{array}{*{20}c} {\varPhi (B)\left[ {\varDelta \left( {{y_t}-\sum\limits_{i=1}^m {{c_i}{x_{it }}} } \right)-\mu } \right]=\varTheta (B){a_t},} & {\mathrm{t}=1,\ldots ..\mathrm{N}} \\ \end{array}} \hfill \\ {\varPhi (B)={\phi_p}(B){\varPhi_P}(B)\;\mathrm{and}\;\varTheta (B)={\theta_q}(B){\varTheta_Q}(B),} \hfill \\ \end{array} $$

where ΦP(B) and ΘQ(B) are the order P seasonal AR polynomial and order Q seasonal MA polynomial respectively; Δ is a differencing operator; y t is the time series under investigation; x it is the ith independent variable at time t; c i is a regression coefficient; μ is the model constant; and a t is the white noise at time t. An example of ARIMA(1,1,1) with temperature and humidity as covariates can be expressed as follows:

$$ \left( {1-{\varphi_1}B} \right)\left( {{y_t}-{y_{t-1 }}} \right)=\left( {1-{\varphi_1}B} \right)\left[ {{c_{temp }}\left( {{x_{temp,t }}-{x_{temp,t-1 }}} \right)+{c_{humidity }}\left( {{x_{humidity,t }}-{x_{humidity,t-1 }}} \right)} \right]+\left( {1-{\vartheta_1}B} \right){a_t} $$

A more detailed description of the algorithm can be found in IBM SPSS Statistics 20 (2011b).

An obvious advantage of the ARIMA model is that its autoregressive terms and differencing property can cater for specific weather effects such as time-lag and daily fluctuations. On the contrary, a regression model must introduce specially designed predictors to account for the aforementioned weather effects. Nevertheless, ARIMA models were shown outperformed by regression models for longer-term forecasts (Wong and Lai 2009). Because this study aims at short-term forecasts of up to 7 days in advance, the ARIMA models were considered a more suitable choice.

All ARIMA models in this study were developed using the IBM SPSS Forecasting module, which contains an Expert Modeler that can resolve, for non-expert users, the best model specification involving seasonal ARIMA models (IBM SPSS 2011a). The winter/summer seasonality was not modeled explicitly as our aim was to undertake 1- to 7-day short-term forecasts and modeling the daily effects of temperature/humidity can serve a similar purpose. Specifically, the Expert Modeler can determine the number of autoregressive and moving average terms and the order of differencing required for transforming non-stationary time-series to stationary, which is a basic assumption of ARIMA modeling. As the research involved numerous ARIMA models, the process was facilitated by implementing an SPSS Macro program (Einspruch 2004).

Assessment of prediction accuracy

We argue that ambulance demand can be represented by the sequence of historical observations such that we can extrapolate the identified pattern to predict future events. The data were divided into two time periods: (1) historical (May 2006–April 2008) and (2) validation (May 2008–April 2009). The historical data were used to develop the forecasting models. Data within the validation period were used to test and evaluate the model performance. Three different versions of ARIMA models were attempted:

  1. (a)

    without predictors or the benchmark model;

  2. (b)

    using the actual average temperature as the independent or predictor variable; and

  3. (c)

    using both actual average temperature and relative humidity as the independent or predictor variables.

To determine how well our established models can forecast ambulance demands in the next 1–7 days, RMSE values between the actual and predicted daily ambulance cases were calculated. A smaller RMSE value would indicate a closer resemblance between the actual and predicted values, and a larger RMSE value would indicate otherwise. Hence, the difference between RMSE values of various models would indicate the relative effectiveness of the respective predictors (or forecast weather factor) in estimating ambulance demand. The RMSE measure thus enables an objective comparison among the models to identify the optimum model and relevant independent variables. After identifying the best model among models a, b and c, the actual weather data will be replaced by weather forecast data for assessing the effectiveness of applying the model in real life situations. All ARIMA models in this study were developed using the IBM SPSS Forecasting module, which determines automatically the number of autoregressive and moving average terms and the order of differencing needed to transform non-stationary time-series to stationary series. Moreover, weekly periodic patterns of the ambulance demand not explained by weather factors were also considered. Details of the processing procedures can be found in the IBM SPSS Forecasting manual (IBM SPSS, 2011a).

The performance of the ARIMA models was assessed on the basis of their forecast accuracy as opposed to testing their modeling parameters as past research has confirmed significant relationships between specific weather factors and daily ambulance demand (Wong and Lai 2012). It is also ineffective to report the modeling parameters because the study involved three sets of ARIMA models with 3,660 different settings for year 2008 (366 times for each version of models a, b, c and 366 × 7 times for the 1–7 day weather forecasts) that require replacing of the weather forecast data for today by the actual weather data at each subsequent modeling processes. Refitting the ARIMA model using known values from the previous day can ensure that the parameters represent the latest situation. Nevertheless, the general pattern of the modeling parameters (i.e., the values of p, d, and q) is presented in the Results section. It should be noted that the assumptions of ARIMA model were not checked explicitly during the model development phase for the same reason stated above. Nonetheless, the effects would be reflected objectively in the forecast accuracy during the model validation phase that verified whether fundamental assumptions of the model had been violated.

Results

Figure 2 shows deterioration in the absolute forecast accuracy (or increasing prediction errors) of HKO weather variables with longer forecast time span, which is not surprising because prediction tends to lose accuracy further into the future. It also shows that the accuracies of forecast average temperature are far better than those of forecast average relative humidity. This preliminary analysis suggests that forecast average temperature is more reliable and should be considered first in the forecast model.

Fig. 2
figure 2

Average absolute percentage errors between actual and forecast values for average temperature and average relative humidity

The performance of the ARIMA models was compared as shown in Table 2. It can be seen that model b gives smaller RMSEs compared with the benchmark model a indicating that average temperature is indeed a useful predictor for improved accuracy. However, model c shows that adding relative humidity to model b brings uncertainty into the prediction which implies that it would be sufficient to have average temperature as the sole predictor in the forecast model.

Table 2 Comparing forecast results of different autoregressive integrated moving average (ARIMA) models. All modeling parameters (i.e. the values of p, d, and q) fall within one order of magnitude. RMSE Root mean square error

Given that forecasting is a prediction of what would occur in the future in the absence of actual values, the predictor in model b was substituted by forecast average temperatures to predict ambulance demand in the next few days; this gives model b’. Since the forecast average temperature was found to be quite reliable (Fig. 2), the RMSEs of model b’ compared favorably against those of model b in Table 1. Figure 3 shows the RMSEs of the 1- to 7-day forecasts as exhibited by three ARIMA models (i.e., columns 2, 3, and 7 of Table 1, respectively, for models a, b, and b’). The benchmark model a predicts ambulance demand based purely on historical behavior. The prediction accuracy improved by 9.1 % to 15.4 % when the actual average temperature was added as a predictor in model b. Although model b’ making use of the forecast average temperature did not perform as well as model b, the fact that it improved prediction accuracy by 8.8 % to 13.2 % suggests that weather forecast data (i.e., 1- to 7-day forecast data of the average temperature in this case) is an effective predictor in the forecasting model.

Fig. 3
figure 3

Comparing root mean square errors of different autoregressive integrated moving average (ARIMA) models

Results of 1- and 7-day forecasts by the three models are shown in Fig. 4. The worst forecast occurred on the day after the Chinese New Year (by date and actual ambulance vs forecast demand: 27 January 2009; 1233 vs 845) because the forecast was severely affected by the low ambulance demand on the day before. This inaccurate forecast is unlikely to be a problem because the abnormality can be easily identified and the anomalous forecast appears for just 1 day out of the whole year. Besides, the modeling parameters (i.e., the values of p, d, and q) appear reliable as they fall within one order of magnitude.

Fig. 4
figure 4

Results of the 1- and 7-day ahead forecasts by the three models

Discussion and conclusions

In this study, we compared the performance of four different ARIMA forecasting models to predict daily ambulance demand from 1 to 7 days in advance. Our empirical findings showed that the benchmark ARIMA model a was outperformed by two ARIMA models b and b’ making use of average temperature as a predictor. Firstly, the inclusion of actual average temperature as a predictor yielded the best results to inform the significance of this weather variable in enhancing forecasting accuracy. Secondly, substituting actual by forecast average temperature produced encouraging results because the latter was a credible reflection of the former. Finally, the combined use of both average temperature and average relative humidity in model c did not improve model performance further.

The insignificant effects of relative humidity were partly because its actual and forecast values were not in close agreement and partly because the average temperature has already explained most of the variance in ambulance demand. We suspect that a straightforward addition of average relative humidity into the forecasting model was not able to expound the remaining variance in ambulance demand. To further explore the potential role of relative humidity in prediction, it may be necessary to consider an alternative and well-defined weather index such as the net effective temperature (NET) (Hentschel 1987). A weather index, along with more precise weather forecast data, could perhaps capture the underlying and intricate relationships between weather factors and ambulance demand.

A deficiency of this research is data completeness because our ambulance data did not include cases of false call and dead before arrival to a hospital. These demands for ambulance services are important in terms of resource implications. In principle, more inclusive data could be derived from call records of the FSD—the managing authority of ambulance services. Another deficiency concerns the representativeness of average values of weather variables (i.e., temperature and relative humidity) computed from their corresponding maximum and minimum readings. It would be more ideal if official average readings of weather variables could be obtained from the HKO. Finally, although the ambulance demand forecast was improved by incorporating the temperature factor, we took note that not all ambulance demand was weather related. The model could be strengthened if different types of ambulance demand could be identified. Through differentiating ambulance demand by its different nature, specific models could be developed for various types of ambulance demand to derive more accurate overall forecasts. However, this suggestion was not tested in the study because essential identifiers were not available.

In summary, this is the first study in which a forecasting model of the daily demand for ambulance services in Hong Kong has been developed using the ARIMA method and weather forecast data from the HKO. Although the effectiveness of using the ARIMA model in predicting ambulance or emergency services demand has been demonstrated outside of Hong Kong (see Channouf et al. 2007; Setzler et al. 2009), these latter studies made use of retrospective or actual meteorological data which, in the practical setting, are not available on the day to enable forecasting. The effectiveness of utilizing official weather forecast data to predict daily ambulance demand for the coming days, as we have established in our study, is a more appropriate demonstration of the societal use and values of meteorological information.

The above finding is timely for the development of the Fourth Generation Mobilising System (FGMS) as the existing system will be overloaded in 2013. Weather forecast data could be incorporated into the FGMS without additional cost by linking directly to the HKO website. The incorporation of weather forecast data is worthwhile because its inclusion will improve forecast accuracy to better gauge resource management and deployment of ambulance attendants. For example, if the HKO 7-day weather forecasts warned of declining average temperatures in the next week and the ambulance demand forecast system predicted an increasing demand for ambulance services, such information would alert the need for advanced planning and estimation of a sufficient number of ambulance attendants required for the occasion. Such forward planning would enhance employee communication and reduce the need for compensation payment due to cancellation of previously approved leaves.

There are few studies about the joint decision-making processes of forecasters and emergency managers during severe weather events (e.g., see Baumgart et al. 2008). Due to a lack of standardization and the complex nature of interagency cooperation, much time is wasted at the beginning on liaison and coordination. Medical care of na on-recurrent nature due to extreme weather conditions must not be neglected. In an emergency situation when lives are endangered, a 12-hour difference is important 1 day out but becomes less important the farther out the forecast is. Given that the present day FSD does not provide forecasts of daily ambulance demand, our forecasting model b’ is helpful for the agency or the managing authority to know beforehand and be better prepared for the expected rise in the demand for emergency ambulances arising from changing weather conditions. Prior knowledge of service demand can facilitate vehicle logistics to assure that the needy can receive timely services.