The m-delay Autoregressive Model with Application

The classical autoregressive (AR) model has been widely applied to predict future data usingm past observations over five decades. As the classical AR model required m unknown parameters, this paper implements the AR model by reducing m parameters to two parameters to obtain a new model with an optimal delay called as the m-delay AR model. We derive the m-delay AR formula for approximating two unknown parameters based on the least squares method and develop an algorithm to determine optimal delay based on a brute-force technique. The performance of them-delay AR model was tested by comparing with the classical AR model. The results, obtained from Monte Carlo simulation using the monthly mean minimum temperature in Perth Western Australia from the Bureau of Meteorology, are no significant difference compared to those obtained from the classical AR model. This confirms that the m-delay AR model is an effective model for time series analysis.


Introduction
Time series analysis has been widely applied in many branches such as agriculture, environment and so on. For agriculture application, the rainfall forecast is one of a popular topic in time series analysis for the irrigation management propose. Based on using historical rainfall data, a time series model has been used to predict the amount of rainfall for agriculture target planning. This would probably help the farmer to deal with their resources and gain more agricultural products [Moeletsi, Mellaart, Mpandeli et al. (2012)]. For Environment, the weather forecast is important as the temperature is the key factor for government planning and management in many aspects such as tourism, vegetation and public health [Ustaoglu, Cigizoglu and Karaca (2008)].
Various approaches based on time series methods have been proposed to fit the data set. These approaches include Box-Jenkins method, neural networks, hybrid technique and fuzzy time series method. Box-Jenkins method is one of the famous methods consisting of autoregressive (AR) model, moving average (MA) model, autoregressive moving average (ARMA) model and autoregressive integrated moving average (ARIMA) model [Box, Jenkins and Reinsel (2008)]. This method has been applied to a wide range of applications including economic especially a stock process sequence [Ip, Zhang, Sowinski et al. (2015)], engineering [Amo-Salas, López-Fidalgo and Pedregal (2015); Bacher, Madsen and Nielsom (2009); de Waele and Broersen (2003)], industry [Acedański (2013)] and scientific [Hurvich and Tsai (1989) and Sawa (1978)]. The artificial neural networks (ANN) approach is the main tool for nonlinear data as it is a brain-inspired system [Tealab (2018)]. The hybrid technique has been established since the individual model may not be adequate to analyze all the actual observations [Taskaya-Temizel and Casey (2005)]. This method combines at least two individual forecast methods to improve forecasting accuracy [Pan, Zhang and Xia (2009)]. ANN approach integrated with either AR or ARIMA method are called AR-ANN model [Qi and Zhang (2001)] or ARIMA-ANN model [Zhang (2003)]. Fuzzy time series is a forecasting method using fuzzy principle as the basis. It is suitable for numerical and linguistic data [Saxena and Easo (2012)]. Apart from the development of forecasting methods, various estimating modelparameter approaches used in the time series model have been proposed such as the maximum likelihood method, the method of moment and the Yule-Walker procedure. For the maximum likelihood method, studies aim to find the parameter values giving the distribution that maximize the probability of observing the data. The method of moment is a simple procedure. By equating the sample moments to the corresponding population moments, the estimating model parameter is solved [Cryer and Chan (2008)]. The Yule-Walker method (or autocorrelation method ) fits an AR model to the window input data by maximizing the error in the least squares sense [Broersen (2008)].
Time series data are commonly classified into two groups; stationary and non-stationary data. For stationary data, its mean and variance are consistent. Its sequence does not reveal an upward (or downward) trend and seasonal pattern. In contrast to the stationary data, mean and variance of non-stationary data vary over time. Two common ways to check the stationary of the data are time series plot and roots of the characteristic equation [Maddala and Wu (1999)]. As possible roots of the characteristic equation may be real and/or complex numbers, the data is stable if all real roots are greater than one or the modulus of each complex root is greater than one. The data is considered to be non-stationary data if at least one root falls between minus and plus one or fall inside the unit circle [Box, Jenkins and Reinsel (2008)]. As the AR model required stationary data to predict future data, single/double differencing method and mathematical transformation are two common procedures to convert non-stationary data to stationary data [Hassler (1994)]. The differencing technique is used to remove the trend component of the data while mathematical transformations including the square/cube root transformation and log transformation are applied when the variance of data is unstable [Pourahmadi (2001)].
As the classical AR model uses more parameters in the formula to predict future observation, statistical techniques have been applied to obtain these parameters. In this study, we propose the m-delay AR model in which only two parameters are determined.
The m-delay AR formula based on the least squares method is derived and an optimal delay algorithm based on a brute-force technique is developed. The remainder of this paper is structured as follows. Section 2 concerns the derivation of the m-delay AR model to obtain two parameters of the m-delay AR model. The performance of parameter estimation is presented in Section 3 and the empirical study is reported in Section 4. Discussion and conclusion are given in Section 5.
2 Derivation of the m-delay autoregressive (MAR) model Generally, an explicit formula of the present value x t is determined by the standard AR model [Box, Jenkins and Reinsel (2008)] , i.e., where x t represents the present value at instant time t, {x t−1 , x t−2 , . . . , x t−m } is the list of the past observations, φ i (i = 1, . . . , m) are i th coefficients of AR model and t is a gaussian white noise process which is assumed to be the normal distribution, N (0, σ 2 ).
In this study, we propose a m-delay AR model to approximate the present value x t by using only the first term at t − 1 and the last term at t − m, The first and the last coefficients of the AR model are estimated by Eq.
(3) based on the least squares method [Khalil and Moraes (1995)]. The principle concept of the least squares procedure is to minimize the sum of square error functions To find the optimal values ofφ 1 andφ m in Eq. (3). We differentiate Eq. (3) with respect tô φ 1 andφ m and set them to zero. This step is shown in Eq. (4).
We finally obtain the m-delay formula for approximatingφ 1 andφ m , i.e., We callφ 1 andφ m as the new approximation of the m-delay AR coefficients.
As the standard AR model is a stationary time series process, we now determine the stationarity condition of the m-delay AR model. In this study, we investigate the stationarity condition of our proposed model (2) by computing the roots of AR characteristic equation.
To discuss the stationarity, we define the AR characteristic polynomial as where the characteristic equation is To solve Eq. (7), a sufficient stationarity condition of our model is where the delay (m) is between 3 and 120.
We illustrate the sufficient stationarity condition in Fig. 1.
where φ 1 and φ m are parameters of the MAR model,φ 1 andφ m are approximated parameters,x t are the forecast observations, n is the sample size, m is the delay and L is the number of simulations.

Parameter estimation
This section concerns the effectiveness of the m-delay formula (see Eq. (5)) via Monte Carlo simulation, and the optimal delay using brute-force technique.

Effectiveness of the m-delay formula (φ 1 andφ m )
To examine the coefficients φ 1 and φ m obtained from the m-delay formulation (see Eq. (5)), we set the initial value of m data to zero (m n). The observations x t determined by Eq. (2) are generated using Monte Carlo technique. Fig. 2 presents the examination process using computation scheme as shown in Tab. 1. We perform the same process for the different sample sizes with different delays.
 and ˆm  using (5) 3)  20 20 20 20 20 20 120 120 120 120 120 120 The simulation results indicate that the averageφ 1 andφ m approach the actual φ 1 and φ m . indicate that sample size greater than 300 gives the reasonable results of approximated ofφ 1 andφ m . In case of the delay 120, the process starts with the sample size of 300. The estimatedφ 1 andφ m approach the actual ones when the sample size about 500. It is noted that Eq. (5) is an effective formula for approximating the m-delay parameter φ 1 and φ m .

Determination of optimal delay (m)
To find the optimal delay, we develop an algorithm as shown below based on brute-force technique by fixing the m 0 -delay AR parameter, φ 1 and φ m 0 for the initial delay m 0 and select the sample size n and the prescribed delay n 2 − 1.
This above algorithm begins with prescribing input parameters including the sample size (n), the initial delay (m 0 ) and the actual m-delay AR parameters (φ 1 , φ m 0 ) and setting the values of all m 0 observations to zero. We then generate the data set (x t ) by Eq.
(2). The process starts with delay 3 at the first iteration. Two unknown parameters (φ 1 ,φ m ) are calculated by Eq. (5) and the minimum error (minE) is obtained. We repeat the process by increasing the delay by one. In each iteration, the minimum error (minE) and minimum delay (minD) are obtained. The process stops when the delay equals the prescribed delay. The process will succeed when the minimum delay equals to the initial delay, i.e., the optimal delay (m op ) is obtained for the sample size n. We then go back to Step 2 until reaching to the maximum number of iteration. Finally, the accuracy of delay estimation based on the brute-force technique is computed by Eq. (11).
To illustrate the performance of a brute-force technique. We compute the accuracy on the simulated data set for each different sample size. The accuracy of the delay estimation is computed by Accuracy(%) = Number of success outcomes Number of possible outcomes × 100.
Using a brute-force algorithm, our experiments show that sample size has a significant effect on the accuracy of the m-delay AR model as shown in Fig. 4. To investigate the effect of sample size on the accuracy of the m-delay approximation as revealed in Fig. 4(a), we choose 24 sample sizes between 60 and 3000 giving the percentage of accuracy from 27.85% to 99.90% and 0.22% to 99.90% for delay 5 and 20, respectively. The percentage of accuracy with delay 100 and 120 is shown in Fig. 4(b). We found that using sample size from 700 to 3000 gives the percentage of accuracy 6% to 99.90% and 0.10% to 99.90%, respectively. For delay 120, we choose various sample sizes to evaluate its effect on estimation accuracy. We found that using sample size from 700 to 3000 gives the percentage of accuracy from 0.10% to 99.90%.

Empirical study
In this section, we first test the performance of the MAR model comparing with the AR model as shown in Algorithm 2. We then apply both models to approximate the monthly mean minimum temperature in Perth, Western Australia.

Monte Carlo simulation
Algorithm 2 below compares the performance of the MAR model (based on only 2 parameters namely φ 1 and φ m ) with the AR model (via m coefficient parameter, φ 1 , φ 2 , . . . , φ m ).

Algorithm 2 Step 1
Prescribe sample size (n), parameter of AR model (φ 1 , φ m 0 ) ,the initial delay (m 0 ) and maxL, set L = 0, sumE AR = 0 , sumE M AR = 0 and x t = 0 for t=1,2,. . . , m 0 Step 2 Generate {x t } n t=m 0 +1 from Eq. (2) Step 3 Find root mean square error from AR model (RM SE AR ) Estimate the coefficient of the AR model(φ 1 ,φ 2 , . . . ,φ m 0 ) Compute RM SE AR sumE AR = sumE AR + RM SE AR Step 4 Find root mean square error from the MAR model (RM SE M AR ) Set m=3 Calculateφ 1 ,φ m from Eq. (5) Step 5 If m < n 2 − 1. Then go to Step 4 Step 6 Optimal m op =minD and Calculate RM SE M AR using the optimal m (m op ) Step 7 sumE M AR = sumE AR + RM SE M AR and L = L + 1 Step 8 If L ≤ maxL. Then go to Step 2 Step 9 Compute average root mean square error of both models RM SE AR = sumE AR /L and RM SE M AR = sumE M AR /L This algorithm starts with determining input parameters including sample size (n), actual parameters of AR with delay process (φ 1 , φ m 0 ) and setting the values of all m 0 observations to zero. We then generate the data set (x t ) by Eq.
(2). To find the root mean square error from the AR model (RM SE AR ), we first estimate the model parameter and we then compute the root mean square error by Eq. (10). In the MAR model, the computation process of the optimal delay starts at delay 3. The unknown parameters (φ 1 , φ m ) are calculated by Eq. (5), and the minimum error (minE) and the minimum delay (minD) are obtained. The process is repeated until the delay reaching to the prescribed delay n 2 − 1. This step returns the optimum delay with the smallest value of the error for each iteration (m op =minD) and we also calculate RM SE M AR using the optimal delay (m op ). This process goes back to Step 2 until the total iterations equal the maximum number of iterations. Finally, we compute the average root mean square error of both models.
Tab. 2 compares average root mean square error obtained from the AR model (RM SE AR ) and the MAR model  The experimental results shown in Tab. 2 are generated using φ 1 = 0.5, φ m = 0.3, m 0 = 10 and L = 10000. We now use the independent sample t-test to confirm that the results obtained from both models are not significantly different. The null hypothesis and alternative hypothesis are RM SE AR = RM SE M AR and RM SE AR = RM SE M AR , respectively.
The result of hypothesis testing is demonstrated in Tab. 3. It indicates that there is not enough evidence to reject the null hypothesis at significance level 0.05 as p-value > 0.05. This means that our proposed MAR model is an efficient technique for the time series prediction comparing to the AR model.

Validation of the m-delay AR model
This section concerns the prediction of the monthly mean minimum temperature in Perth, Western Australia, using the data obtained from the Bureau of Meteorology, Australia between January 1994 and June 2019 containing 306 observations. The data are divided into two sets; training set and testing set. The training set, from January 1994 to December 2017, has 288 observations. The rest, January 2018 to June 2019, is the testing set consisting of 18 observations. As the training data has a seasonal pattern as shown in Fig. 5(a), we firstly deseasonalize the original data by taking the seasonal difference (x t − x t−12 ). The result after transformation is shown in Fig. 5(b). As the AR model requires stationary data, we then need to check the features of the transformed data: normal and independent [Razali and Wah (2011)]. To check whether normal distribution of data, we apply the Shapiro-Wilk test. The result is revealed in Tab. 4.  In Tab. 5, the p-value of 0.911 illustrates that the data are independent. We now can use these transformed data that are normally distributed and independent to find the suitable AR model.

The m-delay AR model
From Eq. (5) and using the transformed data, we obtain the m-delay coefficientφ 1 = 0.194561 andφ 9 = 0.027676 which satisfy the characteristic Eq. (7) with inequality condition | φ 1 | + | φ m |< 1.   The results from Tabs. 8 and 9 illustrate that the residuals from the MAR model are normally distributed and independent with p-values of 0.485 and 0.916, respectively.

Predictive modeling
Using the test set of data with 18 observations as shown in column 2 of Tab. 10 from January 2018 to June 2019, we apply the AR(9) model and the proposed MAR model to predict the monthly mean minimum temperature as shown in columns 3 and 4 of Tab. 10. Fig. 7 and Tab. 10 compare the forecasting monthly mean minimum temperature obtained from the MAR model and the AR model with observed data. From the results as shown in Tabs. 2 and 3, it confirms that there is no significant difference in results obtained from the MAR model and the AR(9) model. Consequently, the m-delay AR model is an effective model for time series prediction.

Conclusion and discussion
In the classical AR model with a large delay, a number of coefficient parameters are needed to determine. In this paper, we consider a special case of the classical AR model, particularly when the delay is large. The optimal m-delay AR model using two parameters is proposed. This would improve the results and reduce the computation time. We outline the future development and the possible application below.
• For parameter estimation, we only modify the least square method to approximate the coefficients of the m-delay AR model. There are several techniques to approximate these AR parameters, namely the maximum likelihood method, the Yule-Walker estimation, and the method of moment.
• Based on the simulation results in Section 4, the root mean square error is utilized. For future development, adding more information criteria can be useful to choose an optimal delay such as the Akaike Information Criteria (AIC), the Bayesian Information Criteria (BIC), the Minimum Description Length (MDL) [Dickie and Nandi (1994)], the Predictive Least Square (PLS), the Predictive Densities Criteria (PDC) and the Sequentially Normalized Maximum Likelihood (SNML) criteria [Giurcȃneanu and Razavi (2008)], the Final Prediction Error (FPE) [Akaike (1970)] and the criteria autoregressive transfer function (CAT). [Burshtein and Weinstein (1985)] • In this work, we assume a random process to be a sequence of independent elements with normal distribution via zero mean and constant variance. It is possible for changing a random process to other distribution such as uniform distribution or exponential distribution. Replacing random process would probably affect the parameter estimation for the m-delay AR model.
• Our model assumes that the noise coefficient is finite. The challenging task is unknown variance. Consequently, we may extend model parameters and estimate these unknown parameters including two coefficient parameters, the delay and the noise coefficient, namely φ 1 , φ m , m and σ 2 .
• Our model can be applied to historical financial data to analyze the dependence of time-scale volatility measurements.
• The experiment results show that the sample size is one of the key factors. It would be interesting to consider the connection of the error (RMSE) with the sample size.