MODELING THE EFFECTS OF OUTLIERS ON THE ESTIMATION OF LINEAR STOCHASTIC TIME SERIES MODEL

A BSTRACT . This study investigates the effects of outliers on the estimates of ARIMA model parameters with particular attention given to the performance of two outlier detection and modeling methods targeted at achieving more accurate estimates of the parameters. The two methods considered are: an iterative outlier detection aimed at obtaining the joint estimates of model parameters and outlier effects, and an iterative outlier detection with the effects of outliers removed to obtain an outlier free series, after which a successful ARIMA model is entertained. We explored the daily closing share price returns of Fidelity bank, Union bank of Nigeria, and Unity bank from 03/01/2006 to 24/11/2016, with each series consisting of 2690 observations from the Nigerian Stock Exchange. ARIMA (1, 1, 0) models were selected based on the minimum values of Akaike information criteria which fitted well to the outlier contaminated series of the respective banks. Our findings revealed that ARIMA (1, 1, 0) models which fitted adequately to the outlier free series outperformed those of the parameter-outlier effects joint-estimated model. Furthermore, we discovered that outliers biased the estimates of the model parameters by reducing the estimated values of the parameters. The implication is that, in order to achieve more


INTRODUCTION
Outliers are common characterizations of every time series.In general, outliers are extreme observations that deviate from the overall pattern of the sample.Statistically, outliers are those observations whose standard deviations are greater than 3 in absolute value, which is the value of kurtosis occupied by the normal distribution.However, the effects of outliers on the linear time series models cannot be overemphasized; such effects range from false inference, introduction of biases in the model parameters, model misspecification and misleading confidence interval ( [1], [2], [3], [4]).
By efficiency, we mean the goodness of an estimator of a model which can be measured by variance, that is, a model with the smallest variance is considered to be superior as regarding efficiency.To reiterate the need for efficiency of the estimates of model parameters by considering the presence of outliers, this study applied two outlier identification and modeling methods.The first is the modified iterative method proposed by [5], which involves the joint estimation of the model parameters and the magnitude of outlier effects.The second is the modified iterative method proposed by [6], which involves identification of outliers sequentially by searching for most relevant anomaly, estimating its effect and removing it from the data.The estimation of the model parameters is again done on the outlier corrected series, and further iteration of the process is carried out until no significant perturbation is found.
Actually, the motivation for this study is derived from the fact that previous studies such as [7], [8], [9], [10] failed to consider outliers while modeling returns series in Nigeria.
Thus, this gap in knowledge is fully addressed in our work.
This work is further organized as follows: section 2 takes care of materials and methods; section 3 handles the results and discussion while section 4 treats the conclusion.

Return Series
The returns series (  ) can be obtained given that   is the price of a unit shares at time t and  −1 is the price of shares at time t−1.Thus In equation ( 1),   is regarded as a transformed series of the price (  ) of shares meant to attain stationarity such that both the mean and the variance of the series are stable [11] while  is the backshift operator.

Autoregressive Integrated Moving Average (ARIMA) Model
[3] considered the extension of ARMA model to deal with homogenous non-stationary time series in which   , is non-stationary but its  ℎ difference is a stationary ARMA model.
Denoting the  ℎ difference of   by where () is the nonstationary autoregressive operator such that d of the roots of () = 0 are unity and the remainder lie outside the unit circle while () is a stationary autoregressive operator.It should be noted that in equation ( 2), the presence of outliers is not taken into consideration.

ARIMA Model for Outlier-Adjusted Return Series
where   is the outlier free series.Meanwhile, equations ( 3) and (4) represent major modifications on equation (2) to account for the presence of outliers.

Outliers in Time Series
Generally, in time series, four types of outliers are identified and they are as follows: additive outlier, innovative outlier, level shift outlier and temporary outlier [12].

Additive Outlier (AO)
A time series  1 , …,   affected by the presence of an additive outlier at t = T is given by for t = 1, …,T, where   () = { 1 ,  = , 0,  ≠ , is the indicator variable representing the presence or absence of an outlier at time T,   follows an ARIMA model,  is an outlier size.Hence, an additive outlier affects only a single observation (see also [1], [12], [3], [4]).
Meanwhile, according to [12], the innovation of a time series  1 , …,   is affected by where   are the innovations of the uncontaminated series   .

Level Shift (LS)
A time series  1 , …,   affected by the presence of a level shift at t = T is given by where   () = (1 − ) −1   () .Note that level shift affects all the observation of the series after t = T. Hence, according to [12], level shift serially affects the innovations as follows: where

Temporary Change (TC)
A time series  1 , …,   affected by the presence of a temporary change at t = T is given by where  is an exponential decay parameter such that 0 <  < 1.If  tends to 0, the temporary change reduces to an additive outlier, whereas if  tends to 1, the temporary change reduces to a level shift.The temporary change affects the innovations as follows: If () is close to 1 − , the effect of temporary change on the innovations is very close to the effect of an innovative outlier.Otherwise, the temporary change can affect several observations with a decreasing effect after t = T [12].

Time Plots
Inspecting the plots in Figures 1-3, it is obvious that they are characterized by upward and downward movements away from the common mean, which clearly indicates the existence of nonstationarity.

Linear Time Series Modeling of Return Series of Union Bank
From Figures 9 and 10  From Table VI, ARIMA (1, 1, 0) model is selected based on the ground that its parameter is significant and has the minimum AIC.

Figure 9 :Figure 10 :
Figure 9: ACF of Return Series of Union Bank 1, …,   affected by the presence of an innovative outlier at t = T is given

Table II : Types of Outliers Identified in the Residual Series of ARIMA(1, 1, 0) Model fitted to the Return Series of Fidelity Bank
To account for the effect of outliers, the method of joint estimation of the parameter of ARIMA (1, 1, 0) model with outliers identified in Table II is performed as indicated in TableIII.Comparing the values of AIC = −11922.67and log likelihood = 5979.34 of the joint model of ARIMA(1, 1, 0) with outliers effects with that of ARIMA (1, 1, 0) model having AIC = −11562.17and log likelihood = 5783.09,it is obvious that the joint model of ARIMA (1, 1, 0) with outliers effects has a lower AIC and a higher log likelihood value, thus making it a better model than the ARIMA (1, 1, 0) model where the influence of outliers is not taken into consideration.

Model for Outlier-Adjusted Return Series of Fidelity Bank
Here, the second method is applied which is the removal of the outliers effects to obtain an outlier-adjusted series.Then, ARIMA(1, 1, 0) model fitted well to the outlier-adjusted series with its parameter significant at 5% level [see TableIV] and is found to be adequate given the On comparing the estimates of ARIMA(1, 1, 0) model fitted to the outlier contaminated series with the ARIMA(1, 1, 0) model when adjusted for outliers using the two proposed methods, it is found that the estimates of both the joint ARIMA(1, 1, 0) model with outliers effects and the ARIMA(1, 1, 0) model fitted to the outlier adjusted series are the same.However, the later tends to outperform the former on the basis of smallest information criteria.Of paramount interest is the discovery that outliers introduced substantial bias in the estimate of ARIMA (1, 1, 0) model by 0.0109 as shown in TableV.Again, the modified iterative method produced a model with smallest variance as indicated in Table V, hence, adjudged the most efficient method.

Table VII : Types of Outliers identified in the Residual Series of ARIMA(1, 1, 0) Model fitted to the Return Series of Union Bank
Again, applying the first method as indicated in Table VIII, it is found that the values of AIC = −11560.27and log likelihood = 5800.13for the joint model of ARIMA(1, 1, 0) with outliers effects when compared to that of ARIMA (1, 1, 0) model with AIC = −9132.26and log likelihood = 4567.13are respectively smaller and higher, making the former a better model than the later.

Model for Outlier-Adjusted Return Series of Union Bank Using
the second method, which is removing the effects of the outliers and afterward, ARIMA(1, 1, 0) model is fitted to the outlier-adjusted series with its parameter significant at 5% level [TableIX], it is found to be adequate at 5% level of significance given the Q-statistics at lags 1, 14, 18, and 24 having Q(1) = 0.0030, Q(14) = 19.228,Q(18) = 24.611and Q(24) = 27.717 with the corresponding p-values of 0.956, 0.1564, 0.136, and 0.2722.

Table IX : ARIMA (1,1,0) Model for Outlier Adjusted Return Series of Union Bank
Again, the effects of outliers on the estimate of ARIMA(1, 1, 0) model fitted to the return series of Union bank is similar to that of the Fidelity bank although the estimate of the model is reduced by 0.164 and the modified iterative method is also adjudged superior in term of efficiency given that it produced a model with minimum variance as shown in TableX.

Table X : Effect of Outliers on Estimate of ARIMA (1, 1, 0) Model for Return Series of Union Bank
ARIMA(1, 1, 0) model and outliers effects is shown in TableXII.