Robust critical values for unit root tests for series with conditional heteroscedasticity errors: An application of the simple NoVaS transformation

Abstract: In this paper, we introduce a set of critical values for unit root tests that are robust in the presence of conditional heteroscedasticity errors using the normalizing and variance-stabilizing transformation (NoVaS) and examine their properties using Monte Carlo methods. In terms of the size of the test, our analysis reveals that unit root tests with NoVaS-modified critical values have actual sizes close to the nominal size. For the power of the test, we find that unit root tests with NoVaSmodified critical values either have the same power as, or slightly better than, tests using conventional Dickey–Fuller critical values across the sample range considered.


Introduction
Recently, there has been great interest by researchers and practitioners in the simultaneous presence of high-persistence and conditional heteroscedasticity. The main reason is their occurrence in many economic time series, including stock index and exchange rate series, both a common subject for unit root tests. Alongside burgeoning interest in the second moment (or variance), the first ABOUT THE AUTHOR Panagiotis Mantalos, PhD, is an associate professor of statistics and econometrics in Department of Economics and Statistics at Linnaeus University, Sweden. His primary field of research deals with the study and development of statistical methodology for diagnostic testing using bootstrap methods. The bootstrap testing method is always assisting for evaluation and building robust models.
His research includes time series, dynamical models in economics, econometrics and financial data econometrics. He has published numerous articles-both theoretical and applied-which can be used to develop strategies for selecting appropriate models. Moreover, using his bootstrap strategies, we can avoid inadequate models, misleading results and incorrect conclusions.

PUBLIC INTEREST STATEMENT
A stationary time series is one whose statistical properties such as mean, variance, autocorrelation are all constant over time. Many economic and financial time series exhibit trending behaviour or nonstationarity in the mean; if the data are trending, then some form of trend removal is required. Two common trend removal procedures are first differencing and time-trend regression. Unit root tests can be used to determine the kind of the trend and their null hypothesis is that the series is trend stationary.
However, the limiting distributions of the test statistics are affected by the volatility, the time dynamic in variance of the financial series. We introduce here a technique by modifying the variance of the time series to be approximately standard normal (NoVaS), so the test procedures do not be effected by possible volatility. We show also by simulation experiments that this technique work well under different models of volatility.
difference of economic variables-here, stock index returns and exchange rate changes-has also been the subject of a sizeable modelling literature in finance and related disciplines, with the most representative approach being the autoregressive conditional heteroscedasticity (ARCH) and generalized ARCH (GARCH) family of models.
Subsequent to the seminal work on ARCH/GARCH by Engle (1982) and Bollerslev (1986), the introduction of new models has continued apace, a common feature being that they estimate and allow for time-varying volatility. A number of suitable surveys of GARCH models is available, with Giraitis, Leipus, and Surgailis (2006) providing a useful reference of existing work across the various models that have predictive ability for squared returns (or variance). Of this body of work, Pantula (1989) is one of the first studies investigating non-stationary univariate autoregressive models with a regular unit root and a first-order ARCH error, deriving the asymptotic distribution of the least squares estimator (LSE) of the unit root, the same as that given by Dickey and Fuller (1979).
Subsequently, Kim and Schmidt (1993) employed Monte Carlo simulation to show that the Dickey-Fuller (DF) tests tended to over-reject in the presence of GARCH errors. However, they also found that the problem was not very serious, the exception being near-integration of the GARCH process errors where the volatility parameter was not small. Ling and Li (1998a) later investigated an autoregressive-integrated moving average (ARIMA) model with GARCH (p, q) errors and derived the asymptotic distribution of the maximum likelihood estimation (MLE). They found that the asymptotic distribution of the MLE of various unit roots involves a series of bivariate Brownian motions and that the MLE of unit roots is more efficient than the corresponding LSE when GARCH innovations are present. Using these asymptotic distributions, Ling and Li (1998b) later constructed some new unit root tests and showed that these tests could provide better performance than DF tests in Monte Carlo simulations.
While there are differences across these and other studies, they all suggest that unit root tests require a set of robust critical values that apply in the presence of GARCH errors. For this purpose, this paper provides a simple and easy-to-apply algorithm using the normalizing and variance-stabilizing transformation (NoVaS) in Politis (2007) to produce robust critical values for ensuring DF tests have appropriate size and adequate power in the presence of conditional heteroscedasticity. However, because the subject of unit root tests is large, we limit the analysis here to investigating and demonstrating our procedure as it applies to unit root tests in the absence of serial correlation.
The remainder of the paper is organized as follows: Section 2 presents the simple NoVaS and the unit root test applying in the absence of serial correlation. In Section 3, we present our simulations and use these to establish the size and power properties of the tests. Section 4 provides a brief summary and the main conclusions. Politis (2007) first defined and analysed the properties of the NoVaS transformation. Let us consider the NoVaS more closely and thereby observe the motivation for our study.

Simple NoVaS transformation
Let {ε t } be a sequence of random variables following the ARCH(q) model where ω > 0, a i ≥ 0, 1 ≤ i ≤ q and the innovations e t are i.i.d N(0,1).
Note that we use the innovations e t as a standard normal because we can interpret the quantity as a "studentized" {ε t } sequence. Moreover, this provides an intuitive explanation of the following NoVaS (2.3) quantity.
3) describes the NoVaS transformation proposed by Politis (2007) under which the data series {ε t } is mapped to the new series {u t,a }.
The practitioner needs to select the order k(≥0) and the vector of non-negative parameters with the twin goals of normalization/variance-stabilization in mind. With a focus on that primary goal of normalization and variance stabilization, Politis (2007) introduced the so-called simple NoVaS transformation for choosing the order k(≥0) of the parameters a i in (1.3). Mantalos and Karagrigoriou (2012) subsequently modified the algorithm for the simple NoVaS as follows.
Select the order k( >0) such that the p-value of the Jarque-Bera statistic for the u t,a NoVaStransformed series is maximized. In our analysis, we undertake this in a loop for k = 1 to 25.
Consider now that u at = t ∕ √ W t,a is the new NoVaS-transformed innovation. Then the transformed series has the following mean: It is not difficult to show that, based on the assumption that the innovations e t are i.i.d N(0,1) (see Equation (1.1)), that (1.4) becomes zero, that is

As shown in Appendix A, the variance of the Simple NoVaS transformed series is
The skewness of the innovations e t is i.i.d N(0,1) is zero. Lastly, as also shown in Appendix A, the kurtosis is As Equations (1.5) and (1.6) provide zero skewness and kurtosis close to three, the transformed series exhibits the same properties as an i.i.d N(0,1) variable. (1.7)

DF unit root tests in the absence of serial correlation
It is well known (Hamilton, 1994) that when testing for unit roots, it is important to specify the null and alternative hypotheses appropriately based on the type of the data at hand. For example, if the data do not display any increasing or decreasing trend, then we should not include a time trend in the test regression. Moreover, the inclusion of deterministic terms will also influence the asymptotic distribution of the unit root test statistics. Consequently, given the importance of the correct alternative hypothesis and the trend type, the data will determine the form of the test regression used. For this reason, and also because non-trending financial series, such as interest and exchange rates and spreads, exhibit non-zero means and GARCH errors; the following first model is considered.

Model 1: Constant term in the regression
The test regression is and the hypotheses to be tested are The test statistic we use is that proposed by DF: t =̂− 1 SE(̂) .
We remind ourselves here that ε t follows an unknown ARCH/GARCH process that we could estimate, but which we do not require for the purpose of our test procedure.
The second model we consider is as follows:

Model 2: Time trend and constant term in the regression
The test regression is In this case, we assume that the data under the null hypothesis, with an example being a GDP series, follow a simple random walk with drift. That is whereas under the alternative hypothesis it is an AR(1) model with | | < 1 and trend. However, even here we use the DF test statistic t =̂− 1 SE (̂) . Note that this formulation is appropriate for time series such as asset prices, that is, trending time series with several possible kinds of GARCH errors.

NoVaS critical values for DF unit root tests in the presence of GARCH errors
The underlying idea and method we use in our procedure to produce robust critical values is simple and identical to the manner in which Dickey and Fuller (1979) produced their critical values for the unit root test, that is, using Monte Carlo methods we can approximate the finite-sample distribution of the unit root test.
Model 1: Consider a sequence of T observations y 1 , y 2 , … , y T for which we wish to test if there is a unit root in the series. In addition, we assume that our observed data do not display any increasing or decreasing trend while at the same time exhibiting a non-zero mean.
The first step is to estimate the test regression (1.8) (1.11) y t = c + y t−1 + bt + t .
(1.12) y t = c + y t−1 + t and calculate the DF test statistic t =̂− 1 SE (̂) . We perform the simple NoVaS transformation on the estimated residuals from (1.13) Based on the simple NoVaS properties (1.5, 1.6, and 1.7), u t = W t,a * z t , is a good approximation of the error term in (1.13), where z t , are T i.i.d random numbers from the standard normal distribution.
The next step is to calculate the new series under the null hypothesis y * t = T ∑ 1 u t .
We then estimate the regression y * t = c * + * y * t−1 + e t and afterwards the DF test statistic t * =̂ * −1
Repeating steps 3-4 a large number of times, N MC (in our Monte Carlo simulations, N = 1,000), we are able to approximate the finite sample distribution of the DF unit root test.
By taking the (1 − α) quintile of the approximate finite-sample distribution of t * , we obtain the αlevel "critical values" (c * t ) and finally reject Ho if t * ≥ c * t .
A Monte Carlo estimate of the p-value for testing is P * t * ≥ t . We use this to study the size and power of the unit root test (with the assistance of the p-value plot).

Model 2:
In the case of a trend The first step is to estimate the test regression and calculate the DF test statistic t =̂− 1 SE(̂) .
We perform the simple NoVaS transformation on the estimated residuals from (1.14), such that The next step is to calculate u t = W t,a * z t , where z t are as previously T i.i.d random numbers from the standard normal distribution.
The next step is to calculate the new series under the null hypothesis y * t = T ∑ 1 (ĉ + u t ).
We then estimate the regression y * t = c * + * y * t−1 + b * t + e t and afterwards the DF test statistic t * =̂ * −1
Steps 6-8 are the same as in Model 1.
(1.13) y t = c + y t−1 + t (1.1.4) y t = c + y t−1 + bt + t (1.15) Note that in Model 1, the null hypothesis does not contain a drift term, while in Model 2, the null hypothesis contains a drift term, for that reason y * t in the Model 1 is: y * t = T ∑ 1 u t and in Model 2:

Monte Carlo
In this section, we perform a simple Monte Carlo experiment by generating data for the size of the test for Model 1 using For the power test, we generate data using The data-generating process for the size of the test for Model 2 is For the power test, we generate data using We study the performance of the NoVaS critical values with error terms i.i. d N(0,1). We assume white noise and in all other cases, the errors follow a GARCH (1, 1) model with the following data-generating process For each model, we perform 5,000 replications for the calculation of size and 1,000 replications for the power functions. Our results are based on 600, 1,100 and 2,100 replications, where the first 100 time series observations in each replication are discarded to avoid possible initial value effect. We calculate the estimated size of the test by counting how many times we reject the null hypothesis in repeated samples under conditions where the null is true.

The p-value plots
The conventional way of reporting the results of a Monte Carlo experiment is to tabulate the proportion of times we reject the null hypothesis in repeated samples under conditions where the null is true. Concerning the significance levels used when judging the properties of the tests, different studies have advocated both larger and smaller significance levels. For example, Maddala (1992) suggests significance levels of as much as 25% in diagnostic testing, while MacKinnon (1992) suggests going in the other direction.
To address this problem, in this study we use mainly graphical methods that potentially provide more information about the size and power of the tests. We use the simple graphical methods developed and illustrated in Davidson and MacKinnon (1998) as they are relatively easy to interpret. We employ the p-value plot to assess the size of the tests and size-power curves to evaluate the power of the tests. The graphs of the p-value plots and size-power curves are based on the empirical distribution function (EDF) of the p-values, denoted F x j . For the p-value plots, if the distribution being used to compute the p s terms is correct, each of the p s terms should be distributed uniformly about (0,1). Therefore, the resulting graph should be close to the 45 o line. The p-value plots also make it possible and easy to distinguish between tests that systematically over-reject or under-reject the null hypothesis and those tests that reject the null hypothesis near the right amount of time.
To judge the reasonableness of the results, we use a 95% confidence interval for the actual size where N is the number of Monte Carlo replications. We consider any results that lie between these bounds as satisfactory. For example, if we consider 5,000 Monte Carlo replications and a nominal size of 5%, we define a result as reasonable if the estimated size lies between 4.39 and 5.61%.

White noise case
We begin our study of the behaviour of our test procedure with the simple case of white noise errors, that is, series without GARCH errors. The results for 500 observations given that there is no difference between these and those for 1,000 or 2,000 observations, we find also no difference between these and those for small samples as 75, 100 and 200 observations. As shown in Figure 1(a), in Model 1, the unit root test behaves well, with both the DF and NoVaS critical values indicating estimated sizes that are inside the confidence interval lines. As we observe the same behaviour for the unit root in Model 2, as shown in Figure 1(b), we conclude that the NoVaS critical values work as well as we expect for the case of white noise. Figure 2 depicts the power of the tests. The first thing we observe is that the unit root tests display the same power when using both the DF and NoVaS critical values. It is also obvious that there is a sample effect (the lowest curves are for 500 observations, followed by 1,000 observations, and the highest curves are for 2,000 observations). In sum, the unit root tests with NoVaS critical values assuming only white noise error work equally as well as the unit root tests with conventional DF critical values. Notes: Solid line is unit root test with NoVaS critical values; dot-dash line is unit root test with DF critical values.

The size of the tests
In this subsection, we present the results of our Monte Carlo experiment concerning the size of the unit root test given GARCH errors. As shown in Figure 3(a) for Model 1 and Figure 3(b) for Model 2 for 500, 1,000 and 2,000 observations, the p-value plots make it possible to easily distinguish that unit root tests with DF critical values systematically over-reject the null hypothesis, whereas unit root tests with NoVaS critical values reject the null hypothesis at about the right amount of time. Once again, we recall that we use the constant term in the regression for Model 1 and a time trend and constant term in the regression in Model 2. As shown, the unit root test with DF critical values overrejects by almost three times the nominal size, that is, for a 5% nominal size and Model 1, we estimated sizes of 15.35% for 500 observations, 14.9% for 1,000 observations and for 14.35% for 2,000 observations. We observe a similar picture for the unit root test with DF critical values for Model 2, where for a 5% nominal size we estimated sizes of 14.85% for 500 observations, 16.26% for 1,000 observations, and 15.9% for 2,000 observations. Overall, the unit roots tests with NoVaS critical values unlike those with DF critical values work well in the presence of GARCH errors.

The power of the tests
In this subsection, we analyse the power of the Wald and bootstrap tests using sample sizes of 500, 1,000, and 2,000 observations. We estimate the power function by calculating the rejection frequencies for 1,000 replications using Equation (1.17) for Model 1 and Equation (1.19) for Model 2. We employ the size-power curves to compare the estimated power functions of the alternative test statistics. We follow the same process to evaluate the EDFs denoted by F ⊕ x j using the same sequence of random numbers used to estimate the size of the tests. Figure 4 displays the results using the size-power curves. As shown, the unit root tests with the DF critical values are now superior to the unit root tests with the NoVaS critical values. We also again observe a sample effect, such that the larger the sample, the greater the power of the tests. To ensure a fair comparison of the power of the unit tests with the DF and NoVaS critical values, we apply the size-power curves on a correct size-adjusted basis. Figure 4 plots the size-power curves and the estimated power functions against the nominal size.
By plotting the estimated power functions against the true size, that is F ⊕ x j against F x j , we obtain size-power curves on a correct size-adjusted basis. As the unit root tests with DF and NoVaS critical values now share the same power, they are even better for a small nominal size (less than 15%), as shown in Figure 5. The main finding for our power investigation is that unit root tests with NoVaS critical values generally perform adequately in both the white noise and GARCH error cases. Notes: Solid line is unit root test with NoVaS critical values; dotted, dot-dash, and dashed lines are unit root tests with DF critical values for 500, 1,000, and 2,000 observations, respectively.

Brief summary and conclusion
In this section, we summarize the results of our investigation. The purpose of this study has been to study the unit root test in the presence of conditional heteroscedasticity errors with newly proposed critical values modified using the NoVaS method. The main conclusion is that NoVaS-modified critical values are robust, with respect to both white noise and conditional heteroscedasticity errors. Moreover, given the actual size that lies close to their nominal size and adequate power, it makes sense that we would select our NoVaS-modified critical values ahead of the conventional DF critical values, especially in the presence of conditional heteroscedasticity errors.
In summarizing our methodology, we studied the estimated size and power of the unit root test in the presence of conditional heteroscedasticity errors using two sets of critical values: namely, conventional DF values and our newly proposed NoVaS-modified values. In terms of the size of the tests, we used Monte Carlo methods to investigate their properties using 5,000 replications per model, with 500, 1,000, and 2,000 observations. We employed p-value plots to investigate the size of the tests and found that the unit root tests perform better with NoVaS-modified critical values across all of our samples. Importantly, when we consider the power results using the size-power curves, when considered on a correct size-adjusted basis, the unit root tests with NoVaS-modified critical values share the same power as tests using DF critical values or at least the difference is very small across all of our samples. Notes: Solid line is unit root test with NoVaS critical values; dotted, dot-dash, and dashed lines are unit root tests with DF critical values for 500, 1,000, and 2,000 observations, respectively.

(a) (b)
In the same manner, we can show that the simple NoVaS model with two lags has for three lags and finally for the general (k) lags Now the variance is Var(W t,a ) = E(W 2 t,a ) − E(W t,a ) 2 .

Kurtosis
Consider that z t = t ∕ √ W t,a is the new NoVaS-transformed series. Then the kurtosis of this series is We can see from (1.22) that the mean of the simple NoVaS is E W t,a = (k + 1)aE( 2 t ). It is easy to see that for a = 1∕ (k + 1) then (1.22) becomes Based on this, we have the kurtosis of the new transformed series as Recall that we derived the second moment of the NoVaS for the general (k) lags in (1.27). Now given we expect that t ∕ √ W t,a is i.i.d normal, and E( 4 t ) < ∞, then Consequently, for a simple analysis of (1.33) with the substitution of the E W 2 t,a , and the right-hand side of (1.27), the kurtosis of the NoVaStransformed series becomes Moreover, for a = 1∕ (k + 1), (1.34) becomes The last part of this equation is the kurtosis of the simple NoVaS-transformed series. (1.27) E(W 2 t,a ) = (k + 1)a 2 m 4 + k(k + 1)a 2 (m 2 ) 2 .