Estimating ARCH Models When the Coefficients are Allowed to be Equal to Zero

In order to be consistent with volatility processes, the autoregressive conditional heteroskedastic (ARCH) models are constrained to have non-negative coefficients. The estimators incorporating these constraints possess non standard asymptotic distributions when the true parameter has zero coefficients. This situation, where the parameter is on the boundary of the parameter space, must be considered to derive the critical values of tests that one or several ARCH coefficients are equal to zero. In this paper we compare the asymptotic theoretical properties, as well as the finite sample behavior, of the main estimation methods in this framework.


Introduction
Least squares (LS) and quasi-maximum likelihood (QML) procedures are arguably the most widely-used estimations methods for ARCH models, and were already considered in the seminal paper by Engle (1982).The LS estimator (LSE) has the advantage of being a closed-form estimator that can be easily implemented and does not require the use of optimization procedures, but has the disadvantage of being generally much less efficient than the QMLE.The quasi-generalized least squares estimator (QGLSE) improves the efficiency of the LSE but remains user-friendly.Deriving the asymptotic properties of these estimators is not a trivial task.Berkes, Horváth, and Kokoszka (2003) is the first reference in which the asymptotic properties of the QMLE of ARCH and the generalized ARCH (GARCH) models were captured in a mathematically rigorous way under weak conditions (see also Francq andZakoïan, 2004 andStraumann, 2005 where several technical assumptions made in Berkes et al., 2003 are relaxed).
For an estimator to be asymptotically normal (AN), a crucial assumption is that the true parameter must belong to the interior of the parameter space.This requirement, made by the above-mentioned papers, is not satisfied when the ARCH coefficients are constrained to be positive by the estimation procedure and when some components of the true ARCH parameter are equal to zero.Following Chernoff (1954) or Andrews (1999) who studied in general frameworks the asymptotic distribution of estimators when the parameter is on a boundary of the parameter space, Jordan (2003) and Francq and Zakoïan (2007) studied the ARCH and GARCH QMLE when the parameter is allowed to have zero components.This framework is particularly relevant for hypothesis testing problems, which often put the parameter on the boundary of the parameter space under the null.Tests of the significance of the coefficients and tests of conditional homoscedasticity Austrian Journal of Statistics, Vol. 37 (2008), No. 1, 31-40 constitute typical situations where we have to study the estimators when the parameter is at the boundary.
In this paper we compare the asymptotic behaviour of the LSE, QGLSE and QMLE of ARCH models, when the true parameter may have zero coefficients.We also consider constrained and truncated versions of the LSE and QGLSE.We limit ourselves to ARCH models because the LSE and QGLSE lose their main practical advantage (namely the fact that they do not need any numerical optimization procedure) in the GARCH framework.

Constrained and Unconstrained Estimators
Consider the standard ARCH(q) model given by the equations where the noise sequence (η t ) is independent and identically distributed (iid) with mean 0 and variance 1, the distribution of η2 t is not degenerated1 , and is a parameter vector which satisfies the positivity constraints These constraints are sufficient (and are also necessary when η t has a positive density) to ensure the positivity of the volatility process σ 2 t = σ 2 t (θ 0 ).

Quasi-Maximum Likelihood Estimator
The QML estimation procedure requires the computation of the logarithm of σ 2 t (θ) = ω + q i=1 α i 2 t−i at any point θ = (ω, α 1 , . . ., α q ) of the parameter space Θ.If α i < 0 for some i, the volatility σ 2 t (θ) is likely to take negative values 2 and the QML procedure fails.Assuming that the parameter space Θ is a compact subset of [0, ∞) q+1 that bounds the first component away from zero, one can compute the quasi-likelihood criterion The QMLE is then defined as any measurable solution of Assume θ 0 ∈ Θ.We have strong consistency under the sole assumption that ( t ) is a strictly stationary solution to (1) such that t is measurable with respect to {η u , u ≤ t} (this assumption3 is maintained throughout the paper).We have AN under the additional assumption that where Θ denotes the interior of the parameter space Θ (see Berkes et al., 2003, Francq and Zakoïan, 2004and Straumann, 2005).When some of its components are equal to zero, the parameter θ 0 , which is constrained to have nonnegative components, belongs to the boundary of the parameter space and (3) is not satisfied.
The following elementary example shows that the asymptotic distribution of the QMLE cannot be Gaussian when (3) is not satisfied.
Example 2.1 Due to the positivity constraints, the QMLE of an ARCH(1) model satisfies αn ≥ 0 almost surely, for all n.When the DGP is a white noise, then α 01 = 0 and with probability one In this case √ n(α n − α 01 ) cannot converge in law to any non-degenerate Gaussian distribution N (m, s 2 ) with s 2 > 0. Indeed For the same reason, when the true value of a general GARCH parameter has zero components, the asymptotic distribution cannot be Gaussian, for the QMLE or for any other estimator which takes into account the positivity constraints.

Least Squares Estimators
The LSE is an alternative estimator based on the AR(q) representation for 2

Unconstrained Least Squares Estimator
With probability one, it can be shown that the matrix X X is non-singular for large enough n.The LSE of θ 0 is thus given by If E( 4 1 ) < +∞, the LSE can be shown to be strongly consistent (see Bose and Mukherjee, 2003).If, in addition E( 81 ) < +∞ the estimator is AN (see also Bose and Mukherjee, 2003) and where

Constrained Least Squares Estimator
Contrary to the QMLE, the computation of the LSE does not require positivity constraints.Note that one or several components of the θLS n , as defined by (5), can be negative.This is a serious practical problem because we know that ARCH models with negative coefficients are not viable and can produce negative predictions of the volatility.This is why it is worth considering the constrained LSE (CLSE) defined by When X has rank q + 1, the constrained estimator θcLS n is the orthogonal projection of θLS n on [0, +∞[ q+1 with respect to the metric X X : The following proposition states that, when θ 0 belongs to the interior of the parameter space, the asymptotic behaviors of the constrained and unconstrained LSE are the same.
Example 2.1 shows that the AN (8) does not hold when some ARCH coefficients are equal to zero.

Truncated Least Squares Estimator
Since all the ARCH coefficients must be positive, a naive approach could be to replace any negative component of the LSE θLS n = ( θLS n,1 , . . ., θLS n,q+1 ) by zero.This leads to the truncated LSE (TLSE) defined by and using the Hadamard product , the truncated estimator can be written as This estimator is simpler to implement than the CLSE and the following proposition shows that its asymptotic properties are the same as those of the constrained and unconstrained LSE when θ 0 is not on the boundary of the parameter space.
Proposition 2.2 Proposition 2.1 remains valid when θcLS n is replaced by θtLS n .

Quasi-Generalized Least Squares Estimator
For linear regression models with heteroscedastic observations, it is well known that the (ordinary) LSE is outperformed by the QGLSE (see e.g.Hamilton, 1994, Chapter 8).In the ARCH framework the QGLSE is defined by where X is supposed to have full rank q + 1, and Ω is a non singular consistent estimator of Ω = Diag(σ −4 n , . . ., σ −4 1+q ).If a first-step estimator θn = (ω, α1 , . . ., α1 ) is available, the matrix Ω can be obtained by replacing σ 2 t by ω + q i=1 αi 2 t−i in Ω.In order to be sure that Ω is well defined and invertible, one can employ the truncated LSE θn = θtLS n .Then the two-stage least squares estimator θQG n is consistent and asymptotically normal under the moment assumption E 4 1 < ∞ when all the ARCH coefficients are strictly positive, and under a slightly stronger moment assumption in the general case (see Bose andMukherjee, 2003 andGouriéroux, 1997).
Obviously one can define constrained and truncated versions of the QGLSE by and t < ∞ and θ 0 ∈ (0, +∞[ q+1 , the three estimators θQG n , θcQG n and θtQG n converge almost surely to θ 0 and have the same asymptotic distribution given by ( 9), as n → ∞.

Conditions for AN of the Estimators and Comparison of the Asymptotic Variances
In view of ( 2), ( 6) and ( 9), the following lemma shows the well known result that, under assumptions ensuring AN, the LSE and its variants (the CLSE and TLSE) are less efficient than the QMLE and the (unconstrained, constrained and truncated) QGLSE.
Lemma 3.1 Under the assumption E 8 t < ∞, Note however that the conditions required to obtain AN are not the same for the different estimators.In particular the computation of the QMLE requires positivity constraints4 , contrary to the LSE and QGLSE.On the other hand the LSE and its extensions require moment conditions, whereas the QMLE requires only the strict stationarity condition.
For an ARCH(1) model the strict stationarity condition is α 01 < exp {−E log η 2 t } and the second-order stationarity requires the much stronger condition α 01 < 1, and the condition The absence of moment conditions is an important advantage for the QMLE over the other estimators because the ARCH models are often fitted to financial series showing evidence of fat tails.
The following table summarizes the constraints on the different estimators in the simple ARCH(1) case, when η t follows a standard Gaussian N (0, 1) or Student distributions normalized in such a way that Eη 2 1 = 1 (St ν stands for a normalized Student distribution with ν degrees of freedom).Note that, as shown by a trivial extension of Example 2.1 to general constrained estimators, the value α 01 = 0 is not allowed for the AN of the QMLE, CLSE and CQGLSE.The next section gives the asymptotic distribution of these estimators when θ 0 belongs to the boundary of [0, ∞[ q+1 .

Asymptotic Distribution of the Estimators when the Parameter is on the Boundary
The parameter θ 0 is allowed to contains zero components, but we exclude the situation where θ 0 attains the upper boundary of the parameter space.Under this assumption the set √ n(Θ − θ 0 ) converges to the so-called local parameter space Λ defined by where Λ 1 = R, and, for i = 2, . . ., q + 1, a For the proof of the AN of this estimator, a technical additional assumption (see Equation ( 8) in Bose and Mukherjee, 2003) is required.This technical assumption is satisfied, in particular, when α 01 > 0 or when E 6 t < ∞.
In view of the positivity constraints, the random vector √ n( θQML n − θ) belongs to Λ with probability one.Following Chernoff (1954) or Andrews (1999) who studied boundary problems in very general frameworks, Francq and Zakoian (2007) gave conditions under which √ n( θQML with In the ARCH framework these conditions reduce to the moment condition E 6 t < ∞.5 When (3) holds true, we have Λ = R q+1 and we retrieve the standard result because λ Λ = Z ∼ N {0, (Eη 4 1 − 1)J −1 } .When θ 0 is on the boundary, the asymptotic distribution of √ n( θQML n − θ 0 ) is more complex than a Gaussian.This is the law of the projection of the Gaussian vector Z on the convex cone Λ.The asymptotic distributions of the constrained LSE and QGLSE are of the same type.
Proposition 4.1 When ( 6) holds (i.e., when The asymptotic distributions of the truncated estimator is simply the truncation of asymptotic distribution of the unrestricted estimators.
Proposition 4.2 With the notation Z introduced in Proposition 4. 1, when (6) holds (i.e.,  when E 8  1 < ∞) we have √ n( θtLS We use MSE QM L = trace E λ Λ λ Λ as a scalar measure for the asymptotic accuracy of the QMLE, and define similarly the MSE of the other estimators.Because we do not have an explicit expression for the matrix J, it seems difficult to compute and compare the MSE of all the estimators in the general setting.Comparison is however possible on the following example.

Comparing the Accuracy of the Estimators under Conditional Homoscedasticity
Consider an ARCH(q) in which α 01 = • • • = α 0q = 0.This framework is encountered in conditional homoscedasticity tests.In this case the local parameter space is Λ = R × (0, ∞) q and the information matrix J has a simple expression.A straightforward computation, available from the authors under request, yields It is interesting to note that when 5 ≤ q < (π + 1) + π/ω 2 0 , and when q > (π + 1) + π/ω 2 0 .That the QMLE (which is actually the maximum likelihood estimator (MLE) when η t is gaussian) might be dominated by another estimator seems quite surprising, at least at first sight.Our interpretation of this interesting phenomenon is the following.According to Le Cam's theory on convergence of local experiments (see e.g., van der Vaart, 1998) our problem is closely related to the problem of estimating m ∈ Λ from one observation X ∼ N (m, 2J −1 ), assuming for simplicity that η t ∼ N (0, 1).The form of J −1 being very simple, it can be shown that the MLE of m is explicitly given by mML It is easy to see that this MLE estimator is less efficient than mtLS = X 1 , X + 2 , . . ., X + q+1 when m = (m 1 , 0, . . ., 0) and q > 4.

Monte Carlo Results
Table 2 summarizes the output of Monte Carlo experiments.The empirical MSE's are generally close to the asymptotic MSE's obtained from ( 11) -( 13) (given in the rows n = ∞).The smallest MSE's are displayed in bold type.We note that the (Q)MLE can be outperformed by simpler estimators, in finite samples and also asymptotically when q = 6.The TLSE, although particularly simple to implement, performs remarkably well in the framework of Table 2. Other simulation experiments, not reported here, reveal that, as expected, the QMLE and (C/T)QGLSE are much more accurate than the other estimators in the presence of conditionally heteroscedastic data.
From the asymptotic theory, as well as from Table 2 and other numerical experiments not presented here, we draw the conclusion that i) the QMLE is generally superior to the other estimators in terms of accuracy when the data show evidence of conditional heteroscedasticity and/or heavy tail distribution, ii) the (Q)MLE can be outperformed by simpler estimators, such as the TLSE, when the true value of the parameters stay at the boundary of the parameter space.Detailed proofs of the results given in this paper are available from the authors.

Table 1 :
Conditions ensuring asymptotic normality for estimators of an ARCH(1) model with coefficient (ω 0 , α 01 ), when the iid noise η t follows a standard Gaussian or normalized Student distributions.

Table 2 :
Empirical MSE and asymptotic MSE for estimators of an ARCH(q) model when the data generated process is a N (0, 0.2 2 ) iid sequence.The number of replications is N = 1000.