Nearly Unstable Integer-Valued ARCH Process and Unit Root Testing

This paper introduces a Nearly Unstable INteger-valued AutoRegressive Conditional Heteroskedasticity (NU-INARCH) process for dealing with count time series data. It is proved that a proper normalization of the NU-INARCH process endowed with a Skorohod topology weakly converges to a Cox-Ingersoll-Ross diffusion. The asymptotic distribution of the conditional least squares estimator of the correlation parameter is established as a functional of certain stochastic integrals. Numerical experiments based on Monte Carlo simulations are provided to verify the behavior of the asymptotic distribution under finite samples. These simulations reveal that the nearly unstable approach provides satisfactory and better results than those based on the stationarity assumption even when the true process is not that close to non-stationarity. A unit root test is proposed and its Type-I error and power are examined via Monte Carlo simulations. As an illustration, the proposed methodology is applied to the daily number of deaths due to COVID-19 in the United Kingdom.


Introduction
First-order nearly unstable continuous autoregressive processes have been well explored in the literature, see for example Chan and Wei (1987), Phillips (1987), Chan, Ing and Zhang (2019), and the references therein.In these works, it is assumed that the model approaches the non-stationarity region as the sample size increases.More specifically, a nearly unstable continuous process {Y where {η t } t∈N is a white noise and ρ n = 1 − b/n, for b > 0.
Another popular way for dealing with count time series data is the INteger-valued Genenalized AutoRegressive Conditional Heterokedastic (INGARCH) models by Ferland, Latour and Oraichi (2006), Fokianos, Rahbek and Tjøstheim (2009), Fokianos and Fried (2010), Zhu (2011), Fokianos andTjøstheim (2011), Zhu (2012), Christou and Fokianos (2015), Gonçalves et al. (2015), Davis and Liu (2016), Silva and Barreto-Souza (2019), Weiß et al. (2020), which constitute in some sense an integer-valued counterpart of the classical GARCH models by Bollerslev (1986) process for dealing with count time series data.To the best of our knowledge, this is the first time that a nearly unstable count time series model is being proposed based on the INARCH approach; all existing nearly unstable discrete processes in the literature consider the INAR approach.We establish the weak convergence of the NU-INARCH process (when properly normalized) endowed with a Skorohod topology.With this result at hand, we derive the asymptotic distribution of the conditional least squares estimator of the correlation parameter as a functional of certain stochastic integrals.An equally important contribution of this paper is to develop a unit root test (URT) for the INARCH model, where the asymptotic distribution of the statistics under the null hypothesis is provided.Note that although URTs are well explored in the continuous case, only sporadic results are available for the discrete case.
A few works dealing with this relevant problem, based on the INAR approach, are due to Hellström (2001) and Drost, Van Den Akker and Werker (2009).
The paper is organized as follows.In Section 2, the NU-INARCH model is introduced and a fluctuation theorem is established, which involves the Cox-Ingersoll-Ross diffusion process.The asymptotic distribution of the CLS estimator for the correlation parameter is derived in Section 3 under the nearly unstable and stationarity assumptions.Section 4 provides simulated results about the asymptotic distribution of the CLS estimator under both nearly unstable and stationary approaches and also compares them in terms of confidence interval coverages.A unit root test for the INARCH process is proposed in Section 5 and its performance is evaluated via Monte Carlo simulations.An empirical application about the daily number of deaths due to COVID-19 in the United Kingdom, which exhibits a nearly unstable/non-stationary behavior, is provided in Section 6. Concluding remarks and future research are addressed in Section 7.

Model and the Fluctuation Theorem
In this section, we define the nearly unstable INARCH process and obtain its weak convergence (under a proper normalization) in the space of the non-negative càdlàg functions endowed with the Skorokhod topology.
Definition 2.1.We say that a sequence {X for n ∈ N, where F 0 }, β > 0, and Remark 2.1.For the nearly unstable INARCH model defined above, we have that corr(X The parameterization of α n in (2) was first proposed by Chan and Wei (1987) and subsequently used in Ispány, Pap and Van Zuijlen (2003).
In the next proposition, we provide the mean, variance, and autocorrelation function of the NU-INARCH process.These results will be important to establish the proper normalization in order to obtain a non-trivial limit for the counting process.
Proposition 2.2.Let {X (n) t } t∈N be a nearly unstable INARCH process.Then, its marginal mean and variance, and autocorrelation function are given respectively by Proof.We have that E(X t−1 ).By using recursion t times, we obtain the result for the marginal mean.For the variance, it follows that Var(X Finally, for k, t ∈ N 0 , the autocorrelation function becomes cov(X where we have used in the third equality the fact that E(X From Proposition 2.2, we have that E(X In the following theorem, we establish the weak convergence of the process {X (n) (t); t ≥ 0} as n → ∞.We introduce some notation before presenting such a result.Denote by D + [0, ∞) the space of the non-negative càdlàg (right continuous with left limits) functions on [0, ∞) and C ∞ c [0, ∞) the space of infinitely differentiable functions on [0, ∞) having compact supports.
Proof.We have that X x ) .From Theorem 6.5 in Chapter 1 and Corollary 8.9 in Chapter 4 of Ethier and Kurtz (1986), to obtain the desired result, it is enough to show that , where h (•) and h (•) denote the first and second derivatives of h(•), respectively.
For Z (n) x = x, we have that and By combining ( 5) and ( 6), we obtain that Note that Equation ( 7) also holds for Z (n) x = x.Further, we can write We now use the Equations ( 7) and ( 8) to express n (x) as follows: We will show that lim n→∞ sup x∈En | (j) n (x)| = 0, for j = 1, 2, 3.This result, Equation ( 9), and the triangular inequality imply that (4) holds and therefore conclude the proof of the theorem.
To show the case j = 1, we argue as in the proof of Theorem 3.1 in Chapter 9 of Ethier and Kurtz (1986).Then, the result follows by showing that lim n→∞ | (1) ω * (0) = 0, ω * (x) = max{1, c/x} for x > 0, and ω * (0) = 1.Hence, it follows that Further, we have that E ( Z These results give us that the right-hand side of (10) goes to 0 as n → ∞.We obtain the same conclusion when x − x) via its characteristic function as follows: Hence, the integrand in (1) n (x n ) is bounded above by an integrable random variable.Further, this integrand converges in probability to 0 since Z We then apply the Dominated Convergence Theorem to conclude that lim n→∞ | (1) For the case j = 2, it follows that as n → ∞.In a similar fashion, for j = 3, it can be shown that lim n→∞ sup x∈En | (3) n (x)| = 0, which concludes the proof.

Conditional Least Squares
In this section, we provide the asymptotic distribution of the conditional least squares estimator of α n for the nearly unstable INARCH process.The parameter β is assumed to be known.This can be seen as a nuisance parameter since our main interest relies on the parameter α n that controls the dependence in the model.In the empirical illustration, we discuss how to deal with the unknown β case.
The CLS estimator of α is obtained by minimizing the Q-function given by Q Hence, we obtain explicitly the CLS estimator of α, say α n , which is given by We begin by deriving the asymptotic distribution of α n under the stationary assumption, where we denote the count time series by {X t } t∈N (no need for the superscript (n)).This case will be contrasted to the nearly unstable INARCH process through simulation in the following section.
Theorem 3.1.Assume that X 1 , . . ., X n is a trajectory from a stationary Poisson INARCH(1) model, that is α n = α < 1.Then, the CLS estimator α n given in (11) satisfies as n → ∞, where Proof.From Fokianos, Rahbek and Tjøstheim (2009), we have that {X t } is strictly stationary and ergodic since α < 1.Hence, we can use Theorem 3.2 from Tjøstheim (1986) to establish the asymptotic normality of the CLS estimator α n .The other conditions necessary to obtain this weak convergence can be straightforwardly checked in our case and therefore are omitted.Applying this theorem, we get that the asymptotic variance, say σ 2 , assumes the form for the marginal moments of a Poisson INARCH(1) model are given in Weiß (2010).Using these results and the notation considered there with . Direct algebric manipulations conclude the proof.
From now on assume that {X k , for t ∈ N 0 and s ≥ 0, where x denotes the integer-part of x ∈ R. Like in the nearly unstable INAR process by Ispány, Pap and Van Zuijlen (2003), we can express α n − α n as In the following lemma, we provide the asymptotic behavior of the autocovariance function of the process {W (n) (s); s ≥ 0}; note that E(W (n) (s)) = 0.This will be important to identify the proper normalization of α n − α n in ( 12) yielding a non-trivial weak limit.
Lemma 3.2.For s, v ≥ 0, we have that cov(W Proof.It is straightforward that E(W (n) (s)) = 0 and cov(W , where the last equality follows from the expression of the covariance given in Proposition 2.2.After using the expression of the variance given in that lemma, we obtain that Var(W From the above results and Proposition 2.2, we obtain that Lemma 3.2 and Theorem 2.3 give us that α n − α n = O p (n −1 ).We now are able to establish the asymptotic distribution of the CLS estimator α n under the nearly unstable INARCH process as follows.
, where both numerator and denominator have the same order of magnitude O p (n 2 ).
For s > 0, it follows that and then W (n) (s) can be expressed by Define the functions Φ n (n = 1, 2, . ..) and Φ mapping Figures 1 and 2, respectively.From Figure 1, it is evident that the normal approximation is not adequate and it is worsening when α gets closer to 1, which is expected since these results are based on stationarity.
asymptotic distribution for all scenarios.
A natural question is what happens when α is not close to 1.To address this point, we run additional simulations with α = 0.7, 0.8, 0.9, and the remaining settings as before.Figures 3 and 4 exhibit histograms and qq-plots of the standardized CLS estimates of α obtained from a Monte Carlo simulation for the stationary and nearly non-stationary Poisson INARCH processes.From Figure 3, we observe some deviation from the normality even for the case α = 0.7.This is well evidenced by the qq-plots.Surprisingly, the results based on the nearly unstable methodology work quite satisfactorily even for α = 0.7.These conclusions can be drawn again in Figure 4, where we note a good agreement between the empirical standardized CLS estimates and the theoretical asymptotic distribution derived in Theorem 3.3.
All the configurations considered here are repeated again with a sample size n = 1000.Figures 5 and   6 give us the histograms and qq-plots of the standardized CLS estimates under the stable and nearly unstable Poisson INARCH processes, respectively, under the settings α = 0.98, 0.99, 0.999.The plots regarding the settings α = 0.7, 0.8, 0.9 for the stable and nearly unstable cases are reported in Figures 7 and 8, respectively.
The conclusions are quite similar to the case n = 500 for the configurations nearly to non-stationarity α = 0.98, 0.99, 0.999.Regarding the configurations where α = 0.7, 0.8, 0.9, although there is an improvement in the results based on the stationary case (compared to n = 500), deviations from the normality can still be observed.In contrast, the nearly unstable approach again works very well and provides the best outcomes.As a short conclusion, we recommend using the nearly unstable-based approach even when the fitted model may in practice not be too close to the non-stationarity region because the proposed methodology works well and perform better than the stationary-based approach.
Our interest now is to evaluate the coverages of the confidence intervals based on the asymptotic results under the nearly unstable and stable assumptions.In Table 1, we provide the empirical coverages of confidence intervals, from a Monte Carlo simulation with 10000 replications, for α with significance level at 10%, 5%, and 1% based on Theorem 3.3 (under nearly non-stationarity).The sample size is n = 500 and we consider α = 0.999, 0.99, 0.98, 0.9, 0.8, 0.7.These results show that inference on the correlation parameter using our methodology is satisfactory since the coverages are close to the nominal levels for all cases considered, even when α is not close to the non-stationarity region.0.314 0.798 0.995 1.000 1.000 1.000 1.000 500 0.367 0.927 1.000 1.000 1.000 1.000 1.000 1000 0.434 0.998 1.000 1.000 1.000 1.000 1.000 2000 0.551 1.000 1.000 1.000 1.000 1.000 1.000 5000 0.839 1.000 1.000 1.000 1.000 1.000 1.000

Real Data Application
We here apply the proposed methodology to the daily number of deaths due to COVID-19 in the United Kingdom from January 30, 2020, to June 4, 2021, so yielding n = 492 observations.This dataset is publicly available at the site https://coronavirus.data.gov.uk.The plot of the daily number of deaths and its associated ACF are provided in Figure 9, which reveals a nearly unstable/non-stationary behavior.
We assume that the time series comes from an NU-INARCH(1) process.The aim of this application is to illustrate that the theoretical results found in this paper can reveal the unit root behavior for a real dataset.We first need to deal with β, which is unknown and can be seen as a nuisance parameter; our primary interest in this paper relies on the correlation parameter α n .One strategy is to estimate β through the conditional maximum likelihood method, which consists in maximizing ∝ n t=2 (y t log λ t − λ t ), and then assume it known in what follows.This procedure gives β = 0.269.At the end of this application, we will evaluate such an approach by performing a small Monte Carlo simulation study.
Using (11), we obtain the estimate for the correlation parameter equal to α n = 0.997, which is very close to 1.We obtain the standard error of the α n estimate (s.e.( α n )) using the asymptotic distribution stated in Theorem 3.1, which gives the s.e.( α n ) ≈ 0.014.We perform the URT proposed in Section 5 for testing the hypothesis H 0 : α = 1 against H 1 : α < 1.We obtain n( α n − 1) = −1.257> −17.952 = q 0.05 and therefore we do not reject the null hypothesis on the unit root with significance We conclude this application by evaluating our strategy by estimating β and assuming known.To do this, we run a small Monte Carlo simulation with 1000 replications.In each loop, we generate an NU-INARCH model with β = 0.269, α = 0.997, and n = 492 (specifications of the application), construct confidence intervals for α based on both approaches with fixed and non-fixed (estimated as done in this section and then assumed known) β, and check if they contain the "true" value.The empirical coverages of the 90%, 95%, and 99% confidence intervals under both approaches are reported in Table 4.
As can be seen from this table, the proposed solution given here in the application provides the expected nominal coverages and works even better than the fixed β case for the 90% and 95% coverages; the 99% coverages are very close to each other.7 Discussion and Future Research A nearly unstable INARCH(1) process was introduced and weak convergence of a normalized version was established.The asymptotic distribution of the CLS estimator of the correlation parameter was derived under both nearly unstable and stable cases, which have been explored via Monte Carlo simulations.We also proposed a unit root test and checked its performance in terms of yielding the desired Type-I error and power through simulation.The nearly unstable INARCH approach was applied to the daily number of deaths due to the COVID-19 in the UK, which exhibits a non-stationary behavior.the proposed URT has provided evidence for the existence of a unit root in agreement with the descriptive analysis.
We have assumed that the conditional distribution in (1) is Poisson, but the methods presented in this paper can be easily adapted for other distributional assumptions such as negative binomial or more generally mixed Poisson distributions, among others.More specifically, the very same strategy given in Proposition 2.2 and Lemma 3.2 can be employed to find the proper normalizations for the processes nt , t ≥ 0} and {W (n) (t), t ≥ 0} in these other cases.After obtaining these results, the asymptotic distributions of the normalized count process and CLS estimator are established following the same steps as those given in Theorems 2.3 and 3.3, respectively.We also believe that extending the results for higher-order INGARCH models deserves future investigation.
and a n ≈ b n denoting that lim n→∞ a n /b n = 1 for real sequences {a n } and {b n }.

Figure 9 :
Figure 9: Plot of the daily number of deaths due to COVID-19 in UK and its associated ACF.

Figure 10 :
Figure 10: Density (based on Gaussian kernel) of D 0 using 100000 Monte Carlo replications.Vertical solid and dashed lines represent the statistic and the 0.05-quantile (of the D 0 distribution), respectively.

Figure 11 :
Figure 11: Left: Number of deaths (points) and the predicted mean E(Y t |Y t−1 ) = β + α n Y t−1 (solid line).Right: Number of deaths against predicted means.
. The INGARCH methodology is the focus of this paper.Like the existing literature on nearly unstable continuous and INAR processes that assumes first-order autoregressive dependence, in this paper we consider the first-order autoregressive version of the INGARCH approach, which is known as INARCH(1) (INteger-valued AutoRegressive Conditional Heteroskedasticity).Our chief goal in this paper is to introduce a Nearly Unstable INARCH (denoted by NU-INARCH)

Table 2 :
Empirical significance levels obtained from a Monte Carlo study to evaluate the proposed unit root test under some sample sizes and nominal significance levels at 10%, 5%, and 1%.

Table 3 :
Empirical power obtained from a Monte Carlo study to evaluate the proposed unit root test under some sample sizes and values of α.Significance level at 5%.