Local asymptotic normality and efficient estimation for multivariate GINAR(p) models

Abstract: We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integer-valued autoregressive (MGINAR) process with order p. The generalized thinning operator in the MGINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, geometric thinning, Negative Binomial thinning and so on. By using the LAN property, we propose an efficient estimation method for the parameter of the MGINAR(p) process. Our procedure is based on the one-step method, which update initial ffiffiffi n p -consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.


Introduction
Recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics (e.g., Weiß, 2018). This paper is concerned with a special class of observationdriven models termed "integer-valued autoregressive processes" (INAR processes), which were introduced independently by Al-Osh and Alzaid (1987) and McKenzie (1985McKenzie ( , 1988. They introduced the ABOUT THE AUTHOR Hiroshi Shiraishi received the BS degree in mathematics in 1998 and the MS and Dr degrees in mathematical science from Waseda University, Japan in 2004 and 2007, respectively. He joined the GE Edison Life Insurance Company, the Prudential Life Insurance Company of Japan and the Hannover-Re Reinsurance Company, in 1998Company, in , 2000Company, in and 2005 respectively. His research interests are actuarial science, time series analysis, econometric theory and financial engineering. In particular, he investigates the statistical analysis of discretevalued time series/the statistical estimation of optimal dividend problems in the field of the actuarial science/the statistical estimation of Hawkes graphs and so on. He is currently an associate professor in the Department of Mathematics, Keio University, Japan. He is a fellowof the Institute of Actuary of Japan (FIAJ).

PUBLIC INTEREST STATEMENT
We derive the Local Asymptotic Normality (LAN) property for a multivariate generalized integervalued autoregressive (MGINAR) process with order p. recently, there has been a growing interest in modelling discrete-valued time series that arise in various fields of statistics.
The MGINAR process is one of the class of discrete-valued time series models and contains various classes. By using the LAN property, we propose an efficient estimation method for the parameter. Our procedure is based on the onestep method, which update initial root-n consistent estimators to efficient ones. The one-step method has advantages in both computational simplicity and efficiency. Some numerical results for the asymptotic relative efficiency (ARE) of our estimators and the CLS estimators are presented. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method.
integer-valued autoregressive process with order 1 (INAR(1) process) to model non-negative integervalued phenomena with time dependence. The more general INAR(p) processes were considered by Alzaid and Al-Osh (1990), Du and Li (1991) and so on. The INAR process consists of a mixture both of the distribution of thinning operator and the distribution of the innovation process. Alzaid and Al-Osh (1990) discussed the INAR(p) process in case that the thinning operator follows Binomial distribution (i.e., specified), while the distribution of the innovation process is left unspecified. Latour (1998) introduced a generalized version of the INAR(p) process, namely, the causal and stationary generalized integer-valued autoregressive (GINAR(p)) process. The generalized thinning operator in the GINAR(p) process includes not only the usual Binomial thinning but also Poisson thinning, Geometric thinning, and Negative Binomial thinning (by Ristić, Bakouch, & Nastić, 2009). The above history is appropriate for the univariate case. Nowadays, extensions for the multivariate case are being actively researched and applied. One of the first approaches to multivariate thinning mechanism was by McKenzie (1988). After that, Franke and Subba Rao (1993) introduced a multivariate INAR(1) (MINAR(1)) model based on independent Binomial thinning operators. The extensions for the MINAR(1) model were discussed by Latour (1997), Karlis and Pedeli (2013), and so on. This paper discusses a parameter estimation problem for a multivariate version of the GINAR(p) (MGINAR(p)) process. The MGINAR model is quite a large class, including all the models described above.
Estimation of the parameter for the INAR(p) process can be carried out in a variety of ways. Common ways for estimating parameters include the method of moments (MM), based on the Yule-Walker equations, and conditional least squares (CLS). The main advantage of both approaches is their simplicity due to the closed-form formulae and robustness due to require no assumption for the distribution. It is known that MM and CLS estimators are asymptotically equivalent. However, Al-Osh and Alzaid (1987) and so on have recommended using maximum likelihood (ML) estimators instead of MM and CLS estimators because they are less biased for small sample sizes. However, it is well known that the ML method is computationally unattractive due to complicated transition probabilities, including many convolutions. To overcome this problem, Drost, Van Den Akker, and Werker (2008) considered one-step, asymptotically efficient estimation of the INAR(p) model. Their method can reduce high computational cost due to the convolutions involved in the ML method. Following Drost et al. (2008), this paper provides a one-step method, which update initial ffiffiffi n p À consistent estimators to efficient ones for the MGINAR (p) processes. In the class of multivariate integer-valued models, the number of convolutions involved in the likelihood function is very large. For some distributions with reproductive property, the likelihood can be simplified, but it is impossible to apply such a simplification for all models in the MGINAR (p) processes. Therefore, it would be quite important to reduce the computational cost by using a one-step estimation. We first establish the local asymptotic normality (LAN) structure for experiments of the MGINAR (p) process. Considering the CLS estimator as the initial estimator, a one-step update estimator is proposed. We can also show that this estimator is asymptotically efficient.
The organization of this paper is as follows. In Section 2, the MGINAR (p) process is introduced and the LAN property is established. Section 3 discusses the efficient estimation. The CLS estimator is introduced and its asymptotic property is shown. Then, by using the LAN property, a one-step estimator is proposed to update the CLS estimator. In Section 4, the asymptotic relative efficiency (ARE) of the one-step estimator and the CLS estimator is examined through some simulation experiments. In addition, a real data analysis is provided to illustrate the application of the proposed estimation method. The proofs and other details are included in the Appendix.

The LAN property
Let fX t ¼ ðX 1;t ; . . . ; X d;t Þ T ; t 2 Z þ ¼ N [ f0gg be a d-dimensional non-negative integer-valued random process (i.e., X t ðωÞ 2 Z d þ ). The multivariate generalized integer-valued autoregressive process of order p (MGINAR(p)) is defined by where A ðkÞ ¼ ðA ðkÞ i;j Þ t;j¼1;...;d is a d Â d-matrix for k ¼ 1; . . . ; p and the matrix thinning A ðkÞ X tÀk gives a d-dimensional random vector with ith component Note that for any A 2 R, A 0 ¼ 0; f ðkÞ i;j;r ; r 2 Z þ g is a collection of independent and identically distributed (i.i.d.), non-negative, integer-valued random variables with the distribution function are independent of each other.
Throughout the paper, the number of lags (p 2 N) and the dimensions (d 2 N) are fixed and known. Let F and G denote the classes of distributions of and that belong to parametric classes, respectively, say F ; ðF a ; a 2 Θ a Þ and G ; ðG A ; A 2 Θ A Þ. For instance, when we consider the Binomial thinning operator and the Binomial innovation, F and G should be defined by F a ðxÞ ¼ a x ð1 À aÞ x and G A ðxÞ ¼ A x ð1 À AÞ x for a 2 Θ a ¼ ð0; 1Þ; A 2 Θ A ¼ ð0; 1Þ, and x 2 f0; 1g. Our goal is to For (probability) measures F 1 ; . . . ; F k 2 F [ G, we introduce the following notations for the convolutions: • F 1⊛ F 2 : the convolution of F 1 and F 2 • ⊛k¼1;...;p F k : the convolution of F 1 ; . . . ; F k (i.e., ⊛k¼1;...;p F k ¼ F 1⊛ Á Á Á ⊛ F k ) • F n⊛ 1 : the n times convolution of F 1 (i.e., F n⊛ Based on the above notations, we consider the corresponding probability space for fX t g denoted by ðΩ; F ; P ν;θ;F;G Þ, where Ω is a sample space, F is the σ-algebra, and P ν;θ;F;G is the probability measure given by ν (distribution of fX Àp ; . . . ; X À1 g), θ (parameter), F (class of distribution of ) and G (class of distribution of ). Furthermore, we introduce F ¼ fF t ; t ! Àpg as the natural filtration generated by fX Àp ; . . .
From (1) and (2), we can write For some x tÀk ¼ ðx 1;tÀk ; . . . ; x d;tÀk Þ T 2 Z d þ ; k ¼ 0; 1; . . . ; p, the transition probability P θ ðx tÀ1 ;...;xtÀpÞ;xt is given by We consider parametric MGINAR models in which the parameter space is restricted to the stationary parameter space, for instance, in case of the Binomial thinning operator and d ¼ 1, the thinning part of the parameter space should be defined by fA; A ¼ ðA ð1Þ ; . . . ; A ðpÞ Þ T 2 ð0; 1Þ d ; ∑ p k¼1 A ðkÞ <1g. Suppose that F Â G is a combination of a family of parametric distributions for the thinning operator and innovation (immigrant) with the formula below: is the sample space and the power set N Z dðnþ1þpÞ þ is the σ-algebra on this sample space. Observing ðX Àp ; . . . ; X n Þ yields the following sequence of experiments: where the initial distribution ν is fixed and the distributions for the thinning operator and innovation (immigrant) are parametrized (i.e., once θ ¼ ða T ; A T Þ T 2 Θ is fixed, the distributions are given by F a and G A ).
To prove the LAN property for the sequence of experiments E ðnÞ ðν; Θ; F; GÞ; n 2 Z þ , we impose the following assumptions.
To prove that ðE ðnÞ ðν; Θ; F; GÞÞ n2Zþ has the LAN property, we need to determine the behavior of a localized log-likelihood ratio. To this end, we first write down the following likelihood: ...;XtÀpÞ;Xt : In addition, we introduce the following log-likelihood: ,ðX tÀ1 ; . . . ; X t ; θÞ: Following Drost et al. (2008), we establish the LAN property by using a Taylor expansion. To do so, the transition score for θ is needed. The transition score ð _ ,Þ can be derived by calculating the partial derivatives of ,ð¼ log PÞ as follows. For the partial derivatives with respect to a For the partial derivatives with respect to if ði 0 ; j 0 ; k 0 ÞÞði; j; kÞ andG i;j;k;i 0 ;j 0 ;k 0 ¼ x j;tÀk g A ðkÞ Then, by using the Equations (3) and (4), we can derive a Taylor expansion of the localized loglikelihood ratio, and the appropriate limit theorems suggest that E ðnÞ ðν; Θ; F; GÞ n2Zþ has the LAN property as follows: Theorem 1. Suppose that any F a Â G A 2 F Â G given any θ ¼ ða T ; A T Þ T 2 Θ satisfies Assumptions (A1)-(A7), and let ν be a probability measure on Z dp þ with finite support. Then, the sequence of experiments E ðnÞ ðν; Θ; F; GÞ n2Zþ has the LAN property in θ 2 Θ, i.e., for every u 2 R q the following expansion holds, . . . ; X 0 ; θÞ is nonsingular, and R n ; R n ðu; θÞ ! p 0 under P ν;θ;F;G .

Efficient estimation
This section provides efficient estimators based on the one-step update method. First, we use the multivariate conditional least squares estimator as an initial estimator of θ (e.g., Bu, McCabe, & Hadri, 2008).
Definition 1 Suppose that fX Àp ; . . . ; X 0 ; X 1 ; . . . ; X n g is observed from the MGINAR(p) process defined by (1). Then, the conditional least squares (CLS) estimatorθ CLS for θ is defined bŷ Þ T X t À g t ðθÞ ð Þ and g t ðθÞ ¼ E X t jF tÀ1 ½ ¼a þ ∑ p k¼1 A ðkÞ X tÀk : Note that by calculating the derivative of Q n with respect to all entries of θ, we havê where Y 2 R dðnþ1Þ and Z 2 R dðnþ1ÞÂq are Then, Du and Li (1991) showed the following.
Moreover, we have the following.
Proposition 2. The CLS estimatorθ CLS is not asymptotically efficient, because it is evident that for some w 2 R q w T Γ À1 AEΓ À1 À J À1 À Á w > 0 under P ν;θ;F;G except for some special cases.
Since we have a ffiffiffi n p -consistent but inefficient estimator of θ, we update the CLS estimator to an efficient estimator by using the LAN result.
Theorem 2. Let ν be a probability measure on Z dp þ with finite support. Letθ CLS be a CLS estimator. Definê

Numerical study
In this section, we first examine asymptotic relative efficiency (ARE) of our proposed estimator and the CLS estimator through some simulation experiments. Then, we present a real data analysis to illustrate the application of the proposed estimation method.

Simulation study
Our proposed estimator (θ Ã ) and the conditional least squares estimator (θ CLS ) under the MGINAR model are compared through a series of simulation experiments. Specifically, we assess the small sample properties of the two estimators in the following cases: a Binomial thinning operator and a Binomial innovation (Case 1); and a Poisson thinning operator and a Poisson innovation (Case 2) with d ¼ 2 and p ¼ 1. The count series fX t g are defined by Case 2: Poisson thinning operator and Poisson innovation Then, the true parameter vector θ is written by Note that the parameter vector is chosen to obtain stationary count series.
We ran 1000 Monte Carlo replicas with sample sizes n ¼ 10; 100; 1000; 10000. For each replica, we estimate the model parameters based on two procedures (CLS:θ CLS , Efficient Est:θ Ã ) and calculate the (approximated) bias (Table 1) and diagonal part of the (approximated) MSE (Table 2) of the parameter estimators. Finally, we calculate the (approximated) ARE (Table 3)  The bias results are reported in Table 1. It can be seen that the biases for both estimators tend to be 0 when the sample size is sufficiently large. This implies that the both estimators are asymptotically unbiased. However, for the CLS estimator of A 12 and A 21 , the biases are relatively large, which implies that the convergence rate is relatively slow. In contrast, our proposed estimator improves the CLS estimator in terms of bias.
The corresponding MSE results are displayed in Table 2. Similar to the bias results, the MSEs for both estimators tend to be 0 when the sample size is sufficiently large, which implies that both estimators converge to the true values in probability. However, for the CLS estimator of A 12 and A 21 , the MSEs are relatively large, which implies that the convergence rate of the variance is relatively slow. In contrast, our proposed estimator improves the CLS estimator, because it appears that the MSEs of all components converge to 0.
Finally, the ARE (asymptotic relative efficiency) results are given in Table 3. The ARE of two estimators is defined as the ratio of their asymptotic variances (e.g., Cox & Hinkley, 1974;Serfling, 2011). Letθ 1 andθ 2 be two estimators of θ 2 Θ & R q . Let AE 1 and AE 2 be the asymptotic covariance matrices, i.e., ffiffiffi n p ðθ i À θÞ ! d N 0; AE i ð Þ  for i = 1,2, respectively. Then, the ARE ofθ 2 andθ 1 is given by In our study, we consider the ARE of our proposed estimator (θ Ã ) and the conditional least squares estimator (θ CLS ) as follows: Clearly, in this setup, if an ARE is larger than 1, it suggests that our proposed estimator improves the CLS estimator in terms of efficiency. Table 3 reports the ARE results, but Γ À1 AEΓ À1 and J À1 are replaced as the sample MSEs. It can be seen that the ARE tends to be larger than 1 as the sample size increases, which implies that our estimator improves the CLS estimator in terms of efficiency. We tried the same simulation studies under some different settings with respect to the parameter values. We omit them, but the results are similar.

Real data analysis
The data set consists of the number of cases of infectious diseases per week by prefecture for the period 2015-2018 (208 weeks) as reported by the National Institute of Infectious Diseases (NIID) in Japan (URL: https://www.niid.go.jp/niid/en/). Here we use the number of cases of "Epidemic keratoconjunctivitis (EK)" and "Aseptic meningitis (AM)" in Shimane prefecture. The sample path plot for the data in Figure 1 reveals some seasonality or periodicity, but it looks that there exist no trend and stationarity. Figure 2 shows the sample ACF and PACF plots for each disease. The figure shows the time dependency but any long-range dependence is not observed, so it is acceptable to fit the MGINAR (1) model.
We suppose that the time series count data fX t ¼ ðX where both of the thinning operator and the innovation follows the Binomial or Poisson distribution. The CLS estimator (â CLS ;Â CLS ) for the parameter (a; A) is obtained as follows:    In what follows, for simplicity, we write ðX s ; . . . ; X t Þ :¼ X s:t and ðx s ; . . . ; x t Þ :¼ x s:t for s; t 2 Z with s < t.