Inference and testing for structural change in general Poisson autoregressive models

: We consider here together the inference questions and the chan- ge-point problem in a large class of Poisson autoregressive models (see Tjøstheim, 2012 [34]). The conditional mean (or intensity) of the process is involved as a non-linear function of it past values and the past observations. Under Lipschitz-type conditions, it can be written as a function of lagged observations. For the latter model, assume that the link function depends on an unknown parameter θ 0 . The consistency and the asymptotic normality of the maximum likelihood estimator of the parameter are proved. These results are used to study change-point problem in the parameter θ 0 . From the likelihood of the observations, two tests are proposed. Under the null hypothesis (i.e. no change), each of these tests statistics converges to an explicit distribution. Consistencies under alternatives are proved for both tests. Simulation results show how those procedures work in practice, and applications to real data are also processed.


Introduction
Time series of counts appear as natural for modeling count events. Some examples can be found in epidemiology (number of new infections), in finance (number of transactions per minute), in industrial quality control (number of defects), just to name a few. We refer the reader to Held et al. (2005) [23], Brännäs and Quoreshi (2010) [5], Lambert (1992) [30] among others, for more details.
Real advances have been made in count time series modeling during the last two decades. Let Y = (Y t ) t∈Z be an integer-valued time series; denote F t = σ(Y s , s ≤ t) the σ-field generated by the whole past at time t and L(Y t /F t−1 ) the conditional distribution of Y t given the past. A model is characterized by the type of marginal distribution L(Y t /F t−1 ), and the dependence structure between L(Y t /F t−1 ) and the past. Models with various marginal distributions and dependence structures have been studied; see for instance Kedem and Fokianos (2002) [26], Davis et al. (2005) [8], Ferland et al. (2006) [15], Davis and Wu (2009) [10], Weiß (2009) [35]. Fokianos et al. (2009) [19] considered the Poisson autoregression such that, L(Y t / F t−1 ) is Poisson distributed with an intensity λ t which is a function of λ t−1 and Y t−1 . Under linear autoregression, they proved both the consistency and the asymptotic normality of the maximum likelihood estimator of the regression parameter, by using a perturbation approach, which allows to use the standard Markov ergodic setting. Fokianos and Tjøstheim (2012) [20] extended the method to nonlinear Poisson autoregression with; λ t = f (λ t−1 ) + b(Y t−1 ) for nonlinear measurable functions f and b. In the same vein, Neumann (2011) [31] studied a much more general model where λ t = f (λ t−1 , Y t−1 ). He focused on the absolute regularity and the specification test for the intensity function, while the recent works of Fokianos and Neumann (2013) [18] studied goodnessof-fit tests which are able to detect local alternatives.   [13], considered a more general model with infinitely many lags. Stationarity and the existence of moments are proved by using a weak dependent approach and contraction properties.
Later, Davis and Liu (2012) [9] studied the model where, the distribution L(Y t /F t−1 ) belongs to a class of one-parameter exponential family with finite order dependence. This class contains Poisson and negative binomial (with fixed number of failures) distribution. From the theory of iterated random functions, they established the stationarity and the absolute regularity properties of the process. They also proved the consistency and asymptotic normality of the maximum likelihood estimator of the parameter of the model. Douc et al. (2013) [11] considered a class of observation-driven time series which covers linear, log-linear, and threshold Poisson autoregressions. Their approach is based on a recent theory for Markov chains based upon Hairer and Mattingly (2006) [22] recent work; this allows existence and uniqueness of the invariant distribution for Markov chains without irreducibility. Further, they proved the consistency of the conditional likelihood estimator of the model (even for mis-specified models); the asymptotic normality is not yet considered in this setting.
Asymptotic theory for inference on time series models usually needs the stationarity properties of the process. But in practice, real data often suffer from non-stationarity which may be due to structural changes occur during the data generating period. Several ways to consider such structural changes are possible, as this was demonstrated during the thematic cycle Nonstationarity and Risk Management held in Cergy-Pontoise during year 2012 1 . In the context of count models, Kang and Lee (2009) [25] proposed a CUSUM procedure for testing of parameter changes in a first-order random coefficient integer-valued autoregressive model defined through thinning operator. Fried (2010, 2012) [16,17] studied mean shift in linear and log-linear Poisson autoregression. Dependence between the level shift and time allows their model to detect several types of interventions effects such as outliers. Franke et al. (2012) [21], considered parameter change in Poisson autoregression of order one. Their tests are based on the cumulative sum of residuals using conditional least-squares estimator.
Here, we shall first consider a time series of counts Y = (Y t ) t∈Z satisfying: where F t = σ(Y s , s ≤ t) and F a measurable non-negative function. The properties of the general class of Poisson autoregressive model (1) have been investigated in Doukhan et al. [13]. Such infinite order processes provided a large way to take into account dependence on the past observations. Proceeding as in Doukhan and Wintenberger (2008) [12], we show that under some Lipschitztype conditions on F , the conditional mean λ t can be written as a function of the past observations. This yields to consider the model where f is a measurable non-negative function. We assume that f is know up to a parameter θ 0 belonging to some compact set That is  (1) and (2)) can be represented in terms of Poisson processes. Let {N t (·) ; t = 1, 2, . . .} be a sequence of independent Poisson processes of unit intensity. Y t can be seen as the number (say N t (λ t )) of events of N t (·) in the time interval [0, λ t ]. So, we have the represen- The Poisson autoregressive models are known to capture the overdispersion phenomenon in counts data, meaning if the process (Y t ) t∈Z is stationary it always occurs that Var The paper first works out the asymptotic properties of the maximum likelihood estimator of the model (3). Under some Lipschitz-type assumption on the function f θ , we investigate sufficient conditions for the consistency and the asymptotic normality of the maximum likelihood estimator of θ 0 . Note that, Doukhan et al. [13] have used weak dependence approach to prove the existence of stationary and ergodic solution of (1). By using their results, some assumptions (such as increasing condition on f θ (·) or four times differentiability on the function θ → f θ ) that often needed for perturbation technique (see for instance Fokianos et al. [19] or Fokianos et al. [20]) are relaxed. Although the models studied by Davis and Liu [9] and Douc et al. [11] allow large classes of marginal distributions, the infinitely many lags of model (1) (or model (3)) enables a higher order dependence structure.
The second contribution of this work is the two tests for change detection in model (3). We propose a new idea to take into account the change-point alternative. This implies that, the procedures proposed will be numerically easy applied than those proposed by Kengne (2012) [27]. The consistency under the alternative is proved. Contrary to Franke et al. [21], the multiple change alternative has been considered and independence between the observations before and after the change-point is not assumed. Note that, the intervention problem studied by Fokianos and Fried [16,17] is intended to sudden shift in the conditional mean of the process. Such outlier could in some case be seen as a particular case of structural change problem that we develop here for a large class of models. However, if the intervention affects only a few data points, in the classical change-point setting (where the length of each regime tends to infinity with the same rate as the sample size), such effects will be asymptotically negligible.
The forthcoming Section 2 provides some assumptions on model with examples. Section 3 is devoted to the definition of the maximum likelihood estimator with its asymptotic properties. In Section 4, we propose the tests for detecting change in parameter of model (3). Some simulation results and real data applications for inference and change-point detection are presented in Sections 5 and 6, lastly the proofs of the main results are provided in Section 7.

Assumptions
We will use the following classical notations: 1. y := p j=1 |y j | for any y ∈ R p ; 2. for any compact set K ⊆ R d and for any function g : A classical Lipschitz-type conditions is assumed on the model (1).
There exists a sequence of non-negative real numbers (α j ) j≥1 satisfying ∞ j=1 α j < 1/ (1 + ) for some > 0 and such that for any x, x ∈ ((0, ∞) × N) N , Under assumption (A F ), Doukhan et al. [13,14] prove that the above equation has a strictly stationary solution (Y t , λ t ) t∈Z ; moreover this solution is τ -weakly dependent with finite moment of any order. The following proposition show that the conditional mean λ t of model (1) can be expressed as a function of only the past observations of the process.
where f : N → R + is a measurable function.
From Proposition 2.1, it appears that the information on the unobservable process (λ t−j ) can be captured by the observable process (Y t−j ). Hence, we will focus on the model (3); with the advantage to easily compute the derivative ∂λ t /∂θ (very useful to derive the asymptotic covariance matrix in the inference study). Note that, if one carries inference on the model (1) by assuming that , then it will not be easy (or not possible) to compute ∂λ t /∂θ or to express it as a function of ∂F θ /∂θ in the general case.
We focus on the model (3) with the following assumptions. For i = 0, 1, 2 and for any compact set K ⊂ Θ, we introduce and there exists a sequence of non-negative real numbers (α The Lipschitz-type condition A 0 (Θ) is the parametric version of the assumption A F . It is classical when studying the existence of solutions of such model (see for instance [12,1] or [13]). In particular, A 0 (Θ) implies that for all θ ∈ Θ and y ∈ (R + ) N , The latter relation is a useful tool for proving that the stationary solution of (3) admits finite moments (see the proof of Theorem 2.1 of [13]). The assumptions A 1 (K) and A 2 (K) as well as the following assumptions D(Θ), Id(Θ) and Var(Θ) are needed to define and to study the asymptotic properties of the maximum likelihood estimator of the model (3).

Particular case of INGARCH(p, q) processes Assume that
Hence, the Lipschitz-type condition (A F ) is satisfied. In this case, we can find for any θ ∈ Θ, a sequence of non-negative real numbers (ψ k (θ)) k≥0 satisfying Therefore, assumptions A 0 (Θ) holds. Moreover, the functions ψ k (θ) are twice continuously differentiable with respect to θ and its derivatives decay exponentially, hence A 1 (Θ) and A 2 (Θ) hold. If inf θ∈Θ (α 0 ) > 0 then D(Θ) holds. For this particular case, Id(Θ) holds automatically. See [15] and [35]) for more details on this model. The adequacy of this linear model to the number of transactions per minute for the stock Ericsson B during July 2, 2002 has been proved by Fokianos and Neumann [18].

Threshold Poisson autoregression
We consider a threshold Poisson autoregression model defined by: where We can also write This example of nonlinear model is also called an integer-valued threshold ARCH (or INTARCH) due to its definition like the threshold ARCH model proposed by Zakoïan (1994) [36]; see also [21] for INTARCH(1) model. In the INTARCH(∞) model, the regression coefficient such model can then be used to capture a piecewise phenomenon. is the threshold parameter of the model. If
For any k, k ∈ Z such as k ≤ k , denote Theorem 3.1. Let (j n ) n≥1 and (k n ) n≥1 be two integer valued sequences such that j n ≤ k n , k n → +∞ and k n − j n → +∞ as n → +∞. Assume θ 0 ∈ • Θ and D(Θ), Id(Θ) and A 0 (Θ) hold with It holds that θ n (T jn,kn ) a.s.
The following theorem regarding the asymptotic normality of the MLE of model (3) holds.
Theorem 3.2. Let (j n ) n≥1 and (k n ) n≥1 be two integer valued sequences such that j n ≤ k n , k n → +∞ and k n −j n → +∞ as n → +∞. Under the assumptions of Theorem 3. 1

and Var(Θ), if for
According to the Lemma 7.2 and the proof of Theorem 3.2, the matrix are both consistent estimators of Σ.  2. In Theorems 3.1 and 3.2, the typical sequences j n = 1 and k n = n, ∀n ≥ 1 can be chosen. This choice is the case where the estimator is computed with all the observations. But in the change-point study and depending on the procedure used, one might need to compute the estimator on each regime. Results are written this way to cover this situation. 3. If the Lipschitz coefficients (α (10) and (11) hold.

Testing for parameter changes
We consider the observations Y 1 , . . . , Y n generated as in model (3) and assume that the parameter θ 0 may change over time. More precisely, we assume that j ) t∈Z is a stationary solution of (3) depending on θ * j . The case where the parameter does not change corresponds to K = 1. This problem leads to the following test hypotheses: (3), depending on θ * j . Let us note that, contrary to Franke et al. [21], the independence between the observations before and after the change-point is not assumed. Moreover, their assumption (A9) impose a change in the mean of the marginal distribution of the observed process. In the case of linear Poisson autoregression, this condition leads to a change in the unconditional mean. The procedures developed here include the situation where parameter can change but not the mean of the marginal distribution (see an example of empirical results in the Section 6). Therefore, the present change-point problem is more general.
Hence for any y = (y k ) k≥1 and y = (y k ) k≥1 : (12) where the second equality follows by seeing By using assumption A 0 (Θ) and relation (12), one can show that the approximated process ( Y t , λ t ) t * j−1 <t≤t * j converges (in L r for any r ≥ 1) to the stationary regime (see for instance Bardet et al. [2] where similar approximation has been made for a class of causal time series models). So, the results of Section 4.2 can be extended (after an approximation study) by relaxing the stationarity assumption after change.
Recall that under H 0 , the likelihood function of the model computed on T ⊂ {1, . . . , n} is given by 1 , . . . ) and the maximum likelihood estimator is given by θ n (T ) = argmax θ∈Θ L n (T, θ). It holds from Theorem 3.2 that, under H 0 , the asymptotic covariance matrix of . Σ n is a consistent estimator of under H 0 (see the proof of Theorem 3.2). The consistency of Σ n under H 1 is not ensured. Σ n does not take into account the change-point alternative. So, the consistency under H 1 of any test based on Σ n will not be easy to prove. Let (u n ) n≥1 and (v n ) n≥1 be two integer valued sequences satisfying u n , v n → +∞, un n , vn n → 0 as n → +∞. Our test statistic is based on the following matrix . Theorem 3.1 and Lemma 7.2 show that Σ n (u n ) is consistent under H 0 . Under H 1 , we will use the classical assumption that the breakpoint gown at the rate n. This will allow us to show that the first component of Σ n (u n ) converges to the covariance matrix of the stationary model of the first regime. It will be a key to prove the consistency under H 1 . Another way to deal is to consider the matrix Asymptotically, both the matrices Σ n (u n ) and Σ n (u n ) have the same behavior under H 0 . In the case of non stationarity after change, the procedure based on Σ n (u n ) can provide more distortion; because, according to the dependence on the past, the second component of Σ n (u n ) will converge very slowly than the first component of Σ n (u n ).
Let us define now the test statistics: where q is a weight function define on (0, 1), see bellow; The first procedure is based on the statistic C n and the other one is based on Q n defined by The weight function q is used to increase the power of the test based on the statistic C n . Its behavior can be controlled at the neighborhood of zero and one by the integral see Csörgo et al. [6] or Csörgo and Horváth [7]. The natural choice is Furthermore, in practice the sequences (u n ) n≥1 and (u n ) n≥1 are chosen to ensure the convergence of the numerical algorithm used to compute θ n (T 1,un ) and θ n (T 1,vn ). Hence, these procedures might not be accurate for smaller sample size (i.e. when n < 200). For Poisson INGARCH model, u n = v n = [(log n) δ0 ] (with 5/2 ≤ δ 0 ≤ 3) can be chosen (see also Remark 1 of [27]).

Asymptotic behavior under the null hypothesis
The asymptotic distributions of these statistics under H 0 are given in the next theorem.
The distribution of sup 0≤τ ≤1 W d (τ ) 2 is explicitly known. In the general case, the quantile values of the limit distribution of the first procedure (based on C n ) can be computed through Monte-Carlo simulations. In the sequel, we will take q ≡ 1. The Theorem 4.2 below implies that the statistics C n and Q n are too large under the alternative. For any α ∈ (0, 1), denote c α the (1−α)-quantile of the distribution of sup 0≤τ ≤1 W d (τ ) 2 . Then at a nominal level α ∈ (0, 1), take ( C n > c α ) as the critical region of the test procedure based on C n . This test has correct size asymptotically. On the other hand, it holds that lim sup So we can use c α/2 as the critical value of the test based on Q n i.e. ( Q n > c α/2 ) as the critical region. This leads to an asymptotically conservative procedure. To get correct asymptotic size in the procedure based on Q n , we have to study the asymptotic distribution of ( Q (1) n , Q (2) n ). This seems to be a very difficult problem in view of the dependence structure of the model and the general structure of the parameter. In the problem of discriminating between long-range dependence and changes in mean, Berkes et al. [3] have studied the limit distribution of such statistic (i.e. the maximum of the maximum between the statistic based on the estimator computed with the observations until the time k (X 1 , . . . , X k ) and the one computed with the observations after k (X k+1 , . . . , X n )). This problem is the topic of a different research project.

Asymptotic under the alternative
Under H 1 , we assume is the integer part of x). The asymptotic behaviors of these test statistics are given by the following theorem.
It follows that the procedure based on C n is consistent under a single change alternative while the statistic Q n diverges to infinity even under multiple changepoints alternative. So, combined with an iterated cumulative sums of squares type algorithm (see [24]), the latter procedure can be used to estimate the number and the break points in the multiple change-points problem.
The Figure 1 is an illustration of these tests for the linear Poisson autoregressive model of order 1 One can see that, under H 0 , the statistics C n,k , Q n,k and Q (2) n,k are all below the horizontal line (see c-), e-), g-)) which represents the limit of the critical region. These statistics are greater than the critical value in the neighborhood of the breakpoint under H 1 (see d-), f-), h-)). In several situations, only one of the statistics Q (1) n and Q (2) n is greater than the critical value under the alternative; so the use of Q n := max( Q (1) n , Q (2) n ) is needed to get more powerful procedure (see for instance the real data application in Subsection 6.2).

Some numerical results for inference in INTGARCH model
We consider the integer-valued threshold GARCH(p, q) (or INTGARCH(p, q)) the true parameter of the model. This model is a special case of the INTARCH(∞) see (7).

Estimation and identification
Assume that a trajectory (Y 1 , . . . , Y n ) of Y is observed. If the orders (p, q) and the threshold are known, then the parameter θ 0 can be estimated by maximizing the conditional log-likelihood defined in (8) and Theorem 3.1 and Theorem 3.2 are applied.
n,k and Q (2) n,k . The horizontal line represents the limit of the critical region of the test.
The order (p, q) can be estimated by using an information criterion such as AIC or BIC; and the treshold is estimated by maximizing the log-likelihood over a set of integer values {0, 1, . . . , max } where max is an upper bound of the true threshold . In practice, we can choose an adapted max = max(Y 1 , . . . , Y n ).
Let p max and q max be the upper bound of the orders p and q respectively. Hence, the estimation of θ 0 , (p, q) and can be done in the following three steps: • Step 1 : For each (p, q, ) ∈ {0, . . . , p max } × {0, . . . , q max } × {0, . . . , max } fixed, compute the estimation θ p,q, as in (9). • Step 2: for each ∈ {0, . . . , max } fixed, select the "best" order (p ,q ) by minimizing the AIC or the BIC criteria. Therefore, the final estimated parameters of the model are n , (pˆ n ,qˆ n ) and θpˆ n ,qˆ n ,ˆ n . We have implemented this procedure on the R-software (developed by the CRAN project).

Some simulations results
We consider an −1 , 6). (14) This scenario is close to the real data example (see below). Note that, the INTGARCH(1, 1) model can be seen as a nonlinear Poisson autoregressive model with one knot. The MLE is reasonably good, as discussed in Davis and Liu [9]. We will focus on the estimation of the threshold and the order (p, q) of an INTGARCH((p, q)) model. Let (Y 1 , . . . , Y n ) be a trajectory generated according to (14). We fit an INTGARCH(p, q) model from these observations; p, q and are assumed to be unknown and are estimated as described above. For the problem of selecting the order (p, q), Table 1 indicates the proportions of the true order (1, 1), low and high order models selected (using AIC and BIC) based on 200 replications  with n = 500 and n = 1000. We have used p max = q max = 5. Some empirical statistics of the estimator of (the threshold) n are reported in Table 2.
These results show that, the (empirical) probability of selecting the true order increases as n increases for both AIC and BIC. Even when n = 500 (this length is close to the real data example, see below) this proportion is reasonably acceptable. It also appears that the BIC leads to select low orders models and less selects high orders models than the AIC; this is not surprising, since the penalty term of the BIC is greater than that of the AIC. Note that, the consistency of the BIC and the efficiency of the AIC have been proved in many situations in model selection theory; even if such results have not yet been proved in our model here, we can nevertheless see that the (empirical) probability to select the true model with the BIC increases (with n) than that of the AIC.
The empirical statistics of n displayed in Table 2 show that the estimation of the threshold is reasonably good (for both the AIC and the BIC) in terms of mean and quantiles. For example, when n = 1000 for more than 60% of the replications, the estimation of the threshold belongs to the set {5, 6, 7} while the true value of the threshold is 6.

Application to real data
We consider the number of transactions per minute for the stock Ericsson B during July 2, 2002. There are 460 available observations which represent the transaction of approximately 8 hours (from 09:35 to 17:14), see Figure 2. The empirical mean and variance of the series are 9.909 and 32.836 respectively. That is the data are overdispersed and the positive dependence (see ACF in Figure 2) suggest that the models studied here are candidates for fitting these data.
Fokianos et al. [19] have fitted these data using linear Poisson autoregression and exponential autoregressive model with one lag autoregression. These two models describe the data reasonably well and the Pearson residuals provided appear to be white. The linear Poisson autoregression pass the goodness-of-fit test proposed by Fokianos and Neumann [18]. The nonlinear Poisson autoregression with one knot applied by Davis and Liu [9] seems also to describe well the data according to the Pearson residuals analysis.
Nevertheless, the autocorrelation function of the observations displays strong dependence between transactions. Therefore, we apply an INTGARCH(p, q) (with p and q unknown) and select the "best" model as described above. AIC leads to INTGARCH(2, 1) model and BIC to INTGARCH(1, 1). The conditional  (Table 2) for the efficiency of this model selection procedure. It appears that the AIC and the BIC globally work well.
To examine the adequacy of the fitted model, we consider the estimated counterparts of the Pearson residuals (see for instance [26]) given by ξ t = (Y t − λ t )/ λ t . If the model is correctly specified, then these residuals should be close to a white noise sequence. The autocorrelation function displayed on Figure 3 [19] and tested by Fokianos and Neumann [18]), what is the best model for such data? This question is not easy to answer, because even if the Poisson INGARCH(1, 1) pass the test proposed by Fokianos and Neumann, some doubt have been pointed about the linearity assumption of these data (see the test based on H n in [18]) and the Poisson INTGARCH(2, 1) and INTGARCH(1, 1) models seem to well describe the data. Let us make some discussion according to the AIC, BIC and the Pearson residuals.   • The Pearson residuals of these three models seem to be white (see [19] for the case of the INGARCH (1, 1)). The mean square error of the Pearson residuals (defined by where d is the number of estimated parameters) displayed in Table 3 show a slight gain with the INTGARCH(1, 1) model.

Testing for parameter change in INGARCH model
We provide some simulations results to show the empirical performance of the tests procedures described above. We consider the Poisson INGARCH(1,1) model: For sample sizes n = 500, 1000, the statistics C n and Q n are computed with u n = v n = [(log n) 5/2 ]. The empirical levels and powers reported in the followings table are obtained after 200 replications at the nominal level α = 0.05.

Poisson INARCH(1) with two change-points alternative.
We assume in (15) that α 1 = 0 and denote by θ = (α 0 , β 1 ) the parameter of the model. Table 5 indicates the empirical levels computed when the parameter is θ 0 and the empirical powers computed when θ 0 changes to θ 1 at 0.3n which changes to θ 2 at 0.7n. The second alternative scenario is a case where the change in the parameters does not induce a change in the mean of the marginal distribution.  It appears in Tables 4, 5 that these two procedures display a size distortion when n = 500; but the empirical levels are close to the nominal one when n = 1000. One can also see that the empirical powers of these procedures increase with n and are more accurate even for the case that the break does not induce a change in the mean of the marginal distribution. Although the procedure based on Q n is little more powerful, the test based on C n provides satisfactory empirical powers even in the case of two change-points alternative. This could be a starting point for investigation of the consistency of this procedure under multiple change-points alternative.

Real data application
We consider the number of transactions per minute for the stock Ericsson B during July 16,2002. There are 460 observations which represent trading from 09:35 to 17:14. Figure 4 displays the data and its autocorrelation function.
Several works (see for instance Fokianos et al. [19], Davis and Liu [9]) on the series of July 2, 2002 led to use an INGARCH(1,1) model (on the series of July 2, 2002). This model provides α 1 + β 1 close to unity. It can be seen in the slow decay of the autocorrelation function (see Figure 2). The series in the period 2-22 July 2002 have been studied by Brännäs and Quoreshi [5]. They have pointed out the presence of long memory in these data and applied INARMA model to both level and first difference forms.
For the transaction during July 16, 2002, we test the adequacy of the Poisson INGARCH(1,1) model by applying the goodness-of-fit test proposed by Fokianos and Neumann [18]. Let θ n = ( α 0,n , α 1,n , β 1,n ) be the maximum likelihood estimator computed on the observations. Denote I t = ( λ t , Y t ) where λ t = α 0,n + α 1,n λ t−1 + β 1,n Y t−1 . Recall that the estimated Pearson residuals is given by ξ t = (Y t − λ t )/ λ t . The goodness-of-fit test is based on the statistic is a univariate kernel. A parametric bootstrap procedure can be used to compute p-values of this test. See [18] for more detail on this test procedure.
We have applied this test with B = 300 bootstrap replications and the pvalues 0.032 and 0.05 have been obtained respectively for uniform and Epanechnikov kernel. So, the linear Poisson INGARCH(1,1) model is rejected.
The previous test for change detection have been applied to the series. A change has been detected around the midday at t * = 12:05. See the statistics C n,k , Q (1) n,k and Q (2) n,k as well as the breakpoint and the autocorrelation function of each regime on Figure 5.
To assess the adequacy of the linear Poisson INGARCH(1,1) on each regime, we apply the goodness-of-fit test of Fokianos and Neumann [18]. The p-values obtained are displayed in Table 6. These results point to the adequacy of the linear Poisson INGARCH(1,1) on of the first regime and raises some doubt about the linearity on the second regime. This shows that, the model structure of the transactions in the morning may be different to the structure of the transactions in the afternoon. Moreover, Figure 5 shows that, the autocorrelation function of each regime decreases fast; this rules out the idea of the long memory in the series.

Proofs of the main results
Proof of the Proposition 2.1. We will use the same techniques as in [12]. Let p, q two fixed non-negative integers. Definite the sequence (λ p,q t ) t∈Z by The existence of moment of any order of the process (Y t , λ t ) t∈Z (see [13]) and assumption (A F ) imply the existence of moment of any order of (λ p,q t ) t∈Z . Let us show that (λ p,q 0 ) q≥0 is a Cauchy sequence in L 1 . By using (A F ), we have By definition and the strictly stationarity of (Y t ) t∈Z , we can easily see that for j = 1 . . . , p, the couples (λ p,q+1 For any fixed p, denote v q = E|λ p,q+1 By applying the Lemma 5.4 of [12], we obtain Hence, v q → 0 as q → ∞. Thus, for any p > 0, the sequence (λ p,q 0 ) is a Cauchy sequence in L 1 . Therefore, it converges to some limit denoted λ p 0 . Moreover, since the sequence (λ p,q 0 ) q≥1 is measurable w.r.t to σ(Y t , t < 0), it is the case of the limit λ p 0 . So, there exists a measurable function f (p) such that Y −1 , . . .). By going along similar lines, it holds that for any t ∈ Z, the sequence (λ p,q t ) q≥1 converges in L 1 to some λ p t = f (p) (Y t−1 , . . .) and since (Y t ) t∈Z , is stationary and ergodic, the process (λ p t ) t∈Z is too stationary and ergodic.
Let p and t fixed. For q large enough, we have (see (16)). By using the continuity (which comes from Lipschitz-type condi- . . . , Y p , 0 . . . ; y) for any fixed y = (y 1 ) i≥1 and by carrying q to infinity, it holds that Denote μ p = Eλ p t , μ = sup p≥1 μ p , Δ p,t = E|λ p+1 t − λ p t | and Δ p = sup t∈Z Δ p,t . By going the same lines as in [12], we obtain Δ p ≤ Cα p+1 . Therefore, Δ p → 0 as p → ∞. This shows that for any fixed t ∈ Z, (λ p t ) p≥1 is a Cauchy sequence in L 1 . Thus it converges to some random λ t ∈ L 1 . Moreover, λ t is measurable w.r.t σ(Y j , j < t) (because it is the case of (λ p t ) p≥1 ). Thus, there exists a measurable function f such that λ t = f (Y t−1 , . . .) for any t ∈ Z. This implies that ( λ t ) t∈Z is strictly stationary and ergodic. Finally, by using equation (17) and continuity of F , it comes that for any t ∈ Z.
Hence, the process (Y t , λ t ) t∈Z is strictly stationary ergodic and satisfying (1). By the uniqueness of the solution, it holds that λ t = λ t a.s.
Proof of the Theorem 3.1. Without loss of generality, for simplifying notation, we will make the proof with T jn,kn = T 1,n . The proof is divided into two parts. We will first show that Hence, We will show that, for any Thus, by using the stationarity of the process (Y t ) t∈Z , it follows that By the uniform strong law of large number applied on ( t (θ)) t≥1 (see Straumann and Mikosch (2006) [33]), it holds that Now let us show that We have We will apply the Corollary 1 of Kounias and Weng (1969) [29]. So, it suffices to show that 1 n . Hence . (10)).
Hence, it follows that From (19) and (20), we deduce that (ii) We will now show that the function θ → L(θ) = E 0 (θ) has a unique maximum at θ 0 . We will proceed as in [9]. Let θ ∈ Θ, with θ = θ 0 . We have By applying the mean value theorem at the function Since θ = θ 0 , it follows from assumption Id(Θ) that 1 Thus, the function θ → L(θ) has a unique maximum at θ 0 .
(i), (ii) and standard arguments lead to the consistency of θ n (T 1,n ).
The following lemma are needed to prove the Theorem 3.2. Lemma 7.1. Let (j n ) n≥1 and (k n ) n≥1 two integer valued sequences such that (j n ) n≥1 is increasing, j n → ∞ and k n − j n → ∞ as n → ∞. Let n ≥ 1, for any segment T = T jn,kn ⊂ {1, . . . , n}, it holds under assumptions of Theorem 3.2 that

Inference and structural change for Poisson autoregression 1295
Proof. Let i ∈ {1, . . . , n}. We have Hence Let r > 0. Using the Minkowski and Hölder's inequalities, it holds that Thus, We The same argument gives Hence, Therefore, we have (with r = 1) This holds for any coordinate i = 1 . . . , d; and completes the proof of the lemma.
Lemma 7.2. Let (j n ) n≥1 and (k n ) n≥1 two integer valued sequences such that (j n ) n≥1 is increasing, j n → ∞ and k n − j n → ∞ as n → ∞.
Moreover, according to Lemma 7.1, it holds that So, for n large enough, we have To complete the proof of Theorem 3.2, we have to show that ∂θ∂θ 0 (θ 0 )) and G n (T 1,n , θ n (T 1,n )) Moreover, recall that For any j = 1, . . . , d, we have This holds for any 1 ≤ i, j ≤ d. Thus, (c) If U is a non-zero vector of R d , according to assumption Var, it holds that U ∂ ∂θ f 0 θ0 = 0 a.s. Hence Thus Σ is positive definite.
(i) For any v n ≤ k ≤ n − v n , we have as n → ∞ Thus, as n → ∞, it holds that

P. Doukhan and W. Kengne
The last equality above holds because sup 0<τ <1 it is a consequence of the properties of the function q when I 0,1 (q, c) is finite for some c > 0. (ii) Goes the same lines as in (i).
Proof of Theorem 4.1.
1. According to Lemma 7.3, it suffices to show that Let v n ≤ k ≤ n − v n . By applying (24) with T = T 1,k and T k+1,n , we have and As n → +∞, we have Thus, we have i.e.
Proof of Theorem 4.2.