Estimation for a class of generalized state-space time series models

https://doi.org/10.1016/S0167-7152(02)00325-5Get rights and content

Abstract

State-space models with exponential and conjugate exponential family densities are introduced. Examples include Poisson–Gamma, Binomial–Beta, Gamma–Gamma and Normal–Normal processes. Maximum likelihood and quasilikelihood estimators and their properties are discussed. Results from a simulation study for the Poisson–Gamma model are reported.

Introduction

State-space models typically involve two related processes: an observation or output process {Yt} and an unobserved “state” process {St}. Typically, the observation process {Yt} is specified conditionally on {St}, and the evolution of the state process {St} is then characterized conditionally given either the past states (St−1,St−2,…), or the past observations (Yt−1,Yt−2,…), the former being the more common approach. If {St} is specified in terms of the past states {St−1,St−2,…}, the model is a state-driven or a parameter-driven model; on the other hand, if {St} is specified in terms of the past observations {Yt−1,Yt−2,…}, the model is referred to as an observation-driven model. A typical example of a state-driven model is the linear state-space model characterized byYt=ZtStt(observationequation),St=TtSt−1t(stateequation),where {Zt} and {Tt} are non-random functions possibly depending on unknown parameters; the error terms {εt} and {ηt} are uncorrelated white noise processes. See Harvey (1989), Diderrich (1985), Tanizaki (1993), Burridge and Wallis (1988), Brockwell and Davis (1987) and Chow (1983) for discussion of linear models.

A non-linear generalization of the model in , is given byYt=ht(St,xtt)andSt=gt(St−1,xtt),where ht(·) and gt(·) are known functions, and xt is a covariate. See Anderson and Moore (1979), Gelb (1974), Tanizaki (1996), Tanizaki and Mariano (1996), and Wisher et al. (1969) for discussion on non-linear state-space models.

More generally, one may consider a model defined in terms of conditional densities:p(yt|st)andp(st|s(t−1))(state-drivenmodel)orp(yt|st)andp(st|y(t−1))(observation-drivenmodel),where s(t−1)=(st−1,…,s1) and y(t−1)=(yt−1,…,y1).

As a specific example, suppose that conditional on St=st,{Yt},t=1,2,…, are independent Poisson random variables with means {st}. Davis et al. (1999) discuss a state-driven model where they choose p(st|s(t−1)) to be a log-normal density withE(logSt|S(t−1))=xt′β+φ(logSt−1−xt−1′β)and Var(logSt|S(t−1))=σ2. Zeger (1988) used the above model to analyze the data on US polio incidence. Harvey (1989) considered an observation-driven Poisson model with p(st|y(t−1)) being a gamma density with parameters (αt,βt) whereαt=j=1t−1wjyt−j,βt=j=1t−1wj−1and 0<w<1 is an unknown parameter. This model belongs to the so-called class of power-steady models (Smith, 1979) characterized byp(ηt|y(t−1))∝[p(ηt−1|y(t−1))]w,where 0<w<1 and ηt=logst. As noted by Grunwald et al. (1997), the power-steady models suffer from a serious flaw regarding their asymptotic behavior. For instance, for the above Poisson-Gamma model, Yta.s.0 as t→∞. Grunwald et al. (1997) propose to avoid this problem with the power-steady models by directly modeling μt and specifying a Markov process for {Yt} bypassing the state process altogether. However, the state process is a very useful tool in modeling and for filtering and prediction. In this paper, we show how to fix the problem to achieve the same goal and still retain the state-space structure.

The main goal of this paper is to study maximum likelihood and quasilikelihood estimation for an observation-driven generalized state-space model with p(yt|st) and p(st|y(t−1)) belonging to an exponential and a conjugate exponential families, respectively. The conditional mean μt=E(Yt|Y(t−1)) turns out to be an important quantity which can be modeled appropriately to ensure that {Yt} is a stationary process. We have chosen to model the parameters of the conjugate family p(st|y(t−1)) in such a way that the forecast density p(yt|y(t−1)) is determined by specifying μt. Furthermore, μt is chosen so that {Yt} is an ergodic Markov process. This approach avoids the problems associated with the power-steady models. The likelihood function based on (Y1,…,Yn) isLn=p(y(0))t=1np(yt|y(t−1)).For the state-driven models, the expression for p(yt|y(t−1)) is typically not available in closed form. On the other hand for the observation-driven models discussed in this paper, the conditional density p(yt|y(t−1)) (and hence Ln) has an explicit simple form.

This paper is organized as follows. The generalized state-space models are reviewed in Section 2. Hirerarchical exponential family processes and their examples are discussed in Section 3. The Markov process specification is given in Section 4. Section 5 is concerned with maximum likelihood estimation and application to the examples discussed in Section 3. Quasilikelihood estimation is discussed in Section 6. Finally, a brief simulation study is presented in Section 7.

Section snippets

Generalized state space models

Let {Yt} and {St},t=0,±1,±2,… denote the observations and related unobserved “states”, respectively, at time t. The process {(Yt,St)} can be characterized by a general state-space model via the following conditional densities:p(yt|(st,s(t−1),y(t−1))):(observationdensity)andp(st|(st−1,s(t−2),y(t−1))):(statedensity),where s(t−1)=(st−1,st−2,…,s1),s(t−2)=(st−2,…,s1),y(t−1)=(yt−1,…,y1), and p(u|v) denotes the conditional density of u given v, as a generic symbol. The two broad classes of models

Hierarchical exponential family processes

Observation driven state-space models using exponential families and their conjugates were discussed by Harrison and Stevens (1976) in the context of Bayesian forecasting. Here we use a slightly different parameterization, which will enable us to express the forecast density directly in terms of the conditional mean of Yt given Y(t−1).

Suppose {Yt} is a k-dimensional time series with the conditional densitiesp(yt|st)=h(yt)exp[yt′η(st)−R(η(st))],whereR(η(st))=logexp(yt′η(st))dμ(yt)<∞.In terms of

Markovian specification

Consider the process {Yt} specified by , , . It is then seen that the predictive density p(yt|y(t−1)) is given by (3.6). The dependence on the past y(t−1)=(yt−1,…,y1) in (3.6) enters only through the parameter θt which in turn is related to the conditional mean μt (see (3.5)). By modeling μt (or θt) appropriately, one can control the asymptotic behaviour of the process {Yt} and avoid the problems associated with the power-steady models. We propose to use a Markovian specification by first

Maximum likelihood estimation

The likelihood function for the general hierarchical exponential family process introduced in Section 3 is given byLn0,β)=p(y(0))t=1np(yt|y(t−1)),where p(yt|y(t−1)) is given by (3.6). DenoteHt0,β)=H(γ0+1,θt+yt)−H(γ0t)=H(γ0+1,γ0μt+yt)−H(γ00μt),where μt=θt/γ0=E(Yt|Y(t−1)). Note that, for simplicity, we have taken γ0t=γ0. Specification of θt as a function of the past observations Y(t−1) and the parameter β will determine Ht and hence the likelihood function Ln. Assuming y(0) is fixed,

Quasilikelihood estimation

In order to use the maximum likelihood method of estimation, we need the precise specification of the conditional densities p(yt|y(t−1)), which, in turn, require the knowledge of the observation density p(yt|ηt) and the state density p(ηt|y(t−1)). Typically, one begins with a model for p(yt|ηt), say for Poisson count data. However, a precise model for p(ηt|y(t−1)) may not be obvious. One can then consider the quasilikelihood score functionQn(γ)=∑t=1ndμtdγ(Var(Yt|y(t−1)))−1(yt−μt),where μt=E(Yt|y

A simulation study

Consider the Poisson-Gamma model in Example 5.1 with γ0=1,β0=15,β1=0.9 and β2=0.1. Observations {Yt} were generated with n=200, and the sample autocorrelations ρ̂(h),h=1,2,…,6, were computed. The experiment was repeated 1000 times and the average values of ρ̂(h) over the 1000 replications were computed. The results are given in Table 1.

Note that ρ̂(h) decreases exponentially fast as h increases. This behavior of ρ̂(h) is similar to that for a stationary AR(1) process.

Assuming γ0=1 to be known,

Acknowledgements

We thank the referee for a careful reading and constructive suggestions.

References (27)

  • B.D.O. Anderson et al.

    Optimal Filtering

    (1979)
  • I.V. Basawa et al.

    Statistical Inference for Stochastic Processes

    (1980)
  • Billingsley, P., 1961. Statistical Inference for Markov Processes. The University of Chicago Press, Chicago,...
  • P.J. Brockwell et al.

    Time Series: Theory and Methods

    (1987)
  • P. Burridge et al.

    Prediction theory for autoregressive moving average processes

    Econometric Reviews

    (1988)
  • G.C. Chow

    Econometrics

    (1983)
  • D.R. Cox

    Statistical analysis of time seriessome recent developments

    Scand. J. Statist.

    (1981)
  • R.A. Davis et al.

    Modeling time series of count data

  • P. Diaconis et al.

    Conjugate priors for exponential families

    Ann. Statist.

    (1979)
  • G.T. Diderrich

    The Kalman filter from the perspective of Goldberger–Theil estimators

    Amer. Statistician

    (1985)
  • A. Gelb

    Applied Optimal Estimation

    (1974)
  • V.P. Godambe

    The foundations of finite sample estimation in stochastic processes

    Biometrika

    (1985)
  • G.K. Grunwald et al.

    Some properties and generalizations of non-negative Bayesian time series models

    J. Roy. Statist. Soc. B

    (1997)
  • Cited by (9)

    View all citing articles on Scopus
    View full text