Skip to main content
Log in

Bayesian modeling for overdispersed event-count time series

  • Original Paper
  • Published:
Behaviormetrika Aims and scope Submit manuscript

Abstract

Social scientists are frequently interested in event-count time-series data. One of the state-of-the-art methods, the Poisson exponentially weighted moving average (P-EWMA) model, leads to incorrect inference in the presence of omitted variables even if they are not confounding. To tackle this problem, this paper proposes a negative binomial integrated error [NB-I(1)] model, which can be estimated via Markov Chain Monte Carlo methods. Simulations show that when the data are generated by a P-EWMA model, but an non-confounding covariate is omitted at the stage of estimation, the P-EWMA model’s credible interval is optimistically too narrow to contain the true value at the nominal level, whereas the NB-I(1) model does not suffer this problem. To explore the models’ performance, we replicate a study on an annual count of militarized interstate disputes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. Another exception is the Poisson autoregressive model (Brandt and Williams 2001), though we do not consider it here. The recent literature in statistics on ECTS models is reviewed by Brandt and Sandler (2012), Cameron and Trivedi (2013, ch. 7), Fokianos (2012), and Trivedi and Munkin (2011).

  2. This is different from the Poisson autoregressive model (Brandt and Williams 2001).

  3. The state-space parametrization we use is non-standard. As such, in the Supporting Information (SI), we represent the models using the more standard parametrization used by Brandt et al. (2000) and Harvey and Fernandes (1989), and specifically present them in correspondence with our parametrization (SI Sect. S.1.1). There we also elaborate why we use the alternative parametrization here.

  4. Bradlow et al. (2002) propose approximate conjugate prior for negative binomial. If one parametrizes the negative binomial distribution by size and probability, the beta distribution is the only conjugate prior for the probability parameter (Harvey and Fernandes 1989). This parametrization does not, however, help us to model overdispersion without changing the mean. In addition, previous non-Gaussian time-series model have mostly focused on exponential family distributions. To our knowledge, no discrete distributions from the exponential family with support on the non-negative integers exist.

  5. Trivedi and Munkin (2011) consider MCMC estimation of cross section and panel count models.

  6. For a distinction between observation and parameter driven models, see Davis et al. (2003), Durbin and Koopman (2000), and Jung et al. (2006).

  7. In a P-EWMA model, it is the marginal distribution:

    $$\begin{aligned} {{\mathcal {N}}}{{\mathcal {B}}}(Y_{t}|y_{t-1}) = \int ^{\infty }_{0} {\mathcal {P}}(Y_{t}|\epsilon _{t}) {\mathcal {G}}(\epsilon _{t}|y_{t-1}) {\text {d}} \epsilon _{t} \end{aligned}$$

    that follows a negative binomial distribution, not the conditional distribution \({\mathcal {P}}(Y_{t}|\epsilon _{t})\).

  8. We downloaded Brandt et al.’s (2000) R script, pests.r (version 1.1.5 dated September 7, 2009), on March 10, 2014 from http://www.utdallas.edu/~pbrandt/Patrick_Brandts_Website/Code_%26_Software.html. We appreciate that the authors make their script public. We use their pewmadgp() function for data generation, for which we should specify the posterior shape and rate parameters of \(\epsilon _{0}\) (denoted by \(a_{0}\) and \(b_{0}\), respectively). Following their example file, pests-example.r, we define \(b_{0} = 0.25\). Following Brandt et al. (2000, f.n. 12) and pests.r, we set \(a_{0}=\epsilon _{1} b_{0} = 2.5\).

  9. The models are implemented and estimated using R and Stan (R version 3.5.3 with the rstan library, version 2.18.2). The SI includes copies of the Stan model implementations (SI Sect. S.2.1). For each chain, we draw 2000 samples, discard the first half, and obtain 1000 samples. We draw four chains. In total, we have \((2000 - 1000) \times 4 = 4000\) samples. Following the Pewma() function in pests.r, for all of the three models, we exclude the first count (\(y_1\)) from the dependent variable and use it only for estimating \(\epsilon _1\) (P-I(1) and NB-I(1) models) or mean and variance of \(\epsilon _1\) (P-EWMA model), though it is possible to make models which include the first count into the dependent variable. To focus the comparison on model differences, not the difference between maximum likelihood estimation (MLE) and MCMC as estimation methods, we use the Stan version of the P-EWMA model instead of the MLE-based Pewma()R function. We thank the associate editor and a reviewer for this suggestion. The SI does include results using the original Pewma() function—basically, the performance is worse than the P-I(1) and NB-I(1) models (SI Sect. S.2.2).

  10. We sample \(\delta ^{-1}\) instead of \(\delta \) due to Stan parametrization.

  11. Brandt et al. (2000) also repeat their procedure 200 times for each setup. If \(y_{1}=0\) (Brandt et al. 2000, f.n. 22), if \(y_{t}\) for any t is missing, if any model fails to produce estimates, or if any model does not converge (that is, potential scale reduction factor for any parameter of any model is larger than 1.1), we dismiss the estimates of all of the three models, draw \(y_{t}\)’s from scratch, and estimate parameters for this trial again. We will make our R scripts for simulation and application available on Harvard Dataverse upon publication.

  12. We use the bridgesampling library. We omit several trials for each value of the dispersion parameter, where log marginal likelihood cannot be calculated.

  13. Stan samples \(\delta ^{-1(i,j)}\); we invert it to obtain \(\delta ^{(i,j)}\).

  14. In Eq. 4, the dispersion parameter corresponds to \((\rho _{\text {W}} \odot y_{t-1})^{-1}\), which increases if \(\rho _{\text {W}}\) decreases. Grunwald et al. (1997, 619) also argue that, in power steady models including P-EWMA, “[b]oth the temporal characteristics of the model \(\cdots \) and the dispersion of the forecast distribution are controlled by” a single model parameter. “As a result, the range of possible models is quite limited.”

  15. We use the term “model” to refer to a statistical model, e.g. P-EWMA, and NB-I(1), and the term “specification” to refer to a set of covariates in a statistical model.

  16. Reanalysis of other long-cycle theories is presented in the SI, Sect. S.3.6.

  17. We would like to thank the original authors for sharing a copy of the original data that we could use for validation. MID Onsets in 1816, the first year in the full data, is equal to zero. Following Brandt et al. (2000, f.n. 22), we drop the first year of data and start with 1817, when MID Onsets is positive. The MID data have been revised, since Pollins (1996) was published. Because we are replicating, we use the same series that Pollins (1996) and Brandt et al. (2000) used. Researchers interested in the impact of long cycles and hegemony on armed conflicts will want to collect revised data (as well as consider alternative operational decisions) and estimate their own specifications.

  18. Since our analysis excludes the first non-zero count (in 1817) from the dependent variable (footnote 9), this figure starts with 1818.

  19. We are able to effectively replicate the result by Brandt et al. (2000, Table 1, bottom panel) using the Pewma() implementation (SI, Sect. S.3.1).

  20. In the SI (Sect. S.3.3, esp. Fig. S.12), we report the specification, where the independent variables include CR3 through CR8.

  21. We have to drop both the constant term and a reference category for identification purposes (see the SI, Sect. S.3.2), whereas Brandt et al. (2000) estimate the P-EWMA model without constant term but also without dropping a reference category.

  22. The MCMC samples are run with 3 chains for a total of 21000 iterations each. We discard the first 6000 iterations for warmup and thereafter keep every 10th sample for a total of \(3\times (21000{\text {--}}6000)/10 = 4500\) samples from the posterior distribution. The prior distributions are the same as Eq. 7. We checked for convergence using Geweke and Gelman–Rubin convergence diagnostics (SI, Sect. S.3.4). We also estimated our specification for the P-EWMA model using the Pewma() function; the substantive implications are mostly similar to the MCMC-based results (SI, Sect. S.3.5).

  23. To put it another way, if a model specifies parameters’ moments only (e.g. Zeger’s (1988) serially correlated error model), it cannot be written in state-space form. We do not consider this class of models here.

References

  • Alzaid AA, Al-Osh M (1990) An integer-valued pth-order autoregressive structure (INAR (p)) process. J Appl Prob 27(4):314–324

    Article  Google Scholar 

  • Boehmke FJ, Osborn TL, Schilling EU (2015) Pivotal politics and initiative use in the American states. Polit Res Q 68(4):665–677

    Article  Google Scholar 

  • Bradlow ET, Hardie BGS, Fader PS (2002) Bayesian inference for the negative binomial distribution via polynomial expansions. J Comput Gr Stat 11(1):189–201

    Article  MathSciNet  Google Scholar 

  • Brandt PT, Sandler T (2012) A bayesian poisson vector autoregression model. Polit Anal 20(3):292–315

    Article  Google Scholar 

  • Brandt PT, Williams JT (2001) A linear poisson autoregressive model: the poisson AR(p) model. Polit Anal 9(2):164–184

    Article  Google Scholar 

  • Brandt PT, Williams JT, Fordham BO, Pollins B (2000) Dynamic modeling for persistent event-count time series. Am J Polit Sci 44(4):823–843

    Article  Google Scholar 

  • Cameron AC, Trivedi PK (2013) Regression analysis of count data, 2nd edn. Cambridge University Press, Cambridge

    Book  Google Scholar 

  • Davis RA, Dunsmuir WTM, Streett SB (2003) Observation-driven models for poisson counts. Biometrika 90(4):777–790

    Article  MathSciNet  Google Scholar 

  • Durbin J, Koopman SJ (2000) Time series analysis of non-gaussian observations based on state space models from both classical and bayesian perspectives. J R Stat Soc Ser B 62(1):3–56

    Article  MathSciNet  Google Scholar 

  • Fokianos K (2012) Count time series models. In: Rao TS, Rao SS, Rao C (eds) Time series analysis: methods and applications. North Holland, Amsterdam

    MATH  Google Scholar 

  • Fukumoto K (2008) Legislative production in comparative perspective: cross sectional study of 42 countries and time-series analysis of the Japan case. Jpn J Polit Sci 9(1):1–19

    Article  Google Scholar 

  • Grunwald GK, Hamza K, Hyndman R (1997) Some properties and generalizations of non-negative bayesian time series models. J R Stat Soc Ser B 59(3):615–626

    Article  MathSciNet  Google Scholar 

  • Harvey AC, Fernandes C (1989) Time series models for count or qualitative observations. J Business Econ Stat 7(4):407–417

    Google Scholar 

  • Howell W, Adler S, Cameron C, Riemann C (2000) Divided government and the legislative productivity of congress, 1945–94. Legislat Stud Q 25(2):285–312

    Article  Google Scholar 

  • Howell WG (2003) Power without persuasion: the politics of direct presidential action. Princeton University Press, Princeton

    Book  Google Scholar 

  • Jung RC, Kukuk M, Liesenfeld R (2006) Time series of count data: modeling, estimation and diagnostics. Comput Stat Data Anal 51(4):2350–2364

    Article  MathSciNet  Google Scholar 

  • King G (1988) Statistical models for political science event counts: bias in conventional procedures and evidence for the exponential poisson regression model. Am J Polit Sci 32(3):838–863

    Article  Google Scholar 

  • King G (1989a) Unifying political methodology: the likelihood theory of statistical inference. Cambridge University Press, Cambridge

    Google Scholar 

  • King G (1989b) Variance specification in event count models: from restrictive assumptions to a generalized estimator. Am J Polit Sci 33(3):762–784

    Article  Google Scholar 

  • Mayhew DR (2005) Divided we govern: party control, lawmaking, and investigations, 1946–2002, 2nd edn. Yale University Press, New Haven

    Google Scholar 

  • McKenzie E (1988) Some ARMA models for dependent sequences of poisson counts. Adv Appl Prob 20(4):822–835

    Article  MathSciNet  Google Scholar 

  • Metternich NW, Dorff C, Gallop M, Weschle S, Ward MD (2013) Antigovernment networks in civil conflicts: how network structures affect conflictual behavior. Am J Polit Sci 57(4):892–911

    Google Scholar 

  • Mitchell SM, Moore WH (2002) Presidential uses of force during the cold war: aggregation, truncation, and temporal dynamics. Am J Polit Sci 46(2):438–452

    Article  Google Scholar 

  • Park JH (2010) Structural change in the U.S. presidents’ use of force abroad. Am J Polit Sci 54(3):766–782

    Article  Google Scholar 

  • Pollins BM (1996) Global political order, economic change, and armed conflict: coevolving systems and the use of force. Am Polit Sci Rev 90(1):103–117

    Article  Google Scholar 

  • Trivedi PK, Munkin MK (2011) Recent developments in cross section and panel count models. In: Ullah A, Giles DEA (eds) Handbook of empirical economics and finance. Chapman and Hall/CRC, Boca Raton, pp 87–131

    Google Scholar 

  • Zeger SL (1988) A regression model for time series of counts. Biometrika 75(4):621–629

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank Patrick Brandt, Andrew Martin, Kevin Quinn, and Shawn Treier for their comments, and Ellen Furby for data management assistance.

Funding

We acknowledge financial support from the Japan Election Studies Association.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kentaro Fukumoto.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Communicated by Takahiro Hoshino.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Previous versions of this paper were presented at the Annual Meetings of the Japan Election Studies Association, Tokyo, Japan, May 17–18, 2014 and May 20–21, 2006; the Annual Meetings of the Midwest Political Science Association, Chicago, IL, USA, April 11–14, 2013 and April 20–23, 2006; and the Annual Meeting of the American Political Science Association, New Orleans, LA, USA, August 30–September 2, 2012.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 2823 KB)

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fukumoto, K., Beger, A. & Moore, W.H. Bayesian modeling for overdispersed event-count time series. Behaviormetrika 46, 435–452 (2019). https://doi.org/10.1007/s41237-019-00093-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s41237-019-00093-5

Keywords

Navigation