Skip to main content
Log in

Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection

  • Research Article
  • Published:
Journal of the Korean Statistical Society Aims and scope Submit manuscript

A Discussion to this article was published on 16 September 2020

A Discussion to this article was published on 16 September 2020

A Discussion to this article was published on 16 September 2020

A Discussion to this article was published on 16 September 2020

Abstract

Many existing procedures for detecting multiple change-points in data sequences fail in frequent-change-point scenarios. This article proposes a new change-point detection methodology designed to work well in both infrequent and frequent change-point settings. It is made up of two ingredients: one is “Wild Binary Segmentation 2” (WBS2), a recursive algorithm for producing what we call a ‘complete’ solution path to the change-point detection problem, i.e. a sequence of estimated nested models containing \(0, \ldots , T-1\) change-points, where T is the data length. The other ingredient is a new model selection procedure, referred to as “Steepest Drop to Low Levels” (SDLL). The SDLL criterion acts on the WBS2 solution path, and, unlike many existing model selection procedures for change-point problems, it is not penalty-based, and only uses thresholding as a certain discrete secondary check. The resulting WBS2.SDLL procedure, combining both ingredients, is shown to be consistent, and to significantly outperform the competition in the frequent change-point scenarios tested. WBS2.SDLL is fast, easy to code and does not require the choice of a window or span parameter.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Amiri, A., & Allahyari, S. (2012). Change point estimation methods for control chart postsignal diagnostics: A literature review. Quality and Reliability Engineering International, 28, 673–685.

    Google Scholar 

  • Anastasiou, A., & Fryzlewicz, P. (2018a). Detecting multiple generalized change-points by isolating single ones. Preprint,

  • Anastasiou, A., & Fryzlewicz, P. (2018b). IDetect: Detecting multiple generalized change-points by isolating single ones. https://CRAN.R-project.org/package=IDetect. R package version 1.0.

  • Andreou, E., & Ghysels, E. (2002). Detecting multiple breaks in financial market volatility dynamics. Journal of Applied Econometrics, 17, 579–600.

    Google Scholar 

  • Arlot, S. (2019). Minimal penalties and the slope heuristics: A survey. Journal de la Societe Française de Statistique, 160, 1–106.

    MathSciNet  MATH  Google Scholar 

  • Arlot, S., Brault, V., Baudry, J.-P., Maugis, C., & Michel, B. (2016). capushe: CAlibrating Penalities Using Slope HEuristics. https://CRAN.R-project.org/package=capushe. R package version 1.1.1.

  • Bai, J. (1997). Estimating multiple breaks one at a time. Econometric Theory, 13, 315–352.

    MathSciNet  Google Scholar 

  • Bai, J., & Perron, P. (2003). Computation and analysis of multiple structural change models. Journal of Applied Econometrics, 18, 1–22.

    Google Scholar 

  • Baranowski, R., & Fryzlewicz, P. (2015). wbs: Wild binary segmentation for multiple change-point detection. https://CRAN.R-project.org/package=wbs. R package version 1.3.

  • Baranowski, R., Chen, Y., & Fryzlewicz, P. (2019). Narrowest-Over-Threshold detection of multiple change-points and change-point-like features. Journal of the Royal Statistical Society: Series B, 81, 649–672.

    MathSciNet  MATH  Google Scholar 

  • Baudry, J.-P., Maugis, C., & Michel, B. (2012). Slope heuristics: Overview and implementation. Statistics and Computing, 22, 455–470.

    MathSciNet  MATH  Google Scholar 

  • Birgé, L., & Massart, P. (2001). Gaussian model selection. Journal of the European Mathematical Society, 3, 203–268.

    MathSciNet  MATH  Google Scholar 

  • Birgé, L., & Massart, P. (2007). Minimal penalties for Gaussian model selection. Probability Theory and Related Fields, 138, 33–73.

    MathSciNet  MATH  Google Scholar 

  • Bosq, D. (1998). Nonparametric statistics for stochastic processes (2nd ed.). New York: Springer.

    MATH  Google Scholar 

  • Boysen, L., Kempe, A., Liebscher, V., Munk, A., & Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators. Annals of Statistics, 37, 157–183.

    MathSciNet  MATH  Google Scholar 

  • Braun, J., & Mueller, H.-G. (1998). Statistical methods for DNA sequence segmentation. Statistical Science, 13, 142–162.

    MATH  Google Scholar 

  • Braun, J., Braun, R., & Mueller, H.-G. (2000). Multiple changepoint fitting via quasilikelihood, with application to dna sequence segmentation. Biometrika, 87, 301–314.

    MathSciNet  MATH  Google Scholar 

  • Brodsky, B., & Darkhovsky, B. (1993). Nonparametric methods in change-point problems. Dordrecht: Kluwer Academic Publishers.

    MATH  Google Scholar 

  • Chen, K.-M., Cohen, A., & Sackrowitz, H. (2011). Consistent multiple testing for change points. Journal of Multivariate Analysis, 102, 1339–1343.

    MathSciNet  MATH  Google Scholar 

  • Cho, H., & Fryzlewicz, P. (2011). Multiscale interpretation of taut string estimation and its connection to Unbalanced Haar wavelets. Statistics and Computing, 21, 671–681.

    MathSciNet  MATH  Google Scholar 

  • Cho, H., & Fryzlewicz, P. (2012). Multiscale and multilevel technique for consistent segmentation of nonstationary time series. Statistica Sinica, 22, 207–229.

    MathSciNet  MATH  Google Scholar 

  • Cho, H., & Fryzlewicz, P. (2015). Multiple change-point detection for high-dimensional time series via sparsified binary segmentation. Journal of the Royal Statistical Society Series B, 77, 475–507.

    MathSciNet  MATH  Google Scholar 

  • Ciuperca, G. (2011). A general criterion to determine the number of change-points. Statistics & Probability Letters, 81, 1267–1275.

    MathSciNet  MATH  Google Scholar 

  • Ciuperca, G. (2014). Model selection by LASSO methods in a change-point model. Statistical Papers, 55, 349–374.

    MathSciNet  MATH  Google Scholar 

  • Cleynen, A., Rigaill, G., & Koskas, M. (2016). Segmentor3IsBack: A fast segmentation algorithm. https://CRAN.R-project.org/package=Segmentor3IsBack. R package version 2.0.

  • D’Angelo, M., Palhares, R., Takahashi, R., Loschi, R., Baccarini, L., & Caminhas, W. (2011). Incipient fault detection in induction machine stator-winding using a fuzzy-Bayesian change point detection approach. Applied Soft Computing, 11, 179–192.

    Google Scholar 

  • Davies, P. L., & Kovac, A. (2001). Local extremes, runs, strings and multiresolution. Annals of Statistics, 29, 1–48.

    MathSciNet  MATH  Google Scholar 

  • Davis, R., Lee, T., & Rodriguez-Yam, G. (2006). Structural break estimation for nonstationary time series models. Journal of the American Statistical Association, 101, 223–239.

    MathSciNet  MATH  Google Scholar 

  • Du, C., Kao, C.-L., & Kou, S. (2016). Stepwise signal extraction via marginal likelihood. Journal of the American Statistical Association, 111, 314–330.

    MathSciNet  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32, 407–499.

    MathSciNet  MATH  Google Scholar 

  • Eichinger, B., & Kirch, C. (2018). A MOSUM procedure for the estimation of multiple random change points. Bernoulli, 24, 526–564.

    MathSciNet  MATH  Google Scholar 

  • Frick, K., Munk, A., & Sieling, H. (2014). Multiscale change-point inference (with discussion). Journal of the Royal Statistical Society Series B, 76, 495–580.

    MathSciNet  MATH  Google Scholar 

  • Fryzlewicz, P. (2014). Wild binary segmentation for multiple change-point detection. Annals of Statistics, 42, 2243–2281.

    MathSciNet  MATH  Google Scholar 

  • Fryzlewicz, P. (2017). breakfast: Multiple change-point detection and segmentation. https://CRAN.R-project.org/package=breakfast. R package version 1.0.0.

  • Fryzlewicz, P. (2018). Tail-greedy bottom-up data decompositions and fast multiple change-point detection. The Annals of Statistics, 46, 3390–3421.

    MathSciNet  MATH  Google Scholar 

  • Fryzlewicz, P., & Rao, S  Subba. (2014). Multiple-change-point detection for auto-regressive conditional heteroscedastic processes. Journal of the Royal Statistical Society Series B, 76, 903–924.

    MathSciNet  MATH  Google Scholar 

  • Galceran, E., Cunningham, A., Eustice, R., & Olson E. (2015). Multipolicy decision-making for autonomous driving via changepoint-based behavior prediction. In 2015 robotics: Science and systems conference, RSS 2015 (vol. 11).

  • Guntuboyina, A., Lieu, D., Chatterjee, S., & Sen, B. (2020). Adaptive risk bounds in univariate total variation denoising and trend filtering. The Annals of Statistics, 48, 205–229.

    MathSciNet  MATH  Google Scholar 

  • Hansen, B. (2001). The new econometrics of structural change: Dating breaks in U.S. labour productivity. Journal of Economic Perspectives, 15, 117–128.

    Google Scholar 

  • Harchaoui, Z., & Lévy-Leduc, C. (2010). Multiple change-point estimation with a total variation penalty. Journal of the American Statistical Association, 105, 1480–1493.

    MathSciNet  MATH  Google Scholar 

  • Huang, C.-Y., & Lyu, M. (2011). Estimation and analysis of some generalized multiple change-point software reliability models. IEEE Transactions on Reliability, 60, 498–514.

    Google Scholar 

  • Huskova, M., & Slaby, A. (2001). Permutation tests for multiple changes. Kybernetika, 37, 605–622.

    MathSciNet  MATH  Google Scholar 

  • James, N., & Matteson, D. (2014). ecp: An R package for nonparametric multiple change point analysis of multivariate data. Journal of Statistical Software, 62, 1–25.

    Google Scholar 

  • Killick, R., Fearnhead, P., & Eckley, I. (2012). Optimal detection of changepoints with a linear computational cost. Journal of the American Statistical Association, 107, 1590–1598.

    MathSciNet  MATH  Google Scholar 

  • Killick, R., Haynes, K., & Eckley, I. (2016). changepoint: An R package for changepoint analysis. https://CRAN.R-project.org/package=changepoint. R package version 2.2.2.

  • Korkas, K., & Fryzlewicz, P. (2017). Multiple change-point detection for non-stationary time series using wild binary segmentation. Statistica Sinica, 27, 287–311.

    MathSciNet  MATH  Google Scholar 

  • Lavielle, M. (1999). Detection of multiple changes in a sequence of dependent variables. Stochastic Processes and their Applications, 83, 79–102.

    MathSciNet  MATH  Google Scholar 

  • Lavielle, M. (2005). Using penalized contrasts for the change-point problem. Signal Processing, 85, 1501–1510.

    MATH  Google Scholar 

  • Lavielle, M., & Moulines, E. (2000). Least-squares estimation of an unknown number of shifts in a time series. Journal of Time Series Analysis, 21, 33–59.

    MathSciNet  MATH  Google Scholar 

  • Lebarbier, E. (2005). Detecting multiple change-points in the mean of Gaussian process by model selection. Signal Processing, 85, 717–736.

    MATH  Google Scholar 

  • Lee, C.-B. (1995). Estimating the number of change points in a sequence of independent normal random variables. Statistics and Probability Letters, 25, 241–248.

    MathSciNet  MATH  Google Scholar 

  • Li, H., & Munk, A. (2016). FDR-control in multiscale change-point segmentation. Electronic Journal of Statistics, 10, 918–959.

    MathSciNet  MATH  Google Scholar 

  • Li, H., & Sieling, H. (2017). FDRSeg: FDR-control in multiscale change-point segmentation. https://CRAN.R-project.org/package=FDRSeg. R package version 1.0-3.

  • Lin, K., Sharpnack, J. L., Rinaldo, A., & Tibshirani, R. J. (2017). A sharp error analysis for the fused lasso, with application to approximate changepoint screening. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett (Eds.), Advances in neural information processing systems (pp. 6884–6893). Curran Associates, Inc.

  • Liu, D., Chen, X., Lian, Y., & Lou, Z. (2010). Impacts of climate change and human activities on surface runoff in the Dongjiang River basin of China. Hydrological Processes, 24, 1487–1495.

    Google Scholar 

  • Maidstone, R., Hocking, T., Rigaill, G., & Fearnhead, P. (2017). On optimal multiple changepoint algorithms for large data. Statistics and Computing, 27, 519–533.

    MathSciNet  MATH  Google Scholar 

  • Mallows, C. (1991). Another comment on O’Cinneide. The American Statistician, 45, 257.

    Google Scholar 

  • Matteson, D., & James, N. (2014). A nonparametric approach for multiple change point analysis of multivariate data. Journal of the American Statistical Association, 109, 334–345.

    MathSciNet  MATH  Google Scholar 

  • Meier, A., Cho, H., & Kirch, C. (2018). mosum: Moving sum based procedures for changes in the mean. https://CRAN.R-project.org/package=mosum. R package version 1.2.0.

  • Muggeo, V. (2003). Estimating regression models with unknown break-points. Statistics in Medicine, 22, 3055–3071.

    Google Scholar 

  • Muggeo V. (2012). cumSeg: Change point detection in genomic sequences. https://CRAN.R-project.org/package=cumSeg. R package version 1.1.

  • Muggeo, V., & Adelfio, G. (2011). Efficient change point detection for genomic sequences of continuous measurements. Bioinformatics, 27, 161–166.

    Google Scholar 

  • National Research Council. Frontiers in Massive Data Analysis. Washington, DC: The National Academies Press (2013). https://doi.org/10.17226/18374.

  • Olshen, A., Venkatraman, E. S., Lucito, R., & Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 5, 557–572.

    MATH  Google Scholar 

  • Pan, J., & Chen, J. (2006). Application of modified information criterion to multiple change point problems. Journal of Multivariate Analysis, 97, 2221–2241.

    MathSciNet  MATH  Google Scholar 

  • Pein, F., Hotz, T., Sieling, H., & Aspelmeier, T. (2018). stepR: Multiscale change-point inference. https://CRAN.R-project.org/package=stepR. R package version 2.0-2.

  • Pezzatti, G., Zumbrunnen, T., Bürgi, M., Ambrosetti, P., & Conedera, M. (2013). Fire regime shifts as a consequence of fire policy and socio-economic development: An analysis based on the change point approach. Forest Policy and Economics, 29, 7–18.

    Google Scholar 

  • Pierre-Jean, M., Rigaill, G., & Neuvial P. (2017). jointseg: Joint segmentation of multivariate (copy number) signals. https://CRAN.R-project.org/package=jointseg. R package version 1.0.1.

  • Ranganathan, A. (2012). PLISS: Labeling places using online changepoint detection. Autonomous Robots, 32, 351–368.

    Google Scholar 

  • Reeves, J., Chen, J., Wang, X., Lund, R., & Lu, Q. (2007). A review and comparison of changepoint detection techniques for climate data. Journal of Applied Meteorology and Climatology, 46, 900–915.

    Google Scholar 

  • Rigaill, G. (2015). A pruned dynamic programming algorithm to recover the best segmentations with 1 to \(k_{max}\) change-points. Journal de la Societe Francaise de Statistique, 156, 180–205.

    MathSciNet  MATH  Google Scholar 

  • Rigaill, G., & Hocking, T.D. (2016). fpop: Segmentation using Optimal Partitioning and Function Pruning, URL https://R-Forge.R-project.org/projects/opfp/. R package version 2016.10.25/r55.

  • Rinaldo, A. (2009). Properties and refinements of the fused lasso. Annals of Statistics, 37, 2922–2952.

    MathSciNet  MATH  Google Scholar 

  • Rojas, C., & Wahlberg, B. (2014). On change point detection using the fused lasso method. Unpublished manuscript.

  • Ross, G. J. (2015). Parametric and nonparametric sequential change detection in R: the cpm package. Journal of Statistical Software, 66, 1–20.

    Google Scholar 

  • Salarijazi, M., Akhond-Ali, A., Adib, A., & Daneshkhah, A. (2012). Trend and change-point detection for the annual stream-flow series of the Karun River at the Ahvaz hydrometric station. African Journal of Agricultural Research, 7, 4540–4552.

    Google Scholar 

  • Tibshirani, R. (2014). Adaptive piecewise polynomial estimation via trend filtering. Annals of Statistics, 42, 285–323.

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B, 67, 91–108.

    MathSciNet  MATH  Google Scholar 

  • Truong, C., Oudre, L., & Vayatis, N. (2020). Selective review of offline change point detection methods. Signal Processing, 167, 107299.

    Google Scholar 

  • Venkatraman, E.S. (1992). Consistency results in multiple change-point problems. Technical Report No. 24, Department of Statistics, Stanford University. https://statistics.stanford.edu/resources/technical-reports.

  • Venkatraman, E. S., & Olshen, A. (2007). A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics, 23, 657–663.

    Google Scholar 

  • Vostrikova, L. (1981). Detecting ‘disorder’ in multidimensional random processes. Soviet Mathematics Doklady, 24, 55–59.

    MATH  Google Scholar 

  • Wang, D., Yu, Y., & Rinaldo, A. (2018). Univariate mean change point detection: Penalization. Preprint: CUSUM and optimality.

  • Wang, T., & Samworth, R. (2018). High dimensional change point estimation via sparse projection. Journal of the Royal Statistical Society: Series B, 80, 57–83.

    MathSciNet  MATH  Google Scholar 

  • Wang, Y. (1995). Jump and sharp cusp detection by wavelets. Biometrika, 82, 385–397.

    MathSciNet  MATH  Google Scholar 

  • Wu, Y. (2008). Simultaneous change point analysis and variable selection in a regression problem. Journal of Multivariate Analysis, 99, 2154–2171.

    MathSciNet  MATH  Google Scholar 

  • Yao, Y.-C. (1988). Estimating the number of change-points via Schwarz’ criterion. Statistics & Probability Letters, 6, 181–189.

    MathSciNet  MATH  Google Scholar 

  • Yao, Y.-C., & Au, S. T. (1989). Least-squares estimation of a step function. Sankhya Series A, 51, 370–381.

    MathSciNet  MATH  Google Scholar 

  • Younes, L., Albert, M., & Miller, M. (2014). Inferring changepoint times of medial temporal lobe morphometric change in preclinical Alzheimer’s disease. NeuroImage: Clinical, 5, 178–187.

    Google Scholar 

  • Zeileis, A., Leisch, F., Hornik, K., & Kleiber, C. (2002). strucchange: An R package for testing for structural change in linear regression models. Journal of Statistical Software, 7, 1–38.

    Google Scholar 

  • Zhang, N., & Siegmund, D. (2007). A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data. Biometrics, 63, 22–32.

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piotr Fryzlewicz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Work supported by the Engineering and Physical Sciences Research Council Grant no. EP/L014246/1.

Proofs

Proofs

Proof of Theorem 3.1

Part (i). The statement is trivially true if \(T = 1\). Suppose, inductively, that it is true for data lengths \(1, \ldots , T-1\). For an input data sequence of length T, the first addition to the solution path breaks the domain of operation into two, of lengths \(b_{m_0}\) and \(T-b_{m_0}\). Therefore by the inductive hypothesis, the length of the solution path for the input data will be \(1 + (b_{m-0} - 1) + (T-b_{m_0} - 1) = T-1\), which completes the proof of part (i).

Part (ii). The computation of the CUSUM statistic for all b in formula (2) is of computational order \(O(e-s)\). Therefore the addition of elements to the solution path from all sub-domains within a single scale of operation is of computational order \(O({\tilde{M}}T)\). If the scale has reached J, there is nothing to do and the procedure has stopped. Therefore the overall computational complexity before the sorting step is of order \(O({\tilde{M}}JT)\). The sorting can be performed in computational time \(O(T \log \,T)\), which never dominates \(O({\tilde{M}}JT)\) as the smallest possible J is of computational order \(O(\log \,T)\), which is straightforward to see from its definition. This completes the proof of part (ii).

Part (iii). We first set the probabilistic framework in which we analyse the behaviour of the WBS2 solution path algorithm. With the change-point locations denoted by \(\eta _1, \ldots , \eta _N\) (with the additional notation \(\eta _0 = 1\), \(\eta _{N+1} = T+1\)), we define the intervals

$$\begin{aligned} I_{i,j}^{k,l} = [\max (1, \eta _i + k), \min (T, \eta _j + l)], \end{aligned}$$

where \(k,l \in \mathbb {Z} \cap [- C_1(\Delta )({\underline{f}}_T)^{-2}\log \,T, C_1(\Delta )({\underline{f}}_T)^{-2}\log \,T]\) for each \(1 \le i+1 < j \le {N+1}\). Suppose that on each interval \(I_{i,j}^{k,l}\), \({\tilde{M}}\) intervals \(\{[s_m, e_m]\}_{m=1}^{{\tilde{M}}}\) have been drawn, with the start- and end-points having been drawn uniformly with replacement from \(I_{i,j}^{k,l}\) (if \({\tilde{M}} > |I_{i,j}^{k,l}|(|I_{i,j}^{k,l}| - 1)/2\), then we understand \(\{[s_m, e_m]\}_{m=1}^{{\tilde{M}}}\) to contain all possible subintervals of \(I_{i,j}^{k,l}\)). Note we do not reflect the (stochastic) dependence of \(s_m, e_m\) on ijkl so as not to over-complicate the notation. As in the proof of Theorem 3.2 in Fryzlewicz (2014), we define intervals \({{\mathcal {I}}}_r\) between change-points in such a way that their lengths are at least of order O(T), and they are separated from the change-points also by distances at least of order O(T). To fix ideas, define \({{\mathcal {I}}}_r = [\eta _{r-1} + \frac{1}{3}(\eta _r - \eta _{r-1}), \eta _{r-1} + \frac{2}{3}(\eta _r - \eta _{r-1})]\), \(r = 1, \ldots , N+1\). For each interval \(I_{i,j}^{k,l}\), we are interested in the following event

$$\begin{aligned} A_{i,j}^{k,l} = \{ \exists _{m_0\in \{1, \ldots , {\tilde{M}}\}} \,\, \exists _{r\in \{i+1,\ldots ,j-1\}}\,\, (s_{m_0}, e_{m_0}) \in {{\mathcal {I}}}_r \times {{\mathcal {I}}}_{r+1} \}. \end{aligned}$$

Note that

$$\begin{aligned} P\{(A_{i,j}^{k,l})^c\}\le & {} \prod _{m=1}^{{\tilde{M}}} P\left\{ (s_m, e_m) \not \in \bigcup _{r=i+1}^{j-1} {{\mathcal {I}}}_r \times {{\mathcal {I}}}_{r+1} \right\} \\\le & {} \prod _{m=1}^{{\tilde{M}}} \max _{r\in \{i+1,\ldots ,j-1\}} (1 - P\{(s_m, e_m) \in {{\mathcal {I}}}_r \times {{\mathcal {I}}}_{r+1}\} ) \le (1 - \delta ^2/9)^{{\tilde{M}}}, \end{aligned}$$

and hence

$$\begin{aligned} P\left( \bigcap _{i,j,k,l} A_{i,j}^{k,l} \right)\ge & {} 1 - \sum _{i,j,k,l} P\{(A_{i,j}^{k,l})^c\} \\\ge & {} 1 - \frac{1}{2}N(N+1) (2C_1(\Delta )({\underline{f}}_T)^{-2}\log \,T + 1)^2 (1 - \delta ^2/9)^{{\tilde{M}}}. \end{aligned}$$

Define further the following event

$$\begin{aligned} {{\mathcal {D}}}_{\Delta , T} = \left\{ \forall \,\, 1 \le s \le b < e \le T\qquad |{\tilde{\varepsilon }}_{s,e}^b| \le 2 \sigma \{ (1 + \Delta ) \log \, T\}^{1/2} \right\} , \end{aligned}$$

where \({\tilde{\varepsilon }}_{s,e}^b\) is defined as in formula (2) with \(\varepsilon _t\) in place of \(X_t\). It is the statement of Lemma A.1 in Fryzlewicz (2018) that \({\mathcal {B}}_{\Delta , T} \subseteq {{\mathcal {D}}}_{\Delta , T}\).

The following arguments apply on the set \(\bigcap _{i,j,k,l} A_{i,j}^{k,l} \cap {{\mathcal {D}}}_{\Delta , T}\) for any fixed \(\Delta > 0\). Proceeding exactly as in the proof of Theorem 3.2 in Fryzlewicz (2014), at the start of the procedure we have \(s = 1\), \(e = T\), and in view of the fact that we are on \(A_{0,N+1}^{0,0} \cap {{\mathcal {D}}}_{\Delta , T}\), the procedure finds a \(b_{m_0} \in [s_{m_0}, e_{m_0}] \subseteq [s,e]\) such that

  1. (a)

    there exists an \(r \in \{1, 2, \ldots , N\}\) such that \(|\eta _r - b_{m_0}| \le C_1(\Delta )({\underline{f}}_T)^{-2}\log \,T\), and

  2. (b)

    \(|{\tilde{X}}_{s_{m_0}, e_{m_0}}^{b_{m_0}}| \gtrsim T^{1/2} {\underline{f}}_T\),

where the \(\gtrsim \) symbol means “of the order of or larger”. Again by arguments identical to those in the proof of Theorem 3.2 in Fryzlewicz (2014), on each subsequent segment containing previously undetected change-points, we are on one of the sets \(A_{i,j}^{k,l} \cap {{\mathcal {D}}}_{\Delta , T}\), and therefore the procedure again finds a \(b_{m_0}\) satisfying properties (a), (b) above for a certain previously undetected change-point \(\eta _r\), until all change-points have been identified in this way. Once all change-points have been identified, by Lemma A.5 of Fryzlewicz (2014), all maximisers \(b_{m_0}\) of the absolute CUSUM statistics \(|{\tilde{X}}_{s_m, e_m}^b|\) (over b) are such that \(|{\tilde{X}}_{s_{m_0}, e_{m_0}}^{b_{m_0}}| \le C_2(\Delta ) \log ^{1/2}T\). As \(\log ^{1/2}T = o(T^{1/2}{\underline{f}}_T)\), for T large enough, the sorting of the elements of \(\tilde{\mathcal {P}}\) does not move any elements from the first N to the last \(T-1-N\) or vice versa, which completes the proof of part (iii). \(\square \)

Proof of Theorem 3.2

With the notation as in the statement of the theorem and in the proof of Theorem 3.1, the following arguments apply on the set \(\bigcap _{i,j,k,l} A_{i,j}^{k,l} \cap {{\mathcal {D}}}_{\Delta , T} \cap \Theta _{\theta , T}\) for any fixed \(\Delta > 0\). Let T be large enough for the assertion of Theorem 3.1 to hold. Further, let \({\tilde{C}}\) be large enough for the following inequality to hold

$$\begin{aligned} {\tilde{\zeta }}_T = {\tilde{C}}{\hat{\sigma }}_T \{2 \log \,T\}^{1/2} > \max ( C_2(\Delta )\log ^{1/2}T, 2\sigma \{ (1+\Delta ) \log \,T \}^{1/2} ) \end{aligned}$$
(6)

(this is always possible on \(\Theta _{\theta , T}\)). If \(N = 0\), then in view of inequality (6) and the fact that we are on \({{\mathcal {D}}}_{\Delta , T}\), from the definition of the SDLL algorithm we necessarily have \({\hat{N}} = 0\). If \(N > 0\), then by part (iii) of Theorem 3.1, we must have \(K + 1 \ge N\). If \(|{\tilde{X}}_{s_{K+1}, e_{K+1}}^{b_{K+1}}| > {\tilde{\zeta }}_T\), then by inequality (6), we have \(N = K + 1\) and the SDLL procedure correctly identifies \({\hat{N}} = K + 1 = N\). If \(|{\tilde{X}}_{s_{K+1}, e_{K+1}}^{b_{K+1}}| < {\tilde{\zeta }}_T\), then we necessarily have \(N < K+1\), and by part (iii) of Theorem 3.1, we have, for \(Z_k = \log |{\tilde{X}}_{s_k, e_k}^{b_k}| - \log |{\tilde{X}}_{s_{k+1}, e_{k+1}}^{b_{k+1}}|\),

$$\begin{aligned} Z_N\sim & {} \log \,T,\\ Z_k\le & {} \log \left( \frac{(1+ \theta ) C_2(\Delta )}{ \beta {\tilde{C}} \sigma 2^{1/2}} \right) \quad \text {for}\quad k=N+1, \ldots , K\quad \text {if this range is non-empty}. \end{aligned}$$

Therefore, from the definition of the SDLL and by part (iii) of Theorem 3.1, for T large enough, \({\hat{N}} = N\) will be chosen. This completes the proof of the Theorem. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fryzlewicz, P. Detecting possibly frequent change-points: Wild Binary Segmentation 2 and steepest-drop model selection. J. Korean Stat. Soc. 49, 1027–1070 (2020). https://doi.org/10.1007/s42952-020-00060-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42952-020-00060-x

Keywords

Navigation