Skip to main content

Advertisement

Log in

Unsupervised learning on U.S. weather forecast performance

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

Nowadays, climate events and weather predictions have a huge impact on human activities. To understand the accuracy of weather prediction, we applied the functional principal component analysis (FPCA) method to investigate the main pattern of variance within the U.S. weather prediction error over a period of 3 years. We further grouped the states in the U.S. based on their similarity in weather forecast performance using two types of functional clustering approaches: the filtering method and the model-based method. The strengths and weaknesses of each clustering method were detected through the simulation studies. Then, the clustering approaches were applied to U.S. weather data from 2014 to 2017. Through clustering, cluster-specific patterns were visually detected, and the cluster-to-cluster differences were quantified in order to identify the most and least predictable U.S. states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  • Abraham C, Cornillon PA, Matzner-Løber E, Molinari N (2003) Unsupervised curve clustering using b-splines. Scandinavian J stat 30(3):581–595

    Article  MathSciNet  MATH  Google Scholar 

  • Adams RA, Fournier JJ (2003) Sobolev spaces, vol 140. Elsevier, Atlanta

    MATH  Google Scholar 

  • Adams RM, Rosenzweig C, Peart RM, Ritchie JT, McCarl BA, Glyer JD, Curry RB, Jones JW, Boote KJ, Allen LH Jr (1990) Global climate change and us agriculture. Nature 345(6272):219–224

    Article  Google Scholar 

  • Adelfio G, Chiodi M, D’Alessandro A, Luzio D (2011) FPCA algorithm for waveform clustering. J Commun Comput 8(6):494–502

    Google Scholar 

  • Bauer P, Thorpe A, Brunet G (2015) The quiet revolution of numerical weather prediction. Nature 525(7567):47–55

    Article  Google Scholar 

  • Besse PC, Cardot H, Stephenson DB (2000) Autoregressive forecasting of some functional climatic variations. Scandinavian J Stat 27(4):673–687

    Article  MATH  Google Scholar 

  • Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transact Pattern Anal Mach Intell 22(7):719–725

    Article  Google Scholar 

  • Bosq D (1996) Nonparametric statistics for stochastic processes: estimation and prediction, vol 110. Springer-Verlag, New York

    MATH  Google Scholar 

  • Bouveyron C (2015) funFEM: Clustering in the Discriminative Functional Subspace. https://CRAN.R-project.org/package=funFEM, r package version 1.1

  • Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. The Annals Appl Stat 9(4):1726–1760

    Article  MathSciNet  MATH  Google Scholar 

  • Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control, 5th edn. John Wiley & Sons, Hoboken, New Jersey

    MATH  Google Scholar 

  • Charrad M, Ghazzali N, Boiteau V, Niknafs A (2012) NbClust package: finding the relevant number of clusters in a dataset. UseR! 2012

  • Charrad M, Ghazzali N, Boiteau V, Niknafs A (2014) NbClust: an R package for determining the relevant number of clusters in a data set. J Stat Soft 61(6):1–36

    Article  Google Scholar 

  • Collomb G (1983) From non parametric regression to non parametric prediction: Survey of the mean square error and original results on the predictogram. In: Specifying statistical models, Springer, pp 182–204

  • Curry HB, Schoenberg IJ (1966) On Pólya frequency functions IV: the fundamental spline functions and their limits. J d’analyse mathématique 17(1):71–107

    Article  MathSciNet  MATH  Google Scholar 

  • Dudoit S, Fridlyand J (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 3(7):1–21

    Article  Google Scholar 

  • Györfi L, Härdle W, Sarda P, Vieu P (1989) Nonparametric curve estimation from time series, vol 60. Springer-Verlag, New York

    MATH  Google Scholar 

  • Hartigan JA, Wong MA (1979) Algorithm as 136: A \(k\)-means clustering algorithm. J Royal Stat Soc Series C (Appl Stat) 28(1):100–108

    MATH  Google Scholar 

  • Hornik K (2019) clue: Cluster ensembles. https://CRAN.R-project.org/package=clue, r package version 0.3-57

  • Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classificat 8(3):231–255. https://doi.org/10.1007/s11634-013-0158-y

    Article  MathSciNet  MATH  Google Scholar 

  • James GM, Sugar CA (2003) Clustering for sparsely sampled functional data. J Am Stat Associat 98(462):397–408

    Article  MathSciNet  MATH  Google Scholar 

  • Ke Y, Li J, Zhang W et al (2016) Structure identification in panel data analysis. The Annals Stat 44(3):1193–1233

    Article  MathSciNet  MATH  Google Scholar 

  • Lazo JK, Morss RE, Demuth JL (2009) 300 billion served: Sources, perceptions, uses, and values of weather forecasts. Bullet Am Meteorol Soc 90(6):785–798

    Article  Google Scholar 

  • Li J, Yue M, Zhang W (2019) Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data. Stat Med

  • Orrell D, Smith L, Barkmeijer J, Palmer T (2001) Model error in weather forecasting. Nonlinear Process Geophys 8(6):357–371

    Article  Google Scholar 

  • Papadimitrou CH, Steiglitz K (1982) Combinatorial optimization: algorithms and complexity. Prentice-Hall, New York

    Google Scholar 

  • Radhika Y, Shashi M (2009) Atmospheric temperature prediction using support vector machines. Int J Comput Theory Eng 1(1):55–59

    Article  Google Scholar 

  • Ramsay J, Silverman B (2005) Functional data anal, 2nd edn. Springer, New York

    Book  Google Scholar 

  • Ramsay J, Hooker G, Graves S (2009) Functional data analysis with R and MATLAB. Springer, New York

    Book  MATH  Google Scholar 

  • Ramsay JO, Wickham H, Graves S, Hooker G (2018) fda: Functional Data Analysis. https://CRAN.R-project.org/package=fda, r package version 2.4.8

  • Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J Royal Stat Soc: Series B (Methodol) 53(1):233–243

    MathSciNet  MATH  Google Scholar 

  • Schmutz A, Jacques J, Bouveyron C, Cheze L, Martin P (2018) Clustering multivariate functional data in group-specific functional subspaces, https://hal.inria.fr/hal-01652467, preprint

  • Schwarz G (1978) Estimating the dimension of a model. The Annals Stat 6(2):461–464

    Article  MathSciNet  MATH  Google Scholar 

  • Silverman BW (1996) Smoothed functional principal components analysis by choice of norm. The Annals Stat 24(1):1–24

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors are most appreciative of the organizers of 2018 JSM Data Expo who made this happen. We also thank Dr. Peijun Sang, PhD candidates Yuping Yang and Zhiyang Zhou, faculty members and graduate students in the Department of Statistics and Actuarial Science at Simon Fraser University who provided helpful suggestions relating to this project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiguo Cao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 699 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, C., Yu, Y., Wu, L.Y. et al. Unsupervised learning on U.S. weather forecast performance. Comput Stat 38, 1193–1213 (2023). https://doi.org/10.1007/s00180-023-01340-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-023-01340-w

Keywords

Navigation