Unsupervised learning on U.S. weather forecast performance

Lin, Chuyuan; Yu, Ying; Wu, Lucas Y.; Cao, Jiguo

doi:10.1007/s00180-023-01340-w

Unsupervised learning on U.S. weather forecast performance

Original paper
Published: 20 March 2023

Volume 38, pages 1193–1213, (2023)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Chuyuan Lin¹,
Ying Yu¹,
Lucas Y. Wu¹ &
…
Jiguo Cao ORCID: orcid.org/0000-0001-7417-6330¹

226 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

Nowadays, climate events and weather predictions have a huge impact on human activities. To understand the accuracy of weather prediction, we applied the functional principal component analysis (FPCA) method to investigate the main pattern of variance within the U.S. weather prediction error over a period of 3 years. We further grouped the states in the U.S. based on their similarity in weather forecast performance using two types of functional clustering approaches: the filtering method and the model-based method. The strengths and weaknesses of each clustering method were detected through the simulation studies. Then, the clustering approaches were applied to U.S. weather data from 2014 to 2017. Through clustering, cluster-specific patterns were visually detected, and the cluster-to-cluster differences were quantified in order to identify the most and least predictable U.S. states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Assessing the Evolution of Meteorological Seasons and Climate Changes Using Hierarchical Clustering

Data-Driven Methods for Weather Forecast

The use of weather types in the definition of seasons: the case of southern Balkans

Article 08 September 2020

References

Abraham C, Cornillon PA, Matzner-Løber E, Molinari N (2003) Unsupervised curve clustering using b-splines. Scandinavian J stat 30(3):581–595
Article MathSciNet MATH Google Scholar
Adams RA, Fournier JJ (2003) Sobolev spaces, vol 140. Elsevier, Atlanta
MATH Google Scholar
Adams RM, Rosenzweig C, Peart RM, Ritchie JT, McCarl BA, Glyer JD, Curry RB, Jones JW, Boote KJ, Allen LH Jr (1990) Global climate change and us agriculture. Nature 345(6272):219–224
Article Google Scholar
Adelfio G, Chiodi M, D’Alessandro A, Luzio D (2011) FPCA algorithm for waveform clustering. J Commun Comput 8(6):494–502
Google Scholar
Bauer P, Thorpe A, Brunet G (2015) The quiet revolution of numerical weather prediction. Nature 525(7567):47–55
Article Google Scholar
Besse PC, Cardot H, Stephenson DB (2000) Autoregressive forecasting of some functional climatic variations. Scandinavian J Stat 27(4):673–687
Article MATH Google Scholar
Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Transact Pattern Anal Mach Intell 22(7):719–725
Article Google Scholar
Bosq D (1996) Nonparametric statistics for stochastic processes: estimation and prediction, vol 110. Springer-Verlag, New York
MATH Google Scholar
Bouveyron C (2015) funFEM: Clustering in the Discriminative Functional Subspace. https://CRAN.R-project.org/package=funFEM, r package version 1.1
Bouveyron C, Côme E, Jacques J (2015) The discriminative functional mixture model for a comparative analysis of bike sharing systems. The Annals Appl Stat 9(4):1726–1760
Article MathSciNet MATH Google Scholar
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control, 5th edn. John Wiley & Sons, Hoboken, New Jersey
MATH Google Scholar
Charrad M, Ghazzali N, Boiteau V, Niknafs A (2012) NbClust package: finding the relevant number of clusters in a dataset. UseR! 2012
Charrad M, Ghazzali N, Boiteau V, Niknafs A (2014) NbClust: an R package for determining the relevant number of clusters in a data set. J Stat Soft 61(6):1–36
Article Google Scholar
Collomb G (1983) From non parametric regression to non parametric prediction: Survey of the mean square error and original results on the predictogram. In: Specifying statistical models, Springer, pp 182–204
Curry HB, Schoenberg IJ (1966) On Pólya frequency functions IV: the fundamental spline functions and their limits. J d’analyse mathématique 17(1):71–107
Article MathSciNet MATH Google Scholar
Dudoit S, Fridlyand J (2002) A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol 3(7):1–21
Article Google Scholar
Györfi L, Härdle W, Sarda P, Vieu P (1989) Nonparametric curve estimation from time series, vol 60. Springer-Verlag, New York
MATH Google Scholar
Hartigan JA, Wong MA (1979) Algorithm as 136: A \(k\)-means clustering algorithm. J Royal Stat Soc Series C (Appl Stat) 28(1):100–108
MATH Google Scholar
Hornik K (2019) clue: Cluster ensembles. https://CRAN.R-project.org/package=clue, r package version 0.3-57
Jacques J, Preda C (2014) Functional data clustering: a survey. Adv Data Anal Classificat 8(3):231–255. https://doi.org/10.1007/s11634-013-0158-y
Article MathSciNet MATH Google Scholar
James GM, Sugar CA (2003) Clustering for sparsely sampled functional data. J Am Stat Associat 98(462):397–408
Article MathSciNet MATH Google Scholar
Ke Y, Li J, Zhang W et al (2016) Structure identification in panel data analysis. The Annals Stat 44(3):1193–1233
Article MathSciNet MATH Google Scholar
Lazo JK, Morss RE, Demuth JL (2009) 300 billion served: Sources, perceptions, uses, and values of weather forecasts. Bullet Am Meteorol Soc 90(6):785–798
Article Google Scholar
Li J, Yue M, Zhang W (2019) Subgroup identification via homogeneity pursuit for dense longitudinal/spatial data. Stat Med
Orrell D, Smith L, Barkmeijer J, Palmer T (2001) Model error in weather forecasting. Nonlinear Process Geophys 8(6):357–371
Article Google Scholar
Papadimitrou CH, Steiglitz K (1982) Combinatorial optimization: algorithms and complexity. Prentice-Hall, New York
Google Scholar
Radhika Y, Shashi M (2009) Atmospheric temperature prediction using support vector machines. Int J Comput Theory Eng 1(1):55–59
Article Google Scholar
Ramsay J, Silverman B (2005) Functional data anal, 2nd edn. Springer, New York
Book Google Scholar
Ramsay J, Hooker G, Graves S (2009) Functional data analysis with R and MATLAB. Springer, New York
Book MATH Google Scholar
Ramsay JO, Wickham H, Graves S, Hooker G (2018) fda: Functional Data Analysis. https://CRAN.R-project.org/package=fda, r package version 2.4.8
Rice JA, Silverman BW (1991) Estimating the mean and covariance structure nonparametrically when the data are curves. J Royal Stat Soc: Series B (Methodol) 53(1):233–243
MathSciNet MATH Google Scholar
Schmutz A, Jacques J, Bouveyron C, Cheze L, Martin P (2018) Clustering multivariate functional data in group-specific functional subspaces, https://hal.inria.fr/hal-01652467, preprint
Schwarz G (1978) Estimating the dimension of a model. The Annals Stat 6(2):461–464
Article MathSciNet MATH Google Scholar
Silverman BW (1996) Smoothed functional principal components analysis by choice of norm. The Annals Stat 24(1):1–24
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are most appreciative of the organizers of 2018 JSM Data Expo who made this happen. We also thank Dr. Peijun Sang, PhD candidates Yuping Yang and Zhiyang Zhou, faculty members and graduate students in the Department of Statistics and Actuarial Science at Simon Fraser University who provided helpful suggestions relating to this project.

Author information

Authors and Affiliations

Department of Statistics and Actuarial Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
Chuyuan Lin, Ying Yu, Lucas Y. Wu & Jiguo Cao

Authors

Chuyuan Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lucas Y. Wu
View author publications
You can also search for this author in PubMed Google Scholar
Jiguo Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiguo Cao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 699 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, C., Yu, Y., Wu, L.Y. et al. Unsupervised learning on U.S. weather forecast performance. Comput Stat 38, 1193–1213 (2023). https://doi.org/10.1007/s00180-023-01340-w

Download citation

Received: 27 April 2019
Accepted: 22 February 2023
Published: 20 March 2023
Issue Date: September 2023
DOI: https://doi.org/10.1007/s00180-023-01340-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Unsupervised learning on U.S. weather forecast performance

Abstract

Access this article

Similar content being viewed by others

Assessing the Evolution of Meteorological Seasons and Climate Changes Using Hierarchical Clustering

Data-Driven Methods for Weather Forecast

The use of weather types in the definition of seasons: the case of southern Balkans

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 699 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Unsupervised learning on U.S. weather forecast performance

Abstract

Access this article

Similar content being viewed by others

Assessing the Evolution of Meteorological Seasons and Climate Changes Using Hierarchical Clustering

Data-Driven Methods for Weather Forecast

The use of weather types in the definition of seasons: the case of southern Balkans

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 699 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation