GEVcdn: An R package for nonstationary extreme value analysis by generalized extreme value conditional density estimation network

doi:10.1016/j.cageo.2011.03.005

Computers & Geosciences

Volume 37, Issue 9, September 2011, Pages 1532-1533

https://doi.org/10.1016/j.cageo.2011.03.005 Get rights and content

Abstract

An R package is developed for the Generalized Extreme Value conditional density estimation network (GEVcdn). Parameters in a GEV distribution are specified as a function of covariates using a probabilistic variant of the multilayer perceptron neural network. If the covariate is time or is dependent on time, then the GEVcdn model can be used to perform nonlinear, nonstationary extreme value analysis. Due to the flexibility of the neural network architecture, the model is capable of representing a wide range of nonstationary relationships, including those involving interactions between covariates. Model parameters are estimated by generalized maximum likelihood, an approach that is tailored to the analysis of hydroclimatological extremes. Functions are included to assist in the calculation of parameter uncertainty via bootstrapping.

Introduction

The distribution of a series of extreme values computed from long sequences of data asymptotically approaches the Generalized Extreme Value (GEV) distribution as the number of samples becomes large. The extreme value theorem, which is the extreme value analog of the central limit theorem (Coles, 2001), forms the basis for extreme value analysis of meteorological and hydrological series, for example, annual maxima of rainfall or streamflow observations, and, in turn, the estimation of design criteria for engineering structures. One of the main assumptions is that the series is stationary, meaning that its statistical properties are independent of time. There is an ample evidence that the hydroclimatological system is nonstationary on time scales relevant to the applied extreme value analysis (Milly et al., 2008). The assumption of stationarity in extreme value analysis is therefore questionable and new methods that explicitly allow for nonstationarity in the GEV distribution parameters are required.

The GEV conditional density network (GEVcdn), which is a model for nonstationary extreme value analysis has been developed by Cannon (2010). Parameters of the GEV distribution are specified as a function of covariates using a probabilistic extension of the multilayer perceptron neural network. Nonlinear relationships, including ones involving unspecified interactions between multiple covariates, can be represented, thus resulting in a flexible statistical model for analyzing extremes.

This note describes the GEVcdn package, which provides an implementation of the GEVcdn model in the R programming language (R Development Core Team, 2009). GEVcdn provides functions for (i) fitting single models (gevcdn.fit), (ii) ensembles of bootstrap aggregated models (gevcdn.bag), (ii) predicting covariate-dependent GEV parameters from fitted models (gevcdn.evaluate), and (iii) calculating bootstrap-based confidence intervals for GEV parameters and specified quantiles (gevcdn.bootstrap).

Section snippets

Features and capabilities

The gevcdn.fit function fits a GEVcdn model via the generalized maximum-likelihood approach of Martins and Stedinger (2000). Nonlinear and linear models can be specified using the same model architecture. In the nonlinear case, the number of hidden nodes in the neural network controls the overall complexity of the model. GEV location, scale, and shape parameters can optionally be held constant (i.e., stationary). The form of the beta distribution prior for the GEV shape parameter, discussed by

Software availability

Name of software: GEVcdn

Version: 1.0

Developer: Alex J. Cannon

Contact address: Meteorological Service of Canada, Environment Canada Pacific and Yukon Region, 201-401 Burrard Street, Vancouver, BC, V6C 3S5, Canada

E-mail address: [email protected]

Availability and online documentation: Free download with manual and supporting material at: http://www.eos.ubc.ca/∼acannon/GEVcdn

Year first available: 2010

Software required: R (http://www.r-project.org)

Acknowledgment

Portions of this work were conducted while visiting the Climate Prediction Group in the Department of Earth and Ocean Sciences (EOS) at The University of British Columbia (UBC).

References (10)

A.J. Cannon et al.
Downscaling recent streamflow conditions in British Columbia, Canada using ensemble neural network models
Journal of Hydrology
(2002)
L. Breiman
Bagging predictors
Machine Learning
(1996)
K.P. Burnham et al.
Multimodel inference: understanding AIC and BIC in model selection
Sociological Methods and Research
(2004)
A.J. Cannon
A flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology
Hydrological Processes
(2010)
A.J. Cannon et al.
Modeling transient pH depressions in coastal streams of British Columbia using neural networks
Journal of the American Water Resources Association
(2001)

There are more references available in the full text version of this article.

Cited by (21)

Nonstationarity impacts on frequency analysis of yearly and seasonal extreme temperature in Turkey
2020, Atmospheric Research
Citation Excerpt :
where n is the number of observations. The parameters of GEV distributions can be estimated using different Packages in R- Programming such as ‘GEVcdn’ by Cannon (2011), “ismev” by Heffernan and Stephenson (2012) and few versions of “extRemes” such as Gilleland and Katz (2011) and Gilleland (2016). In this research, the parameters of GEV and Gumbel distributions were estimated using “ismev” package in R programming.
This study investigates the temporal variability in yearly and seasonal extreme temperatures across Turkey using stationary and nonstationary frequency analysis. The analyses are conducted using Generalized Extreme Value (GEV), Gumbel and Normal distributions for minimum and maximum temperatures during historical (1971–2016) and projection period (2051–2100). The future nonstationarity impacts are quantified using a 12-member ensemble of The Coordinated Regional Downscaling Experiment (CORDEX) regional climate models (RCM) based on the worst emission scenario (RCP8.5). The ability to preserve the nonstationarity signals after bias-correction for selected RCMs are also presented. CORDEX ensemble members generally underestimated the temperature across all seven geographical regions of Turkey. The CORDEX-31 (HadGEM2-ES/CCLM) provided the most trustable temperature simulation in each region. GEV and Normal distributions exhibited a closer fit to each other but both distributions showed substantially better fit than Gumbel distribution for temperature extremes. Magnitudes of nonstationarity impacts (30-year return level) show strong spatial and seasonal variability. Notably higher magnitudes are observed for minimum temperature (up to +10 °C) than maximum temperature (up to +4 °C). Such positive impacts are more significant particularly in eastern Turkey for yearly and seasonal scales. This effect shows greater regional variability in the historical period but with increased temperature projection it is more homogenous and larger in the future period for each region. In the long term, nonstationarities, particularly in minimum temperatures might contribute to less snowpack, accelerate the time-shifts towards the earlier days of the year in snowmelt runoff peaks of streams, further dwindle the water availability during the summer season.
Shifts in historical streamflow extremes in the Colorado River Basin
2017, Journal of Hydrology: Regional Studies
Citation Excerpt :
Upon completion of the Mann-Kendall trend analysis, the GEV analysis was used to detect stationary and non-stationary changes in high and low streamflow at the annual and seasonal timescales. With some notable exceptions, the approach outlined in Bennett et al. (2015) was applied to calculate the GEV distribution using the R-project GEVcdn package explained in Cannon (2010, 2011). Here, we summarize some of the main points of this method, which are described in Bennett et al. (2015) in greater detail.
The global phenomenon of climate change-induced shifts in precipitation leading to “wet regions getting wetter” and “dry regions getting drier” has been widely studied. However, the propagation of these changes in atmospheric moisture within stream channels is not a direct relationship due to differences in the timing of how changing precipitation patterns interact with various land surfaces. Streamflow is of particular interest in the Colorado River Basin (CRB) due to the region’s rapidly growing population, projected temperature increases that are expected to be higher than elsewhere in the contiguous United States, and subsequent climate-driven disturbances including drought, vegetation mortality, and wildfire, which makes the region more vulnerable to changes in hydrologic extremes. Here, we determine how streamflow extremes have shifted in the CRB using two statistical methods—the Mann-Kendall trend detection analysis and Generalized Extreme Value (GEV) theorem. We evaluate these changes in the context of key flow metrics that include high and low flow percentiles, maximum and minimum 7-day flows, and the center timing of streamflow using historical gage records representative of natural flows. Monthly results indicate declines of up to 41% for high and low flows during the June to July peak runoff season, while increases of up to 24% were observed earlier from March to April. Our results highlight a key threshold elevation and latitude of 2300 m and 39° North, respectively, where there is a distinct shift in the trend. The spatiotemporal patterns observed are indicative of changing snowmelt patterns as a primary cause of the shifts. Identification of how this change varies spatially has consequences for improved land management strategies, as specific regions most vulnerable to threats can be prioritized for mitigation or adaptation as the climate warms.
Historical trends and extremes in boreal Alaska river basins
2015, Journal of Hydrology
Citation Excerpt :
Following Fleming and Dahlke (2014a, 2014b), we applied a cost-complexity model selection criterion, the Akaike Information Criterion corrected for small sample sizes (AICc) to determine which of the candidate model approaches is most applicable for a given dataset (Burnham et al., 2011). To further guard against over-fitting of the models, the model recommended by AICc was selected to run a bootstrapped version of the GEV analysis (Cannon, 2011), which was iterated 100 times, and the mean value of the bootstrapped aggregated quantiles was used for plotting return values. To test the goodness-of-fit of the distributions and determine if the GEV fit of the model candidates was appropriate we used a Kolmogorov–Smirnov (K–S) test.
Climate change will shift the frequency, intensity, duration and persistence of extreme hydroclimate events and have particularly disastrous consequences in vulnerable systems such as the warm permafrost-dominated Interior region of boreal Alaska. This work focuses on recent research results from nonparametric trends and nonstationary generalized extreme value (GEV) analyses at eight Interior Alaskan river basins for the past 50/60 years (1954/64–2013). Trends analysis of maximum and minimum streamflow indicates a strong (>+50%) and statistically significant increase in 11-day flow events during the late fall/winter and during the snowmelt period (late April/mid-May), followed by a significant decrease in the 11-day flow events during the post-snowmelt period (late May and into the summer). The April–May–June seasonal trends show significant decreases in maximum streamflow for snowmelt dominated systems (<−50%) and glacially influenced basins (−24% to −33%). Annual maximum streamflow trends indicate that most systems are experiencing declines, while minimum flow trends are largely increasing. Nonstationary GEV analysis identifies time-dependent changes in the distribution of spring extremes for snowmelt dominated and glacially dominated systems. Temperature in spring influences the glacial and high elevation snowmelt systems and winter precipitation drives changes in the snowmelt dominated basins. The Pacific Decadal Oscillation was associated with changes occurring in snowmelt dominated systems, and the Arctic Oscillation was linked to one lake dominated basin, with half of the basins exhibiting no change in response to climate variability. The work indicates that broad scale studies examining trend and direction of change should employ multiple methods across various scales and consider regime dependent shifts to identify and understand changes in extreme streamflow within boreal forested watersheds of Alaska.
Assessment of historical and projected changes in extreme temperatures of Balochistan, Pakistan using extreme value theory
2024, Environmental Monitoring and Assessment
Impact Evaluation Using Nonstationary Parameters for Historical and Projected Extreme Precipitation
2023, Water (Switzerland)
Regression modelling of spatiotemporal extreme U.S. wildfires via partially-interpretable neural networks
2022, arXiv

View all citing articles on Scopus

^☆: Code available from: http://www.eos.ubc.ca/∼acannon/GEVcdn.

View full text

Short noteGEVcdn: An R package for nonstationary extreme value analysis by generalized extreme value conditional density estimation network☆

Abstract

Introduction

Section snippets

Features and capabilities

Software availability

Acknowledgment

Journal of Hydrology

Bagging predictors

Machine Learning

Multimodel inference: understanding AIC and BIC in model selection

Sociological Methods and Research

A flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology

Hydrological Processes

Modeling transient pH depressions in coastal streams of British Columbia using neural networks

Journal of the American Water Resources Association

Short note
GEVcdn: An R package for nonstationary extreme value analysis by generalized extreme value conditional density estimation network☆