Elsevier

Computers & Geosciences

Volume 37, Issue 9, September 2011, Pages 1532-1533
Computers & Geosciences

Short note
GEVcdn: An R package for nonstationary extreme value analysis by generalized extreme value conditional density estimation network

https://doi.org/10.1016/j.cageo.2011.03.005Get rights and content

Abstract

An R package is developed for the Generalized Extreme Value conditional density estimation network (GEVcdn). Parameters in a GEV distribution are specified as a function of covariates using a probabilistic variant of the multilayer perceptron neural network. If the covariate is time or is dependent on time, then the GEVcdn model can be used to perform nonlinear, nonstationary extreme value analysis. Due to the flexibility of the neural network architecture, the model is capable of representing a wide range of nonstationary relationships, including those involving interactions between covariates. Model parameters are estimated by generalized maximum likelihood, an approach that is tailored to the analysis of hydroclimatological extremes. Functions are included to assist in the calculation of parameter uncertainty via bootstrapping.

Introduction

The distribution of a series of extreme values computed from long sequences of data asymptotically approaches the Generalized Extreme Value (GEV) distribution as the number of samples becomes large. The extreme value theorem, which is the extreme value analog of the central limit theorem (Coles, 2001), forms the basis for extreme value analysis of meteorological and hydrological series, for example, annual maxima of rainfall or streamflow observations, and, in turn, the estimation of design criteria for engineering structures. One of the main assumptions is that the series is stationary, meaning that its statistical properties are independent of time. There is an ample evidence that the hydroclimatological system is nonstationary on time scales relevant to the applied extreme value analysis (Milly et al., 2008). The assumption of stationarity in extreme value analysis is therefore questionable and new methods that explicitly allow for nonstationarity in the GEV distribution parameters are required.

The GEV conditional density network (GEVcdn), which is a model for nonstationary extreme value analysis has been developed by Cannon (2010). Parameters of the GEV distribution are specified as a function of covariates using a probabilistic extension of the multilayer perceptron neural network. Nonlinear relationships, including ones involving unspecified interactions between multiple covariates, can be represented, thus resulting in a flexible statistical model for analyzing extremes.

This note describes the GEVcdn package, which provides an implementation of the GEVcdn model in the R programming language (R Development Core Team, 2009). GEVcdn provides functions for (i) fitting single models (gevcdn.fit), (ii) ensembles of bootstrap aggregated models (gevcdn.bag), (ii) predicting covariate-dependent GEV parameters from fitted models (gevcdn.evaluate), and (iii) calculating bootstrap-based confidence intervals for GEV parameters and specified quantiles (gevcdn.bootstrap).

Section snippets

Features and capabilities

The gevcdn.fit function fits a GEVcdn model via the generalized maximum-likelihood approach of Martins and Stedinger (2000). Nonlinear and linear models can be specified using the same model architecture. In the nonlinear case, the number of hidden nodes in the neural network controls the overall complexity of the model. GEV location, scale, and shape parameters can optionally be held constant (i.e., stationary). The form of the beta distribution prior for the GEV shape parameter, discussed by

Software availability

Name of software: GEVcdn

Version: 1.0

Developer: Alex J. Cannon

Contact address: Meteorological Service of Canada, Environment Canada Pacific and Yukon Region, 201-401 Burrard Street, Vancouver, BC, V6C 3S5, Canada

E-mail address: [email protected]

Availability and online documentation: Free download with manual and supporting material at: http://www.eos.ubc.ca/∼acannon/GEVcdn

Year first available: 2010

Software required: R (http://www.r-project.org)

Acknowledgment

Portions of this work were conducted while visiting the Climate Prediction Group in the Department of Earth and Ocean Sciences (EOS) at The University of British Columbia (UBC).

References (10)

  • A.J. Cannon et al.

    Downscaling recent streamflow conditions in British Columbia, Canada using ensemble neural network models

    Journal of Hydrology

    (2002)
  • L. Breiman

    Bagging predictors

    Machine Learning

    (1996)
  • K.P. Burnham et al.

    Multimodel inference: understanding AIC and BIC in model selection

    Sociological Methods and Research

    (2004)
  • A.J. Cannon

    A flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology

    Hydrological Processes

    (2010)
  • A.J. Cannon et al.

    Modeling transient pH depressions in coastal streams of British Columbia using neural networks

    Journal of the American Water Resources Association

    (2001)
There are more references available in the full text version of this article.

Cited by (21)

  • Nonstationarity impacts on frequency analysis of yearly and seasonal extreme temperature in Turkey

    2020, Atmospheric Research
    Citation Excerpt :

    where n is the number of observations. The parameters of GEV distributions can be estimated using different Packages in R- Programming such as ‘GEVcdn’ by Cannon (2011), “ismev” by Heffernan and Stephenson (2012) and few versions of “extRemes” such as Gilleland and Katz (2011) and Gilleland (2016). In this research, the parameters of GEV and Gumbel distributions were estimated using “ismev” package in R programming.

  • Shifts in historical streamflow extremes in the Colorado River Basin

    2017, Journal of Hydrology: Regional Studies
    Citation Excerpt :

    Upon completion of the Mann-Kendall trend analysis, the GEV analysis was used to detect stationary and non-stationary changes in high and low streamflow at the annual and seasonal timescales. With some notable exceptions, the approach outlined in Bennett et al. (2015) was applied to calculate the GEV distribution using the R-project GEVcdn package explained in Cannon (2010, 2011). Here, we summarize some of the main points of this method, which are described in Bennett et al. (2015) in greater detail.

  • Historical trends and extremes in boreal Alaska river basins

    2015, Journal of Hydrology
    Citation Excerpt :

    Following Fleming and Dahlke (2014a, 2014b), we applied a cost-complexity model selection criterion, the Akaike Information Criterion corrected for small sample sizes (AICc) to determine which of the candidate model approaches is most applicable for a given dataset (Burnham et al., 2011). To further guard against over-fitting of the models, the model recommended by AICc was selected to run a bootstrapped version of the GEV analysis (Cannon, 2011), which was iterated 100 times, and the mean value of the bootstrapped aggregated quantiles was used for plotting return values. To test the goodness-of-fit of the distributions and determine if the GEV fit of the model candidates was appropriate we used a Kolmogorov–Smirnov (K–S) test.

View all citing articles on Scopus

Code available from: http://www.eos.ubc.ca/∼acannon/GEVcdn.

View full text