Elsevier

Coastal Engineering

Volume 135, May 2018, Pages 16-30
Coastal Engineering

Bayesian Networks in coastal engineering: Distinguishing descriptive and predictive applications

https://doi.org/10.1016/j.coastaleng.2018.01.005Get rights and content

Highlights

  • 10 years of daily shoreline data is used to develop a Bayesian Network (BN) modelling shoreline change to storm events.

  • The BN can be differentiated into versions that optimise predictive or descriptive application.

  • The predictive BN is more skillful at predicting unseen storms when compared to an equivalent empirical model.

  • The descriptive BN can be used to identify and investigate the key processes driving storm erosion.

Abstract

Bayesian networks (BNs) are increasingly being used to model complex coastal processes due to their ability to integrate non-linear systems, their transparent probabilistic framework, and low computational cost. A BN may be suited to descriptive or predictive application. Descriptive BNs are highly calibrated models that are useful for better understanding the physics and causal relationships driving a system. Predictive BNs are generalisations of a system that have skill at predicting outside of the training domain. The predictive and descriptive usefulness of a BN depends on its complexity and the amount of data available to train it, but there is often a trade-off; higher descriptive skill comes at the cost of reduced predictive skill. To demonstrate the differences between predictive and descriptive BNs in a coastal engineering context, a BN to predict shoreline recession caused by coastal storm events is developed and tested using an extensive 10-year dataset incorporating 137 individual storms events monitored at Narrabeen-Collaroy Beach, Australia. A parsimonious approach to BN development is used to separately determine the optimum predictive and descriptive BNs for this dataset. Results show that for this dataset two quite different BNs can be developed: one that is optimized to achieve the highest predictive skill, and a second network that is optimized to maximize descriptive skill. The optimum predictive BN is found to comprise 3 nodes (variables) and can predict the shoreline recession caused by unseen storm events with a skill of 65%. The optimum descriptive BN is composed of 5 nodes and can reproduce 88% of the training dataset, but with more limited predictive capabilities. The uses and limitations of these two different approaches to BN formulation are illustrated with example applications to coastal process modelling. It is anticipated that the insights provided in this paper will help to clarify the further development of Bayesian Networks applied to coastal modelling.

Introduction

Bayesian networks (BNs) are probabilistic graphical models that can be used to represent causal systems. They model interactions between variables describing a system using representative datasets and statistics founded on Bayes’ rule of conditional probability. BNs originate from artificial intelligence research and are increasingly being used to model environmental systems (Aguilera et al., 2011). BNs can easily handle non-linear systems, have low computational cost, can deal with missing data and data from different sources, explicitly include uncertainties, and have a simple and intuitive graphical structure that is easily understood by non-technical users (Chen and Pollino, 2012, Uusitalo, 2007). On the other hand, BNs depend on the quality of data used to develop them and require continuous variables to be discretised. For a thorough introduction into BNs, the reader is referred to Pearl (1988) and Charniak (1991).

Recently, BNs have been used in a number of coastal engineering applications, including: predicting episodic coastal cliff erosion (Hapke and Plant, 2010), reproducing wave-height evolution in the surf zone (Plant and Holland, 2011), assessing coastal vulnerability to sea level rise (Gutierrez et al., 2011), predicting barrier island response to storms (Plant and Stockdon, 2012, Wilson et al., 2015), predicting dune retreat resulting from coastal storms (Palmsten et al., 2014) and modelling hurricane damage to urbanised coasts (van Verseveld et al., 2015).

These studies and others have shown that BNs can have considerable skill modelling a range of complex coastal processes. However, one topic that is not well clarified in the literature is that BNs may be suited to descriptive or predictive applications; that is, a BN may be skilful at representing and reproducing a unique dataset descriptively, or at generalising the causal relationships in the dataset such that they are applicable to predicting unseen data. It is therefore important to clarify whether the BN purpose is descriptive or predictive as this dictates its generic applicability. Fienen and Plant (2015) developed a k-fold cross-validation application using the BN software package Netica (Norsys Software Corporation, 1995–2017) for assessing the predictive and descriptive skill of a BN. In k-fold cross-validation a dataset is divided into k number of folds (or partitions), where k is commonly taken as 10 (Marcot, 2012). A BN is trained and tested on all but 1 fold of the data (descriptive skill) and then tested on the 1 withheld fold (predictive skill) for all k permutations of training and testing sets. k-fold cross-validation is an unbiased way of evaluating model descriptive and predictive skill (Elsner and Schmertmann, 1994), and is widely applied to test machine learning models (Refaeilzadeh et al.). Fienen and Plant (2015) and other recent coastal studies (e.g (Gutierrez et al., 2015, Poelhekke et al., 2016).,) have used cross-validation to show that predictive skill and descriptive skill vary with BN model complexity and that there is a trade-off between the two – better descriptive power usually comes at the cost of reduced predictive power (Fienen and Plant, 2015, Gutierrez et al., 2015), resulting in different optimum BN structures for both descriptive and predictive BN applications.

While cross-validation provides a useful method of distinguishing between a predictive or descriptive BN model, there remains no standard procedure to developing the optimum predictive or descriptive model for a particular dataset (Chen and Pollino, 2012, Marcot, 2012). In coastal applications to date, the typical approach taken to BN model development has been to start with a complete conceptual model of the system and then iteratively modify and evaluate this structure to investigate model skill and sensitivity (e.g (Plant and Stockdon, 2012, Wilson et al., 2015, Palmsten et al., 2014).,). An alternative and more objective approach that is often used for empirical model development, but has received less attention in the coastal BN literature to date, is the parsimonious approach to model development (Sivapalan and Young, 2005). The parsimonious model approach builds a model up, from simple to complex, using only model inputs and causal relations that are justified and optimised by the available training dataset (Sivapalan and Young, 2005). Such an approach integrates sensitivity analysis into model development and protects against the model fitting spurious relationships in the data. Practically, parsimonious BNs can be developed by constructing and evaluating a conceptualised BN model one variable at a time, based on maximising the descriptive and predictive skill at each step of construction. This not only allows identification of the optimal predictive and descriptive variable subsets to use in a BN for a given dataset but further serves the practical and physically meaningful purpose of identifying how individual variables in the dataset impact the skill of the model (Sivapalan and Young, 2005).

The aim of this paper is to explore the distinction between descriptive and predictive BNs in coastal modelling. To this end, a BN to model shoreline recession caused by coastal storm events is developed and tested using an extensive 10-year dataset from Narrabeen-Collaroy Beach, on the southeast coast of Australia. Understanding and predicting the response of the shoreline to coastal storm events remains a focus of the coastal research community (Holman et al., 2015), having important implications for both emergency and long-term coastal management. This is particularly the case for highly-developed, dynamic sandy coastlines such as Narrabeen-Collaroy Beach, where coastal storm events can place beachfront infrastructure at risk in the short term (Harley et al.) and often dominate longer-term patterns of shoreline change (Harley et al., 2011). BNs offer an appealing method of modelling the impacts of storm events that differs from the empirical or process-based model approaches that have typically been used in these coastal settings (e.g (Davidson et al., 2013, Harley et al., 2009, Karunarathna et al., 2014, Wright et al., 1985, Splinter et al., 2014a).,). Here, a parsimonious approach to BN development is used to develop the optimal descriptive and predictive BNs for modelling storm-induced shoreline change at Narrabeen-Collaroy Beach, followed by a discussion and example applications of how these models can be used in coastal settings. This paper further serves as an introduction to BN modelling and a reference point for understanding other BN studies in the coastal science and engineering community.

Section snippets

Study site

Narrabeen-Collaroy Beach (hereafter referred to simply as Narrabeen) is a sandy, 3.6 km long embayed beach bounded at its extremities by rocky headlands. It is situated on the southeast coast of Australia approximately 20 km north of the centre of Sydney. The beach is composed of fine to medium quartz sand (D50 ≈ 0.3 mm), with ∼30% carbonate fraction, and has a typical intertidal slope of 0.12. Its modal beach state ranges from dissipative-intermediate (wider, flatter beaches often with bars

Bayesian Networks

A BN is a graphical representation of the joint probability distribution of a system comprised of discrete variables. A very simple illustration of a hypothetical BN to predict the occurrence of erosion versus accretion as a response to different combinations of wave height and period is shown in Fig. 2. The BN consists of nodes representing variables in the system (e.g., Wave Height, Wave Period and Beach Response) that are connected with arcs representing causality between nodes. The arcs and

Results

The shoreline change BN model shown in Fig. 4 was evaluated in an incremental manner one node at a time based on the order of influence of each input variable (Table 2 & Fig. 5) to evaluate how the predictive and descriptive skill of the BN varied with increasing input nodes. Fig. 6 shows the predictive skill (i.e., the ability of the model to correctly predict events it has not been trained on) and descriptive skill (i.e., the ability of the model to correctly ‘re-predict’ events it has

Application of a predictive Bayesian Network and comparison to an empirical model

To date, the majority of BNs developed for coastal engineering applications have typically been used as predictive tools (e.g (Hapke and Plant, 2010, Plant and Holland, 2011, Plant and Stockdon, 2012, Gutierrez et al., 2015, Poelhekke et al., 2016).,). In this section we discuss and illustrate how the predictive BN developed in the present study (Fig. 7a) can be used for such predictive purposes and how it compares to an alternative empirical model developed at Narrabeen by Harley et al. (2009)

Conclusion

Bayesian networks (BNs) provide an alternative approach to empirical and physics-based modelling of coastal processes, offering the benefits of probabilistic modelling, uncertainty quantification, and low computational cost. BNs can be useful for both describing a dataset and predicting new data, but there is often a tradeoff – better descriptive capability is anticipated to be achieved at the cost of reduced predictive skill. In this paper, a BN predicting shoreline change by coastal storms

Acknowledgements

Data for this research was partially funded by ongoing support by Northern Beaches council, the Australian Research Council (LP04555157, LP100200348, DP150101339) and the NSW Environmental Trust Environmental Research Program (RD 2015/0128). Wave and tide data was kindly provided by Manly Hydraulics Laboratory under the NSW Coastal Data Network Program managed by the Office of Environment and Heritage (OEH). The 3rd Author is additionally supported through an Australian Research Council Future

References (50)

  • G. Masselink et al.

    Role of wave forcing, storms and NAO in outer bar dynamics on a high-energy, macro-tidal beach

    Geomorphology

    (2014)
  • J.E. Nash et al.

    River flow forecasting through conceptual models part I—a discussion of principles

    J. Hydrol

    (1970)
  • N.G. Plant et al.

    Prediction and assimilation of surf-zone processes using a Bayesian network

    Coast. Eng.

    (2011)
  • L. Poelhekke et al.

    Predicting coastal hazards for sandy coasts with a Bayesian Network

    Coast. Eng.

    (2016)
  • K.D. Splinter et al.

    A relationship to describe the cumulative impact of storm clusters on beach erosion

    Coast. Eng.

    (2014)
  • L. Uusitalo

    Advantages and challenges of Bayesian networks in environmental modelling

    Ecol. Model.

    (2007)
  • H.C.W. van Verseveld et al.

    Modelling multi-hazard hurricane damages on an urbanized coast with a Bayesian Network approach

    Coast. Eng.

    (2015)
  • K.E. Wilson et al.

    Application of Bayesian Networks to hindcast barrier island morphodynamics

    Coast. Eng.

    (2015)
  • L. Wright et al.

    Morphodynamic variability of surf zones and beaches: a synthesis

    Mar. Geol.

    (1984)
  • L. Wright et al.

    Short-term changes in the morphodynamic states of beaches and surf zones: an empirical predictive model

    Mar. Geol.

    (1985)
  • N. Booij et al.

    A third-generation wave model for coastal regions: 1. Model description and validation

    J. Geophys. Res.: Oceans

    (1999)
  • J. Cain

    Planning Improvements in Natural Resources Management

    (2001)
  • E. Charniak

    Bayesian networks without tears

    AI Mag.

    (1991)
  • A.P. Dempster et al.

    Maximum likelihood from incomplete data via the EM algorithm

    J. Roy. Stat. Soc. B

    (1977)
  • J.B. Elsner et al.

    Assessing forecast skill through cross validation

    Weather Forecast.

    (1994)
  • Cited by (43)

    • A multi-model ensemble approach to coastal storm erosion prediction

      2022, Environmental Modelling and Software
      Citation Excerpt :

      There are however, several drawbacks associated with these models. Data-driven models tend to suffer a performance drop when predicting out-of-sample (e.g., Beuzen et al., 2018). This is particularly problematic given there is a general lack of data for coastal storm events from which to train on.

    • A storm hazard matrix combining coastal flooding and beach erosion

      2021, Coastal Engineering
      Citation Excerpt :

      As the Storm Hazard Matrix consists of discrete regimes, new classification machine learning techniques (e.g., support vector machines, stochastic gradient descent, decision trees, etc.) may provide powerful predictive capabilities that leverage the increased availability of coastal data. Probabilistic methods such as ensembles (Beuzen et al., 2019a), Monte Carlo simulations (Davidson et al., 2017), and Bayesian networks (Bulteau et al., 2015; Beuzen et al., 2018) are practical approaches that enable uncertainty in local morphology and storm hydrodynamics to be appropriately considered. As the availability of routine coastal observations spanning regional scales continues to expand and modelling tools improve, implementation of the Storm Hazard Matrix within the context of operational Early Warning Systems has the potential to deliver forecasts of coastal storm hazards spanning both wave-dominated and surge-dominated coasts.

    View all citing articles on Scopus
    View full text