Impact of measurement error and limited data frequency on parameter estimation and uncertainty quantification

https://doi.org/10.1016/j.envsoft.2019.03.022Get rights and content

Highlights

  • Historical observed data is required for calibration.

  • Measurement error and limited data frequency result in parameter uncertainty.

  • The results highlight the critical roles of measurement error and frequency in the calibration.

  • The effect of the measurement uncertainty is significant when the calibrated data are limited.

  • The research findings can be used to support measurement prioritization and resource allocation.

Abstract

Parameter estimation, using historical observed data, is an important part of the environmental modeling. The uncertainty in the parameter estimation limits the applications of environmental models.

In this paper, the influence of limited and uncertain calibrated data on the performance of the parameter estimation are systematically investigated. For this purpose, synthetic observations with a given uncertainty and frequency are used to estimate the model parameters of a conceptual water quality (WQ) model of the River Zenne, Belgium. Bayesian inference using Markov Chain Monte Carlo sampling is adopted to simultaneously perform the automatic calibration and the uncertainty analysis. The results highlight the critical roles of measurement frequency and uncertainty in the model calibration. We found that the effect of the measurement uncertainty on the parameter estimation is significant when the calibrated data points are limited (e.g. monthly data). The research findings can be used to support measurement prioritization and resource allocation.

Introduction

In order to use environmental models for different tasks, such as predictions, scenario analysis, and setting up regulations, they should represent the reality adequately. Moreover, they should be scientifically sound, robust and defensible (U.S. EPA. 2002). In general, model results are affected by the model structure (i.e. the model assumptions), the model inputs, the boundary conditions, and the model parameters (van Griensven and Meixner, 2006; Rode et al., 2010). The underlying assumptions of the model are often fixed, and therefore, the model structure is not changed during the modeling processes. Moreover, the input data and the boundary conditions, obtained through measuring campaigns or provided by responsible authorities, are not altered by the modeler (Nossent, 2012). On the other hand, most of the model parameters, representing some properties of the system, cannot be measured directly (Vrugt et al., 2003). As a consequence, the model parameters should be set to appropriate values in order to increase the agreement between the model results and the real system. The parameters adjustment is based on the reduction of the difference between the model results and historical measurements of the system response (Laloy et al., 2010; Vrugt et al., 2013; Leta et al., 2015). This procedure is referred to as parameter estimation, parameter optimization, model calibration or inverse modeling (Raat et al., 2004).

Parameter estimation using historical observed data is an important part of the environmental modeling practice which has been the focus of many researches and studies (e.g. Duan et al., 2006). However, because the models are only an approximation of the real system and the observed data used for the calibration contain error (i.e. measurement uncertainty), parameter estimation is error-prone (Vrugt et al., 2002). As a result, it is difficult to well identify the parameters and the parameter uncertainty is caused. In addition, in some fields, such as water quality modeling, the data collection is resource-intensive (Mannina, and Viviani, 2010), consequently, the available data has a limited frequency, e.g. biweekly or even monthly intervals (Zheng and Keller, 2007a). In these cases, a serious complication for the model calibration is the lack of reliable calibration data which results in parameter uncertainty (Raat et al., 2004; Franceschini, and Tsai, 2010a,b). The ambiguity in the parameter estimation has considerable impact on the model simulation uncertainty, and, therefore, limits the applications of environmental models, such as water quality models (Wagener et al., 2003).

Despite the critical role of amount and reliability of calibration data on the performance of the parameter estimation, to the best of the authors’ knowledge, the literature contains very few studies on quantifying the impact of these two aspects of the measurements (i.e. amount and reliability) on the parameter uncertainty of water quality models. For example, for a catchment nitrogen modeling, Raat et al. (2004) explored the relationship between the quality of the calibration data (i.e. measurement uncertainty) and the uncertainty associated with the final parameter estimates (i.e. parameter uncertainty), using virtual data. Wang et al. (2017) used synthetic data to explore how the number of tracer (i.e. isotope) data samples (i.e. measurement amount) affect model calibration. However, the effect of measurement errors of the tracer data on the parameter estimation was not studies. Therefore, it is needed to investigate the effect of both amount and reliability of the calibration data on the performance of the calibration. Neglecting the other source of uncertainties (e.g. model structural uncertainty, input data uncertainty), a modeler should first investigate that it is feasible to reach a pre-defined model performance with a given amount of uncertain measured data.

As the impact of these critical characteristics of calibration data, amount and reliability, on the model calibration has not fully addressed in literature, in this study, we focus on assessing the influence of limited uncertain calibrated data on the performance of the parameter estimation and on the parameter uncertainty intervals of water quality models. For this purpose, the following questions are formulated:

  • 1)

    What is the effect of increasing the measurement frequency on the water quality parameter estimation?

  • 2)

    What is the effect of reducing the measurement error of the observed data on the water quality parameter estimation?

To address the research questions, synthetic observations with a given uncertainty and frequency are used to estimate the model parameters of a conceptual water quality (WQ) model of the River Zenne in Belgium, for simulation dissolved oxygen (O2) and biological oxygen demand (BOD). As an optimization tool, Bayesian inference using Markov Chain Monte Carlo (MCMC) sampling (Bates and Campbell, 2001; Kuczera and Parent, 1998), is adopted to simultaneously perform the automatic calibration and the uncertainty analysis. The synthetic data series are generated by running a WQ model, with a given set of model parameter values as ‘true’ values. The model outputs are then perturbed with a pre-specified random error, as measurement error, and sampled with a given frequency to mimic discrete measurements. The sampled data are then considered as if they were observed data and used to calibrate the model parameters, using the MCMC algorithm. Finally, it is verified if the model parameters can be identified using the limited and uncertain data. To evaluate the relationship between the amount and reliability of the calibration data and the parameter uncertainty, nine different synthetic data sets are generated by increasing the measurement error and decreasing the measurement frequency. Then, the generated synthetic data are used as calibration data in subsequent optimization runs.

Section snippets

Conceptual Integrated Tool for Water Quality Assessment (CIToWA)

In order to enhance the applicability of conceptual river water quality simulators, Woldegiorgis (Woldegiorgis, 2017; Woldegiorgis et al., 2017) developed a Conceptual Integrated Tool for Water Quality Assessment (CIToWA), as an alternative to detailed WQ simulators. In CIToWA, the river system is represented by reaches which are conceptual elements that divide the channel longitudinally into different parts. CIToWA obtains estimates of discharges and velocities of the reaches from external

The results of the sensitivity analysis

The results of the PAWN sensitivity indices of the model parameters, together with the dummy parameter, are presented in Fig. 3. The dash line represents the sensitivity index of the dummy parameter, as a threshold for the parameter screening. For simulating O2 (Fig. 3 (a)), RK2 is the most influential parameters, followed by RK1 and RK3. Considering the sensitivity index of the dummy parameter (dash line in Fig. 3 (a)), the other parameters considered less influential for simulating O2. For

Conclusion

Parameter estimation requires measurements of the system response. However, in some fields, such as water quality modeling, the measurement frequency is limited. Moreover, the measurements are uncertain. These limitations cause uncertainty in the parameter estimation and in the simulation results. The objective of this study was to investigate the effect of the measurement frequency and uncertainty on the parameter estimation process and the uncertainty quantification.

To this aim, the DREAM(ZS)

Software/data availability

The PAWN method is implemented in the SAFE Matlab/Octave Toolbox for GSA (Pianosi et al., 2015). SAFE is freely available for non-commercial purposes at www.bristol.ac.uk/cabot/-resources/safe-toolbox/.

The MATLAB toolbox of DREAM is available upon request from the author, [email protected].

The CIToWA tool is available upon request from the author, [email protected].

Acknowledgment

The authors would like to thank the Flanders Hydraulics Research for supporting and coordinating the project of “Development of conceptual models for an integrated river basin management”.

References (55)

  • O.T. Leta et al.

    Assessment of the different sources of uncertainty in a SWAT model of the River Senne (Belgium)

    Environ. Model. Softw

    (2015)
  • G. Mannina et al.

    Water quality modelling for ephemeral rivers: model development and parameter assessment

    J. Hydrol.

    (2010)
  • P. Meert et al.

    Computationally efficient modelling of tidal rivers using conceptual reservoir-type models

    Environ. Model. Softw

    (2016)
  • M.K. Muleta et al.

    Sensitivity and uncertainty analysis coupled with automatic calibration for a distributed watershed model

    J. Hydrol.

    (2005)
  • J. Nash et al.

    River flow forecasting through conceptual models part I—a discussion of principles

    J. Hydrol.

    (1970)
  • J. Norton

    An introduction to sensitivity assessment of simulation models

    Environ. Model. Softw

    (2015)
  • J. Nossent et al.

    Sobol’ sensitivity analysis of a complex environmental model

    Environ. Model. Softw

    (2011)
  • F. Pianosi et al.

    PAWN: a simple and efficient method for Global Sensitivity Analysis based on cumulative distribution functions

    Environ. Model. Softw

    (2015)
  • F. Pianosi et al.

    A Matlab toolbox for global sensitivity analysis

    Environ. Model. Softw.

    (2015)
  • F. Pianosi et al.

    Sensitivity analysis of environmental models: a systematic review with practical workflow

    Environ. Model. Softw.

    (2016)
  • K. van Werkhoven et al.

    Sensitivity-guided reduction of parametric dimensionality for multi-objective calibration of watershed models

    Adv. Water Resour.

    (2009)
  • J. Vrugt et al.

    Hydrologic data assimilation using particle Markov chain Monte Carlo simulation: theory, concepts and applications

    Adv. Water Resour.

    (2013)
  • B.C. Bates et al.

    A Markov chain Monte Carlo scheme for parameter estimation and inference in conceptual rainfall‐runoff modeling

    Water Resour. Res.

    (2001)
  • K.J. Beven

    Uniqueness of place and process representations in hydrological modelling

    Hydrol. Earth Syst. Sci. Discuss.

    (2000)
  • K. Beven et al.

    The future of distributed models: model calibration and uncertainty prediction

    Hydrol. Process.

    (1992)
  • R.E. Brazier et al.

    Equifinality and uncertainty in physically based soil erosion models: application of the GLUE methodology to WEPP–the Water Erosion Prediction Project–for sites in the UK and USA

    Earth Surf. Process. Landforms: J. Br. Geomorphol. Res. Group

    (2000)
  • L.C. Brown et al.

    Computer Program Documentation for the Enhanced Stream Water Quality Model QUAL 2E (No. 471

    (1985)
  • Cited by (12)

    • A fast and effective parameterization of water quality models

      2022, Environmental Modelling and Software
      Citation Excerpt :

      Based on the SA results, 8 parameters are selected for the model calibration: RK1, RK2, RK3, BC1, BC2, BC3, Kd and Ko2 (the definitions of the parameters are given in Table 2). These parameters were also important for the conceptual WQ model of the River Zenne, Belgium (Khorashadi Zadeh et al., 2019). The manual calibration is performed by adjusting the multiplier factors of the 8 influential parameters, identified by the SA (see Section 3.1 for the list of influential parameters).

    • Augmentation of limited input data using an artificial neural network method to improve the accuracy of water quality modeling in a large lake

      2021, Journal of Hydrology
      Citation Excerpt :

      By contrast, while an automatic monitoring system can provide high-frequency data for a limited number of variables, a regular monitoring system can test many types of variables in the laboratory, albeit with low frequency. The results of water quality modeling with insufficient data can result in misleading interpretations and increase uncertainty (Zadeh et al., 2019). For instance, phytoplankton growth in water is predominantly affected by hourly changes in water temperature and solar radiation (Thomann and Mueller, 1987).

    View all citing articles on Scopus
    View full text