Elsevier

Computers & Geosciences

Volume 34, Issue 9, September 2008, Pages 1154-1166
Computers & Geosciences

A simple method for representing some univariate frequency distributions, with particular application in Monte Carlo-based simulation

https://doi.org/10.1016/j.cageo.2007.04.009Get rights and content

Abstract

Introduced in this paper is a simple yet effective method for representing some of the types of univariate frequency distribution that commonly are required in Monte Carlo-based simulation. To use the method, an appropriate parent distribution is first chosen; then this distribution is modified by blending a constant value into the density function; the particular value used is the ordinate of the density function at its mode. The advantages of the method are (1) that a wide variety of forms of distribution can be represented, (2) that the number of parameters is low, (3) that the parameters can be varied continuously to let sets of systematically related distributions be constructed, and (4) that the resulting distribution functions are straightforward to invert numerically, thereby letting random deviates be generated quickly and efficiently. The method is potentially of particular value in Monte Carlo-based simulation, because it allows distributions of greatly differing forms to be represented within a single, flexible framework. The paper describes the method, provides the necessary equations for parameter estimation, and gives an example of a simulation exercise in which the method proved valuable. A demonstration program is provided that allows experimentation with the method.

Introduction

The Monte Carlo method, described initially by Metropolis and Ulam (1949), has long played an important role in geoscientific modelling. For this there is an excellent reason, viz. that Monte Carlo-based simulation is an extraordinarily powerful tool for studying the behaviour of the types of systems with which geoscientists commonly work. These systems typically are highly complex ones, with parameters that cannot be specified exactly and for which the effects of parameter variation cannot be determined analytically. The use of Monte Carlo-based simulation allows models of these systems to be built and analysed in a straightforward and effective way.

There are two preconditions for the use of Monte Carlo-based simulation. The first—this is the general precondition for all successful scientific modeling—is that the model being used is behaviourally equivalent to the system being studied. The second—a contrastingly specific precondition—is that the distributions of the variables used in the model are capable of being represented in a computationally efficient way. This second precondition ensures that appropriately distributed random deviates can be generated quickly whenever they are needed, as the Monte Carlo method demands.

The purpose of this present paper is to propose a simple yet effective method for representing some of the types of univariate frequency distribution that are commonly required in Monte Carlo-based work. These types of distribution share the following characteristics: (1) they are continuous, with unimodal density functions; (2) the variables concerned have ranges that are bounded in at least one of the two directions; (3) the density functions are non-zero in value at the bounds. The first part of the paper introduces the proposed representation method; then the matter of parameter estimation is considered; finally an example is given of how the method was used in a recent stratigraphic modelling exercise. A program is provided that allows experimentation using different sets of parameter values; this program can be downloaded from the journal website.

Section snippets

The representation problem

Distributions of the types being considered here are met with in many applications, both in the geosciences and elsewhere, commonly in situations in which there is little or no theory to point to their ideal mathematical form. The problem is then always to decide on how they will best be represented. This problem is a very application-specific one, for the choice of representation will always be influenced by the purposes for which the distribution in question is to be used. However, usually it

Representing distributions bounded in one direction

The mechanics of the proposed method are readily appreciated by looking first at how it is used to represent a unimodal distribution bounded in one direction only (Fig. 1A). The following notation is used: the variable concerned is x, 0⩽x⩽∞; the required density and distribution functions are f(x) and F(x), respectively; the mode is located at x=m.

First, the parent distribution is chosen. This must have three properties: (1) its range must be identical to the required range of x, (2) the values

Distributions bounded in both directions

Next consider how the method is used to represent a unimodal distribution bounded in both directions (Fig. 1B); in this case the variable x is taken to lie in the range 0⩽xa. The density function f(x) is again constructed in two sections, but both of these are now rescaled linear blends of g(x) and g(m). Thusf(x)=fl(x)/Rforxm,wherefl(x)=wlg(x)+(1-wl)g(m);f(x)=fu(x)/Rforxm,wherefu(x)=wug(x)+(1-wu)g(m).where wl and wu are the blending coefficients for the lower and upper sections,

Parameter estimation

Distributions used in simulation work commonly have to represent sets of data. The values of the distribution parameters then have to be chosen to give the best fit to those data. For ungrouped data, the parameter estimation can be carried out by maximising the appropriate log likelihood function. This islogL(x|w,θ)=i=1rlogfl(xi)+i=r+1nlogg(xi)-nlogRfor the one-bound case, andlogL(x|wl,wu,θ)=i=1rlogfl(xi)+i=r+1nlogfu(xi)-nlogRfor the two-bound case. θ denotes the set of parameters belonging

Two applications of the proposed method in Monte Carlo-based simulation

The value of the proposed method in Monte Carlo-based simulation is demonstrated in a recent exercise in sedimentation modelling (Tipper, 2007). The aim there was to look at how sediment accumulation models could be used to predict the thickness–time relationships to be expected over the long term in one-dimensional stratigraphic successions; by ‘long term’ is meant intervals of 104–105 years duration. The exercise—an avowedly exploratory one—had three parts: (1) a model was selected and its

A brief discussion

This paper has had a deliberately limited purpose—to describe the proposed new method for representing univariate distributions and to illustrate its application in Monte Carlo-based simulation. The results of the particular simulation exercise used in the illustration are therefore hardly appropriate here. All that is relevant is perhaps to remark that the simulation approach used there provides for the first time a way of predicting the nature of the thickness–time relationships to be

Acknowledgements

I thank Martin Stynes and Robert Boik for responding so magnificently to my cry for help.

References (13)

There are more references available in the full text version of this article.

Cited by (0)

Code available from server at http://www.iamg.org/CGEditor/index.htm.

View full text