A framework for the modelling of uncertainty between remote sensing and geographic information systems

https://doi.org/10.1016/S0924-2716(00)00018-6Get rights and content

Abstract

This paper addresses the modelling of uncertainty in an integrated geographic information system (GIS), specifically focused on the fusion of activities between GIS and remote sensing. As data is abstracted from its ‘raw’ form to the higher representations used by GIS, it passes through a number of different conceptual data models via a series of transformations. Each model and each transformation process contributes to the overall uncertainty present within the data. The issues that this paper addresses are threefold. Firstly, a description of various models of geographic space is given in terms of the inherent uncertainty characteristics that apply; this is then worked into a simple formalism. Secondly, the various transformation processes that are used to form geographic classes or objects from image data are described, and their effects on the uncertainty properties of data are stated. Thirdly, using the formalism to describe the transformation processes, a framework for the propagation of uncertainty through an integrated GIS is derived. By way of a summary, a table describing sources of accumulated uncertainty across four underlying models of geographic space is derived.

Introduction

Good science requires statements of accuracy by which the reliability of results can be understood and communicated. Where accuracy is known objectively, then it can be expressed as error, where it is not, the term uncertainty applies (Hunter and Goodchild, 1993). Thus, uncertainty covers a broader range of doubt or inconsistency, and in the context of this paper, includes error as a component. The understanding of uncertainty as it exists in geographic data remains a problem that is only partly solved Story and Congleton, 1986, Goodchild and Gopal, 1989, Veregin, 1995, Ruiz, 1997, Worboys, 1998. However, without quantification, the reliability of any results produced remains problematic to assess and difficult to communicate to the user. A geographic information system (GIS) provides a whole series of tools with which data can be manipulated, without offering any control over misuse. To quote Openshaw et al. (1991):

A GIS gives the user complete freedom to combine, overlay and analyse data from many different sources, regardless of scale, accuracy, resolution and quality of the original map documents and without any regard for the accuracy characteristics of the data themselves.

This is a serious issue; without quantification of uncertainty, the results themselves may only be considered as qualitative information, and this greatly devalues their merit in both a scientific and a practical sense. To compound the problem, in the fusion of activities from remote sensing and GIS, an integrated approach to managing geographic information is required. This must necessarily support many different types of data (Ehlers et al., 1991), gathered according to different models of geographic space (Goodchild, 1992), each possessing different types of inherent errors and uncertainties (Chrisman, 1991). As well as providing individual support for these different models of space, it is necessary to explicitly include methods to keep track of uncertainty, as data is changed from low level forms (such as remotely sensed image data) to the higher level abstractions required by cartography and GIS (such as distinct objects and themes). It is particularly toward the propagation of uncertainty through different conceptual models that this paper is oriented.

Whether a particular data set can be considered suitable for a given task depends on many different criteria, and despite the fact that various aspects of uncertainty can be measured objectively, their importance will be largely determined by the current task. Our overall goal with modelling uncertainty is therefore threefold: (i) to produce a statement of uncertainty to be associated with each data set so that an objective statement of reliability may be reported, (ii) to develop methods to propagate uncertainty as the data is processed and transformed, and (iii) to ultimately determine the suitability of a data set for a given task (‘fitness for use’). Another valid goal, not considered here, is to communicate uncertainty information to the user (e.g. Hunter and Goodchild, 1996).

A good deal of research effort has been directed towards the study of uncertainty in geographic data. A useful framework, recognising the separate error components of value, space, time, consistency and completeness was proposed by Sinton (1978) and later embellished by Chrisman (1991). However, uncertainty in geographic data can be described in a variety of alternative ways, such as those provided by Bedard (1987), Miller et al. (1989) and Veregin (1989). Although different, these approaches all have a number of aspects in common, including the observation that uncertainty itself occurs at different levels of abstraction. For example, positional and temporal errors describe uncertainty in a metric sense within a space–time framework, whereas completeness and consistency represent more abstract concepts describing coverage and reliability, and are consequently more problematic to describe.

Work to date on uncertainty addresses the inherent errors present within specific types of data structure (e.g. raster or vector) or data models (e.g. field, object). The affects of combining data layers together within these various paradigms have been studied by Veregin, 1989, Veregin, 1995, Openshaw et al. (1991), Goodchild et al. (1992), Heuvelink and Burrough (1993), Ehlers and Shi (1997) and Leung and Yan (1998). Scant attention has so far been given to the problem of modelling uncertainty as the data is transformed through different models of geographic space, a notable exception is the work of Lunetta et al. (1991). A typical path taken by data captured by satellite, then abstracted into a suitable form for GIS is shown in Fig. 1, and involves four such models. Continuously varying fields are quantised by the sensing device into image form, then classified and finally transformed into discrete mapping objects. The overall object extraction process is sometimes referred to as semantic abstraction (Waterfeld and Schek, 1992) due to the increasing semantic content of the data as it is manipulated into forms that are easier for people to work with.

When transforming data between different conceptual models of geographic space, the uncertainty characteristics in the data may change; in that techniques used to transform the data also alter the inherent uncertainty and in addition may introduce further uncertainty of their own. Furthermore, many of the abstraction techniques employed combine data with different uncertainty characteristics, for example multi-temporal image classification (Jeon and Landgrebe, 1992) or knowledge-based feature extraction (McKeown, 1987). Consequently, two inter-related problems must be addressed, namely:

  • 1.

    How do the uncertainty characteristics of data change as data are transformed between models?

  • 2.

    How do the transformation methods used affect and combine the uncertainty present in the data (Lodwick et al., 1990)?

This paper concentrates mainly on the first question, proposing a framework within which the second question can be tackled.

One of the consequences of the separation of GIS and RS activities into separate communities and separate software environments is that there is an artificial barrier between the two disciplines. Thus, the integration of these two branches of science is to some extent an artificial problem. As a result, there is no easy flow of meta-data between systems, inter-operability is often restricted to the exchange of image files or object geometry and the problem of managing uncertainty is compounded.

The four stages shown in Fig. 1 represent four models of geographic space that are considered here. They are termed the field (F), image (I), thematic (T) and object (or feature) (O) models and are typical (though not exhaustive) of those used in the integration of GIS and remote sensing activities. These models represent the conceptual properties of the data only and are considered here as independent from any particular data structure that might be used to encode and organise the data.

The description of uncertainty used here follows that proposed by Sinton (1978). It covers the sources of error as they occur in remote sensing and GIS integration (although other approaches may be equally valid). Uncertainty is restricted to the following properties: (i) value (including measurement and label errors), (ii) spatial, (iii) temporal, (iv) consistency and (v) completeness. These are symbolised below using the following five letters of the Greek alphabet (α, β, χ, δ and ε, respectively). Of these, α, β and χ can be applied either individually to a single datum or to any set of data. The latter two properties of consistency and completeness can only apply to a defined data set since they are comparative (either internally among data or to some external framework). Issues regarding scale and resolution of the data (e.g. Bruegger, 1995) are postponed; these are not objective uncertainty criteria, but instead become important when assessing the ‘fitness for use’ of data for a specific task. Also, we do not concern ourselves here with structures for the provision of lineage information (e.g. Lanter, 1991), although these are certainly required to support the propagation of uncertainty.

The techniques proposed for quantifying error and uncertainty referred to in this paper are not in themselves new. The purpose of the formal notation developed later is to describe succinctly and unambiguously the forms of uncertainty that exist in the data, where they originate from and what they affect. Ongoing work by the authors and others is attempting to quantify some of these uncertainty terms as they relate to the integration of GIS and remote sensing, for example ISPRS Commissions 2, 3 and 4 span the full range of integration activities described here (see: http://www.isprs.org/technical_commissions.html for more details).

Section snippets

Description of data model properties and their uncertainty characteristics

More formally, a single geographic datum (a) at any level of abstraction can be described by its value (d), spatial extents (s) and temporal extents (t) (Gahegan, 1996) each of which have associated uncertainty u; where u has three components, α, β and χa(d,s,t,α,β,χ).

The above expression assumes that α, β and χ may be orthogonalised. As will be shown later, it is often difficult to treat these components separately, since each may affect the others. A similar formulation, using upper case

The transformation process between geographic models

This section describes the three transformation processes between geographic models, field→image, image→theme and theme→object and their effects on the data characteristics introduced above.

Discussion, conclusions and future work

The expressions described above seem to offer some insight into the complex transformation processes that occur within integrated GIS, particularly with reference to keeping track of the sources of errors and uncertainties and their effects. Table 1 categorises these errors and uncertainties as they apply to the four geographic models considered here.

Uncertainty propagates from left to right across the table as the higher models inherit uncertainty properties from the lower models and add to

References (41)

  • M.F. Goodchild

    Geographical data modelling

    Comput. Geosci.

    (1992)
  • R.M. Haralick

    Performance characterisation in computer vision

    CVGIP-Image Understanding

    (1994)
  • M. Worboys

    Computation with imprecise geospatial data

    Comput., Environ. Urban Syst.

    (1998)
  • Y. Bedard

    Uncertainties in land information databases

  • B.P. Bruegger

    Theory for the integration of scale and representation formats: major concepts and practical implications

  • P.A. Burrough

    Fuzzy mathematical methods for soil survey and land evaluation

    J. Soil Sci.

    (1989)
  • N.R. Chrisman

    The error component in spatial data

  • G. Edwards et al.

    Modelling uncertainty in photo-interpreted boundaries

    Photogramm. Eng. Remote Sens.

    (1996)
  • M. Ehlers et al.

    Integration of remote sensing and GIS: data and data access

    Photogramm. Eng. Remote Sens.

    (1991)
  • M. Ehlers et al.

    Error modelling for integrated GIS

    Cartographica

    (1997)
  • M.N. Gahegan

    Specifying the transformations within and between geographic data models

    Trans. GIS

    (1996)
  • M.N. Gahegan et al.

    A model to support the integration of image understanding techniques within a GIS

    Photogramm. Eng. Remote Sens.

    (1996)
  • M. Gahegan et al.

    Recent developments towards integrating scene understanding within a geographic information system for agricultural applications

    Trans. GIS

    (1999)
  • M.F. Goodchild et al.

    Development and test of an error model for categorical data

    Int. J. Geogr. Inf. Syst.

    (1992)
  • C.M. Gurney

    The use of contextual information in the classification of remotely sensed data

    Photogramm. Eng. Remote Sens.

    (1983)
  • G.B.M. Heuvelink et al.

    Propagation of errors in spatial modelling with GIS

    Int. J. Geogr. Inf. Syst.

    (1989)
  • G.B.M. Heuvelink et al.

    Error propagation in cartographic modelling using Boolean logic and continuous classification

    Int. J. Geogr. Inf. Syst.

    (1993)
  • G. Hunter et al.

    Mapping uncertainty in spatial databases, putting theory into practice

    J. Urban Reg. Inf. Syst. Assoc.

    (1993)
  • G. Hunter et al.

    Communicating uncertainty in spatial databases

    Trans. GIS

    (1996)
  • Cited by (76)

    • Per-pixel land cover accuracy prediction: A random forest-based method with limited reference sample data

      2021, ISPRS Journal of Photogrammetry and Remote Sensing
      Citation Excerpt :

      In fact, such a PLCA map provides spatially-explicit accuracy information and aims to increase the value of an accuracy assessment procedure. The PLCA brings great opportunities for end-users of LC maps to consider this information more appropriately in future decision making, as well as for producers to improve quality of the produced LC maps (Comber et al., 2012; Gahegan and Ehlers, 2000; Zhang et al., 2018). So far, different approaches have been adopted to predict PLCA.

    • Decadal trends in mangrove and pond aquaculture cover on Hainan (China) since 1966: mangrove loss, fragmentation and associated biogeochemical changes

      2020, Estuarine, Coastal and Shelf Science
      Citation Excerpt :

      Also the conversion rate of 52% in WWE was still higher than the average conversion rate of mangrove forest to pond aquaculture of 30% in South East Asia from 2000 to 2012 (Richards and Friess, 2016). Transformation of spatial data into GIS products has the potential to introduce different sources of uncertainty (Gahegan and Ehlers, 2000). The types of uncertainty most relevant to the object extraction method by point estimates used here are incorrect assignment of object type, object shape error and consistency of object formation during the extraction procedure, as well as completeness, consistency and precision (resolution) of the images.

    View all citing articles on Scopus
    View full text