A framework for the modelling of uncertainty between remote sensing and geographic information systems

doi:10.1016/S0924-2716(00)00018-6

ISPRS Journal of Photogrammetry and Remote Sensing

Volume 55, Issue 3, September 2000, Pages 176-188

https://doi.org/10.1016/S0924-2716(00)00018-6 Get rights and content

Abstract

This paper addresses the modelling of uncertainty in an integrated geographic information system (GIS), specifically focused on the fusion of activities between GIS and remote sensing. As data is abstracted from its ‘raw’ form to the higher representations used by GIS, it passes through a number of different conceptual data models via a series of transformations. Each model and each transformation process contributes to the overall uncertainty present within the data. The issues that this paper addresses are threefold. Firstly, a description of various models of geographic space is given in terms of the inherent uncertainty characteristics that apply; this is then worked into a simple formalism. Secondly, the various transformation processes that are used to form geographic classes or objects from image data are described, and their effects on the uncertainty properties of data are stated. Thirdly, using the formalism to describe the transformation processes, a framework for the propagation of uncertainty through an integrated GIS is derived. By way of a summary, a table describing sources of accumulated uncertainty across four underlying models of geographic space is derived.

Introduction

Good science requires statements of accuracy by which the reliability of results can be understood and communicated. Where accuracy is known objectively, then it can be expressed as error, where it is not, the term uncertainty applies (Hunter and Goodchild, 1993). Thus, uncertainty covers a broader range of doubt or inconsistency, and in the context of this paper, includes error as a component. The understanding of uncertainty as it exists in geographic data remains a problem that is only partly solved Story and Congleton, 1986, Goodchild and Gopal, 1989, Veregin, 1995, Ruiz, 1997, Worboys, 1998. However, without quantification, the reliability of any results produced remains problematic to assess and difficult to communicate to the user. A geographic information system (GIS) provides a whole series of tools with which data can be manipulated, without offering any control over misuse. To quote Openshaw et al. (1991):

A GIS gives the user complete freedom to combine, overlay and analyse data from many different sources, regardless of scale, accuracy, resolution and quality of the original map documents and without any regard for the accuracy characteristics of the data themselves.

This is a serious issue; without quantification of uncertainty, the results themselves may only be considered as qualitative information, and this greatly devalues their merit in both a scientific and a practical sense. To compound the problem, in the fusion of activities from remote sensing and GIS, an integrated approach to managing geographic information is required. This must necessarily support many different types of data (Ehlers et al., 1991), gathered according to different models of geographic space (Goodchild, 1992), each possessing different types of inherent errors and uncertainties (Chrisman, 1991). As well as providing individual support for these different models of space, it is necessary to explicitly include methods to keep track of uncertainty, as data is changed from low level forms (such as remotely sensed image data) to the higher level abstractions required by cartography and GIS (such as distinct objects and themes). It is particularly toward the propagation of uncertainty through different conceptual models that this paper is oriented.

Whether a particular data set can be considered suitable for a given task depends on many different criteria, and despite the fact that various aspects of uncertainty can be measured objectively, their importance will be largely determined by the current task. Our overall goal with modelling uncertainty is therefore threefold: (i) to produce a statement of uncertainty to be associated with each data set so that an objective statement of reliability may be reported, (ii) to develop methods to propagate uncertainty as the data is processed and transformed, and (iii) to ultimately determine the suitability of a data set for a given task (‘fitness for use’). Another valid goal, not considered here, is to communicate uncertainty information to the user (e.g. Hunter and Goodchild, 1996).

A good deal of research effort has been directed towards the study of uncertainty in geographic data. A useful framework, recognising the separate error components of value, space, time, consistency and completeness was proposed by Sinton (1978) and later embellished by Chrisman (1991). However, uncertainty in geographic data can be described in a variety of alternative ways, such as those provided by Bedard (1987), Miller et al. (1989) and Veregin (1989). Although different, these approaches all have a number of aspects in common, including the observation that uncertainty itself occurs at different levels of abstraction. For example, positional and temporal errors describe uncertainty in a metric sense within a space–time framework, whereas completeness and consistency represent more abstract concepts describing coverage and reliability, and are consequently more problematic to describe.

Work to date on uncertainty addresses the inherent errors present within specific types of data structure (e.g. raster or vector) or data models (e.g. field, object). The affects of combining data layers together within these various paradigms have been studied by Veregin, 1989, Veregin, 1995, Openshaw et al. (1991), Goodchild et al. (1992), Heuvelink and Burrough (1993), Ehlers and Shi (1997) and Leung and Yan (1998). Scant attention has so far been given to the problem of modelling uncertainty as the data is transformed through different models of geographic space, a notable exception is the work of Lunetta et al. (1991). A typical path taken by data captured by satellite, then abstracted into a suitable form for GIS is shown in Fig. 1, and involves four such models. Continuously varying fields are quantised by the sensing device into image form, then classified and finally transformed into discrete mapping objects. The overall object extraction process is sometimes referred to as semantic abstraction (Waterfeld and Schek, 1992) due to the increasing semantic content of the data as it is manipulated into forms that are easier for people to work with.

When transforming data between different conceptual models of geographic space, the uncertainty characteristics in the data may change; in that techniques used to transform the data also alter the inherent uncertainty and in addition may introduce further uncertainty of their own. Furthermore, many of the abstraction techniques employed combine data with different uncertainty characteristics, for example multi-temporal image classification (Jeon and Landgrebe, 1992) or knowledge-based feature extraction (McKeown, 1987). Consequently, two inter-related problems must be addressed, namely:

1.
How do the uncertainty characteristics of data change as data are transformed between models?
2.
How do the transformation methods used affect and combine the uncertainty present in the data (Lodwick et al., 1990)?

This paper concentrates mainly on the first question, proposing a framework within which the second question can be tackled.

One of the consequences of the separation of GIS and RS activities into separate communities and separate software environments is that there is an artificial barrier between the two disciplines. Thus, the integration of these two branches of science is to some extent an artificial problem. As a result, there is no easy flow of meta-data between systems, inter-operability is often restricted to the exchange of image files or object geometry and the problem of managing uncertainty is compounded.

The four stages shown in Fig. 1 represent four models of geographic space that are considered here. They are termed the field (F), image (I), thematic (T) and object (or feature) (O) models and are typical (though not exhaustive) of those used in the integration of GIS and remote sensing activities. These models represent the conceptual properties of the data only and are considered here as independent from any particular data structure that might be used to encode and organise the data.

The description of uncertainty used here follows that proposed by Sinton (1978). It covers the sources of error as they occur in remote sensing and GIS integration (although other approaches may be equally valid). Uncertainty is restricted to the following properties: (i) value (including measurement and label errors), (ii) spatial, (iii) temporal, (iv) consistency and (v) completeness. These are symbolised below using the following five letters of the Greek alphabet (α, β, χ, δ and ε, respectively). Of these, α, β and χ can be applied either individually to a single datum or to any set of data. The latter two properties of consistency and completeness can only apply to a defined data set since they are comparative (either internally among data or to some external framework). Issues regarding scale and resolution of the data (e.g. Bruegger, 1995) are postponed; these are not objective uncertainty criteria, but instead become important when assessing the ‘fitness for use’ of data for a specific task. Also, we do not concern ourselves here with structures for the provision of lineage information (e.g. Lanter, 1991), although these are certainly required to support the propagation of uncertainty.

The techniques proposed for quantifying error and uncertainty referred to in this paper are not in themselves new. The purpose of the formal notation developed later is to describe succinctly and unambiguously the forms of uncertainty that exist in the data, where they originate from and what they affect. Ongoing work by the authors and others is attempting to quantify some of these uncertainty terms as they relate to the integration of GIS and remote sensing, for example ISPRS Commissions 2, 3 and 4 span the full range of integration activities described here (see: http://www.isprs.org/technical_commissions.html for more details).

Section snippets

Description of data model properties and their uncertainty characteristics

More formally, a single geographic datum (a) at any level of abstraction can be described by its value (d), spatial extents (s) and temporal extents (t) (Gahegan, 1996) each of which have associated uncertainty u; where u has three components, α, β and χ $a(d,s,t,α,β,χ).$

The above expression assumes that α, β and χ may be orthogonalised. As will be shown later, it is often difficult to treat these components separately, since each may affect the others. A similar formulation, using upper case

The transformation process between geographic models

This section describes the three transformation processes between geographic models, field→image, image→theme and theme→object and their effects on the data characteristics introduced above.

Discussion, conclusions and future work

The expressions described above seem to offer some insight into the complex transformation processes that occur within integrated GIS, particularly with reference to keeping track of the sources of errors and uncertainties and their effects. Table 1 categorises these errors and uncertainties as they apply to the four geographic models considered here.

Uncertainty propagates from left to right across the table as the higher models inherit uncertainty properties from the lower models and add to

References (41)

M.F. Goodchild
Geographical data modelling
Comput. Geosci.
(1992)
R.M. Haralick
Performance characterisation in computer vision
CVGIP-Image Understanding
(1994)
M. Worboys
Computation with imprecise geospatial data
Comput., Environ. Urban Syst.
(1998)
Y. Bedard
Uncertainties in land information databases
B.P. Bruegger
Theory for the integration of scale and representation formats: major concepts and practical implications
P.A. Burrough
Fuzzy mathematical methods for soil survey and land evaluation
J. Soil Sci.
(1989)
N.R. Chrisman
The error component in spatial data
G. Edwards et al.
Modelling uncertainty in photo-interpreted boundaries
Photogramm. Eng. Remote Sens.
(1996)
M. Ehlers et al.
Integration of remote sensing and GIS: data and data access
Photogramm. Eng. Remote Sens.
(1991)
M. Ehlers et al.
Error modelling for integrated GIS
Cartographica
(1997)

M.N. Gahegan

Specifying the transformations within and between geographic data models

Trans. GIS

(1996)

M.N. Gahegan et al.

A model to support the integration of image understanding techniques within a GIS

Photogramm. Eng. Remote Sens.

(1996)

M. Gahegan et al.

Recent developments towards integrating scene understanding within a geographic information system for agricultural applications

Trans. GIS

(1999)

M.F. Goodchild et al.

Development and test of an error model for categorical data

Int. J. Geogr. Inf. Syst.

(1992)

C.M. Gurney

The use of contextual information in the classification of remotely sensed data

Photogramm. Eng. Remote Sens.

(1983)

G.B.M. Heuvelink et al.

Propagation of errors in spatial modelling with GIS

Int. J. Geogr. Inf. Syst.

(1989)

G.B.M. Heuvelink et al.

Error propagation in cartographic modelling using Boolean logic and continuous classification

Int. J. Geogr. Inf. Syst.

(1993)

G. Hunter et al.

Mapping uncertainty in spatial databases, putting theory into practice

J. Urban Reg. Inf. Syst. Assoc.

(1993)

G. Hunter et al.

Communicating uncertainty in spatial databases

Trans. GIS

(1996)

Cited by (76)

A review of machine learning in processing remote sensing data for mineral exploration
2022, Remote Sensing of Environment
The decline of the number of newly discovered mineral deposits and increase in demand for different minerals in recent years has led exploration geologists to look for more efficient and innovative methods for processing different data types at each stage of mineral exploration. As a primary step, various features, such as lithological units, alteration types, structures, and indicator minerals, are mapped to aid decision-making in targeting ore deposits. Different types of remote sensing datasets, such as satellite and airborne data, make it possible to overcome common problems associated with mapping geological features. The rapid increase in the volume of remote sensing data obtained from different platforms has encouraged scientists to develop advanced, innovative, and robust data processing methodologies. Machine learning methods can help process a wide range of remote sensing datasets and determine the relationship between components such as the reflectance continuum and features of interest. These methods are robust in processing spectral and ground truth measurements against noise and uncertainties. In recent years, many studies have been carried out by supplementing geological surveys with remote sensing datasets, which is now prominent in geoscience research. This paper provides a comprehensive review of the implementation and adaptation of some popular and recently established machine learning methods for processing different types of remote sensing data and investigates their applications for detecting various ore deposit types. We demonstrate the high capability of combining remote sensing data and machine learning methods for mapping different geological features that are critical for providing potential maps. Moreover, we find there is scope for advanced methods such as deep learning to process the new generation of remote sensing data that provide high spatial and spectral resolution for creating improved mineral prospectivity maps.
Per-pixel land cover accuracy prediction: A random forest-based method with limited reference sample data
2021, ISPRS Journal of Photogrammetry and Remote Sensing
Citation Excerpt :
In fact, such a PLCA map provides spatially-explicit accuracy information and aims to increase the value of an accuracy assessment procedure. The PLCA brings great opportunities for end-users of LC maps to consider this information more appropriately in future decision making, as well as for producers to improve quality of the produced LC maps (Comber et al., 2012; Gahegan and Ehlers, 2000; Zhang et al., 2018). So far, different approaches have been adopted to predict PLCA.
Given the importance of accuracy in land cover (LC) maps, several methods have been adopted to predict per-pixel land cover accuracy (PLCA) of classified remote sensing images. Such a PLCA map provides spatially-explicit accuracy information and is of paramount importance for both producers and end-users of LC maps to thoroughly understand the spatial distribution of accuracy. In this study, we proposed a simple yet powerful random forest (RF) based approach for PLCA mapping with limited reference sample data. The main assumption of the proposed approach is that the LC’s misclassifications do not occur randomly, but rather exhibit some detectable characteristics which can be retrieved via the built model. With this approach, RF attempts to establish a nonlinear relationship between the accuracy and the same spectral bands used in LC classification. To confirm the proposed method as a consistent and practical approach for a variety of different settings, we evaluated it on five different classified remote sensing images derived from Landsat-8, Ikonos, and three Sentinel-2 images across different parts of Iran. In this manner, to validate the predictive capability of the RF-based method, we calculated the area under the receiver operating characteristic curve (AUROC) and several other statistical metrics, including sensitivity (SN), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), and accuracy (ACC). Analysis of the average values of these metrics (AUROC = 0.88, SN = 95%, SP = 68%, PPV = 96%, NPV = 72%, and ACC = 95%) derived from the limited sample size datasets showed that the proposed model performs well in all case studies. The performance of the proposed model was further assessed through comparison against two benchmark methods, namely Gaussian kernel interpolation (GKI) and linear kernel interpolation (LKI). In conclusion, although our comprehensive evaluations revealed that RF, GKI, and LKI methods are promising approaches for PLCA mapping, RF outperformed both GKI and LKI in all of the experimental sites.
Decadal trends in mangrove and pond aquaculture cover on Hainan (China) since 1966: mangrove loss, fragmentation and associated biogeochemical changes
2020, Estuarine, Coastal and Shelf Science
Citation Excerpt :
Also the conversion rate of 52% in WWE was still higher than the average conversion rate of mangrove forest to pond aquaculture of 30% in South East Asia from 2000 to 2012 (Richards and Friess, 2016). Transformation of spatial data into GIS products has the potential to introduce different sources of uncertainty (Gahegan and Ehlers, 2000). The types of uncertainty most relevant to the object extraction method by point estimates used here are incorrect assignment of object type, object shape error and consistency of object formation during the extraction procedure, as well as completeness, consistency and precision (resolution) of the images.
Mangrove forests suffer from large-scale conversion into pond aquaculture worldwide. However, rarely can the detailed development of these changes and the consequences for coastal biogeochemistry be traced back to baseline conditions. We analyzed decadal changes in mangrove forest and aquaculture pond cover of five estuaries along the east coast of Hainan, northern South China Sea, using aerial photos and satellite images from 1966 to 2009. In addition, we reconstructed historical changes in the biogeochemistry by analyzing three sediment cores from the largest remaining mangrove area in east Hainan (Wenchang/Wenjiao Estuary, WWE). Overall mangrove loss was 72% (from 3697 ha in 1966 to 1041 ha in 2009), ranging from 63% in WWE to virtually 100% loss in Qingge. Land cover of aquaculture ponds in the five estuaries increased from 550 ha in 1966 to 3944 ha in 2009. 55% of the former mangrove area was directly replaced by aquaculture ponds accounting for 76% of the mangrove loss. An increase in the number of individual mangrove area patches from 230 larger to 2134 smaller patches indicates severe fragmentation of the remaining mangrove areas, likely with adverse consequences for ecosystem functioning. The sediment cores from the WWE show that the primary organic matter source changed from mangrove- to aquaculture-derived suspended matter since the 1980s. Moreover, the land cover change likely increased the nutrient export from land due to loss of the mangrove filter and creation of a significant nutrient source by pond effluents with negative impact for adjacent seagrass meadows and coral reefs. This is one of the longest time series documenting massive mangrove decline in recent decades. It highlights the strong and persistent ecological and biogeochemical changes associated with mangrove conversion in tropical estuaries, negatively affecting ecosystem services provided by undisturbed mangrove forests. The speed and magnitude of land conversion and the biogeochemical consequences for adjacent coastal waters observed in Hainan serve as an extreme example of similar activities in Southeast Asia.
Geovisualizing attribute uncertainty of interval and ratio variables: A framework and an implementation for vector data
2018, Journal of Visual Languages and Computing
Geovisualization of attribute uncertainty helps users to recognize underlying processes of spatial data. However, it still lacks an availability of uncertainty visualization tools in a standard GIS environment. This paper proposes a framework for attribute uncertainty visualization by extending bivariate mapping techniques. Specifically, this framework utilizes two cartographic techniques, choropleth mapping and proportional symbol mapping based on the types of attributes. This framework is implemented as an extension of ArcGIS in which three types of visualization tools are available: overlaid symbols on a choropleth map, coloring properties to a proportional symbol map, and composite symbols.
Hierarchical semantic cognition for urban functional zones with VHR satellite images and POI data
2017, ISPRS Journal of Photogrammetry and Remote Sensing
As the basic units of urban areas, functional zones are essential for city planning and management, but functional-zone maps are hardly available in most cities, as traditional urban investigations focus mainly on land-cover objects instead of functional zones. As a result, an automatic/semi-automatic method for mapping urban functional zones is highly required. Hierarchical semantic cognition (HSC) is presented in this study, and serves as a general cognition structure for recognizing urban functional zones. Unlike traditional classification methods, the HSC relies on geographic cognition and considers four semantic layers, i.e., visual features, object categories, spatial object patterns, and zone functions, as well as their hierarchical relations. Here, we used HSC to classify functional zones in Beijing with a very-high-resolution (VHR) satellite image and point-of-interest (POI) data. Experimental results indicate that this method can produce more accurate results than Support Vector Machine (SVM) and Latent Dirichlet Allocation (LDA) with a larger overall accuracy of 90.8%. Additionally, the contributions of diverse semantic layers are quantified: the object-category layer is the most important and makes 54% contribution to functional-zone classification; while, other semantic layers are less important but their contributions cannot be ignored. Consequently, the presented HSC is effective in classifying urban functional zones, and can further support urban planning and management.
Predicting individual pixel error in remote sensing soft classification
2017, Remote Sensing of Environment
Accuracy assessment of remote sensing soft (sub-pixel) classifications is a challenging topic. Previous efforts have focused on constructing a soft classification error matrix and producing summary measures to describe overall and per-class map accuracy. However, these summary assessments do not provide information on the spatial distribution of the soft classification error as distributed at the individual pixel level. This is important because the map error of a given class may vary considerably over different regions. Spatial interpolation has been previously used for predicting soft classification error at the pixel level. Here, we propose two alternative domains for soft classification error interpolation, the spectral and mapped class proportion domains. In the spectral domain we interpolate errors in the classification feature space, whereas in the mapped class proportion domain interpolation takes place in a space with dimensions defined by the mapped class proportions (i.e., the output of the soft classification). The two newly proposed prediction methods (spectral domain and mapped class proportion domain), spatial interpolation, and a summary measure method were evaluated using 23 test regions, each 10 km × 10 km, distributed throughout the United States. These 10 km × 10 km blocks had complete coverage reference data (where the reference classification was determined by manual interpretation) and the predicted error maps were then evaluated by comparing them to these complete coverage reference error maps. Mean absolute error was used to quantify the agreement of the predicted error maps to the reference error maps. The spectral and mapped class proportion methods generally outperformed the spatial interpolation and the summary measure methods both in terms of smaller mean absolute error and visual similarity of predicted error maps to the reference error maps. The superiority of the new methods over spatial interpolation is an important result because spatial interpolation is a familiar method analysts would commonly consider for modeling spatial variation of classification error. The predicted soft classification error maps provide a straightforward visual assessment of the spatial patterns of error that can accompany the original classification products to enhance their value in subsequent analysis and modeling tasks. Furthermore, from the standpoint of implementation, our methods do not require additional datasets; the same test dataset currently used for confusion/error matrix construction can be used for our error interpolation methods.

View all citing articles on Scopus

View full text

A framework for the modelling of uncertainty between remote sensing and geographic information systems

Abstract

Introduction

Section snippets

Description of data model properties and their uncertainty characteristics

The transformation process between geographic models

Discussion, conclusions and future work

Comput. Geosci.

CVGIP-Image Understanding

Comput., Environ. Urban Syst.

Uncertainties in land information databases

Theory for the integration of scale and representation formats: major concepts and practical implications

Fuzzy mathematical methods for soil survey and land evaluation

J. Soil Sci.

The error component in spatial data

Modelling uncertainty in photo-interpreted boundaries

Photogramm. Eng. Remote Sens.

Integration of remote sensing and GIS: data and data access

Photogramm. Eng. Remote Sens.

Error modelling for integrated GIS

Cartographica

Specifying the transformations within and between geographic data models

Trans. GIS

A model to support the integration of image understanding techniques within a GIS

Photogramm. Eng. Remote Sens.

Recent developments towards integrating scene understanding within a geographic information system for agricultural applications

Trans. GIS

Development and test of an error model for categorical data

Int. J. Geogr. Inf. Syst.

The use of contextual information in the classification of remotely sensed data

Photogramm. Eng. Remote Sens.

Propagation of errors in spatial modelling with GIS

Int. J. Geogr. Inf. Syst.

Error propagation in cartographic modelling using Boolean logic and continuous classification

Int. J. Geogr. Inf. Syst.

Mapping uncertainty in spatial databases, putting theory into practice

J. Urban Reg. Inf. Syst. Assoc.

Communicating uncertainty in spatial databases

Trans. GIS