Variable selection in regression models used to analyse Global Positioning System accuracy in forest environments

https://doi.org/10.1016/j.amc.2012.08.069Get rights and content

Abstract

Reliable information on the geographic location of individual points using GPS (Global Positioning System) receivers requires an unobstructed line of sight from the points to a minimum of four satellites. This is often difficult to achieve in forest environments, as trunks, branches and leaves can block the GPS signal. Forest canopy can be characterized by means of dasymetric parameters such as tree density and biomass volume, but it is important to know which parameters in particular have a bearing on the accuracy of GPS measurements. We analyzed the relative influence of forest canopy and GPS-signal-related variables on the accuracy of the GPS observations using a methodology based on linear regression models and bootstrapping and compared the results to those for a classical variable-selection method based on hypothesis testing. The results reveal that our methodology reduces the number of significant variables by approximately 50% and that both forestry and GPS-signal-related variables are significant.

Introduction

Knowledge of the shape and dimensions of the earth’s surface and, more specifically, identification of the position of its characteristic features is a standard exercise in various branches of engineering, including forestry. The development of techniques based on the global navigation satellite and Global Positioning Systems (GNSS and GPS, respectively) have particularly transformed surveying practices [1]. The use of triangulation and traversing methods is no longer limited by the availability of a direct line of sight between a known position and the object whose position is to be determined [2].

Although satellite positioning systems provide reliable information on the position of individual points, irrespective of the weather conditions, at any time and place on or near the terrain surface, they require an unobstructed line of sight to a minimum of four satellites [3]. Use of GNSS techniques to study, describe and measure land used for forestry is an increasingly common practice. Studies of the application of GPS and GNSS techniques in forestry include plot inventories [4], cadastral surveys [5], [6], map and plan making [7], geographic information systems [8], surface area and plot perimeter estimates [9] and even forestry planning and implementation [7]. However, tree cover reduces the effectiveness of these techniques due to the trunks, branches and foliage causing interference and signal loss [10]. This is evident in the lower precision—a difference of one order of magnitude—obtained regarding the position of characteristic terrain features [11]. Most studies report a number of complications in using these techniques and provide practical recommendations for ensuring correct measurement.

The accuracy of GNSS observations, which needs to be consistent with the tolerance limits established for each case, depends on the systematic error component, mathematically expressed as bias, and the accidental error component directly related to precision (expressed as standard deviation). To ensure appropriately handle error and obtain sufficiently accurate measurements, it is important to determine possible causes of error during the observation phase and to assess their possible bearing on measurements.

Previous research has revealed that, along with conventional causes, several dasymetric parameters—tree dimensions, tree growth and standing volume—significantly influence accuracy in measurements made in forest environments. Bakula et al. [12], referring to real time kinematic observations, indicated the need to resolve ambiguities (a process called initialization) to ensure a high degree of precision and accuracy. Hasegawa et al. [13], who evaluated the accuracy of static-mode dual-frequency GPS receivers operating in forest environments, developed a model that estimates the probability of resolving ambiguities using logistic regression, with the observation period and tree cover index as independent variables, concluding that although position was more accurate when tree cover was less dense, 15 min of observation was sufficient to resolve ambiguities and obtain satisfactory precision under tree cover. Using a method based on genetic algorithms, Ordoñez et al. [14] concluded that dasymetric parameters had a greater bearing on positioning accuracy than variables associated with the GPS signal. Considering only the accuracy of vertical measurements, Wing and Frank [15] recorded significant differences between measurements made with GPS receivers with the same settings in environments with and without tree cover, concluding that forest cover had a negative influence on accuracy.

We describe a methodology for analyzing the relative importance of eight dasymetric parameters (arithmetic mean diameter, tree density, treetop height, Hart–Becking index, dominant height, basal area, standing volume and slenderness coefficient) and variables related to the GPS signal (signal-to-noise ratio in codes CA and P, position dilution of precision (PDOP), number of satellites transmitting signal, number of satellites receiving code and mean elevation angle) in the accuracy of GPS receiver observations made under tree cover. We used a linear regression model combined with bootstrap techniques to determine the minimum number of explanatory variables necessary to obtain the best prediction. To compare results, we used a stepwise backward method currently implemented in R program [16].

Section snippets

Methodology

For multiple regression models (with p variables), it is common to question what could be, and how to determine, the best subset of q (qp) covariates that ensure the best possible fit to the data. This problem is particularly important in situations with many variables or with redundancy between highly correlated variables. In these contexts, the addition of a new variable to a model may appear to yield a better data fit, yet there are reasons why the estimates obtained may not be

Experimental test

The usefulness of the proposed method was tested for a set of code range and carrier phase observations captured using two dual-frequency GPS receivers (Hyper-Plus, Topcon Positioning Systems, Inc., Livermore, CA, USA) in 12 forest scenarios with different kinds of tree cover. The observations, each lasting a minimum of 1.5 h, were made over four working days (lasting 5–6 h) between 20 and 23 August 2007. The logging rate was 1 s and antenna height was 1.45–1.60 m.

Records were reviewed subsequently

Results and discussion

The application of the proposed methodology to our observations indicated that, for a confidence level of 95%, the null hypothesis was true for q = 7. Fig. 1 shows the Tˆ statistic for each value of q and the corresponding lower limit a.

It can be observed that, for a confidence level of 95%, the null hypothesis was rejected until seven variables were included and was thereafter accepted (for values of a of less than 1).

Table 2 shows, for models with 1–21 variables, the input variables that

Conclusions

We evaluated the influence of dasymetric parameters and GPS-signal-related variables on the accuracy of observations made with a GPS receiver under tree cover, using a linear regression model and a variable selection method that determines the minimum number of variables needed to obtain the best estimate. The results obtained indicate that no single model explains accuracy, as different combinations offered the same prediction capacity.

The proposed method provided better results than

Acknowledgments

The authors gratefully acknowledge the financial support from the project MTM2011-23204 of the Spanish Ministry of Science and Innovation (FEDER support included) and Xunta de Galicia (10 PXIB 300 068 PR).

References (25)

  • M.G. Wing et al.

    Vertical measurement accuracy and reliability of mapping-grade GPS receivers

    Comput. Electron. Agr.

    (2011)
  • Instituto Geográfico Nacional (IGN), GNSS, Ministerio de Fomento del Gobierno de España, Madrid, 2011. Online: 26...
  • B. Hofmann-Wellenhof et al.

    GPS Theory and Practice

    (2001)
  • J. Bao-Yen Tsui

    Fundamentals of Global Positioning System Receivers: A Software Approach

    (2004)
  • D. Evans et al.

    Use of global positioning system (GPS) for forest plot location

    Southern J. Appl. Forestry

    (1992)
  • T. Soler et al.

    GPS high accuracy geodetic networks in Mexico

    J. Surv. Eng.

    (1996)
  • T. Yoshimura, S. Gandaseca, S. Gumus, H. Acar, Evaluating the accuracy of GPS positioning in the forest of the Macka...
  • T.P. McDonald et al.

    Using the global positioning system to map disturbance patterns of forest harvesting machinery

    Can. J. Forest Res.

    (2002)
  • M.G. Wing et al.

    GIS: an updated primer on a powerful management tool

    J. Forest Sci.

    (2006)
  • Y. Tachiki et al.

    Effects of polyline simplification of dynamic GPS data under forest canopy on area and perimeter estimations

    J. Forest Res.

    (2005)
  • C. Ordóñez et al.

    Machine learning techniques applied to the assessment of GPS accuracy under the forest canopy

    J. Surv. Eng.

    (2009)
  • P. Sigrist et al.

    Impact of forest canopy on quality and accuracy of GPS measurements

    Int. J. Remote Sens.

    (1999)
  • Cited by (1)

    View full text