Short communication
A three-dimensional gap filling method for large geophysical datasets: Application to global satellite soil moisture observations

https://doi.org/10.1016/j.envsoft.2011.10.015Get rights and content

Abstract

The presence of data gaps is always a concern in geophysical records, creating not only difficulty in interpretation but, more importantly, also a large source of uncertainty in data analysis. Filling the data gaps is a necessity for use in statistical modeling. There are numerous approaches for this purpose. However, particularly challenging are the increasing number of very large spatio-temporal datasets such as those from Earth observations satellites. Here we introduce an efficient three-dimensional method based on discrete cosine transforms, which explicitly utilizes information from both time and space to predict the missing values. To analyze its performance, the method was applied to a global soil moisture product derived from satellite images. We also executed a validation by introducing synthetic gaps. It is shown this method is capable of filling data gaps in the global soil moisture dataset with very high accuracy.

Introduction

The presence of data gaps is a cause for concern in many geophysical datasets and presents a large source of uncertainty in data analysis. This is of particular importance when analyzing the spatio-temporal variability of large datasets, e.g., the large-scale satellite observations. In the last two decades satellite observations have demonstrated the potential to become a major tool for observing the properties of the Earth’s land surface and atmosphere, such as soil moisture, temperature, aerosols and more recently greenhouse gases. The data gaps in satellite datasets are intrinsic, primarily due to the satellite orbits. Other specific reasons such as clouds contamination or instrumental failure etc can also create data gaps. The rapidly growing volume and diversity of satellite datasets require an efficient method for filling the data gaps.

Several methods for this purpose have emerged in recent years (e.g., Diamantopoulou, 2010), among which the most promising ones are based on the empirical orthogonal function (EOF) of spatial variability (Beckers and Rixen, 2003, Alvera-Azcárate et al., 2007) or the singular spectrum analysis (SSA) of temporal variability (Kondrashov and Ghil, 2006, Hocke and Kämpfer, 2009). These methods use a few spatial or temporal optimal modes occurring at low frequencies to predict the missing values. With the other components discarded as noise, these methods may lead to reduced accuracy of the statistical models fitted to the existing values and consequently the predicted missing values from these models. More importantly, for large spatio-temporal datasets it is of critical importance to utilize information from both spatial and temporal variability to predict the missing values. This demands a method that explicitly takes into account the full three-dimensionality (2-D spatial + time) of the spatio-temporal dataset. However, such a method is still not yet reported to date.

Here we introduce a penalized least square method based on three-dimensional discrete cosine transforms, for the purpose of filling data gaps in large spatio-temporal datasets. To show its performance we apply it to a global soil moisture product derived from satellite images. There are two reasons to choose soil moisture dataset as a primary example. First, soil moisture is one important climate component, which affects the drought and heat conditions of lower atmosphere through partitioning of the available net radiation into latent heat for evaporation and sensible heat for temperature (Koster et al., 2010, Seneviratne et al., 2010). Complete soil moisture datasets are nowadays urgently needed, both for a number of practical applications, such as agriculture and weather forecasting (Varella et al., 2010), as also for increased empirical understanding of the interactions between soil moisture and atmosphere. Secondly, soil moisture exhibits temporally a red spectrum (Wang et al., 2010). This provides a special challenge to the existing gap filling methods utilizing only optimal modes at low frequencies (Kondrashov and Ghil, 2006). It is worth noting that some methods exist that are specifically designed for filling data gaps in high-resolution in-situ soil moisture time series as reviewed in Dumedah and Coulibaly (2011); however, these were not compared to our method, which considers large spatio-temporal satellite products with coarse resolution.

Section snippets

Global soil moisture product

We use the VU University-NASA (VUA-NASA) global volumetric soil moisture product (m3 m−3) derived from the Advanced Microwave Scanning Radiometer-Earth Observing System (Owe et al., 2008). This sensor is mounted on NASA’s Aqua satellite and has daily ascending (13:30 equatorial local crossing time) and descending (01:30) overpasses. The surface moisture is retrieved with the Land Parameter Retrieval Model (LPRM) that solves simultaneously for the surface soil moisture and vegetation optical

Gap filling results

Fig. 1 shows the fraction of data gaps that exist in the soil moisture product for the study period. The major reasons that cause data gaps in this dataset include track changes, dense vegetation, frozen soil (snow) and waterbodies. As a polar orbiting satellite, the AQUA satellite gives better coverage over the high latitudes. However, the data gaps amount to 60–90% over north of 45°N because of frozen soil. The same situation also exists for high elevation regions like in the Tibetan Plateau.

Discussion

In this communication, we introduce an efficient DCT-PLS method for filling the data gaps in large spatio-temporal dataset. It is recommended particularly for the rapid growing volume and diversity of satellite observations in environmental sciences. Using a global satellite soil moisture dataset as example and as challenging case, we have demonstrated the very good skill of this method for gap filling purposes.

This DCT-PLS method has some novel features with respect to other gap filling

Acknowledgements

The work was supported by the The Netherlands Organization for Scientific Research (NWO; grant No. 854.00.026) and ESAs STSE funded Integrated Project WAter Cycle Multimission Observation Strategy (WACMOS; Contract No. 22086/08/I-EC).

References (18)

There are more references available in the full text version of this article.

Cited by (204)

  • Gap-Filled Multivariate Observations of Global Land–Climate Interactions

    2023, Journal of Geophysical Research: Atmospheres
View all citing articles on Scopus
View full text