Satellite-detected gain in built-up area as a leading economic indicator

Leading indicators of future economic activity include measures such as new housing starts, managers purchasing index, money supply, and bond yields. Such macroeconomic and financial indicators hold predictive power in signaling recessionary periods. However, many indicators are constrained by the fact that data are often published with some delay and are subject to constant revision (Bandholz and Funke , Huang et al , Orphanides ). In this research, we propose a leading indicator derived from satellite imagery, the expansion of anthropogenic bare ground. Satellite-detected gain in built-up area, a major land cover and land use (LCLU) outcome of anthropogenic bare ground gain (ABGG), provides an inexpensive, consistent, and near-real-time indicator of global and regional macroeconomic change. Our panel data analysis across four major regions of the world from 2001 to 2012 shows that the logarithm of total ABGG, mostly owing to its major LCLU outcome, the expansion of built-up land in either year t, t −1 or t −2, significantly correlated with the year t logarithm of gross domestic product (GDP, de-trended by Hodrick–Prescott filter). Global ABGG between 2001 and 2012 averaged 7875 km2 yr−1, with a peak gain of 11 875 (± 2014 km2 at the 95% confidence interval) in 2006, prior to the 2007–2008 global financial crisis. The curve of global ABGG or its major LCLU outcome of built-up area in year t − 1 accords well with that of the de-trended logarithm of the global GDP in year t. Given the 40 year archive of free satellite data, a growing satellite constellation, advances in machine learning, and scalable methods, this study suggests that analyses of ABGG as a whole or its LCLU outcomes can provide valuable information in near-real time for socioeconomic research, development planning, and economic forecasting.


Introduction
The 2007-2008 global financial crisis is considered the worst since the Great Depression of the 1930s and had dramatic impacts on global and regional economies and societies. Economists and policy makers seek indicators of overall economic health to help diagnose and forecast expected performance, with a goal to mitigate against volatility and to avoid shocks such as the crisis of [2007][2008]. However, such efforts are limited as macroeconomic and financial variables are often reported with delays and constantly revised (Bandholz and Funke 2003, Orphanides 2003, Huang et al 2018. Recently, new data resources from satellite images (Jean et al 2016, Bennett andSmith 2017), cellphone metadata (Blumenstock et al 2015) and social media (Li et al 2013, Liu et al 2015 have emerged as indicators of economic activity. For example, proposed proxies for GDP, such as lit area (Elvidge et al 1997) and luminosity (Chen and Nordhaus 2011), have been derived from satellite data on nighttime light (Henderson et al 2012). However, the ability of Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence.
Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. new big data sources to predict changes in economic outcomes at global or regional scale is a work in progress. Based on the presented research, we propose a leading indicator derived from freely available satellite imagery: the expansion of built-up area, a major LCLU outcome of anthropogenic bare ground gain (ABGG).
ABGG is a dynamic of land-cover and land-use change (LCLUC) that principally results from economic activities. Estimates of ABGG, based on the characterization of publicly available satellite imagery, can potentially serve as a low-cost, near-real time source for proxies of economic change from national to global scales. We define bare ground gain as a process of land-cover change featuring the removal and continued absence of vegetative cover for at least three years by either human or natural disturbances (Ying et al 2017). In our previous research (Ying et al 2017), globally and for each of seven regions over the 2001-2012 period, we partitioned bare ground gain into six components, defined by their LCLU outcomes: resource extraction; infrastructure development; commercial/residential built-up area; transitional bare ground gain, defined as new bare ground gain that had not yet been clearly put to some use; greenhouses; and one component for all natural gain (see Ying et al 2017 for detailed explanation). The five components of ABGG accounted for 95% of total bare ground gain over the study period. Examples of ABGG include expansion of urban areas, construction of new roads, mining, installation of oil wells, among other dynamics.
For the present study we estimated temporal trends of bare ground gain and its LCLU outcomes at the global scale (figure 1(a)) and for seven regions (figures 1(b)-(e), S3 is available online at stacks.iop. org/ERL/14/114015/mmedia) where time series of economic indicators are aggregated by the World Bank, including East Asia and the Pacific, North America, Europe and Central Asia, Latin America and the Caribbean, South Asia, Sub-Saharan Africa, and Middle East and North Africa. We conducted panel data analysis of the four regions that account for over 90% of ABGG and have relatively low uncertainties: East Asia and the Pacific, North America, Europe and Central Asia, Latin America and the Caribbean. This analysis shows that the logarithm of ABGG in either year t, t − 1 or t − 2 is significantly correlated to the detrended (by Hodrick-Prescott (HP) filter) logarithm of GDP in year t, whereas such gains in year t or t − 1 are significantly correlated to the de-trended logarithms of merchandise imports and exports, or energy consumption in year t. Globally, the annual ABGG between 2001 and 2012 was 7875 km 2 on average with a peak gain of 11 875 (±2014 km 2 for 95% CI) occurred in 2006 prior to the 2007-2008 global financial crisis. The curve of global ABGG in year t − 1 accords well with that of the de-trended logarithm of the global GDP in year t. This predictive attribute of remotely sensed ABGG makes it an important LCLUC theme that can effectively support socioeconomic analysts and policy makers to develop financial plans and to allocate resources towards stable growth.

Method and data
Estimation of annual bare ground gain and attribution of LCLU outcomes Unbiased estimates of areas of annual bare ground gain were produced from a set of probability-based samples (Ying et al 2017). Specifically, global land area was stratified by a set of global seamless bare ground gain layers that were produced through automatic classification methods using Landsat 7 Enhanced Thematic Mapper Plus (ETM+) growing season composites between 2000 and 2012. Global annual composites were produced from 654 178 growing season Landsat scenes with per pixel detection of cloud, shadow, snow/ice, water or qualified observation (Hansen et al 2013). Ying et al (2017) calculated multi-temporal metrics out of the time series of annual composites that were then used to build tree models for classification of bare ground gain. Due to errors inherent with the produced layers of bare ground gain, we did not calculate the areas of bare ground gain from counting the pixels of labeled gain by classification models to avoid biased report of areas of bare ground gain. Instead, we employed those layers as a stratifier to help us efficiently distribute a set of sample for unbiased area estimation in bare ground gain. A total of 5750 sample pixels were selected globally in a stratified random design (25 strata) and then interpreted whether bare ground gain occurred or not (1635 gain pixels versus 4115 no-gain pixels) with time-series of Landsat images, 32 d normalized difference vegetation index (NDVI) time sequences from Google Earth Engine (Gorelick et al 2017) and high-resolution images on Google Earth (figure S1). For gain sample points, we decided the change year when NDVI dropped by over 50% and kept low for at least three years following our definition of bare ground gain. We attributed the LCLU outcomes of gain samples through combined information given by the characteristics of spectrum and configuration of Landsat and high-resolution images and even local photos from Google Earth. Recording the change year and attributing the LCLU outcomes of sample pixels that were labeled as bare ground gain enabled us to estimate annual change areas and disaggregate all changes to six components of direct bare ground gain drivers including resource extraction, infrastructure development, commercial/residential built-up area, transitional bare gain, greenhouses and natural bare ground gain (Ying et al 2017, Zalles et al 2019. We estimated the bare ground gain area attributable to the LCLU outcome i in year t in stratum h: where A h is the total area in stratum h, is the sample proportion of pixels interpreted as bare ground gain of LCLU outcome i in year t in stratum h, ¢ n i t h , , is the number of sample pixels interpreted as bare ground gain of LCLU outcome i in year t, and n h is the number of sample pixels allocated to stratum h. Then the bare ground gain area of LCLU outcome i in year t is obtained by summing the area estimates over all strata: , , Area estimates and uncertainty quantification of bare ground gain were performed at global and regional level. To reduce the variation of annual bare ground gain estimates from interpretation bias of change year due to noise and missing data in NDVI time sequences and historic high-resolution images, sample counts of the average of year t and 1 year neighbors (t−1 and t+1). Land cover change maps derived from remote sensed imagery are subject to omission and commission errors (Olofsson et al 2014). However, probability-base samples using maps as strata to target specific LCLU dynamics yield unbiased estimates of change areas (Stehman 2013). The approach of utilizing classified land cover change maps to target sampling of a rare class of interest compared to the overall land surface, in this case bare ground gain, raised the sampling efficiency by eleven times and greatly reduced the standard error in our change estimates (Ying et al 2017).
Producing highly accurate maps of bare ground gain is the key to increase sampling efficiency and reduce estimation uncertainty. For a historical data set, we employed the definition of three-year absence of vegetation in remote sensing classification of bare ground gain to eliminate commission errors from agricultural fallow being falsely classified as bare ground gain. Though the annual composite used for detection of bare ground gain was selected from growing season Landsat images, it is possible that for a consecutive three year, reflectance of pre-planting agricultural clear was selected for a pixel due to limited observations in some areas. To better serve as a leading indicator in near-real time, a number of enhancements are possible, including the use of monthly satellite observations of bare ground in place of the annual measure employed here, and land use maps to track ephemeral bare ground gain associated with established land uses such as agriculture and forestry.
Growing trends and business cycles of selected economic variables Economic development is depicted by economic, trade and energy measures considered relating to ABGG. The World Bank is one of the leading groups for collecting and analyzing data of global economies at global, regional and national level. Indicators including GDP, merchandise imports and exports, energy use and production (table S3) were downloaded from the World Bank database at global and regional level from 2000 to 2012 (The World Bank 2019a). Annual statistics were recorded for 211 countries and regions, and then aggregated to seven regions grouped by geographic locations identified by the World Bank. We converted values of GDP, merchandise imports and exports, which were measured in current US dollars in original data, to constant US dollars in 2010 by adjusting for inflation.
The HP Filter (Hodrick and Prescott 1997) in the R statistical package (Balcilar 2015) was employed to perform economic growth-business cycle decomposition for time series of the five economic variables. The HP filter is arguably the most commonly used mathematical tool in macroeconomics, especially in real business cycle theory, to separate the time-trend from cyclical component of a time series data (Hodrick andPrescott 1997, Williamson 2002). It is composed of two components. One controls the fitted trend close to the time series, which is measured by the residuals. The other one controls the smoothness of the trend that is measured by the second derivative of the trend. A parameter λ weighs the two components to control the trend between linear and the original time series. The set of λ value depends on data frequency. An optimal trend is the one that gives the minimum sum of the two components. We took natural logarithms of time series variables, removed the trend component, and derived the cyclical component using a λ of 100 for annual data (Backus and Kehoe 1992).
Panel data analysis of bare ground gain dynamics and economic fluctuations Panel data analysis (Flanagan et al 2006) was used in a fixed effect mode to examine the correlation between the cyclical components of each economic variable and the natural logarithms of different combinations of LCLU outcomes of bare ground gain when detangling the unobservable time-invariant heterogeneity associated with each region. The fixed effect model, taking GDP as an example, is where subscripts r and t denote region and year. In the model, a r represents the characteristics such as resource endowments, laws and regulatory regimes, or culture that are unique to each region, not change much in a short period of about one decade, but correlated with the predictor variable. The fixed effect model controls for the effect of these characteristics to assess the net effect of changes in bare ground gain areas on the variation in economic variables. We used R 'plm' package for estimation (Yves and Giovanni 2008) and performed Hausman test to justify the selection of the fixed effect model.
We only used data in regions of East Asia and the Pacific, Europe and Central Asia, Latin America and the Caribbean and North America for their relatively low uncertainty of annual bare ground gain estimates. As our baseline year is 2000, the full-time frame of bare ground gain estimates is from 2001 to 2012.
Because our definition of bare ground gain requires an absence of vegetation for continuously three years, edge effects may have influenced in our estimates for the last two years, which could cause underestimation of bare ground gain areas in 2011 and 2012. We tested regression models for both full (2001-2012,

Results
Global distributions, trends and compositions of bare ground gain A probability-based, stratified-random sample (figure S2) was selected from mapped bare ground strata from 2000 to 2012. Each sample was visually interpreted using reference imagery, specifically Landsat Enhanced Thematic Mapper Plus (ETM+) imagery with a moderate spatial resolution (∼30 m) and commercial imagery with very high spatial resolution freely viewable on Google Earth. Each sample location of bare ground gain was assigned to a type of LCLU outcome and to a year of initial vegetation removal (figure S1).
From 2000 to 2012, global bare ground gain averaged 7881 km 2 yr −1 . That is, every year a land area close to the size of the Yellowstone national park semi/ permanently lost its vegetation cover. About half of this average rate, 3550 km 2 yr −1 , was in the East Asia and the Pacific region, and the smallest portion, 154 km 2 yr −1 , was in the Middle East and North Africa. The temporal trend of total bare ground gain globally is unimodal ( figure 1(a) The greatest LCLU outcome of bare ground gain in most regions was commercial/residential built-up area (49% in North America, 44% in East Asia and the Pacific, 29% in Latin America and the Caribbean), followed by resource extraction (North America 32%, Europe and Central Asia 26%, Latin America and the Caribbean 23%). East Asia and the Pacific differed, where infrastructure was the largest (39%), followed by commercial/residential built-up area (34%). Different LCLU outcomes of bare ground gain, however, showed different patterns of peak time and change rate ( figure 1, figure S4). The trends in the expansion of commercial/residential built-up area varied among the four regions in the years following their 2006 peaks ( figure S4). For example, those in North America and Europe and Central Asia gradually declined, that in East Asia and the Pacific temporarily stabilized in 2007 through 2010 and then resumed its decline, and that in Latin America and the Caribbean appeared to begin to recover beginning in 2010 ( figure S4). New infrastructure development generally peaked in 2008-2010. A shorter cycle of transitional bare ground gain appeared in each region following the decline of commercial/residential built-up area. Growth in resource extraction resumed about two years after the peak in each region except East Asia and the Pacific and was the source of the recovery in overall bare ground gain in Latin America and the Caribbean.

Gain in built-up area foreshadowed the Great Recession
The cyclic patterns of global ABGG foreshadowed the decade's macroeconomic fluctuations dominated by the 2007-2008 global financial crisis. The rise in ABGG, driven mainly by commercial/residential built-up area, transitioned to a decline prior to the 2008 crash ( figure 3(a)). Furthermore, inter-annual ABGG and commercial/residential built-up area were both significantly correlated to the de-trended global GDP (figures 3(b), (c)).
To explore how these patterns were related to the economic activities during this study period, we carried out panel data regressions of fluctuation s on individual economic variables versus different combinations of LCLU outcomes of ABGG, and their leading or lagging terms. GDP was used in the analysis because among the four components of expenditure, investment is most related with infrastructure development and commercial/residential built-up area, whereas consumption and net exports are more associated with all LCLU outcomes of ABGG. Merchandise exports and imports were included to further account for domestic land use and displacement (Lambin and Meyfroidt 2011;Yu et al 2013). Energy use and production partly accounted for fossil fuel extraction, one major component of bare ground gain from resource extraction. Merchandise imports and exports, and energy use are significantly correlated with GDP ( figure S7).
Panel analysis indicates that trends in regional, ABGG are significant leading indicators, by one year, of the tested economic variables, especially GDP (table 1, S4, figure 4). Alternating the lag length of bare ground gains, panel regressions show that commercial/residential built-up area leads GDP by one year or two years at the highly significant levels ( figure 4).
Comparing different compositions of LCLU outcomes of ABGG, panel regressions show that commercial/residential built-up area leading GDP yield the highest r 2 values (figure 4). A 10% increase in ABGG in an antecedent year is associated with a growth in the following year of 0.6% (±0.2%) for GDP, 1% (±0.3%) for merchandise imports, and 0.9% (±0.3%) for merchandise exports (table 1). For one-year lagged terms (t − 1), total ABGG, as well as gains in commercial/ residential built-up area, alone or combined with those in infrastructure development and transitional land, are all significantly correlated with GDP, merchandise imports and exports, and energy use (table 1), showing a predictive capability in economic changes. Compared to resource extraction and infrastructure development, one-year lagged term of commercial/residential built-up area outperforms the coincident term with an increase in r 2 of 0.21 (table 1, S4).
Leading indicators performed differently among different regions. For example, commercial/residential built-up area peaked in 2006 in Europe and Central Asia and Latin America and the Caribbean (figures 1(d)-(e), figure S4), two years earlier than the GDP peak in these regions ( figure S6). Also, the magnitude of changes in ABGG was modest compared to the magnitude of GDP changes. For example, the decrease of newly commercial/residential built-up area in 2008 was weak compared to the plunge in the GDP of Europe and Central Asia in 2009 (figure S8).

Discussion
Link between gain in built-up area and regional economic activities As with the global trend, ABGG and gains from commercial/residential built-up area in North America and Europe and Central Asia, where the economies were first hit by the subprime mortgage crisis in 2007 and substantial European debt crisis, all peaked in 2006, earlier than these financial crises. According to the Standard & Poor/Case-Shiller Composite Home Price Index, a measure of the aggregate market for single family homes in 10 U.S. major cities, the real estate market entered a price boom in late 1990s, and abruptly turned down after mid-2006 (Shiller 2008). Commercial/residential built-up area is closely related to the housing market, thus the trends in this component of ABGG mirrored the market, immediately suggesting a downturn in investment in real estate, which is further reflected in macroeconomic accounts.
The temporal dynamics of commercial/residential built-up area, infrastructure development and transitional bare ground gain were inherently coupled over our study period. For example, the peak in infrastructure gain lagged that of commercial/residential built-up area by two years. This suggests that flexible housing markets are more sensitive to economic changes than infrastructure projects are, the latter typically requiring more planning and equipment  East Asia and the Pacific accounted for 45% of global bare ground gain, 78% of which was in China. China has experienced an excessive rural-urban migration since the economic reform. The urbanization rate increased from 36% in 2000 to 52% in 2012, while the average urban household income grew fourfold (The World Bank 2019b). To accommodate massive urban in-migration, China carried out urban housing reform, pushing the provision of urban housing from a welfare to market-oriented system . The demographic, economic and institutional changes resulted in an average gain of 976 km 2 yr −1 in commercial/residential built-up area (figure 2(a) and yellow samples in figure S2). Implications for near-real time monitoring of global and regional economic health In the long run, the accumulated change of bare ground from anthropogenic demands is driven by population growth and economic development  (table  S4). Nevertheless, short-term factors that cause fluctuations in GDP, such as market anticipation, monetary system, technology innovation and policy decision, also affect annual changes in the quantity, attributes and spatial allocation of new bare ground. More importantly, the ability of the satellite-detected built-up area changes to signal economic recession with a one-year lead can effectively help policy makers to initiate counter-recession measures in a much more timely manner compared with the current policymaking practice. This ability can also help financial institutions and specialists to make much more informative investment decisions, and help the public to better prepare for economic recessions. Thus, it is important to implement near-real time monitoring of bare ground gain at global and regional scales.
The ongoing earth observation programs of Landsat and Sentinel 2 satellites enable near-real time monitoring of land change at large scale as exemplified by an alert system of forest disturbance in operation on a weekly basis . The surge of CubeSat (Hand 2015) technology also provides high resolution images that are especially important for the validation and land-use attribution of bare ground gain. The latency of confirming bare ground gain and LCLU change attribution may be facilitated by high resolution observations and contextual inference. Our approach is scalable and bridges the relationship between socioeconomics and land-use change at national or local scale, with a potential for operational implementation.

Data availability
Any data that support the findings of this study are included within the article. Sample reference images and plots are openly available at https://glad.geog.umd. edu/bare-ground-gain-as-leading-economic-indicator.

ORCID iDs
Qing Ying https:/ /orcid.org/0000-0002-9752-8973 Matthew C Hansen https://orcid.org/0000-0003-0042-2767 Laixiang Sun https:/ /orcid.org/0000-0002-7784-7942 Figure 4. Comparison of r 2 (y axis) and significance (point size) between fixed effect regression models of GDP on the sequences of different time lags (x axis, −1 means ABGG in year t−1 and GDP in year t, 1 means ABGG in year t + 1 and GDP in year t) in different compositions of LCLU outcomes of ABGG (point color). Commercial/residential built-up area (CR), ABGG, the sum of infrastructure development, commercial/residential built-up area and transitional bare ground gain (ID+CR+TR), and the sum of infrastructure development and commercial/residential built-up area (ID+CR) led GDP by one year and two years with the significant level above 5%. The two-year lead of CR has an r 2 of 0.18 and the one-year lead of CR has an r 2 of 0.33.