Operational continental-scale land cover mapping of Australia using the Open Data Cube

ABSTRACT To comprehensively support national and international initiatives for sustainable development, land cover products need to be reliably and routinely generated within operational frameworks. Coupled with consistent semantics and taxonomies, ensuring confidence in mapping land cover for multiple time periods, facilitates informed decision-making at scales appropriate to multiple policy domains. The United Nations Food and Agriculture Organisation (FAO) Land Cover Classification System (LCCS) provides a taxonomy that comparable at different scales, level of detail and geographic location. The Open Data Cube (ODC) initiative offers a framework for operational continental-scale land cover mapping using analysis-ready Earth Observation data. This study utilised the FAO LCCS framework and the Landsat sensor data through Digital Earth Australia (DEA; Australia’s ODC instance) to generate consistent and continent-wide land cover mapping (DEA Land Cover) of the Australian continent. DEA Land Cover provides annual maps from 1988 to 2020 at 25 m resolution. Output maps were validated with ∼12,000 independent validation points, giving an overall map accuracy of 80%. DEA Land Cover provides Australia with a nationally consistent picture of land cover, with an open-source software package using readily available global coverage data and demonstrates a pathway of adoption for national implementations across the world.


Introduction
Identifying the state and condition of landscapes through continuous monitoring is paramount to ecological sustainable development (Díaz et al. 2019). This is becoming increasingly urgent given the rapid changes occurring globally in relation to declared climate and biodiversity emergencies (Tollefson 2019;Gills and Morgan 2020). Landscape state can be identified using land cover, an effective and easily interpretable biophysical descriptor of the Earth's surface (Wulder et al. 2018). Earth Observation (EO) can provide spatially and temporally explicit land cover information, identifying landscape state and its change over time and space. Reliable, standardised, continental-scale mapping of land cover facilitates informed decision-making at scales appropriate for planning and implementing national and international policies on sustainable development (Franklin and Wulder 2002;Kavvada et al. 2020;Metternicht, Mueller, and Lucas 2020;Mohamed-Ghouse et al. 2021). The provision of ongoing and reproducible land cover maps, particularly at high spatial and temporal scales, has been difficult to achieve in the past (Grekousis, Mountrakis, and Kavouras 2015;Gómez, White, and Wulder 2016). Recent efforts in analysis-ready data (ARD) accessibility and advances in systems capable of providing computational requirements are timely to address these challenges, whereby demand for land cover maps intersects current capacity to provide products that meet end user requirements (Wulder et al. 2018;Homer et al. 2020).
Land cover maps generated recently have overcome several historical limitations of mapping at national and continental scales. Contemporary efforts mapping land cover for Europe (Pflugmacher et al. 2019;Malinowski et al. 2020;Venter and Sydenham 2021), China (Yang and Huang 2021), Africa Zhang et al. 2020), the conterminous United States (Homer et al. 2020), and near-real time global land use land cover (Brown et al. 2022) have generated products at greater spatial resolutions (i.e. 10 m using Sentinel-2, 30 m using Landsat-8) and higher temporal frequencies (e.g. updated map every 1-5 years), providing valuable data for identifying large scale landscape change. However, the transferability of methods and land cover taxonomies remains a common limitation to widespread adoption beyond the geographic extent or particular year of method development. Several methods use disaggregation measures or ancillary data to account for climatic variation (Pflugmacher et al. 2019;Venter and Sydenham 2021), or employ numerous spatiallystratified models to ensure high classification accuracy of the end mapping product (Inglada et al. 2017;Zhang et al. 2020). Similarly, selected land cover taxonomies are often relevant to a limited geographic extent (Griffiths, Nendel, and Hostert 2019;Ghorbanian et al. 2020), or reflect a data-driven approach (e.g. based on reference data available) Malinowski et al. 2020;Brown et al. 2022), where land cover classes may not be applicable for many end user requirements (e.g. supporting regional and national policy implementation).
The increasing need to monitoring landscape change for sustainable development, as well as identifying opportunities for climate change mitigation and adaptation, ecosystem restoration and biodiversity conservation, requires ongoing and routine generation of land cover products (Metternicht, Mueller, and Lucas 2020;Owers et al. 2021). National and international efforts to report landscape change, such as the United Nations System of Environmental Economic Accounting (SEEA) and Sustainable Development Goals (SDGs), rely on consistent and relevant national to global land cover information (Kavvada et al. 2020;Metternicht, Mueller, and Lucas 2020;Skidmore et al. 2021). However, many existing products are unsuitable for this reporting as they are not produced routinely or on a consistent, long-term basis, providing only a snapshot of landscape state for specific years (Griffiths, Nendel, and Hostert 2019;Venter and Sydenham 2021). Often methods applied in previous iterations vary widely (e.g. Li et al. 2020;Zhang et al. 2020), requiring modifications for additional retrieval and generation of updated maps (e.g. changing land cover semantics and taxonomy, updated machine learning algorithms). Such variations in the production of land cover information make comparisons between maps through time very difficult (Herold et al. 2016;Phiri and Morgenroth 2017;Yang et al. 2017), limiting confidence in their use for identifying changes in landscape states.
Australia, a country with diverse and dynamic landscapes, is a prime example of the essential need for routine mapping and monitoring of landscape state. To date, the National Dynamic Land Cover Dataset (DLCD) (Lymburner et al. 2011) has provided a nationally consistent land cover product for Australia andits territories (2002-2015). However, the maps are limited in their capacity to identify landscape change, largely because these were produced from a 250 m resolution MODIS composite product encompassing multiple years. Recent advances have utilised the DLCD within a machine learning approach to achieve 30 m resolution mapping for 5-year epochs based on Landsat sensor data (Calderón-Loor, Hadjikakou, and Bryan 2021). However, to comprehensively support international initiatives for sustainable development, land cover products need to prioritise methods that are more transparent (i.e. generated according to Findable, Accessible, Interoperable and Reusable (FAIR) principles, where caveats are easily understood; Wilkinson et al. 2016) and adaptable (e.g. across sensors and platforms, utilising available computational resources), with consistent semantics and taxonomies to facilitate robust and routine generation.
The Land Cover Classification System (LCCS), developed by the United Nations Food and Agriculture Organisation (FAO; Di Gregorio and Jansen 2000), provides a taxonomy that is fundamentally well suited to consistent classification of land cover (Lucas and Mitchell 2017;Lucas et al. 2019). Despite the issues inherent to land cover classification taxonomies (i.e. thematic representations of continuous realworld landscapes), the LCCS addresses key issues of interoperability, a major caveat with many other land cover taxonomies and classifications . This is because the LCCS is a semantically-driven integrated system, providing a taxonomy that is consistent and comparable at different scales, level of detail and geographic location. Moreover, as an internationally recognised taxonomy, land cover mapping using the LCCS taxonomy is also interoperable with end-user requirements (i.e. classes generated closely align with habitat taxonomies that are widely used by ecologists) (Atyeo and Thackway 2006;Kosmidou et al. 2014;Punalekar et al. 2020Punalekar et al. , 2021. Application of the LCCS for use with EO data has been established using the Earth Observation Data for Ecosystem Monitoring (EODESM) system (Lucas and Mitchell 2017;Lucas et al. 2019Lucas et al. , 2020. Unlike other EO implementations of the LCCS, which generally directly classify a limited number of 'end classes' in the taxonomy, the EODESM adopts a sequential classification that aligns with the LCCS hierarchy using derived products from EO data (Lucas and Mitchell 2017). The EODESM system places emphasis on retrieving continuous or categorical environmental descriptors with predefined units or codes respectively, which are progressively integrated within the hierarchical and module structure of the FAO LCCS and combined to generate several thousands of land cover classes . The advantage of this classification approach is that it is relevant and applicable to any location globally and can be applied independently of spatial scale. The EODESM system was demonstrated initially through the EU Horizon 2020 Ecopotential project for national parks (Lucas and Mitchell 2017), subsequently for select sites in Australia  and Malaysia  and most recently for Wales . A fully implemented EODESM platform, Living Earth, has been optimised for EO data . However, operational continental scale implementation for routine map production is yet to be demonstrated.
Performing reliable, standardised, continental-scale land cover mapping requires access to analysis-ready data (ARD) and adequate high-performance computing facilitates. The Open Data Cube (ODC) initiative provides a framework to store and analyse EO data for continuous monitoring . Digital Earth Australia (DEA) is an ODC instance containing the entire Australian archive of Landsat data (1987 to present), utilising the high-speed storage and processing power of Australia's National Computing Infrastructure (NCI) as well as the public cloud (i.e. Amazon Web Services; AWS). The ODC framework enables a pixel-based rather than the traditional scene-based approach to analysing Landsat data, directly comparing observations from a specific location acquired over two or more epochs (Dhu et al. 2017). This analytical power provides unprecedented capability for continental-scale analysis at high temporal frequency and has been used to develop innovative products, such as Water Observations from Space (WOfS) (Mueller et al. 2016), intertidal extent and elevation (Sagar et al. 2017;Bishop-Taylor et al. 2019), and high dimensional statistics that allow the production of cloud-free composite mosaics (Roberts, Mueller, and McIntyre 2017, 2019. The prospect of routinely generating consistent land cover maps using an internationally recognised taxonomy aligns with DEA's transparent and transferable EO analytical framework . The aim of this study was to generate consistent and continent-wide land cover mapping (DEA Land Cover) for Australia and its territories from the Landsat archive. More specifically, land cover maps were generated from environmental descriptors retrieved or classified from the Landsat sensor data according to the dichotomous FAO LCCS taxonomy (termed Level 3), which provides capacity for more detailed classifications based on the modular-hierarchical phase (termed Level 4). Annual maps were generated for the entire Landsat archive . The specific objectives of this study were to: . Implement the dichotomous framework of the FAO LCCS within the Living Earth platform. . Differentiate six (6) key land cover types as defined by the FAO LCCS Level 3; terrestrial (semi) natural vegetation, aquatic natural vegetation, cultivated and managed vegetation, artificial surfaces, natural (bare) surfaces, and water bodies. . Develop an implementation optimised for national operational land cover mapping using high performance computing within the ODC framework The method developed and implemented in this study is provided as an open-source software package using readily available global coverage data. This method, demonstrated using the Australian continent, has clear potential for country-level, regional and global adoption. Whilst we demonstrate the application of Living Earth using the rich time-series of Landsat sensor data, the approach can be applied independently of the satellite platform and sensor type.

The Australian landscape
The focus of this study was the Australian continent, its mainland and island territories. With a land area of 7.7 million km2 and a latitudinal range of 10°41´S to 43°38´S (ABS 2012), Australia encompasses 6 major climate groups (Peel, Finlayson, and McMahon 2007), ranging from tropical regions in the north, extensive inland arid areas, and temperate regions in the south. Although only 18% of the continent is considered to be desert, up to 70% of the mainland receives less than 500 mm of rain annually, varying substantially with elevation and latitude (Geoscience Australia 1994), as well as through interannual climate variability (e.g. El Niño-Southern Oscillation; ABS 2012). Coupled with a long history of landform evolution, Australia exhibits complex and varying landscapes with vastly different biomes.
The distribution, extent and change in land cover across the continent is broadly influenced by climate and landscape suitability. Agricultural expanses occupy the subtropical and temperate areas of eastern Australia, making up approximately 52% of the land area (ABS 2012). Urbanised areas are predominantly concentrated along the coastline, with temperate coasts hosting most of Australia's major cities (Clark and Johnston 2017). Naturally bare surfaces are widespread in the continent's interior, with an extensive band of natural terrestrial and aquatic vegetation along the northern, eastern and southern coastlines. Reported changes in land cover include a continual trend of native vegetation removal for agriculture and urban expansion (Harwood et al. 2016;White, Griffioen, and Newell 2020), fluctuations in water extent (Krause et al. 2021) and variable levels of greenness associated primarily with climate impacts on vegetation cover and productivity (Donohue, McVicar, and Roderick 2009;Xie et al. 2019;Lamchin et al. 2020). Episodic events (e.g. flooding, drought, wildfires, tropical cyclones) are becoming more intense. These can have extensive impacts on the landscape, some with unparalleled severity, including mass mangrove mortality in northern Australia (Duke et al. 2017;Asbridge et al. 2019) and extreme forest fires in southeast Australia (Boer, de Dios, and Bradstock 2020).
Two calendar years (2010 and 2015) were selected to design, apply and evaluate the proposed method for its subsequent scaling out to the entire Landsat archive . These years represent the range of climatic conditions common to the Australian continent; 2010 was one of the wettest years on record (records commenced in 1900) with mean rainfall well above the longterm average (2010, 690 mm; 1961-1990 average, 465 mm) with this attributed to a significant La Niña event (BOM 2011). 2015 represented a particularly warm and dry year, with mean temperatures 0.83°C above the 1961-1990 average, coinciding with below average rainfall (2015, 444 mm; BOM 2016). A robust land cover classification approach developed for these years served to ensure a consistent classification for annual maps of the entire Landsat archive.

FAO LCCS framework
The FAO LCCS Level 3 land cover taxonomy is hierarchical, consisting of a binary decision tree structure ( Figure 1). Primarily vegetated areas, defined as vegetative cover of at least 4% for at least two months of the year, are initially differentiated from primarily non-vegetated areas (Di Gregorio 2005). Terrestrial and aquatic areas are subsequently distinguished, where the latter are associated with landscapes that are significantly influenced by the presence of water over an extensive period each year. Primarily vegetated areas are further classified based on human activities, generating four primarily vegetated Level 3 classes consisting of (a) cultivated and managed terrestrial areas (CTV), (b) natural and semi-natural terrestrial vegetation (NTV), (c) cultivated aquatic or regularly flooded areas and (d) natural and semi-natural aquatic or regularly flooded vegetation (NAV). Similarly, primarily non-vegetated terrestrial areas are separated into (a) artificial surfaces and associated expanses (AS) or (b) naturally bare surfaces (BS). Primarily non-vegetated aquatic areas (W) are also differentiated into artificial and natural land cover. Cultivated aquatic or regularly flooded areas were not included in this study given as land cover type is uncommon in Australia and, where existing, is small in area. In addition, non-vegetated aquatic areas were not differentiated into artificial and natural waterbodies, as the former are difficult to consistently identify and map from EO data.
The classification (to LCCS Level 3) was based on the concepts associated with EODESM implemented through the Living Earth platform, as a foundation to enabling wider application to include Level 4. Binary raster inputs were generated using one or more EO products for each decision in the hierarchy. In this way, input products were used for multiple binary decisions, masked by previous levels in the hierarchy. The modified LCCS Level 3 framework used in this study ( Figure 1) provides six (6) land cover types, requiring only four (4) binary raster inputs.

Data
The Landsat data is indexed into DEA as ARD, having already been geometrically corrected, converted to surface reflectance, adjusted for solar illumination and viewing angles, and masked for Figure 1. The modified LCCS Level 3 hierarchical structure with FAO LCCS-2 definitions suitable for land cover products derived from EO data. The dotted line for aquatic vegetation and waterbodies indicates that the land cover class is not differentiated from the Level 2 land cover type. L1; Level 1, L2; Level 2, L3; Level 3. cloud and cloud shadow (Irish 2000;Li et al. 2020;Zhu and Woodcock 2012;Mueller et al. 2016;Lewis et al. 2017). When stored and accessed through DEA, the Landsat ARD allows individual pixel values for the same location to be accessed and interrogated with consistency through time. Several products derived from Landsat ARD were utilised in the hierarchical framework.

Classification workflow
2.4.1. Level 1 Primarily vegetated areas were identified using time-series analysis of Fractional Cover; a product which utilises a spectral unmixing algorithm to measure green photosynthetic vegetation (FC-PV), non-photosynthetic vegetation (FC-NPV) and bare soil (FC-BS) from each terrain corrected Landsat surface reflectance observation (Gill et al. 2017). Water Observations from Space (WOfS) (described below) was used to mask the occurrence of open water in each observation to avoid erroneous errors associated with Fractional Cover endmembers and water (Guerschman et al. 2009). Similarly, a salt index (SI5; Abbas and Khan 2007) was calculated using annual geometric median pixel composites (Geomedian; Roberts, Mueller, and McIntyre 2017) to mask primarily non-vegetated areas such as mudflats and saltpans that Fractional Cover endmembers did not correctly attribute to BS ( Figure 2). Monthly median aggregates were then generated from the masked Fractional Cover observations and used to determine the presence or absence of vegetation. Primarily vegetated areas were defined where the FC-PV or FC-NPV fractions were greater than the FC-BS fraction. These monthly vegetation classifications were then collated and, where the frequency of vegetation presence was two or more consecutive months, the pixel was classified as primarily vegetated. This was considered to meet the FAO LCCS guidelines for primarily vegetated areas. Specific caveats to this method include the inability to confidently define 4% of vegetation cover. However, the criteria were set such that non-vegetated areas needed to demonstrate a majority in the FC-BS endmember (i.e. > 50%).

Level 2
Aquatic areas were determined using several DEA products applied in combination. Water Observations from Space (WOfS) identifies inland and coastal water bodies through a regression tree classification of Landsat (Mueller et al. 2016). An area was considered aquatic where the presence of water detected was greater than 20% over the annual observations, ensuring infrequent flood events were removed. The Intertidal Extent Model (ITEM) was used to identify non-vegetated intertidal areas such as mud and sand flats (Sagar et al. 2017). This dataset represents the exposed intertidal region of the Australian coastline, based on Landsat observations tagged with reference to corresponding tide heights at the time of image capture. This study applied ITEM to extract aquatic intertidal areas, that is, areas being exposed between 10% and 80% of the tidal range observed by the Landsat archive. National mangrove maps were then used to identify tidally inundated areas not captured by ITEM due to obscuration of water by the mangrove canopy cover ). All three data layers were then used in combination to identify aquatic areas, with all those remaining assumed to be terrestrial ( Figure 2).

Level 3
Cultivated and managed areas were identified using a supervised machine learning approach. Training data samples were generated for selected Albers tiles using Landsat geometric median images (Geomedians; Roberts, Mueller, and McIntyre 2017) and interpretation confirmed using high resolution imagery from Google Earth TM for the corresponding years (i.e. training data generated from 2010 and 2015 only). Each tile was segmented in eCognition (to simply group spectrally homogeneous pixels), applying the Normalised Difference Water Index (NDWI) to aggregate and identify water bodies, then manually classifying features to one of five land cover types (cultivated vegetation, n = 5254; natural vegetation, n = 41464; bare areas, n = 995; artificial surface, n = 415; water, n = 2481). Features were selected based on the criteria that they only contain pixels of the required classes and they represent a broad variety of environmental conditions for that class. Each classified training feature was used to extract the median spectral values from two highdimensional Landsat composite products; annual Geomedians (Roberts, Mueller, and McIntyre 2017), including several spectral indices to enhance differentiation between cultivated areas and other land cover types (see Table S1), and median absolute deviations (MAD; Roberts, Dunn, and Mueller 2018). The MAD dataset identified areas of high and low variability within a calendar year, particularly useful for cultivated areas with distinct sowing, growing and harvesting seasons (Roberts, Dunn, and Mueller 2018).
Training features were selected from a range of locations across Australia to account for variability in climate and cultivated type (e.g. rainfed, irrigated). Features were collated, randomised, and stratified on the five land cover types to provide a balanced training dataset, achieved through scikitlearn ensemble methods such as 'random state' and 'balanced subsample' weighting (Pedregosa  Tables S1 and S2. et al. 2011), to develop a classification model. However, testing showed that training on binary data produced more accurate results so the input data was merged to represent just two classes, cultivated areas and counter examples (i.e. natural vegetation, naturally bare surface, artificial surface, water). As an iterative process, several machine learning algorithms were tested with additional training data and features added based on the performance of initial classifications. A Random Forest approach, implemented in scikit-learn (Pedregosa et al. 2011), was selected from this process. Feature selection was conducted using Lasso regression (Tibshirani 1996), this reduced the number of input features to the model (Table S1). The classification model was applied to Landsat Geomedian and MAD datasets where areas classified as cultivated were used as input into LCCS.
Artificial surfaces were differentiated from Bare Surfaces using a deep learning approach, specifically an artificial neural network (ANN). Initially, a continental mask was applied using the MAD dataset to limit false positives due to spectral similarly of salt lakes and other large features in central Australia.
Training data was then generated in a semi-automated manner using a non-linear spectral unmixing model based on selected spectral indices for brightness (Tasseled Cap Brightness;Crist 1985), vegetation (Modified Soil Adjusted Vegetation Index; Qi et al. 1994), and water (Modified Normalised Difference Water Index; Xu 2006). The unmixing analysis was used to provide three fractional endmembers based on visual interpretation for Albers tiles covering the Brisbane region. These fractions were labelled based on their broad distinction of vegetation (n = 10000), bare surface (n = 10000), and urban areas (n = 10000). The labelled fractions were then used as model input trained on all Landsat Geomedians bands and several spectral indices (see Table S2, 11 parameters in total). The ANN architecture consisted of four fully connected (dense) layers in sequence (Table S3), implemented with the Keras library (Chollet 2015), using the Tensorflow backend (Abadi et al. 2015). The model was trained for 200 epochs with a batch size of 100, using a Rectified Linear Unit activation function (ReLU) and gradient-based optimisation where the learning rate is adaptive (RMSprop). Thresholds were determined for the three-class output (i.e. three fractions with values that add to 1) through an iterative process (vegetation < 0.5, bare surface > 0.75, urban areas > 0.08) for applying the model to classify Artificial Surfaces. The Woody Cover Fraction (WCF), developed by Liao et al. (2020), was used to further mask naturally bare surfaces (< 0.004) in combination with the Urban Centre and Localities (UCL) map (ABS 2019). The resulting binary mask identifying Artificial Surfaces was used as input to LCCS (Figure 2).
Overall performance of the Cultivated areas model and Artificial surfaces model were evaluated using independent validation data not used in the training of the models. Input variables for each model were evaluated using SHapely Additive exPlanations (SHAP) to identify feature importance (Lundberg and Lee 2017). SHAP values enable fair comparison of input variables to a machine learning model by comparing the model outputs with and without the feature in an iterative process.

Implementation
The hierarchical classification was executed in Python using the Living Earth software package , with plugins used to generate custom products. Dataset storage was provided by AWS using S3 and ODC framework of DEA. For implementation, the Australian continent was tiled into 100 km × 100 km Albers tiles (EPSG: 3577) to facilitate distributed compute on AWS. Each tile was classified using the hierarchical approach described in section 2.4. Annual continent-scale land cover maps were produced for the Landsat archive (1988-2020).

Validation
Validation was completed for the continental-scale land cover maps, using the years (2010 and 2015) for method development and evaluation, in accordance with good practice guidelines (Olofsson et al. 2014;Salk et al. 2018) and FAO accuracy assessment standards (FAO 2016). An overall expected standard error of 0.007 was selected, and the number of points generated was based on proportional area (see Olofsson et al. 2014). Validation points were generated based on the LCCS level 3 output for 2015, stratified to ensure adequate points were selected for each output class based on areal proportion. This stratification was then adjusted to account for the dominant landscape classes across the continent, with 5% more validation points being added to the four under-represented classes and 10% removed from (Semi) Natural Terrestrial Vegetation (NTV) and Bare Surfaces (BS). Final percentages were rounded to the nearest integer prior to generating the standard validation set (Table S4). This resulted in ∼12,000 validation samples (i.e. 6,000 validation points each for 2010 and 2015 annual maps) separated into 60 geographical clusters (∼100 points each cluster) (see Figure S1).
Accuracy assessment was carried out using visual interpretation of annual Geomedians (Roberts, Mueller, and McIntyre 2017) in combination with WOfS (Mueller et al. 2016) and Google Earth TM / Sentinel-2 imagery for the same epoch where available. Accuracy assessment was manually completed by a small team of remote sensing scientists who observed and followed the same methodology of interpretation, documenting difficult samples, in an effort to enhance consistency and accuracy of the validation set (Wickham et al. 2013;Olofsson et al. 2014;FAO 2016). Each validation team member completed an independent calibration sub-set (∼ 150 points) across the continent to compare visual interpretation consistency between team members (Table S5).
User's and producer's accuracy, f-score, quantity and allocation disagreement, as well as overall accuracy, were generated using estimated area proportions, with variance of accuracy measures reported to 95% confidence intervals (Olofsson et al. 2014;FAO 2016). Spatial variation of accuracy was mapped, stratified on major climate groups (Beck et al. 2018), to visualise the performance of the LCCS Level 3 taxonomy. Area proportions were used to estimate the area of each LCCS Level 3 class by generating an error matrix as well as 95% confidence intervals for each class.

Land cover classifications
Annual continent-scale land cover maps were produced for Australia from 1988 to 2020 at 25 m resolution ( Figure 3). All maps were generated using the same framework with respective Landsat sensor observations for the calendar year. The DEA Land Cover maps are freely available and accessible on DEA Maps (https://maps.dea.ga.gov.au/ -Select 'Land and vegetation' > 'DEA Land Cover'). The open-source software was produced under an Apache 2.0 license and available online (https:// bitbucket.org/au-eoed/livingearth_lccs, https://bitbucket.org/geoscienceaustralia/livingearth_australia).
DEA Land Cover demonstrated that the Australian continent was dominated by primarily vegetated areas (> 50%) and interior expanses of primarily bare natural surfaces (> 20%). Artificial surfaces, water bodies and aquatic natural vegetation make up < 5% of the landscape combined, while cultivated and managed areas account for 5-10%. For the selected method development and evaluation years 2010 and 2015, the relative area of land cover classes varied for each year, largely due to considerable differences in rainfall and their influence on vegetation condition (Figures 4 and 5). Waterbodies in 2010 accounted for only 1% of the continental surface, however, this was double the area compared to 2015 (0.45%). Differences in relative area of primarily naturally bare surfaces demonstrate the substantial influence of climate variability on the Australian landscape, with a 10% increase in area for 2015 (2,677,904 km 2 ; 35%) relative to 2010 (1,901,213 km 2 ; 25%).

Level 3 model accuracy
Total input data to the Cultivated areas Random Forests model was 50609 training samples, with 11737 available for independent validation. The Random Forest model had a reported accuracy of 98% and F1-score of 92%. Feature selection using the Lasso regression removed several Landsat bands and indices from the Geomedian dataset (see Table S1), as well as Bray-Curtis distance median absolute deviations (BCDEV) from the MAD dataset. SHAP analysis demonstrated the greatest variable importance of the near-infrared band (NIR) and Euclidian distance median absolute deviations (EDEV), with several vegetation indices also of moderate importance (Figure 6a). Input variables of lesser importance were those relating to built-up areas (BUI), moisture (NDMI) and bare soil (BSI).
The Artificial surfaces ANN model was trained on total input data size of 54,000 samples, with 6,000 available for independent validation. Final accuracy for the ANN model was 88%, trained for 200 epochs with a batch size of 100, with a reported loss of 0.89. SHAP analysis identified vegetation and water indices, NDVI and MNDWI, as the most prominent input variables impacting model output (Figure 6b). NDVI and MNDWI has substantially higher impact on the model for all output fractions (i.e. vegetation, bare surface, urban areas fractions), having combined mean SHAP values of 1.00 and 0.87 respectively, followed by the tillage index (NDTI) and red band (RED) with only 0.24 and 0.20 respectively. Both the Cultivated areas and Artificial Surfaces models are available on bitbucket (https://bitbucket.org/geoscienceaustralia/livingearth_australia).

Validation
The overall accuracy of DEA Land Cover annual maps produced using LCCS Level 3 classification was 80%. Validation results discussed here present combined statistics for method evaluation years (2010 and 2015 maps) for the classification approach, however individual statistics for 2010 and 2015 maps are present in the supplementary material. Overall, the summary statistics demonstrate a reasonably robust classification, with high sensitivity (i.e. validation points classified correctly; > 75%, Table 1) and very high specificity (i.e. validation points classified outside of particular category correctly; > 85%, Table 1) across all land cover classes. Allocation and quantity disagreement between validation points and classifications were similar, suggesting that the overall error (i.e. equal to 1overall accuracy) was spatially and proportionally distributed (Table 1). However, the exchange component of quantity disagreement (i.e. quantity disagreement = exchange + shift) was substantially greater than the shift, suggesting that the spatial distribution of the overall error was largely due to actual changes in classification. Cultivated and managed areas, as well as naturally bare surfaces, showed substantially higher rates of false discovery (CTV; 45.1%, BS; 41.0%) compared to all other land cover classes (Table 1), reflecting likely confusion in the classification of cultivated and managed areas, and spectral similarity of naturally bare surfaces to other land cover classes.
The combined error matrix demonstrates patterns of misinterpretation by the classification (Table 2). Naturally bare surfaces and natural terrestrial vegetation were somewhat misclassified with each other, suggesting spectral similarity of endmember fractions in the Fractional Cover product. Similarly, cultivated and managed areas were sometimes misinterpreted as natural terrestrial vegetation but were rarely misclassified as another land cover class. Natural aquatic vegetation, artificial surfaces, and waterbodies presented the least misinterpretation or misclassification, with both user's and producer's accuracies > 85%.

Discussion
This study provides annual continental-scale land cover maps (1988-2020) for Australia and its territories from Landsat sensor data delivered through the ODC platform. Our framework provides a standardised and repeatable approach to land cover mapping utilising an internationally recognised taxonomy in the FAO LCCS Level 3, implemented using the open-source Living Earth system for EO data, with this based on the concepts used in EODESM. Six (6) key land cover classes were generated using this approach, with this requiring only four (4) binary input layers representing vegetated, aquatic, cultivated and artificial landscapes. The DEA instance of the ODC platform was used for developing input environmental descriptors for the Living Earth system, taking full advantage of the Landsat time series to generate annual land cover maps at 25 m spatial resolution. This approach was guided by FAIR principles for software design (Wilkinson et al. 2016) to ensure a nationally operational, open-access mapping product. Map outputs presented here are important for systematic identification of landscape change at scales ranging from continental to sub-national. The LCCS Level 3 classification provides foundational metrics required for reporting on national and international policies and environmental accounting, as well as capacity for more detailed attribution based on LCCS Level 4.

Annual land cover maps
The overall accuracy of the annual land cover maps generated was 80%, with select assessment year 2015 having marginally greater accuracy (82%) compared to 2010 (78%). This is consistent with other continental-scale maps utilising moderate resolution satellite data (Pflugmacher et al. 2019;Homer et al. 2020;Yang and Huang 2021). The DLCD reports map accuracy using fuzzy logic, where the 'exact' and 'very similar' matches are 30% and 35% respectively (Lymburner et al. 2011). Although these may appear as low accuracy outputs, the DLCD generated 35 independent classes based on the LCCS Level 4. Therefore, a comparative overall accuracy for DLCD Level 3 can be interpreted as 65%, lower than map accuracies of DEA Land Cover (80%). More recent efforts demonstrate that greater accuracy can be achieved for the Australian continental at 30 m resolution (∼93% overall accuracy, Calderón-Loor, Hadjikakou, and Bryan 2021), though accomplished for 5-year epochs rather than annual time steps, for six user defined classes rather than a  Figure S2; 2015, Figure S3). standard taxonomic framework. It is important to note that unlike many other land cover products, the accuracy of the land cover products generated in this study are a direct consequence of the four binary input layers, not the final six thematic classes. This approach, therefore, provides a highly flexible and robust platform for generating land cover products that is not grounded on the quality of reference data, or a particular machine learning approach for generating end classes. This is likely at the expense of some map accuracy. However, achieving higher overall accuracy may require compromise on the land cover taxonomy (e.g. use of spectrally similar 'other areas' that cannot be identified, see Calderón-Loor, Hadjikakou, and Bryan 2021), reducing the useability of the map product for multiple purposes. The accuracy obtained for the six land cover classes was similar for 2010 and 2015, with natural aquatic vegetation (99.8% = 2010, 99.6% = 2015), waterbodies (99.8% = 2010, 99.6% = 2015), and artificial surfaces (99.9% = 2010, 99.9% = 2015) recording the highest classification accuracy. Natural aquatic vegetation and waterbodies were primarily derived using existing validated products (e.g. Mangroves = 97.9 -98.5%, Lymburner et al. 2019;WOfS = 97%, Mueller et al. 2016), contributing to their impressive classification accuracies, whereas artificial surfaces utilised state-of-the-art deep learning techniques that have demonstrated substantial promise in mapping land covers (e.g. Parente et al. 2019;Amani et al. 2020). Naturally bare surfaces proved challenging to classify correctly, with considerable confusion occurring with natural terrestrial vegetated areas. This is reflective of the reduced accuracies for the interior arid regions of Australia, which are dominated by naturally bare and sparsely vegetated landscapes (Walter and Breckle 1986;Donohue, McVicar, and Roderick 2009;McFarlane and Wallace 2019). This confusion has been recognised in similar arid environments when producing continental mapping products (Gong et al. 2019;Venter and Sydenham 2021), as well as limitations with the Fractional Cover product utilised to determine vegetated and non-vegetated areas (Guerschman et al. 2009;Gill et al. 2017;Hill and Guerschman 2020). Moreover, evidence of temporal climatic variability (e.g. year's with above average rainfall) across the continent have substantial impact on natural terrestrial vegetated area classifications due to greening of sparsely vegetated landscapes (e.g. Figure 3; 2000 and 2011).
Similarly, reduced classification accuracy was recorded for cultivated and managed areas. Evidence of misclassification and confusion is present in tropical areas of the continent, with this attributed in part to the monsoonal seasonal variation of natural vegetation that mimics similar growth patterns observed for cultivated areas (Huete et al. 2006;Poulter and Cramer 2009;Bégué et al. 2018). Notably, the classification was largely correct in areas of cultivation and high land use management (southeast and southwest temperate regions of Australia), suggesting that the machine learning approach utilised in this study has a high classification accuracy in areas of known relevance (i.e. agricultural expanses).

Land cover classifications using EODESM and Living Earth software
The FAO LCCS is a globally relevant taxonomy with semantics that support and align with end-user requirements (Atyeo and Thackway 2006;Kosmidou et al. 2014;Di Gregorio 2016). The EODESM hierarchical approach, fully implemented in the Living Earth software, facilitates direct comparison between map outputs Planque et al. 2020;Owers et al. 2021). The four binary input layers used to generate six land cover classes through a range of methods, including a time-series rule-based approach at Level 1, existing aquatic products for Level 2, and a range of machine learning techniques at Level 3, demonstrate the simplicity and flexibility of EODESM in that it enables tailoring and adapting a range of methods for different thematic classes. The hierarchical classification of LCCS Level 3 means that regardless of the environmental descriptors used to form the binary layer inputs, map comparisons will be valid, as the same taxonomic end classes are used. The approach implemented in this study can be applied broadly, to produce spatially and temporally comparable map outputs; a unique feature compared to other continental scale land cover classifications of Australia, providing a straightforward approach for application regardless of geographical context.
In addition, the hierarchical classification approach is adaptive to technological advances in EO sensors and image processing, ensuring longevity of the standardised cartography of land covers. This is a major limitation of other land cover products, and it affects the life cycle of national and continental-scale maps (Wulder et al. 2018;Li et al. 2020;Venter and Sydenham 2021). As higher accuracy environmental descriptor layers become available for implementations such as DEA Land Cover, they are simply substituted or added into the hierarchical framework to generate the next annual land cover product, where the end class taxonomy and hierarchical implementation provide consistent and comparable outputs to previously generated maps. This is currently not possible for many national and continental-scale land cover products underpinned by reference data to train models (e.g. Inglada et al. 2017;Griffiths, Nendel, and Hostert 2019;Malinowski et al. 2020;Zhang et al. 2020;Calderón-Loor, Hadjikakou, and Bryan 2021;Yang and Huang 2021;Brown et al. 2022). Moreover, as a machine learning model becomes superseded in favour of newer models that produce greater accuracy (e.g. spatial and temporal convolutional neural networks in favour of single pixel-based classifications), this will result in maps that are not standardised or comparable, where end classes are generated directly from model classification. This is likely to limit the use of a particular land cover programme for ongoing use and future development.

Current caveats and future opportunities
While this study has demonstrated the benefits of using the EODESM for generating land cover maps, there are some conceptual limitations which are important to acknowledge. The hierarchical approach of the EODESM means classification accuracy of higher levels (i.e. Level 1 and 2) are inherited by lower levels (i.e. Level 3). For example, if an area in the landscape is naturally bare and incorrectly classified at Level 1 or 2, then correct classification at Level 3 (i.e. differentiation from artificial surfaces) will be irrelevant. This can have additional adverse effects, for instance, if incorrectly classified at Level 1 to primarily vegetated, it may become systematically misinterpreted down the hierarchy and attributed to an incorrect class. If the naturally bare surface is incorrectly classified as natural vegetation at Level 1, then at Level 3 it may have spectral characteristics similar to cultivated and managed areas. The map accuracy statistics may then suggest the cultivated and managed areas machine learning model is an inaccurate classifier. However, this actually represents a misclassification propagated in the hierarchy. This limitation is inherent to hierarchical classifications. However, it demonstrates that implementation of this approach should invest greater resources to ensuring the binary input layer accuracy is greatest at Level 1 and Level 2 before higher accuracies differentiating end classes at Level 3 can be achieved.
The modified FAO LCCS Level 3 framework presented in this paper does not differentiate between natural and artificial waterbodies (such as reservoirs, canals and artificial lakes) nor natural and cultivated aquatic areas (such as rice). This was primarily due to the difficulty of accurate and routine retrieval from EO for these categories based on the FAO LCCS taxonomic definitions (see Di Gregorio 2005;Owers et al. 2021). For example, cultivated aquatic areas in Australia (predominately rice with comparatively high water use efficiency, see Humphreys et al. (2006), Bajwa and Chauhan (2017)) rarely have surface water covering herbaceous vegetation that can be accurately detected from the Landsat sensor data. Instead, land cover characteristics are presented that are spectrally similar to irrigated cultivated and managed areas. Similarly, artificial and natural waterbodies are primarily identified through shape rather than spectral characteristics, with artificial waterbodies having more uniform and rigid extent boundaries. Although many studies have used object-based analysis and machine learning approaches over traditional pixel-based classification, we lack consistent evidence of a robust spatiotemporal techniques that can be applied consistently at a continental scale. Correct and complete identification of these land covers is highly valuable for detecting anthropogenic impacts on landscape change. Future development of these input layers for LCCS level 3 classes would be highly beneficial, and the operational implementation of DEA Land Cover provides the opportunity to develop image processing methods that can be easily integrated into the existing workflow. As the software implementation has been fully designed, interim ancillary products can be easily added within the Living Earth system, such as the Australian Hydrological Geospatial Fabric (Geofabric) to map artificial hydrological features (Atkinson et al. 2008). Further opportunities exist to generate additional environmental descriptors for LCCS Level 4 (additional descriptors for each Level 3 class mapped in this study) that provide detailed land cover descriptions for end users .

Application of DEA Land Cover
This study presents a standardised approach to operational implementation of the FAO LCCS Level 3 at national and continental scales. To ensure robust and routinely producible land cover maps, we employ an open-source approach underpinned by the ODC platform, using free and accessible Landsat sensor data (Woodcock et al. 2008) and guided by the FAIR principles of software development. All DEA Land Cover products and metadata are available through DEA Maps (https:// maps.dea.ga.gov.au). For DEA Land Cover, two important functions can be implemented in the short to medium term that will enhance ongoing capacity to map and monitor landscape change over the entire Australian continent.
(1) DEA Land Cover provides a nationally consistent picture, with the six land cover classes identifying broad land cover categories that are consistent over time and can be compared to detect change. This is highly valuable for retrospective identification of landscape changes over the past 30 years (using the Landsat archive) and into the future (using Landsat and Sentinel sensors), without caveats of jurisdictional boundaries or interoperable land cover taxonomies. Landscape change is a particularly valuable metric derived from multi-temporal land cover maps, and has been demonstrated using the EODESM, where 64 categories of LCCS Level 3 change are identified (Lucas et al. , 2022. This approach could be successfully implemented and extended using the DEA Land Cover products. (2) The LCCS Level 3 output maps provide the foundation for implementing the FAO LCCS Level 4, an attribution-based thematic classification (Di Gregorio 2005Gregorio , 2016. Delivering a robust Level 3 classification is crucial to providing high confidence in the Level 4 attribution. Level 4 provides more detail to each Level 3 category as available (e.g. canopy cover, water persistence, crop typesee example from Planque et al. 2021) and has been systemically implemented and optimised for EO data . Existing continental products could be used to derive Level 4 attributes and integrated seamlessly with DEA Land Cover.
DEA Land Cover products have been developed to support national and international reporting that require land cover and land cover change information for routine reporting, such as Sustainable Development Goals (SDGs) and the System of Environmental Economic Accounting (SEEA). EO capacity to support SDGs has been well established (e.g. Paganini et al. 2018;EO4SDGs 2020;Estoque 2020), where maps that produce information for action are required, rather than simply satellite ARD (Metternicht, Mueller, and Lucas 2020;Mohamed-Ghouse et al. 2021). LCCSbased land cover maps can support several SDGs (see Metternicht, Mueller, and Lucas 2020;Owers et al. 2021) and similarly, routine generation of land cover provides consistent and comparable quantification of landscape change for national land and environment accounts. This is particularly useful to address discrepancies across reporting mechanisms, by utilising the same routinely generated national land cover product, and following the principles of good data and information management; 'collect once, use many times' (IGGI 2005).

Conclusion
DEA Land Cover provides annual continental-scale land cover maps for Australia and its territories using Landsat sensor data from 1988 to 2020. This is generated at a national operational level through Digital Earth Australia using the ODC platform. This study presents a standardised and repeatable approach to land cover mapping utilising an internationally recognised taxonomy in the FAO LCCS, implemented using the open-source Living Earth system that has been optimised for EO data. We developed and evaluated the approach on two calendar years (2010 and 2015) representing the range of climatic conditions experienced on the Australian continent, achieving an overall map accuracy of 80%. The hierarchical structure implemented in FAO LCCS ensures the longevity of the land cover taxonomy, where resources can focus on generating high quality environmental descriptors to be utilised for producing layers that do not alter taxonomic class outputs. Moreover, this study provides key land cover classes (LCCS Level 3) that encourage generation of environmental descriptors for further attribution to landscape descriptions (LCCS Level 4). Map outputs presented here are important for systematic identification of landscape change at scales ranging from the continental to sub-national and can therefore provide foundational measurements required for reporting on national and international policies and environmental accounts.