Towards a comprehensive and consistent global aquatic land cover characterization framework addressing multiple user needs

Aquatic land cover represents the land cover type that is significantly influenced by the presence of water over an extensive part of a year. Monitoring global aquatic land cover types plays an essential role in preserving aquatic ecosystems and maintaining the ecosystem service they provide for humans, while at the same time their accurate and consistent monitoring for multiple purposes (e.g. climate modelling, biodiversity conservation, water resource management) remains challenging. Although a number of global aquatic land cover (GALC) datasets are available for use to monitor aquatic ecosystems, there are prominent variabilities among these datasets, which is primarily caused by the inconsistency between different land versus water-related monitoring approaches and characterization schemes. As aquatic land cover exists in many different forms on Earth (e.g. wetland, open water) and can be mapped by different approaches, it is necessary to consider a much more consistent and comprehensive characterization framework that not only ensures the consistency in the monitoring of aquatic land cover but also serves the needs of multiple users (e.g. climate users, agricultural users) interested in different aspects of aquatic lands. In this study, we addressed this issue by 1) reviewing 33 GALC datasets and user needs identified from the citing papers of current datasets and international conventions, policies and agreements in relation to aquatic ecosystems, 2) proposing a global characterization framework for aquatic land cover based on the Land Cover Classification System (LCCS) classifier principles and the identified user needs, and 3) highlighting the opportunities and challenges provided by remote sensing techniques for the implementation of the proposed framework. Results show that users require or prefer various kinds of information on aquatic types including vegetation type, water persistence, the artificiality of cover (i.e. artificial vs natural), water salinity, and the accessibility to the sea (i.e. coastal vs inland). Datasets with medium to high spatial resolution, intra-annual dynamics and inter-annual changes are needed by many users. However, none of the existing datasets can meet all these requirements and a rigorous quantitative accuracy assessment is lacking to evaluate its quality for most of the GALC datasets. The proposed framework has three levels and users are allowed to derive their aquatic land cover types of interest by combining different levels and classifiers of information. This comprehensive mapping framework can help to bridge the gap between user needs and current GALC datasets as well as the gap between generic and aquatic land cover monitoring. The implementation of the framework can benefit from evolving satellite-data availability, improved computation capability and open-source machine learning algorithms, although at the same time it faces challenges mainly coming from the complexity of aquatic ecosystems. The framework proposed in this study provides insights for future operational aquatic land cover monitoring initiatives and will support better understanding and monitoring of complex aquatic ecosystems.


Background
The presence of water on Earth has a significant influence on land surfaces and ecosystems. Land cover types that exist in terrestrial areas, such as bare lands, croplands, grasslands, shrubs, or trees, can also be present in aquatic environments. As the water table may vary during a year, land surfaces could be regularly or permanently flooded with an extensive period of water presence. Depending on the inundation frequency of different types of land surfaces, a variety of water-related land covers and ecosystems have been formed, for example, open water (permanent water bodies), mangroves (permanently flooded tree vegetation), rice paddies (regularly flooded cultivated vegetation), and mudflats (regularly flooded bare lands). These land cover types share a common characteristic that water is the dominant factor determining its formation, soil development or the type of plant communities living on its surface. The ISO-certified United Nations Land Cover Classification System (LCCS; Di Gregorio, 2005) refers to these land cover types as aquatic land cover where the environment is significantly influenced by the presence of water over an extensive period of the year. This study follows the LCCS definition and uses "aquatic land cover" to refer to water-related land cover types, whereas open ocean and snow/ice are excluded. Wetland is also a typical aquatic ecosystem and the interplay among its three key components, hydrology, soil and vegetation (Mitsch and Gosselink, 2007), makes wetland not a uniform land cover type but comprises diverse aquatic land cover types (Gallant, 2015).
Aquatic land cover types provide many valuable ecosystem services such as water and food supply, flood mitigation, water purification, coastal protection, and increasingly tourism and recreation (Gardner and Finlayson, 2018;Millennium Ecosystem Assessment, 2005). Despite their importance, some essential aquatic ecosystems (such as surface water and wetlands) are reported to suffer great degradation and loss globally in the past decades (Gardner and Finlayson, 2018;Nel et al., 2009;Pekel et al., 2016). The Sustainable Development Goal (SDG) 6 specifically pointed out the significance of protecting and restoring water-related ecosystems by 2020. Mapping aquatic land cover globally is therefore very important for gaining knowledge on its status and it has recently received renewed interests, particularly in the context of global climate change (Arnell, 1999).

Global mapping of aquatic land cover based on remote sensing
Observations from remote sensing (RS) platforms can provide continuous, non-invasive and spatially explicit data over large areas, and thus become the most effective way to monitor land cover globally and are increasingly evolving to operational global land monitoring systems (Buchhorn et al., 2020;Herold et al., 2016;Rebelo et al., 2009). The capability of RS technology for aquatic land cover observation has moved forward with the development of new satellite archives such as the Copernicus programme's Sentinel Constellation (Berger et al., 2012;Mora et al., 2014) and cloud computing platforms such as the Google Earth Engine (Gorelick et al., 2017). Although aquatic land cover is different from most terrestrial land cover because of the presence of water, there has been no universally applicable classification scheme to describe aquatic land cover types and RS map producers have developed different products to characterize aquatic land covers according to their own understanding and application purposes.
Aquatic land covers are often mapped by global land cover (GLC) products, but they are represented by very limited classes; for instance, the high spatial resolution GlobeLand30  includes water bodies and wetlands as aquatic land covers (excluding open ocean and snow/ice). The spatial distribution and extent regarding aquatic land covers especially for wetlands usually vary a lot among these different products (Nakaegawa, 2012). One of the reasons for the inconsistency in aquatic types and their distribution lies in the fact that different datasets adopt different classification schemes (Amler et al., 2015;Hu et al., 2017a;Nakaegawa, 2012). Unlike the aforementioned GlobeLand30, the global land cover database for the year 2000 (GLC2000; Bartholome and Belward, 2005) uses four types to define aquatic land covers, namely (1) tree cover, regularly flooded, fresh and brackish water, (2) tree cover, regularly flooded, saline water, (3) regularly flooded shrub and/or herbaceous cover and (4) water bodies. These differing interpretations among GLC products have directly resulted in the disagreement of spatial distribution and areal statistics of aquatic land covers.
Apart from GLC products, there are also specific global aquatic land cover (GALC) datasets. One group of these datasets is delineating the general extent of aquatic land covers, such as the Global Inundation Extent from Multi-Satellites (GIEMS; Prigent et al., 2007) and the global surface water extent dataset (Papa et al., 2010) that captures but does not discriminate among inundated wetlands, rivers, small lakes and irrigated agriculture. The second group of specific GALC datasets is narrowed down to a single type, for instance, global mangroves (Giri et al., 2011), global saltmarshes (McOwen et al., 2017, or global lakes (Messager et al., 2016). The third group of specific GALC datasets contains multiple types such as the Global Lakes and Wetlands Database (GLWD; Lehner and Döll, 2004) level-3 dataset, which has 12 aquatic land cover types covering both vegetated (e.g. freshwater marsh, swamp forest) and non-vegetated wetlands (e.g. lake, reservoir, river). Although these datasets are more comprehensive than GLC products and datasets with a single type, they are still confronted with the issue of inconsistent classification schemes and varying spatial distribution and extent of aquatic land covers (Zhang et al., 2017). As a result, it is necessary to come up with a consistent characterization framework to describe different aquatic land cover types.
As aquatic ecosystems are essential to almost every aspect of human life, GALC datasets have attracted a large number of users from different fields. Depending on the purpose of the application, users of GALC datasets may require different thematic information. For example, climate modellers apply GALC datasets, specifically wetland datasets, to evaluate methane emissions and the information they need is natural vegetated wetlands with anaerobic conditions to produce methane, such as bogs, fens, and flooded swamps (Matthews and Fung, 1987), while for hydrological modellers surface water and its dynamics are key focuses of their models (Luo et al., 2017). Users in the agricultural management domain may apply GALC datasets in irrigation water management and thus the information about aquatic croplands (e.g. rice paddy) and freshwater is preferred by them (Zohaib et al., 2019). Apparently, users from different fields have a different focus on the characteristics of aquatic land cover types. Some of them care about the vegetation type, while some others care more about water dynamics. A full understanding of user requirements is beneficial for any mapping purpose, while the investigation of user needs towards aquatic land cover mapping has not been achieved yet.
Since aquatic land covers exist in many different forms on Earth and can be characterized by different mapping purposes and approaches, it is necessary to consider a consistent and comprehensive characterization framework that ensures the consistency of the understanding of aquatic land cover types and serves the needs of multiple users interested in different aspects of aquatic land covers. Although many countries have their national classification systems, such as the Cowardin et al. (1979) classification system adopted by the US National Wetlands Inventory (NWI) and the Canadian wetland classification system (Warner and Rubec, 1997), these nation-wide systems have limitations to represent the wetland types in their own countries (e.g. the Cowardin et al. classification system has been revised by the Federal Geographic Committee in 2013 for mapping US wetlands, Tiner et al., 2015), let alone to be used for global-scale classifications. Up to now, the most widely used global wetland inventory system is defined by the Ramsar Convention on Wetlands (Matthews, 1993). However, this conservation-based classification system has been criticized to be too broad (Amler et al., 2015) for RS-based mapping, as such level of detail (e.g. freshwater springs, seasonal streams or creeks) is beyond what satellite sensors can deliver (Congalton et al., 2014).
It has been agreed that a flexible structure of a classification framework is preferred for future global wetland datasets (Hu et al., 2017a) and harmonization efforts of classification schemes have already taken effect in global land cover monitoring (Herold et al., 2008). The LCCS (Di Gregorio, 2005) targets on ensuring the comprehensiveness, consistency and flexibility of classification schemes (Herold et al., 2009;Mora et al., 2014) and it was designed to serve the needs of different user communities. LCCS defines land cover types according to a series of pre-identified classifiers (Bartholome and Belward, 2005) making it easy for the developed classification system to be tailored for different applications, such as forest monitoring, biodiversity conservation, and climate modelling (Tsendbazar et al., 2015). Global land cover monitoring is becoming increasingly operational and the recently launched Copernicus Global Land Service fully adopted the LCCS approach (Szantoi et al., 2020) and considered the connection of land types and water dynamics (Buchhorn et al., 2020), i.e. permanent water bodies and temporary water bodies are added to the classification scheme, but presenting more aquatic land cover characteristics (e.g. vegetation) has remained limited here. In this study, we intend to go further by proposing an aquatic land cover classification scheme that addresses different aspects of aquatic land cover characteristics using the LCCS approach.

Objectives
With the aim of coming up with a consistent and comprehensive global aquatic land cover characterization framework addressing multiple user requirements, this paper addresses four questions: (1) What is available currently? Here we provide an overview and synthesis of the thematic, spatial, and temporal characteristics of existing GALC datasets.
(2) What is needed by users? A comprehensive and updated user analysis is conducted, and we summarize user needs to capture the variety of requirements and specifications for GALC datasets.
(3) How can we conceptually characterize aquatic land cover types in a consistent way? Based on the understanding of current datasets and evolving user needs, we propose a novel aquatic land cover characterization framework building upon the LCCS approach. (4) How to integrate all the types in the proposed framework with remote sensing? For putting the novel framework into practice, we review recent Earth Observation developments and assess the feasibility in implementing the framework building on existing and evolving remote sensing capabilities.
With these four objectives, we are developing a comprehensive approach for improving global aquatic land cover monitoring considering the limitations of available datasets, refined user requirements and evolving remote sensing capabilities.

Data and methods
In order to come up with a consistent and comprehensive characterization framework towards aquatic land covers, we first evaluated the thematic (i.e. land cover types), spatial (i.e. spatial resolution) and temporal (i.e. temporal frequency) characteristics of available GALC datasets. Then, major user groups and user needs were identified by analysing international conventions, policies, and agreements in relation to aquatic ecosystems as well as the papers that cite each dataset, i.e. citing papers of current GALC datasets. Based on the user required information on aquatic land cover types and characteristics, the global aquatic land cover characterization framework was proposed applying the LCCS approach. Finally, the feasibility of RS capabilities in achieving the proposed framework was analysed. Fig. 1 summarizes the main steps taken for this study. The following subsections provide details on these steps.

Global aquatic land cover datasets
A total of 33 GALC datasets published until 2019 were reviewed in this study (Table 1) and these datasets were divided into four groups: (1) Inundation/Extent datasets not including detailed aquatic land cover classification types, but only serving as a baseline of aquatic areas like a general delineation of inundated areas, (2) Global Land Cover (GLC) datasets that contain many land cover classes, but only a limited number of land cover classification types are related to water, (3) Single-type GALC datasets which comprise only one type of aquatic land cover, and (4) Multi-type GALC datasets that have various aquatic land cover classification types.
To understand the thematic, spatial, and temporal characteristics of current GALC datasets, we collected information about the aquatic land cover class, spatial resolution, and temporal frequency of each dataset (Table 1). The richness of thematic categories of each dataset was scored on five aspects with respect to its information on vegetated vs non-vegetated cover, permanent vs temporal/waterlogged cover, natural vs artificial cover, inland vs coastal cover, and freshwater vs brackish/saline water. Score 2 was assigned to an aspect if the dataset has both types of cover (e.g. both vegetated and non-vegetated types), score 1 was assigned if the dataset has only one type of cover (e.g. only vegetated cover), and score 0 was assigned if the dataset has no information on this aspect (e.g. no information on vegetation type).
The quality of each dataset was assessed based on the result of accuracy assessments found in the published literature. As there exist prominent variabilities among the completeness of the validation of each dataset (i.e. some datasets were validated using independent reference samples, while some datasets were not validated at all), we adopted the Committee on Earth Observation Satellites (CEOS) validation stage hierarchy (Land Product Validation Subgroup, 2003) to show the validation status of each dataset. Five stages (0-4) were defined according to the CEOS land product validation hierarchy, where Stage 0 indicates no validation. At Stage 1, the accuracy of the product is evaluated from a small (typically < 30) set of locations and time periods by comparison with in-situ or other suitable reference data. At Stage 2, the product accuracy is assessed over a significant (typically > 30) set of locations and time periods and, at the same time, the spatial and temporal consistency of the product is evaluated over globally representative locations and time periods. The Stage 3 is upgraded to a global scale on the basis of Stage 2. At Stage 4, validation results for Stage 3 are systematically and regularly updated when new products are released. The detailed result on the review of accuracy assessments of each dataset was presented in the supplemental file (Table S1).

Evaluation of user groups
During the past decades, a lot of international conventions, policies, and agreements have been established for the wise use of aquatic ecosystems (e.g. Ramsar Convention on wetlands, Davidson, 2016), biodiversity conservation (e.g. Aichi Biodiversity Targets, Convention on Biological Diversity, 2018), sustainable development (e.g. Sustainable Development Goals, United Nations, 2015), land management (e.g. Land Degradation Neutrality, IUCN et al., 2015), climate change mitigation (e.g. Paris Agreement, FCCC, 2015), and disaster risk reduction (e.g. Sendai Framework for Disaster Risk Reduction, Aitsi-Selmi et al., 2015). Most of them are either directly or indirectly linked to aquatic land covers, which makes them potential users of GALC products. In an attempt to make the proposed aquatic land cover characterization framework globally applicable, we focused on eight international conventions, policies, and agreements (Table 2), which have been established and implemented by working with a diverse global network of partners including national governments (Ramsar Convention Secretariat, 2010) and international or national non-governmental organizations (Sustainable Brands, 2018). Details about the targets and goals of these conventions, policies, and agreements are shown in Table  S2 of the supplemental file.
Apart from the international conventions, policies, and agreements in relation to aquatic ecosystems, the citing papers are also a good source to find potential users and user needs or preferences. In this study, we used the Science Citation Index Extended (SCIE) database from Web of Science, which covers high-quality peer-reviewed publications for the citation analysis. Statistics on the Web of Science Categories (Clarivate Analytics, 2019) of the citing papers were generated. According to the most frequently cited research categories, we could find potential user groups. Based on the identified research areas of citing papers as well as the selected international conventions, policies, and agreements, we finally generalized the major user groups of GALC datasets.
According to Fig. 2, it is obvious that about 50% (16 out of 33) of GALC datasets reviewed in this study were produced after 2014, which indicates that users have more choices among a variety of GALC datasets after 2014. To avoid a biased statistic (because older datasets may  have more citations than new datasets), in this study only the citing papers of each dataset between 2015 and 2019 were analysed. In addition, we consider scientists and experts in the field of remote sensing, computer science, and imaging science as map producers who aim to improve the map quality and we did not include this group as the targeted users of GALC datasets. Furthermore, only papers with the document type of "Article" were counted because they present full information on original research. To sum up, the citing papers were pruned based on the following criteria: (1) Refine the papers published between 01/01/2015 and 31/12/2019.
(2) Exclude papers in the areas of remote sensing, computer science, and imaging science. (3) Refine the document type to "Article" papers.
For the citing papers of GLC datasets, we focused only on waterrelated studies, so the inquiry was refined using the keyword "water* OR wetland* OR aquatic OR flooded OR inundated". A total of 3151 papers were reviewed for the 33 GALC datasets ( Table S3 in the supplemental file).

Assessment of user demands towards global aquatic land cover datasets
In this study, the user needs towards aquatic land covers are derived from two parts: 1) direct user needs identified from international conventions, policies, and agreements, and 2) users' preferences and uptakes of aquatic land cover information summarized from the citing papers of existing datasets. The information obtained from the content of international conventions, policies, and agreements reflects user's requirements on specific types, while the citation of a GALC dataset represents a broad overview of users' preferences towards the general features and characteristics of aquatic land cover. Although the dataset might be cited but not used by the user, we assume that if a GALC dataset is frequently cited by a specific user group, then, to a large extent, the information contained in this dataset has gained interests by this user group.
The goals, targets, indicators, articles, priorities, or variables ( Table  S2 in the supplemental file) of the international conventions, policies, and agreements with respect to aquatic land covers were reviewed to collect the information they cared about, including the aquatic land cover extent, thematic aquatic land cover types, spatial resolution of data, intra-annual land cover dynamics and inter-annual land cover changes. For example, according to the Target 11 of the Aichi Biodiversity Targets, which states that "By 2020, at least 17 percent of terrestrial and inland water, and 10 percent of coastal and marine areas, …, and integrated into the wider landscapes and seascapes", we are able to conclude that biodiversity researchers need information about inland water bodies and coastal/marine wetlands as well as the extent of these classes. The detailed contents of these conventions, policies, and agreements are listed in Table S2 of the supplemental file.
In order to know how each dataset was cited by different user groups, an intensive interpretation was done to assign the citing paper to a specific user group. The title, abstract and keywords were assessed to determine which user group they belonged to. For those that were not clear enough by looking at the title, abstract and keywords, we further checked the full paper. After this procedure, the number of citations of each dataset cited by each user group was acquired (Fig. 4). Based on this statistic, we further analysed users' preference and uptake of the thematic, spatial, and temporal characteristics of GALC datasets. According to the most frequently cited datasets by each user group and their use cases, we summarized the general thematic characteristics of aquatic land cover preferred by each user group and then translated them into the LCCS language, i.e. classifiers. The spatial and temporal resolution of users' preference and uptake was evaluated based on the datasets cited by each user group, and the cited datasets were divided into five spatial ranges, namely > 1 km, 500 m − 1 km, 100-500 m, 30-100 m and ≤ 30 m and four temporal ranges including daily, monthly, yearly and static. Not indicated.

LCCS-based aquatic land cover characterization framework
The LCCS has been developed as a comprehensive and standardized classification system specifically for mapping purposes (Mora et al., 2014). Land cover classes are created at different levels by the combination of a set of independent diagnostic attributes that are called classifiers. According to Di Gregorio (2005), the classification system developed based on the LCCS approach is: 1) comprehensive covering all possible combinations of classifiers; 2) capable of meeting the needs of a variety of users; 3) scale-independent that can be used at different scales and at different levels of detail; 4) with clear class boundary definitions and internal class consistency. In this study, we adopted the LCCS approach to build the aquatic land cover characterization framework.
The classification with LCCS comprises a dichotomous phase and a modular-hierarchical phase (Di Gregorio, 2005). The dichotomous phase starts with three initially pre-defined classifiers, namely the presence of vegetation (designed for the differentiation between vegetated and non-vegetated land cover types), the edaphic condition (designed for the differentiation between terrestrial and aquatic types), and the artificiality of cover (designed for the differentiation between artificial and natural land cover types) (Di Gregorio, 2005). Developers are allowed to add other classifiers or attributes at different levels of the classification according to their own application purposes. Since we only focus on aquatic land cover types in this study, the edaphic condition classifier was not used here. Instead, we adopted several other classifiers according to the identified user needs of the thematic aquatic land cover types. In the modular-hierarchical phase, land cover types were further specified by another set of pre-defined classifiers. For example, the vegetated types derived from the presence of vegetation classifier in the dichotomous phase can be separated into trees, shrubs, and herbaceous cover by the life form classifier. To derive their classes of interest, users are required to start with the pure land cover classifiers defined in the dichotomous phase and stop at the level where they can derive the details they need. In this study, according to the identified user demands on aquatic land cover types, we split the required thematic information into different levels and finally developed a hierarchical aquatic land cover characterization framework (Table 9).

Characteristics of global aquatic land cover datasets
The assessment of the richness of thematic information (i.e. vegetated vs non-vegetated cover, permanent vs temporal/waterlogged cover, natural vs artificial cover, inland vs coastal cover, and freshwater vs brackish/saline water) of each dataset is shown in Table 3. In general, multi-type GALC datasets and GLC datasets are more comprehensive than the inundation/extent datasets and the single-type GALC datasets. However, none of these datasets can be completely filled by all the five aspects of thematic information (Table 3).
The inundation/extent products were scored as 0 for all the five aspects of information because they do not contain any detailed information on land cover classification types, which means the inundation/extent products can only serve as a proxy of aquatic areas. Among the eleven single-type GALC datasets, five of them are water-only products and six of them are vegetation datasets. Only two datasets (i.e. GSW and G3WBM) provide information on water seasonality. Few of the single-type GALC datasets give more useful information on water salinity and artificiality of cover. The four multi-type GALC datasets cover both vegetated and non-vegetated types while giving only partial information on the other four aspects. Among the four multi-type datasets, GLWD is the most comprehensive one containing information about not only vegetated and non-vegetated types, but also humanmade types, saline wetlands, and coastal wetlands. Some of the GLC datasets (e.g. GLC2000, Land Cover CCI, GLOBCOVER) are more comprehensive in terms of the information on vegetation because they indicate the specific life form of trees, shrubs, or herbaceous cover. Many GLC datasets also provide information about water salinity (e.g. Land Cover CCI) and water seasonality (e.g. GLOBCOVER), but the information on artificial vs natural cover and inland vs coastal cover is still lacking among all the GLC datasets.
In general, the single-type GALC datasets tend to be finer than the other three groups of datasets and about 91% (10 out of 11) of the single-type datasets have resolutions ≤100 m (Table 4). Half (3 out of 6) of the inundation/extent products are coarser than 1 km (Table 4), among which the GIEMS dataset and the GSWE dataset developed by Papa et al. (2010) have a spatial resolution of 0.25° (~28 km) and 25 km, respectively. The multi-type GALC datasets are even coarser than the inundation/extent products and they normally have a spatial resolution larger than 0.5° (~ 55 km). The most comprehensive GLWD dataset is also the finest among the four multi-type datasets with a spatial resolution of 1 km. The spatial resolution of GLC datasets is between 30 m -1 km and two of the GLC datasets have a fine spatial resolution of 30 m, namely FROM-GLC and GlobeLand30.
The majority of GLC datasets (83%), as well as single-type (82%) and multi-type (75%) GALC datasets, are static, while the inundation/ extent products tend to be more dynamic and 67% of these datasets have a daily or monthly frequency (Table 4), and most of these products also have a long period of tracking inundated areas. Among the singletype GALC products, GSW is the only monthly dynamic dataset that covers 32 years  of surface water changes (Pekel et al., 2016) and CGMFC-21 is the only yearly map reporting the extent changes of global mangrove forests (Hamilton and Casey, 2016). Two GLC datasets (i.e. Land Cover CCI and CGLS-LC100) are yearly updated to provide dynamic and long-term monitoring of the status and evolution of the land surface. Among the multi-type GALC datasets, only one dataset (i.e. SWAMPS-GLWD) provides intra-annual dynamics with a monthly frequency.
There is a clear gap in the quality assessment levels between GLC products and other three groups of GALC datasets. As a statistically robust GALC validation dataset is not available, most of the inundation/ extent datasets as well as the single-type and multi-type GALC datasets were mainly assessed by the qualitative comparison with previously published water-related datasets. According to the CEOS land product validation stage hierarchy, around 81% (17 out of 21) of these three groups of GALC datasets were under the validation Stage 1, and only two datasets (i.e. GIW and GRIPC) reached to the validation Stage 2. Most GLC products were well validated based on independent reference data and the product accuracy was systematically reported (Table S1), reaching to the CEOS land product validation Stage 3. Although it is hard to determine the quality of the datasets without a rigorous quantitative accuracy assessment, the qualitative assessment gives some useful information on these datasets. For example, from the comparison it is clear that GIEMS missed many small water bodies in densely forested regions in comparison to the IGBP-DISCover dataset (Prigent et al., 2007). The GLWD level-3 dataset tends to overestimate tropical peatland extents compared with the PEATMAP (Xu et al., 2018). According to the reported accuracy of GLC products, the classification of water bodies achieves relatively high accuracies (generally > 80%), while temporarily flooded vegetated types in GLC datasets are poorly mapped (Table S1). For instance, the producer's and user's accuracy of marshlands in the FROM-GLC dataset is 11.48% and 24.82%, respectively, making it less feasible to be used in further studies.

Major user groups
The top 20 Web of Science Categories that the citing papers fall into are shown in Fig. 3. A detailed explanation of each category can be found in the supplemental file (Table S4). These categories cover research about ecological (e.g. Ecology), biological (e.g. Biodiversity Conservation), hydrologic (e.g. Limnology, Oceanography), climatic (e.g. Meteorology Atmospheric Sciences) and agricultural studies (e.g. Agronomy) as well as research about water resource management (e.g. Water Resources), sustainable development (e.g. Green Sustainable Sciences Technology) and land management (e.g. Engineering Civil).
Among the eight international conventions, policies, and agreements reviewed in this study, the Aichi Biodiversity Targets corresponds to the "Biodiversity Conservation" category mentioned above. As ecological and biological research are closely related, we put them together in this study to formulate the ecological/biological user group (Table 5). The Paris Agreement, the 2013 Wetland Supplement to the 2006 IPCC Guidelines as well as the ECVs correspond to climate studies, and together with the "Meteorology Atmospheric Sciences" category, they form the climate user group. The principal idea of the Ramsar Convention on Wetlands and water-related SDGs is to conserve and sustainably use water and wetland resources and the aim of the Sendai Framework for Disaster Risk Reduction is also sustainably managing aquatic ecosystems to reduce risks, together with the Web of Science Categories such as "Water Resources", "Limnology" and "Green Sustainable Sciences Technology", they formulate the sustainable water resource management users. The Land Degradation Neutrality is related to land management, and together with the Web of Science Categories of "Engineering Civil" and "Geography Physical", they are grouped as land management users. Considering that agricultural activities have a Table 2 The international conventions, policies, and agreements reviewed in this study.

Name
Brief description Ramsar Convention on Wetlands An intergovernmental treaty whose mission is the "conservation and wise use of all wetlands through local, regional and national actions and international cooperation, as a contribution towards achieving sustainable development throughout the world". Sustainable Development Goals (SDGs) SDGs aim to achieve the prosperity of people and the planet through sustainable development. Aquatic land covers are a key aspect in achieving the SDGs through the valuable ecosystem services they provide. SDG 6 aims to protect and restore waterrelated ecosystems, SDG 15 calls for protecting the inland freshwater ecosystems, and SDG 14 encourages conserving marine areas. The Sendai Framework for Disaster Risk Reduction The Sendai Framework aims to prevent new and reduce existing disaster risk. It contains seven targets and four priorities, of which the Priority 3 and 4 advises to reduce risks happening in aquatic areas. Aichi Biodiversity Targets The Aichi Biodiversity Targets aim to halt the loss of biodiversity and ensure the resilience of ecosystems. Of the 20 targets, Target 6 emphasizes on sustainable use of aquatic species and Target 7 on the management of aquaculture. Target 11 underlines conserving at least 17% of terrestrial and inland water, and 10% of coastal and marine areas by 2020. Land Degradation Neutrality (LDN) The LDN aims to halt and reverse land degradation and maintain the world's resource of healthy and productive land. Many forms of land degradation are linked to water management, and land degradation directly impacts aquatic land covers such as peatlands, estuaries, and rivers.  P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 close relation with aquatic ecosystems and the Web of Science Category of "Agronomy" reflects that GALC datasets are used in agricultural studies, the agricultural user group is investigated in this study. Concluding from the above analysis, we target on five groups as major users including sustainable water resource management users, ecological/biological users, climate users, land management users as well as agriculture users. The main focuses of each user group are listed in Table 5. It should be noted that some of the Web of Science Categories are quite broad, for example, Environmental Sciences and Geosciences Multidisciplinary, which may overlap with several different categories, thus we did not include them in Table 5 but all the citing papers falling into these categories were reviewed in the user demands assessment later on (section 3.2.2).

User demands towards global aquatic land cover datasets
3.2.2.1. Information demanded by international conventions, policies, and agreements. User needs concerning the general extent (i.e. general delineation of aquatic land cover), thematic land cover types, spatial and temporal resolutions of GALC datasets were summarized (Table 6) according to the international conventions, policies, and agreements in relation to aquatic land covers. In Table 6, the intra-annual dynamics correspond to the daily or monthly temporal resolution and the interannual changes correspond to the yearly temporal resolution. It should also be noted that the requirement about the spatial resolution is only indicated by the ECVs, while other conventions, policies, and agreements do not specify this information.
As indicated in section 3.2.1, the Ramsar Convention, SDGs, and the Sendai Framework for Disaster Risk Reduction represent sustainable water resource management users. Both the Ramsar Convention and the SDGs need data on the general extent of aquatic areas. The thematic aquatic land cover types wanted by the three international conventions, policies, and agreements cover both non-vegetated (e.g. rivers, lakes) and vegetated types (e.g. flooded forests), inland and coastal wetlands, natural and man-made types as well as saline and freshwater wetlands. The intra-annual dynamics and inter-annual changes are demanded by the Ramsar Convention and SDGs to track the changes of wetlands (Ramsar Convention Secretariat, 2016) and restore aquatic ecosystems (SDG 6).
The Aichi Biodiversity Targets (Convention on Biological Diversity, 2018) representing the ecological/biological user group, require not only a general extent of aquatic land covers but also detailed information about inland water bodies, vegetated types (e.g. natural permanently or regularly flooded forests and aquatic plants), marine/ coastal wetlands and aquatic artificial lands (specifically aquaculture and regularly flooded agriculture). The inter-annual land changes are also wanted to evaluate the loss of natural habitats and reduce degradation and fragmentation in aquatic ecosystems.
The Land Degradation Neutrality representing the land management user group is focused on the land affected by desertification and floods; thus, besides the general extent data, the thematic information wanted by this group includes vegetated wetlands and flooded areas. Inter-annual land changes are also required to assess land degradation.
The Paris Agreement and the 2013 Wetland Supplement focus on wetlands serving as sinks and sources of greenhouse gases (GHGs), including peatlands, coastal wetlands (specifically mangroves, marshes and seagrass), inland wetlands (specifically riparian wetlands, forested swamps, marshes, saline and brackish wetlands) as well as artificial wetlands (specifically wastewater management infrastructure and rice paddy). To monitor climate changes, the ECVs need surface water (specifically lakes, rivers, and surface inundation), peatlands, freshwater wetlands, and marine or coastal wetlands (mangrove forest, seagrass bed, etc.). For the purpose of climate change assessment, some ECVs such as the lake extent and surface inundation are supposed to be updated daily (Global Climate Observing System, 2019). The spatial resolution for the monitoring of lakes, rivers, peatlands, and land cover tends to be fine (≤ 250 m), while for the monitoring of surface inundation the spatial resolution is much coarser (i.e. 1-25 km) ( Table 6).
3.2.2.2. Users' preference and uptake of aquatic land cover information identified from the citing papers of existing datasets. In general, the GMF, GRanD, GLOWABO, GSW, and the CGMFC-21 of the single-type dataset group, the GLWD and the Matthews and Fung (1987) wetland product of the multi-type dataset group, and the MODIS Collection 5 and IGBP-DISCover of the GLC dataset group are cited more frequently (i.e. with more than 100 citations) than other datasets (Fig. 4).
GALC datasets used more often by climate users include GLWD, GLOWABO, GMF, and the wetland dataset developed by Matthews and Fung (1987) (Table 7). According to the use cases, the information about peat-accumulating wetlands, mangroves, surface water, dams/reservoirs and inundated areas is used to assess greenhouse gas emissions (Ito, 2019;Peltola et al., 2019), and to evaluate the impact of climate change on aquatic ecosystems (Ellison, 2015) as well as the response of aquatic ecosystems to climate change (Woolway and Merchant, 2018).
The most frequently used datasets by ecological/biological users include the GMF, GRanD, GLWD, GLOWABO and accordingly the thematic information on mangroves, dams/reservoirs, persistent and natural wetlands, and surface water is preferred by this user group. They use GALC datasets for studies about biodiversity conservation (Asaad et al., 2017;Bolivar et al., 2018), biomass estimation , species distribution (Cano et al., 2018), and also ecological research like ecosystem services (Duncan et al., 2016) and ecological models (Janssen et al., 2019).
Sustainable water resource management users utilize datasets including GRanD, GLWD, GSW, GMF more frequently. Information about dams/reservoirs and surface water is essential for water resource monitoring. Freshwater wetlands and forested wetlands (e.g. mangroves) are also required for sustainable wetland management (Chow, 2018). Topics related to water resource management include water storage estimation (Binh et al., 2019), water quality (Rasul, 2019), optimizing water allocation or supply (Martinsen et al., 2019), and the future gap between water demand and supply under socio-economic development (Wijngaard et al., 2018). Besides water resource management, they also use GALC datasets for wetland restoration (Dutta et al., 2018), sustainable development strategies (including social, economic and political ones) towards aquatic ecosystems (Haer et al., 2018), and hazard or risk (e.g. flooding, drought, contamination) control (Wan et al., 2017).
GALC datasets used more often by agriculture users are GRIPC, GRanD, MODIS Collection 5, and the GLWD dataset. Information on dams/reservoirs, surface water and regularly flooded cultivated land (e.g. rice paddy) is preferred by agriculture users mainly for agriculture and water management (du Preez et al., 2018;Rodell et al., 2018;Zaussinger et al., 2019). In addition, fishery conservation and management are widely studied by agriculture users (de Graaf et al., 2015;Deines et al., 2017;Lo et al., 2019) and the information on aquaculture ponds, freshwater wetlands and mangroves is required by this user group.
Land management users utilize datasets like GMF, GSW, GLWD and GLC products including MODIS Collection 5, FROM-GLC, GlobeLand30 and GLC2000 more frequently. Their primary focus is to monitor aquatic land cover/use changes (Davidson and Finlayson, 2018) and also to explore the impact of land change on aquatic ecosystems Deb and Ferreira, 2017) as well as drivers of land cover/use changes (Hao et al., 2015;Sabic et al., 2018). Of the reviewed cases, the thematic information covers surface water, mangroves, permanent wetlands, and dams/reservoirs. The above results show that different user groups have their priorities on different aquatic land cover types, which indicates to map producers that a comprehensive dataset containing different types is helpful to fulfil the needs of most users. For example, the primary focus of climate users is peat-dominated wetlands, while ecological and biological users concentrate more on the mangrove forest ecosystem, and agriculture users are more interested in regularly flooded croplands. There are cases that users apply GALC datasets as a mask to define their region of interests, indicating that the broad-level split of aquatic and non-aquatic land cover is still necessary. By considering the user needs collectively, it can be found that the thematic information of aquatic P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 land covers that are of users' preferences and uptakes includes open water (e.g. lakes, rivers), vegetation types (e.g. forests, marshes), water seasonality (e.g. regularly or permanently flooded), man-made aquatic land covers (e.g. dams/reservoirs, croplands), coastal wetlands, and freshwater or saline aquatic types. Accordingly, the related LCCS classifiers are the presence of vegetation, the persistence of water, the artificiality of cover, the relative accessibility of aquatic land cover to the sea, and water salinity.
There are variabilities among users' choices towards the spatial resolution of the GALC dataset. According to Table 8, 46.6% of climate users apply coarse to medium (> 500 m) datasets. A large proportion of climate-related studies are carried out at larger scales (i.e. global or continental) in which coarser resolution datasets are frequently used (Tsendbazar et al., 2015). In contrast, 57.6% of land management users prefer datasets with ≤30 m resolution showing that these users prefer more details and may focus more on local studies. Besides land  In this table, the deeper the colour, the more comprehensive the dataset is. "0" represents none of the types of information is included in the dataset; "1" represents only one type of information is included, or both types of information are included but not discriminated between each other; "2" represents both types of information are included and the information can be directly obtained from the dataset. P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 management users, the majority of sustainable aquatic resource management users (55.5%) and ecological/biological users (69.3%) prefer finer resolution datasets (≤ 30 m) as well. Results also show that 46.3% of the climate users apply datasets with ≤30 m resolution. Therefore, considering most of the users' requirements on the spatial resolution of GALC datasets, developing a higher spatial resolution dataset will be an ongoing trend for future aquatic land cover mapping initiatives (Mahdianpari et al., 2020;Pickens et al., 2020). Concerning the temporal frequency, static datasets are more widely used by most of the user groups (Table 8). As there are only nine datasets among the reviewed 33 datasets having a daily, monthly, or yearly temporal resolution, it is reasonable that citing papers of dynamic datasets are rare.
However, the daily and monthly products are still useful for dynamic water resource management and datasets with a yearly temporal resolution are useful for long-term aquatic land cover change monitoring.

Global aquatic land cover characterization framework
Applying the LCCS approach, the thematic user needs were translated into a three-level aquatic land cover characterization framework (Table 9).
At the first level, aquatic land cover is separated from terrestrial land cover, which corresponds to the split (i.e. masking) between aquatic and non-aquatic areas and the extent estimate of aquatic areas. The primary difference between the terrestrial land cover and the aquatic land cover lies in the edaphic condition, where terrestrial land covers are influenced by a substratum, while aquatic land covers are dominated by the presence of water. In addition, the water is supposed to exist over extensive periods of time so that occasionally flooded land within a terrestrial environment is not considered as "aquatic".
At the second level, five classifiers including the persistence of water, the presence of vegetation, the artificiality of cover, the accessibility to the sea, and water salinity are adopted in the proposed framework. The persistence of water classifier divides aquatic land covers into permanently flooded, temporarily flooded, and waterlogged types according to the inundation frequency and duration. According to LCCS, permanently flooded areas are covered by water for a substantial period, while the water in temporarily flooded areas stays less time. Waterlogged types are not characterized by eminent surface flooding but by a very high water table. The presence of vegetation classifier discriminates primarily vegetated areas from the primarily non-vegetated areas. The vegetation can have different life forms, e.g. trees or shrubs, and the non-vegetation can also have various appearances when no water is covering the surface such as bare rock, bare soil, or sand. The artificiality of cover classifier corresponds to user needs on artificial or cultivated types and natural classes, such as the man-made wetlands required by the Ramsar Convention and the natural permanently or regularly flooded forests required by Archi Biodiversity Targets. The accessibility to the sea classifier aims to differentiate coastal aquatic Table 4 The characteristics of the spatial and temporal resolution of the four groups of global aquatic land cover datasets. Numbers in the table represent the number (or percentage) of datasets falling into different ranges of spatial resolution and temporal frequency. For those GALC datasets that are created by integrating previous maps and offering a scale as spatial resolution, the spatial resolution of the finest dataset used for generating the GALC dataset is counted in this table. Fig. 3. The top 20 Web of Science Categories that the reviewed global aquatic land cover datasets fall into. P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 areas from inland aquatic areas. Though not included in the LCCS, it is an important layer of information required by users (i.e. marine/coastal wetlands), so we added this classifier at the second level of the framework. The classifier water salinity corresponds to user demands on saline wetlands or freshwater wetlands. According to LCCS, water salinity can be classified as freshwater, brackish water and saline water based on the concentration of Total Dissolved Solids (TDS). Freshwater contains less than 1000 parts per million (ppm) of TDS while saline water contains more than 10,000 ppm TDS (Cowardin et al., 1979) and the water in-between "fresh" and "saline" is called "brackish" water. At the third level, the vegetation in primarily vegetated aquatic areas is further divided into trees, shrubs, and herbaceous cover according to the life form defined by LCCS. The non-vegetated areas are separated into the open water body and bare rock, soil, or sand based on the surface type of the land exposed when there is no water. If needed, developers can also define detailed types for other level-2 classifiers at the third level. The detailed thematic information on aquatic land cover types required by users mainly comes from the level-2 and level-3 of the proposed framework and these classifiers cover almost all the attributes and features demanded by the five user groups.
Inherent to the LCCS approach, the classifiers presented at the same level are independent from each other. Users can define their aquatic land cover class of interest by combining different classifiers. The more classifiers added, the more detailed the class. For instance, by combing the "permanently flooded" cover type defined by the persistence of water classifier with the "vegetated" cover type defined by the presence of vegetation classifier, users can derive the permanently flooded vegetated class. This class can be further specified into permanently flooded, coastal, saline water, trees (frequently corresponds to mangrove forests) with the use of the "coastal" cover type of the accessibility to the sea classifier, the "saline water" of the water salinity classifier, and the "tree cover" defined by the life form classifier at level-3. Likewise, by combining "temporarily flooded", "herbaceous cover", "artificial", and "freshwater", users can obtain the "rice paddy" type. Following such a step-by-step process, i.e., level by level, classifier by classifier, users can select their preferred classes, which demonstrates the flexibility of the proposed framework. This LCCS-based characterization framework also ensures the flexibility in a way that developers can add their own-defined classifier or feature at different levels of the framework according to their specific needs. Fig. 5 is a visual presentation of the proposed aquatic land cover characterization framework and it emphasizes that a comprehensive land cover characterization is not a matter of providing a few classes but rather different layers or classifiers of information that can be derived from multiple data sources and combined in different ways to meet various user demands.
The proposed framework also has its limitations concerning the scope of user groups and datasets reviewed in this study. For instance, the water depth which plays an important role in the formation and functioning of aquatic ecosystems is not included here because the user needs identified in this study focus more on the surface aspect of aquatic land cover. However, it is possible to add water depth and other classifiers according to the application purpose when developers create the map, which also reveals the flexibility of our proposed aquatic land cover characterization framework.

Addressing the gap between current global aquatic land cover datasets and user needs
The analysis of the four groups of GALC datasets shows that existing inundation/extent products are dynamic but coarse in spatial resolution ( Table 4). The single-type GALC datasets have finer spatial resolutions, but their thematic information is sometimes too specific to meet multiple user demands. The multi-type GALC datasets are more comprehensive, but in many cases, they are outdated and too coarse in spatial resolution. The GLC datasets provide a bit more information on land cover concepts, while the complexity of aquatic land cover is being underrepresented (Amler et al., 2015). Inundation/extent datasets are able to be used in the general delineation of aquatic areas. However, although there are several inundation/ extent datasets, the distribution and extent of global aquatic land cover vary a lot among these datasets with a maximum areal extent estimation being 29 million km 2 (Tootchi et al., 2019) and a minimum estimation being only 2.12 million km 2 (Prigent et al., 2007). The information on water persistence is addressed by some of the single-type GALC datasets and by some GLC products, but it is still incomplete concerning the variety of aquatic land cover types. For instance, the information on water persistence of the GSW dataset only exists in open water areas while such information in vegetated areas is missing. In contrast, the GLOBCOVER dataset only indicates water persistence in flooded forests and grasslands, while the persistence of water in waterbody-only areas is not included. The information about vegetation types is well represented by multi-type GALC datasets, GLC products and some of the single-type GALC datasets. However, datasets containing a specific description of the life forms of vegetation, i.e. trees, shrubs, and herbaceous cover, are rare and mainly existing in GLC products. Current GALC datasets addressing the information of man-made aquatic land cover primarily focus on dams/reservoirs and rice paddies, while the user demanded aquaculture ponds have not been mapped over large areas and the constructed wetlands for wastewater treatment have not been mapped at all yet. The information on coastal aquatic land cover and freshwater or saline aquatic types receives little attention from existing GALC datasets.
Concerning the user needs of the spatial and temporal resolution of GALC datasets, a medium (≤ 1 km) to high spatial resolution (≤ 30 m)   In this table, "+" means the information is required by the user; and "-" means the information is not indicated in the content of the international conventions, policies, and agreements. Aquatic land cover types in the parentheses are specifically mentioned in the content of international conventions, policies, and agreements and the required aquatic land cover types might include but are not limited to the types listed in the parentheses.
dominates user needs and a dynamic dataset with intra-annual or interannual dynamics is also needed by users. Current GALC datasets with a high or medium spatial resolution are mostly single-type and GLC datasets, while most of the multi-type datasets are coarser than user demands. GALC datasets with intra-annual dynamics are mainly inundation/extent products with daily or monthly frequency, while the majority of single-type and multi-type datasets, as well as GLC datasets, are static. Among all the reviewed datasets, only three yearly updated products with vegetated land cover types (i.e. Land Cover CCI, CGLS-LC100, CGMFC-21) can be applied to assess inter-annual changes of aquatic vegetated types. The GSW dataset (Pekel et al., 2016) provides the information on inter-annual changes of open water extent that can be directly used by users. However, although it is possible to extrapolate interannual changes of inundated areas using the daily or monthly updated datasets that have a long-term tracking of water (Aires et al., 2017;Papa et al., 2010;Prigent et al., 2007;Schroeder et al., 2015), at the same time it brings challenges and uncertainties when aggregating these daily or monthly estimates, especially for users who have no expertise in RS. Dynamic datasets reviewed in this study only provide changes in the extent of aquatic land covers, while none of them provides information on the transformation of specific classification types (e.g. the transformation between natural wetland and artificial wetland). A rigorous quantitative assessment of the mapping accuracy of the inundation/extent datasets and the single-type and multi-type GALC datasets is lacking, leaving users unsure about the quality of the dataset they choose, and users may have to consider the uncertainties coming from the dataset while applying these datasets in their specific research (Wang et al., 2020). Although GLC datasets are systematically validated, the mapping accuracy of temporarily flooded vegetated types in GLC products is too limited to be applied in further studies. Concerning the gap in the quality assessment levels between the GLC maps and other GALC datasets, it would be better if future aquatic land cover mapping initiatives could provide a rigorous quantitative assessment for the product. In addition, global land cover mapping programs may have to enhance the accuracy of flooded vegetated types to promote the usability of GLC products in the monitoring of vegetated aquatic ecosystems.

Addressing the aquatic land cover monitoring gaps by the proposed global aquatic land cover characterization framework
Compared with the user required thematic information on aquatic land covers, existing GALC datasets are incomplete. One of the causes of the gaps between current datasets and user needs comes from the incomplete classification schemes they have adopted. Existing wetland classification systems are either nationally based (e.g. US National Wetlands Inventory, Canadian Wetland Inventory), which is not globally applicable, or using too many details (e.g. Ramsar wetland classification system), which are beyond RS capabilities. On the other hand, Fig. 4. The number of citing papers of each GALC dataset cited by different user groups. P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 Table 7 Use cases of the most frequently cited datasets by each user group and users' preference and uptake towards the thematic information of aquatic land covers.
User group Top ten most frequently cited datasets Thematic information included in these datasets

Use cases
Climate users GLWD; GLOWABO; GMF; Matthews and Fung (1987); GRanD; IGBP-DISCover; MODIS Collection 5; GIEMS; CGMFC-21; SWAMPS-GLWD  Matthews and Fung (1987); IGBP-DISCover Mangroves; dams, reservoirs; lakes; persistent wetlands; natural wetlands; surface water Biodiversity conservation; biomass estimation; species distribution, richness, or loss; genetic diversity; ecosystem functions and services; the stability of aquatic ecosystem; ecological models; chemical or metal concentration in water bodies; the ecohydrological response; the vegetation productivity; food web; eutrophication or nitrification; organic matter in aquatic ecosystems; sediment exchange in wetlands; resilience of wetlands after disturbances; the impact of wetland degradation on biodiversity and ecosystem functioning; dam effects on streams and species; the bio-geomorphic responses to flow regulation Sustainable water resource management users GRanD; GLWD; GSW; GMF; MODIS Collection 5; GLOWABO; IGBP-DISCover; GLC2000; HydroLAKES; GIEMS Dams, reservoirs; surface water; mangroves; persistent wetlands; lakes; inundated areas Water resource management (including water storage/volume assessment, flow regulation, water quality, renewable energy or hydropower production, hydrologic alteration induced by damming, optimizing water resources allocation or supply, global human water consumption and footprint, and future water gap under socio-economic development, etc.); wetland restoration; sustainable use of wetlands; social, economic, or political sustainable development strategies; hazard, risk, and disaster control or reduction (flood, drought, or contamination); the sustainability of mangrove forests Agriculture users GRIPC; GRanD; MODIS Collection 5; GLWD; IGBP-DISCover; GSW; GMF; GLC2000;

GLOWABO; GlobeLand30
Regularly flooded cultivated land; dams, reservoirs; permanent wetlands; surface water; mangroves; lakes Agricultural water management (including irrigation water quality, irrigation water use, and irrigation dam command, etc.); sustainable nitrogen management in agriculture; the link between crop production and irrigation; irrigated agriculture vulnerability; wetland soils under rice management; food security; fisheries conservation and management (including the contribution of lakes to global inland fisheries harvest, aquaculture-mangrove farming system, the relationship between forest cover around rivers and fish consumption, etc.); mangrove forest management with agricultural planning Land management users GMF; GSW; MODIS Collection 5; FROM-GLC; GLWD; GlobeLand30; CGMFC-21; GLC2000; IGBP-DISCover; GRanD Mangroves; surface water; permanent wetlands; dams, reservoirs Aquatic land cover changes (including the trend of wetland extent, land cover transformation, the impact of land degradation and reclamation on aquatic ecosystems, land subsidence, drivers of aquatic land cover changes, etc.); the impact of urbanization on aquatic ecosystems; the urban land-water system; anthropogenic disturbances on wetlands; freshwater landscapes; the damage, deforestation, and recovery of mangroves; spatial and temporal dynamics of aquatic land cover The top ten most frequently cited datasets are obtained according to Fig. 4. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 in generic land cover classification systems (e.g. IGBP DISCover Land Cover Classification System, Belward, 1996) aquatic land covers are underrepresented. In comparison, the global aquatic land cover characterization framework proposed in this study not only addresses every aspect of aquatic land cover characteristic required by users but also considers the mappability of aquatic land cover features by applying a set of diagnostic criteria. In addition, the proposed framework ensures flexibility by allowing users to select their aquatic land cover classes of interest at different levels and with different combinations of classifiers. The more classifiers added, the more detailed the class.
The citation analysis of current datasets shows that specific GALC datasets (including both single-type and multi-type) are used more often than GLC products (Fig. 4), which indicates that there also exist gaps between aquatic land cover characterization and generic land cover monitoring. However, characterizing aquatic and non-aquatic land cover characteristics in a consistent manner is essential and this is particularly important for evolving operational global land monitoring initiatives such as those under the Copernicus programme (Buchhorn et al., 2020) aiming to address a variety of user needs. The implication for future comprehensive and consistent land cover monitoring initiatives is that on the one hand, generic global land monitoring has to consider aquatic ecosystems together with their complex attributes instead of just a simple class, and on the other hand, specific aquatic land monitoring has to recognize that aquatic areas are not disconnected from the surrounding terrestrial areas. In our study, the proposed global aquatic land cover characterization framework connects aquatic and generic land cover mapping by applying LCCS classifier principles to describe aquatic land cover types. Following this framework, the developed global aquatic land cover maps can serve as an extension of global land cover products specifically in aquatic areas.
Considering the fact that most GALC datasets fail to report the accuracy of the product as well as that different users have various needs on aquatic land cover, the accuracy assessment could be done on different levels or classifiers of the proposed global aquatic land cover characterization framework. For instance, besides evaluating the overall accuracy of the whole product, the classification accuracy could be independently assessed for level-1, level-2, and level-3 of the characterization framework. The classification accuracy for each classifier, such as the presence of vegetation, could also be reported separately. Furthermore, the derived product could be assessed in case users may combine different levels or classifiers of information to generate their own types of interest.

Opportunities provided by open-source satellite data, cloud computing platforms and machine learning algorithms
The provision of a large volume of open access RS data has been advancing in recent years and offers opportunities to address the monitoring gaps. Besides Landsat, which provides the longest openaccess satellite data archives (Loveland and Dwyer, 2012), the Copernicus programme's Sentinel missions also started to offer high-resolution satellite images at frequent intervals, adding an important extension to current RS data streams (Drusch et al., 2012). The Sentinel-1 satellites include a radar system which provides cloud-free C-band Synthetic Aperture Radar (SAR) images. As SAR is sensitive to water and moisture, it bears much potential to detect water under vegetation areas (Tsyganskaya et al., 2018a). The Sentinel-2 satellites incorporate a multispectral sensor with resolutions of 10 m, 20 m and 60 m and will orbit with a five-day revisit time (Drusch et al., 2012), which provides possibilities for the monitoring of water dynamics in aquatic lands.
As the coming of the big data epoch, machine learning techniques are increasingly used for interpreting RS images (Lary et al., 2016). A multitude of machine learning algorithms such as support vector machines, random forests, decision trees, and neural networks are available under open source programming languages (e.g. R, Python) and platforms (e.g. GitHub). The increased cloud computation capability has also facilitated global data processing and management. Some powerful large-scale cloud computing platforms, such as Google Earth Engine (Gorelick et al., 2017), Amazon Web Services (2016), and the System for Earth Observation Data Access, Processing and Analysis for Land Monitoring (SEPAL, Open Foris, 2018) allow users to query and process satellite data quickly and efficiently and to tailor their own use and create advanced analyses. These cloud computing platforms have successfully improved the analysing efficiency in land monitoring applications (Deines et al., 2019;Hansen et al., 2013;Pekel et al., 2016) and will also benefit future global aquatic land cover characterization programs.

General classification of aquatic land cover
The delineation of general aquatic areas can be achieved in many ways, such as hydrological modelling based on water table depth (Fan and Miguez-Macho, 2010), topographic modelling using both topographic indices and precipitation data (Hu et al., 2017b), optical and SAR satellite data classification (Papa et al., 2010), or the combined use of topographic inputs with optical and SAR images (Hird et al., 2017). The hydrologic and topographic approaches generate the potential distribution of aquatic areas according to the relationship between aquatic land cover formation and water table depth or topography, which does not consider the surface characteristics (e.g. human influence, vegetation cover) and tends to overestimate the extent of aquatic areas (Hu et al., 2017b). In comparison, the combined use of topographic parameters and satellite images produces more reliable results (Hird et al., 2017). As aquatic areas are subject to water dynamics, the reflectance and energy backscatter properties might be substantially altered within a short period (Gallant, 2015), which poses challenges for consistent monitoring of the extent of aquatic land covers. Generating dynamic maps with daily or monthly frequency is a good Numbers in the table represent the number (or percentage) of citing papers falling into different ranges of spatial resolution and temporal frequency. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 Table 9 The proposed global aquatic land cover characterization framework.
Level 1 class name Temporarily flooded -The area is regularly flooded but the water cover does not remain for a substantial period or other than for a particular season. For vegetated areas, the water persists for < 3 but more than two months a year or during a specific season; for open water bodies, the water covers the surface for < 9 months each year.

Waterlogged -
The water table in this area is very high and at or near the surface. These areas could be occasionally flooded, but the main characteristic is the high level of the water table.
Presence of vegetation (The existence or absence of vegetation)

Primarily Vegetated Areas
Trees A tree is defined as a woody perennial plant with a single, well-defined stem carrying a more-or-less-defined crown (Ford-Robertson, 1971) and being at least 3 m tall.

Shrubs
These are woody perennial plants with persistent and woody stems and without any defined main stem (Ford-Robertson, 1971), being less than 5 m tall. The growth habit can be erect, spreading, or prostrate.

Herbaceous
Defined as plants without persistent stem or shoots above ground and lacking definite firm structure (Scoggan, 1978).
Primarily Non-Vegetated

Areas
Water body This class refers to areas with physical appearances of open water, such as rivers and lakes.
Bare rock, soil, or sand This class refers to areas with physical appearances of bare rock, soil, or sand when there is no water, such as floodplains.
Artificiality of cover (The anthropogenic or natural origin of cover) Artificial -This type is cultivated or managed by human and requires human activities to maintain it in the long term.
Natural -This cover type is in balance with the abiotic and biotic forces of its biotope and not maintained by human.
Accessibility to the sea (The accessibility to seawater) Coastal -This type is along the coast where seawater mixes with fresh water to form an environment of varying salinities.
Inland -This type is outside the coastal area and mostly surrounded by terrestrial upland.
Water salinity (The concentration of Total Dissolved Solids (TDS)) Fresh This type of water contains less than 1000 ppm TDS (Cowardin et al., 1979).

Brackish
This type of water contains 1000-10,000 ppm TDS.

Saline
This type of water contains more than 10,000 ppm TDS.
Terrestrial land cover ---The cover is formed on a non-water substratum.
P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 solution (Prigent et al., 2007;Schroeder et al., 2015) and the use of stable topographic data is also able to compensate for the uncertainties caused by water dynamics (Hird et al., 2017).

Classification of the persistence of water
The persistence of water is typically monitored through multitemporal images (Tulbure et al., 2016;Xu, 2006). According to the definition of permanently and temporarily flooded areas, water persistence can be determined by the period that water covers the surface. Timeseries approaches are appropriate for the extraction of water persistence with the inclusion of seasonal or annual fluctuations of water (Tsyganskaya et al., 2018a). The GSW dataset developed by Pekel et al. (2016) was created based on a Landsat time-series of 32 years and the water persistence was presented in one seasonality map. Recently, a new global surface water dynamics dataset (Pickens et al., 2020) characterizing the inter-annual and intra-annual open surface water dynamics for 1999-2018 has become available. Compared with GSW, this new dataset applies a temporally denser time-series and produces water percent layers at each individual month for the entire period, improving the characterization of the dynamics of global open surface water extent. When applying optical images, spectral indices are often used to assess water seasonality based on time-series analysis, such as the Normalized Difference Water Index (NDWI; McFeeters, 1996) and the Normalized Difference Vegetation Index (NDVI; Kriegler, 1969). A lot of efforts have been devoted to developing automatic extraction methods (Huang et al., 2018a), while one limitation of optical images is that they are always obstructed by clouds as aquatic areas are typically cloud-prone.
In comparison, temporally dense SAR data is more useful in characterizing surface water dynamics and flood frequencies (Slagter et al., 2020). For the detection of water persistence in vegetated areas, SAR data is more effective than optical imagery because of the ability to penetrate the vegetation. Time-series features derived from VV (vertical/vertical) polarization work well for characterizing the flooding frequency under vegetation (Tsyganskaya et al., 2019).
Different from permanently or temporarily flooded areas, waterlogged areas are characterized by a high level of the water table. As a result, it is possible to map waterlogged areas using topographic data, soil moisture and water table depth estimations (Bechtold et al., 2018;Delancey et al., 2019). Current research towards the characterization of waterlogged areas mostly focuses on peatlands (i.e. a typical waterlogged ecosystem) over local or regional scales (Bechtold et al., 2018;Gumbricht et al., 2017;Kalacska et al., 2018). Efforts are still needed for future operational identification of the global-scale waterlogged areas.

Classification of the presence of vegetation
Vegetated and non-vegetated aquatic areas can be discriminated from each other by multispectral optical images and SAR data because they have different responses in different spectral bands and SAR signals. Open and smooth water surfaces are able to be identified by low SAR backscatter values and these areas can be well differentiated from non-water regions showing higher backscatter values (Huang et al., 2018b). SAR sensors can also detect water under vegetation canopies (Tsyganskaya et al., 2018b) and different sets of SAR modes can achieve the level-3 classification of different vegetation forms, i.e. trees, shrubs, grasses. SAR data with large incidence angles, short wavelengths (e.g. C-band), horizontal transmission and vertical reception polarization (HV) are considered helpful to map herbaceous aquatic vegetation covers (Henderson and Lewis, 2008;Mahdavi et al., 2018), while SAR images with small incidence angles, long wavelengths (e.g. L-band), horizontal transmission and reception polarization (HH) are more effective in the mapping of flooded trees or shrubs (Mahdavi et al., 2018). The BIOMASS mission, which will be launched in 2022 by the European Space Agency, will carry a fully polarimetric P-band SAR (Quegan et al., 2019). This new archive will achieve more accurate measurements of forest height at 50 m spatial resolution and is expected to benefit the characterization of flood extent under tall or very dense flooded forests (Henderson and Lewis, 2008).
Among various multispectral bands, the red-edge band and the near-infrared band are particularly helpful because different vegetation types show the greatest variation at these wavelengths (Schmidt and Skidmore, 2003;Sims and Gamon, 2002). The freely available Sentinel-2 imagery provides three red-edge bands (i.e. Band 5, 6 and 7) at 20 m P. Xu, et al. Remote Sensing of Environment 250 (2020) 112034 resolution and two shortwave infrared bands (i.e. SWIR1 and SWIR2) that are believed to be effective in detecting water under dense vegetation cover (Lefebvre et al., 2019). However, the characterization of vegetated areas becomes challenging when a heterogeneous landscape is accompanied by irregular water flooding. Such effect has been reflected in the poor mapping of temporarily flooded vegetated areas in GLC datasets and the difficulty in identifying peatlands in the northern boreal area, which are patchy and fragmented, with waterlogged soil and diverse vegetation (Bourgeau-Chavez et al., 2017). To deal with this issue, the multi-source integration of different types of satellite images (e.g. optical, topography and SAR data) and high-resolution images are recommended (Rasanen and Virtanen, 2019).

Classification of the artificiality of cover
Artificial aquatic land covers that are of users' interests include regularly flooded croplands, dams/reservoirs, aquaculture ponds, and constructed wetlands for wastewater treatment. Global mapping of regularly flooded croplands is mainly focused on rice paddy (Dong and Xiao, 2016;Kuenzer and Knauer, 2013). Multi-temporal analysis and phenology-based approaches applying time-series images are effective methods for rice paddy classification. Global rice paddy mapping faces challenges of cloud-induced noises (rice paddies are usually planted in cloud-prone areas) and the fragmentation of rice paddy fields (90% of global rice paddy fields are distributed in Asia, where most cropland fields are patchy and fragmented) (Dong and Xiao, 2016). However, these two issues can be solved by the combined use of high-resolution (e.g. Landsat-like or Sentinel-2) optical and SAR (e.g. Sentinel-1) imagery with a short revisit time (Torbick et al., 2011;Zhang et al., 2009).
The dam/reservoir dataset reviewed in this study (i.e. GRanD that contains 6862 dams) was generated by compiling existing maps and datasets. Recently, a new dam dataset (i.e. GOODD, Mulligan et al., 2020) containing more than 38,000 dams has been produced by the digitisation of satellite imagery globally. However, the mapping of dams/reservoirs by remote sensing on a global scale has not been achieved yet. Current trials of classifying dams and reservoirs are limited to small scales (Amitrano et al., 2017;Annor et al., 2009) and dams/reservoirs are extracted by change-detection-based methods using multi-temporal optical  or SAR images (Amitrano et al., 2017).
Aquaculture ponds are distinguishable from other water bodies utilizing their distinct rectangular structures, while the relatively small size and intermingling with lakes or other water bodies make them difficult to recognize from satellite images (Zeng et al., 2019). A high spatial resolution (e.g. 10 m) is very important to discriminate not only between ponds and other land surfaces but also to separate adjacent ponds from each other (Ottinger et al., 2017). Time-series of optical and SAR data and object-based feature selection methods are recommended for the classification of aquaculture ponds (Ottinger et al., 2017;Stiller et al., 2019;Virdis, 2014;Zeng et al., 2019).
The constructed wetlands for wastewater treatment are wetlands designed to use natural processes involving vegetation, soils, and associated microbial assemblages to treat wastewater (IPCC, 2014). Currently, no study has been carried out to characterize this special kind of wetland using RS techniques mainly because of its small scale to be recognized by RS data and the lack of reliable reference and ground truth data for training and validation.

Classification of the accessibility to the sea and water salinity
The classification of coastal vs inland and freshwater vs brackish/ saline aquatic land cover based on RS techniques is not widely studied. A general distinction between these areas could be achieved according to the definition of coastal and inland aquatic land cover (Table 9) or the potential locations that they may be distributed on. For example, coastal and inland areas could be discriminated using ancillary data, such as the marine ecoregion (Spalding et al., 2007), and saline water can be roughly discriminated from freshwater using coastlines as saline water-covered areas are mostly located in coastal areas. Specific coastal aquatic land cover types demanded by users include mangrove forests, seagrass meadows, tidal marshes, and floodplains. The mapping of mangroves, saltmarshes, and floodplains is actually the classification of permanently flooded trees, temporarily flooded herbaceous cover, and temporarily flooded bare rock/soil/sand in coastal areas, respectively. However, the classification of seagrass meadows at the global scale by remote sensing is difficult due to the confusion between seagrass and other substrate types in shallow coastal water environments (Hossain et al., 2015). High spatial and spectral resolution data, as well as reliable field samples, are required for an accurate mapping of seagrass meadows (Knudby and Nordlund, 2011).
According to LCCS, brackish/saline water is water that normally contains more than 1000 ppm TDS and freshwater is water with salts less than 1000 ppm TDS ( Table 9). As salinity has no direct colour signals, remote sensing characterization of water salinity can only be achieved using a proxy that has a direct relationship with salinity (Chong et al., 2014), such as the chromophoric dissolved organic matter (CDOM) (Bai et al., 2013;Fang et al., 2019) derived from satellite ocean colour data. In this case, a large quantity of reliable field data is required to build the model between CDOM and water salinity (Chong et al., 2014). Specific freshwater types demanded by users include freshwater lakes, freshwater marshes and freshwater forested wetlands (Costanza and Sklar, 1985;Davidson, 2016). Saline aquatic types include salt pans, salt lakes and saltmarshes (Davidson, 2016;IPCC, 2014). The classification of these categories may use ancillary data such as the aforementioned marine ecoregions based on the classification outputs of prior classifiers.
The above analysis shows that a successful implementation of the proposed global aquatic land cover characterization framework requires the integration of multiple data sources and different analysing approaches. The improved computation capability, the open-sourced machine learning algorithms and the evolving satellite data availability improve the feasibility of implementing the comprehensive aquatic land cover mapping framework (Fig. 5). Challenges of implementing the framework mainly come from the complexity of aquatic ecosystems (e.g. dynamic water flooding, heterogeneous and fragmented landscapes), the lack of reliable field data, and the difficulty in acquiring high quality (i.e. very high-resolution images) data on a global scale (Fig. 5).

Conclusion
Aquatic land cover types provide many valuable ecosystem services for human well-being, but they have suffered great loss in the past decades. The global monitoring of aquatic land cover is of high importance. Although plenty of GALC datasets are available for monitoring aquatic ecosystems, map users are confronted with prominent inconsistencies and uncertainties when applying these datasets in different fields of research and applications. The increased satellite data availability has promoted global land monitoring coming to an operational stage that seeks to satisfy multiple user demands. As aquatic land cover exists in many different forms, it is also important to come up with a consistent and comprehensive characterization framework that ensures the universal understanding of aquatic land covers consistent with those of terrestrial land cover characterization. In this study, we addressed the gaps in aquatic land cover monitoring through a comprehensive approach assessing the limitations of available datasets, refined user requirements and evolving remote sensing capabilities that have resulted in a concrete framework for improving global aquatic land cover monitoring.
Among the four groups of GALC datasets, inundation/extent products are dynamic but coarse in spatial resolution. The single-type GALC datasets have finer spatial resolutions but they are too specific in thematic information to meet multiple user needs. The multi-type GALC datasets are more comprehensive, but they are outdated and too coarse in spatial resolution. The GLC datasets address more aspects of aquatic features, while the complexity of aquatic ecosystems is being underrepresented. The assessment of user requirements indicates that user required and preferred thematic information on aquatic land covers concerns open water, vegetation types, water persistence, man-made aquatic land covers, coastal wetlands, and freshwater or saline aquatic types. Datasets with medium to high spatial resolution, intra-annual dynamics and inter-annual changes are also required by users. However, none of the existing datasets can fully meet such demands and a rigorous assessment on the quality of most GALC datasets is lacking.
Based on the identified user needs and the LCCS approach, a threelevel global aquatic land cover characterization framework was proposed. The first level of the framework is a general delineation of aquatic areas. At the second level, five classifiers including the persistence of water, presence of vegetation, the artificiality of cover, the accessibility to the sea, and water salinity are adopted. At the third level, vegetated and non-vegetated categories are further defined. This framework is highly flexible allowing users to combine different layers or classifiers of land cover types to meet their specific needs. This LCCSbased framework is able to bridge the gap between aquatic land cover characterization and generic land cover mapping, which not only considers the complexity of aquatic ecosystems but also ensures the consistency between aquatic and non-aquatic land cover types.
The evolving satellite data availability, improved computation capability, and open-source machine learning algorithms offer tremendous opportunities to implement the proposed framework, while the complexity of aquatic ecosystems, the lack of reliable field data, and the difficulty in acquiring very-high-resolution images on a global scale also bring challenges for the implementation. This comprehensive aquatic land cover mapping framework provides a reference for future operational global aquatic land cover mapping initiatives and will support better understanding and monitoring of complex aquatic ecosystems.

Declaration of Competing Interest
None.