Demonstrating the potential of Picture Pile as a citizen science tool for SDG monitoring

The SDGs are a universal agenda to address the world ’ s most pressing societal, environmental and economic challenges. The supply of timely, relevant and reliable data is essential in guiding policies and decisions for successful implementation of the SDGs. Yet official statistics cannot provide all of the data needed to populate the SDG indicator framework. Citizen science offers a novel solution and an untapped opportunity to complement traditional sources of data, such as household surveys, for monitoring progress towards the SDGs, while at the same time mobilizing action and raising awareness for their achievement. This paper presents the potential offered by one specific citizen science tool, Picture Pile, to complement and enhance official statistics to monitor several SDGs and targets. Designed to be a generic and flexible tool, Picture Pile is a web-based and mobile application for ingesting imagery from satellites, orthophotos, unmanned aerial vehicles or geotagged photo- graphs that can then be rapidly classified by volunteers. The results show that Picture Pile could contribute to the monitoring of fifteen SDG indicators under goals 1, 2, 11, 13, 14 and 15 based on the Picture Pile campaigns undertaken to date. Picture Pile could also be modified to support other SDGs and indicators in the areas of ecosystem health, eutrophication and built-up areas, among others. In order to leverage this particular tool for SDG monitoring, its potential must be showcased through the development of use cases in collaboration with governments, NSOs and relevant custodian agencies. Additionally, mutual trust needs to be built among key stakeholders to agree on common goals that would facilitate the use of Picture Pile or other citizen science tools and data for SDG monitoring and impact.


Introduction
The United Nations (UN) Sustainable Development Goals (SDGs), a framework of 17 goals, 169 targets and 231 indicators, was adopted by UN member states in 2015 (UN, 2015). Designed as a global call for progress on social, economic and environmental challenges, the SDGs require robust monitoring mechanisms and timely, accurate and comprehensive data to track progress towards their achievement. The Inter-Agency Expert Group on SDG Indicators (IAEG-SDGs) is responsible for developing the protocols for indicator population, including metadata standards, conceptual and methodological development, and data availability. The IAEG-SDGs initially classified indicators into 3 tiers, where Tier 3 signified indicators with no established methodology or standards to collect the data; Tier 2 referred to indicators that have a clear internationally established methodology, but no regular data production by countries; and Tier 1 were indicators with an internationally established methodology and regular data production by at least 50% of countries, where relevant (UN, 2020a). The SDG indicator framework has evolved over time and is updated regularly by the IAEG-SDGs. In December 2017, 40% of the SDG indicators were Tier 1, 28% Tier 2 and 29% Tier 3. By April 2019, these figures had changed to, respectively, 44%, 39% and 15%, as methodologies and data availability had improved (MacFeely, 2019). As of 28 December 2020, there were 130 Tier 1 indicators, 97 Tier 2 indicators and 4 indicators with multiple tiers. There are no remaining Tier 3 indicators (UN, 2020b).
Despite these incremental improvements, many of the indicators are still beyond the technical and financial capacities of many National Statistical Systems (NSSs), which are comprised of statistical organizations and units (e.g., National Statistical Offices -NSOs) in countries that collect the data for global SDG reporting (IAEG-SDGs, 2018). For example, by 2020 (or 5 years into the SDG process), it was found that more than half of the SDG targets cannot be measured in the Asia-Pacific region due to lack of data (UN ESCAP, 2020); data are available for only 33% of the 104 gender-related SDG indicators in OECD countries (OECD, 2020); and 68% of the environmental SDG indicators lack data (UN Environment, 2019). Research undertaken by Dang and Serrajuddin (2020) also highlights the overwhelming challenge with missing data and suggests that not all Tier 1 indicators are actually populated. They estimate that only 19% of the required data are currently available. Even before Covid-19, the funding required to measure all SDG indicators was identified as a huge barrier for the timely and efficient monitoring of the SDGs Fraisl et al., 2020).
Traditional sources of data, such as household surveys, constitute the basis for SDG monitoring and reporting. Eighty out of 232 SDG indicators (the number of indicators at the time of the studycurrently there are 231 indicators) could be generated through household surveys covering 13 of the 17 SDGs, particularly in the health, education, gender equality, poverty, hunger, labor and justice domains (ISWGHS, 2019). However, household surveys have limitations, such as high costs, infrequent data collection, and lack of spatial coverage and openness, among others . The potential offered by new sources of data, including Earth Observation (EO), mobile data, financial transactions, citizen science and others, is well acknowledged (UN Global Pulse, GSMA, 2017;UN Global Pulse, 2018;Jensen and Campbell, 2018;UNOOSA, 2018;Fritz et al., 2019;MacFeely, 2019;MacFeely and Nastav, 2019;ESA, 2020;Fraisl et al., 2020). UNEP has identified citizen science as one important component of a global digital ecosystem for the environment that calls for partnerships between governments, the private sector, academics, citizens and other stakeholders to work together to match data gaps and needs, technological solutions and required capacities (Jensen and Campbell, 2018).
Citizen science has been defined in different ways by different researchers (Eitzel et al., 2017;Haklay et al., 2020b) but here we define it based on three main characteristics: "public participation", "voluntary contributions", and "scientific research and knowledge production" (SDSN TReNDS, 2019;Fraisl et al., 2020). Therefore, the following terms are considered under the umbrella of citizen science in the context of this paper: community science, participatory mapping, participatory science, volunteered geographic information (VGI), citizen observatory, crowdsourcing, crisis mapping, community-based monitoring, and citizen generated data, among others (Haklay et al., 2020a).
There is a growing literature on the contributions of citizen science data to SDG monitoring Campbell et al., 2020;Fraisl et al., 2020;Head et al., 2020;König et al., 2020;Laso Bayas et al., 2020b;Wuebben et al., 2020). Additionally, UNEP has included citizen science as a source of data in their global methodology for SDG indicator 14.1.1b on floating plastic debris density -Beach litter, plastic in the sea column and floating plastic and plastic on the sea floor - (GESAMP 2019;UN, 2020c). UNEP also used citizen science data in global and regional reporting for the first time in 2021 for the indicator 14.1.1b (UN, 2021a). The metadata for SDG indicator 6.3.2 on the proportion of bodies of water with good ambient water quality also refers to citizen science as an additional source of data to complement official sources of data, which would otherwise require substantial investments (UN, 2020c).
In 2020, Fraisl et al. (2020) carried out a systematic review of all SDG indicators and citizen science initiatives, demonstrating that citizen science data are already contributing to the monitoring of 5 SDG indicators including: (i) 9.1.1 Proportion of the rural population who live within 2 km of an all-season road, (ii) 14. As part of the review by Fraisl et al. (2020), many tools were identified, such as collaborative websites, mobile apps and low-cost sensors, which aid data collection by citizen scientists. One of the most successful examples of online collaborative mapping, which has led to the creation of a free and open global map of the world, is OpenStreetMap (OSM) (Jokar Arsanjani et al., 2015). The road network in OSM is already being used as an input to indicator 9.1.1 Proportion of the rural population who live within 2 km of an all-season road, and it has the potential to help track progress on the SDGs in other areas, such as disaster response, the environment, water and sanitation, access to electricity, public health, etc. For example, the Humanitarian OSM Team (HOT) uses the collaborative mapping framework of OSM for humanitarian purposes, such as disaster response management, community development and urban planning (Soden and Palen, 2014;Bertolotto et al., 2020), which could have relevance for SDG 1 (target 1.5, indicator 1.5.2) and 11 (target 11.5, indicator 11.5.2) . Other relevant applications identified for these targets and indicators include the MicroMappers set of tools developed by the Digital Humanitarian Network, such as TweetClicker and ImageClicker for tagging text and images from Twitter (Tan et al., 2017), and Picture Pile, an application for mobile and web devices developed by the International Institute for Applied Systems Analysis (IIASA) (Danylo et al., 2018). By combining citizen science with EO, Picture Pile integrates innovative data streams while engaging volunteers in the rapid classification of different types of remotely sensed imagery. Data from previous campaigns will be openly shared (under a CC-BY type of license) while future campaigns will be encouraged to do the same or pay for the service if the data are to remain confidential or are of a sensitive nature.
These are only some of the many initiatives identified that have the potential to be used for SDG monitoring and reporting; for the complete list, see Table S1 in Fraisl et al. (2020). The next step to unlock the value of citizen science for collecting SDG data at scale is to demonstrate how these individual tools could contribute, either directly or in a supplementary way, to the SDGs. To our knowledge, there are only a few studies that specifically address this gap (Ajates et al., 2020). In addition, many existing citizen science applications focus on photo-or text-based data collection tasks, rather than data analysis tasks such labeling, classification, annotating, and/or tagging. Hence, the aim of this paper is to focus on one tool, i.e., Picture Pile, to demonstrate its use as a generic citizen science tool that was designed to support EO but can also support and/or complement official data and statistics for the monitoring of some SDG indicators. The paper also provides concrete recommendations on how to leverage this citizen science tool for SDG monitoring to enhance official statistics.

The Picture Pile application
Picture Pile is a web-based and mobile application that was designed for rapid image classification, i.e., satellite images and orthophotos, geotagged photographs, drone imagery, etc. The images are assembled into a pile and provided to the Picture Pile application for classification by volunteers. The initial idea behind the application was to provide large amounts of reference data needed for the training and validation of remote sensing-based products, e.g., land cover maps, cropland maps, etc. as there is a lack of data available for these purposes (Szantoi et al., 2020). Although other rapid image classification products exist, such as ImageClicker (Tan et al., 2017) and MapSwipe (Herfort et al., 2017), they are both focused on the humanitarian domain rather than a more generic image classification tool suitable for EO purposes.
Picture Pile is the second generation of a prototype app called Cropland Capture, where volunteers were asked a simple question: Can you see cropland in the red box? An example of the Cropland Capture interface is shown in Fig. 1a. On a mobile device, users swipe an image to the right if answering yes, to the left if answering no and downwards if the answer is maybe, e.g., due to clouds covering the satellite imagery or other uncertainties that prevent a respondent from answering definitively. The Cropland Capture campaign ran for around six months, collecting more than 5 million classifications through the involvement of more than 3000 participants. An analysis of the results showed that 92% of the classifications were done in less than 2.5 s each (Sturn et al., 2015).
Moving beyond classifying cropland, Picture Pile was first used to identify evidence of human impact on the landscape, followed by a campaign on deforestation (Fig. 1b), where the interface was changed to accommodate pairs of images from different time periods to identify change. The mechanic of the app was similar, i.e., a simple question with a yes, no, or maybe answer.
Since then, Picture Pile has been used in multiple campaigns, leading to the collection of more than 11 million classifications with the help of more than 4500 volunteers. In addition to the identification of deforestation, Picture Pile has been used in many different domains including the identification of: (i) clouds to improve cloud detection algorithms; (ii) building damage for post-disaster assessment; (iii) crop types to improve crop type mapping; (iv) indicators related to poverty; and (v) marine litter, many of which have relevance for the SDGs. The mechanics of Picture Pile have, therefore, also evolved to accommodate the data collection needs of different applications. For example, questions that could be answered with categorical and continuous variables have been added as shown in Fig. 2

. Figs. 2a and 2b show examples of how
Picture Pile has been used to collect nominal and ordinal categorical variables, respectively, while Fig. 2c shows an example of collecting data for a continuous variable related to wealth. Instead of swiping the image to the left or right, the user moves the image in the direction of the wheel to choose an answer. These simple mechanics of swiping images to the left, right or down for questions with a yes/no/maybe answer, or towards a category or continuous value in a wheel, have all been designed with high usability; these features are also explained in a visual way when the user starts the app. Regarding the classification task itself, users are provided with two forms of guidance. The first involves a set of training images used at the start of the app where examples of different images are provided along with the answers so that users can learn how to classify them. Then during the campaign, answers to expert images are provided at regular intervals so that users can keep learning over time. The second is a gallery of examples that are provided in the app, which users can consult at any time to understand how the images should be classified.
It is important to highlight that Fig. 2c belongs to an experimental campaign that aimed to understand perceptions on wealth and poverty. The objectives of the campaign were (i) to test if perception data could provide additional information and aspects on poverty that objective methods may not be able to capture, such as identifying slums through EO, and (ii) to compare the results derived from perception data to the results produced by EO data, such as slum maps. The analysis is still ongoing.

Assessing the potential of Picture Pile for the SDGs
To understand the potential contribution of Picture Pile to the SDGs, we first summarize the campaigns undertaken by domain that are relevant to the SDGs and then link these to the findings in Table S1 of Fraisl et al. (2020), who already made a first mapping between SDG indicators and the Picture Pile application. Additional information, such as the number of participants, the number of observations and the length of the campaigns, provide further context for assessing this potential. We then provide results from Picture Pile by individual indicator to demonstrate how the data could be used in SDG monitoring and reporting.
Building on the methodology adopted by categorized the contributions from Picture Pile to the SDG indicator framework as direct or supplementary contributions. Direct contribution means that data from Picture Pile could contribute to the calculation of the SDG indicators. For example, Picture Pile could directly contribute to the monitoring of indicator 11.5.2 Direct economic loss in relation to global GDP, damage to critical infrastructure and number of disruption, by providing information on damaged buildings and infrastructure (Danylo et al., 2018). Supplementary contribution refers to data that are useful to contextualize an SDG indicator or target. For indicator 1.5.1 Number of deaths, missing persons and directly affected persons attributed to disasters per 100,000 population, Picture Pile could produce supplementary information on the Number of dwellings/houses destroyed attributed to disasters to contextualize this SDG indicator . Indicator 1.5.2 is the Direct economic loss attributed to disasters in relation to global gross domestic product (GDP), which is further extended within indicator 11.5.2 to include damage to critical infrastructure and number of disruptions to basic services, attributed to disasters. The metadata for indicators 1.5.2 and 11.5.2 describe direct economic loss as the monetary value of total or partial destruction of physical assets existing in the affected area and highlights that it is nearly equivalent to physical damage" (UN, 2021a). Many countries maintain a disaster loss database, which should be built (or adjusted) using the recommendations of the Open-ended Intergovernmental Working Group on Indicators and Terminology related to Disaster Risk Reduction (OEIWG). The disaster loss database contains the number of physical assets damaged by event, e.g., damaged buildings (houses, industrial and commercial facilities, etc.), critical infrastructure as well as agricultural loss. The direct economic loss (disaggregated by event) is then calculated by multiplying the number of facilities damaged, the size of the facilities, and the unit cost of replacement (UN Office for Disaster Risk Reduction, 2016; UN, 2021b).

Picture Pile campaigns
Picture Pile was used in 2017 to collect data on damaged buildings, using pairs of very high-resolution satellite images from before and after Hurricane Matthew (which occurred in October 2016) (Danylo et al., 2018). Working with the HOT, the idea was to demonstrate that Picture Pile could be used to collect information on damaged buildings in Haiti in a short period of time. The modified interface is shown in Fig. 3a, while Fig. 3b shows the damage map aggregated from the Picture Pile results.
The Damaged category was assigned when a minimum of 4 volunteers agreed on visible damage to buildings, whereas Likely Damaged only required 3 volunteers to agree. Where no majority agreement was reached, the area on the map was labelled as Unknown. The most common reason for unusable images was cloud cover. Half of all classifications were collected within the first five days of the campaign, usually from a small number of volunteers. If the network of HOT had been mobilized in real-time, such a map could have been created for all of Haiti within a week.
The data generated by Picture Pile can directly contribute to the calculation of SDG indicators 1.5.2 and 11.5.2. This contribution can occur in two different ways. The aggregated damage map generated by the Picture Pile results could be used to highlight key areas of damage, which could then be used by damage assessors and insurance companies on the ground in undertaking a more focused spatial inventory of damaged assets. Alternatively, for countries where a comprehensive damage assessment is not possible, the results could be used to identify damage from satellite images to determine the number of destroyed or partly destroyed assets, including a rough area calculation.
Picture Pile could also provide supplementary contributions to the repeating indicators 1.5.1, 11.5.1 and 13.1.1 Number of deaths, missing persons and directly affected persons attributed to disasters per 100,000 population. The metadata for these indicators refer to the Technical Guidance for Monitoring and Reporting on Progress in Achieving the Global   This indicator is defined as the Proportion of agricultural area under productive and sustainable agriculture, which is calculated as the aggregation of 11 sub-indicators that jointly indicate productive and sustainable practices divided by the agricultural land area. The methodology for this indicator focuses on the use of a farm survey for the collection of data on productive and sustainable agricultural practices. Agricultural area is reported at national level to FAO (UN, 2021c). Picture Pile can collect data on agricultural land (cropland as well as grazing areas for livestock), which can directly contribute to the monitoring of this indicator and can be fed into remote sensing of agricultural areas, particularly for those countries where capacity for agricultural inventories and surveys are limited. Fig. 2a shows an example of Picture Pile being used to collect crop type from street level photographs while Cropland Capture (Fig. 1a) and Geo-Wiki, an IIASA platform for engaging citizens in environmental monitoring, have been used to collect reference data on cropland in the past Sturn et al., 2015;Laso Bayas et al., 2017;Lesiv et al., 2019). This type of information, which has a considerable global distribution (Fig. 4), can be used to improve the denominator of this indicator at a national level.
Data were also collected through a partnership between Picture Pile and the Global Earth Challenge campaign. Led by EARTHDAY.ORG, the Wilson Center, and the U.S. Department of State, Earth Challenge 2020 looks across six research areas-including food security-to increase the amount of open and interoperable citizen science data in each, while concretely driving individual and policy change. The Picture Pile/Global Earth Challenge partnership focused on embedding a link to the Picture Pile application within the Global Earth Challenge mobile application (Fig. 4a). Once users open the link, they are prompted to download the Picture Pile app, where they can drag or swipe the image to classify crops as wheat, maize, sunflower, vineyard, sorghum, other, or "I don't know" (Fig. 2a). Through this campaign, 386 citizen scientists made 123,187 classifications (Table 1), where almost 50% of users were located in the USA, 4% in Australia and the remaining in Europe. In addition to contributing data that could directly contribute to SDG indicator 2.4.1, this example also demonstrates how partnerships around SDG-oriented campaigns can help bring in a wider, more geographically diverse range of citizen science volunteers working together towards a common, global goal than would be possible with one mobile app or project alone.
The 11 sub-indicators of the numerator could be collected using other crowdsourcing tools, such as AgroTutor (Laso Bayas et al., 2020a), which was also developed by IIASA, to collect data on farming practices including fertilizer use and management of pesticides, which are two of the sub-indicators. As data are currently not available for this indicator (Tier II), new digital approaches may help to facilitate farm surveys in the future. /SDG 1, Indicators 11.1.1/11.3.1 and 1.1.1/1.2.1/1.2.2 Indicator 11.1.1 is the Proportion of urban population living in slums, informal settlements or inadequate housing. Slums are defined as households that have one or more of the following characteristics: lack of access to (i) improved water sources (or those protected from contamination); and/or (ii) improved sanitation facilities; (iii) lack of sufficient living area, defined as no more than 3 people sharing the same area with a minimum of 9 m 2 ; (iv) lack of housing durability, defined as a permanent structure built on non-hazardous locations; and (v) lack of tenure security, i.e., protection from forceful eviction. Informal settlements have the same criteria as slums while inadequate housing additionally has housing affordability as a criterion (UN, 2021d).

SDG 11
The sources of data for this indicator are censuses and household surveys but the lack of tools to measure these components at the city and national levels is acknowledged within the metadata for this indicator, which may result in underestimation of slums, informal settlements and inadequate housing (UN, 2021d). Moreover, these traditional sources of data collection are expensive and only take place every three to five years, while alternative data sources may provide cheaper and more up-to-date information for this indicator.
Picture Pile has been used in two campaigns to collect data related to the mapping of slum areas. Fig. 5a shows the results for locations identified as slums in Dhaka, Bangladesh. Volunteers were asked if slums were present based on very-high resolution satellite imagery and street level photographs, where the question was largely focused on the identification of housing durability. Comparing this to slum data mapped by Gruebner et al. (2014), based on delineation of slum areas in a GIS using visual interpretation of very high resolution satellite imagery, similarities in the patterns are evident. In a second campaign, volunteers were asked to rate the degree of poverty/wealth on a sliding scale using street level images (see Fig. 2c for an example of the interface). The results are plotted in Fig. 5b, which may provide a proxy for housing affordability and inadequate housing. However, the main reason for capturing this information was to use both slum presence and degree of poverty/wealth as inputs to a machine learning (ML) model that additionally used variables derived from remote sensing in combination with a building data set to develop a wall-to-wall map of slum households. Hence, Picture Pile provided the reference data needed to produce a comprehensive slum map for Dhaka. Such an approach could be used in situations where traditional data collection tools cannot be implemented or to complement these more traditional data. Therefore, data produced by Picture Pile have the potential to directly contribute to the monitoring of this indicator. However, we do acknowledge the fact that socio-economic backgrounds and location may play a role in how volunteers perceive poverty and wealth. We provided some examples in this experimental study, but a wider scale implementation of such an approach would require training materials that are both culturally neutral and sensitive to differing concepts of poverty and wealth around the world.
The two aforementioned campaigns on slum mapping in Dhaka and on poverty/wealth could also provide supplementary information for the monitoring of indicators 1.1.1 Proportion of the population living below the international poverty line by sex, age, employment status and geographic location (urban/rural), 1.2.1 Proportion of population living below the national poverty line, by sex and age and 1.2.2 Proportion of men, women and children of all ages living in poverty in all its dimensions according to national definitions. For example, indicator 1.1.1 has a rigorous methodology that focuses on income using household survey data. The metadata of this indicator refers to issues around measuring poverty using the current methodology, such as timeliness, frequency, quality, and comparability of household surveys, particularly in the poorest countries (UN, 2021b). However, the combination of different sources of data could help to address these challenges. Picture Pile could support this indicator in a supplementary way by providing data on areas that could help understand and contextualize poverty, such as information on nighttime lights and proximity to basic services and infrastructure density, among others. These data could complement the survey data in a way that could be mapped out to identify the spatial distribution of poverty.
Indicator 11.3.1 is the Ratio of land consumption rate to population growth rate. The recommended methodology for calculating the land consumption rate is that individual countries use remote sensing to create a classification of built-up areas on an annual basis or for a time interval to be determined locally. From these national maps of built-up areas over time, the total area of built-up is derived (UN, 2021e).
Although a number of global built-up products are available (Corbane et al., 2017;Marconcini et al., 2020), the metadata for indicator 11.3.1 does not recommend using these products because of inconsistency between globally and nationally produced numbers as well as the inability to adequately capture the uniqueness of local contexts (UN, 2021e). To support national efforts in remote sensing of built-up areas, Picture Pile could be used to support the rapid collection of training and/or validation data on built-up areas for these efforts. In 2019, a Picture Pile campaign was run to gather data on built-up areas based on a global random sample generated from the 2015 VIIRS (Visible Infrared Imaging Radiometer Suite) annual nighttime lights product (Elvidge et al., 2017), selecting only locations with some level of radiance as this correlates with the presence of built-up areas (Fig. 6).
For the monitoring of this indicator, individual countries could generate a sample for which very high-resolution satellite imagery are extracted to provide training and/or validation data to support national remote sensing efforts in producing the required built-up area products. Therefore, Picture Pile could provide a direct contribution to the monitoring of this indicator.

SDG 14, Indicator 14.1.1b
Indicator 14.1.1b is the plastic debris density, which is supported by remote sensing undertaken globally for identification of plastic patches greater than 10 m 2 , a numerical model run by UN Environment to estimate the likely presence and origin of beach litter found on coastlines, and data collected nationally on beach litter, plastic in the sea column, floating plastic and plastic on the sea floor (reported as an average count of plastic items per km 2 ), with guidelines provided by GESAMP (2019). The latter is increasingly being supported by volunteers as part of beach cleanups, e.g., organized by the Ocean Conservancy's International Coastal Cleanup initiative, and numerous citizen science projects around the world UNEP, 2021).
A Picture Pile campaign was run where volunteers were asked to classify images taken from an Unmanned Aircraft System (UAS), which surveyed the Stomio coastline in western Crete, using an interface like that shown in Fig. 7a. The UAS imagery was divided into a set of tiles (shown in Fig. 8a as yellow squares). From 1215 images, the majority of volunteers identified marine litter in 101 images (shown in Fig. 8a as red dots). This information was then used in a geospatial model to create a density map (Fig. 8b). However, the ultimate aim of the Picture Pile data collection would be to provide citizen science inputs to a machine learning model for identifying marine litter from both near-earth UAS and satellite-based remote sensing, see e.g., Papakonstantinou et al. (2021). It is also possible to modify Picture Pile to capture the percentage of marine litter in an area of known size using an interface like that shown in Fig. 7b, which may provide additional information for the machine learning algorithm.
The imagery from the UAS could also be used by volunteers to count plastic items found in each image (of known size), which could provide supplementary data to beach cleanups and in-situ citizen science projects or could supplement and validate other methodologies used during beach cleanup campaigns, such as paper and pencil checklists. 15.1.1, 15.2.1, 15.4.2 Indicator 15.1.1 is forest area as a proportion of total land area, where the methodology for this indicator is based on national reporting to the UN Food and Agriculture Organization's Forest Resource Assessment (FAO-FRA) every five to ten years. Indicator 15.2.1 is progress towards sustainable forest management, which is composed of five sub-indicators: (i) Forest area annual net change rate; (ii) Above-ground biomass stock in forest; (iii) Proportion of forest area located within legally established protect areas; (iv) Proportion of forest area under a long-term forest management

plan; and (v) Forest area under an independently verified forest management certification scheme.
Sub-indicator (i) is calculated by FAO from forest area reports over time while the other sub-indicators are reported to FAO as part of the FAO-FRA exercise. Where there are missing data for a country, FAO makes an estimate based on previous assessments, remote sensing and/or information from the literature. Indicator 15.4.2, which is the Mountain Green Cover Index, is also managed by FAO. It is calculated as green areas (defined as forest, grassland, cropland and wetland), taken from the annually produced land cover product from the Climate Change Initiative of the European Space Agency (ESA-CCI), divided by mountainous area (defined above 300 m elevation), calculated from a global mountain layer developed by Kapos et al. (2000).
For each of these indicators, Picture Pile could provide reference data on forests to complement the estimates of forest area, particularly if remote sensing is used, or to improve the data set on green areas, which is currently based on the ESA-CCI yet has been shown to both over and underestimate cropland when compared to FAO statistics (Liu et al., 2018). Hence, reference data on cropland could be useful to improve estimates of this land cover type in mountainous area (see Section 3.2.2). The Urandata land cover campaigns, run in Indonesia as part of the RESTORE+ project (RESTORE+ Consortium, 2020) on land restoration, are also examples of data collection campaigns that could contribute land cover reference data to improving the underlying land cover data sets at the national level.
The calculation of sub-indicator (i) of 15.2.1 could also be supported by Picture Pile, e.g., through a campaign that is used to collect data on deforestation (Fig. 1a), where the results for Tanzania are shown in Fig. 9. The amount of deforestation can be used to support the estimate of the forest area annual net change rate, which is calculated by FAO. Countries can verify these estimates if they have better information on deforestation. Therefore, Picture Pile could provide a direct contribution to the monitoring of these indicators.

Discussion
Monitoring the SDGs is key to their achievement. However, official statistics alone will not be capable of producing information on the scale needed to measure progress on the SDGs and populate the SDG indicator framework. As noted above, more than 40% of the SDG indicators are still Tier 2, and many of the Tier 1 indicators still suffer from significant  data gaps. To fill this gap, new approaches to data collection and production will be needed (MacFeely and Nastav, 2019). Citizen science is one of the new data sources that could complement and enhance official statistics. In what has been characterized the 'Data Revolution', the data ecosystems in which NSSs operate now include not just administrative data, but big data in all of its myriad forms and crowdsourced or citizen generated data. Consequently, this is a timely research topic for the direction and future of SDG monitoring and implementation, as ways to improve official statistics using citizen science tools, and approaches are being widely debated within the citizen science and statistics communitiessee, for example, the Bogota Declaration (UN, 2017); the Dubai Declaration (UN, 2018a); or the UN Statistical Commission Seminar on Emerging Issues (UN, 2019). However, there are only a few studies that discuss the potential of such tools in a way that directly addresses the needs and concerns of official statistics compilers and users Ajates et al., 2020;Fraisl et al., 2020).
Here we have shown that Picture Pile, a citizen science tool that combines EO and crowdsourcing approaches, could contribute to the monitoring of fifteen SDG indicators under goals 1, 2, 11, 13, 14 and 15. In terms of the type of contributions, Picture Pile could support the monitoring of eight indicators directly and seven indicators in a supplementary way, which aligns with the findings of Fraisl et al. (2020). The analysis presented in this paper is based on the Picture Pile campaigns undertaken to date, which are summarized in Table 1.
In terms of tier classification, some of the indicators that could benefit from Picture Pile are currently Tier I with an internationally established methodology and regular data production (e.g., 11.1.1, 15.1.1, etc.), whereas the others are Tier II with existing methodologies but no regular data production (e.g., 2.4.1, 11.5.2, etc.). Even though Picture Pile has greater potential for input to Tier II indicators, as a data gap exists, it also offers potential for Tier I indicators, which requires that data are produced regularly by at least 50% of the countries. Even for countries that have data available for Tier I indicators, they may not have these data consistently over time, with sufficient timeliness or with adequate coverage to facilitate any meaningful disaggregation.
Picture Pile also offers a great potential for monitoring of the indicators concerning ecosystem health. There are a number of SDG indicators that capture ecosystem extent, ecosystem health or other environmental quality issues in the SDG framework (e.g., 6.3.2 on water quality, 6.6.1 on water-related ecosystems, 15.1.1. on forest ecosystems, 15.3.1 on land degradation, etc.). These indicators are based largely on remote sensing data. Although the current data products provide data layers related to the overall extent of ecosystems, there is a lack of available information related to the health of different ecosystemsecosystems are designated in a binary fashion as present or absent. Picture Pile has the potential to add an additional layer of analysis that could be used to create a gradient of ecosystem health. The use of Picture Pile for the SDGs would thus allow policy makers to identify when ecosystem health is declining and develop targeted interventions before it is too late, and the ecosystem has been damaged irreparably or disappeared completely. Additionally, Picture Pile could be used to bring additional remote sensing data into the analysis through the creation of training data and can be used to improve the resolution of the current analytical products, helping realize the value of triangulating citizen science, EO, and ML.
Even though not yet experimented with via a campaign, other SDG indicators could also benefit from Picture Pile, such as 6.4.2 Level of water stress: freshwater withdrawal as a proportion of available freshwater and 11.7.1 Average share of the built-up area of cities that is open space for public use for all, by sex, age and persons with disabilities by providing maps of open and green spaces in cities, among others.
Picture Pile also has the potential to support SDG monitoring efforts by providing an open, free and easy-to-use platform that allows users to set up and run their own crowdsourced image classification campaigns that could be used to train ML algorithms as part of a European Research Council (ERC) funded project. By providing mechanisms to ensure data quality, such as collecting multiple classifications per image by different users, which can provide confidence in the results or providing training images with instructions, the newly established Picture Pile Platform will improve the accuracy of the data collected by any user, which is of utmost importance for measuring progress towards the SDGs more rigorously.
Some limitations in the use of Picture Pile for SDG monitoring, which are also applicable to other citizen science tools and other new sources of data, such as lack of human capacity and financial resources may retard implementation. Other issues, for example, data protection and privacy related issues may also need to be teased out. However, as the Picture Pile Platform will adhere to the FAIR (Findable, Accessible, Interoperable and Usable) principles and will operate in compliance with the EU General Data Protection Regulation (GDPR), most governance and privacy issues should already be addressed. Users will be required to agree to a set of Terms and Conditions that will clearly set out what data are being collected, how the data are being used, and the option to delete any data upon request. Another important consideration is the type of task asked of the volunteers and whether the answers may be influenced by socio-economic background, world view or geographical location, e. g., in the campaign on slum mapping. Thus, careful consideration of the types of questions asked and the guidance provided to the users, which should be as culturally neutral as possible, is needed.
It is important to highlight that this paper is not an attempt to present Picture Pile or any other citizen science tool and new source of data as a way to replace traditional sources of data, but instead, as a way to complement and enhance them. It is essential to be aware of the limitations and advantages of such tools and data they produce and try to minimize them while designing a data collection activity, analyzing the data produced and deriving results from them.

Conclusions, recommendations and future work
This paper outlined the potential of Picture Pile as a citizen science tool for SDG monitoring, particularly for countries and indicators where data are currently lacking. The analysis reveals that Picture Pile could directly contribute to the monitoring of eight indicators, and support seven indicators by providing supplementary information. In total, Picture Pile could contribute to the monitoring of fifteen indicators.
Picture Pile, as with many other citizen science tools and initiatives, could also help make the SDGs and other national goals local, encouraging citizens to get involved in SDG monitoring and reporting efforts while remaining scientifically independent of any political or SDG organizational processes. It could also help mobilize action by raising awareness of relevant environmental and societal challenges. However, providing a precise assessment on the impact of Picture Pile on SDG achievement requires a well-designed and targeted impact monitoring and evaluation approach. Additionally, understanding volunteer motivation is essential for recruiting and retaining volunteers in citizen science initiatives to produce high-quality results. Therefore, research is needed to understand participant motivations and relevant design improvements in Picture Pile.
Further, if Picture Pile is to be used for SDG monitoring, the potential of this tool must be showcased to governments, NSOs and relevant custodian agencies. Forums, such as the WeObserve SDGs and Citizen Science Community of Practice (SDGs CoP), the UN World Data Forum, the International Conference on Big Data for Official Statistics, the UNEP Science Policy Business Forum and other relevant conferences and platforms could be used. Peer-reviewed publications and reports, as well as discussions and partnerships with the relevant custodian agencies, such as the FAO and UNEP could also be explored. Additionally, NSOs need to be guided and supported in how to use Picture Pile to address their data gaps and needs. Picture Pile is currently being operationalized into the Picture Pile Platform, which will provide a service where users can run their own campaigns with no associated costs. This will make it easier for countries to use Picture Pile for SDG monitoring. At the same time, step-by-step guidelines and how-to manuals will be required to support countries in using Picture Pile. In addition, we plan to incorporate a visualization tool on the platform so that any data collected can be mapped in near-real-time, which can then be linked to information about how the data will be or are being used. In this way, the platform will engage users beyond simple data collection.
Another important step could be the development of best practice 'use cases' in cooperation with pioneering NSOs, line ministries and other relevant governmental agencies, to illustrate how Picture Pile can be used for SDG monitoring. In this regard, the treatment of confidential data will be of paramount importance to NSOs, where adherence not only to the UN Fundamental Principles for Official Statistics (UN, 2014), but also their own national statistical legislation will never be far from the minds of national statisticians. Additionally, in cases where governments are not willing to use Picture Pile or citizen science data more broadly, mainly for political reasons or lack of priorities and resources, this approach becomes even more relevant as an effort to democratize science and policy and to keep governments accountable, particularly in situations where the government cannot control the collection of data. Capacity building to enable citizen engagement and the use of mobile technology for data collection are two of the recommendations by Saner et al. (2018) in the example of human trafficking and the SDGs, where it is recognized that more needs to be done to fill the data gaps in this highly sensitive area.
From a broader perspective, the successful uptake of the Picture Pile tool and results, as well as other citizen science tools, for SDG monitoring and reporting will require that some mechanism(s) to reassure NSOs that the data coming from Picture Pile meet the standards of their national statistical quality assurance frameworks or codes of practice is established. One step that Picture Pile could consider, in addition to adopting the FAIR principles, would be to incorporate the UN Fundamental Principles of Official Statistics and UN Statistical Quality Assurance Framework (UN, 2018b), into its governance framework.
There are also a number of strategic developments that may assist in the use of Picture Pile and other citizen science data to official statistics. In 2020, the Chief Statisticians launched their System-wide Road Map for Innovating United Nations Data and Statistics (UN, 2020d). This roadmap is comprised of 3 goals, the first of which is 'Create new and timely data solutions'. More explicitly, the ambition of goal 1 is that the UN system creates new data solutions using multiple data sources that address emerging policy issues and describe 'what matters, when it matters'. Additionally, the ongoing work to put in place a global governance structure for citizen science through the Global Citizen Science Partnership (GCSP) will help enormously in this regard, as it will make it easier for the international statistical system and the citizen science communities to agree and promulgate standards and accords. A more long term strategic initiative, that could greatly facilitate a wide range of secondary data, not just citizen science data, would be the establishment of formal accreditation or certification mechanisms (as part of NSSs) that would create pathways for unofficial statistics to be recognized as official for the purposes of SDG monitoring and reporting (MacFeely and Nastav, 2019).
Picture Pile, as one of many citizen science tools and initiatives, offers great potential for complementing official statistics for SDG monitoring and reporting in a time-and resource-efficient way. However, its uptake for SDG purposes is a process that requires time, capacities and most importantly a change in business-as-usual mindsets and approaches. Building partnerships that include NSOs, governments, custodian agencies, academia, volunteer groups, Civil Society Organizations and other stakeholders will almost certainly be at the heart of developing future NSSs. It will also be key to building the trust required to agree on commons goals, that in turn will facilitate the use of Picture Pile and other citizen science tools and initiatives for SDG monitoring and impact.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.