Towards user-driven earth observation-based slum mapping

Earth observation (EO) capabilities to produce up-to-date geographical information on slums over large areas supporting urban planning and evidence-based policymaking are largely acknowledged. Most EO studies typically use a data-driven approach without an understanding of end-user requirements. This study addresses this gap by aligning EO methods with societal needs and concerns using a user-driven approach in Accra, Ghana. By carrying out in-situ observations and slum experts interviews, we produced a user-driven slum map that meets potential users’ expectations. To do so, we used a random forest classifier, SPOT 6 imagery, and ancillary geospatial data such as OpenStreetMap information. The overall classification accuracy for the user-driven approach reached 84%. The results show that the addition of local context-knowledge, end-user requirements, and geo-ethics, help to better contextualise and conceptualise slums. Our research demonstrates an approach of slum mapping that is reflective and open to societal needs and concerns.


Introduction
Rapid urbanisation in Low-and Middle-Income Countries (LMICs) has led to a growing number of slum dwellers with inadequate access to infrastructure and services (UN-Habitat, 2015). In Sub-Saharan Africa, 62% of the urban population lives in slums (UN-Habitat, 2016). Slums dwellers are often characterised by socioeconomic vulnerability, including poverty and unemployment and are usually located in environmental risk areas such as flood zones (UN-Habitat, 2016). The poor living conditions of slum dwellers are further exacerbated by the COVID-19 pandemic as they are facing the dire consequences of the crisis, such as loss of income (United Nations, 2020). Therefore, systematic monitoring of the urban environment is essential for local, national, and international organisations, and for the realisation of sustainable development goals (SDGs), in particular SDG 11-sustainable cities and communities. Unfortunately, reliable and up-to-date spatial information is often not readily available (Mahabir, Crooks, Croitoru, & Agouris, 2016).
Although EO-based slum mapping is widely studied, what is typically missing is the link to local data needs. The main challenges EO scientists are facing include inadequate understanding of the local context , poorly understood end-user requirements Thomson et al., 2020) and a lack of acknowledging ethical and privacy concerns associated with slum mapping . The two main elements that contribute to successful EO-based slum mapping are (i) contextualising slums which allow understanding of local context and user requirements, and (ii) conceptualising and building models, which allow local slum characteristics to be translated into image features. Generally, the EO community methods have been data-driven with an inadequate understanding of different end-users' spatial information. Map-makers are often uncertain on the required map scale, aggregation level, accuracy measures, and level of details required by end-users Thomson et al., 2020). Similarly, the end-users have misconceptions and low awareness of the potentials and limitations of EO (Leonita, Kuffer, Sliuzas, & Persello, 2018). For example, the noise (salt and pepper effect) in EO-based classification can give an impression of inaccuracy and makes users reluctant to use them (Leonita et al., 2018).
The task of slum mapping is a sociotechnical problem with increasing risk of unwitting consequences. In this context, sociotechnical means that the slum mapping task is not just technical but also has a complex socio-cultural dimension that needs to be considered (Durk, 2017), as the products are used in a real-world context for making decisions that can heavily impact society. Slum information is sensitive and political, and its misuse threatens marginalised communities . Geo-ethics is often about fairness, accountability and transparency in the way people (in this context slum dwellers) are mapped, represented and treated (Durk, 2017). Therefore, the process of slum mapping and disseminating of information should address geo-ethical issues. Unfortunately, geo-ethical concerns on slum mapping are widely neglected in the literature. The lack of geo-ethical considerations can jeopardise the marginalised communities (e.g., risk of eviction and stigmatisation) (Gevaert, Sliuzas, Persello, & Vosselman, 2018).
Motivated by the need to produce policy-driven geoinformation, this study presents a novel user-driven approach for mapping slums by integrating local context-knowledge, end-user requirements and geoethics in making slum information publicly available for Accra, Ghana ( Fig. 1). First, we applied a global slum ontology framework to create a prototype map to discuss with end-users. Thus, we want to understand the views of end-users to contextualise the slum mapping process. Second, we conceptualised and built a model that meets the expectations of end-users while addressing geo-ethical concerns. It uses an open-access processing chain and low-cost SPOT 6 imagery. This solution is relevant for LMICs with limited funds and a rapid pace of urbanisation, which implies the need for frequent slum map updating.
The main contribution of this paper is in twofold. First, it identifies knowledge-based features including local context-knowledge, end-user requirements and geo-ethics to contextualise slums. EO studies typically classify slums without fully understanding the end-user requirements and often do not ask critical questions on how maps should be produced to ensure ethical data sharing. Second, it proposes a user-driven EObased approach that integrates these knowledge-based features into the machine learning-based mapping model. Our proposed approach aggregates slums to an appropriate unit to avoid unnecessary details and ensures ethical data sharing.
The structure of the paper is as follows. Section 2 describes the study area and data. Section 3 demonstrates the methodological workflows used for the study. The main findings are presented in Section 4. In Section 5, we discuss the main findings, limitations and areas for further improvement, followed by conclusions in Section 6.

Study area
This study was conducted in the Greater Accra Region, Ghana. Historical effects of race-based town planning, military cantonments, migrant communities, and rapid urbanisation have led to the proliferation of slums (Agyei-Mensah & Owusu, 2010). It is estimated that the Greater Accra region has a population of 4.9 million in 2019. Within the Accra Metropolitan Assembly (AMA) limit, about 38.4% of the population lives in slums (AMA, 2011).
The official slum dataset is highly fragmented, inconsistent and not up-to-date. It is usually produced in collaboration with international organisations. They are often not maintained after the project ends. The recent slum mapping activity was carried out in 2016 within the Accra Metropolis using a participatory rapid appraisal tool (People's Dialogue, 2016). The mapping was done at the neighbourhood scale with slum sizes ranging from 0.01 to 2.3 km 2 . The survey-based mapping is labour intensive, costly and time-consuming, especially when regular updating is needed.
In this study, the urban centre boundary of 2014 obtained from the global human settlement layer (GHSL) (Florczyk et al., 2019) was used as the area of interest (AOI). This is because we wanted to capture the intra-urban diversity of slums at the urban region level as the city now merges with Kasoa, Central region and Nsawam, Berekuso and Aburi, Eastern region. Fig. 2 illustrates the extents of AOI covering 764.3km 2 superimposed with the district units.

Input data
This study used both primary and secondary data (Table 1). Primary data includes SPOT 6 imagery of 2017 obtained from the European Space Agency (ESA) through a third party grant. During the time of the research, that was the most recent cloud-free imagery that covered the AOI. Accra frequently suffers from cloud cover and dust storms. It consists of a panchromatic band of 1.5 m resolution and 4 multispectral bands of 6 m. Images were pansharpened using modified intensity hue saturation (IHS) resolution merge algorithm with nearest neighbour resampling in Erdas Imagine (Siddiqui, 2003). Primary data also includes field photos and expert interviews to understand the local context, end-user requirement, and geo-ethics. Secondary data includes the official slum dataset used for training and validation of the machine learning-based slum mapping and OpenStreetMap (OSM) data (Open-StreetMap contributors, 2020). OSM data includes linear features such as road, railways, rivers, streams and drains used for creating streetblock(s) as well as OSM landcover information (e.g., Building and trees) were used for training and validation.

Software
The study relied on free and open software for geospatial (FOSS4G) solutions. FOSS4G solutions are particularly beneficial for LMICs characterised by limited funds and allow anyone to review and adapt them to their needs (Rico & Maseda, 2012). GRASS GIS (GRASS Development Team, 2019) and QGIS (QGIS Development Team, 2020) were used for raster and vector processing, respectively. PostgreSQL with PostGIS extension was used for storing, managing and processing large vector datasets. The Python (Van Rossum & Drake, 2009) and R languages (R Core Team, 2019) were used for advanced statistical methods mainly pertinent to machine learning. The whole workflow was implemented in a Jupyter notebook to allow sharing of codes for reproducibility. The codes are available on a dedicated Github repository (https://github. com/maxwellowusu/Accra_slum_map.git)

Methodology
The methodology of the research consists of two main steps, as depicted in Fig. 3. In the first step, we focused on a data-driven approach (i.e., the common EO) for slum mapping using the global slum ontology (GSO) of Kohli, Sliuzas, Kerle, and Stein (2012). The resulting slum map served as a prototype for investigating end-user requirements and the causes of misclassification. The second step was the user drivenapproach, split into two parts. The first part consists of field observation and expert interviews to understand the causes of misclassification and end-user requirements (commonly not done by EO experts). The outcome provided informative knowledge for contextualising and conceptualising the slum mapping process. The second part focused on a   Fig. 3. Overview of the methodology, split into two main steps.
user-driven slum mapping that integrates local context-knowledge, enduser requirements, and geo-ethics to produce geoinformation that as much as possible meets the expectations of potential users (our proposed user-driven approach). A detailed overview of the steps is described below.

Data-driven slum mapping (Step 1)
A machine learning-based mapping using the GSO framework was implemented. Slums were mapped using a workflow similar to Grippa et al. (2018). Nevertheless, this workflow was enriched with the local knowledge of the first author regarding the characteristics of the mapped slum areas. Fig. 4 shows the process of the data-driven approach. The steps consist of extracting Grey Co-Occurrence Matrix (GLCM) texture features and normalised difference vegetation index (NDVI) (see Supplementary material Section 1 for texture extraction details). Next, we combined the GLCM, NDVI, and multispectral bands to map landcover (LC) using the Supervised Sequential Maximum a Posteriori (SMAP) (Mccauley, Engel, & Clover, 1995). LC classes include built-up, vegetation, water, and bare land (see Supplementary material Section 2 for land-cover classification details). The overall accuracy of the LC product was 89%. The result of LC classification was used as an input for land-use (LU) classification to help the algorithm identify discriminate separations. For example, a high-income residential area has a different composition of LC types (e.g. build-up, vegetation, bare-land) compared to slums (usually built-up).
Before the LU classification, we extracted street-block(s) to serve as the mapping scale. The street-block(s) is the most fundamental and appropriate unit to map urban structure types (Bochow, Taubenböck, Segl, & Kaufmann, 2010). It shows homogeneous structure types and provides sufficient spatial details relevant for this study (Grippa et al., 2018). Similar to Grippa et al. (2018), we relied on OSM data to generate street-block(s) since the official network datasets were not readily available for the AOI (see Supplementary material Section 3 for streetblock(s) extraction details). The minimum size of the street-block(s) is 0.5 ha, the mean is 3.1 ha, and the standard deviation is 10.3 ha.
LU classification was performed using a random forest (RF) classifier (Breiman, 2001). RF is an ensemble learning method with a high prediction accuracy and can handle high data dimensionality, and is not affected by overfitting (Belgiu & Drȃguţ, 2016). It is also efficient in parameter selection and computationally fast. The land-use classes include slum, high-density residential, low-density residential, nonresidential and non-built-up. The choice of these classes was based on visual interpretation of the urban structure types. Detail of the LU classification is described in Supplementary material Section 4.

Uncertainty analysis
The occurrence of misclassifications in EO-based applications is inevitable. Such misclassifications affect the mapping product's credibility (Pratomo, Kuffer, Martinez, & Kohli, 2017). Apart from using the confusion matrix, an analysis of the uncertainty of LU classification results was carried out at the street-block(s) level using the Equivalent Reference Probability (ERP) measure (Bogaert, Waldner, & Defourny, 2017). ERP is built on the concept of information-based criteria that has the advantage of taking maximum probability values into account while committing for the full set of probabilities. They provide the uncertainty associated with every street-block(s). The uncertainty analysis focused on slum street-block(s) only.
In this study, uncertainty is defined as the probability that a streetblock(s) is correctly classified. It can be expressed as existential and extensional uncertainties (Molenaar, 2000). Existential uncertainty refers to the possibility that a street-block is classified as a slum but does not correspond to a slum on the ground or the possibility that a slum street-block(s) is not detected. Extensional uncertainty refers to the level of confidence a street-block is classified as slums. Most RS-based slum mapping studies usually report overall accuracy from 70 to 90% (Kuffer et al., 2016). Therefore, street-block(s) were labelled uncertain if the ERP is less than 70%. This step resulted in the first map of slums used in steps 2 to improve the map.

Field observations and expert interviews
In-situ observations were conducted to investigate the causes of misclassification in the data-driven model described above (land-use classification results from step 1). Fourteen location points were purposefully sampled using specific criteria based on visual assessment of the data-driven classification results (see Fig. 5). The selection criteria are: Fig. 4. The process of data-driven slum map.
1. Areas classified as slum (more than 70% certainty) that are not slums in the available reference data. 2. Areas classified as uncertain (less than 70% certainty) that are slums in the reference data. 3. Areas classified as high-density residential that are slums in reference data. 4. Areas classified as non-residential that are slums in reference data.
Afterwards, a buffer of 1000 m was created for each point. Using a Quick Scan approach (Ajami, Kuffer, Persello, & Pfeffer, 2019), field observation was carried out within the defined 1000 m buffer. The Quick Scan fieldwork was designed to collect slum data from all the different buffer zones within three weeks. As the walk progressed, stops were made at key features, and pictures were taken.
Next, spatial information required by end-users was collected and analysed. Furthermore, the geo-ethics related information was collected through interviews with local institutions following the approach proposed by Brey (2012). This approach consists of three levels: the technology level, the product level, and the application level (Fig. 6). In this paper, the technology level focused on ethical concerns on machine learning, input features, and accuracy metrics. Product level focused on the product and the deliverable to be included in making slum information publicly available. The application level focused on the uses of the product by different institutions and community-based organisations, and the impact on slum dwellers.
We conducted interviews with the institutions listed in Table 2. These institutions were identified and purposely selected from literature as they relate to slum issues in Accra (see reports on slum: AMA, 2011; Engstrom et al., 2015;People's Dialogue, 2016). Using semi-structured interviews (see Supplementary material Section 5), several topicfocused discussions were organised to understand the end-user requirements, geo-ethical considerations in making slum information publicly available, and local context-knowledge.
To have diverse views on the subject matter, four planners from four different districts (Accra Metropolitan Assembly, Tema Metropolitan Assembly, Ablekuma Municipal Assembly and Ledzokuku municipal assembly), two experts from PWD (Tema Metropolitan Assembly and La-Dade Kotopon municipal assembly) and one each from the other institutions were interviewed.

User-driven slum mapping
The field observations collected at the locations depicted in Fig. 5 and the expert-interview outcomes, helped us understand end-user requirements, geo-ethics, and local context-knowledge required to produce user-driven geoinformation. First, we contextualised the morphology of slums by incorporating in the initial LU classification schema different stages of slum development, which help increase the   quality of training data and reduce uncertainties (e.g., old traditional township). Next, we analysed the end-user requirements, geo-ethics, and local context-knowledge from interviews and conceptualised the mapping process. Using a similar workflow in step 1, we modelled a userdriven machine learning-based slum map. Lastly, we created an interactive web-based interface that provides enough information, including uncertainty levels and ground-validated areas, to disseminate the results. A detail of the mapping process is as follows. First, we characterised slum in Accra urban region. Slums in Accra vary in size, nature and typology. Four slum development stages were identified, similar to Sliuzas, Mboup, and de Sherbinin (2008). Namely, kiosks (temporary), infant, consolidated, and matured slums. Kiosks are the early development with only temporary structures. In the next stage, they become infant slums characterised by a mix of temporary and permanent structures with few houses. They become consolidated when they grow in numbers with the introduction of some services such as water and improved living conditions. The matured stage is when the growth leads to high densification, and the settlement boundary already has a shape. Table 3 describes the characteristics of different stages of slums identified in Accra. We adopted the global slum ontology framework proposed by Kohli et al. (2012) to ensure consistency.
Based on the outcome of end-user requirements analysis and the distinct morphological characteristics of slums identified during field observation, we generated a new training and validation dataset and repeated the step in Section 3.1. The revised LU scheme includes infant slums, matured slum, high-density residential, low-density residential, non-residential (commercial, industrial, and administrative) and nonbuilt-up (vegetation and open space). Consolidated slums were merged with matured slums due to difficulty in obtaining samples. This was done to reduce uncertainty because they have a similar appearance as matured slums. Kiosk slums were not included because of geo-ethical concerns by end-users, as described in Section 5.2.
In total, 500 street-blocks were randomly sampled for training and validation. According to the dominant LU class in the sampled streetblocks, labels were assigned using visual interpretation and local knowledge acquired during the field campaign. There was imbalanced sampling for the classes: infant slum, matured, and high-density residential. Given the sensitivity of machine learning to the presence of an imbalanced training samples, an extra 235 street-blocks were manually sampled (not randomly) to capture the classes, infant, matured, and high-density residential. In total, 735 samples were used. Samples were randomly split into 67% for training and 33% for validation.
Slum maps were created using the same model architecture as in step 1. We combined the LC, GLCM features, NDVI and multispectral information for classification at the street-block scale. Although most endusers recommended the use of grid-scale, we used the street-block level as there was no consensus on the grid-scale size. Comparatively, it was easier to identify street-block(s) than grid cells during fieldwork because they follow the urban structure. It is important to acknowledge that users needs can be contradictory (e.g., size of grids) and that they might be in opposition to feasibility (grid vs street-block). Furthermore, the street-block scale allows omitting kiosk slums, which have the highest eviction pressure.
Finally, we prepared an interactive interface showing how the final product can be disseminated, considering geo-ethics and end-users expectations. An interactive map was created using qgis2web plugin in QGIS (QGIS.org, 2019).

Results of data-driven slum mapping
The data-driven LU model achieved a high overall accuracy (OA) of 92% and an F1-score of 96% using an independent validation dataset at the street-block scale. The F1-score was used to assess the disparities between classes (Sokolova & Lapalme, 2009).
All the other classes obtained producer accuracy (PA) and user accuracy (UA) above 80% except for high-density residential, which obtained a low score below 75%. Predicted slum street-block(s) is classified as uncertain when ERP is less than 70%. A large portion (62.7%) of the predicted slum block was reclassified as uncertain based on the ERP value. A visual assessment of the classification indicates that most of the inner city was classified as slum or uncertain (Fig. 7). Inspection of the confusion metrics reveals confusion between slums and high-density residential (see Supplementary material Section 6). This confusion indicates the danger of only relying on OA since this metric is very dependent on the validation data used.

Causes of misclassification
Three main causes of misclassification were identified from field observation. First, the morphological similarities of typical slums (Fig. 8b) and "old towns" (Fig. 8a). Old towns are neighbourhoods that existed before settlement planning became part of the government system of Accra. They are usually fishing communities housed by lowincome groups that have grown over time. They have a high building density and irregular settlement patterns similar to typical slums. Second, the presence of areas with slum-like appearance due to unplanned and uncontrolled extension (Fig. 8c). Low and middle-income groups predominantly house these areas with access to social services and infrastructure such as electricity and potable water. Lastly, the fact that some slum communities, which have been upgraded (they have received infrastructure development without a spatial redesign of the neighbourhood) or regularised, still look like slums on images (Fig. 8d). These areas does not follow the strict planning standards. For example, creating of alleys or small roads with below standard width.

Spatial information requirement
The interviews show that the top priority in terms of slum information requirements is planning and management. In general, experts pointed out insufficient information on temporal growth (slum dynamics), slum location and boundary, stage of slum development, slum population, level of deprivation, and building characteristics. However, we observed the diversity of information needs in Accra, as shown in Table 4. The mapping scale and level of details required vary depending on the institution. This diversity is illustrated in Fig. 9 (more information on the level of aggregation can be found in Appendix Table A). The national Disaster, NGO, and District Planning experts required highly disaggregated and highly detailed information. They mentioned that highly disaggregated information at the street-block and grid-scale is needed to support their activities. Also, highly detailed information (e.g., stage of slum development) is required to take the initiative to relocate, redesign or upgrade them. Detailed information on the level of deprivation will also help in deciding where and when to intervene. However, experts from the Private Company and District Development Control required highly disaggregated information but low details. They mentioned that highly disaggregated maps (e.g., 100 m grid size) could help identify kiosk slums (temporary slums) at the early stage of development to prevent their growth. They required fewer details as their main activity is to ensure development control and management. Therefore, identifying the location of new slum areas is essential for their activities.
There were inconsistencies concerning the features that should be included in the final map. The NGO and National Disaster experts revealed that they want two classes, thus slum and non-slum. Furthermore, the slum class should show the level of deprivation (good to worst slums). This information can support them in pro-poor initiatives. The other experts (District Planner, Development Control officer, Private Company and Researcher) opted for more LU classes such as planned residential, commercial, industrial, and open space. They mentioned that the LU information would help to monitor LU changing into slums and vice versa effectively.

Geo-ethics in making slum information publicly available
We analysed the geo-ethics in making slum information publicly available to avoid unnecessary details and ensure adequate ethics for data sharing. The results from the interview showed no objection to making slum information publicly available by all parties. Several issues under the three ethical levels of analysis should be considered.

Technology level.
The technology level analysed ethical issues related to the machine learning classification, input features, and accuracy metrics. In general, most of the experts raised geo-ethical concerns on the machine learning algorithm, input features and accuracy metrics. Regarding machine learning, most interviewed experts attested that they have little knowledge of how the maps were prepared. They lack knowledge of artificial geo-intelligence, mainly machine learning classification, and cannot comprehend the mapping process. This knowledge gap hinders the use of EO products.
Concerning input features, experts criticised the use of only the physical morphology to map slums. They indicated that slums are characterised by both social-cultural and physical constructs. Experts from the District Planning, District Development Control and Research Institution mentioned that the housing development strategies (e.g., incremental housing development) contribute to uncontrolled extension (both settlement and building extension) in the city. This happens because most new extensions are developed without a settlement layout  Fig. 7. Initial land-use map (uncertain class is street-block with equivalent reference probability less than 70). A reference data obtained from AMA shows the actual location of slums.   (Adarkwa, 2012). Furthermore, old towns have characteristics similar to typical slums. These local contextual mismatches introduce uncertainties in the training and classification, as seen in Fig. 8. Pertaining to the model error, most interviewed experts were willing to use maps with an accuracy of over 70%. However, they cautioned that some errors are more costly than others and should be mitigated. For example, a wrong classification of planned residential as slums can lead to stigmatisation. When asked about potential solutions to the foreseen model error, they mention that the map producers should clearly describe how the accuracy measures were computed. EO scientists should make the metrics available to support the interpretation of the map. They should make clear why the model behaves that way. To overcome such expectation mismatches, experts recommended that map producers should report on the uncertainties related to the final product and elaborate on the potential implications of using such information.

Product level.
One aspect of geo-ethics is how information is represented and disseminated. Fig. 10 summarizes the expected deliverables that should be provided when making slum information publicly available. The final map package should include metadata, guidelines on the use of EO information, report on accuracy metrics, report on the integration of local context, and show areas with/without ground validation. This integration will increase the usability and acceptability of EO information.

Application level.
The last aspect of geo-ethics is about how the information is used to treat slum dwellers. In general, there was an application mismatch among institutions. While the NGO and Research Institution needed information to develop pro-poor initiatives, the District Planning and Management experts as well as Private Company were more interested in using the information for eradicating slums and preventing their growth. Most experts from the government institution mentioned that slums are urban planning and management challenges that need to be addressed. This opinion acknowledges that slum information can be used to develop initiatives to support their wellbeing or a weapon for eviction/stigmatisation. The NGO expert mentioned that "slum housing means more to the slum dweller than the stereotypical picture of deprivation and poverty." Thus slum dwellers prioritise a place of abode to their poor living environment. For this reason, slum dwellers would rather remain invisible than risk eviction and stigmatisation. Therefore, any effort to make slums visible should ensure adequate privacy. To ensure adequate privacy and produce satisfactory slum information for all users, NGO hinted that matured slums are no/ less threatened by eviction than kiosk and infant slums. Therefore, mapmakers should map at an aggregated scale that will omit high-risk slums (in this study, the kiosk slums), thus avoiding unnecessary details and having clear ethics for data sharing.

Prototype of a user-driven slum mapping
By integrating the insights and lessons from step 2 and 3, we produced user-driven slum maps. While some expectations were met, others will require further resources and studies. Therefore, we built the model based on the available resources. As such, the following modifications were carried out: 1. Mapping of sub-classes of slums (matured and infant slums); 2. Set the minimum size of street-block to 0.5 ha to omit kiosk slums which are the most vulnerable group for eviction; 3. Showing uncertainty levels per slum street-block and areas with ground validation using an interactive web map.
The user-driven map achieved an overall accuracy of 84% and an F1score of 84% using an independent validation dataset at the street-block level. Except for the infant slum class, both PA and UA for the subclasses achieved a score of over 75% (Table 5). The infant slum class showed a strongly lower user accuracy of 25% and producer accuracy of 60%, partially caused by the imbalanced class distribution. The F1-score, which defines the harmonic mean of omission and commission error, is lower than 35%. The confusion matrix (Table 5) revealed misclassification between infant slums, matured slums and high-density residential. These subclasses have similar morphological characteristics. Since most experts were willing to use maps with an accuracy of over 70%, we computed uncertain slum street-block(s). Overall, 7.6% of the slum street-blocks falls under uncertain class when ERP is less than 70%.
Using an interactive web map (available at https://slummap.ne t/index.php/geoknowledge/), we were able to present the final output in a way that better meets users' expectations. Uncertainty indicators and areas with validation can be easily visualised in an interactive way. By doing so, users can quickly assess the credibility of the maps (Fig. 11).

Discussion
EO studies typically map slums using a data-driven approach without understanding end-user requirements and asking critical questions on how they should be produced and shared Thomson et al., 2020;Wang, Kuffer, Roy, & Pfeffer, 2019). This study introduced a user-driven approach that integrated local context-knowledge, end-user requirements, and geo-ethics to align EO-based methods with societal  needs and concerns. The user-driven approach prevents unnecessary details and ensures ethical data sharing to support pro-poor programs and slum dwellers facing eviction. The proposed solution provides not only policy-relevant information but improves usability and acceptability and promotes EO-based information.

Socio-cultural context
We first used a data-driven approach that allowed us to have a preliminary map to investigate the causes of misclassification and prepare interviews with experts. From field observation, it was clear that sociocultural reasons can lead to misclassification when not considered. Unfortunately, in the application of slum mapping, such discussion is rarely found. Studies exclusively present the predictive capabilities of the proposed model and fail to discuss the socio-cultural limitations for slum mapping (Grippa et al., 2018;Ranguelova et al., 2019). However, this is an important issue in preparing training and validation datasets for the model.

End-user requirement and geo-ethics
By incorporating insight and lessons learned, we were able to produce geoinformation that meets users' expectations while ensuring geoethics. This indicates the importance of a bottom-up approach, as discussed by Lilford et al. (2019). From the interviews, we noticed that slum information required by end-users varies in terms of thematic details, aggregation scale, and final output depending on the purpose of the institution. Understanding user needs will improve the use and acceptance of EO-based products. The studies produced a prototype of a userdriven slum map that aimed to combine different user requirements rather than tailor-made products for every user.
The EO community is aiming for a global slum repository Thomson et al., 2020). Nevertheless, studies on geo-ethics are rare. This article clearly reveals geo-ethical concerns that need to be addressed to make slum information publicly available. We found a mismatch in thematic details and aggregations scale needed by end-users. The NGO cautioned that producing highly disaggregated maps means putting the most vulnerable people in danger. A similar reason was discussed in Kuffer et al. (2018) work. However, the District Development Control and Private Company experts will require disaggregated data to prevent slum growth. This is a typical real-world challenge faced by most EO modellers. In this study, we aimed to produce data to support and improve slum dwellers' living conditions rather than contributing to their stigmatisation and risk of eviction. By restricting the minimum size of the street-block(s) to 0.5 ha, we were able to omit kiosk slums which are the most vulnerable for eviction (Fig. 12). Therefore, mapping at the street-block level proves to be ideal for addressing the geo-ethical concerns.
Furthermore, most experts criticised the sole use of morphological features to map slums as it is a simplistic approach. Similar to the discussion of Mahabir et al. (2016), stakeholders see slums as both physical and social constructs. Yet, we used only the image features for this study because of lack of social data at the urban region scale. Further research can explore crowdsourcing methods to obtain social data that can be integrated into the model (Abascal et al., 2021). Table 5 Confusion matrix, Producer accuracy (PA), User accuracy (UA) and F1-score of land-use classification. HDR: high-density residential, LDR: low-density residential, NON-RST: non-residential, NON-BLT: non-built-up. Green colour: High accuracy, yellow: medium accuracy and red: low accuracy.

User-driven slum mapping outcomes
Similar to other studies (e.g., Engstrom et al., 2015;Sandborn & Engstrom, 2016), the user-driven result achieved a high overall accuracy of over 80%. Obtained accuracy levels differed by stage of slums, with matured slums having the highest classification accuracy while infant slums have the lowest classification accuracy. The low score of infant slums can be associated with the low number of samples obtained for the training and validation dataset (only 38 street-blocks) or the use of GLCM features. This result shows a contrasting finding as RF is said to be robust for a small sampling size (Belgiu & Drȃguţ, 2016;Folleco, Khoshgoftaar, Van Hulse, & Bullard, 2008). In addition, GLCM features underperform in identifying less prominent patterns in the urban environment (Kit, Lüdeke, & Reckien, 2012). Further studies are recommended to better understand the unique characteristics of infant slums and find out which image features are relevant for their extraction.
Aside from RF high prediction accuracy (Belgiu & Drȃguţ, 2016), the study took advantage of the class membership probabilities to estimate spatial uncertainty per street-block. Most of the extensional uncertainties were found in heterogeneous street-blocks. Showing areas with a low level of certainty helps to improve the credibility of the map. It suggests that these areas will require field investigation to verify and update the map. With this discussion above, this article reveals the strength of integrating the local context-knowledge, end-user requirements, geo-ethics into EO-based slum mapping.

Limitation of the proposed model
The study did not capture the views of all stakeholders. To have a more holistic user-driven slum map, further research should investigate the data required by the health institutions, environmental institutions, and slum dwellers themselves. However, it is important to acknowledge that it is impossible to satisfy the needs/expectations of all stakeholders.
Moreover, we faced difficulties in using visual interpretation to distinguish different slum stages. We combined matured and consolidated slums since they have similar morphological characteristics and were difficult to distinguish visually. There was a high confusion between matured slums and high-density residential areas. Comparatively, slums have low height buildings and are usually located in highrisk areas, making them distinguishable from other residential land-use (Kohli et al., 2012). Further studies can include height information such as digital elevation models and other contextual information such as risk maps and socio-economic data (e.g., social media data (e.g., Taubenböck et al., 2018)). With the increasing availability of Google Earth's street views, scene information can be added to improve the classification (Ibrahim, Titheridge, Cheng, & Haworth, 2019).

Conclusion
Developing policy-driven geoinformation is essential to support SDG 11 target. Our research has demonstrated the potential of a new workflow based on integrating local context-knowledge, end-user requirement and geo-ethics to produce user-driven geoinformation on slums. Our proposed user-driven approach improves current EO-based methods by contextualising and conceptualising the mapping of slums as a sociotechnical classification task. Since potential users have varying geospatial information needs, this study presents geoinformation that suits almost all purposes while addressing critical geo-ethical concerns. The final map is detailed enough to support the activities of most institutions, while at the same time protecting slum areas with a high risk of eviction. Such spatial information on slums from images has the potential to provide relevant information for pro-poor programs as well as strategic planning and management in a complex city. Further research should investigate more on the slum ethics by including several institutions such as health, environment, housing, and slum dwellers themselves.

Funding
The research pertaining to these results received financial aid from the Belgian Federal Science Policy (BELSPO) according to the agreement of subsidy no. (SR/11/380) (SLUMAP: http://slumap.ulb.be/) and from NWO grant number VI.Veni.194.025.

Author's contribution
Maxwell Owusu is the main author of the study who wrote the manuscript, processed the data, analysed the results, and developed the online map for visualisation of results. Monika Kuffer and Mariana Belgiu conceptualise the study, reviewed and edited the manuscript. Tais Grippa and Stefanos Georganos prepared the codes for street-block extraction and classification. Moritz Lennert and Sabine Vanhuysse reviewed and edited the manuscript and helped to improve it. All authors have read and agreed to the published version of the manuscript.