A Proposed Methodology for Detecting the Urban Footprint in Egypt

The detection of informal built-up environments that sprawl on agricultural land is difficult to codify specially in particular with the increase in the population in the Arab Republic of Egypt. In 2017, a law was passed to remove building infractions in rural areas according to aerial photographs taken by the military and the building codes built after the 2019 Construction Violations Reconciliation Law (CVRL) were regulated, which mandates the demolition of any buildings built without a permit that are capped at 8.2 million units Since 2007. The use of remote sensing, which is a powerful technology that relies specifically on updated satellite images, is a key tool for detecting infringements built-up regions. The European Space Agency’s Sentinel Synthetic Aperture Radar (SAR) constellation is a key component of this research because of its advantages that allow for improved spatial resolution and atmospheric independence, and it takes day and night images, making it useful for a wide range of ground cover detection applications., a technological advantage that surpasses Optical sensors for the Sentinel 2 and Landsat missions that rely on clear weather conditions. The study aims to evaluate the use of radar satellites compared to optical satellites to detect urban built-up areas and encroachments on agricultural areas. The study was conducted on an area of 64 square kilometres, in Damietta City, East Delta, Egypt. Supervised classification was also used to derive the overall accuracy of radar satellite images and compare them with optical satellite images, using Error Matrix tables on Scup + QGIS programs. The results of the study resulted in reaching a high accuracy of land cover classification from the process of combining the Sentinel-1A,1B images in Maximum likelihood (ML) algorithms compared to Sentinel-2 images, reaching up to an overall accuracy (OA) of 80%, while the classification in Random Forest (RF) algorithms reached an overall accuracy (OA) of 87%. This paper recommends using the method of merging (SAR) Sentinel 1A,1B images and applying it, as well as conducting periodic monitoring of urban expansions using remote sensing techniques, to make sustainable decisions for the future of urban expansion master plans in the region and to quickly explore informal built-up encroachments.


Introduction
Earth observation and space technology are a United Nations-recognized approach to achieving the Sustainable Development Goals (SDG) (UN, 2020). However, accurate urban mapping remains a challenge, owing to spatial, floating, and temporal constraints on remote sensing systems that often fail to obtain a rapid response (C. Small, 2018). Recent studies have used the thermal infrared properties of urban areas and surface temperature, as well as night-time light satellite images to classify the urban footprint (Bailang Yu, 2014). The most common approach to urban monitoring applications involves the use of optical satellite sensors, as in Sentinel-2 and Landsat-8 satellites (Martino Pesaresi et al, 2016).in addition ,other studies used a combined images generated from synthetic radar aperture satellites (SAR) and optical satellites (interferometry approach), as in the study of (Castro Gómez, 2017) that had proven to generate an accurate method in urban footprint classifications and detection, where he combined Sentinel-1A (SAR) satellite images with the Sentinel 2 (MSI) optical sensor reaching a total overall accuracies of up to a of 84% for urban footprints detection. Others used the interferometric Synthetic Aperture Radar (InSAR) an approach to detect urban footprint, as it is a radar technique based on the use of more than one synthetic aperture radar (SAR) image to generate maps of the Earth's surface as well as to detect digital altitudes using variations in the wave phase of the satellite (Tomás, 2014). The (InSAR) approach is frequently used to detect landcover slides, earthquakes, and vegetation crops, as the coherence of the interferometric pair is a valuable parameter for a range of thematic mapping applications. Nevertheless, this approach is not frequently used to detect urban footprint. However, the main advantages of radar sensing data are improved spatial resolution and high revisit frequency, and the capture of microwave characteristics of the urban environment (interferometric coherence, speckle divergence and backscattering phenomena), that make them useful for a wide range of land cover detection applications over the optical sensors which depends on clear weather situations. (SAR) satellites the Sentinel-1A/1B constellation acquires data in single or dual polarization with a revisit time of 6 days. The 6-day repeat cycle provides global coverage and enhanced performance for (InSAR) applications (Wegmüller U et.al,, 2015). Sentinel-1 Level 1 data are distributed by the Copernicus Open Access Hub under two product types: Ground Range Detected (GRD) and Single Look Complex (SLC). The pre-processing of the (GRD) products is simpler, less time consuming than the (SLC) data (ESA, 2021). As the GRD dataset comprises (VV and VH polarization channels) acquired during ascending and descending orbits over the Earth's surface. The study of (Semenzato et al., 2020) (InSAR) detected urban footprint reaching 85% values ranging from to 90% of overall accuracy of urban footprints. The pre-processing workflow of the Single look Complex (SLC) of S1A images was generated through 9 steps in (SNAP) ESA platform, unsupervised classification was applied, and accuracy assessments were performed by means of the Semiautomatic Classification Plugin (SCP) implemented in QGIS. Also, the study of (Kumar, 2021) uses (InSAR) approach on C-band but with (GRD) datasets acquired from Sentinel-1A/B sensor, and the Google Earth datasets for ground truth verification to detect urban footprints. Where, the pre-processing workflow of the acquired data was generated through 6 steps in (SNAP) toolbox platform, as it was observed that the Refined-Lee filter performed well to provide detailed information about the various urban objects. The experimental results revealed that the proposed technique works well over traditional methods in terms of accuracy and precision. However, the study didn't provide an accuracy assessment step to evaluate the overall accuracy measurements. The selected method in this study is based on exploring the local area statistics (speckle divergence, backscattering coefficient sigma0) channels extracted from the pre-processing of satellite image of Sentinel 1(GRD),and focusing on the Sentinel 1B capabilities, as the study aims for (a) Exploring the advantages of combining the Sentinel 1B sigma0, and the capabilities of its (VV,VH) polarization channels to enhance the process of (InSAR) urban footprint detection, (b) Simulation and evaluation of the overall accuracy of built-up area classifications of the proposed methodology. The presented workflow offers a benchmark for the development of (InSAR) technique to detect urban areas, generating a reliable information of interest to a wide range of communities. Results from this study should be used to improve existing and develop new urban area detection methods based on Sentinel-1B data.

Structure
To promote the use of satellite virtual constellations by means of data fusion techniques, A standard generic pre-process workflow to Copernicus Sentinel-1 GRD data is here presented. As The workflow aims to apply a series of standard corrections to apply precise orbit of acquisition, perform radiometric calibration, apply terrain correction. Additionally, the workflow allows to stack (GRD) Sentinel-1A products to Sentinel-1B data grids. Figure 1 represents the flow chart of the methodology steps. The methodology steps as follow: • First, validating process for Sentinel 2 image in the region of interest • Pre-processing of Sentinel 1A,1B and the combined image • Supervised classification of (ML) algorithms of (LCS) on Qgis Scp-plugin (Qgis, 2020), Random Forest classification algorithms (RF) classifications are applied to satellite images using SNAP TBx. • Evaluation of the results of classifications (ML), (RF) to extract the best overall accuracy.

Study Area
The study area extends over 64 m 2 in Damietta city, which is located in the North of Egypt, bordered by the Mediterranean Sea to the North (Lon, 31 ° 45'0.00 "E, lat., 31 ° 26'60.00" N) (see Fig. 2). It contains the highest standard of living per capita compared to the rest of the governorates of Egypt also there is no unemployment due to the economic activity in this Governorate is one of the best economic patterns as it is characterized by diversity and based on the human element. This economic activity directly affects social characteristics and creates different urban environments. Damietta Port has 39 projects, 14 private free zones and 25 public free zones and is a residential attraction for the availability of free investment zones. The main industrial activities in the complex are Mopco fertilizer manufacturing and methanol production plants by Methanex (Methanex Corporation , 2015). In addition, it contains the largest port in the Middle East.in addition, it has a population of over 1 million, Köppen-Geiger climate classification system classifies its climate as hot desert (BWh), but blowing winds from the Mediterranean Sea greatly moderate the temperatures, typical to the Egypt's north coast, making its summers moderately hot and humid while its winters mild and moderately wet when sleet and hail are also common, the average annual temperature is 21.1 °C | 69.9 °F. The annual rainfall is 145 mm to 5.7 inch (Climate-data, 2021). The region has activities for extension and encroachment on agricultural land in informal urbanization activity beside industrial areas.   Table 1 represents the information of satellite products and the number of bands used to derive the classification evaluations results. All satellite images, as well as additional data sets, referred to as the Global Geodesic System and selected projection is a global transverse Mercator in Area 33N (WGS 84, UTM Zone 33N), EPSG 32632. Areas of interest are represented in Figure 1 with a border shape file of dimensions of 8*8 km (yellow square).

Validating S2 images
Sentinel 2 was selected to calibrate satellite classifications, with a multispectral image of this satellite containing 13 strips at an altitude of 786 kilometers (ESA, 2014), these capabilities enable us to distinguish between urban and other areas using sets of false colors of the satellite image bands B, G, R (4.3.2). In this way real ground points already known in the region can be allocated. 50 points for each class of urban and non-urban land cover category are set on the visual image to assess the classification  2.4. Pre-processing S1A, S1A/1B The workflow of the Sentinel 1 pre-processing stage, on SNAP TBx, illustrated in the following diagram (Fig. 4), and summarized in the following steps: The steps of pre-processing of S1A/1B as follows: • Pre-processing sentinel 1A and 1B images begins with the updating of orbit metadata to provide accurate information about the location and speed of satellites. Sentinel-1 Ground-Range Detected (GRD) products are geographically encoded, and geocoding ensures that the image is not only displayed in the correct location on the earth's surface, but also displays the image as expected when viewed outside a spatially enabled framework. • Followed by the Radiometric calibrationwhere Values of the acquired satellite images are transformed to represent reflectivity measures of the ground surface as to calculate sigma0 of both the amplitude and intensity polarization channels. • Terrain correction, to flip the images automatically and to add the DEM files on the stack composition, Speckle filtering -using the lee sigma algorithms reduces the variance of the complex scatterings. • SAR Image texture Analysis (GLCM) to enhance urban extraction by displaying the backscattering coefficient (sigma0) for VH and VV polarization, which is the pattern of differences The intensity of the Sentinel 1A image for VV-VH polarizations' bands is shown in the ( Fig.5. a, b). The maximum intensity for Sentinel 1A image VV polarization reaches (14252.190 W/m 2 ), while the VH channels maximum limits of intensity reaches (1297.811 W/m 2 ). The levels of high backscattered intensities on elevated topographies and built-up environment in the region of interest are represented by the white toned pixels in the images. The Sentinel 1B image was radiometrically calibrated and terrain corrected to project the image to the location of the study area, the intensities values of the 2 channels shows high values specially in the VH band of the S1B (shown in the histogram for the polarization channels) (see Fig.5 c, d). Figure 5: Sentinel 1 pre-processing workflow and channel intensities, (a) S1A (VV) , (b) S1A (VH), (c) S1B (VV channel), (d) S1B (VH channel) , , Source: Researcher working on SNAP TBx, (ESA, 2021) , ROI is presented by the yellow rectangle in S1A, by the blue rectangle in S1B, yellow arrow is the direction of sensing.

Classification scenarios
Two types of supervised classification have been used to classify images of S1A, S1B and S2 with the same form training shapefile in Figure 6, where it was classified in 2 classes (urban, nonurban). The first type of classification is the maximum likelihood application (ML) using the semi-automated plug-in (Scp) on QGIS. the second type, Random Forest (RF), is classified under the supervision of satellite imagery using TBx SNAP, to apply further spatial analysis.  Figure 6: Training ROIs of satellite images, Source: Researcher working on QGIS, 2021

Accuracy assessment
The automated Error Matrix (also known as the confusion matrix, correlation matrix, or covariance matrix) has been used to evaluate the results of the satellite images classification, and this approach represents one of the exact methods of extraction in many statistical measures, calculating the overall accuracy, percentage of commission, omission errors and Kappa coefficient (K) of the classification of satellite images (G.Congalton, 1991). The accuracy assessment is a measure of consistency with the information documented in a spatial information's, and is used to assess the quality of maps developed from remotely sensed data (Hashemian et.al,, 2001), using stratified random sampling, which is to divide the area of interest into smaller areas according to the category of maps, where a number of categories of land cover maps are selected and 50 points are randomly assigned to each category of land cover (Cochran, 1977). The Sentinel 1 satellite image pre-processing added more accurate information's for the detection process of urban footprint, as the workflow started with the orbit file operator and the calibration process, the (sigma 0 ) was calculated for each (VV, VH) polarization channel (see Fig.7). The S1A satellite image offers a higher set of backscattering values for the VV channel than VH polarization values, (high density values represented by white pixels in Fig.7 a). Visual image monitoring showed high intensities values of sigma 0 for pre-processing of the S1B image (Fig.7. c, d). Built-up areas on Sentinel 1B image can be easily distinguished in the VH polarization range, where the difference between high luminance of built-up areas and low luminance from plant and water areas can be detected significantly (Fig.7. d). Figure 7: Processed Sentinel-1A, 1B product (sigma 0 values) t, (a) S1A-VV polarization, (b) S1A-VH polarization, (c) S1B-VV polarization channel, (d) S1B-VH polarization channel, Source: (The researcher processing sentinel 1A on SNAP TBx.,2021)

Pre-processing results
The process of extracting Speckle divergence band in Fig.8. a, and GLCM Variance of the processed image was generated by stacking both channels of the amplitude and the intensity of S1A,1B satellite images (Fig.8. b). The process maximized the intensities of the backscattering channel reflected from the built-up regions (represented in white pixels in Fig.8. a). However, in the process of extracting the GLCM, the small differences in grey color pixels distinguished the urban areas and the small regions of urban settlements.

Classification results
In the supervised classification of the Sentinel-2 satellite image (Fig.9), Random Forest classification of Sentinel 2 represented more information's of urban pixels count due to the sophisticated algorithm that capture small details for the building's shadows and scattered pixels around the water surfaces ( Fig.9. b).  Figure 10 represents the results of the classifications scenarios for S1A and the combination method of S1A/1B, where the images was classified by the Maximum likelihood (ML) on (Scp-plugin) Qgis interface (Fig.10. arc), and then classified with Random Forest (RF) classification algorithm on SNAPTBx ( Fig.10. b, d), Table 2 show the pixel counts of each class (urban, non-urban) in satellite images.  Figure 10: Sentinel 1A Supervised classification, (a) S1A -(ML) classification, (b) S1A -(RF) classification, (c) S1A/1B -(ML) classification, (d) S1A/1B -(RF) classification. Source: Researcher working on QGIS, 2021

Discussion
By pre-processing Sentinel 1 images, it was concluded that the VV polarization band yielded higher intensities and amplitudes than VH polarization in (S1A). The VH band for the S1B was observed to have much higher intensities than the VH band in S1A images (The VH polarization sigma 0 in the S1B showed a value of (0.059), much higher than the S1A VH channel (0 .0005)). Both the intensity and amplitude were carried on S1A, S1B generated an equal maximum value of (sigma 0 ) for the VV bands (0.4). by combining both VH bands of S1A-1B satellites, more information for the built-up regions and urban footprints can be extracted (generated 18 bands). The speckle characteristics of the stacked image can provide a texture layer that highlights and distinguish built-up surfaces over vegetation and water surfaces. These variations of structures reflectivity led to a local heterogeneity that gives urban areas a very specific and distinct appearance represented in white pixels. in addition, Using the Gray-Level Co-Occurrence Matrix (GLCM) texture analyses for the sentinel 1A,1B images produced a way better homogeneous values which adds to the speckle divergence step in extracting more information's for the small urban clusters, or settlement areas of differing scales, e. g. small villages as well as cities.
For the overall accuracies of (SAR) images classifications, Sentinel 1A generated a minimum value (OA 0.7666), when classified with (RF) algorithms. however, it failed in detecting urban areas with (ML) classification algorithm (OA 0.599). The combination method of stacking S1A-S1B satellite images produced high overall accuracy (OA 0.803) when classified with (ML) algorithms (generated by the error pixels count on Scp plugin Qgis). and when compared to S2 assigned point of the stratified random sampling model Reaching (OA 0.869). in (RF) classification algorithms for the combination, it produced low accuracy of (OA 0.566), the results of the classification are shown in Table 8. which proves that the hybrid methodology used in the study have a lot of future potentials in discovering built-up environment with Maximum likelihood (ML) classification algorithms.

Conclusion
The use of (SAR) satellites provides the possibility to explore urban lands, without relying on weather conditions or the presence of adequate lightning, in addition, the results classifying these images showed that high levels of accuracy in the selection of urban areas can be achieved using the process of combining satellite images of Sentinels satellite. However, Sampling in SAR data needs to be over larger areas to ameliorate speckle and better establish a local 'normal' value of backscatter values.  14 The combination techniques illustrated in this paper, utilizing the (SAR) Sentinel 1A and 1B discovered built-up areas with Overall accuracies (OA) range from 80% to 86%, which proves to be a reliable method in future studies for built-up environment exploration.

Recommendations
The results of the study indicate that making full use of spatial information is important to improve the process of exploring land cover classification. Given that urban footprints maps are characterized by large uncertainties, using RS satellite images offers a much more cost-effective approach to spot informal residential complexes need to be relocated. in addition, Incorporation of suitable textural and spectral images and the use of segmentation-based classification methods can significantly improve land cover classification comparing with traditional per-pixel spectral based classification methods. We also recommend that the methodology followed, should guide the study, and to be linked to the strategic plan for informal housing expansions limitation strategies in the (ARE).