A fusion approach for evaluating ground conditions for seismic microzoning at the Egyptian Solar Park in Benban, Aswan

Seismic microzoning is the process of mapping out and comprehending the differences in ground motion due to earthquakes in a certain location. Accurate seismic microzoning is vital for the development and safety of buildings and infrastructure in earthquake-prone locations. In this work, we present the application of microtremors, multichannel analysis of surface and machine learning approaches for seismic microzoning at Benban Solar Park in Aswan, Egypt. The findings of the investigation indicated that the ground at Benban Solar Park was generally stiff, with certain regions having stronger stiffness and damping qualities than others. The data also indicated variances in the ground conditions at various sites inside the solar park, with certain regions having a greater risk of ground motion due to earthquakes. Overall, the combination of microtremors, multichannel analysis, and machine learning has shown to be an excellent strategy for correctly and effectively mapping out the ground conditions at Benban Solar Park and assuring the safety and structural integrity of the solar power plants at the park. Moreover, the results of the research could be used to guide the design and construction of the future solar power plants at the park and to examine the safety and structural integrity of the solar park. Furthermore, the application of these techniques not only ensures the safety and structural integrity of the solar power plants at Benban Solar Park, but also promotes sustainable development by providing valuable information for the design and construction of future solar power plants at the park, in line with the principles of environmentally-conscious and responsible development.


Introduction
Egypt is one of the nations with a large population, with the majority of its citizens located in the Nile Valley and Delta areas. The expanding population and the desire to improve the Egyptian community's living standards are driving the creation of new metropolitan areas, special economic zones, infrastructure construction, and other forms of development initiatives. The only sites feasible for such operations are the desert zones around the riverbank on both sides of the Nile (Soltani 2022). Although the scale and rate of development in Egyptian cities vary, the problem is satisfying a rising demand for safe energy resources. The impending energy shortage, along with the risk of a changing climate, necessitates innovations in the energy sector. To secure a prosperous, healthy, and ecologically sound future, there is a greater demand for a new industrialization, one powered by inexpensive, accessible solar and wind renewable energy sources (Abouaiana and Battisti 2022 Egypt's geographic location places it at the epicentre of the global solar belt, making it one of the world's wealthiest countries in terms of renewable solar energy. The southern part of Egypt is one of the ideal sites for building future concentrated solar power due to high solar radiation. The Governorate of Aswan in the southern region of Egypt holds significant potential for the development of future renewable energy projects. Aswan is distinct from the other Egyptian provinces in a variety of socio-economic, anthropological, historic, and cultural conditions. One of the biggest solar energy stations in the world is being constructed in the Benban village, where 90% of the electricity generated by the High Dam will be produced. This is within the framework of the strategy developed by the New and Renewable Energy Authority, which aims to make 20% of the electricity produced in Egypt from clean energy sources (Ibrahim et al. 2021;Hochberg 2021).
Benban Solar Park ( BSP ) is a photovoltaic power station with a total capacity of 1650 megawatts of power. It is situated in the western desert, about 40 kms northwest of Aswan. It is currently one of the largest solar power plants in the world. BSP located on an area of 37.2 km 2 which is subdivided into 41 separate plots arranged in four rows, with each plot ranging in size from 0.3 to 1.0 km 2 . The 41 plants in the BSP are linked to the high-voltage network via four new substations built on-site by the Egyptian Electricity Transmission Company. It has attracted significant attention globally due to its scale and the fact that it is located in an area with high solar radiation. On the other hand, Aswan is situated in a seismically active zone with a history of moderate to large earthquakes. Thus, it is vital to examine and mitigate any seismic hazards at the BSP through a seismic microzonation study to guarantee the safety of buildings and infrastructure and the dependability of the system. Seismic microzonation is the act of partitioning a territory into different zones depending on its seismic hazard level. It comprises the measurement of site-specific parameters, such as soil type, thickness, and stiffness, and the possible ground shaking and damage that might occur during an earthquake ). Seismic microzonation is particularly vital for big infrastructure projects, such as the BSP , where the safety of the buildings and equipment is critical. By identifying regions with increased seismic risk, suitable design and construction procedures may be adopted to decrease the possibility of damage and assure the safety of the infrastructure (Pitilakis 2004). Site effect analysis is an indispensable part of any microzonation study for the proper planning of urban areas. It relates to the surface geology and geotechnical properties of soil deposits, which have a significant impact on seismic ground motion (Bonnefoy-Claudet et al. 2006). Generally, site effects include the modification of the characteristics that are controlled by anomalies in the mechanical properties of the shallowest layers of subsoil, when they consist of soft sediments, or by the shape of surfaces of discontinuity close to or coincident with the topographic surface (Nakamura 1989;Aki 1993;Bard 1999). Depending on the site response factors, such as the fundamental frequency, spectral amplification factor, shear-wave velocity, and soft soil thickness, a site may experience different levels of ground shaking during a seismic event (Lebrun et al. 2002;Pitilakis 2004).
Ambient noise measurements can be used in seismic microzonation studies to estimate the site-specific seismic response of an area. Results of these measurements can help to accurately characterise the local soil conditions and site geology, which are important factors that can affect the way that seismic waves propagate through an area (Gosar 2017). Additionally, ambient noise measurements can be used to identify areas that may be more vulnerable to earthquakes, which can help to prioritise resources and efforts to mitigate potential damage (Cristina García-Nieto et al. 2021). Ambient noise can be caused by a variety of sources, including traffic, construction, industrial activities, and natural sources such as wind and water. Multichannel Analysis of Surface Waves MASW and Horizontal-to-Vertical Spectral Ratio HVSR are two methods that can be used to infer the soil and rock properties that can affect ground motion during an earthquake. In the MASW method, surface waves are generated by impulsive or resonant sources, and the waveforms are measured at multiple locations using geophones. The dispersion characteristics of the surface waves are analysed to infer the shear wave velocity profile of the soil and rock layers. The HVSR method involves measuring the response of the ground to ambient noise at multiple locations and calculating the horizontal-to-vertical spectral ratio of the ground motion. The HVSR method can be used to infer the fundamental frequency of a site and estimate the spectral amplification of the soil and rock layers at that frequency. Both the MASW and HVSR methods can provide valuable information about the soil and rock properties that can affect ground motion during an earthquake and can be used to identify areas that may be more vulnerable to earthquake damage. In the recent decade, this approach has been extensively applied for numerous applications, including seismic microzonation (Moustafa 2015;Gosar 2017), foundation depth mapping (Ibs-von Seht and Wohlenberg 1999), soil liquefaction probability (Bazzurro and Cornell 2004;) and shallow subsurface shearwave velocity structure characterization (Anbazhagan et al. 2016;. Furthermore, the application of these techniques not only ensures the safety and structural integrity of the solar power plants at Benban Solar Park, but also promotes sustainable development by providing valuable information for the design and construction of future solar power plants at the park, in line with the principles of environmentally-conscious and responsible development. The research area's sediments range from soft to stiff and overlie extremely hard bedrock (Issawi and Hinnawi 1980), resulting in a significant impedance contrast and amplification of seismic waves (Mohamed et al. 2022). Consequently, in the present work, the HVSR and MASW are applied to the proper understanding and characterization of the subsurface conditions. Moreover, the HVSR data are used to measure the seismic vulnerability index (Nakamura 1997). This is a critical component of geotechnical design and planning, especially in infrastructure development and land-use planning .
After estimating the various characteristics, the next step is to undertake soil classification. Soil classification systems are used to describe and classify soils in order to better understand their properties and behavior, and to predict how they will respond to various types of loading and environmental conditions. Soils are classified by geotechnical engineers based on their engineering properties as they apply to their use as foundation support or building material. The category is applied in the design of structural-related projects, such as bridges, retaining walls, and buildings (Panah et al. 2002;Anbazhagan et al. 2019). Usually, soil categorization is subjective and prone to misconceptions. To achieve impartial categorization, unsupervised machine learning methods are used to detect patterns in data. Unsupervised machine learning methods can be used to classify or categorise soils based on their physical and chemical properties. These methods can be used to classify soils that have never been labelled or classified before because they do not require labelled training data. One way to use unsupervised machine learning methods for soil classification is to use clustering algorithms, which can group soils into different clusters based on their similarity. K-means clustering is a commonly used clustering algorithm that can group soils into a certain number of clusters based on their similarity in terms of certain soil properties. The classification results of unsupervised machine learning classifiers are found to be equivalent to those of supervised machine learning classifiers (Jain and Dubes 1988;Moustafa et al. 2021).
This study proposes the use of clustering analysis of MASW and HVSR data to investigate the sites having similar responses based on the important parameters including shear-wave velocity, fundamental frequency, spectral amplification factor, and seismic vulnerability index. The improved version of K-means clustering technique (Shukla and Naganna 2014) known as Weighted K-means Clustering ( WKMC ) (Kerdprasop et al. 2005) was applied to the dataset. The quality of cluster solutions was evaluated using cluster validity indices. The results are used to prepare a preliminary individual seismic site microzonation map of the BSP site. They were generated to provide actionable information for further analysis and study the heterogeneity in terms of site response. The results of the current study can aid in the development of seismic hazard maps and building codes based on the site response parameter in order to reduce the impact of future disasters caused by major events. The current study could be useful for researchers integrating seismic hazard assessment studies with machine learning.
The rest of this work is organised and structured as follows: "Features of study area" provides a summary of the main characteristics of the mapped region in terms of location, geology, and seismicity. "Site characterization methodology" provides the local site effect assessment using MASW and HVSR approaches to achieve our stated aim, as well as the used classification technique known as weighted K-means clustering (WKMC). A theoretical framework for the applied strategies is given. In "Experimental data collection and analysis" data gathering and exploratory data analysis are described. This section explains the model's formulation as well as the assessment techniques. "Results and discussions" summarises the results and compares our findings to those of other comparable studies. In that part, evaluations of the performance of the implemented models are also provided. Conclusions are formed in "Conclusions".

Features of study area
The BSP area is characterised by an arid environment with desert-like characteristics. Despite the fact that precipitation is rarely significant throughout the year, certain unusual and irregular storms occur. This basin absorbed a substantial quantity of surface runoff during the rainy season from the eastern Red Sea Mountains range and western calcareous plateau as well. Topographically, it is generally flat, and covered by alluvium (sand and gravel) carried by an ancient E-W striking stream channel (wadi), which is older in age than the contemporary River Nile (Yousif 2019). These sediments are recognised as Protonile deposits (El Ramly and Hermina 1978) on the geological map of the research region ( Fig. 1).
The western vestiges of this historic wadi system are known as Wadi El-Kubanyia. The investigated site lies in the western extension of the Kom Umbo sub-basin and has most probably been affected by the geo-structural setting that shaped the basin (Gaber et al. 2011). The Kom Umbo sub-basin has a separate late Paleozoic, Mesozoic-Tertiary lithostratigraphic sequence. Numerous lithological units that vary in age from the Cretaceous to the Quaternary are present along the surface (El Ramly and Hermina 1978). These depositional sediments came from several distinct places and directions. Sand, a mixture of gravel and sand, and gravel deposits from fluvial and aeolian formations are the most common top strata. The geological units in the area under investigation (Gaber et al. 2015;Yousif 2019) are summarizes in Fig. 1. The region is affected mostly by faults of various scales, according to structural geology. The mapped faults are organised into four distinct groups, which are directed in decreasing order of frequency: ENE-WSW to E-W, NNW-SSE to N-S, NE-SW, and NW-SE (Gaber et al. 2011). Recent studies highlighted the possibility of the well-known Spillway Fault Zone extending to the BSP . The Spillway Fault Zone is the region's longest fault. The Spillway fault is about 60 km long. It depicts a fault zone that stretches from Nasser Lake to BSP in the NNW direction (Conco 1987;Mohamed et al. 2022). The research location has a variety of geomorphological characteristics, including hills, wadis, streams, sand dunes, the Nile Valley, and the Nubian Plain. The BSP is located between the Nubian Plain in the east and Gebel El-Barqa in the west. The area was thought to be aseismic until the Aswan earthquake (Mb 5.3), which struck on November 14, 1981(Kebeasy et al. 1987. That event, was the greatest one yet to be recorded in the Aswan region. A 4.0-magnitude earthquake struck the west Kom Ombo region on March 22, 2003. Many aftershocks, ranging in magnitude from 2.7 to 3.0, were felt throughout the region and its surroundings. The main shock's hypocentral location and its aftershocks were ascertained close to the northern end of the Western Desert's Gebel el-Barqa fault, located 40 km west of the River Nile (Consultants 1985;Fat-Helbary and Mohamed 2004).

Site characterization methodology
The method used in the present study is the integration of ambient noise and unsupervised machine learning to develop local microzonation maps for BSP . Microzonation involves subdividing an area into smaller zones based on factors such as geotechnical and geomorphological properties, which can impact the behaviour of the ground and structures within the area.
The main goal of the current research is to create various microzonation maps for the BSP region. Creating microzonation maps for Benban could provide a number of benefits. Improved risk assessment is one potential Fig. 1 The mapped area's general geological settings. Adopted from El Ramly and Hermina (1978); Conco (1987) benefit, as a microzonation map can help identify areas that are more prone to geohazards such as earthquakes, allowing for better risk assessment and the development of risk reduction strategies. Enhanced structural design is another potential benefit, as a microzonation map can provide important information for the design and construction of structures, such as buildings, allowing for more accurate and cost-effective designs that are better suited to the local ground conditions at BSP. Enhanced infrastructure planning is another potential benefit, as a microzonation map for BSP can help identify areas that are suitable for the construction of infrastructure such as pipelines or roads.
The process of creating various microzonation maps for the BSP region includes conducting a field geological survey, collecting remote-sensing data, and collecting geophysical data measurements. The data are then classified using the weighted K-means algorithm and visualised using Geographic Information System (GIS) contouring tools. Contour maps can be an effective way to communicate the results of the weighted K-means analysis and to support decision-making related to the design and construction of structures or the mitigation of geohazards. The resulting map and associated data can be saved in a shareable format and shared with relevant parties as needed. The following algorithm outlines the steps involved in producing individual microzonation maps for the BSP area.
In order to better understand the seismotectonic processes that can have an impact on the BSP region, surface lineaments and geological fault identification will be carried out as the initial phase of the proposed integrated producer. Lineaments are linear topographic features that can be seen on satellite images or on topographic maps. Tectonic activity may have caused the formation of geologic features such as faults, fractures, or other forms of geologic structures. Lineaments can be used to pinpoint the positions and orientations of active or possibly active faults in the region when creating a seismotectonic map for the research area. Understanding the region's seismotectonic environment and estimating the likelihood of future earthquakes may benefit from this information. Lineaments can also be used to infer the tectonic history of the area and to understand how the region has been affected by past earthquakes. By analysing the distribution and orientation of lineaments, geologists and seismologists can gain insights into the tectonic forces that have shaped the region over time and how those forces might influence future seismic activity (Moustafa et al. 2022). There are several theories that have been proposed to explain the formation and evolution of lineaments. One theory suggests that lineaments are formed by the intersection of different types of rock layers or structures, such as faults or folds. According to this theory, the different rock layers or structures have different physical properties, such as density or strength, and when they intersect, they create a lineament. As a result, attempts to identify lineaments from satellite images are made in this study to improve comprehension of the area's microzonation maps (Koçal et al. 2004;Moustafa et al. 2022).
The first geophysical approach for site-specific characterization is known as MASW, or Multichannel Analysis of Surface Waves. It is a widely-used technique for subsurface exploration that is known for its speed and reliability. It is a geophysical technique used to measure the mechanical properties of subsurface materials. It is based on the principle that surface waves, which travel along the surface of the earth, can be used to investigate the subsurface. The theory behind MASW is that surface waves are sensitive to the mechanical properties of the subsurface materials through which they pass (Dal Moro et al. 2015). By analysing the characteristics of the surface waves, it is possible to determine the stiffness, density, and other properties of the subsurface materials (Park et al. 2007). MASW is commonly used in a variety of applications, including site characterization for civil engineering projects, environmental assessments, and oil and gas exploration. It is a useful tool for understanding the subsurface conditions at the given location, as it allows for the non-invasive investigation of the subsurface materials (Mohamed et al. 2022). The output of a MASW survey typically consists of a dispersion curve, which is a plot of phase velocity versus frequency for the surface waves measured at each sensor location. Using an inversion method (Park et al. 2007), it is possible to determine the soil's shear wave velocity profile as a function of depth from this dispersion curve, particularly in the top 30 m of the soil, which is denoted as V s 30 and estimated using the equation: where h i and v i refer to the layer thickness (in meters) and shear-wave velocity (at a shear strain level of 10 −5 or less) of the ith layer, in a total of N, existing in the top 30 m. The Uniform Building Code recognised shear-wave velocity in the top 30 m, or V 30 s for site classification in 1997 (Dobry et al. 1998), as well as the new provisions of Eurocode 8 (Code 2005).
The second utilised geophysical approach for sitespecific characterisation is known as the Horizontalto-Vertical Spectral Ratio (HVSR) method (Nakamura 1989). It is a technique used to estimate the fundamental frequency (also known as the natural frequency or resonance frequency) of soil or rock layers in the subsurface of the earth. It is based on the idea that the spectral ratio of ground motion in the horizontal direction to the vertical direction is relatively constant at the natural frequency of the soil or rock (Nagoshi and Igarashi 1971;Nakamura 1989). The HVSR method is based on the measurement of ground motion at a site using three seismic sensors: one vertical and two horizontal. The sensor signals are typically filtered to isolate the frequency range of interest, and the spectral amplitudes of the vertical and horizontal components are calculated. The HVSR is then calculated as the ratio of the spectral amplitude of the vertical component to the spectral amplitude of the horizontal components using the equation: where HVSR(f) is the horizontal-to-vertical spectrum ratio, S NS (f ) , S EW (f ) and S Z (f ) are the Fourier amplitude spectra in the NS, EW and Vertical directions, respectively. There are several advantages to using the HVSR method, including its ability to provide site-specific estimates of the fundamental frequency, its simplicity, and its robustness to noise and other disturbances. It has been widely used in geotechnical engineering and seismic risk assessment, as it provides a quick and efficient way to estimate the fundamental frequency of the subsurface material, which can be used to predict the response of the ground to seismic shaking. The HVSR curve can be used to estimate the natural frequency of the soil, which is the frequency at which the HVSR is minimal. It is particularly useful in cases where it is difficult to obtain direct measurements of the soil or rock properties, such as in urban areas with a lot of infrastructure or in areas with shallow soil cover (Nakamura 1997). The fundamental frequency is related to the soil's shear wave velocity and stiffness, and can be used to estimate the site amplification factor for a given frequency and soil type.
In addition to the V 30 s and fundamental frequency, other parameters that can be computed from the output of a MASW or HVSR survey include the soil's thickness. Determining the thickness of soft sediments overlaying a bedrock and the geometry of the bedrock surface is the main component of many geological, hydrological, and engineering studies. In the last decade, several scientists, including (Ibs-von Seht and Wohlenberg 1999;Parolai et al. 2002), clarified that seismic noise data can be used to determine the thickness of soft sediments. The fundamental site frequency, f 0 is related to sediment thickness, h using the following relationship: (2) where V s is the average shear-wave velocity of the sediment layer overlaying the bedrock, h is the thickness of the sediments overlaying the bedrock.
The seismic vulnerability index Kg is a measure of the susceptibility of a building or structure to damage from earthquakes. It is typically calculated based on a combination of factors that contribute to the seismic performance of a building, including its structural characteristics, the quality of the construction, and the characteristics of the soil on which the building is founded. Nakamura evidenced in his study (Nakamura 1997;Nakamura et al. 2000) that there were linkages between the seismic vulnerability index Kg values and the maximum amplitude-frequency ratio in the quasi-transfer spectrum. With the use of this approach, the risky locations in the research area may be established, and the places where damage might occur before an earthquake can be assessed. The Kg index can also be used to inform the development of building codes and standards and to guide the design and construction of new buildings in seismically active areas. The Kg value is derived from Nakamura (1997): where ( A 0 ), is the amplification and ( f 0 ), is fundamental frequency of the sites using the HVSR approach.
Clustering techniques are a type of unsupervised machine learning that can be used in seismic microzonation to group similar regions based on their seismological characteristics (Moustafa et al. 2021). This can be useful for identifying areas that are more or less vulnerable to earthquakes, as well as for developing more accurate models for predicting ground motion during an earthquake. In this research, weighted K-means clustering technique is applied to estimated datasets from MASW and HVSR to help improve our understanding of site response and enhance the safety and resilience of our built environment. In the context of seismic microzoning at Benban Solar Park, weighted K-means clustering is used to group the data collected from microtremors and multichannel analysis of surface waves (MASW) into clusters based on the characteristics of the ground, with each cluster representing a different soil type or ground condition. To perform weighted K-means clustering, the algorithm starts by selecting a set of initial centroids, which represent the centres of each cluster. The data points are then assigned to the nearest centroid based on their similarity to it. The centroids are then adjusted based on the data points assigned to them, and the process is repeated until the centroids converge and the clusters become stable. The goal of weighted K-means clustering is to group the data points into clusters such that the sum of the squared errors between the data points and their respective cluster centroids is minimised (Kerdprasop et al. 2005). This can be expressed mathematically as: where n is the number of data points, k is the number of clusters, w i is the weight of data point i, and x (j) i is the distance between data point i and cluster centroid j. The computation procedure can be represented as shown in the outlined algorithm.
Virtually all machine learning tasks are ill-posed since a unique solution to the problem can't be found unless certain assumptions, or inductive bias, are included. This starts with identifying the learning algorithm and could also contain certain hyperparameters for optimization of the algorithm. Here we will utilise unsupervised learning to classify the various soil conditions to be implemented in microzonation mapping. The number of groups k that should result from clustering is unknown in advance, yet it is a needed parameter of the clustering process. To determine the number of clusters, we consider the overall compactness of clusters as well as their separation from one another over the number of clusters k with an appropriate assessment criteria, such as the average Silhouette coefficient value (Rousseeuw 1987) and Calinski-Harabasz criterion (Calinski and Harabasz 1974). Silhouettes employ average proximities that are recognised to operate best with distinct, compact, and spherical clusters. Silhouette value for an element is defined as Rousseeuw (1987): where s(x) the silhouette value of a single clustered observation x, a(x) is the average difference between the observation x and all other observations on a ratio scale (such as the Euclidean distance). The range of silhouette values is [−1, 1] . Values close to 1 indicate that an observation is strongly clustered in its cluster, 0 indicates that it may just as well be grouped in its next neighbor cluster, and −1 indicates that the observation is virtually definitely in the incorrect cluster.
Calinski-Harabasz criterion is defined as Calinski and Harabasz (1974): where SS B is the between-group sum of squares that gives the overall variance between clusters. and SS W is the withingroup sum of squares that gives the overall variance within clusters. k is the number of clusters, N is the number of observations. The greater the Calinski-Harabasz criteria, the better the cluster structure (Calinski and Harabasz 1974).

Experimental data collection and analysis
The satellite imagery data for the study were obtained from the Landsat 8 satellite mission operated by the United States Geological Survey (USGS). It provides high-resolution, multispectral imagery of the Earth's surface at a spatial resolution of 30 m and a swath width of 185 km (Acharya and Yang 2015). To extract lineaments from satellite imagery, the following steps were performed: Processing was performed on the satellite imagery to enhance the contrast and remove noise. This includes techniques such as contrast stretching, histogram equalization, and filtering. Image processing techniques, such as the edge detection algorithm (Koçal et al. 2004), were used to identify candidate lineaments. After filtering and merging the candidate lineaments, a set of final lineaments was created. This involves applying various criteria, such as minimum length or minimum intensity, to eliminate false positives or low-quality lineaments. Finally, lineaments are identified as faults and fractures, which can be useful indicators of geologically significant structures or surface material changes (Moustafa et al. 2022). The delineated surface lineaments are considered to be linear patterns on the earth's surface that have substantial geological and geomorphological importance and influence seismic hazard. Next, HVSR and MASW geophysical techniques are used to investigate subsurface conditions and conduct geotechnical evaluations of the BSP site. Both microtremors and MASW are non-invasive procedures, as they do not involve digging or drilling into the earth. This is especially important at BSP, where the solar panels are already in place and it would be impossible to reach the ground underneath them. Both techniques are used to gather data rapidly, which is vital for the efficient and timely completion of the seismic microzoning procedure at BSP. Conjunction usage of them offer a detailed and precise picture of the ground conditions at BSP, which is necessary for correct seismic microzoning (Park et al. 2007;Nakamura 1989).
In contrast to MASW measurements, which were only made at 80 profiles, the free-field microtremors experiment was carried out at 100 measurement locations in the BSP . The spatial distribution of the measurement locations is shown in Fig. 2. Mohamed (Mohamed et al. 2022), goes into great depth on the data collection and processing for the MASW experiment. They analysed the foundation layers and the underlying structures at the BSP site using two geophysical methods, including shallow seismic refraction and MASW; consequently, they won't be addressed in the present study, and only HVSR survey details will be presented.
In the current research, the Horizontal-to-Vertical Spectral Ratio (HVSR) approach, which had been developed by Nakamura (1989) was implemented for the seismic site evaluation. To collect HVSR data, 100 locations spaced at regular intervals along a line that covers the BSP area of interest were selected (Fig. 2). A Nanometrics-Compact Trillium-120 broadband seismometer with a sensitivity of 750 V/m was used to detect and record microtremors mostly at the same MASW sites. At each location, microtremors were recorded for 120 min at a sample rate of 100 s/s. The standards created for ambient vibration measurements by the Site Effects assessment using AMbient Excitations ( SESAME ) project (Bard and Participants 2004) were adopted throughout the observations and analysis of the microtremors data at the BSP site. HVSR measurements were taken in a desert location and environment, thus there are no nearby buildings, trees, or subsurface structures to impair the data. The same data collection process was repeated at multiple locations along the line to gather a sufficient amount of data for analysis. The spatial distribution of HVSR observation sites is depicted in Fig. 2.
To analyse the HVSR data, several pre-processing steps were performed to remove any noise or artefacts that could interfere with the analysis. This involves filtering out specific frequency ranges and applying other smoothing techniques to clean the data. Data processing was done using the Geopsy program (Wathelet et al. 2020), which is designed to analyze and compute the HVSR from ambient seismic noise recordings. The HVSR curves were constructed by averaging the microtremors signals acquired during data acquisition and the implemented processing stages. Initially, the recorded signals were divided into non-overlapping windows (i.e., 25 s), as shown in Fig. 3. The frequency band between 0.2 and 20 Hz is the focus of the data analysis. Second, each window was Fast Fourier transformed (FFT) and smoothed by Konno and Ohmachi's 1998 filtering with a band width corresponding to 10% of the central frequency. The key advantage of employing that smoothing function is its ability to accommodate the varying number of points at lower frequencies. Third, the geometrical average was performed to merge two horizontal (EW and NS) components into a single horizontal component, which was then divided by the vertical component to obtain the observed HVSR curve (Nakamura 1989) according to Eq. 2. During the processing phases, all the peaks that occurred on the HVSR curves were examined for their reliability and for their origin (natural or industrial), as indicated in Fig. 3. Some representative examples of the obtained HVSR curves at different sites in the study area are shown in Fig. 4. The fundamental frequency of the ground vibrations, which is the frequency at which the amplitude of the vibrations is maximum, was identified for each observation site. The delineated fundamental frequency and spectral amplification were used to infer soil properties from the subsurface.
Finally, to ensure that all characteristics had the same effect on the employed unsupervised machine-learning algorithm, all measured or estimated features were standardised to have the same variance as a subsequent step in the processing procedure. Z-score was used for normalization: where X j is the vector of all data-points for feature j, j is the mean values for data-points in feature j, j is the standard deviation of feature j and Z j is the normalised vector.

Results and discussions
To start the process of creating different microzonation maps, remote sensing was used to identify structural faults in the BSP area that could affect it. The Earth's crust has a variety of fractures due to a variety of tectonic and geologic processes, known as faults. Results of the analysis suggest four primary fault sets in the studied area: ENE-WSW to E-W, NNW-SSE to N-S, NE-SW, and NW-SE (illustrated in Fig. 5). Further investigation reveals a steep dip angle of these faults, typically exceeding 75 • . Additionally, dextral (right-lateral) movement and normal displacement have been observed in the ENE-WSW to E-W fault set. The NNW-SSE faults (N-S) in the western part of the area are classified as normal, along with the wellknown Spillway Fault Zone (illustrated in Fig. 5). This fault zone is 60 km in length and has a steep angle of 80 • , as well as sinistral (left-lateral) displacement. The Spillway Fault Zone is linked to a variety of seismic activities in the area, as per the previous geodetic, geophysical, and seismological studies (Conco 1987;Deif et al. 2009;Hassib et al. 2012;Mohamed et al. 2020Mohamed et al. , 2022Consultants 1985). Finally, though the depth of the water table is unknown, previous studies have identified that the highly fractured zones related to these faults may be conducive to water accumulation.
The results give practical information about faults that may be used in seismic hazard assessment and microzonation, especially in identifying the locations of active faults that can lead to earthquakes. This information may be used to identify regions of increased risk, enabling decisionmakers to devote the appropriate resources to reduce such possible hazards. Moreover, information on the locations of fractures and their relation to the level of water tables may be used to establish measures to limit the quantity of groundwater-induced seismicity in the region as well as to understand the seismic reactions of these locations. Such measures may help minimise seismic risk as well as boost the overall safety of the solar park. Analysis of Multichannel Analysis of Surface Waves (MASW) datasets reveals that most profiles display two layers of soil with varying shear wave velocities. The upper layer, located at the surface and extending to a depth of 2-18 m, has velocities ranging from 204 to 570 m/s. The lower layer displays velocities ranging from 336 to 977 m/s. A third layer, in some profiles, has velocities between 766 and 946 m/s. Furthermore, the MASW technique is employed to create maps in the form of a V 30 s distribution, which is a measure of the shear wave velocity of the soil in the top 30 m of the earth's surface. This V 30 s value is highly significant for seismic microzonation. By obtaining a V 30 s distribution, it is possible to comprehend the spatial variations in soil characteristics within the BSP area. Results of the MASW analysis are used to identify areas with high or low soil stiffness and potentially highlight areas that may be more vulnerable to ground shaking during an earthquake. Shear wave velocities for the top 30 m are estimated to range from 319 to 834 m/s. Figure (Fig. 6a). According to the estimated V 30 s velocity values, the average value is 400 m/s, the standard deviation is 110, and the coefficient of variation (standard deviation divided by the mean) is 0.27. The estimated mean V 30 s value suggests that the majority of the V 30 s values in the data set are clustered around this value. The standard deviation of the V 30 s values indicates that the values are fairly spread out, with some values being significantly higher and lower than the mean. The coefficient of variation of the V 30 s values suggests that the values are relatively consistent, with most of the values being within 27% of the mean. Using the estimated velocity values, it is possible to design buildings and other infrastructure in the BSP region to be resistant to local seismic conditions and to ensure the safety of communities Fig. 3 Example of data processing and window selection in the Geopsy software for the recorded microtremor data. The evaluated peak of the HVSR curve for sites 6, 40, and 99 in the mapped area is given next to each selection. Each dotted curve indicates the standard deviation, while the solid line displays the average of the HVSR spectra. Other curves indicate the average of all HVSR spectra for various windows ◂ Environmental Earth Sciences (2023) 82:305 Page 13 of 21 305 in the area. Descriptive statistics of estimated parameters are given in Table 1.
On the other hand, Fig. 4 displays some exemplary instances of the HVSR curves acquired from this investigation. The present data reveal various forms of HVSR curves. At the event of a single peak, the corresponding frequency was regarded the fundamental frequency ( f 0 ), whereas curves with two peaks were seen only in very few sites owing to the existence of two impedance contrasts at distinct subsurface depths (Lebrun et al. 2002). The first peak with the lower frequency denotes the fundamental frequency, so the HVSR's dependability was examined to guarantee that the origin of these peaks is not industrial. Some of HVSR's curves are recorded without obvious peaks, as a consequence of the occurrence of hard rocks on the surface or at extremely shallow depths. Spatial distribution maps for the fundamental frequencies ( f 0 ) and related amplitudes ( A 0 ) were developed (Fig. 6b and c). Fundamental frequency values varied between 0.352 at site 54 and 1.014 Hz at site 39. The fundamental frequency values at ten seismic stations were obtained above 0.78 Hz. The fundamental frequency values in the stations numbers 12, 26, 28, 16, 11, and 13 ranged from 0.92 to 1.014 Hz (Fig. 6b). Based on the calculated frequency values of f 0 , the average value is found to be 0.55 Hz, with a standard deviation of 0.186 and a coefficient of variation of 0.34. The mean f 0 value suggests that the majority of the data is concentrated around this value. The standard deviation of the f 0 values illustrates that the values are relatively dispersed, with some values deviating significantly from the mean. The coefficient of variation of the f 0 values illustrates that the values are relatively consistent, with most values being within 34% of the mean. A summary of these descriptive statistics can be found in Table 1. A significant proportion of the study area, represented by 68 sites, exhibited a fundamental frequency below 0.5 Hz. This observation may be attributed to the presence of thick, dense soil layers or the depth of these layers. The low fundamental frequency values have implications for the design of structures in these locations, such as buildings or bridges, and should be taken into consideration during the design process.
Amplification factor values varied between 2.3 and 10.3. The highest amplification factor values were obtained at the seismic station number 22. At the seismic stations located in sites number 21, 17, and 18 amplification factor values ranged from 8.4 to 9.92. Lastly, at 17 stations, the amplification factor values were smaller than 3.58 (Fig. 6c). Estimated A 0 amplification values are an important indicator of the seismic hazard at a site, as they provide information on how much the ground motion is amplified at a specific location. High values of A 0 amplification can increase the vulnerability of buildings and structures at that site. The calculated values of A 0 amplification provide insight into the level of ground motion amplification at different sites within the study area. The average value of 5.1 suggests that, on average, the ground motion at the majority of the sites is amplified by a factor of 5.1. The standard deviation of 1.76 indicates that the values of A 0 are relatively spread out, with some sites experiencing significantly higher or lower levels of amplification than the average. The coefficient of variation of 0.34 suggests that the values are relatively consistent, with most sites experiencing levels of amplification within 34% of the mean. As described in the Table 1, the descriptive statistics of estimated parameters are presented. These statistics provide a summary of the distribution of the A 0 values and can be used to understand the overall pattern and variability of the data.
The seismic vulnerability index is calculated immediately after the amplification and fundamental frequency variables from the HVSR computation are estimated using the definition given in Eq. 4. The seismic vulnerability index Kg is a numerical measure of the susceptibility of a site to damage from earthquakes. The Kg values calculated for this study ranged between 5 and 232. Very high Kg values (> 50) were obtained at 49 seismic stations. High Kg values (20-50) were calculated at 34 seismic stations. Low Kg (< 20) values were only obtained at 19 stations (Fig. 6d). Low seismic vulnerability indices in very small areas show that they may be less devastated in any future scenario of an earthquake. The majority of the surveyed area, however, has high seismic vulnerability indices, suggesting that it may be damaged when seismic events that strike the region, particularly those earthquakes that occur near Nasser Lake. A very high to high seismic vulnerability index (Kg) means that the structure or building has a high probability of experiencing significant damage or collapse during an earthquake (Fig. 6d). Based on the calculated values of the seismic vulnerability index Kg, it was determined that the average value is 61.21, with a standard deviation of 48.32 and a coefficient of variation of 0.78. The mean Kg value suggests that the majority of the data is concentrated around this value, with most sites having a similar level of vulnerability. The standard deviation of the Kg values illustrates that the values are relatively dispersed, with some sites deviating significantly from the mean. The coefficient of variation of the Kg values illustrates that the values are relatively consistent, with most sites having a vulnerability index within 78% of the mean. Descriptive statistics of estimated Kg values are given in Table 1.
Integration of the results from MASW and HVSR surveys to acquire site-specific information on soil properties like shear wave velocity, fundamental frequency, spectral amplification, and seismic ground vulnerability index, combined with the application of weighted K-means unsupervised clustering-a technique that enables detecting patterns in the data and grouping similar areas without making presumptions about the data-is utilised to facilitate the construction of individual seismic microzonation maps for the BSP area. The data used in the analysis included four variables: the fundamental frequency ( f 0 ) of soft sediments, the spectral amplitude ( A 0 ), the shear-wave velocities in the upper 30 m ( V 30 s and the soil vulnerability index (Kg). In this study, some data points were removed from the dataset as they were determined to be outliers. To make the data more homogeneous, standardisation of the values of the different variables was carried out using the Z-score scheme, as the utilised clustering strategies necessitated the data to be within the same range.
To ascertain the optimal number of clusters (k) in the dataset, the Elbow method, or distortion score, the Calinski-Harabasz index, and the silhouette score, as depicted in Fig. 7, were taken into consideration. The elbow method involves plotting the distortion score as a function of the cluster number and recognising the "elbow" as a resource to find the preferred number of clusters. The optimal number of clusters is indicated where the score begins to level off or slow down in growth. The Calinski-Harabasz index is used to assess the compactness and separation of the clusters; higher values of the Calinski-Harabasz index suggest Fig. 5 The geological map of the study area displays the distribution of rock units and the locations of major faults that impact the area. The trend of the faults is also depicted in a rose diagram Fig. 6 The average shear-wave velocity ( V 30 s (m/s)), the fundamental frequency ( f 0 (Hz)), spectral amplification ( A 0 ), and seismic ground vulnerability ( Kg ) distribution maps of the study area. The blue lines indicates the main faults affecting BSP improved dissociation and compactness of the clusters. The Silhouette score evaluates how analogous a certain feature is to its own cluster instead of other clusters; higher values of the score signify better dissociation and compactness of the clusters. In accordance with the evaluated scores depicted in Fig. 7, an increase in the clusters beyond three does not substantially enhance the analysis. Therefore, the optimal cluster number ( k ) for all the selected variables is three . Once the data are grouped into clusters, individual microzonation maps can be generated for each cluster. These maps can be used to identify areas of similar soil properties and to identify areas that are particularly susceptible to seismic hazard and vulnerability. Microzonation maps for the BSP area, constructed utilising weighted K-means clustering with three clusters, are shown in Figs. 8,9,10,and 11. The initial developed map for the purpose of Earth's surface microzonation is the average shear wave velocity in the top 30 m (Fig. 8). This map is utilised in order to classify the soil type present at the BSP site, and predict how the surface may respond during an earthquake. The results of the classification demonstrate that the majority of the BSP region has a low shear wave velocity, with a mean value of 335 m/s, implying that this terrain would be more easily affected by ground motion. This can consequently result in an unstable basis and the collapse of existing structures supported by the soil. There are minor areas in the southeast and southwest that possess medium and high shear wave velocities, with respective averages of 447 m/s and 618 m/s; and there are also isolated spots in the northwest that are classed as having a medium shear wave velocity (Fig. 8). According to the National Earthquake Hazards Reduction Program (NEHRP), landscapes with very soft soils have a V 30 s of less than 360 m/s, V 30 s of 360-750 m/s are classed as soft soils and regions with a V 30 s of over 750 m/s are categorised as possessing firm to stiff soils. In essence, soils with high to medium V 30 s values are more resistant to ground shaking. The potential for ground motion amplification or reduction at various locations in the BSP area, based on the characteristics of the soil at those locations, is illustrated by the final HVSR spectral amplification microzonation map of Fig. 9. This map, generated through the application of the weighted K-means clustering algorithm, divides the area into three classes based on A 0 values. Areas of loose sand soils exhibit high amplification potential, indicating they may be vulnerable to increased levels of damage in response to seismic events. The majority of the study area is classified as moderate amplification site. The borders of the BSP area demonstrate low amplification, implying a higher level of resistance to soil amplification, implying increased capacity to resist seismic events. This map can be used by engineers and building designers to gain insight into amplification potential in the area, and in turn, to design structures with enhanced resistance to earthquakes and other seismic events.
The HVSR fundamental frequency microzonation map in the BSP region (Fig. 10) is used to evaluate the seismicallyinduced ground shaking characteristics and the potential for soil amplification. This map can be consulted by engineers and other professionals when designing structures that can withstand earthquakes and other natural disasters. In BSP, the majority of the mapped area has low fundamental frequency values. This may be attributed to factors such as soil type and depth of the soil layers. Conversely, high values could be indicative of areas that experience ground shaking at higher frequencies, which can be attributed to soil type and shallow depth of the soil layers. In the mapped area, the classification map shows values below 1.0 Hz, which are typically associated with longer periods of ground shaking, thus leading to less structural damage (Fig. 10).
The seismic vulnerability classification map is a type of map that illustrates the potential for harm or destruction from an earthquake in a certain region (Fig. 11). Areas highlighted as having a high seismic vulnerability are areas that may experience elevated levels of ground shaking when an earthquake occurs, which may result in considerable destruction. By contrast, regions with a low seismic vulnerability are less likely to sustain extreme ground shaking during an earthquake, which can reduce the amount of damage caused to buildings and other structures (Fig. 11).  The microzonation maps of the BSP area highlight the numerous elements that may impact the intensity of seismic ground motion, including soil type and depth as well as the presence of groundwater. Alluvial soils, which are sedimentations left by water, are common in Benban and its surroundings, and are part of the Wadi El-Kubanyia and Kom Umbo sub-basins, where low-hazard zones are prominent in hills and high topography, and high-hazard zones appear in plain terrains, which have also been subjected to seismic activity and local climate transformations. Additionally, groundwater may influence soil stiffness and damping properties that, subsequently, can impact ground motion frequency and magnitude during an earthquake. The degree and spacing of groundwater may also impact earthquake shaking. Intense rains and floods are both capable of modifying the topography and soil qualities of BSP increasing the danger of collateral damage to houses and other structures during earthquakes, especially in regions with inclined land or missing adequate soil layers.

Conclusions
Analysis of the Multichannel Analysis of Surface Waves (MASW) datasets suggests that two layers of soil with varying shear wave velocities, ranging from 204 to 570 m/s and 336 to 977 m/s respectively, are commonly observed in most profiles. On rare occasions, a third layer with velocities between 766 and 946 m/s is also present. By employing the MASW technique, a V 30 s distribution map is created in order to measure the shear wave velocity of the soil in the top 30 m of the earth's surface. According to the analysis, the average V 30 s velocity is 400 m/s, with a standard deviation of 110 and a coefficient of variation of 0.27. Nine sites indicate relatively high values of shear wave velocity, ranging from 593 to 673 m/s, while just five locations reveal low shear wave velocity values in the range from 326 to 338 m/s. The estimated parameters demonstrate that the majority of the V 30 s values in the data set are clustered around the mean and are relatively consistent, with most of the values within 27% of the mean. This information can be used to design buildings and other infrastructure in the BSP region to be resistant to local seismic conditions and to ensure the safety of the population.
In this research, various techniques are utilized to analyze the geology and soil conditions of the Benban area, with the ultimate goal of creating microzonation maps that will help to inform development and construction in the region. To achieve this goal, a variety of techniques, including lineament extraction with remote sensing, HVSR surveys to estimate fundamental frequency and amplification, MASW surveys to estimate shear-wave velocity in the upper 30 m, and weighted K-means clustering to build individual microzonation maps. One of the key benefits of using remote sensing for lineament extraction is that it allows for the efficient and accurate mapping of geologic features such as faults and fractures. This information is critical for understanding the stability of the ground and the potential for earthquakes, landslides, and other geohazards. The HVSR survey technique is a useful tool for estimating the fundamental frequency and amplification of soil layers, which can help to predict the ground motion that will occur during an earthquake. The MASW survey method is useful for estimating shear-wave velocity in the upper 30 m, which is a key parameter for evaluating the seismic response of a site. The use of weighted K-means clustering to build individual microzonation maps is also a valuable aspect of this research project. This technique allows for the grouping of data points into clusters based on common characteristics, which can help to identify areas that are similar in terms of their geologic and soil conditions. By creating microzonation maps, researchers and policymakers can have a clearer understanding of the risk profiles of different areas and can make informed decisions about where and how to develop or construct buildings and infrastructure. Overall, the integration of these techniques appears to be an effective way to gain a detailed understanding of the geology and soil conditions of the Benban area, which will be valuable for informing development and construction projects in the region and for mitigating the risks associated with geohazards. the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.