Evaluating the Performance of Machine Learning and Deep Learning Techniques to HyMap Imagery for Lithological Mapping in a Semi-Arid Region: Case Study from Western Anti-Atlas, Morocco

: Accurate lithological mapping is a crucial juncture for geological studies and mineral exploration. Hyperspectral data provide the opportunity to extract detailed information about the geology and mineralogy of the Earth’s surface. Machine learning (ML) and deep learning (DL) techniques provide an accurate and effective mapping of various types of lithologies in arid and semi-arid regions. This article discusses the use of machine learning algorithms, speciﬁcally Support Vector Machines (SVM), one-dimensional Convolutional Neural Network (1D-CNN), random forest (RF), and k-nearest neighbor (KNN), for lithological mapping in a complex area with strong hydrothermal alteration. The study evaluates the performance of the four algorithms in three different zones in the Ameln valley shear zone (AVSZ) area at eastern Kerdous inlier, Moroccan western Anti-Atlas. The results demonstrated that 1D-CNN achieved the best classiﬁcation results for most lithological units. Additionally, the LK-SVM demonstrated good mapping results compared to the other SVM models, as well as RF and KNN. Our study concludes that the combination of the CNN and HyMap data can provide the most accurate lithologic mapping for the three selected region, with an overall accuracy of ~95%. However, this study highlights the challenges in identifying different lithological units using remotely sensed data due to spectrum similarities induced by similar chemical and mineralogical compositions. This study emphasizes the importance of carefully considering and evaluating ML and DL methods for lithological mapping studies, then recommends the high-resolution hyperspectral data and DL models for accurate results. The implications of this study would be fascinating to exploration geologists for Mineral Prospectivity Mapping (MPM), especially in selecting the most appropriate techniques for highly accurate mineral mapping in metallogenic provinces.


Introduction
Reflectance spectroscopy data have been used in several studies to measure specific chemical and physical properties of surface materials. Remotely sensed Hyperspectral Imaging (HSI) data play an important role in geological investigations and mineral exploration, especially in arid and semi-arid environments [1][2][3][4][5][6]. Various hyperspectral satellites the KNN, and the 1D-CNN in the mapping of lithologies within three parts of the Ameln valley shear zone (AVSZ) area at eastern Kerdous inlier, Moroccan western Anti-Atlas ( Figure 1). The following steps were adopted to perform this investigation: (i) selecting three areas in the HyMap imagery and evaluating their spectral characteristics; (ii) carrying out the lithological mapping using the different ML and DL classifiers; (iii) undertaking an accuracy assessment. The validation data are essentially based on the lithological units' field observations in each given area, and the performance evaluation used various measures, namely overall accuracy, Kappa, and F1-score. In summary, this work seeks to present a comprehensive evaluation of the ML and DL approaches for lithological mapping in a semi-arid region of the AVSZ using HyMap imagery. This study's findings will be interesting to exploration geologists for Mineral Prospectivity Mapping (MPM) and other experts in the Earth sciences, in particular when it comes to selecting appropriate techniques to use in similar regions.
gence allowed the hardly accessible regions' mineral potential to be mapped accurately and time effectively [19,32].
In this study, the capability of the ML and DL techniques were evaluated for automated lithological mapping using HyMap imagery. This work evaluates the three support vector machine types, linear, the polynomial, and radial basis function kernels, as well as RF, the KNN, and the 1D-CNN in the mapping of lithologies within three parts of the Ameln valley shear zone (AVSZ) area at eastern Kerdous inlier, Moroccan western Anti-Atlas ( Figure 1). The following steps were adopted to perform this investigation: (i) selecting three areas in the HyMap imagery and evaluating their spectral characteristics; (ii) carrying out the lithological mapping using the different ML and DL classifiers; (iii) undertaking an accuracy assessment. The validation data are essentially based on the lithological units' field observations in each given area, and the performance evaluation used various measures, namely overall accuracy, Kappa, and F1-score. In summary, this work seeks to present a comprehensive evaluation of the ML and DL approaches for lithological mapping in a semi-arid region of the AVSZ using HyMap imagery. This study's findings will be interesting to exploration geologists for Mineral Prospectivity Mapping (MPM) and other experts in the Earth sciences, in particular when it comes to selecting appropriate techniques to use in similar regions.

Geological Sitting of the Study Area
The study area is located in the Anti-Atlasic belt of Morocco ( Figure 1). The Anti-Atlas forms the Northern edge of the West African Craton [33]. The Proterozoic lithological formations outcrop at some inliers, including the Kerdous, Akka, and Bas Draa inliers, as well as being surmounted by Ediacaran-Paleozoic cover in the western Anti-Atlas [34].
Within the Kerdous inlier, the Paleoproterozoic basement represents more than 30% of the rock [35]. Basement units are formed by a polymetamorphic complex [34,36,37], represented in the study area by the orthogneisses of the Jbel (mountain) Ouiharen (Xoε), schist, and mica-schist (XIξ). These metamorphic units are overlapped by several granitoids from the Paleoproterozoic era, including the Tasserhirt Plateau calc-alkaline granite (XIγm). The Neoproterozoic and the Paleozoic formations have been deposited in unconformity up from the Paleoproterozoic units [37,38]. In the study area (Figure 1), the Neoproterozoic units consist of quartzites of Jbel Lkest (XII2q), rhyolitic vulcanites, ignimbrites of Adrar Mkorn (XIIIm), Pan-African granites of Tafraout (XII3γ), and the volcano-detrital deposits of the Tanalt formation (XIIIS1). The dolerite dykes (XII2δ) are mapped essentially within the Jbel Lkest quartzites. Then, the Lower Cambrian units are formed by the Adoudou formation that is partitioned into the basal series represented by schist and sandstone (Ad11a), limestone and dolomite (Ad11a), with the lower series being represented by dolomite and limestone (Ad12). The study area is known for the existence of several structural features trends, with the particular dominance of the NE-SW trend. The dominance of this feature was demonstrated with multisource remote sensing datasets [39].

Characteristics of HyMap Data
The HyMap is a Hyperspectral airborne imaging system developed by the Integrated Spectronic, Sydney, Australia, and operated by the HyVista Corporation. The HyMap data principal characteristics are summarized in Table 1. Table 1. HyMap sensor characteristics, Cocks et al. [13].

Module Spectral Range (nm) Bands Number Spectral Resolution (nm)
VIS 450-890  31  15  NIR  890-1350  31  15  SWIR1  1400-1800  32  13  SWIR2 195-2500 32 17 The HyMap scene of the Amlen valley region was recorded in 124 bands, from 450 nm to 2500 nm, with a spatial resolution of around 5 m, and an average spectral resolution of 15 nm. The geometric and atmospheric corrections have been performed on the scene using HyVista. The imagery data were furnished in ground reflectance by the National Office of Hydrocarbons and Mines (ONHYM). The entire HyMap scene was acquired in the Anti-Atlasic belt while a regional airborne survey covering a total area of 10,000 km 2 was conducted. The data imagery has already been geo-referenced in the UTM 29 projection and WGS-84 datum. The HyMap Atmospheric and Topographic Correction Model (ATCOR4) was applied to the data, which permits the converting radiance to surface reflectance data on the basis of the MODTRAN radiative transfer code. It can also eliminate the topographic effect of the illumination differences [40]. The noisy bands and the bands covering water absorption features were eliminated during this stage. Pre-removal of bad bands is required for hyperspectral datasets before data processing techniques. Thus, only 110 bands were used for further processing. A vegetation mask was applied on the study area scene, then, a gap-filling tool integrated in ENVI (5.3) software was used for the reconstitution of the resulting no data pixels to avoid gaps in the results [19]. Figure 2 represents the flowchart of the methodology used to process HyMap data in this analysis.

Image Processing by Applying Machine and Deep Learning Techniques
The MNF method serves as a valuable character transformation technique utilized for remote sensing imagery. Its primary objectives are to determine the intrinsic dimensionality, or the optimal bands number, then to effectively separate noise from the underlying data. Figure 3 displays the band combinations of MNF1, MNF2, and MNF3 as an RGB color combination. The MNF image exhibits enhanced spectral contrast, facilitating the discrimination of different lithological units within the area. The MNF transformation allows for the effective grouping of image pixels of similar colors, facilitating the delineation of corresponding boundaries with high precision. This process ensures accurate spatial representation and enhances the visual interpretation of the data. Hence, the MNF image can be used to confirm the observed litho-boundaries during fieldwork.  The MNF method serves as a valuable character transformation technique utilized for remote sensing imagery. Its primary objectives are to determine the intrinsic dimensionality, or the optimal bands number, then to effectively separate noise from the underlying data. Figure 3 displays the band combinations of MNF1, MNF2, and MNF3 as an RGB color combination. The MNF image exhibits enhanced spectral contrast, facilitating the discrimination of different lithological units within the area. The MNF transformation allows for the effective grouping of image pixels of similar colors, facilitating the delineation of corresponding boundaries with high precision. This process ensures accurate spatial representation and enhances the visual interpretation of the data. Hence, the MNF image can be used to confirm the observed litho-boundaries during fieldwork.

Classification Using SVMs
Essentially, support vector machines aim to find the best possible boundary between different classes of data. SVMs are supervised learning algorithms first introduced by Vladimir Vapnik in the early 1990s [41]. These algorithms are a type of machine learning algorithm that separate data into different classes by finding a hyperplane with a maximum margin between the classes. This makes them more robust and less prone to overfitting than other classifiers.
Some of the most popular kernel functions (K) are chosen in our study for the xi and xj input vectors: Linear: K x , x = γx x (2) Polinomial: K x , x = (γx x + ) , γ 0 where the kernel parameters are γ, d, and r. The gamma parameter acts as an inner product coefficient in the polynomial function (Equation (3)) and also controls the kernel width in the RBF (Equation (1)), [42]. The parameter d represents the degree of the polynomial function (Equation (3)). The r parameter controls how much the high-degree polynomials versus low-degree polynomials influence the model [42].

Classification Using SVMs
Essentially, support vector machines aim to find the best possible boundary between different classes of data. SVMs are supervised learning algorithms first introduced by Vladimir Vapnik in the early 1990s [41]. These algorithms are a type of machine learning algorithm that separate data into different classes by finding a hyperplane with a maximum margin between the classes. This makes them more robust and less prone to overfitting than other classifiers.
Some of the most popular kernel functions (K) are chosen in our study for the x i and x j input vectors: where the kernel parameters are γ, d, and r. The gamma parameter acts as an inner product coefficient in the polynomial function (Equation (3)) and also controls the kernel width in the RBF (Equation (1)), [42]. The parameter d represents the degree of the polynomial function (Equation (3)). The r parameter controls how much the high-degree polynomials versus low-degree polynomials influence the model [42].

K-Nearest Neighbor
The K-NN algorithm is a highly popular supervised ML classification technique. It is widely used as a preferred classifier in many statistical studies [43]. The K-NN instancebased method is a lazy learning method which takes more processing time than the other common ML methods [44]. It classifies every specific sample by its distance from the k number of the nearest neighbor samples. The Euclidian distance is usually used to calculate the distance of the k neighbors for each point [45].

Random Forest
Random forest is a supervised classification algorithm, proposed for the first time by Breiman [46]. It has recently been recognized as a powerful machine learning technique and has been applied for a broad range of regression and classification tasks. Several lithological classification studies using remotely sensed data use the RF algorithm [47,48]. A random forest is generated from a large number of decision trees (DTs), where a random subset of the input data is used to train each DT [25]. A bagging process allows the random draw of the new training set and the replacement of the initial training set [49]. Each pixel is classified to a specific class by obtaining the largest popular voted class in the forest preceptor's tree [50].

Convolutional Neural Network
In recent years, the CNN has progressively demonstrated a significant benefit of lithological identification and mapping on hyperspectral datasets [51,52]. The basic structure of a CNN is the convolutional layer, the pooling layer, and the fully connected layer, as represented in Figure 4. The overfitting problem can be avoided beforehand by applying the PCA that reduces the number of spectral bands before feeding them into the 1D-CNN, then the computational cost of the convolution operation will be reduced. Thereafter, the convolution series allows for the extraction of deep features, which are then flattened in a neuron column that will be used as the input of the fully connected layer [53]. The final layers of a convolutional neural network are completely (fully) connected layers that allow for the processing of information sent to lower levels and the formulation of decisions [53,54]. The K-NN algorithm is a highly popular supervised ML classification technique. It is widely used as a preferred classifier in many statistical studies [43]. The K-NN instancebased method is a lazy learning method which takes more processing time than the other common ML methods [44]. It classifies every specific sample by its distance from the k number of the nearest neighbor samples. The Euclidian distance is usually used to calculate the distance of the k neighbors for each point [45].

Random Forest
Random forest is a supervised classification algorithm, proposed for the first time by Breiman [46]. It has recently been recognized as a powerful machine learning technique and has been applied for a broad range of regression and classification tasks. Several lithological classification studies using remotely sensed data use the RF algorithm [47,48]. A random forest is generated from a large number of decision trees (DTs), where a random subset of the input data is used to train each DT [25]. A bagging process allows the random draw of the new training set and the replacement of the initial training set [49]. Each pixel is classified to a specific class by obtaining the largest popular voted class in the forest preceptor's tree [50].

Convolutional Neural Network
In recent years, the CNN has progressively demonstrated a significant benefit of lithological identification and mapping on hyperspectral datasets [51,52]. The basic structure of a CNN is the convolutional layer, the pooling layer, and the fully connected layer, as represented in Figure 4. The overfitting problem can be avoided beforehand by applying the PCA that reduces the number of spectral bands before feeding them into the 1D-CNN, then the computational cost of the convolution operation will be reduced. Thereafter, the convolution series allows for the extraction of deep features, which are then flattened in a neuron column that will be used as the input of the fully connected layer [53]. The final layers of a convolutional neural network are completely (fully) connected layers that allow for the processing of information sent to lower levels and the formulation of decisions [53,54]. where several 1D-convolution, pooling, and fully connected layers are applied. Thereafter, the fully connected layer's output is used to generate the final classification.

Tuning Parameters
Implementing each ML and DL method requires setting a number of hyperparameters. In this study the classification of HyMap data was performed using the Advanced Hyperspectral Data Analysis Software (AVHYAS) [55] as a python-based plugin in QGIS. The default settings of parameters were adopted. The used 1D-CNN structure is similar to that proposed by Hu et al. in 2014 [56]. Its architecture consists of five layers, each carrying its own set of weights. These layers include the input layer, the convolutional layer, the max pooling layer, the fully connected layer, and the output layer. In the 1D-CNN the input layer can only be represented by vector spectral data. The first hidden layer is a where several 1D-convolution, pooling, and fully connected layers are applied. Thereafter, the fully connected layer's output is used to generate the final classification.

Tuning Parameters
Implementing each ML and DL method requires setting a number of hyperparameters. In this study the classification of HyMap data was performed using the Advanced Hyperspectral Data Analysis Software (AVHYAS) [55] as a python-based plugin in QGIS. The default settings of parameters were adopted. The used 1D-CNN structure is similar to that proposed by Hu et al. in 2014 [56]. Its architecture consists of five layers, each carrying its own set of weights. These layers include the input layer, the convolutional layer, the max pooling layer, the fully connected layer, and the output layer. In the 1D-CNN the input layer can only be represented by vector spectral data. The first hidden layer is a convolutional layer of 20 filters with a kernel size of 12. The kernel size is calculated in the used CNN architecture by dividing the sequence length by nine [56]. The maxpolling1D is implemented as the polling layer before applying the fully connected layer. However, the number of epochs in the CNN has been adjusted to 20 to accelerate the computation time.

Accuracy Assessment Approaches
The confusion matrix is constructed using four parameters: true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs) [57]. Each of these parameters represents a different type of classification result, where TPs comprise classes that have been correctly predicted as positive; TNs comprise classes that have been correctly predicted as negative; FPs comprise classes that have been predicted as positive but are actually negative; and FNs comprise classes that have been predicted as negative but are actually positive. By analyzing the values in each cell of the confusion matrices, we can calculate different indices such as accuracy, producer accuracy, user accuracy, sensitivity, and specificity, which provide insight into the performance of the classification model [58].

Accuracy
One of the most widely used metrics in multi-class classification is accuracy, which is calculated on the basis of the confusion matrix. The accuracy parameter provides a summary indicator of how well the model has predicted the complete set of data. The single individuals in the dataset are the fundamental component of the metric; each unit has the same weight and contributes equally to the accuracy number [57]. With the accuracy (Equation (4)), only the elements placed in the diagonal confusion matrix are taken into account.

Kappa Coefficient
The Kappa coefficient is the relative number of well-classified pixels to all other pixels that were examined. The obtained Kappa coefficient in this study demonstrates how well the results match the data of reference [59]. Because it checks every component of the confusion matrix, it provides an objective statistic when evaluating the classification [60]. The K C can be calculated as follows (Equation (5)):

F1-Score
A single metric, the F1-score is considered as the harmonic mean of precision and recall in binary cases. In multi-class cases, the F1-Score will have to involve all the used classes. Hence, a comparison of the multi-class measure of recall (Re) and precision (Pr) is required. The F1-score is given as follows (Equation (6)): where Pr = TP/(TP + FP) Re = TP/(TP + FN)

Training Region Spectral Characteristics
The supervised classifications with both machine and deep learning classifiers were performed based on the sampling of the lithological unit's spectra (Table 2). Initially, all the training samples were divided into training and test sets (30% for testing, and 70% for training). The sampling step was carried out based on the study area of the geological map, a visual inspection of the HyMap image in the true color composite and the false color composite of the MNF bands 3.2.1, as well as the field survey. Five classes were selected for AVZ1, and seven classes were selected for both AVZ2 and AVZ3.  Figure 5 displays the extracted spectral signatures of the lithological units from the AVZ1, AVZ2, and AVZ3 HyMap scenes, respectively. The spectral responses of lithofacies vary from the HyMap data of a zone to another's, providing distinct spectral signatures along the entire wavelength range (Figure 5), due to the high spectral resolution (15 nm). In our study area, Mg-Fe-OH/CO 3 , Al-OH, and Fe 3+ /Fe 2+ hydrothermal alteration minerals were detected and mapped [19]. Almost all the litho-units show an Al-OH absorption feature (2200 nm) in AVZ1, while this absorption becomes deeper in the AVZ2 and AVZ3 lithounits, except for some carbonate lithounits, represented by lower series limestone and dolomite and basal series limestone and dolomite. A Mg-Fe-OH/CO 3 absorption feature at 2336 nm was also presented in AVSZ lithounits, becoming deeper in AVZ3.
An Fe 3+ /Fe 2+ absorption feature (950 nm) was also detected in the spectral signatures of the study area lithofacies. Hydrothermal alteration minerals, including hematite, kaolinite, illite, muscovite, montmorillonite, topaz, dolomite, and pyrophyllite, were identified and mapped in the AVSZ area, and can explain the various previously described absorption features within AVZ1, AVZ2, and AVZ3 [19]. Table A1 displays the results of the Average Jeffries-Matusita (JM) distances [61] for AVZ1, AVZ2, and AVZ3. The values of the JM matrix for AVZ1, AVZ2, and AVZ3 are ideal in most cases or very close to a value of 2 in several cases, indicating a much significant spectral dissimilarity. Additionally, the Ascendant Pair Separation (APS) was computed for the HyMap data of three zones. In AVZ1, the APS shows values of, 1.997790, 1.999036, and 1.999995, for classes 3 and 4, 1 and 4, and 2 and 3, respectively. In AVZ2, the APS shows values of 1.998016, 1.999988, and 1.999968 for classes 4 and 5, 4 and 2, and 5 and 7, respectively. Last, for AVZ3 more distinct spectra have been observed, with the APS showing values of 1.99993, 1.99994, and 1.99999, for classes 6 and 7, 5 and 6, and 5 and 7, respectively. An Fe 3+ /Fe 2+ absorption feature (950 nm) was also detected in the spectral signatures of the study area lithofacies. Hydrothermal alteration minerals, including hematite, kaolinite, illite, muscovite, montmorillonite, topaz, dolomite, and pyrophyllite, were identified and mapped in the AVSZ area, and can explain the various previously described absorption features within AVZ1, AVZ2, and AVZ3 [19]. Table A1 displays the results of the Average Jeffries-Matusita (JM) distances [61] for AVZ1, AVZ2, and AVZ3. The values of the JM matrix for AVZ1, AVZ2, and AVZ3 are ideal in most cases or very close to a value of 2 in several cases, indicating a much significant spectral dissimilarity. Additionally, the Ascendant Pair Separation (APS) was computed for the HyMap data of three zones. In AVZ1, the APS shows values of, 1.997790, 1.999036, and 1.999995, for classes 3 and 4, 1 and 4, and 2 and 3, respectively. In AVZ2, the APS shows values of 1.998016, 1.999988, and 1.999968 for classes 4 and 5, 4 and 2, and 5 and 7, respectively. Last, for AVZ3 more distinct spectra have been observed, with the APS showing values of 1.99993, 1.99994, and 1.99999, for classes 6 and 7, 5 and 6, and 5 and 7, respectively.

Lithological Mapping Results
Lithological classification maps of the HyMap dataset using the different SVM types, RF, KNN, and 1D-CNN are displayed in Figure 6. Moreover, the false color composite (FCC) image of MNF 3.2.1 as a RGB composite was extracted for every classified area (Figure 6). In zone 1, the lithological results obtained using SVM-RBF show more noise than other SVM types, as illustrated in Figure 6. The mapping results show that the granite (Cl-1) and gneisses (Cl-2) units using SVM-RBF are misclassified in some parts into surrounding lithological units. In addition, the volcanites (Cl-5) unit is over-classified within the schist (Cl-3) unit compared to the SVM-LK and SVM-PK results ( Figure 6). The KNN results are comparable to RF, while the CNN results better enhance the litho-units relatively. In zone 2, the lithological classification extracted using the SVM-RBF method shows a significant difference compared to the two other methods ( Figure 6). Almost all the basal limestone and dolomite units (Cl-3) and basal schist and sandstone units (Cl-2) which are clearly depicted in the MNF image were misclassified into schist (Cl-7) and lower limestone and dolomite (Cl-1) (Figure 6), while the other classifications show less confusion between these units. The SVM-LK derived map shows relatively more promising results, especially in detecting ultimate conglomerate (Cl-4) units in the south-east area of AVZ2. On the other hand, the KNN and RF results exhibit a high resemblance to those derived from RBF-SVM, while the CNN classification reveals a greater ability to differentiate between basal and lower series as well as avoid misclassification between comparable ultimate conglomerates and quartzites unit capabilities when mapping lithological units in

Lithological Mapping Results
Lithological classification maps of the HyMap dataset using the different SVM types, RF, KNN, and 1D-CNN are displayed in Figure 6. Moreover, the false color composite (FCC) image of MNF 3.2.1 as a RGB composite was extracted for every classified area ( Figure 6). In zone 1, the lithological results obtained using SVM-RBF show more noise than other SVM types, as illustrated in Figure 6. The mapping results show that the granite (Cl-1) and gneisses (Cl-2) units using SVM-RBF are misclassified in some parts into surrounding lithological units. In addition, the volcanites (Cl-5) unit is over-classified within the schist (Cl-3) unit compared to the SVM-LK and SVM-PK results ( Figure 6). The KNN results are comparable to RF, while the CNN results better enhance the litho-units relatively. In zone 2, the lithological classification extracted using the SVM-RBF method shows a significant difference compared to the two other methods ( Figure 6). Almost all the basal limestone and dolomite units (Cl-3) and basal schist and sandstone units (Cl-2) which are clearly depicted in the MNF image were misclassified into schist (Cl-7) and lower limestone and dolomite (Cl-1) (Figure 6), while the other classifications show less confusion between these units. The SVM-LK derived map shows relatively more promising results, especially in detecting ultimate conglomerate (Cl-4) units in the south-east area of AVZ2. On the other hand, the KNN and RF results exhibit a high resemblance to those derived from RBF-SVM, while the CNN classification reveals a greater ability to differentiate between basal and lower series as well as avoid misclassification between comparable ultimate conglomerates and quartzites unit capabilities when mapping lithological units in zone 3 ( Figure 6), in spite of the occurrence of some misclassification of basal limestone, dolomite units (Cl-2) and ultimate conglomerates (Cl-4) while using SVM-RBF. It is observed that the lithological units derived from the SVM-LK and SVM-PK methods exhibit greater precision when compared to the results obtained from the SVM-RBF method, which is supported by the MNF results, field survey, and geological map ( Figure 6).
Accuracy and loss are plotted versus iterations (epochs) for the train, with the test phase of the 1D-CNN having 20 epochs. It is notable that the plot of the train and test loss decreases to achieve a stability point, demonstrating learning curves that show good fit. Using the four classifiers, the user (UA) and producer (PA) accuracies are represented in Figure 7. The UA and PA metrics provide insights into the commission and omission errors associated with individual classes, respectively [59]. dolomite units (Cl-2) and ultimate conglomerates (Cl-4) while using SVM-RBF. It is observed that the lithological units derived from the SVM-LK and SVM-PK methods exhibit greater precision when compared to the results obtained from the SVM-RBF method, which is supported by the MNF results, field survey, and geological map ( Figure 6). Accuracy and loss are plotted versus iterations (epochs) for the train, with the test phase of the 1D-CNN having 20 epochs. It is notable that the plot of the train and test loss decreases to achieve a stability point, demonstrating learning curves that show good fit. Using the four classifiers, the user (UA) and producer (PA) accuracies are represented in Figure 7. The UA and PA metrics provide insights into the commission and omission errors associated with individual classes, respectively [59].

SVC with RBF, Polynomial, and Linear Kernels Evaluation
In AVZ1 the best classification results were obtained by using the PK-SVM method, with an overall accuracy of 85.73% (KA = 81.54, F1 = 78). However, the LK-SVM method exhibits comparable results, with an overall accuracy of 85.4% (KA = 81.17, F1 = 84.85) ( Table 6), while the lowest accuracy is obtained by using the RBF-SVM method, with an overall accuracy of 79.41% (KA = 73.19, F1 = 78.29). Figure 7A shows the producer and the user accuracy of each class. The highest accuracy (average) was obtained with the mapping of gneisses unit (CL-2), with more than 95% for all the classifiers (Table 3). In addition, the best results in classifying gneisses were obtained using the LK-SVM method, with an accuracy of 99.6. A low accuracy was observed in the vulcanites unit (Cl-5) following the low PA by using the PK-SVM (72.1) and LK-SVM (62.5) methods ( Figure 7A), while the quaternary sediments unit reveals a low UA, with about 70% for all the SVM types, which reduced the average accuracy of the SVMs ( Figure 7A). In AVZ2, the highest classification accuracy was obtained from the PK-SVM method, with an OA of 75.93 (KA-67.7, F1 = 72.6), which is~0.4% (OA) higher than the LK-SVM method and 5.7% (OA) higher than the RBF-SVM method. By using the RBF-SVM method, we noticed a good portion of (Cl-3) was misclassified into (Cl-1) and (Cl-4) ( Table 4); this was also observed as well with the other SVM types with less intensity. Figure 7B shows the UA and the PA of each class in AVZ2 using the three SVM types. The highest accuracy is allocated to the lower limestone and dolomite unit (Cl-1) that demonstrated good mapping results (See Figure 6), with accuracy averages of 92.5, 94.48, and 85.06 using the PK, LK, and RBF-SVM methods, respectively. The lowest accuracy was represented in basal schist and sandstone (CL-2), which was almost completely misclassified into Cl-1, Cl-4, and Cl-7 (Table 4) using the RBF-SVM method. The misclassification of Cl-2 to Cl-7 was not recorded when using the LK and PLK-SVM methods, while the misclassification of surrounded units (Cl-1 and Cl-3) did persist however. Additionally, the basal limestone and dolomite unit (Cl-3) presents a moderate accuracy when using the three methods, with the highest accuracy being recorded when using the PK-SVM method. The other classes reveal generally good classification, with more than 70% in their cases.
In AVZ3, the best mapping results were obtained by using the LK-SVM method, with an overall accuracy of 88.4% (KA = 85.64, F1 = 84.91), which is 3% and 22% higher than the PK-SVM and RBF-SVM results, respectively. The schist unit (Cl-6) demonstrated the highest classification accuracy, with more than 80% for all the classifiers (Table 5). Lower limestone and dolomite (C1-1) shows good mapping results using the PK and LK-SVM method, with about 90% accuracy, while exhibiting a low mapping accuracy by using the RBF-SVM method due to the misclassification of its surrounding units and Cl-7. The lower limestone and dolomite (Cl-1), basal schist and sandstone (Cl-2), and the basal limestone and dolomite (Cl-3) demonstrate low classification accuracy results, especially when using the RBF-SVM method. However, good accuracy was achieved for almost all the other classes (>80%). The results in Table 6 demonstrate that the LK-SVM and PL-SVM methods can provide more accuracy in lithological mapping within the AVSZ area, with mean overall accuracies of 83.12 and 82.36%, respectively, taking into consideration the three selected zones. However, the results of the LK-SVM and PK-SVM methods are much comparable in terms of performance, with a 0.76% OA increase using the LK-SVM method.

SVC-Types, RF and KNN, and CNN Accuracy Assessment
The classification results in AVZ1 indicate that the 1D-CNN reached the best classification results for most lithological units. Additionally, the SVM, KNN, and RF traditional methods seemed to be poorly classified, with some granite being misclassified as orthogneisses. In particular, the granite was particularly misclassified as orthogneisses and schist in AVZ1 ( Figure 6). The 1D-CNN demonstrated more powerful learning capabilities, resulting in fewer classification errors and more precise boundary classification compared to other methods. Table 3 presents the accuracy results of the different algorithms for AVZ1. The ML methods, including the SVM-LK, KNN, and RF methods, demonstrated less classification performance, as noticed for the quaternary sediments (Cl-4) with an accuracy of 78.4% using SVM-LK, and for Granites (Cl-1), with accuracies of 63.1% and 59.7% using the KNN and RF methods, respectively. On the other hand, the ML methods showed advantageous results when using the LK-SVM method. The CNN 1D model achieved the best results, with an OA of 95.56% (KC = 94.19%, F1 = 95.86%). Figure 6 shows comparable classification results using the RBF-SVM, KNN, and RF methods in AVZ2. The basal schist and sandstone (Cl-2), as well as the basal limestone and dolomites (Cl-3), were mostly misclassified into surrounding litho-units using these methods. The SVM-LK, SVM-PK, and 1D-CNN classification results demonstrate a relatively greater ability to delineate the lithological formation in AVZ2. The classification accuracy in Table 6 indicated that the 1D-CNN achieved the best performance, obtaining a remarkable overall accuracy of 92.19% (KC = 89.77%, and F1 = 88.78%). The 1D-CNN enhances the accuracy of classification for the basal schist and sandstone (Cl-2) and basal limestone and dolomites (Cl-3) by 28.9% and 24.8%, respectively, compared to the LN-SVC method ( Table 4). The highest misclassification in AVZ3 was observed between the basal schist and sandstone (2) and orthogneisses (7); basal limestone and dolomites (3) and basal schist and sandstone (2); as well between basal limestone and dolomites (3) and lower limestone and dolomites (1), using the RBF-SVM method; while comparable results showing similar litho-unit misclassification were revealed with the KNN and RF methods, with a moderate increase in accuracy for the basal limestone and dolomites (3) and basal schist and sandstone (2) (See Table 5). In parallel, the best accuracy in the classification of Cl-1 and Cl-2 using ML algorithms was achieved using the LK and PL-SVM methods, with an overall accuracy of 87.6% and 85.0%, respectively. However, as shown in Figure 6, the 1D-CNN shows a noticeable improvement in the classification results, with the OA being improved by 11.64% compared to the LK-SVM results. Moreover, the 1D-CNN algorithm achieved the best overall accuracy of 94.76% ( Table 6). The spectral features integration with the 1D-CNN can enhance the classification of hyperspectral high-resolution remote sensing data.
The obtained lithological classification using the HyMap data was validated using a field survey ( Figure 8). Different locations were selected to validate the RS-based results. Figure

Discussion
The study area of the AVSZ is a complex area showing strong hydrothermal alteration, where many zones of iron, argillic, phyllic, and dolomitization alteration have been revealed in the East, Northeast, and Northwest study areas [19]; thus, due to these occurrences, lithological mapping using remotely sensed data can be as a challenging task. In semi-arid regions with sparse vegetation cover, lithological variations become more pronounced. In particular, within the western Anti-Atlas, remotely sensed data from multispectral sensors (i.e., ASTER, OLI, and Sentinel 2A) played an important role in lithological and mineral mapping [62]. Hyperspectral data can provide more spectral information, and subsequently be more sensitive to subtle changes in the reflectance patterns of lithological units, allowing for effective detection and classification of these variations. With this understanding, our study represents a performance evaluation of HyMap and different algorithms, namely SVMs, the 1D-CNN, RF, and the KNN, for lithological mapping in three different zones.
The results indicate that the 1D-CNN algorithm achieved the best classification results for most lithological units, while the SVM, KNN, and RF traditional methods seemed to be poorly classified.
Machine learning and deep learning techniques demonstrate a great ability in mapping surface geology when used with remote sensing data [22]. Despite the fact that deep learning techniques possess the capacity to effectively represent intricate and massive datasets, no previous lithological mapping research was conducted in the western Anti-Atlas of Morocco. Deep learning is broadly categorized into two types of neural network architectures: feedforward and recurrent [63].
In the current study, three zones were selected for lithological classification, namely AVZ1 (with five classes), and AVZ2 and AVZ3 (with seven classes each). Thereafter, the enhancement of the litho-boundaries via MNF transform (Figure 3) as well as fieldwork has assisted in the sampling of each class. The Average Jeffries-Matusita distances (Table A1) demonstrated a good separability between the study area classes' spectra, which reflects the accurate sampling. The SVM algorithm was applied and assessed in the three selected zones over the AVSZ using the three kernels: the RBF, LK, and PK. Then, the LK-SVM method revealed good mapping results when compared to the other SVM models. Additionally, the LK-SVM and PK-SVM methods yielded comparable capability in lithological mapping within the study area, showing a higher accuracy than the RBF-SVM method. Figure 7 shows that the litho-units for the UA and PA demonstrate a considerable improvement using the LK-SVM and PK-SVM methods when compared to the RBF-SVM method for the three zones. Moreover, accuracy assessment results using the OA, KA, and F1-score demonstrated a high homogeneity in the calculated parameters by using the three SVM types from one zone to another, results which support the robustness of the results of the present study and their applicability in other geologically comparable regions. For example, in AVZ1, the OA, KA, and F1 using the LK-SVM method were 5.9%, 7.9%, and 6.5%, respectively, higher than those obtained by using the RBF-SVM method ( Table 6).
In the previous studies, De Boissieu et al. [64] used HyMap and RBF-SVM in regolithgeology mapping, where the classification accuracy assessment showed an OA of 70%. The overall accuracy presented by De Boissieu et al. [57] was 3.3% higher than that of our investigation, which can be due to several factors, including terrain complexity, the target materials' geochemistry, and sampling errors. However, the latter accuracy assessment results and our results are still close and comparable. In addition to the VNIR and SWIR bands, hyperspectral TIR remote sensing images and CNNs can be used to improve lithological classification obtained using conventional ML methods [21]. A study conducted by Liu et al. [21] in Liuyuan, Gansu Province, China revealed that the use of CNNs and hyperspectral TIR data in lithological mapping can enhance the OA of conventional ML methods from 2.5% to 25%. Even using multispectral satellite dataset (ASTER, OLI, and Sentinel) CNNs could provide the highest lithological mapping accuracy in mineral-rich areas [24].
Many remote sensing data classification studies revealed the good performance and the adaption of the RBF kernel with SVMs [65]. On the other hand, it is worth it to note that selecting an appropriate kernel function for a specific problem is a complex process and usually involves testing various types of SVMs and kernel functions to identify the most effective approach. Using the four ML and DL methods, according to the accuracy assessment using the OA, KA, F1-score, as well as acknowledging the fieldwork in the AVSZ area, the combination of the 1D-CNN and HyMap data yielded the most accurate lithological mapping results. Compared to the ML methods, the 1D-CNN has the ability to extract the hidden relationships in the HyMap data. Additionally, the DL methods are still limited, with one of the most important limitations being the large number of samples required.
Additionally, the performance of the DL and ML methods may be influenced by various factors, such as the choice of hyperparameters, the presence of outliers or noise in the data, and the quality and quantity of the training data. In general, due to the spectrum similarities induced by the comparable chemical and mineralogical compositions of the various lithological units, it is challenging to identify different lithological units using remotely sensed data [66]. Additionally, the results can be optimised by refining the hyperparameters of the SVM, RF, and KNN ML algorithms, and a grid search in a programming environment can be used to find the best hyperparameters. To sum up, even though the present study did not use 3D-CNN or 2D-CNN methods, the experimental results highlight the effectiveness of the 1D-CNN in enhancing lithological classification using hyperspectral HyMap imagery. Nevertheless, incorporating 3D-CNN models to hyperspectral imagery introduces increased complexity, resulting in a potentially longer computation time [67].

Conclusions
The current study evaluated the HyMap VNIR-SWIR bands for lithological classification of the Ameln valley region at the eastern Kerdous inlier, Moroccan western Anti-Atlas, using the machine and deep learning approaches. The results of classification yielded a considerable and acceptable overall accuracy in the majority of cases (>70%). The lithological mapping derived from the HyMap data gave more details on the three distinct zones and highlights the lithological boundaries more precisely compared to the geological map of Tafraout (1/100,000). The contact between the ultimate conglomerates and the basic series, and the basic series and the lower series, was clearly distinguished, with the transition between the Precambrian basement and the Palaeozoic cover being marked by several mineral occurrences within the western and central Anti-Atlas province.
The choice of hyperparameters, the presence of noise in the data, and the quantity of the training data are the most challenging tasks for the ML and DL classifiers. Furthermore, we can conclude that the choice of the right SVM kernel function for a given problem is not a simple task, and often requires experimentation with different types of SVMs' kernel functions to determine the best one. The LK-SVM and PK-SVM methods with the used parameters are suitable to be conducted on the HyMap data, with the aim of obtaining more accurate and detailed geological mapping compared to the RBF-SVM, KNN, and RF methods. It is suggested for future work to apply dimensionality reduction methods such as MNF before implementing machine or deep learning methods on the dataset if aiming to achieve the best visualization of the litho-units boundaries. To sum up, HyMap imagery is recommended for performing small scale lithological mapping in a semi-arid region presenting hydrothermal alteration occurrences. From a more comprehensive perspective, the LK-SVM, PK-SVM ML, as well as 1D-CNN DL methods coupled with high resolution hyperspectral data are suggested to achieve optimal lithological mapping results in arid and semi-arid regions. In future research, it is recommended that an approach could be established using deep learning techniques and an optimum band selection of HyMap data to improve the accuracy and efficiency of lithological classification and mineral exploration in the semi-arid regions around the world.  Table A1. Average Jeffries-Matusita distances for AVZ1, AVZ2, and AVZ3 calculated within the HyMap scene of AVSZ (ascendant pair separation (Pair Sep) has been as well indicated).