A Fully-Automated Framework for Mineral Identification on Martian Surface Using Supervised Learning Models

The availability of various spectral libraries for CRISM (Compact Reconnaissance Imaging Spectrometer for Mars) data on NASA PDS (Planetary Data System) hugely facilitated the research on the surface mineralogy of Mars, however, building supervised learning models for mineral mapping appears to be challenging due to the lack of ground-truth/training data. In this paper, an automated framework is presented that classifies the spectra in a CRISM hyperspectral image using supervised learning models, where the required training data is produced by augmenting the mineral spectra available in the MICA (Minerals Identified in CRISM Analysis) spectral library, that keeps the key absorption signatures in the mineral spectra intact while providing adequate variability. The framework contains a pre-processing pipeline that in addition to some conventional pre-processing steps includes a new feature extraction method to capture the information of the most distinguishable absorption patterns in the spectra. The proposed framework is validated on a set of CRISM images captured from different locations on the Martian surface by using different types of supervised learning models, like random forests, support vector machines, and neural networks. An uncertainty analysis of the different steps involved in the pre-processing pipeline is provided, as well as a comparison of performances with some of the previously used methods for this purpose, which shows this framework works comparably well with a mean accuracy of around 0.8. Interactive mineral maps are also provided for the detected dominant minerals.


I. INTRODUCTION
CRISM is a visible to near-infrared spectroscopic measurement for hyperspectral imagery with an improved spectral and spatial resolution that has been widely used to categorize minerals on the Martian surface as well as understand the diversity of minerals. CRISM imaging spectrometer was introduced to investigate minerals that could indicate geological evolution and previous sustainability on the Martian surface. It was first launched in 2006 and since then captured hundreds of contiguous bands from the The associate editor coordinating the review of this manuscript and approving it for publication was Wei Liu. most critical parts of the landscape, such as volcanic areas, steep cliffs, and sedimentary deposits. At a swath width of 9.5-12 kilometer and a high spatial resolution of 18 − 20 meter per pixel, CRISM measures electromagnetic energy in visible to near-infrared (VNIR) wavelengths in the range 364-1055 nanometer using an 'S' detector, and infrared (IR) wavelengths in the range 1001-3936 nanometer using an 'L' detector, with a bandwidth of 6.55 nanometer per channel [1].
The characterization of minerals on the Martian surface is challenging because of instrumental artifacts and seasonal fluctuations in CO 2 , ice aerosols, and dust in the Martian environment while collecting the data. So, the raw-collected  Experimental Data Record (EDR) with 544 bands, undergoes numerous correction stages and a Map-Projected Targeted Reduced Data Record (MTRDR) is formed and left out with 489 bands. From this MTRDR data, spectral summary parameters and browse products have been calculated, which is essential for spectral analysis and visual interpretation for the characterization of the Martian surface [2]. MTRDR data collected from 2006 to 2012 along with a collection of 60 spectral summary parameters prepared by Viviano-Beck et al. in 2014, is available at the PDS geoscience node [3] is used for the present study to draw a comparison of our findings with similar works in literature.
MICA (Minerals Identified in CRISM Analysis) type spectral library, which was made available at the PDS geoscience node in 2014, and since then has been used vastly for both identification and validation of the minerals. The content of MICA type spectral library includes numerator, denominator, and ratioed (ratio of radiance to incident and solar radiation) reflectance spectra for 31 distinct minerals/mineral groups [ figure 1], which was prepared by analyzing CRISM hyperspectral imagery from different mineral-rich locations on Martian surfaces [2].
Despite the availability of such an extensive spectral library, the development of automated systems for mineral identification is hugely still in progress. Most of the methods for mineral mapping in the initial days were constrained by the use of similarity metrics on the full spectra distribution to map the mineral classes and were limited by the quality of the end results. The mineral identification is not so straightforward, because, firstly, the reflectance values for a mineral in the spectral library and the reflectance values for that same mineral in MTRDR data may be at different scales [ figure 2a], secondly, the presence of different continuum noise in MTRDR data changes the global shapes of the spectra [4] through the absorptions are preserved [ figure 2b]. For this reason, building learning models for mineral mapping based on similarities in position and strength of absorption features has been a focus of research in recent years [5], [6], [7], [8]. One major challenge to building a supervised model for mineral identification in CRISM data is the insufficiency of labelled data for model training. A dearth of studies on spectral pre-processing to increase mapping quality is also observed.
Supervised models are acclaimed for recognizing the hidden features in data, and may thus be trusted to map test pixels based on the absorption features; though the challenge lies here in accumulating an unbiased and divergent training dataset. In this study, a new data augmentation technique is used on the mineral spectra available in the MICA-type spectral library to generate the training data that eliminates the challenges of manual intervention to gather the training data. The training data being generated solely from the library makes these models applicable to any MTRDR data of the Martian surface. Also, a new idea of feature extraction is proposed that calculates the absorption information over the most diverse wavelength ranges in a spectrum, that in turn helps to train the models faster with improved accuracy. The objective of this study is to create a common framework for mineral mapping in CRISM hyperspectral images, that can be used with any supervised learning model. This study does not compare or evaluate the competence of the used supervised learning models; instead, its sole goal is to determine the efficacy of the framework with the provided data augmentation and absorptions extraction step. To our knowledge, no such fully-automated framework for mineral identification in the Martian data is available till date.

II. PRE-PROCESSING PIPELINE
In this section, the process to create the augmented data is described first, and then the pre-processing pipeline to build the viable training data is discussed. The domain collating step is used to establish the consistency of the features in the training data, the smoothing step is necessary to remove overly spurious portions of the spectrum as well as small unwanted kinks that occur due to various noises in real data, the continuum removal process is required to improve some absorption signatures in the spectrum that are subdued by the presence of continuum in real data, and whereas the standardisation step introduces high variance in the training data to speed up the learning process. The feature extraction procedure is then used to create the training dataset by extracting features from the most diverse of spectral areas. Flowchart of the data pre-processing pipeline that is used to train the supervised models, and the mineral mapping procedure using the trained models at right.
The illustrations in figure 3 depict the model training and mineral identification procedure.
A. DATA AUGMENTATION Data enhancement essentially involves introducing such noise into training data which has a regularization effect, thus increasing the strength of the supervised learning models. MICA library contains single sample (ground truth) spectra for 31 different minerals. Building a supervised learning model with such a small number of training data could make the model biased to some features or sensitive to some other features, making it harder for the model to generalize absorption patterns, which could decrease output accuracy. Using mineral signatures in the MICA library to augment spectral signatures of minerals is not an obvious approach. The nonlinear multiplicative noise that predominates in CRISM images is particularly difficult to replicate via augmentation. This noise component varies not only between distinct images but even within the same image. In this study, a new augmentation method has been used that preserves the direction of the spectra curve while deviating randomly within a carefully chosen limit from the original spectra, and then was combined with noise and a randomly generated continuum in order to make the model work with real data. step-1: (Augmenting the pure spectra from MICA) Let Aug is the spectra augmented from the pure spectra Org in MICA library. Let w ′ is the previous wavelength of w where a reflectance is captured. Org w and Org w ′ are the reflectance values for original mineral spectra in MICA library and Aug w and Aug w ′ are the determined reflectance values in the augmented spectra at wavelength w and w ′ respectively. Equations (1)(2)(3) generates Aug from Org constrained by a deviation limit m. w 0 is the wavelength where the first reflectance value in the spectrum is measured. Equation (1) determines the change of reflectance value at w in Org and equation (2) determines a proportional change limited by the two constraints m and r, where r is generated randomly ∈ [0, 1]) and m(≥ 0) is set beforehand. The proportional change is applied on Aug w ′ to get the augmented reflectance value at w in equation (3). Note that, by equation (2) the direction of change (upward or downward) from w ′ to w in Org remains the same in Aug.
If m = 0, Aug is parallel to Org with the distance of initial deviation d.

step-2: (Blending noise and continuum)
In real hyper-spectral data often the presence of continuum and noises alters the absorption information stored in the spectrum. Therefore, the augmented spectra created in the previous step are mixed with random noise and continuum to create a more accurate reproduction of the real data, making our model capable of detecting characteristics in the real spectra. The random noise N is generated by randomly chosen values from a Gaussian distribution N (0, σ 2 N ) and the continuum curve C is predicted by a Gaussian curve (w, 0.65), where the wavelength w is the position of the peak of the curve and is picked randomly within the selected range of the wavelengths. The noisy augmented curve AugN is given by, The prefix scaled denotes a min-max scaling operation on the data. Note that, even though the augmented reflectance values are not entirely coherent with the original spectrum, that will affect the classification process due to the presence of continuum removal and standardization steps in the pipeline. Figure 4 graphically depicts the augmentation process.

B. DOMAIN COLLATING
Supervised learning Models must have domain consistency between training and testing data. For this purpose, a predefined set of wavelengths is needed to be fixed as the domain for both training and testing data. In MICA library different minerals have spectra as a function of a different set of wavelengths (for example spectra of gypsum have 430 bands, alunite has 469 bands whereas bassanite has 480 bands). The fixed set of spectra to work on has been derived by unifying all the different wavelengths. Various curve-based interpolation techniques can be used to perform this task such as cubic, cosine, gaussian, and others. In this study, linear interpolation is applied to generate additional reflectance values, for the sake of simplicity and also for the negligible difference if more complicated interpolation methods were used. It has been concluded in different studies that most of this absorption occurs in the wavelength range 1-2.6 micrometer [8], [9], [10]. Though for some minerals a few distinguishable absorptions lie beyond this range, spectra are generally very prone to noises thus exhibiting additional drifts. To avoid models trained with unwanted features, a trade-off is made by compromising the effect of those distinguishable absorptions. From the interpolated 480 bands the 247 bands within the 1-2.6 micrometer wavelength range are selected to be used in the next pre-processing steps.

C. SPECTRA SMOOTHING
In CRISM hyperspectral data narrower bandwidth captures diverse energy that introduces the self-generated noise inside the sensors leading to spurious spectra both in the MICA library and MTRDR data. These small kinks remain present in each of the augmented spectrum replicas and can affect the generalization ability of the learning models. Hence, the following stage in the pre-processing procedure is to smooth each spectrum. Savitzky-Golay filter [11], which is a frequently used smoothing technique appearing in various computational and optimization studies for years, increases the precision of the data without deforming the signal tendency. The trade-off to include the wide range of minerals reducing the noises blended with real data was given precedence, even if the Savitzky-Golay smoothing even with small window sizes is uncertain to preserve the doublet characteristics present in some uncommon phases. Moreover, the Savitzky-Golay filter, unfortunately, cannot remove the larger spikes from a spectrum; hence an additional spike removal step is necessary.
Let, SG is a smoothed spectrum and the coefficient of variation of it being cv SG . Let a small window of size ω is moved across the set wavelength range such that cv ω w is the coefficient of variation and µ ω w is the mean of the available reflectance values within the window, w being the middle position of the window. Let mean(cv ω ) and max(cv ω ) respectively be the mean and the maximum of the coefficient of variations of all such windows. The spike removal procedure is given by, where SR w is the spike removed reflectance value at wavelength w and the parameter s regulates the spuriosity in the resultant spectrum. Figure 5 shows a sample result of the smoothing generated by Savitzky-Golay filtering followed by spike removal.

D. CONTINUUM REMOVAL
Even after atmospheric and photometric corrections on CRISM data, the spectrum contained in a pixel of an MTRDR image may be distorted by several factors like reflectance, radiance, and other environmental conditions producing a continuum, which alters some essential criteria for mineral identification, such as reflectance value, curvature tendency, the position of the absorption and others, making mineral identification much more difficult [4]. The Continuum removing step focuses on eliminating these distortions and enhances the absorption qualities of the spectrum. Because continuum spectra are frequently convex or flat, the continuum is estimated by fitting an upper convex hull to the original spectrum [12], [13]. The ratio between the pixel spectrum and the value of its convex hull at the same wavelength has been calculated to remove the continuum  from the spectrum [ figure 6].
where at wavelength w, Aug w is the original reflectance value, CH w is the value of the upper convex hull around Aug, and CR w is the value of the continuum removed spectrum.

E. SPECTRA STANDARDIZATION
Machine learning models based on gradient descent converge to the minima more quickly if the variance of the different features is high. Patterns in a spectrum emerge as absorptions positioned around specific wavelengths recorded in consecutive bands, implying that the features/bands are not fully independent of one another. Also, the continuum removal step ensures all the features in input data are in the (0,1] range, thus minimizing the variance. A standard scaler performed over the continuum removed data, by dividing a spectrum by its standard deviation, and establishes the mean at 0 and the standard deviation is 1 for each spectrum. This improves its absorption characteristics even more [ figure 7], and speeds up the learning process. As an alternative process, row normalization can be performed with a min-max scaler.

F. ABSORPTIONS EXTRACTION
Though supervised models are well-known for discovering hidden patterns in data on their own, they are referred to as black-box models because it is unclear which patterns are prioritized; this manifests the fact that there is no certainty over its acceptable performance on delicate test data. On the VOLUME 11, 2023 other hand, feeding the possible patterns more distinctively to the learning models imposes more control over the models' performance. To say precisely, though it is an additional challenge for the designers to generate more sophisticated feature values, it provides more control over the model outcome.
A section of a spectrum known as an absorption band is made up of reflectance values from a group of nearby wavelengths, with the lowest reflectance value occurring in the middle of the wavelengths (C point in figure 8) and all reflectance values before it having a negative slope and all reflectance values after it having a positive slope. The endpoints of the absorption band are the points where the slope changes (A and B points in figure 8). Following are some of the well-used absorption features in literature [14]: • band-depth: If the endpoints of the absorption band are interpolated, the depth of the lowest point from the interpolation line is referred as the band-depth (dashed line in figure 8).
• band-angle: This is the angle between the lines connecting the lowest point with the endpoints (θ in figure 8).
• band-area: This is the total area covered by the absorption band and the interpolated line and is estimated by the summation of depths of all the points in the absorption band from the interpolated line (greycoloured area in figure 8). In a continuum removed and standardized spectra S the band-depth at a particular wavelength w within a wavelength range w s to w t is calculated by where S ′[w s ,w t ] is represents a linear interpolation between w s and w t with corresponding processed reflectance values, w being a wavelength within the range and S w is the processed reflectance value at w. The band-area within this range can be estimated by Note that, by definition (6) and (7), on a spectrum for a specified range these values can be negative.
Here a new idea is proposed to extract a set of absorption features from a spectrum. From all 31 numerator reflectance spectra, extracting the wavelength ranges with higher diversity is the first task in this step, which is performed by identifying all shoulder points/ endpoints from each of the processed spectra available in the MICA library and considering ranges between every two consecutive endpoints in them. More than 350 absorptions of varying shapes with overlapping wavelength ranges have been retrieved. Then for each of the wavelength ranges in each numerator spectrum, band-area is calculated. The diversities in wavelength ranges are measured by the Average Absolute Deviation (AAD) of the band-areas.
A new set of training data is derived by calculating banddepths in the processed spectra for a portion of the retrieved wavelength ranges with high diversities. As increasing the dimension is inversely related to the learning time in supervised models, and also makes the model prone to overfitting, feeding a training dataset of a large number of features is avoided in such models. Figure 9 shows some example absorption bands with high diversity.
The pre-processing steps may not necessarily be performed in the same order as given in figure 3, as an example, performing augmentation after cropping and interpolation steps would reduce the run-time. However, reordering these steps does not have any evident effect on the performance of the learning models. The training data prepared by the discussed pipeline can be used to train the supervised learning models. The number of training classes in all learning models is 31, which corresponds to the number of mineral classes in the MICA spectral library. To generate the mineral map of an MTRDR data, the spectrum from each pixel is extracted and after applying the same pre-processing steps is evaluated by a model; the label is set on basis of the confidence given by the model for being a mineral class. An issue with supervised learning models is that the models are bound to classify each pixel to one of the mineral classes even if the pixel spectrum does not match with the mineral spectrum considerably, which often leads to misclassification. We set some confidence cut-off to be achieved by the model to get a pixel classified. For each pixel, the model gives a result of confidence by ensuing probability for each of the classes that are the minerals present in the MICA library. With the criterion imposed, a pixel is mapped to some mineral that has the highest confidence, only if the confidence value is more than 0.5 and the second-highest confidence is less than half of the highest confidence. If the condition is failed, the pixel is marked as unclassified, though during performance measures, an unclassified pixel is considered a false negative.

III. PERFORMANCE ANALYSIS
In this section, the performance of the proposed procedure is evaluated. The augmented data after applying the proposed pipeline is used to train three types of supervised learning models, which are deep learning-based approach Artificial Neural Network (ANN), binary classification-based approach Support Vector Classifier (SVC) and decision treebased approach Random Forest Classifier (RFC). Moreover, the performances of these models are compared with that of a shallow Convolutional Neural Network (CNN) model that is trained by the augmented data pre-processed until the spectra standardization step. Applying convolutions to the domain of extracted features is ineffective because a CNN model learns the absorption characteristics on its own using convolutions that rely on the resemblance of the reflectance values at the consecutive wavelengths. The purpose of using the different supervised learning models is to show the automated framework can perform with any of them with regard to the pre-processing pipeline proposed in the previous section.
The experimental setup, which covers the description of the test data, parameter tuning of the learning models, and selection of an ideal set of parameter values for the preprocessing pipeline, is first described. Then, through an uncertainty analysis, the significance of each step of the pipeline is demonstrated. Finally, the performances of these models are compared with some of the previously used methods for mineral classification in martian data. The codes for all the experiments and results presented here can be found online at [15].

A. EXPERIMENTAL SETUP 1) DATA SPECIFICATIONS
Recently Plebani et al [16] published a collection of labelled pixels from 77 different TRDR images from various locations on the Martian surface in the wavelength range 1 to 3.47 micrometer, that were labelled using a hierarchical Bayesian model for estimating distributions of spectral patterns, and can be used as a basis for nonlinear noise removal, or as training data for mineral classification models as well as for validating the models that use TRDR or MTRDR data for training. The dataset contains a total of 592413 spectra with 39 different labels including the spectra for artifacts and some bland pixels, of which 28 labels are similar to MICA library. We analyzed the performance of our model with these 28 labels, taking randomly sampling 500 spectra from each label, and repeating the occurrence of some spectra for the labels where less than 500 records are present in the dataset. Note that, the supervised models used in this performance analysis, i.e, CNN, ANN, RFC, and SVC are trained by the dataset augmented from MICA spectral library and tested on this data, whereas, all the other methods mentioned in this section are both trained and tested on this data by k-fold cross-validation.

2) PRE-PROCESSING PARAMETERS
The correctness of the models how depends on the different parameters used to produce training data are discussed here. The plots presented in figure 10 show that mean accuracy around or more than 0.8 was frequently attained by using particular values for the different parameters like deviation limit (m) and standard deviation of noise distribution (σ N ) to generate the randomness within an augmented spectrum and the number of augmented spectra for each mineral class (n c ) and the number of extracted features (n E ). Each of the plots in figure 10 shows the change in mean accuracy with respect to the change in one of these parameters, where other parameters are constant. Figure 10a and 10b show that the optimum result can be generated if m and σ N are set to around 0.7 and 0.45 respectively. From figure 10c and 10d it can be seen that increasing the size of the augmented dataset or the number of extracted features does not improve the performance significantly after a certain limit, which in turn helps to find a suitable upper-limit to both the parameters to restraint the run-time of the learning process. Table 1 summarises the various parameter settings that were utilised to produce the training data.
3) MODEL SPECIFICATIONS a: CNN MODEL [17] CNN with 1-D convolutions operates in a multiscale manner from local to global which further examines the mutual information between locally adjacent bands. In the case of mineral identification, spectra are matched based on position and the intensity of the absorptions. In test data, the position of the absorption patterns must occur in the same wavelengths or at negligibly different wavelengths to be classified within the same class of the training data. For this reason, minimal size of kernels and strides are used with 1-D convolutions at the convolution layers of the model. This model structure had only three convolution layers, followed by a dense layer, and batch normalization and drop-out layers after each of the convolution layers [ figure 11a] to stabilize the learning process by standardizing the updates while back-propagation. Thus the risk of early over-fitting is reduced, albeit not fully avoided. An estimate of the ideal number of epochs to use in the learning process is obtained as 50 with a batch size of 150. With a vast set of tuning parameters available in the TensorFlow 2.0 package [18] sometimes it is tricky to get the best set of parameter values for a model. With the help of hyper-parameter tuning on each layer, the use of Leaky ReLU (Rectified Linear Unit) activation [19] is validated in all but the last layer. SoftMax activation is applied in the last (output) layer. Loss function and optimizer are respectively, categorical cross-entropy and RAdam [20] with warm-proportion (similar idea to momentum in general neural network optimizers like Adam) is set to 0.2, and the learning rate is set to 1 × 10 −5 .
b: ANN MODEL [21] This model structure had three hidden layers and the output layer had 31 nodes corresponding to the 31 mineral classes. The sizes of the dense layers are decreased gradually till the output layer and drop-out layers (drop value of 0.4) followed by batch-normalization layers were appended to the model structure, after each hidden layer [ figure 11b]. With multiple experiments, it is deduced that the best model performance occurs with batch size 150. The model is set with the network weights at the epoch of the best validation loss out of a maximum of 300 epochs. The node activation functions, loss function, and optimizer are set the same as the CNN model discussed above. c: RFC MODEL [22] Scikit-learn library [23] provides a vast set of parameters to fine-tune the performance of a random forest. The branching criterion used in our model is based on the Gini impurity, where, when a dataset is randomly labelled, the probability with which each element is mislabelled is measured. The number of trees (estimators) is set to 100, which is used to fix the number of votes available in the ensemble learning process. No restrictions are set on the maximum depth of the trees to avoid impurity cases at the leaf nodes. Same way, no particular restriction is set on other branching factors like the minimum number of elements needed to split an internal node and the minimum number of elements needed to be a prediction node. The low correlation between the individual trees is ensured by using bootstrapping. d: SVC MODEL [24] A Support Vector Classifier is a binary classifier with hyperplanes acting as class boundaries which are often learned by using a kernel. The Regularization parameter, symbolized as C, puts a bound on the distance from the class boundary for each training example. When C is high, the optimizer selects a separating hyperplane with a smaller margin if that hyperplane can classify all of the training points more accurately, whereas, with a low value of C optimizer selects a separating hyperplane with a big margin to create a more generalized model. The best result was generated by setting radial basis function kernel and C (Regularization Parameter) as 1 using hyper-parameter tuning [23].
Note that, learning the hyperplane classifier is largely a time-consuming process that increases with a higher number of features/dimensions in the data; hence a dimensionality reduction pre-procedure is a must-require before learning an SVC model. PCA [25], a transformation applied for a wide variety of objectives that requires dimensionality reduction, is included in our model to fasten the learning process of SVC. Naturally, reducing the number of features in data affects accuracy; though, the objective of dimensionality reduction is to sacrifice some accuracy in exchange for simplicity as machine learning algorithms can learn the  [16] with the pre-processing are noted for the supervised models.
features more easily and quickly without having to deal with superfluous factors. Table 2 provides a description of the different parameters set to fine-tune the learning models through numerous experiments in order to achieve the best outcomes. The confidence cut-off mentioned in the previous section is set upon all the models to generate the classification. CNN model is included only in this section to provide a comparison with the other models that use the data processed by the proposed feature extraction step, and those are the models used to generate the results in section V.

B. UNCERTAINTY ANALYSIS OF PRE-PROCESSING PIPELINE
An uncertainty analysis is depicted in Table-3 that contains mean accuracy and standard deviation for the supervised models, calculated over multiple runs i.e., taking a different set of validation samples from the [16] data in each run. The accuracy is simplified by the total number of correct predictions ratioed to the total number of samples (i.e., 28 × 500). The pre-processing steps mentioned in each row of table-3 indicate the pipeline used in the same order as mentioned, for both the training data and validation data to calculate the corresponding evaluation value.
The pre-processing steps mentioned in section II are used for the uncertainty analysis. The mean accuracies of the models increase as the pipeline is extended by almost every pre-processing step. The improvement after the smoothing step is not much, which could be due to the exclusion of some weaker absorption signatures in the process of removing unwanted kinks from the spectra. Although the smoothing step alone does not significantly alter the accuracy, using it before the continuum removal stage results in higher mean accuracy. If spectra standardization is included in the pipeline, a significant increase in mean accuracy is observed. However, it is clear from the table that this improves the accuracy to an acceptable level when used in association with the continuum removal step. It can be concluded that, the preprocessing steps mentioned in the order in figure 3 yields the best combination for all the supervised models. The similar accuracy obtained by CNN that works on the standardized data, and the other models like ANN, RFC and SVC for which the absorption extraction step is included in the data processing pipeline shows the significance of the feature extraction step to use such supervised learning models.

C. PERFORMANCE COMPARISON
Extensive research has previously been done on the use of probabilistic and unsupervised models for mineral identification in martian data. Allender et al. [5] used OPTICS (Ordering Points To Identify Clustering Structure) algorithm to generate the first level of clusters, followed by the rare class discovery algorithm DEMUD (Discovery via Eigenbasis Modeling of Uninteresting Data) to identify mineralogical units of separate features and used a decision tree based on different browse products to assign semantic labels to the identified mineral deposits in full-resolution targeted (FRT) data. Bue et al. [6] applied a metric learning method using multiclass linear discriminant analysis (LDA) to improve segment constancy to the classes of interest. Intending to identify some previously unidentified minerals Dundar et al. used the Partially Observed Hierarchical Dirichlet Process (PO-HDP), which uses a Dirichlet Process Mixture (DPM) to model each set of data and connects various groups of data by a higher level DP to create a hierarchy in the test dataset comprised of both identified and unknown samples and effectively classified the unknown samples by mapping to the known samples were having common ancestors in the hierarchy. The key advantage of this method is that it simplifies the process by automatically clustering data that don't fit into one of the observed classes, allowing verification after generating the initial clusters rather than at the sample level, as other anomaly detection methods do [7].
The performances of the supervised models with previously specified parameter settings and trained on the augmented spectra controlled by the aforementioned parameters are compared with these models in table 4, where the mean accuracy and standard deviation are measured over successful mineral-group classification. The mean accuracy and standard deviation are calculated by multiple runs with different sets of training and test data. The overall performances of the supervised models are as per the previously used methods, which proves the potential of the proposed framework. As carbonates, due to the C-O bond vibration overtones, have the same paired 2.3 and 2.5 micrometer absorption signatures that are used to identify phyllosilicates as well [2], the accuracy of carbonate detection is relatively low. Halides like epidotes are spectrally similar to phyllosilicate mixtures including calcite, chlorite, and illite with the major 2.34 and 2.25 micrometer absorptions and are only distinguishable by the presence of a minor band at 1.55 micrometer [2], and thus, the confidence of identification for halides could have been lower, resulting a bit lesser accuracy.

IV. GEOLOGICAL INFORMATION OF STUDY AREA FOR MINERAL MAPPING
On Mars, the Syrtis Major quadrangle (which spans longitudes latitudes 0 • to 30 • north and 270 • to 315 • west) contains a cluster of massive lowlands known as the Nili Fossae. Olivines, smectites, hydrated silica, kaolinite, and iron oxides are the main minerals found in this area [27], [28]. It was chosen as a prospective landing location for the Mars 2020 rover in September 2015. Jezero crater, a huge impact crater of about 45-kilometer in diameter [29] located in the Nili Fossae region, was in fact the landing site for the Mars 2020 mission, which sought to gather the deposit mineral samples that might be brought back to Earth for the first time ever by a future mission. It may therefore be used to compare and assess how the diversity of minerals evolves over time on a specific region of the Martian surface. For this region, the regions close to Jezero crater are thoroughly researched in literature to identify minerals and sediments. Mg-Carbonate, Fe/Mg-Phyllosilicates, Mafic minerals (Pyroxene and Olivine), and outcrops of Serpentine and Chlorite are the most common minerals found in the Jezero Crater region [8], [30]. An MTRDR image, FRT93BE, from this region has been used in this study to describe the validation process in detail using browse products and absorption matching with library spectra. We also applied the proposed frameworks for mineral mapping in MTRDR data collected from several other regions of the Martian surface as well.
Mawrth Vallis is a 100-kilometer-broad hole that has been altered by many impact craters, many small streams, and volcanism. The visible and near-infrared spectrometers have been used extensively to study Mawrth Vallis. Some of the dominant minerals identified in earlier studies are Fe/Mg-Phyllosilicates, Al-Phyllosilicates, and High Calcium Pyroxene (HCP) [31], [32]. Gale Crater is a 150-kilometer-  wide impact crater that occurred near the early Hesperian time stratigraphic outline. Earlier studies have discovered extensive detections of mafic minerals and hydrated silicates in both bedrock and sediments, as well as temporal changes in water chemistry as evidenced by the presence of chloride salts [33], [34]. The 119-kilometer-diameter Columbus Crater on Mars is situated in the Terra Sirenum, a massive region in the southern hemisphere encompassing latitudes 10 • to 70 • South and 110 • to 180 • West. An orbiting nearinfrared spectrometer analysis has revealed that it contains layers of both clay, gypsum, and sulfates indicating the presence of fresh water and the possibility of life on the planet [35]. The 280-kilometer-diameter impact crater Aram Chaos is located in the Margaritifer Terra region of Mars. Hematite, jarosite, and hydrated sulfates are minerals that are frequently discovered in this area [36]. Northeast Syrtis, a part of the Syrtis Major volcanic province in Mars' northern hemisphere, has stratified terrain that is filled with a variety of igneous minerals, including olivine and high-and lowcalcium pyroxene, as well as aqueous minerals, such as clay, carbonate, serpentine, and sulfate [37], [38]. Figure 12 shows the locations of the regions on the Martian surface from where the images used in the present study are acquired and figure 13 shows the false-colour images.

V. RESULTS AND DISCUSSIONS
FRT93BE image from Jezero crater depicts a high mineral diversity and serves as a better test for evaluating the performance of the proposed framework. The dominant minerals found by the supervised models in FRT93BE are Mg-Carbonate (bearing the Carbonate group of minerals), Fe/Mg-Smectite, or nontronite (bearing the Phyllosilicates group of minerals), HCP, and Mg-Olivine (bearing Mafic group of minerals) [ figure 14]. Some other minerals outcrops such as plagioclase, serpentine, and chlorite are also detected in the image, but they are not considered for model evaluation.
Different false-colour composites, known as browse products, combine three RGB sets of thematically associated summary parameters to provide distinguished visualizations of the knowledge derived from the MTRDR image to identify the spatial distribution of minerals within a scene. Each browse product has its significance with its three-band components and colour interpretation. The information about VOLUME 11, 2023    the browse product used to identify each of the dominant minerals in FRT93BE is presented in Table 5.
Mg-Carbonate is more sensitive to subtle variability in the Carbonate group of minerals which has strong absorption near 2.3 and 2.5 micrometer and thus appear yellowish/white in CAR Browse Product; whereas CR2 browse product distinguishes Carbonate minerals tuned to the absorption features at 2.2-2.3 and 2.4-2.5 micrometer, thus differentiates between Mg-Carbonate and Fe/Ca Carbonate. The PHY and PFM browse products recognize patterns related to the cation composition of a hydroxylated and Phyllosilicate group of minerals. Indistinguishable Fe/Mg-Smectites appear in red/magenta colour in PHY product and cyan tinge in PFM product by considering hydration band of equivalent strength. The MAF Browse product gives knowledge associated with Mafic mineralogy. Both Fe-Phyllosilicate and Olivine have 1.0-1.7 micrometer bowlshaped absorption in the MAF browsing product and can be visualized in red colour. HCPINDEX2 is sensitive to broad absorption at 2-micrometer Pyroxene and is less susceptible to false positives from other spectral signatures that display convexity centrally located at 1.3 and 1.5 micrometer. HCP appears in a blue/magenta shade in MAF composite.
In figure 14, the locations classified as Mg-Carbonate, Fe/Mg-Smectite, HCP, and Olivine minerals in FRT93BE by ANN, SVC, and RFC models possess a solid resemblance with the corresponding regions as shown in Table 5. It is identified that the Carbonate-bearing mineral inside Jezero is more spatially diverse than previously detected.
Characterization of minerals can be validated using the MICA library based on their absorption features. The median spectra calculated from the set of labelled minerals in the image FRT93BE for each dominant mineral class by the supervised models are compared with the corresponding spectra in the MICA library to show that, both exhibit similar absorption features, which is shown in figure 15. Mg-Carbonates are characterized by combining 2.3 and 2.5 micrometer because of C-O bond vibration overtones, with Fe/Ca Carbonates dip concentrated at 2.33 and 2.53 micrometer and Mg abundant Carbonates shifted to 2.3 and 2.5 micrometer and hydration band at 1.93 micrometer [34]. Nonetheless, these MICA spectra typically exhibit the same absorption at 1.93, 2.3, and 2.5 micrometer as the MICA library spectrum, shown in figure 15a. Fe/Mg-Smectites are a Phyllosilicate group of minerals that is more likely to be combined with Carbonates in a large abundance. Outside of Carbonate-bearing terrains, Fe/Mg-Smectites are identified by absorptions near 1.5, 2.3, and 2.4 micrometer which can be observed in figure 15b, where a 2.3 micrometer band center is consistent with Mg-rich Smectites [39]. HCP mineral is characterized by absorption at 2.0, 2.3 and 2.4 micrometer which can be seen in figure 15c. Olivine has a wide absorption centered near 1.0 micrometer due to an increase in the content of iron, which automatically broadens and strengthens in-depth [2]. Other absorptions to characterize Mg-Olivine mineral is 2.0, 2.4, and 2.5 micrometer, as shown in figure 15d.
A few additional results of the mineral mapping generated in the MTRDR images mentioned in the IV section, along with the validation by the browse products and absorption matching are also presented here.

VI. CONCLUSION AND FUTURE WORK
The lack of ground-truth data is a formidable obstacle to overcome in building a supervised learning model. In this study, an augmentation technique is presented to generate the training data only from the spectra available in the MICA spectral library which effectively eliminates the manual labour needed to create a viable training dataset. A novel feature extraction technique that involves accumulating absorption indices for each spectrum over the most diverse wavelength ranges has also been demonstrated, which improves both the training time and the performance of the models. The models are implacable on MTRDR data from any area of Mars since the training data are not biased to any location-specific feature of the Martian surface, unlike the problem that comes when manually building a training . . VOLUME 11, 2023 . . VOLUME 11, 2023 dataset by collecting spectra from known MTRDR images. A pre-processing pipeline for the training dataset is presented that without the explicit feature extraction step can be used to train a CNN model, and including that can be used to train supervised learning models like ANN, RFC, and SVC, for mineral identification in MTRDR data. The increase in the mean accuracy of the models with adding each of the pre-processing steps validates the need for each of the preprocessing steps and leads to the conclusion that further research on improving/fine-tuning each of the pre-processing steps and including more sophisticated feature extraction techniques and also more fine-tuned models can certainly improve the overall performance.

13134
The learning models are used here only for educational purposes, specifically to demonstrate how they can be applied to this framework to automatically construct the mineral mapping in CRISM hyperspectral data. Only shallow neural networks, as well as backbone random forests and support vector machines, were used in this study because the primary goal was to attain an acceptable degree of accuracy rather than an optimal level. The learning models used in this study act on the features differently, for example, ANN eventually assigns weights to feature values, RFC makes decisions based on feature values, and SVC determines the maximum separation hyperplane in the feature space. It is evident from the satisfactory result shown here that any kind of supervised learning model can be used in the proposed framework. With the tremendous success of deep learning approaches to the classification problem, the performance of the framework could be improved by using convolutionbased deep neural networks within the framework. Until now, no benchmark supervised learning models are available for mineral mapping in CRISM hyperspectral images, therefore this study certainly can be extended in that direction, perhaps incorporating more dynamic learning models like Long Short-Term Memory networks and Graph Neural Networks.
One of the primary objectives of the Mars 2020 mission being searching for rare minerals from the Martian surface, there is the possibility of accessing a more-diversified spectra library once the data from NASA PDS becomes accessible, which will be helpful to build more powerful learning models. In this study, it is assumed that the spectra in the experimented images are pure, i.e., each pixel contains information about a single mineral, and the training data is built using the pure spectra from the MICA library; however, this is not the case in practice. Pixel with mixed spectra is another challenge to overcome in mineral identification from hyperspectral images. The use of spectral unmixing techniques in the framework could aid in the mineral identification process.