A Novel Sample Generation Method for Deep Learning Lithological Mapping with Airborne TASI Hyperspectral Data in Northern Liuyuan, Gansu, China

: High-resolution and thermal infrared hyperspectral data acquired from the Thermal Infrared Airborne Spectrographic Imager (TASI) have been recognized as efficient tools in geology, demonstrating significant potential for rock discernment. Deep learning (DL), as an advanced technology, has driven substantial advancements in lithological mapping by automatically extracting high-level semantic features from images to enhance recognition accuracy. However, gathering sufficient high-quality lithological samples for model training is challenging in many scenarios, posing limitations for data-driven DL approaches. Moreover, existing sample collection approaches are plagued by limited verifiability, subjective bias, and variation in the spectra of the same class at different locations. To tackle these challenges, a novel sample generation method called multi-lithology spectra sample selection (MLS3) is first employed. This method involves multiple steps: multiple spectra extraction, spectra combination and optimization, lithological type identification, and sample selection. In this study, the TASI hyperspectral data collected from the Liuyuan area in Gansu Province, China, were used as experimental data. Samples generated based on MLS3 were fed into five typical DL models, including two-dimensional convolutional neural network (2D-CNN), hybrid spectral CNN (HybridSN), multiscale residual network (MSRN), spectral-spatial residual network (SSRN), and spectral partitioning residual network (SPRN) for lithological mapping. Among these models, the accuracy of the SPRN reaches 84.03%, outperforming the other algorithms. Furthermore, MLS3 demonstrates superior performance, achieving an overall accuracy of 2.25–6.96% higher than other sample collection methods when SPRN is used as the DL framework. In general, MLS3 enables both the quantity and quality of samples, providing inspiration for the application of DL to hyperspectral lithological mapping.


Introduction
Geological maps contain essential information critical in a variety of fields, such as landslide risk assessment, mineral resource management and development, and land use planning [1].However, the difficulty of accessing geological outcrops and the limited duration of the field missions have resulted in heterogeneity and discontinuity in geological data collection, posing challenges for generating geological maps in extensive arid or semi-arid regions.Remote sensing images provide a cost-effective way to identify various geological units and facilitate geological interpretation compared to traditional field surveys [2].With the increase in remote sensing satellites and airborne sensors, it has become feasible to acquire different sources of remote sensing images for lithological mapping [2,3].
Multispectral remote sensing imagers, such as Landsat and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), are commonly utilized for interpreting geological formations and units [4,5].However, they provide limited spectral information due to their small number of bands.Hyperspectral technology, combining twodimensional imaging and spectroscopic techniques to acquire spectral-spatial information simultaneously, has garnered significant attention in the field of lithological mapping [6,7].Common hyperspectral satellite sensors, including Hyperion and Gaofen 5 (GF5) [8,9], as well as common airborne hyperspectral sensors like Hyperspectral Mappers (HyMAP) and Airborne Visible and Infrared Imaging Spectrometers (AVIRIS) [10,11], are widely utilized.However, certain rock-forming minerals (e.g., quartz, feldspar, etc.) lack distinct spectral features in the visible-near infrared and short-wave infrared (VNIR-SWIR; 0.4-2.5 µm) ranges [12][13][14].To compensate for this gap, thermal infrared (TIR; 8-12 µm) hyperspectral sensors like Spatially Enhanced Broadband Array Spectrograph Systems (SEBASS) and Thermal Airborne Spectrographic Imagery (TASI) are employed to identify lithologies and minerals lacking specific spectral features in VNIR-SWIR [15][16][17].Moreover, TASI hyperspectral data have enabled fine-scale lithological mapping with 2.25 m resolution data, offering the potential for precise remote retrieval of the surface lithological types.Previous studies utilizing TASI hyperspectral data have highlighted the substantial role of the sensor in lithological mapping and mineral mapping [16,17].
Traditional machine learning (ML) algorithms such as support vector machine (SVM) and random forest (RF) have proven effective in lithological mapping [18][19][20].Recently, deep learning (DL) techniques have rapidly advanced, offering a new dimension in lithological mapping.These techniques can automatically extract high-level semantic features from input data, providing greater accuracy than earlier lithological mapping methods [9,10].Consequently, they are becoming powerful and flexible tools in lithological mapping, facilitating more detailed analysis and characterization of geological bodies [21].Several researchers have improved geological body identification using convolutional neural networks (CNN) [22][23][24], fully convolutional networks (FCN) [25], and other networks [26].Particularly, CNN excels at discerning spectral and spatial features, achieving high mapping accuracy, and exhibiting substantial robustness.These models extract features from sample datasets and abstract them into higher-level representations to comprehend and address complex issues [27].The quantity and quality of these samples are crucial for enhancing the learning capacity and performance of the models [28,29].A larger number of samples provides broader information, reduces the risk of overfitting, and improves the generalization abilities of models.Concurrently, the reliability of the samples is paramount.Using inaccurate samples may mislead the models into learning incorrect patterns and information, resulting in unstable and potentially inaccurate outcomes.However, acquiring high-quality samples remains a significant challenge in DL-based lithological mapping, directly affecting the effectiveness of DL strategies.Most DL-based lithological mapping research obtains samples using geological maps as a reference [23][24][25]30].Some studies utilize regions of interest (ROI) for sample acquisition [31,32], while others extract lithological endmember spectra to process images [27].However, these sample collection approaches are limited by verifiability, subjective bias, and variation in the spectra of the same class at different locations.
To address these limitations, we propose a novel sample generation method called multi-lithology spectra sample selection (MLS3) to construct a sample dataset.It comprises the following steps: multiple spectra extraction, spectra combination and optimization, lithological type identification, and sample selection.This approach minimizes the impact of human factors and spectral variability on the samples.In this paper, the TASI hyperspectral data collected from the Liuyuan area in Gansu Province, China, were used as experimental data, which features complex geological conditions with sparse vegetation cover.The samples generated by MLS3 were fed into five DL models to map lithologies, including two-dimensional convolutional neural network (2D-CNN), hybrid spectral CNN (HybridSN), multiscale residual network (MSRN), spectral-spatial residual network (SSRN), and spectral partitioning residual network (SPRN).In addition, the different sample collection methods were compared, and the experimental results verified the superiority of the MLS3 in lithological mapping.
The remainder of the article is structured as follows.Section 2 provides a detailed literature review of previous work.Section 3 details the geographic information, geological background, ground-truth data, as well as TASI data and its pre-processing.Section 4 outlines detailed information about the methodology.Section 5 presents the experimental results and comparative analyses.Section 6 discusses the findings.The final Section 7 summarizes the conclusions.

Related Work
Lithological mapping is a vital component of geological mapping.The corresponding interpretation results are of great value in analyzing the geological conditions and metallogenic potential of an area [2,33].Remote sensing technology has progressively become a critical tool in lithological mapping due to its ability to quickly yield data across extensive surfaces.Notably, the adoption of DL for lithological mapping in remote sensing has been increasing due to its powerful feature learning capability.In the following segment, we review some important developments in the application of DL to lithological mapping.

Lithological Mapping Based on DL
Currently, the majority of lithological mapping tasks focus on CNN.Clabaut et al. [34] demonstrated the promising potential of CNN for gossan detection, achieving 77% accuracy in the Canadian Arctic.Ye et al. [9] explored various CNN architectures, including multi-scale 3D deep CNN, hybrid spectral CNN, and spectral-spatial unified network, for lithological mapping.Their results showed accuracies exceeding 90% for all methods.Yu et al. [23] introduced a 3D convolutional autoencoder to extract lithological spatial and spectral features in the Liuyuan area, achieving compelling results.Shirmard et al. [10] combined CNN and ASTER data for lithological mapping, enabling almost all test data to be correctly predicted to match the field data.Pan et al. [24] constructed a CNN model for lithological mapping in Inner Mongolia, China, achieving an overall classification accuracy of 83.0%, outperforming the RF model.Dong et al. [30] proposed a network consisting of a transformer and a dynamic graph convolution module.This network enhances feature extraction by using the transformer to explore the long-range interactive sequence features of lithology and the dynamic graph convolution module to obtain the dynamic graph structure features of lithology, achieving 97% accuracy.Additionally, other networks, such as semantic segmentation models, have been used for lithological mapping.Wang et al. [25] developed a semantic segmentation-based FCN to determine lithological classes, achieving an overall classification accuracy of 96%.However, remarkably, most of these DL-based lithological mapping studies overlook the impact of samples on the results [2,35].

Sample Dataset Construction Approaches for DL-Based Lithological Mapping
Most of the studies on DL-based lithological mapping typically utilize geological maps as a standard, from which some data are then randomly selected as training samples [23][24][25]30].While these maps provide valuable labels, their lack of pixel-by-pixel verifiability may render them less accurate [9], potentially affecting model inference.Obtaining training samples from regions of interest (ROI) is also a common method [9,31,32], but this method is easily affected by the manual operation of interpreters.Alternatively, the ground-measured spectra are used as the reference spectra, and the samples are identified by comparing the similarity between the reference spectra and the pixel spectra [36].However, due to the influence of terrain, environment, and other conditions, the spectra of the same lithologies tested in the field may have large differences.It becomes difficult to choose which lithological spectrum is the reference spectrum, and it remains uncertain whether the selected reference spectrum can match the pixel spectra within the image.Another approach is to use spectra extracted from the images as reference spectra for sample construction [27].
However, this method does not consider the existence of variations in the spectra of the same class at different locations in hyperspectral imagery.Therefore, relying on a single spectrum to represent a type of object may not capture its spectral characteristics under various conditions.Against this background, it is significant to construct an appropriate sample generation method for DL-based lithological mapping.

Overview of Study Site
The study area, located in northwestern Gansu, China (41 • 13 ′ -41 • 14 ′ N; 95 • 30 ′ -95 • 33 ′ E), is shown in Figure 1a,b.The area covers approximately 9.34 km 2 and has an altitude ranging from 1700 to 2000 m.This area exhibits an arid environment characterized by undulating Gobi terrain and a lack of vegetation but has excellent bedrock exposure [37].The area is characterized by complex geological conditions and contains a major ore-forming zone, making it an ideal location for obtaining high-quality TIR hyperspectral images.
However, due to the influence of terrain, environment, and other conditions, the spectra of the same lithologies tested in the field may have large differences.It becomes difficult to choose which lithological spectrum is the reference spectrum, and it remains uncertain whether the selected reference spectrum can match the pixel spectra within the image.Another approach is to use spectra extracted from the images as reference spectra for sample construction [27].However, this method does not consider the existence of variations in the spectra of the same class at different locations in hyperspectral imagery.Therefore, relying on a single spectrum to represent a type of object may not capture its spectral characteristics under various conditions.Against this background, it is significant to construct an appropriate sample generation method for DL-based lithological mapping.

Overview of Study Site
The study area, located in northwestern Gansu, China (41°13′-41°14′N; 95°30′-95°33′E), is shown in Figure 1a,b.The area covers approximately 9.34 km 2 and has an altitude ranging from 1700 to 2000 m.This area exhibits an arid environment characterized by undulating Gobi terrain and a lack of vegetation but has excellent bedrock exposure [37].The area is characterized by complex geological conditions and contains a major oreforming zone, making it an ideal location for obtaining high-quality TIR hyperspectral images.

TASI Data and Pre-Processing
TASI is equipped with 32 channels within the 8-11.5 µ m range.The sensor operated at a 2 km altitude, capturing the image in September 2010.The TASI data were provided by the Beijing Research Institute of Uranium Geology (Beijing, China).Pre-processing of the TIR hyperspectral image involves radiometric calibration, atmospheric correction, and temperature emissivity separation [38].Radiometric calibration was processed with the system software provided by Canada ITRES operated by the Beijing Research Institute of Uranium

TASI Data and Pre-Processing
TASI is equipped with 32 channels within the 8-11.5 µm range.The sensor operated at a 2 km altitude, capturing the image in September 2010.The TASI data were provided by the Beijing Research Institute of Uranium Geology (Beijing, China).Pre-processing of the TIR hyperspectral image involves radiometric calibration, atmospheric correction, and temperature emissivity separation [38].Radiometric calibration was processed with the system software provided by Canada ITRES operated by the Beijing Research Institute of Uranium Geology (Beijing, China).Atmospheric correction and temperature emissivity separation were implemented using MATLAB R2019a.An atmospheric radiation transfer model with intermediate spectral resolution was used for atmospheric correction to correct atmospheric absorption and upwelling radiation [38].The image processed in the previous step was transformed into emissivity and temperature images using the normalized emissivity module (NEM), the emissivity ratio module (RATIO), and the average/maximum-minimum difference module (MMD) [39].Channels having wavelengths greater than 11 µm and less than 8.5 µm were removed to ensure the accuracy of inversion because some are in non-atmospheric windows and subject to environmental influences.Finally, 22 channels were chosen, ranging from channel 6 to channel 27. Figure 1c depicts the color composite hyperspectral emissivity image.

Geological Background and Ground-Truth Data
Tectonically, the study area lies within the Yujingzi and Liuyuan intracontinental rift zones on the southern margin of the Beishan epicontinental active belt, situated between the Tarim and Sino-Korean plates [40].The region exhibits complex geological conditions with exposed strata [41].The study area features slate, granite, granodiorite, diorite, marble, and quaternary sediments.To obtain precise geological information, a combination of outdoor field surveys and indoor laboratory analyses was conducted.Outdoor activities were carried out in late August 2020, including Global Positioning System (GPS) surveys and rock collection.Indoor rock thermal infrared spectra measurements were conducted using the 102F portable Fourier transform infrared spectrometer (FTIR).Ground-truth data involved 25 points with GPS coordinates, rock-type details, and thermal infrared spectra.As shown in Figure 1c, slate is represented by points 1 to 12, granite by points 13 to 16, granodiorite by point 17, diorite by points 18 to 24, and marble by point 25. Figure 2 shows several hand specimen images and several field photographs.Most studies use geological maps as reference maps.Since geological maps are only a general description of the geological situation, while remote sensing images reflect the real surface, geological maps lack pixel-by-pixel verifiability.In this case, we interpreted the TASI image based on geological resources [9,41] and the lithological types of the ground-truth data, as shown in Figure 3, facilitating algorithm evaluation.
intermediate spectral resolution was used for atmospheric correction to correct atmospheric absorption and upwelling radiation [38].The image processed in the previous step was transformed into emissivity and temperature images using the normalized emissivity module (NEM), the emissivity ratio module (RATIO), and the average/maximum-minimum difference module (MMD) [39].Channels having wavelengths greater than 11 µ m and less than 8.5 µm were removed to ensure the accuracy of inversion because some are in non-atmospheric windows and subject to environmental influences.Finally, 22 channels were chosen, ranging from channel 6 to channel 27. Figure 1c depicts the color composite hyperspectral emissivity image.

Geological Background and Ground-Truth Data
Tectonically, the study area lies within the Yujingzi and Liuyuan intracontinental rift zones on the southern margin of the Beishan epicontinental active belt, situated between the Tarim and Sino-Korean plates [40].The region exhibits complex geological conditions with exposed strata [41].The study area features slate, granite, granodiorite, diorite, marble, and quaternary sediments.To obtain precise geological information, a combination of outdoor field surveys and indoor laboratory analyses was conducted.Outdoor activities were carried out in late August 2020, including Global Positioning System (GPS) surveys and rock collection.Indoor rock thermal infrared spectra measurements were conducted using the 102F portable Fourier transform infrared spectrometer (FTIR).Ground-truth data involved 25 points with GPS coordinates, rock-type details, and thermal infrared spectra.As shown in Figure 1c, slate is represented by points 1 to 12, granite by points 13 to 16, granodiorite by point 17, diorite by points 18 to 24, and marble by point 25. Figure 2 shows several hand specimen images and several field photographs.Most studies use geological maps as reference maps.Since geological maps are only a general description of the geological situation, while remote sensing images reflect the real surface, geological maps lack pixel-by-pixel verifiability.In this case, we interpreted the TASI image based on geological resources [9,41] and the lithological types of the ground-truth data, as shown in Figure 3, facilitating algorithm evaluation.

Methodology
A schematic diagram illustrating the methodology adopted in this study is presented in Figure 4, which consists of four processes.The first part is the input of pre-processed surface emissivity TASI data.The second part involves constructing a sample dataset from the emissivity data using MLS3.This dataset is divided into training, validation, and test sets for the next part.The third part involves training, validating, and testing the dataset obtained in the previous step using different DL models.To validate the performance, each model is run five times.The best-performing model from these runs is selected for the next part.The fourth part involves generating the lithological map using the best model.The two main processes, "sample dataset construction" and "lithological map creation models", will be described in detail.
These methods were implemented in ENVI 5.3.1 on a Windows 10 computer with an Intel i7-10700K CPU and 16 GB of RAM, as well as in Python 3.8 on a machine equipped with an Intel Xeon Silver 4210R, 20 GB of RAM, and a NVIDIA GeForce RTX 3090.

Methodology
A schematic diagram illustrating the methodology adopted in this study is presented in Figure 4, which consists of four processes.The first part is the input of pre-processed surface emissivity TASI data.The second part involves constructing a sample dataset from the emissivity data using MLS3.This dataset is divided into training, validation, and test sets for the next part.The third part involves training, validating, and testing the dataset obtained in the previous step using different DL models.To validate the performance, each model is run five times.The best-performing model from these runs is selected for the next part.The fourth part involves generating the lithological map using the best model.The two main processes, "sample dataset construction" and "lithological map creation models", will be described in detail.
These methods were implemented in ENVI 5.3.1 on a Windows 10 computer with an Intel i7-10700K CPU and 16 GB of RAM, as well as in Python 3.8 on a machine equipped with an Intel Xeon Silver 4210R, 20 GB of RAM, and a NVIDIA GeForce RTX 3090.

Sample Dataset Construction
MLS3 is implemented to generate a sample dataset consisting of the following steps: multiple spectra extraction, spectra combination and optimization, lithological type identification, and sample selection, as shown in Figure 5.

Multiple Spectra Extraction
In general, the same lithological type in remote sensing images may exhibit different states at different locations due to imaging conditions, resulting in variations or significant differences in the spectra of the same lithological type [42].To overcome this effect, the original image is segmented into multiple patches based on its size to extract lithological endmember spectra.After segmentation, the image patches are significantly smaller than the original image.This processing helps extract richer lithological endmember spectral information, alleviates the problem of local spectral variability, and reduces the extraction errors of the lithological endmember spectra [42].The number of lithological endmember spectra in each patch is determined using the hyperspectral signal subspace identification by minimum error (HySime) algorithm.The HySime algorithm, which estimates the dimensionality of hyperspectral subspaces, first calculates the correlation matrix of the signal and noise.It then selects the subspace with the smallest mean squared difference before and after projection in a space consisting of signal eigenvectors [43].Bioucas-Dias and Nascimento [43] asserted that the number of endmembers is related to the subspace of eigenvectors that best captures the information in the original data.Therefore, the dimension of the eigenvector subspace is equal to the number of endmembers.Then, the lithological endmember spectra in each image patch can be extracted using the sequential maximum angle convex cone (SMACC) algorithm.SMACC, which is based on a convex cone model, identifies the lithological endmember spectra with the aid of constraints [44].It uses the poles to identify the convex cone and define the first lithological endmember spectrum.The next lithological endmember spectrum is then generated by applying an oblique projection with constraints to the existing cone.Continue adding cones to generate new lithological endmember spectra.Repeat this process until the existing lithological endmember spectra are included in the generated convex cone or until a specified number of lithological endmember spectra classes are satisfied [44].

Sample Dataset Construction
MLS3 is implemented to generate a sample dataset consisting of th multiple spectra extraction, spectra combination and optimization, litho tification, and sample selection, as shown in Figure 5.

Sample Dataset Construction
MLS3 is implemented to generate a sample dataset consisting of the following steps: multiple spectra extraction, spectra combination and optimization, lithological type identification, and sample selection, as shown in Figure 5.

Spectra Combination and Optimization
After extracting the lithological endmember spectra from each image patch, all the lithological endmember spectra are collected into a set.K-means, which is an unsupervised learning algorithm [45], is used to cluster these spectra into different classes.After classifying the lithological endmember spectra, a single lithological type may correspond to multiple endmember spectra.When a large number of endmember spectra exist within a class, it increases the variety and quantity of endmember spectra but results in redundant calculations.In this case, the endmember average root mean square error (EAR) metric is employed to optimize the selection of lithological endmember spectra [46,47].For a class with n lithological endmember spectra {E 1 , E 2 , ..., E n }, the EAR for the ith lithological endmember spectrum is defined as follows: where EAR i denotes the EAR for the ith lithological endmember spectrum.RMSE(E i , E j represents the average root mean square error between E i and E j .A lower EAR value indicates a higher representativeness of the lithological endmember spectra.For each lithology, several representative spectra are chosen based on their EAR values.

Lithological Type Identification
After selecting the representative lithological endmember spectra, the spectral angle (SA) is utilized to classify these spectra.SA considers that a smaller spectral angle indicates a closer similarity between the representative lithological endmember spectra and the measured lithological spectra [48].Consequently, the type of representative lithological endmember spectra is identified based on their best match to the measured lithological spectra.This method determines the type of representative lithological endmember spectra.
The algorithm is both convenient and efficient and has been widely used for identifying lithological types [16,49].The technique is implemented by applying: where SA is the spectral angle (in radians; 0 to 2π), t is the lithological endmember spectrum, and r is the measured lithological spectrum.

Sample Selection
After determining the types of all representative lithological endmember spectra, fully constrained linear spectral unmixing (FCLS) is used for sample selection, as follows: where b represents the total number of representative lithological endmember spectra.c i denotes the abundance value corresponding to the ith representative lithological endmember spectrum e i .a is an error term.x is any k dimensional spectral vector from the image (k is the number of bands in the image).Two constraints must be observed with FCLS: All representative lithological endmember spectra are put into Equation (3) to generate the abundance map.Each pixel in the abundance map has a value ranging from 0 to 1.A pixel's value closer to 1 indicates a higher likelihood that the pixel belongs to the type represented by that abundance map.The probability that a pixel belongs to a specific class can be determined by a threshold.The sample selection method using the abundance map involves the following main steps: (1) The threshold is defined using histogram calculations.Specifically, the cumulative percentage of the histogram is calculated, and the threshold is set at the pixel value where this percentage exceeds a predefined value T. (2) Pixels are classified using the threshold.If a pixel's abundance value exceeds the predefined value, it is assigned to a specific class.Otherwise, it is categorized as unclassified.(3) Since multiple representative lithological endmember spectra correspond to a single class, samples of the same class are grouped accordingly in steps (1) and (2) to generate a final labeled sample map.The pseudo-code of the algorithm is provided in Algorithm 1.

Lithological Map Creation Models
The samples obtained using MLS3 are employed to evaluate five distinct DL models, including 2D-CNN [38], HybridSN [52], MSRN [53], SSRN [54], and SPRN [55].To assess the applicability of the original network for lithological mapping, the structural integrity of the original network framework is maintained as much as possible.The networks are implemented using PyTorch 1.7, with a training epoch set at 300.An initial learning rate of 0.001 is established, which is reduced by a factor of 0.1 at the 100th epoch and by 0.01 at the 250th epoch.The Adam optimizer is utilized for training updates.Additionally, the batch size is set to 128, and the input image size is 7 × 7, eliminating the requirement for dimensionality reduction in the original data.A brief conceptual framework for the implemented DLs is presented below.

1.
Two-dimensional convolutional neural network (2D-CNN) The 2D-CNN is a neural network designed for spatial feature extraction [38].Firstly, convolutional layers are utilized to capture features such as edges and textures from the input data.Subsequently, pooling layers are used to reduce computational complexity and resource consumption.Finally, fully connected layers perform classification based on the features learned by the convolutional and pooling layers.

Multiscale residual network (MSRN)
MSRN considers multi-scale feature extraction to capture optimal spatial features.Specifically, MSRN replaces depth separable convolution (DSC) with mixed depth convolution (MDConv) to extract features at different scales from each feature map [53].This improves the feature representation capability of the network by considering feature interactions at different scales.MSRN replaces the convolutional layer in the conventional residual block with MDSConv and uses the multiscale residual block (MRB) as its main unit.The entire MSRN network consists of four MRB units.To further enhance feature representation capability, skip connections are incorporated into two cascaded MRBs.The maximum pooling layer is removed, and only the first two MRB blocks are retained due to the smaller input patch and the large amount of input data in this research.

4.
Spectral-spatial residual network (SSRN) SSRN employs residual learning to construct spectral and spatial residual blocks [54].Specifically, 3D convolutional layers are the fundamental elements, and a batch normalization layer is introduced after each convolutional layer to standardize the learning process and improve model performance.Each spectral residual block utilizes multiple 1 × 1 × k convolutional layers to extract and reduce the dimensionality of spectral features from the original input image.Each spatial residual block uses multiple 3 × 3 × 1 convolutional layers to learn and enhance spatial features.

5.
Spectral partitioning residual network (SPRN) SPRN utilizes group convolution (GC) to partition the input spectra into multiple nonoverlapping continuous sub-bands and employs cascaded parallel residual blocks to extract local spectral and spatial features from these sub-bands [55].Simultaneously, ordinary convolution is utilized to extract global information over the entire band through additional branches.Finally, the input information, local information, and global information are fused through a skip connection.

Sample Dataset Generation
Based on the procedure outlined in Section 4.1, a sample dataset was constructed.Firstly, the TASI image was divided into six patches according to the size of the image.The HySime algorithm was then used to determine the number of lithological endmember spectra for each block.To ensure the inclusion of all crucial and significant spectra, eight lithological endmember spectra were extracted from each patch, totaling 48 spectra.
Secondly, these spectra were classified into six classes using the K-means algorithm.Two representative lithological endmember spectra were selected from each class using the EAR algorithm, considering the complexity of the geological conditions in the study area.
Thirdly, the measured infrared spectra were resampled to match the TASI bands, and these spectra were used to determine the type of representative lithological endmember spectra.The six sets of spectral curves are shown in Figure 6.Specifically, the first set of spectra corresponds to slate, exhibiting a spectral signature of quartz with a minimum emission near 9 µm, as illustrated in Figure 6a.The second set of spectra represents granite, displaying a broad emission signature in the 8.5-10 µm range, as shown in Figure 6b.The third set of spectra corresponds to granodiorite, as depicted in Figure 6c, exhibiting a distinct emission feature between 9 and 9.5 µm.The fourth set of spectra corresponds to diorite, as shown in Figure 6d, with a minimal emission near 9.5 µm.The fifth group of spectra is similar to marble, as shown in Figure 6e, and the marble spectrum is generally smooth without obvious spectral features.As shown in Figure 6f, the sixth set of spectra does not have a direct match with the measured TIR spectra.A brief mapping of the TASI data using the spectra reveals that the sixth set of spectra corresponds to the quaternary sediments in the image.
Fourth, the abundance maps were generated using FCLS.Then, these maps were processed to produce the initial labeled samples.In this process, the given value was set to 99.6%, which enables more accurate samples to be obtained.These initial samples can be further corrected and optimized according to the ground-truth data.In particular, if labeled samples are present in and around the measured data, the results are retained.Conversely, the initial results are supplemented.Figure 7 shows the identified samples using MLS3.

Lithological Mapping Results of DLs
These samples were divided into a training set, a validation set, and a test set in the ratio of 6:2:2, as shown in Table 1.The performance of lithological map creation models is evaluated using overall accuracy (OA), user's accuracy (UA), producer's accuracy (PA), and kappa coefficient (Kappa).
Figure 8 shows the mapped results obtained from different DL algorithms.To highlight these differences, four locally enlarged patches, indicated by distinct colors (red, green, blue, and yellow), are situated on the right side of each image.Visually, the 2D-CNN algorithm demonstrates the most significant deviation between its results and the reference map.This discrepancy arises from its constitutive structure, which includes only two convolutional layers designated for feature extraction.Due to this limitation, the feature extraction capabilities of the algorithm are curtailed, resulting in a high volume of misidentifications.HybridSN performs closer to the reference image compared to 2D-CNN because HybridSN Fourth, the abundance maps were generated using FCLS.Then, these maps were processed to produce the initial labeled samples.In this process, the given value was set to 99.6%, which enables more accurate samples to be obtained.These initial samples can be further corrected and optimized according to the ground-truth data.In particular, if labeled samples are present in and around the measured data, the results are retained.Conversely, the initial results are supplemented.Figure 7 shows the identified samples using MLS3.

Lithological Mapping Results of DLs
These samples were divided into a training set, a validation set, and a test set in the ratio of 6:2:2, as shown in Table 1.The performance of lithological map creation models is evaluated using overall accuracy (OA), user's accuracy (UA), producer's accuracy (PA), and kappa coefficient (Kappa).Figure 8 shows the mapped results obtained from different DL algorithms.To highlight these differences, four locally enlarged patches, indicated by distinct colors (red, green, blue, and yellow), are situated on the right side of each image.Visually, the 2D-CNN algorithm demonstrates the most significant deviation between its results and the reference map.This discrepancy arises from its constitutive structure, which includes only two convolutional layers designated for feature extraction.Due to this limitation, the feature extraction capabilities of the algorithm are curtailed, resulting in a high volume of misidentifications.HybridSN performs closer to the reference image compared to 2D-CNN because HybridSN considers both the spatial and spectral features of lithology.MSRN, SSRN, and SPRN, particularly the SPRN algorithm, demonstrate clearer lithological boundaries and superior results compared to those of 2D-CNN and HybridSN.SPRN outperforms other methods by reducing the input dimension of each CNN and by fusing  Table 2 provides the mapping accuracy of various algorithms.Quantitatively, compared to 2D-CNN, HybridSN, MSRN, and SSRN, SPRN performs notably better, exhibiting superior OA values higher by 14.89%, 10.22%, 5.76%, and 1.34%, respectively.This improvement is further reflected in the Kappa values, with Kappa values higher by 0.2085, 0.1469, 0.0843, and 0.0237, respectively.In the slate type, both the PA and UA of SSRN and SPRN are higher than those of the other algorithms.Notably, the UA of SPRN outperforms all other algorithms at 95.27%.For categories such as granite, granodiorite, and quaternary sediments, SPRN demonstrates significantly superior PA and UA results compared to those of the other methods.In the diorite type, SPRN surpasses other algorithms with a PA value of 94.43%.As for the marble type, SPRN leads with a UA value of 58.39%.Therefore, SPRN consistently delivers higher mapping accuracy due to its robust feature extraction capability.local spectral features with global features to obtain more accurate semantic information.Therefore, SPRN contributes to superior performance and delivers the most accurate result.Table 2 provides the mapping accuracy of various algorithms.Quantitatively, compared to 2D-CNN, HybridSN, MSRN, and SSRN, SPRN performs notably better, exhibiting superior OA values higher by 14.89%, 10.22%, 5.76%, and 1.34%, respectively.This improvement is further reflected in the Kappa values, with Kappa values higher by 0.2085, 0.1469, 0.0843, and 0.0237, respectively.In the slate type, both the PA and UA of SSRN and SPRN are higher than those of the other algorithms.Notably, the UA of SPRN outperforms all other algorithms at 95.27%.For categories such as granite, granodiorite, and quaternary sediments, SPRN demonstrates significantly superior PA and UA results compared to those of the other methods.In the diorite type, SPRN surpasses other algorithms with a PA value of 94.43%.As for the marble type, SPRN leads with a UA value of 58.39%.Therefore, SPRN consistently delivers higher mapping accuracy due to its robust feature extraction capability.

Comparison of Sample Collection Methods
To demonstrate the effectiveness of the MLS3 method described in this paper for lithological mapping, we conducted a comparative analysis with three representative sample collection methods.

•
ROI: ROI selects patches from the image as samples based on user selection.

•
Spectral angle mapping (SAM): SAM determines the samples by comparing the angle between the ground-measured spectra (used as the reference spectra) and the pixel spectra.Given that slate, granite, and diorite each have numerous field-measured spectra, selecting appropriate reference spectra becomes challenging.In our study, we have selected the measured spectra that exhibit the highest correlation with the image endmember spectra to ensure the greatest similarity between the reference spectra and the pixel spectra.Notably, the type of quaternary sediment lacks matching field-measured spectra, so its lithological endmember spectrum is used.

•
Spectral unmixing (SU): SU extracts one spectrum for each lithology and uses FCLS to generate abundance maps.It utilizes the abundance map to select samples for each class.
To ensure a fair comparison, the samples from ROI and MLS3 were selected to match in number, and their spatial distributions were made as similar as possible.Furthermore, the samples obtained through both SAM and SU methods underwent the same correction procedures based on ground-truth data.Table 3 effectively illustrates the number of samples obtained through various sample dataset construction methods.Table 3 shows that the total number of samples for these lithologies is approximately 13,000.Among the methods, SU procures the highest number of samples, while SAM yields the fewest.MLS3 and ROI obtain a moderate number of samples.For each lithology type, the number of samples collected by ROI, SU, and MLS3 is generally comparable, while SAM collects significantly fewer samples than the other methods.Figure 9 visually illustrates the spatial distribution of the samples obtained using different algorithms.To highlight the differences, two locally enlarged patches are provided on the right side of each image, indicated by distinct colors (red and green).Panels (a)-(d) of Figure 9 show the sample datasets obtained by ROI, SAM, SU, and MLS3, respectively.Subsequently, these samples were input into the SPRN classifier.The effectiveness of the sample dataset construction algorithms is evaluated based on the mapping accuracy achieved by SPRN.    Figure 10 presents the mapped results obtained using different sample acquisition methods.To highlight the differences, four zoomed-in patches are displayed on the right side of each map, distinguished by unique colors (red, green, blue, and yellow).As shown in Figure 10, the result generated by MLS3 demonstrates a closer alignment with the reference map, revealing more distinct lithological boundaries.In contrast, the results from ROI, SAM, and SU exhibit more misclassification and low spatial aggregation, indicating that the samples obtained through MLS3 are more representative and provide more accurate mapping results.Before implementing the lithological map creation algorithms, a large number of training samples need to be acquired [56,57].The choice of spectra is essential throughout the entire MLS3.Hyperspectral image mapping typically uses ground truth or laboratory   4 that the MLS3 achieves a higher OA, with improvements of 2.25%, 6.96%, and 3.33% over ROI, SAM, and SU, respectively.Additionally, the MLS3 exhibits a superior Kappa, with improvements of 0.0413, 0.0961, and 0.047 compared to the aforementioned algorithms.For the slate and granite types, the UA values obtained by MLS3 are significantly higher than those obtained by those obtained by the other algorithms, suggesting that MLS3 rarely misclassifies other lithologies into these two lithologies.For the granodiorite type, diorite, marble, and quaternary sediments, the PA values garnered by MLS3 vastly surpass those of other algorithms, indicating that MLS3 rarely omits these lithologies.This underscores the exemplary results yielded by the sample dataset construction method we proposed.Before implementing the lithological map creation algorithms, a large number of training samples need to be acquired [56,57].The choice of spectra is essential throughout the entire MLS3.Hyperspectral image mapping typically uses ground truth or laboratory spectra as the reference spectra for lithological mapping [58].However, environmental, climatic, and temporal factors can cause significant differences between ground-truth or laboratory spectra and image spectra.Furthermore, the phenomenon that there are variations in the spectra of the same class at different locations in the image cannot be avoided.To address these issues, various effective algorithms, such as SMACC, K-means, SA, EAR, and FCLS, are applied to help us select the most representative lithological endmember spectra, ensuring the chosen samples are appropriate for each class.During the process, the optimal number of lithological spectra and the given threshold for abundance image are the two primary factors.In general, the number of spectra required per class should be determined by the actual geological conditions.In this study, two spectra per class are sufficient for lithological identification.Moreover, the given threshold of 99.6% is appropriate as it ensures the quantity of samples while maintaining their quality.
In evaluating different methods for constructing sample datasets, it is evident that each approach has unique challenges and limitations.While the quantity of samples obtained through ROI matches that of MLS3, the ROI results are greatly susceptible to human influence.Different investigators may choose varying samples, potentially affecting the accuracy of lithological mapping.SAM obtains the smallest number of samples.Because SAM uses measured spectra to select samples, there may be inaccuracies in comparing these measured spectra to pixel spectra in the images, leading to unsatisfactory results.Even though SU obtains a larger number of samples than others, it overlooks the variability of lithologies across different regions and the variations in endmember spectra within the same class, leading to poor sample quality.In contrast, our proposed MLS3 method considers these factors.Although it generates fewer samples than SU, the quality of the samples is significantly higher.Consequently, MLS3 ensures both satisfactory sample quantity and quality.
Furthermore, the application of MLS3 requires geological knowledge of the study area to achieve more accurate lithological mapping results.The algorithms used in this study rely on geo-specific measurements, such as high-quality field spectral data, lithology types of field-measured points, and accurate GPS coordinates, to produce more reliable results.In situations where field data is unavailable, geological maps remain a plausible approach, providing insight into the local geological context.

DL Algorithmic Considerations
As for the lithological map creation models, five state-of-the-art CNN models were selected from the existing literature and tested using samples obtained from MLS3 based on the airborne TASI hyperspectral data.The experimental results indicate all models per-formed well, achieving good lithological mapping results.However, there is an imbalance in the UA accuracy values for each category obtained by these algorithms.For instance, SPRN has a UA of only 31.65% for granodiorite.The UA for marble and quaternary sediments hovers at around 55%, implying misidentification.One potential contributor to this discrepancy could be the class imbalance.Slate and granite are abundant throughout the study area, while granodiorite and marble appear in lesser quantities.This discrepancy may make it difficult for the algorithm to establish relationships between samples, leading to misidentification.Another aspect is the mineralogical similarity between certain lithologies, like granite and granodiorite.This resemblance can lead to instances of granites being incorrectly classified as granodiorite.Therefore, in the future, efforts should be made to optimize the training process of the models and change the learning strategy [59] to improve the performance of CNNs.Or try to use other types of neural network models, such as graph convolutional network (GCN) [60], transformer [61], etc.In addition, the combination of multi-source data has been shown to improve the accuracy of lithological mapping, so using data from multiple sources could be explored in the future.

Conclusions
This study explores the practical challenges encountered in leveraging DL for generating lithological maps.These challenges include the difficulty in acquiring representative samples arising from inadequate verifiability, subjective bias, and differences in the spectra of the same class at different locations in the hyperspectral image.We evaluated the efficacy of the proposed MLS3 and tested the abilities of different DL models utilizing TASI data gathered from the Liuyuan area in Gansu Province, China.Based on both theoretical and empirical results, we draw the following conclusions: (1) MLS3 considers the potential differences in spectra of the same lithology, reduces the influence of subjective factors, and achieves an overall accuracy of 2.25-6.96%higher than other sample collection methods.In general, MLS3 is designed to generate labeled samples in a more scientific and comprehensive manner.(2) MLS3 can be successfully applied to various DL models to enhance the performance of lithological mapping.Particularly, SPRN shows the best result compared to other CNN methods, with 84.03% for OA and 0.7416 for Kappa, respectively.SPRN improves the lithological mapping task due to its strong learning capabilities.
The above results show excellent mapping accuracy, providing some solution ideas for lithological mapping when using DL models but lacking samples.However, improving the accuracy of lithological mapping remains a challenging task.In future work, more complex DL models and multi-resource remote sensing data will be tried for experimental applications and evaluations.

Data Availability Statement:
The datasets presented in this article are not readily available.Further inquiries can be directed to the corresponding author.

Figure 1 .
Figure 1.Map of study area locations: (a) Guazhou County within Gansu Province, China; (b) study area in Liuyuan Town, Guazhou County; (c) color composite hyperspectral image and measured points in the field.

Figure 1 .
Figure 1.Map of study area locations: (a) Guazhou County within Gansu Province, China; (b) study area in Liuyuan Town, Guazhou County; (c) color composite hyperspectral image and measured points in the field.

Figure 2 .
Figure 2. Hand specimen images and field photographs.(a-d) Hand specimen images.These hand specimens were obtained from points 12, 15, 19, and 21, respectively; (e,f) Field photographs.These field photographs were taken at points 14 and 15, respectively.

Figure 2 .
Figure 2. Hand specimen images and field photographs.(a-d) Hand specimen images.These hand specimens were obtained from points 12, 15, 19, and 21, respectively; (e-f) Field photographs.These field photographs were taken at points 14 and 15, respectively.

Figure 3 .
Figure 3. Annotation map of the study area (according to [9,41] and ground-truth information from Figure 1c modification).

Figure 3 .
Figure 3. Annotation map of the study area (according to [9,41] and ground-truth information from Figure 1c modification).

Figure 4 .
Figure 4. Workflow of the experimental process in this paper.

Figure 4 .
Figure 4. Workflow of the experimental process in this paper.

Figure 4 .
Figure 4. Workflow of the experimental process in this paper.

Figure 5 .
Figure 5. Overview of MLS3 for sample dataset construction.Figure 5. Overview of MLS3 for sample dataset construction.

Figure 5 .
Figure 5. Overview of MLS3 for sample dataset construction.Figure 5. Overview of MLS3 for sample dataset construction.

Algorithm 1
The sample selection using the abundance map Input: Abundance map U (m, h, b), User defined given T thresholds = [] # Compute the cumulative pixel percentage of the histogram and determine thresholds.For band in range(b): histogram, bins = np.histogram(U[:,:, band].flatten(),bins = 255) cumulative_pixel_percentage = np.cumsum((histogram/np.sum(histogram)* 100)) indexes = np.argmax(cumulative_pixel_percentage>= T) thresholds.append(bins[indexes])end for # Pixels in U that are greater than or equal to the thresholds in each band are marked with the band index, otherwise they are marked as 0. outputs_list = [] For band in range(b):

Figure 6 .
Figure 6.Six groups extracted lithological spectra (red and green lines) and their closest match from the field spectra data (black line): (a) matching results for the first group of spectra; (b) matching results for the second group of spectra; (c) matching results for the third group of spectra; (d) matching results for the fourth group of spectra; (e) matching results for the fifth group of spectra; (f) the sixth group of extracted lithological spectra.

Figure 6 .
Figure 6.Six groups extracted lithological spectra (red and green lines) and their closest match from the field spectra data (black line): (a) matching results for the first group of spectra; (b) matching results for the second group of spectra; (c) matching results for the third group of spectra; (d) matching results for the fourth group of spectra; (e) matching results for the fifth group of spectra; (f) the sixth group of extracted lithological spectra.

Figure 7 .
Figure 7. Sample distribution map of the study area.

Figure 7 .
Figure 7. Sample distribution map of the study area.

Figure 10
Figure 10 presents the mapped results obtained using different sample acquisition methods.To highlight the differences, four zoomed-in patches are displayed on the right side of each map, distinguished by unique colors (red, green, blue, and yellow).As shown in Figure 10, the result generated by MLS3 demonstrates a closer alignment with the reference map, revealing more distinct lithological boundaries.In contrast, the results from ROI, SAM, and SU exhibit more misclassification and low spatial aggregation, indicating

Author Contributions:
Conceptualization, H.L.; methodology, H.L. and K.W.; software, H.L.; formal analysis, H.L., K.W. and D.Z.; writing-original draft preparation, H.L.; writing-review and editing, H.L., K.W., D.Z. and Y.X.; funding acquisition, K.W.All authors have read and agreed to the published version of the manuscript.Funding: This work was supported by the National Natural Science Foundation of China, grant number U21A2013; the Fundamental Research Funds for the Central Universities, China University of Geosciences (Wuhan), grant number 2642022009; the Open Fund of State Key Laboratory of Remote Sensing Science, grant number OFSLRSS202312; the Global Change and Air-Sea Interaction II, grant number GASI-01-DLYG-WIND0; the Open Fund of Wenzhou Future City Research Institute, grant number WL2023007; the Foundation of State Key Laboratory of Public Big Data, grant number PBD2023-28; the Open Fund of Key Laboratory of Regional Development and Environmental Response, grant number 2023(A)003; the Hebei Key Laboratory of Ocean Dynamics, Resources and Environments, grant number HBHY2302; and the Open Fund of Key Laboratory of Space Ocean Remote Sensing and Application, MNR, grant number 202401001.

Table 1 .
Sample dataset for the study area.

Table 1 .
Sample dataset for the study area.

Table 2 .
Mapped results of different DL methods.

Table 2 .
Mapped results of different DL methods.

Table 3 .
Sample size for different sample dataset construction methods.

Table 4 .
Mapped results of different sample acquisition methods.

Table 4
compares the mapping accuracies achieved by different sample dataset construction methods.It is evident from Table

Table 4 .
Mapped results of different sample acquisition methods.