Spatial Validation of Spectral Unmixing Results: A Systematic Review

The pixels of remote images often contain more than one distinct material (mixed pixels), and so their spectra are characterized by a mixture of spectral signals. Since 1971, a shared effort has enabled the development of techniques for retrieving information from mixed pixels. The most analyzed, implemented, and employed procedure is spectral unmixing. Among the extensive literature on the spectral unmixing, nineteen reviews were identified, and each highlighted the many shortcomings of spatial validation. Although an overview of the approaches used to spatially validate could be very helpful in overcoming its shortcomings, a review of them was never provided. Therefore, this systematic review provides an updated overview of the approaches used, analyzing the papers that were published in 2022, 2021, and 2020, and a dated overview, analyzing the papers that were published not only in 2011and 2010, but also in 1996 and 1995. The key criterion is that the results of the spectral unmixing were spatially validated. The Web of Science and Scopus databases were searched, using all the names that were assigned to spectral unmixing as keywords. A total of 454 eligible papers were included in this systematic review. Their analysis revealed that six key issues in spatial validation were considered and differently addressed: the number of validated endmembers; sample sizes and sampling designs of the reference data; sources of the reference data; the creation of reference fractional abundance maps; the validation of the reference data with other reference data; the minimization and evaluation of the errors in co-localization and spatial resampling. Since addressing these key issues enabled the authors to overcome some of the shortcomings of spatial validation, it is recommended that all these key issues be addressed together. However, few authors addressed all the key issues together, and many authors did not specify the spatial validation approach used or did not adequately explain the methods employed.


Background
A pixel that contains more than one "land-cover type" is defined as a mixed pixel, and its spectrum is formed by combining the spectral signatures of these "land-cover types" [1]. The presence of mixed pixels in the image constrains the techniques that can be carried out to analyze, characterize, and classify the remote sensing images [2,3]. To retrieve mixed-pixel information from remote sensing images, a shared research effort allowed developing several methods (e.g., spectral unmixing, probabilistic, geometric-optical, stochastic geometric, and fuzzy models [1]). However, the literature shows that, for over 40 years, spectral unmixing has been the most commonly used method for discrimination, detection, and classification of superficial materials [4][5][6].
The spectral unmixing was defined as the "procedure by which the measured spectrum of a mixed pixel is decomposed into a collection of constituent spectra, or endmembers, and a set of corresponding fractions, or abundances, that indicate the proportion of each endmember present in the pixel" [6]. It is important to point out that many names

Reviews on the Spectral Unmixing Procedure
In order to more effectively understand the importance of spectral unmixing, a quantification of the works that have studied, implemented, and applied this procedure since 1971 were provided. For this purpose, all names that were given to the spectral unmixing procedure were exploited as terms in the search strategy. A total of 5768 and 5852 papers were identified using Web of Science and Scopus search engines, respectively (accessed on 19 May 2023). Among these papers, 19 reviews offered the status of spectral unmixing ( Table 2). Table 2. Reviews on the spectral unmixing procedure.

Paper
Publication Year Publication Title
An interesting overview of the "linear models" developed up to 1996 was offered by Ichoku & Karneili [1], who compared this method with four other unmixing models: probabilistic, geometric-optical, stochastic geometric, and fuzzy models. The authors summarized that evaluated spatial accuracies were not representative of the real accuracies at the level of individual pixels because the spatial validation was performed for a few test pixels.
Heinz & Chein-I-Chang [33] focused on the second constraint of linear spectral mixture analysis (i.e., the fractional abundances of each mixed pixel must be positive), which is very difficult to implement in practice. Reviewing the literature, the authors pointed out that because most research did not know in detail the spectra present in the image scene, their results did not necessarily reflect the true abundance fractions of the materials [33].
Keshava [42] exploited the hierarchical taxonomies to facilitate comparison of the wide variety of methods used for spectral unmixing and revealed their similarities and differences. Furthermore, the author restated that most of the methods developed to solve problems were due to lack of detailed knowledge of ground truth. In their extensive description of spectral unmixing methodology, Keshava and Mustard [6] focused on the processing chain of linear unmixing methods applied to hyperspectral data. The authors highlighted that the shortcomings in spatial validation were due to the lack of detailed ground-truth knowledge; for this reason, the main focus of the research was on determining endmembers, rather than recovering fractional abundance maps [6].

Screening and Eligible Criteria
Reading the abstracts of the identified papers was conducted to select only those that applied spectral unmixing to remote images. Excluding the duplicates, 760 papers were selected with the first screening (orange box in Figure 1): 535 were the papers published in 2022, 2021, and 2020; 186 were the papers published in 2011 and 2010; 100 were the papers published in 1996 and 1995.
Reading the full text of the screened papers was conducted to identify only those that spatially validated the spectral unmixing results (bright red box in Figure 1). The last analysis identified the eligible papers: 326 were the papers published in 2022, 2021, and 2020; 112 were the papers published in 2011 and 2010; 16 were the papers published in 1996 and 1995.
In conclusion, 454 eligible papers were included in this systematic review. In Appendix A, the Tables A1-A7 summarize the characteristics of the eligible papers that were published in 2022,2021,2020,2011,2010,1996, and 1995, respectively.
The analysis of these data showed that most studies that analyzed hyperspectral images were performed at the local scale and did not carry out the multitemporal studies, whereas most studies that analyzed multispectral images were performed at the regional or continental scale and carried out the multitemporal studies (more than one image was analyzed). Therefore, the spectral unmixing is widely applied to multispectral images, despite their smaller number of bands than hyperspectral images, because these data are characterized by greater spatial and temporal availability than those of the hyperspectral data.
Moreover, the spectral unmixing was also applied to some hyperspectral and multispectral images that were characterized with high spatial resolutions (e.g., AMMIS image with spatial resolution equal to 0.5 m [56] and WorldView-3 image with spatial resolution of 0.31 m [166]). These papers confirm that, no matter how high the spatial resolution might be, no image pixel results were completely homogeneous in spectral characteristics [9,516,517].

Accuracy Metrics
Accuracy, which is defined as "the degree of correctness of the map", is usually assessed by comparing the "ground truth" with the map retrieved from remote images [518,519]. Because no map can fully and completely map the territory [520], ground truth is more correctly called reference data [521]. To assess the differences between the reference data and results of the spectral unmixing, the eligible papers exploited different metrics. Figure 3 shows the pie chart of the distribution of the metrics that were adopted by eligible papers. The other 14 metrics were average accuracy [522], correct labeling percentage for the unchanged pixels [141], correlation coefficient [150], Kling-Gupta efficiency [523], mean abundance error [117], mean error [169], mean relative error [169], normalized average of spectral similarity measures [524], producer s accuracy [153], Receiver Operating characteristic Curves (ROC) method [525], relative mean bias [165], separability spectral index [526], signal-to-reconstruction error [56], and systematic error [109].
In conclusion, the authors of 454 eligible papers employed 22 different metrics, and most authors employed more than 1 metric. Overall, 25% of the eligible papers did not specify the accuracy metrics used. It is very important to note that some standard accuracy assessments, such as the kappa coefficient, "assume implicitly that each of the testing samples is pure"; therefore, some of these metrics were inappropriate for evaluating the accuracy of the fractional abundance maps [41,518].

Key Issues in the Spatial Validation
Since the literature highlighted many sources of error in accuracy assessment of retrieved maps [518,519,521], the authors identified and carried out several "key issues" to address and minimize these errors. Figure 4 and Tables A1-A7 summarize the key issues that were identified.

Validated Endmembers
Before analyzing the endmembers that were validated, it is necessary to remember that the number of endmembers that were determined with the images must be less than the number of sensor bands; therefore, the number of endmembers that were determined with the multispectral data is less than the number of endmembers that were determined with the hyperspectral data [6,23,527]. Therefore, the authors who elaborated the multispectral images employed smaller levels of model complexity than authors who elaborated the hyperspectral images [528,529]. For example, the VIS model was used to map only three endmembers (Vegetation, Impervious surfaces, and Soil) in many urban areas that were retrieved from multispectral data (e.g., [109,152,477,493]).
The third columns of Tables A1-A7 list the endmembers that were determined using spectral unmixing; the fourth columns of these showed the number of these endmembers that were validated. It is interesting to note that some authors validated smaller number of endmembers than the number of the endmembers that were determined (i.e., 40 eligible papers). Dividing the works that analyzed hyperspectral images from those that analyzed multispectral data, Figure 5 shows the percentage of studies that validated the total or partial number of endmembers. It is important to highlight that, since 4 eligible papers analyzed both hyperspectral and multispectral data [104,227,231,281], the sum of papers that analyzed hyperspectral data and papers that analyzed multispectral data (i.e., 458) is greater than the number of eligible papers (i.e., 454). Therefore, only 2% of the studies that elaborated hyperspectral images partially validated the determined endmembers, whereas 18% of the studies that elaborated multispectral images partially validated the determined endmembers. As mentioned above, hyperspectral images were used to carry out non-repeated surveys over time and at localscale studies (252 papers of a total of 262), whereas most multispectral images were used to carry out regional-or continental-scale studies that were or were not repeated over time (180 papers of a total of 196). Therefore, some of these authors, who analyzed more than one image, chose to spatially validate only the materials or groups of materials on which they focused their study. For example, Hu et al. [149] spatially validated only blue ice fractional abundance maps that were retrieved from MODIS images covering the period 2000-2021 in order to present a FABIAN (Fractional Austral-summer Blue Ice over Antarctica) product. It should be noted that 5 and 12% of the papers that analyzed hyperspectral or multispectral data, respectively, did not specify which endmembers were validated.

Sampling Designs for the Reference Data
The literature demonstrated that a possible source of error in spatial validation is due to the choice of the sampling design for the reference data [518,519,521,530]. The sampling design mainly includes the definition of the sample size and the sampling design of the reference data [518]. Authors of eligible papers chose three kinds of sample sizes: the whole study area; the representative area; small sample sizes (pixels, plots, and polygons samples). The eighth columns of Tables A1-A7 show the different sample sizes that were adopted by every eligible paper, and Table 10 shows the number of papers that adopted the different sample sizes. Most authors of the eligible papers chose to validate the whole study areas, followed, in descending order, by the choice to employ the different number of small sample sizes and then the representative areas. It is also important to note the high percentages of the papers that did not specify the sample size of the reference data: 18, 11, and 31%, respectively.
The literature also pointed out that the sampling designs for spatially validating maps at local scale cannot be the same as the designs for spatially validating maps at regional or continental scale [518,530]. As mentioned above, most of the studies that analyzed the hyperspectral data were performed at local scale (252 papers of a total of 262), whereas the studies that analyzed the multispectral images performed at regional or continental scale (180 papers of a total of 196). Therefore, the eligible papers that analyzed hyperspectral images were analyzed separately from those that analyzed multispectral images ( Figure 6 on the right and left, respectively), not only to analyze the different sampling designs adopted from the hyperspectral and multispectral data, but also to highlight the different sampling designs chosen for local or regional/continental scale studies.  Most papers that processed hyperspectral images validated the whole study area (212 papers), whereas most papers that processed multispectral images employed small sample sizes (94 papers).
The authors of eligible papers that employed small sample sizes adopted three different sampling designs of reference data: partial, random, and uniform. The ninth columns of Tables A1-A7 show the sampling designs of every eligible paper. Most authors who published in 2022, 2021, and 2020 and published in 2011 and 2010 chose the random distribution of reference data (78% for a total of 326 papers and 76% for a total of 110 papers, respectively), whereas the authors who published in 1996 and 1995 did not specify the sampling designs employed. Stehman and Foody [519] highlighted that "the most commonly used designs" that were chosen to assess the land cover products were "simple random, stratified random, systematic, and cluster" designs. Therefore, these results confirmed that random designs were the most commonly used approaches.

Sources of the Reference Data
Eligible papers employed four different sources of reference data to spatially validate spectral unmixing results: images, in situ data, maps, and previous reference maps. Table  11 shows the number of the eligible papers that employed these reference data sources, whereas the fifth columns of Tables A1-A7 detail the sources of the reference data. The number of authors who chose to utilize geological, land use, or land cover maps as reference maps is the smallest (5% of the total eligible papers), followed, in ascending order, by the number who chose to create the reference maps using in situ data (20% of the total eligible papers), and then by the number of authors who chose to create the reference maps using other images (31% of the total eligible papers). Firstly, the number of authors who chose to use the previous reference maps is the largest (44% of the total eligible papers).
As regards the authors who chose to create the reference maps using other images, most of them employed images at higher spatial resolutions than those of the remote images analyzed (95% of a total of 143 papers). To create the reference maps from the images, 47% of the eligible papers did not specify the method used to map the endmembers, 29% employed the photo-interpretation, 21% classified the images, 2% used the vegetation indexes, and 2% used the mixed approach by classifying and/or photo-interpreting and/or applying vegetation indexes (e.g., [114,145,531]). As regards the classification methods, there are four works that applied the same classification procedure to analyze the remote images and to create the reference maps [65,66,149,261]. Among these, the authors of 3 papers compared the fractional abundance maps that were retrieved from the multispectral images at moderate spatial resolutions (10, 30, and 60 m) with the fractional abundance maps that were retrieved from the multispectral data at coarse spatial resolutions (0.5 and 1 km) [65,66,149].
Moreover, the reference data sources that were chosen to validate the results of the hyperspectral images were analyzed separately from those that were chosen to validate the results of the multispectral images. Figure 7 shows the percentage of the papers that adopted the different sources of the reference data to validate the results of hyperspectral (right) and multispectral data (left). As regards the papers that analyzed the multispectral data, most of the authors chose to create the reference maps from the other images, whereas most of the authors that analyzed the hyperspectral data chose to employ the previous reference maps. It is important to emphasize that 97% of these reference maps are available online together with hyperspectral images and/or reference spectral libraries (e.g., [532][533][534][535] Figure 8). Therefore, these images were well known: Cuprite (Nevada, USA, e.g., [70,458]), Indian Pines (Indiana, USA, e.g., [78,458]), Jasper Ridge (California, USA, e.g., [68,97]), Salinas Valley (California, USA, e.g., [75,78]) datasets that were acquired with AVIRIS sensors; Pavia (Italy, e.g., [81,85]) datasets that were acquired with the ROSIS sensor; Samson (Florida, USA, e.g., [59,89]) dataset that was acquired with the Samson sensor; University of Houston (Texas, USA, e.g., [59,78],) dataset that was acquired with the CASI-1500 sensor ; Urban (Texas, USA, e.g., [59,68]) and Washington DC Mall (Washington DC, USA, e.g., [81,90]) datasets that were acquired with the HYDICE sensor. Moreover, 93% of these papers proposed a method and tested it not only on these "real" hyperspectral data, but also on created synthetic images. Borsoi et al. [4] highlighted that in order to overcome "the difficulty in collecting ground truth data", some authors generated synthetic images. However, the authors complained because "there is not a clearly agreed-upon protocol to generate realistic synthetic data" [4].

Reference Fractional Abundance Maps
"Misclassifications" of the reference data or "misallocations of the reference data" are another possible source of error in spatial validation, defined as "imperfect reference data" by [519] or "error magnitude" by [518]. The authors highlighted that these errors can be caused also by the use of "standard" reference maps to validate the spectral unmixing results (i.e., the fractional abundance maps) [41,518,519]. The difference between standard reference maps and reference fractional abundance maps is that each pixel of the standard reference map is assigned to a corresponding land cover class, whereas each pixel of the reference fractional abundance map is labeled with the fractional abundances of each endmember that is present in that pixel. Therefore, the values of the standard reference map are equal to 0 or 1, whereas the values of the reference fractional abundance map are greater than 2 and vary between 0 and 1 (100 values are able to fully validate the fractional abundance of endmembers [114]).
The reference fractional abundance maps were employed by 133 eligible papers that were published in 2022, 2021, and 2020; by 62 eligible papers that were published in 2011 and 2010; and by 13 eligible papers that were published in 1996 and 1995 (45% of the total eligible papers). Moreover, among these works, 87, 47, and 8 papers estimated the full range of abundances using 100 values (31% of the total eligible papers), whereas 41, 10, and 5 works partially estimated the fractional abundances using less than 100 values (12% of the total eligible papers). It is important to note that 7% of the total eligible papers did not specify if they used the standard reference maps or the reference fractional abundance maps.
The eligible papers were separately analyzed according to reference data sources that were adopted in order to find out how fractional abundances were estimated. In the four parts of Figure 9, the eligible papers that were clustered according to the reference data sources are shown, and each part of Figure 9 shows the percentage of the papers that did not specify the reference maps used and the number of the papers that fully or partially estimated the reference fractional abundance maps. Figure 9. Distribution of the eligible papers that did not specify the reference maps used, fully and partially estimated fractional abundances according to the reference data sources, where n was the total number of papers that were clustered according to the reference data sources and included in the pie charts: (a) The papers that employed the maps; (b) The papers that employed in situ data; (c) The papers that employed the images; (d) The papers that employed the previous reference maps.
High-spatial-resolution images were the most widely employed to make the reference fractional abundance maps (81% of the total papers that employed the images), followed by in situ data (68% of the total papers that employed in situ data), and then the maps (50% of the total papers that employed maps). Moreover, in situ data were the most widely employed to estimate the full range of fractional abundances (62% of the total papers that employed in situ data), followed by high-spatial-resolution images (52% of the total papers that employed the images), and then the maps (21% of the total papers that employed the images). The previous reference maps were not employed to make the reference fractional abundance maps.
Many authors highlighted that it is not easy to create the reference fractional abundances maps (e.g., [4,6,518,519]). Cavalli [145] implemented a method that was proposed by [537] in order to create the reference fractional abundance maps. This method is able to create the reference fractional abundance maps by varying the spatial resolution of the high-resolution reference maps several times, and the range of fractional abundances can be fully estimated according to the spatial resolution of the reference maps [114].

Validation of the Reference Data with other Reference Data
In order to further minimize the errors due to "misclassifications" or "misallocations of the reference data" [518,519], some authors validated the reference data using other reference data: 61 eligible papers published in 2022, 2021, and 2020; 21 eligible papers published in 2011 and 2010; 4 eligible papers published in 1996 and 1995. Therefore, 81% of the total eligible papers did not take into consideration that the reference map may not be "ground truth" and may be "imperfect" [519,520].
It is very important to point out that some authors took advantage of the online availability of reference data to validate reference data (e.g., [114,123,127,140,145,152,231,448,496]). Many efforts are being made to create the networks of accurate validation data [48,[538][539][540]. For example, Zhao et al. [140] exploited in situ measurements of the Leaf Area Index (LAI) that were provided by the VALERI project [540], whereas Halbgewachs et al. [123], Lu et al. [423], Shimabukuro et al. [353], and Tarazona Coronel [127] utilized validation data that were provided by the Program for Monitoring Deforestation in the Brazilian Amazon (PRODES) [541].

Error in Co-Localization and Spatial Resampling
The key issues described above addressed only the errors in the thematic accuracy of the spectral unmixing results [518,519], whereas this key issue aimed to address the geometric errors due to the comparison of remote images with reference data [542]. The impact of co-localization and spatial resampling errors was minimized and/or evaluated by 6% of the eligible papers: 20 eligible papers published in 2022, 2021, and 2020; 8 eligible papers published in 2011 and 2010; 1 eligible paper published in 1996. In order to minimized the errors, Arai et al. [368], Cao et al. [164], Li et al. [107], Soenen et al. [500], and Zurita-Milla et al. [419] carefully chose the size of the reference maps; Bair et al. [254], Cavalli [114,145], Ding et al. [152], Fernandez-Garcia et al. [ [480] selected the size and the spatial resolution of the reference maps; Ben-dor et al. [507], Fernandez-Guisuraga et al. [342], Kompella et al. [328], Laamarani et al. [343], and Plaza & Plaza [465] carefully co-localized the reference fractional abundance maps on the reference maps; Wang et al. [366] expanded the windows of the field sample size; Zhu et al. [64] resampled at "four kinds of grids" (i.e., 1100 × 1100 m, 2200 × 2200 m, 4400 × 4400 m, and 8800 × 8800 m) the reference fractional abundance map and compared the results. Bair et al. [254], Binh et al. [341], Cavalli [114,145], Cheng et al. [543], and Ruescas et al. [448] evaluated the errors in co-localization and spatial-resampling due to the comparison of different data at different spatial resolutions. Moreover, Cavalli [145] proposed a method to minimize the errors: the comparison of the histograms of the reference fractional abundance values with the histograms of the retrieved fractional abundance values.
It is important to point out that 94% of the total papers did not address the geometric errors due to the comparison of remote images with reference data.

Conclusions
The term validation is defined as "the process of assessing, by independent means, the quality of the data products derived from the system outputs" by the Working Group on Calibration and Validation (WGCV) of the Committee on Earth Observing Satellites (CEOS) [48]. Since 1969, research has been involved to establish shared key issues to validate the land cover products that were retrieved from the remote images [518,519,539,544]. These products can be obtained by applying classifications called "hard", because they extract information only from "pure pixels," and classifications called "soft", because they also extract information from "mixed pixels" [519,544].
However, not only the literature related to the spatial validation, but also every review on the spectral unmixing procedure (i.e., a soft classification) highlighted that the key issues in the spatial validation of soft classification results have yet to be clearly established and shared (e.g., [4,6,518,519]).
Since no review was performed on this fundamental topic, this systematic review aims (a) to identify and analyze how the authors addressed the spatial validation of spectral unmixing results and (b) to provide readers with recommendations for overcoming the many shortcomings of spatial validation and minimizing its errors. The papers published in 2022, 2021, and 2020 were considered to analyze the current status of spatial validation, and the papers published not only in 2011 and 2010, but also in 1996 and 1995, were considered to analyze its progress over time. Since the literature on spectral unmixing is extensive, only papers published in these seven years were considered. A total of 454 eligible papers were included in this systematic review and showed that the authors addressed 6 key issues in the spatial validation. In this text, the order in which the key issues were presented is not an order of importance.
1. The first key issue concerned the number of the endmembers validated. Some authors chose to focus on only one or two endmembers, and only these were spatially validated. This key issue was designed to facilitate the conduct of regional-or continental-scale studies and/or multitemporal analysis. It is important to note that 8% of the eligible papers did not specify which endmembers were validated. 2. The second key issue concerned the sampling designs for the reference data. The authors who analyzed hyperspectral images preferred to validate the whole study area, whereas those who analyzed multispectral images preferred to validate small sample sizes that were randomly distributed. It is important to point out that 16% of the eligible papers did not specify the sampling designs for the reference data. 3. The third key issue concerned the reference data sources. The authors who analyzed hyperspectral images primarily used the previously referenced maps and secondarily created reference maps using in situ data, whereas the authors who analyzed multispectral images chose to create reference maps primarily using highspatial-resolution images and secondarily using in situ data. 4. The fourth key issue was, perhaps, the one most closely related to the spectral unmixing procedure; it concerned the creation of the reference fractional abundance maps. Only 45% of the eligible papers created the reference fractional abundance maps to spatially validate the fractional abundance maps retrieved. These mainly employed high-resolution images and secondarily in situ data. Therefore, 55% of the eligible papers did not specify the employment of the reference fractional abundance maps. 5. The fifth key issue concerned the validation of the reference data with other reference data; it was addressed only by 19% of the eligible papers. Therefore, 81% of the eligible papers did not validate the reference data. 6. The sixth key issue concerned the error in co-localization and spatial resampling data, which was minimized and/or evaluated only by 6% of the eligible papers. Therefore, 94% of the eligible papers did not address the error in co-localization and spatial resampling data.
In conclusion, to spatially validate the spectral unmixing results and minimize and/or evaluate its errors, six key issues were considered not only from the eligible papers published in 2022, 2021 and 2020, but also from those published in 2010, 2011, 1996, and 1995. In addition, the results obtained from both hyperspectral and multispectral data were spatially validated considering all key issues, but these were addressed in different ways. All six key issues addressed together enabled rigorous spatial validation to be performed. Therefore, this systematic review provided readers with the most suitable tool to rigorously address spatial validation of the spectral unmixing results and minimize its errors.
The key difference between reference data suitable for hard and soft classifications is that the latter reference maps must have higher spatial resolution than the resolutions of the image pixels [6,114,518]. The optimal scale would be that 100 times larger than the image pixel resolution [114]. However, many hyperspectral data were validated using the previous reference maps at the same spatial resolution as the remote image, so these standard reference maps can only create reference fractional abundance maps with the help of other reference data. The employment of the standard reference maps instead of the reference fractional abundance maps was also evidenced by the employment of metrics to assess spatial accuracy that "assume implicitly that each of the testing samples is pure" [37,217].
However, only 4% of eligible papers addressed every key issue, and many authors did not specify which approach they employed to spatially validate the spectral unmixing results. Moreover, most of the authors who specified the approach employed did not adequately explain the methods used and the reasons for their choices. Six "good practice criteria to guide accuracy assessment methods and reporting" were identified by [519]. Therefore, these papers did not fully meet three good practice criteria: "reliable", "transparent", and "reproducible" [519].

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
In accordance with the PRISMA statement [49,50], 454 eligible papers were identified, screened, and included in this systematic review: 326 eligible papers were published in 2022, 2021, and 2020; 112 eligible papers were published in 2011 and 2010; 16 eligible papers were published in 1996 and 1995. The eligible criterion was that the results of the spectral unmixing were spatially validated. Analyzing these papers, six key issues were identified that were differently addressed to spatially validate the spectral unmixing results. The different ways in which the key issues were addressed by the eligible papers published in 2022, 2021,2020,2011,2010,1996, and 1995 are summarized in Tables A1, A2, A3, A4, A5, A6, and A7, respectively.