A New Elbow Estimation Method for Selecting the Best Solution in Sparse Unmixing

The goal of hyperspectral image analysis is often to determine which materials, out of a given set of possibilities, are present in each pixel. As hyperspectral data are being gathered in rapidly increasing amounts, automatic image analysis is becoming progressively more important. Automatic identification of materials from a mixed pixel is possible with 1) Bayesian unmixing algorithms and 2) multiobjective sparse unmixing algorithms when a method such as elbow estimation is used to select the best solution from the set of Pareto-optimal solutions. We develop a new elbow estimation method called termination condition adaptive elbow (TCAE) for selecting the best solution from the set of Pareto-optimal solutions to a biobjective unmixing problem. Specifically, the two objectives are assumed to be the sparsity level of the fractional abundance vector and the reconstruction error. We conduct experiments with real-world unmixing applications in mind, and TCAE performs significantly better than a state-of-the-art elbow estimation method when they are both used to select the best solution from the sequence of fractional abundance vectors generated by iterative spectral mixture analysis (ISMA). Furthermore, the combination of ISMA and TCAE is able to identify endmembers from mixed pixels several times faster and with higher F1-score than the two Bayesian unmixing algorithms used as a reference. We conclude that the combination of ISMA and TCAE facilitates automatic, reliable, and rapid identification of endmembers from mixed pixels.


I. INTRODUCTION
H YPERSPECTRAL cameras are used for close-range imaging in addition to remote sensing from satellites and manned aircraft. Specifically designed instruments may also be mounted on unmanned aerial vehicles. As a result, hyperspectral imaging sees wide use in many different fields, including geology, precision agriculture, food quality control, medical diagnosis, forensics, and military applications [1], [2], [3]. Furthermore, the use of supercontinuum lasers for illumination has been studied in the recent decade, and in the future, they may enable more widespread use of hyperspectral imaging in underground mines and other applications that have challenging illumination conditions [4]. Manuscript  Numerous methods for hyperspectral image analysis have been presented [5]. Classification algorithms assign a single label to every pixel. Thus, materials are left unidentified in the case of mixed pixels. Furthermore, a classification algorithm may yield a label that does not correspond to any of the endmembers occupying a mixed pixel, as it is difficult to classify a mixed pixel by comparing its spectral signature with the individual endmember spectra [5].
Unmixing methods have the advantage that they can identify multiple endmembers from a single pixel and estimate their abundances [6], [7], [8], [9]. Abundance estimates tend to be the most accurate when the endmembers participating in the given pixel have been correctly identified [10]. As the endmember spectra are assumed to be known in semisupervised unmixing, identifying the endmembers from a pixel equates to determining the smallest subset of endmembers with which all essential features of the pixel spectrum y can be modeled.
Several different approaches have been developed for finding the subset of endmembers that neither overfits nor underfits the pixel spectrum y. Greedy and relaxation-based unmixing algorithms select the best subset with the help of hyperparameters whose values need to be determined by a human expert. However, the optimal values of the hyperparameters are typically data dependent and difficult to find [11], [12], [13], [14]. Furthermore, the need for expert intervention reduces the speed at which a large number of pixels can be analyzed.
Automatic identification of endmembers from a mixed pixel is possible with multiobjective sparse unmixing algorithms when a method such as elbow estimation is used to select the best solution from the set of Pareto-optimal solutions [15], [16]. In addition, Bayesian unmixing algorithms can automatically identify which materials occupy a mixed pixel [9].
Angle-based elbow estimation (ABEE) [14] has been used to select the best solution from the set of Pareto-optimal solutions in many studies, e.g., [17], [18], [19], [20]. However, ABEE was criticized in [11] by stating that it sometimes selects wrong solutions or fails to unambiguously identify the best solution. Typical downsides of Bayesian unmixing algorithms include computational heaviness and general complexity [21], [22]. As hyperspectral data are being gathered in rapidly increasing amounts [23], automatic identification of materials from mixed pixels warrants further research.
The main contribution of this article is a method that enables automatic, reliable, and rapid identification of materials from mixed pixels. Specifically, we have developed a new elbow estimation method called termination condition adaptive elbow (TCAE), which has been combined with iterative spectral mixture analysis (ISMA) [10] in this article. However, TCAE can be used to select the best solution from the set of Pareto-optimal solutions generated by any unmixing algorithm, whereby TCAE is widely usable in sparse unmixing.
The rest of this article is organized as follows. Section II describes the theoretical and methodological background of our work, while Section III presents TCAE. Section IV describes the chosen performance metrics, while Section V focuses on experiments with synthetic data. Sections VI-VIII cover real-data experiments. Section IX examines the performance of TCAE with an ablation experiment and presents a runtime comparison between the algorithms. Section X discusses the novelty and wider applicability of TCAE in addition to suggesting future research directions. Finally, Section XI concludes this article.

II. BACKGROUND
The linear mixing model may be written as y = Φx + n (1) in which y ∈ R m is the mixture spectrum, the dictionary Φ ∈ R m×n contains the endmember spectra as columns, x ∈ R n is the fractional abundance vector, and n ∈ R m represents noise. Only nonnegative abundances are physically meaningful, whereby the abundance nonnegativity constraint (ANC) is typically imposed. In addition, it is sometimes required that the components of x must sum to 1. The use of the abundance sum-to-one constraint (ASC) has been criticized because it is rarely satisfied in other than simulation data [7], [8].
Deviations from the linear mixing model arise if photons scatter several times before being detected. Such multiple scattering is especially common in the case of intimate mixtures, in which the materials are mixed on spatial scales smaller than the path length of photons in the mixture. Spectral preprocessing methods [24] and nonlinear mixing models, such as bilinear models, can be used to deal with the effects that arise from multiple scattering. We will return to multiple scattering in Section VI.
In many unmixing algorithms, each target material is represented by a single endmember spectrum. However, in reality, the spectral signature of a particular material may vary within a single scene [25]. One way to account for such endmember variance is to represent each target material with a set of spectra instead of one spectrum. For example, the MESMA algorithm and its variations are based on this principle [13], [26]. We will return to endmember variance in Section X-A.

A. Sparse Unmixing
In terms of the linear mixing model (1), the goal in sparse unmixing is to find such sparse x that Φx ≈ y. Thus, sparse unmixing has two fundamental questions. The first question is how accurately the sparse approximation Φx must match the mixture spectrum y. Subsequently, one must find the smallest subset of endmembers that gives the desired accuracy.
In the equation form, the second step may be written as min x ||x|| 0 s.t. ||y − Φx|| ≤ δ (2) in which || · || 0 yields the number of nonzero components in the argument vector, || · || is the Euclidean norm, and δ ≥ 0 is the largest allowed reconstruction error. Although ANC and ASC could be imposed in (2), we have omitted them here for simplicity. Since (2) is a combinatorial problem for a given δ, finding the exact solution to (2) is computationally unaffordable even for moderately large dictionaries Φ [13]. To obtain useful solutions in reasonable time, computationally lighter methods have been developed for approximately solving (2). While greedy and relaxation-based algorithms can be used to approximately solve (2), they contain hyperparameters whose values must be set before the estimated fractional abundance vector is obtained. Chen and Zhang [27] have derived a formula that yields the optimal value of the regularization parameter in nonconvex relaxation of (2). Specifically, the formula yields the optimal value of the regularization parameter as a function of signal-to-noise ratio (SNR). However, we believe that the applicability of the formula is limited mainly to simulations, because SNR is usually not known in real-world measurements. Thus, it remains problematic that the values of the hyperparameters in greedy and relaxation-based unmixing algorithms must be set before the estimated fractional abundance vector is obtained.
To avoid the problem posed by hyperparameters, sparse unmixing methods based on multiobjective optimization have been developed [11], [12], [14]. Considering the sparsity level ||x|| 0 and the squared reconstruction error ||y − Φx|| 2 as the objectives, sparse unmixing may be formulated as a biobjective optimization problem Since the second objective tends to increase when the first objective is decreased, the biobjective minimization problem (3) does not have a unique solution. A solution x is called Pareto-optimal if neither of the objectives ||x|| 0 and ||y − Φx|| 2 can be decreased without increasing the other. The solution to the biobjective minimization problem (3) is the set of all Pareto-optimal solutions x. The task of choosing the best solution from the set of Paretooptimal solutions may sometimes be left for a human expert [12], while automatic selection is usually based on elbow estimation [15], [16]. The reason for using elbow estimation is that the Pareto-optimal front of a biobjective minimization problem usually features a region, where a small decrease in one objective causes a large increase in the other objective [16].

B. Iterative Spectral Mixture Analysis
ISMA is a greedy sparse unmixing algorithm that uses backward elimination to find the best subset of endmembers [10]. Algorithm 1 presents the details of ISMA in pseudocode. The first input of ISMA is the mixture spectrum y ∈ R m . The second input is the dictionary Φ ∈ R m×n , which has the n endmember 8: Increase the iterator by setting i = i + 1. 9: end while 10: Set r n+1 = ||y||. 11: The critical iteration critIT = v + η. 14: Return x (critIT) as the output. spectra as columns. The last inputs are the hyperparameters Δ th ∈ (0, 1] and η ∈ {1, 2, 3, . . .}, which determine the sparsity level of the estimated fractional abundance vector x.
Steps 3-9 of ISMA constitute a while loop that carries out the backward elimination. Specifically, Step 4 fits the mixture spectrum y using the columns of Φ that are listed in the set S. At the first iteration of the while loop, this amounts to using the whole dictionary Φ, because S was initialized to contain all integers from 1 to n in Step 1.
Step 5 stores the reconstruction error r i = ||y − Φx (i) || for later use in Step 11. Step 6 finds out which component of the fractional abundance vector x (i) has the smallest value, whereafter Step 7 removes the corresponding index from the set S. Thus, the endmember that had the smallest abundance will not be used for fitting in Step 4 at the subsequent iterations of the while loop.
As a result, endmembers corresponding to negative abundances get eliminated first. Note that this is a desired feature in ISMA, because negative abundances are unphysical, i.e., they are not possible in reality. At some point, the elimination results in a situation in which all abundances are nonnegative. Then, the endmember corresponding to the smallest abundance is likely used to fit noise or is otherwise the least important for a good fit. Thus, the endmembers that have contributed the most to the mixture spectrum y are likely to get eliminated last.
We note that the solution x (i) in Step 4 must be unique, because otherwise the endmember selected for elimination in Step 6 will be arbitrary. Consequently, the dictionary Φ must be undercomplete, i.e., it must have less columns than rows. The Δ i values calculated in Step 11 of ISMA describe how the reconstruction error r i increases over consecutive iterations. Specifically, Δ i indicates how large the relative change in reconstruction error was between iterations i and i + 1. For the purposes of calculating Δ n , we use the definition r n+1 = ||y|| made in Step 10. Small and large values of Δ i correspond to small and large relative changes, respectively.
The upper panel in Fig. 1 illustrates how the reconstruction error r i might increase at consequent iterations. The lower panel in Fig. 1 shows the corresponding Δ i values calculated in Step 11 of ISMA. The total number of endmembers is n = 12, while it is assumed that the mixture spectrum y is a linear combination of five endmember spectra plus noise.
At the early iterations in Fig. 1, the Δ i values are small, because the endmembers that were eliminated from the set S fitted only noise. The first prominent increase in the Δ i values occurs when one of the mixture components is eliminated from S (Step 7 at iteration i = 8). The reason is that the remaining endmembers can no longer model all major features of the mixture spectrum y.
A human expert must specify a threshold Δ th that separates the small Δ i values from the larger values that arise when mixture components are removed from the set S. Furthermore, the human expert must specify a nonnegative integer η, which determines how resistant the search for the critical iteration will be against local minima.
Part 2 of ISMA determines the critical iteration with the help of the hyperparameters Δ th and η. Starting from the last iteration, one finds the first η consecutive Δ i values that are smaller than Δ th . Assuming that the ηth value was found from iteration v, the critical iteration critIT is taken to be v + η.
The endmembers indexed in S at the beginning of iteration critIT are considered to form the optimal endmember set. ISMA returns the corresponding fractional abundance vector x (critIT) as the output.

C. Angle-Based Elbow Estimation
As mentioned in Section I, ABEE [14] is widely used for selecting the best solution from the set of Pareto-optimal solutions to (3). The incomparable scales of the objectives ||x|| 0 and ||y − Φx|| 2 in (3) complicate the elbow estimation. To avoid this problem, the values of both objectives are scaled with their maximum values before estimating the location of the elbow. That is, ABEE seeks the elbow from the points in which x (i) with i ∈ {1, 2, . . . , n} are the Pareto-optimal solutions of (3). To find the elbow, the four angles α 1 , α 2 , α 3 , and α 4 defined in [14,Fig. 5] are calculated for each point P i . The maximum of these four angles is taken to represent the tradeoff of the associated solution x (i) . The solution x (i) having the largest tradeoff is chosen as the best solution.
In ISMA, ABEE can be used instead of Steps 12 and 13 to determine the critical iteration critIT. To elaborate, let us assume that {x (i) | i = 1, 2, . . . , n} is the set of solutions determined by the while loop in ISMA. Furthermore, we define x (n+1) = 0 as the solution corresponding to the reconstruction error r n+1 = ||y|| defined in Step 10 of ISMA.
The points P i with i ∈ {n − 3, n − 2, . . . , n + 1} are schematically illustrated in Fig. 2. In the x-axis, the maximum sparsity level max i (||x (i) || 0 ) is equal to the number of fractional abundance vectors generated by ISMA, namely, n. In the y-axis, r 2 i = ||y − Φx (i) || 2 in accordance with Step 5 of ISMA. For points P i with i ∈ {3, 4, . . . , n − 1}, all of the angles α 1 , α 2 , α 3 , and α 4 are calculated. For point P 2 , only angles α 1 and α 2 are calculated, as the calculation of α 3 and α 4 is impossible. Similarly, only angles α 1 and α 3 are calculated for point P n , because the calculation of α 2 and α 4 is impossible.
According to the principle of ABEE, the maximum of the angles associated with a point P i is taken to represent the tradeoff of the associated solution x (i) . The value of i corresponding to the largest tradeoff is taken as the critical iteration critIT. As a result, the solution x (critIT) returned by ISMA has the largest tradeoff.

D. Spectral Angle Mapper (SAM) Classifier
The spectral angle between two vectors a ∈ R m and b ∈ R m is Let us now denote the angle between a pixel spectrum and the ith endmember spectrum by θ i . The SAM classifier considers the pixel to consist of the ith endmember if θ i < θ th is a user-specified threshold.
Thanks to its simplicity and computational lightness, the SAM classifier is the most commonly used technique for mineral identification from hyperspectral images [28]. Its primary disadvantage is that it requires the estimation of the threshold values θ (i) th , which is subjective and challenging. Since this article focuses on unmixing algorithms that do not require manual tuning of hyperparameters, we used a simplified version of the SAM classifier that considers a pixel to consist of the kth endmember in which k = argmin i θ i .

E. Nonnegativity Constrained Least Squares (NCLS)
NCLS finds the fractional abundance vector x that minimizes ||y − Φx|| subject to ANC. That is, the estimated fractional abundance vector given by NCLS iŝ in which the greater than or equal to relation is understood componentwise. We used the MATLAB function lsqnonneg as our NCLS implementation. According to MATLAB documentation, lsqnonneg uses the NCLS algorithm described in [29].

F. Fully Constrained Least Squares (FCLS)
FCLS finds the fractional abundance vector x that minimizes ||y − Φx|| subject to ANC and ASC. That is, the estimated fractional abundance vector given by FCLS iŝ in which 1 is a column vector of 1s having as many components as x. We realized FCLS by using an embedded ASC in our NCLS implementation. Thus, the FCLS algorithm we used is essentially identical with the one that was introduced in [30] and discussed in Section V-A of [31].

G. Sparse Bayesian Learning (SBL) Algorithm
An unmixing algorithm based on SBL was presented in [32]. Henceforth, we will call the algorithm SBL in accordance with [32]. SBL first uses the relevance vector machine (RVM) [33] to select a subset of endmembers, whereafter it determines the abundances of the selected endmembers with FCLS as RVM does not enforce ANC or ASC.
To discuss how SBL uses RVM to select a subset of endmembers, we denote the fractional abundance vector given by RVM by x = (x 1 , x 2 , . . . , x n ) T . In contrast to what is stated in [32], the support of x was taken to be 1 with sthr = 0.001. That is, the value of sthr was not zero as implied in [32, p. 645].
After solving the support S, SBL determines the abundances of the endmembers indexed in S with FCLS. That is, FCLS is executed with a dictionaryΦ that contains only those columns of the original dictionary Φ that are indexed in S. The abundances of the endmembers that are not indexed in S are considered to be zero.
However, we have observed instances in which FCLS yields one or more vanishing (i.e., zero) abundances when it is executed with the dictionaryΦ. That is, SBL sometimes predicts a vanishing abundance for an endmember indexed in S. Consequently, it is somewhat ambiguous which endmembers SBL actually considers as mixture components. Our choice was to consider all endmembers indexed in S as mixture components. Thereby, we were able to eliminate the effect of FCLS and examine how well the RVM-based basis selection performed individually.
The results presented in [32] were calculated with the RVM implementation developed by Tipping [34]. Following the example of [32], we left all adjustable parameters in the Tipping's implementation to their default values.

H. BI-ICE
BI-ICE is a Bayesian unmixing algorithm [35], which is based on the linear mixing model (1). The point estimate x for the fractional abundance vector is calculated from [35, eq. (29)]. All components of x are necessarily positive, because the right-hand side of (29) in [35] is positive for all μ * ∈ R and σ * > 0.
Considering the original purpose of BI-ICE, namely, abundance estimation, it is not problematic that all the components of x are nonzero. However, an additional rule is needed if one wishes to use BI-ICE for identifying which endmember spectra constitute a given mixture spectrum y. We constructed the needed rule from the probability distributions that BI-ICE determines for each component of x. Specifically, [35, eq. (30)] implies that each component ii defined in [35, eq. (31) and (32)], respectively. The parameter μ * i specifies where the probability density function attains its maximum, whereas σ * ii describes the width of the distribution. Accordingly, we considered the ith endmember to be present in the mixture if Fundamentally, (9) means that the ith endmember is considered to be present in the mixture if the associated probability distribution attains its maximum sufficiently far from zero in 1 Additional details were received directly from the first author of [32].

Algorithm 2:
Automatic Iterative Spectral Mixture Analysis. Input 1: Mixture spectrum y ∈ R m Input 2: Dictionary Φ ∈ R m×n having the endmember spectra as columns. Input 3: String critName, which specifies the termination condition that will be used to determine the critical iteration. Output: Estimated fractional abundance vector x 1: Execute Part 1 of ISMA. 2: Determine critical iteration critIT with the termination condition specified in the string critName. 3: Return x (critIT) as the output.
comparison to the width of the distribution. Specifically, we chose the constant 3 in (9) in analogy to the three-sigma rule [36].
Themelis et al. [35] reported that establishing a formal convergence criterion for BI-ICE had been cumbersome. We required BI-ICE to execute at least five iterations, whereafter the convergence criterion was that a new iteration did not change any of the fractional abundances by more than 10 −4 . According to our tests, the chosen value of 10 −4 should be so small that choosing a smaller value should not have any appreciable effect on the unmixing results.

III. PROPOSED ALGORITHMS
In addition to TCAE, ABEE can be used to automatically determine the critical iteration in ISMA (see Section II-C). Therefore, we need formalism for dealing with algorithms that automatically determine the critical iteration in ISMA.
Automatic ISMA (AISMA) described in Algorithm 2 formally represents the combination of ISMA and some algorithm that automatically determines the critical iteration. For example, we will refer to the combination of ISMA and TCAE as AISMA with TCAE. The first two inputs of AISMA are the mixture spectrum y ∈ R m and the dictionary Φ ∈ R m×n . The third input specifies which termination condition is used to determine the critical iteration.

A. Termination Condition Adaptive Elbow
TCAE is the main contribution of our research, and it is described in Algorithm 3. TCAE is called in Step 2 of AISMA to determine the critical iteration critIT. To illustrate how TCAE automatically determines the critical iteration critIT, we use dictionary Φ 2 . The properties of dictionary Φ 2 have been described in Table I.   3. Red curve is the mixture spectrum y that we use to illustrate the functionality of AISMA and TCAE. Also shown are the three components of y and the noise term.
The mixture spectrum y we use in this illustration is a linear combination of three endmember spectra plus correlated noise. Specifically, the mixture components are antigorite, chlorite, and carnallite-hs with fractional abundances of 55%, 30%, and 15%, respectively. The mixture spectrum y and its components plus the noise term are shown in Fig. 3.
TCAE takes the fractional abundance vectors x (1) , x (2) , . . . , x (n) and fractional changes Δ 1 , Δ 2 , . . . , Δ n determined by ISMA as inputs. The values of Δ 1 , Δ 2 , . . . , Δ 12 are shown in Fig. 4. We recall that ISMA calculated each of the Δ i values by leaving out the mineral that corresponded to the smallest abundance. The name of the excluded mineral is shown at the upper horizontal axis.
The black dots in Fig. 4 indicate the iterations in which excess minerals are included in the unmixing solution. The red triangles indicate the iterations in which all the remaining minerals are represented in the mixture spectrum y. Note that ISMA removed all other minerals before starting to exclude endmembers which were in the mixture.
The first iteration of the while loop in TCAE is illustrated in Fig. 6. The upper panel shows the line L(z) defined in Step 5 for the values of i = 1 and j = 12 set in Steps 12 and 10, respectively. The area shaded in blue is A L , which is defined
Thereby, δ i vanishes whenever the associated abundance vector x (i) contains unphysical values. 2: DefineΔ i = max j∈{1,2,...,i} δ j for i ∈ {1, 2, 3 · · · , n}. This may be perceived as smoothing that eliminates local minima. 3: Extend the definition ofΔ i to nonpositive integers i by settingΔ i = 0 for i ∈ {0, −1, −2, · · · }. 4: Hereafter it is assumed that the integer valued variables i and j follow the constraint i < j ≤ n. 5: Let us consider a straight line connecting the points ( 6: Let us consider the area of the rectangular triangle having L(z) as the hypotenuse. The area of the said triangle It is used in Steps 13 and 15. For a discussion, see the last three paragraphs of Section III-A. 9: Set initial value critIT = 1. Thus, all endmembers belong to the initial unmixing solution. The while loop starting from Step 11 will refine this initial solution. 10: Set initial value j = n. Effectively, this fixes the right end of the line L(z) to the point (n,Δ n ).  Fig. 6 shows L(z) after Step 14. The five-pointed star indicates the location of the elbow determined in Step 18 [15], [37], [38], [39].
Steps 19 and 20 perform the update critIT = elbow + 1 = 9, because the initial value critIT = 1 < elbow + 1 = 9. Thus, the unmixing solution determined at the first iteration of the while loop includes one excess mineral, namely elbaite.
The variable j is decreased to 12 − 1 = 11 in Step 22. The second iteration of the while loop ensues and again the ninth iteration is determined as the critical iteration. Subsequently, j is decreased to 11 − 1 = 10 in Step 22, and the third iteration of the while loop commences.
The situation at the third iteration of the while loop is illustrated in Fig. 7. The upper panel shows the line L(z) for the initial values of i = 1 and j = 10. Since the ratio A L /AΔ ≈ 6.68 > R, Step 16 adjusts A L /AΔ to 2.97 by moving the left end of the line L(z) to 6. The lower panel of Fig. 7 shows L(z) after Step 16. The five-pointed star indicates the value elbow = 9 determined in Step 18. Since critIT = 9 < elbow + 1 = 10, Steps 19 and 20 perform the update critIT = elbow + 1 = 10.
Thereafter, Step 22 decreases j to 9 and the while loop terminates because the condition critIT < j is no longer true. As a result, Step 24 returns the tenth iteration as the critical iteration. Thus, AISMA managed to identify all the endmember spectra used to construct the mixture spectrum y, while no excess endmembers were included in the unmixing solution.
To summarize, TCAE determines the critical iteration with the assistance of elbow point estimation. Elbow detection is an inherently heuristic process, which may be understood as finding the point in which a curve changes from steep to flat [37], [38], [39]. However, the last twoΔ i values depicted in Fig. 6 do not contribute to the steep part, because they are equal to the third lastΔ i value. Consequently, we believe that a human would intuitively disregard the last twoΔ i values when visually searching for the elbow. That is, a human would likely start searching for the elbow in the way depicted in the upper panel of Fig. 7.
Elbow detection algorithms usually work the best if the steep and flat parts of the curve have roughly equal lengths [38], [39]. Relatedly, we believe that a human intuitively disregards some of the earlier iterations if the tail of smallΔ i appears to be excessively long. Similarly, we believe that a human mentally extends the tail of smallΔ i values to the left if a longer tail of smallΔ i values is needed to make the elbow apparent. In particular, the extension of the tail may be necessary if the dictionary contains only a handful of endmembers and they all have contributed to the given mixture spectrum.
The constant R defined in Step 8 controls the length of the tail formed by the smallΔ i values. Specifically, large and small values of R favor Steps 14 and 16, respectively. Thus, large and small values of R lead to long and short tails, respectively. We chose to fix R = 3, because we believe that the resulting tails of smallΔ i values are approximately as long as a human would consider when intuitively searching for the elbow.

IV. PERFORMANCE METRICS
We consider the quality of unmixing from two complementary perspectives: 1) quantitative abundance estimation and 2) endmember identification. To discuss these approaches in greater detail, we assume that x = x 1 , x 2 , . . . , x n T andx = x 1 ,x 2 , . . . ,x n T are the ground truth fractional abundance vector and its estimate, respectively.
In quantitative abundance estimation, the estimatex is considered to be better the closer it is to x in some continuous sense. We measure the goodness ofx in quantitative abundance estimation with the relative L 2 error Note that RL2E is the reciprocal of the square root of the signalto-reconstruction error used in [7] and [8]. We use RL2E instead of the root-mean-square error (RMSE) because we believe that a large relative error usually renders an abundance estimate useless even if the absolute error is small. In endmember identification, the goal is to determine which endmembers are present in a given pixel. That is, the goal is to decipher the support of x, namely from the mixture spectrum y [40]. By bearing in mind the linear mixing model (1), one may view endmember identification as basis selection from the dictionary Φ [41]. We use recall, precision, and F 1 -score to measure how well the support ofx matches the support of x. Together, recall and precision give more information than any single metric such as the distance between the supports ofx and x. At the same time, F 1 -score acts as an overall measure of the goodness ofx in endmember identification.
The number of true positives in which • means componentwise product. The number of false positives The number of false negatives By using (12)- (14), we can define recall and precision as follows: Recall Precision The harmonic mean of recall and precision, namely, F 1 -score V. SIMULATIONS We examined the performance of the unmixing algorithms with simulations similar to [7]. To cover different types of unmixing scenarios, we used four different dictionaries. The endmember spectra included in the dictionaries are mineral spectra from the ECOSTRESS library [42], [43] and the USGS spectral library [44].
The properties of the dictionaries are summarized in Table I. Table I shows that the number of wavelength bands in each dictionary is m = 200. The bands equidistantly span the wavelength range from λ = 0.8 μm to λ = 2.5 μm. The third column shows the number of endmembers in the dictionary, whereas the fourth column shows the mutual coherence [7] of the dictionary. We chose to use dictionaries that entail up to 40 endmembers, because the number of endmembers in a real-life unmixing scenario is often small, say less than 20 [7].  The smallest dictionary Φ 1 contains three mineral spectra, namely, the spectra of alunite, muscovite, and zoisite. The dictionary Φ 2 contains the 12 mineral spectra used in [32].
Dictionary Φ 3 contains 22 common rock-forming minerals, which were used in the article that introduced ISMA [10]. However, we did not include the practically featureless spectra of quartz and sulfur in Φ 3 . The reason is that these featureless minerals cannot be detected by hyperspectral imaging [45]. To illustrate the flatness of quartz and sulfur spectra, Fig. 8 shows them along with two mineral spectra included in Φ 3 .
The largest dictionary Φ 4 contains 40 mineral spectra. We required all spectra in Φ 4 to have appreciable absorption features, whereby minerals such as quartz and sulfur were not included in Φ 4 .
We used the 12 simulations detailed in Table II to study the performance of the unmixing algorithms. The second column of Table II shows the dictionary used in the simulation, while the third column shows the sparsity level of the generated fractional abundance vectors x. We limited the maximum number of mixture components to five, because real-world unmixing scenarios typically entail up to five endmembers per pixel [21], [46].
In each simulation, 1 000 000 fractional abundance vectors x were generated. For each x, the number of nonzero components was randomly drawn from the range indicated in the third column of Table II. Thereafter, the indices of the nonzero components were chosen randomly. Finally, the selected components were assigned values drawn from the flat Dirichlet distribution. Hence, each x satisfies ANC and ASC.
In our view, the use of the flat Dirichlet distribution may be considered analogous to the use of a flat (uninformative) prior in Bayesian analysis. Since we are not aware of any reasons to prefer some specific form of the Dirichlet distribution, we opted to use the flat distribution.
Corresponding to each x, a mixture spectrum was generated. Here, Φ and n are the dictionary and the noise vector, respectively. We used correlated noise in the simulations because it is a better model of real-world noise than independent and identically distributed (i.i.d.) Gaussian noise [7]. The correlated noise was generated according to [7]: low-pass filtering i.i.d. Gaussian noise with a normalized cutoff frequency of 5π/m, in which m is the number of spectral bands. Since (18) does not entail baseline shifts or other scattering effects, we did not use any spectral preprocessing [47] in the simulations of Table II. We define the SNR as the power of the signal divided by the power of the noise. That is To adjust the SNR of a mixture spectrum y, we set n = αñ, with α ≥ 0 being a parameter and vectorñ containing the unscaled noise. The desired SNR was then obtained by setting As presented in the fourth column of Table II, we used the SNR values of 20 dB, 35 dB, and 50 dB in our simulations (cf. Section IV-E in [7]). We note that simulations 1-3 do not belong to the category of sparse unmixing as the other simulations. The reason is that the maximum sparsity level ||x|| 0 = 3 is equal to the number of endmembers in the dictionary Φ 1 . Nevertheless, some hyperspectral images involve only few endmembers, which may all be present in a single pixel. For example, the dataset of [48] contains images in which most or all pixels contain all the three or four possible endmembers. Therefore, we consider simulations 1-3 relevant even though they do not represent sparse unmixing.

A. Simulation Results
The performance metrics describing the quality of endmember identification are shown in Table III. AISMA with TCAE tended to have the largest F 1 -score, and the marginal by which AISMA with TCAE has the largest F 1 -score tends to increase according to SNR.
The critical difference diagram of Fig. 9 shows the average ranks of the unmixing algorithms based on the F 1 -scores shown  in Table III. We excluded NCLS from the diagram because FCLS has a larger F 1 -score in most simulations. The horizontal bars in Fig. 9 indicate cliques in which the performance differences of the algorithms are not significant according to the one-sided Wilcoxon signed-rank test with Holm correction on significance level 0.05. AISMA with TCAE performed the best in endmember identification by a significant margin.
In all subsequent critical difference diagrams, the horizontal bars have the same meaning as in Fig. 9. That is, a horizontal bar indicates a clique in which the performance differences of the algorithms are not significant according to the one-sided Wilcoxon signed-rank test with Holm correction on significance level 0.05.
The average RL2E of the estimated fractional abundance vectors is shown in Table IV. Specifically, the shown values are  Table IV. averages over the 1 000 000 RL2Es obtained in each simulation. The column "full dictionary" reports the FCLS and NCLS results obtained with the whole dictionary. In contrast, the results in the column "ground truth" were obtained by running FCLS and NCLS with the ground truth endmembers. In the columns titled "SBL," "AISMA w/ ABEE," and "AISMA w/ TCAE," the endmembers selected by the respective algorithms were used for FCLS and NCLS unmixing.
The ASC included in FCLS is not satisfied in all real-world unmixing scenarios [7], [8]. Nevertheless, SBL natively uses FCLS for abundance estimation, whereas AISMA uses NCLS. Thus, SBL and AISMA could have different relative L 2 errors even if they would have selected the same endmember subsets. Therefore, we ran both FCLS and NCLS with the endmember Fig. 11. Schematic illustration of the Scene III setup. Figure adapted from [48]. subsets selected by SBL and AISMA. To further facilitate fair comparison, we always ran BI-ICE with and without an embedded ASC.
The critical difference diagram of Fig. 10 was calculated from the RL2Es shown in Table IV. Specifically, each column contains two RL2Es per simulation, and we always took into account the smaller one.
Unmixing with ground truth endmembers yielded the best abundance estimates by a significant margin, while AISMA with ABEE performed the worst by a significant margin. Furthermore, full dictionary ranked better than BI-ICE, SBL, and AISMA with TCAE. Thus, the endmember subsets selected by the three aforementioned sparse unmixing algorithms did not lead to better abundance estimates than the full dictionary.

VI. EXPERIMENTS WITH CLOSE-RANGE HYPERSPECTRAL IMAGES
The unmixing dataset introduced by Zhao et al. [48] has been specifically designed for unmixing experiments. It consists of close-range hyperspectral images measured in a laboratory and abundance ground truth is available for all pixels.
The images in the Zhao's dataset have been divided into four groups named Scene I, Scene II, Scene II RGB, and Scene III. Scene I samples are printed checkerboards, whereby the linear mixing model should work well. In contrast, the Scene II samples are homogeneous mixtures of colored quartz sands. Since Scene II samples may be regarded intimate mixtures, multiple scattering is expected to reduce the validity of the linear mixing model in Scene II. Scene II RGB samples are nonhomogeneous but otherwise similar to Scene II samples.
As illustrated in Fig. 11, Scene III images were taken from Scene I checkerboards while a plastic board was standing next to the checkerboard. Because of the Scene III setup, double scattering is prominent. That is, most incident photons scatter both from the checkerboard and the plastic board before being detected.
In all three scenes, there are mixtures that contain all endmembers from the scene. To retain focus on sparse unmixing, we opted to use a single dictionary Φ that contains the endmember spectra from all the three scenes. Specifically, we formed each endmember spectrum by averaging over a hyperspectral image that was taken from a sample consisting of the respective We used low-, medium-, and high-SNR mixture spectra in our experiments. The low-SNR spectra correspond to individual pixels, whereas the high-SNR spectra are averages over all the pixels in the image. The medium-SNR spectra were generated by applying a 3 × 3× 1 box filter on the raw mixture images. Thus, the filter width was three pixels in both spatial dimensions, whereas no averaging was done along the wavelength axis. Fig. 12 illustrates the differences between the three SNR levels by using the Scene I mixture 1 as an example. The low-SNR spectrum corresponds to pixel (30,30) in the raw image, whereas the medium-SNR spectrum has been taken from the pixel (30,30) after the 3 × 3× 1 box filter was applied on the image. The high-SNR spectrum is the average over all the pixels. Visually, the high-SNR spectrum is the smoothest, whereas the low-SNR spectrum appears to contain the largest amount of noise.
In the unmixing experiments with Scene I and Scene III spectra, we did not use any spectral preprocessing. The reason is that the preprocessing methods we tried, namely, linear detrending and logarithm transformation, did not appreciably improve the results. In Scene II and Scene II RGB experiments, we used the generalized logarithm transformation with c = 10 −6 to convert the measured reflectance spectra to absorbance spectra [24]. For more information about the generalized logarithm transformation (21) used in Scene II and Scene II RGB experiments, see Appendix A.

A. Unmixing Results
The endmember identification metrics for Zhao's dataset [48] are shown in Table V. The values shown on each row are averages over all mixture spectra belonging to the indicated scene and SNR level. The first three algorithms are not sparse unmixing algorithms, and therefore, we focused on the last three algorithms when boldfacing the largest value of each metric from every row.
A critical difference diagram calculated from the F 1 -scores is shown in Fig. 13. AISMA with TCAE had the best F 1 -scores by a significant margin.
The RL2Es for Zhao's dataset [48] are shown in Table VI. The values shown on each row are averages over all mixture spectra belonging to the indicated scene and SNR level. To compare the unmixing algorithms with each other, we bolded the smallest  Table VI. value from each row without considering the "ground truth" column.
A critical difference diagram calculated from the RL2Es of Table VI is shown in Fig. 14. Specifically, each column in Table VI contains two RL2Es per SNR level, and we always took into account the smaller one. Unmixing with ground truth Fig. 15. First row shows the ground truth composition of the pixels, whereas the last five rows show the predictions of the algorithms. From left to right, the columns correspond to bitumen, green fabric, red fabric, red metal, and blue fabric. The upper right corner has been masked in each subplot, because the corresponding ground area contained objects whose spectra were not provided within the HySU dataset. endmembers yielded the best abundance estimates by a significant margin. Furthermore, AISMA with TCAE performed significantly better than AISMA with ABEE. However, the endmember subsets selected by BI-ICE, SBL, and AISMA with TCAE did not lead to significantly better abundance estimates than the full dictionary.
The hyperspectral image of Scene II RGB sample 4 (viz. spatial pattern D) is represented as a false color image in the left panel of Fig. 16 in Appendix B. The estimated and ground truth abundance maps for the Scene II RGB sample 4 are shown in Fig. 18 in Appendix C.

VII. EXPERIMENT WITH A REMOTE-SENSING IMAGE
Remote sensing is the traditional application area of hyperspectral imaging. Therefore, we carried out an experiment with an airborne hyperspectral image, which was specifically acquired for the purposes of testing unmixing algorithms [49]. Specifically, the image is from a grass-covered football field, where targets of different size were deployed. Each target was made of either bitumen, green fabric, red fabric, red metal, or blue fabric.
Corresponding to each material, there were five targets having the sizes of 3 m × 3 m, 2 m × 2 m, 1 m × 1 m, 0.5 m × 0.5 m, and 0.25 m × 0.25 m. In comparison, the reported pixel size was 0.7 m × 0.7 m. Thereby, the 0.5 m × 0.5 m and 0.25 m × 0.25 m targets could not give rise to any pure pixels, but appeared entirely as subpixel objects. The arrangement of the targets has been graphically illustrated in [49, Fig. 1].
Since there are five target materials lying on grass, the total number of the endmembers is six. The HySU dataset [49] provides two versions of these endmember spectra, namely, spectra extracted from the hyperspectral image and spectra measured with a field spectrometer. The dictionary Φ we used for unmixing comprises the endmember spectra extracted from the hyperspectral image. The endmember spectra included in Φ are shown as solid lines in [49,Fig. 5].
In our experiment, we used only a part of the original hyperspectral image. The part that we used contains all the targets and is approximately the same as the one shown in [49,Fig. 4b]. The right panel of Fig. 16 in Appendix B is a false color representation of the hyperspectral image that we used.
The first row of Fig. 15 shows the endmembers that are present in each pixel. From left to right, the columns correspond to bitumen, green fabric, red fabric, red metal, and blue fabric. As in [49, Fig. 4b], the upper right corner has been masked because it contains endmembers whose spectra were not provided within the HySU dataset.
The second row of Fig. 15 shows the results of SAM classifier. The third, fourth, fifth, and sixth rows show the unmixing results obtained with SBL, BI-ICE, AISMA with ABEE, and AISMA with TCAE, respectively.
The SAM classifier and AISMA with ABEE perform well in avoiding false positives but have trouble detecting the smallest and the second smallest targets. In contrast, SBL and BI-ICE yield many false positives for all the five target materials, resulting in the third and fourth row being qualitatively rather different from the first row. The compositional maps estimated by AISMA with TCAE are in relatively good agreement with the ground truth for green fabric, red fabric, red metal, and blue fabric. For bitumen, the match with the ground truth is poorer, but even then the majority of estimated bitumen pixels are located close to the ground truth of bitumen pixels. Overall, AISMA with TCAE managed to detect even some of the smallest targets, while the number of false positives remained relatively low.

VIII. EXPERIMENTS WITH JASPER RIDGE AND URBAN DATASETS
The remote sensing images provided in the Jasper Ridge and Urban datasets have been used in many unmixing studies. The remote sensing image included in the Jasper Ridge dataset has been extracted from a larger image [50]. The Jasper Ridge and Urban datasets can be downloaded, e.g., from [51].
The Jasper Ridge dataset contains two versions of the hyperspectral image with the difference being in the number of wavelength bands. The first version has 224 wavelength bands extending from 380 to 2500 nm, while the number of wavelength bands in the second version is 198 following the removal of the bands that are affected by atmospheric absorption.
We used the Jasper Ridge image that has 198 wavelength bands, and a false color representation of the used image is shown in the left panel of Fig. 17 in Appendix B. The spatial plane of the hyperspectral image is divided into 100 × 100 pixels with each pixel corresponding to a 20 m × 20 m region in the scene [50]. The abundance map provided in the Jasper Ridge dataset has four endmembers, namely, tree, water, dirt, and road.
The Urban dataset contains two versions of the hyperspectral image with the difference being in the number of wavelength bands. The first version has 210 wavelength bands extending from 400 to 2500 nm, while the number of wavelength bands in the second version is 162 following the removal of the bands that are affected by atmospheric absorption.
We used the Urban image that has 162 wavelength bands, and a false color representation of the used image is shown in the right panel of Fig. 17 in Appendix B. The spatial plane of the hyperspectral image is divided into 307 × 307 pixels with each pixel corresponding to a 2 m × 2 m region in the scene [50]. Three different reference maps indicating the pixelwise fractional abundances are provided in the Urban dataset. We used the reference map that has six endmembers, namely, asphalt road, grass, tree, roof, metal, and dirt [50], [52].
The abundance maps for the Jasper Ridge and Urban images are shown, respectively, in Figs However, it should be noted that the reference maps included in the Jasper Ridge and Urban datasets are not ground truth maps, as they have been calculated with certain algorithms instead of being determined through in situ measurements [49], [53], [54]. At the same time, we share the view of many scholars that comparison of estimated maps with ground truth maps is the most objective way to assess the performance of unmixing algorithms (see, for instance, [48], [49], [55], [56], and [57]).
To further examine the plausibility of reference maps that have been calculated with algorithms, we consider the first row of Fig. 19. We note that the water area has pixels in which the fractional abundance of the road endmember does not vanish but ranges from tiny positive values to about 0.2. However, these pixels are scattered in such a way that we consider it unlikely that they would correspond to a road that is partly or thinly covered by water. Further contextual information can be gathered from the larger Jasper Ridge image [50] and physical maps of the Jasper Ridge area [58], and our view is that they do not suggest the presence of a road in the water area of the Jasper Ridge image. As a result, we feel that the reference maps provided in the Jasper Ridge dataset suffer from plausibility issues, which would not be present in ground truth maps determined through in situ measurements.
Owing to the lack of ground truth in Jasper Ridge and Urban datasets, we consider the related experiments less conclusive than the ones carried out in Sections VI and VII. Overall, we believe that the results obtained with the Jasper Ridge and Urban datasets qualitatively indicate the applicability of AISMA with TCAE to the unmixing of remote sensing images.

IX. ABLATION AND TIMING EXPERIMENTS
In Section IX-A, we examine the role of TCAE in the performance of AISMA by performing an ablation experiment. Specifically, we will repeat the simulations of Table II with TCAE replaced by a termination condition that determines the critical iteration with the assistance of the ground truth fractional abundance vector. Thus, we get to compare the performance of AISMA with TCAE against an upper limit that AISMA cannot exceed with any conceivable termination condition.
In Section IX-B, we state the computational complexity of AISMA with TCAE and conduct a runtime comparison between the algorithms.

A. Role of TCAE in the Performance of AISMA
To examine to the role of TCAE in the performance of AISMA, we define a new termination condition called termination condition oracle (TCO). TCO determines the critical iteration with the assistance of the ground truth fractional abundance vector x, and it is described in Algorithm 4.
In addition to the ground truth fractional abundance vector x, TCO receives the fractional abundance vectors x (1) , x (2) , . . . , x (n) generated by ISMA as inputs. With the assistance of the ground truth fractional abundance vector x, TCO calculates the number of true positives and false positives for each of the estimated fractional abundance vectors x (1) , x (2) , . . . , x (n) .
Step 4 of TCO chooses the critical iteration critIT to maximize the number of true positives, which are subject to the constraint of the minimization of false positives. In practice, the number of false positives will be zero if the endmember remaining at the last iteration is a true positive (viz. belongs to the mixture). Otherwise, the number of false positives will be one. We stress that the performance of AISMA with TCO sets an upper limit to the performance that AISMA can attain with any conceivable termination condition, because TCO uses the ground truth fractional abundance vector to determine the critical iteration.
We repeated the simulations from Table II with TCAE replaced by TCO, and the endmember identification results are shown in Table VII. In the low-SNR simulations with smaller dictionaries, TCAE led to considerably smaller recall than TCO. Otherwise, we feel that the results obtained with TCAE are close to the best possible performance obtained with TCO.
In the medium-and high-SNR simulations, the difference between the F 1 -scores obtained with TCAE and TCO does not appreciably depend on the number of endmembers in the dictionary. In the low-SNR simulations, the gap between the

B. Computational Complexity and Runtime Analysis
The main computational complexity in AISMA with TCAE results from the least-squares fit, which is performed in Step 4 of ISMA. Specifically, the asymptotic computational complexity of AISMA with TCAE is O(n 3 m), with n and m being the number of columns and rows, respectively, in the dictionary Φ (see Appendix D).
We examined the runtimes of the unmixing algorithms by running simulations in a MATLAB environment with a laptop having a 1.9-GHz CPU and 8 GB of memory. To illustrate how the number of endmembers affects runtime, we used the dictionaries Φ 2 , Φ 3 , and Φ 4 in the simulations. Each runtime reported in Table VIII is the average time it takes to unmix 10 000 mixture spectra. For further details on the timing experiment, see Appendix E.
AISMA with TCAE was about two and three times slower than FCLS in the simulations with Φ 2 and Φ 4 , respectively. In comparison to SBL, AISMA with TCAE was about four and 11 times faster in the simulations with Φ 2 and Φ 4 , respectively. BI-ICE was roughly two times slower than SBL in all the three simulations.
We conclude that SBL and BI-ICE have a larger computational complexity than AISMA with TCAE with respect to the number of endmembers in the dictionary. Since the number of endmembers in a real-world unmixing scenario is typically less than 20, the results with the dictionary Φ 3 suggest that AISMA with TCAE could be up to seven times faster than SBL and up to 16 times faster than BI-ICE in real-life unmixing applications.

X. DISCUSSION
The main contribution of our research, namely TCAE, is an extension to ISMA that automatically selects the best solution from the sequence of fractional abundance vectors generated by ISMA. Although we developed TCAE as an extension to ISMA, TCAE can be used with any unmixing algorithm that yields a sequence of estimated fractional abundance vectors with an increasing sparsity level. To elaborate, let us assume that x (1) , x (2) , . . . , x (n) is a sequence of estimated fractional abundance vectors with an increasing sparsity level ||x (i) || 0 = i. By changing the labeling, one can form a sequence x (1) , x (2) , . . . , x (n) , in which ||x (i) || 0 = n − i + 1. Subsequently, the Δ i values can be calculated by using the formulas from Steps 5, 10, and 11 of ISMA.
For example, this means that TCAE can be used to select the best fractional abundance vector from the set of fractional abundance vectors generated by a matching pursuit algorithm. Similar logic also dictates that TCAE can be used to select the best solution from the set of Pareto-optimal solutions to the biobjective minimization problem (3). Thus, TCAE can be used in multiobjective sparse unmixing to select the best solution from the set of Pareto-optimal solutions. TCAE selects the best solution from the set of Pareto-optimal solutions with the help of elbow estimation. Thus, TCAE belongs to the same family of methods as ABEE, the weighted sum of objective values method, and the distance to the extreme line method [15]. An important novelty in our work is that we seek the elbow from the points (||x (i) || 0 ,Δ i ) instead of the points (||x (i) || 0 , ||y − Φx (i) || 2 ), in which i ∈ {1, 2, . . . , n} and n is the number of solutions. In comparison to the distance to the extreme line method, TCAE has the advantage that it can adaptively choose the end points of the line segment that is used to determine the location of the elbow.

A. Future Directions
Abundance estimates tend to be the most accurate when the endmembers participating in the given pixel have been correctly identified [10]. Although the critical difference diagrams 9 and 13 show that AISMA with TCAE had the best F 1 -scores by a significant margin, the critical difference diagrams 10 and 14 indicate that the abundance estimates yielded by AISMA with TCAE were not significantly better than the ones obtained with SBL, BI-ICE, and full dictionary. Thus, the results suggest that F 1 -score is not the best possible performance metric for endmember identification if the goal of endmember selection is to improve the abundance estimates. A possible future direction would be to study the correlation between various endmember identification metrics and the relative L 2 error.
Since AISMA with TCAE is based on ISMA, it does not have any specific mechanism for mitigating problems that arise from endmember variance. Accordingly, we carried out simulations and real-data experiments with such data that do not exhibit endmember variance in any appreciable degree.
Although we did not consider endmember variance, it is common in real-world data. Therefore, extending the combination of ISMA and TCAE to handle endmember variance would increase its applicability to real-world problems. We surmise that grouping the spectra representing the same mineral as in SMESMM [13] would lead to better results than simply adding variations of the same mineral to the dictionary Φ. With the grouping, the iterative elimination of the endmember spectra could be carried out one group at a time instead of a single endmember spectrum at a time.

XI. CONCLUSION
In this article, we studied the problem of automatic identification of materials from mixed pixels. We developed a new elbow estimation method TCAE that enables a rapid and reliable identification, providing better performance than a state-of-the-art elbow detection method and two Bayesian unmixing algorithms. Specifically, Figs. 9, 10, 13, and 14 show that 1) AISMA performed endmember identification and quantitative abundance estimation significantly better with TCAE than ABEE, and 2) AISMA with TCAE outperformed SBL and BI-ICE in endmember identification.
Since AISMA does not have any hyperparameters, we used a simple version of SAM classifier that does not have userspecified thresholds. Specifically, the used version of SAM classifier considers a pixel to consist of the endmember for which the spectral angle between the pixel spectrum and the endmember spectrum is the smallest. Figs. 9 and 13 show that AISMA with TCAE outperformed the SAM classifier in endmember identification.
Timing experiments indicated that AISMA with TCAE could be up to seven times faster than SBL and up to 16 times faster than BI-ICE in real-life unmixing applications. Thus, AISMA with TCAE is better suited for applications that require real-time or near real-time performance. Overall, our conclusion is that AISMA with TCAE facilitates automatic, reliable, and rapid identification of endmembers from mixed pixels.

APPENDIX A SPECTRAL PREPROCESSING IN SCENE II AND SCENE II RGB EXPERIMENTS
The reason for using the generalized logarithm transformation (21) instead of A = − log 10 R in Scene II and Scene II RGB experiments is that in some rare instances, noise may cause a negative reflectance value R. We believe that the chosen value c = 10 −6 in (21) may be considered small, because it is several orders of magnitude smaller than typical reflectance values such as the ones exhibited by the high-SNR curve in Fig. 12. The advantage of choosing a small value for c is that the generalized logarithm transformation is then practically equivalent to A = − log 10 R, but avoids the problem posed by slightly negative reflectance values arising from noise.
[48, Table VI] shows the RMSEs of various unmixing algorithms for Scene II mixtures. The shown values were calculated with reflectance spectra, while the used dictionary contained only the Scene II endmember spectra. The mean RMSEs of FCLS, K-Hype, and Hapke were 0.1630, 0.1221, and 0.0940, respectively. In comparison, we have calculated that the mean RMSE of FCLS with absorbance spectra is 0.0872.
We conclude that conversion from reflectance to absorbance substantially improves the applicability of the linear mixing model (1) for Scene II spectra. Since Scene II RGB samples are similar to Scene II samples, we assume that the generalized logarithm transformation (21) improves the validity of the linear mixing model (1) also for Scene II RGB spectra. Therefore, we feel that our choice to preprocess spectra with (21) in Scene II and Scene II RGB experiments is justified.   Fig. 18. Abundance maps for the Scene II RGB sample 4 (spatial pattern D). The first three columns correspond to the endmembers constituting the sample, while the last five columns correspond to the other endmembers included in the dictionary. The first row contains the ground truth abundance maps, while the following rows show the estimated abundance maps. Comparison of the FCLS and NCLS results shows that ASC has a substantial effect on the estimated abundance maps. The SBL maps were calculated by running FCLS with the endmembers selected by the RVM-based endmember selection. Similarly, embedded ASC was used in BI-ICE, while the AISMA maps were calculated by executing FCLS with the endmembers selected by AISMA.

APPENDIX C ABUNDANCE MAPS
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. Fig. 19. Abundance maps for the Jasper Ridge image. The first row shows the reference maps included in the Jasper Ridge dataset, while the following rows show the estimated abundance maps. The reference maps satisfy ASC, and we scaled the fractional abundance vectors in the estimated maps so that each estimated abundance map also satisfies ASC. Fig. 20. Abundance maps for the Urban image. Reference maps based on four, five, and six endmembers are provided in the Urban dataset, and we opted to use the ones based on six endmembers. The first row shows the reference maps, while the following rows show the estimated abundance maps. The reference maps satisfy ASC, and we scaled the fractional abundance vectors in the estimated maps so that each estimated abundance map also satisfies ASC.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.

APPENDIX D COMPUTATIONAL COMPLEXITY OF AISMA WITH TCAE
The main computational complexity in AISMA with TCAE results from the least-squares fit, which is performed in Step 4 of ISMA. The least-squares fit is based on matrix multiplication and matrix inversion. The multiplication of an s × t matrix by a t × u matrix has the complexity of O(stu) [59], whereas inversion of an n × n matrix with Gauss-Jordan elimination has the complexity of O(n 3 ) [60]. Relatedly, we recall that the mixture spectrum y ∈ R m and the dictionary Φ ∈ R m×n .
The least-squares fit argmin x ||y − Φx|| begins with the matrix multiplications Φ T y and Φ T Φ, which have the complexities of O(mn) and O(n 2 m), respectively. Subsequently, the inversion of Φ T Φ ∈ R n×n has the complexity of O(n 3 ). Since Φ T y ∈ R n×1 has already been calculated, the final matrix multiplication (Φ T Φ) −1 Φ T y has the complexity of O(n 2 ). Thereby, the complexity of the least-squares fit is O(n 3 + n 2 m).
Step 4 of ISMA is repeated n times, i.e., as many times as there are endmembers in the dictionary Φ. Thus, the overall computational complexity resulting from Step 4 is n · O(n 3 + n 2 m) = O(n 4 + n 3 m). Since the dictionary Φ must be undercomplete in ISMA (i.e., n < m), the asymptotic computational complexity of AISMA with TCAE is O(n 3 m).

APPENDIX E RUNTIME EXPERIMENT
By using (18), we constructed ten sets of mixture spectra for each of the dictionaries Φ 2 , Φ 3 , and Φ 4 . Each set contained 10 000 mixture spectra and each mixture spectrum was corrupted with correlated noise. The SNR of each mixture spectrum was randomly chosen from the interval 20 to 50 dB (cf. Table II). The number of components in each mixture was randomly drawn from {1, 2, . . . , 5}.
Each unmixing algorithm was timed as it processed the ten sets. For further averaging, the process of constructing the ten sets and timing the algorithms was repeated ten times. Each runtime reported in Table VIII is the mean of the soobtained 100 durations. Thus, each runtime reported in Table VIII is the average time it takes to unmix 10 000 mixture spectra.