Blind source separation for non-stationary random fields

Regional data analysis is concerned with the analysis and modeling of measurements that are spatially separated by specifically accounting for typical features of such data. Namely, measurements in close proximity tend to be more similar than the ones further separated. This might hold also true for cross-dependencies when multivariate spatial data is considered. Often, scientists are interested in linear transformations of such data which are easy to interpret and might be used as dimension reduction. Recently, for that purpose spatial blind source separation (SBSS) was introduced which assumes that the observed data are formed by a linear mixture of uncorrelated, weakly stationary random fields. However, in practical applications, it is well-known that when the spatial domain increases in size the weak stationarity assumptions can be violated in the sense that the second order dependency is varying over the domain which leads to non-stationary analysis. In our work we extend the SBSS model to adjust for these stationarity violations, present three novel estimators and establish the identifiability and affine equivariance property of the unmixing matrix functionals defining these estimators. In an extensive simulation study, we investigate the performance of our estimators and also show their use in the analysis of a geochemical dataset which is derived from the GEMAS geochemical mapping project.


Introduction
In spatial data analysis observations x(s i ), i = 1, . . . , n are collected in a domain S ⊂ R d where s i ∈ S specifies the location of the observation x(s i ). In most applications d = 2, which will be assumed in the following if not mentioned otherwise. It is meanwhile well-established that when analyzing spatial data the proximity of different observation locations has to be taken into account as observations located close to each other are expected to be more similar than observations further apart. The common way to consider this is via the covariance function C x (s i , s j ) = E ((x(s i ) − E(x(si)))(x(sj) − E(x(sj))) .
To make working with spatial data more tractable one assumes that the spatial observations are realizations of a weakly stationary random field which means one assumes that (i) E(x(si)) = µ for all s i ∈ S and that (ii) C x (s i , s j ) = C x (s i + h, s j + h), where h can be any shift with s i , s j , s i + h and s j + h ∈ S. This assumptions state that the mean is constant over the domain and the covariance function is only a function of the difference between the sample locations but does not depend on the actual locations. This in turn means one can express the covariance function also as a one vector argument function, namely the difference h = s i − s j . If additionally the covariance function does only depend on the distance h = s i − s j then its said to be isotropic. Usually, parametric covariance functions are specified and fitted to the data. One of the most popular parametric covariance function is the isotropic stationary Mátern covariance function [1] C(h; σ 2 , ν, φ) = σ 2 2 ν−1 Γ(ν) where K ν is the modified Bessel function of second kind, Γ is the gamma function and σ 2 > 0, ν > 0 and φ > 0 are the variance, shape and range parameter respectively.
In many applications not only one variable is measured at each sample location but rather many, which yields multivariate spatial data where also cross-dependencies between the different variables have to be taken into account. Many suggestions and approaches for modeling the spatial cross-covariance functions for a p-variate random field x(s), )(x(s j ) − E(x(s j )) are reviewed for example in [2] where it is also pointed out that it is not that easy to create flexible and valid spatial cross-covariance functions. One of the most popular approaches is the linear model of coreginonalization (LMC) [3,4], where the multivariate covariance function is formed by r summands of p × p positive semi-definite coregionalization matrices T k multiplied by univariate, parametric spatial correlation functions ρ k (h). Formally, the LMC is stated as Another approach is followed by [5], where the marginal and the cross-covariances are of the above Mátern covariance form. The marginal covariance functions yield C ii (h; σ 2 ii , ν ii , φ ii ) = σ 2 ii C(h; 1, ν ii , φ ii ) for i = 1, . . . , p, and the cross-covariances write as C ij (h; ρ ij , σ ii , σ jj , ν ij , φ ij ) = ρ ij σ ii σ jj C(h; 1, ν ij , φ ij ) for i, j = 1, . . . , p, i = j.
Conditions for the shape, range, variance and correlation parameters ν ij , φ ij , σ 2 ii and ρ ij for i, j = 1, . . . , p which result in a valid multivariate cross-covariance function can be formulated, however, these conditions are rather involved and therefore the interested reader is referred to [5]. Similar as in the univariate case, the two above families of cross-covariance functions, and many others, make the assumption of weak stationarity and isotropy.
As the domains in modern applications are however often huge it is meanwhile commonly accepted that the weak stationarity assumption is convenient but not realistic. Stationarity seems rather justifiable on a local scale but not globally. Thus, recent years saw an increased interest in developing spatial methods which do not assume weak stationarity where the focus was mainly on univariate approaches. For example [6] reviews four different strategies to develop non-stationary covariance functions where the most popular approach seems to be based on spatial deformations. [7] focus on extending the Mátern covariance for the non-stationary case by letting the shape, scale and variance parameters vary in the spatial domain. For the multivariate case [8] point out that extensions of non-stationary crosscovariance functions are even more challenging to develop. [9] extend the LMC to account for non-stationarity and [10] extend the multivariate Mátern model, both by introducing spatially varying parameters. [8] on the other side extend the spatial deformation approach to the multivariate setting and also review some other approaches. In any way all the discussed approaches start with the selection of one or more cross-covariance functions which are then fitted to the data.
For multivariate spatial data recently [11,12] suggested another approach, denoted as spatial blind source separation (SBSS). In SBSS the p-variate random field x(s) is decomposed into p uncorrelated / independent components which allows independent univariate modelling. However, SBSS also assumes weak stationarity. The goal of this work is to extend SBSS to the case of non-stationary spatial data which allows to discard the complex multivariate covariance modeling in favor of individual univariate modeling.
The structure of the paper is as follows. In Section 2 we specify the exact considered spatial non-stationary blind source separation model. Three estimators for recovering the latent random fields are introduced in Section 3, where the identifiability and affine equivariance properties of the underlying unmixing matrix functionals are studied. In an extensive simulation study we test the validity of our estimators in Section 4 and illustrate their use on an environmental example in Section 5. Lastly, we conclude the paper in Section 6 and hint ideas for further research. The appendix contains the proofs of the stated propositions.

A non-stationary spatial blind source separation model
For the remainder of the paper we assume that the random field x(s) at hand follows a spatial non-stationary (blind) source separation (SNSS) model which is defined as follows.
Definition 1 (Spatial non-stationary source separation model). A p-variate random field x(s) defined on a d-dimensional spatial domain S ⊆ R d follows a spatial non-stationary source separation model (SNSS) if it can be formulated as where A is a deterministic invertible p × p mixing matrix, b is a p-variate deterministic location vector and z(s) is a p-variate latent random field which fulfills the following assumptions (SNSS 1) E(z(s)) = 0 for all s ∈ S, (SNSS 2) Cov(z(s)) = E z(s)z(s) = Σ s where Σ s is a positive definite diagonal matrix for all s ∈ S and (SNSS 3) Cov(z(s), z(s )) = E(z(s)z(s ) ) = Σ ss , for all s = s ∈ S where Σ ss is a diagonal matrix depending on s and s .
In practical considerations the random field x(s) of Definition 1 is observed on a set of n deterministic sample locations C = {s 1 , . . . , s n } ⊂ S which is a natural assumption for geostatistical applications. The domain S can be thought of as a continuous version of the sample locations C and can in principle be of any shape, but for convenience it is often a d-dimensional hyperrectangle, which so-to-speak covers C.
Assumption (SNSS 1) states that the mean of each entry of the latent random field is a constant for the whole domain.
In contrast, assumptions (SNSS 2) and (SNSS 3) allow the diagonal covariance as well as the diagonal spatial crosscovariance matrices to be dependent on the specific sample locations. In total, the observed random field is formed by uncorrelated latent random fields that are non-stationary in the sense that the second order dependencies are allowed to vary across the spatial domain. Often however the assumption of uncorrelated latent components is replaced by the stronger assumption of mutual independence. For general overviews of blind source separation (BSS) methods and their assumptions see for example [13,14]. The SNSS model here can be seen as a spatial variant of the non-stationary time series model which is for example considered in [15,16,17,18].
If (SNSS 2) and (SNSS 3) are forced to be stationary, i.e, Σ s is constant and the diagonal matrix Σ ss carries stationary covariance functions on its diagonal elements, i.e. functions only of the difference vector h between s and s , then the model of Definition 1 corresponds to the (stationary) SBSS model discussed in detail in [11,12].
The main goal of SNSS is to recover the true latent random field z(s) based on x(s) alone. Thus, an unmixing matrix functional W = W(x(s)) and a location functional T = T(x(s)) are required such that z(s) = W (x(s) − T). Note that assumptions (SNSS 1)-(SNSS 3) are not sufficient to make this a well-defined problem as the conditions do not fix the order, signs and scales of the latent components of z(s). That is, let J = PSD where P denotes a permutation matrix, S a sign-change matrix and D a diagonal matrix with positive diagonal values. Then, the pairs (A, z(s)) and (AJ −1 , Jz(s)) both lead to the same x(s) and fulfill all requirements of Definition 1, hence, they are not distinguishable. This leads to the fact that recovering z(s) is only possible up to order, signs and scale which is an ambiguity present in all BSS models and not considered a problem. For a detailed discussion about identifiability and general ambiguities in BSS models see for example [12,19,20].
Another requirement of unmixing matrix functionals is the affine equivariance property [21] which states that the same latent random field is recovered (up to order and sign) independent of the exact way of mixing. Let x(s) be a random field and x * (s) = Bx(s)+a its affine transformed version, where B is any invertible p×p matrix and a is any p-dimensional vector. For an affine equivariant unmixing matrix functional it holds that W(Bx(s) + a) = W(x(s))B −1 = A −1 B −1 up to order and sign of the row vectors. Multivariate statistical tools fulfilling this property belong to the more general invariant coordinate system (ICS) framework [22].
The following definition formally states identifiability and the affine equivariance property of unmixing matrix functionals discussed before. Definition 2 (Unmixing matrix functional). For a random field x(s) following the SNSS model (Definition 1) a p × p matrix-valued functional W(x(s)) is an unmixing matrix functional if it satisfies: (Identifiability) W(x(s))A = PSD for some permutation matrix P, sign change matrix S and diagonal matrix with strictly positive diagonal elements D.
(Affine equivariance) W(Bx(s)+a) = PSW(x(s))B −1 where B is an invertible p×p matrix, a is a p-dimensional vector, P is some permutation matrix and S is some sign change matrix.
In the subsequent section we introduce three unmixing matrix functionals that solve the above stated SNSS problem and investigate their identifiability and affine equivariance properties.

Three SNSS methods
The goal of this section is to introduce three different unmixing matrix functionals W(x(s)) that can be used in conjunction with any location functional T(x(s)) to recover the latent random field z(s) by W(x(s)) (x(s) − T(x(s))).
For T(x(s)) we simply use the expectation and in the following focus our discussion solely on W(x(s)).
The key quantities for all three following unmixing matrix functionals are so-called local covariance matrices which are defined as Local covariance matrices were introduced in [11] and refined in [12,23] in the context of SBSS for the second order stationary case. Note that in Equation (2) we allow the considered spatial domain S not to contain C which is slightly different in comparison with the original definition, this will be useful when considering subdomains (see Section 3).
The matrices M S,f (x(s)) compute a weighted average of the spatial covariances of all available pairs of coordinates S ∩ C, where the weights are determined by the so-called spatial kernel function f : R d → R. Three options are introduced in [12] as follows.
• Gauss kernel: Here I(·) denotes the indicator function. All three kernel functions above assume isotropic random fields as they only operate on the norm of s. It is possible to define spatial kernel functions differently and account for possible anistropies present in the random fields, this is however beyond the scope of this paper.
For the special case of a ball kernel with parameter r = 0, denoted as f 0 , local covariance matrices reduce to the average covariance in S where no spatial dependence is utilized. Formally Considering a finite sample, the estimation of the subsequently introduced mixing matrix functionals is carried out by replacing the population quantities from Equation (2) by their sample counterparts. Specifically, the corresponding sample version of Equation (2) is given bŷ which also defines the sample version of M S,f0 (x(s)). Additionally, we estimate the location functional T(x(s)) always byx.
For a random field x(s) following the SNSS model (Definition 1) we observe that M S,f0 (z(s)) as well as M S,f (z(s)) yield diagonal matrices for all formerly discussed kernel function options which motivates the following three estimators.

Simultaneous diagonalization of two average covariance matrices
The first unmixing matrix functional is based on the simultaneous diagonalizaton (sd) of two average covariance matrices which is formalized in the following definition.
Definition 3 (SNSS.sd functional). Consider a random field x(s) following the SNSS model (Definition 1) and a partition of the spatial domain S into S 1 , S 2 where S 1 ∩ S 2 = ∅. The SNSS.sd functional W = W(x(s)) is defined as the simultaneous diagonalizer satisfying WM S1,f0 (x(s))W = I p and WM S2,f0 (x(s))W = D S1S2 , where D S1S2 is a diagonal matrix with decreasingly ordered diagonal elements.
Given a sample, an unmixing matrix W can be found by solving the generalized eigenvalue-eigenvector problem, which always yields exact diagonalization of the former two matrices. Furthermore, the decreasing ordering of the diagonal elements of D S1S2 comes without loss of generality as the order of the latent random field can be anyhow only recovered up to permutations. The following proposition gives a necessary condition for the identifiability of the the above unmixing matrix functional as well as the desired affine equivariance property. Proposition 1. The SNSS.sd functional seen in Definition 3 is (1) identifiable as seen in Definition 2 if and only if elements of the diagonal matrix M −1 S1,f0 (z(s))M S2,f0 (z(s)) are pairwise distinct, (2) affine equivariant as seen in Definition 2.
According to [18,Result 1] to ensure identifiability there need to exist at least two locations s 1 , s 2 ∈ C for which the elements of the diagonal matrix Σ −1 s1 Σ s2 are pairwise distinct (where Σ s refers to the covariance matrices from Definition 1). If the former holds then it is possible to find two disjoint sub-domains S 1 , S 2 of S in such a way that all elements of the diagonal matrix M −1 S1,f0 (z(s))M S2,f0 (z(s)) are pairwise distinct. Note that [18,Result 1] is formulated for the times series non-stationary blind source separation model, the above outline is the natural extension of this statement to the spatial non-stationary case. However, in practical considerations the desired partition is unknown, therefore the a-priori choice of the partition of the domain is not trivial and greatly affects the performance of the method. This issue is addressed in the following extension of the former unmixing matrix functional.

Joint diagonalization of more than two average covariance matrices
In contrast to the former method the spatial domain is divided into more than two subdomains and the corresponding average covariance matrices are jointly diagonalized (jd) as follows. Definition 4 (SNSS.jd functional). Consider a random field x(s) following the SNSS model (Definition 1). Standardize Then, the SNSS.jd functional equals W(x(s)) = UM −1/2 S,f0 (x(s)).
In the above definition diag(·) is a diagonal matrix with the diagonal elements equalling the ones of the matrix-valued argument, and · F denotes the Frobenius norm. U is denoted an orthogonal joint diagonalizer of the matrices M S k ,f0 (x st (s)) for k = 1, . . . , K as maximizing the diagonal elements is equal to minimize the off-diagonal elements by the orthogonal invariance of the Frobenius norm. Note that for a finite sample, usually the sample versions of the matrices M S k ,f0 (x st (s)) for k = 1, . . . , K given by Equation (3) do not commute, hence, exact joint diagonalization is impossible. Therefore, algorithms that find an approximate joint diagonalizer are needed. We choose one such algorithm that relies on Givens rotations [24], but many others are available, see for example [25].
The next proposition is concerned with identifiability as well as affine equivariance. Proposition 2. The SNSS.jd functional seen in Definition 4 is affine equivariant as seen in Definition 2.
The condition for identifiability given in Proposition 2 is more general than the one in Proposition 1 as a finer partition of the domain is allowed. Therefore, the exact partition of the domain for the SNSS.jd method should have less influence on the performance as long as enough sub-partitions are considered. In practical applications it might be useful to simply overlay the spatial domain by a grid formed by equally sized squared shaped blocks which define the sub-division of S, a procedure that we investigate in more detail in the simulation study in Section 4. The advantage of less sensitivity on the exact domain sub-partition of the the SNSS.jd methods comes at the cost of giving up exact diagonalization from the SNSS.sd method, which introduces more computational complexity as joint diagonalization algorithms need to be applied.
Both former methods have in common that only the spatial ordering of the points is taken into account but not the spatial dependencies between them when computing the unmixing matrix. A trivial example which would cause problems is the case when the matrices Σ s are the identity matrix for all s ∈ S but Σ ss is non-zero and spatial dependent. In that case the identifiabilty conditions of Propostions 4 and consequently the one of Propostion 3 do not hold and the two methods fail. In that case recovering the latent random field is still possible when considering second order spatial dependencies as suggested in the following approach.

Joint diagonalization of more than two local covariance matrices
The following SNSS.sjd divides the domain into at least two parts and jointly diagonalizes the corresponding local covariance matrices for a set of kernel functions, therefore, it utilizes second order spatial dependence (sjd). Again, as in the case of the SNSS.jd method, for a finite sample joint diagonalization approximate algorithms need to be used.
When setting the number of spatial kernel functions L = 1 and the resulting spatial kernel function to f = f 0 , then the SNSS.sjd method reduces to the SNSS.jd method. If additionally the spatial domain is only divided into two parts and the transformation step is adapted accordingly, the SNSS.sjd method further reduces to the SNSS.sd method. In similar manner, if the choice of the spatial kernel functions is free but the domain is not partitioned, then the original SBSS method as introduced in [11,12] is obtained.
Identifiability and affine equivariance results are obtained in the following proposition.

Simulations
In this part we investigate the performance of the different unmixing matrix estimators which are introduced beforehand in an extensive simulation study. All simulations are carried out in R version 3. 6.1 ([26]) with the help of the packages SpatialBSS ( [27]), JADE ( [28]) and RandomFields ( [29]).
We use always squared two-dimensional domains of the form S = [0, n] × [0, n] (later denoted also as n × n) where n ∈ {20, 30, 40, 50, 60, 70}. The set of sample locations C is formed by two different patterns, namely a uniform and skewed pattern. For the uniform coordinate pattern n 2 x and y values are sampled from the uniform distribution U (0, 1) and then the sampled values are multiplied by n. The skewed coordinate pattern is formed by n 2 x values that are sampled from the beta distribution β(2, 5) and n 2 y values that are sampled from the uniform distribution U (0, 1), again all values are multiplied by n. This way of sampling coordinates ensures that the density of sample locations is the same for all domain sizes. In the case of the uniform pattern it equals one throughout the whole domain, whereas the skewed pattern shows more dense sample locations in the left half of the domain. Figure 1 depicts one example for the uniform and skewed coordinate pattern for different domain sizes.
Moreover, we randomly divide the spatial domain at hand into three different parts. This is done by randomly placing three locations on the spatial domain that act as cluster centers, which is depicted by the crosses (×) in Figure 1. The

Setting 4 and 5
These settings are based on the non-stationary extension of the Mátern covariance function presented in [7] given by where K ν is the modified Bessel function of second kind, σ 2 : S → R + , ν : S → R + and φ : S → R + are the local variance, shape and range parameter functions. We choose these functions to be of the form dividing the domain at hand in four equal squared blocks as shown on the right panel of Figure 1. Additionally, for the SNSS.sjd method we either use a ball kernel with r = 2 (SNSS.sjd B(2)) and f 0 or a ring kernel with (r 1 , r 2 ) = (0, 2) (SNSS.sjd R(0,2)) and f 0 . This choice keeps the average number of sample locations at r 2 π ≈ 12 for the uniform setting. As contender methods we estimate the unmixing matrix with the SBSS method, introduced in [11,12], with the same spatial kernel function settings as before but without f 0 (SBSS B(2) and SBSS R(0,2)). Lastly, we use the fourth order blind identification (FOBI) method which is a popular independent component analysis (ICA) method that does not utilize spatial information but fourth order cumulants, see [30,31].
To evaluate the quality of the unmixing matrix estimateŴ from the different methods we use the minimum distance index (MDI) [32,33] which is defined as Here, J is the set of all matrices that carry exactly one non-zero element in each row and column which corresponds to all matrices of the form PSD that are exactly the indeterminacies of our model definition. The MDI is a function The average MDI based on 2000 simulation iterations for the above estimators for all six considered random field models are presented in Figure 2 for the uniform sample location pattern. As in Setting 1 the random field shows no spatial dependence all SBSS methods completely fail as they only rely on spatial dependencies and the SNSS.jd method outperforms all contender methods. SNSS.sd is inferior which might be explained by the fact that it only halves the spatial domain, whereas SNSS.jd uses four equally sized sub-domains. Even though the SNSS.sjd methods use the sample covariance matrix inside each sub-domain, additionally (non-informative) local covariance matrices are used which might bring noise into the joint diagonalization algorithm and therefore reduce its performance in this setting. In contrast to Setting 1 only methods that rely on spatial dependencies perform well in Setting 2 and 4 as the variance for this setting equals one for each entry of the random field globally. Interestingly, the SBSS methods still perform well in Setting 2, this might result from the fact that this Setting is based on stationary covariance functions. In Setting 4 SBSS is clearly outperformed by the SNSS.sjd method. As the covariance is non-constant for Setting 3 and 5 also the SNSS.sd and SNSS.jd methods show good performances here. Lastly, as Setting 6 is formed by stationary latent fields with global constant variances, only the SBSS and the SNSS.sjd are expected to deliver meaningful results. However,  SBSS shows a better performance because the domain is not split into parts, therefore the effective sample size for the local covariance estimation is higher leading to a better separation. Interestingly, for all simulations where the variance is non-constant FOBI increases its performance as the sample size increases. Also, the choice for the kernel function for SNSS.sjd does not seem to have a high impact.
The results for the skew sample locations pattern are presented in Figure 3. The qualitative results are very similar to the uniform setting with two differences. Firstly, the overall performance is worsened for all methods due to the imbalanced distribution of the sample locations. Secondly, the SNSS.sd method where the domain is halved across the y axis clearly increases its performance as the sample locations density is still constant along the y axis.
The former simulations are carried out for a fixed partition of the spatial domain for the SNSS.jd and SNSS.sjd methods.
In this part we investigate the influence of different partitions on the overall performance of the unmixing matrix estimation. We consider sub-divisions into four (2 × 2), nine (3 × 3) and 16 (4 × 4) equally sized squared blocks for both methods. Exemplary, 3 × 3 is depicted on the left panel and 2 × 2 is depicted on the right panel of Figure 1.
Additionally, we half the domain across the x and the y axes for the SNSS.sjd method. The mean MDIs based on 2000 simulation repetitions are shown in Figure 4 for the uniform sample location pattern, we do not present the results for the skewed setting as the qualitative results are very similar to the uniform ones. Overall, the influence of the domain sub-division is very minor except for the SNSS.sjd method in Setting 1 and 6. Again, in Setting 6 the performance increases as the sub-division of the domain decreases, and more information is available to estimate the matrices of interest. The optimal case is given when the domain is not divided at all, which leads to the original SBSS method.
Generally, the simulation study showed that SNSS.sjd is a particularly good method as it always improves its performance with increasing sample size, its performance is never among the last and it is among the best in four out of six simulation settings. Therefore, we investigate the usefulness of the SNSS.sjd method on a real data example as follows.

Data example
In this section we illustrate the use of the above introduced methods on an environmental application. Specifically, we consider a dataset that is derived from the GEMAS geochemical mapping project [34] which consists of concentration measurements of 18 elements (Al, Ba, Ca, Cr, Fe, K, Mg, Mn, Na, Nb, P, Si, Sr, Ti, V, Y, Zn, Zr) in 2017 agricultural soil samples. This dataset is freely available in the R package robCompositions ( [35]).
As it is common practice in geochemical applications we respect the relative information of the data by performing typical compositional data analysis transformations prior the actual SNSS analysis. In a BSS context this is for example discussed in [36,11] and we follow in the exact same fashion as outlined in [11]. We first perform an isometric log-ratio (ilr) transformation by using pivot coordinates, and then apply the SNSS method. The loadings matrix is formed by combining the contrast matrix and the estimated unmixing matrix. Here the contrast matrix is an orthogonal matrix that transforms the data from centered log-ratio (clr) into ilr coordinates. Details on clr, ilr and compositional data analysis in general are given for example in [37]. Note that the ilr transformation reduces the dimension of the dataset by one, therefore p = 17.
We carry out SNSS.sjd as it has the overall best performance in the simulation study above. We divide the domain into four equally sized rectangles where the four resulting blocks of sample locations are depicted in the left panel of Figure 5. The circle on that Figure illustrates the parameter (r 1 , r 2 ) = (0 • , 1.5 • ) for the used ring kernel function and the right panel of Figure 5 shows boxplots of the number of neighboring sample locations defined by the ring kernel choice for each of the four considered blocks of sample locations. Additional to the ring kernel function choice we also include the covariance matrix for each of the four blocks (kernel function f 0 ).
We compute moving block variance maps for each entry of the latent random field, to hint the possible non-stationary variances. Specifically, we overlaid the domain by a grid with a resolution of one degree where the center is placed on   the minimum longitude and latitude value present in the dataset. The variance for each cell of the grid is computed by considering all sample locations that lie inside a block of size 3 • × 3 • that is placed on that cell.
After visual inspection of all recovered entries of the latent random field and the corresponding moving block variance maps we exemplary present the first two entries in Figure 6 and 7. The corresponding combined loadings (matrix product of the contrast and the estimated unmixing matrix) that transform the clr data into the first and second entry of the latent random field are presented in Table 1. A cluster of high values for the first component of the latent random field is found on the Iberian Peninsula. This cluster is mostly formed by the high balance between the pair Al, Ti and Na, V as the corresponding loadings show roughly equal values with opposite signs. The second component of the latent random field shows a cluster of high variance as well as high values in Greece, along the Balkan up to the northern and central part of Italy. The high loading of clr(Al) and the roughly equal absolute values of the clr(Cr) and clr(Zr) loadings suggest that this entry is mostly driven by a positive log-ratio between Cr and Zr combined with the high relative dominance of Al. The opposite effect is observed for the cluster of low values from mid to east Europe and the southern part of Scandinavia. Deeper investigation of the found latent random field and the possible driving physical phenomena can be achieved by geological experts.

Conclusion
BSS has been successfully used in many scientific applications [13]. BSS has a long tradition for iid data where it is known as independent component analysis (ICA) and for stationary and non-stationary time series [38]. Recently BSS approaches were suggested for stationary spatial data [11,12]. In this paper, we combine ideas from non-stationary time series methods and spatial stationary BSS to develop approaches for non-stationary spatial data. We formulate a spatial non-stationary blind source separation model and provide three different estimators that are based on the joint diagonalization of covariance and local covariance matrices for sub-divisions of the spatial domain. These estimators can be easily applied on spatial datasets with irregular sample locations and their use is illustrated in an extensive simulation study and on an environmental application.
Interesting future research would be to derive asymptotic results for the different estimators. Furthermore, it is of great interest to explore the use of the SNSS methods in the context of spatial prediction. The entries of the latent random field are uncorrelated, therefore, p univariate non-stationary models can be built which is much simpler as building one multivariate model for the original data. In the stationary case, such an approach seemed promising as discussed in [39]. Another interesting question would be to test if all latent components are actually informative and non-stationary, perhaps some exhibit spatial dependence but are stationary and others might be just white noise. In such cases modelling could be simplified. The separation of stationary from white noise processes in SBSS is for example discussed in [23]. We have focused so far on simple rectangular subdivisions of the domain at hand for the SNSS estimators, but irregular divisions might also be beneficial.
In a time series context [40] viewed such a partition of the data as a realization of grouped data and adapted the BSS model to such a case. A motivating example would be EEG signals where the sensors are placed on the same locations for different patients ensuring the same way of mixing. The measurements for each patient then form the different groups. However, motivation for the adaptation to the spatial setting is a future problem.
From the last equations W * B can be identified as the unmixing matrix W(x(s)), this leads to W * (x * (s)) = W(x(s))B −1 which concludes the proof.
Proof of Proposition 2 1. Definition 4 is a special case of Definition 5, therefore, the proof of Proposition 2 is a special case of the proof of Proposition 3 for L = 1. As M S,f0 (z(s)) is a diagonal matrix with strictly positive diagonal elements by assumption it follows that WAM 1/2 S,f0 (z(s)) = V where V is a p × p orthogonal matrix. From the maximization equation it follows that