A GPM-based algorithm for solving regularized Wasserstein barycenter problems in some spaces of probability measures

In this paper, we focus on the analysis of the regularized Wasserstein barycenter problem. We provide uniqueness and a characterization of the barycenter for two important classes of probability measures: (i) Gaussian distributions and (ii) $q$-Gaussian distributions; each regularized by a particular entropy functional. We propose an algorithm based on gradient projection method in the space of matrices in order to compute these regularized barycenters. We also consider a general class of $\varphi$-exponential measures, for which only the non-regularized barycenter is studied. Finally, we numerically show the influence of parameters and stability of the algorithm under small perturbation of data.


Regularization of barycenters in the Wasserstein space
In this paper we are interested in the regularization of barycenters in the Wasserstein space, which is a minimization problem of the form where P 2 (R d ) is the Wasserstein space of probability measures on R d with finite second moments; {µ i } n i=1 are n given probability measures in P 2 (R d ); W 2 is the L 2 -Wasserstein distance between two probability measures in P 2 (R d ) (cf. Section 2), and F : P 2 (R d ) → R is an entropy functional. Finally γ ≥ 0 is a given regularization parameter; λ 1 , . . . , λ n are given non-negative numbers (weights) satisfying n i=1 λ i = 1.

Literature review
Problem (1) for γ = 0 has been studied intensively in the literature. It was first studied by Knott and Smith [23] for Gaussian measures. In [1], Agueh and Carlier studied the general case proving, among other things, the existence and uniqueness of a minimizer provided that one of µ i 's vanishes on small sets (i.e. sets whose Hausdorff dimension is at most d − 1). Examples of such measures include those that are absolutely continuous with respect to the Lebesgue measure. The minimizer is called the barycenter of the measures µ i with weights λ i extending a classical characterization of the Euclidean barycenter. The article [1] has sparked off many research activities from both theoretical and computational aspects over the last years. Wasserstein barycenters in different settings, such as over compact Riemannian manifolds [22] and over discrete data [4] have been investigated . In the compact Riemannian setting, the condition to vanish on small sets ensuring uniqueness is replaced by absolute continuity with respect to the volume measure [22]. However, in the discrete setting, the uniqueness and absolute continuity of the barycenter is lost [4]. Connections between Wasserstein barycenters and optimal transports have been explored [29,21]. Several computational methods for the computation of the barycenter have been developed [13,2,24,30]. Recently Wasserstein barycenters has found many applications in statistics, image processing and machine learning [31,26,33]. We refer the reader to the mentioned papers and references therein for a more detailed account of the topic.
The case γ > 0 has been studied in the recent paper [8] where the existence, uniqueness and stability of a minimizer, which is called the regularized barycenter, has been established. In particular, this paper shows that if the regularizing function is a proper and lower semicontinuous function (for the Wasserstein distance) and is strictly convex on its domain, then there exists a unique regularized barycenter even in the case of discrete measures. In addition, the regularization parameter γ was proved to provide smooth barycenters especially when the input probability measures are irregular which is useful for data analysis [7,32]. In addition, the regularized barycenter problem also resembles the discretization formulation of Wasserstein gradient flows for dissipative evolution equations [20,3,11] and the fractional heat equation [15] at a given time step where {µ i } represent discretized solutions at the previous steps and γ is proportional to the time-step parameter.
Gaussian measures play an important role in the study of Wasserstein barycenter problem since in this case an useful characterization of the barycenter exists [1,6] which gives rise to efficient computational algorithms such as the fixed point approach [2] and the gradient projection method [24]. Our aim in this paper is to seek for a large class of probability measures so that the regularized barycenter can be explicitly characterized and computed similarly to the case of Gaussian measures. It is worth mentioning that many papers in the literature study a related problem of entropic regularization of optimal transports where the Wasserstein distance is regularized by an entropic term. The problem of finding a closed form solution for such problems in the case of Gaussian distributions has increasingly attracted interest in the community of computational optimal transport and machine learning [16,17]. The problem that we study in this paper is different from these papers since the entropy term is added outside of the Wasserstein distance.
We will study the regularization problem (1) for two important classes of probability measures, namely Gaussian and q-Gaussian measures, where the entropy functional is the negative Boltzmann entropy and the Tsallis entropy, respectively. In addition, we also study the non-regularization problem (i.e., (1) with γ = 0) for the class of ϕ-exponential measures, which contains both Gaussian measures and q-Gaussian measures as special cases, cf. Section 1.3 below. To state our main results, we now briefly recall the definition of ϕ-exponential measures; more detailed will be given in Section 2.

ϕ-exponential distributions
Let ϕ be an increasing, positive, continuous function on (0, ∞), the ϕlogarithmic is defined by [10] ln ϕ (t) := which is increasing, concave and C 1 on (0, ∞). Let l ϕ and L ϕ be respectively the infimum and the supremum of ln ϕ , that is The function ln ϕ has the inverse function, which is called the ϕ-exponential function, and is defined on (l ϕ , L ϕ ). This inverse function can be extended to the whole R as which is C 1 on (l ϕ , L ϕ ). Let S(d, R) + be the set of symmetric positive definite matrices of order d. Let v ∈ R d be a given vector and V ∈ S(d, R) + be a given symmetric positive definite matrix. The ϕ-exponential measure with mean v and covariance matrix V , denoted by G ϕ (v, V ), is the probability measure on R d with Lebesgue density where |x| 2 V := x, V −1 x , λ ϕ and c ϕ are normalization constants. Two important examples of ϕ-exponential measures include Gaussian measures and q-Gaussian measures corresponding to ϕ(s) = s and ϕ(s) = s q respectively. The ϕ-exponential measures play an important role in statistical physics, information geometry and in the analysis of nonlinear diffusion equations [28,27,34,35]. More information about ϕ-exponential measures will be reviewed in Section 2.

Main results of the paper
As already mentioned, in this paper we study the regularization problem (1) for Gaussian measures and q-Gaussian measures, where the entropy functional is the (negative) Boltzmann entropy functional and the Tsallis entropy functional respectively, as well as the non-regularization problem for ϕ-exponential distributions. Main results of the present paper are explicit characterizations of the minimizer of (1) and properties of the objective functions that can be summarized as follows. 1. Suppose that for each i = 1, . . . , n, µ i is a q-Gaussian measure (Gaussian measure when q = 1) with mean zero and covariance matrix A i ∈ S(d, R) + . Then the regularized barycenter problem (1) has a unique minimizer, which is also a q-Gaussian measure with mean zero and covariance matrix X satisfying where m(q, d) is a constant depending on q and d (see Theorem 4.1 for its explicit formula, in particular m = 1 when q = 1).
2. Suppose that for each i = 1, . . . , n, {µ i } is a ϕ-exponential measure with mean zero and covariance matrix A i . Then the unregularized barycenter problem (i.e. γ = 0 in (1)) has a unique minimizer, which is also a ϕexponential measure with mean zero and covariance matrix X satisfying The key to the analysis of the present paper is that the spaces of ϕexponential measures and Gaussian measures are isometric in the sense of Wasserstein geometry [34,35], that is where N (v, V ) denotes a Gaussian measure with mean v and covariance matrix V . Therefore, since the Wassertein distance between Gaussian measures can be computed explicitly, the objective functional in (1) can also be computed explicitly in terms of the covariance matrices and (1) becomes a minimization problem over the space of symmetric positive definite matrices. We then prove the strict convexity of the objective function and the existence of solutions to the optimality equation using matrix analysis tools as in [6]. Theorems 3.1, 4.1 and 5.1 establish the existence and uniqueness of a minimizer and provide an explicit characterization of the minimizer in terms of nonlinear matrix equations for the covariance matrix generalizing the characterization of the Wasserstein barycenter for Gaussian measures in [1,6] to the regularized Wasserstein barycenter for Gaussian measures, q-Gaussian measures, and ϕ-exponential measures. Theorem 6.2 and Theorem 6.3 prove the Lipschitz continuity of the gradient of the objective function providing an explicit upper bound for the Lipschitz constant generalizing the results of [24] for the barycenter for Gaussian measures to our setting. We also perform numerical experiments to show the affect of the parameter q and a stability property of the algorithm under small perturbation of the data, cf. Section 7.

Organization of the paper
The rest of the paper is organized as follows. In Section 2 we review relevant knowledge that will be used in subsequent sections on the Wasserstein metric and the Wasserstein geometry of Gaussian and ϕ-exponential distributions. Then we study the regularization of barycenters for Gaussian measures in Section 3 and extend these results to q-Gaussian and ϕexponential measures in Section 4 and Section 5. In Section 6 we describe a gradient projection method for the computation of the minimizer and prove that the gradient function is Lipschitz continuous. Finally, in Section 7, we numerically show affect of parameters to the minimizer and stability of the algorithm under small perturbation of data.

Wasserstein metric, Gaussian measures and ϕ-exponential measures
In this section, we summarize relevant knowledge that will be used in subsequent sections on the Wasserstein metric and the Wasserstein geometry of Gaussian and ϕ-exponential distributions.

Wasserstein metric
We recall that P 2 (R d ) is the space of probability measures µ on R d with finite second moment, namely Let µ and ν be two probability measures belonging to P 2 (R d ). The L 2 -Wasserstein distance, W 2 (µ, ν), between µ and ν is defined via where Γ(µ, ν) denotes the set of transport plans between µ and ν, i.e., the set of all probability measures on R d × R d having µ and ν as the first and the second marginals respectively. More precisely, for all Borel measurable sets A ⊂ R d . It has been proved that, under rather general conditions (e.g., when µ and ν are absolutely continuous with respect to the Lesbegue measure), an optimal transport plan in (5) uniquely exists and is of the form γ = [id × ∇ψ] # µ for some convex function ψ where # denotes the push forward [9,18]. The Wasserstein distance is an instance of a Monge-Kantorevich optimal transportation cost functional and plays a key role in many branches of mathematics such as optimal transportation, partial differential equations, geometric analysis and has been found many applications in other fields such as economics, statistical physics and recently in machine learning. We refer the reader to the celebrated monograph [36] for a great exposition of the topic.
We now consider two important classes of probability measures, namely Gaussian measures and ϕ-exponential measures, for which there is an explicit expression for the Wasserstein distance between two members of the same class. Although Gaussian measures are special cases of ϕ-exponential measures, but we consider them separately since many proofs for the former are much simplified than those for the latter.

Wasserstein distance of Gaussian measures
Given any X ∈ S(d, R) + , we define a symmetric positive definite matrix X 1/2 such that X 1/2 X 1/2 = X. Throughout the paper, we denote by I the identity matrix of order d. The Wasserstein distance between two Gaussian measures is well-known [19], see also e.g., [34]:

The entropy of Gaussian measures
The (negative) Boltzmann entropy of a probability measure µ = µ(x)dx on R d is defined by Using Gaussian integral, the (negative) Boltzmann entropy of a Gaussian measure can be computed explicitly [12, Theorem 9.4.1]: We now consider the second class of probability measures: ϕ-exponential measures.

ϕ-exponential measures and Wassertein distance
We recall that for a given increasing, positive and continuous function ϕ on (0, ∞), the ϕ-logarithmic function and the ϕ-exponential function are respectively defined in (2) and (3). Two important classes of ϕ-exponential functions are: (i) ϕ(s) = s: the ϕ-logarithmic function and the ϕ-exponential function become the traditional logarithmic and exponential functions: ln ϕ (t) = ln(t), exp ϕ (t) = exp(t).
(ii) ϕ(s) = s q for some q > 0: the ϕ-logarithmic function and the ϕexponential function become the q-logarithmic and q-exponential functions respectively where [x] + = max{0, x} and by convention 0 a := ∞. The q-logarithmic function satisfies the following property Definition 2.1. For any a ∈ R, we define O(a) to be the set of all increasing, positive, continuous function ϕ on (0, ∞) such that max{δ ϕ , δ ϕ } < a where It is proved in [35, Proposition 3.2] that for any ϕ ∈ O(2/(d + 2)) there exist constants λ ϕ and c ϕ such that (cf. (4) in the Introduction) where |x| 2 V := x, V −1 x , is a probability density on R d with mean v and covariance matrix V , which is called a ϕ-exponential distribution. Note that, in the above expression, λ ϕ and c ϕ are enough to define only at the identity matrix I d , not on all S(d, R) + . We define the space of all ϕ-exponential distribution measures by Above L d is the Lesbesgue measure on R d . Two important cases: (i) ϕ = s, G ϕ reduces to the class of Gaussian measures with mean v and covariance matrix V .
(ii) In the case ϕ = s q , G ϕ becomes the class of all q-Gaussian measures and C 0 (q, d), C 1 (q, d) are given by The ϕ-exponential measures play an important role in statistical physics, information geometry and in the analysis of nonlinear diffusion equations [28,27,34,35]. We refer to [27,34,14] for further details on q-Gaussian measures, ϕ-exponential measures and and their properties.
The following result explains why q-Gaussian measures and ϕ-exponential measures are special. It will play a key role in the analysis of this paper.
Proposition 2.2. The following statements hold [34,35] 1. For any q ∈ (0, 1) ∪ 1, d+4 d+2 , the space of q-Gaussian measures is convex and isometric to the space of Gaussian measures with respect to the Wasserstein metric.
2. For any ϕ ∈ O(2/(d+2)) with d ≥ 2, the space G ϕ is convex and isometric to the space of Gaussian measures with respect to the Wasserstein metric.

We have
2.5. The Tsallis entropy of a q-Gaussian measure The Tsallis entropy of a probability measure µ = µ(x)dx on R d is defined by The Tsallis entropy of a q-Gaussian can also be computed explicitly using the property (10) and similar computations as in the Gaussian case.
The first result of the present paper is the following proposition.
Then the regularized barycenter problem (1) has a unique minimizer, which is also a q-Gaussian measure with mean 0. This statement holds also for q = 1 and in this case, the minimizer is a Gaussian measure with mean 0. Similarly, when {µ i } are all ϕ-exponential distributions with mean 0, then the unregularized barycenter problem has a unique minimizer which is also a ϕ-exponential distribution with mean 0.
Proof. Since each of {µ i } n i=1 is a q-Gaussian measure with mean zero, then there exists a unique minimizer µ * ∈ P 2 (R d ), which is absolutely continuous with respect to the d-dimensional Lebesgue measure [8]. Let v and V be the mean and covariance matrix of µ * . Let G q (v, V ) be the q-Gaussian measure with the same mean v and covariance matrix V . Next we will show that Since G q (v, V ) minimizes the Tsallis entropy F q among all probability measures µ which are absolutely continuous with the d-dimensional Lebesgue measure having mean v and covariance matrix V (see for instance [34] We recall the following equivalent, Monge and Kantorovich duality, characterizations of the Wassertein distance between two probability measures µ, ν ∈ P 2 (R d ) (see [37,Theorem 5.10]) In addition, the optimal transport map T * and the optimal Kantorovich potential φ * in the above problems satisfy Let T i and φ i , i = 1, . . . , n be the optimal transport map and the optimal Kantorovich potential for W 2 (µ i , G q (v, V )), that is According to [34,Theorem A], T i is given by It follows that Therefore, It follows that the Jacobian matrix J i when changing the variable from y tō where ( * * ) follows from ( * ) since the ( * ) depends only on the mean and covariance of µ * which is the same as G q (v, V ). Therefore From (15) and (16) we get By the uniqueness of minimizers, we deduce that µ * = G q (v, V ). Moreover, the facts that ensure v = 0. Note that this proof also holds true for q = 1 where q-Gaussian measures and the Tsallis entropy are respectively replaced by Gaussian measures and the Boltzmann entropy. Similarly, using the third part of Proposition 2.2, we can show that the minimizer of the unregularized barycenter is again a ϕ-exponential distribution if all the µ i are ϕ-exponential distributions. This completes the proof of this proposition.

Regularization of barycenters for Gaussian measures
In this section we study the following regularization of barycenters in the space of Gaussian measures where µ i ∼ N (0, A i ) (i = 1, . . . , n), F is the (negative) Boltzmann entropy functional of a probability measure defined in (8) and γ > 0 is a regularization parameter.
According to Proposition 2.4, we only need to seek for the minimizer µ among Gaussian measures with mean zero, that is µ ∼ N (0, X) for some covariance matrix X. We note that we consider here Gaussian measures with zero mean just for simplicity, see Remark 3.3 for further discussion on this assumption. The main results of the paper can be easily extended to the case of non-zero mean. From now on, we equip S(d, R) + with the Frobenius inner product X, Y := tr(X T Y ). The Frobenius norm is defined Theorem 3.1. Assume that {µ i } are Gaussian distributions with mean zero and covariance matrix A i , µ i ∼ N (0, A i ) for i = 1, . . . , n. The regularization of barycenters problem (1) has a unique solution µ ∼ N (0, X) where the covariance matrix X solves the following nonlinear matrix equation In particular, in the scalar case (d = 1), we obtain Before proving this theorem, we show the existence of solutions to equation (18). Proof. Pick 0 < α 0 < β 0 so that α 0 I ≤ A i ≤ β 0 I for all i = 1, . . . , n. Set Then for matrices X satisfying α * I ≤ X ≤ β * I we have, By definition of α * and β * , for every X ∈ [α * I, β * I] := {Z : α * I ≤ Z ≤ β * I}. This shows that the map is a continuous self map on the Löwner order interval [α * I, β * I]. By Brouwer's fixed point theorem, it has a fixed point.
We are now ready to prove Theorem 3.1 Proof of Theorem 3.1. According to (6) and (9) we have Thus we can write (1) as a minimization problem in the space of symmetric positive definite matrices where It has been proved [6] that where A♯B denotes the geometric mean between A and B defined by which is symmetric in A and B. According to [25, Proof of Theorem 8, Chapter 10] X → − ln det(X) is strictly convex. Using Jacobi's formula for the derivative of the determinant and the chain rule, we get It follows that X → f (X) is strictly convex. Furthermore, we have From this we deduce that where the gradient is with respect to the Frobenius inner product. Hence ∇f (X) = 0 if and only if Using the definition (22) of the geometric mean, the above equation can be written as which is equation (18). By Lemma 3.2 this equation has a positive definite solution. This together with the strict convexity of f imply that f has a unique minimizer which is a Gaussian measure N (0, X) where X solves (18).
In the one dimensional case this equation reads which results in This completes the proof of the theorem.
and the formula of the entropy functional (9) (noting that the entropy of a normal distribution is independent of its mean), we deduce that the minimizer µ ∼ N (m, X) where the mean m is given by and the covariance matrix X satisfies the nonlinear matrix equation (18).
The above statement about the mean is also true for the case of q-Gaussian measures and ϕ-exponential measures in the subsequent sections.

Regularization of barycenters for q-Gaussian measures
In this section we study the following regularization of barycenters in the space of q-Gaussian measures min where µ i ∼ G q (0, A i ) (i = 1, . . . , n), F q is the Tsallis entropy for a probability measure µ = µ(x)dx on R d defined by According to Proposition 2.4, we only need to seek for the minimizer µ among q-Gaussian measures with mean zero, that is µ ∼ G q (0, X) for some covariance matrix X.
. . , n. The regularization of barycenters problem (24) has a unique solution µ ∼ G q (0, X) for all γ ≥ 0 if either 0 < q ≤ 1 or 1 < q ≤ 1 + 2α 2 dβ 2 and for γ sufficiently small if 1 + 2α 2 dβ 2 < q < d+4 d+2 . The covariance matrix X solves the following nonlinear matrix equation where m(q, d) is defined by The following proposition shows that equation (26)   Proof. Similarly as the proof of Lemma 3.2 we will also apply Brouwer's fixed point theorem. We will show that has a fixed point which is a positive definite matrix. Due to the appearance of the second term on the left-hand side of (26) the proof of this proposition is significantly involved than that of Lemma 3.2. Suppose that α 0 I ≤ A i ≤ β 0 I for all i = 1, . . . , n. Then similarly as in the proof of Lemma 3.2, for α * I ≤ X ≤ β * I (with α * , β * chosen later), we have Multiplying this inequality with λ i then adding them together, noting that λ i = 1, we obtain To continue we consider two cases.
It follows from (27) that Next we will show that following system has positive solutions 0 < α * < β * < ∞: Thus a * and b * satisfy We now show that f : . By Brouwer's fixed point theorem, f has a fixed point in [α 0 , a * ] ×[β 0 , b * ], which means that system (31) has a positive solution (α * , β * ). Using this solution in (30) we obtain Hence by Brouwer's fixed point theorem again, ψ has a fixed point in [α * I, β * I] as desired. This finishes the proof of the proposition.
Next we will show that the functional that we wish to mimimize in (24) is strictly convex under rather general conditions. According to Propositions 2.2 and Lemma 2.3 we have Therefore the minimization problem (24) can be written as where , which appeared in (21). Note that by definition of the q-logarithmic function we have ln q C 0 (q, d) Using explicit formula of C 1 (q, d) we get .
Substituting these expressions into (33) we get The following proposition studies the convexity of g.
Similarly as in the proof of Theorem 3.1, using again Jacobi's formula for the derivative of the determinant and the chain rule, we get Therefore, using the definition of m(q, d), we have In the computations below the linear operator P (X) is defined to be P (X)Y = XY X. This operator is called the quadratic representation in the literature. By the Leibniz rule, we get Thus Furthermore, according to [6], for αI ≤ A i , X ≤ βI, we have Thus we get From this estimate, we deduce the following cases (i) If which implies that the Hessian of g is positive for all γ. Note that the above condition is fulfilled if α and β satisfy β 2 ≤ d+2 d α 2 . In fact, we have then for γ < 1 2 the Hessian of g is positive since γ < 1 2 Hence the Hessian of g is always positive definite in this case.
We are now ready to proof Theorem 4.1.
Proof of Theorem 4.1. Suppose that the hypothesis of the statement of Theorem 4.1 is satisfied, that is either (i) 0 < q ≤ 1 or (ii) 1 < q ≤ 1 + 2α 2 dβ 2 or (iii) 1 + 2α 2 dβ 2 < q < d+4 d+2 . Suppose that γ is sufficiently small in the last case; in the other cases it can be arbitrarily positive. As has been shown in the paragraph before Proposition 4.3, the minimization problem (24) can be written as min where g(X) is given in (34) By Proposition 4.3, X → g(X) is strictly convex. Now we compute the derivative of g(X). We have According to the proof of Theorem 3.1 we have By (35), we have Substituting these computations into (39) we obtain Thus ∇g(X) = 0 if and only if which, by using the definition of the geometric mean (22), is equivalent to This is precisely equation (26). By Proposition 4.2, it has a positive definite solution. This, together with the strictly convexity of g, guarantees the existence and uniqueness of a minimizer of g. We complete the proof of the theorem.

Barycenters for ϕ-exponential measures
In this section we consider the following barycenter problem in the space of ϕ-exponential measures: In contrast to the Gaussian and q-Gaussian measures, we are not aware of an explicit formulation for the entropy for a general ϕ-exponential measure. Therefore, in the above formulation we do not include the regularization term. The main result of this section is the following theorem that states that the equation determining the barycenter for ϕ-exponential measures is the same as that of for Gaussian-measures.
The non-regularization of barycenters problem (1) has a unique solution µ ∼ G ϕ (0, X) where the covariance matrix X solves the following nonlinear matrix equation In particular, for n = 2, X is given explicitly by Proof. This theorem is a direct consequence of Proposition 2.2 and [1, Theorem 6.1] or [6,Theorem 8]. In fact, similarly as in the proof of 4.1, by using (12) we can write (40) as

Gradient projection method
In this section, we describe a gradient projection method for the computation of the minimizer to the regularization problems (20) and (32), and analyze its convergence properties.
First, we formally describe the algorithmic procedure for the gradient projection method (GPM) below.
The stepsize is selected by Armijo rule along the feasible direction [5]. It is described below.
Note that ψ = f for the regularization problem (20) and ψ = g for the regularization problem (32). The projection of the matrix S ∈ S d , where S d is the set of d × d symmetric matrices, onto the set [αI,βI] is to find the solution of the following minimization problem In the following subsections, we show the Lipschitz continuity of the gradient function of the regularization problems. In this case, we can use a constant stepsize for the gradient projection method. That is, t (k) = 1 L where L is a Lipschitz constant. Then we have

Regularization of barycenters for Gaussian measures
We recall that the unique minimizer of the minimization problem (17) in the space of Gaussian measures satisfies the following nonlinear matrix equation ∇f (X) = 0 where We establish the following theorem for the Lipschitz continuity of the gradient function.
Proof. According to [24, Proof of Theorem 3.1] we have Therefore we get

Regularization of barycenters for q-Gaussian measures
We recall that the unique minimizer of the minimization problem (24) in the space of q-Gaussian measures solves the nonlinear matrix equation The following main theorem of this section proves the Lipschitz continuity of ∇g. Theorem 6.3. Suppose that A i ∈ [αI, βI] for all i = 1, . . . , n. Then for αI ≤ X = Y ≤ βI, we have Proof. Let αI ≤ X, Y ≤ βI. According to the proof of Theorem 6.2, we have It remains to study the Lipschitz continuity ofh(X) = (det X) The second equality and inequality are derived from the mean value theorem.
Moreover, we get method applied for the regularization of barycenters for q-Gaussian measures on n randomly generated matrices of the size d × d. The random matrices we use for our test are generated by matlab code as follows: The eigenvalues of generated matrices are randomly distributed in the interval [eiglb, eiglb + eigub]. In our experiments, we set n = 100, d = 10 if q < 1 and n = 50, d = 5 if q > 1. And we set eiglb = 0.1 and eigub = 9.9. We set ξ = 0.5, σ = 0.1,α = 10 −5 ,β = 10 5 , λ i = 1/n, i = 1, . . . , n for GPM in our experiment. All runs are performed on a Laptop with Intel Core i7-10510U CPU (2.30GHz) and 16GB Memory, running 64-bit windows 10 and MATLAB (Version 9.8). Throughout the experiments, we choose the initial iterate to be X 0 = I and stop the algorithm when D (k) F ≤ 10 −8 . We report in Table 1 our numerical results, showing the Frobenius norm of the difference between the final estimated solution of the model (32) with q = 0.5 and that with various given q less than 1. In Table 2, the difference between the final estimated solution of the model (32) with q = 1.25 and that with various given q greater than 1 is reported. From Tables 1-2, we see that the difference is increasing as q goes to 1 when the regularization parameter γ is fixed and we observe that the bigger the regularization parameter γ is, the bigger the difference is when q is fixed.
In the next experiment, we investigate stability properties for the model (32). We perturb the given data, A i as follows: From Tables 3-4, we can observe that X B − X A F ≤ 4ǫ, where X A is the final estimated solution of the model (32) with data A i and X B is that with the perturbed data B i , for all the cases. The value X B − X A F /ǫ tends to reduce if the regularization parameter γ and q are getting large.
To visualize the effect of γ, we create the following toy example: In this experiment, we set q = 0.5 and ǫ = 10 −5 .  where X A,γ is the final estimated solution with the given γ and data A i and X B,γ is the final estimated solution with the given γ and the perturbed data B i . The bigger the penalty parameter γ is, the smaller the value of each diagonal entry of the solution is. Table 3: Test results of the value X B − X A F /ǫ where X A is the final estimated solution of the model (32) with data A i and X B is that with the perturbed data B i on 5 random data sets when q < 1.   Table 4: Test results of the value X B − X A F /ǫ where X A is the final estimated solution of the model (32) with data A i and X B is that with the perturbed data B i on 5 random data sets when q > 1.