An extrapolated fixed-point optimization method for strongly convex smooth optimizations

: In this work, we focused on minimizing a strongly convex smooth function over the common fixed-point constraints. We proposed an extrapolated fixed-point optimization method, which is a modified version of the extrapolated sequential constraint method with conjugate gradient direction. We proved the convergence of the generated sequence to the unique solution to the considered problem without boundedness assumption. We also investigated some numerical experiments to underline the e ff ectiveness and performance of the proposed method


Introduction
Let H be a real Hilbert space equipped with an inner product ⟨•, •⟩ and its induced norm ∥ • ∥.In this work, we deal with the minimization of a strongly convex smooth function over the common fixed-point constraints, which is in the following form: where Fix T i := {x ∈ H : T i (x) = x} and the objective function and the constrained operators satisfy the following assumptions: (A1) The function f : H → R is η-strongly convex and L-smooth.
Thanks to the closedness and the convexity of the common fixed-point sets m i=1 Fix T i and the strong convexity of f , we can ensure that the solution set to the problem (1.1) is a singleton set.In this situation, we denote the solution point to the problem (1.1) by x * throughout this work.

Related works
The minimization problem and its generalization (such as variational inequality problem) over the common fixed-point constraints have been investigated by many authors, in which we briefly note some related works as follows.Bauschke [1] considered the problem (1.1) in the case when the objective function is given by f (x) := 1  2 ∥x − a∥ 2 , where the point a ∈ H is a given point and the operators T i , i = 1, 2, . . ., m are nonexpansive operators.He proposed the following method: Given x 1 ∈ H and a nonnegative sequence {β k } ∞ k=1 , for all k ∈ N, compute where the function [•] is the modulo m function taking values in {1, . . ., m}.Bauschke proved that the sequence {x k } ∞ k=1 generated by the proposed method (1.2) converges strongly to the unique solution of the considered problem, provided that the constrained set satisfies and the sequence {β k } ∞ k=1 ⊂ [0, 1) satisfies lim k→∞ β k = 0, ∞ k=1 β k = ∞, and ∞ k=1 |β k+m − β k | < ∞.After that, Yamada [2] considered the solving of a more general setting of the problem (1.1) in the sense of variational inequality problem governed by the strongly monotone and Lipschitz continuous operator F : H → H. Thanks to the optimality condition, it is well known that such a considered problem can be reduced to the problem (1.1) by setting F := ∇ f .He proposed the so-called hybrid steepest descent method, which is in the following form: Given x 1 ∈ H and a nonnegative sequence {β k } ∞ k=1 , for all k ∈ N, compute With the similar assumption on the constrained set (1.3) and the control sequence {β k } ∞ k=1 , the strong convergence of the proposed method (1.4) to the unique solution of the considered problem was obtained.
Xu and Kim [3] also investigated the problem in the same setting as in Yamada [2] and considered the convergence result of the method (1.4) by proposing a variant condition of the control sequence {β k } ∞ k=1 ⊂ (0, 1], which satisties lim n→∞ [2].Another condition of the control sequence {β k } ∞ k=1 and the assumption on the constrained set (1.3) are also the same as above.After the works of Yamada [2] and Xu and Kim [3] were presented, many authors considered the variances and generalizations; see, for instance, [4][5][6][7][8][9].
Prangprakhon et al. [10] also considered the solving of the variational inequality problem over the intersection of fixed point sets of firmly nonexpansive operators.They presented an iterative algorithm, which can be viewed as a generalization of the hybrid steepest descent method in the allowance of adding appropriated information when computing operators values.The main feature of their proposed method is the presence of added information that can be viewed as the allowance of possible numerical errors on the computations of operators' values.They subsequently proved the strong convergence of the proposed method without the assumption on the constrained set (1.3).
Prangprakhon and Nimana [11] subsequently proposed the so-called extrapolated sequential constraint method with conjugate gradient direction (ESCoM-CGD) for solving the variational inequality problem over the intersection of the fixed-point sets of cutters.ESCoM-CGD is motivated by the ideas of the hybrid conjugate gradient method [12], which is an accelerated version of the hybrid steepest descent method together with the extrapolated cyclic cutter methods [13,14].This method consists of two interesting features, namely, it consists of the extrapolation stepsize function σ(y k ) (see Step 2 of Algorithm 1) and the search direction d k+1 , which is the combination of the current direction −∇ f (x k+1 ) together with a previous search direction d k .By assumming the boundedness of the generated sequence {x k } ∞ k=1 together with the assumptions on control sequences, they also proved the strong convergence of the proposed method.
Very recently, Petrot et al. [15] proposed the so-called dynamic distributed three-term conjugate gradient method for solving the strongly monotone variational inequality problem over the intersection of fixed-point sets of firmly nonexpansive operators.Unlike [11], this method has a simultaneous structure so that it allows the independent computation of each operator along with the dynamic weight, which is updated at every iteration.Moreover, the strong convergence of the method was obtained without assuming that the sequence {x k } ∞ k=1 is bounded.

Our contribution
In this work, we will present a modification version of ESCoM-CGD by using the search direction considered in [15].We will prove the convergence of the proposed method without assuming that the generated sequence is bounded.It is important to underline that we will simplify the proving lines compared to the proof of ESCoM-CGD given in [11,Theorem 3] in a shortened way.We also show that the proposed algorithm and convergence result are applicable in the binary classification via support vector machine learning.
The remainder of this work is organized as follows.In the rest of this section, we collect some mathematical preliminaries consisting of some useful definitions and facts needed in the proving lines.In Section 2, we present the modified version of ESCoM-CGD for finding the solution to the problem (1.1).The main convergence theorem and some important remarks are also given in this section.In Section 3, we provide some usefulness of the proposed method and its convergence result in solving the minimum-norm problem to the system of homogeneous linear inequalities, as well as the binary classification via support vector machine learning.In Section 4, we provide some technical convergence properties of the generated sequences and, subsequently, prove the main convergence theorem.Lastly, we give a concluding remark.

Mathematical preliminaries
In this subsection, we will recall some definitions, properties, and key tools in proving our convergence results.The readers can consult the books [16,17] for more details.
We denote by Id the identity operator on a Hilbert space H.For a sequence {x k } ∞ k=1 , the strong and weak convergences of a sequence {x k } ∞ k=1 to a point x ∈ H are defined by the expression x k → x and x k ⇀ x, respectively.
We will recall the convexity and smoothness of a function.Now, let f : H → R be a real-valued function.Let α > 0 be given.We say that the function f is convex if, for all x, y ∈ H and for all λ ∈ [0, 1], it holds that In particular, we say that the function f is α-strongly convex if, for all x, y ∈ H and for all λ The constant α is called the strongly convex parameter.Let x ∈ H be given.We say that the function f is Fréchet differentiable or, in short, differentiable at x if there exists a vector g ∈ H such that Moreover, we call this unique vector g by the gradient of f at x and denote it by ∇ f (x).
It is well known that the convexity of f can be characterized in terms of differentiability property as the followings.
Fact 1.1.[16, Theorem 5.24] Let f : H → R be a real-valued differentiable function, then f is α-strongly convex if, and only if, for all x, y ∈ H, we have The following fact is the cornerstone of our proving lines.It is the necessary and sufficient condition for the optimality of a convex function over a nonempty closed and convex set.Let L ≥ 0 be given.We say that the function f is L-smooth if it is differentiable and, for all x, y ∈ H, it satisfies that The constant L is called the smoothness parameter.It is clear that an L-smooth function is a continuously differentiable function.
The following fact is a very useful tool in obtaining the convergence result.The proof can be found in [2,Lemma 3.1(b)].
Fact 1.3.Let f : H → R be an η-strongly convex and L-smooth.For any µ ∈ (0, 2η/L 2 ) and β ∈ (0, 1], if we define the operator F β : H → H by F β (x) := x − µβ∇ f (x) for all x ∈ H, then Next, we recall the definition of cutters and some useful properties.Let T : H → H be an operator with Fix T := {x ∈ H : T (x) = x} ∅.We say the operator T is a cutter if, for all x ∈ H and z ∈ Fix T , we have The followings are the important properties of cutters.(ii) For all x ∈ H and for all z ∈ Fix T , we have Next, we recall the definitions of relaxations of an operator.Let T : H → H be an operator and λ ∈ [0, 2] be given.The relaxation of the operator T is defined by T λ (x) := (1 − λ)x + λT (x), for all x ∈ H.In this case, we call the parameter λ by a relaxation parameter.Moreover, let σ : H → (0, ∞) be a function.The generalized relaxation of the operator T is defined by for all x ∈ H.In this case, we call the function σ by a stepsize function.If the function σ(x) ≥ 1 for all x ∈ H, then we call the operator T σ,λ by an extrapolation of T λ .We denote T σ := T σ,1 .It is worth noting that, for all x ∈ H, the followings hold and for all λ 0, we also have For the simplicity, we denote the compositions of operators as We recall the important properties that will be useful for the convergence properties.Fix T i ∅.Let σ : H → (0, ∞) be defined by then the following properties hold: (ii) The operator T σ is a cutter.
Next, we will recall the notion of demi-closedness principle as follows.
Let T : H → H be an operator having a fixed point.The operator T is said to satisfy the demiclosedness principle if, for every sequence We close this section by recalling the well-known metric projection, which is defined in the following: Let C be a nonempty subset of H and x ∈ H.If there is a point y ∈ C such that ∥x − y∥ ≤ ∥x − z∥, for any z ∈ C, then y is said to be a metric projection of x onto C and is denoted by P C (x).If P C (x) exists and is uniquely determined for all x ∈ H, then the operator P C : H → H is said to be the metric projection onto C. Actually, we need some additional properties of the set C to ensure the existence and the uniqueness of a metric projection P C (x) for a point x ∈ H as the following fact.Fact 1.6.[17, Theorem 1.2.3.]Let C be a nonempty closed convex subset of H, then for any x ∈ H, there exists a metric projection P C (x) and it is uniquely determined.
Moreover, we finally note that the metric projection onto a nonempty closed convex set C is a cutter with Fix P C = C; see [17, Theorem 2.2.21.] for further details.

Algorithm and its convergence
We will proceed in this section by presenting a modified version of the extrapolated sequential constraint method with conjugate gradient direction (ESCoM-CGD) for solving the problem (1.1).Now, we are in a position to state a modified version of ESCoM-CGD as the following algorithm.We call the proposed method by the modified extrapolated sequential constraint method with conjugate gradient direction (in short, MESCoM-CGD).
Step 2. Compute the stepsize σ(y k ) as Step 3. Compute the recurrence x k+1 ∈ H and the search direction d k+1 ∈ H as and Step 4. Update k := k + 1 and return to Step 1.
Some comments and particular situations of MESCoM-CGD are the following: Remark 2.1.(i) As we have mentioned that the convergence of ESCoM-CGD (see, [11,Theorem 3]) relies on the assumption that the generated sequence {x k } ∞ k=1 is bounded, however, it is also pointed in [11,Remark 3] that the boundednesses of the search direction {d k } ∞ k=1 ∈ H and the sequence {φ k } ∞ k=1 together with the definition of the search direction yield the boundedness of the sequence {x k } ∞ k=1 .In this situation, if we construct the bounded search direction ∈ H in place of the previous version, then the boundedness of the generated sequence {x k } ∞ k=1 will be provable as well (see, Lemma 3.3 below).One can notice that the key idea of this bounded strategy is nothing else but restricting the search direction in a unit ball centered at the origin.
(ii) If the search direction {d k } ∞ k=1 is guaranteed to stay within the unit ball centered at the origin, then it holds that max{1, ∥d k ∥} = 1 so that MESCoM-CGD is reduced to the ESCoM-CGD in the case when the operator T m = Id.
(iii) If the number of m = 1, the relaxation parameter λ k = 1, the stepsize σ(y k ) = 1, and max{1, ∥d k ∥} = 1 for all n ∈ N, then MESCoM-CGD is nothing else but the hybrid conjugate gradient method proposed in [12].Now, let us begin this section by investigating the assumptions needed for the convergence result.
Let us briefly discuss some particular situations in which hypotheses (H1)-(H4) hold as follows: (iii) The role of the constant ε ∈ (0, 1) is to ensure that the term λ k (2 − λ k ) stays away from zero for all k ∈ N.This will yield that the cancellation of this term can be processed.One can take, for instance, λ k = λ 0 for some λ 0 ∈ (0, 2) as a trivial example of relaxation parameters {λ k } ∞ k=1 , satisfying hypothesis (H3).
(iv) The demi-closedness principle of the operators T i , i = 1 =, . . ., m, in (H4) is a key property and it is typically assumed when dealing with the convergence result of the common fixed-point type problems.Actually, the demi-closedness principle is satisfied by a nonexpansive operator, i.e., ∥T i (x) − T i (y)∥ ≤ ∥x − y∥ for all x, y ∈ H; see [17, Lemma 3.2.5.] for the proving lines with some historical notes.This yields that the metric projection onto a nonempty closed and convex also satisfies the demi-closedness principle, see [17,Theorem 2.2.21].Moreover, it is worth mentioning that the demiclosedness principle is also satisfied by a subgradient projection operator P f for a convex continuous function f : H → R, which is Lipschitz continuous relative to every bounded subset of H; see [17, Theorem 4.2.7] for more details.
The main result of this work is the following theorem: Theorem 2.4.Let the sequence {x k } ∞ k=1 be generated by MESCoM-CGD and assume that assumptions (A1) and (A2) and hypotheses (H1)-(H4) hold, then the sequence {x k } ∞ k=1 converges strongly to the unique solution x * of the problem (1.1).

Proof of Theorem 2.4
In this section, we will provide some technical convergence properties and close this section by proving the convergence of the proposed method to the unique solution of the problem (1.1).
For simplicity of notations, we denote and, for every n ∈ N and x ∈ m i=1 Fix T i , we denote Moreover, for every k ≥ 2, we denote In order to prove the convergence result in Theorem 2.4, we collect several facts that will be useful in what follows.We first state the lemma relating to the norms of iterate x k to a common fixed-point.The analysis is similar to those given in [11,Lemma 4], however, we state here again for the sake of completeness.Fix T i , we have Fix T i be given.By using the properties of extrapolation and Facts 1.4 (ii) and 1.5(ii), we note that We note from the definition of {y k } ∞ k=1 that By invoking this relation in (3.1), we obtain In order to prove the boundedness of the generated sequence {x k } ∞ k=1 , we need the following proposition [18, Lemma 3.1].
k=1 be a sequence of nonnegative real numbers such that there exists a subsequence {a k j } ∞ j=1 of {a k } ∞ k=1 with a k j < a k j +1 for all j ∈ N. If, for all k ≥ k 0 , we define Fix T i be given.Since , and so Thus, by using this property together with the inequality obtained in Lemma 3.1, we get For the sake of simplicity, we set for all k ∈ N. In this case, the inequality (3.2) can be rewritten as Next, we divide the proof into two cases based on the behaviors of the sequence {Γ k } ∞ k=1 : Case I: Suppose that there exists k 0 ∈ N such that Γ k+1 ≤ Γ k for all k ≥ k 0 .In this case, we have for all k ≥ k 0 .Since the series ∞ k=1 β 2 k converges, we obtain that the sequence {∥x k − x∥ 2 } ∞ k=1 is bounded and, subsequently, the sequence {x k } ∞ k=1 is also bounded.Case II: Suppose that there exists a subsequence for all k ∈ N, and let {v(k)} ∞ k=1 be defined as in Proposition 3.2.This yields, for all k ≥ k 0 , that By using the relation (3.4) in the inequality (3.3) and the fact that the term and, since 0 ≤ φ k ≤ 1, we have Since f is η-strongly convex, we have from Fact 1.1 that where the last inequality holds by the inequality (3.6).Thus, we obtain that which implies that the sequence {∥x v(k) − x * ∥} ∞ k=1 is bounded.Now, let us observe that which means that the sequence {Γ v(k)+1 } ∞ k=1 is bounded above.Finally, by using the inequality (3.5), we obtain that {Γ k } ∞ k=1 is bounded and, subsequently, {x k } ∞ k=1 is also bounded.□ A simple consequence of Lemma 3.3 is the boundedness of related sequences.Fix T i be given.We first note from the L-smoothness of f that where M 1 := sup k∈N ∥x k ∥.This means that the boundedness of the sequence {∇ f (x k )} ∞ k=1 is now obtained.
Next, by the construction of d k , we note for all k ≥ 1 that Thus, we have ∥d k ∥ ≤ max{M + 1, ∥d 1 ∥} for all k ∈ N and the boundedness of {d k } ∞ k=1 is obtained.Finally, these two obtained results immediately yield the boundedness of the sequence Fix T i , it holds that Fix T i be given.We note from the inequality (3.1) that where the second inequality holds by the fact that ∥x + y∥ 2 ≤ ∥x∥ 2 + 2⟨y, x + y⟩ for all x, y ∈ H, and the third inequality holds by Fact 1.

□
The following proposition will be a key tool for obtaining the convergence result in Theorem 2.4.The idea and its proof can be consulted in [19,Lemma 2.6] and [20,Lemma 2.4].Proposition 3.6.Let {a k } ∞ k=1 be a nonnegative real sequence, {δ k } ∞ k=1 be a real sequence, and Now, we are in a position to prove Theorem 2.4.
Proof.Let x * be the unique solution to the problem 1.1.For simplicity, we denote a k := ∥x k − x * ∥ 2 for all k ∈ N. Now, by using the facts obtained in Lemmata 3.3 and 3.4 and the fact that lim k→∞ By utilizing the inequality obtained in Lemma 3.1 and the fact lim This implies that On the other hand, since {y k } ∞ k=1 is a bounded sequence, so is the sequence Due to the boundedness of the sequence {y k j } ∞ j=1 , there exists a weakly cluster point z ∈ H and a subsequence {y k j ℓ } ∞ ℓ=1 of {y k j } ∞ j=1 such that y k j ℓ ⇀ z ∈ H.According to the obtained fact in (3.7), we note that lim Since T 1 satisfies the demi-closedness principle, we obtain that z ∈ Fix T 1 .Furthermore, we note that the facts y k j ℓ ⇀ z and lim By invoking the assumption that T 2 satisfies the demi-closedness principle, we also obtain that z ∈ Fix T 2 .By continuing the same argument used in the above proving lines, we obtain that z ∈ Fix T i for Fix T i .Since x * is the unique solution to the problem (1.1), we note from the optimality condition in Fact 1.2 that lim inf Now, the assumption that lim k→∞ φ k = 0, the boundedness of the sequences {y k } ∞ k=1 and {d k } ∞ k=1 , and the relation (3.8) yield that lim sup Hence, by applying Propostion 3.6, we conclude that lim k→∞ a k = 0.The proof is completed.□

Some illustrated consequences
In this section, we will provide some simple consequences of the main theorem.We start with the minimization problem over the system of linear inequalities.This constraint is nothing else but the intersection of linear half-spaces.We subsequently show that the proposed algorithm and convergence result are also applicable in the well-known support vector machine learning.

Minimization problem over system of linear inequalities
In this subsection, we assume that the function f : H → R is η-strongly convex and L-smooth as given in the assumption (A1).Now, let a i ∈ H with a i 0 and b i ∈ R, i = 1, . . ., m.We consider the minimization of a strongly convex smooth function over the intersection of nonempty closed and convex sets, which is in the following form: We may assume the consistency of the system of linear inequalities and we also denote the solution point to the problem (4.1) by x * .Now, for each i = 1, . . ., m, we let H i := {x ∈ H : ⟨a i , x⟩ ≤ b i } be the half-space corresponding to a i and b i .Furthermore, we set T i := P H i , the metric projection onto the half-space H i , for all i = 1, . . ., m.It is worth noting that the metric projection onto a half-space has a closed-form expression, that is, see [17,Subsection 4.1.3]for further details.Note that H i is a closed and convex set and the fixed-point set.Fix T i = H i for all i = 1, . . ., m.To derive an iterative method for solving the problem (4.1), we recall the notations T := T m T m−1 • • • T 1 , S 0 := Id, and S i := T i T i−1 • • • T 1 , for all i = 1, 2, . . ., m.Now, for every x ∈ H, by setting u i := S i (x), we have u 0 = x and u i = T i (u i−1 ), for all i = 1, 2, . . ., m.In this case, the stepsize function σ : H → (0, ∞) can be written as the following: see [13,Section 4.2] for further details.
In order to solve the problem (4.1), we consider a particular situation of MESCoM-CGD as the following algorithm.
We obtain an immediate consequence of Theorem 2.4 in the following corollary.
To examine the behavior of Algorithm 2 and the convergence given in Corollary 4.1, we consider the solving of the minimum-norm problem to the system of homogeneous linear inequalities.We assume that the whole space H is finite dimensional so that H . ., m as half spaces and T i := P H i , i = 1, . . ., m, the metric projections onto H i with Fix T i = H i satisfy the demiclosed principle.Furthermore, the function f (x) := 1 2 ∥x∥ 2 is 1-strongly convex with 1-smooth.In this situation, the problem (4.2) is nothing else but a special case of the problem (4.1), which yields that Algorithm 2 and the convergence given in Corollary 4.1 are applicable.
Next, we show behavior of Algorithm 2 when solving the problem (4.2) for various error of tolerance ϵ.We set number of variables to be n = 100 and consider several number of inequalities, that is, m = 100, 200, 300, 400, and 500.According to the above experiments, we set the parameters β 0 = 1, φ 0 = 1, and λ 0 = 1.We plot the number of iterations and computational time in seconds in Figure 1.It can be noticed from Figure 1 that a smaller number m needed a larger number of iterations for all errors of tolerance.Moreover, for the numbers m = 100, 200, and 300, it can be seen that a smaller number m needed a larger computational time.

Support vector machine learning problem
In this subsection, we consider the constructing of a classifier in binary classification problem starting by a given training datasets of two classes.More precisely, the ith training data a i ∈ R n and the ith label b i ∈ {−1, +1} of the ith training data, for all i = 1, . . ., m.The support vector machine learning is to train a weight u ∈ R n so that a (linear) classifier c(a) := ⟨a, u⟩ can give a corrected class of every new tested data a ∈ R n .In this situation, a will be identified to be in the class −1 if c(a) < 0, and to the class +1, otherwise.Mathematically, the support vector machine learning can be form as the following minimization problem: where ξ i ≥ 0 is nonnegative slack variable corresponding to the ith training data for all i = 1, . . ., m.By introducing a new variable x := (u, ξ 1 , . . ., ξ m ) ∈ R n+m and using the idea given in, for instance [15,21], the problem (4.3) can be written as where the matrix A ∈ R 2m×(n+m) is given by and the vector b ∈ R 2m is given by This problem (4.4) is nothing else but a particular case of the problem (4.2) and, subsequently, Algorithm 2 and the convergence given in Corollary 4.1 are also applicable.
In the first experiment, we aim to classify the 28 × 28 images on gray scale pixels of the handwritten digits from the MNIST dataset, which was provided as https://cs.nyu.edu/roweis/data.html.We used the dataset of 5000 images for the handwritten digit 9 and the dataset of 5000 images for the handwritten digits 0 − 8.The images are labeled by the classes +1 and −1, respectively.We perform the 10-fold cross-validation on the given datasets.Actually, in each class, we put a fold of 1000 images to be the testing data and the remaining 9 folds consisting of 9000 images to be the training data.We perform the cross-validation process repeatedly for 10 times so that each fold is set to be the testing data.
Recalling that the class of digit 9 and the class of digits 0 − 8 are labeled by +1 and −1, which are positive and negative, respectively, we denote the numbers of a tested data.It is classified as positive by true positive (TP); if classified as negative, then it is denoted by false negative (FN).If a tested data is labeled as negative and is classified as negative, then it is denoted as true negative (TN); if it is classified as positive, it is denoted as false positive (FP).These numbers are summarized as the following details: In the experiment, we terminate the experiment when the number of iterations is k = 50 and, subsequently, average each performance metric after the cross-validation process repeatedly for 10 times.The classification performance metrics of each parameter are presented in Table 1.

Fact 1 . 2 .
[16, Corollary 3.68] Let f : H → R be a continuously differentiable convex function and C ⊂ H be a closed and convex set, then f attains its minimum at a point x * ∈ C if, and only if, for all y ∈ C, it holds that ⟨∇ f (x * ), y − x * ⟩ ≥ 0.

Fact 1 . 4 .
[17, Remark 2.1.31and Lemma 2.1.36.]Let T : H → H be a cutter, then the following statements hold: (i) Fix T is closed and convex.

Lemma 3 . 1 .
For all k ∈ N and x ∈ m i=1 then the sequence {v(k)} k≥k 0 is nondecreasing, a v(k) ≤ a v(k)+1 , and a k ≤ a v(k)+1 for every k ≥ k 0 .Now, we are in a position to prove the boundedness property of the generated sequence {x k } ∞ k=1 as the following lemma.The idea proof is based on [15,Lemma 5].Lemma 3.3.The sequence {x k } ∞ k=1 is a bounded sequence.Proof.Let k ∈ N and x ∈ m i=1

Figure 1 .
Figure 1.Behavior of Algorithm 2 for various errors of tolerance.

F-measure = 2 ×
Positive TN := True NegativeTo measure the performance of each obtained classifier performed by each cross-validation, we consider classification performance metrics, namely, accuracy, precision, recall, specificity, and Fmeasure.These performance metrics are computed by the following details:Accuracy = TP + TN TP + TN + FP + FN , Recall × Precision Recall + Precision .
a vector.The minimum-norm problem is to find the vector x ∈ R n that solves the problem i , x⟩ ≤ b i , i = 1, . .., m.Now, by putting the constrained sets H