An adaptive splitting algorithm for the sum of two generalized monotone operators and one cocoercive operator

Splitting algorithms for finding a zero of sum of operators often involve multiple steps which are referred to as forward or backward steps. Forward steps are the explicit use of the operators and backward steps involve the operators implicitly via their resolvents. In this paper, we study an adaptive splitting algorithm for finding a zero of the sum of three operators. We assume that two of the operators are generalized monotone and their resolvents are computable, while the other operator is cocoercive but its resolvent is missing or costly to compute. Our splitting algorithm adapts new parameters to the generalized monotonicity of the operators and, at the same time, combines appropriate forward and backward steps to guarantee convergence to a solution of the problem.


Introduction
Operator splitting algorithms are developed for structured optimization problems based on the idea of performing the computation separately on individual operators. At each iteration, they require multiple steps which are known as either forward or backward steps.
The forward steps are almost always easy as they use the operator directly. The backward steps, on the other hand, are often more complicated as they use the resolvents of the operators. While there are many operators whose resolvents are readily computable, there exist operators whose resolvents may not be computable in closed form, thus, it is necessary to use the forward steps in certain situations. Notable examples of splitting algorithms include the forward-backward algorithm [15], the Douglas-Rachford algorithm [14,15], and many others.
In this paper, we study an adaptive splitting algorithm for the inclusion problem find x ∈ X such that 0 ∈ Ax + Bx + Cx,

The algorithm
Throughout, X is a real Hilbert space with inner product ·, · and induced norm · . The set of nonnegative integers is denoted by N and the set of real numbers is denoted by R. We denote the set of nonnegative real numbers by R + := {x ∈ R | x ≥ 0} and the set of the positive real numbers by R ++ := {x ∈ R | x > 0}. The notation A : X ⇒ X is to indicate that A is a set-valued operator on X and the notation A : X → X is to indicate that A is a single-valued operator on X.
Let A : X ⇒ X be an operator on X. Then its domain is dom A := {x ∈ X | Ax = ∅}, its set of zeros is zer A := {x ∈ X | 0 ∈ Ax}, and its set of fixed points is Fix A := {x ∈ X | x ∈ Ax}.
(2021) 2021: 16 Page 3 of 19 The graph of A is the set gra A := {(x, u) ∈ X × X | u ∈ Ax} and the inverse of A, denoted by A -1 , is the operator with graph gra A -1 := {(u, x) ∈ X × X | u ∈ Ax}. The resolvent of A is defined by where Id is the identity operator. Now, let η, γ , δ ∈ R ++ and set λ := 1 + δ γ . In order to address problem (1), we employ the operator T A,B,C := Id -ηJ γ A + ηJ δB (1 - We will also refer to γ and δ as the resolvent parameters as they are used to scale the operators A, B in their respective resolvents. In fact, we adapt γ and δ to the generalized monotonicity of A and B in order to guarantee the convergence of T A,B,C . Intuitively, in the case A and B are maximally monotone, one would expect the use of equal resolvent parameters γ = δ, and in other cases, γ and δ are no longer the same. This phenomenon was initially observed in [11,12]. Although the imbalance of monotonicity can be resolved by shifting the identity between the operators as in [11,Remark 4.15], our plan is to conduct the convergence analysis of the algorithm applied to the original operators. To motivate the use of (3), the following result shows the relationship between the fixed point set of T A,B,C and the solution set of (1).
Proof Let x ∈ dom T A,B,C . We have Therefore, which completes the proof.
In the rest of this section, we recall some preliminary concepts and results. Let T : X → X be a single-valued operator on X. Then T is nonexpansive if it is Lipschitz continuous Dao  with constant 1 on its domain, i.e., The operator T is said to be conically averaged with constant θ ∈ R ++ (see [4,7]) if there exists a nonexpansive operator N : X → X such that Given a conically θ -averaged operator, it is θ -averaged when θ ∈ ]0, 1[ and nonexpansive when θ = 1. Further properties are discussed in the following result from [4, Proposition 2.2].

Proposition 2.2
Let T : X → X, θ ∈ R ++ , and λ ∈ R ++ . Then the following are equivalent: Recall from [11] that an operator A : We say that A is monotone if α = 0, strongly monotone if α > 0, and weakly monotone if α < 0. The operator A is said to be maximally α-monotone if it is α-monotone and there is no α-monotone operator B : X ⇒ X such that gra B properly contains gra A. We say that A is σ -cocoercive if σ ∈ R ++ and Clearly, if A is σ -cocoercive, then A is single-valued and monotone. In fact, σ -cocoercivity was extended to σ -comonotonicity to allow for negative parameter σ , see [4,7] for more details. Next, we recall a result from [11, Lemma 3.3 and Proposition 3.4].

Proposition 2.3 (single-valued and full domain) Let
A : X ⇒ X be α-monotone and let γ ∈ R ++ such that 1 + γ α > 0. Then the following hold: Finally, we recall the demiclosedness principle for cocoercive operators developed in [2]. A fundamental result in the theory of nonexpansive mapping is Browder's celebrated demiclosedness principle [9]. It was extended for finitely many firmly nonexpansive mappings in [5] and was later generalized in [2] for a finite family of conically averaged mappings or for a finite family of cocoercive mappings. An instant application of the demiclosedness principle is to provide a simple proof for the weak convergence of the shadow sequence of the Douglas-Rachford algorithm [5] and of the adaptive Douglas-Rachford algorithm [2]. For our analysis, we recall only the result for two operators.

An abstract convergence result
In order to study T A,B,C , it is reasonable to consider the general operator where T 1 , T 2 , T 3 : X → X and η, ν, λ, δ ∈ R ++ . In this section, we establish a convergence result for the operator T under the cocoercivity of T 1 , T 2 , T 3 . We begin with a useful technical lemma.
The following proposition is inspired by [13, Proposition 2.1].
As a result, (x n ) n∈N is bounded, and so is (T 1 x n ) n∈N . Let y * be a weak cluster point of (T 1 x n ) n∈N . Then there exists a subsequence (x k n ) n∈N such that Dao Define z n := Sx n = (1λ)x n + λT 1 x n -δT 3 T 1 x n . Since T 3 T 1 x n → T 3 T 1 x * by (ii), it follows that Next, we have from (i) that which, due to (41), implies that Set ρ 1 := λ -1 = ν > 0 and ρ 2 := 1. Then and it follows from (42) that Using the definition of z n and then (43), we obtain Now, in view of (40), (41), (42), (43), (44), (45), and (47), we apply Proposition 2.4 to derive that which is the unique weak cluster point of (T 1 x n ) n∈N . Thus, T 1 x n T 1 x * . Since T 1 x n -T 2 Sx n = 1 η (Id -T)x n → 0 and x * ∈ Fix T, we derive that T 2 Sx n T 1 x * = T 2 Sx * . (iv): In this case, ω 3 > 0. So (37) implies that, as n → +∞, On the other hand, Since y is arbitrary in Fix T and x * ∈ Fix T, it also follows that T 1 y = T 1 x * and T 2 Sy = The proof is complete.

Zeros of the sum of three operators
In this section, we apply the result to the problem of finding a zero of the sum of three operators. We assume that the operator A is maximally α-monotone, the operator B is maximally β-monotone, and the operator C is σ -cocoercive. We will consider two cases: α + β = 0 and α + β > 0.
Using Theorem 4.1, we recover the results in [13, Theorem 2.1(1)], which partly spurred our interest in the topic.

Corollary 4.2 Suppose that A and B are respectively maximally monotone, that C is
Then the following hold: Proof Apply Theorem 4.1 with α = β = 0, δ = γ , and η * = 2 -γ 2σ . Next, we consider the case α + β > 0. This case indeed allows for some flexibility in choosing the resolvent parameters γ , δ. In particular, let us recall the case α + β = 0 in Theorem 4.1, the resolvent parameters γ , δ must be directly related by In the case α + β > 0, the above exact relation is no longer necessary; instead, for given γ , one can choose δ within a range such that for some positive that depends on α, β, and γ . In the next results, we will show that such choices for (γ , δ) always exist and will guarantee convergence of the algorithm. Lemma 4.4 (existence of resolvent parameters) Let α, β ∈ R be such that α + β > 0, let σ ∈ R ++ , and let γ , δ ∈ R ++ . Set Then γ 0 ≥ max{0, -α + 1 4σ } and the following statements are equivalent: . Consequently, there always exist γ , δ ∈ R ++ that satisfy both (i) and (ii).
We are now ready to prove the convergence of the algorithm for the case α + β > 0. Theorem 4.5 (convergence in the case α + β > 0) Suppose that A and B are respectively maximally α-and β-monotone with α + β > 0, that C is σ -cocoercive, and that γ , δ ∈ R ++ satisfy Set λ = 1 + δ γ and let η ∈ R ++ . Let (x n ) n∈N be a sequence generated by T A,B,C in (3) and set S := (1λ)Id + λJ γ A -δCJ γ A . Then the following hold: (i) T A,B,C is single-valued and has full domain.
(ii) For all x, y ∈ X, In particular, T A,B,C is conically η η * -averaged. (iii) If zer(A + B + C) = ∅ and η < η * , then the rate of asymptotic regularity of T A,B,C is o(1/ √ n) and (x n ) n∈N converges weakly to a point x * ∈ Fix T, while (J γ A x n ) n∈N and (J δB Sx n ) n∈N converge strongly to J γ A x * = J δB Sx * ∈ zer(A + B + C), (CJ γ A x n ) n∈N converges strongly to CJ γ A x * , and zer(A + B + C) = {J γ A x * }.
On the other hand, Therefore, we obtain (ii) due to Proposition 3.2(ii). Finally, applying Theorem 3.3(i), (ii), and (iv) and noting that J γ A (Fix T A,B,C ) = zer(A + B + C) due to Proposition 2.1, we complete the proof.

Zeros of the sum of two operators
The new results in Theorems 4.1 and 4.5 allow us to revisit the relaxed forward-backward, relaxed backward-forward, and adaptive Douglas-Rachford algorithms for finding a zero of the sum of two operators. Theorem 5.1 (relaxed forward-backward) Suppose that B is maximally β-monotone with β ∈ R + and that C is σ -cocoercive. Let γ ∈ ]0, 4σ [, η ∈ ]0, 2 -γ 2σ [, and let (x n ) n∈N be a sequence generated by Then the following hold: (i) For all x, y ∈ X, In particular, T FB is 2ησ 4σ -γ -averaged. (ii) If zer(B + C) = ∅, then the rate of asymptotic regularity of T FB is o(1/ √ n) and (x n ) n∈N converges weakly to a point x * ∈ zer(B + C), while (Cx n ) n∈N converges strongly to Cx * , and C(zer(B + C)) = {Cx * }. Moreover, if additionally β > 0, then (x n ) n∈N converges strongly to x * and zer(B + C) = {x * }.