A THEORETICAL FRAMEWORK FOR THE REGULARIZATION OF POISSON LIKELIHOOD ESTIMATION PROBLEMS

. Let z = Au + γ be an ill-posed, linear operator equation. Such a model arises, for example, in both astronomical and medical imaging, in which case γ corresponds to background, u the unknown true image, A the forward operator, and z the data. Regularized solutions of this equation can be obtained by solving where T 0 ( Au ; z ) is the negative-log of the Poisson likelihood functional, and α > 0 and J are the regularization parameter and functional, respectively. Our goal in this paper is to determine general conditions which guarantee that R α deﬁnes a regularization scheme for z = Au + γ . Determining the appropriate deﬁnition for regularization scheme in this context is important: not only will it serve to unify previous theoretical arguments in this direction, it will provide a framework for future theoretical analyses. To illustrate the latter, we end the paper with an application of the general framework to a case in which an analysis has not been done.

Our application of interest is image processing, so that z ∈ L ∞ (Ω) denotes the image intensity, u ∈ L 2 (Ω) the (nonnegative) intensity of the unknown object, and γ ∈ L ∞ (Ω) the (nonnegative) intensity of the background. Each function is defined on a closed, bounded domain Ω ⊂ R d . Finally, where a ∈ L ∞ (Ω × Ω) is the error free, nonnegative point spread function (PSF). Given these assumptions, A : L 2 (Ω) → L 2 (Ω) is a compact operator, and hence, the problem of solving (1) for u is ill-posed [10,11]. Moreover, Au ≥ 0 whenever u ≥ 0, and hence, assuming that the true image u exact ≥ 0, we have that the error free data z = Au exact + γ is bounded below by γ. Note that here, and in the remainder of the document, we omit "almost everywhere" from the mathematical statements in which its presence is called for.
In previous works of the author [2,3,4], the following variational problem is theoretically analyzed: (2) min u∈C T α (Au; z) def = T 0 (Au; z) + αJ(u) , where C = {u ∈ L 2 (Ω) | u ≥ 0}, α and J are the regularization parameter and functional, respectively, and T 0 is the functional analogue of the negative-log of the Poisson likelihood function [4,7], which arises in positron emission tomography (PET) [8] and in image deblurring when a charge-coupled-device (CCD) camera is used to collect images [9]. In [2], an analysis of (2) with the regularization functional [3], the regularization functional is a 2 × 2 positive definite matrix with continuously differentiable components for all x ∈ Ω, is the focus; and finally, in [4], we consider the total variation regularization functional where β ≥ 0 and We note that if u is continuously differentiable on Ω, (6) takes the recognizable form [1, Theorem 2.1] J 0 (u) is known as the total variation of u.
In [2,3,4], the term regularization scheme is not defined. We will do this rigorously below, and then argue that the results of [2,3,4] imply that (2) defines a regularization scheme for J given by (4), (5), and (6). Moreover, the definition together with our previous arguments provide a general framework for future analysis of (2) with, for example, different regularization functionals, or, as we will see later, with convex transformations of T 0 , such as is used in positron emission tomography (PET) [8].
Before continuing, we note that the definition for T 0 used in [2,3,4] is given by T 0 (Au + σ 2 ; z + γ + σ 2 ), with T 0 defined by (3). We have made this change for two reasons: first, we are using (1) instead of z = Au as our model, and second, we have removed the parameter σ 2 due to the fact that it stems from a noise model for CCD camera data [9] and, as such, is extraneous in the functional (non-stochastic and non-discrete) setting of this paper.
The paper is organized as follows. We present our definition of regularization scheme in Section 2. In Section 3, we prove that the results of [2,3,4] imply that problem (2) defines a regularization scheme when J is given by (4), (5), and (6).
Finally, in Section 4, we illustrate the utility of the general framework by applying it to a variational problem that arises in PET imaging.
2. Definition of regularization scheme. We begin with some definitions and notation. Let Finally, we define a sequence of operator equations with a n ∈ L ∞ (Ω × Ω) a nonnegative point spread function (PSF). Note, then, that A n u ≥ 0 whenever u ∈ C. The functions z n and a n should be viewed as approximations of the true data z and PSF a, respectively. In astronomy and PET, for example, both the data and PSF are estimated and hence contain errors. Thus our definition of regularization scheme, which we present now, accounts for errors in the measurements of both z and a.
3. R α (a, z) defines a regularization scheme. ) is a well-defined operator on B. We begin by defining two function spaces that will be of import in the discussion that follows. First, recall the innerproduct on the Hilbert space H 1 (Ω) [5,12] given by (10) u and · H 1 (Ω) = ·, · H 1 (Ω) . Next, the space of bounded variation [6] is defined where J 0 is defined by (6). BV (Ω) is a Banach space with norm We now prove that solutions of (2) exist and are unique under certain reasonable assumptions. This will follow provided that T α is strictly convex and coercive [5,12].
In order to see that T 0 is convex, we note that the gradient and Hessian of T 0 are given, respectively, by where " * " denotes operator adjoint. Given our assumptions regarding γ, z, and A, we see immediately that the Hessian is positive semi-definite for all u ∈ C, and is positive definite provided the null-space of A is {0}. (Note that this significantly weakens the assumption of the invertibility of A made in [2,4].) Moreover, we see that ∇T 0 (Au exact ; z) = 0, and hence, u exact is a minimizer of T 0 .
The strict convexity of T α requires a still weaker assumption on A if J is given by (4), (5), or (6). In particular, J given by (4) is strictly convex, in which case T α is strictly convex for any A satisfying the assumptions made following (1). On the other hand, when J is given by (5) or (7), it is enough to assume that the null-space of A does not contain the constant functions (since then the null-spaces of ∇ 2 T 0 and ∇ 2 J will not intersect). Thus we have that T α is strictly convex if J is given by (4), or if J is given by (5) or (7) and the null-space of A does not contain the constant functions.
In addition to strict convexity, we need that T α is coercive, which is defined Here · is a norm that depends upon the choice of J: for J given by (4), · def = · L 2 (Ω) ; for J given by (5), · def = · H 1 (Ω) ; and for J given by (6), · def = · BV (Ω) . Proofs of the coercivity of T α for each case can be found in [2,3,4].
It follows, then, that solutions of (2) exist and are unique-making R α (a, z) a well-defined operator-when J is given by (4), or when J is given by (5) or (7) and the null-space of A does not contain the constant functions.
Another question of interest, which has not been addressed directly in our previous work, is: what is the range of R α ? The proofs of existence and uniqueness of minimizers of T α found in [2,3,4] indicate that when J is given by (4), Range(R α ) ⊂ L 2 (Ω); when J is given by (5), Range(R α ) ⊂ H 1 (Ω); and when J is given by (6), Range(R α ) ⊂ BV (Ω). Thus the choice of regularization functional has a significant effect on the properties of the regularized solution, as is readily verified by numerical experiment (see [2,3,4]). α (a, z) is a continuous operator on B. For each choice of regularization functional J, the theoretical arguments in [2,3,4] give us that R α (a n , z n ) − R α (a, z) L p (Ω) → 0, for some p ≥ 1, provided A n − A L 1 (Ω) , z − z n L ∞ (Ω) → 0. Noting that
3.3. R α (a, z) is convergent. Finally, we must show that condition 2 of Definition 2.1 holds. First, we note that by our discussion above, if (a n , z n ) − (a, z) B → 0 then A n − A L 1 (Ω) , z n − z L ∞ (Ω) → 0.
In [2,3,4], it is argued that if A n − A L 1 (Ω) , z − z n L ∞ (Ω) → 0 and {α n } ∞ n=1 is chosen so that α n → 0 at a rate such that (14) (T 0 (A n u exact ; z n ) − T 0 (A n u 0,n ; z n ))/ α n is bounded, where u 0,n is a minimizer of T 0 (A n u; z n ) over C, it follows that (15) R αn (A n , z n ) − u exact L p (Ω) → 0, for p = 2 when J is given by (4) or (5), and for 1 ≤ p < d/(d − 1) when J is given by (6). Then R α satisfies condition 2 of Definition 2.1. However, in [2,3,4], the existence of u 0,n is not argued. Rather than do this, we can replace (14) by and the arguments of the proofs in [2,3,4] go through unchanged. We note that inf u∈C T 0 (A n u; z n ) exists due to the fact that T 0 (A n u; z n ) is bounded below. This follows from Jensen's inequality (assuming w.l.o.g. |Ω| = 1) and the properties of the function x − c log x for c > 0: For J given by (4), however, the arguments of [2] remain incomplete. In particular, from those arguments, we have only that R αn (a n , z n ) converges to u exact weakly in L 2 (Ω). But strong convergence can be proved provided (16) is replaced by This change does not effect the other arguments since (17) implies (16). Now, to prove strong convergence, we note, as above, that weak convergence in L 2 (Ω), together with ||R αn (a n , z n )|| L 2 (Ω) → ||u exact || L 2 (Ω) , implies strong convergence in L 2 (Ω). To prove norm convergence, we note that (18) inf u∈C T 0 (A n u, z n ) ≤ T αn (A n (R αn (a n , z n )), z n ), and that by the weak lower semi-continuity of the norm, ||u exact || 2 L 2 (Ω) ≤ lim inf ||R αn (a n , z n )|| 2 L 2 (Ω) . Thus by (18) together with (17), we have lim inf ||R αn (a n , z n )|| 2 = ||u exact || 2 L 2 (Ω) . Hence we have ||R αn (a n , z n )|| 2 L 2 (Ω) → ||u exact || 2 L 2 (Ω) and so R αn (a n , z n ) converges to u exact strongly in L 2 (Ω).

3.4.
Conclusions. The discussion above constitutes a proof of the following theorem.
Theorem 3.1. Let R α : B → C be defined as in (8). Then {R α } α>0 is a regularization scheme, as defined in Definition 2.1, when J is given by (4), and when J is given by (5) or (6) and the null-space of A does not contain the constant functions. Moreover, if J is given by (4) then Range(R α ) ⊂ L 2 (Ω); if J is given by (5) then Range(R α ) ⊂ H 1 (Ω); and if J is given by (6) then Range(R α ) ⊂ BV (Ω).
We advocate the use of Definition 2.1 for use in proving analogous results for (8) with regularization functionals other than those presented in this paper, or with modifications of the likelihood function. We present such an example next.

4.
Using the framework on a new example. In positron emission tomography (PET), the following model is sometimes used in place of (1) [8]: where z 0 ∈ L ∞ (Ω) is known as the blank scan and is positive, and z, A, u, and γ are as above. The corresponding variational problem then takes the form where T 0 is defined as in (3), and the regularization function J is one of (4), (5), or (6). Since (19) results from a strictly convex (and hence continuous) transformation of T 0 , condition 1 of Definition 2.1 immediately follows from the same result for (8).
Condition 2 of Definition 2.1 follows for (19) from the fact that e −Au − e −Av 1 ≤ Au − Av 1 for all u, v ∈ C (this can be easily verified by an appeal to the inequality that is needed in the proofs of convergence in each of [2,3,4]). Thus Theorem 3.1 also holds for (19).

Conclusions.
We have provided a definition of regularization scheme for (1) and have shown that the arguments of [2,3,4] (together with some corrections made here) imply that (8) defines a regularization scheme for J given by (4), (5), and (6). This result is given in detail in Theorem 3.1, and we note that we have significantly weakened the assumptions made on the operator A in [2,3,4]. We also demonstrate the usefulness of the general framework implied by the definition and our previous arguments by using them to prove, in a straightforward manner, that the results of Theorem 3.1 also hold for (19) which arises in positron emission tomography.