SPECT with a multi-bang assumption on attenuation

We consider the identification problem which arises in single photon emission computed tomography (SPECT) of joint reconstruction of both attenuation a and source density f. Assuming that a takes only finitely many values and f∈Cc1(R2) we are able to characterise singularities appearing in the attenuated Radon transform R a  f, which models SPECT data. Using this characterisation we prove that both a and f can be determined in some circumstances from R a  f. We also propose a numerical algorithm to jointly compute a and f from R a  f based on a weakly convex regularizer when a only takes values from a known finite list, and show that this algorithm performs well on some synthetic examples.


Introduction
In this paper we consider the problem of single photon emission computed tomography (SPECT) in which we seek to recover both attenuation a and radiation source density f. We will only consider the problem in two dimensions, and use the photon transport equation for the forward model. Assume that a and f are compactly supported functions on R 2 and let Ω ⊂ R 2 1 Supported by a President's Doctoral Scholarship award from the University of Manchester. This work was initiated with support from a NuSec pilot project grant. * Author to whom any correspondence should be addressed.
Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. be a bounded set containing the supports. Then the photon transport equation is θ · ∇u (x, θ) + a (x) u (x, θ) = f (x) , (x, θ) ∈ Ω × S 1 , where u(x, θ) is the photon flux through the point x in a unit direction θ ∈ S 1 and where n(x) is the unit outward pointing normal to the boundary at x. Intuitively the differential equation in (1) states that photons are created by a source with density f and then move along straight lines while being attenuated at a rate given by a. The boundary condition in (1) requires that we have no radiation entering the domain Ω. The photon transport equation can be solved (for example by method of characteristics as in [29]) to find the solution u(x, θ) which satisfies where θ ⊥ ∈ S 1 denotes θ rotated in the anticlockwise direction by π/2 and Da is the beam transform of a defined as We define the attenuated Radon transform (AtRT) [29] of f with attenuation a to be the integral on the right-hand side of (2), and denote this R a f(s, θ). The basic question we investigate in this paper is whether it is possible to determine both of a and f from R a f. When a is fixed, the mapping f → R a f is known to be analytically invertible under certain mild conditions, and when these hold a closed form solution for the inverse is known, see [9,23,32]. In fact similar formulae have been established for attenuated tensor transforms [27], and furthermore the problem has been studied on surfaces both for functions and tensor fields [22,24]. The problem of recovering both a and f from the AtRT is sometimes known as the SPECT identification problem [28,35,36], and a known result, given in [19], shows non-uniqueness for radial a and f. That is, there are different pairs of a and f depending only on distance to the origin which give the same AtRT. Numerical evidence in [25] also suggests that reconstructions for a and f which are 'close' to being radial are unstable. Furthermore, investigations in [34] show evidence of non-uniqueness in other situations. One should also note that if f = 0, then it is trivially clear that a can be anything and still R a f = 0.
Despite these negative results, under some additional hypotheses determination of a and f from R a f can still be possible. In medical imaging literature there has been quite a lot of work aimed towards numerical methods for the SPECT identification problem (e.g. [18,37,38] and many others), although for practical applications of SPECT the standard method is to obtain the attenuation through a separate transmission CT scan. Linearisation has been applied for numerical reconstruction of attenuation from SPECT data alone in [7], as well as range conditions [6,8]. Some studies have also attempted to make use of scattered photons for attenuation identification in SPECT ( [15,16]) although the forward model must be enriched to include the scattering, and so the mathematical problem is different. There are not many positive theoretical results concerning recovery of both a and f from emission data alone. In the case when Da in (2) is replaced by a constant μ times t the transform is called the exponential Radon transform, and in [35] it is shown that μ can be determined from the exponential Radon transform exactly when f is not radial ( f is not assumed to be known for this result). A linearisation of the problem is studied from a microlocal point of view in [36], and used to establish some results for the nonlinear problem as well. A range characterisation of f → R a f for given a originally found in [32] and further explored in [31] was used in [2] to analyse recovery of both a and f. Perhaps closest to the results of the present paper is the work in [11] which shows that unique recovery of a and f is possible when a is a multiple of the characteristic function of a star shaped region. In this work we assume that a takes on only finitely many values, and refer to such a as multi-bang. As detailed below, we are able to show unique recovery of such a from R a f in some cases.
Recently the authors of [13,14] introduced a convex multi-bang regularization technique intended to allow reconstructions of images in which there are only certain known values, and our line of research leading to the present paper was originally inspired by this technique. There are many applications where the multi-bang regularization technique might be useful, particularly in many forms of medical imaging, e.g. SPECT imaging [19] and x-ray imaging [39]. The convex multi-bang technique of [13,14] was applied numerically to the problem of recovering multi-bang a and f from R a f in [34] with mixed results, and in the present paper we modified the method to use a weakly convex (rather than convex) multi-bang regularization combined with total variation (TV) to promote the joint recovery of multi-bang a and f from the AtRT R a f. We implement this by alternating updates between a and f using [1] to prove convergence. The a update is the most computationally intensive step due to the nonlinearity and using recent work by [3,20] we apply a variant of the alternating direction method of multipliers (ADMM) with a non-convex multi-bang regularizer, which we show lends itself to promoting multi-bang solutions.
The novel contributions of this paper are summarised as follows. The precise and rigorous versions of the theoretical results, which include a few other technical assumptions, are presented later in the paper.
(a) Theorem 1. Assuming that a is multi-bang, f ∈ C 1 c (R 2 ) is non-negative, and with some additional assumptions about the regularity of the boundaries of the regions of constant a, we can characterise the singularities occurring in R a f as a result of the jumps in a. (b) Theorem 2. If a and f are as in theorem 1 with the regions of constant a being constructed using a sequence of nested convex sets and assuming supp( f) satisfies certain hypotheses, then a and f can be uniquely determined from R a f. (c) We propose a numerical algorithm for joint recovery of multi-bang a and f for limited projection data, and demonstrate its utility with some numerical examples.
As it is presented in this paper, the hypotheses of theorem 2 imply that supp( f) ⊂ supp(a) which is not realistic in practical applications of SPECT. We would like to thank an anonymous referee for pointing out that in the case where a = c on some circle of radius r, methods from [35] give a counterexample to theorem 2 with radial f supported within that circle. In general, if supp( f) ⊂ supp(a) then the outermost boundary on which a is discontinuous cannot be determined using techniques presented in this paper.
Our proofs for theorems 1 and 2 are based on a careful analysis of the singularities which occur in R a f arising from the jumps of a, as well as a result that if a is known outside of a convex region, then R a f determines f uniquely also outside this convex region (see lemma 8). We prove this latter result by reducing it to the problem considered in [10, theorem 3.1], although other proofs using for example analytic microlocal analysis as in [17] should also be possible.
The rest of the paper is structured as follows. Section 2 gives the theoretical results of the paper introducing some necessary definitions in section 2.1, stating and proving theorem 1 as well as some related results in section 2.2, and stating and proving theorem 2 in section 2.3. Section 3 describes the numerical methods used and section 4 gives a number of numerical examples. The final section concludes the work in the paper and suggests avenues for further research.

Problem set-up and definitions
We first recall from the introduction that the AtRT, R a f is defined by the formula Throughout this section we will assume that f ∈ C 1 c (R 2 ) and that a is multi-bang. Note that (s, θ) corresponds to the line tangent to θ with distance s from the origin (this is slightly different from the standard parametrisation of lines in the Radon transform in which θ is normal to the line), and R a f is an integral along that line. We will always parametrise θ ∈ S 1 by the angle ω ∈ R such that θ = (cos(ω), sin(ω)) and θ ⊥ = (−sin(ω), cos(ω)).
We now introduce the precise definition of multi-bang.
Here χ Ω j is the characteristic function of the set Ω j , and we assume that for all Ω j the interior of the closure of Ω j is equal to Ω j . We also assume that any line only intersects the boundaries ∪ n j=1 ∂Ω j finitely many times. In using definition 1 we will assume that there is actually a jump in a between any two adjacent regions Ω j in (4) sharing a relatively open section of boundary. This can be done without loss of generality since we can always combine regions with the same value of a j that share a section of boundary.
We are deliberately making a distinction between multi-bang functions and piecewise constant functions in the fact that the range of multi-bang functions is known and finite. It is worthwhile to point out that for the theoretical results in section 2 we do not require knowledge of the admissible set A, and hence they could also be applied to piecewise constant functions with bounded support and hypotheses on the regularity of the boundaries of the constant regions. However, we do assume knowledge of the admissible set for the numerical algorithm.
As we will see below in section 2.3, we will be able to determine certain points on the boundaries of the sets Ω j in definition 1 from R a f. The set of points which we can determine will be denoted P a , which we now define.

Definition 2. (P a ).
Suppose that a is multi-bang with sets {Ω j } n j=1 as in definition 1. Then P a is the set of points x ∈ ∂Ω j for some j such that either (a) ∂Ω j has non-zero curvature at x, or (b) ∂Ω j has a corner at x.
We further write P a,1 for the subset of P a where the boundary has non-zero curvature as in (a) and P a,2 for the subset of P a of corners as in (b).
As we might expect from microlocal analysis (e.g. see [36]) the jumps in a along the boundaries of Ω j lead to singularities in R a f at points (s, θ) corresponding to lines which are either tangent to the boundaries ∂Ω j , or pass through corners of ∂Ω j . We introduce the following definition for these lines.
Suppose that a is multi-bang with sets {Ω j } n j=1 as in definition 1 and P a is as in definition 2. We define K 0 a to be the subset of {(s, θ) : s ∈ R, θ ∈ S 1 } such that the line corresponding of (s, θ) is either tangent to ∂Ω j or passes through a corner of ∂Ω j for some j. We further define K a ⊂ K 0 a to be the same set with the added requirement that if the line is tangent, then the tangency must be at a point where ∂Ω j has nonzero curvature. Finally, we define subsets of K a which are the set K a,1 of lines tangent at a point where the curvature is non-zero, and K a,2 the set of lines passing through corners. (Note that K a,1 and K a,2 may not be disjoint).
Note that in this definition we consider lines passing through a corner of ∂Ω j that are also tangent to one of the branches of the corner to be tangent to ∂Ω j (so there would always be at least two lines passing through a corner that are also tangent).
For theorem 2 we will also require an additional definition which describes precisely what we meant when we said the regions of constant a are constructed using a sequence of nested convex sets.
Definition 4 (Nicely multi-bang). We say that a is nicely multi-bang if a is multi-bang and can furthermore be written in the form where the sets C j are all convex, bounded, open with smooth boundary possibly having corners and nested in the sense that C n C n−1 · · · C 1 .
Here C j C j−1 means that C j is contained in a compact set that is contained in C j−1 .
In the next section we will state and prove theorem 1 as well as some other related results.

Theorem 1 and related results
We start the section with the statement of theorem 1.

Theorem 1.
Suppose that f ∈ C 1 c (R 2 ) is non-negative and a is multi-bang with sets Ω j as given in definition 1. The theorem has two parts corresponding to P a,1 and P a,2 .
(a) Suppose x ∈ P a,1 and the line tangent to a boundary ∂Ω j at x is not tangent to a boundary anywhere else. If this tangent line is given by (s * , θ * ) with θ * = (cos(ω * ), sin(ω * )) and the ray {x + tθ * |t < 0} intersects the set { f > 0}, then ∂ s R a f(s, θ * ) has a singularity of order 1/2 at s = s * , and x is the unique point on the line such that (Recall that θ = (cos(ω), sin(ω))). (b) Suppose that x ∈ P a,2 lies on a corner of a boundary for a component of Ω j and is also a corner for only one other component of some Ω l . If (s * , θ * ) ∈ K a,2 corresponds to any line passing through x that passes through no other corners, is not tangent to any of the boundaries and the ray {x + tθ|t < 0} passes through the set { f > 0}, then ∂ s R a f(s, θ * ) has a jump across s = s * .

Remark 1.
We comment that while we have chosen theorem 1 to summarise the results in this section in a concise manner, in fact the lemmas we prove taken together can provide stronger statements and more information than given in theorem 1 concerning the singularities of R a f when a is multi-bang. In particular lemmas 1 and 5, and corollary 1 are not reflected in the statement of theorem 1.
We now begin establishing some lemmas and a corollary which will lead to the proof of theorem 1. We first look at the regularity of R a f in the complement of K 0 a . Lemma 1. Suppose that f ∈ C 1 c (R 2 ), a is multi-bang, and (s * , θ * ) ∈ (K 0 a ) c . Then the mapping s → ∂ s R a f(s, θ * ) is continuous at s * .
Proof. Since (s * , θ * ) ∈ (K 0 a ) c the line tangent to θ * with distance s * from the origin is neither tangent to one of the boundaries where a jumps, nor passing through a corner of one of these boundaries. W.L.O.G. we can rotate the axis so that the line corresponding to (s * , θ * ) lies on the x-axis and θ * = (1, 0). Then we have For x ∈ R, recall that By assumption, the line given by (s, θ * ) only crosses the jumps of a finitely many times and let us label the ordered values of t for which these crossings occur (when the line is parametrised as t → (t, s)) as {t i (s)} N i=1 . Since (s * , θ * ) ∈ (K 0 a ) c these functions t i are differentiable in a neighbourhood of s = 0. Next we introduce the functions Note that for all i, φ i is continuous with bounded first derivative in a neighbourhood of s = 0 which is also continuous when x = t i (s). Using these functions we have where for each i, c i is one of the admissible values or possibly zero (in particular c 0 and c N are always zero). We thus see that Da((x, s), θ * ) also has bounded derivatives in a neighbourhood of s = 0 that are continuous except when x = t i (s) for some i, and differentiating this with respect to s gives Since f ∈ C 1 (R 2 ) as well, we can therefore differentiate under the integral sign in (6) to get Since f ∈ C 1 c (R 2 ) and the derivatives t i are all continuous, we see that in fact ∂ s R a f(s, θ * ) is also continuous with respect to s at s = 0 which completes the proof.
For a ray (s * , θ) we have two cases, either (s * , θ) passes through a corner or it does not. If it does not pass through a corner then there exists a δ > 0 such that for all s satisfying |s − s * | the ray (s, θ) also does not pass through any corners. Since we do not pass through any of the boundaries tangentially we have that none of the terms φ i+1 − φ i tend to 0 as s → s * , therefore all of the terms in the derivative are continuous as s → s * . Hence ∂ a D a (s, θ) is continuous at s * . If the ray (s * , θ) passes through a corner then by the previous argument only the behaviour of the ∂ s R a f(s, θ) at corners is yet to be determined. At corners we have a jump in the derivative at the boundary, corresponding with the angle at the corner, for each corner this is bounded at s * and since we have only finitely many boundaries we have the derivative is bounded at s = s * , as required.
We now begin to consider what can happen to R a f at lines in K a . First we consider K a,2 .
, a is multi-bang, and (s * , θ * ) ∈ K a,2 passes though exactly one corner and is not tangent to any boundary ∂Ω j . Then s → ∂ s R a f(s, θ * ) is bounded near s * .
Suppose additionally that the corner point occurs at s * (θ * ) ⊥ + t * θ * , is a corner for N different components of the regions Ω j , and the boundaries of these regions make angles {α k } N k=1 with (θ * ) ⊥ where the orientation is chosen so that θ * is at a positive angle. Also suppose that the jump in a across the boundary with angle α k in the direction of increasing angle is b k (see figure 1 and caption). Then there is a jump in ∂ s R a f(s, θ * ) across s = s * given by Proof. We use the same set-up and notation as in the proof of lemma 1, although by translating if necessary we also assume W.L.O.G. that the corner occurs at the origin. Unlike in lemma 1 there may be a different number of boundary crossings N when s > 0 and s < 0, and so we introduce corresponding functions giving the crossing points where the t + i are defined for s > 0 and the t − i are defined for s < 0. As in lemma 1 these t ± i will all have bounded derivatives up to s = 0 (this is because the line given by (s * , θ * ) is not tangent to any boundary). We also introduce the corresponding φ ± i defined as in (7) but only for s > 0 and s < 0 respectively. The formula (8) still holds with ± added in appropriate places, and we can still see that Since these derivatives are all bounded (but not necessarily continuous at s = 0) we still have (9) for s = 0, and we see that ∂ s R a f(s, θ * ) is bounded thus proving the first statement of the theorem. It remains to analyse the jump at s = 0. Let us first consider the jump in ∂ s Da((x, s), θ * ) across s = 0. The only terms contributing to this jump in (11) will be those with φ ± i where t ± i (s) → 0 as s → 0 ± since the others correspond to boundaries which do not have corners along (s * , θ * ) and the line is not tangent to any of the boundaries. Let us reindex the indices i corresponding to such t i using a new index k as . Then the jump in ∂ s Da((x, s), θ * ) is given by Using (9) we find that the jump of ∂ s R a f(s, θ * ) across s = 0 will be Taking into account the rotation and translation used at the beginning we see that this corresponds with (10) and so completes the proof.
Next we consider K a,1 which requires a bit more work.
, a is multi-bang, and that (s * , θ * ) ∈ K a,1 is such that the line corresponding to (s * , θ * ) is only tangent to a boundary ∂Ω j at one point given by s * (θ * ) ⊥ + t * θ * which is not also a corner. Furthermore, suppose that a = c on the convex side of the point of tangency, a = c 0 on the concave side, the curvature of ∂Ω j at the point of tangency is κ > 0, and θ * ⊥ ∈ S 1 is orthogonal to θ * and pointing into the convex side. Then where ± is the sign of (θ * ) ⊥ · θ * ⊥ . Proof. After possibly rotating as in the proof of lemma 1 and reflecting across the x-axis we can assume that s * = 0, θ * = (1, 0) and θ * ⊥ = (0, 1). We further assume W.L.O.G. by translating if necessary that the single point of tangency is the origin. This means that locally near the origin the boundary ∂Ω j will be given as a graph in the form where g is a strictly positive function and g(0) = κ 2 . For sufficiently small s > 0 we then follow the same reasoning as in the proof of lemma 1 and so obtain (for s > 0) the equation (8). The difference from lemma 1 is that in the current case the derivatives of the t i 's corresponding to the point of tangency will blow up as s → 0 + . There will be two such t i 's which both go to 0 as s → 0 + , one positive and one negative which we label respectively as t ± , and differentiating (13) we can show that Now we have (9) which holds for s > 0 sufficiently small, and all the terms in this will be bounded as s → 0 + except those that involve derivatives of t ± . Thus when we multiply by s 1/2 and take the limit s → 0 + the only terms that will possibly remain are The term on the right on the first line is zero by (14) and using the fact the integrand is continuous, while using (14) another time we can evaluate the term on the second line to get Taking into account the rotation, translation and reflection at the beginning of the proof, this corresponds with (12) and so completes the proof.
Remark 2. Note that lemma 3 precisely characterises the leading order singularity of the derivative ∂ s R a f at (s * , θ * ). It is possible to obtain a similar formula and characterisation for some smooth parts of the boundary where the curvature is zero with some higher order derivative which does not vanish at x = 0. In this case we would use y = x 2n g(x) in place of (13) where n 2 and g(0) = 0. The order of the singularity in ∂ s R a f as s → (s * ) ± is then 1 − 1 2n rather than 1/2.

Remark 3.
It is possible to combine the methods of proof of the previous lemmas to characterise the singularities at (s * , θ * ) corresponding to lines both tangent to the boundaries at multiple places and/or passing through multiple corners. Contributions from each point of tangency or corner along a line are added together; these can sum to 0 and so may not show a jump in the derivative.
At this point we note that these first three lemmas already show how we can determine some information about multi-bang a from R a f. First of all, lemma 1 shows that if R a f(s, θ) is not continuous in s at a point (s * , θ * ), then the corresponding line must either be tangent to or passing through a corner of the boundary of one of the regions Ω j . If ∂ s R a f(s, θ) is bounded near (s * , θ * ), but has a jump in s at this point, then the line must be passing through a corner from lemma 2. If ∂ s R a f(s, θ * ) blows up as s → s * , then the line (s * , θ * ) must be tangent to one of the boundaries by lemma 3. This already gives most of the information required to prove theorem 1 except for the part about the derivative with respect to ω. For this we include one additional lemma studying the derivative with respect to variation in the angle ω rather than s as in the previous lemmas.
Proof. As in the previous lemmas by rotating, translating and possibly reflecting about the x-axis we assume W.L.O.G. that θ * = (1, 0), θ * ⊥ = (0, 1) and the point of tangency is at the origin (i.e. t * = 0). We also assume that x * = ( , 0) and for the moment consider only the case = 0. Note that the line corresponding to (x * · θ ⊥ , θ) is precisely the line through x * tangent to θ, and we will change the parametrisation of this line in the integral definition of the AtRT so that t = 0 always corresponds with x * . After doing all of this we have Now we use the same notation as in the previous lemmas and label the ordered values of t along the line t → x * + tθ, for sgn( )ω > 0 and |ω| sufficiently small, at which the line intersects one of the boundaries ∂Ω j as {t i (ω)} N i=1 . Two of the t i will correspond to the point of tangency and these will satisfy t i (ω) → − as ω → 0 −sgn( ) . Combining (13) with the geometric relations we can show by taking derivatives with respect to ω, and some computation, that In the case that = 0 we will still have t ± (ω) when ω = 0 is sufficiently small corresponding to the two intersections near the tangent point, but t −sgn(ω) (ω) = 0 and ω t sgn(ω) (ω) > 0 for all ω = 0. In this case we can show in a similar manner to the = 0 case that We next define functions φ i in a similar way to before (compare with (7)) as for −sgn( )ω > 0. As before also for −sgn( )ω > 0. In the case = 0 we still have versions of the previous formula for ω = 0, but it will change depending on the sign of ω. We will also write φ ± for those φ ± corresponding to t ± . Now let us take the derivative of (16) in the case when −sgn( )ω > 0 if = 0 or ω = 0 if = 0. We then have First consider the case = 0. In this case when we multiply by |sin(ω) 1/2 | and take the limit as ω → 0, using (19) we see that the limit is zero. Since = x * · θ * − t * , this proves the result when = 0. Now consider when = 0. In this case we multiply by |sin(ω)| 1/2 and take the limit as ω → 0 −sgn( ) . The only terms that are not bounded in (21) for ω close to zero are those that involve derivatives of φ ± . We therefore have Applying (18) to this we finally obtain Taking into account the translations, rotation and reflection from the beginning of the proof this formula agrees with (15), and so completes the proof.
Before giving the proof of theorem 1 we record a corollary of the proof of lemma 4 which looks at one case in which at the point of tangency of a line to the boundary of an Ω j , the curvature of the boundary is zero. This corollary will be useful for the proof of theorem 2 later. Corollary 1. Assume the same hypotheses as in lemma 4 including the assumption that there is a convex and concave side of the boundary near the point of tangency, but say the curvature is κ = 0 at the point of tangency. If x * · θ * − t * = 0 and the ray {x * + tθ * : t < t * } intersects the set {f > 0}, then limits in (15) and (12) are one of ±∞.
Proof. The proof follows the same outline as the proofs of lemma 3 and 4, but in (13) we have g(0) = 0. Because of this (14) and (18) respectively change to Following the proofs through the rest of the way with this change, and using the fact that the integrals appearing at the end do not vanish, proves the corollary.
The proof of theorem 1 now follows simply from lemmas 2-4 as we now point out.
Proof of theorem 1. For the first item in theorem 1 we note that if the hypotheses are satisfied, then by lemma 3 equation (12) holds, and since the ray {x + tθ * |t < 0} intersects { f > 0}, the integral on the right side of (12) is not zero. Therefore ∂ s R a f(s, θ * ) has a singularity of order 1/2 at s = s * . Similarly lemma 4 implies that (15) will hold and this limit will only be zero when x * = x. This proves the first part. The second part follows similarly from lemma 2, although we note the under the given hypotheses N = 2 in (2) and b 1 = −b 2 = 0. Thus we have a jump in ∂ s R a f if tan(α 1 ) = tan(α 2 ). This will always be true at a corner since there would be equality only if α 1 = α 2 + nπ for an integer n, but that would mean we are not at a corner.
To finish this section we prove one more lemma concerning what can happen if there is a flat section of a boundary of one of the Ω j . This will be used in the proof of theorem 2.

Lemma 5.
Assume that a is multi-bang and f ∈ C 1 c (R 2 ) is non-negative. Suppose that the line given by (s * , θ * ) intersects the boundary of one of the Ω j in a line segment of length given by {s * (θ * ) ⊥ + tθ * | t ∈ [t − , t + ]}, there are no corners for any of the other regions contained in the interior of this line segment, that the line given by (s * , θ * ) does not intersect any other boundary in a line segment, and to simplify the notation write x * = s * (θ * ) ⊥ . Assume also that the ray {x * + tθ * |t < t + } intersects the set {f > 0}. Then R a f(s, θ * ) is discontinuous at s = s * .
Furthermore suppose a = c + on the side of line segment t ∈ [t − , t + ] corresponding to s > s * and a = c − on the opposite side. Then the jump in R a f(s, θ * ) across s = s * is given by where Proof. We follow the same method as the proof of lemma 2 using the notation t ± i and φ ± i as before. The difference here is that some of these t ± i (ω) will converge to the endpoints t − and t + of the line segment as s → 0 ± . These will lead to a jump in Da((x, s), θ * ) given by (8) at s = 0 when x < t + . Taking limits then gives In a similar manner we can determine the limit as s → 0 − ; the only difference is that c − replaces c + in (24). Since c + = c − and by assumption f is not entirely 0 on this line segment, we have a jump in the beam transform as s → 0. Combining (24), the corresponding formula when s → 0 − , and accounting for the transformations made at the start of the proof we obtain the formula (22).
We next proceed to the statement and proof of theorem 2 as well as some related results.

Theorem 2 and related results
In this section we state and prove theorem 2.

Theorem 2.
Suppose that a is nicely multi-bang (see definition 4) and f ∈ C 1 c (R 2 ) is nonnegative. Also assume that Then a and f are uniquely determined by R a f. We prove theorem 2 throughout this section in a series of lemmas. The initial step in the proof of theorem 2 is to show that under the given hypotheses we can determine the set of points where a jumps. We will do this now. Lemma 6. Assume the same hypotheses as theorem 2. Then we can determine from R a f the sets C j appearing in definition 4 for a.
Proof. We first note that by lemmas 1 and 2 and corollary 1 the set of (s * , θ * ) such that ∂ s R a f(s, θ * ) for s near s * is not bounded gives the set of lines which are tangent to some boundary ∂C j , possibly missing some of the lines which intersect a boundary in a line segment. We can get rid of all of the (s * , θ * ) corresponding to lines which intersect a boundary ∂C j in a line segment by looking at the continuity of R a f(s, θ * ) near s * and using lemma 5. Thus we can determine the set of (s * , θ * ) such that the corresponding lines are tangent to a boundary ∂C j at some point, and since the C j are nested convex sets the point of tangency along each such line must be unique. We can determine the point of tangency along each line that is tangent at a point where the curvature of ∂C j is not zero using theorem 1 or we can determine if at the point of tangency the curvature is zero using corollary 1. Thus we can identify all points in the boundaries of the C j at which the curvature of ∂C j is not zero. Next we will show that we can also find the corners of the boundaries ∂C j .
By lemma 2 and the hypotheses, for every corner point x for some ∂C j there will infinitely many lines passing through x such that for at least (s * , θ * ) corresponding to these lines ∂ s R a f(s, θ * ) is bounded, but has a jump at s = s * . This allows us to determine the corner points, and combining this with the previous paragraph we see that we can determine the P a from R a f under the given hypotheses. We next show that this is sufficient to determine all of the C j . By lemma 7, which we will prove next, the closure of the convex hull of P a is equal to the closure of C 1 . Therefore we can determine C 1 . The rest of the sets C j can now be determined inductively. Indeed, suppose that we know C l for all l < j. Then by lemma 7 again ∂C l , and so we can determine C j . This completes the proof.
The following geometric lemma was needed in the proof of lemma 6. Lemma 7. Suppose that C ⊂ R 2 is closed, convex, bounded and has smooth boundary possibly with corners. Also let P be the subset of ∂C consisting of points which are either corners of ∂C, or where ∂C has nonzero curvature. Then where conhull(P) is the convex hull of P.
Proof. Since C is closed and convex, and P ⊂ C, we have conhull (P) ⊂ C. Thus it only remains to show the opposite inclusion. Suppose that x ∈ ∂C\conhull (P). Then there must be a neighbourhood U of x such that U ∩ ∂C does not intersect P. Therefore the curvature of ∂C is zero at all points in U ∩ ∂C, and since x is also not a corner for ∂C, this implies that U ∩ ∂C must contain a line segment containing x in its relative interior. There must then be some maximally extended line segment containing x which is contained in ∂C. At least one of the end points of this maximal line segment must also be in ∂C\conhull (P) since otherwise we would have x ∈ conhull (P) by convexity. However this is a contradiction since by the argument we have already given this endpoint would be in the relative interior of a line segment contained in ∂C\conhull (P). Thus ∂C\conhull (P) = ∅, which then implies the result.
Having proven in lemma 6 that we can determine the sets C j from R a f under the hypotheses of theorem 2, it remains to show that we can recover f and the jumps in a across each of the boundaries. For this we argue by induction starting at the outermost region C 1 , and continuing inward. First suppose that a and f are known everywhere outside of C j−1 for some j 2. Then since there must be at least one point x on the boundary ∂C j−1 in P a , we can either use (10) or (12) to determine the jump in a across the boundary at x. Therefore we can determine a outside of C j . To complete the induction step it then remains to show we can determine f outside of C j . For this we use the following lemma.
, and a is nicely multi-bang with sets {C j } n j=1 , and let C 0 be an open ball centred at the origin that is sufficiently large so that C 1 C 0 and supp(f) C 0 . Then for j 1, f | C j−1 \C j is uniquely determined if we know all of (a) R a f, Proof. For this proof we will write (x, y) as Cartesian coordinates for points in R 2 . By translating and rotating as necessary we assume W.L.O.G. that C j is contained in the upper half plane {y > 0}, and show that we can then uniquely determine f restricted to the lower halfplane {y < 0}. By translating to bring C j arbitrarily close to {y = 0} and rotating this then shows we can determine f everywhere outside of C j and so will complete the proof.
Having done the transformations described in the previous paragraph, we also assume that C j−1 ⊂ {y > −h}. Now choose ω > > 0 such that the parabola {y = x 2 } lies entirely outside of C j and the parabola {y = ωx 2 − h} lies entirely outside of C j−1 . It is possible to find such ω and since the C j are all bounded. Now choose φ ∈ C ∞ (R 2 ) such that φ(x, y) = 1 on C j−1 and φ(x, y) = 0 on the set {y < ωx 2 − h}. Then definef = φ f so thatf has support contained in the set {y ωx 2 − h} and is such thatf a(x, y) = 0 ot he rw i s e .
The setup described in the last few lines is illustrated in figure 2. Our next step is to show that we can determine Rãf (s, θ) if (s, θ) corresponds to a line contained in the set {y < x 2 } given the hypotheses of the lemma. If this line does not pass through C j−1 , then there is no problem since we knowf andã outside of C j−1 . Suppose on the other hand that the line does pass through C j−1 , and let the two points of intersection between the line and ∂C j−1 be denoted t 1 < t 2 (note there will always be two such points by convexity and these can be determined from C j ). We then have This illustrates the setup and some of the notation used in the proof of lemma 8. Note that we assume a = c in the region C j−1 \C j , andã = c in the region above the lowest parabola translated downwards by 1 and outside of C j . f is assumed to be known outside of C j−1 , and thenf is supported in the region above the middle parabola.
The first and third terms on the right side of the last equation only involve a| R 2 \C j and f | R 2 \C j−1 as well as t 1 and t 2 , and thus are known functions of (s, θ) under the given hypotheses. We combine these together, and also R a f, into one function G(s, θ), and so, since also where G is a function which can be determined from the known information. Next note that for t ∈ (t 1 , t 2 ), F = −Da(sθ ⊥ + tθ, θ) + Dã(sθ ⊥ + tθ, θ) only depends on s and θ, and can be determined under the hypotheses. Therefore we have Finally, we can add back in the integrals withf andã from −∞ to t 1 and t 2 to ∞ since these only involveã| R 2 \C j , andf | R 2 \C j−1 which are assumed to be known. Doing this we see that Rãf can be determined given the hypotheses. The problem has now been reduced to determining f | C j−1 ∩{y<0} . Our final step is to change variables in order to reduce the problem to the one considered in [10, theorem 3.1]. For this we consider only Rãf (s, θ) for (s, θ) corresponding to lines contained in {y < x 2 }. We reparametrise such lines using y = kx + l where the slope k and intercept l replace θ and s respectively. We will also write x + (k, l) = for the larger value of x at which {y = kx + l} intersects {y = x 2 − h − 1}. When we parametrise the lines in this way, the beam transform becomes and so the AtRT becomes We now introduce the new coordinates (z, w) defined by z = √ x and w = y − x 2 + h. With this change, the region { x 2 > y > x 2 − h} becomes the strip {h > w > 0}, and abusing notation slightly by writingf also for the same function in these coordinates the AtRT becomes for any (k, l) corresponding to a line contained in {y < x 2 } and passing through the region {y > x 2 − h}. The uniqueness off in the region {y < x 2 }, and therefore also f in the same region, now follows from [10, theorem 3.1] since we have that is an analytic function of k/(2 √ ) and z provided the imaginary part of k is sufficiently small.
Lemma 8 now allows us to complete the proof of theorem 2. Indeed, the induction step is already proved as described just above the lemma. The base case is also included in lemma 8 since we can determine the set C 0 as in the lemma by looking at the support of R a f, and then we always know f | R 2 \C 0 = 0 and a| R 2 \C 1 = 0. This completes the proof of theorem 2.

Numerical method
We now turn our attention to numerically recovering a and f from data. We begin by first outlining how we discretize the domain. Let Ω be the domain of interest, and split Ω into M 2 square pixels of resolution dx. We order the pixels lexicographically from the top left to the bottom right. We then assume that a and f are piecewise constant over each pixel. Recall that for an oriented line given by (s, θ) we define the AtRT via (25) where s is the signed closest approach to the origin and θ is a unit direction tangent to the line and giving the orientation. Since a and f are piecewise constant on the pixels, we can evaluate (25) exactly as follows. Let P be a list of the pixels passed, in the order in which they are passed, along the oriented line and denote the length of P by N. Note for this to be well-defined we need the ray to be oriented. Let K be the ordered set of t values which correspond to an intersection with an edge of a pixel in the grid and let IT be the set of distances between adjacent entries in K. Using this notation we find where a P(i) , f P(i) are the values of a and f in the P(i)th pixel, and S(N ) = 1, S(i − 1) = S(i)e −IT(i)a P(i) . This allows us to rewrite the AtRT as a vector equation involving a and f. If we are given data vector d for a set I of oriented lines (s i , θ i ) i∈I then we can combine all of these vector equations into a matrix equation The discretised problem of interest is then to determine both a and f from d given by (28) where a is multi-bang with the admissible set A = {a 0 , a 1 , . . . , a n } known (note that for notational convenience we have reindexed the admissible values relative to definition 1). We attempt to do this by solving the variational problem where TV is a discrete version of the total variation and M, which will be described below, is used to enforce the multi-bang assumption. The known set of admissible attenuation values is A := {a 0 , a 1 , . . . , a n } with a 0 < a 1 < · · · < a n . Recent work in [13,14] attempted to design a convex regularizer to promote multi-bang solutions when the admissible set is known. The original idea in [13] was to make a convex penalty with jumps in gradient at admissible values.
In this paper we instead use a modified, non-convex, version of the multi-bang penalty given by where We refer to m as the pointwise multi-bang penalty. It is 0 for all multi-bang values and then behaves quadratically when t is between two adjacent a i (see figure 3 for an example). Compared to the convex multi-bang penalty from [13], this has the advantage of giving a proximal map which has multi-bang values as stationary points. Note that in the discrete case we consider piecewise constant a and so (30) is really a sum over pixels given by Strictly speaking (32) should have a factor of dx 2 in front of the summation but this gets absorbed by the regularization parameter α and so we omit it. Although the regularizer (32) promotes multi-bang solutions it provides no spatial regularity, and so we also include TV [30] regularization as a joint regularizer. TV has been widely studied and is well known to promote piecewise constant images with small perimeter [13,30]. This combination, at least numerically, allows us to significantly reduce the number of projections required to obtain a good reconstruction. For practical implementation we use a smoothed version of the isotropic TV [30] TV c (a) = where c > 0 is a small smoothing constant and each D i ∈ R 2×M 2 is a finite difference matrix satisfying Note that the smoothness of the TV is required in order to guarantee global Lipschitz continuity of its gradient. At this point we would like to mention that although (33) is convex and the non-convex multi-bang regularizer is weakly convex, we still have to be careful with the data fidelity term. It can be shown that for sufficiently large a, R[a] f − d 2 may be non-convex. Because of this we use the following alternating minimization scheme [1] designed for non-convex objective functions for sufficiently small {ξ k } ∞ k=1 . We first turn our attention to the a update.

Updating attenuation a
Since we only concern ourselves with parts of the objective function R(a, f ) involving a, the a update in (35) is equivalent to For the purpose of solving this optimisation problem we introduce two auxiliary variables which are x corresponding to a itself, and y corresponding to the discrete derivative of a. These two parts are linked by the matrix equation where D is the finite difference matrix obtain by stacking all the D i defined by (34) on top of each other. We further split y into a series of 2 by 1 column vectors which are linked to x by the matrix equations Therefore, we can rewrite (36) as A standard algorithm for solving an optimization problem in the form of (37) is the ADMM [5]. In this case the augmented Lagrangian [5] is given by where β > 0 and μ i are Lagrange multipliers related to y i . We also define a vector μ given by placing the μ i related to y i in the same positions in μ as the corresponding y i are in y. The ADMM algorithm for solving (36) then proceeds as follows Removing terms not involving x, we see that the x update for x l+1 can be calculated using the first order optimality condition where is determined from (26) as in [34]. Note that since the non-convex multi-bang regularizer is separable, we have the elementwise optimality condition is differentiable it is Lipschitz continuous. Furthermore, as the pointwise multi-bang regularizer is weakly convex by [3] the pointwise multi-bang regularizer admits a well-defined proximal map where 0 , a 1 , . . . , a n } is the admissible set and 1 2 > αt > 0. Figure 4 gives an example of the proximal map for the weakly convex multi-bang regularizer when the admissible set is A = {0, 0.25, 0.5, 0.75, 1}. Note in particular that prox 1 t αm(a i ) = a i which is in contrast to the convex case [13]. Using this we can we find x l+1 satisfying the optimality condition (39) via a fixed point iteration such as ISTA or FISTA [4]. Indeed, provided 0 < αt < 1 2 by [3,4] both ISTA and FISTA produce iterates which converge to a solution x l+1 of (39) to within any prescribed tolerance.
We now turn our attention to the y update. The first order optimality condition for the y update gives for all i. Whilst this cannot be explicitly solved for y i easily, we can make use of the gradient on the right-hand side of (41) to solve the y update via gradient descent. Once we have updated all of the y i in this way we can combine them to update y. Finally since is lower semi-continuous and M 2 −1 i=1 λ y i 2 2 + c has Lipschitz continuous gradient, [20] gives convergence of ADMM to a critical point; that is, both the primal residual r l := y l+1 − Dx l+1 and dual residual s l := βD T (y l+1 − y l ) converge. Numerically we can speed up the rate of convergence of the ADMM algorithm by having an adaptive β. We use the following scheme from [5]: pick β 0 > 0 then for l 0 define for some chosen positive τ ± and ν. This completes the a update section of the numerical method. We now turn our attention to updating f.

Updating source radiation f
Removing terms not involving f, the f update satisfies again where c > 0 is some small smoothing constant. We find f k+1 solving this equation via ADMM [5,20] in a similar way to the a update. However it is simpler here because we do not have the multi-bang regularization term. The method is again proven to converge to a critical point.
With both the a and f update dealt with we are ready to outline the joint reconstruction algorithm.
We point out that in this algorithm β 0 is reset to the same initialised value whenever the inner iterations aimed at the a update in (35) (those indexed by l) restart. With the numerical method outlined we now present some numerical results.

Numerical reconstructions
Throughout this section we produce data on a 340 by 340 pixel grid and reconstruct on a 200 by 200 grid to avoid inverse crimes. The size of each pixel in all examples presented is 0.2 by 0.2 (i.e. dx = 0.2). All of the following examples have 5% added Gaussian white noise and were performed on a standard 4 core laptop using MATLAB. Note that much of the computational time is spent computing and recomputing the matrix representation of R[a] when a is updated, and many of the steps in this reconstruction can be done using parallel computing toolboxes. Unless otherwise stated the following reconstructions use 12 parallel ray projections which are equally spaced with some small perturbation to make the angles irrationally related (i.e. unless otherwise stated we only use data with 12 different values of θ). Irrationally related angles have been shown to reduce the number of projections required to obtain good reconstructions [12,39]. For 12 projections the simultaneous reconstruction algorithm takes approximately 8 min.
We note that most of the reconstructions presented are not covered by the theory in section 2. Although a is multi-bang in all cases, we only give a single example where a is nicely multibang (figure 7) as we believe that these other examples, when a is not nicely multi-bang, are more challenging. In fact, similar results have been obtained in cases when a is nicely multi-bang. The assumption that f ∈ C 1 from section 2 is used throughout, at least up to discretisation, except in one example ( figure 7) where we demonstrate the behaviour of the algorithm when both a and f are multi-bang.
There are a large number of parameters to control which gives good flexibility but does require extensive parameter tuning in order to obtain optimal results. Parameter selection is therefore an important part of the algorithm. Although optimal results require fine tuning, we find that a large range of the parameters β 0 , t, α, λ and η lead to good results. Firstly, the step sizes t and β 0 can be chosen quite widely, and only the number of iterations required for convergence are affected. Since it is actually the quantity 1 t that we use as the step size, small values of t lead to large step sizes. For a given α there is an upper bound on the value of t that can be used which is determined by (40), since the proximal map is not well-defined if αt is too large. As we update β 0 at each iteration, the initial choice of β 0 can be widely varied and still produce good results. Perhaps the most interesting parameter choice concerns α. Since we are interested in the quantity αt, taking α too large causes the proximal map to be ill-defined. If α is close to the limit allowed by αt then in numerical experiments we found the algorithm produced completely multi-bang reconstructions but the boundaries were poorly recovered, whereas taking α too small tends to give non multi-bang reconstructions. Again there is a large range, several orders of magnitude for which good reconstructions are obtained. For example, in the reconstructions shown in this section a choice of α ∈ [10 −3 , 0.5] and t ∈ [0.001, 0.1] would yield similar results. There is some interaction between the choice of α and λ; in particular if one is much larger than the other the reconstructions essentially become purely multi-bang or TV reconstructions. Keeping the orders of α and λ the same, at least in the experiments we have tried, produces a with both good multi-bang and shape recovery. Finally a large range of η yield good recovery and so this parameter is not extensively tuned.
In the following examples we use initial guesses where a is constant. In practice convergence is obtained for all tested phantoms for any constant initial guess of a, provided the constant value lies between a 0 and a n in the admissible set. We therefore use initial guess a 0 = 0 for all numerical results presented here. The last general comment we make is that if we set ξ = ∞, effectively removing the added terms a − a k 2 and f − f k 2 from (35) we still obtain convergence. In many cases removing this part improves the speed of convergence, although the theoretical proof of convergence does not hold in this case. Throughout the section we fix ξ = 50 and set all tolerances δ i to 1 × 10 −3 . Figure 5 shows reconstructions obtained via an optimized parameter joint TV and multibang regularizer algorithm against those obtained purely by a TV approach. The left-hand column gives the true phantoms for a and f; a is binary so here A = {0, 1}. The middle column shows the joint reconstruction for a and f with both TV and multi-bang regularization. Here the step sizes are t = 0.10 and β 0 = 0.1. The regularization parameters are α = 0.2 and λ = η = 0.1. The reconstruction for a is multi-bang with the overall structure very well recovered. This is also seen in the reconstruction for f. We note that an L1 regularizer could have been used in the place of TV on f. In practice however TV performs much more favourably in removing cross talk-artefacts which are common in these types of joint reconstructions [10,19,33]. The right-hand column is an optimized parameter reconstruction obtained using just TV with 80 projections. In this case both the binary nature and the structure of a are lost, even with the extra data. This can also be seen in the recovery of f; the structure is roughly recovered but there are several ring artefacts appearing in the reconstruction. Figure 6 shows an example of a nicely multi-bang a with a multi-bang f which jumps across the same boundaries as a. Here a takes the values 0, 0.25, 0.5 and 1 and f takes the values of 0, 0.125, 0.25 and 0.5. The reconstructions in this example use the joint regularizer on a and only TV regularisation on f. The left column shows the true phantoms for a and f.
The middle column of figure 6 is an optimized reconstruction using the correct admissible set A = {0, 0.25, 0.5, 1} for a with α = 0.1 and λ = η = 0.05. In this case the reconstruction of both a and f is very good indicating that unique recovery may still be possible even when f is not C 1 . This is a step beyond the theoretical results of the previous sections.
The right column of figure 6 shows an optimized reconstruction using the erroneous admissible set A = {0, 0.2, 0.4, 0.6, 0.7, 0.8} for a with α = 0.1, λ = 0.05 and η = 0.15. Note that for the reconstruction shown in the right-hand column, from the admissible set only 0 is a true multi-bang value. In this case the region where a is 0 is well recovered but we have circular artefacts appearing in a taking values from the incorrect A. Bands appear near jumps in the original a and the inner most region is slightly larger than that in the true A reconstruction. In fact it appears the algorithm has attempted to interpolate the correct a using values from the erroneous admissible set. Lastly, we see that f suffers in a similar manner to that of a. Interestingly the multi-bang property of f is not so well inherited in this case, and in particular the middle region has some variation in it. The ability of the algorithm to handle erroneous values of a in the admissible set in more general settings is something that requires further study and is not covered by the theory presented in section 2. Figure 7 shows the effect of the number of projections on reconstruction quality. Here the phantom is made up of 3 regions with A = {0, 0.5, 1}. The left column shows the true phantoms for a and f. The middle column is an optimized reconstruction using 6 projections with α = 0.1 and λ = η = 0.05. The right column shows an optimized reconstruction using 12 projections with α = 0.1, λ = 0.05 and η = 0.15. In both reconstructions for a we obtain multi-bang solutions. The middle column shows a poor recovery of the structure of a and f. The recovered a has a lot of misclassification and has been unable to separate the regions. The inaccuracies in a have an impact on the recovery of f, with the outer most regions of f being poorly recovered. The rightmost column is a very good recovery of both a and f, with just a small section on the left bracket being misclassified. The matching f is also very well recovered. Although the smoothed TV regularization removes cross talk artefacts it is important not to use too large η as this removes the continuous nature of f at the edges. We also remark that there is no significant improvement in the quality of the reconstruction if we increase the number of projections further. Figure 8 shows a plot of the proportion of pixels taking admissible values against outer iteration number (that is k in algorithm 1) for the rightmost reconstruction of a in figure 7. The initial guess is a 0 constant at 0, which is why the initial proportion is 1. The proportion increases  monotonically after about 20 iterations with some large jumps before this point. These typically line up with β l+1 being either increased or decreased in the inner iterations. This graph is typical for reconstructions presented here and suggests that another suitable stopping criteria would be to terminate after a certain proportion of admissible values is reached. Typically a convergent reconstruction has a multi-bang proportion of over 0.95 by the time the algorithm is terminated by the step size tolerances. Figure 9 shows the effect of shrinking the size of the support of f on reconstruction. This is linked to the proof of uniqueness from theorems 1 and 2, where we require f to be non-zero on Algorithm 1. Joint reconstruction algorithm.
1: Input a 0 as initial guess, step sizes t, β 0 , tolerances δ 1 , δ 2 , δ 3 , δ 4 , δ 5 and regularization parameters α, λ and μ. 2: Set f 0 to be the least squares solution of R[a 0 ] f − d 2 . 3: for k 0 do 4: Set x 0 = a k and y 0 = Dx 0 . 5: for l 0 do 6: Update x l+1 via ISTA or FISTA with δ 1 as a tolerance on x l+1 − x l . 7: Update y l+1 via gradient descent on (41). 8: Set μ l+1 = μ l + β l (y l+1 − Dx l+1 ). 9: Update β l+1 via (42) 10: Terminate when r l < δ 2 and s l < δ 3 and output a k+1 = x l+1 . 11: Update f k+1 via (43) using ADMM with tolerance δ 4 . 12: Terminate when a k+1 − a k 2 < δ 5 and f k+1 − f k 2 < δ 5 . The reconstruction algorithm here is then performed by simply performing one update for a. The rightmost reconstruction captures the shape and classifies almost all pixels correctly; most importantly it captures the smallest regions well. Note here that as in theorem 1 and 2 f has a larger support than a. It is possible to obtain reconstructions similar to that of the rightmost recovery for f which have a slightly smaller support than a. The 2nd column from the right also has a decent shape recovery but has lost some of the finer features such as the ellipses at the bottom. The 2nd column from the left does a poor job of the recovery of a with very few pixels correctly classified as 1. The high valued outer ellipse is lost and all the finer details are missing. This is a similar effect to reducing the number of projections, which could be expected as reducing f leads to fewer rays contributing data per projection. We can however still see some detail outside of the support of f, this is due to TV being able to fill in the gaps and extend our visibility. The angled straight edges in the reconstruction are related to the angles of the projections in the data set.   Figure 10 shows reconstructions based on a realistic SPECT phantom described in [26].
Here a and f have the same support and a takes values 0, 1, 2 and 20. The left-hand column shows the original phantoms for a and f and the right-hand column shows an optimized recovery with 5% added Gaussian white noise. Here we have A = {0, 1, 2, 20}. The step sizes are t = 0.025 and β 0 = 0.15 and regularization parameters are α = 0.02, λ = 0.05 and η = 0.15. This example is intended to better capture the range of attenuation values which can occur in SPECT imaging. The non-uniformly spaced admissible values lead to a proximal map which has larger intervals for certain multi-bang values. If the differences between adjacent multibang values is very large (i.e. > 10 3 ) then this can create problems in the reconstructions, but an additional scaling term related to the difference between adjacent multi-bang values can be added to ameliorate this issue. However, in this example since the difference is only 20 times, this is not necessary and by lowering the value of α we can obtain a good reconstruction of both a and f.
The final numerical result we present in figure 11 is obtained using an image of a walnut reconstructed from CT data by the Finnish Inverse Problems Society [21] as the attenuation map. The walnut phantom for a is binary and so A = {0, 1}. Here the step sizes are t = 0.05 and β 0 = 0.2 and the regularization parameters are α = 0.2, λ = 0.1 and η = 0.15. The lefthand column is the true a and f and the right-hand column the reconstruction. In general the larger structures of a are recovered but the finer details are lost. This is still true even if the number of projections is upped significantly. This is in part due to the detail being compressed going from 340 by 340 to 200 by 200 pixels and the TV regularizer eliminating the smallest non-zero regions. The larger sections are well-recovered and the outermost boundary is very well classified. The reconstruction for f is again good with the areas towards the boundaries being the areas most affected by the errors in a. There are no cross-talk artefacts present, even with the more complicated a.

Conclusion
In this paper we have presented and proved two theorems on the identification problem for SPECT involving multi-bang attenuation. In particular, for nicely multi-bang a and f ∈ C 1 c (R 2 ) non-negative with sufficiently large support we have shown uniqueness of joint recovery for the AtRT. The method of proof for these theorems gives possible methods to produce similar results with further relaxed conditions on a and f, and we intend to investigate this in future work.
On the numerical side, we have formulated a variational problem including a weakly convex version of the convex multi-bang regularizer [13,14] and a smoothed TV, and presented an algorithm for simultaneous recovery of a and f from the resulting variational problem (29). Using an alternating direction approach for non-convex objective functions [1] coupled with ADMM [5] we are able to successfully solve these variational problems. The addition of a joint multi-bang and TV regularizer has produced good results for joint recovery with projection numbers similar to those used in the x-ray recovery case when the image is known to have only a finite number of values [12,39]. The apparent convergence of the algorithm even in the case ξ k = ∞ in all cases we have investigated, and also independent of the smoothing parameter c for the TV presents theoretical questions for future work. Also, even though the variational problems on which our method is based are non-convex, our algorithm consistently converges to a reasonable approximation for the correct solution. This suggests we may actually be finding the global minimiser, or be getting close to the global minimiser, and there is potential for future research investigating whether this is indeed correct.
Finally, the numerical examples shown in figure 9 indicate that we are able to obtain good recovery for a even when the support of f is smaller than required in our theoretical results. We suspect that the TV may be playing a role in filling in the boundaries of regions of constant a, and would like to investigate this further as well.