Fast algorithms for nonlinear and constrained phase retrieval in near-field X-ray holography based on Tikhonov regularization

Based on phase retrieval, lensless coherent imaging and in particular holography offers quantitative phase and amplitude images. This is of particular importance for spectral ranges where suitable lenses are challenging, such as for hard x-rays. Here, we propose a phase retrieval approach for inline x-ray holography based on Tikhonov regularization applied to the full nonlinear forward model of image formation. The approach can be seen as a nonlinear generalization of the well-established contrast-transfer-function (CTF) reconstruction method. While similar methods have been proposed before, the current work achieves nonlinear, constrained phase retrieval at competitive computation times. We thus enable high-throughput imaging of optically strong objects beyond the scope of CTF. Using different examples of inline holograms obtained from illumination by a x-ray waveguide-source, we demonstrate superior image quality even for samples which do not obey the assumption of a weakly varying phase. Since the presented approach does not rely on linearization, we expect it to be well suited also for other probes such as visible light or electrons, which often exhibit strong phase interaction.


Introduction
X-ray imaging and tomography offers a set of advantages for biomedical and materials research: resolution, penetration power, high volume throughput, compatibility with hydrated specimen, and quantitative contrast mechanisms relatively unaffected by multiple scattering. The advantage of phase over absorption contrast for brain tomography has been realized early, since for the high photon energies of hard x-rays, the index of refraction ( , r) = 1 − ( , r) + ( , r) leads to dominating phase interaction compared to absorption. For low-elements, in particular, we have / 1, and a proportionality of (r) and the electron density. Hence, it has been a major research goal over the last two decades to exploit the small phase shifts in a wavefront imparted by matter for high resolution imaging. Different phase-contrast methods can be used to transform these phase shifts into measurable intensities patterns by ways of interference [1,2], based on different geometries and mechanisms [3,4]. Propagation-based imaging only requires the self-interference of the diffracted beam behind the object and the unattenuated or weakly attenuated primary beam [5]. It hence does not require scanning of the object or any optical element, which makes the method also dose effective. Finally, the imaging modality can cover a wide range of scales, from macroscopic (clinical) settings such as mammography [6] down to imaging sub-cellular spatial resolution below 50 nm [7] using synchrotron radiation.
A central challenge in propagation-based imaging has been the formulation of accurate and efficient phase retrieval schemes [8][9][10]. In the so-called direct-contrast regime, i.e. at small propagation distances or correspondingly large Fresnel numbers, linearization in can be used to derive approximate inversion equations [11]. On the contrary, the present work is concerned with the holographic regime of small Fresnel numbers, which is exploited for imaging with spatial resolutions in the order of 100 nm or below, where such inversion methods fail [5,12]. In this regime, where propagation imaging is also known as near-field holography (NFH), phase-sensitivity is high enough to 'pick up' the small differences, for example encountered in unstained soft tissues. This enables a wide range of applications such as virtual histology of COVID-19 lung tissue [13], structural imaging of microscopic catalyst particles [14], observation of expanding cavitation bubbles [15] or resolving neuronal structures throughout entire Drosophila melanogaster brains [16]. For NFH, an inversion based on the known contrast transfer function (CTF) for optically weak objects has been proposed twenty years ago [10,17,18], using holograms data recorded at different measurement planes and Fresnel numbers 1 , . . . , . This so-called CTF phase retrieval has been extremely successful, and has found widespread use also due to its numerical simplicity and efficiency.
However, CTF fails or has limited image quality in many cases, when the basic assumption of a weakly varying phase is not fulfilled [19][20][21]. For example, interior or exterior interfaces can result in large phase gradient which often substantially deteriorate the image quality, even if the bulk of the sample is characterized by smaller phase gradients. To treat data with larger phase gradients, iterative phase retrieval schemes have to be considered that overcome the limitation to the linear CTF regime. Such nonlinear reconstructions in NFH, using gradient descent algorithms, have been reported over a decade ago [22,23]. Regularized Newton methods have also been proposed [24]. More recently, Hagemann et al. introduced a simple scheme of alternating projections (AP), which operates on the same input data and constraint sets as the conventional CTF-based phase retrieval [25]. As in far-field coherent diffractive imaging (CDI), iterative projection algorithms [26,27], which cycle between measurement and object plane, do not rely on linearization and are well suited also for phase retrieval in the NFH regime. In addition to the measured data, a priori constraints (prior information) on the imaged object, such as finite support, separability, or range constraints on the numerical phase values can be used [28,29]. However, single distance acquisitions of extended (not compactly supported) specimen, which in practice are of particular relevance for tomography, often suffer from stagnation and noise-induced artifacts as phase retrieval becomes more strongly ill-posed in this setting [30,31]. In fact, without a compact support as in [7,28], AP-type algorithms often only yield low-quality reconstructions, unless the data acquisition is extended by longitudinal (i.e. parallel to the optical axis) or lateral scanning. Moreover, iterative reconstructions are typically numerically expensive, which is prohibitive for large tomographic data sets. In order not to bottleneck tomographic imaging pipelines, computation times for image-reconstruction should not exceed the typical image-acquisition times ( 1 s). On the contrary, iterative projection algorithms applied to holograms with 2000 2 pixels typically require thousands of iterations and take minutes to complete even on data center graphical processing units (GPUs) [25].
In this work, we propose an algorithm for phase retrieval in the holographic regime, denoted as NLTikh, using Tikhonov regularization to invert the full nonlinear forward model of NFH. The approach is a nonlinear generalization of the well-established CTF reconstruction, which no longer depends on a weak object assumption and on the other hand permits to impose supportand/or range constraints on the recovered phase-image. As an intermediate step, we also introduce a constrained CTF algorithm (cCTF) that retains the linear CTF model but adds flexibility in terms of incorporating constraints. The proposed methods overcome the aforementioned performance issues of other iterative methods by applying state-of-the-art optimization methods for minimizing the Tikhonov functional, significantly reducing the number of required iterations compared to previous approaches. Additionally, our algorithms make use of GPUs whenever available, while retaining the ease-of-use of CTF phase retrieval from the user's perspective.
While the early phase retrieval algorithms have been reported solely in the literature without usable implementations, it has become customary to publish the source code as well [32][33][34]. We follow this trend and provide Matlab implementations for the proposed methods.

Phase retrieval algorithms
The task of phase retrieval in NFH is to reconstruct the phase (and absorption ) from near-field intensity data measured in the detector plane: (1) D ( ) = F −1 (exp(−i 2 /(4 )) · F ( )) denotes Fresnel-propagation of a wave-field from the object plane to the detector, with the Fourier transform F and the dimensionless Fresnel number , where 1 corresponds to the holographic regime. The object exit wave = exp(i − ) is given by projections of the imaged sample's optical indices = − ∫ ( , , )d and absorption = ∫ ( , , )d ( = 1 − + i : refractive index, : optical axis, : wavenumber), as long as diffraction within the object can be neglected. The widely used weak object approximation, i.e. the linearization of (1) with respect to the phase-and absorption images and , then leads to the contrast transfer function (CTF) model [19], where ( ) = sin( 2 /(4 )) and ( ) = cos( 2 /(4 )) denote the phase-and absorption CTFs. Direct inversion of the CTF-model, as introduced by Peter Cloetens twenty years ago [10], is the standard phase retrieval method in the holographic regime today. It works by computing a regularized inverse of (2) based on a Fourier-filter constructed from the CTF. The method makes use of either a pure phase or a single-material approximation, i.e. = 0 or -more generallyproportionality of and is assumed: Mathematically, CTF phase retrieval amounts to applying Tikhonov regularization to the linearized forward model under the constraint in (3), i.e. it can be written as where the are the holograms measured at Fresnel numbers 1 , . . . , , := ( ∫ | ( )| 2 d ) 1/2 denotes the 2 -norm and the weighting function ≥ 0 controls the strength of the regularization term 1/2 · F ( ) 2 in the different spatial frequencies of . Note that, if regularization was omitted, i.e. = 0, the denominator in (4b) could become arbitrarily small for spatial frequencies where the summed CTF =1 ( − / · ) 2 is close to zero. This would lead to a prohibitively strong amplification of data noise in the reconstructed images. As is customary, we consider a two-level regularization, exhibiting a smooth transition from a weaker penalization of low spatial frequencies to a stronger penalization beyond the first maximum of the phase-CTF: is the mean Fresnel number of the holograms. We use low-freq = 10 −3 and high-freq = 10 −1 as default values, which turn out to yield good results in many settings, although the 'ideal' choice of regularization parameters is application-specific.
In the present work, we want to build upon this successful model and achieve a generalization, rather than a replacement. To this end, we propose two extensions: First, we introduce a CTF-based method which is more flexible with respect to incorporation of additional a priori constraints, including in particular support: | R 2 \Ω = 0 outside some known bounded domain Ω ⊂ R 2 , and negativity: , − ≤ 0, which results from the positive definiteness of electron density. Formally, such a constrained CTF (cCTF) reconstruction simply amounts to restricting the search set of the minimization problem in (4): Second, and more importantly, we want to omit the weak-object assumption underlying to CTF phase retrieval, by replacing the linearized model L with the general nonlinear forward map N . Accordingly, we aim to reconstruct by solving the following minimization problem: While the proposed generalizations are hence conceptually straightforward, they bring about significant challenges on the practical, algorithmic side: contrary to (4), the minimizer in the constrained variant (6) can no longer be computed by a single-step FFT-based filtering operation as given by (4b). Moreover, the minimization problem in (7) even becomes non-convex, so that a global solution strategy is completely beyond reach. In both cases, we therefore have to resort to iterative algorithms to compute the sought minimizers.
To achieve reasonably fast termination, we employ the following methods: 1. State-of-the-art optimizers: For cCTF we employ an accelerated variant [35] of the alternating direction method of multipliers (ADMM). For the nonlinear case we use a (projected) gradient-descent method with adaptive Barzilai-Borwein stepsizes, nonmonotone linesearch and a gradient-based stopping rule as described in [36].

Warm starting:
The NLTikh-algorithm is initialized by a linear CTF reconstruction obtained with the same parameters. This helps to avoid local minima associated with phase-wrapping and typically reduces the number of iterations required for convergence.
3. GPU computing: Our implementation automatically performs massively parallelized computations on a graphical processing unit (GPU) if such hardware is available.
Notably, all of these algorithmic tweaks act behind the scenes -from the user's perspective, our cCTF and NLTikh implementations are as easy to use as standard CTF phase retrieval. The code is fully publicly available as part of the HoloTomoToolbox [32,33].
Relation to existing methods: The proposed algorithms bear similarities to several previous work in the literature. In [23,34,37], gradient-descent schemes are proposed to iteratively minimize the distance to the hologram data (or rather to square-root-data √︁ in [37]) based on the nonlinear model, i.e. the first term on right-hand side of (7b). Moreover, the approach in [23] also includes warm-starting from a CTF-reconstruction and the algorithm in [37] allows to impose range-or support-constraints via projected gradient descent as in the current work. On the other hand, these previous methods both lack the ability to impose regularization as well as adaptive stepsizes to accelerate convergence, leading to as many as 10000 iterations for a single reconstruction in [37]. Perhaps the most closely related work to the present one is [22], where nonlinear Tikhonov regularization with a gradient-penalty (corresponding to the choice ( ) = constant · | | in (7b)) is proposed. However, no support-or range-constraints are enabled and the algorithm involves a computationally expensive line-search strategy. Also the regularized Newton method in [24] is similar to the present approach, but involves nested (outer) Newton-and (inner) conjugate-gradient-iterations that spoil the computational performance. In conclusion, we find that our NLTikh algorithm stands out from previous methods in three respects: (1) it is flexible in terms of incorporating constraints, (2) builds upon experience gained with CTF phase retrieval, in particular retaining the established regularization strategy, and (3) is algorithmically optimized to keep computational costs as low as possible.
Concerning our constrained CTF method, we note that similar ADMM-schemes have been proposed for linearized phase retrieval with total variation (TV) regularization [21,38]. While such a regularization strategy is somewhat more advanced than our approach, we note that it has been so far limited to the weak object regime due to the algorithmic challenges it brings about.

ADMM for constrained CTF
We solve (6) by splitting it into the uncoupled subproblems of the unconstrained CTF (4) on the one hand and the constraints on the other hand. This splitting enables us to use an augmented Lagrangian method, namely ADMM [39,40]. The resulting iterations read Here, and are auxiliary variables that are initialized with zeros, > 0 is a stepsize parameter and Π is the projection onto the convex constraints-set from (6). If, for example, a negative-phase constraint is to be imposed, i.e. = { ∈ 2 (R 2 , R) : ≤ 0}, this projection is simply a pointwise minimum: Π ( ) = min(0, ). Note that the minimization in (8a) can be efficiently solved in closed form (8b), similar to the unconstrained CTF in (4b). For faster convergence, we use an accelerated variant of the basic ADMM-scheme in (8), given by Algorithm 8 in [35]. We stop the iterations automatically as soon as the relative value of so-called primal and dual residual (see [35]) both drop below a tolerance of 10 −3 .
Computational costs: Noting that the summand 2 =1 ( − / · ) · F ( − 1) in (8b) can be precomputed during the initialization of the algorithm, we see that the computation of the ADMM-update (8) essentially requires one forward-and one inverse Fourier transform per iteration, in addition to some less expensive pointwise arithmetic operations.

Projected gradient descent for nonlinear Tikhonov
In contrast to the linear CTF setting, no closed form, even for the case without constraints ( = R 2 ), exist for equation (7). However, Fréchet-differentiability of the nonlinear operator N allows us to compute the gradient of the nonlinear Tikhonov functional T NL in (7). As derived in appendix A, it can be computed as with the abbreviations := i − / , N , := |D (exp( ))| 2 and the adjoint Fréchet derivative We determine the minimizer of (7) using a projected gradient descent method. The iteration is a gradient descent step followed by the projection onto the constraints-set : Without constraints, this reduces to the well-known gradient descent method. Convergence of the algorithm crucially depends on the chosen stepsizes > 0: small typically lead to slow convergence, whereas overly large stepsizes may cause the method to fail to decrease the T NL ( ) altogether. We use the popular stepsize-rule proposed by Barzilai and Borwein [41], for even (12) ( ·, · : 2 -inner product) combined with a non-monotone line-search [42], as detailed in [36]. We note that, by construction, the NLTikh-method reduces to CTF-phase retrieval in the linear weak object limit → 0. On the other hand, a CTF-reconstruction is numerically both more stable and less expensive to compute (also in a constrained setting!). We exploit this fact by using a CTF-phase as an initial guess for the NLTikh-algorithm, i.e. 0 = cCTF .
The iterations are automatically stopped as soon as the (relative) residual gradient, drops below a threshold of 10 −3 . The reasoning is as follows: as the Tikhonov functional T NL is continuously differentiable, the sought minimum is characterized by grad T NL ( NLTikh ) = 0 and grad T NL ( ) must tend to zero when the iterates converge to NLTikh . In this sense, grad measures proximity to the Tikhonov minimizer, i.e. provides a convergence metric.

Stabilization of high frequencies:
The linear CTF model is diagonal in Fourier space, i.e. the reconstruction of a certain spatial frequency only depends on the Fourier-mode of the same frequency in the hologram data 1 , . . . , . In particular, this means that low-frequency data-errors may not cause high-frequency artifacts in the recontruction and vice-verser. Switching from the CTF-to the nonlinear model L → N does not preserve this property. Hence, low-frequency background variations in the holograms, as often caused by imperfect flat-field correction, may result in oscillatory artifacts when reconstructing nonlinearly. We find that this undesirable effect can be effectively suppressed by adding a third level to the regularization defined in (5):

14)
where denotes the aspect length of the detector and beyond-NA is large (default: beyond-NA = 2 ). As the naming suggests it, the modification enforces a higher penalization for spatial frequencies that scatter under angles exceeding the numerical aperture (NA) of the imaging system, as defined by the field-of-view covered by the detector. Note that the corresponding high-frequency Fourier-modes are thus not sufficiently represented in the acquired hologram data anyway so that these could not be stably recovered [43]. Thus, nothing is really lost by damping out these frequencies, while stability is gained on the algorithmic side.
Computational costs: Up to computationally inexpensive pointwise arithmetic operations, evaluating one projected gradient-descent step (11) essentially requires 2 forward-and backward Fresnel propagations according to equations (9) and (10). Consequently, the computational costs of an NLTikh-iteration is comparable to that of typical alternating-projection iterations. Note that neither the stepsize-rule (12) nor checking the stopping-criterion (13) requires significant computational effort, as grad T NL ( ) must be computed in every iteration anyway.

Performance tests and comparison to existing methods
The algorithms were tested with NFH data measured at the Göttingen instrument for nano imaging with x-rays (GINIX), installed at the P10 beamline of the PETRA III storage ring (DESY, Hamburg), at a photon energy of 8 keV (Fig.1, Fig.4) and 13.8 keV (Fig.3). An x-ray waveguide was used as a spatial and coherence filter of the undulator radiation which was focused onto the waveguide by a Kirkpatrick-Baez (KB) mirror [44]. The exit of the waveguide forms a quasi-point source for holographic illumination. The geometrically magnified holograms of size 2048 × 2048 pixels were recorded by two different fiber coupled scintillator sCMOS detectors (Photonic Science and Hamamatsu) each with pixel sizes of 6.5 m, placed approximately at 02 ≈ 5 m behind the KB focus. All holograms considered as test data in this work are given by dark-and flat-field-corrected detector images. The experimental setup parameters for the test cases in the subsequent sections are detailed in

Test structure of polysysterene microspheres
First, we demonstrate the performance of nonlinear phase retrieval using a test object of polystyrene spheres with diameter 15 m [25]. Holograms were acquired at four different defocus distances (see table 1 for detailed parameters), as in standard linearized (CTF-based) phase retrieval to compensate for minima in the CTF [10,18]. The data set has been previously considered to demonstrate the advantage of iterative phase retrieval with an alternating projections (AP) algorithm over CTF [25]. Figure 1 shows a comparison of reconstructions obtained by (a) standard (unconstrained) CTF, the proposed NLTikh algorithm (subfigures (b), (c), (d)) as well as (e) the AP-result from [25]. The same parameters, / = 0 (pure phase object) and low-freq = 10 −3 , high-freq = 10 −1 , were used for all CTF-and NLTikh-reconstructions. Table 2 gives an overview over the observed computation times for the different reconstrution methods.
As noted already in [25], the CTF result shows significant artifacts from the large phase gradients introduced by the spheres, whereas NLTikh and AP yield visually artifact-free images of similar quality. This is confirmed by the line-scans in Fig. 1(f) where NLTikh and AP both show good agreement with the theoretically predicted phase profiles for 15 m-sized polystyrene spheres (assuming the literature value = 3.673 · 10 −6 [45] for the refractive index of C 8 H 8 with density 1.05 g/cm 3 ). However, while the AP-reconstruction took 2500 iterations to complete [25], corresponding to minutes of computations on typical GPUs, the NLTikh algorithm requires only about 50 iterations to converge, thus completing within about two seconds on our benchmark workstation. Moreover, Fig. 1(d) demonstrates that even an NLTikh-reconstruction from a single hologram results in an acceptable image quality, apart from an expectable increase of noise compared to the four distance setting. Finally, a comparison of Figs. 1 (b) and (c) shows that the ability of NLTikh to impose negativity of the reconstructed phases helps to effectively suppress low-frequency variations of the background and thus achieve quantitatively correct phaseswithout a relevant increase in computational costs.  Table 2. Computation times for phase reconstruction of the polystyrene-microspheres data set using different algorithms. See Fig. 1(a)-(e) for the corresponding reconstructed images, respectively. All GPU-computations were performed in double-precision floating-point arithmetics on an NVIDIA Quadro RTX 6000 using Matlab R2020a.

Convergence experiments
For an exemplary assessment of convergence of the NLTikh algorithm, we consider the reconstruction setting in Fig. 1 (c), that includes both constraints and nonlinearity. First, we compute a (supposedly) fully converged reference solution ref by performing 1000 iterations of the proposed projected gradient descent scheme. We then compute the gradient residual grad defined in (13) as well as the residual of the Tikhonov functional relative to ref : value ( ) =   (13), respectively, for the reconstruction setting in Fig. 1 (c). Blue lines: the proposed algorithm. Red: same as (a) but without warm start, i.e. the algorithm is initialized with 0 = 0 instead of a CTF-reconstruction. Yellow: same as (a) but with constant-instead of Barzilai-Borwein stepsizes . By default, the iterations are stopped when grad drops below 10 −3 , as marked by the black-dashed line.
The convergence test is carried out both for the proposed Barzilai-Borwein stepsize-rule as well as for a constant stepsize for comparison. Moreover, to motivate our warm-starting approach, we compare reconstructions initialized with a CTF-reconstruction (as proposed) to the convergence behavior if the algorithm is initialized with zeros, 0 = 0. Results for value and grad are plotted in Figs. 2 (a) and (b), respectively. The following observations can be made: • Barzilai-Borwein stepsizes yield considerably faster convergence than constant stepsizes.
• While the gradient-residual grad ( ) may vary strongly between subsequent iterations, its overall decrease is comparable to that of value ( ) for all three test cases. This emphasizes the validity of the proposed stopping criterion based on the metric grad .
• Although the difference levels out asymptotically, the warm-start strategy accelerates initial convergence. In particular, it takes only 47 iterations to reach the stopping criterion grad ( ) < 10 −3 with warm start compared to 105 iterations if initialized with 0 = 0.
Overall, the convergence experiments thus show that our algorithmic optimizations to accelerate the projected gradient-descent scheme indeed pay off in the considered example.

Golgi-Cox stained hippocampus
Next, we tested the nonlinear Tikhonov algorithm on a biological sample, notably a Golgi-Cox stained brain slice of mouse hippocampus embedded in EPON resin, which was already investigated before by CTF [46,47] and the AP-algorithm [25]. The imaging setup is detailed in table 1. The Golgi-Cox stain is a classical stain for brain sections, which visualizes individual neurons by seemingly stochastic all-or-nothing events based on precipitation of silver along the neuron's neurites. The metalized neurites then stand out in strong contrast with respect to the neuropil infiltrated by EPON resin, resulting in strongly varying phases. Hence, the assumption neither of a weak object, nor of a homogeneous (single material) object are strictly met. Fig. 3 shows reconstructions of the four-hologram data set obtained by (a) CTF phase retrieval, and (b) the nonlinear Tikhonov algorithm. In both cases, the same regularization parameters ( low-freq = 10 −3 , high-freq = 10 −1 ) and / = 0.1 were used without additional constraints. Visual comparison of the reconstructed projections shows significantly less artifacts for NLTikh compared to CTF. In particular, the flawful wavy background at high spatial frequency, visible as oscillations of the CTF line-scan in Fig. 3(c), is absent. Furthermore, the NLTikh-reconstruction exhibits a much larger range of reconstructed phases compared to CTF, which is physically plausible for the high-contrast metal-stained neurons.

Sedimented silicon spheres in a capillary
For a final test case, we investigate a setting which also often encountered in practice, where a strong phase gradient at the edges of the object has to be imaged along with significantly weaker phase-variations in the interior. Examples are many: glass capillaries with tissue punches, single particles in air, trimmed specimens of material science. Here we consider silicon spheres of diameter 2 m in a capillary. The tomographic data set consists of 726 holograms acquired at a single source-to-sample distance (Fresnel number = 6.499 · 10 −4 ) and equidistant incident angles covering a range of 180 degrees, see table 1 for details. It has been recorded as part of a sedimentation experiment where the 3D-motion of the single spheres is tracked over time [48]. The considered tomogram shows the final, fully sedimented state of the sample.
We reconstruct the holograms for each incidence angle using CTF and NLTikh, imposing the / -ratio of the glass capillary (SiO 2 ) at a photon energy of 8 keV, / = 0.0135 according to [45], as well as negativity of the recovered phases. Regularization parameters are low-freq = 10 −5 and low-freq = 10 −1 . Results are shown in Fig. 4. The capillary has an approximately square cross-section. Accordingly, the severeness of the phase gradients at the capillary-boundary depends on the incidence direction of the x-rays: when the latter is skew with respect to the sides of the capillary, only moderate phase-variations occur, whereas strongly varying phases result in the case of (almost) parallel incidence. Fig. 4 (a), (b) and (c), (d) shows exemplary projections reconstructed for skew and parallel settings, respectively. We observe that the CTF-reconstructions suffer from fringe-artifacts in both cases, which also distort the sphere-structures in the interior of the capillary. For NLTikh, such artifacts still occur in the parallel-incidence setting, whereas the skew reconstruction is practically artifact-free -despite the strong total phase-shifts up to ≈ 20 rad. As revealed by the tomographic slices (computed by applying filtered back-projection to the recontructed phase images for all tomographic angles)) in Fig. 4 (e), (f), our nonlinear NLTikh method overall achieves an improved, yet still imperfect reconstruction quality compared to CTF, yielding a clearer but still not artifact-free image of the silicon spheres' packing in the capillary.

Summary and Outlook
In summary, we have presented a nonlinear Tikhonov phase retrieval algorithm for near-field holography (NFH), to cope with optically non-weak objects beyond the scope of the established linear CTF model. Since the NLTikh approach comprises standard CTF phase retrieval as a limiting case, it can be regarded as a nonlinear generalization of the CTF approach. We have validated NLTikh for experimental data, including test objects and 'real samples', recorded under relevant conditions of nanoscale holographic x-ray imaging, both in 2D and in a tomographic 3D setting. As expected by the algorithmic design, which is based on the full nonlinear model of NFH, the method is found capable to reconstruct objects with moderately strongly varying phase shifts and outperforms CTF-based methods in terms of image-quality in such settings. As a further option, NLTikh enables imposing additional support and range constraints -like the proposed constrained CTF (cCTF) algorithm for the weak object regime. This is of particular interest if hologram data is available only for a single distance, where imposing additional a priori knowledge may compensate for the lack of data richness [30,31]. We emphasize that both the ability to reconstruct optically non-weak objects and to exploit additional a priori constraints are also provided by previously considered methods, such as alternating projection schemes (AP). The distinctive feature of our NLTikh algorithm, however, lies in the demonstrated superior numerical efficiency, rendering nonlinear, constrained image-reconstruction applicable for large-scale tomographic data sets with thousands of high-resolution holograms to be processed.
Despite the merits of the presented approach, one should also note its limitations. While not being strictly limited to the weak object regime like CTF, NLTikh still relies on linearity to a certain degree: it uses a CTF-reconstruction as an initial guess and the iterations are based on derivatives of the nonlinear model, i.e. on local linear approximations. Due to non-convexity, convergence is thus only guaranteed if the true phase is sufficiently close to the linear CTF-result, i.e. nonlinearity must not be too strong. For very strong objects, as considered in section 4.4, the initial guess may be too far off for the gradient-descent scheme to find the global minimum of the Tikhonov functional. In such a setting, AP and related methods like RAAR [49] may perform significantly better than NLTikh, as they are originally designed for highly non-convex imagereconstruction problems like CDI. A second limitation of our algorithm lies in the assumption of a homogeneous (single material) object. Like the generalizations of CTF phase retrieval made in this work, lifting this assumption is conceptually simple: just optimize the Tikhonov functional in (4), (6) or (7) for both parameters and , omitting the coupling constant / . Yet, achieving accurate image reconstructions in a numerically stable and efficient manner by such an approach is expected to constitute a severe algorithmic challenge, that might be addressed in future research. Owing to the advent of energy-resolving X-ray detectors, another interesting research direction might be to extend NLTikh to settings with holograms acquired at different X-ray energies, similar to what has been demonstrated in the direct-contrast regime [50,51].

A. Derivation of the gradient
The nonlinear forward model N , in (9) is Fréchet-differentiable with derivative (see [24]) In order to compute the adjoint derivative, we consider the 2 -inner product of N , [ ] ( ) with an intensity for arbitrary real-valued functions , , and : where we used that D is unitary as well as real-valuedness of , , and . By the defining property of the adjoint, we also have that Since (17) and (18) Now we proceed to computing the gradient of the nonlinear Tikhonov functional T NL from (7). Noting that the Fréchet-derivative of ( ) := 2 = , is given by [ ] (ℎ) = 2 , ℎ and that T NL ( ) = =1 (N ((i − / ) ) − ) + ( 1/2 · F ( )), we obtain by application of the sum-and chain-rule for Fréchet-derivatives In the final step, we used that ↦ → 1/2 · F ( ) is linear with adjoint ↦ → F −1 ( 1/2 · ) together with linearity of the inner product. Since equation (20) holds for arbitrary real-valued ℎ, it follows that grad T NL is given by the expression in (9).

Funding
The work was supported by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through grant SFB 1456-C03.