3D X-Ray Imaging of Continuous Objects beyond the Depth of Focus Limit

X-ray ptychography is becoming the standard method for sub-30 nm imaging of thick extended samples. Available algorithms and computing power have traditionally restricted sample reconstruction to 2D slices. We build on recent progress in optimization algorithms and high performance computing to solve the ptychographic phase retrieval problem directly in 3D. Our approach addresses samples that do not fit entirely within the depth of focus of the imaging system. Such samples pose additional challenges because of internal diffraction effects within the sample. We demonstrate our approach on a computational sample modeled with 17 million complex variables.


INTRODUCTION
Over the entire span of the electromagnetic spectrum, x-rays offer a unique combination of nanometer wavelength to enable high spatial resolution imaging, large penetration in millimeter-scale specimens, and low values of plural and inelastic scattering to enable straightforward image interpretation and quantization [1,2]. A variety of lens-based and lensless imaging methods have demonstrated 2D spatial resolution better than 20 nm [3,4]. Among these, ptychography [5,6] offers a unique combination of having a spatial resolution determined not by optics but by maximum detected scattering angle, while displaying robustness in phase retrieval for non-isolated objects [7]. In ptychography, a finite-sized xray probe (coherent beam spot) is used to illuminate the specimen at multiple overlapping probe positions, while recording the far-field diffraction intensity corresponding to each probe position. The resulting data redundancy allows one to recover both the object and the probe. One can even reconstruct multiple probe function modes to account for x-ray beam partial coherence [8], sample vibration [9], and continuously moving illumination [10][11][12][13]. Ptychography has been used to image thin circuit layers through 300 μm of silicon at 12 nm resolution [14], sub-10 nm resolution has been achieved with thinner specimens [15][16][17], and sub-wavelength resolution has been obtained using EUV light [18].
As the transverse resolution δ r of x-ray imaging continues to be improved, a challenge lies ahead: the invalidity of the pure projection approximation. When using a circular lens, the depth of focus (DOF) of an image is given by [19] while DOF = 5.2δ r 2 /λ has been found to describe the depth of focus of x-ray ptychographic images [20]. For samples with an overall dimension less than the equivalent depth of focus of the imaging approach used, each view as the object is rotated can be treated as representing a pure projection for use in a standard tomographic reconstruction algorithm. However, 3D x-ray imaging experiments that are soon within reach will involve conditions where the pure projection approximation can no longer be applied. Therefore we consider here the case of near-wavelength resolution imaging of an extended object.
Normally ptychography reconstructs a single plane, which is the exit wave leaving a 2D object or the pure projection of complex optical modulation by a 3D object. With pure projections of 3D objects, single-slice ptychographic tomography (SSPT) allows one to recover the 3D object by first applying phase-unwrapping [21] to the projection images, and then using these projections in standard tomographic reconstruction algorithms [22,23]. However, one can deal with objects located at multiple planes along the x-ray beam direction by applying the object's optical modulation at each plane and propagating the resulting wavefield to the plane of the next object slice before propagating the final exit wave to the detector plane [24]. In the reconstruction process, the probe and object functions are updated at each plane, since the ptychographic update steps work to maximize the separability of object and probe. Using this multislice approach, images have been obtained of a few discrete, separated planes by using a single viewing direction [20,[25][26][27][28] or a limited range of tilt angles [29].
One approach to 3D imaging that builds on the above work was recently demonstrated [30]. This approach, termed multi-slice ptychographic tomography (MSPT), involved the use of multislice ptychography to reconstruct five depth planes of a 3D object at each viewing direction (with a span over all planes sufficient to encompass the 3D object and a spacing of planes fine enough so that one can ignore to some extent Fresnel diffraction within each plane). Since the object showed primarily phase contrast, the phase projections of the sample at each angle were calculated by pixelwise addition of the phases of the five slices. This again required a phase unwrapping process to yield a pure projection image for standard tomographic reconstruction.
We demonstrate here a different approach to the reconstruction of 3D images of extended, complex objects. Rather than seeking the solution of several object planes from each viewing direction separately before combining calculated pure projections in a standard tomographic approach, we consider the totality of the 3D object in each update step. To do so, we simulate the propagation of probe function illumination waves at various probe positions and incident angles through a present guess of the 3D object so as to produce a set of assumed intensity patterns to be recorded on a far-field detector; we then adjust the guess of the object so as to minimize the difference between the actual detector plane Fourier magnitudes against our present guess of the same. Such an approach has been used with learning algorithms to guide the object updates [31,32], as well as with the imposition of a sparsity regularizer [33]. In our case, we use a newly-developed proximal alternating linearized minimization algorithm for finding the object, as will be discussed below. Our algorithmic approach exploits a parallelized implementation that can address the additional nonlinearities and computational complexity introduced with the fully propagated model, and the capabilities of high performance computing. Our approach requires no phase unwrapping because the phase shift per 3D voxel is always small, and because it is based on a forward model rather than backwards propagation of the wavefield through the object. We term our approach multislice optimized object recovery, or MOOR.
The use of multislice propagation to carry out the forward model calculation is well established. First introduced to interpret high-resolution electron microscopy data [34,35], the multislice method has been shown [36] to recreate a wide range of x-ray optical phenomena relevant to nanoscale imaging. These include grazing incidence reflectivity and wave propagation in arbitrarily thick transmission gratings that were previously understood only by using coupled-wave theories for simple, mathematically definable structures [37][38][39][40]. Objects that have refractive index boundaries within gridded voxels can be treated by filling the voxel with a weighted sum of materials [36]. Thus, multislice propagation releases one from the limit of considering only those objects for which the Born approximation applies, or objects that are constrained to be within the effective depth of focus of the imaging scheme employed.

BEYOND THE PURE PROJECTION APPROXIMATION
The Rayleigh resolution criterion [41] uses the position of the first minimum of the Airy intensity distribution as the measure of the transverse resolution δ r , giving where λ is the wavelength and NA is the numerical aperture of a circular lens. We will also describe the angle of maximum scattering from the object by NA. For a circular lens, the axial intensity distribution I(z) along the focal distance [42] is , and with b f as a central stop fraction. The first minimum of the longitudinal intensity distribution of Eq. 3 occurs when u = π, giving a suggested longitudinal resolution of 2λ/NA 2 . In fact, a more realistic criterion is to define the depth resolution δ z as half this value, or δ z = λ NA 2 , so that the DOF extends by ±δ z about the central focus plane, giving DOF = 2δ z . When combined with the Rayleigh resolution of Eq. 2, the DOF can be written as which agrees well with experimental observations for absorption contrast imaging in a scanning transmission x-ray microscope [19] as well as with multiple-plane x-ray ptychography observations [20]. That is, as the transverse resolution approaches the x-ray wavelength, the DOF approaches the transverse resolution. Without an approach to go beyond the DOF limit, the natural combination of short wavelength (enabling high spatial resolution) and large penetration (enabling thick specimen tomography) intrinsic to imaging with x-rays cannot be fully exploited. This situation motivates the development of approaches that can work beyond the DOF limit.
Ptychography currently allows for the solution of the phase problem from a thin object or from discrete planes. However, the nature of diffraction (that it is jointly contributed to by materials throughout the entire depth of the sample) implies that the exiting wave contains 3D information about the whole object, which inspires an iterative optimization-based solution to the object unknowns. This strategy requires a forward model to propagate the probe wave through the object in each iteration, which can be implemented by using a multislice approach. Multislice propagation decomposes the object into a number of thin layers with thickness Δz along the beam axis. For the jth layer, the incident wavefront ψ j is modified by the well-tabulated [43] refractive index n = 1 − δ − iβ of the material in that layer as It is then free-space propagated to the next slice. For this step, the Fresnel diffraction integral can be used and implemented as the convolution between the wavefront and a Fresnel kernel so that the wavefront at the (j + 1)-th layer can be expressed as where ℱ denotes the Fourier transform operator and propagation from the exit plane to the detector is usually over a long distance L satisfying L ≫ a/λ, where a is the wavefield extent. In such scenario, the Fraunhofer approximation applies, allowing the wavefield at the detector plane to be written simply as the Fourier transform of the exit wave. If the object is divided into s layers, the detector wavefront ψ d will then be Together, the entire forward model is This converts the inherently nonlinear problem of 3D diffraction into a set of nonlinear matrix equations. One can carry out multislice propagation with a layer thickness equal to the transverse pixel size or can reduce computational steps by increasing the slice thickness to be a fraction of the DOF of Eq. 4. A good strategy is to use a voxel size that is a fraction (e.g., a tenth) of the desired spatial resolution, so as to avoid too coarse a grid, and a slice thickness that is a fraction (e.g., a tenth) of the DOF of Eq. 4; these choices have been shown in simulations [36] to agree well with the asymptotic limit as one goes to finer voxel sizes and slice thicknesses.

OPTIMIZATION METHODOLOGY
We discretize the object domain with a 3D regular grid of dimension n × n × n and discretize the probe domain with a 2D regular grid of dimension l × l; the real-valued measurements are assumed to lie on a 2D regular grid of size l × l. For ease of representation, we employ vector notation and denote the object variable by y ∈ ℂ n 3 , the probe variable by x ∈ ℂ l 2 , and each of m scan data by d j ∈ ℝ + l 2 . The variable y is related to β and δ by We choose to solve for y to avoid unnecessary nonlinear transformation during the inversion.
We let S q, j ∈ ℝ n 3 × l 2 denote the linear operator that samples the qth propagation plane in the jth scan of the object. In this notation, the forward model generating scan j is where C denotes the convolution operator and the product is in the order П q=1…s a q = a s a s−1 a 1 We perform the inversion by solving the constrained minimization problem where the constraint sets are given by The constants v x and v y are used to define a compact search space and provide bounds on the magnitudes of the probe and object functions. In practice, we imagine that these values are set in a way that the minimization of Eq. 12 is not expected to result in values of x and y that lie on the boundary of the sets X and Y, respectively.
The first term of the objective function in Eq. 12 of is a generalization of the one used for 2D ptychography [44] and represents the data mismatch based on the forward model in Eq. 11. The second term is composed of a scalar κ > 0 and a 3D total variation regularization term, which has proven highly effective in 3D image reconstruction [33]. This term is the 1-norm of the finite-difference approximation of the gradient of the variable y. That is, TV 3 (y) =⎹∂ h y⎹ 1 , where ∂ h is the finite-difference approximation of the gradient computed using the difference between neighboring voxels in each of the three spatial directions. We use the proximal alternating linearized minimization algorithm [45] to solve the minimization problem in Eq. 12. Our algorithm is summarized in Algorithm 1 and uses projections and proximal operators as in [44,45]. The jth component of the projection onto X is and the jth block (of size l 2 ) of the projection onto M by where, for w = 1, …, l 2 , The computation of the proximal operator prox α y τ + κTV 3 (y) = arg min can be performed efficiently by fast gradient projection [46].
Because of the nonlinearity inherent to the forward model in Eq. 11, updating the stepsize parameters in each iteration (e.g., by a line search procedure like in our implementation) can significantly benefit overall performance. The convergence of Algorithm 1 can be further accelerated using an inertial version of the proximal alternating linearized minimization (iPALM) [47], which our implementation exploits. The evaluation of the analytical partial gradients is the main computational bottleneck Algorithm 1. A proximal alternating linearized minimization algorithm for solving Eq. 12.
of Algorithm 1; however, significant computational savings are possible by careful ordering of the terms. Furthermore, the independence of scans, as well as the spatial structure of probe and object functions, can be exploited in a parallel implementation of Algorithm 1 to enable large-scale solution such as that considered in our demonstration. Figure 2 illustrates the setup used for our simulated experiment. One can increase the throughput of x-ray ptychography by using small focused beams and small-pixel-count detector arrays with high frame rate [48], so this is the configuration we selected for our simulation. We simulated a Gaussian focus spot of 5 keV or λ = 0.25 nm X rays with 14 nm FWHM probe size. This was represented with a l × l = 72 × 72 pixel with 1 nm pixel size. We considered an object described below (200 nm in extent, sampled on a 3D grid with n × n × n =256 × 256 × 256) that is enough to go beyond the pure projection approximation at the chosen spatial resolution (1 nm voxel size, since sub-wavelength resolution imaging has been demonstrated with ptychography [18]), forcing us to account for beam propagation effects within the specimen. In other words, the "depth of focus" of diffraction effects within the object is found from Eq. 4 to be5.4(1 nm) 2 /(0.25 nm) = 22 nm, while the probe has a depth of focus of 4,300 nm. With 1 nm voxel size, we were able to use nearest-neighbor sampling rather than interpolation for tomo-graphic object rotation. Assuming Nyquist sampling of discrete Fourier transforms, a l × l = 72 × 72 pixel detector array was assumed to collect scattered signal going out to a maximum spatial frequency of 1/(2 · 1 nm) = 500 μm −1 Figure 3 shows the simulated object shape, and the incident illumination wavefront. The focus spot is considered to be at the center of the object, so that the 2D illumination function shown in Fig. 3 is the exit wave of the probe backpropagated by half of the 3D object size, though this effect is small given the relatively large DOF of the probe compared to the object size.

DEMONSTRATION
The object was designed to resemble the thin capillary tubes that are widely used in laboratories as a carrier for cryo microscopy of cells in liquid suspension [50], and for applications such as electrophysiological probing or photocatalysis [51]. Composite structures like these are interesting subjects for x-ray 3D imaging because of the presence of features ranging from nanospheres with diameters of tens of nanometers, to quantum dots several nanometers in size, close to the wavelength of soft x-ray probe beam. Further challenges in imaging are contributed from the capillary tube substrate, whose diameter can be orders of magnitude higher than that of the finer features, rendering the projection approximation no longer valid. The object was designed by using the open-source package XDesign [52], using x-ray refractive index data retrieved from a database within Xraylib [53]. This demonstration involved a complex object array with n 3 = 256 3 voxels, which implies that we are inverting for approximately 17 million complex variables. We simulated the data acquisition scans in the following way. First, data from 23 2 probe positions from a single large plane were generated. The center of each probe position on the plane was at [12j, 12k] for j, k ∈ 0, …, 22. We tested two rotation sampling schemes: one with 90 specimen rotations at 4° increments about a single axis over an 360° range, and another with 180 rotations at 2° increments over a 180° range. The total number of far-field diffraction patterns recorded for both sampling schemes was thus m = 90 · 23 · 23 = 47610. We note that the 2° rotation increment is not fine enough to meet the Crowther criterion for fine angular spacing in standard tomography [54]. Finally, while our simulation did not include the effects of noise due to finite photon statistics, we note the following:

•
One can arrive at a simple model for the required photon fluence to image a given feature type with a specified spatial resolution and signal-to-noise ratio [55]. In x-ray ptychography experiments, this estimate has been found to agree within 20% of what is required for imaging integrated circuit features [14] and frozen hydrated biological specimens [56] using iterative phase retrieval algorithms. This indicates that the algorithms themselves can be robust to noise, and work at the limit of the minimally required photon exposure.
• Simulation studies of the effects of photon noise in iterative algorithms for reconstructing coherent diffraction imaging data have shown that the achievable resolution is no different than conventional imaging with the same photon fluence [57]. This conclusion is consistent with other studies of the effects of photon noise on coherent diffraction imaging with X rays [58,59].
Therefore we expect that our MOOR approach should work within the limits of the resolution that is achievable given a particular feature contrast and photon fluence. For comparison with the (erroneous in this case) assumption that the object can be described by the pure projection approximation (PPA), we also reconstructed the simulated experimental dataset using the single slice ptychographic tomography (SSPT) approach [22,23] discussed in the introduction. We first reconstructed a set of 2D complex-valued projections using 2,000 iterations of a GPU implementation [60] of the ePIE algorithm [61].
To align all phase projections and account for linear and constant phase terms that typically plague independent ptycho-graphic reconstructions of the same sample, we subtracted a best-fit plane from each reconstructed phase [23]. The aligned phase and magnitude projections were then tomographically reconstructed by using 200 iterations of the simultaneous iterative reconstruction tomography (SIRT) method as implemented in the TomoPy package [62]. We note that the employed ptychography and tomography algorithms assume a pure projection forward model, so their poor performance shown in Fig. 4 is expected in this case where the pure projection approximation does not apply. For example, propagation fringes at the boundaries of the sample are clearly visible, along with periodic artifacts caused by the relatively low number of projections (which can be suppressed by Fourier filtering [49]) and ePIE's inability to retrieve the correct illumination function in the circumstances of our demonstration. In order to provide a fair comparison between the pure projection approximation (PPA) and our MOOR approach, we post-processed the result of the PPA reconstruction (we emphasize that no post-processing steps were used for the MOOR algorithm results) to give a secondary result that we have labeled as "PPA+filtering." First, we used median filtering with a window length of 10 in order to suppress the line artifacts that can be observed in the PPA column of Fig. 4; we then thresholded the values of δ of the filtered image with a threshold of 10 − 5.
3D ptychographic experiments typically only sample 180° around the object. This is a valid procedure under the pure projection approximation: in this setup two samples taken from symmetrically opposed locations would, in theory, produce the exact same projection image and corresponding diffraction pattern. However, situations beyond the pure projection approximation (thus requiring the multislice approach) break this symmetry, and our numerical results show that in this case a sampling around all 360° is necessary. Figure 4 shows the comparison of four ptychographic reconstructions with the true object: MOOR using 180° sampling, MOOR using 360° sampling, a reconstruction from a pure projection approximation (PPA) and a post-processed PPA reconstruction. In addition, Fig. 5 shows isosurface renderings of the true object and the 360° and 180° MOOR reconstructions.
Together these results show that the "hidden" side of the object is poorly resolved when using MOOR with only 180° rotation sampling.

DISCUSSION AND SUMMARY
Single-slice ptychographic tomography (SSPT) is extremely successful [22,23,63,64] at obtaining high quality 3D x-ray reconstructions of objects to which the pure projection approximation applies. For objects with larger extent, the multislice ptycho-graphic tomography (MSPT) approach [30] has shown promise for treating the object as being represented by several planes along each viewing direction, which can be reconstructed and added to yield approximations of a pure projection. With multi-slice optimized object recovery (MOOR), we take the approach of allowing for multiple slice propagation through thick objects along each viewing direction; an optimization approach is then used to recover a 3D object that is consistent with all diffraction measurements, and which requires no phase unwrapping. Although we have not made an explicit comparison with the MSPT approach, the fact that one obtains an improved reconstruction with a 360 degree rotation relative to a 180 degree rotation means that a simple summation of separate reconstruction planes may not be sufficient to accurately reconstruct objects with increasing sizes beyond that to which the pure projection approximation applies.
In this first demonstration of the MOOR approach, a proximal alternating linearized minimization algorithm is used to obtain rapid convergence for the case of ptychography (where a small coherent probe is scanned across the entire specimen projection field of view at each rotation angle). This approach is consistent with the expectations of SSPT and MSPT, where the iterative rules alternate between updates of the object and the probe [61,65]. However, our approach employs the capabilities of high-performance computing to carry out a 3D calculation with isotropic voxel size, and no need for phase unwrapping. It could also be used for other situations, such as probe overlap during rotation rather than at each viewing angle [66]. Our approach might also be useful in situations where the probe is larger than the object and hence the overlapping probe feature of ptychography is absent [67].
Although our approach is successful in recovering a computer-synthesized 3D object to which the pure projection approximation does not apply, there is room for future development. In this first demonstration, data parallelism was used so that each computing node owned and updated only a subset of the variables z j (and associated scans d j ) but also a copy of the full 3D object, which, in turn, created a synchronization point in the parallel computation for merging those copies after every update step. This allowed us to show that the approach works, but in an inefficient manner where data communication (between all the nodes with each one calculating a different probe position, and the 3D data) limited computational throughput. This limitation could be overcome by exploiting domain decomposition (i.e., parallelism in the object/reconstruction domain) for partitioning the 3D object based on the scanning pattern and the associated sparse intersections of localized illumination probes with the overall object. One could then have each node hold the probe function, a local subregion of the object, and the measured Fourier magnitudes for rapid computation with periodic synchronization happening only where the subregions overlap (we have used such an approach in standard ptychography reconstructions [60,68]). In addition, a stochastic asynchronous version of the proximal alternating linearized minimization algorithm [69] is likely to reduce the computational cost of the algorithm as well as the synchronization cost. We will also examine the capability of our algorithm with ptychography experiments for nanoscale objects including nanofabricated devices and subcellular structures of eukaryons. Technical challenges such as probe alignment (already addressed in 2D ptychography [70][71][72]) could be addressed when applying the algorithm to experimental results. Depth of focus as a function of x-ray photon energy for a variety of transverse resolution δ r values. Also shown is the exp[−1] penetration depth μ −1 for amorphous ice (for frozen hydrated biological specimens) and silicon (for microelectronics specimens) as proxies for the thickness range of x-ray imaging as a function of photon energy. As the transverse resolution δ r in x-ray microscopy is improved to finer values, the DOF decreases with the square of the resolution improvement (Eq. 4), leading to a decrease in the size of a specimen that can be imaged within the projection approximation required by standard tomography. Experimental geometry used for our simulated experiment. A lens was assumed to produce a Gaussian coherent illumination probe of size 14-nm full width at half maximum (FWHM) through which the object is scanned at each object rotation angle. A far-field diffraction pattern is then captured in each scan. Left: an isosurface rendering of the true object with 200 nm length along the cone's axis.
The simulated object was a conical glass capillary tube with embedded Ti nanospheres. Right: the as-designed Gaussian probe function with 14 nm FWHM size, as represented at the midpoint of the object region; brightness indicates the amplitude of the wave, and hue indicates the phase (see color wheel inset). Comparisons of two cuts (zonal, top row; and meridonal, bottom row) of the reconstructed phase-shifting part δ of the x-ray refractive index n = 1 − δ − iβ for different methods and data collections. These cuts show the reconstructed value of δ in the voxels at the selected planes. The true object is shown at left, followed by the MOOR reconstruction using 360° and then 180° rotation axes. Finally, a reconstruction from the standard pure projection approximation (PPA) to ptychographic tomography is shown. As noted in the text, the pure projection approximation does not properly reproduce the object, and it also suffers from regular artifacts due to insufficient probe overlap for the reconstruction method used [49]. For this reason, we also show a column of "PPA+filtering" images with post-processing. These figures show that 360° MOOR gives a reconstructed image that represents the true object with a high degree of fidelity. Comparisons of 3D isosurface renderings of the true object, and MOOR reconstructions using 360° and 180° object rotation. Rotation over a full 360° range gives improved results. The 3D object reconstructed using the pure projection approximation (PPA) is not shown, as its errors (Fig. 4) do not make it possible to obtain a clean isosurface rendering.