Dose requirements for resolving a given feature in an object by coherent x-ray diffraction imaging

We address the question of what dose is required to image an object by coherent x-ray diffraction imaging (CXDI) and to resolve a certain sub-unit or feature of that object. We show that a necessary condition for being able to resolve the detail is that the feature can be imaged by itself. The quality of the reconstruction of the feature is nearly independent of the surrounding, whether it is embedded in a larger object or not. This allows one to easily estimate the dose requirements for identifying atoms and clusters in larger objects. We illustrate the result by a numerical example and give an estimate for the dose required to resolve single atoms of different elemental species in CXDI experiments at free-electron laser and synchrotron radiation sources.


Introduction
Structure determination on the nanoscale is of the utmost importance to understand the properties of nanoscale objects, for example in biology, chemistry, physics, materials science and nanotechnology. X-ray microscopy is ideally suited for this purpose, as it can access the inner structures of an object without destructive sample preparation. In addition, it is well suited for in situ investigations, as x-rays can penetrate special sample environments, such as a chemical reactor or a pressure cell.
X-ray microscopes based on focusing or imaging are limited in resolution by the performance of the underlying x-ray optics, in particular their numerical aperture. Today, spatial resolutions down to a few 10 nm and below are reached with different x-ray optics [1]- [7]. While for some of these optics the physical limits are reached today (e.g. total reflection mirrors and waveguides), the numerical aperture of most of them is technology-limited, leaving room for improving the performance of optics-based x-ray microscopes down to the nanometer level [8]- [10]. This requires, however, a substantial technological effort. Coherent x-ray diffraction imaging (CXDI) is a new microscopy technique that is not limited in spatial resolution by any optic. Therefore, it is the ideal candidate to bridge the resolution gap between a few 10 nm and the atomic level. First introduced in 1999 [11], it has quickly developed in the last decade [12]- [20] and spatial resolutions down to below 10 nm have been demonstrated [14,18,21,22].
In this technique, the sample is illuminated by coherent x-rays and its far-field diffraction pattern is recorded. From these data and with some additional knowledge about the sample, e.g. its finite support, the phases of an appropriately sampled far-field diffraction pattern can be recovered by iterative phase retrieval algorithms [23]. In this way, the shape and structure of a (non-periodic) object can be reconstructed. More precisely, the exit wave field behind the object is recovered. The main advantage of this technique is that no optic is needed for the image formation. As in crystallography, the resolution is limited by the maximal momentum transfer that can be captured by the detector, easily exceeding the numerical aperture of any x-ray optic. For this reason, this technique has been proposed for the imaging of clusters and molecules down to the atomic level [24,25].
In principle, if the diffraction pattern can be recorded on a large solid angle, the spatial resolution is only limited by the wavelength of the x-rays. In practice, however, the spatial resolution is limited by the largest scattering angle up to which a significant diffraction signal can be detected. This angle depends on the coherent dose applied to the sample. For biological samples, radiation damage limits that dose, allowing for spatial resolutions down to about 10 nm [26,27]. A way around this resolution limit is thought to be possible by applying a much higher dose in a very short time at an x-ray free-electron laser (XFEL) source [24]. First steps toward this goal have been taken at the free-electron laser source FLASH at DESY in Hamburg, Germany [28,29].
For radiation-hard objects, the limitation of the resolution arises from the limited coherent flux available from the source. The diffraction patterns for non-ordered, generic objects exhibit a strong fall-off in intensity toward large scattering vectors q. For length scales down to slightly below the nanometer level, the scattered intensity typically falls off with the fourth power of q [17,26,27,30]. This means that an increase in resolution by one order of magnitude requires an increase in coherent dose on the sample by four orders [21].

3
The question of the dose required to reconstruct an object to a given resolution was addressed in several publications [18,26,27,30,31]. The main result is that at least a few photons must fall into a so-called Shannon pixel obj that spans a solid angle just small enough to sample the diffraction pattern appropriately [26].
In this paper, we discuss the closely related question of what dose is required to image a certain feature or sub-unit of a non-periodic object to a given resolution. More specifically, the question is: what dose is required to detect a given atom or a molecular sub-unit at a given position in a molecule or solid? The main result is that such a sub-unit can only be detected in a larger sample if it can be detected by itself without the rest of the larger object around. The quality of the reconstruction of the feature is nearly independent of the surrounding, whether it is embedded in a larger object or not. This simplifies the calculations of the dose significantly, as only the dose requirement for the sub-unit of interest needs to be calculated. Under certain circumstances that are discussed in section 2, the requirement for detecting a certain feature in an object is equivalent to that of being able to reconstruct the object as described in [26]. In general, however, these requirements are not equivalent.
In section 2, a necessary condition to image a certain feature inside an object is derived. We illustrate the result by a numerical example in section 3, and calculate the minimal dose densities required to image single atoms of carbon and gold at XFEL or synchrotron radiation sources in section 4.

Imaging a feature as part of a larger object
To first Born approximation the radiation scattered from an object is given by where F( q) is the flux into the solid angle subtended by a detector pixel for a momentum transfer q. I c is the incident coherent intensity and is the differential elastic scattering cross section of the sample. Here, r 0 is the classical electron radius, p(ϑ) is the polarization factor (for the scattering angle ϑ with respect to the direction of polarization), and the sum runs over all atoms i in the object with f i ( q) being the form factor of the ith atom and r i its position. We now subdivide the object into two parts: a given feature d (for 'detail') under consideration and the rest r of the object. The differential cross section (2) can now be written as with 4 and ψ r defined analogously. This separation can be made for any two parts of any object and is rigorous. The total number of photons in a given detector pixel after the exposure time t is In order for the feature to be detectable, the contribution in equation (5) of the 'heterodyne' term proportional to 2R(ψ d ψ * r ) or the 'homodyne' term proportional to |ψ d | 2 must exceed the shot noise level of the scattering from the contribution of the rest |ψ r | 2 alone by some factor α (e g. α = 5 for the Rose criterion).
We first consider the heterodyne term: assuming Poissonian shot noise, the heterodyne term can be detected in the shot noise of the rest of the object if A necessary condition for equation (6) to hold is namely that the noise from the scattering signal of the rest of the object does not exceed an upper bound for the heterodyne term. Consider d , where d is the Shannon pixel size of the detail taken by itself. Over these , the amplitude |ψ d | varies slowly and can be considered as nearly constant. Therefore, the integral on the left-hand side of equation (7) can be simplified: using the abbreviation for the average of a function over the pixel with size at a given q. With equation (8) and = d , equation (7) can be squared to take the form As |ψ r | 2 |ψ r | 2 for any function ψ r , the right-hand side of (9) is bounded from below, and a necessary condition for the feature to be observable inside the object as a result of the heterodyne term can be derived: This condition is equivalent to the feature being reconstructable by itself [26], i.e. that the photon statistics in a Shannon pixel of the detail lies above a certain constant threshold, e.g. α 2 /4 = 6.25 in the case of using the Rose criterion [26]. Note that equation (10) describes a necessary condition. The actually required photon statistics can be higher by at least the factor |ψ r | 2 / |ψ r | 2 . Information on the detail is encoded not only in this heterodyne part, but also in the homodyne term. For this one, a very similar estimation can be made If the feature or detail is much smaller than the overall object, the condition in equation (11) is usually much more stringent than that in equation (10). In that case, the detail is predominantly encoded in the heterodyne term. The homodyne term could play a role in cases where the diffraction pattern of the rest of the object is very weak over the size of a Shannon pixel d of the detail. For generic objects, this is not to be expected.
We can now analyze under what conditions the feature or detail can be detected in a larger object and compare it to the condition to reconstruct the large object as discussed in [26]. Here, obj is the Shannon pixel size and |ψ obj | 2 = dσ d the scattering cross section of the large object. Comparing (10) and (12), the following cases can be distinguished.
The feature is a strong scatterer in a relatively weak scattering object. The feature taken by itself could be reconstructed more easily than the whole object and condition (12) is more stringent. If the object can be reconstructed, the feature or detail is well reconstructed, too.
The contrast of the feature is comparable to the average contrast in the object. Both conditions are equivalent. For binary test objects and for twodimensional (2D) objects with an overall homogeneous density distribution, this case is given.
The contrast of the feature is weaker than that of the rest of the object on average. In this case, the overall object can be reconstructed with a lower dose than the feature: the reconstruction does not represent the feature well. This situation is fulfilled, for example, when a small object is imaged together with much stronger scatterers, a weakly scattering material coexists with a strongly scattering one, or the object is 3D, with the feature being thin compared to the overall object thickness. In the extreme case, a weaker object remains completely invisible, such as, for example, the silicon nitride pyramid in [18].
Equation (10) can also be considered as a measure for the dynamic range in a reconstructed image. We are, for example, free to choose the feature as ψ obj · γ , where 0 < γ < 1 is the value of the smallest relative gray value that is supposed to be reconstructed. In this way, the question is answered as to what coherent dose density is needed to reconstruct the complex amplitudes 6 behind the object with a relative accuracy of γ . In this case, d = obj and equation (10) becomes This means that, for example, if ten shades of gray are to be faithfully reconstructed (γ = 0.1), the required dose per Shannon pixel must be increased by a factor of 100. Furthermore, the scheme can also be applied to Fourier transform holography. In this case, the object may be considered as ψ d and the reference wave as ψ r . Shintake [32] proposed the use of a strongly scattering gold particle to amplify the scattered signal from a biomolecule in single shot CXDI experiments. While, indeed, the scattered signal from the biomolecule can be significantly increased in this way, the signal-to-noise ratio to recover the biomolecule is not affected according to our previous considerations 2 .

Numerical examples
In the following, we illustrate the implications of equation (10) by a numerical simulation. We consider two test objects that both include the same small feature. Test object I is flat with a mainly 2D distribution of scatterers, while test object II is 3D. The transmission through these objects is shown in figures 1(a) and (c), respectively. The gray scale encodes the thickness of the material. Both objects are made of clusters of gold spheres together with a small gold test structure magnified in insets (b) and (d) of figure 1. The gold spheres are randomly distributed on a circular area of 140 nm diameter. The diameters of the spheres range from 2 to 24 nm. While the spheres in object I are distributed in two dimensions with only a slight overlap and a relatively low overall density, the spheres in object II are arranged in three dimensions, with significant overlap in the projection. The test feature d has a size of approximately 15 nm and has structures with strong contrast on the nanometer scale. For simplicity, the test feature does not overlap with any other structures of the test objects in the projection. The objects were chosen to have no particular symmetries and cover all spatial frequencies between 1 and 140 nm.
We simulate a coherent x-ray diffraction experiment at E = 8 keV, assuming an ideal flat illumination of the objects. The complex transmission functions of the objects are calculated using the complex refractive index n = 1 − δ + iβ. The transmitted field amplitude is given by where E i (x, y) is the incident wave field amplitude that is chosen constant for a flat homogeneous illumination, δ and β are the decrement of the index of refraction and its imaginary part, respectively, k = 2π/λ is the wave number, and d(x, y) is the thickness of the material along the optical axis in the pixel at the position (x, y). For gold at the x-ray energy of E = 8 keV, δ = 4.81 × 10 −5 and β = 4.78 × 10 −6 , where β includes photo absorption and inelastic scattering. The far-field diffraction patterns were calculated using the Fourier transform of the object and introducing Poissonian noise in each pixel according to various incident coherent dose densities I c · t. (Here, the dose density is the number of photons incident on a unit area during the exposure.) In figures 2(a)-(d) and figures 3(a)-(d), the far-field diffraction patterns of test objects I and II are given for different values of the dose density I c · t in the range from 10 5 to 10 8 photons nm −2 , respectively. The noise visible in the diffraction patterns is pure shot noise, neglecting other types of noise that may be introduced by the detector. The detector has 512 × 512 pixels and the geometry is chosen in such a way as to cover a q range down to a real space pixel size of 1 × 1 nm 2 , i.e. q max = 3.14 nm −1 in the horizontal and vertical direction.   The superior photon statistics of the thicker object II are clearly visible. The azimuthal average of the diffraction pattern of the single test feature alone is also shown.
From figures 2 and 3, it can already be seen that object II is a stronger scatterer than object I. The photon statistics of both objects I and II are also illustrated in figure 4, showing the azimuthal average of the diffraction patterns as a function of q for a coherent dose density of 10 6 photons nm −2 . The intersections of the dashed line with the curves in figure 4 show the expected resolution limits for the two objects according to equation (12) and the Rose criterion. At the same dose, object II can thus be reconstructed to a higher resolution than object I according to equation (12) [26] or, stated differently, yields the same resolution as object I at a lower dose.
These noisy data are now used to feed a phase retrieval algorithm in order to solve the phase problem and determine the corresponding real-space structure. The applied algorithm is based on an iterative refinement of the phases, where each iteration involves the introduction of known constraints in real and reciprocal space. A subsequent switching between error-reduction constraints (20 iterations) and hybrid-input-output constraints with feedback parameter β = 0.9 (80 iterations) was applied [23]. A single reconstruction was considered to be complete after a total of 1000 iterations. A fixed and relatively tight support was used in order to guide the algorithm to the correct solution. For each dataset this reconstruction procedure was repeated 100 times with random starting phases, and the best 20 reconstructions were averaged to yield the final image. The result of these reconstructions is shown in figures 5 and 6. Obviously, it depends on the dose density and the image becomes sharper the more photons are available.
Owing to the power-law dependency of approximately dσ d ( q) ∝ q −4 [17,26,30], an increase of almost one order of magnitude in resolution is expected between the two extreme dose densities shown in parts (a) and (d) of figures 2 and 3, respectively. This increase in resolution with increasing dose density can be observed quite well in figures 5 and 6, ranging from an image blurred on the scale of about 10 nm to a sharp image on the 1 nm scale.
The criterion equation (12) predicts a slightly better reconstruction for test object II. For the feature introduced in both test objects, however, we expect more or less the same contrast, independent of the shape of the larger object and the potentially higher overall spatial resolution achieved in object II. In addition, we expect the same (or slightly better) contrast for the small feature taken by itself and illuminated with the same dose density. The corresponding diffraction patterns are shown in figure 7. Since the feature is about an order of magnitude smaller in both directions compared to the two test objects, the Shannon pixel size d is about a factor of 100 larger than obj . The diffraction patterns in figure 7 were reconstructed using the same procedures as for the two previous objects. The reconstruction results are shown in figures 8(a)-8(d), together with the  magnified regions around the feature in the reconstructions shown in figures 5 and 6. It becomes apparent that indeed the quality of the reconstruction of the feature is nearly independent of the surrounding, i.e. whether it is embedded in a larger object or not. This shows that a feature can be reconstructed as part of an object if it can be reconstructed by itself. Besides those for objects I and II, figure 4 also shows the azimuthal average of the diffraction signal of the single feature. Their comparison shows that equation (12) does not necessarily describe the quality of reconstruction of the single feature very well. Indeed, in the case of object I, the quality of the reconstruction of the single feature is underestimated: object I falls into case (i) discussed at the end of section 2, i.e. the feature is a strong scatterer compared to the sparsely filled rest of the object. The three-dimensional object II falls more closely into case (ii), having on average the same contrast as the feature. Therefore, the number of photons per Shannon pixel is very similar for object II and the single feature in figure 4. The single feature's increased average number of photons at higher q is a result of a higher contrast on small length scales.
The phase retrieval transfer function (PRTF) is often used to characterize the resolution in a CXDI experiment [18]. It is plotted in figure 9 for object I, object II and the single feature at a dose density of 10 6 photons nm −2 . For a fixed cut-off, the resolutions obtained in the reconstructions would be estimated in a way similar to that, using equation (12). As for equation (12), it is thus not always a good measure for the quality of reconstruction of a given detail inside the object. The reason for this is that, for example, in order to have a good representation of a weak feature (case (iii) of section 2), the phases must be known to a higher accuracy than for a feature of similar or higher average relative scattering power (case (ii) or (i), respectively). In the given example, very different resolutions for the feature would be deduced from the PRTFs in figure 9, although the reconstructions are nearly the same.

Consequences for coherent x-ray diffraction imaging (CXDI)
The main result of this work is that a given feature in an object is reconstructed only if the dose density requirements for imaging it by itself are met. This means that the dose densities I c · t must be adapted to the features to be imaged.
In an experiment, the flux density I c = F c /A depends on the coherent flux F c from the source and the area A onto which this flux is focused. F c is bounded from above for a given Figure 9. PRTF for the two-dimensional test object I (red curve) and threedimensional test object II (blue dotted curve). The latter has a stronger contrast and scatters more photons to large diffraction angles, resulting in a PRTF that is increased at higher q compared to that of object I. The black dash dotted curve shows the PRTF for the feature imaged by itself. In all three cases, the dose density was 10 6 photons nm −2 .
source by where Br is the source's brilliance, λ the wavelength of the radiation, and E/E is its bandwidth. For a given source, F c is fixed. Only the illuminated area A can be manipulated by focusing the x-rays, but must always be larger than the object that is to be imaged. In view of this, the highest feature resolution will be achieved for the smallest objects, i.e. small molecules or clusters, since only for those can the area A be minimized, and thus the coherent flux density maximized.
In order to assess under what circumstances an individual atom can be reconstructed in the context of a molecule from a single CXDI experiment, first the dose density is estimated that is required to detect a single atom by itself. We say that a single atom is detected if a certain number of photons is scattered into the solid angle d given by the Shannon pixel size for an individual atom. Assuming an atomic radius of about 1 Å, the Shannon pixel size of the atom is d ≈ 0.25 for an x-ray wavelength of λ = 1 Å. In the following calculation, we assume further that we can detect the atom if 6.25 photons per d fall onto the detector outside the direct beam. The scattering cross section of an atom is bounded from above by r 2 0 · Z 2 , where Z is the atomic number of the element. Thus, the coherent dose density required according to (10) is at least To detect single carbon atoms (Z = 6) by themselves and as part of a molecule, a dose density of at least I c · t ≈ 9 × 10 10 photons nm −2 is required 3 . For a gold atom (Z = 79), a dose density of I c · t ≈ 5 × 10 8 photons nm −2 is needed. These rather high dose densities are difficult to obtain with current and even upcoming x-ray sources. Here, we estimate under which experimental conditions individual atoms can be detected in a molecule or cluster using a single pulse of the XFEL without any additional chemical knowledge. Assuming a fixed coherent dose of about D c = 10 12 photons pulse −1 for a free-electron laser source, this requires lossless focusing to an area of To detect single carbon atoms, the focused area must be smaller than A ≈ 10 nm 2 . Likewise, for gold atoms, the focused area must be below A ≈ 2000 nm 2 . For a circular focus, this corresponds to a focus diameter of ≈ 4 nm for carbon and ≈ 50 nm for gold. For one thing, this limits the size of molecules or clusters in which a certain atomic species can be imaged with a single XFEL pulse of 10 12 photons; for another, it imposes stringent requirements on the focusing optics in terms of efficiency, numerical aperture and imaging quality. The case of imaging individual carbon atoms with a single pulse seems to be very challenging unless the number of coherent photons per pulse can be increased. For gold and other heavier elements, however, imaging clusters and molecules with a single pulse at atomic resolution does not seem out of reach, as high-efficiency optics focusing XFEL radiation to below 100 nm are currently under development.
At synchrotron radiation sources and for radiation-hard samples, the coherent dose is limited by the exposure time t. The coherent dose from an undulator source at a modern thirdgeneration synchrotron radiation source exceeds that of the XFEL for exposure times roughly above 100 s. This opens possibilities for imaging radiation-hard static samples, for example from materials science, down to the atomic level.