Efficient Coherent Diffractive Imaging for Sparsely Varying Objects References and Links

We demonstrate an efficient phase-retrieval method, allowing the recovery of features of dynamic objects that are sparsely varying – i.e. when the difference between two consecutive frames is small. The method uses redundancy in information between similar consecutive frames to recover the features more robustly than current phase-retrieval methods, and necessitates a considerably smaller number of measurements. Both of these features directly lead to shorter acquisition time, paving the way to coherent diffractive imaging of fast dynamic processes. Numerical simulations show a possible 100-fold improvement in temporal resolution over existing methods. Single mimivirus particles intercepted and imaged with an X-ray laser, " Performance test of fresnel zone plate with 50 nm outermost zone width in hard X-ray region, " Jpn. Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens, Quantization analysis of speckle intensity measurements for phase retrieval, " Appl. Femtosecond diffractive imaging with a soft-X-ray free-electron laser, " Nat. Phys. Reconstruction of the shapes of gold nanocrystals using coherent x-ray diffraction, " Phys. Lensless diffractive imaging using tabletop coherent high-harmonic soft-X-ray beams, " Phys. Time-resolved imaging of purely valence-electron dynamics during a chemical reaction, " Nat. Ultrahigh 22 nm resolution coherent diffractive imaging using a desktop 13 nm high harmonic source, " Opt. Super-resolution and reconstruction of sparse images carried by incoherent light, " Opt. Quantum limits of super-resolution of optical sparse objects via sparsity constraint, " Opt. Compressive phase retrieval from squared output measurements via semidefinite programming, " arXiv:1111.6323v3, (2012). 29. S. Mukherjee and C. S. Seelamantula, " An iterative algorithm for phase retrieval with sparsity constaints: Application to frequency-domain optical-coherence tomography, " ICASSP 2012. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information, " IEEE Trans. Sparsity based sub-wavelength imaging with partially incoherent light via quadratic compressed sensing, " Opt. Experimental compressive phase space tomography, " Opt.Understanding the twin-image problem in phase retrieval," J. Opt. Soc.


Introduction
The spatial resolution of any standard imaging system is fundamentally bounded by Abbe's diffraction limit, setting the minimal resolvable feature size even in an ideal imaging system to be comparable to one half of the optical wavelength (λ/2).Therefore, direct optical imaging of an object with nanometric-scale features, e.g.viruses [1], or nanometric magnetic domains in thin magnetic films [2], requires extreme UV or X-ray light.However, focusing a light beam or constructing an imaging system at such wavelengths is very difficult, due to the lack of suitable optical components.Namely, focusing a hard X-ray beam is typically done with a Fresnel zone plate [3] or with focusing mirrors, where in both cases the current focusing ability is fundamentally limited to a spot much larger than the wavelength.For example, at hard X-ray wavelength (λ ~0.1nm) the current resolution limit, in focusing as well as in imaging, is ~10nm, which is ~170 λ [4].
Another avenue to overcome this practical resolution limit for imaging applications, at extreme UV and X-ray wavelengths, is to use coherent diffractive imaging (CDI) [5,6].X-ray CDI, in its most basic form -plane-wave CDI, consists of illuminating the object with a coherent X-ray plane wave, and measuring the diffracted intensity pattern in the far-field [6].The diffracted intensity is then algorithmically inverted to obtain the sought object, most commonly by using an iterative phase-retrieval algorithm [7,8].The loss of information, which is inherent in the measurement process -where only the magnitude of the diffraction pattern is measured and not the phase -is handled algorithmically using prior knowledge about the region where the object resides (the "support").However, the algorithmic inversion process is challenging, and the prior knowledge it requires for successful reconstruction is often very precise (e.g., on the location of the object, or the support boundaries, etc.).Moreover, all such phase-retrieval algorithms fundamentally require a high signal to noise ratio (SNR), and a large number of measurements in the diffraction plane, a criterion known as the oversampling criterion.Recent techniques suggested improving the phase-retrieval process by use of a phase diffuser to create speckles in the far-field, thereby increasing the spatial variation of the measured far-field intensity [9].However, the SNR is still the limiting factor for phase-retrieval problems.For a recent review on CDI see [10].
Due to the high SNR and oversampling requirements, the level of signal (light) needed for a high quality recovery is large.This can be obtained by using a single intense pulse of light, e.g. from a free-electron laser [11,1], or through accumulating a large number of relatively weaker X-ray pulses, e.g. from synchrotron radiation [12,13,2], or using light from a highharmonic generation process [14].Since an intense X-ray pulse is destructive for certain objects (e.g.proteins), imaging a dynamic process of such an object while it varies in time is challenging: It can either be done by destructive pump-probe time resolved imaging, where identical copies of a dynamic process are imaged with different time delays between the pump and the probe [15,16], or by accumulation of weaker pulses [14,17,18].However, pulse accumulation naturally prolongs the acquisition time necessary for a single image; currently, the acquisition time is on the scale of tens of seconds, at least [14,17,18] -far from being suitable for real-time dynamic imaging.In the case of lensless imaging (CDI) of magnetic structures [2,13], the exposure time may be shorter (~0.2 seconds in [2]) -but the exposure time is still limited by SNR.Thus, finding a technique that would work under lower SNR could facilitate faster acquisition, thereby improving the ability to track the formation of magnetic domains with better temporal resolution.For the specific case of imaging the formation of nanometric domains in magnetic films, improving the temporal resolution may allow viewing phase transitions as they occur [13].
Here, we propose and demonstrate an algorithmic method that performs robust phaseretrieval of sparsely varying video, i.e. when the difference between two consecutive frames is small.The proposed method utilizes the similarity between consecutive frames to compensate for the loss of information associated with the measurements.The loss of information is caused primarily by the lack of phase information, but our method compensates also for low SNR, and allows using a considerably smaller number of measurements than traditional phase-retrieval methods.As such, our method can be used to recover dynamically varying objects using a considerably shorter acquisition time compared to existing algorithms, without requiring multiple identical samples.Using the sparsity of frame-differences for efficient measurements in a tracking system has been recently suggested and demonstrated in a linear imaging model [19].Our current work deals with a different group of problems, namely, is a quadratic mathematical problem (phase retrieval).For such nonlinear problems, standard linear sparse recovery algorithms are inapplicable.This, together with the motivation to pursue sub-wavelength imaging, are the reasons we develop the new methodology presented here.Applications of our method may include imaging of moving or changing biological specimen, magnetic nano-structure formation, or multiple particle tracking, in all of which the improvement of temporal resolution can be greatly beneficial.Imaging magnetic domains during their formation is particularly appropriate for our method because the temporal variations are sparse, as the domains tend to flip in small groups.Another interesting application where our method can be used is for CDI of growing neurons [20] -where, due to the nature of the dendritic growth in which most of the neuron remains unchanged, the difference between consecutive frames is sparse.Yet another example for a phenomenon that conforms to our sparse-variation model is the formation of crystal, such as the colloidal crystal dynamically imaged by STED [21].Our method relies on new ideas associated with sparsity-based subwavelength imaging [22][23][24] and phase retrieval of sparse objects [25][26][27][28][29][30], and utilizes a recently developed efficient greedy algorithm [30] in order to solve the problem.

CDI problem formulation
In plane-wave CDI, an object is illuminated by a plane wave, and its far field diffraction pattern is measured, as depicted in Fig. 1.Therefore, performing CDI amounts to recovering the object's optical transmission function from the magnitude of the diffraction pattern.In the far-field, or the Fraunhofer approximation (applicable when 2  1 a d being the distance between the detection plane and the object, and a and b being the radii of circles within which the object and its far-field diffraction pattern are localized around the optical axis, respectively), the intensity of the diffraction pattern corresponds to the magnitude of the Fourier transform of the object [31].Such is the case when the features of the object are large compared to λ, and the diffraction pattern is measured sufficiently far away.( , ) , where is the 2D Fourier transform of ( ', ') t x y .The mathematical description of the CDI problem is therefore to invert Eq. ( 1), namely to recover the phase of ( , )   x y T ν ν -since once its phase is known, it can be inverse-Fourier transformed to yield ( ', ') t x y .However, due to the missing phase information, the inversion problem is ill-posed, and in general some additional prior information is necessary in order to solve it.Most commonly, this additional knowledge is about the location of the object and its boundaries (its "support").The support is incorporated into an iterative phase retrieval algorithm developed by Fienup [7], attempting to find a solution consistent with both the prior information and the measurements.There are several variants of the Fienup algorithm [8], and they all work by alternating between imposing real-space constraints (e.g. the assumed support of the object) and Fourier constraints -i.e. the measured Fourier magnitude.In order to converge to the correct solution, however, they require both oversampling, i.e. a larger number of measurements than unknowns (more than twice), and high SNR [10].

Sparsity-based dynamic CDI
Using CDI to image a dynamic process requires taking a set of far-field measurements at consecutive times.In principle, in each temporal segment, the far-field diffraction pattern can be captured, its phase can be retrieved using a phase-retrieval algorithm [7], and a snapshot of the object can be recovered.However, due to the high SNR requirement of current phaseretrieval methods, and in order to avoid possible damage to the object, a strong enough signal can only be obtained by accumulating many weak pulses.The acquisition time and hence the temporal resolution are therefore inherently very slow -on the order of tens of seconds at least [14,17,18], rendering current CDI techniques inapplicable for fast imaging dynamic processes.
Here, we suggest and demonstrate a method facilitating considerably shorter acquisition times for CDI of dynamically varying objects, and in doing that allowing for improving the temporal resolution substantially.Our technique works under the assumption that two consecutive frames are similar, rather than treating each temporal frame separately.Under this assumption, which is commonly used in applications of video compression [32] and tracking systems [19,33], the number of unknowns becomes smaller than the number of measurements.At the k'th frame we have the following system of quadratic equations: where ( , ) k I x y is the k'th temporal frame, and F denotes the 2D Fourier transform.The difference between consecutive frames, denoted as ( , ) k t x y Δ , is therefore sparse, i.e. it contains a small number of nonzero elements, when it is spatially discretized.
In our method for solving the dynamic CDI problem, we shall assume that the first frame is measured with high quality -i.e.high SNR and sufficient number of samples.However, from then on the following frames can be sampled under less strict conditions (low SNR, fewer samples).Under this assumption, after spatial discretization, Eq. ( 2) translates to finding a sparse solution to a set of quadratic equations.
Finding sparse solutions to underdetermined systems of equations is a problem that has been studied extensively in recent years -although until recently all such studies addressed linear equations only (see, e.g., [34][35][36][37]).In fact, the first studies on finding sparse solutions to sets of quadratic equations appeared only in the past two years, in the context of sparsitybased sub-wavelength imaging [38,39], and have recently been considered in more general contexts [25,26].In the context of optical imaging, sparsity has also been used in various different application such as digital holography [40,41], phase space tomography [42], tracking [19,33], and more.Here, in order to solve the quadratic problem, we use a recently developed greedy method, based on iterative application of the damped Gauss Newton algorithm [30].

Formulation of the discrete problem
Discretization of Eq. (2) yields the following vector relation: where k x is a , where i A is a matrix defined by The formulation of Eq. ( 4) shows that, given the recovery of the previous frame, 1 k x − , the measurements are a quadratic function of the framedifference vector k x Δ only.The goal is therefore to find a solution k x Δ that is consistent with the measurements.However, in general, this problem is ill-posed, and the solution k x Δ is not unique.We therefore regularize the problem by requiring the frame-difference to be sparse, based on the logic that the object changes only slightly between consecutive time segments.As we will see, this essential concept of sparsity enables improved performance over standard methods.
The mathematical formulation we propose is summarized as follows.At each timesegment k , we seek a vector k x Δ that is consistent with the quadratic measurement vector k y (through the relation given by Eq. ( 4)), and is at the same time sparse -i.e.contains a small number of non-zero elements.This yields the following optimization problem:

Solution method -GESPAR
We find a solution to Eq. ( 5) by using a slightly modified version of our recently developed greedy algorithm for sparse phase-retrieval, named GESPAR, described in detail in [30].The algorithm is based on a local search, where within each iteration we solve a nonlinear leastsquares problem using the damped-Gauss-Newton (DGN) [43] method.Given the previous frame 1 k x − , the algorithm produces a solution x Δ that is both sparse and consistent with the quadratic measurements of the current frame.The algorithm's steps are summarized below: 1. Select an initial random support for k x Δ , namely the initial locations of the nonzero elements in the guessed solution.
2. Given the support, the problem defined by Eq. ( 5) reduces to a nonlinear least squares problem, which we solve by the DGN algorithm that is commonly used for this type of problem.The DGN produces an estimate x Δ .
3. Calculate the cost function with its gradient around the current estimate x Δ .
4. Perform a local search by index swapping: replace an index i from the support of x Δ containing an element with a small absolute value element with an index j containing a high absolute gradient value.Given the new support s, perform DGN, obtain support is updated -i.e. the index i is removed from the support and the index j is added to it .Go to step 3. Otherwise, go to 4, until a predetermined number of swaps have been performed.If there is no possible improvement in the cost function by swapping -go to step 1.The procedure stops after a predetermined number of swaps.The output is the solution x Δ that yields the lowest value of the cost function ( ) Since the algorithm recovers the frame-difference at each temporal segment, we avoid carrying a large accumulative error by assuming that the initial frame 0 x is recovered with high quality, i.e. using sufficient measurements and high SNR.The main difference between the method used here and GESPAR as detailed in [30] is the fact that here the cost function in the current problem contains an additional term linear in x Δ .It is important to note that problem (5) is very difficult to solve -it is highly non-convex and has combinatorial complexity.In general, there is no guarantee that the true solution or a global minimum will be found, and indeed GESPAR finds only a local minimum.However, when the sought signal is sparse enough, empirical evidence (based on numerical simulations in [30]) suggests that the true solution is found with high probability.In order to avoid stagnation of the algorithm, for reasons such as the "twin image problem" [44,45] associated with iterated projection algorithms [46], GESPAR includes randomization.Specifically, the indices i and j from part 4 in the algorithm are selected randomly from a sub-set of indices.

Simulations
We test the performance of our algorithm by performing a series of numerical simulations.The first simulation examines robustness to noise: The Fourier transform magnitude of a simulated object that is varying in time is sampled under various SNR conditions.The realspace object is then recovered frame by frame, and the recovery is compared with the true dynamic object.The recovery result is compared with frame-by-frame recovery using the standard Fienup Hybrid Input-Output (HIO) algorithm [8].The size of the simulated object is 51x51 pixels, and the sparsity of each frame-difference is 5 s = .The number of Fourier measurements taken for each frame in this example is 2601.In all of these simulations the number of iterations used in the Fienup HIO algorithm is chosen as 3000, which is sufficient for convergence, where the last 600 iterations are averaged to yield the final reconstruction.The parameter β in the HIO algorithm is chosen as 0.5, which yielded the best results.The initial guess for HIO in all frames is the clean first frame, which proves to work better than other options, such as taking the initial guess as the output of the previous frame.The prior support knowledge, needed for the HIO algorithm, is taken as the 31x31 pixel square, visible in the videos and in Figs. 2 and 4. The total number of GESPAR swaps is set as 500 for each frame.
Simulated examples showing our sparsity-based reconstruction of dynamically varying objects, and comparing to standard HIO reconstruction are shown in Fig. 2 (Media 1) and Fig. 2 (Media 2). Figure 2 shows two example frames: the left panel shows the true simulated object, the center shows our GESPAR recovery, based on frame-difference sparsity, and the right panel displays the HIO recovery.In this example, the SNR is 30dB, defined as: , where n is an independently and identically distributed (i.i.d.) Gaussian noise added to the measurement vector y.In these typical examples GESPAR clearly obtains a much better reconstruction than the HIO algorithm, as will be quantified next.The performance of both algorithms under various SNR is displayed in Fig. 3.The normalized mean squared error, defined as , where x is the unknown object and rec x is the reconstruction, is plotted as a function of frame number for different SNR's.Several conclusions can be drawn from Fig. 3. First, in order to obtain a solution of similar quality (measured by NMSE), the HIO algorithm requires a much higher SNR than GESPAR algorithm; e.g. the performance of GESPAR under SNR of ~30 corresponds to SNR of ~50 in HIO, corresponding to a factor of 10 in the relative signal to noise energy.Since the SNR is related to acquisition time, this translates directly to reduction in acquisition time.Namely, under the reasonable assumption that the dominant noise factor in the imaging system is shotnoise, the factor of 10 in signal to noise energy translates to a factor of 100 in acquisition time.Second, the visible increase in the error with frame number can be explained by the fact that GESPAR iteratively reconstructs only the differences between frames, so the error is accumulated.The less apparent increase in the HIO error with frame number is the result of the initial guess being more similar to the first frames than to the following frames in our example.
In addition to the considerably superior performance of our algorithm in terms of noise robustness, we also examine the performance as a function of the number of measurements required for reliable reconstruction.Once again, successful recovery from a small number of measurements directly leads to faster acquisition time.For example, several adjacent measured pixels can be summed ("binned") together, yielding a smaller number of measurements but a higher SNR. Figure 4 shows two example frames recovered using only 650 Fourier magnitude measurements (about ¼ of the number used in Fig. 2), located on a square grid, and with very low noise (SNR=60).Fienup algorithms are known to require oversampling in the Fourier plane [10], therefore the low quality of the HIO reconstruction in Fig. 4 (Media 4) (right column) is not surprising.Using the sparsity prior information, on the other hand, compensates for the considerably lower number of Fourier measurements, leading to the high quality reconstruction with GESPAR shown in Fig. 4 (Media 3) (center column).Figure 5 shows the reconstruction error more quantitatively, as a function of frame number, for both algorithms.The reconstruction error increases as a function of frame number in both algorithms, for related but different reasons.In the HIO algorithm the reason is that the initial guess used for each frame is the clean first frame -which proved to work better than using a random initial guess or the output of the previous frame.With time, the object changes more and more, so that the first frame becomes a worse and worse initial guess.In contrast, for GESPAR, the increase in reconstruction error is mainly due to the accumulating error, which results from reconstructing only frame-differences, as discussed earlier.We also test the performance of our GESPAR algorithm under various levels of sparsity, as shown in Fig. 6.The reconstruction quality naturally improves as the frame difference is more sparse (low values of s).Fig. 6: Reconstruction error vs. frame number for various sparsity levels.SNR=40.Error accumulation due to the reconstruction of frame-differences can impose a practical limit on the possible number of frames in a reconstructed movie.This can be tackled by performing highquality sampling every certain number of frames, dictated by the system's parameters (e.g.noise, number of measurements).In this case, it is also possible to further improve performance by back-propagation -i.e. using a high-quality frame in order to reconstruct a previous frame.
The spatial resolution of our technique is an important issue.One may naively think that the best resolution is pre-determined by the high-resolution first frame.However, our recent work on sub-wavelength imaging demonstrates that it is possible to obtain resolution beyond the resolution of the measurement device, which in our current case would correspond to improving the resolution beyond the resolution of the first frame.Indeed, current work is targeting this goal.
The limitations of our technique may arise from several causes.Of course, in some cases the assumption that the frame-difference is sparse may be inappropriate.Such would be the case, for example, if the images of the dynamically changing object are taken at a low rateallowing the object to change significantly between two consecutive frames.Another limitation may arise from the fact that the GESPAR algorithm recovers only framedifferences -implying that the high-resolution first frame is important, and that error will generally accumulate with frame number -as our simulation results indeed indicate.The temporal resolution of the imaging system, although improved significantly by our method, will still ultimately depend on SNR.

Summary and future directions
We suggested and demonstrated a method to perform fast dynamic CDI, based on the prior knowledge that the object changes only slightly (sparsely) in between successive frames.The problem was formulated as seeking a sparse solution to a set of quadratic equations, and a recently developed algorithm for sparse phase retrieval (GESPAR) is used to solve it.In principle, additional prior knowledge about the object can be incorporated into the method in order to further improve the results -with the ultimate goal of relieving the method of the need for a clean first frame.
Importantly, since our method succeeds in recovering the features of an object from only a subset of its Fourier spectrum, it can serve as a solid basis for the recovery of sub-wavelength features.This is because the problem of sub-wavelength CDI corresponds to recovering an object using only the magnitude of its low Fourier frequencies [39].Indeed, our next goal is to employ the current ideas to obtain super-resolution beyond the numerical aperture of the system.

Fig. 1 :
Fig. 1: CDI setup Denoting the 2D transmission function of the object as ( ', ') t x y and the measured intensity as ( , ) I x y , under the assumptions above, the following relation holds, up to a constant multiplicative factor:

2 1n 1 mF
× vector representing the n n × sought object in time k , obtained from the discretized 2D object by stacking its columns, and the vector k y is, similarly, a 2 × column stack of the Fourier intensity measurements.The notation i is the i'th row of the 2D DFT matrix, so that each element i in k y is obtained by the absolute value squared of the inner product of k x with the appropriate spatial frequency.Finally, k x Δ is the frame difference vector, i.e. 3) can be rewritten as a function of the unknown frame-differences:

Fig. 2 :
Fig. 2: Simulated example frames comparing the quality recovered through our sparsity-based phase-retrieval technique with the best result calculated through the Fienup HIO algorithm.(a) Frame #2, and (b) Frame #20.The left column shows the true simulated frames, the center frame shows our GESPAR reconstruction (Media 1), and the right column shows the reconstruction with the standard Fienup HIO (Media 2).SNR=30.The simulated object has uniform intensity and phase -i.e. it is opaque.However, the asymmetry of the object means that its phase structure in Fourier space is complicated and not binary.

Fig. 4 :
Fig.4: Simulated example frames comparing the recovery using 650 measurements through our sparsity-based phase-retrieval technique with the best result calculated through the Fienup HIO algorithm.The left column shows the true simulated frames, the center frame shows our GESPAR reconstruction (Media 3), and the right column displays the recovery using the standard Fienup HIO reconstruction (Media 4).SNR=60.The simulated object has uniform intensity and phase -i.e. it is opaque.