Subgradient-projection-based stable phase-retrieval algorithm for X-ray ptychography

This paper proposes a phase-retrieval algorithm for X-ray ptychography with almost the same computational efficiency as the conventional methods. It exploits an optimization technique, subgradient projection, which has an interesting property that means it can be expected to avoid yielding poor images.


Introduction
Ptychography is a lensless imaging technique for microscopic observation, based on the diffraction of a coherent wave, including X-rays (Maiden & Rodenburg, 2009;Harada et al., 2013;Valzania et al., 2018).With the increasing demand for nondestructive high-spatial-resolution imaging of various specimens, hard X-ray ptychography has gained much attention.The short wavelength and high penetrability of hard X-rays enable nondestructive observation of internal structures of thick specimens at high spatial resolution, even though they are too thick for electron microscopy.Hard X-ray ptychography has been applied to specimens in various fields including biology (Polo et al., 2020;Suzuki et al., 2016;Shahmoradian et al., 2017;Jones et al., 2014), chemistry (Pattammattel et al., 2020;Shi et al., 2019;Hirose et al., 2019Hirose et al., , 2020) ) and materials science (Cuesta et al., 2019;Grote et al., 2022;Uematsu et al., 2021;Gao et al., 2020).
Fig. 1 schematically illustrates the ptychographic measurement in transmission geometry.In the measurement, Fraunhofer diffraction patterns from the specimen (termed 'object') are collected with a localized X-ray beam (the 'probe').This measurement is repeated by shifting the illumination area on the object with some overlap.The diffraction data set is then subjected to a computational image reconstruction process utilizing an iterative optimization algorithm, yielding the spatial distribution of the complex-valued refractive index (i.e.phase and absorption contrast image) of the object and also the complex wavefield of the probe.This image reconstruction process is often called phase retrieval (PR) because the process recovers phase information of the diffraction wavefield lost in the diffraction data.
Various algorithms have been proposed to solve the PR problem in ptychography: the conjugate gradient method (Guizar-Sicairos & Fienup, 2008), the difference map (Elser, 2003;Thibault et al., 2009), relaxed averaged alternating reflections (Luke, 2004;Marchesini et al., 2016), the maximum likelihood method (Thibault & Guizar-Sicairos, 2012), the proximal splitting algorithm (Hesse et al., 2015;Qian et al., 2014) and the extended ptychographical iterative engine (ePIE) (Rodenburg & Faulkner, 2004;Maiden & Rodenburg, 2009).In the iterative update of the object and probe functions, most algorithms use the entire set of diffraction patterns simultaneously, while ePIE sequentially reflects the diffraction patterns point by point.The former batch updating approach is advantageous for parallel computing, but it requires more memory compared with the latter sequential updating approach and is prone to a poor local solution (Pham et al., 2019).It is also empirically known that the batch updating approach can actually require more iterations for convergence (Yatabe & Takayama, 2022).On the other hand, the ePIE algorithm is simple and computationally efficient and thus widely used.
Because of the favorable properties of the ePIE algorithm, it is also used for advanced PR. Advanced PR tackles, for example, a practical restriction that the higher the spatial resolution, the thinner the sample must be.This trade-off relationship between sample thickness (depth of field) and spatial resolution is known as the diffraction limit in the theory of general transmission microscopy (Tsai et al., 2016;Born & Wolf, 1980).To overcome this trade-off relationship, a multi-slice PR approach has been proposed as an extension of ePIE (Tsai et al., 2016;Maiden et al., 2012).Thanks to the strong real-space constraint in ptychography, multi-slice PR achieved extension of the depth of field.However, the increase in information to be retrieved in the multi-slice approach makes the PR problem more difficult, and reconstructed-object slices often suffer from cross-talk artifacts caused by poor depth resolution (Du et al., 2021).To reduce the artifacts, it is important to develop a base PR algorithm that performs better than ePIE.
To improve the convergence performance of ePIE, some variants have been proposed (Maiden & Rodenburg, 2009;Maiden et al., 2017;Pham et al., 2019).The regularized PIE (rPIE) (Maiden et al., 2017) involves a modification of the updating formula of ePIE with a regularization weighting that uses the square of the absolute value of the object and probe.The momentum PIE (mPIE) improves the convergence speed with a momentum motivated by a stochastic or incremental gradient approach (Maiden et al., 2017).Although rPIE and mPIE may work better with well tuned hyperparameters, it is difficult to tune them because their heuristic modifications involve an increase in the number of hyperparameters.
In this paper, we propose a stable PIE-like algorithm named CRISP to realize a reliable method that can automatically tune its hyperparameter.CRISP exploits an optimization technique that is called subgradient projection.Thanks to the favorable property of subgradient projection, CRISP can avoid getting stuck in poor local solutions.Moreover, an auxiliary method included in the proposed method can automatically tune the parameter for subgradient projection.With the combination of these techniques, CRISP achieves high performance despite its simplicity.Our experiments showed that CRISP improved reliability especially at high spatial frequencies and reduced PR artifacts while achieving convergence speeds comparable to those of ePIE and rPIE.

Notation
The two-dimensional Fourier transform operator and the two-dimensional inverse Fourier transform operator are denoted by F and F À 1 , respectively.Matrices are indicated by bold capital letters, i.e.A, and the elements of the ith row and jth column are denoted by A [i, j] . sign(•) denotes the signum function that is generalized for complex numbers as sign(z) = z/|z|.The complex conjugate of A is denoted by A � .For any differential function g, rg denotes the gradient of g.

Ptychographical iterative engine framework
We first explain the ptychographic measurement model shown in Fig. 2. In ptychographic measurement, a set of diffraction patterns is observed by shifting the measurement position.The nth observed diffraction intensity pattern I n corresponds to the squared magnitude of the two-dimensional Fourier transform of the exit wavefield.The wavefield is modeled by multiplication of the illumination probe function P and the object transmission function O. Thus, the nth observed diffraction intensity pattern I n can be written as Ptychography aims to estimate the object transmission and illumination probe functions from the observed diffraction patterns.In the PIE-based algorithms, the PR problem can be formulated as the following optimization problem that minimizes the squared error in the exit wavefield in the real space: n is the revised exit wavefield constrained to the reciprocal space (Maiden et al., 2017): where W n is the exit wavefield of the nth scan position, i.e.W n ¼ P n � O n .This manipulation replaces the propagated modulus with the square root of the observed diffraction patterns and propagates back to the real space.It is difficult to find a reasonable solution to the problem in equation ( 2) because it involves the product of unknown variables P n and O n and has local minima.To make the problem simpler, the PIE-based algorithms deal with the following two subproblems that optimize each variable O n and P n while fixing the other ones: ePIE (Maiden & Rodenburg, 2009) updates the object (o) and probe (p) as follows: where � o , � p 2 (0, 1] are step size parameters, and �W n ¼ W n À W 0 n .The update formulas can be obtained by applying the gradient descent method to each subproblem in equations ( 4) and ( 5) (details will be given in the following section).
rPIE (Maiden et al., 2017) considers the slightly different subproblems in equations ( 4) and ( 5) and includes a regularization term with the subproblems such as where U n and W n are weight matrices with nonnegative elements.The 1 2 kU n � ðO 0 n À O n Þk 2 F and 1 2 kW n � ðP 0 n À P n Þk 2 F terms penalize significant changes to objects and probes between updates.By setting the weights as U n ¼ jP n j 2 max and W n ¼ jO n j 2 max and solving the subproblems in equations ( 8) and ( 9), the updating formulas of rPIE can be derived as follows: where � o , � p > 0 are balancing parameters, and division is computed element-wise.
The entire procedure of ePIE is summarized in the following pseudocode, where k = 1, . . ., K is the iteration index, K is the number of iterations, l is a vector of the randomized sample indices, and randPermðNÞ is a function that randomly permutes an integer vector up to N. The function b S r n assigns the object sampled by the function S r n to the scan position r n , and the function b T r n restores the probe shifted by the function T r n .The updating formulas, equations ( 6) and ( 7), are on lines 11 and 12.When lines 11 and 12 are changed to equations ( 10) and ( 11), the procedure becomes the rPIE algorithm.

Figure 2
Schematic illustration of the observation model.

ePIE as a gradient descent method
As mentioned in the previous section, the updating formulas of ePIE are based on the gradient descent method.It is helpful for characterizing the proposed method to explain the derivation of ePIE's updating formulas.In particular, the property of its step size will give a suggestion for the proposed method.
Let x be an optimization variable and g a differentiable cost function to be minimized.The gradient descent method finds a local minimum of g by iterating the following update: where x and x 0 are, respectively, variables before and after the update, and � is a step size parameter.By applying equation ( 12) to equation ( 4), the following updating formula of O n is obtained: In the same manner, from equation ( 5) and ( 12) the updating formula of P n is obtained: If we set the step size parameters � o ¼ � o =jP n j 2 max for equation ( 13) and � p ¼ � p =jO n j 2 max for equation ( 14), these updating formulas become equal to equations ( 6) and ( 7).
The step size setting of ePIE is derived from the property that g(x 0 ) < g(x) if � 2 (0, 1/L], where L is the Lipshitz constant of rg(x) (Bauschke & Combettes, 2017).This property guarantees the cost always decreases through updates.The Lipshitz constants of the gradient of the cost functions in equations ( 4) and ( 5) are given by jP n j 2 max and jO n j 2 max , respectively (Qian et al., 2014).Thus, the updating formulas in equations ( 13) and ( 14) always decrease the cost under the conditions � o 2 ð0; 1=jP n j 2 max � and � p 2 ð0; 1=jO n j 2 max �.In the update of ePIE, setting � o , � p 2 (0, 1] satisfies the condition.In practice, it has been shown that ePIE stably converges when � o , � p is chosen from the range (0, 1] (Maiden et al., 2017).

Proposed method
We propose a PR algorithm, CRISP (clipped reliable iterative subgradient projection).We exploit a subgradient projection that is considered to have a favorable property (Section 3.3).The proposed method consists of a main updating formula and two auxiliary methods that improve the stability (Section 3.4) and simplicity (Section 3.5) of the proposed method.The explanations of these auxiliary methods follow an intuitive explanation of the subgradient-projection-based algorithm.

Subgradient projection
We first introduce subgradient projection, which plays a central role in the proposed method.Subgradient projection is known in the field of optimization and is usually used for optimization with non-differentiable cost functions whose subgradient can be calculated.We introduce the simplified version of subgradient projection that is for differentiable functions because we only consider a differentiable function in this paper.
The iterative subgradient projection algorithm (Bauschke & Combettes, 2017), which uses subgradient projection iteratively, can be regarded as a type of gradient descent method.Compared with the updating formula in equation ( 12), the step size � is replaced by a scalar-valued function �ðgðxÞ À �Þ þ =krgðxÞk 2 2 .From now on, we treat this scalar variable as the step size of the updating formula of iterative subgradient projection.

The basic form of CRISP
We next derive a basic updating formula based on subgradient projection.As introduced in Section 2.2, the cost function of the PR problem in ptychography is Applying the cost function for each variable O n and P n to the formula of the subgradient projection in equation ( 15), the following formulas can be obtained: where � o , � p are step size parameters with the same role as the step size parameters of ePIE (� o , � p ), and � is a hyperparameter.We use the same � for all n.
An important property of the proposed updating formula is that the step size changes adaptively.The main part of the step size This property can bring a better solution (details are given in the following section).

Intuitive explanation of CRISP
Let us clarify the properties of the updating formulas in equations ( 16) and ( 17).We first show that the iterative update using subgradient projection can bring about benefits that are likely to reach a better solution.In Fig. 3, an iterative update by the gradient descent method (corresponding to ePIE) and the subgradient projection (corresponding to CRISP) are demonstrated.As shown in Figs.3(a) and 3(b), the gradient descent can get stuck in the local minimum, while the subgradient projection arrives at the global minimum even when it starts from a poor initial value.This is because the gradient descent method stops when rg(x) = 0 even if it is a local minimum whose cost is large.On the other hand, the subgradient projection keeps updating whenever g(x) > �.
When the sequence of the subgradient projection approaches the local minimum, krg(x)k becomes close to 0, while g(x) is still larger than �.This makes the step size large because the step size includes the reciprocal of the gradient rg(x) as in equation ( 15).This property can avoid poor local minima.
Next, we visualize the importance of the setting of the hyperparameter �.Fig. 4 shows the sequences generated by iterative updates under different conditions of � and E. As shown in Figs.4(a) and 4(b), when the minimum value of g is 0 and � = 0, the sequence of the subgradient projection converges to the minimum in the same way as the gradient method.As shown in Fig. 4(c), when the minimum value of g is E > 0 and � = 0, the sequence diverges because the step size gðxÞ=krgðxÞk 2 2 becomes extremely large near the minimum point where the gradient rg(x) is close to 0. To avoid this, we need to set an appropriate constant � > 0. As shown in Fig. 4(d), when we set � greater than E, the points stop at the area where g(x) � � À E.
The entire CRISP procedure is summarized in the following pseudocode, where D o and D p denote the matrices containing the updating directions, R is a set of scan position vectors, and I is a set of observed diffraction patterns.The vector e stores the cost in each sample for calculation in line 16.
The following subsections are about auxiliary methods for CRISP.

Adaptive step size clipping
As the first auxiliary method, we propose a stabilizing method for updating objects and probes.Although the subgradient projection can avoid local minima, its step size may become extremely large around points where the gradient is close to 0, as shown in Fig. 3(b).This may cause a performance degradation.For stability, limiting the step size can be conceivable.This technique is commonly used in the field of machine learning and has shown good results (Lin et al., 2018).Fig. 3(c) illustrates updating with subgradient projection with step size clipping.The left sequence in Fig. 3(c) goes to the global minimum while avoiding a large step as in Fig. 3(b).As can be seen, step size clipping may work well for optimization with subgradient projection.
To make it easier to explain step size clipping, we first introduce a step size function G that replaces part of the step size in equations ( 16) and ( 17) as where Q is either P n or O n .By using this step size function, the updating formulas of the proposed method in equations ( 16) and ( 17) are rewritten as We next introduce a modified step size function that sets an upper limit for the update amount: limits the maximum step size to that of ePIE with (� o , � p ) = (� o , � p ).This is because G clip ðQ; �W; �; �Þ becomes 1=jQ n j 2 max , and equation ( 19) becomes equation ( 6) when (� o , � p ) = (� o , � p ).The latter adopts an adaptive limit that depends on the scale of the object and probe.Our experiment will show that the former setting often works well in practice and that the latter setting may further improve performance.The proposed clipping is used in lines 5 and 6 in the following algorithm, which corresponds to the function CRISPdirection.

Automatic tuning of n
As the second auxiliary method for CRISP, we propose to improve its practicality by relaxing the difficulty of hyperparameter tuning.The subgradient-projection-based update finally stops near a good solution when � is appropriately set for the minimum value of the objective function, as shown in Fig. 4(d).However, the update stops in an undesired solution when � is set to too large a value because the value of the step size function G becomes 0 early before the cost 1 2 k�W n k 2 F becomes sufficiently small.To avoid this, we automatically adjust � with each iteration to keep updating until the cost becomes small.Since the subgradient-projection-based update depends on the error variable 1 2 k�W n k 2 F , we adjust � [k+1] according to the average of the cost ð1=NÞ P N n¼1 1 2 k�W n k 2 F at the kth iteration.We adopt the adjusting step as follows: where c > 0 is a tuning parameter for adjusting how much to reduce from the average cost.Since we expect the average cost to decrease with updates, we set c 2 (0, 1) so that � [k+1]  becomes smaller than the average cost in the kth iteration.For this procedure, the error should be saved for each iteration as in line 4 in CRISPdirection.In each iteration, � is updated to � 0 after all objects and probes have been updated as line in 16 in the CRISP algorithm.� is initialized before the iteration begins as in line 1 in the CRISP algorithm.As shown in the pseudocode initializeXi, this initialization computes the error for each sample using the initial values of the object and probe and then uses all of them to calculate the first �.

Experiment
To investigate the properties of the proposed algorithm, numerical experiments were performed.For a numerical simulation, we used a TiO 2 -particle-filled polymethyl methacrylate film used in a previous paper (Yatabe & Takayama,Figure 4 Comparison of iterative updates between the gradient descent method and the subgradient projection for several cases.2022).These data were configured to imitate the hard-X-ray ptychographic measurement system at an imaging station of the Hyogo ID beamline BL24XU at SPring-8 (Takayama et al., 2020(Takayama et al., , 2021)).This measurement system is shown in Fig. 5.An undulator radiation X-ray beam monochromatized to a photon energy of 8 keV was cropped with the slit used as a virtual light source, and then the X-ray beam diffracted from the slit illuminated the beam-defining aperture (BDA).The BDA, a Fresnel zone plate (FZP) lens and the specimen were placed according to the thin lens formula; thus the demagnified image of the BDA was formed on the specimen for the local illumination.The first-order diffraction was selected with an order-sorting aperture (OSA).The diameter of the BDA was 20 mm, the outermost zone width and diameter of the FZP were 86 nm and 416 mm, respectively, and the demagnification ratio was set to 5, yielding an illumination beam with a diameter of 4 mm.It was assumed that the diffraction intensity patterns of the specimen were measured with an EIGER X 1M detector (Dectris Ltd) placed 4.5 m downstream of the specimen.
We used two simulation data sets whose diffraction intensities at the origin (I 0 ) were varied: 10 10 (high dose) and 10 8 (low dose).We added Poisson noise to the observed intensity patterns, which is a common assumption for actual measurements.For an experiment with real data, we used the measurement of a Siemens star chart and ink toner particles.
The proposed algorithm CRISP was compared with ePIE.For a quantitative evaluation, we used the following evaluation indices.To check the convergence of the algorithm, the R F factor (Miao et al., 2006) was calculated.
To evaluate the image resolution of the objects, we calculated the Fourier ring correlation (FRC) (Rosenthal & Henderson, 2003), � are the Fourier transforms of the estimated and true object transmission functions, respectively, and I ðSÞ is the set of frequency indices corresponding to the ring whose radius is equal to the given spatial frequency S. The closer the value is to 1, the higher the accuracy at that spatial frequency.

Result 1: performance per iteration
We compared the convergence speed among ePIE, rPIE and CRISP.We used the high-dose simulation data.To compare under similar conditions, the step size parameters of the proposed method were set to the same as those of ePIE, i.e. � o = � o = 1.0, � p = � p = 0.4.The balancing parameters of rPIE were set to those providing the best results according to the preliminary tuning: � o = 0.1, � p = 1.0.The clipping parameters and the tuning parameter of CRISP were (� o , � p ) = (1, 1) and c = 0.5, respectively.The number of iterations was 300.The ratios of the total computational time of CRISP to those of ePIE and rPIE were 1.09 and 1.02, respectively.
The results are shown in Fig. 6.In Fig. 6(j), the R F factors of ePIE and rPIE are lower than those of CRISP until the 100th iteration, while CRISP reached a lower R F factor than ePIE and rPIE at the 300th iteration.In Fig. 6(i), the FRC of CRISP at the final iteration is higher at the high spatial frequency than that of ePIE and rPIE.The reconstructed-object images shown in Figs.For each algorithm, the correlation coefficients between the reconstructed probes [Figs.6(d)-6( f)] and the ground truth [Fig. 6(h)] in the reciprocal space were almost 1, where these were calculated over the spatial frequency range cropped with the aperture of the illumination zone plate.
Compared with ePIE and rPIE, CRISP had a slower drop in R F during the first 100 iterations.This behavior can be interpreted as follows.At the beginning of iterations, the error �W n tends to be large, resulting in a large gradient.This makes the step size of ePIE and rPIE large at the beginning of iterations, as shown in Fig. 3(a).On the other hand, the step size of CRISP has less variation for the entire set of iterations, as shown in Fig. 3(c), because equation ( 15) normalizes its step size by the norm of the gradient in the denominator.These characteristics of CRISP might be the reason for its stable performance and better reconstruction quality.

Result 2: effect of the step size clipping
The following two experiments confirm the validity of the auxiliary methods of CRISP.This experiment verified the effect of the step size clipping.We used both high-dose and low-dose data.To show the difference in behavior due to clipping parameters, we compared the two types of setting:  parameter for CRISP was c = 0.01, and the number of iterations was 400 (compared with the experiment with high-dose data, more iterations were required for sufficient convergence).
The results are shown in Fig. 7.In both high-dose and lowdose data, the FRC results with step size clipping were better than those without clipping, and even ePIE and rPIE as shown in Figs.7( f) and 7(g).The reconstructed-object images, especially those reconstructed from low-dose data, exhibited higher improvement of FRC than those reconstructed without clipping.The third from the right images of Figs. 7(d) and 7(e) were reconstructed with step size clipping.These were less noisy than the image reconstructed without step size clipping [the third from the right of Fig. 7(c)].These results confirm that step size clipping is essential for CRISP to perform stable high-precision reconstruction.Similarly to Section 4.1, the correlation coefficients in the reciprocal space between the reconstructed probes and the ground truth were 1 for all results.
The behavior of CRISP depended slightly on the clipping parameter as shown in Figs.7( f ) and 7(g).The ePIE-like clipping worked best with the high-dose data, and the scaleadaptive clipping worked slightly better than the ePIE-like clipping with the low-dose data.Although this suggests that which clipping parameter to set may depend on the condition of the data, we found experimentally that ePIE-like clipping generally stabilizes performance.Therefore, CRISP can avoid a detailed setting of the clipping parameter, and the only additional parameter that needs to be set in CRISP is the tuning parameter c for automatic tuning.
An advantage of CRISP over rPIE is noise robustness.In Figs.7( f ) and 7(g), rPIE performed well for high-dose data, while FRC was poor for low-dose data.rPIE uses an elementwise step size in equations ( 10) and ( 11) to improve the Figure 8 �-dependent performance of CRISP.The left and right columns are the result of high-dose and low-dose data, respectively.The top row is the R F factor, the middle is the R F factor at the final iteration for each � and the bottom is FRC.For all subfigures, the colors correspond to values of fixed �, which go from yellow to blue as the value increases from 0. The dashed black lines are ePIE, the dashed brown lines are rPIE, the red line and points are the automatic-�-tuned CRISP, and the rest are the �-fixed CRISP.The black lines in the top row represent the R F factor between the diffraction patterns and their noise-free version.The � values of the red points on the middle row were calculated using the final updated � [K] as (1/c)� [K] by considering the scaling by tuning parameter c.For the Siemens star chart, the median of � ij was 1.4 � 10 À 2 for ePIE, 1.5 � 10 À 2 for rPIE and 1.1 � 10 À 2 for CRISP; the variance of � ij was 5.8 � 10 À 6 for ePIE, 8.5 � 10 À 6 for rPIE and 8.0 � 10 À 7 for CRISP; the R F factor at the final iteration was 0.422 for ePIE, 0.499 for rPIE and 0.423 for CRISP.For ink toner, the median of � ij was 4.5 � 10 À 3 for ePIE, 9.3 � 10 À 3 for rPIE and 4.2 � 10 À 3 for CRISP; the variance of � ij was 4.2 � 10 À 7 for ePIE, 2.6 � 10 À 6 for rPIE and 1.7 � 10 À 7 for CRISP; the R F factor at the final iteration was 0.237 for ePIE, 0.299 for rPIE and 0.230 for CRISP.Bars in (c) indicate 4 mm, and that in the inset 1 mm.
We provide further arguments to support the validity of the automatic tuning.The bottom and middle rows of Fig. 8 show that FRC increased and the R F factor decreased as � increased to a certain value such as � = 0.5 for the high-dose data and � = 0.06 for the low-dose data.On the other hand, FRC degenerated when � was set greater than its appropriate value.These results suggest that there is a proper � range for better performance.The � derived by automatic tuning was 0.47 for the high-dose data and 0.074 for the low-dose data, as shown in the middle rows of Fig. 8.These values were close to the setting of � that exhibited the best performance among the fixed �.Thus, it was shown that automatic tuning simplifies the difficult parameter setting and can set an appropriate � to give a good performance.

Result 4: performance with real data
For real data, we verified the robustness of the proposed method to the order for the object update, which is known to affect the reconstruction performance.We evaluated the dispersion of reconstruction performance in ten trials in which the update order was randomly changed.To evaluate the dispersion, we used the � ij score which is calculated as follows (Sekiguchi et al., 2017;Takayama & Nakasako, 2024): The results are shown in Fig. 9.We visualized the histograms of � ij for each data set as shown in Figs.9(d) and 9(e).In both figures, � ij scores of CRISP have smaller values and are distributed in a narrower range than those of the other methods.This was also shown by the fact that the median and variance of � ij scores were lower in CRISP than in the other methods.
These results confirm that CRISP is more robust to the update order than the other methods.

Conclusion
In this paper, we proposed a PR algorithm named CRISP that is based on subgradient projection.It performs error-adaptive updates and can avoid poor update stagnation.The proposed method stably reconstructed higher-quality images than ePIE and rPIE, while the convergence speed and complexity in parameter tuning were almost the same as those of ePIE.In future work, we will extend the proposed method to handle regularizations and different noise models, such as the Poisson model, and apply it to multi-slice approaches.

Figure 1
Figure 1Schematic illustration of a ptychographic measurement.

GFigure 3
Figure 3 Contour plot of a cost function and trajectories of iterative updates.(a) An example using the gradient descent method.Examples using the iterative subgradient projection (b) without step size clipping and (c) with it.The points are colored according to the number of updates, and the arrows indicate the transition of each update.The global minimum is shown with a red star.
(a) An example using the gradient descent method.(b)-(d) Examples using the iterative subgradient projection.(b) shows the case where the minimum value of the objective function g is 0, and (c) and (d) show the case where the minimum value is E. In (c) and (d), the constants used in the subgradient projection are 0 and �, respectively.The three-dimensional schematic diagram in the upper panels is projected to a two-dimensional plane and is shown in the lower panels.The color of the points represents the number of iterations, and the arrows indicate the transition of each update.The plane at g(x) = 0 is shown in light gray and the plane at g(x) = � in light blue.In the bottom diagram of (d), the area where g(x) � � À E is shown in blue.
6(a) to 6(c) reflect the result in Fig. 6(i).At the final (300th) iteration, the amplitude image reconstructed by CRISP was closer to the ground truth [Fig.6(g)] than that of ePIE.Note that ePIE enhanced the white contour of the particles as shown in the upper-right box of Fig. 6(a), which is known as an artifact of defocusing.Moreover, the amplitude image reconstructed by CRISP shown in the upper-right box of Fig. 6(c) was less noisy than that reconstructed by rPIE shown in the upper-right box of Fig. 6(b).These results confirm that CRISP converges as fast as other methods and can reconstruct higher-quality images.

Figure 5
Figure 5Schematic illustration of the experimental setup.

Figure 6 7
Figure 6 Comparison of performance per iteration among ePIE, rPIE and CRISP.(a)-(c) show the amplitude (upper) and phase (bottom) of object images reconstructed by (a) ePIE, (b) rPIE and (c) CRISP, respectively.(d)-( f ) show the probe reconstructed by ePIE, rPIE and CRISP, respectively.(g) and (h) show the ground-truth object image and probe, respectively.(i) and (j) are the calculated FRC and R F factor, respectively.The FRC figures correspond to 100, 200 and 300 iterations from left to right, respectively.The line colors correspond to the colors of (a)-(c).The horizontal black line in (j) represents the R F factor between the diffraction patterns and their noise-free version.Bars in (c), (d) and (g) indicate 4 mm, and those in the inset are 0.5 mm.

Figure 9
Figure 9 Results for real data.The left and right columns are the Siemens star chart and ink toner, respectively.Typical diffraction intensity patterns are shown at the top.The amplitude images of the object, phase images of the object and probe reconstructed by (a) ePIE, (b) rPIE and (c) CRISP are shown in the left, middle and right, respectively.(d) and (e) are histograms of � ij for (d) the Siemens star chart and (e) ink toner.The blue bins are ePIE, the yellow bins are rPIE and the red bins are CRISP.For the Siemens star chart, the median of � ij was 1.4 � 10 À 2 for ePIE, 1.5 � 10 À 2 for rPIE and 1.1 � 10 À 2 for CRISP; the variance of � ij was 5.8 � 10 À 6 for ePIE, 8.5 � 10 À 6 for rPIE and 8.0 � 10 À 7 for CRISP; the R F factor at the final iteration was 0.422 for ePIE, 0.499 for rPIE and 0.423 for CRISP.For ink toner, the median of � ij was 4.5 � 10 À 3 for ePIE, 9.3 � 10 À 3 for rPIE and 4.2 � 10 À 3 for CRISP; the variance of � ij was 4.2 � 10 À 7 for ePIE, 2.6 � 10 À 6 for rPIE and 1.7 � 10 À 7 for CRISP; the R F factor at the final iteration was 0.237 for ePIE, 0.299 for rPIE and 0.230 for CRISP.Bars in (c) indicate 4 mm, and that in the inset 1 mm.
jO i ½k; l� À O j ½k; l�j= P k;l jO i ½k; l� þ O j ½k; l�j; ð25Þ where O i and O j are the pair of reconstructed-object images from two different trials.The lower � ij is, the smaller the difference between the two reconstructed images.Since ten trials were conducted, there were 45 combinations of reconstructed images (excluding itself).The step size parameters were set to � o = � o = 0.8, � p = � p = 0.4 for the Siemens star chart and � o = � o = 0.4, � p = � p = 0.2 for ink toner.The balancing parameters of rPIE were set to � o = 0.4, � p = 2.0 for the Siemens star chart and � o = 0.5, � p = 1.0 for ink toner.The tuning parameter of CRISP was c = 0.01 for both, and the clipping parameter was (� o , � p ) = (1, 1) for both.The number of iterations was 300 for the Siemens star chart and 100 for ink toner.

.
Vectors are indicated by bold lower-case letters, i.e. v, and the ith element is denoted by v[i].Element-wise multiplication is denoted by � , the element-wise absolute value is denoted by | • | and the largest absolute value among all elements of input is denoted by | • | max .kAk F is the Frobenius norm, which is defined as kAk