Ptychographic wavefront characterisation for single-particle imaging at X-ray lasers

A well-characterised wavefront is important for many X-ray free-electron laser (XFEL) experiments, especially for single-particle imaging (SPI), where individual bio-molecules randomly sample a nanometer-region of highly-focused femtosecond pulses. We demonstrate high-resolution multiple-plane wavefront imaging of an ensemble of XFEL pulses, focused by Kirkpatrick-Baez (KB) mirrors, based on mixed-state ptychography, an approach letting us infer and reduce experimental sources of instability. From the recovered wavefront profiles, we show that while local photon fluence correction is crucial and possible for SPI, a small diversity of phase-tilts likely has no impact. Our detailed characterisation will aid interpretation of data from past and future SPI experiments, and provides a basis for further improvements to experimental design and reconstruction algorithms.

These XFEL experiments depend on highly-focused and spatially-coherent X-ray pulses. Furthermore, knowing how the intensities and phases of these pulses develop through the optical focus at high resolution is helpful for SPI. This knowledge helps us determine where the nanometer-sized particles should be injected to reproducibly give bright diffraction patterns. The phase and intensity profiles differ between individual pulses because of the self-amplified stimulated emission (SASE) pulse-generation mechanism. While pulse-to-pulse intensity fluctuations are typically monitored by upstream gas detectors, the complex-valued beam profiles of the focused pulses are much less well-studied. Without well-characterised beam profiles experimental design, instrument alignment, and data interpretation can be delayed or severely handicapped.
Many of these wavefront characterisation experiments indicate that variations between pulses are significant, and even speculated to adversely impact SPI. To study this impact definitively requires an experiment that directly measures both the spatial and temporal variations of the XFEL pulses at the foci used for SPI imaging. Here we describe a ptychographic wavefrontsensing experiment designed specifically to study this impact. To image these intense foci with ptychography, the pulses must be attenuated to keep the target undamaged, which leads to dimmer diffraction in a single pulse exposure. To obtain a stable ptychographic reconstruction we must signal average many dim patterns together. However, the intrinsic variations between pulses or sub-micrometer mechanical vibrations anywhere along the beam path can reduce the speckle-contrast in these averaged patterns, thus jeopardizing the ptychographic reconstruction.
These issues with beam diversity could be overcome with a mixed-state approach [34] to account for unavoidable experimental variations encoded in the diffraction data. Furthermore, we describe a single-pulse fitting procedure that infers hidden sources of variation, e.g. sample stage vibrations. A similar strategy could also be applied to related imaging methods such as electron ptychography, which often face similar important but hidden experimental variations (e.g. drift, aberrations) [35].
This mixed-state reconstruction strategy allowed us to characterise the full spatial intensity and phase profile of a focused XFEL beam and study their implications for SPI experiments, in particular the expected shot-by-shot variations in fluence and phase tilt. Through numerical propagation along the beam axis, we quantified the phases and intensities at different defocus positions and computed their effect on SPI data collection efficiency (i.e. hit-rate). Using this quantitative reconstruction of the complex-valued beam profile we simulated a large SPI data set under realistic experimental conditions. A unique insight from this is the importance of a photon fluence correction in SPI reconstruction pipelines.

A. Data collection
A ptychographic wavefront sensing experiment, with its setup shown in Fig. 1a, has been performed inside the LAMP endstation of the Atomic, Molecular and Optical Science (AMO) instrument [36] at the Linac Coherent Light Source (LCLS) under beam conditions similar to SPI experiments. A 200 nm thick Siemens Star test pattern, made by depositing gold (X30-30-2, www.zeiss.com) on a 110 nm Si 3 N 4 membrane, was used as a fixed target. To limit damage to the target, the full and unfocused LCLS beam with a photon energy of 1.26 keV and an average pulse energy of (2.76 ± 0.16) mJ (as measured by an upstream gas monitor detector) was attenuated using a 4.26 m long nitrogen gas attenuator at an average pressure of 14.23 Torr resulting in a transmission of about 10 −7 . The attenuated beam was collimated using a pair of slits and focused onto the test pattern by a pair of Kirkpatrick-Baez (KB) mirrors which were located 1.1 and 1.6 m upstream of the interaction region.
The test pattern was scanned by moving the sample using a motorised piezo stage in a 10 × 10 "snake" pattern (black circles in Fig. 1b), while diffraction images were taken with a p-n junction charge-coupled device (pnCCD) detector [37] placed 0.73 m downstream of the test pattern. The pnCCD was moved off-center to place the diffraction pattern on a 192 × 192 pixel area without any dead area. Due to using an upstream gas attenuator, no beamstop was necessary, and the signal on the detector was covering the complete dynamic range of diffraction from the sample. The scanning pattern was designed such that each position was given a randomised offset from a regular raster grid position to avoid raster grid pathology [28,38]. To reduce sample vibrations, the cryochillers of the pnCCDs were switched off temporarily during data acquisition. At each of the 100 scan positions x j = (x j , y j ), a total of 900 single-shot data frames were recorded distributed over 3 separate data collections (each with 300 frames per position). A selection of diffraction patterns is shown in Fig. 1c.

B. Data pre-processing
For each diffraction frame, a running dark subtraction was performed based on the average of the closest 100 dark frames. Dark frames were recorded every two seconds using BYKIK, the upstream undulator beam kicker magnet. This dynamic thermal dark correction was necessary to account for changes in the readout of the pnCCD as it was slowly heating up. The detector was in gain mode 3 where a photon at the given energy is equivalent to 25 analogue-to-digital units (ADUs). For all detector pixels, dark-corrected ADU values were converted to units of photons and values smaller than 0.8 photons were rounded down to zero with all other values rounded to the closest integer value.

A. Averaged data set with nominal positions
For each of the 3 data collections, all 300 frames sharing the same nominal scanning position were averaged together to form a new set of ptychographic diffraction patternsÎ jq , where j is the position index and q is the reciprocal space coordinate. These averaged diffraction patterns have a much higher signal-to-noise ratio compared to individual frames but also a reduced speckle visibility due to pulse-to-pulse variations. This reduction step effectively produces a virtual data set, encoding any form of shotto-shot variation in a "blurred" diffraction pattern. The mixedstate approach to ptychography [34] was designed precisely for such cases where one or more sources of partial coherence are present. In the mixed-state formalism, the diffracted intensity pattern at scan position x j can be written as where M is a given number of probe modes (components) P (m) x in which the partially coherent illumination is decomposed, O x the transmission function of the object and F x→q [·] the 2D Fourier transform. The iterative reconstruction algorithm adapted for this problem, as described in [34] and implemented in PtyPy  5 most dominant probe modes, along with a high-resolution complex-valued image of the Siemens Star test pattern (Fig. 2a). We used 2000 iterations of the difference map (DM) algorithm [38] followed by 1000 iterations of the maximum-likelihood (ML) algorithm [40]. As an initial guess of the probe, we used an idealized illumination model based on the average 1.35 m focal distance between the two KB mirrors that gave a nominal focal size of 2 × 2 µm 2 .
Reconstructions with more than 5 probe modes did not significantly change the relative power of the probe modes shown in Fig. 2a. Furthermore, the weakest reconstructed mode has distinct features forming towards the edges of the field of view. These features do not correspond to regions where the reconstruction has converged to the expected structure of the test patterns, and are attributed to possible detector readout artifacts in our measurement, an effect that had been previously observed [41].

B. Checking for hidden sources of pulse-to-pulse variation
Even though we have experimentally minimised vibrations on the sample stage, we still expected minor variations in the positions, either due to dynamic changes in the beam or the sample's location. XFELs are known to show some beam pointing instabilities that can lead to changes in the focused wavefront [42]. In order to check for these types of pulse-to-pulse variations, we introduced a list of free parameters and define a model for a single-shot diffraction pattern n as the incoherent sum where P (m) x are the orthogonalised probe modes, c nm are positive real-valued coefficients, ∆x n = (∆x n , ∆y n ) is an additional translation of the main probe mode with respect to the object and A xn , B xn are additional phase terms for the main and all other probe modes, respectively. These phase terms are defined as A xn = e i(α nx x+α ny y) and (3) with the phase differences α nx , α ny , β nx and β ny , which are related to phase tilt angles (see Section C in Methods) and further correspond to center shifts on the detector (and in the source plane). Altogether we have defined M + 6 scalars for a single pulse and treat them as fitting parameters in an optimisation scheme, minimizing the quadratic distance between the measured intensity patternÎ nq and the model I nq . We used Powell's method, a conjugate direction approach without a need to calculate any derivatives, to obtain these single-shot parameters and cropped model and data to a 72 × 72 pixel area for improved computational efficiency. We performed this single-pulse fitting analysis on all 3 data collections, with a total of 90k single-shot events and discovered substantial variations in the positions with a root-mean-squaredeviation (RMSD) from the nominal positions of 0.34 µm in the horizontal and 0.33 µm in the vertical direction. Interestingly, the changing positions follow well-defined oscillations with a dominant frequency at around 30 Hz, which can likely be attributed to electrical equipment in the vicinity of the sample chamber. We have also observed variations in the phase tilts and the coefficients, both are further discussed in the Results section.

C. Averaged data set with corrected positions
With the corrected scan positions, we defined a new averaged virtual data set which combined the diffraction patterns from all 3 data collections sampled on a much denser 60 × 60 grid with a total of 2642 scan positions and at least 15 patterns averaged per position (grey dots in Fig. 1b). Thereby, we have reduced the positional variation for each data point at the cost of lower signal-to-noise ratios. We performed another mixedstate reconstruction, again with a simulated initial guess for the probe, using 1000 iterations of DM and 1000 iterations of ML resulting in an improved reconstruction of the Siemens star together with the 5 most dominant probe modes (Fig. 2b). The two weakest modes exhibit similar corner artefacts attributed to detector artefacts, as described above. We performed a second round of single-pulse fitting again on the full set of 90k singlehit events but with the updated reconstruction and found that variations in the positions were reduced to a RMSD of 0.24 µm in the horizontal and 0.14 µm in the vertical direction compared to the first round. Further rounds of single-pulse fitting did not significantly improve the corrected positions.

D. Single-frame data set with corrected positions
For the final ptychographic reconstruction, we selected 20 singleshot events per nominal position, giving a total of 2000 events each with corrected positions (red dots in Fig. 1b). We then performed a mixed-state reconstruction using 2000 iterations of the ML algorithm with an iteration-dependent smoothing operation on the object. Starting with 20 pixels, the size of the Gaussian smoothing kernel was reduced exponentially with a decay rate of 1/200 per iteration. As the initial guess for the illumination, we used the 5 probe modes from the previous reconstruction (Fig. 2b) and started updating the probe modes after 500 iterations. The final reconstruction of the Siemens star, along with the 5 most dominant modes is presented in Fig. 2c. Based on the given geometry of the experiment, the half-period resolution (pixel size) was 50.3 nm.

RESULTS
After two rounds of data averaging and single-pulse fitting, we identified the 5 reconstructed modes (Fig. 2c) as the best approximation of the partially coherent average AMO beam, with 65.7 % of the total power in its most dominant mode (Fig. 3a). We note that partial coherence in this context describes all sources of "effective" decoherence, including unresolved fringes and speckles produced by scatterers along the optical path, and detector point-spread-function. However, back-propagating each orthogonal probe mode into the secondary source plane and inspecting its intensity (Fig. 3d-h) suggests that all 5 modes, incoherently summed together (Fig. 3i), are necessary to describe the features in the recorded direct beam intensity of a single LCLS pulse focused by the KB optics at AMO (Fig. 3c). Therefore, we present the average intensity profile at AMO as the incoherent sum of the 5 probe modes (Fig. 3) measured by mixed-stated ptychography 0.5 mm upstream of the focal plane with the strongest intensity (as determined by numerical propagation).
The most intense region of the focus has an estimated full width at half maximum (FWHM) of 2 µm in the horizontal and 0.5 µm in the vertical direction with a large tail extending several micrometres towards the upper left corner. In the same intense focus region, the phase in the average wavefront appears to be flat with small changes towards the edges as the intensity gets weaker and large changes in the tails of the beam. These tails (or "wings") in a KB-focused beam are often attributed to mid-spatial frequency roughness of the mirror [43,44].
In the following sections, we derive multiple properties of the reconstructed intensity and phase profiles numerically propagated into different defocus planes (see Section A in Methods for details) and describe their implications on single-particle imaging.

A. The beam intensity profile seen by single particles
Our recovered attenuated photon fluence distribution, when properly rescaled (Fig. 4), is comparable to an independent study performed at the same beamline with sucrose particles [45]. For upscaling the recovered intensity profile at different defocus planes, we used an unattenuated pulse energy of 2.76 mJ (equivalent to 1.38 × 10 13 photons) and an estimated transmission through the optics of (4.8 ± 0.4) % (see Section 1 in Supplement 1 for details). This transmission efficiency is reasonable when compared to more recent power measurements [46] given the additional apertures used in our experiment (see Fig. 3i). The upscaling approximates the fluence distributions as observed by a single particle in a regular SPI experiment at AMO. While 2D profiles as shown in Fig. 4a are typically unknown in such experiments, it is possible to map out their 1D distributions (Fig. 4b) based on a fitting analysis of single-shot diffraction from particles randomly sampling the beam [5,10,16,26,45]. Fig. 4b shows that the distributions obtained from this ptychographic experiment (red curves) agrees with the fluence histograms (shown in gray) from sucrose cluster diffraction obtained at a slightly lower photon energy (1140 eV) and pulse energy (1.1 mJ) but otherwise similar conditions at the AMO endstation [45,47]. The sucrose data has been rescaled to match the average pulse energy of the ptychography data and adjusted to consider the 70 % dynamic scattering efficiency reported in [45]. The agreement between the two measurements persists within ±3 mm of the nominal focus.

B. The "optimal" defocus position for single-particle imaging
One of the most time-consuming challenges in SPI is to adjust the overlap between the particle beam and the X-ray pulses to maximise the fluence and the ratio of detected single-particle events (hit-ratio) [5,48]. With the reconstructed 2D fluence profile, we can also define this hit-ratio as the area of all spots with a fluence larger than a certain detection limit multiplied by the particle density (particles per area). This particle density is dependent on particle delivery conditions, here we assume 0.001 particles/µm 2 which seems like a reasonable choice for commonly used particle injectors [49].
In practice, one has to find the optimal possible trade-off between these two quantities (fluence and hit-ratio) as the focal plane with the highest peak fluence (the focus) usually also minimises the achievable hit ratio (Fig. 5). Similarly, going further away from the focus increases the hit ratio while decreasing the peak fluence. This is always the case in Gaussian beam optics no matter which level of fluence is used for defining the single-particle hit detection limit (Figs. 5a and 5c).
Looking at how the peak fluence changes with defocus ( Fig.  5a), we can see that the KB-focused beam at AMO is similar to a Gaussian beam with Rayleigh length 2.4 mm but with highly asymmetric features downstream of the focus. Similar effects have been found in ablation studies of the same KB-focused beam [21]. Furthermore, we also see a similar trend in the expected hit-ratios compared to a Gaussian beam, but crucially as the hit detection limit increases, which is equivalent to imaging smaller and smaller particles, the situation is different and the position along the beam axis with the minimal hit ratio is no longer found at the focus (Figs. 5a and 5c). This also means that measuring the hit ratio as the particle stream is scanned along the X-ray beam axis is likely to give a biased estimate of the focal plane.
Concerning the best choice of defocal plane for SPI in the given KB-focused beam at AMO (or other similar KB-focused beams), there is no obvious answer because it highly depends on the size and detectability of the particles. At a given particle density of 0.001 particles/µm 2 , larger particles that can be detected as low as only 1 % of the maximum achievable peak fluence might still give rise to reasonable diffraction intensities at −10 mm defocus with much smaller peak fluence but also a much higher hit ratio. Smaller particles that are only detectable above 10 % of the maximum achievable peak fluence should probably be imaged closer to the focus where both fluence and hit ratio can be maximised (Figs. 5a and 5b).

C. Strong fluence variations impact structural inference
To reconstruct the 3D Fourier intensity in SPI, is equivalent to inferring each particle's unknown orientation [50] and the local incident photon fluence [8,51] from only their respective diffraction measurements. Our reconstructed pulse profile in Fig.  4 clearly demonstrates how this local fluence can vary by orders of magnitude depending on where injected particles randomly intercept the focused pulse, rather than pulse-to-pulse variations from the SASE lasing process. While such fluence variation between patterns is sometimes corrected as a pre-processing Fluence distributions, comparing the histogram of (a) to the distribution obtained from single-particle sucrose diffraction data. The distributions obtained by ptychography (red curve) are normalised such that their integral equals to 1. The distribution obtained by sucrose SPI is rescaled to match the average pulse energy of 2.76 mJ and normalised such that the integral is equal to the ptychography distribution for fluences above 5 × 10 9 ph/µm 2 . The bin width is 2 × 10 9 ph/µm 2 . step in SPI by normalizing the number of photons per pattern, this is challenging when the patterns are severely photon-limited Fig. 6b.
Here, we used the reconstructed average pulse profile to study if the variations in local photon fluences for different illuminated particles can be correctly inferred, and the impact of such variations on structure determination. With the Dragonfly package [52] we simulated 1 million single-particle diffraction patterns of 1.4 MDa particles (Fig. 6a,b) randomly placed in the expected beam profile at ∆ f = 0 mm (Fig. 4). For this simulation, we used relevant AMO parameters with a photon energy of 1.26 keV and the pnCCD detector placed at 130 mm behind the interaction region with a pixel pitch of 75 µm and a circular beamstop of radius 30 px. A 'hit' detection limit of 1 × 10 10 ph/µm 2 , corresponding to around 50 scattered photons per pattern, was assumed.
For a single structural class, we found that the expandmaximize-compress (EMC) [50] SPI reconstruction algorithm (see Section B in Methods for details) adequately recovers the ground truth photon fluences of each particle/pattern (Fig. 6c).
Surprisingly, if the patterns were instead randomly generated from two different 1.4 MDa structures (Fig. 6a), Fig. 6d shows that the correct structural class cannot be determined without also inferring (hence correcting for) the fluence variations. Furthermore, the structural classes are only correctly inferred for particles illuminated by sufficiently high local photon fluences ( 5 × 10 10 ph/µm 2 ). This dependence on fluence supports the importance of the few photons scattered to sufficiently high angles, which are important for distinguishing structural classes through their higher-resolution difference.

D. The impact of phase tilts on single-particle diffraction
Besides the optimisation for fluence and hit ratios, in SPI it is also desired to inject the particles into a flat wavefront [48]. Several single-particle studies have observed displacements in the centre of diffraction attributed to differences in phase tilts between local regions of different pulse wavefronts intersected by different particles [10,16,26]. These variations in the phase tilt can either be due to shot-to-shot fluctuations of the wavefront (e.g. beam pointing) or because of local variations in the phase profile of the beam sampled by the randomly positioned particles. These two sources of variations, unfortunately, are indistinguishable in an SPI experiment. Our ptychographic pulse profiling allows us to directly measure these two sources of phase tilt variations, and study their independent impact on SPI.
When fitting our ptychographic model to single-pulse diffraction data, we found that the changes in the overall phase tilt between pulses for the dominant mode have a small standard deviation of 87 µrad and 32 µrad for the horizontal and vertical components respectively. For all other modes combined, these standard deviations shrunk to 8 µrad and 6 µrad respectively. For a detector distance of 130 mm (typical for SPI experiments at AMO) this would translate into displacements of the centre of diffraction up to 11 µm, which are much smaller than the pixel pitch of the pnCCD detector. This suggests that the overall pulseto-pulse wavefront fluctuations in the LCLS wavefront should only cause negligible displacements in the centre of diffraction often observed in SPI.
Since the differences in overall phase tilts between pulses are small, the random shifts observed in SPI diffraction patterns of different particles might be due to where each particle intersects a curved (and possibly irregular) pulse wavefront. To study this second source of local phase tilt variation, we computed the gradient in the horizontal and vertical direction of the most dominant pulse mode in our ptychographic reconstruction (shown at different defocus planes in Fig. 7a). We again converted those pixel-to-pixel phase differences to a distribution of phase tilts (see Section C in Methods for details) that a single particle would observe in an SPI experiment (Fig. 7b) with the average fluence mapped in color. The white-dotted contour lines indicate an average fluence of 5 × 10 9 ph/µm 2 .
The fluence-phase tilt distribution that we reconstructed here at 3 mm defocus resembles that reported earlier (Fig. 4a of ref. [26]). Considering all defocus planes shown in Fig. 7, the expected deviation is less than 1 mrad translating to a 130 µm displacement at 130 mm detector distance.
Overall, local variations in phase tilts on a single pulse's wavefront are larger than variations in overall phase tilts be- (b) Expected single-particle hit ratios derived from the propagated fluence distributions assuming a particle density of 0.001 particles/µm 2 and shown for detection limits at different fractions of the peak fluence at ∆ f = 0 mm. (c) Expected single-particle hit ratios for a Gaussian beam under the same conditions as described in (b). tween pulses. Nevertheless, for sufficiently small particles whose diffraction speckles are well sampled at low scattering angles, this larger source of phase tilt variations is unlikely to have a big impact on SPI experiments at AMO. Furthermore, these small displacements in the centre of diffraction can be inferred and corrected when individual patterns contain sufficiently many photons [10].

E. Correlation with electron bunch and pulse properties
Each of the 90k pulses used in this ptychographic experiment includes a set of diagnostic EBEAM parameters that measure properties of each electron bunch and photon pulse along their trajectory in the accelerator tunnel and inside the optics hutches respectively. A comprehensive description of these parameters can be found in [54].
We correlated these EBEAM parameters with the parameters fitted for individual pulses in our ptychographic reconstruction pipeline. For this purpose, we have chosen the robust nonparametric Spearman's rank correlation. Most notably, we found that only the equatorial angular "wobble" of individual pulses (α x ) that was inferred from our ptychographic reconstructions seems to follow fluctuations in a subset of upstream electron bunch properties. Although other reconstructed parameters, such as c 0 and c 2 , also showed significant correlations with EBEAM parameters (see Section 3 and Table S1 in Supplement 1), their physical significance is less obvious.
Importantly, inferred simple translations of individual pulses (e.g. ∆x and ∆y) show little to no correlation with EBEAM parameters. Hence, it seems plausible that most of these translational variations might be attributed to vibrations of the sample stage, the KB optics or the offset mirror rather than beam pointing instabilities further upstream. Coincidentally, the translational variations in our reconstructions (see supplementary Fig.  S2) have characteristic frequencies that are most likely caused by equipment close to the sample stage.
We anticipate that future beam optimisation experiments at XFELs in combination with machine-learning approaches [54] where the focused beam under relevant SPI conditions is monitored (e.g. using single-pulse ptychography [33]) while adjusting the upstream beam parameters would allow for smart pulse-picking strategies that could potentially improve the data quality for SPI.

DISCUSSION
In addition to their high brilliance, XFEL pulses are prized for SPI because of their high spatiotemporal coherence. Naturally, several prior experiments were dedicated to measuring this degree of coherence [22,55]. While these measurements relied on the assumption that the pulses comprised a few low order Hermite-Gaussian modes, our mixed-state ptychographic wavefront sensing directly decomposes the ensemble of XFEL pulses into a set of orthogonal wavefront modes without such assumptions (Fig. 3). The power distribution of our recovered modes gives a useful quantification of the coherence properties of the beam. We have found that on average 65.7 % of the total power was present in the dominant mode of the beam, which means that an hypothetical ideal spatial filter could create a fully coherent beam with about 65% of the pulse fluence. This relative power of the dominant mode is lower than the 78 % previously reported for the neighbouring SXR endstation at the LCLS [22], though the widely differing experimental settings prevent a direct comparison of these numbers. Another metric is the degree Fig. 6. Simulated single-particle reconstructions using EMC. (a) 1FFK and 1Y69, two similar but different 1.4 MDa structures from the Protein Data Base (PDB) with their electron densities at 1.5 nm resolution rendered in gray using Chimera [53]. (b) Example diffraction from 1FFK in a random orientation at low (1 × 10 10 ph/µm 2 ) and high (16 × 10 10 ph/µm 2 ) fluence. (c) Comparison of the recovered and true fluence when running regular (single-model) EMC with 1 million patterns generated from structure 1FFK at relevant AMO parameters and using the expected in-focus fluence distribution (Fig. 4a). The black line indicates equal values for true and recovered fluence. (d) Two-model (binary) EMC with 500k 1FFK patterns and 500k 1Y69 patterns, highlighting the importance of fluence correction in multi-model single-particle reconstructions. The correctness score is defined as the fraction of patterns correctly classified into the two models for 1FFK and 1Y69. The underlying fluence distribution with a bin size of 5 × 10 9 ph/µm 2 is provided for reference in gray on a log scale. of coherence [56], which can simply be computed by summing the squares of the mode relative powers w m : We find a degree of coherence of ξ = 46.8 %, again lower than the value of 56 % previously reported for the SXR endstation. While our method does not discriminate between the possible sources of coherence loss, it is likely that scattering from upstream optics and detector point-spread-function are the main factors. Unlike fixed target experiments, the random injection of small particles into the stream of XFEL pulses in SPI has decidedly an element of chance and uncertainty. Our mode decomposition of these pulses provides a unique window to efficiently explore how random samples of different regions of many pulses can impact on SPI.
To first order, we found that previously reported random translations in the diffraction centers of individual diffraction patterns [10,26] can be largely explained by particles randomly sampling different regions of pulses with irregular local phase tilts (Fig. 7). Such irregularity is primarily determined by the surface finish of the KB-focusing mirrors [57]. Our pulse-to-pulse regressions showed that the variations in overall phase tilt between pulses played a lesser role here. Within a few millimeters of the pulses' foci, we found the maximum observed phase tilt was only around 1 mrad, which in most scenarios only has a minimal influence on SPI data analysis.
In contrast, the variations in local photon fluence as particles randomly sample different spots in the focused LCLS beam has a much larger impact on SPI. These variations were previously observed indirectly from randomly injected particles at both the AMO [5,26,45] and CXI beamlines [10] at the LCLS, and recently also at the SPB instrument at the European XFEL [16]. Ptychographic reconstructions, however, allowed us to directly observe how these spatial intensities distributions develop near the pulses' nominal foci (Fig. 4). Depending on where the particle beam intersects the X-ray beam along its propagation axis, the intensity profile also has interesting and non-intuitive implications on the expected hit ratio (Fig. 5).
Comprehensive beam profiling experiments help us better anticipate, measure, and plan and/or correct for uncertainties in an SPI experiment that may degrade imaging resolution. In this capacity, ptychographic wavefront sensing using a mixedstate approach will be useful for complex SPI experiments where perfect knowledge of all parameters is practically impossible (e.g. sources of vibrations in Fig. 2 ). The importance of robust photon fluence correction for 3D SPI reconstruction algorithms and structural inference that is demonstrated in Fig. 6, is one such example.
Rapidly profiling the foci of tens of thousands of pulses during an SPI beamtime using ptychography can also help save precious experiment time. In principle, collecting and reconstructing the wavefront modes of 10,000 pulses at 120 Hz repetition rate may only take minutes, which gives timely feedback for tuning optical elements to maximize photon transmission, changing focus conditions, and/or reducing background scattering. Currently, such tuning because of limited fast feedback can take several hours, if such tuning is at all successful. A routine and well-optimised ptychographic pulse profiling instrument can substantially reduce such optical tuning times. Furthermore, such profiles can rapidly form estimates of ideal and target hit rates (Fig. 3), which in turn allow experimenters to efficiently diagnose potential issues with particle injectors (e.g. clogging).
Overall, we are confident that live pulse profiling will be important for efficient SPI experiments. This article demonstrates that mixed-state ptychography can serve this key role. Other XFEL experiments that assemble insights from many partial measurements of an ensemble of pulses may also benefit from such detailed profiling.

A. Numerical wavefront and intensity propagation
Based on the orthogonalised probe modes P (m) x and the wavelength λ, we can define the propagated probe at a near distance z as P (m) using the numerically stable and convenient angular spectrum method, neglecting the plane wave term. This allows us to further define fluence maps Φ x (z) = ∑ m |P (m) x (z)| 2 as the incoherent sum of the M propagated probe modes.

B. The binary-model EMC algorithm
The single-model EMC reconstruction in Fig. 6c iteratively maximises the target log-likelihood function in [50][51][52], where the current model for the diffraction intensities W is updated to W , and the likely local photon fluence (ϕ d ) that illuminated the particle that resulted in diffraction pattern K d updated to ϕ d . Here, the set of diffraction patterns are denoted K dt , where the d subscript denotes the pattern index, and t subscript defines the pixel index on each pattern. The maximisation of Eq. (7) runs over a set of discrete samples of the 3D rotation group, indexed r, based on a linear refinement scheme of the unit quaternions that correspond to the 4D 600-cell [50]. The likelihood function in Eq. (7) is where W rt is the Ewald sphere section of the diffraction volume W at orientation r that is sampled by detector pixels labeled t. We maximise the log-likelihood in Eq. (7) by solving dQ/dW x = 0 and dQ/dϕ d = 0 alternately, where x is the voxel index of the Fourier intensity volume. This gives the typical update of the intensity volume and local photon fluence: where {rt; x} indicates all pairs of r and t that rotate to a given voxel x.
In anticipation of the high-throughput XFEL-SPI experiments at LCLS-II, we simulated three one-million diffraction patterns using only elastic scattering [52]. The consequential EMC reconstruction is both memory-and compute-intensive. Hence, the implementation of Eq. (10) is efficiently distributed over GPUs (NVIDIA GTX 1080Ti) that are distributed over several compute nodes.
When two latent structural models (labeled m) give rise to the diffraction patterns (Fig. 6a,d), the above equations could be modified by substituting r with m, r: ϕ d = ∑ t K dt ∑ m,x ∑ {rt;x} P dmr W x = ∑ t K dt ∑ m,r,t P dr W mrt .

C. Conversion from phase differences to phase tilts
For any given phase difference α, i.e. between two neighboring pixels or the same pixel at different time points, it is possible to calculate a phase tilt angle where λ is the wavelength and δ the size of a pixel in the reconstructed wavefront. A more complete derivation for this simple relationship is given in Section 4 of Supplement 1.