THz streak camera performance for single-shot characterization of XUV pulses with complex temporal structures

: The THz-ﬁeld-driven streak camera has proven to be a powerful diagnostic-technique that enables the shot-to-shot characterization of the duration and the arrival time jitter of free electron laser (FEL) pulses. Here we investigate the performance of three computational approaches capable to determine the duration of FEL pulses with complex temporal structures from single-shot measurements of up to three simultaneously recorded spectra. We use numerically simulated FEL pulses in order to validate the accuracy of the pulse length retrieval in average as well as in a single-shot mode. We discuss requirements for the THz ﬁeld strength in order to achieve reliable results and compare our numerical study with the analysis of experimental data that were obtained at the FEL in Hamburg - FLASH.


Introduction
Short-wavelength pulses of ultra-short duration and high intensity as delivered by extreme ultraviolet (XUV) and (soft) X-ray free-electron lasers (FELs) have opened new frontiers in atomic and molecular physics, non-linear spectroscopy, solid density plasma physics, photochemistry, and structural biology [1][2][3][4][5]. The accurate knowledge of temporal characteristics of the FEL pulses is essential for a large number of applications. In particular, structural studies aim to shorten the pulse duration in order to outrun radiation damage during the exposure [6][7][8]. The FEL-pulse duration is related to the rate of energy deposition into the sample. Therefore, it is an essential parameter for understanding the electronic response of atoms [9][10][11], nanoparticles [12][13][14], and solids [15]. Moreover, the convolution of the pump and probe temporal pulse profiles defines the time resolution in conventional pump-probe experiments.
In high-gain single-pass FELs the radiation is emitted in short coherent spikes via the selfamplified spontaneous emission (SASE) process. The phase relationship between individual spikes is random [16][17][18]. The coherence time is related to the SASE gain bandwidth. In the frequency domain, spikes are also observed in single-shot spectra [17,18]. The width of the spectral spikes is inversely proportional to the temporal length of the photon pulse. The complex spectral and temporal structure of SASE FEL pulses makes their diagnostics challenging.
To provide to a broad user community reliable information on the temporal structure of FEL pulses, a number of direct and indirect diagnostics techniques has been developed [18][19][20]. Indirect methods are based on the electron bunch duration measurements in the time as well as in the frequency domain [17,21]. Other indirect methods utilize statistical SASE properties and investigate either statistical fluctuations of the radiation energy or spectral intensity correlations in the emitted spectra [16,17]. An additional approach based on advanced machine learning algorithms has been developed recently [22]. Indirect techniques now enable a reliable prediction of the temporal characteristics of the FEL pulses. However, their verification and calibration requires photon pulse measurements in the time domain. Therefore, several cross-correlation schemes with externally synchronized optical laser pulses have been developed [19,20]. These methods are based either on the FEL-induced changes of the optical properties of materials [23][24][25] or on the laser-dressed photoelectron spectroscopy in atomic gas media, so called generation of side-bands [26,27]. When the X-ray pulse duration approaches the oscillation period of the dressing laser, the side-band generation regime transfers into the streaking regime as it is shown by numerical calculations in Ref. [28]. The light field driven streak camera is actively used in the attosecond pulse metrology [29]. For metrology of femtosecond FEL pulses, a terahertz (THz) field is used for streaking [30].
Following the first demonstration of the THz-field streaking [30] this technique has been successfully implemented for measurements of fs SASE FEL pulses [31] as well as for pulses produced by the high harmonic generation (HHG) [32,33]. The main advantage of the streak camera measurement is its capability to extract the FEL pulse duration and the arrival time jitter of each FEL shot from few simultaneously recorded spectra. A time-consuming THz-XUV delay scan that is essential for the characterization of the THz field needs to be performed only prior to the measurement. The single-shot capability of the streak camera motivated the efforts towards its integration into the beamline photon diagnostics at several user facilities [34][35][36]. The high interest of the photon science community to streak camera measurements at FEL and other laser facilities triggers a high demand on the efficient data analysis procedures that are able to extract time information from a limited number of measurements [20,37].
In the current study we investigate the performance of a THz streak camera for measurements of SASE FEL pulses with pulse lengths of about 100 fs full-width-at-half-maximum (FWHM). We numerically simulate SASE FEL pulses and validate the accuracy of the XUV pulse length retrieval in average and in single-shot mode. Three approaches that are able to extract pulse length from a limited number of measurements are considered. We discuss the requirements for the THz field strength in order to achieve a reliable pulse length retrieval and use this knowledge to analyse streaking measurements performed at the FEL in Hamburg -FLASH. Our results help to optimize the THz based streak camera performance and contribute to the development of efficient data analysis routines in order to provide machine diagnostics and users of experimental end-stations with a fast and reliable feedback.

THz streaking principle
In a THz streak camera experiment the ionizing XUV pulse is collinearly overlapped with the laser streaking field in a gas medium. The period of the laser streaking field has to be longer than the XUV pulse duration. Therefore, THz field is used for diagnostics of femtosecond XUV pulses [30][31][32][33]36,38] and near infra-red (NIR) field is used for diagnostics of attosecond pulses [29,[39][40][41][42]. The presence of the laser field introduces an additional momentum to the electrons that are released into the continuum state during the interaction of the XUV pulse with the atoms.
By changing the relative time delay τ between the XUV pulse and the laser streaking field a series of electron kinetic energy spectra I(W, τ), so called streaking spectrogram, is measured, with W being the kinetic energy of the electron in the final continuum state. The single active electron approximation in combination with the strong field approximation is used to describe the electron spectrum [39,40]: (1) Here d p(t) is the dipole transition matrix element from the ground to the continuum state. Far from any resonances d p(t) is typically considered to be a constant. p(t) = p 0 + A(t) is the instantaneous momentum of the free electron in the laser streaking field E streak (t) with p 0 being the initial electron momentum without streaking field and A(t) being the vector potential that is given by is the electric field of the XUV pulse, I p is the ionization potential of the atom and φ(t) is the phase accumulated in the continuum state. We note the use of the atomic units throughout the manuscript (e = m e = = 1) unless further specified. Equation (1) shows that the laser dressing field introduces a temporal phase modulation on the electron wave packet and therefore the light-field-driven streak camera is often considered as an ultrafast electron phase modulator [40,41]. Following considerations are discussed in Ref. [40,41], namely, (i) a linearly polarized streaking field E streak (t) = E 0 (t) cos(ω streak t) has an envelope that is long enough to apply slowly-varying envelope approximation, (ii) electrons are emitted in the direction of the laser polarization, and is the ponderomotive potential of the electron in the laser field and ω streak is the angular frequency of the streaking field. In this case it can be shown that close to the zero-crossing point of the vector potential A(t), the phase term in the Eq. (1) can be approximated to Introducing the streaking speed α = 8U p Wω streak into Eq. (2) reveals that the phase modulation φ(t) is quadratic in time and, thus, the electron wavepacket experiences a linear streaking in energy dW/dt = −∂ 2 φ(t)/∂t 2 = α [40,41]. This allows to apply several simplified procedures to extract information on the pulse duration from the streaking measurements as discussed in the following section.

Procedures to extract pulse durations from the THz streaking measurements
To extract the duration of FEL pulses from measured photoelectron spectra I(W, τ), we consider three approaches that require only a limited number of measured spectra: (i) Gaussian approximation -it is based on the analytical solution of the integral in Eq. (1) by approximating the electric field of the XUV pulse E XUV (t) with a Gaussian amplitude envelope and a quadratic phase [30,37]; (ii) Linear streaking -it is valid at sufficiently high streaking speed α, i.e. in the so-called linear streaking regime. The temporal phase of the XUV pulse is neglected by this approach and the temporal profile of the FEL pulse is retrieved by a direct projection of I(W, τ) onto the time axis [20,31,42]; (iii) Simplified chronocyclic tomography (CT) -this approach treats the streaking effect as a shear of the Wigner distribution on the time-axis [40,43,44] according to Eq. (3). Simplified CT holds the potential to retrieve the temporal structure of the XUV pulse including its temporal phase [40,43,44].

Gaussian approximation
Consider the electric field E XUV (t) of the XUV pulse to have a Gaussian amplitude envelope and a quadratic phase: where the parameter a is related to the root mean square (rms) pulse length τ XUV of the XUV intensity I XUV (t) = |E XUV (t)| 2 according to τ XUV = 1/(2 √ a). Here and further in the text we refer to the rms length of the intensity profile of the pulse as τ XUV , which is related to the FWHM of the intensity profile as T XUV = 2 √ 2ln2τ XUV 2.35τ XUV . In Eq. (4) ω 0 is the carrier frequency and b is the linear chirp parameter that describes the linear sweep of the instantaneous frequency ω inst (t) = ω 0 − bt. Substitution of Eq. (3) and Eq. (4) into Eq. (1) provides an analytical solution for the integral and allows direct calculation of τ XUV and b from streaked spectra [30,37].
where σ 0 , σ + and σ − represent the rms spectral bandwidths of unstreaked spectra and, respectively, streaked spectra with positive α>0 and negative α<0 streaking speed. Spectra with the spectral bandwidths σ + and σ − are measured at the zero-crossing points of the vector potential A(t) using electron spectrometers installed parallel and antiparallel with respect to the electrical field vector of the streaking field [30,37]. Spectrum with the spectral bandwidth σ 0 is measured in the absence of the streaking field. It is determined from the online measurement of the XUV pulse spectrum as described in section 5.1. The streaking speed α is determined from the analysis of streaking spectrograms as described in section 5.2. The Gaussian approximation has been implemented for the retrieval of SASE FEL pulses at FLASH [30,37], HHG pulses [32,33] as well as seeded FEL pulses [38] with several tens of femtoseconds pulse duration. All studies report good agreement between retrieved and expected photon pulse durations. Retrievals of the chirp of HHG pulses indicate a reproduction of the decrease of the chirp parameter with increasing harmonic order [33]. Analyses of SASE FEL [30,37] and seeded FEL [38] photon pulses reveal a linear chirp that is attributed to the linear energy chirp in the electron bunch.

Linear streaking regime
When the streaking speed is large (|α|>>|b|), i.e. in the so-called linear streaking regime [20,31,42], the τ XUV can be retrieved from the direct projection of the rms streaked spectral bandwidth σ streak onto the time axis according to τ XUV = σ streak /|α|. This relationship can also be deduced from Eq. (5) when σ + σ − σ 0 is considered. Therefore, at large streaking speeds both approaches should be applicable for pulses with arbitrary shapes. In our study we tested whether the use of an average streaked bandwidth (σ + + σ − )/2α would provide a more accurate estimation of the XUV pulse length: The linear streaking approach has successfully been implemented to retrieve SASE FEL pulses at FLASH [31] and LCLS [42] with few fs pulse durations.

Simplified chronocyclic tomography
Another approach to retrieve a pulse in the time domain from measurements in the frequency domain is to use tomographic techniques [40,43,44]. In this approach the spectral I(ω) and the temporal I(t) intensity profiles of a pulse are considered as frequency and time marginals of the Wigner distribution function W Wigner (t, ω), i.e. its projections onto the frequency (recall that the photon energy of a pulse is ω) and the time axis, respectively: In chronocyclic tomography (CT) a large number of projections of the rotated Wigner distribution is measured and then a back-projection algorithm is applied to retrieve the Wigner distribution from a set of measured projections.
In simplified CT an electric field of a pulse can be retrieved using one projection of the Wigner function and its angular derivative [40,43,44]. According to Eq. (2) streaking with a streaking speed α at the zero-crossing points of the vector potential A(t) is considered as a quadratic temporal phase modulation that corresponds to a shear of the Wigner function along the frequency axis in the chronocyclic (t, ω) space: W WignerOUT (t, ω) = W WignerIN (t, ω + αt). Following the discussion in Refs. [40,43,44] it can be shown that for streaking speed α close to zero: This means that the group delay ∂ϕ/∂ω of a pulse can be obtained from the frequency marginal of the Wigner function I(ω) and its angular derivative ∂I α /∂α. The angular derivative ∂I α /∂α is estimated via two spectra I + and I − measured at two opposite streaking speeds: The simplified CT has been verified experimentally in the optical regime [43] and is proposed theoretically for the evaluation of light-field streaking spectrograms [40]. In the current study we investigate the applicability of this approach to SASE FEL pulses with complex temporal and spectral structures.

Numerical study of THz streak camera performance
In the following section we describe how model pulses with similar pulse shapes as expected in the experiment are generated numerically. Further, Eq. (1) is used to calculate streaked spectra at zero-crossing points of the vector potential at different streaking speeds α. The frequency of the THz field is kept at 1 THz while the field strength is varied in a range that is typical for the THz streaking experiment (with one exception of very low streaking speed α=0.2 meV/fs) as summarized in Table 1. Then we apply the Gaussian approximation, the linear streaking as well as the simplified CT to extract the rms pulse duration from numerically simulated streaking spectra. In order to evaluate numerically the performance of all three different pulse retrieval approaches we compare the retrieved with the original rms pulse duration. Table 1. THz field strengths and resulting parameters as used in the simulation: streaking speed α, amplitude of streaking field E 0 , ponderomotive potential U p , maximal kinetic energy shift W max experienced by the electron with initial kinetic energy of W 0 =71 eV in the streaking field, relative spectral broadening (averaged over all shots) ∆σ/σ 0 due to streaking, defined as a ratio between de-convoluted streaked bandwidth ∆σ = σ 2 streak − σ 2 0 and unstreaked bandwidth σ 0 .

Generation of model pulses
The SASE process starts from the shot noise in the electron bunch. This results in characteristic spectral and temporal intensity profiles of the photon pulses consisting of multiple random spikes with random phase fluctuations between individual spikes [18,45]. To mimic the random structure of the SASE FEL pulses numerically, the electric field and phase are constructed in the spectral domain by using a pseudo-random Wiener processes. To set an appropriate limit to the bandwidth of the numerical XUV pulses the envelope of the field is multiplied with a Gaussian profile. The temporal profile and the temporal phase of a pulse are calculated via the Fourier transform. The width of the Gaussian function as well as a drift parameter in the random walk process are chosen randomly. The main criteria in the choice of parameters is to match the temporal and spectral bandwidth as well as the coherence time of numerically generated pulses with FEL pulse properties expected in the experiment. Finally, about 3000 pulses with an average FWHM spectral bandwidth of (1.1±0.2) eV (standard deviation), FWHM pulse duration of (103±33) fs (standard deviation) and a coherence time of (10.6±1.3) fs (standard deviation) are selected for modelling of the streaking process. Our approach is similar to the one used in Ref. [46] to model FEL pulses produced in the SASE operation mode.
A representative example of a numerically generated pulse is shown in Fig. 1. The upper photon energy scale in Fig. 1(a) shows the spectral intensity profile of the XUV pulse of 92.5 eV central photon energy. Ionization of atomic gas results in the electron distribution that (under low irradiation intensity, low atomic gas density conditions and far from any resonances) is considered as a replica of the XUV pulse. The lower electron kinetic energy scale in Fig. 1(a) corresponds to the electron kinetic energy spectrum resulting from the release of Ne(2p) electrons from a Ne gas target after absorption of XUV photons. Note, that the average spectral bandwidth of FEL pulses is 1.1 eV (FWHM) in our simulation and 1.25 eV (FWHM) in the experiment, i.e. substantially larger than the spin-orbit splitting of Ne 2p 1/2 and 2p 3/2 states with the corresponding ionization potentials of 21.7 eV and 21.6 eV. Therefore, the center of the electron kinetic energy spectrum in Fig. 1(a) is shifted by 21.65 eV (the average ionization potential I p of Ne(2p) electrons) with respect to the central XUV photon energy. The temporal intensity profile shown in Fig. 1(b) is composed of multiple spikes in accordance with the expected FEL pulse structure [18,45]. To calculate spectral σ XUV and temporal τ XUV bandwidth from simulated spectral I(W) and temporal I(t) intensity profiles the corresponding equations for weighted sample standard deviation are used: and τ XUV is calculated in the range of ±400 fs and σ XUV from 60 eV to 90 eV. To implement the Gaussian approximation, we assume that the spectral and temporal profile is approximated by the Gaussian shape with the corresponding equivalent rms spectral and temporal widths as shown in Figs. 1(a) and 1(b) by the grey shaded area.

4.2.
Results of pulse reconstruction for numerically generated pulses Figure 2 summarizes results of pulse retrievals using the Gaussian approach, linear streaking as well as the simplified CT as a function of streaking speed. Statistical variations and absolute accuracies for retrieval of individual pulses are shown in Fig. 3. At each streaking speed (see Table 1) the analyses are performed on the same set of 3000 numerically generated pulses. We compare τ retr /τ orig -the average of ratios between the retrieved τ retr and the original τ orig rms pulse lengths. To characterize statistical variations between retrieved τ retr and original τ orig rms pulse lengths, we calculate the Pearson correlation coefficient r τ orig ,τ retr between the two variables τ orig and τ retr at different streaking speeds. The Pearson correlation coefficient r τ orig ,τ retr =COV(τ orig , τ retr )/(σ τ orig σ τ retr ) is defined as a ratio between covariance and corresponding standard deviations σ τ orig and σ τ retr .  As it follows from the comparison of τ orig and τ retr the linear approach should not be used at very small streaking speeds. According to the Table 1 at streaking speeds α ≤ 5 meV/fs the streaked spectral widths are comparable to the non-streaked width. This results in overestimation of the pulse lengths which is manifested by large τ retr /τ orig as observed in the Fig. 2.
Reconstruction by Gaussian and linear streaking approaches provides very similar results (Fig. 2, orange, green and red solid and dashed lines). With increasing streaking speed the average of ratios τ retr /τ orig and the correlation coefficient r τ orig ,τ retr are approaching 1. At α ≥ 25 meV/fs the difference between the two methods almost disappears. For linear streaking we have used the relationship τ XUV = σ streak /|α| (red solid and dashed lines) as well as Eq. (6) with an average streaked bandwidth (green solid and dashed lines) in order to estimate the XUV pulse lengths. The average of ratios τ retr /τ orig is nearly the same for both methods at all streaking speeds (green and red solid lines are overlapping). At lower streaking speeds α<25 meV/fs the correlation coefficient r τ orig ,τ retr is higher when Eq. (6) is used, while at higher streaking speeds α ≥ 25 meV/fs both approaches show no difference. Our analysis indicates that at large streaking speeds as typically used in THz streaking experiments [30][31][32][33] both Gaussian and linear approximations provide reasonable estimation of the mean pulse duration. For Gaussian approach implemented at α = 50 meV/fs streaking speed the absolute accuracy is around 10 fs for most of reconstructed pulses (Fig. 3(c)) and is found to be very similar for the linear streaking approach (data not shown).
In contrast to the Gaussian and the linear streaking approach, for simplified CT the average of ratios τ retr /τ orig is close to 1 only at very low streaking speed values α< 5 meV/fs (Fig. 2, blue solid line). The Pearson correlation coefficient is very low (Fig. 2, blue dashed line) indicating a very large spread between τ retr and τ orig . This is analysed in Fig. 3 that compares shot-to-shot XUV pulse retrieval by the Gaussian (a) vs. the simplified CT (b) approach. For the Gaussian approach (Fig. 3(a)) the standard deviation of data points from a median value (spread of data points) is about 15 fs and is found to be similar for all streaking speeds. Running median values (solid lines in Fig. 3(a)) are close to the τ orig /τ retr = 1 line (red dashed line). The deviation for longer (τ orig >50 fs) pulses at large streaking speed α=100 meV/fs (green solid line) is explained by the numerical artefacts in the simulation, namely, a broadening of streaked spectra becomes very large (Table 1) and approaches the whole length of the electron energy axis used in the simulation.
For the simplified CT ( Fig. 3(b)) the spread between the original τ orig and the retrieved τ retr pulse lengths is substantially larger. The running median line is parallel to the τ orig /τ retr = 1 line (red dashed line) only at very low streaking speed α=0.2 meV/fs (orange solid line) that is in correspondence with relatively high correlation value of 0.8 at this streaking speed (Fig. 2). However, the retrieved pulse lengths are significantly overestimated (the orange solid line is located above the red dashed line in Fig. 3(b)). With increasing streaking speed τ retr becomes smaller than the original pulse length τ orig . Moreover, τ retr becomes less dependent on τ orig as indicated by the vanishing slope of the running median lines (green, blue and red solid lines). At a streaking speed α=50 meV/fs the retrieved pulse length τ retr becomes independent of τ orig . Figure 3(d) shows that the absolute accuracy at 5 meV/fs streaking speed becomes larger with increasing τ orig . This indicates that the simplified CT might be more sensitive to the lower order phase terms and is getting less applicable for pulses with increasing complexity of the spectral phase. In our simulation the spectral bandwidth was limited to FWHM of (1.1±0.2) eV (standard deviation). Due to this limitation the larger pulse duration corresponds to the increasing complexity of the spectral phase. From Fig. 2 and Fig. 3(b), (d) we conclude that the implementation of simplified CT to pulses with very complex temporal and spectral phase is not straightforward. In accordance with simplifications made in Eq. (8) a possible application of simplified CT is limited to very small streaking speeds, in our case to α<5 meV/fs.
To summarize the results of our numerical study, we have found that for pulses with complex temporal and spectral structure a linear streaking or Gaussian approach needs to be implemented at large streaking speed to provide a reasonable estimation of the pulse duration. The simplified CT algorithm does not provide satisfactory results even at very small streaking speeds. In the following section we discuss the implementation of the Gaussian, the linear streaking and the simplified CT approaches for the analysis of experimental data.

Experimental setup
The THz streaking experiment is performed in the unfocused branch of the beamline BL3 [47] and the THz beamline [48] of the FLASH FEL. The FEL is tuned to maximum pulse energy with an electron bunch charge of 0.3 nC and an electron energy of 680 MeV. At these parameters of FEL operation an XUV pulse duration of around ∼100 fs (FWHM) is expected. Both XUV and THz pulses are generated from the same electron bunch enabling a high level of synchronization of the multi-cycle THz pulse with the XUV pulse [30,48]. In the current study the jitter between XUV and THz pulses was in the order of 10 fs rms as revealed from an analysis of streaking spectrograms. The experimental geometry is shown in Fig. 4(a). In the interaction region XUV pulses of 13.4 nm central wavelength (92.5 eV photon energy) and THz pulses of 3 THz central frequency (100 µm wavelength) are collinearly overlapped and focused into an atomic gas jet. For balancing the XUV and THz beamline lengths the XUV pulses are delayed and back focused by a spherical Mo/Si multilayer mirror with a focal length of 2.0 m to a spot size of about 100 µm FWHM. The maximum reflectivity of the Mo/Si mirror is at ∼92.5 eV with a bandwidth of ∼5 eV FWHM. The THz pulses are focused with an off-axis parabolic mirror of 200 mm focal length to a spot size of ∼0.5 mm. A band-pass filter (QMC Instruments) was used to reduce the THz spectral bandwidth. In order to characterize the THz field as well as to indentify zero-crossing points of the THz vector potential for subsiquent XUV pulse diagnostic THz streaking spectrograms are obtained by moving the delay line available in the THz-branch [48] with 33 fs steps. Electron kinetic energy spectra resulting from the ionization of Ne atoms by XUV pulses are measured in a single-shot with two opposite time-of-flight (TOF) spectrometers that are installed parallel to the THz polarization direction. This allows to obtain two streaking spectrograms with the same streaking speed value but opposite signs within one XUV-THz time-delay scan (Figs. 5(a) and 5(b)). At each fixed XUV-THz delay stage position relative large data set of 100 to 120 consecutive measurements is obtained. The combination of a time-delay scan for the THz field characterization with the acquisition of single-shot data at each fixed delay line position for further analysis of the XUV pulse durations has been chosen in order to use the limited beamtime more efficiently by postponing the data analysis after the end of the user beamtime. The fluctuating number of single-shots at each time delay is explained by the data acquisition routine. Care was taken to minimize a broadening of the electron spectra by space charge effects as well as to prevent complete ionization of the neutral target within the XUV pulse [12]. Therefore, the XUV beam was strongly attenuated with zirconium foil filters.
Synchronously to the acquisition of the electron kinetic energy spectra the spectrum of the XUV pulse is measured by a variable line spacing grating spectrometer (VLS) (Fig. 5(c)). The VLS is installed upstream and uses a fraction of the same beam that is delivered to the experiment [49]. The VLS spectrum is taken as unstreaked spectrum for the pulse retrieval algorithms. Figure 4(b) shows a representative example of the photon energy spectrum of an XUV pulse (blue solid line, upper scale) measured with the VLS spectrometer and for the same pulse the electron kinetic energy spectra measured with opposite TOF spectrometers in the absence of the THz field (orange and green dashed line, bottom scale). The electron kinetic energy scale is shifted with respect to the photon energy scale by 21.65 eV, i.e. by an average I p of Ne 2p 1/2 and 2p 3/2 states (similar to Fig. 1(a)). All spectra are normalized to the area under the main peak. Peaks appearing in the electron kinetic energy spectra to the left and to the right of the Ne(2p) photoline are artefacts caused mainly by microchannel plate (MCP) dark counts. Both spectra are overlapping well, showing that the spectral bandwidth of the XUV pulse is not cut by the beam propagation through the beamline and, in particularly, by the multilayer focusing mirror.
For the careful implementation of the pulse retrieval algorithms an accurate characterization of the THz vector potential is essential. Therefore, at each THz-XUV time-delay an average center-of-mass value in the electron kinetic energy spectra is determined (Figs. 5(a) and 5(b), points) and then fitted by an amplitude modulated sinusoidal function (Figs. 5(a) and 5(b), solid line). This allows one to obtain the exact positions of the zero-crossing points as well as the streaking speed values. Following THz field parameters have been extracted: maximal electrical field strength 19 MV/m, maximal vector potential A=4x10 −5 V s/m, maximal streaking speed α=95 meV/fs, THz frequency 3.14 THz and a linear chirp of 0.136 THz/ps.

FEL pulse retrieval
The pulse retrieval algorithms are implemented by taking single-shot spectra measured with two opposite spectrometers at zero-crossing points of the vector potential (streaked spectra) and one XUV photon energy spectrum measured with the VLS spectrometer (unstreaked spectrum). At FLASH the THz pulse is multi-cycle as shown in Figs. 5(a) and 5(b). This allows to select the optimal streaking speed for the pulse retrieval in accordance with our numerical study. For the Gaussian and the linear streaking approach zero-crossing points that are located close to the pulse maximum amplitude with streaking speed α>85 meV/fs (green dashed lines in Figs. 5(a) and 5(b)) have been selected. Correspondingly for the implementation of simplified CT it is essential to keep the streaking speed as low as possible. Therefore, a region on the edge of the THz pulse with the streaking speed of 5 meV/fs has been selected (blue dashed line in Figs. 5(a) and 5(b)). The number of analysed single-shot measurements at each zero-crossing point is indicated in Figs. 5(a) and 5(b) on the top. For the Gaussian and the linear streaking approaches the time delay steps closest to the green dashed lines in Figs. 5(a) and 5(b) are selected. The fluctuating numbers of measurements at the corresponding time delay step are due to the data acquisition routine. Equation (9) applied in the range of 60 eV to 90 eV is used to calculate the rms spectral bandwidths. Note that the presence of noise spikes might lead to a possible overestimation of the rms spectral bandwidth and, thus, according to Eqs. (5) and (6), to the overestimation of the pulse duration. The choice of the broader spectral range and of the weighted standard deviation given by Eq. (9) for the calculation of the rms width helps to suppress the contribution of noise spikes. For the simplified CT it is essential to perform the measurement exactly at the zero-crossing point in order to eliminate the shift of the electron kinetic energy spectra due to the streaking. We have analysed the difference between the center-of-mass of the Ne(2p) photoline in streaked with respect to unstreaked spectrum at three time delay steps that are located close to the blue dashed line in Figs. 5(a) and 5(b). In case this difference is below 150 meV the corresponding measurement is selected for further analysis. In total 134 shots are selected. Figure 6 shows representative examples of single-shot spectra taken for analysis with the simplified CT (a, c) and the Gaussian as well as the linear streaking approaches (b, d).  Both histograms are very similar (grey bars represent an overlap between histograms related to Gaussian and linear streaking). This is in correspondence with our numerical study that reveals almost no difference between the two methods at large streaking speeds. For the Gaussian approach the average maximum of the histogram weighted with the number of samples in each bin is at 97 fs and the rms width of the distribution is 17 fs. For the linear streaking approach the weighted average maximum of the histogram is at 100 fs and the rms width of the distribution is 16 fs. The analysis performed on spectra that are averaged over all shots in the histogram reveals a slightly larger (with respect to the weighted average maximum of the histogram) pulse duration of 105 fs (FWHM) for the Gaussian and of 110 fs (FWHM) for the linear streaking approach. This can be explained by the shot-to-shot energy jitter of FEL pulses (visible in the center-of-mass fluctuations of photon energy spectra in Fig. 5(c)). As observed, averaging of spectra would cause a spectral broadening leading to the overestimation of the pulse duration. The results of our pulse retrieval are in good correspondence with the 100 fs FWHM photon pulse duration that is expected from the parameters of the machine operation. The rms width of the retrieved distribution may either originate from natural shot-to-shot fluctuations of the FEL pulse durations or from the uncertainty of the method. The latter is mainly caused by the limitations in the description of pulses with complex temporal and spectral structures by their rms widths. In particular, the rms pulse width is known to overestimate small contributions to the signal at the edges of a pulse (not to be confused with the noise spikes). Note that expected absolute accuracy of linear and Gaussian approaches as estimated from our numerical simulations is around 10 fs (Fig. 3(c)). Additionally, in order to estimate the uncertainty related to the method we perform a numerical test. From the distribution of 3000 numerically generated pulses (Section 4) the pulses with an average pulse duration of 100 fs (FWHM) and a very narrow width of the distribution of 2.5 fs rms are preselected. We then apply the Gaussian approach to this selected distribution. The corresponding analysis results in the same 100 fs (FWHM) pulse duration. However, the width of the distribution is 15 fs rms. This is substantially broader than the initial 2.5 fs rms width but comparable with the width of the distribution obtained from the experimental data (red solid line in Fig. 7). Therefore we suggest that the spread of reconstructed pulse lengths is mainly related to the accuracy of the method rather than to fluctuations of FEL pulse lengths in our experimental run or to other possible sources of noise.
To implement the simplified CT approach a small streaking speed and the possibility to determine the exact streaking speed value are essential. In the streaking spectrogram shown in Fig. 5 we were able to implement the simplified CT approach only at one zero-crossing point (blue dashed line). Results of single-shot analyses are shown in the histogram in Fig. 7(b). The distribution of the retrieved pulse lengths is comparably broad. For the CT approach the information on the spectral phase is encoded in small changes of the spectral amplitude. Therefore, care is taken to select spectra with almost no shift between the center-of-mass in the VLS spectrum ( Fig. 6(a), blue line) and the center-of-mass of Ne(2p) photoline in streaked spectra ( Fig. 6(a), green and orange lines). Moreover, the method is very sensitive to the noise present in the single-shot electron spectra (note the spikes in green and orange spectra in Fig. 6(a)). A Gaussian filter is used to suppress these contributions (Fig. 6(c)). Additionally, low signal-to-noise level i.e. low number of electrons contributing to the Ne(2p) signal ( Fig. 6(a)) introduces some uncertainty to the form of the spectral amplitude. To estimate an average pulse duration and to eliminate the effect of the noise we have performed the simplified CT pulse retrieval on spectra that were averaged over all shots (correspondingly all spectra streaked with +α, −α and non-streaked VLS spectra were averaged). The retrieved average pulse duration of 82 fs FWHM is substantially shorter than expected from the parameters of the machine operation. Both findings, i.e. a very broad distribution of retrieved pulse durations as well as an underestimated average pulse length, are in correspondence with the results of our numerical study (Fig. 3(b)). Both, numerical study and analysis of experimental results confirm the difficulty to apply the simplified CT for analysis of pulses with complex temporal and spectral pulse profile.

Summary
We have applied the Gaussian approximation, the linear streaking as well as the simplified CT approaches in order to retrieve pulse durations of XUV pulses with complex spectral and temporal structures. The single-shot performance of all three pulse retrieval algorithms is compared using numerically generated as well as experimental data. We found no significant difference between the Gaussian approximation and the linear streaking approaches when applied at large streaking speeds. At smaller streaking speeds the Gaussian approach provides a better estimation of the average pulse durations. However, the accuracy for shot-to-shot pulse analysis is decreasing both for the Gaussian and for the linear streaking approaches with decreasing streaking speed as judged by the correlation coefficient. Therefore, we suggest that at large streaking speeds (>25 meV/fs in our case) a reliable estimation of the FEL pulse duration can be reached with linear streaking τ XUV = σ streak /|α| that requires a measurement of only one streaked spectrum and the knowledge of the streaking speed. Considering the linear streaking approach from the user perspective, very fast analysis of τ XUV from the measured spectrum ensures the implementation of this algorithm for efficient XUV pulse diagnostics. Care has to be taken when judging single-shot results. We have found that even at large streaking speed the width of retrieved pulse distributions is substantially broader with respect to the original distribution. In these analyses we have selected numerically generated pulses with a FWHM duration of 100 fs±2.5 fs (standard deviation), the retrieved pulses had a FWHM duration of 100 fs±15 fs (standard deviation). We explain this finding mainly by limitations in defining spectral and temporal widths by the rms values for pulses with a complex structure.
In contrast to the Gaussian and the linear streaking, the implementation of the simplified CT to pulses with complex spectral and temporal structures is difficult. According to the approximation made in Eq. (8) the method is limited to very low streaking speeds. In the case considered here the streaking speed has to be below 5 meV/fs. However, the data analyses at these streaking speeds are very challenging due to a very low electron kinetic energy shift W max as well as due to very small changes of the spectral bandwidth in the streaked spectra (Table 1 and Fig. 6(a)). Moreover, the method is very sensitive to noise in the single-shot spectra because the information on the spectral phase is encoded in the small amplitude changes. In particular, we have found that the noise on the edges of the main photo-line has substantial influence on results of the reconstruction. From the user perspective, this limitation makes the analysis of individual pulses very time consuming. Additionally, the requirement to perform the measurement exactly at the zero-crossing point of the vector potential contributes to the high rejection rate due to the jitter between XUV and THz pulses.
So far the simplified CT algorithm has been implemented in the optical regime for pulses with almost flat spectral phase [43]. The authors found a very good correspondence between the second-order intensity autocorrelation functions retrieved and measured in a non-linear optical crystal. The reconstructed pulse profile is not shown in this study [43]. In the numerical example shown in Ref. [40] the original spectral phase of the pulse contains a combination of a quadratic and third-order terms, however, in the retrieved spectral phase a quadratic term seems to be dominating. Our simulations show that with increasing streaking speed the retrieved pulse length becomes almost independent of the original pulse length. For the numerically generated pulses the larger pulse duration corresponds to the increasing complexity of the spectral phase due to the limited spectral bandwidth that was kept at the average FWHM of (1.1±0.2) eV (standard deviation). Summarizing these results we suggest that the CT algorithm might be more sensitive to the lower order phase terms and is getting less applicable for pulses with increasing complexity of the spectral phase. Further investigations of the applicability of the simplified CT algorithm for the XUV pulse diagnostics are needed.
In conclusion, we have exploited the THz streak camera performance for single-shot characterization of XUV pulses with complex temporal structure. Results of our numerical study are in correspondence with the analysis of experimental data. We expect that our results will stimulate further development of efficient pulse retrieval algorithms for the single-shot XUV pulse diagnostics techniques at existing and emerging user-facilities based on free-electron lasers or on the generation of high harmonics.