Three-dimensional, time-dependent simulation of free-electron lasers with planar, helical, and elliptical undulators

Free-electron lasers (FELs) have been built ranging in wavelength from long-wavelength oscillators using partial wave guiding through ultraviolet through hard x-ray that are either seeded or start from noise. In addition, FELs that produce different polarizations of the output radiation ranging from linear through elliptic to circular polarization are currently under study. In this paper, we develop a three-dimensional, time-dependent formulation that is capable of modeling this large variety of FEL configurations including different polarizations. We employ a modal expansion for the optical field, i.e., a Gaussian expansion with variable polarization for free-space propagation. This formulation uses the full Newton–Lorentz force equations to track the particles through the optical and magnetostatic fields. As a result, arbitrary three-dimensional representations for different undulator configurations are implemented, including planar, helical, and elliptical undulators. In particular, we present an analytic model of an APPLE-II undulator to treat arbitrary elliptical polarizations, which is used to treat general elliptical polarizations. To model oscillator configurations, and allow propagation of the optical field outside the undulator and interact with optical elements, we link the FEL simulation with the optical propagation code OPC. We present simulations using the APPLE-II undulator model to produce elliptically polarized output radiation, and present a detailed comparison with recent experiments using a tapered undulator configuration at the Linac Coherent Light Source. Validation of the nonlinear formation is also shown by comparison with experimental results obtained in the Sorgente Pulsata Auto-amplificata di Radiazione Coerente SASE FEL experiment at ENEA Frascati, a seeded tapered amplifier experiment at Brookhaven National Laboratory, and the 10 kW upgrade oscillator experiment at the Thomas Jefferson National Accelerator Facility.


Introduction
While free-electron lasers (FELs) have been intensively studied since the 1970s, new developments and concepts keep the field fresh. Intensive work is ongoing into new FEL-based light sources that probe ever shorter wavelengths with a variety of configurations. There presently exists a large variety of FELs ranging from longwavelength oscillators using partial wave guiding to ultraviolet and hard x-ray FELs that are either seeded or starting from noise (i.e., self-amplified spontaneous emission (SASE)). As these new light sources come on-line, interest will grow in shorter pulses, new spectral ranges and higher photon fluxes. In addition, interest is growing in producing photons with a variety of polarizations ranging from linear, through elliptical, to circular. Indeed, novel configurations have been described for producing variable polarizations in synchrotron light sources and FELs using a variety of different undulator designs including APPLE-II and Delta-type undulators [1][2][3][4][5][6][7].
In this paper, we develop a three-dimensional, time-dependent nonlinear formulation of the interaction that is capable of modeling such a large variety of FELs, in particular this represents the first presentation of a three-

General simulation properties
The formulation we develop describes the particles and fields in three spatial dimensions and includes time dependence as well. Electron trajectories are integrated using the complete Newton-Lorentz force equations. No wiggler-averaged-orbit approximation is made. The magnetostatic fields can be specified by analytical functions for a variety of analytic undulator models (such as planar, elliptical, or helical representations), quadrupoles, and dipoles. These magnetic field elements can be placed in arbitrary sequences to specify a variety of different transport lines. As such, we can set up field configurations for single or multiple wiggler segments with quadrupoles either placed between the undulators or superimposed upon the undulators to create a FODO lattice. Dipole chicanes can also be placed between the undulators to model various optical klystron and/or high-gain harmonic generation configurations. The fields can also be imported from a field map.
The electromagnetic field is described by a modal expansion. For free-space propagation, we use Gaussian optical modes. The Gauss-Hermite modes are used for simulation of planar undulators, while Gauss-Laguerre modes are used for elliptical or helical undulators.
The electromagnetic field representations are also used in integrating the electron trajectories, so that harmonic motions and interactions are included in a self-consistent way. Further, the same integration engine is used within the undulator(s) as in the gaps, quadrupoles, and dipoles, so that the phase of the optical field relative to the electrons is determined self-consistently when propagating the particles and fields in the gaps between the undulators.
Particle loading is done in a deterministic way using Gaussian quadrature that preserves a quiet start for both the fundamental and all harmonics. Shot noise is included using a Poisson statistics algorithm [21] so that the formulation is capable of simulating SASE FELs; however, provision is made for enhanced shot-noise due to various levels of micro-bunching.
The FEL simulation has also been linked to the OPC [8,9] for the simulation of FEL oscillators or propagating an optical field beyond the end of the undulator line to a point of interest. OPC propagates the optical field using either the Fresnel diffraction integral or the spectral method in the paraxial approximation using fast discrete Fourier transforms (FFT). A modified Fresnel diffraction integral [22,23] is also available and allows the use of FFTs in combination with an expanding grid on which the optical field is defined. This method is often used when diffraction of the optical field is large. Propagation can be done either in the time or frequency domain. The latter allows for the inclusion of dispersion and wavelength dependent properties of optical components. Currently, OPC includes mirrors, lenses, phase and amplitude masks, and round and rectangular diaphragms. Several optical elements can be combined to form more complex optical component, e.g., by combining a mirror with a hole element, extraction of radiation from a resonator through a hole in one of the mirrors can be modeled. Phase masks can be used, for example, to model mirror distortions or to create nonstandard optical components like a cylindrical lens.
In a typical resonator configuration, OPC handles the propagation from the end of the gain medium to the first optical element, applies the action of the optical element to the optical field and propagates it to the next optical element and so on until it reaches the entrance of the gain medium. Diagnostics can be performed at the planes where the optical field is evaluated. Some optical elements, specifically diaphragms and mirrors allow forking of the optical path. For example, the reflected beam of a partial transmitting output mirror forms the main intracavity optical path, while the transmitted beam is extracted from the resonator. When the intracavity propagation reaches the output mirror, this optical propagation can be temporarily suspended and the extracted beam can be propagated to a diagnostic point for evaluation. Then the intra-cavity propagation (main path) is resumed.
The numerical procedure involves translating between the input/output required for the FEL simulation and OPC. Initially, we run the FEL simulation to determine the optical output after the first pass through the undulator, which then writes a file describing the complex field of the optical mode. OPC is then used to propagate this field to the downstream mirror, which is partially transmissive in the current example. The portion of the optical mode that is reflected is then propagated to the upstream mirror (which is a high reflector) by OPC, and then back to the undulator entrance. The field at the undulator entrance is then reduced to an ensemble of Gaussian modes that is used as input to the FEL simulation for the next pass. This process is repeated for an arbitrary number of passes. While the example discussed in this paper relates to a concentric resonator, OPC has also been used to simulate a regenerative amplifier with a ring resonator [24].

The field representations
The undulator field models are three-dimensional representations. Two planar undulator models are available corresponding to flat-pole-faces and parabolic-pole-faces. The parabolic-pole-face model provides weak two-plane focusing. The elliptical undulator field is modeled by a representation of an APPLE-II undulator consisting of two flat-pole-face undulators that are shifted in phase. In each case, however, the injection into and ejection from the undulators is simulated by the particle tracking algorithms using smooth models for the undulator transitions. The quadrupole and dipole field models used are curl-and divergence-free representations with hard-edged field transitions.

The flat-pole-face undulator
The flat-pole-face undulator is represented by where B w and k w (= 2π/λ w , where λ w is the undulator period) are the undulator amplitude and wavenumber respectively. This field is both curl-and divergence-free when the amplitude, B w , is constant. The transitions at the ends of each undulator segment are modeled via where B w0 is the field amplitude in the uniform region, L w is the undulator segment length, N tr is the number of undulator periods in the transition region, and L tr (= L w -N tr λ w ) is the start of the output transition. The field in the transitions is divergence-free, and the z-component of the curl also vanishes. The transverse components of the curl do not vanish, but are of the order of (k w B w ) −1 dB w /dz, which are usually small.

The parabolic-pole-face undulator
The parabolic-pole-face field model is given by and B w (z) is given in equation (2). As in the case of the flat-pole-face model, this field is divergence-free and the zcomponent of the curl also vanishes.

The helical undulator
The helical undulator model that is employed is of the form in cylindrical coordinates where χ=k w z-θ, I 1 denotes the regular Bessel function of the first kind, and B w (z) is given by equation (2).

The APPLE-II undulator model
An approximate representation of an APPLE-II undulator can be formed by the super-position of two flat-poleface undulator models that are oriented perpendicularly to each other and phase shifted with respect to the axis of symmetry. As such, the field is represented in the form ( ) where, as before, B w (z) is given by equation (2). This is an approximate representation of an APPLE-II undulator that is valid near the axis of symmetry. The ellipticity is governed by the choice of the phase, f.
For 0f π/2, the ellipticity, u e , is given by for which the semi-major axis is oriented along π/2. The choice of f=0 (π/2) corresponds to planar (helical) polarization. When π/2fπ, the ellipticity is cos 8 e and the semi-major axis is oriented along −π/2. Illustrations of the on-axis field contours are shown in figure 1, where we plot the y-component of the field versus the x-component (normalized to the amplitude) for f = π/8, π/4, π/2 and 3π/4. The choice of elliptical polarization for the Gaussian modes has the semi-major axis aligned along the x-axis, so that this undulator field must be rotated in order to correspond to the polarization of the radiation field.

Quadrupole and dipole fields
The quadrupole field model used is , 9 x y Q Q where B Q (z) is the field gradient (constant) defined over a range z 1 zz 2 . This field is both curl-and divergence-free over this range. The dipole field model is described by a constant field oriented perpendicularly to the axis of symmetry over some range z 1 zz 2 .

The Gaussian optical modes
The Gauss-Hermite modes are used in simulating the interaction with planar undulators. In this case, the field representation is 2 where the indices (l, n) describe the transverse mode structure, the index h is the harmonic number, the field amplitudes, ( ) dA , describes the transverse mode structure where H l are the Hermite polynomials, ζ x =√2x/w h , ζ y =√2y/w h , and w 0,h and w h denote the waist size and spot size of the hth harmonic respectively. The spot size is assumed to be a slowly-varying function of (z, t). The phase is where k 0 =ω/c, α h denotes the curvature of the phase front of the hth harmonic and which is assumed to be a slowly-varying function of (z, t).
The Gauss-Laguerre modes are used when simulating elliptical and helical undulators. The field representation is where the transverse mode structure is given by L n l is the associated Laguerre polynomial, and ζ=√2r/w h . The phase is given by The total power carried in each mode, P l,n,h , is given by integration of the Poynting vector over the cross section. This is given by for the Gauss-Hermite modes, and

The dynamical equations
The dynamical equations for the fields employ the source-dependent expansion [25] which is an adaptive eigenmode algorithm in which the evolution of the spot size and curvature are determined self-consistently in terms of the interaction with the electron beam. As such, the dynamical equations for the fields are of the form where ( ) S l n h , , 1,2 are the source terms, is the convective derivative, and for F l,n = 1 +l+n(= 1+|l|+2n) for the Gauss-Hermite (Gauss-Laguerre) modes. The source terms are given by for the Gauss-Hermite modes, and for the Gauss-Laguerre modes, where ω b is the beam plasma frequency, and ( ) á ¼ ñdenotes an average over the initial particle distribution and υ i (i=x, y, z) are the Cartesian components of the velocity of an electron with z taken along the axis of the undulator. A uniform distribution in initial phase and a Gaussian distribution in coordinate and momentum space is assumed in the examples discussed in this paper. In this case

/ /
where γ avg and Δγ denote the average energy and energy spread, and σ r and σ p describe the initial transverse phase space. The averaging operator defines how the particle phase space is generated. The integral over each degree of freedom is discretized using Gaussian quadrature to set the initial phase space coordinates and charge weight of each electron in the simulation. Note that the various macro-electrons included in the simulation carry a variety of different charge weights that are set initially by the Gaussian quadrature algorithm. Once the initial phase space is generated, the coordinates and momenta for each macro-electron is tracked by integrating the Newton-Lorentz equation simultaneously with the equations for the optical fields.
We remark that there are alternative ways of implementing this particle average and different distributions can be used to initialize the electron phase space internally. However, MINERVA also has the ability to import initial particle distributions from alternative sources such as electron beam dynamics simulations through accelerators and magnetic transport lines.
The evolution of the spot size and curvature are governed by where the source terms are defined as These field equations are integrated together with the Newton-Lorentz force equations for the particles.
where ψ is the ponderomotive phase, where δE and δB correspond to the electric and magnetic fields of the complete super-position of Gaussian modes, and B static is the magnetostatic fields (undulators, quadrupoles, and dipoles).

Numerical considerations
MINERVA and MEDUSA [13] both employ the SVEA using a Gaussian modal superposition for the optical fields and a non-wiggler-averaged orbit analysis. While the development of MINERVA benefits from lessons learned in the development of MEDUSA, it goes about the process in significantly different ways and contains many improvements in features, diagnostics, and computational efficiency. Rather than list all of the differences, we note that the two most important differences are (1) the Gaussian modal representation has been changed and significantly simplifies the dynamical equations for the fields and (2) the application of slippage has been improved by implementing a higher order interpolator. These improvements preserve energy conservation to within several parts in 10 3 , which is 1 to 2 orders of magnitude better than is achieved with full time-dependent SASE simulations with MEDUSA. MINERVA can treat steady-state simulation by the simple expedient of including a single temporal slice in the simulation. Time dependence is treated by including multiple temporal slices and allowing the field slices to advance relative to the electron slices at arbitrary integration intervals. Since the optical field slips ahead of the electrons at the rate of one wavelength per undulator period, if this slippage operation is performed at shorter intervals than the undulator period, then the field advance is interpolated between adjacent temporal slices based on this slippage rate.
The total number of equations in each simulation is where N slices is the number of slices in the simulation, and for each slice, N particles is the number of particles, N modes is the number of modes in all the harmonics, and N harmonics is the number of harmonics. This complete set of coupled nonlinear differential equations is solved numerically using a Runge-Kutta algorithm. A 4th order Runge-Kutta integrator is typically used; however, since small spatial steps are required to resolve the wiggler motion (20-30 steps per undulator period), it is sometimes possible to use a 2nd order Runge-Kutta algorithm which halves the run times without a significant loss of accuracy. It should also be remarked that the Runge-Kutta algorithms allow for changing the step size 'on the fly' so that different step sizes can be used in quadrupoles, dipoles, or drift spaces between magnetic elements. The number of Gaussian optical modes required to achieve numerical convergence is determined empirically by adding modes until the desired level of convergence is obtained. The number of modes needed to achieve convergence to within 1%-5% varies with the parameters of the simulation. In general, however, the number of required modes depends upon the relative values of the gain length and the Rayleigh range, with fewer modes needed for longer Rayleigh ranges. For all the examples discussed in this paper, 20-30 modes were used.
The particle averages in the source terms are implemented by converting the continuous integral over a distribution function into a discrete set of macro-particles using Gaussian quadrature over each of the degrees of freedom. Good accuracy typically requires using at least 8000 or so particles per slice, but significantly more particles may be required when greater transverse resolution is desired or when simulating harmonic generation.

Vacuum propagation
That these equations recover vacuum propagation can be demonstrated by considering the case in which the electron beam is not present and the sources (equations (21) and (22)) vanish. As a result, X h = Y h = 0 so that the spot size and curvature satisfy the following equations These equations have the well-known solutions for the spot size and curvature in vacuo where is the Rayleigh range and z 0 denotes the position of the mode waist. then the dynamical equations can be reduced to equations for the derivatives of the amplitude δA l,n,h and phase j as which shows that the power is constant, and which indicates that the phase variation is described by the Gouy phase. As a result, the dynamical equations describe vacuum diffraction in the absence of an electron beam.

Elliptical undulators
We now describe the generation of elliptically polarized radiation using an elliptically polarized undulator. For convenience, we consider the same beam, undulator and focusing configuration as used in the simulation of the SPARC experiment (section 8), except that we now use the APPLE-II undulator model and elliptically polarized radiation. In addition, we limit the simulation to the steady-state (i.e., a single temporal slice) regime since that is sufficient to demonstrate the reliability of the formulation and allows us to compare the simulation results with an analytic theory. In order to compare the simulation results with an analytic theory, we make use of a description of the effect of an elliptical undulator on the resonant wavelength and the usual JJ-coupling factor that has been given by Henderson et al [26]. The generalized resonance condition varies with the ellipticity as follows Observe that this reduces to the usual expressions in the limits of planar (u e =0) and helical (u e =1) undulators. The generalized JJ-factor is given by In [26], the authors compared the results of simulations for different choices of the ellipticity using (1) a onedimensional, orbit-averaged simulation code in which the generalized resonance condition and JJ-factor were implemented and (2) the implementation of an elliptical undulator model in the one-dimensional PIC PUFFIN [14] code. Since the PUFFIN code does not make use of the orbit average and does not explicitly include either the resonance condition or the JJ-factor, it is expected that the ellipticity is included self-consistently. The comparison of the two codes showed excellent agreement. Hence, we conclude that the generalized dynamical equations constitute a reliable description of the ellipticity. As a result, we can obtain a three-dimensional approximation of the interaction in an elliptical undulator by using these expressions for the resonant wavelength and JJ-factor in the parameterization given by Ming Xie [15]. This generalized parameterization is then compared with the results of three-dimensional simulations. The undulator field amplitude (left axis in blue) associated with the generalized resonance and the generalized JJ-factor (right axis in red) for the parameters of interest are shown in figure 2 as functions of the ellipticity. We employed these undulator field amplitudes in performing the simulations for various choices of the ellipticity. Note, however, that the simulation model does not employ a wiggler-averaged orbit integration; hence, the physics associated with the JJ-factor is implicitly included in the simulations. As a result, the JJ-factor is only used in generalizing the parameterization developed by Ming Xie for comparison purposes.
Simulations have been performed for ellipticities ranging from zero (planar undulator) to unity (helical undulator) using the APPLE-II undulator representation. In each case, the simulation was started from shot noise using the same noise seed. No average over multiple noise seeds was performed; however, this is perfectly adequate since our intention is to study the variation in performance due to different ellipticities and the initial phase space used in the different simulations is invariant with respect to the ellipticity. Results showing the power growth along the undulator line are shown in figure 3 for ellipticities of 0, 0.15, 0.25, 0.50 and 1.0. As shown in the figure, the distance to saturation tends to decrease with increasing ellipticity. This is understandable since the JJ-factor increases with the ellipticity and this tends to increase the strength of the interaction.
A comparison between the saturation distances found in simulations and the predictions based on the generalized parameterization due to Ming Xie is shown in figure 4 where we plot the saturation distance versus the ellipticity. It should be remarked that we have added the drift space between the undulators to the predictions of the generalized parameterization. Since the simulation includes two extra undulator periods in each undulator to model the transitions at the entrances and exits of the undulators, we have added these lengths to the generalized parameterization as well. It is evident from the figure that the simulation is in good agreement with the generalized parameterization.

The LCLS SASE FEL
The LCLS [16] is a SASE FEL user facility that became operational in 2009 operating at a 1.5 Å wavelength. In this paper, we first discuss a comparison with the initial results from the LCLS in order to validate the model. We then present the first comparison showing substantial agreement between simulation and an experiment on the LCLS that employed an aggressive taper to enhance the efficiency.
The best experimental estimates for the fundamental operating parameters for operation at 1.5 Å are listed in table 1. It employs a 13.64 Ge V/25 pC electron beam with a flat-top temporal pulse shape of 83 fs duration. The  normalized emittance (x and y) is 0.4 mm mrad and the rms energy spread is 0.01%. The undulator line consisted of 33 segments with a period of 3.0 cm and a length of 113 periods including one period each in entry and exit tapers. A mild down-taper in field amplitude of −0.0016 kG/segment starting with the first segment (with an amplitude of 12.494 kG and K rms =2.4748) and continuing from segment to segment was used. This is the so-called gain taper. The electron beam was matched into a FODO lattice consisting of 32 quadrupoles each having a field gradient of 4.054 kG cm −1 and a length of 7.4 cm. Each quadrupole was placed a distance of 3.96 cm downstream from the end of the preceding undulator segment. The Twiss parameters for this FODO lattice are also shown in table 1.
The propagation of the beam through the LCLS undulator/quadrupole lattice as found in simulation is shown in figure 5, where we plot the beam envelope in x (blue, left axis) and y (red, right axis) versus position. Observe that the beam is well-confined over the 130 m of the extended lattice with an average beam size of about 21 mm.
The LCLS produces pulses of about 1.8 mJ at the end of the undulator line [16] and saturation is found after about 65-75 m along the undulator line. A comparison between the measured pulse energies (red circles), as  obtained by giving the electrons a kick to disrupt the FEL process, and the simulation (blue) is shown in figure 6. The experimental data is courtesy of P Emma and H-D Nuhn at SLAC, and the simulation results represent an average over an ensemble of 25 runs performed with different noise seeds. As shown in the figure, the simulations are in good agreement with the measurements in the exponential growth region with close agreement for the gain length. The simulation exhibits saturation at the same distance as the experiment in the range of 65-75 m at a pulse energy of 1.5 mJ. After saturation, in view of the gain taper, the pulse energy grows more slowly to about 2.02 mJ at the end of the undulator line, which is approximately 8% higher than the observed pulse energy. The agreement between simulation and experiment for the pulse energy is poorer during the early stages of the interaction. This may be due to a variety of reasons. On the experimental side, as the pulse energy grows by 5 to 6 orders of magnitude from the initial shot noise to saturation, it is difficult to calibrate the detectors for the low pulse energies at the early stages of the interaction. Also, while kicking the electrons provides a fast measurement method, it is accompanied by a larger background signal and the possibility of restarting the FEL process downstream the undulator line when the kick is performed at the beginning of the undulator line [17]. On the simulation side, there may be some inaccuracies in the shot noise algorithm that underestimates the initial noise level. Experiments have also been performed at the LCLS [17] to investigate enhancing the efficiency using a more sharply tapered undulator. The LCLS configured with a stronger taper for the last segments has demonstrated enhancements in the efficiency. This experiment employed an undulator in which the aforementioned mild linear down-taper is enhanced by the addition of a more rapid down-taper starting at the 14th undulator segment. This so-called saturation taper profile is shown in figure 7 (data courtesy of D Ratner).  In comparison with the undulator and electron beam properties employed in the first lasing experiments, the tapered undulator experiment employed undulators tuned to somewhat different field strengths and electron beam parameters that may have varied from the first lasing experiment. The pulse energies in the experiment were obtained by measuring the energy loss in the electron beam. Simulations were conducted over a parameter range including emittances of 0.40-0.45 mm mrad and energy spreads of 0.010%-0.015% that are thought to characterize the electron beam.
A comparison between the measured pulse energies and simulations over the parameter range that most closely agree with the experiment is shown in figure 8, where the experimental results are shown in red and have been obtained by measuring the electron energy loss. The maximum pulse energy shown represents an enhancement of the efficiency by a factor of 2-3 over what is found with the gain taper alone. As is evident from the figure, the simulations for the three choices are all very similar and are in good agreement with the measurements, indicating that the efficiency enhancement could be achieved for a variety of electron beam parameters.

The SPARC SASE FEL
The SPARC experiment is a SASE FEL located at ENEA Frascati [18]. The best estimate for the experimental parameters of SPARC are summarized in table 2 and are as follows. The electron beam energy was 151.9 MeV, with a bunch charge of 450 pC, and a bunch width of 12.67 ps. The peak current was approximately 53 A for a parabolic temporal bunch profile. The x and y emittances were 2.5 mm mrad and 2.9 mm mrad respectively,  and the rms energy spread was 0.02%. There were six undulators each of which was 77 periods in length (with one period for the entrance up-taper and another for the exit down-taper) with a period of 2.8 cm and an amplitude of 7.88 kG. In the simulation eight undulators were used to show saturation of the system. The gap between the undulators was 0.4 m in length and the quadrupoles (0.053 m in length with a field gradient of 0.9 kG cm −1 forming a strong focusing lattice were located 0.105 m downstream from the exit of the previous undulator. Note that the quadrupole orientations were fixed and did not alternate. The electron beam was matched into the undulator/focusing lattice. The resonance occurred at a wavelength of 491.5 nm. The pulse energies were measured in the gaps between the undulator segments.
Given the bunch charge available, the SASE interaction was unable to reach saturation over the six undulators present. Hence, for the purposes of the simulation we shall add two extra undulators to bring the interaction to saturation.
The propagation of the beam through the undulator/quadrupole lattice as found in simulation is shown in figure 9, where we plot the beam envelope in x (blue, left axis) and y (red, right axis) versus position. Observe that the beam is well-confined over the 20 m of the extended lattice with an average beam size of about 115 mm.
A comparison of the evolution of the pulse energy as found in simulation and as measured in the experiment is shown in figure 10 where the simulation is indicated by the blue line and is an average taken over 20 simulation runs with different noise seeds. The pulse energy was measured in the gaps between the undulators, and the results for a sequence of shots are indicated by the red markers (data courtesy of L Giannessi). Observe that the agreement between the simulation and the measured performance is excellent over the entire range of the experiment. In addition, the simulation shows that saturation could have been reached after about 18-20 m with two additional undulator segments.
This result is in substantial agreement with the parameterization developed by Ming Xie [15]. Using a βfunction of about 2 m, we find that the Pierce parameter ρ≈2.88×10 −3 and that this parameterization predicts a gain length of 0.67 m, and a saturation distance of 18.1 m (including the additional 3.2 m represented by the gaps between the undulators). This is in reasonable agreement with the simulation.
A comparison between the evolution of the relative linewidth as determined from simulation and by measurement (data courtesy of L Giannessi) is shown in figure 11 over the range of the installed undulators and agreement between the simulation and the measured linewidth is within about 35% after 15 m. As shown in the figure, the predicted linewidths are in substantial agreement.
The initial decrease in linewidth shown in figure 11 results from the development of temporal coherence as can be seen in figures 12-14, where we plot the power versus time within the optical pulse. The time window used in the simulation was chosen to be 14 ps in order to allow for slippage across the 12.67 ps electron bunch. The optical pulse at the start-up of the SASE interaction is expected to contain a large number of 'spikes'. This is indeed what is found in simulation as shown in figure 12, where we plot the power in the pulse over the entire time window. This pulse is near the start of the undulator line and exhibits a broad distribution of spikes       coinciding roughly with the center of the electron bunch, which is located at the center of the time window. As shown in figure 11, the linewidth narrows as the interaction proceeds and this corresponds to the development of temporal coherence. This evolution of temporal coherence is illustrated in figures 13 and 14, by comparison with figure 12, where the temporal pulses are shown at z = 10.0 m and 15.0 m respectively. These two figures correspond to the exponential gain region prior to saturation. It is clear in these figures that the early collection of a large number of spikes has coalesced into a more sharply peaked distribution containing a smaller number of spikes. This corresponds to the narrowing of the linewidth due to the development of coherence in the exponential gain region. Comparing figures 10 and 11, we observe that the measured pulse energy is somewhat higher and the relative linewidth somewhat lower than the simulation results, which may be due to a small difference in the beam match.

The BNL tapered amplifier
A tapered-wiggler, seeded amplifier experiment was conducted at the Source Development Laboratory at BNL [19] using a high brightness electron injector, a chicane bunch compressor feeding a 100 MeV, S-band SLAC type traveling wave linac. The electron beam is then injected into the NISUS wiggler [27] that was built for Boeing Aerospace. The NISUS wiggler is a 10 m long planar wiggler with a period of 3.89 cm and with weak, twoplane focusing. The NISUS undulator consists of a linkage of 1 m segments, and a taper can be imposed by choosing a segment and opening the jaws of the undulator starting at that point. This creates a linear downward taper of the field. A Ti:sapphire laser was used both as the driver for the photo-cathode electron gun and as the seed laser for the FEL amplifier operating at a wavelength of 793.5 nm. A 300% enhancement over the uniform wiggler interaction was observed when the NISUS undulator was tapered.
The experimental parameters are given in table 3. The resonant electron beam energy is 100.86 MeV, and the bunch charge is 360 pC over a bunch duration of 1.8 ps (full width), yielding a peak current of 300 A for a parabolic pulse shape. The normalized emittance was 4.0 mm mrad and the rms energy spread was 0.1%. The electron beam was matched into the weak focusing, NISUS undulator with a matched beam radius of about 212 mm and a 2.23 m β-function. The amplitude of the NISUS wiggler in the uniform section was 3.03 kG (K rms =0.848). The optical seed pulses provided by the Ti:Sapphire laser had peak powers of up to about 10 kW with a pulse duration of 6 ps, which was wider than the electron bunch duration. Indeed, this ensures that the electron beam experiences a relatively uniform seed laser intensity over the entire bunch at the outset.
The experiment was run at the resonant energy. While the simulation can be run unambiguously at the resonant energy, finding the resonant energy in the experiment involved adjusting the electron beam energy from the linac. Since there was insufficient bunch charge to reach saturation in SASE mode, and since the growth rate peaks on-resonance, the unsaturated SASE interaction will yield maximum power when the beam energy is tuned to the resonant energy. As a result, when this condition was realized, the linac was 'locked down' to this beam energy and the seed laser was turned on. The pulse energy was measured by 'kicking' the beam to the wall at various axial positions and measuring the output pulse energy that resulted. A comparison between the simulation and measured pulse energies for a uniform undulator is shown in figure 15, where the data (courtesy of X J Wang and J B Murphy) is indicated in red and the error bars indicate the standard deviation for a series of shots. It is evident that good agreement is found between the simulation and the measurements. Saturation at about 113 ± 28 μJ is found after about 7.0-7.5 m. The simulation result of 103 μJ is well within the range of uncertainty found in the experiment.
The NISUS undulator can be tapered in 1 m steps. Since the optimal taper is dependent upon both the starttaper point and the taper slope, and since the start-taper point must be located prior to saturation in the uniform undulator, finding the optimal taper configuration was an iterative process. The choice of 10 kW seed power was made by trial and error to optimize the start-taper point at 7.0 m. Further optimization indicated that a down taper of 4% over the final 3 m of the undulator yielded the maximum output power.
A comparison between the measured pulse energies for the tapered undulator (data courtesy of X J Wang and J B Murphy) and the corresponding simulation results is shown in red in figure 16. The uniform undulator results taken from figure 15 are also shown for comparison in blue. As evidenced in the figure, the agreement between the simulation and the measurements is excellent. The measured output was 283 ± 68 μJ, and the simulation result of 296 μJ also falls well within the range of experimental uncertainty. This represents an increase of almost 300% over the output of the uniform undulator.
The spectra as observed in the experiment and as found in simulation show similar agreement. Note that the accuracy of the spectral measurements was limited by the bandwidth of the filter, and the experimental spectra may be shifted by as much as ±0.5 nm. The spectra as determined at z = 6.0 m are shown in figure 17 where the measured spectrum is shown in red and the simulation result is shown in blue. Observe that the peaks and spectral widths agree closely. The comparison between the observed (red) and simulation (blue) spectra at the exit from the tapered undulator is shown in figure 18, and the agreement is very good at this point as well. The shift in the simulation spectrum relative to that measured is only 0.45 nm, which is well within the sensitivity of the spectrometer. In addition, the spectral widths are very close, and the sidebands indicated at about 796 nm are also in good agreement between the measurement and the simulation. Although the simulation observed somewhat more sideband growth than the experiment, sidebands do not seem to be an important component of the output spectra.

The JLab IR-Upgrade FEL oscillator
To further investigate the simulation capabilities we also compared the simulation with the IR-Upgrade FEL oscillator at JLab [20]. The basic experimental parameters were a kinetic energy of 115 MeV, an energy spread of 0.3%, a bunch charge of 115 pC, a pulse length of 390 fs, a normalized emittance of 9 mm mrad in the wiggle plane and 7 mm mrad in the plane orthogonal to the wiggle plane, and a repetition rate of 74.85 MHz for the electron beam. The planar undulator was 30 periods long, had a period of 5.5 cm, and a peak on-axis magnetic field of 3.75 kG. For proper electron beam transport through the undulator, we used a one period up-and downtaper. The electron beam was focused into the undulator with the focus at the center of the device. The resonator    length was about 32 m and the cold-cavity Rayleigh length was 0.75 m. The total loss in the resonator was 21% with about 18% out-coupled per pass from the downstream mirror. For these settings, the wavelength was 1.6 μm.
To simulate the FEL oscillator, OPC takes the optical pulse at the exit of the undulator and propagates the pulse through the resonator and back to the entrance of the undulator. The FEL simulation takes this optical pulse and propagates it together with a fresh electron bunch through the undulator. This process repeats for a predefined number of roundtrips. In this simulation, the number of particles was 5832 per slice, while the separation between slices was 5.4 fs. The number of optical modes was dynamically adjusted each roundtrip to accommodate the evolution of the optical field inside the resonator.
The length of the optical cavity must be selected so that the returning optical pulse is in synchronism with the electron bunches. The roundtrip time for the optical pulses in the cavity is t roundtrip =2L cav /c and the separation between electron bunches is t sep =1/f rep , where L cav is the cavity length and f rep is the electron bunch repetition rate. Perfect synchronism (referred to as zero-detuning) is obtained when t roundtrip = Mt sep , where M is the number of optical pulses in the cavity. In this case there were 16 optical pulses in the cavity and the zero-detuning length is L 0 =32.041 946 079 m. The cavity detuning curve is shown in figure 19 as a function of the difference between the cavity length L cav and the zero-detuning length. We find that the maximum output power of 14.52 kW occurs for a positive detuning of 2 μm and is close to the measured value of 14.3 ± 0.72 kW [20]. As a result, the predicted extraction efficiency is about 1.4%, which is close to the theoretical value of 1/2N u ≈1.7%. We remark that previous simulation of this experiment [28] yielded an average output power of 12.3 kW, and the present formulation is in better agreement with the experiment than in the earlier simulation. As in the previous simulation [28], the roughly triangular shape of the detuning curve is also in agreement with the experimental observation.
The temporal profiles of the optical pulse at the undulator entrance and exit as well as that of the electron bunch current are shown in figure 20 for the zero-detuning cavity length after pass 100 which corresponds to a stable, saturated steady-state. Observe that the electron bunch is centered in the time window, which has a duration of 1.4 ps. That this is at zero-detuning is indicated by the fact that the incoming optical pulse at the undulator entrance is in close synchronism with the electron bunch. It is also evident that the center of the optical pulse advances by about 0.16 ps as it propagates through the undulator, and this is in good agreement with the theoretical slippage estimate of N w λ/c, where N w is the number of periods in the undulator. Finally, it should be remarked that this is in the steady-state regime where the losses in the resonator and the out-coupling are compensated for by the gain in the undulator.

Summary and conclusion
In this paper, we have described a three-dimensional, time-dependent nonlinear formulation for the simulation of a variety of FEL configurations utilizing helical, planar, and elliptical undulators. For future reference, we refer to this formulation and simulation code as MINERVA. Along with the well-known three-dimensional representations of helical and planar undulators, a three-dimensional model is described to simulate an APPLE-II undulator. The simulation is in substantial agreement with a generalized parameterization for elliptical undulators. Comparisons of the simulation with a seeded infrared FEL amplifier, an infrared FEL oscillator, and SASE FELs operating at optical and x-ray wavelengths all showed good agreement with the experiments. Consequently, we feel that the formulation captures the basic physics of the FEL interaction over a wide range of parameters and can accurately, and with confidence, predict the performance of a large variety of FELs.
The Gaussian optical modes are not the ideal electromagnetic representation for all FELs. There is also interest in the development of FELs at spectral ranges that approach mm wavelengths. At wavelengths longer than 100 μm or so, the boundary conditions imposed by the walls of the drift tube cannot be satisfied using the Gaussian optical modes. Instead, a waveguide mode decomposition is more appropriate. Future development of this formulation will include a waveguide mode decomposition in addition to the Gaussian optical modes. Indeed, it is also intended to include a mixed decomposition appropriate where the waveguide boundary conditions are appropriate in one direction while the free-space modes are appropriate in the other direction. This will permit the simulation of long wavelength THz FELs using a rectangular drift tube which is compressed in one direction but relatively open in the other. These developments will be reported in future publications.
In addition, there is continuing and developing interest in short wavelength FELs with variable polarizations that are formed using undulator lines composed of mixed undulator types and orientations (i.e., planar undulators aligned along the x-or y-axes). At the present time, MINERVA is capable of setting up such a complex undulator line. However, the polarization (planar, helical, or elliptical) is fixed at the initialization. In order to self-consistently model such an undulator line where the polarization of the optical field can vary along the axis of symmetry, the dynamical equations for the x-and y-components of the field must be treated separately. The modifications to MINERVA needed to implement this field model are currently under study and the results will be presented in a forthcoming paper.