High Efficiency Terahertz Generation in a Multi-Stage System

We describe a robust system for laser-driven narrowband terahertz generation with high conversion efficiency in periodically poled Lithium Niobate (PPLN). In the multi-stage terahertz generation system, the pump pulse is recycled after each PPLN stage for further terahertz generation. By out-coupling the terahertz radiation generated in each stage, extra absorption is circumvented and effective interaction length is increased. The separation of the terahertz and optical pulses at each stage is accomplished by an appropriately designed out-coupler. To evaluate the proposed architecture, the governing 2-D coupled wave equations in a cylindrically symmetric geometry are numerically solved using the finite difference method. Compared to the 1-D calculation which cannot capture the self-focusing and diffraction effects, our 2-D numerical method captures the effects of difference frequency generation, self-phase modulation, self-focusing, beam diffraction, dispersion, and terahertz absorption. We found that the terahertz generation efficiency can be greatly enhanced by compensating the dispersion of the pump pulse after each stage. With a two-stage system, we predict the generation of a $17.6$ mJ terahertz pulse with total conversion efficiency $\eta_{\text{total}}=1.6\%$ at $0.3$ THz using a 1.1 J pump laser with a two-line spectrum centered at 1 $\mu$m. The generation efficiency of each stage is above $0.8\%$ with the out-coupling efficiencies above $93.0\%$.


Introduction
The last decades have seen a surge in research studies on generation and applications of terahertz radiation, which typically refers to electromagnetic waves in the spectral range from 0.1 to 3 THz. Among the wide range of applications, spectroscopy [1,2,3], spin dynamics control [4,5] and linear electron acceleration highly benefit from high power terahertz sources [6,7,8]. In particular, applications like particle accelerations place steep requirements of a few millijoules of single/multi-cycle terahertz pulse energy to enable bunch manipulation in the relativistic regime. Such performance is largely contingent on increasing the terahertz generation efficiency towards the percent level and beyond. Different approaches have addressed the generation of few-cycle terahertz pulses, including four wave mixing in ambient air [9], plasma driven effects in air [10], terahertz emission from photoconductive switches [11,12], tilted-pulsefronts [13] and echelons [14,15,16]. In addition, there are techniques available for the generation of multi-cycle terahertz radiation. Quantum cascade lasers are attractive due to their compactness and production simplicity in its use in the spectral range > 1 THz despite their limited tunability [17,18,19]. Molecular gas lasers can provide high terahertz energies but are less tunable in terms of the generated terahertz frequency [20]. Free electron based terahertz sources are ideal in providing high energy multi-cycle terahertz pulses but are less accessible and are even more challenging for a compact implementation [21,22,23]. Laser based multi-cycle terahertz generation can leverage on developments in solidstate laser technology to enable compact coherent terahertz sources with high conversion efficiency at low terahertz frequency (<1 THz) [24,25,26]. This study revolves around the generation of high energy terahertz pulses in the multi-cycle (narrowband) regime.
Laser-driven terahertz generation utilizes the second order nonlinearity of the nonlinear material to perform difference frequency generation. Increasing the optical to terahertz generation/conversion efficiency (referred to as efficiency throughout this paper) in laser-based schemes is often hindered by several challenges. The first limitation is introduced by the Manley-Rowe relation [38] obtained after the assumption of single photon conversion. This limitation can be overcome by a cascaded second order process (repeated energy down-conversion of pump photons to terahertz photons) [36, 39]. The damage threshold of the nonlinear crystal imposes another challenge. Large pump electric field strength induces optical breakdown and damage to the material, which also limits the achievable terahertz efficiency [40]. Another challenge is introduced by the terahertz absorption of the material. This challenge can be overcome by cryogenic cooling [41,42], since the temperature decrease impedes thermal phonon excitation and, therefore, considerably reduces the terahertz absorption. Additionally, it was found by Lee et al. that cooling of the nonlinear crystal increases the terahertz generation efficiency [43].
Our previous work based on a 1D-propagation model [35], partially addressed these issues and showed the possibility of obtaining high efficiencies of several percent for optimal pump pulse formats. However, in practice, complex pump pulse formats are hard to engineer. Therefore, further innovations are necessary to achieve the desired percent level efficiencies. Here, we aim to alleviate these limitations by recycling the optical pump and separating the generated terahertz radiation in a staged approach, before significant terahertz absorption occurs. In this multi-stage architecture, consecutive stages utilize the same optical pump pulse. The terahertz pulse generated at each PPLN stage is then coupled out, circumventing the excessive terahertz absorption. A similar concept was experimentally tested by Chen et al.
[44] with a two-stage system for single-cycle terahertz generation. However, in their system, the terahertz pulse generated in the first stage is also coupled into the second stage together with the pump pulse [44].
In this paper, section 2 describes the coupled wave equations governing terahertz generation and the utilized numerical approach based on a finite difference method in cylindrical coordinates. The effects of dispersion and mitigation mechanisms are discussed. Furthermore, the critical aspect of isolating the copropagating terahertz and optical beams at each stage is resolved with a suitably designed quartz coupler. Section 3 analyzes the dependence of terahertz generation on pump pulse duration, effective length and beam size. Also, the influence of phase front distortions and spatial intensity fluctuations of the pump pulse on terahertz generation are studied. Detailed information about the terahertz spatial profile and terahertz beam combination is presented.  The schematic illustration of the multi-stage system is presented in Fig. 1. A set of PPLN crystals are placed in a serial arrangement, through which an optical pump pulse is consecutively recycled. At each stage, the terahertz beam is out-coupled by a designed quartz coupler (see section 2.2.2) to avoid excessive terahertz absorption. The out-coupling and separation of the optical and terahertz beams may also be achieved by a combination of anti-reflection coatings for optical and terahertz beams at the exit of the crystal, and a dichroic mirror. In this article, we focus on the quartz coupler due to its simplicity in manufacturing and the low loss for the terahertz beam.

Theory
Terahertz radiation is generated in PPLN through difference frequency generation (DFG) processes. The collinear propagation of terahertz and pump beams suggest considering the cylindrical symmetry in solving the governing equations for terahertz generation. For the quartz coupler, however, this cylindrical symmetry is violated. We solve for the propagation of terahertz beam in quartz using an angular spectrum method [45]. For this purpose, the terahertz beam profile is reconstructed in 3-D Cartesian coordinates from the 2-D cylindrical coordinates. It is assumed that the quartz coupler has no influence on the pump pulse (detail in section 3.3). To precisely simulate the evolution of the terahertz and optical pulses, all dominant effects such as DFG, self-phase modulation (SPM), self-focusing, beam diffraction, dispersion and material absorption of the terahertz radiation, are included. In our studies, stimulated Raman scattering is neglected because of its relatively small influence [46], since the desired terahertz frequency range (< 1 THz) is far away from the phonon resonance frequency of Lithium Niobate (LiNbO 3 ), occurring around 7.5 THz[47].

Coupled Wave Equations in Cylindrical Coordinate
We choose to perform calculations on a slice along the radial axis in cylindrical coordinates as shown in Fig. 2. For high energy terahertz generation, the usage of large crystals is mandatory due to the limited damage threshold of the nonlinear materials. Using the state-of-the-art fabrication technology, PPLN crystals as large as 1 cm × 1 cm transverse dimensions and few centimeters in length are realizable [48]. The propagation of the pump and terahertz beams in bulk PPLN crystals resemble the propagation of linearly polarized beams in a square waveguide. However, the waveguide dimensions are far larger than the wavelength of both pump and terahertz in the crystal. As a result, the effect of the waveguide boundaries become negligible. Consequently, the entire problem reduces to solving the scalar coupled wave equations with the ansatz E(ω, r, z) = A(ω, r, z)e −ik(ω)z , considering the slowly varying amplitude approximation. The governing equations of the field ampli-tude of pump and terahertz beams read as: In Eqs. (1) and (2), A op and A THz stand for the electric field amplitude of the optical pump and the terahertz beam respectively; F denotes the Fourier transform operator; ω and Ω are the optical and terahertz angular frequencies respectively; k represents the wavenumber; α denotes terahertz absorption; n 2 represents the nonlinear refractive index and n represents the refractive index. The effective nonlinear coefficient is written as χ 333 (z) in the last terms on the right hand side of Eqs. (1) and (2). The term e i kz = e i(k(Ω)+k(ω)−k(ω+Ω))z evaluates phase mismatch between the propagation of the terahertz beam and the optical pump. The term e i kz together with the term in the first order Fourier expansion series of χ  1) and (2), the first terms correspond to the transverse spatial dependence of the propagation, describing beam divergence due to diffraction. In Eq. (1), the second term describes the impact of SPM on the optical beam which is independent of the phase-matching condition. As one can see, we consider a frequency dependent SPM effect which implicitly includes the effect of self-steepening (see Appendix 5.2). Additionally, by considering radially dependent SPM, self-focusing is considered. The third term captures two important effects. One is the DFG term A op (ω + Ω)A * THz (Ω) term which describes the frequency down-shift of the pump, and the other is the A op (ω − Ω)A THz (Ω) term which describes the back conversion from terahertz to the pump beam (sum-frequency generation) [49]. The second and third terms on the right hand side of Eq. (2) evaluate the effects of terahertz absorption and DFG processes respectively.
Our 2-D calculations in cylindrical coordinates allow for the possibility of including the spatial variation of the beam, diffraction and self-focusing. This can not be addressed by a 1-D calculation.

Numerical Method and Simulation Parameters
Various finite difference methods can be used for solving Eqs. (1) and (2). In this study, a split step Fourier method is used to compute the integral terms corresponding to the nonlinear polarization terms containing n 2 and χ (2) eff . The second order derivative in the r direction is expressed by a conventional 3-point finite difference method as shown by Eq. (15) in Appendix (5.3). Ultimately, the values of A op and A THz are updated throughout the propagation distance z using the low-storage Runge-Kutta method [50]. The entire method leads to an explicit update algorithm along z rather than in time t. This is only valid when there is no reflection in the system (i.e only forward propagation along z is involved). The 3-point finite difference method takes advantage of the tridiagonal matrix of the discretization along the r direction, thus reducing computational cost. A combination of split step, explicit finite difference, and low-storage Runge-Kutta methods highly enhance the computational performance. Using other finite difference methods such as the Crank-Nicolson method is also possible. However, this implicit method necessitates the inversion of a finite difference matrix for each frequency and increases computational cost. Additionally, an angular spectrum method based on Hankel transformation [51] can be used but the approach suffers from a lack of the development of the fast Fourier transform (FFT) in Bessel series [52] and the boundary conditions are less straightforward to be implemented. Our calculation is performed in C++ with MPI and OpenMP for code parallelization. Details of the numerical method can be found in Appendix (5.3).
In this work, we base all simulations for terahertz generation centered at 0.3 THz due to its amenability for electron acceleration. The overarching conclusions may however be extended to other frequencies as well. Terahertz generation with pulse trains is effective in mitigating walk-off and yields high conversion efficiencies [35]. In the remainder of this paper, the simulations are based on pump pulses consisting of two narrow spectral lines (referred to as two-lines throughout this paper) separated by the desired terahertz frequency (Ω 0 = 2π×0.3 THz). This forms a pulse train (beat note) with each sub-pulse separated by a time 2π/Ω 0 , i.e. the terahertz electric field period. The simulation parameters are tabulated in Table (1). The refractive index and absorption (α) of terahertz radiation at 80 K and 0.3 THz are obtained from linear interpolation of the experimental data between temperatures 50 K and 100 K from [56]. The material absorption at optical wavelengths, i.e. the pump absorption, is neglected and the refractive index is fitted to a Sellmeier equation [57]. We assume a super-Gaussian spatial distribution (M = 5) and a Gaussian temporal profile for the optical pump. This is also the case for high power lasers. Additionally, a super-Gaussian spatial distribution ensures that the generated terahertz radiation is homogeneous in the transverse r dimension and also minimizes the damage due to peak fluence of the pump compared to a Gaussian spatial profile. The input pump pulse energy, which follows from spatial and temporal integrations of the electric field defined in Table. 1, is given by where Γ is the gamma function, σ is the waist of the pump beam and τ FWHM is the full width half maximum (FWHM) of the pump temporal envelope.

Terahertz Efficiency Enhancement with Dispersion Compensation
We suggest compensating the dispersion accumulated in the material of the pump pulse generated at the end of each stage before recycling it to the subsequent PPLN stage in order to enhance the conversion efficiency. In other words, before the pump pulse is recycled to stage N, an opposite dispersion is added to the pump pulse at the end of stage N-1 to cancel the dispersion obtained by propagation through the stage N-1. For the first and second stages, dispersion compensation can be realized by a prism or grating (i.e second and third order dispersion). For many stages, the cascading process broadens the spectrum drastically. Dispersion compensation for a large bandwidth beam can be challenging in practice since higher order dispersion needs to be taken into consideration. However, it still can be achieved with double chirped mirrors [58]. The simulation results comparing the terahertz spectra at the end of each stage with and without dispersion compensation are shown in Fig. 3(a,b). The resulting conversion efficiencies are shown in Fig. 3(c,d).
One can see that the terahertz conversion efficiency reduces subsequently after each stage without dispersion compensation. Since the terahertz generation broadens the pump spectrum via the cascading process, part of the broadened pump spectrum can not be phase matched for further cascading. In other words, different frequency elements in the pump spectrum pick up different phases due to dispersion in the PPLN via propagation. Consequently, the terahertz radiation generated by DFG from different spectral ranges of the pump pulse carries different phases, leading to partial destructive interference of the total generated terahertz radiation. Thus, the efficiency is degraded due to the dispersion. It can be seen in Fig. 3(b,d) that the efficiency can be significantly enhanced by dispersion compensation.

Spectral Dynamics of Optical Pump and Terahertz Pulses
Figure (4) depicts the temporal profile of the pump pulse with the short time Fourier transform (STFT) (i.e instantaneous spectrum) of the corresponding time range. The broadening of the pump pulse spectrum causes a reduction of the pulse duration, an increase in peak intensity and a drastic variation in the instantaneous spectrum of each sub-pulse (see Fig. 4(a,b)). The STFT in Fig. 4(b) indicates that the instantaneous spectrum in each sub-pulse forms a 'U' shape due to the cascading effect, leading to a drastic spectral variation in time. The maximum cascading (spectral down-shift) occurs where the highest peak intensity is present. The STFT in Fig. 4(c) suggests that the dispersion compensation alters the spectral distribution with respect to time. If a trans-form limited pump pulse, which has a uniform spectrum distributed over time, could be obtained after each stage, the terahertz efficiency could be greatly enhanced. However, the realization of such a process is challenging, since perfect compression of the 'U' shaped spectrum induced by SPM, the second order nonlinear effect, and dispersion, is not straightforward. Figure. 5 compares the spectral density (spatially dependent spectra in radial dimension r) of the terahertz generated at each stage for two cases with and without dispersion compensation. The terahertz beams generated with dispersion compensation follow a Gaussian-like spectrum, whereas those generated by direct pump pulse recycling have distorted spectral shapes, particularly in the last stage. This is due to the fact that the phase of the pump perturbed by dispersion influences the phase-matching condition of terahertz generation, and thus leads to a broader terahertz spectrum. Figure. 6 shows the comparison of the spectral density of the output optical pump at each stage without and with dispersion compensation. It can be seen that dispersion compensation boosts the cascading process and leads to greater pump spectral broadening. The results depicted in Fig. 3, Fig. 5 and Fig. 6 confirm that the terahertz conversion efficiency can be largely enhanced by pump pulse recycling with dispersion compensation.  Figure 5: (a-d) show the terahertz spectral density generated by 4 consecutive stages respectively. The first and second rows show the terahertz spectra generated without and with dispersion compensation, respectively. Figure 6: (a-d) represent pump pulse spectral density after each corresponding stage. The first and second rows show the pump pulse spectra at the output of each stage generated without and with dispersion compensation, respectively.  Figure. 7(a) shows one possible arrangement of the multi-stage setup accounting for the subtlety of terahertz and optical beam separation. The output pump beam from the PPLN propagates at a certain angle with respect to the input pump pulse direction due to the refraction from the quartz coupler (QC). However, in the sketch, we do not depict this for simplicity of illustration. Incidence at Brewster angle plays a crucial role in this scheme in order to couple out and separate both pump and terahertz beams with high transmission efficiency from the PPLN to air (n = 1). Out-coupling of the terahertz beam from Lithium Niobate to air through a silicon coupler was first demonstrated experimentally and theoretically by Kawase et al. [59]. In this article, crystalline quartz is chosen as the material of the output coupler owing to its low absorption for terahertz radiation at 0.3 THz (0.01/cm) [60,61], large bandgap (12.5 eV) [62] which has lower nonlinearity and higher damage threshold compared to silicon or Lithium Niobate [63]. Additionally, the dispersion of quartz at both pump and terahertz frequencies is negligible for few centimeters of propagation [61,62]. We neglect the small influence of the quartz coupler on the optical pump and fully focus on the out-coupling of the terahertz radiation. In Fig. 7(a), the inner angles of the bulk quartz coupler are represented by α 1 , α 2 , and α 3 . This ensures that the terahertz beams experience Brewster incidence angles π − α 1 and γ 2 at S 1 and S 2 surfaces, respectively. β 1 and β 2 are the output angles of the pump beam at the S 1 surface and the pump beam Brewster incidence angle at the S 3 surface, respectively. The refraction loss of the pump is negligible. S 3 can be chosen to be parallel to surface S 1 to further reduce angular deviation. Additionally, L (the length of the top edge of the quartz coupler) is the minimum distance that ensures sufficient spatial separation of pump and terahertz beams. Details of the equations used for calculating the involved angles can be found in Table 2, where σ = 5 mm, n LN (ω 0 ) = 2.1, n Q (ω 0 ) = 1.4, n LN (Ω 0 ) = 5, n Q (Ω 0 ) = 2. The subscripts 'LN' and 'Q' correspond to PPLN and Quartz respectively.

Quartz Output Coupler for Beam Separation
Since the refractive index of the PPLN varies with temperature and manufacturing process, an output coupler with a reasonable tolerance is desired. In Fig. 7(b,c), comparison of two out-coupling cases 'Air' (red curves) and 'QC' (blue curves) is shown. In the case of 'Air' where no output coupler is utilized, the PPLN crystal is cut such that the terahertz pulse is incident at Brewster's angle (tan −1 (1/n LN (Ω 0 )) = tan −1 (1/5) = 11.3 • ) from PPLN to air directly. In the case of 'QC', the terahertz pulse is coupled out from the PPLN to air through the designed quartz coupler shown in Fig. 7(a) with Brewster incidence angle (tan −1 (n Q (Ω 0 )/n LN (Ω 0 ) = tan −1 (2/5) = 21.8 • ) on S 1 surface.
In both Fig. 7(b) and Fig. 7(c), the transmission window of the 'QC' case at σ t = 5 mm (flat interval with terahertz energy transmission T ≈ 1) is due to total internal reflection (TIR) at both the S 1 and S 2 surfaces. This window range reduces for the smaller terahertz beam size σ t = 1 mm (i.e terahertz beam with larger angular divergence). The terahertz beam components with angular divergence larger than the TIR are filtered out, leading to lower energy transmission. This TIR induced transmission loss is particularly pronounced for the 'Air' case since the refractive index varies from n LN (Ω 0 ) = 5 to n Air (Ω 0 ) = 1. This large refractive index difference causes a sharp transmission drop at the incidence angle larger than Brewster angle. This explains why the transmission peak occurs at an angle smaller than Brewster's angle in the 'Air' case in Fig. 7(c). With larger beam size σ t = 5 mm, the transmission peak approaches the Brewster angle due to smaller angular divergence. It can be seen that from PPLN to air, the terahertz energy transmission at σ t = 1 mm is around 45% and 55% for 'Air' case and 'QC' case respectively. The terahertz energy transmission at σ t = 5 mm is around 63% and 96% for 'Air' case and 'QC' case respectively. With the given refractive index at the corresponding Brewster incident angle, a quartz coupler leads to higher transmission. More discussion can be found in section (3.3).

Single Stage: Terahertz Efficiency in Terms of Pump
Pulse Duration The crucial parameters in designing a complete terahertz source are the pulse duration of the pump and the necessary length of the PPLN crystal. As one can see in Eq. (3), longer pulse durations allow higher input energy, but simultaneously necessitate longer interaction length, which in turn intensifies the effects caused by terahertz absorption. The optimal pump pulse duration is found to be 150 ps with conversion efficiency η = 1.05 % as shown in Fig. 8(a,c). In order to define the optimal PPLN length L eff , we introduce the length parameter L 0 where the efficiency increase is most rapid. This is obtained at d 2 η(z)/dz 2 | z=L0 = 0. We define L eff as: One should avoid choosing L eff at dη(z)/dz = 0, because the plateau length in Fig. 8(c) increases with the increase of the pump pulse duration and dη(z)/dz = 0 occurs approximately in the middle of the plateau. Within the range of the plateau, the amount of generated terahertz radiation equals to the absorbed amount. Consequently, the optimal L eff should be chosen at the onset of the plateau to avoid unnecessarily long crystals. If we define δ ≈ [n(Ω 0 ) − n g (ω 0 )] /(cτ FWHM ), Eq. (4) results in the following analytic equations for L eff and L 0 (see Appendix 5.5 for details): for short pump pulses, δ α 1, L 0 = tan −1 ( δ α )/δ for long pump pulses, δ α 1, L 0 = 2 ln (2)/α, L eff = 2 α ln ( L 0 and L eff variations in terms of the pump pulse duration are depicted in Fig. 8(b) where the results of analytic calculations are compared against numerical results.

Single Stage: Effects of Transverse Spatial Variations of Pump
High power terahertz generation demands high power lasers which usually suffer from phase front distortions and intensity modulations. In the following section, we investigate the influence of the spatial variation of the pump pulse on the terahertz generation process. Here, we aim at exploring the dependence of the terahertz generation efficiency on the pump beam size. For this purpose, we perform simulations for a super-Gaussian (M = 5) pump beam with beam sizes σ = 1 mm, 3 mm and 5 mm, respectively. Figure 9 suggests that the terahertz beam size and spatial profile resemble the corresponding pump properties for large pump beam sizes. It can be seen in Fig. 9(a) that the terahertz conversion efficiency is lower when σ = 1 mm compared to σ = 3 mm and 5 mm. The diffraction has a minor influence on the spatial profile of the pump pulse since the Rayleigh range (λ ≈ 1 µm) is about few tens of meters. In a super-Gaussian beam, smaller beam size leads to larger spatial phase gradient at the edge of the spatial profile. Therefore, as observed in Fig. 9(b), the self-focusing effect leads to the peak appearing at the edge of the 1 mm pump beam. The change of the spatial profiles of the terahertz beams in Fig. 9(c) is due to the diffraction of the super-Gaussian beam [64,65]. It can be seen that for a 1 mm terahertz beam, diffraction modifies the super-Gaussian profile to a Gaussian profile after 25 mm generation distance.

Beam Size Dependence
The above results are obtained with perfect super-Gaussian pump spatial profiles and a flat spatial phase front. In order to examine the influence of spatial energy and phase front variations of the pump pulse on terahertz generation with the two-line pump spectrum, we studied 4 cases: (I) flat phase front with one of the pump spectral lines having a 5% modulation in the electric field strength.
(II) flat phase front with both pump spectral lines having a 5% modulation in the electric field strength. (III) flat spatial profile with one of the pump spectral lines having modulation from −π to π in the phase front. (IV) flat spatial profile with both of the pump spectrum lines having modulation from −π to π in phase front.
In the following context, E(ω, r) denotes the electric field of the input pump pulse and E 0 (ω, r) is the electric field with a perfect super-Gaussian spatial profile (σ=5 mm) and flat phase front. The interaction length is chosen to be z = 25 mm as in Fig. 9. r max = 8.5 mm is the maximum distance of the calculated radial dimension.

Case I: One line with Cosine Electric Field Amplitude Modulation
It can be seen in Fig. 10 that terahertz generation is not degraded by the intensity modulation in one of the spectral lines. Though the transverse intensity modulation can also induce spatial phase variations due to self-focusing, with our pump pulse parameters, the induced spatial phase variation is negligible. Case II: Both lines with Cosine Electric Field Amplitude Modulation E(ω, r) = E 0 (ω, r)(1 + 0.05 cos(2π × 4r r max )) (7) Fig. 11 suggests that the terahertz generation is not degraded when both lines are subjected to the same intensity modulation, which is expected from the Case I study.  Fig. 12 suggests that if both of the pump lines are identically modulated in phase, the terahertz generation process is not degraded. The phase modulation induces self-focusing which can be observed in Fig. 12(c). Case IV: One line with Cosine Phase Modulation E(ω, r) = E 0 (ω, r)e iπ cos(2π× 4r rmax ) , ω ≤ 2π × 291.3 THz E(ω, r) = E 0 (ω, r), ω > 2π × 291.3 THz (9) Figure 13 indicates that a distortion in the phase front, not unusual for high power laser beams, strongly reduces the conversion efficiency. This is due to the nature of the difference frequency generation process, which transfers the phase difference between the two frequency lines of the optical pump to the terahertz beam at a thousand times longer wavelength. In turn, the modulated terahertz phase front suffers from strong diffraction, leading to destructive interference. Consequently, the terahertz conversion efficiency is greatly reduced and a strongly spatially modulated terahertz beam appears as shown in Fig. 13(d).
The peaks in Fig. 13(c) are due to self-focusing, making it different from case III. In case III, since the phase fronts of both spectral lines are identical, the terahertz generated inherits a flat phase front. As a result, the cascaded frequencies newly generated in the optical region maintain the same phase front as the initial two-line spectra. However, in case IV since DFG creates a terahertz pulse with the phase difference of the pump spectral lines, the new pump frequency lines generated by the cascading process carry different phases. The later generated cascaded lines inherit multiple times of the original phase difference, forming an enormously curved phase front. Thus, strong self-focusing occurs.  60.3% Figure 14: (a,b) and (c,d) represent terahetz spatial profiles in PPLN in x − y coordinate before S 1 surface (see Fig. 7(a)) generated by a two-stage system with pump beam size σ = 5 mm and σ = 1 mm respectively. (e-h) represent coupled out terahertz spatial profiles of (a-d) after S 2 surface in x − y coordinate in the air. The color bar represents fluence in the unite of J/m 2 . Each figure window size is 15 mm × 15 mm. Terahertz out-coupling efficiency from (a-d) to (e-h) is η a = 96.1%, η b = 93.7%, η c = 62.5%, η d = 60.3% respectively.

Terahertz Spatial Profile after Quartz Coupler
The quartz coupler breaks the cylindrical symmetry of the PPLN. In order to calculate the terahertz spatial profile after the quartz coupler, we reconstruct the terahertz beam profiles in x-y coordinates from the numerical results in the cylindrical coordinates inside the PPLN before the incidence on the S 1 surface (see Fig. 7(a)). The refraction at the S 1 , S 2 surfaces (see Fig. 7(a)) for p polarized light and the beam propagation in the quartz coupler is calculated by the angular spectrum method [45]. Figs. 14(a,b) and Figs. 14(c,d) correspond to terahertz spatial profiles generated with a pump beam size σ = 5 mm and σ = 1 mm, respectively by a two-stage setup. Note that with a pump beam size σ = 5 mm, the terahertz spatial profile retains its flat top nature, whereas with σ = 1 mm, the terahertz spatial profile reduces to a Gaussian as shown in Fig. (9). The corresponding transmitted spatial profiles after the S 2 surface in x − y coordinates in the air (see Fig. 7(a)) are shown in Fig. 14(e-h) with the terahertz out-coupling efficiency η a = 96.1%, η b = 93.7%, η c = 62.5%, η d = 60.3% respectively. The influence of the S 1 and S 3 surfaces together with the propagation in the quartz coupler has a negligible effect on the pump spatial profile, i.e the spatial profile of the pump stays flat top but with a reduced size in the x direction i.e σ = σ tan(β 2 ) cos(β 1 )/ sin(α 1 ) ≈ 0.6σ. The beam size of the terahertz radiation in the x direction follows the same relation σ t = σ t tan(β 2 )/tan(α 1 ) ≈ 0.2σ t . On the other hand, the refraction at S 1 and S 2 surfaces greatly changes the energy distribution of the flat top terahertz beams spatially ( see Fig. 14(a,b,e,f) ) while the Gaussian spatial profile remains unchanged ( see Fig. 14(c,d,g,h) ). The flat top function is a superposition of Hermite Gaussian modes in Cartesian coordinates. Thus, the final spatial profile of terahertz is a result of a superposition of all the Hermite Gaussian modes after refraction.
The terahertz beams coupled out by the QC could be combined by a power combiner with a specifically designed input coupler which matches the terahertz beam profile to the mode of the power combiner [66]. This investigation is beyond the scope of this article and will be presented in a separate study. For an ideal symmetric power combiner, the combination efficiency η combine can be written as the following [67]: where E k (x, y, t) is the terahertz electric field obtained from kth stage. The combination efficiency η combine reaches the maximum when E 1 (x, y, t) = E 2 (x, y, t) = ..... = E N (x, y, t), e.g terahertz outputs from each stage need to have identical spectrum, phase and amplitude [67]. This could be achieved by adjusting the length of each stage and the relative arrival time of the terahertz beams. With equation (10), the combination efficiency of the terahertz output beams are 93.0% (Fig. 14(e,f)) and 95.2% ( Fig. 14(g,h)), respectively. This high combination efficiency is the result of similarity between terahertz beams generated in each stage. As a result, the final total efficiency η total of a two stage system can be written in Eq. (11) where η 1 , η 2 is the generation efficiency of the first and second stage respectively, η a , η b is the out-coupling efficiency, from PPLN to the air with QC, of the first and second stage respectively and η combine is the combination efficiency of the terahertz generated by the first and the second stage.
Using equation (11) and (3), a total terahertz energy 17.6 mJ with η total = 1.6% can be achieved with a two stage system using pump beams of size σ = 5 mm and input pump energy 1.1 J.

Conclusion
We presented a scheme for generating terahertz pulses using a consecutive arrangement of PPLN crystals (multi-stage system). For this purpose, a 2-D numerical scheme assuming cylindrical symmetry is developed. The simulation includes the effects of DFG, SPM, self-focusing, beam diffraction, dispersion and terahertz absorption.
For a single stage, the optimal pump beam pulse duration for a 2-line pump spectrum configuration is 150 ps with effective length 25 mm and conversion efficiency η = 1.05%. With this pump pulse configuration, the terahertz spatial profile resembles the pump when the pump waist σ ≥ 3 mm. Additionally, we found that the intensity profile modulation has negligible influence on the terahertz generation efficiency, whereas a distortion in phase front strongly reduces the conversion efficiency. This suggests that care must be taken in circumventing phase induced efficiency deterioration for high energy terahertz generation.
For a multi-stage system, we found that by recycling the pump pulse with dispersion compensation, the efficiency can be greatly enhanced. Subsequently, a quartz output coupler for separating the terahertz beam and optical pump with high terahertz transmission is designed.
Specifically, for a two stage system, we predict generation of a terahertz pulse of 17.6 mJ energy with total conversion efficiency η total = 1.6% at 0.3 THz for a 1.1 J input pump energy, where the generation efficiencies are η 1 = 1.0%, η 2 = 0.8%, the out-coupling efficiencies are η a = 96.1%, η b = 93.7% and combination efficiency is η combine = 93.0%.

Acknowledgments
In addition to DESY, this work has been supported by the European Re-
In Eq. (12), subscripts 1 and 2 represent the ordinary axes and 3 represents the extraordinary axis [68]. Axes 1 and 2 are identical at the first order (n o1 = n o2 ), which is known as uniaxial birefringence. However, in the second order, the o axes are not identical. This explains why d ij matrix elements are not identical on 1 and 2 axes. With the linearly polarized input pump pulseÊ = (0, 0, E 3 ) along the extraordinary axis, the second order polarization term driving the terahertz field is also linearly polarized along the same direction (P . The phase-matching condition shown in Eq. (13) ensures that the terahertz pulses generated at different positions of the PPLN crystal can add up in phase provided: where Λ is the period of PPLN, Ω is the terahertz angular frequency, ω is the pump angular frequency, v THz is the phase velocity of terahertz pulse and v g is the group velocity of the pump pulse. Note that for a given PPLN structure, there is more than one terahertz frequency fulfilling the phase-matching condition ( Ω N = Ω 0 × (1 + 2N ))

Frequency Dependent Self Phase Modulation
Frequency dependent SPM includes self-steepening which can be shown using the following derivation: where the first and second terms correspond to the SPM effect at ω 0 and selfsteepening, respectively.

Finite Difference Method
The finite difference methodology for solving Eq. (1) is presented. Equation (2) can be solved by following the same procedure. By defining r + j = 1 2 (r j+1 + r j )/( r 2 r j ), r − j = 1 2 (r j−1 + r j )/( r 2 r j ), where r is the step size in the r dimension and r j represents a mesh point at a specific radial position, one can obtain the discretized form of the second order derivative as shown in Eq. (15).
Equation (15) can be written in a matrix multiplication form as shown on the right hand side of the last column in Eq. (18) where the r i related matrix is denoted as a tridiagonal matrix M R as shown in Eq. (16).
Equation (1) can be written as Eq. (17) where P NL (ω m , r j , z k ) represents all the second and third nonlinear polarization terms.
Equation (17) can be written in matrix form as Eq. (18) where N r , N ω are the number of the mesh grids at r and ω dimension respectively. Each color of the corresponding column vector represents the coupled wave equation at a specific frequency ω m . In Eq. (18), the 2-D matrix of angular frequency (ω) and radius (r) on the left hand side is updated by a low-storage Runge-Kutta method [50]. The electric field elements at the same position r j with different frequencies are mixed by the nonlinear polarization P NL . The electric field elements of the same frequency ω m at different positions in r are mixed by M R (e.g diffraction). The totally number of the mesh points in r dimension is N r + 2. The extra 2 points A op (ω, r 0 , z), A op (ω, r Nr+1 , z), marked with dashed boxes, are "ghost points". The "ghost points" together with the two columns marked by dashed lines in M R in Eq. 16 are used to construct boundary conditions. P NL (ω Nω , r 1 , z k ) P NL (ω Nω , r 2 , z k ) . . . P NL (ω Nω , r Nr , z k ) P NL (ω m , r 1 , z k ) P NL (ω m , r 2 , z k ) . . . P NL (ω m , r Nr , z k ) P NL (ω 1 , r 1 , z k ) P NL (ω 1 , r 2 , z k ) . . . P NL (ω 1 , r Nr , z k ) + A op (ω Nω , r 0 , z k ) A op (ω Nω , r 1 , z k ) A op (ω Nω , r 2 , z k ) . . .

Boundary Conditions
The boundary conditions are enforced to the solution by the elements in the dashed boxes in Eq. (18), and M R matrix in Eq. (16). For the mesh points in r dimension, the r = 0 singularity is avoided by defining r 0 = 3∆r, and thus r − 1 = 1 2 (r 0 + r 1 )/( r 2 r 1 ). The "boundary condition" near r 1 is confined by the symmetry at the origin of the cylindrical coordinate itself ( e.g ∂Aop(ω,r,z) ∂r | r=r1 = 0). Thus, A op (ω, r 0 , z) = A op (ω, r 1 , z) For the boundary condition at r Nr , three different conditions can be applied. Note that as shown in Fig. (2), in the calculated area the polarization of the field is perpendicular to the boundary. Of course with the same numerical method, any polarization at the boundary can be considered. However, the boundary condition has to be changed accordingly.
If the field polarization is parallel to the contact surface of the dielectric (PPLN) and the metal cladding outside the PPLN, one can also apply a reflective boundary condition.
This is equivalent to the situation where electric field is reflected by a mirror.
2. Transparent boundary [69]. This boundary represents the situation where the boundary has no effect on the field as the field propagates through the calculation boundary. This kind of boundary is also equivalent to the situation when Eq. (1) and (2) are solved by angular spectrum method. The condition can be written as [69].

Pulse Duration in Terms of Effective Length
As section (3.1) suggests, the effective length is related to the pump pulse duration. In order to get more insight into this problem, we begin with a 1-D undepleted model. Consequently, Eq. (2) can be written as Eq. (23).