SFQEDtoolkit: a high-performance library for the accurate modeling of strong-field QED processes in PIC and Monte Carlo codes

Strong-field quantum electrodynamics (SFQED) processes are central in determining the dynamics of particles and plasmas in extreme electromagnetic fields such as those present in the vicinity of compact astrophysical objects or generated with ultraintense lasers. SFQEDtoolkit is an open source library designed to allow users for a straightforward implementation of SFQED processes in existing particle-in-cell (PIC) and Monte Carlo codes. Through advanced function approximation techniques, high-energy photon emission and electron-positron pair creation probability rates and energy distributions are calculated within the locally-constant-field approximation (LCFA) as well as with more advanced models [Phys. Rev. A 99, 022125 (2019)]. SFQEDtoolkit is designed to provide users with high-performance and high-accuracy, and neat examples showing its usage are provided. In the near future, SFQEDtoolkit will be enriched to model the angular distribution of the generated particles, i.e., beyond the commonly employed collinear emission approximation, as well as to model spin and polarization dependent SFQED processes. Notably, the generality and flexibility of the presented function approximation approach makes it suitable to be employed in other areas of physics, chemistry and computer science.


Introduction
Since the invention of chirped pulse amplification, the continuous technological progress in highpower laser systems has opened up the possibility of accessing ultrastrong electromagnetic fields in the laboratory. Peak laser intensities of the order of 10 23 W/cm 2 have been recently achieved with multi-petwatt laser systems [1], while 10 PW lasers are expected to become soon available at the Extreme Light Infrastructure (ELI) [2] and in several other facilities worldwide [3]. This new experimental capability has driven substantial theoretical and experimental effort in exploring the strong-field frontier of quantum electrodynamics (QED) from atoms [4,5] to plasmas [6,7,8,9]. Indeed, first experiments coupling ultrarelativistic electron beams with intense lasers have recently provided the first evidence of radiation reaction effects and of their quantum nature [10,11]. In parallel, progress in accelerator technology with the possibility of generating high-current and strongly focused ultrarelativistic lepton beams has provided a novel route to investigate SFQED in beam-beam and in beam-plasma interaction [12,13,14,15,16]. Further technological development could even allow to probe the supercritical regime of SFQED, where particles experience fields largely exceeding the QED critical one F cr = m 2 e c 3 /|e|ℏ ≈ 1.3 × 10 18 V/m (≈ 4.4 × 10 9 T) in their rest frame. Here c is the speed of light in vacuum, ℏ the reduced Planck constant, while m e and e are the electron mass and charge, respectively.
In relativistic laboratory astrophysics, the study of SFQED processes in plasmas with even the possibility to generate dense electron-positron jets extending over several plasma skin depths [17,18] could enable access to the microphysics that determines both the plasma dynamics and the powerful emission processes thought to occur around compact astrophysical objects such as pulsars and magnetars [19,20,21,22]. This unique opportunities justify the growing interest and effort in studying SFQED effects in extreme-field plasma physics. A prominent example are QED cascades, whose dynamics is essentially determined by the interplay between SFQED and the collective plasma response [23,24,25,26,27,28,29,30,31,32,33,34,35].
SFQED-dominated plasmas are complex and strongly nonlinear systems. Modeling and gaining insights into the intrinsically multiscale dynamics that governs these complex systems in realistic conditions critically depends on the development of advanced numerical tools and codes that can efficiently run on high-performance computing (HPC) machines. In the last few decades, several Monte Carlo and PIC codes have been developed to study beam physics and classical plasma dynamics [36]. Some of these codes have been enriched with SFQED routines that model highenergy photon emission and electron-positron pair creation in strong background electromagnetic fields [25,37,38,39,40,33,41,42,43], and merging algorithms [44,45] have been developed to tackle the excessive or even unfeasible computational demands caused by the rapid and potentially exponential growth of the number of computational particles in SFQED simulations. However, to our knowledge, the standard approach to model probability rates and distributions of SFQED processes in state-of-the-art codes resorts to binary search and interpolation applied to pre-computed lookup tables where the required functions are sampled at a limited number of points logarithmically distributed between a minimum and a maximum that can span several orders of magnitude [37,40,42]. This approach is motivated by the fact that SFQED rates and distribution functions involve numerically expensive special functions and integrals whose evaluation at runtime would largely exceed the computational cost of PIC routines, therefore rendering SFQED PIC simulations unviable, practically. However, the major shortcomings of this approach are that: (i) it struggles to cover the whole range of parameters. In fact, e.g., a cut-off in the photon spectrum is commonly introduced. (ii) to achieve high accuracy a very high number of sampling points is needed, which noticeably increases the computational cost. It is worth noting that the insufficient resolution of lookup tables may introduce both substantial systematic errors in probability rates (see section 4) and artifacts such as stairlike structures in the high-energy region of the photon spectrum (see Ref. [43] and section 4). The above shortcomings can unacceptably affect the reliability of simulation results.
Here we introduce SFQEDtoolkit, a suite of modules designed to implement SFQED processes in existing PIC and Monte-Carlo codes while overcoming the above-mentioned issues. We follow the standard methodology for implementing SFQED processes in a code, which consists of two main steps. First, determine whether the SFQED event is deemed to occur according to its rate, i.e., its probability per unit time. This is a critical step and needs to be highly optimized because SFQED rates are required for each particle and at each timestep, thereby critically affecting both the performance and the accuracy of the simulation. Second, if the event occurs, sample the state after the event from its probability distribution. In the collinear emission approximation, this reduces to calculate the particles' energy after the event.
Note that in the S-matrix formalism of SFQED the calculation of the probability involves integrals in time ranging from −∞ to +∞, i.e., one needs to consider the asymptotic states in the past and in the future. The use of rates depending only on the instantaneous value of particles' parameters is possible when SFQED probabilities are formed over a temporal scale much smaller than the scale of variation of electromagnetic fields (see Refs. [46,47] for further details). This assumption constitutes the core of the widely employed locally-constant-field approximation (LCFA). In the paradigmatic case of photon emission by an electron colliding with a laser pulse, LCFA requires ξ ≡ a 0 = |e|E/m e ωc ≫ 1 and ξ 3 ≫ χ e , where χ e = F * /F cr with F * being the field amplitude in the instantaneous rest frame of the electron, while E and ω are the laser field amplitude and frequency, respectively [46,47,48]. Physically, the above conditions imply that SFQED probabilities receive most of their contribution from a region of a particle's trajectory that is small compared to the scale of variation of the background electromagnetic fields, and that most of the radiated photon spectrum is well approximated by the LCFA photon spectrum [49,50]. Interestingly, however, even when the above conditions are fulfilled the LCFA photon spectrum differs quantitatively and qualitatively from the exact one for relatively low emitted photon energies ε γ ≲ (χ e /ξ 3 )ε e , where ε γ and ε e are the photon and the electron energy, respectively [49]. To properly model this region of the photon spectrum one has to resort to more advanced techniques [50,51,52,53,54]. On the one hand, some of these techniques accurately reproduce the interference features of the emitted photon spectrum predicted by SFQED. On the other hand, most of them are suitable only for specific cases such as the interaction of particles with a plane-wave pulse or with a crystal. In SFQEDtoolkit we implement the technique described in Ref. [50]. Albeit this technique cannot exactly reproduce the SFQED spectrum for all photon energies, it provides a several orders of magnitude better approximation than the LCFA in the low energy part of the photon spectrum while retaining the advantages of the LCFA. Namely, the high energy part of the photon spectrum corresponds to the SFQED spectrum, it depends only on the local instantaneous value of a particle's parameters, and is suitable for background electromagnetic fields with arbitrary spacetime structure such as those occurring in PIC simulations.
SFQEDtoolkit can be used as a black box, in which case the user can directly refer to Appendix A, which summarizes the currently available routines and the essential steps for a straightforward implementation of SFQEDtoolkit in a code. More detailed instructions, updates and neat examples of the usage of SFQEDtoolkit are provided in C++ and Fortran in the "example cpp" and "example fortran" folders in the GitHub repository at https://github.com/QuantumPlasma/ SFQEDtoolkit.

Rates and cumulative probabilities
The emission of a photon by an electron and the conversion of a photon into an electron-positron pair in a background electromagnetic field are stochastic events. Under suitable conditions such as those of the LCFA, the first order S-matrix contribution to the probability of an event in a short temporal interval dt is much smaller than unity and the probability of the process is linearly proportional to dt [46,47]. This is at the heart of the local probability method, where a stochastic event is deemed to occur if R(t)dt > r. Here R(t) is the rate of the process at time t, and 0 < r < 1 is a uniformly distributed random number. However, in general, the emission probability of several photons over an arbitrary interval, is not known.
In this section we present the theory that justifies one of the broadly used stochastic techniques, namely, the optical depth method (see also the statistical treatment in Ref. [14]). We then compare its advantages and disadvantages with respect to the local probability method. We start by introducing the probability P n (0, t) that an electron emits n photons in an arbitrary temporal interval (0, t). In the following, we assume that: 1. the probability of emission in a small temporal interval dt depends only on the instantaneous value of physical parameters at t; 2. for small dt the probability of multiple n−photon emissions is negligibly small. Hence, the only possible event in dt is a single photon emission; 3. the electron motion is quasiclassical so, between two emission events, its trajectory is well defined and obtained by solving the Lorentz equation of motion.
The probability of no emission P 0 (0, t + dt) in an interval (0, t + dt) can be written in terms of the probability of no emission P 0 (0, t) in (0, t) and the probability of no emission P 0 (t, t + dt) in (t, t + dt). In fact, from the first assumption P 0 (0, t) and P 0 (t, t + dt) are independent. Thus, P 0 (0, t + dt) = P 0 (0, t)P 0 (t, t + dt). In addition, from the second assumption P n (t, t + dt) = 0 for n > 1, which in turn implies that P 0 (t, t + dt) = 1 − P 1 (t, t + dt). Since the probability of emission in a sufficiently small time interval dt is P 1 (t, t + dt) = R pe (t)dt, where R pe (t) is the rate of photon emission at time t, one gets the differential equation (2.1) whose solution after setting P 0 (0, 0) = 1 as initial condition is In Eq. (2.2), Γ 0 : t → [x(t), p(t)] denotes the curve representing the electron trajectory in phase space which, according to the third of the above hypothesis, is well defined and determined by the Lorentz equation of motion. We stress that, in general, if Γ a and Γ b are two different trajectories then t 0(Γa) From (2.2), the cumulative probability of any event P ae (0, t) in (0, t) is Equation (2.4) is at the heart of the optical depth method, where a stochastic event is deemed to occur if P ae (0, t) > r. In practice, every particle is initialized with an optical depth τ 0 = − ln(1 − r) obtained by inverting P ae (0, t) = r. The event occurs when τ e (t) > τ 0 , where τ e (t) = t 0(Γ 0 ) R pe (τ )dτ is calculated along the trajectory Γ 0 obtained from the Lorentz equation. Typically, first order Eulerian integration τ e (t + ∆t) = τ e (t) + R pe (t)∆t is used in PIC codes, where ∆t is the timestep (see, e.g., Refs. [37,42]). Notice that, instead of storing both τ 0 and τ e (t), to reduce memory requirements one can equivalently store only τ d (t) = τ 0 − t 0(Γ 0 ) R pe (τ )dτ and check when τ d (t) < 0. When an event occurs, τ 0 is then re-initialized by using a new random number.
Although the local and the optical depth method are equivalent, each presents advantages and disadvantages. The main advantage of using an optical depth method is that random numbers are needed only when particles are initialized or after an event occurs. The need of storing the τ d (t) of each particle is a relatively minor drawback, and this method may provide better performance when R pe (τ ) can be efficiently calculated. By contrast, the local method can be more efficient when R pe (τ ) is computationally expensive. In this case the local method allows one to use an acceptancerejection technique. In fact, one can drastically reduce the number of required evaluations of R pe (τ ) by introducing a computationally cheaperR pe (τ ) such thatR pe (τ ) > R pe (τ ). Thus,R pe (τ )∆t > r is a necessary but not sufficient condition for R pe (τ )∆t > r, r being the same random number. If in the majority of casesR pe (τ ) closely follows R pe (τ ), one can noticeably reduce the number of evaluations of R pe (τ ) by first checking the necessary conditionR pe (τ )∆t > r, and only in the rare cases whereR pe (τ )∆t > r, then also R pe (τ ) is evaluated and the condition R pe (τ )∆t > r is checked to decide whether an event is deemed to occur at the considered timestep. Since the rate must be calculated for each particle at each timestep, and since irrespective of the adopted local or optical depth method for good numerical accuracy the timestep must be chosen such that R pe (τ )∆t ≪ 1, the computational cost of evaluating R pe (τ ) may significantly affect the simulation's performance. A prominent example where the above-mentioned technique is implemented and the local method provides a simpler and more efficient strategy than the optical depth method is discussed in section 5, where we detail the implementation of photon emission beyond the LCFA model. In fact, in the beyond the LCFA modelR pe (τ ) reduces to a simple one parameter function, while R pe (τ ) is a relatively complicated multivariate function.
It is worth determining also P 1 (0, t), . . . , P n (0, t) when possible, i.e., when there is essentially a single possible trajectory in phase space (see below). We begin by considering the probability of a single photon emission P 1 (0, t + dt) in the interval (0, t + dt). In this case only two mutually exclusive channels are possible: either no emission in (0, t) followed by one photon emission in (t, t + dt), or one emission in (0, t) followed by no emission in (t + dt). The corresponding equation is P 1 (0, t + dt) = P 0 (0, t)P 1 (t, t + dt) + P 1 (0, t)P 0 (t, t + dt). (2.5) Following the same reasoning as above, we have where for clarity the dependence of R pe on the electron energy and quantum parameter have been explicitly reported, and the subscript 0 (1) denotes quantities from the trajectory in phase space without (with) photon emission. By replacing Eq. (2.2), Eq. (2.6) and Eq. (2.7) into Eq. (2.5) we obtain the differential equation with the initial condition P 1 (0, 0) = 0. Along a specific trajectory in phase space, Eq. (2.8) can be solved with the integrating factor method. However, in practice, neither ε e,1 (t), χ e,1 (t) nor the time of emission are uniquely defined. In fact, an infinite number of ε e,1 , χ e,1 and emission instants are possible, such that an infinite number of Γ 1 need to be considered, and the integral along a particular Γ 1 does not give the probability of one emission in (0, t). In the limiting case either of R pe,1 (t) = 0 or of soft photon emissions, where the electron energy and its dynamics are basically unaffected by the emission events such that R pe,1 (t) ≈ R pe,0 (t), there is a single phase space trajectory, and one can determine the corresponding total probabilities. In the R pe,1 (t) = 0 case one immediately gets which consistently coincides with the total probability of any event in Eq. (2.4). This reasoning applies to photon conversion into an electron-positron pair where, obviously, the assumption that no event is possible after the first one holds rigorously. In the soft-photon emission case one gets (2.10) Following the same reasoning that has led to Eq. (2.5), the probability of n−photon emissions is P n (0, t + dt) = P n−1 (0, t)P 1 (t, t + dt) + P n (0, t)P 0 (t, t + dt). (2.11) For soft photon emission this leads to the equation Thus, the probability of emission of n soft photons follows a Poisson distribution, and t 0(Γ 0 ) R pe (τ )dτ is the mean number of emitted photons in the interval (0, t). This result corresponds to the classical limit, where multiple soft photon emissions occur, and each photon carries away a negligible fraction of the electron energy [55,56].

Methodology of implementation
SFQEDtoolkit aims at computing rates and probability distributions of SFQED processes with better than 0.1% accuracy throughout the domain of interest with performance better than or similar to coarse lookup tables. The open-source PIC code Smilei with its default 256 points 1 and 1024 points 2 precomputed tables is used as a benchmark [57]. The methodology employed to implement SFQED processes in SFQEDtoolkit chiefly resorts to a combination of: (i) function approximation with Chebyshev polynomials; (ii) asymptotic expansions; (iii) variable and function transformation.
We start by briefly describing Chebyshev polynomials and discussing a few properties that render them an ideal basis for approximating smooth functions. Chebyshev polynomials are defined via the recurrence relation and form an infinite-dimensional basis of functions orthogonal with respect to the weight 1/ √ 1 − x 2 and defined in the interval −1 ≤ x ≤ 1, i.e., An explicit expression of Chebyshev polynomials is T n (x) = cos(n arccos x), which highlights their relation with the discrete Fourier transform. A continuous function f (x) can be approximated as where c i are the coefficients of the expansion and N is the order of the expansion, i.e., the higher order polynomial in the expansion. The power of function approximation with Chebyshev polynomials resides in: 1. The coefficients c i can be easily calculated by exploiting the orthogonality relations of Chebyshev polynomials (see Eq. (3.2) or, e.g., Ref. [58] for the discrete orthogonality relationships).
To calculate the coefficients used by SFQEDtoolkit, we have developed a parallel code that efficiently computes Chebyshev coefficients of an arbitrary multivariable 1D, 2D, and 3D function defined in a finite range of its parameters until a predefined accuracy is achieved. Note that a variable x in the interval a < x < b can be mapped on a variable X in the interval A Chebyshev approximation is nearly the same as the minimax polynomial, i.e., the polynomial which has the smallest maximum deviation from the true function f (x) among all polynomials of the same degree. However, unlike the exact minimax polynomial, its coefficients can be readily computed. 3. The error is almost uniformly distributed over the entire approximation range. 4. The magnitude of the first neglected Chebyshev coefficient provides a good estimate of the error of the approximation. This is due to T n (x) = cos(n arccos x) ≤ 1. 5. For smooth functions, its Chebyshev expansion is rapidly convergent. This allows to condense most of the information on the function in relatively few coefficients that can fit in a modern CPU cache. Indeed, according to Ref. [58]: Any function that is bounded on the interval [of the Chebyshev polynomials] will have a convergent Chebyshev approximation as [the order] N → ∞, even if there are nearby poles in the complex plane. For functions that are not infinitely smooth, the actual rate of convergence depends on the smoothness of the function: the more derivatives that are bounded, the greater the convergence rate. For the special case of a C ∞ function, the convergence is exponential.
6. Due to its recurrence relation in Eq. (3.1), a Chebyshev expansion can be efficiently evaluated with the Clenshaw's recurrence formula [59]. This enables one to evaluate the expanded function using only the coefficients of the expansion c i , without evaluating the actual polynomials T i (x). Moreover, Clenshaw's recurrence formula guarantees both efficiency and numerical stability against roundoff errors (see Appendix B); 7. A generic multivariable function f (x, y) can be expanded as If f (x, y) needs to be evaluated multiple times, e.g., at several different x but with the same y, its computational cost can be greatly reduced. In fact, Eq. (3.4) can be rewritten as Clenshaw's recurrence is first used to compute the N + 1 coefficients d i (y). Then, it is further used for evaluating the function in x by employing d i (y) as coefficients. After the first iteration, the computational cost for evaluating the function at any x with same y is therefore reduced to that of a single variable function with N + 1 coefficients instead of a two variable function with (N + 1) × (M + 1) coefficients. This is highly beneficial, for example, when employing root finding routines.
The above properties are systematically used in SFQEDtoolkit. Given a generic function of one f (x) or two f (x, y) arbitrary variables x and y that we want to approximate with a desired accuracy, we divide the interval of definition of x and y into smaller intervals, and in each subinterval of x, y the function is approximated with a suitably chosen approximation method. Even when the same approximation method, e.g., a Chebyshev expansion, is used for two distinct subintervals of definition of the same function, this can result in a substantial reduction of the number of coefficients required to achieve the desired accuracy. At runtime, the microprocessor selects the relevant subinterval among those possible depending on the actual value of the variables, which is called branching. Since modern microprocessors are also pipelined, which means that they can execute multiple code instructions in a single cycle, branching creates a bottleneck. For this reason modern microprocessors have branch predictor capabilities, namely, they select the most likely subinterval of values in a repeated cycle. Thus, a correct prediction of the required branch at runtime may significantly improve the efficient execution of the code. One can take advantage of these capabilities by judiciously choosing the subintervals in which the function is approximated, such that a single subinterval is likely to be most frequently required. In the following, we denote the Chebyshev expansion of a function f (x) as C[f (x)]. As we have emphasized above, the rate of convergence of C[f (x)] strongly depends on the smoothness of f (x). Thus, before calculating a Chebyshev expansion, it is important to analyze f (x) and determine whether its smoothness can be increased, e.g., via a suitable change of variables. In addition, it can be simpler and computationally cheaper to use asymptotic expansions in the regions of the domain where the function or its derivatives grow rapidly, or when the domain is unbounded. Similarly, asymptotic expansions are preferable when f (x) is exponentially small, or tends to zero much faster than a polynomial. In fact, in this case f (x) and C[f (x)] tend to zero at different paces, and the accuracy of the approximation decreases. Alternatively, it may be convenient to calculate the Chebyshev expansion of a function derived from f (x). In this case, at runtime one first computes the derived function, and then obtains f (x) from the the evaluation of the derived function. For example, if f (x) or its derivatives are divergent, then 1/f (x) or f (x)/g(x) where g(x) is an easily computable asymptotic function that renders f (x)/g(x) smooth, may have a much more rapidly ], respectively. In SFQEDtoolkit, the required Chebyshev coefficients are loaded from text files into memory only once when the library is initialized (see Appendix A).

Photon emission and pair creation in the LCFA
The differential probability of photon emission per unit time t per unit photon energy ε γ by an electron or positron with energy ε e is [60] where α = e 2 /ℏc ≈ 1/137.036 is the fine structure constant, u = ε γ /(ε e − ε γ ), and K ν (x) are the modified Bessel functions of the second kind. Analogously, the differential probability of electronpositron pair creation per unit time per unit electron energy from a photon with energy ε γ is [60] where η = 2ε 2 γ /(3ε e ε p χ γ ), and ε p = ε γ − ε e is the positron energy. The electron/photon quantum nonlinearity parameter χ e/γ as computed in SFQEDtoolkit by calling either the function compute chi with vectors or the function compute chi with components (see Appendix A for details) is [60] which holds both for electrons and for photons with momentum p e/γ and energy ε e/γ in a background electric E and magnetic B field. Eqs. (4.1)-(4.2) hold if the two field invariants |E 2 −B 2 |/F 2 cr and |E · B|/F 2 cr are much smaller than min(1, χ 2 e/γ ), and electrons and positrons in the initial and final states are ultrarelativistic ε e/p /(m e c 2 ) ≫ 1 (see Ref. [60]). In fact, in this case the discrete nature of energy levels and the energy associated to the spin degrees of freedom can be neglected, such that a quasiclassical approach is applicable. If background fields are also nearly constant and uniform over the formation length of the process, Eqs. (4.1)-(4.2) can be used also with time-and space-dependent electromagnetic fields [46,60,49,50]. For completeness, we stress that all currently available SFQED calculations require that perturbation theory in the Furry picture can be used. This assumption was conjectured to break down in the αχ 2/3 ≳ 1 regime, i.e., for χ ≳ 1600 (see, e.g., Ref. [61] and references therein). Recent developments have shown that the conjectured breakdown does not occur in general, but its onset depends on the structure of the electromagnetic fields and on the energy of the incoming particles [62,63]. Finally, we mention that the assumption of ultrarelativistic particles also allows one to use the collinear approximation, i.e., immediately after the event the momentum of the initial and of the generated particles are aligned. Moreover, the initial (final) spin and polarization degrees of freedom of particles are averaged (summed) in Eqs. (4.1)-(4.2). The implementation of the angular distribution of the generated particles beyond the collinear approximation as well as the inclusion of spin-dependent SFQED effects will be presented in a separate publication. Table 1: Summary of the methods employed by the functions SFQED INV COMPTON rate and SFQED LCFA INV COMPTON PHOTON energy in each region of their domain. "C" denotes Clenshaw's recurrence applied to a Chebyshev expansion, with the number inside the round brackets reporting the available number of Chebyshev coefficients. The value of the Chebyshev coefficients is available in the "coefficients" folder of SFQEDtoolkit, see, e.g., https://github.com/QuantumPlasma/SFQEDtoolkit. "A." denotes the asymptotic approximation in Eq. (4.19), while "E." denotes the exponential approximation in Eq. (4.21).

Photon emission rate
As briefly discussed, each computational cycle involving SFQED processes first consists in determining whether an event occurs according to its rate. From Eq. (4.1), the rate of photon emission is [60] For its implementation in SFQEDtoolkit, it is convenient to change the dummy variable of integration from u to v = 2u/3χ e and express all quantities in normalized units. Namely, we use an angular frequency ω r as a reference, and consequently obtain a reference time T r = 1/ω r , a reference length λ r = c/ω r , and a reference field E r = m e cω r /|e|, while electron and photon energies are normalized as γ e/γ = ε e/γ /m e c 2 . Notice that, e.g., in a simulation where lengths are in units of the laser wavelength λ, then ω r = 2πc/λ and λ r = λ/2π. In practice, ω r should be an important frequency that we want to resolve. The use of normalized quantities exhibits the scale invariance of the Lorentz equation, and avoids to incur in possible numerical issues related to the use of numbers that are too big or too small for floating-point arithmetic. After the above transformations, Eq. (4.4) becomes where λ C = ℏ/m e c is the reduced Compton length and 3 In order to implement Eq. (4.5), we only need to compute C[W rad (χ e )] to the desired accuracy Figure 1 displays ∆ pe over the considered interval 0 ≤ χ e ≤ 2000 clearly showing SFQEDtoolkit's better than 0.1% accuracy throughout its domain. 3 In practice, the upper bound of integration is 700 as the integrand becomes so small that the whole integral from 700 to infinity is ≲ 10 −305 leading to underflow even with double precision accuracy. In SFQEDtoolkit, the Chebyshev coefficients of C[W rad (χ e )] are precomputed and stored into five separate text files according to the following ranges: 0 ≤ χ e < 2, 2 ≤ χ e < 20, 20 ≤ χ e < 80, 80 ≤ χ e < 600, and 600 ≤ χ e ≤ 2000 (see the summary in Tab. 1). At runtime, coefficients are loaded once when the simulation is initialized. When the rate is calculated by calling the function SFQED INV COMPTON rate, the relevant interval is selected depending on χ e , and Clenshaw's recurrence formula is applied to the corresponding coefficients (see Appendix B). Approximately, ten coefficients per interval are needed to compute C[W rad (χ e )] with the desired accuracy (see Tab. 1). Such a small number of coefficients can fit in a modern CPU L1 cache greatly speeding up the simulation. Given the smoothness of C[W rad (χ e )], no asymptotic expansion is used. We stress that the fast and accurate calculation of SFQED rates has major implications on the simulation results. On the one hand, insufficient accuracy may result into a significant error in the predicted number of events, which systematically accumulates during the simulation. On the other hand, low performances may noticeably slow down the simulation given that SFQED rates are evaluated at each timestep and for each particle. For example, we implemented SFQEDtoolkit in Smilei and simulated the evolution of an ultralow-density bunch of N e = 10 10 electrons with 10 GeV energy in a constant and uniform magnetic field. Parameters were chosen such that χ e = 1, initially, and the duration of the simulation was T sim = 1.3 fs, i.e., approximately half of the mean time required for an electron to emit once. By comparing the number of photons produced in the Smilei simulation with SFQEDtoolkit N tk = 4.834×10 9 and in the Smilei simulation with its default 256 points table N S = 5.111 × 10 9 , with the expected number of photons N a = N e T sim R pe (ε e , χ e = 1) ≈ 4.835 × 10 9 , we observe that even after such a short amount of time Smilei's 256 points table value N S differs by 5.70% from the analytical prediction, while SFQEDtoolkit's value N tk differs by 0.04%. Finally, we recall that the above LCFA rates are suited for both the local probability and the optical depth method (see Sec. 2).

Pair creation rate
The treatment of the pair creation rate follows similar steps as those of the photon emission rate. By using the symmetry with respect to the ε γ /2-axis of the integrand in Eq. (4.2) and by changing the variable of integration to v with 0 < v < 1 and ε e = ε γ (1 + v)/2, the rate of photon Table 2: Summary of the methods employed by the functions SFQED BREIT WHEELER rate and SFQED BREIT WHEELER ELECTRON energy in each region of the computational domain. "C" denotes Clenshaw's recurrence applied to a Chebyshev expansion, with the number inside the round brackets reporting the available number of Chebyshev coefficients. The value of the Chebyshev coefficients is available in the "coefficients" folder of SFQEDtoolkit, see, e.g., https://github.com/QuantumPlasma/SFQEDtoolkit. "A." denotes the asymptotic approximation in Eq. (4.10), while "E." denotes the exponential approximation in Eq. (4.29).
conversion into an electron positron pair is [60] Notice that in Eq. (4.7) the ultrarelativistic assumption ε γ ≫ m e c 2 and ε e ≫ m e c 2 allowed us to approximate the upper and lower limits of integration to ε γ and zero, respectively. By converting to normalized units, we get . (4.9) Thus, we only need to approximateW pair (χ γ ) to the desired accuracy ∆ pp defined, as in the photon case, as the relative error between the exact and the approximate function. The considered range 0.01 ≤ χ γ ≤ 2000 was divided into seven intervals: 0.01 ≤ χ γ < 0.24, 0.24 ≤ χ γ < 0.4, 0.4 ≤ χ γ < 2, 2 ≤ χ γ < 20, 20 ≤ χ γ < 80, 80 ≤ χ γ < 600 and 600 ≤ χ γ ≤ 2000. While for each interval in 0.24 ≤ χ γ ≤ 2000 the coefficients of C[W pair (χ γ )] were computed, for χ γ < 0.24 the pair creation rate is exponentially small and a Chebyshev expansion is no longer suited to accurately calculateW pair (χ γ ). In this range SFQEDtoolkit uses the asymptotic expansion [60] Note that the pair creation rate becomes negligibly small W pair ≲ 3.5 × 10 −119 αλ r /λ C γ γ for χ γ < 0.01. Hence, the contribution of the regions where χ γ < 0.01 is neglected in SFQEDtoolkit. It is worth noticing that, when using simulations to model realistic scenarios, the value of χ γ is necessarily affected by an experimental or observational uncertainty ∆χ γ , e.g., because of the limited knowledge of physical parameters such as the electromagnetic field or the particle's energy, and the simulation itself is affected by numerical and round-off effects. This uncertainty ∆χ γ propagates ∆ pp χ γ to W pair as ∆W pair ≈ (dW pair /dχ γ )∆χ γ . Since the relative error d ln(W pair )/dχ γ rapidly diverges for χ γ → 0, not only W pair is tiny for small χ γ , but also its relative error necessarily becomes large. For this reason it is often of limited significance to consider SFQED pair production in regions where χ γ ≪ 1. In addition to the function SFQED BREIT WHEELER rate which returns W pair (γ γ , χ γ ) for 0.01 ≤ χ γ ≤ 2000, SFQEDtoolkit also provides a function SFQED BREIT WHEELER rate fast which returns zero for χ γ < 0.3. Since W pair ≲ 9.1 × 10 −6 αλ r /λ C γ γ for χ γ < 0.3, the computationally cheaper SFQED BREIT WHEELER rate fast is expected to provide essentially the same results as SFQED BREIT WHEELER rate in many relevant cases while possibly improving the performance of the simulation.

Photon emission spectrum
Once an event is deemed to occur, the particles of the final state need to be generated according to the probability distribution of the process. By averaging (summing) over the initial (final) spin and polarization degrees of freedom and by assuming that the momentum of the generated particles is aligned to the momentum of the incoming particle, only the calculation of particles' final energy is needed to determine the state of particles after the event. Now, let us assume that we want to sample a particle with energyε in the range 0 <ε < ε tot according to an arbitrary distribution f (x, ε). In this case one can resort to the inverse transform sampling (ITS) method. Namely, given an f (x, ε), one needs to solve the equation in the unknownε. Here x represents generic constant parameters and 0 < r < 1 is a uniformly distributed random number. For photon emission by an electron, this implies that one needs to where the cumulative distribution function I pe (ε, ε e , χ e ) is obtained by integrating Eq. (4.1) up to an arbitrary energy valueε Equation (4.12) defines an implicit functionε = G pe (r, ε e , χ e ), which can be calculated up to arbitrary accuracy by numerically solving Eq. (4.12) with a suitable root-finding algorithm such as the Brent-Dekker method [64,65]. However, direct application of the above recipe at runtime in a PIC code would be prohibitively expensive. Modified Bessel functions of the second kind K 2/3 (x) and ∞ η dyK 1/3 (y) as well as several computational cycles where multiple integrals need to be calculated are required to obtainε from Eq. (4.12) via a root-finding routine.
In SFQEDtoolkit, the function SFQED LCFA INV COMPTON PHOTON energy provides users with a fast and accurate approximation of G pe (r, ε e , χ e ). The accuracy of the approximation ∆ r is given by ∆ r = |ε IT S −ε tk | /ε IT S , whereε IT S is the value obtained via the ITS method computed with more than ten significant digits accuracy, andε tk is the value returned by SFQEDtoolkit. It is worth noting that although the functions I pe and R pe in Eqs. (4.12)-(4.13) depend on two-and three-variables, respectively, the total energy of the parent particle ε e can be easily factored out, and one is required to approximate functions of only one and two variables, in practice.
Similarly to the photon emission rate, the domain of χ e and r was divided into smaller intervals either to reduce the required number of Chebyshev coefficients or to use a different approximation method. The same intervals as for the photon emission rate were used for χ e , while r was divided into four intervals: 0 < r ≤ r min , r min < r ≤ r inv , r inv < r ≤ r max , and r max < r < 1. The value of r min , r inv and r max depends on χ e , and is reported in Tab. 3. Table 1 summarizes the decomposition of the domain, the method used in each interval and, when the function is approximated with Chebyshev polynomials, the number of employed Chebyshev coefficients. Figure 3 plots the value of ∆ r for the whole range of 0 ≤ χ e ≤ 2000 and 0 < r < 1, which shows that ∆ r < 10 −4 throughout the whole considered domain.
We begin describing how SFQEDtoolkit implements the two-variable functionw =G pe (r, χ e ) implicitly defined by Eq. (4.17) in the interval r min < r < r max . A natural choice is to approximatẽ G pe (r, χ e ) with Chebyshev polynomials. In practice, despite this approach performs extremely well almost everywhere,G pe (r, χ e ) and its derivatives rapidly grow for r → r max such that the number of Chebyshev coefficients needed to accurately approximate the region around r max becomes large. To reduce the number of required coefficients while retaining high accuracy, in the interval r min < r < r inv and r inv < r < r max the Chebyshev expansion ofG pe (r, χ e ) and of 1/G pe (r, χ e ) is employed, respectively. In the latter case,G pe (r, χ e ) is then readily obtained from [1/G pe (r, χ e )] −1 .
Finally, SFQEDtoolkit solves Eq. (4.17) in the limit r → 0 and r → 1, i.e., when the lower and the higher energy tail of the photon spectrum are approached, by resorting to asymptotic expansions and exponential approximations. For r → 0, one can expand Eq. (4.16) as which is a better than 0.1% approximation ofĨ pe (w, χ e ) for r ≤ r min . By substituting Eq. (4.18) in Eq. (4.17), one immediately obtains (4.19) Regarding r → 1, i.e.,w → ∞, we leverage on the fact that forw above a threshold w 0 the functioñ I pe (w, χ e ) saturates toW rad (χ e ) =Ĩ pe (∞, χ e ), and the following exponential approximation holds . (4.21) Note that: (i) there is no need to calculate the cubic root in Eq. (4.21) since ε γ depends only on w 3 ; (ii) in Eq. (4.21) we have explicitly indicated that in SFQEDtoolkit the Chebyshev expansion ofW rad (χ e ) andĨ pe (w 0 , χ e ) are used; (iii) this approximation is used for r ≥ r max , and the corresponding χ e -dependent w 0 is obtained from w 0 = C[G pe (r max , χ e )] where C[G pe (r max , χ e )] is the Chebyshev approximation of the function obtained by setting r = r max in Eq. (4.17). We implemented SFQEDtoolkit in the open source PIC code Smilei [57] version 4.7 and simulated the evolution of an ensemble of 10 10 electrons. The charge density was kept very low to suppress self-fields, and particles were placed in a constant and uniform external magnetic field with 10 GeV energy. Simulation parameters were chosen such that χ e = 1, initially. The total simulation time was set to one tenth of the time expected for an electron to emit once. We then repeated the same simulation using the original version of Smilei, which makes use of 256-points pre-computed lookup tables and compared the results with those obtained with SFQEDtoolkit. Figure 4(a) displays the photon spectrum obtained from the original version of Smilei (red line), with SFQEDtoolkit (blue line), and the analytical prediction (dashed green line). While Smilei with its default 256-points tables shows a marked stairlike pattern at high photon energies and a significant deviation from the analytic spectrum also for photon energies around 0.2 ε e , SFQEDtoolkit nearly perfectly matches the analytic spectrum throughout the whole domain. Note that such discrepancies of Smilei with its 256-points tables with respect to the analytic spectrum originate from the coarse tabulation. By employing Smile with 1024-points tables the agreement with the analytic spectrum significantly improves with respect to the 256-points tables, particularly in the region ε γ /ε e ∼ 0.2, while the stairlike pattern is much milder but remains visible also with 1024-points tables. Regarding performance, a direct comparison of the execution time of the default version of Smilei and of the SFQEDtoolkit implementation does not directly reflect the improvement, because the total simulation time is strongly determined by the other operations of the PIC loop. Nevertheless, in our tests the implementation with SFQEDtoolkit outperformed the 256 points default Smilei by up to 12%.

Pair creation spectrum
The same methodology employed to sample the energy of emitted photons applies in the case of photon conversion into an electron-positron pair. One solves the equation (ε e , ε γ , χ γ )dε e − r εγ 0 d 2 W pp dtdε e (ε e , ε γ , χ γ )dε e = I pp (ε, ε γ , χ γ ) − rR pp (ε γ , χ γ ) = 0, (4.22) where I pp (ε, ε γ , χ γ ) is obtained by integrating Eq. (4.2) up toε In SFQEDtoolkit, we exploited the symmetry of Eq. (4.23) with respect to ε γ /2 to halve the domain of integration to ε γ /2 < ε e < ε γ , converted to normalized units, and changed the dummy variable Thus, the equation that needs to be solved simplifies tõ , and r ′ = 2r − 1 is a random number uniformly distributed in −1 < r ′ < 1 obtained from a random number r defined in 0 < r < 1. Equation (4.25) defines an implicit functionv =G pp (|r ′ |, χ γ ). Oncev has been determined, the relation for |r ′ | it is divided into three intervals: 0 ≤ |r ′ | < 0.9, 0.9 ≤ |r ′ | ≤ 0.9999, and |r ′ | > 0.9999. In 0 ≤ |r ′ | ≤ 0.9999, Chebyshev polynomials are used to approximateG pp (|r ′ |, χ γ ) to the desired accuracy. The division into two intervals, i.e., in two distinct set of Chebyshev coefficients, is made only to reduce the number of required coefficients in the more probable region 0 ≤ |r ′ | < 0.9. In fact, when |r ′ | → 1 the number of coefficients necessary to accurately approximateG pp (|r ′ |, χ γ ) increases dramatically, and to preserve the computational speed granted by working with a restricted set of coefficients, we have chosen to separate the smaller interval requiring a larger number of coefficients |r ′ | > 0.9 from the more probable interval 0 ≤ |r ′ | < 0.9. Finally, for |r ′ | > 0.9999 an exponential approximation similar to Eq. (4.20) is used to approximate the cumulative functioñ where we have indicated that in SFQEDtoolkit a Chebyshev expansion is used to computeW pair (χ γ ) as well asĨ pp (v 0 , χ γ ), The function SFQED BREIT WHEELER ELECTRON energy returns the energy of the created electron once the normalized photon energy γ γ , quantum parameter χ γ and a random number r uniformly distributed in (0, 1) are provided as input parameters. Figure 5 displays the accuracy of the approximation ∆ r = |ε IT S −ε tk | /ε IT S in the domain 0.3 < χ γ < 2000 and 0 < r < 1. Symbols have the same meaning as in the photon emission case.

Changing the number of coefficients used in a simulation
SFQEDtoolkit allows users to reduce the number of Chebyshev coefficients to be employed at runtime therefore reducing the accuracy but possibly enhancing the performance of the library.
For instance, the Chebyshev coefficients c i of a one-variable function f (x) are stored in text files where the first row is an integer reporting the total number of coefficients n stored in the file, the second and third row contain floating point numbers that report the minimum and the maximum of the interval where the function is approximated with Chebyshev polynomials, respectively, while from the fourth row on the coefficients c i are written row by row. By adding a colon followed by an integer k < n, i.e., by replacing n with n : k, only the leading k coefficients out of the n total are employed at runtime.
Similarly, the Chebyshev coefficients c ij of a two-variable function f (x, y) are stored in text files where the first and fourth row report the number of columns n and of rows m of c ij , respectively. The second and third (fifth and sixth) row of the file report the minimum and the maximum of first x (second y) variable of f (x, y), respectively. Finally, starting from the seventh row of the file the coefficients c ij are written row by row following a row-major order. Also in this case, one can reduce the number of used coefficients by editing the first (fourth) row of the file, i.e, replacing n (m) with n : k (m : l) where k (l) is an integer smaller than n (m). Unless otherwise specified, SFQEDtoolkit uses all the coefficients available in a file.
For example, by repeating the simulation reported in Fig. 4(a) with a set of 15 × 25 coefficients instead of the 17 × 35 provided with SFQEDtoolkit, the relative error rises up to 0.7% while the time to execute the specific task of calculating the photon spectrum reduces by approximately 30%. Figure 4(b) displays the photon spectrum obtained with the above-mentioned reduced set of coefficients. Even with the reduced set of coefficients, SFQEDtoolkit provides a photon spectrum in manifestly much better agreement with the analytical prediction than Smilei's default algorithm and 256 × 256 points table [see Fig. 4(b)].

Photon emission beyond the locally-constant-field approximation
In the following we detail how the method of photon emission beyond LCFA developed in Ref. [50] is implemented in SFQEDtoolkit. In order to retain flexibility and allow users to better adapt and customize functionality to their codes, SFQEDtoolkit provides a set of independent functions each carrying out a specific task of the algorithm.
We begin by briefly reviewing the rationale of the algorithm and explaining how it benefits from a local probability approach instead of the optical depth method (see Sec. 2). As detailed in Ref. [50], the differential probability of photon emission d 2 W pe /dtdγ γ becomes almost flat for a broad range of photon energies below a threshold γ γ,LCFA , here normalized to m e c 2 , while it nearly coincides with the LCFA distribution above this threshold. This implies that the use of the LCFA rate systematically results in an orders of magnitude overestimated number of emitted photons for ε γ ≲ (χ e /ξ 3 )ε e . Notice that γ γ,LCFA depends on a characteristic local time of variation τ of the transverse Lorentz force F L,⊥ experienced by the emitting particle. This characteristic time τ is obtained from the firstḞ L,⊥ and secondF L,⊥ time derivative of F L,⊥ calculated along the emitting particle's trajectory [50].
We stress that the calculation of the improved rate of photon emission for each particle and at each timestep, which is required by the optical depth method, now becomes relatively complex and computationally expensive. A simpler and more efficient option is to use the LCFA rate of photon emission to determine whether a photon emission event occurs, locally. Only if the event is deemed to occur according to the LCFA model, then the expected photon energy is sampled from the LCFA differential probability of photon emission. If the sampled photon energyγ γ exceeds the threshold γ γ,LCFA , a photon with energyγ γ is generated and the emitting particle recoils.
Otherwise, the photon emission event is either accepted or rejected by comparing an independent uniformly distributed random number in (0, 1) with the ratio between the LCFA and the improved differential probability of photon emission calculated atγ γ (see below). This greatly improves the performance of the code given that most of the above-mentioned extra computational steps are performed rarely and only when needed.
SFQEDtoolkit provides a C++ object named BLCFA Object which contains (i) a three-element double precision array to store the transverse Lorentz force at the penultimate timestep, (ii) a threeelement double precision array to store the difference between the Lorentz force at the penultimate and the antipenultimate timestep, and (iii) a boolean signaling whether the particle was created at the penultimate timestep. This boolean is needed because for a new particle the transverse Lorentz force at the penultimate and antipenultimate timestep is not available, and either the LCFA is employed or no emission occurs for the first timestep after the particle is created (see Ref. [50] and below). These quantities together with the simulation timestep ∆t and the emitting particle energy γ e and quantum parameter χ e are used by the routine SFQED BLCFA INV COMPTON PHOTON threshold to calculate the threshold γ γ,LCFA (see below and Appendix A). In addition, the class BLCFA Object serves to derive through inheritance an extended C++ object, which can be used to include all needed information on the state of a computational particle. In the given PIC or Monte Carlo code where the user wants to implement SFQEDtoolkit, a linked list, or possibly better for memory locality, an array or a C++ vector class of particle objects derived from BLCFA Object can be used to describe the state of an arbitrary number of computational particles. Each BLCFA Object is created by calling the routine SFQED CREATE BLCFA OBJECT, and its content is updated at each timestep and for each particle via the routine SFQED BLCFA OBJECT update (see Appendix A for details).
For clarity and completeness, we summarize below the steps needed to calculate γ (n) γ,LCFA at timestep (n) as well as the acceptance-rejection method. Following Ref. [50], by starting from x (n) and p (n−1/2) advance the momentum to p (n+1/2) L with the Lorentz force integrator existing in the code 4 . If the particle was created at the penultimate timestep, which is signaled, e.g., by the boolean of the BLCFA Object, set F L,⊥ , then update the boolean describing its status, and continue to the next particle. Otherwise, calculate where τ C is the Compton time and normalized units are employed. Note that the ultrarelativistic approximation p (n) L,⊥ ) 2 and χ (n) > ζ, where ζ is a nearly negligible number relative to unity 5 , then calculate τ (n) /τ C = 2π [F (n) L,⊥ ] 2 /δ (n) . Otherwise, the background fields are either basically constant and the LCFA applies throughout the photon spectrum or the quantum parameter χ (n) is negligibly small. This condition is introduced to avoid numerical issues for constant background fields, where the LCFA holds and τ (n) /τ C diverges. Finally, following Ref. [50] γ (n) 9) and the condition γ is evaluated. If this latter condition is violated, most of the spectrum is not approximated by the LCFA model, and the formation of the emission probability becomes an intrinsically nonlocal process. In this case we assume that no emission is deemed for the considered particle at this timestep.
Once γ γ,LCFA as well as two independent uniformly distributed random numbers 0 < r 1 < 1 and 0 < r 2 < 1. This routine first calls SFQED LCFA INV COMPTON PHOTON energy by giving r 1 as input parameter to sample the LCFA-predicted emitted photon energyγ γ . Ifγ γ > γ (n) γ,LCFA , the routine simply returnsγ γ . Otherwise, SFQED BLCFA INV COMPTON PHOTON energy uses the second random number r 2 to evaluate the condition γ,LCFA , γ e , χ e ). (5.10) If Eq. (5.10) is fulfilled, then the routine returnsγ γ . Otherwise, it returns zero, which implies that no photon emission occurs. Note that in Eq. (5.10) γ e has been written for clarity but is a global constant factor which multiplies both sides therefore canceling out. Hence, it is not used for evaluating Eq. (5.10) in the code. In addition, for its implementation the change of variable γ γ = (3γ e χ e w 3 )/(2 + 3χ e w 3 ) was performed, which gives (2 + 3w 3 χ e ) 2 18γ e w 2 χ e . (5.11) The above change of variable is motivated by the fact that the function d 2 W pe /dtdw is efficiently approximated with Chebyshev polynomials in SFQEDtoolkit. By contrast, d 2 W pe /dtdγ γ exhibits a γ −2/3 γ divergence when γ γ tends to zero, which makes it unsuitable for being expanded with Chebyshev polynomials, directly (see Sec. 3). Figure 6 displays the results obtained after implementing SFQEDtoolkit routines into a fourthorder Runge-Kutta pusher, and by repeating the electron beam-laser pulse simulations with the same choice of electron and laser parameters as in Ref. [50]. By comparing the results in Fig. 6 with the corresponding results in Ref. [50], it is apparent that the improved beyond LCFA spectrum is recovered. Figure 6: The LCFA vs the beyond LCFA (see Ref. [50]) differential photon emission probability for an electron colliding head-on with a plane-wave pulse for the following parameters: (a) 5 GeV electron initial energy, 5 fs (FWHM of the intensity) pulse duration, and a0 = 8 normalized field amplitude; (b) 10 GeV electron initial energy, 5 fs pulse duration, and a0 = 10 normalized field amplitude; (c) 10 GeV electron initial energy, 10 fs pulse duration, and a0 = 3 normalized field amplitude.

Conclusions and outlook
In this work we have presented a novel approach that allows an efficient implementation of the complex and computationally expensive functions needed to model strong-field QED processes into codes. This approach leverages on a combination of advanced function approximation techniques, and its key concepts can be naturally used in other areas of research. The method resorts to a combination of: (i) function approximation with Chebyshev polynomials; (ii) asymptotic expansions; (iii) variable and function transformation (see Sec. 3). We have applied this method to create an open source library named SFQEDtoolkit, which is designed to allow users for a straightforward implementation of SFQED processes, namely nonlinear Compton emission and nonlinear Breit-Wheeler pair creation, in existing particle-in-cell and Monte Carlo codes.
SFQEDtoolkit provides users with an efficient and better than 0.1% accuracy implementation of SFQED processes throughout the whole particles' spectrum. Benchmarks performed with the PIC code Smilei version 4.7 have shown that SFQEDtoolkit outperforms the default 256-points tables and achieves an accuracy better than that of 1024-points lookup-tables. Currently, photon emission and pair creation with χ e/γ ≤ 2000 are implemented by assuming the locally-constantfield approximation and collinear emission of the generated particles. For photon emission, the more advanced method beyond the locally-constant-field approximation presented in Ref. [50] is also included. The implementation of the angular distribution of generated particles as well as of the spin and polarization-dependent SFQED processes is currently underway and will be presented in a separate publication.

Acknowledgments
This article comprises parts of the Ph.D. thesis work of Samuele Montefiori, to be submitted to the Heidelberg University, Germany.

Appendix A. User guide
SFQEDtoolkit is an open source library written in C++. A wrapper for Fortran-based programs which leverages the standardized interoperability of modern Fortran with C is also provided with the current version. SFQEDtoolkit can be used as a black box by following the instructions below and the examples showing its usage provided on GitHub at: https://github.com/QuantumPlasma/SFQEDtoolkit. This appendix describes the key steps needed for implementing SFQEDtoolkit, and the main functions currently available. SFQEDtoolkit exposes all functions 6 to the user through the module SFQEDtoolkit Interface.hpp and SFQEDtoolkit Interface.f90 to C++ and Fortran codes, respectively. We recommend users to read this appendix before implementing the library.
SFQEDtoolkit must be initialized and finalized once at the beginning and at the end of the simulation, respectively. These are the only functions that must be implemented in a code using SFQEDtoolkit. All other available SFQEDtoolkit functions are independent of each other and can be ignored if not required, i.e., only the functions that are desired have to be implemented in a code. Thus, for example, none of the functions of the beyond LCFA method of Sec. 5 needs to be implemented if not used. Analogously, for example, if one is interested in photon emission according to the LCFA method but not in pair creation, then the implementation of the function that compute the LCFA photon emission rate and of the function that calculate the LCFA emitted photon energy suffice. In fact, e.g., for testing purposes, one can even implement only the SFQEDtoolkit function for the LCFA photon emission rate, and use a custom version for calculating the LCFA emitted photon energy, or vice versa. This enables high flexibility and allows users to customize SFQEDtoolkit to their code and to their specific objective without implementing unnecessary functions. Figure A.7 displays the flowchart of a PIC code embedding SFQEDtoolkit. At initialization all the precomputed Chebyshev coefficients stored in the txt files included with the library are loaded into memory 7 . At this stage the user needs to specify either the reference length or the reference frequency of the simulation. This is required because in SFQEDtoolkit all quantities are given in normalized units, and an angular frequency ω r is used as a reference. Consequently, the reference time T r = 1/ω r , length λ r = c/ω r , and field E r = m e cω r /|e| are obtained, while energy is normalized to m e c 2 . In particular, notice that in a problem where lengths are given in units of the laser wavelength λ, then λ r = λ/2π such that ω r = 2πc/λ.
The initialization is carried out through either the first or the second of the two following functions The reference length must be given in meters, while the simulation timestep must be given in normalized units, i.e., in units of T r = 1/ω r . The simulation timestep is used only by the routines that implement the beyond LCFA photon emission method described in Sec. 5. However, because of its generality, an input value is expected also when the beyond LCFA photon emission functions are not implemented. At runtime, SFQEDtoolkit reads the required information from files located in the subdirectory "coefficients" from the parent directory where the program is executed. If initialization is successful, the function returns a boolean that is true. If the photon emission method beyond LCFA of Sec. 5 is employed, SFQEDtoolkit provides users with a C++ object named BLCFA Object that is designed to allocate the additional information that is required to keep track of the force acting on the particle. Namely, a boolean signaling whether the particle was created at the penultimate timestep, and two arrays each with three double precision elements. One array stores the transverse Lorentz force at the penultimate timestep, the other array stores the difference of the Lorentz force between the penultimate and the antipenultimate timestep. This object is created by calling the function 3: BLCFA Object* SFQED CREATE BLCFA OBJECT(): creates an object designed to store the information required to apply the beyond LCFA method to a particle. The boolean is set to true, while the two arrays are initialized to zero. The function returns a pointer to the created BLCFA Object.
Note that functions that allow users to implement the beyond LCFA method without employing a BLCFA Object are also provided and described below. If function 3 is implemented, each computational particle should have one BLCFA Object associated. It is up to the user to organize objects in a suitable data structure (array, linked list, etc.) and, if the code in which SFQEDtoolkit is implemented uses message passing interface (MPI), to manage the communication of the object data among MPI domains. A recommended choice is to use inheritance to derive from BLCFA Object an object Particle where information such as particle position and momentum is stored, and to create either an array or a C++ vector to store the data of all particles in the simulation. Another option is to include a pointer to a BLCFA Object in the class or data type used by the existing code to store information about a particle's status. In a parallel code using MPI, the communication of the additional particle data stored in BLCFA Object should follow the same methodology used by the code to communicate other particle information such as particle's position and momentum. In fact, one should simply communicate the elementary datatypes stored in a BLCFA Object instead of the BLCFA Object itself.
Next, the standard steps of a PIC loop are performed: currents are deposited on the grid, Maxwell equations are solved, the resulting fields on the grid are interpolated at the particle's position, and particles are advanced in time according to the Lorentz force with a suitable particle 'pusher' (see, e.g., Ref. [36]). If a leapfrog integrator is used, a convenient choice is to adapt the implementation to its staggered method by first advancing the particle momentum, then call the routines for modeling SFQED processes, and only afterward advance the particle position and store the new data into memory.
Again, only when using the beyond LCFA method for photon emission of Sec. 5, a user employing the BLCFA Object and its features can resort to the function 4: bool SFQED BLCFA OBJECT update( BLCFA Object* obj, double* p push, double* p, double delta, double gamma, double chi): if the boolean of the BLCFA Object pointed by obj is true, calculates F While functions 3 and 4 are relevant only for the photon emission model beyond LCFA of Sec. 5, SFQED probability rates must be computed for each particle and at each timestep in the considered LCFA and beyond LCFA models. SFQEDtoolkit allows users to efficiently compute the SFQED rates by means of 5: double SFQED INV COMPTON rate( double gamma, double chi): uses the normalized electron or positron energy gamma as well as its quantum parameter chi to return the LCFA photon emission probability rate. 6: double SFQED BREIT WHEELER rate( double gamma, double chi): uses the normalized photon energy gamma as well as its quantum parameter chi to return the LCFA probability rate of photon conversion in an electron-positron pair. We recall that, as thoroughly discussed in Sec. 2, in the LCFA model both the local probability and the optical depth method can be used to determine whether a SFQED event is deemed to occur from probability rates. However, the beyond LCFA photon emission method of Sec. 5 is based on an acceptance-rejection technique applicable only with the local probability method.
In case a SFQED event is deemed to occur according to the LCFA probability rate, the energy of generated particles is computed by calling 8: double SFQED LCFA INV COMPTON PHOTON energy( double gamma, double chi, double rnd): returns the normalized photon energy by sampling from the LCFA energy distribution. Its input parameters are the normalized electron or positron energy gamma, the quantum parameter chi of the emitting particle, and a random number rnd sampled from a uniform distribution in (0, 1). 9: double SFQED BREIT WHEELER ELECTRON energy( double gamma, double chi, double rnd): returns the normalized energy of the generated electron (positron) by sampling from the LCFA energy distribution for photon conversion into an electron-positron pair. The energy of the positron (electron) is then readily obtained from the difference between the photon and the electron (positron) energy. The normalized photon energy gamma and its quantum parameter chi as well as a uniformly distributed random number rnd in (0, 1) must be passed as input parameters.
10: double SFQED BREIT WHEELER ELECTRON energy fast( double gamma, double chi, double rnd): same as function 9, but chi < 0.3 is not managed. This function must be used only in combination with function 7.
When the beyond LCFA model of Sec. 5 is used, if function 4 returns true and a SFQED event is deemed to occur according to the LCFA probability rate (see function 5), the calculation of the local LCFA energy threshold γ γ,LCFA as defined in Eq. (5.9) is required to determine the improved photon energy distribution. If BLCFA Objects are created and updated through functions 3 and 4, respectively, then γ γ,LCFA can be computed by calling 11: double SFQED BLCFA INV COMPTON PHOTON threshold( BLCFA Object* obj, double delta, double gamma, double chi): returns the normalized photon energy threshold below which the LCFA breaks, i.e., the γ γ,LCFA defined in Eq. (5.9). The arguments of this routine are those passed to and updated by function 4.
After calling function 11 all required quantities for the beyond LCFA method are available, and the photon energy is obtained from 12: double SFQED BLCFA INV COMPTON PHOTON energy( double limit, double gamma, double chi, double rnd, double rnd2): returns the normalized photon energy according to the beyond LCFA model of Sec. 5. This function requires as input the same parameters of function 8, plus two additional arguments: (i) the limit=γ γ,LCFA defined in Eq. (5.9), which users can obtain by calling function 11, and (ii) rnd2, which is a uniformly distributed random number in (0, 1) independent of rnd. If γ γ,LCFA > 0.75γ e the function returns zero. Otherwise, it applies the acceptance-rejection technique of Sec. 5 and returns zero if the event is rejected.
After the above steps, the computational cycle is complete and is repeated until the end of the simulation is reached. At this point, if the BLCFA Object was used, memory should be deallocated by calling the function 13: void SFQED FINALIZE BLCFA OBJECT( BLCFA Object* obj): deallocates the memory space reserved for the BLCFA Object pointed by obj.
and SFQEDtoolkit should be 'finalized', i.e., the memory reserved by SFQEDtoolkit for storing the tables of coefficients should be deallocated, by calling the function 14: void SFQED FINALIZE ALL(): frees all the memory allocated at initialization.
The BLCFA routines 4 and 11 both rely on the BLCFA Object entity, which is initialized with function 3. If desired, the user can completely bypass the usage of BLCFA Objects. In this case, the declaration and initialization of the additional variables that are necessary for the BLCFA algorithm has to be done by the user, directly, and functions 4 and 11 are replaced by 15: bool SFQED BLCFA OBJECT update raw( double* p push, double* p, double* Lorentz F, double* Delta Lorentz F, bool just created, double delta, double gamma, double chi): replaces function 4. Namely, if just created is true, calculates F (n) L,⊥ from Eqs. (5.1)-(5.4) by using the particle momentum before p and after p push one timestep, and changes just created to false. In addition, it sets Lorentz F and Delta Lorentz F equal to the computed F (n) L,⊥ , and the boolean returned by the function is false to signal no emission at this timestep. Note that for a new particle, the user must have declared in the code suitable variables corresponding to Lorentz F and Delta Lorentz F, both set to zero, and to just created, set to true. By contrast, if just created is false, uses the particle momentum before p and after p push one timestep to perform all calculations in Eqs. Finally, SFQEDtoolkit provides users with two functions for the computation of the particle quantum nonlinearity parameter χ according to Eq. (4.3), and one function implementing the collinear emission model for the generated particle. 17: double compute chi with vectors( double gamma, double p [3], double EE [3], double BB [3]): returns the quantum nonlinearity parameter χ γ/e of a particle with energy gamma and momentum p in an electric EE and magnetic BB field. All quantities must be provided in normalized units. 18 [3], double p out [3]): returns the momentum of the generated particle p out aligned with the momentum of the parent particle p in and sets its magnitude according to the energy of the generated particle gamma out. All parameters are in normalized units.
The above functions can be used, e.g., in combination with functions 5-10.
Thus, there is no need to compute the Chebyshev polynomials T i (x). Elegant and computationally efficient, Clenshaw's recurrence is also stable against round-off error, as reported in Ref. [58]: You need to be aware that recurrence relations are not necessarily stable against roundoff error in the direction that you propose to go (either increasing n or decreasing n).
[...] Clenshaw's recurrence is always stable, independent of whether the recurrence for the functions F k [the functions appearing in the series] is stable in the upward or downward direction. Finally, the application of Clenshaw's recurrence to the coefficients g i (y) gives f (x, y) evaluated at the desired x, y.
Remarkably, if f (x, y) needs to be evaluated multiple times and y is fixed, then the N + 1 coefficients g i (y) do not change therefore greatly reducing the computational cost of the further evaluations. The above iterative procedure can be naturally extended to functions of an arbitrary number of variables.