Modern methods of signal processing applied to gyrokinetic simulations

Numerical simulations, like the ones necessary for e.g. electromagnetic gyrokinetic models in plasma physics, require large computational resources and long run times. Using tools from signal processing, it is possible to draw conclusions about frequencies, damping rates and mode structures using shorter runs. These tools can also be applied to analyse transient signals. We give a pedagogical review of two contemporary methods from signal processing: damped multiple signal classification and stochastic system identification. An application to simulations of Alfvén modes in a tokamak is presented.


Introduction
Gyrokinetic simulations using initial value codes are a mainstay of theoretical plasma physics. Unfortunately, they can be extremely expensive with regard to computing time especially for electromagnetic perturbations. Thus, it would be desirable to be able to draw conclusions from simulations using as few time steps as possible. Initial-value codes are normally used in the linear regime for finding the most dominant mode, in either damped or unstable systems. If the system contains modes with similar growth or damping rates then simulations need to be run long enough until the ratio of the mode amplitudes is so large that the modes can be discerned. A second problem arises for transient phenomena, e.g. finding the frequencies of a collection of strongly damped modes. For standard Fourier transform (FT) methods, the frequency resolution ∆f is inversely proportional to the signal length. Thus, in order to distinguish modes with similar frequency or to estimate damping Original Content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. rates one needs long time traces. Nevertheless, the modes of interest may have decayed so strongly that the signal at later times is useless because it may be corrupted by e.g. numerical noise (which especially can be a problem for Monte Carlo simulations). Additionally, if the phenomenon of interest only appears during a specific time period a longer time trace may not give better resolution. We note that, in principle, FT methods can be used for getting frequencies and damping rates but obtaining time traces of sufficient length can be prohibitively expensive in simulations, may not be helpful for transient phenomena or are simply not available.
Methods from the toolbox of signal processing can be useful to help overcoming these common problems. We have the impression that, in contrast to the well established Fourier transform, these methods, used in the signal processing or structural mechanics community for quite some time, are not well known to the plasma simulation practitioner. Methods like the ones presented here, give a high surplus value at costs which, albeit higher than for a simple Fourier transform, are still negligible. Accordingly, two methods, which we found useful for attacking the two above problems, will be presented here in a pedagogical review: damped multiple signal classification (DMUSIC) and stochastic system identification (SSI). These methods require manipulating large matrices and thus need much more computing time than a simple Fourier transform but compared to gyrokinetic simulations their costs are still negligible.
In section 2, the basics of both methods are briefly reviewed and in section 3 applications to the result of electromagnetic gyrokinetic simulations in tokamaks will be presented.

DMUSIC
The Fourier Transform (or the periodogram) is the usual standard method for finding the frequency content of a signal. However, its frequency resolution is limited by the signal length t S and can not be better than t −1 S . Consequently, if high frequency resolution is needed one has to run long simulations which can be extremely time consuming. For damped signals, an accurate estimation of frequency is complicated by the fact that the spectral peaks get smeared out. Even worse, if the signal is noisy the variance of the Fourier spectrum cannot be reduced by providing longer time traces (see e.g. [1]).
These fundamental limits of the FT stem from the fact that it is a non-parametric method, i.e. no assumption about the signal is made. In contrast, parametric methods presuppose a given signal model and estimate its parameters. They can thus break the FT resolution limit and lead to so-called superresolution methods. Additionally, they show much superior performance for noisy signals compared to the FT. The most well-known methods (see e.g. [1,2]) are MUSIC (Multiple Signal Classification) and ESPRIT (Estimation of Signal Parameters by Rotational Invariance Techniques) which are only applicable for stationary signals. For MUSIC, the frequency estimation of a noiseless signal can (under relatively weak conditions) be shown to be exact while for a noisy signal the accuracy depends on the signal-to-noise ratio [3].
For damped signals, Prony's method can be used but is known to give inconsistent estimates [4], to be numerically unstable and to show poor performance if noise is present. Damped MUSIC (DMUSIC [5,6]), which is an extension of MUSIC also applicable for damped noisy signals, will be used in the following. Since this method does not seem to belong to common knowledge, its basic idea is briefly presented here.
The derivation of the method starts from so-called linear prediction. Here, one makes the following ansatz to predict a signal value y at discrete time n ∈ [0, N] (where N is the signal length) from the preceding J values: In order to estimate the coefficients c J − i , one applies this ansatz to the last N − J values y(J), . . . , y(N − 1), of the signal resulting in the following system of equations for the coefficients: . . .
Note, that the matrix on the left-hand side, the so-called prediction matrix A ∈ R J× (N−J) , contains a random contribution if the signal is noisy. The system (2) is overdetermined for J < N/2 and the coefficients can be obtained by e.g. a least squares method. Nevertheless, in the DMUSIC algorithm, the interest is not to obtain the coefficients but to use the property of A being a Hankel matrix in combination with a signal model to get an estimator for the model parameters. For frequency estimation, the model is chosen as the sum of K < J exponentials with amplitude a i ∈ C and frequency s i ∈ C (we assume all s i to be different).
Inserting equation (3) into A, the matrix can be written as: where C is a K × K diagonal matrix. The K columns of S L and S R consist of the vectors ⃗ r L (s) = (1, e s , . . . , e (N−J)s ) T and ⃗ r R (s) = (1, e s , . . . , e (J−1)s ) T for s = s 1 , . . . , s K . So, the columns of S L and S * R are linearly independent and consequently A has rank K. Using the fact that none of the vectors ⃗ r R (s i ) is in the kernel of A leads to an orthogonal decomposition R J = W S ⊕ W N with the so-called signal space W S := span{⃗ r R (s 1 ), . . . ,⃗ r R (s K )} and the noise space W N := ker(A).
Having elucidated the structure of A using the model equation (3), now the actual signal values are used to build the prediction matrix A. Performing a singular value decomposition then results in A = USV * . Since rank(A) = K, only the first K entries in the (N − J)×J diagonal matrix S are non-zero. Consequently the columns K + 1, . . . , J of V (the corresponding matrix we call V N ) span the kernel of A and thus the noise space W N . Now define a function R(s) = ∥V * ⃗ r R (s)∥ 2 2 with s ∈ C and ∥ · ∥ 2 denoting the usual Euclidean norm. If an s 0 can be found such that R(s 0 ) = 0 then⃗ r R (s 0 ) is orthogonal to W N and, due to the above definition of signal and noise space, must be a linear combination of vectors from W S :⃗ r R (s 0 ) = ∑ K i=1 a i ⃗ r R (s i ). This overdetermined system has the unique solution a i = 0 for i ̸ = i 0 and a i0 = 1 (for some index i 0 ∈ {1, . . . , K}) and so s 0 = s i0 . Therefore, the frequencies s i of the model can be found by searching the complex plane for poles of the signal function P(s) : = 1/R(s).
If noise is present A has full rank and a clear separation into signal and noise space is no longer possible. Nevertheless, if the noise is not too strong the maxima of P(s) will still be estimators for the complex frequency content of the signal (for a mathematical analysis of the simpler MUSIC algorithm see e.g. [3]). It is important to keep in mind that the actual value of P does not give any information (for the case without noise it is formally infinite). In particular, it cannot be used as an indicator for the size of the coefficients a i of the model.
It is interesting to note that the DMUSIC algorithm can also be interpreted as a method to provide a rank-K matrix approximation of the prediction matrix A which preserves its Hankel matrix structure [5].
The maxima of P(s) are found by performing a scan over a rectangular domain [z r,0 , z r,1 ]×[z i,0 , z i,1 ] in the complex plane. Since often one is not interested in the damping rate but only in the frequency, we define a frequency estimator by P r (Ω) = Scanning of the domain and calculating the estimator can, especially for large J, be time consuming. So, a parallel implementation in the Julia programming language has been used for the results presented here.

Stochastic system identification
System identification in general assumes that the system is driven by a known signal and the system's response is used to draw conclusions about its internals. Often it is neither possible nor desirable to apply a given signal to a system but it can be assumed that it is driven by (external or internal) noise. Then, the methods of SSI can be applied. This is especially interesting for the analysis of initial-value simulations where one starts with noise and lets the system evolve freely.
System identification is akin to hidden Markov models or the filtering problem in signal processing: one assumes a dynamic process with state vector⃗ x k ∈ C s (here s is the dimension of the state space and k is a discrete index labelling time intervals of length ∆t). This process is hidden in the sense that its state is not directly accessible to observation but only through a secondary process yielding an observable ⃗ y k ∈ C m in an m-dimensional observational space. It is then natural to ask what can be said about the process ⃗ x k when only the ⃗ y k are known. Here ⃗ x k is assumed to follow a linear evolution driven by white noise ⃗ w k . The observation ⃗ y k is perturbed by white noise ⃗ v k but otherwise depends linearly on the state vector. So the model is given by: with A ∈ C s× s , C ∈ C m× s and k = 1, . . . , N (where N is the length of the data set). Furthermore, we make the essential assumption that each of ⃗ x k and ⃗ y k is uncorrelated with the necessary, sequences are extended by applying zero padding at the end).
In the following we closely follow [7] (referring especially to what they call 'principal component algorithm') and the very accessible exposition by Magalhães and Cunha [8].
can be written as: with Choosing an arbitrary number J > 2 s, the first J matrices Λ i can be arranged to form a block-Hankel matrix H ∈ C J m× J m : Using (7) it is easy to decompose H as H = Γ∆ using the block vectors: A singular value decomposition of H gives H = USV * from which Γ and ∆ can be calculated as Γ = US 1 2 and ∆ = S 1 2 V * (for the freedom involved in this procedure, see [7]). By removing the first or the last block from Γ, one defines the two reduced vectors: which are connected by the relation ΓA = Γ. Applying the Moore-Penrose inverse, A can be recovered as A = Γ + Γ. From a subsequent eigendecomposition A = FDF −1 the eigenfrequencies λ i (for i = 1 . . . s) and ⃗ x-eigenvectors can be found from the matrices D and F while the ⃗ y-eigenvectors ⃗ λ i are contained in the columns of CF. All eigenvectors in F must be normalised in the same way (e.g. ∥ ⃗ λ i ∥ 2 = 1). This is necessary to conserve the information about the relative contribution (encoded in C) of the ⃗ y-eigenvectors to the signal. Usually, one is interested in a system given by differential equations and the model (5) and (6) can be regarded as its time discretisation. The eigenvalues of the discrete and the continuous system can easily be related: assuming e.g. the simple equationẋ = ωx and using its solution x(t) ∼ e ωt at the discrete time points t = k∆t (k ∈ N) one arrives at the discrete model x k+1 = x((k + 1)∆t) = e ω∆t x k . If λ denotes the eigenvalue of the discrete system one thus arrives at ω = 1 ∆t ln λ. The dimension of the state space is usually not known a priori. Thus for each chosen value of s one obtains a solu- There is some freedom in how to define 'near' but a natural measure is that solutions are 'near' when |λ s+1 i1 − λ s i0 |/|λ s i0 | < ϵ and MAC( ⃗ λ s i0 , ⃗ λ s+1 i1 ) > 1−δ (with small numbers ε, δ) where the modal assurance criterion MAC(⃗ v, ⃗ w) = |⃗ v · ⃗ w|/(∥⃗ v∥ 2 ∥⃗ w∥ 2 ) measures collinearity of vectors. One thus performs multiple calculations for an increasing sequence of s and, for the stable solutions, marks the real part of the eigenfrequencies over s. This gives a stability diagram from which the value of s necessary to describe the system accurately enough can be deduced (for an example see figure 5).
It is helpful to additionally plot the mode indicator function (MIF) in this diagram since it is an independent means of estimating the systems frequencies. This function results from an alternative method of system identification: frequency domain decomposition [9]. Here first the power spectral density matrix G ∈ C m× m of the signal ⃗ y k is constructed: G(ω) = ⃗ yk ·⃗ y T k where ⃗ yk denotes the discrete Fourier transform of ⃗ y k with respect to k. Performing a singular value decomposition of G, the mode indicator function is then given by the largest singular value as a function of ω =k/∆t.
We note that recovering mode structures from space-time data can also be done by traditional Fourier based methods. One of the simplest ways is to apply a well chosen band-pass filter around a peak in frequency space simultaneously at all space positions and then transform back to the time domain. This approach suffers the problem of all Fourier based methods of not giving reliable results for the mode structure if the frequency peaks overlap (e.g. by damping or too less resolution). Being a parametric method, SSI does not have such problems in many cases. Other advantages of SSI are that it aims at reconstructing true eigenmodes of the system and when combined with the MAC and MIF gives additional reliability.

DMUSIC
The two methods described above will now be applied to electromagnetic gyrokinetic simulations. Such simulations are numerically difficult and extremely time consuming, so they are a natural candidate for applying the above tools. The three-dimensional global δ f particle-in-cell code EUTERPE [10,11] will here be used to simulate Alfvén waves in tokamak equilibria. The first case is a circular tokamak with aspect ratio A = 10 (the often-used ITPA standard case [12]). To guarantee that all waves in the system are excited, the simulation is started with random noise. 4 × 10 6 ion and 16 × 10 6 electron markers with a time step of ∆t = 5 Ω −1 i (here Ω i denotes the ion gyro frequency) were evolved for a total of 6000 times steps (needing about 9 h on a Linux cluster with 48 processors). The radial resolution (using the normalised toroidal flux ρ as a radial coordinate) was N ρ = 80 grid points and the simulation was restricted to m = 5, . . . , 17 and n =−6 toroidal and poloidal Fourier components, respectively. For further analysis, the first 1500 time steps were omitted since they just show an initial numerical relaxation phase of the system. Fourier transforming the time trace of the electrostatic potential at each radial grid point ρ (for fixed toroidal and poloidal coordinate θ = 0, ϕ = 0) gives the plot shown in figure 1, left. Continuum branches (which consist of radially singular modes [13]) and their respective crossings can be seen.
The solid line in the left panel shows the ideal magnetohydrodynamic (MHD) continuum calculated with the CONTI code [14]. The only gap is the low frequency toroidal Alfvén (TAE) gap near ρ = 0.25. Gaps at the higher frequency branch crossings could appear if the tokamak equilibrium were not circular (e.g. ellipticity induced gaps). Due to the short time trace (4500 steps) the Fourier resolution is limited to ∆Ω ≈ 8 × 10 4 s −1 and nearly nothing can be concluded especially about the region ρ > 0.6. In particular, it is not possible to estimate the geodesic acoustic (GAM) up-shift of the lowest branch (m = 11) around ρ = 0.8. The result of zooming into this region can be seen in figure 2 (left). It is conspicuous that the FT does not allow to gain much information due to its highly limited resolution.
Applying the DMUSIC algorithm to the same data results in figure 1, right. Here, for each ρ-position a two dimensional scan over the relevant region of s ∈ C (using 300 × 50 points) was performed to obtain the signal function P(s) (the angular frequency Ω and the damping factor are given by Im(s) and Re(s), respectively). The result was then integrated over the imaginary part of s giving the reduced signal function P r (Ω) containing the angular frequency part of the signal. Note that for the DMUSIC plot no colour bar is shown, since P r only gives the strength of the reduced signal function but, in contrast to the FT case, does not convey any information about the mode amplitude (as noted in section 2.1, P(s) even has a pole at the true s of the system).
The continuum branches are now very clear and details in the region ρ > 0.6 can be discerned. Usually the gyro radius is much less than the radial resolution used to solve the field equations inside EUTERPE. We may conclude that the kinetic Alfvén waves present in a fully gyro-kinetic model need not be radially resolved for a calculation aiming at global modes and that the radial average over their frequencies is very close to that of the MHD continuum. Apart from the continuum branches, a global toroidal Alfvén eigenmode (TAE) in its early phase of formation (where it is not yet observable in the electrostatic potential) is visible as the red line extending from ρ ≈ 0.1, . . . , 0.5 at Ω ≈ 4 × 10 5 s −1 . Comparing the measured continuum with the ideal MHD continuum, one observes that the branches near the TAE are distorted, i.e. they collapse to the frequency of the global mode.
The GAM up-shift in the lowest branch near ρ = 0.8 is well resolved, nevertheless one of the benefits of DMUSIC, compared to FT based methods, is that one can zoom into the region of interest without loosing resolution. This is shown in figure 2 (right): using the same time series as for figure 1, a scan with 100 × 50 s-points gives a very detailed result. The advantage of DMUSIC over a FT is striking. To obtain a similar frequency resolution with FT would require a time series O(10) longer, i.e. using 90 h of computing time instead of 9 h (the time needed to run DMUSIC is negligible compared to this, see below). Figure 3 illustrates the relation between P and P r : both functions are displayed for a constant ρ = 0.5 covering the lower frequency range of figure 1. It shows how well frequencies and damping rates can be resolved. Note that the strongest signal comes from the two continuum branches, while the TAE   is barely visible. This is to be expected since no fast ions capable of exciting the TAE are present in this simulation. In ideal MHD, the continuum is neither damped nor growing but due to kinetic effects it is slightly damped here.
Obtaining damping rates of modes is awkward using FT based methods since the width of the, sufficiently resolved, frequency line needs to be estimated by e.g. fitting a Lorentz distribution to it. Using DMUSIC, damping rates can directly be extracted by looking at P instead of P r . This is shown in figure 4. Here the frequency and damping rate of the GAM up-shifted part is obtained by finding the maximum of P over s for radial positions ρ ∈ [0.595, 0.95]. The frequency of the branch is compared with the result from standard MHD theory which uses an adiabatic index of κ = 5/3 (dotted line). The results from the simulations are met quite well but near ρ = 0.8 the MHD results give a somewhat lower frequency. We interpret this behaviour as a sign of kinetic effects coming into play leading to an effective adiabatic index. Its value is empirically found to be close to κ = 2 (dashed curve). The value κ = 5/3 is also obtained by using moments of the gyrokinetic equation and assuming isotropic pressure perturbations (see e.g. [15]). Relaxing the latter assumption leads to a parallel adiabatic index κ = 2 (see e.g. [16]). The damping of the continuum modes is found to be the largest near the frequency minimum but changes only slightly with ρ. Note that a damping can not be obtained from ideal MHD but a kinetic extension is necessary, which is currently under investigation.
The performance and runtime of the algorithm clearly depend crucially on the resolution for the scan over s as well as on the choice for the parameter J determining the prediction matrix size. Setting those to too large values increases the run time considerably but does not lead to improved results. K does not influence the runtime as strongly as J but choosing a value too low may cause some modes to be missed in the analysis. Finding an optimal set of values for J and K needs some experience (Li et al [6] recommend choosing J around N/2).
The advantage of using an FT is its speed: producing figure 1 (left) took less than 1 s. On the other hand, running DMUSIC (K = 50, J = 2000 and 300 × 50 points for the s scan) in a Julia implementation required about 1000 s on 16 processors (a straightforward parallelisation over the radial position ρ was used) on a Linux cluster to get figure 1, right. Putting it differently, reaching the same frequency resolution as with DMUSIC using FT would require a prohibitively expensive ten times longer time series from the gyrokinetic simulation. Furthermore, it would mainly show the TAE while the continuum as a transient effect would be blurred. Thus a combination of both methods suggests itself: an FT gives a first approximation of the whole spectrum and then DMUSIC is used to get the frequency region of interest with the desired resolution.

SSI
In order to show the capabilities of SSI, a gyrokinetic simulation for a tokamak with A = 3 was used. The rotationaltransform profile was chosen such that ideal MHD shows the presence of a reversed-shear Alfvén eigenmode (RSAE) together with a TAE. 16 × 10 6 ions and 64 × 10 6 electrons were used in the simulation resulting in a time series for the  To reduce the data size for the analysis, only every 320th time step was selected. The run time on one processor using a Julia implementation of the above algorithm is about 1 min what is negligible compared to the run time of the simulation.
Even though the simulation was run for a long time, both modes are still present at the end since their damping rates are very similar. It is not possible to separate the modes using the simulation data as such, so SSI can be employed to find their respective mode structures. Obviously SSI could be applied to the time series of all 100 × 8 channels resulting in a very large Hankel matrix. In order to avoid this, a different strategy was followed here: each single Fourier component was used for a separate SSI analysis of order s resulting in frequencies λ s i,m and corresponding ⃗ λ s i,m . The i, m pairs were then grouped together in such a way that within a group the pairwise differences between the real parts of λ s i,m , as well as that between the imaginary ones, are each smaller than given thresholds. The corresponding groups of ⃗ λ s i,m then constitute a mode. Note that this procedure is not unique: the grouping depends on the thresholds which must be chosen in an appropriate way.
In ideal MHD, all modes are standing waves and thus, when decomposed into travelling waves, have a negative and positive frequency part of equal modulus. Furthermore, all modes have damping rate γ = 0. Applying the ideal-MHD code CKA [17] to the above tokamak equilibrium gives two dominant modes: an RSAE with frequency ω RSAE = 196.7 kHz and a TAE with ω TAE = 297.7 kHz. Adding kinetic effects should lead to a loss of these properties: first, positive and negative frequency waves acquire frequencies with slightly different magnitude and, second, the modes obtain some kinetic damping. The result of applying SSI to the time trace of the gyrokinetic simulation are displayed in figure 5 for m = 8 and m = 9 (choosing a model order up to s = 25). It shows the frequencies found by the SSI for each s together with the MIF. Choosing the results for s = 20 and applying the grouping procedure, the four modes with the largest amplitude (corresponding to prominent peaks in the MIF) have frequencies ω RSAE,1 = −226.6 kHz, ω RSAE,2 = 223.5 kHz, ω TAE,1 = −321.6 kHz, ω TAE,2 = 306.9 kHz. Also the modes are weakly damped γ RSAE,1 = −24.3 s −1 , γ RSAE,2 = −52.0 s −1 , γ TAE,1 = −28.2 s −1 , γ TAE,2 = −10.6 s −1 . Figure 6 shows the eigenmodes, i.e. the modulus of the electrostatic potential Φ as a function of radial position ρ, given by SSI (left) and by CKA (right). Eigenfunctions and frequencies of the corresponding results agree very well.
Note that, as can be seen in figure 5, SSI finds more frequencies than just the four given above. Nevertheless, they are either unstable (in the sense explained above) or the amplitudes of the eigenmodes are low and thus of minor importance for the time series. Overall, it can be concluded that the total procedure of identifying modes and frequencies with SSI is not wholly automatic but requires some informed inspection by the user.

Conclusions
We have presented two modern methods of signal processing which can be used to analyse time series: DMUSIC and SSI. The first, as a super-resolution method, is highly sensitive in finding frequencies and damping rates but does not provide information about mode structures. If this is required, SSI is the tool of choice.
As an illustrative example, we applied these methods to the results of electromagnetic gyrokinetic simulations for tokamaks. Since those simulations are very expensive with respect to computing time, it is desirable to have methods at our disposal which facilitate finding frequencies, damping rates and mode structures using shorter runs or which can be used when it comes to analyse transient phenomena. We have shown how the continuous Alfvén spectrum could be found and how TAE and RSAE mode structures could be identified and separated from the simulation results using DMUSIC and SSI.
SSI may be especially useful when only a few modes are dominant, as it is often the case for simulations of e.g. Alfvén modes. Whether it may also be helpful for identifying structures when many modes are present, as is the case for e.g. turbulence, needs further investigation.
Each of the presented methods has a free parameter specifying the order of the underlying model (for DMUSIC it is K and for SSI it is s). Choosing these parameters in a good way is usually guided by experience. It would also be possible to use criteria based on information theory like e.g. Akaike's information criterion (see e.g. [18]) to select the optimal model parameters. This is left for future development.
The presented methods are very general and can be useful also for the analysis of experimental data. With respect to this, one important advantage of parametric methods is that, being based on an underlying model, they are less sensitive to noise than non-parametric methods. Thus they may be applicable even when the signal-to-noise ratio is not optimal. In experiments, it is often difficult to increase the available resolution by simply getting longer time series (e.g. if the machine parameters change over time). Here combining DMUSIC with a time window (as replacement for a windowed Fourier transform) leads to better results. Regarding the application of SSI, we found that when the signal strength in the mode indicator function of two signals differs by many orders of magnitude the weaker signal may not be reliably detected. In this situation attenuation of the strong signal by a band-stop filter should be considered as a remedy. Recent applications of SSI and DMUSIC for the analysis of experimental signals from a Mirnov diagnostic in Wendelstein 7-X can be found in [19][20][21].