COMPRESSED SENSING AND DYNAMIC MODE DECOMPOSITION

This work develops compressed sensing strategies for computing the dynamic mode decomposition (DMD) from heavily subsampled or compressed data. The resulting DMD eigenvalues are equal to DMD eigenvalues from the full-state data. It is then possible to reconstruct full-state DMD eigenvectors using `1-minimization or greedy algorithms. If full-state snapshots are available, it may be computationally beneficial to compress the data, compute DMD on the compressed data, and then reconstruct full-state modes by applying the compressed DMD transforms to full-state snapshots. These results rely on a number of theoretical advances. First, we establish connections between DMD on full-state and compressed data. Next, we demonstrate the invariance of the DMD algorithm to left and right unitary transformations. When data and modes are sparse in some transform basis, we show a similar invariance of DMD to measurement matrices that satisfy the restricted isometry property from compressed sensing. We demonstrate the success of this architecture on two model systems. In the first example, we construct a spatial signal from a sparse vector of Fourier coefficients with a linear dynamical system driving the coefficients. In the second example, we consider the double gyre flow field, which is a model for chaotic mixing in the ocean. 2010 Mathematics Subject Classification. Primary: 65P99, 37M99, 37M10, 68P30, 94A99; Secondary: 76M99, 90C25, 94A08.


1.
Introduction. Dynamic mode decomposition (DMD) is a powerful new technique introduced in the fluid dynamics community to isolate spatially coherent modes that oscillate at a fixed frequency [54,49,52]. DMD differs from other dimensionality reduction techniques such as the proper orthogonal decomposition (POD) [37,9,30,43], where modes are selected to minimize the 2 -projection error of the data onto modes. In particular, DMD results in spatial modes that are constrained to each evolve at a single frequency and/or growth or decay rate in time. Thus, DMD not only provides modes, but also a linear model for how the modes evolve in time, in terms of these frequencies and growth/decay rates [60].
There are many immediate benefits to this formulation of coherent modes. First, it is possible to extract coherent structures that are known to oscillate at a fixed temporal frequency, which is common in fluid dynamics [64]. In addition, DMD modes with zero frequency and zero growth rate are background modes, providing efficient foreground/background separation in videos [25]. Finally, having a loworder model that describes the evolution of DMD modes may be useful for closedloop feedback control.
The DMD is a data-driven and equation-free method that applies equally well to data from experiments or simulations. An underlying principle is that even if the data is high-dimensional, it may be described by a low-dimensional attractor subspace defined by a few coherent structures. When the data is generated by a nonlinear dynamical system, then the DMD modes are closely related to eigenvectors of the infinite-dimensional Koopman operator [35,41,49,12,40]. In Ref. [63], DMD has been shown to be equivalent to linear inverse modeling (LIM) [46,47] from climate science, under certain conditions, and it also has deep connections to the eigensystem realization algorithm (ERA) [29,33,38].
DMD has been used to study various fluid experiments [53,55], shock turbulent boundary layer interaction [26], the cylinder wake [4], and foreground/background separation in videos [25]. In the context of fluid dynamics, DMD typically relies on time-resolved, full-state measurements of a high-dimensional fluid vector field. For complex, turbulent flows, it may be prohibitive to collect data across all spatial and temporal scales required for this analysis.
The present work leverages tools from compressed sensing [20,14,16,15,7,8] to facilitate the collection of considerably fewer measurements, resulting in the same dynamic mode decomposition, as illustrated in Figure 2. This reduction in the number of measurements may have a broad impact in situations where data acquisition is expensive and/or prohibitive. In particular, we envision these tools being used in particle image velocimetry (PIV) to reduce the data transfer requirements for each snapshot in time, increasing the maximum temporal sampling rate. Other applications include ocean and atmospheric monitoring, where individual sensors are expensive. Even if full-state measurements are available, the proposed method of compressed DMD will be computationally advantageous in many situations where there is low-rank structure in the high-dimensional data.

Previous work on sparsity in dynamics.
There are a few examples of prior work utilizing sparsity for the dynamic mode decomposition. In [32], a sparsitypromoting variant of the dynamic mode decomposition was introduced whereby an 1 -penalty term on the number of DMD modes balanced the tradeoff between the number of modes and the quality of the DMD representation. Other algorithms have been developed to obtain only a fixed number of modes, but these have involved global minimization techniques that may not scale with large problems [18].

COMPRESSED SENSING AND DMD 3
In [5,65,6,27], compressed sensing has been used to design non-time resolved sampling strategies motivated by particle image velocimetry (PIV) of a fluid velocity field; in [65,27], this sampling is specifically used to compute DMD utilizing sparsity of mode coefficients in time. These experimental methods are based on the fact that temporally sparse signals may be sampled considerably less often than suggested by the Shannon-Nyquist sampling frequency [44,56].
In [57], compressed sensing is paired with the theory of linear dynamical systems to obtain higher temporal sampling resolution and accurate reconstruction of video MRI. Their work is based on prior studies relating compressed sensing, linear dynamical systems, and video MRI [50,45]. Incoherent measurements are used to estimate an underlying snapshot matrix of hidden-Markov states, as well as an embedding from this low-dimensional attractor into the high-dimensional image pixel space. For the first part, they use system identification based on Hankel matrices and minimal realization theory [29]. Based on the heavy use of Hankel matrices in ERA, and the established connections between DMD and ERA, it will be interesting to see how the present work connects to [57] in the future.

1.2.
Contribution of this work. This work deviates from the prior studies combining compressed sensing and DMD [32,65], in that we utilize sparsity of the spatial coherent structures to reconstruct full-state DMD modes from few measurements. This method results in full-state DMD modes from spatially compressed or subsampled measurements using compressed sensing. The eigenvalues of the DMD on compressed data are shown to be equivalent to the full-state DMD eigenvalues under some conditions, so that we obtain the same low-dimensional DMD model to advance mode coefficients.
These results highlight the ability to perform DMD with significantly less data acquisition when the data and modes are sparse in some transform basis. If full-state snapshots are available, it is also possible to pre-compress the data, compute DMD on the compressed data, and then reconstruct full-state DMD modes as a linear combination of the original full-state data. Performing the DMD on compressed data requires significantly less computational overhead when compared with traditional DMD, because the most expensive step, computing a singular value decomposition (SVD), is performed on a much smaller data matrix. These methods are described in Section 3.3 as various paths we can take in Figure 2, depending on the available of full-state or compressed initial data.
Our results rely on a number of theoretical advances that may be useful more broadly. First, we establish connections between DMD on full-state and compressed data. We then show that DMD is invariant to left and right unitary transformations of the data. This implies that the DMD computed in the spatial domain, Fourier domain, or in a POD coordinate system will all be closely related, since these coordinate systems are related by unitary transformations. We then show that when data and modes are sparse in some basis, we obtain a similar invariance of the DMD when our measurement matrix and sparse basis satisfy the restricted isometry property.
The methods in this paper are illustrated on two examples that are relevant to fluid mechanics. However, we believe that there is broad applicability of compressed sensing techniques and dynamic mode decomposition in dynamical systems more generally.

2.
Background. Dynamic mode decomposition is a method of modal extraction from full-state snapshot data that results in spatial-temporal coherent structures oscillating with a fixed frequency and damping rate. This theory has recently been generalized and extended to a larger class of datasets [63], and it is discussed in Sec. 2.1.
The present work is centered around the use of compressed sensing to compute the dynamic mode decomposition from very few spatial measurements. In compressed sensing, a high-dimensional signal may be reconstructed from few measurements as long as the signal is sparse in some transform basis. We discuss compressed sensing in Sec. 2.2.
2.1. Dynamic mode decomposition (DMD). The dynamic mode decomposition (DMD) is a new tool in dynamical systems that has been introduced in the fluid dynamics community [54,49,52]. The DMD provides the eigenvalues and eigenvectors of the best-fit linear system relating a snapshot matrix and a time-shifted version of the snapshot matrix at some later time.
Consider the following data snapshot matrices: Here, x k ∈ R n is the k th snapshot, and typically n m.
x is often the state of a high-dimensional dynamical system, such as a fluid flow. We consider snapshots that are spaced evenly in time, and we also allow for these measurements to be discrete-time samples of a continuous-time signal, so that x k = x(k∆t).
The dynamic mode decomposition involves the decomposition of the best-fit linear operator A relating the matrices above: (1) The system X = AX is generally underdetermined for n > m, and A is the leastsquares solution, which is chosen to minimize the Frobenius norm of X − AX F . In the atypical case when Eq. (1) is overdetermined (e.g., m > n), it is possible to solve for A as a minimum-norm solution [63]. When ambiguous, we may refer to A in Eq. (1) as A X . It is important to note that Eq. (1) may not be exactly satisfied by the data, and exact equality is not required at any step in the results that follow, hence the least-squares solution.
In practice, the matrix A may be extremely large, making it difficult or impossible to store in memory and analyze. However, DMD seeks the dominant eigenvectors of A, corresponding to non-zero eigenvalues, of which there are at most m when n ≥ m. Thus it is possible to analyze a low-rank projection of A onto modes defined by the singular value decomposition (SVD) of X. The eigendecomposition of this projected matrix is then used to reconstruct full-state DMD modes. The exact DMD algorithm proceeds as follows: Algorithm 1. The method of exact DMD was recently defined [63], and it is repeated here for convenience. The exact DMD algorithm is given by the following: 1. Collect data X, X and compute the SVD of X: In general, U and V are square unitary matrices of sizes n × n and m × m, respectively. The rank of the SVD is r ≤ m, and so we only keep the first r COMPRESSED SENSING AND DMD 5 columns of U and V and the first r × r terms in Σ; thus U and V become rectangular matrices. In practice, even if the matrix has full column rank m, one may choose to truncate the SVD at r < m, either at some pre-determined cutoff threshold or at an optimal threshold for noise reduction [22]. 2. Compute the least-squares fit A that satisfies X = AX and project onto POD/PCA modes U:Ã If the matrix U contains all m left-singular vectors of X arranged as columns, thenÃ is projected so that it acts on the range of X. However, in practice, U may be truncated, in which caseÃ is projected onto the leading POD modes of X. 3. Compute the eigen-decomposition ofÃ: Λ are the DMD eigenvalues. 4. Compute the DMD modes Φ: It is proven in [63] (Theorem 1) that the diagonal elements of Λ are eigenvalues of A with corresponding eigenvectors given by columns of Φ.
Remark 1. The first three steps in the algorithm above are identical to those in [52]. However, the last step differs in the computation of DMD modes. In [52], the modes are given by Φ = UW. In [63], this formula Φ = UW is still used to compute modes corresponding to zero eigenvalues. The new definition of exact DMD has the benefit that modes Φ are in the column space of X , whereas traditional DMD modes are in the column space of X (i.e., they are a linear combination of POD modes U). In many fluid systems, the column space of X and X are either close or identical, because the system is sampled on an attractor, so that either convention may be used to compute DMD modes. None of the major theoretical results in this paper depend on defining DMD modes using Eq. (5) instead of Φ = UW. However, we use the definition in Eq. (5) to be consistent with [63].
The data X, X may come from a nonlinear system in which case the DMD modes are related to eigenvectors of the infinite-dimensional Koopman operator K which acts as the pull-back operator on observable functions [35,1,49,40]. In particular, K acts on observable functions g as: The connection between DMD and the Koopman operator justify the application of this method in a variety of contexts. We may interpret DMD as a model reduction technique if data is acquired from a high-dimensional model, or a method of system identification if the data comes from measurements of an uncharacterized system. In the latter case, the resulting DMD model is data-driven and may be used in conjunction with equation-free methods [34]. The hierarchy of structure in the data is illustrated in Figure 1. Recently, the assumption of evenly spaced snapshots was relaxed, so that the columns of X may be sampled at any times, as long as the columns of X are sampled a fixed ∆t later [63].

Nonlinear dynamics
Linear dynamics Figure 1. Schematic of various assumptions of dynamic structure underlying data X, X .

2.2.
Compressed Sensing. Consider a signal x ∈ R n , which is sparse in some orthogonal basis given by the columns of Ψ, so that and s is a vector containing mostly zeros. The signal s is K-sparse if it has exactly K nonzero elements. Most natural signals are sparse in some basis. For example, natural images and audio signals are sparse in Fourier or wavelet bases, resulting in a high-degree of compressibility. If we take the Fourier or Wavelet transform of an image, most of the coefficients will be small and can be neglected without resulting in much loss of image quality. Truncating in Fourier or Wavelet bases is the foundation of JPEG-2000 image compression and MP3 audio compression. Similarly, many high-dimensional nonlinear PDEs have sparse solutions, allowing for accelerated computational methods [51,39].
The theory of compressed sensing [20,14,16,15,7,8] suggests that instead of measuring the high-dimensional signal x and then compressing, it is possible to measure a low-dimensional subsample or random compression of the data and then directly solve for the few non-zero coefficients in the transform basis. Consider the measurements y ∈ R p , with K < p n: The measurement matrix C is often denoted by Φ in the compressed sensing literature. However, Φ is already used to represent DMD modes in Eq. (5). If x is sparse in Ψ, then we may solve the underdetermined system of equations for s and then reconstruct x. Since there are infinitely many solutions to this system of equations, we seek the sparsest solutionŝ, However, this amounts to a brute force combinatorial search, which is infeasible for even moderately large problems. Under certain conditions on the measurement matrix C, Eq. (8) may be relaxed to a convex 1 -minimization [15,20]: Specifically, the measurement matrix C must be incoherent with respect to the sparse basis Ψ, so that rows of C are uncorrelated with columns of Ψ. The incoherence of the measurement matrix C with respect to the sparse basis Ψ is quantified by the parameter µ(C, Ψ): where c k is the k-th row of the matrix C. When measurements are incoherent, the matrix CΨ satisfies a restricted isometry property (RIP) for sparse vectors s, [17]. The constant δ K is defined as the smallest number that satisfies the above inequality for all K-sparse vectors s. When δ K is small, then CΨ acts as a near isometry on K-sparse vectors s. In practice, it is extremely difficult to compute δ K directly; moreover, the measurement matrix C may be chosen to be random, so that it is more desirable to derive statistical properties about the bounds on δ K for a family of measurement matrices C, rather than to compute δ K for a specific C. Generally, increasing the number of measurements will decrease the constant δ K , improving the property of CΨ to act isometry on sparse vectors. In addition to taking incoherent measurements, we must take on the order of K log(n/K) measurements to accurately determine the K nonzero elements of the n-length vector s [13,14,7]. In this case, there are bounds on the constant δ K that guarantee exact signal reconstruction for noiseless data. A more in-depth discussion of incoherence and the RIP can be found in [7,17]. We will use the fact that CΨ acts as a near isometry on K-sparse vectors to combine sparsity and dynamic mode decomposition for data with low-rank structure.
Typically a generic basis such as Fourier or wavelets is used to represent the sparse signal s. The Fourier transform basis is particularly attractive for engineering purposes since single-pixel measurements are incoherent, exciting broadband frequency content. If a signal is K-sparse in the Fourier domain, we may then reconstruct the full state from O(K log(n/K)) single-pixel measurements at random spatial locations. This is especially beneficial when individual measurements are expensive, for example in ocean and atmospheric sampling, among other applications.
Another major result of compressed sensing is that Bernouli and Gaussian random measurement matrices C will satisfy the RIP with high probability for a generic basis Ψ [16]. There is also work describing incoherence with sparse matrices and generalizations to the RIP [23]. Recent work has shown the advantage of pairing compressed sensing with a data-driven POD/PCA basis, in which the data is optimally sparse [36,10,5,6,11]. The use of a POD/PCA basis results in a more computationally efficient signal reconstruction from fewer measurements.
In addition to the 1 minimization described in Eq. (9), there are a host of greedy algorithms [61,62,42,24] that iteratively determine the sparse solution to the underdetermined system in Eq. (7). There has also been significant work on compressed SVD and PCA based on the Johnson-Lindenstrauss (JL) lemma [31,21,48,24]. The JL lemma is closely related to the RIP, and it states when it is possible to embed a set of high-dimensional vectors in a low-dimensional space while preserving the spectral properties.
3. Compressive DMD. In this section, we combine ideas from compressed sensing to compute the dynamic mode decomposition from a few spatially incoherent measurements. In Sec. 3.1, we establish basic connections between the DMD on full-state and compressed data. These connections facilitate the two main applied results of this work: 1) It is possible to compute DMD on compressed data and reconstruct full-state DMD modes through compressed sensing. This will be referred to as the compressed sensing dynamic mode decomposition (CSDMD). 2) If full-state measurements are available, it is advantageous to compress the data, compute the compressed DMD, and then compute full-state DMD modes by linearly combining snapshots according to the compressed DMD transforms. This will be referred to as the compressed dynamic mode decomposition (CDMD). In both cases, the full-state and compressed DMD eigenvalues are equal. These two approaches are described in Section 3.3 as various pathways to take in Figure 2, depending on what the initial data is.
Next, in Sec. 3.2, we demonstrate the invariance of the DMD algorithm to left or right unitary transformations of the data. We then discuss how the condition of unitarity may be relaxed to a transformation satisfying a restricted isometry property, as long as the data is sparse in a basis that is incoherent with respect to the measurements. This strengthens the connection to compressed sensing. Fullstate DMD modes are then reconstructed from DMD modes on compressed data using compressed sensing; in particular, we use the method of compressive sampling matching pursuit (CoSaMP) [42].
3.1. DMD eigenvalues and eigenvectors on compressed data. It is possible to either collect data X, X , or compressed data Y, Y , where Y = CX, Y = CX and C ∈ R p×n is the measurement matrix.
Definition 3.1. We refer to X and X as the full-state snapshot matrices and Y and Y as the compressed snapshot matrices.
Similar to Eq. (1) above, we seek to find a best-fit operator A Y that relates Y and Y as Again, note that exact equality in the expression above is not required, nor is it likely to be satisfied by the data. We may also rely on the following three assumptions of sparsity of data and incoherence of measurements. Assumption 1. The columns of X and X are sparse in a transform basis Ψ so that X = ΨS and X = ΨS , where the columns of S, S are sparse (mostly zeros).

Assumption 2.
The measurement matrix C is incoherent with respect to Ψ and there are enough measurements (i.e., p > K log(n/K)) so that CΨ acts as a near isometry on K-sparse vectors. In this case, there are bounds on the restricted isometry constant δ K that ensure, with high probability, that exact sparse signal reconstruction is possible for noiseless data.
Remark 2. Under Assumptions 1 and 2, it is possible to reconstruct x from y = Cx, and therefore X, X from Y, Y using compressed sensing. However, this is laborious and inelegant. Assumption 3. In addition to sparsity of the columns of X and X , we may also require each of the columns to be in the same sparse subspace of the basis Ψ. The POD modes U X and the DMD modes Φ X are then guaranteed to be in this sparse subspace. This condition is reasonable when the dynamical system evolves on a low-dimensional subspace.
Recall that the matrix V Y may be truncated, so that it has size r × m. If the system has rank r = m, then V Y V * Y will be the identity matrix and the assumption is trivially satisfied. However, if r < m, then the columns of V X are required to be in the column space of V Y ; of course, this will only be approximately satisfied if there is measurement noise.
Lemma 3.2. If Assumption 4 holds, then the full-state and compressed DMD matrices A X from Eq. (1) and A Y from Eq. (10) are related as: Proof. We first use Assumption 4 to derive a useful identity: Then The above lemma allows us to prove the following theorem, which establishes the central connection between DMD on full and compressed data.
Then Cφ x is an eigenvector of A Y with the same eigenvalue λ.
Then starting with Lemma 3.2, we find that so Cφ x is an eigenvector of A Y with eigenvalue λ.
For real data, it is likely that DMD eigenvectors of A X will be approximately in column space of X, since X and X will generally have similar column spaces; however, if there is measurement error, the two matrices will not have identical column spaces. If C is chosen poorly, so that φ x is in the null-space of C, then Theorem 3.3 applies trivially. Theorem 3.3 does not guarantee that every eigenvector φ y of A Y is the projection of an eigenvector of A X through C. The rank r of X is guaranteed to be less than or equal to m, the number of columns, when n ≥ m. In many cases, r is chosen by a threshold criteria to truncate noise [22], or to capture 99% of the variance in X. If the noise-truncated rank r is equal to m, then it may be necessary to collect more snapshots in order to adequately sample the attractor.
The assumption of low-rank structure is implicit in most dimensionality reduction strategies. For compressed sensing DMD, we also require that there are at least as many output measurements p as the rank r, so that p ≥ r. As long as the columns of X are not in the null-space of C and the rank of Y = CX is also r, then the r nontrivial DMD eigenvalues Λ Y of Y, Y will be be equal to Λ X . Similarly, the projected (compressed) DMD eigenvectors are related to full-state DMD eigenvectors as described in Theorem 3.3. Corollary 1. Given Assumptions 1-4, we may reconstruct φ x from the eigenvector φ y = Cφ x = CΨφ s from Theorem 3.3 by compressed sensing.
Remark 3. Even starting with full-state measurements X, X , it is beneficial to compress, compute the DMD, and reconstruct the modes according to: We refer to this as compressed DMD, as opposed to the compressive-sampling DMD above. Under certain conditions, compressed DMD is equivalent to DMD.
Theorem 3.4. If Assumption 4 holds and the column space of X is contained in that of X, then the compressed DMD modes are DMD modes.
Proof. By assumption, we have U X U * X X = X . Together with Eq. (12), which follows from Assumption 4, this allows us to write Thus, the columns ofΦ X are eigenvectors of A X , making them DMD modes (as well as compressed DMD modes).

Remark 4.
It is important to note that neither Lemma 3.2 nor Theorem 3.3 require exact equality in Eqs. (1) and (10). However, often X ≈ A X X, where A X = X X † X V X Σ −1 X U * X . In this case, we require that (CX) † ≈ X † C * , which holds either when C satisfies the Johnson-Lindenstrauss theorem, or X = ΨS and CΨ satisfies the restricted isometry property, as in the next section.

3.2.
Invariance of DMD to unitary transformations. In this section, we show that the dynamic mode decomposition is invariant to left and right unitary transformations of the data X and X . We first assume that we have an n × n unitary measurement matrix C with the number of measurements p equal to n. We will then relax the condition of C being a square unitary measurement matrix, and instead require that our data is sparse in some basis and the p n measurements are incoherent with respect to that basis.
This section relies on the fact that the singular value decomposition of Y = CX is related to the singular value decomposition of X = UΣV * if C is unitary: Now CU are the left-singular vectors of Y. Similarly, if Y = XP for unitary P, then Y = UΣV * P. This is discussed in more detail in the Appendix.
Theorem 3.5. The DMD eigenvalues and eigenvectors are invariant to righttransformations P of the columns of X and X if P is unitary.
Proof. Let P be a m × m unitary matrix that acts on the columns of X and X as: Y = XP, and Y = X P. The four steps of the exact DMD algorithm proceed as: Therefore, the DMD eigenvalues and eigenvectors are invariant to right-unitary transformations of the data.  Proof. Again, let Y = CX and Y = CX . Exact DMD proceeds as follows: Corollary 3. The DMD computed in the spatial domain is related to the DMD computed in the Fourier domain or in a coordinate system defined by principle components. Both the discrete Fourier transform F and the principal component coordinate transformation U are unitary transformations, and so Theorem 3.6 applies with C = F or C = U, respectively. The DMD eigenvalues will be unchanged, and the DMD eigenvectors will be projected into the new coordinates.
Now we may consider data X, X that are sparse in some basis Ψ, and relax the condition that C is unitary. Instead, C must be incoherent with respect to the sparse basis, so that the product CΨ satisfies a restricted isometry property with small constant δ K . This allows us to compute DMD on the compressed data Modes Figure 2. Schematic of compressive-sampling and compressed DMD as they relate to data X, X and compressed data Y, Y . C is a projection down to measurements that are incoherent with respect to sparse basis.
(Y, Y ) = (CX, CX ) = (CΨS, CΨS ) and reconstruct full-state DMD modes by compressed sensing on Φ Y = CΦ X = CΨΦ S . The invariances above persist if CΨ acts as an isometry on sparse vectors and the columns of X, X are sparse in Ψ. This can be verified by working through the method of snapshots in the Appendix on the compressed data.

Various approaches and algorithms.
There are a number of algorithms that arise from various paths in Figure 2 depending on what data we have access to. The primary paths are: Path 1B (compressed DMD) and Path 2B (compressed sensing DMD). A schematic of the data flow is shown in Fig. 3.

Path 1:
We start with full-state data X, X . Option A. Compute DMD to obtain (Λ X , Φ X ). Option B. Compress data first: struct Φ S and then construct Φ X = ΨΦ S . Path 2: We only have compressed data Y, Y .
Option A. First reconstruct X, X using compressed sensing.
(i) Perform 1 -minimization on Y = CΨS to solve for S, and hence X (same for X ). (ii) Compute DMD on X (or S). Option B. Compute DMD on compressed data and only reconstruct r modes using compressed sensing.

Reconstruct Modes
Therefore, the DMD eigenvalues and eigenvectors are invariant to right-unitary transformations of the data.
Corollary 2. The DMD eigenvalues and eigenvectors are invariant to permutations of the columns of X and X 0 .
Theorem 3.5. The DMD eigenvalues are invariant to left-transformations C of data X if C is unitary, and the resulting DMD modes are projected through C.
Proof. Again, let Y = CX and Y 0 = CX 0 . Exact DMD proceeds as follows: Therefore, the DMD modes Y are the projection of X through C: Y = C X . Corollary 3. The DMD computed in the spatial domain is related to the DMD computed in the Fourier domain or in a coordinate system defined by principle components. Both the discrete Fourier transform F and the principal component coordinate transformation U are unitary transformations, and so Theorem 3.5 applies with C = F or C = U, respectively. The DMD eigenvalues will be unchanged, and the DMD eigenvectors will be projected into the new coordinates. Now we may consider data X, X 0 that are sparse in some basis , and relax the condition that C is unitary. Instead, C must be incoherent with respect to the sparse basis, so that the product C satisfies a restricted isometry principle. This allows us to compute DMD on the compressed data (Y, Y 0 ) = (CX, CX 0 ) = (C S, C S 0 ) and reconstruct full-state DMD modes by compressive sampling on

Various approaches and algorithms.
There are a number of algorithms that arise from various paths in Figure 2 depending on what data we have access to. The primary paths are: Path 1B (compressed DMD) and Path 2B (compressive sampling DMD).

Path 1:
We start with full-state data X, X 0 . Option A. Compute DMD to obtain (⇤ X , X ). Option B. Compress data first: iv) [alternative to (iii)] Perform`1-minimization on Y = C S to reconstruct S and then construct X = S . Path 2: We only have compressed data Y, Y 0 .
Option A. First reconstruct X, X 0 using compressive sampling. (i) Perform`1-minimization on Y = C S to solve for S, and hence X (same for X 0 ). (ii) Compute DMD on X (or S).
Proof. Again, let Y = CX and Y = CX . Exact DMD proceeds as follows: Therefore, the DMD modes Y are the projection of X through C: Y = C X . Corollary 3. The DMD computed in the spatial domain is related to the DMD computed in the Fourier domain or in a coordinate system defined by principle components. Both the discrete Fourier transform F and the principal component coordinate transformation U are unitary transformations, and so Theorem 3.5 applies with C = F or C = U, respectively. The DMD eigenvalues will be unchanged, and the DMD eigenvectors will be projected into the new coordinates. Now we may consider data X, X 0 that are sparse in some basis , and relax the condition that C is unitary. Instead, C must be incoherent with respect to the sparse basis, so that the product C satisfies a restricted isometry principle. This allows us to compute DMD on the compressed data (Y, Y 0 ) = (CX, CX 0 ) = (C S, C S 0 ) and reconstruct full-state DMD modes by compressive sampling on

Various approaches and algorithms.
There are a number of algorithms that arise from various paths in Figure 2 depending on what data we have access to. The primary paths are: Path 1B (compressed DMD) and Path 2B (compressive sampling DMD).
Path 1: We start with full-state data X, X 0 . Option A. Compute DMD to obtain (⇤ X , X ). Option B. Compress data first: iv) [alternative to (iii)] Perform`1-minimization on Y = C S to reconstruct S and then construct X = S . Path 2: We only have compressed data Y, Y 0 .
Option A. First reconstruct X, X 0 using compressive sampling. (i) Perform`1-minimization on Y = C S to solve for S, and hence X (same for X 0 ). (ii) Compute DMD on X (or S).   f. Again, let Y = CX and Y 0 = CX 0 . Exact DMD proceeds as follows: refore, the DMD modes Y are the projection of X through C: Y = C X . ollary 3. The DMD computed in the spatial domain is related to the DMD puted in the Fourier domain or in a coordinate system defined by principle ponents. Both the discrete Fourier transform F and the principal component dinate transformation U are unitary transformations, and so Theorem 3.5 apwith C = F or C = U, respectively. The DMD eigenvalues will be unchanged, the DMD eigenvectors will be projected into the new coordinates.
ow we may consider data X, X 0 that are sparse in some basis , and relax condition that C is unitary. Instead, C must be incoherent with respect to sparse basis, so that the product C satisfies a restricted isometry principle. s allows us to compute DMD on the compressed data (Y, Y 0 ) = (CX, CX 0 ) = S, C S 0 ) and reconstruct full-state DMD modes by compressive sampling on = C X = C S .
Various approaches and algorithms. There are a number of algorithms arise from various paths in Figure 2 depending on what data we have access The primary paths are: Path 1B (compressed DMD) and Path 2B (compressive pling DMD). h 1: We start with full-state data X, X 0 . ption A. Compute DMD to obtain (⇤ X , X ). ption B. Compress data first: iv) [alternative to (iii)] Perform`1-minimization on Y = C S to reconstruct S and then construct X = S . h 2: We only have compressed data Y, Y 0 . ption A. First reconstruct X, X 0 using compressive sampling.
(i) Perform`1-minimization on Y = C S to solve for S, and hence X (same for X 0 ). (ii) Compute DMD on X (or S).  As a general rule, if full data X, X is available, Path 1B (i)-(iii) is the most efficient option. This is because the most expensive step, consisting of an SVD on full data, is replaced by an SVD on compressed data; the computational cost of the SVD is reduced from O(nm 2 ) (full SVD) to either O(pm 2 ) (if p > m) or O(mp 2 ) (if p < m). We refer to this as compressed DMD. If only the compressed data Y, Y is available, Path 2B is more computationally efficient than Path2A, since there will be fewer DMD modes than columns of Y and Y , resulting in fewer 1 minimization problems. We refer to this as compressed sensing DMD. Path 2 is, in general, quite computationally expensive, especially for large problems, because of the 1 -minimization. However, when full-state data is unavailable, such as for examples in ocean or atmospheric sampling problems, the cost of compressed sensing is secondary, since the alternative is a prohibitively expensive NP-hard brute force search for DMD modes. Note that alternative (iv) in Path 1B is presented for completeness, although it is not recommended because of the unnecessary expense of 1 -minimization when full-state data is available.

4.
Results. We illustrate the above methods on two example problems that are relevant for fluid dynamics. In the first example, we construct a spatially evolving system that has sparse dynamics in the Fourier domain. In the second example, we consider the time-varying double-gyre, which has been used as a model for ocean mixing. Both examples are coded in Matlab, using the built-in svd command, which is based on LAPACK routines [2]. In each example, the sparse basis Ψ is the two-dimensional inverse discrete Fourier transform (iDFT2) matrix.
4.1. Example 1: Sparse linear system in Fourier domain. This system is designed to test to compressive DMD algorithms in a well-controlled numerical experiment. We impose sparsity by creating a system with K = 5 non-zero 2D spatial Fourier modes; all other modes are exactly zero. It is also possible to allow the other Fourier modes to be contaminated with Gaussian noise and impose a very fast stable dynamic in each of these directions. We then define a stable linear, timeinvariant dynamical system on the K modes. This is done by randomly choosing a temporal oscillation frequency and small but stable damping rate for each of the modes independently. In this way, we construct a system in the spatial domain that is a linear combination of coherent spatial Fourier modes that each oscillate at a different fixed frequency. Table 1 contains the specific values for this example. Figure 4 shows a snapshot of this system at t = 2. We see the five large Fourier mode coefficients generate distinct spatial coherent patterns. Figure 5 shows the five Fourier modes (real and imaginary parts) that contribute to the spatial structures in Figure 4. This example is constructed to be an ideal test case for compressed sampling dynamic mode decomposition. The linear time-invariant system underlying these dynamics is chosen at random, so the data matrix X will contain significant energy from many of the modes. Therefore, POD does not separate the spatial modes, as shown in Figure 6. Instead, POD extracts structures based on their variance in the data. Since POD is invariant to permutation of the columns, the underlying Table 1. Parameters of system in for Example 1, visualized in Figure 4. Individual modes are obtained by constructing an oscillating and decaying Fourier mode (with eigenvalue λ = d + iω), x(I, J) = e λt = e dt (cos(ωt) + i sin(ωt)), and taking the real part of the inverse Fourier transform:  temporal structure is not reflected in the POD structures. Since each of our Fourier modes is oscillating at a fixed and distinct frequency, this is ideal for dynamic mode decomposition, which isolates the spatially coherent Fourier modes exactly, as shown in Figure 7.
In the case of no background noise, the compressed sensing DMD algorithm (Path 2B in Sec. 3.3) works extremely well, as seen in the mode reconstruction in  Figure 9. DMD modes from compressed data, using starting with full-state snapshots, compressing, performing DMD, and then reconstructing using formula in Eq. (13) results in accurate reconstruction, shown in Figure 9. Both compressive-sampling DMD and compressed DMD match the true eigenvalues nearly exactly, as shown in Figure 10. The X, X data matrices are obtained on a 128 × 128 spatial grid from times 0 to 2 in increments of ∆t = 0.01, so that X ∈ R 16384×200 . We use p = 15 measurements; randomly placed single-pixel measurements and measurements obtained by taking the dot product of a Gaussian or Bernoulli random vector with the state vector x all yield accurate DMD eigenvalues and modes.  The accuracy of the compressed and compressed sensing DMD methods in reconstructing DMD modes and eigenvalues are summarized in Table 2. The accuracy of the standard DMD method is also included, since this example was constructed from a true low-rank linear model on FFT modes, providing an explicit solution.
Since the DMD modes in this example are in complex conjugate pairs, and since these modes have a phase difference from the true FFT modes, we compare the magnitudes. For each of the five modes, the correct FFT modal wavenumber is identified, and all other wavenumbers are zero. The mode error in Table 2 arises from small differences in the magnitude of the single non-zero wavenumber. The computational time of each method has also been benchmarked and summarized in Table 3. For this example, the compressed sensing method is still faster than standard DMD because of the sparsity of the solution vector.
When we add small to moderate amounts of background noise (2-5% RMS) in the Fourier domain, a number of things change. The modes and frequencies are still very well characterized by both methods of compressed DMD and compressivesampling DMD. However, the resulting DMD damping is often inflated; this effect is more pronounced when the full-state DMD eigenvalues are strongly damped. Recent results predict the change in DMD spectrum with small additive noise [3], and our results appear to be consistent. There are currently efforts to understand and correct for the effect of noise in the DMD computation [19,28]. However, if the goal of a DMD model is closed-loop feedback control, then one may desire some controller robustness with respect to uncertain damping rates. 4.2. Example 2: Double gyre flow. The double gyre is a simple two-dimensional model [59] that is often used to study mixing between ocean basins. The double-gyre flow is given by the following stream-function ψ(x, y, t) = A sin (πf (x, t)) sin(πy) which results in the following time-periodic vector field  Figure 11 illustrates the double gyre vector field, with color representing vorticity and arrows showing the vector field. The vorticity field is highly compressible, in that 99% of the Fourier coefficients may be zeroed with little effect on the reconstructed vorticity field (panel c).
The datasets X, X are constructed by sampling the vorticity (curl of velocity field) on a 512 × 256 spatial grid at times 0 to 10 with ∆t = 0.05. The POD modes for the double gyre data set are shown in Figure 12 (a). The DMD modes are shown in Figure 12 (b). Taking 2500 single pixel measurements (i.e., constructing Y and Y by sampling 2500 random rows of X and X ), which accounts for under 2% of the total pixels, the reconstructed modes are shown in Figure 12 (c) and (d). The modes exhibit good agreement. Moreover, the compressive DMD eigenvalues are close to the exact DMD eigenvalues, shown in Figure 13.
A quantitative comparison of the compressed DMD eigenvalues with the exact DMD eigenvalues is provided in Fig. 14. Only the first five modes are shown, since modes 6-9 are conjugate pairs with modes 1-4. The agreement between eigenvalues is surprisingly close, even for p = 10 compressed measurements. The compressed DMD eigenvalues continue to converge as the number of measurements p is increased. It is interesting to note that the lower frequency modes have significantly smaller errors in the eigenvalue approximations. Similar convergence plots of the L 2 error of the compressed DMD modes are shown in Fig. 15.
The compressed sensing DMD method yields the same eigenvalues as the compressed DMD algorithm, so Fig. 14 also applies to both methods. The convergence plots of the L 2 error of the compressed sensing DMD modes are shown in Fig. 16 for increasing numbers of measurements p. The convergence of modes is much slower than in compressed DMD, since full-state snapshots are not available for reconstruction. However, this is consistent with results from the compressive sensing literature. For the CoSaMP algorithm [42], we use 10 iterations and a desired sparsity of K ∼ p/4(1 + log 10 n − log 10 p), approximated from p ∼ 4K log 10 (n/K).
A benchmark comparison of the computational efficiency of the various algorithms is provided in Fig. 17. The compressed DMD algorithm is 30-40 times faster than the standard DMD algorithm for a small number of measurements (p = 10 to p = 1250), although the method is only 13 times faster for p = 10000 measurements.   Compressed sensing DMD is faster than DMD for a small number of measurements (p < 300), though it rapidly becomes inefficient, as is generally the case with compressed sensing. However, if full-state measurements are unavailable, the additional computational cost may be justified to approximate full-state DMD modes. All results in Figs. 14-17 use p randomly placed single-pixel measurements. We have also investigated Gassian random compression (i.e., >>C=randn(p,n); in Matlab) and uniform random compression (i.e., >>C=rand(p,n); in Matlab) of the data X, which yield similar results. Although the compressed measurements are chosen randomly in this work, there are recent studies that investigate optimal sensor placement for improved sparse approximation [11]. Compressed sensing works well with random measurements, because aliasing effects is highly unlikely.
DMD has been widely used for data generated by fluid dynamical systems, because of the ability to isolate spatially coherent modes that oscillate at a fixed frequency in time. For example, in [64], DMD modes are used to understand interactions between the shear layer and wake behind a finite-thickness flat plate.
Recently, DMD has also been effectively used to find background structures by identifying DMD modes corresponding to λ = 1 (in discrete time) [25]. The DMD analysis on the double gyre flow identifies such a coherent background mode, corresponding to Mode 5. Without the DMD analysis, it would be unclear that this is coherent and unchanging in time.    5. Discussion. In this work we have developed a new method that leverages compressed sensing to compute the dynamic mode decomposition (DMD) for spatially subsampled or compressed data. There are two key advances resulting from this work. First, it is possible to reconstruct full-state DMD modes from heavily subsampled or compressed data using compressed sensing. This is called compressivesampling DMD, and is illustrated in Sec. 3.3 and Figs. 2-3 as Path 2B. Second, if full-state snapshots are available, it is possible to first compress the data, perform DMD, and then reconstruct by taking a linear combination of the snapshot data, determined by the DMD on compressed data. This is denoted by compressed DMD, and is explained in Path 1B in Sec. 3.3 and Figs 2-3. The theory relies on relationships between DMD on full-state and compressed data established in Sec. 3.1. We also show that DMD is invariant to left and right unitary transformations. We then relax this condition and use the restricted isometry property that is satisfied when incoherent measurements are applied to a signal that is sparse in some basis.
Both of these methods are demonstrated to be effective on two example problems with relevance to fluid dynamics and oceanographic/atmospheric sciences. In the first example, a low-order linear dynamical system is evolved on a few Fourier coefficients, establishing a high-dimensional, time-varying spatial flow with underlying low-rank patterns. In the second example, we consider the double gyre flow, which is a model for ocean mixing. The accurate and efficient reconstruction of the DMD eigenvalues and modes are demonstrated on both examples using few spatial measurements and compressed/compressive-sampling DMD.
There are a number of interesting directions that arise from this work. First, it will be a natural extension to apply these methods to high-dimensional systems in fluid dynamics and to oceanographic/atmospheric measurements. The reduced burden of spatial sampling may also allow for increased temporal sampling rates in particle image velocimetry (PIV), similar to recent advances in MRI [50,45,57]. In PIV, the data transfer from camera to RAM is a limiting factor, although one may envision compressing the full-state data before transferring to memory. It is also important to characterize and address the effect of noise on DMD [3,19,28], both full-state and compressed. The success of the compressed DMD method, even for very few measurements p, suggests that further theoretical developments are required, based on random matrix theory. Providing error bounds and theoretical limits for compressed DMD will be the subject of ongoing research. Finally, it may be possible to combine the spatial compressed sensing strategy advocated here with the temporal sampling strategy in [65] for greater efficiency gains.
Appendix: SVD and Unitary transformations. In this appendix we demonstrate that left or right-multiplication of a data matrix X by a unitary transformation preserves all of the terms in the singular value decomposition except for the corresponding left or right unitary matrix U or V, respectively. These matrices are simply multiplied by the new unitary transformation. 5.1. Method of snapshots. Typically when computing the SVD of a large data matrix X ∈ R n×m , with n m, we use the method of snapshots [58]. In this method, we compute the m × m matrix X * X and solve the following eigendecomposition: With V and Σ computed, it is possible to construct U: Thus, we have the singular value decomposition: X = U X Σ X V * X . We have added the subscript X to denote that the U, Σ, and V are computed from X.

5.2.
Left unitary transformation. Now consider the data matrix Y = CX, where C is unitary. We find that Y * Y = X * C * CX = X * X, so the compressed data has the same eigendecompositon and results in the same V X and Σ X . Now, constructing U Y , we have: U Y = YV X Σ −1 X = CXV X Σ −1 X = CU X . Therefore, we have the following singular value decomposition: Y = CU X Σ X V * X . In other words, CU X are the new left-singular vectors.

Right unitary transformation.
Similarly, consider Y = XP * , where P is a unitary matrix. We find Y * Y = PX * XP * = PV X Σ 2 X V * x P * . This results in the following eigendecomposition: X . Therefore, V Y = PV X and Σ Y = Σ X . It is then possible to construct U Y : The singular value decomposition Y = U X Σ X V * X P * has PV X as the new rightsingular vectors.