MODULO: A software for Multiscale Proper Orthogonal Decomposition of data

In the era of the Big Data revolution, methods for the automatic discovery of regularities in large datasets are becoming essential tools in applied sciences. This article presents an open software package, named MODULO (MODal mULtiscale pOd), to perform the Multiscale Proper Orthogonal Decomposition (mPOD) of numerical and experimental data. This novel decomposition combines Multi-resolution Analysis (MRA) and standard Proper Orthogonal Decomposition (POD) to allow for the optimal compromise between decomposition convergence and spectral purity of its modes. The software is equipped with a Graphical User Interface (GUI) and enriched by numerous examples and video tutorials (see Youtube channel MODULO mPOD).


Motivation and significance
Data driven modal analysis aims at decomposing a dataset as a linear combination of elementary contributions called modes. This provides the fundamental framework for many areas of applied mathematics and related applications, including pattern recognition, machine learning, data compression, filtering and model order reduction [1,2]. Linear decomposition consists in projecting the dataset onto a suitable space, spanned by a basis that is, hopefully, better capable of capturing the essential features of the data.
A mode is computed by projecting the data onto a certain element of a basis. In most of the engineering applications, the data results from the discretization, or sampling of a real quantity (e.g. grayscale entries in images, pressure fields in a fluid flow simulation or stress fields in solid mechanics) over a spatial discretization x i and a temporal discretization t k . Each of the modes produced by a decomposition has its own spatial and temporal structure.
In pattern recognition, one aims at linking modes to specific patterns of interest [3,4]. In filtering or data compression, one aims at removing modes that describe unwanted features or modes that do not significantly contribute to the data [5,6]. In machine learning, this is often a fundamental preprocessing step for many supervised or unsupervised problems [7]. In model order reduction, one aims at projecting partial differential equations onto the space spanned by a few of the leading modes [8,9], thus significantly reducing the computational complexity of a problem and eventually enabling system identification methods and control [10,11].
The convergence optimality comes at the cost of setting no constraints on the frequency content of the structures constituting its modes. In many applications, however, it is of interest to have harmonic decompositions to facilitate physical interpretability. In fluid mechanics, this need has motivated the development of an alternative decomposition known as Dynamic Mode Decomposition (DMD, [15]), which extends the Fourier decomposition to the data-driven paradigm: frequencies are defined from the dataset and not imposed a priori.
There exist nevertheless cases for which both convergence optimality and spectral purity yields poor feature detection capabilities [16]. The lack of frequency constraints in the POD occasionally results in modes that capture phenomena occurring at very different scales. On the other hand, the constraint of purely harmonic modes prevents time-frequency localization and yields convergence problems for datasets that are not strictly periodic.
The Multiscale Proper Orthogonal Decomposition proposed in [16] allows for overcoming the limitations of the two with a hybrid method that combines their advantages. This decomposition has already been successfully used in various experimental [17,18,19,20] and numerical works [21,16]. The software package MODal mULtiscale pOd (MODULO) described in this work is a Matlab-based software developed at the von Karman Institute for Fluid Dynamics to perform mPOD, POD and DFT on numerical and experimental data. Equipped with a Graphical User Interface (GUI), an executable, and a set of exercises and video tutorials, the use of this software does not require direct interaction with the source code and can thus facilitate applied scientists that are unwilling to enter into technicalities linked to programming. In what follows, we refer the reader to each of the eight video tutorials in MODULO's youtube channel for a more detailed review of the software features.

Theoretical Background
A brief description of the theoretical background of the code is here presented; for more details, the reader is referred to [16] and to the first three video tutorials in the youtube channel. As described in the first video, all the modal decompositions implemented are matrix factorizations of the form where D is the data matrix to decompose, here assumed to be function of space (x i ) and time (t k ). Regardless of the dimensionality (e.g. 2D or 3D, scalar or vector) of the data, we here assume that every temporal realization (snapshot) of the data is flattened into a vector and stored as a column of D. Every column of is thus a function of x i and every row is a function of t k . This matrix is thus of size n s × n t , with n s the number of spatial points and n t is the number of temporal realizations (snapshot), and has rank R = rank(D) ≤ N = min(n s , n t ).
The matrices Φ ∈ C n s ×R and Ψ ∈ C n t ×R collect the spatial and the temporal structures (bases) and Σ = diag[σ 1 , σ 2 , . . . , σ R ] ∈ R R×R is the diagonal matrix collecting their importance (amplitude). More generally, if other independent variables are considered instead of space and time, Φ and Ψ contain a basis for the columns and the rows of D respectively.
The bold notation used (x i or i) denotes linear matrix indices. In the current version of the code, it is assumed that the data has uniform sampling both in space and in time, although the latter constraint can be relaxed for the POD. Consider, for example, a 2D velocity field from planar Particle Image Velocimetry (PIV) sampled over a grid x i = [x i , y j ] containing 128 × 128 points. Flattening the entire velocity field into the columns of D, the number of spatial points if n s = 2 × 128 × 128 = 16384, having concatenated both vector components into a single snapshot.
The matrix D is constructed by assigning every k th snapshot d k [i] to a column of D. Here k is the index over the time discretization t k = (k − 1)∆t, sampled at a frequency f s = 1/∆t with k = [1, n t ]. As the spatial structures φ r [i] are columns of Φ, and the temporal structures ψ r [k] are columns of Ψ, eq.(1) can be written as a dyadic expansion: If the summation is truncated at r < N, an approximation of the original data is obtained. Since Σ is a diagonal matrix, and both the spatial and temporal structures have unitary norm ||φ r || = ||ψ r || = 1 ∀r ∈ [1, R], it is easy to see that the decomposition can be completed once either Φ or Ψ are given. The decompositions implemented in MODULO are the POD, the DFT, and the mPOD, computing first the temporal structures Ψ. All these decompositions have an orthonormal temporal basis (Ψ −1 = Ψ † , with † denoting Hermitian transpose). Hence, the spatial structures can be easily computed as where the calculation of the diagonal matrix Σ is done from the normalization of the spatial structures, that is σ r = ||Dψ r ||. The theoretical background for DFT and POD is provided in the second video tutorial while the third video tutorial is dedicated to the mPOD. The computation of the temporal structures for these three decompositions proceeds as follows.
-DFT. The temporal basis is the well known Fourier matrix, which can be written as The construction of this matrix is independent from the data, hence the DFT is not data-driven. In practice, the multiplication by the Fourier Matrix is carried out using the FFT algorithm.
-POD. The temporal basis is computed from the eigenvalue decomposition of K = D T D: This matrix is known as temporal correlation matrix in the fluid mechanics community and the approach implemented is known as Sirovinch's snapshot method [22].
-mPOD. The temporal basis is computed via a combination of Multi-resolution analysis and eigenvalue decomposition. The fundamental idea of the mPOD is to perform POD at different scales, each retaining non-overlapping portions of the frequency spectra. For example, assuming that the dataset is sampled at f s = 1000 Hz, one might decide to separate phenomena occurring in the range [0 − 100] Hz, [100 − 300] Hz and [300 − 500] Hz and perform a POD on each of these independently. As described in [16], this could be done by using a filter bank, defined by a frequency splitting vector F V = [100, 300] Hz, to break the dataset into three contributions, and perform the POD in each of these separately. To reduce the computational cost of this operation, the mPOD performs the MRA on the temporal correlation matrix K = D T D. Given a set of suitable transfer functions {H m } M m=1 , with M the number of scales to identify, the mPOD breaks the correlation matrix as: where K = Ψ F K Ψ F is the 2D Fourier transform of the correlation matrix and is the shur product, that is the entry by entry multiplication.
The filtering operation is designed to preserve the key properties of K: each contribution K m is symmetric and positive definite and thus equipped with a set of orthonormal eigenvectors (POD modes) and non-negative real eigenvalues: where n m is the number of non zero eigenvalues at each scale. These spectral constraints impose that a mode (eigenvector) having frequency content in one scale has no frequency content in the others. Therefore, it is possible to show that the eigenvectors of all the scale are orthogonal complements that span the entire R n t space, that is M m=1 n m ≈ n t . The mPOD basis is then constructed by collecting the POD bases of all scales, sorted by amplitude: with P Σ a permutation matrix to rank the structures in decreasing order of energy contribution. This decomposition generalizes POD and DFT: for M → 1, the mPOD is a standard POD. At the limit M → n t , the mPOD is a DFT.

Software Architecture
MODULO has a minimal Graphical User Interface (GUI), shown in figure 1, that opens after launching the executable. The decomposition process can be followed from the toolbar in the upper part of the main menu. An overview of MODULO's GUI is given in the fourth video tutorial. In this first version of the software, the data is assumed to be 1D or 2D: the first raw of non-editable tabs shows the number of points along the x-axis (Nx), the y-axis (Ny), the total number of spatial points (Ns)and the number of time steps (Nt). In a 1D test case, it is assumed that Ns=Nx (Ny=0); in a 2D scalar test case one has Ns=Nx×Ny while a 2D vectorial test case yields Ns=2×Nx×Ny.
The dataset can be imported from the menu Import Data, which allows for two options: embedded mesh or separated mesh. The first option should be used if the mesh grid is stored in each of the data files; the second option should be used if the mesh is stored in another file.
Before importing the data, the user can mean-center all the snapshots, i.e., remove the average column (time average if the row domain is linked to time) from the snapshot matrix D. This option can be useful for plotting purposes in the DFT and is generally recommended for POD and mPOD of statistically stationary datasets. Once the files are loaded, the Region of Interest (RoI) GUI shown in figure 2 opens. This GUI allows for setting the portion of the spatial domain that will be used in the decomposition. By default, MOD-ULO considers the entire domain, but the user can introduce the ranges along the horizontal (x_L, x_R) and the vertical (x_L, x_R) axes. The upper left menu pop-up menu allows for selecting and preview any of the imported snapshots. In case of vector datasets, the parameters on the left allow for adjusting the quiver plot in terms of spacing (delta_x, delta_y) and arrow length (Scale) while the bottom tick boxes can be used to flip the axes or adjust the axis aspect ratio to 1 : 1. The settings used at the step will be used in the exporting of the spatial structures of the decomposition.
Once these parameters are set, the user can either use the RESET button, to restore default values or the button DONE, to proceed with the importing of the data and the preparation of the matrix D. The Memory saving option is described in the section 2.3.1.

Process
From the Process menu, the user can select which of the three decompositions described in section 2, is to be performed. By choosing POD or DFT algorithms, the user is asked to introduce the sampling frequency f s and the extreme of the range of the modes to be exported. For all the decompositions, this indexing assumes that the modes are always exported in decreasing order of amplitude, even if DFT and mPOD are not energy-based. Nevertheless, for DFT and mPOD, the need for ordering the modes requires the calculation of the complete basis regardless of the number of exported modes, while for the POD only the modes to be exported are computed.
MODULO is a dimensionless software, hence the units in the sampling frequency need not to be specified: digital frequency bins are computed as ∆ f = f s /n t and the frequency domain is f n ∈ [− f s /2, f s /2]. Once these parameters are introduced, the decomposition begins in the case of DFT and POD, and a wait bar indicates the progress of the calculation. The fifth and sixth video tutorials are dedicated to the POD and the DFT respectively.
In the case of mPOD, additional parameters should be introduced, and a dedicated GUI appears to support the user in this process. This GUI is described in the seventh video tutorial and is shown in figure 4.
The window shows a contour of the value of the crossspectral density matrix on the left and its diagonal on the right. On the top-left editable box, the user introduces the sampling frequency Fs and the number of scales excluding the largest. Assuming, for example, that one is interested in four scales, say [0−100]Hz, [100−200]Hz, [200−300]Hz, and [300−F s/2]Hz, the number to introduce here is 3. The reason for excluding the first (largest) scale from the counting is that this scale is always kept by default in MODULO while the other scales can be removed if the software is used as a low-pass filter.
The parameters C-axis and Frequency axis limit can be modified for plotting purposes; the first changes the upper limit in the color axis of the contour plot, the second sets the frequency limits in the axes of both figures.
The last input parameter in the figure is the Modes cut-off, which controls the number of modes computed as indicated in the non-editable text box below. A red line also indicates this percentage in the normalized spectrum (right plot). This number controls the number of modes, in each scale, that will be considered in the final mPOD basis: if this is set to, e.g., 10%, then only the modes that have at least 10% of the leading POD mode energy in each scale will be taken. Observe that this estimation is made before computing the decomposition based on the transfer function of the filters that will isolate the scales, and is thus only an estimation. This estimation is available only after the settings of the filters in each scale are introduced, from the SET button. The first parameter is the frequency splitting vector: its entries collect the cut-off frequencies on each scale. In the previous example this is F V = [100, 200, 300]Hz (units are customary). The second input is a vector containing the length of the filter kernels that will be used to compute each scale. These are FIR filters with a Hamming window. Finally, the last input allows for introducing the keep vector. This indicates the intermediate scales that will be kept and allows for using the mPOD also as a filter. In the previous example, if the keep vector is set to [1,1,0], then the highest scale [300− F s/2]Hz is removed from the decomposition. If the keep vector is set to [0, 0, 0] then the mPOD will be equivalent to performing the POD of the dataset obtained by low-pass filtering to keep only the portion [0 − 100]Hz.
Once the frequency splitting vector, the length of the kernels and the keep vector are set, the RUN button to start the computation is enabled. As for the DFT, the user inputs the range of modes to be exported only at the end of the decomposition.

Memory Saving
The largest matrix in every decompositions is the snapshot matrix D ∈ R n s ×n t . Depending on the computational resources available, this matrix can be prohibitively large, and the cost of matrix products such as the correlation matrix in (4) or the projection in (3) might exceed the available RAM. To cope with this limitation, MODULO offers a 'Memory Saving' option, from the RoI GUI shown in Figure 2. This option is described in the eighth video tutorial.
When this option is active, all decompositions are performed without loading the snapshot matrix D in memory, but only a few partitions of it at a time. Therefore, the calculation of the time correlation matrix K = D T D, needed for POD and mPOD, is computed from n P 'column-wise' partitions of D. Each partition D c i is of size n s × n C with n C = n t /n P . These partitions are saved as temporary .mat files and loaded one at a time during the calculation of the correlation, which is performed in blocks. This allows for limiting the number of stored entries to n s × n C × 2 entries, although at the cost of increasing the number of reading/writing operations. The calculation steps for this matrix using three blocks is illustrated in Figure 5. To limit the cost of the memory saving feature, MODULO takes advantage of the symmetry of the correlation, computing only the upper triangular part of it while the remaining portion is mirrored.
A similar approach is pursued in computing (3), namely the last projection step of every decomposition. In this case, D is split into n P 'row-wise' partitions D r , of size n R × n t , and the projection is carried out independently in each portions. The calculation steps for this matrix using three blocks is illustrated in Figure 6. The resulting projection is then assembled back to  column-wise partition to compute the amplitude of each mode via column normalization of the matrix ΦΣ. Figure 6: Calculation of the final projection in (4) using three partitions to limit memory requirements. The final 'row-wise' blocks of ΦΣ must be regrouped into 'column-wise' blocks for the normalization step.
In all the analyzed exercises, the increased computational cost of the memory saving option is of the order of a minute, depending on the decomposition and the size of each snapshot. Table 1 collects the computational time (in seconds) required to perform POD, DFT and mPOD without memory saving (n p = 1) and with memory saving with n p = 4 and n p = 12. The calculations performed on a laptop with Intel(R) Core(TM) i7-7700HQ CPU with 2.8GHz and 16 GB RAM.
The timing is given in output by MODULO. Hence the timing to prepare the dataset is excluded (since this step is done before selecting the decomposition), while for the mPOD this timing also excludes the preparation of the correlation matrix (since this is done before the mPOD setting GUI opens). The scope of Table 1 is thus not that of comparing the decomposition time but comparing how the partitioning influences the timing. For the mPOD, four scales are chosen with kernel widths of 100 and all scales being kept.
The most expensive operation appears to be the final projection with normalization and sorting steps: since the DFT always requires computing n t modes, this decomposition is more sensitive to the increase of the snapshots. On the other hand, as the POD requires no normalization, the computational costs are much reduced. In general, the computational time increases with the number of partitions due to the increased time spent in reading/writing operations. The price to pay to maintain a lim-  Table 1: Computational time (in seconds) of POD,DFT, and mPOD with no memory saving option (n p = 1) and with memory saving with different partitions. The test case considered is the one from Exercise 4 in MODULO's Github repository, with each snapshot consisting of a 2D velocity field on a n s = 13680 grid. The mPOD is performed with four scales. The time for preparing the dataset matrix is not included; for the mPOD, the timing also excludes the preparation of K.
ited memory usage appears nevertheless acceptable. Finally, it is worth observing that in case the memory saving is not selected, but the computational resources are not sufficient for the calculation, a warning dialog appears. In this case, the user is strongly advised to either hit "Activate Memory Saving" (in which case the memory saving will be activated with the default number of partitions), either close the warning box and select one of the proposed partitions.

Illustrative Examples
The code repository currently includes five exercises that allow for testing all the features of MODULO and, at the same time, explore the limitations and strengths of each decomposition. These exercises are also solved using various commented Matlab files (sorted from 'A to 'D') in order to let the user follow the decomposition procedures using the source codes. The first exercise presents the analysis of a 1D scalar dataset, which collects the time-dependent velocity profile of a pulsating Poiseuille Flow. As described in the second video tutorial, this dataset can be analytically decomposed in eigenfunction and hence offers a comparison between data-driven and analytical decompositions. Moreover, being the flow sustained by two known source terms, the exercise allows for exploring the convergence and the time-frequency analysis capabilities of all the methods. The second and third exercises present the analysis of 2D scalar datasets. These were described in [17]. Both are useful to analyze the problem of uniqueness of the POD, which occurs when modes have similar energy content. The second exercise consists of a simple superposition of known modes while the third features the numerical solution of the nonlinear advection-diffusion of prescribed source terms. The fourth exercise, also presented in [17], presents the decomposition of the experimental data, which is velocity fields obtained via Time-Resolved Particle Image Velocimetry (TR-PIV). This allows for practice with modern experimental data. Figure 7 shows an exemplary mPOD mode obtained in MODULO for this test case, which consists of a planar gas jet impinging on a flat wall. The top figure shows the spatial structure of the mode; the bottom one shows the frequency content of the associated temporal structure. This mode captures the turbulent structures evolving from the shear layer instability in the jet. As described in the video tutorials three and seven, neither the POD nor the DFT can clearly isolate these structures. The POD is limited by the constraint of optimal convergence, which forces the decomposition to put multiple features in the same modes. The second is limited by the constraint of harmonic temporal structures, which does not let the DFT mode capture coherent patterns composed of multiple frequencies. Finally, the fifth exercise considers the velocity field obtained via TR-PIV of the flow past a cylinder in transient conditions. The dataset is described in [19]. This test case consists of a much larger number of snapshots, which forces most laptop computers to use the memory saving features of MODULO.

Impact and Conclusions
We have presented the functionalities of the open source software package MODULO, starting from the theoretical background. The software allows for performing classical modal decomposition such as POD and DFT as well as the novel mPOD. Moreover, thanks to its memory saving feature, MODULO is well suited to analyze relatively large data sets while keeping moderate memory requirements. While these decomposition are nowadays essential tools in fluid mechanics, their general framework is certainly of great interest to any applied scientist. Finally, the complete set of exercises available can also serve didactic purposes and encourage the novice to enter this important discipline.