AUTOENCODING FOR THE ’GOOD DICTIONARY’ OF EIGEN PAIRS OF THE KOOPMAN OPERATOR

. Reduced order modelling relies on representing complex dynamical systems using simplified modes, which can be achieved through Koop-man operator analysis. However, computing Koopman eigen pairs for high-dimensional observable data can be inefficient. This paper proposes using deep autoencoders, a type of deep learning technique, to perform non-linear geometric transformations on raw data before computing Koopman eigen vectors. The encoded data produced by the deep autoencoder is diffeomorphic to a manifold of the dynamical system, and has a significantly lower dimen-sion than the raw data. To handle high-dimensional time series data, Takens’s time delay embedding is presented as a pre-processing technique. The paper concludes by presenting examples of these techniques in action.


Introduction
There has been a rapid advancement of computing machinery, scientists acquiring the capability to analyze large data sets and extract information and data-driven techniques have gathered momentum with this recent growth.The modern discipline of data-driven sciences can be viewed as a combined outcome of available advanced mathematical techniques and high performance computing.According to [1] data-driven discoveries are revolutionizing the modeling, prediction, and control in a diverse range of complex systems found in turbulence, the brain, climate, epidemiology, finance, robotics, and autonomy.
As opposed to analyzing the behaviors of complex dynamical systems in terms of the examining single trajectories, it has become possible to empirically discuss global questions in terms of evolution of density.Apart from the traditional analysis using geometrical techniques, we can analyze the evolution of dynamical systems using the Frobenius-Perron transfer operator [2].The left-adjoint of the Frobenius-Perron operator is the Koopman operator(KO).KO produces the values produced by a measurement function applied to the dynamical system.Instead of producing the value of the measurement, it produces a value in future.
The perspective of data-oriented KO analysis for analyzing dynamical systems has recently become extensively popular and relevant in science and engineering [3][4][5][6], and it can be used to interpret dynamical systems irrespective of them being linear or nonlinear [3].However, the compromise is that the finite dimensional nonlinear system may or may not require infinitely many dimensions to be represented as a linear system.For most of the computation requirements, a small error could be tolerated.Therefore the infinite dimensionality can be truncated at the cost of reduced accuracy [3].Furthermore, it explains why the low dimensional dataset could be used to do the same.This paper brings out a machine learning technique, to reduce the dimensionality of the dataset produced by a dynamical system using deep autoencoders(AEs).
A good dictionary in this context refers to a set of eigenvectors for an efficient representation of the dynamical system.When representing arbitrary observable functions by a series of eigenvectors, there is freedom to define an efficient representation.Choosing the best from the rich and infinite eigenvectors develops a good dictionary of eigenvectors.The reader is referred to [7][8][9][10] for Koompan mood decomposition as a global analysis of dynamical systems.This can be used for forecast, as well as to do descriptive decompositions such as growing, fluctuating and decaying components.
The utility of the initial Dynamic Mode Decomposition(DMD) and exact DMD methods as numerical techniques for computing Koopman eigenpairs persists, as evidenced by their continued use in research [3,5,11,12].Several variants of DMD exist, including sparse DMD [13], extended DMD (EDMD), and those that employ basis vectors to interpret data flow [14,15].In machine learning nomenclature, these basis vectors are commonly referred to as a dictionary of features [3].Following this direction [16] describes an extended dynamic mode decomposition with optimal dictionary learning (EDMD-DL).DMD with control laws are introduced in [5,17] and are emphasized on developing control laws.
In this paper we are developing a computationally efficient way of computing a good dictionary of Koopman eigenpairs.The technique of computing eigenpairs is based on [3].Computational efficiency gain was induced by making a geometric transformation to the data manifold by means of deep learning techniques.In section 2 we will be presenting the mathematical and numerical foundations that the Koopman eigenpairs computed.In section 3 the machine learning technique is introduced and in section 4 examples are presented using the technique.In section 5 the conclusions are derived with a direction for future developments.

Koopman analysis
First, we review the underlying mathematics of Koopman analysis.Let, be a dynamical system.The flow can be stated as a function for each t ∈ R, such that X(t) = ρ t (X 0 ) where X 0 = X(0) ∈ M.This means that the trajectory starts at X 0 when t = 0.A KO describes the evolution of observables along the flow [3,4,18].This study is based on analyzing these observables.Let us call these observables observation functions.With reference to [3], in this work we consider the observation function to be g : where 3) The scalar observation functions introduced here can be stacked to form vector observations.KO defines how the observation functions evolve over time along the orbits of the dynamical system.
Definition 2.1.Let ρ t be a semiflow.That is, ρ t is defined for t ≥ 0.Then, the KO K t is defined for L 2 (M) such that where ρ t (X 0 ) = X t .
In lehman's terms, for each X we observe the value of an observable g(x), not at x itself, but after a time t at ρ t (X).The KO is linear on L 2 (M).This may be at the cost of it being infinite dimensional.Eigenvectors and eigenvalues are studied under the spectral analysis of the KO [4,18,19].Definition 2.2.Suppose ϕ λ is an eigenvector of K t and λ is the corresponding eigenvalue.Then the eigenpair should satisfy the equation or For each eigenvalue λ there are uncountable eigenvectors.In general, these are not linearly independent.It is interesting to explore how an observable function can be decomposed into eigenvectors of the KO.Definition 2.3.Let g(X) be a vector of observable functions g k (X). (2.7) Suppose g(X) could be written as follows, Then it follows that, (2.9) Apply KO to both sides of the equation, where X t = X(t).
Since eigenvectors are not unique, the authors are interested in finding an efficient representation using eigenvectors.
Definition 2.4.In [3], a k-efficient finite set of Koopman eigenpairsare defined as follows.Let ψ λ i (X) k i=1 and ϕ λ i (X) k i=1 be sets of k unit eigenvectors.Further, let q be an observable function, (2.12) then ψ λ i (X) k i=1 is a set of k-efficient Koopman eigenvectors.

Computational technique of determining Koopman eigenpairs
The following definition of the eigenvector was rephrased from [3].
Theorem 2.5.Let the Koopman eigenfunciton Partial Differential Equation(PDE) is defined for an Ordinary Differential Equation(ODE) Ẋ = F(X), with a flow X(r) = ρ r (X 0 ) : , that contains 0, and is transverse to the flow.Further let U = ∪ t∈[t 1 ,t 2 ] ρ t (Λ) be the resulting non-recurrent closed domain.Furthermore, let h : Λ → C be an initial data function, then the Koopman eigenpair (λ, ϕ λ (X)) ϕ λ : U → C has the form Now, we present a computational explanation of Theorem 1 of [3].This is the mathematical infrastructure of the work presented in this paper.
Suppose Ẋ = F(X) is an ODE satisfying the Koopman eigenvector PDE.Let the corresponding flow be given by X(r) = ρ r (X 0 ) : M × R → M. Further, let Λ ⊂ M be a co-dimension-1 data manifold.We should pick a time interval where the data values are non-recurrent.Let us call Λ a non-recurrent data manifold, and Let us assume that the non-recurrent time span is [t 1 , t 2 ] and 0 If Λ is transverse to the flow, we can define U as follows.
That is, U is the collection of all the trajectories starting from Λ.This includes reverse trajectories as well.Furthermore, let h : Λ → C be an initial data function.Then, an eigenpair (λ, ϕ λ (X)) satisfies the equation 2.14 where where s is the parametrization of Λ.That is, s is the data function available on Λ.The exercise of [3] was to introduce this formulation and introduce the algorithm to numerically compute h and optimal λ for a given stream of data.
Following is a summary of the numerical algorithm described in [3].Suppose q ∈ L 2 (U) be an arbitrary data function.q : U → C. Let, be the eigenvector that closely estimates q. ψ : U → C. It is worth noting that the optimization is over both λ and h.Furthermore, where for any f, g ∈ L 2 (U).Let us explain an example scenario for a 2-dimensional dynamical system.The algorithm can be generalized to larger dimensions limited only by the computational resources, then Λ becomes 1dimensional since it should be co-dimension 1.Let s 0 < s 1 < • • • < s n be a uniform partition of Λ.The data function h : Λ → C could be indexed accordingly as We will consider a uniform grid of time as well, That is, m + 1 is the number of equal time steps.Then, the grid points can be identified as follows: In terms of the grid, the optimization problem in equation 2.18 for the function ψ(x) represented on the grid ψ • x(s i, j ) is approximated by solving the finite rank least squares problem, On this grid, both functions q and psi λ,h can be sampled at ρ r j • x(s i ).
Let q i, j = q(s i, j ) for 0 ≤ i ≤ n, 0 ≤ m.In [3], for distance the Frobenius norm [21] is used.We can rewrite the optimal initial data as an n × 1 vector h 0 (λ) ∈ C n .The problem restatement is as follows.
Let us define the m × 1 vector E(λ) as E(λ) = e λr 0 e λr 2 . . .e λr m ∈ C m .Then, A(λ) = E(λ) ⊗ I n where ⊗ is the Kronecker product.Furthermore, let b be the vector received by reshaping the matrix q ∈ C n×m into a vector of shape mn × 1.That is, b = reshape(q, mn, 1).
Then h 0 (λ) is computed by solving the least squares problem, It is noticed that q = q(s i, j ) is a grid of values.It follows that Thus, ψ0 1 = reshape(p 0 (λ), n, m).That is, in order to produce the optimal eigenvector, p 0 (λ) is being reshaped.ψ0 1 is the eigenvector computed by carrying out the computation once.This can be successively done to reduce the residuals.
Therefore, we can call the residue at the kth iteration R k , then and the successive eigenpairs can be computed by using

Manifold learning
In [20] and [22], a manifold is described as a collection of charts called an atlas.Each chart maps a subset of the manifold to the Eucledean Space with an invertible map and these subsests could have intersections that are not empty.The invertible maps should be consistent in the nonempty intersections.Furthermore, a manifold is a topological space that is Locally Eucledean [20].

Manifold hypothesis
The manifold hypothesis is the assumption that higher dimensional data lies on lower dimensional manifolds embedded within the high-dimensional space [23,24].This is due to the belief that high dimensional data could be generated from a low-dimensional dynamical system.Specifically in the context of a dynamical systems, orbit data may be attracted to a stable invariant manifold, which occurs in certain systems where dissipation leads to the potential of a reduced order model.A collection of techniques developed to identify such a low-dimensional manifold using the available high-dimensional data is called manifold learning.For a comprehensive study of the manifold hypothesis and its mathematical consequences, the reader is referred to [23].

Manifold learning
Among the handful of manifold learning techniques, there are techniques such as isometric feature mapping (ISOMAP), locally linear embeddings (LLE), Laplacian eigenmaps, diffusion maps, nonlinear principal component analysis which stand out from the rest.While introducing manifold learning, [25] introduces it as an algorithmic technique for dimensionality reduction.In paper [26], a geometric framework for nonlinear dimensionality reduction is introduced.The authors claim that their technique efficiently computes the global optimal and is guaranteed to converge asymtotically to the true structure.For a comprehensive study of LLE, the reader is referred to [27].For treatment of ISOMAP algorithms and Laplacian Eigen Maps the interested reader is referred to [28] and [29], respectively.However, this investigation is more concerned about manifold learning using deep learning techniques.Following is a concise empirical study of deep learning based manifold learning in various engineering applications.For a more descriptive treatment on the topological point of view of manifold learning, refer to [22].

Mathematical foundations of manifold learning
Dimensional reduction of the data is the main application of manifold learning.The data, which is to be dimensionally reduced, is assumed to be generated from a lower dimensional submanifold [30].A d-dimensional manifold M has the property that for all x ∈ M, there is a neighborhood [31].Then, Θ x can be called a local coordinate of X x [31].
Since M = x∈M X x where {X x |x ∈ M} is an open covering of M, we can find a homeomorphism ϕ x : X x → Θ x for all x ∈ M.Then, for all x ∈ M, we have ϕ x (X x ) = Θ x .Therefore, {Θ x |x ∈ M} can be represented as {(X x , Θ x )|x ∈ M}.The set of homeomorphisms {ϕ x } x∈M need not be unique [31].This leads to the observation that learnt manifolds need not be unique.Suppose It can then be concluded that the observable data was generated by a d-dimensional dynamical system.We find the homeomorphism ϕ : X → Y, such that ϕ(X i ) = Y i for 1 ≤ i ≤ N is the process of manifold learning.Even though the previously surveyed techniques can be used for manifold learning, this investigation is interested in deep learning techniques.The deep learning techniques are favorable in real time computing and the learnt deep neural network can be used to uncover the intrinsic local coordinates.This ability comes as a consequence of the results of this investigation.
Following is a concise survey of applications of deep manifold learning.While introducing the phrase deep manifold learning (DML) for the deep learning of hidden lower dimensional manifolds, [32] represents a framework of DML for action recognition using convolutional neural networks.In their paper on image set classification, [33] introduces a technique for Riemannian manifold deep learning.This paper also provides a summary of previous efforts on Riemmanian manifold learning.

Autoencoders
AEs have become a standout machine learning method to find a reduced order model in an invariant manifold within an artificial neural network framework, including dynamical systems with stable reduced order models [34,35].Figure 1   AEs were first proposed and formulated by the PhD thesis of Y. LeCun [36].These are traditionally used for dimensionality reduction [37].Based on the formulation of [38], the mathematical foundation of an AE can be viewed as the recursive formula 3.1.The mathematical foundation of the neural networks can be found in [39].
Let K (l) denote the number of neurons in each layer where l = 1, 2, 3, . . ., L. Let the output of neuron k in layer l be x (l)  k , and the vector of all outputs for this layer be x (l) = x (l) 1 , x (l) 2 , x (l) 3 , . . ., x (l)

′
. In each node, a nonlinear activation function g(•) is applied further to the linear operation.In the recursive equation 3.1, W (l−1) is a K (l) × K (l−1) matrix of weight parameters and b (l−1) is a K (l) × 1 vector of bias parameters [38].For this work, "sigmoid" and "ReLu" [40] were proven to be acceptable activation functions.We decided to do our experiments with sigmoid consistently.
AEs, in general, takes the form Z = h(X) and X = g(Z), where X is the output of the neural network.Vector value Z is known as the latent vector, which is the encoded/transformed form of X. Transformation g is meant to invert the transformation h.Therefore, ideally, g = h −1 .The loss function of the neural network is a function of ∥ X − X∥.We used the mean squared error as the loss function, Subsequently, L is minimized using the gradient descent over a multitude of iterations.

AEs for reduced order modeling
Full order models require increased utilization of resources.The main goal of a reduced representation is to capture the essential physical features of a system and project them onto a lowerdimensional space or manifold in a way that preserves as much information as possible, while still allowing for meaningful comparisons to be made with a full-order model (FOM).This approach is therefore called reduced order modeling (ROM) [41].There have been many attempts in developing ROMs for dynamical systems [42][43][44].Compiling a ROM for a complex dynamical system can be an exceedingly arduous task due to the inherent complexity of the system.The projection-based ROM is one of the earliest and most widely used techniques for ROM.This method involves transforming the original space into a lower-dimensional space using the governing PDEs of the physical system [45].For an elaborate introduction of the dimensionality of the ROM, the reader is referred to [46].
Proper orthogonal decomposition (POD) is a widely used projection-based ROM technique, particularly in the field of computational fluid dynamics.It involves decomposing the original data into a set of orthogonal modes that capture the dominant dynamics of the system.For those interested in delving deeper into the topic of POD, we recommend referring to the following seminal works: [47][48][49].Principal component analysis (PCA) [50] is a comparable method to POD, which gathered momentum due to the availability of the computing resources.Empirical orthogonal functions [51], Karhunen-Loeve expansion [52] are also similar techniques for ROM.According to [53] these ROM techniques utilize the eigen-decomposition of the snapshot matrix using singular value decomposition (SVD).
DMD is a ROM technique, that is popular in modeling physics-based systems using their spatiotemporal coherent structures [11].DMD is being used for a wide range of applications [1], imageprocessing [54], neuroscience [55], and Robotics [56] to name a few.Koopman operator theory (KOT) itself has direct connections to DMD.Initially KOT was used for the characterization of the dynamics of the Hamiltonian functions [57].EDMD was used in [16] to decompose the KOT for the analysis of a nonlinear dynamical system.A matching dynamical system was formed using the Koopman operator by decomposing the operator using EDMD in [58].

Examples
Now, we provide a couple of worked examples of the developed theory.

Example 1: Van der Pol Oscillator
We use the Van der Pol(VdP) oscillator as the first example for simulation.The VdP is an oscillator with nonlinear damping [59].The first step of the process is to simulate trajectories of the VdP.In order to simulate geometric transformations using deep AEs, we considered a complete cycle of each trajectory.These trajectories are similar to what was depicted in Figure 2. The original trajectories lie on a 2 dimensional plane.We projected these trajectories to a 3 dimensional space using the mapping This transforms 2d trajectories to trajectories in 3d.These projected trajectories are depicted in Figure 2. At this, state the dynamical system is treated as a 3 dimensional system.The next step would be to reduce the order of the model to 2. We did it using a deep-autoencoder (DAE).Following is a brief description of the DAE that we tuned for this particular system.For implementing the deep neural network, we used Keras, which is based on Tensorflow.Both of these were programmed using the python programing language.The DAE had the latent vector size of 2, consistent with our goal.We received optimal results when the first and second hidden layers had the sizes of 100 nodes each.These two hidden layers are the encoder.The decoder, which is the neural network after the latent vector, has the same architecture as the encoder.Optimal results were received when Sigmoid was used as the activation function.Comparably similar results were achieved when rectified linear unit (Relu) was used as the activation function.tanh and softmax functions did not provide comparably similar results.These two functions were rejected after visually inspecting the resulting reduced order trajectories, as all the layers in the neural network were dense.As a normalized practice, the networks were trained for 2, 000 epochs throughout the study.Before feeding into the neural network for training, each stream of data has to be scaled from 0 to 1.The scaled 2 dimensional trajectories are depicted in Figure 3.There are many trajectories, each with a different initial point.The nature of the VdP is that they converge to a stable limit cycle.The range of initial points are depicted by the anomaly at around points (0.9, 0.5) and (1, 0.5) in Figure 3.All the trajectories, initiated from the range of different initial points, eventually converge to the stable limit cycle and stay on it.Therefore, it is sufficient to simulate one initial period.The transformed trajectory is depicted in Figure 4.There is no guarantee that the transformed trajectory will always be this.Even with the original trajectories being the same, the AE might converge to a different local minimum of the error producing a different transformation.Finding a geometric transformation using a DAE is a trial and error process.With the number of hidden layers, nodes on each layer, and epochs to train, the activation function may vary with the application at hand.Therefore, it is to be found and optimized over several iterations during the process of simulation.As it was intended, the ROM of the VdP has 3 dimensions.

Reduced order modeling for Koopman eigenpair computation
Now, we move into the analysis of the geometric transformation of the trajectory we used for the Koopman eigenpair computation.For the Koopman analysis, we do not need a full cycle of a trajectory.Therefore, the trajectories are restricted to a shorter time period.The scaled trajectory is depicted in Figure 5. Again, this is projected to the 3 dimensional space using the same transformation as in the complete trajectory case which is depicted by Figure 6.The same neural network architecture as for the previous case was used to autoencode the set of shorter trajectories.The ROM trajectories are depicted in Figure 7.A closer comparison between Figures 4 and 7 will hint the interested reader that the two transformations are not equal to each other.For the DAE, the same architecture and hyperparameters were used as the full trajectory DAE.This guarantees that accuracy is going to converge to a stable value.The next step is to use the ROM to compute the Koopman eigenvectors and eigenvalues.

Computing Koopman Eigenvectors with the ROM
The next step is to find Koopman eigenvectors as given by the equation 2.14.Given by the equation 2.17, s * is derived as a consequence of the flow of the dynamical system.Therefore, we only need to determine the optimal eigenvalue λ and the corresponding h of the equation 2.14.To be consistent with the investigation of [3] we chose the observation function to be q = 3e − x 2 1 +x 2 2 10 .When each x i is replaced by z i for i = 1, 2, we get q = 3e − z 2 1 +z 2 2 10 .In this case, instead of using x 1 and x 2 , we use z 1 and z 2 , the transformed coordinates to compute the eigen pairs.We computed the first 10 modes or the first 10 eigenvectors and their corresponding eigenvalues.For each mode the corresponding eigenvalue is the value of λ, which minimizes the error that is depicted in Figure 8.The h(s) vectors for the first 10 modes are depicted in Figure 9.The absolute error between the sum of eigenvectors and the observation function decreases with the mode.This is depicted in Figure 10.

Eigenpairs with a different observation function
Now, we change the observation function to q = z 2 1 + z 2 2 .The variation of the error with λ and the function h(s) for the first 10 modes are plotted in subfigures, Figure 11a and 11b, respectively.The corresponding minimum error versus mode graph is presented in Figure 12.

Minimum Error
Minimum Error vs Mode Figure 12.Minimum error vs. mode for q = z 2 1 + z 2 2 .

Example 2: Reaction diffusion
Now, we apply our algorithm to a 1 dimensional reaction diffusion equation that appears in [60]. (4.2) In the system 4.2, x ∈ [0, 1] and it is subject to Dirichilet conditions, We chose to fix ϵ = 0.01 and α = 0.01 for our experiments.For the simulation we have presented, we used D = 0.0322.The surface plots for u 1 and u 2 for these parameters are presented in Figure 13, where x ∈ [0, 1] and 0 ≤ t ≤ 60.It is noted that this reaction-diffusion system produces chaotic [61] behavior depending on the values of ϵ and D. The trajectories of u 1 and u 2 at x = 1/2 are graphed in Figure 14.Even though mathematically there are unaccountably many trajectories between x = 0 and x = 1, computationally this number is countably finite.We used x to be linspace(0,1,100).This leads to the resulting simulation having 100.When there are 100 trajectories, each for u 1 and u 2 , there are 200 trajectories in total.Each trajectory from these becomes an input to the DAE.In order for the machine to learn a reduced order model, the deep neural network has to have at least 4000 nodes in the first hidden layer.However, the network architecture with (4, 000, 4, 000) did not converge beyond 25% accuracy.Increasing the number of nodes to (10, 000, 10, 000) did not improve the accuracy significantly.The available computing cloud resources were not sufficient to increase the number of nodes further.Therefore, we had to look for other means of producing the reduced order model.
The solution we found was to preprocess the data using other means.In particular, we used another embedding technique using time delayed snapshots of the trajectories.

Takens' time delay embedding
Takens' theorem provides conditions where we can reproduce the dynamical system using sequential data from a single trajectory rather than processing all the trajectories in parallel.Since all the trajectories from the reaction-diffusion cannot be processed together in parallel with available computing resources, Takens' theorem can be used to reconstruct the dynamical system using time delay embedding with one trajectory at a time.
Takens' time delay embedding was introduced in [62] and provided conditions under which a smooth attractor can be reconstructed from the observations made with an observable function.Suppose we have a d-dimensional dynamical system given by a state vector x t , which is continuous.Further, assume that we have one observable function y(t), which is coupled to all components of x t .Then, a k dimensional vector of observations can be created by considering k time lagged observations with period τ of y(t), that is, • • • , y t−2τ , y t−τ , y t , y t+τ , y t+2τ , • • • , so on and so forth.As k → ∞, the system becomes deterministic and predictable.Takens' theorem states that the dynamics of the lagged vector become deterministic at a finite dimension.The finite dimension is given by k < 2d + 1.
Let us formally present the Takens' embedding theorem.Theorem 4.1 (Takens's Embedding Theorem).Let Ẋ = f (X) be a dynamical system defined on the manifold M. f : M → M, and f is smooth.Suppose that the dynamics f has the strange attractor [63] A with Minkowski-Bouligand dimension [64] d A .Using the Whitney's embedding theorem [65,66], A can be embedded in k-dimensional Euclidean space with k > 2d A .That is, there is a diffeomorphism ϕ that maps A into R k such that the derivative of ϕ has full rank.
Building on Theorem 4.1, it is possible to construct a vector using only a single trajectory from the flow of trajectories of 4.2.In the work presented in this paper, each of the trajectories u 1 and u 2 are time embedded into a vector of size 5.It is further explained by using the trajectory at x = 1/2.The values of the trajectory at x = 1/2 are isolated for both u 1 and u 2 .Since the values are discrete, τ is taken to be the time steps available.Let the time stamp we are considering be t = t 0 , then the scalar u 1 (x = 1/2, t = t 0 ) is composed to be a vector given by When both u 1 and u 2 are considered, it becomes the following vector.
Then, the trajectories for different x values are fed to the neural network serially.
The first hidden layer had 3, 200 nodes while the second hidden layer had 100 nodes, with a latent vector size of 6 the network produced an accuracy > 70%.Hence this ROM is of dimension 6.Using this ROM, the Koopman eigenvectors were computed.
For the simulation of the eigenpair computations, q = 3e − was taken as the observation function.Similar to the previous experiment, we have produced the graphs corresponding to the first 10 modes.
Figure 15 displays graphs that illustrate how the error changes with the eigenvalue of the initial 10 modes.The binary nature of the graphs are observed again.While the graphs of the odd modes exhibit a similar shape, the graphs of the even modes display a shape that is different yet similar to each other.Figure 16 depicts the graphs of the computed h(s) functions of the corresponding modes.It is visible that there is a "saw-tooth" nature to all the graphs even though there are two distinctive types of graphs.Figure 17 depicts the error variation with mode.The absolute error is initially around 8.9 but decreases to slightly below 8.4, where it then remains constant.

Conclusions and further work
In this paper we brought out deep autoencoding as a technique for uncovering simple geometries of complex dynamical systems.Autoencoding works as a geometric transformation.The transformed stream of data is then used to compute a dictionary of Koopman eigenpairs.The technique used to compute Koopman eigenpairs is different from the conventional technique introduced in the paper [3].In that conventional technique the eigenpairs are determined, using the data streams as it is presented.In this paper, we improved on that and computed the eigenpairs using the transformed geometry.
The transformation minimized the vector size of the input data by compressing the data driven dynamical system into a low-dimensional manifold.This reduced the required computational resources to compute the eigenpairs.The transformed data stream was then a reduced order model of the fullorder dynamical system.
When the ROM is formed using an AE, there is no mechanism to govern the formulation of the model according to our need.This is because the autoencoding is an unsupervised learning technique.To overcome this, a nontrivial loss function can be introduced.This loss function could partly depend on the latent variables as well as the output of the neural network.Even though this is still unsupervised learning, the ROM can be governed using a nontrivial loss function, and this will, in turn, lead to less error in Koopman eigenpair calculations.

Use of AI tools declaration
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
depicts a basic neural network topology of a deep AE.

Figure 1 .
Figure 1.Graphical depiction of a simplified deep AE.

Figure 13 .
Figure 13.Surface plots of u 1 and u 2 of reaction-diffusion PDE.

Figure 15 .
Figure 15.Error vs λ variation through the first 10 modes Reaction-Diffusion PDE.
Figure 16.h(s) of the first 10 modes reaction-diffusion PDE.