A fully adaptive nonintrusive reduced-order modelling approach for parametrized time-dependent problems

We present an approach to build a reduced-order model for nonlinear, time-dependent, parametrized partial differential equations in a nonintrusive manner. The approach is based on combining proper orthogonal decomposition (POD) with a Smolyak hierarchical interpolation model for the POD coefficients. The sampling of the high-fidelity model to generate the snapshots is based on a locally adaptive sparse grid method. The novelty of the work is in the adaptive sampling of time, which is treated as an additional parameter. The goal is to have a robust and efficient sampling strategy that minimizes the risk of overlooking important dynamics of the system while disregarding snapshots at times when the dynamics are not contributing to the construction of the reduced model. The developed algorithm was tested on three numerical tests. The first was an advection problem parametrized with a five-dimensional space. The second was a lid-driven cavity test, and the last was a neutron diffusion problem in a subcritical nuclear reactor with 11 parameters. In all tests, the algorithm was able to detect and include more snapshots in important transient windows, which produced accurate and efficient representations of the high-fidelity models.


Introduction
In science and engineering applications, dynamic models can be described by time-dependent mathematical models. Often, these models are written as parametrized partial (integro-) differential equations (PDE). Applications such as uncertainty and sensitivity analysis and design optimization require solving the equations repeatedly for different values of the PDE parameters. For complex, large-scale problems, applications of repeated evaluations demand excessive computational power and memory resources. In such cases, model reduction techniques are used to overcome the computational burden. Model reduction methods aim to replace the high-fidelity model with an efficient, low-dimensional reduced-order model (ROM) capturing the main dynamics of the system with a controlled level of accuracy. Model reduction methods can be classified into intrusive and nonintrusive methods. Intrusive methods are mainly projection-based, where the high-dimensional model is approximated by projecting the original equations onto a reduced subspace. For an overview of different projection-based approaches, see [1,2].
Intrusive approaches require access to the operator of the high-fidelity model, which can be limiting for applications where the numerical solver is closed-source or legacy, coupled multi-physics code. In these cases, nonintrusive methods are applicable where the reduced model is built using only data generated from the highfidelity model. For this reason, they are also called data-driven methods. A class of nonintrusive methods aims at recovering part of the problem's physical structure by inferring an assumed operator from the data. In the Loewner framework [3], a reduced model is built by interpolating measurements of the transfer function in the frequency domain. This approach was extended to construct a reduced model from time-domain data [4]. However, reduced models in the Loewner framework are applied to linear, time-invariant systems (or linear PDEs). For nonparametrized PDEs, dynamic mode decomposition (DMD) [5] learns a linear operator by fitting a sequence of time snapshots data in an optimal least square sense. However, this approach cannot be directly applied to parametrized problems.
A different line of research attempts to construct a nonintrusive reduced model for general (nonlinear) parametrized PDEs without an operator inference. The PDE solver is considered as a black-box. This class of methods uses generated data to fit a model mapping a defined input space to the desired output space. Hence, they are closer to machine learning techniques. However, while machine learning methods are typically applied in settings where data are abundant, in numerical and experimental computational science and engineering applications, data are typically expensive to generate [6]. Therefore, an important challenge to overcome for nonintrusive ROM methods is to build an accurate model using minimum data size. One effective black-box ROM method adapts the projection-based proper orthogonal decomposition (POD) method to be a nonintrusive approach. This nonintrusive version starts in a similar way to the projection-based version by constructing a reduced basis space. However, instead of projecting the high-fidelity model equations onto the reduced basis space to solve for the POD coefficients, data-fit surrogate models for the POD expansion coefficients are used. Different routes can be followed to construct the models for the expansion coefficients. One can interpolate with splines [7] or, more commonly, use radial basis function (RBF) [8,9]. Additionally, neural networks can be used to learn a surrogate model for the coefficients [10]. Gaussian process regression (or kriging) is another option to build the surrogate model [11,12]. We have presented an approach using locally adaptive sparse grid and hierarchical interpolation [13], which was then applied to perform analysis of the uncertainties in a coupled multi-physics model of a nuclear reactor system [14].
Most of the work on nonintrusive ROM methods has been developed either for parametrized time-independent problems (steady-state solutions) or time-dependent non-parametrized problems. Generalizing a ROM method to address both spatiotemporal discretization as well as the parameter space is not trivial. As a direct approach, one could build a separate ROM model for each time instance of interest using any of the (steady-state) nonintrusive ROM methods. However, such an approach is computationally unfeasible for the entire discretized time series. The challenge is even more complicated if the boundary and initial conditions are also parameter and time-dependent or if the parameter space is of a high dimension. Audouze et al. (2013) [9] suggested a nonintrusive ROM approach for time-dependent PDE problems using a two-level RBF-POD technique. This approach constructs two reduced basis spaces, one for spatial basis and a second for temporal basis. The authors use a coarse grid discretization of the spatial coordinates, time, and parameter spaces to sample the high fidelity model and generate the snapshots for the POD. Chen et al. (2018) [15] extended this work to include adaptive sampling for the parameter space using an RBF error estimator based on the distance between the RBF coefficients. The adaptivity in this approach cannot easily be extended to higher dimensional parameter spaces. Xiao et al. (2017) [16] presented an approach to tackle the high dimensional parameter space challenge using (non-adaptive) sparse grid to generate the sampling points. Their approach is also based on RBF-POD, but only one reduced basis space is constructed offline while a twolevel RBF interpolation is used online; the first layer generates interpolated coefficients in parameter space then a second RBF layer propagate these coefficients in time. Peherstorfer and Willcox (2016) [17] proposed a nonintrusive operator inference ROM approach that can be applied to linear systems or systems with nonlinear terms of low order polynomials. As an extension of this work, Qian et al. (2020) [18] proposed first to lift the generated data from the high-fidelity model to a quadratic form using auxiliary variables. Then apply the operator inference approach to the lifted system. However, defining the lifting maps is problem specific and requires characterization of the nonlinear term, which is an intrusive step. Guo et al. (2019) [19] proposed an approach based on Gaussian process regression models for the POD coefficients. Time is treated as a parameter, and the snapshots for the POD basis construction are generated based on parameter and time tensorization. Recently, several studies investigated ROM approaches based on Artificial Neural Networks (ANN) [20][21][22][23][24][25][26]. Swischuk et al. [6] compared between different machine learning methods for POD-based ROM modelling. They found ANN to be underperforming in cases where data is scarce. Their finding is in line with ANN literature, where it has been established that successful training of the ANN model requires a minimum data size that is a multiple of the complexity of ANN structure [27,28].
In all of the studied ROM methods, one must select the time snapshots of the high-fidelity model a priori. If the snapshots are too close to each other in time, the computational burden is unnecessarily increased. On the other hand, defining coarse time intervals risks overlooking important system dynamics. This issue has been identified by the projection-based community [29][30][31][32], where an adaptive selection of the time snapshots has been proposed to improve the projection-based POD modelling. However, the selection of the snapshot is imposed based on criteria that require knowledge of the governing equations, which is unfeasible when the system's precise dynamics are unknown, such as our nonintrusive setting.
In the present work, we aim to develop a general nonintrusive approach to identify and select the snapshots for any parametrized, time-dependent system. We build on our previous work for steady-state systems where we presented an adaptive sparse grid approach combined with POD [13]. We extend the adaptivity in parameter space to the time domain. We consider time as a parameter and use our adaptive technique to choose the important snapshots both in time and parameter spaces. This approach assumes a bounded time window of interest (i.e., t ∈ [0, T ]). Therefore, the reduced model has no predictive capabilities beyond the defined end time T . However, the reduced model is able to simulate the spatiotemporal evolution of the system as a function of system parameters up to the end time T . We present three numerical tests for our adaptive approach. The first is a two-dimensional linear unsteady advection problem (Molenkamp test) that has an exact solution. This problem has five input parameters to investigate. In this test, we compare between the direct (fixed time grid) method and our time-adaptive approach. The second test is a lid-driven cavity problem, which was solved as a non-parametrized model (i.e., only time was considered as a parameter). The third is a two-dimensional time-dependent neutron diffusion problem in a subcritical reactor, which was parametrized with an 11-dimensional space. This problem is challenging due to the higher dimensionality of the parameter space and the abrupt response of the system during the transient.
The remainder of this paper is organized as follows: Problem formulation is introduced in Section 2, along with a summary of the adaptive-POD algorithm and the time treatment approach. The numerical tests are presented in Section 3. Our conclusions are discussed in Section 4.

Problem formulation
We are interested in building a reduced model for a general parametrized time-dependent problem. Due to our nonintrusive approach, the governing equations are unknown. Therefore, a general form for the problem under an unknown nonlinear operator N (·) can be written as where y(x, t, α) is the solution of the system, x is the state independent variable (e.g., spatial coordinates, energy, or angular directions), t is time, α ∈ R d is a vector of d parameters representing properties of the system (e.g., material, geometry, or boundary conditions), and s(x, t, α) is a source function. We aim at building a reduced model to capture the dynamics of the solution y(x, t, α) within a defined range of the parameter α. We assume the availability of a numerical solver for the discretized version of the problem. That is where y ∈ R n is a vector with n state variables. The computational burden usually scales with the dimension of the state vector n. Our approach is based on POD method, where we seek to approximate y(α, t) using an expansion of the form where v j ∈ R n is the basis vector (or POD mode) and c j (α, t) is its coefficient that depends on parameter α and the time (t). The POD method extracts a reduced basis space for the system using the left singular vectors of the singular value decomposition (SVD) applied to the snapshot matrix, which is a matrix containing an ensemble of solutions at different states of the system. The basis space can be truncated at the first r left singular vectors such that the truncation error is below a cut-off threshold γ tr . That is where σ j is the singular value of the left singular vector v j . We have proposed in [13] an iterative algorithm to build a reduced model by adaptively selecting important points from the parameter space and updating the snapshot matrix. In this work, we propose to deal with time as a parameter. That is, we consider time to be an additional input parameter such that the solution y(α, t) = y(α * ), where α * = [α ⊤ , t] ⊤ and the symbol ⊤ denotes the transpose. The dimension of the parameter space becomes d * = d + 1 and Eq. (3) becomes Formulating the problem in this way allows us to directly use the previously developed adaptive tool. Once the orthonormal basis is known, the coefficient values at the sampled point α * q can be computed as where ⟨., .⟩ indicates the scalar product.

Smolyak interpolation
To compute the coefficient at any non-sampled point, we use the Smolyak iterative interpolant developed in [13]. Here, we only present a summary of the adaptive algorithm. At iteration k, the d * -dimensional interpolant A k,d * (c)(α * ) is given by with A 0,d * (c)(α * ) = 0, and where m ∆ k is the cardinality of the so-called important set Z k . The important set contains the parameter points α * at which the interpolant was found to have an error greater than a pre-defined threshold γ int . In the next iteration, the algorithm refines the sampling scheme in the neighbourhood of the points in the important set. The d * -variate basis function Θ n (α * ) is defined for every point α * n = (α where α * has support nodes = (α 1 , . . . , α d * ), and i p is the level (tree depth) index along dimension p. The unidimensional interpolant a i p where the dependence on the dimension p is dropped for notational convenience. The surplus w k n is defined as the difference between the interpolated value and the true value of the coefficient The reduced model is then built by using the interpolant of Eq. (7) as a surrogate model for the POD coefficients in Eq. (5), yielding

Adaptive sampling strategy
The adaptive sparse grid algorithm is based on arranging the nodes along each dimension in a tree structure, as shown in Fig. 1. Each node has two children and one father with an exception at the boundary nodes in level 2 where each has one child only. The nodes are then tensorized to form points in parameter space. To maximize the separation of points in parameter space, we choose the equidistant rule for the unidimensional nodes. Each point in parameter space has forward and backward points. The forward points for a point α = (α 1 , α 2 . . . , α d * ) is generated by tensorizing the children of each node with the rest of the nodes. That is, the first forward point of α is (b 1 (α 1 ), α 2 . . . , α d * ), where b 1 (α) is a function that returns the first child from the tree. The second forward point is (b 2 (α 1 ), α 2 . . . , α d * ), where b 2 (α) is a function that returns the second child. By applying b 1 (α) and b 2 (α) to α 2 , the third and fourth forward points are generated and so forth for the rest of the dimensions. Therefore, for any point in parameter space we can generate up to 2d forward points. The backward points are generated in the same manner but by using a function that returns the father of a node instead of the child function. Hence, each point has at most d backward points. By generating the backward points recursively, the set of ancestors are created. Note that a forward point can be shared between two different backward points. Therefore, points in parameter space do not form a classical tree structure but rather are connected as a network.
The parameter space is bounded by the defined upper and lower values for each dimension in α * . This space is mapped to a unitary hypercube with dimension d * , where 1 is mapped to the upper value and 0 represents the lower value of the range. In the initialization step (k = 0), the algorithm selects the central point in the hypercube and adds it to the important set Z 0 . The high-fidelity model is then sampled at that point. Then, a reduced model is built using Eq. (13). In the first iteration (k = 1), a trial set is generated from the forward points of the points in Z 0 . The algorithm then samples the high-fidelity model and computes the error of the reduced model at each point in the trial set. Points with an error above a pre-defined threshold (γ int ) are considered important points. Then, in any iteration k, the trial set is generated from the forward points of Z k−1 . For each point in the trial set, if the error is found to be above γ int , this point is marked as a candidate point. The algorithm then considers the ancestors of each candidate point. If all ancestors were included in ⋃ k−1 l=0 Z l , that candidate point is added to the important set. On the other hand, if the candidate point has an ancestor that was not included in ⋃ k−1 l=0 Z l , that ancestor is marked important and the candidate point is stored and tested again in the next iteration. Points that are not marked important are added to the inactive set.
To control the efficiency of the sampling scheme, a generated forward point is excluded from the trial set if it has a fraction of inactive backward points above a predefined parameter µ ∈ [0, 1]. For µ = 1, all forward points are sampled and the algorithm is more exploratory whereas for µ = 0, the algorithm is more efficient by only sampling points which have all their backward points in the important set. Fig. 2 summarizes the algorithm with a flow chart. For a detailed description of the algorithm, we refer the reader to [13].
Clearly, time is not an input parameter and special attention has to be taken with such an approach. This is because numerical solvers are discretized in time. The algorithm could request a snapshot at a certain time t l , which could be a time instance in-between the solver's default time steps. This can be addressed either by solving up to the last default time step before t l then modifying the time step to reach t l or interpolating between two time steps before and after t l . Additionally, for every request of t l , the high-fidelity model will solve for all time instances from t = 0 up to t l . Management of the interface with the solver is important to avoid redundant simulations. If at one iteration, the algorithm requests α q with t l , we can store all generated snapshots for t < t l in a library. In this manner, the algorithm can recall from this library instead of rerunning the high-fidelity solver with each α q call. Likewise, If the algorithm requests t > t l with α q , the solver needs only to be restarted from t l instead of the initial conditions t = 0. This strategy saves computational resources.
However, in cases where memory is limited, one might opt to store only a certain percentage of the generated snapshots. Then, restart a requested simulation from the closest stored snapshot. Note that storing all generated snapshots is not an integral part of the algorithm. The only snapshots that need to be stored are the ones marked as important and included in the snapshot matrix. Stored snapshots that are not used during the construction stage can serve as testing points for the reduced model once the algorithm is terminated. The snapshot matrix is the only memory consuming step in the algorithm. One can reduce the memory burden of the snapshot matrix with the use of an SVD updating algorithm [33] instead of the full SVD at each iteration. In our implementation for this work, however, we have used full SVD at each iteration.  We use the ℓ 2 norm to compute the relative error for any point α * q as where ϵ is introduced as an offset for cases when ∥ y(α * q )∥ ℓ 2 has near zero magnitude. A point α * q is marked as a candidate point when e k q is above the threshold γ int . The iterative algorithm is terminated when e k q for all points in the trial set of iteration k is below a global tolerance ζ .

Applications
Our proposed algorithm is tested on three time-dependent problems. The first is a two-dimensional linear unsteady advection problem that has an exact solution. This problem is also called the Molenkamp test [34]. We parametrize this problem on a five-dimensional space. The second is a lid-driven cavity test. This problem is not parametrized but tests the ability of the algorithm to detect the important transient window. The third is a challenging 11-dimensional transient nuclear reactor problem. This problem simulates a subcritical reactor with an external source.

Molenkamp test
The original Molenkamp test [34] is a two-dimensional advection problem which has an exact solution as a Gaussian cloud of material being transported in a circular path without changing its shape. However, in order to create a more challenging setting for the adaptive-POD algorithm, we modified this problem to include an additional reaction term, which in effect causes the amplitude of the solution to decay over time. The dimensionless advection-reaction equation is where the velocity field describes a solid body rotation u = −2π y and v = 2π x. The initial condition is The exact solution is imposed on the inflow boundary condition as The exact solution is evaluated on a Cartesian uniform 100 × 100 grid. Therefore, the model has 10,000 degrees of freedom. Note that in this problem, evaluating the solution is computationally efficient and a reduced model is not necessary. However, this problem is selected to test the ability of the developed algorithm in capturing such dynamics.
The problem is parametrized with a 5 dimensional space λ i for i = 1, . . . , 5. Figs. 3 and 4 show selected time snapshots of the solution for different values of λ 2 . The snapshots show the Gaussian cloud initially centred at x = −0.5 and y = 0. Over time, the cloud is transported in a circle which completes a full rotation at t = 1. The cloud also decays to reach a near-zero magnitude after a full rotation. The parameter λ 1 is a linear scaling factor that controls the magnitude of the initial cloud, λ 4 and λ 5 control the initial coordinates of the centre of the cloud with respect to the domain centre. The parameter λ 3 is the decay constant of the cloud that controls the speed of the decay. The parameter λ 2 controls the size of the cloud. For smaller values of λ 2 , the cloud size is bigger and the solution is smoother over the domain as shown in Fig. 3 whereas for higher values, the solution has a steeper gradient (spike-like) as shown in Fig. 4.
We test two different settings of the problem. The first is a smooth solution by varying λ 2 between 0.1 and 0.2 and the second is a steep solution with values of λ 2 between 2 and 4. The steep solution is more challenging to capture for a POD-based ROM because as the solution becomes steeper (or closer to being orthogonal), the required basis space becomes larger. That is, the rank of the snapshot matrix is higher for snapshots that are already orthogonal or near orthogonal, which entails more POD modes for an accurate representative model. Table 1 summarizes the range of variation for each parameter. We are interested in building a reduced model that can reproduce the solution q(x, y, t) over the spatial domain and time t ∈ [0, 1] for any values of the parameters within the defined range.
We compare two approaches for this test. The first is the developed time-adaptive approach as described in Section 2. The second is the more direct approach by defining a fixed time grid then building a separate reduced model for each time instance in the grid. To reproduce the solution at any time t, the solution of the ROM models on the time grid is interpolated. However, to select the snapshots in parameter space, we still use the adaptive algorithm for each model on the grid. For this approach, a single basis space is constructed for all ROM models on the grid. A point in parameter space is selected to be included in the important set if any of the ROM models marked that point as important. Thus, the basis space for all ROM models on the fixed grid is updated with any point marked important by at least one of the ROM models. We initially defined the fixed time grid with 11 times points uniformly separated in the time window of interest (i.e., t ∈ [0, 1]).
For both approaches, we choose a greediness value of µ = 0 and require the reduced model to have a maximum of 1% ℓ 2 norm error. Therefore, we set the global tolerance (ζ ) to be 1% and the adaptive threshold (γ int ) to 0.1%. The POD truncation threshold (γ tr ) was set to 10 −12 . The results are summarized in Table 2. The table presents a comparison between the two approaches for both the smooth and the steep solution settings in the number of calls to run the high-fidelity model, the total number of snapshots resulted from these runs, the number of POD modes after truncation, and the maximum relative error resulted from testing the model on 1000 randomly generated points using latin hypercube sampling (LHS) (i.e., snapshots generated by random point in the space formed by the parameters λ i and time t, which were not part of the snapshot matrix). In Table 2, we report the maximum error results for the model of the fixed grid approach in two separate occasions: The maximum error at the predefined Table 2 Results for the Molenkamp problem for the smooth and steep setting comparing time-adaptive and fixed grid approaches. The number of model runs indicates the number of calls to run the high-fidelity model. The number of snapshots indicates the total number of snapshots resulted from the high-fidelity model runs. The number of POD modes indicates the number of basis vectors selected after truncation. The maximum relative error reports the maximum computed error from testing the model on 1000 randomly generated points that are not part of the snapshots. For the fixed grid, the errors at the defined grid points and at interpolated time instances are reported separately. fixed grid instances and the maximum error at interpolated points in-between these instances. The maximum of the two values is the more relevant result to be compared to the error results of the adaptive approach. The interpolated values were obtained using splines interpolation. For the smooth Molenkamp setting, the time-adaptive approach needed 775 high-fidelity model runs and computed 6369 snapshots. Out of the total number of snapshots, 4692 were marked important and included in the snapshot matrix. The number of POD modes after truncation was 33. On the other hand, the fixed time grid approach needed 1379 model runs resulting in a total of 15 169 snapshots, out of which 9889 snapshots were important. The result of the test on the 1000 random points showed the time-adaptive model having a maximum error of 0.5%, which is less than the set tolerance of 1%. The fixed grid model resulted in a maximum error of 0.17% at the grid points. However, at interpolated points (time instances in-between the defined grid points), the maximum error was 1.1%. The adaptive model has a clear advantage in this test as the error was lower and the model was more efficient in the number of high-fidelity model calls. Note that for the time-adaptive approach, not all high-fidelity model calls are simulated up to the end time T (where in this case T = 1). This is because the time-adaptive algorithm requests some high-fidelity model runs with a time t l that is less than T . Therefore, the efficiency in the time-adaptive model is not only in the reduced number of high-fidelity model runs but also in the reduced computational burden of each model run.
For the steep solution test, we notice that both the time-adaptive and fixed grid approaches needed an increased number of model runs and a larger POD basis space compared to the smooth solution setting. For the time-adaptive approach, a total of 2944 high-fidelity model runs were requested with 78 035 snapshots sampled and 64 379 of them were marked as important. The algorithm selceted 238 POD modes. A conclusion similar to the smooth setting can be drawn in this case about the time-adaptive model being more efficient and more accurate. In fact, the fixed grid model captured the dynamics of the solution at the grid points but the error was as high as 76% at the interpolated points. In order to produce a more accurate model, we built another fixed grid model with 101 time points uniformly distributed in time t ∈ [0, 1]). This model reduced the maximum error at the interpolated points to about 0.34%. However, this was achieved with about twice as much model runs compared to the time-adaptive approach and about 6 times more snapshots. Table 3 summarizes the number of projected important points on each dimension. This number represents the linearity of the output with respect to each dimension. A value of 1 signals that the algorithm considered that dimension to be constant. In other words, varying the value of that parameter has a negligible effect on the output of the model with respect to the defined tolerance. A value of 3 means that the algorithm considered the output of the model to be linear or piecewise linear with respect to that dimension. A higher value implies a nonlinear parameter, and the degree of the non-linearity scales with the value. It can be seen that the algorithm recognized λ 1 as a linear scaling parameter in both settings. The decay constant λ 3 was the most sampled parameter and was not affected by the change in the shape of the solution controlled by λ 2 . This can be confirmed from the exact solution in Eq. (17). The parameters λ 4 and λ 5 , on the other hand, were affected by the shape as they control the location of the cloud. For this reason, the algorithm added more points along the dimensions of these parameters in the steep setting, where the cloud size is smaller compared to the smooth setting.

Lid-driven cavity test
In this test, the incompressible Navier-Stokes equations are solved in a two-dimensional lid-driven cavity. Zerovelocity boundary conditions are assumed around the cavity except at the top lid, where a velocity equal to v lid is imposed. The model equations read where u(x, y, t) is the flow velocity, ν is the viscosity, p is the pressure, and v lid is the velocity of the top cavity wall, which was imposed as a ramp according to The domain is illustrated in Fig. 5. We consider a laminar flow with Reynolds number of 1000 and are interested in the velocity field within a time range t ∈ [0, 100]. An in-house Navier-Stokes solver was used as the high-fidelity model [35]. The system of equations is solved with a pressure-correction method, discretizing the equations in space with a discontinuous Galerkin finite element method and in time with the implicit Euler scheme. A fixed time-step size of 10 −3 was chosen. The domain was discretized on a structured non-uniform (finer near the walls) mesh of 60 × 60 elements. The velocity field was discretized using a second-order polynomial, which leads to a total of 43 200 degrees of freedom for the high-fidelity model. A single high-fidelity simulation to t = 100 requires about 35 CPU-hours. Lorenzi et al. (2016) [36] has presented an approach to build a reduced model for this benchmark using a projection-based POD approach. To select the snapshots, the authors sampled the velocity field using a fixed grid of 1000 equally spaced time points. As pointed out in their work, this test is challenging for projection-based POD methods due to the potential instability of the reduced model induced by truncating modes that have small energy magnitudes but are important for dissipating the energy of the system. Nonintrusive approaches do not face such an issue.
We aim to build a non-parametrized reduced model that captures the velocity evolution with time as a response to v lid . We require a 0.5% maximum ℓ 2 norm error and set a POD truncation threshold γ tr to 10 −12 . The algorithm selected 463 snapshots and marked 232 of them as important. The number of POD modes was 229. The selected points are shown in Fig. 6. The algorithm was successful in identifying the first part of the transient to be more important than the last. This is because most of the changes to the velocity field occur within the first few seconds and then gradually stabilize until a steady-state is reached. In fact, the algorithm recognizes that the flow is in a steady-state by t = 50 and no snapshots were marked important between t = 50 and t = 100.   To test the reduced model, Fig. 7 shows the computed relative ℓ 2 norm error between the reduced model and the high-fidelity model at 10,000 randomly generated points in time. The tested points were not part of the snapshots matrix. It can be seen that all tested points resulted in an error below the set tolerance of 0.5%. The figure shows the error to oscillate between 10 −7 and the tolerance (5 × 10 −3 ), with the oscillation frequency being higher in the first part of the time domain. These oscillations are due to the fact that some of the tested points are very close to points that were marked important and included in the snapshot matrix. The error for reconstructing a point in the snapshot matrix can be estimated with the POD truncation threshold γ tr = 10 −12 . Therefore, γ tr can be considered as a lower bound of the error in the ROM model. When a point is tested near a point included in the snapshot matrix, the error can be expected to approach γ tr . On the other hand, when the tested point is further from any Fig. 8. Comparison between the reduced model (ROM) and the high-fidelity model (HFM) for the lid-driven cavity problem at t = 68.157 where the relative ℓ 2 error was found to be 0.054%. point in the snapshot matrix, the error increases to the tolerance. This is evident by considering Fig. 6 where the frequency of the selected important points correlated well with the frequency of the oscillation in the error shown in Fig. 7.
The figure shows the maximum error to be 0.38%. In fact, this maximum is at the first time step of the highfidelity model. This high error is attributed to the discontinuity in the velocity field between the initial conditions (null velocity everywhere) and the first time step (velocity almost zero except at the very top of the cavity where v lid is introduced). The relative error is magnified by the near zero ℓ 2 norm of the solution at this first step. The maximum absolute difference between the reduced model and the high-fidelity model at this point was about 6 × 10 −6 , while the magnitude of the maximum velocity at the top of the cavity was 8 × 10 −4 . Beyond t = 1 (when the input ramp ends), the highest error is observed to be 0.054% at t = 68.157 s. A comparison between the high-fidelity model and the reduced model at this point is shown in Fig. 8. We also plot the velocity components along the horizontal and vertical central lines in Fig. 9 at t = 68.157. The figures show that the reduced model produced an accurate representation of the high-fidelity model despite the fact that no snapshots were selected in the important set between t = 50 and t = 100. In addition, Fig. 9 compares the results with steady-state benchmark data from Botella and Peyret (1997) [37] to verify that the algorithm was successful in recognizing the flow to be in a steady-state beyond t = 50. Simulating the flow for the 10,000 tested points required about 10 s on a personal computer with the reduced model compared to the 35 CPU-hours needed by the high-fidelity model to simulate the flow to t = 100.

Subcritical reactor test
Nuclear reactors are complex systems with multiple interacting physical phenomena. A standard model to describe the neutron flux dynamics inside a reactor is the time-dependent diffusion equation [38] 1 v where φ(x, t) is the one-speed neutron flux with speed v = 300,000 cm/s, D(x) is the diffusion coefficient, and Σ a is the absorption (removal) cross section. The source term S(x, t) is defined as where β is the delayed neutron fraction, ν is the number of neutrons emitted per fission, Σ f is the fission cross section, q(x, t) is the external neutron source, and λ is the decay constant of the precursors C(x, t). The dynamics of the precursors is governed by We consider a two-dimensional domain (i.e., x = (x, y)) divided into 4 regions as shown in Fig. 10. The dimensions of the reactor (including the extrapolated length) were set to x, y ∈ [−109.36, 109.36], which were chosen such that the flux φ(x, t) is zero at the boundary Γ . Each region has uniform material properties such that The external source q(x, t) is assumed to be present only in the lower left corner of the domain,  The multiplication factor of a reactor (k eff ) is the ratio of the neutrons produced from fission in one generation to the neutrons lost in the previous generation. For k eff < 1, a fission chain reaction cannot be sustained and the reactor is said to be subcritical, while for k eff > 1, the reactor is supercritical since the neutron population is multiplying over time. For k eff = 1, the reactor is critical and the neutron population is constant in time. In our test, the reactor is assumed to be in a subcritical condition with a multiplication factor k eff = 0.94 in the nominal state. The neutron population is kept in a steady-state due to the external source q ext (t) = 1. At time t = 100 s, the source intensity is perturbed. This is equivalent to a step-change in the source at time t = 100 s, where q 1 ∈ [0, 5]. The neutron flux response is then observed as a function of time and space. We aim to build a reduced model that captures the dynamics of flux under different conditions of material properties D =  Table 4. The parameters range of variations was chosen such that the reactor is kept in a subcritical condition (k eff < 1) in all cases. The level of the subcritical condition is controlled with Σ a and D, which also set the initial flux value. As the reactor gets closer to criticality, the response following a perturbation becomes steeper and the transient becomes longer. Therefore, the time until reaching a new steady state is a function of the material properties. The flux response to the perturbations can be described by two main parts. First, an initial abrupt response due to the prompt neutrons, which has a magnitude controlled by β, Σ a and D. This prompt response has a duration in the order of 1/vΣ a = 2 × 10 −5 s. The second is the response due to the delayed neutrons emitted from the decay of the precursors, which is governed by a time in the order of 1/λ = 12 s. The final steady-state value scales linearly with the external source q 1 . Therefore, this test poses a challenge for any nonintrusive approach because different parameters affect the dynamics at different timescales.
The model was solved using a finite element method with an unstructured mesh discretizing the spatial domain such that the model has 1084 degrees of freedom. An implicit Euler discretization for time was employed with variable step sizes. The step size following the perturbation was taken as 10 −5 s for the first 0.1 s to resolve steep variations. Then, a step size of 10 −2 s was employed for the remainder of the transient. A single simulation of the transient takes about 30 s on a personal computer. Therefore, this model is not computationally demanding. However, we considered this test to challenge the algorithm in capturing the effect of the 11 parameters on the complete transient. The initial flux distribution before perturbing the source (t = 0) for the nominal case is shown in Fig. 11a while Fig. 11b shows the initial flux distribution when reducing Σ a by 5%. It can be seen that reducing Σ a caused the flux intensity to increase and the shape to broaden over the spatial domain, which is expected since fewer neutrons are being absorbed in this case. Fig. 12 shows the transient tracking the flux at the centre of the reactor following three different source perturbations q 1 ∈ {0, 2.5, 5} and compares the case of all parameters at the nominal values with the case of only reducing Σ a by 5% of the nominal value. The figure shows that by reducing Σ a , the reactor is closer to criticality, which not only has an effect on the initial flux value but also resulted in a slower response to reach a new steady-state following a perturbation.
We built a reduced model with a global tolerance of 1% and a POD truncation threshold of 10 −12 . The algorithm ran 3295 high-fidelity simulations and selected 155 270 snapshots, where 52 710 were marked important. The number of POD modes was 294 after truncation. The high number of selected snapshots is a result of the high dimensional parameter space combined with the complex dynamics of the problem. Note that the number of selected snapshots is not a function of the number of degrees of freedom. Therefore, scaling this problem to larger degrees of freedom will not affect the selection of the snapshots.
The projection of the important points onto each dimension is given in Table 5. The table shows that the algorithm considered the diffusion coefficient to be linear within the defined range of ±5% while the absorption cross section was the most nonlinear parameter. This is expected because the absorption cross section has a direct effect on the subcriticality level of the reactor. In addition, it is shown that Σ a1 was considered the most nonlinear parameter and was sampled more densely because it belongs to the region that contains the external source. On the other hand, Σ a4 belongs to the region furthest from the source and was sampled the least. The parameters λ and β had 3 unique nodes each, which implies that within the defined ranges of ±20%, the effect of these parameters on the dynamics of the reactor is linear. The external source intensity was correctly identified as a linear scaling factor. The time parameter was considered important at 110 time instances. Fig. 13 shows the projection of the sampled points onto the (Σ a1 , t) plane. It can be seen that the algorithm sampled most of the points during the period from t = 100 s to t = 250 s, which is the transient time following the source perturbation. Along the Σ a1 dimension, most of the sampled points were in the lower range of the domain. This is expected, because for lower values of the absorption cross section, the reactor is closer to criticality and the response becomes more nonlinear.   The model was tested on 1000 randomly generated points using LHS method. The histogram of the relative errors in Fig. 14 shows that 99.5% of the points were below the tolerance. The maximum relative error was found to be 3.2%. The point with the maximum error corresponds to a case where the source value q 1 = 1.1 × 10 −3 n/cm 3 s and time t = 203 s. The solutions of the reduced and high-fidelity models are compared for this case in Fig. 15. It can be seen from the figure that the flux at this case is almost zero at t = 203 s. The maximum absolute difference between the reduced and high-fidelity models was found to be 7 × 10 −3 n/cm 2 s. The complete transient for this case is also shown in Fig. 15d. The figure shows that the ROM model was able to track the reference solution with great accuracy at the initial and final steady-state while most of the discrepancy was contained in the transient. The second highest error was 1.5%, which was also a point with q 1 near zero (q 1 = 7 × 10 −4 n/cm 3 s). The third highest error was 1.2%, which was found at q 1 = 1.76 n/cm 3 s and time t = 470 s. This case is shown in Fig. 16, which shows that the error in this case was in the steady-state value rather than the transient. Simulating the 1000 points needed 10 s with the reduced model while the high-fidelity model required about 6 h for the same points.

Conclusions
This work presented an approach for time and parameter adaptivity to build a nonintrusive reduced-order model. The approach is an extension of our sparse grid adaptive-POD algorithm to time-dependent parametrized problems. Time was considered as an additional parameter, which enabled the locally adaptive sparse grid algorithm to be applied directly. The adaptivity provided a tool to include more snapshots from important time windows, which reduces the probability of overlooking crucial dynamics in the POD snapshot matrix. Moreover, the efficiency of the construction phase (offline phase) is improved by sampling the high-fidelity model less in time periods of steady-state or slow (smooth) changes.
Three numerical problems were presented to test the proposed approach. The first was a Molenkamp problem with five parameters, which was solved in two settings: a smooth solution and a more challenging steep solution. In this test, we compared the time-adaptive approach with an a priori fixed sampling approach of the time domain. The results in both settings showed that the time-adaptive approach was more efficient without compromising the accuracy. Additionally, the algorithm was able to identify the linearity of the response with respect to each parameter. The second test was a standard lid-driven cavity problem. For this problem, only time was considered as a parameter. The adaptive algorithm was able to identify that the important time period was the first few seconds of the transient when the flow is still developing.
Moreover, the algorithm recognized that after about t = 50, the flow was fully developed and no important snapshots were selected between t = 50 and t = 100. The reduced model was able to simulate the flow in 10 s compared to the 35 CPU-hours needed by the high-fidelity model. The last subcritical reactor test was challenging not only due to the higher dimensionality of the parameter space but also due to the abrupt dynamics at small timescales. The algorithm correctly recognized the time of the important transient following the source perturbation. In addition, the algorithm revealed the region of importance of each parameter and correspondingly concentrated the sampling of the points in these discovered regions. This improved the efficiency of the approach compared to non-adaptive techniques. The model was tested on 1000 randomly generated points which were simulated in 10 s while the reference model needed about 6 h to simulate the same points. In all tests, the reduced models built with the time-adaptive approach captured the dynamics of the model with an accuracy that fell within the defined tolerances.
Our approach was nonintrusive which can be applied to a wide range of problems. Despite the fact that nonintrusive approaches do not preserve the physical structure of the system, using adaptive approaches, such as the one presented in this work, provides an insight into the physics of the system by ranking the importance of the parameters or exploring linearity. A challenge for any adaptive method is to scale efficiently to higher dimensional spaces. This issue was addressed in our approach by using the locally adaptive sparse grid approach. However, for the Molenkamp and subcritical reactor tests, the algorithm required a high number of snapshots compared to the number of POD modes selected after truncation. This is an indication that most of the sampled snapshots were needed for the construction of the surrogate model of the POD coefficient more than revealing additional dynamics of the system. Therefore, an area to study in future work is the use of higher order interpolation models for the POD coefficients with the aim to reduce the number of snapshots and further improve the efficiency. Another interesting area of research to achieve this goal is investigating a space-time decomposition of the basis space or the construction of several local basis spaces tailored to different dynamics instead of a single global basis space.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.