Remarks on mixed-integer formulations for hyper-reduction schemes in nonlinear dynamics

The hyper-reduction problem for reduced-order internal forces evaluation in transient, nonlinear, explicit dynamics is reformulated, employing Mixed-Integer Programming (MIP), taking into account consistency constraints. Constraint reduction is introduced. Resulting quadratures, as well as reduced runs, are compared against the standard Energy Conserving Sampling and Weighting (ECSW) scheme, on a reference example. Rather than searching for optimal performance, the goal is to provide a benchmark solution, for evaluation of heuristic hyper-reduction formulations along with a non-greedy approach.


Introduction
Simulation is widely used in the industry when designing and virtually testing new products. In constant search of a better approximation of reality, Finite Element (FE) models are becoming increasingly complex. Last decade, the number of elements in models used by car manufacturers when testing new cars in crash situations had increased by orders of magnitude for, amongst others, fracture representation. Simultaneously, the need to speed-up simulations has motivated the development of Model Order Reduction (MOR) methods. Projection-based Reduced Order Modeling (PROM) approximates the unknown field variable as a linear combination of a set of global, domain-spanning Reduced Basis (RB) functions. The Proper Orthogonal Decomposition (POD) [1], originating from statistical data analysis [2], post-processes training data gathered from Full Order Model (FOM) runs to build RBs used 'online' in reduced simulations. POD has found extensive applications in turbulent flow modeling [3][4][5][6] and computer graphics [7][8][9][10] as well as in a variety of scientific fields such as modeling of composites [11], inverse problems [12,13] and shape optimization [14][15][16]. However, when applied to explicit nonlinear dynamics [17,18], POD does not reduce the complexity of evaluating internal variables and entails a computational overhead in the 'online' reduction phase due to the necessity of computing internal forces over all elements. Hyper-Reduction (HR) methods are combined to PROM to tackle the internal forces computational complexity and achieve 'online' speed-up by selecting a representative subset of elements for internal forces approximation.
Miscellaneous HR methods exist in the literature: In the PGD context, the Reference Point Method [19] may be considered as a HR technique. The Discrete Empirical Interpolation Method (DEIM) [20][21][22] reduces the complexity of evaluating nonlinear internal variables combining projection and interpolation. DEIM originates from the Empirical Interpolation Method [23] and aims to approximate the projection of the internal forces by using only a few of their components. This method is closely related to the magic points method [24]. Other HR approaches directly approximate the projected internal forces by building a reduced spatial integration scheme. The multidimensional 'a priori' hyper-reduction (APHR) [25] creates a reduced integration domain without any knowledge from the FOM. In the present work, we consider the Energy-Conserving Sampling and Weighting (ECSW) [26] formulation of an optimization problem, based on data collected from the FOM, to build a hyper-reduced integration scheme whose particularity is to impose integration weights positivity ensuring energy conservation of the reduced integrator. Further evolution of this scheme introducing volume preservation is the Empirical Cubature Method(ECM) [27]. ECSW is an 'a posteriori' method as it relies on data from full-scale simulations to train a hyper-reduced integration scheme, as opposed to 'a priori' methods which use the knowledge of the physical problem rather than training data [25]. The optimization problem that arises in the ECSW is NP-hard and is in practice suboptimally solved by mean of the Sparse Non-Negative Least Square (SNNLS) greedy algorithm. Similar numerical integration schemes, in which it is necessary to preserve the consistency and compatibility during the shape functions integration, arise in Galerkin meshless methods such as Diffuse Elements [28], and Element Free Galerkin [29][30][31].
Present work is motivated by the observation that, proceeding greedily by subsequent enrichments, the SNNLS algorithm becomes computationally expensive in building large hyper-reduced integration schemes , and that a reference method is needed to compare performances of heuristic approaches. The ECSW optimization problem is reformulated as a linear mixed integer one solved with [32], and illustrated on a benchmark FE model. Different formulations incorporating consistency conditions and constraint reduction are tested. Resulting integration schemes are compared in terms of number and position of integration points, and error in the work of internal forces within the 'online' reduction phase.
The paper is organized as follows. "Projected hyper-reduced order model Notations" section briefly reviews PROM and ECSW methods in a nonlinear explicit structural dynamics framework. "Hyper-reduced integration" section develops the theory behind the proposed linear MIP formulation, which is then tested in "Results and discussion" section on a pierced plate under uniform tension FE model. Results are then discussed, and recommendations are made regarding future developments.

Projected hyper-reduced order model Notations
Throughout this paper, curly brackets designate vectors and square brackets designate matrices. Following notations are sorted out in alphabetical order.

Projected reduced order model
We focus on semi-discretized equations used in nonlinear explicit FE solvers Above approximation is injected into the FOM (1) which is then projected on the RB, yielding the PROM The ROM is of size k, much smaller than N , Projecting the FOM (1) employing an orthogonal matrix [ ] potentially increases the critical time step ensuring numerical stability. A rigorous mathematical proof of this property is provided in [34]. However, in applications to explicit structural dynamics, PROM methods generally yield a computational overhead. First, there is no real gain in reducing the model size to k when the lumped mass approach is used in the FOM (1) as a diagonal mass matrix inverse is computed with negligible computational effort. Second, internal forces have to be computed for all elements, involving the time-consuming integration of the constitutive law, and two additional steps are required: expansion of reduced DoF However, as internal forces vector of size N is to be projected on the reduced space of size k spanned by the columns of the RB [ ] and k << N , it may not be necessary to compute the whole internal forces vector to have a good approximation of its projection on the reduced space, and this is where hyper-reduction takes place. The following section briefly reminds the ECSW scheme motivating this work.

Energy-conserving sampling and weighting (ECSW)
The Galerkin method used in FE analysis in the divide and conquer spirit successively computes internal forces {f e int } ∈ R N in each of n e elements of the model and assembles respective contributions In the ROM (4), internal forces are projected Hyper-reduction computes internal forces only for a subset H ⊂ 1, n e of elements indexes and applies weights ζ * e to the elemental contributions ahead of summation and projection on the reduced space where {ζ } * = (ζ * 1 , ζ * 2 , . . . , ζ * n e ) T ∈ R n e ≥0 contains weights associated with all elements in the model. ζ * e = 0 if and only if element e is not selected (e / ∈ H). The ECSW method imposes also ζ * e > 0 for selected elements to maintain the integrator positivity. The hyper-reduced integration scheme, given by the subset of selected elements H and associated weights {ζ * }, is obtained through optimization. Given a RB [ ] and unassembled internal forces ({f e int (t i )}) (e,i)∈ 1,n e × 1,n s , the hyper-reduced quadrature scheme integrates the projected unassembled internal forces training data set up, to a user-defined precision, while selecting the fewest possible elements in H. First, n s unassembled internal forces snapshots are collected at training times (t i ) i∈ 1,n s . Without loss of generality, snapshots may as well be taken at different model parameters values. Once collected, unassembled internal forces ({f e int (t i )}) (e,i)∈ 1,n e × 1,n s ⊂ R N are projected on the reduced space, yielding ([ ] T {f e int (t i )}) (e,i)∈ 1,n e × 1,n s ⊂ R k , and are organized in the matrix . . . where Using this notations, the non-reduced assembly process (5) writes with {ζ * } = {1 n e } corresponding to the selection of all elements with integration weights equal to 1 and {b} ∈ R k * n s is the 'exact' projection of all internal forces snapshots on [ ], used as reference to train the weights. Finally, given a targeted precision τ , the optimization problem of the hyper-reduced integration scheme is stated as The threshold τ on the approximation precision constraint is imposed in the admissible space A ECSW alongside weights positivity, • 0 denotes the zero-norm associating the number of its non-zero coefficients to a vector, equivalent to the number of selected finite elements, to be minimized. However, the zero norm is not differentiable, making (11) NPhard. In practice (11) is suboptimally solved with greedy algorithms such as SNNLS [26] (Algorithm 1). Alternatives such as the LASSO algorithm have been compared to SNNLS in [35]. In the present work, (11) is reformulated and solved using MIP optimization, yielding an optimal method used in this paper as a reference to assess the performances of the heuristic SNNLS procedure on an academic example.

MIP formulation
IBM CPLEX MIP [32] solves problems involving both integer and real variables based on a combinatorial Branch And Bound [36,37] algorithm for integer unknowns and simplex algorithm for real variables. (11) is transformed into a problem that may be solved using MIP in the following steps. An additional boolean unknown {ξ } ∈ {0, 1} n e is introduced to take account of the elements affiliation to H ξ e = 1 , e ∈ H 0 , e ∈ 1, n e \H.  18 Recompute {h} and n (eventual zeroed values) Mixed optimization variables are thus ({ζ }, {ξ }) ∈ R n e ≥0 × {0, 1} n e . The link between the non-negative real weights and the Boolean selectors is: if ξ e = 0 for a given element e, the element is not selected and ζ e = 0 is imposed. A maximal value for the weights is prescribed bounding each selected element weight with ζ max and each unselected element weight with 0 and linearizing the problem. Together with the weight non-negativity condition, (12) ensures Similarly, a minimal value ζ min is imposed on each selected element weight by the constraint (13) imposed with ζ min = 1 ensures that each selected element counts at least for itself while avoiding over-fitting.
Additionally, defining ξ min and ξ max , the minimal and maximal number of elements in allows to target prescribed intervals for the number of selected elements. In contrary to the ECSW implementation in [26], this constraint allows to directly start searching for solutions from a given ξ min or assess unattainability of the targeted precision for a given ξ max , reducing computation time by shrinking the admissible space.
Finally, for a given threshold τ , the MIP optimization problem writes Note, that the threshold τ on the hyper-reduced quadrature scheme precision is now imposed in L ∞ norm and not in L 2 norm in contrary to the ECSW optimization problem (11). The linear constraint matrix is with [G] defined by Eq. (8).

Consistency constraints
When internal forces functions are strongly varying in space and time, such as in carcrash analysis, exact domain integration may not be possible. Exactness in the projected Galerkin hyper-reduced scheme is not guaranteed even if the RB functions possess sufficient completeness to represent the solution. A similar problem arises in the Element Free Galerkin context [28,38], where the exactness in the Galerkin approximation, conditioned by the numerical verification of volume and divergence equalities, is met provided additional zero and first-order integration constraints, respectively. The present work also investigates the impact of adding consistency conditions at training times (t i ) i∈ 1,n s such as exact volumic and polynomial integration up to a given degree. Volume preservation writes where ω e (t i ) denotes the volume of element e at training time t i , i ∈ 1, n s , without loss of generality, sampled at the same training times as unassembled internal forces. Constraint (17) is expressed in the MIP formalism by appending [G v ] to (16) [ In a similar fashion, first degree polynomials integration is imposed at training times ζ e ω e (t i )g 1 (t 1 ) . . . ω n e (t 1 )g (n e ) 1 (t 1 ) . . . . . .
while {b} in the left and right members of (15) is computed using Eq. (10) with modified [G].

Constraints reduction
Overfitting is a significant concern when training a hyper-reduced integration scheme over a set of collected data. Redundant snapshots may eclipse others and result in inadequate internal forces approximation in the online reduction phase. Moreover, an excessively large data set may lead to an unnecessarily large number of linear constraints, deteriorating performances of the MIP solver. A constraint reduction in the optimization problem (15) is thus proposed to address those two issues.
In this section and the remainder of the paper, [G] dimensions are denoted m × n e . Data is decomposed using Singular Value Decomposition (SVD). Note, that as the SVD decomposes the second dimension of the matrix and, in the present case, the constraints are to be reduced, the decomposition is performed on [G] T of a presumably lower number of lines  , λ 2 , ..., λ n e ) ∈ R n e ×n e containing singular values λ 1 ≥ λ 2 ≥ · · · ≥ λ m arranged in descending order. Using (23), the constraint The size of [ϒ] makes the problem numerically non-tractable. Therefore, the idea is to keep only l < n e columns of [ϒ], yielding the first form of the reduced constraints, with Choosing l so that λ 1 ≥ · · · ≥ λ l > 0 ensures the kernel to be of size n e − l. As a matter fo fact, [Υ] has l linearly independent columns and the rank theorem yields For every {χ} ∈ R n e −l , {ζ } defined by Eq. (28) verifies implicitly (25). Constraint on weights non-negativity writes and constraint (12) implies Finally, the reduced optimization problem is with the reduced constraint matrix In this problem there are 2n e + 1 linear constraints and 2n e − l MIP unknowns, yielding more contraints than unknowns. Yet, depending on the value of ξ min , ξ max and s max , it is not always possible to find a solution. With (ξ min , ξ max ) = (1, n e ) and s max ≥ 1, a trivial solution is {χ} = {0} and {ξ } = {1 n e }. Non-trivial solutions exist for every ξ max greater or equal to (n e − l), ξ min = 1. As a matter of fact, the kernel of [Υ], being spanned by the columns of [Q 2 ] of dimension l, it is possible to satisfy the constraints by selecting less than n e − l elements.

Results and discussion
Approaches proposed in the present work are implemented in the industrial FE solver Altair Radioss [39] modified for research purposes. The mixed optimization problems are solved with CPLEX [32]. In this section, different hyper-reduced integration schemes are computed with the state-of-the-art SNNLS algorithm and compared with the proposed approaches in terms of numbers and positions of selected elements, offline training data, and online work of internal forces approximation errors. The model used for comparison is presented in Fig. 1 Fig. 3. No parametric variations are considered in this work and the training times correspond to different discrete times (t i ) i∈ 1,n s of the same simulation. In this test case, an RB [ ] of size eight is used, and n s = 1148 unassembled internal forces snapshots are taken at training times (t i ) i∈ 1,n s uniformly distributed every 50 time cycle during the simulation. For consistency conditions, elements' centers and volumes are gathered at the same simulation times as unassembled forces snapshots.
Results obtained in this example are presented in two groups. The impact of adding consistency conditions is first studied in "Consistency conditions" section while constraints reduction is tested in "Constraints reduction" section.

Consistency conditions
In this section, the following hyper-reduced formulations are tested both in the offline training phase, in terms of training data approximation error at given quadratures size, and in the online reduction phase, in terms of the work of internal forces reconstruction. These tests include: • SNNLS greedy algorithm presented in Algorithm 1 solving the ECSW optimization problem (11); • MIP optimization problem (15) The reduced run using the RB [ ] of size eight without hyper-reduction is used as reference to investigate the impact of the different hyper-reduction methods. Figure 4 shows the error on unassembled internal forces snapshots training data Consistency constraints are not appended to [G] when computing this error for respective hyper-reduced quadratures. Figure 5 shows the approximation error of work of internal forces within the online reduction phase. The work of internal forces W int is defined as the integral over the domain of the tensor dot product between the stress ε and the strain σ tensors The approximation error of the work of internal forces int is defined as with W int , andW int , the work of internal forces of the reference and the hyper-reduced simulations, respectively. Element subsets H selected by the different methods are compared for hyper-reduction schemes of size 7, 10, and 14 in Figs. 6, 7, and 8. On those examples, and as mentioned above, results are extended to the full pierced plate through three planar symmetries. This choice allows for a better comparison of element selection as, on a full model, hyperreduction may indiscriminately select among symmetric elements. Consequently, each selected element is represented eight times in the symmetric parts of the model.   Hyper-reduced integration weights obtained with the different formulations are represented in boxplots in Fig. 9. Figure 4 assesses the performances of the greedy SNNLS algorithm in the offline training phase. Among the hyper-reduced quadratures of size seven, the SNNLS algorithm solution offers the less accurate approximation in the offline phase. However, for larger hyper-reduced quadratures, proposed MIP formulations do not always offer better offline approximations than the SNNLS. This is in part due to the incapacity to express the constraints on training data integration in L2 norm in the proposed linear methods, thus, they don't share the same admissible space with the SNNLS. Moreover, the MIP formulations do not minimize the offline training data approximation error but only keeps it under a prescribed threshold. On the other hand, quadratures obtained with the MIP approach may also be unattainable with the SNNLS algorithm due to its greedy nature. Unassembled training data approximation quality is overall quite similar between the different approaches and adding consistency constraints does not deteriorate the training data approximation at a given quadrature size.
The maximal online work of internal forces approximation error is presented in Fig. 5. While the SNNLS shows very good online performances for an heuristic approach, the proposed MIP approach offers better online work of internal forces reconstruction on the given example. Adding consistency constraints to the proposed method does not enhance online internal forces approximation, in particular, it tends to deteriorate the SNNLS and MIP select elements in a similar pattern, as shown in Figs. 6, 7, and 8. Most selected elements are located in the necking zone near the hole and few elements, with larger integration weights, summarize internal forces behavior on the outer parts of the model. Element selection at size 7, presented in Fig. 6, is quite similar for the SNNLS and MIP methods but tends to diverge as more elements are selected. As a matter of fact, the SNNLS greedy procedure allows the deselection of one element in subsequent enrichment only if the associated weight is set to zero when computing the least feasible step (lines 15, 16 and 17 of SNNLS presented in Algorithm 1). Thus, on the three figures, elements selected by the SNNLS at a given quadrature size are still selected in larger integration schemes, which is not the case for the proposed methods.
Boxplots in Fig. 9 show that the proposed method including consistency conditions on polynomial integration suffer from weight overfitting as it consistently yield most significant integration weights. On this example, adding contraints on volume integration prevents overfitting, as observed comparing the MIP quadrature of size 16 with the MIP+V quadrature of the same size. On the other hand, weights computed with the SNNLS procedure are uniform.
The volume integration is tested for quadratures originating from the SNNLS and MIP + V on Fig. 10. In this figure, volume obtained by integrating with different quadratures is averaged over all n s = 1148 training times. Quadratures computed with the SNNLS do not preserve the volume while the proposed MIP+V formulation does independently of the number of elements. Results show that it is possible to add constraints to the hyper-reduced quadrature without deteriorating offline and online performances.

Constraints reduction
In this section, results are presented for the two following formulations of the hyperreduction problem without consistency condition:  Figure 11 plots the offline unassembled internal forces data approximation errors for hyper-reduced quadratures size ranging from 6 to 17 while associated online work of reconstructed internal forces is presented in Fig. 12. Selected elements are illustrated in Figs. 13, 14 and 15 for integration schemes of size seven, ten and fourteen. Weights are represented in boxplots in Fig. 16.
While offline performances, plotted in Fig. 11, are very similar between the different methods, reducing constraints has improved online performances, except for integration  We remark that being accurate over the training data does not ensure good performances in the online reduction phase. As a matter of fact, overfitting over the training data is observed with the Reduced MIP + LS method. This phenomenon may be explained by the relatively sparse sampling of the training data, thus, even when reproducing the same test case, online results may differ. The Reduced MIP + LS is always more accurate than the Reduced MIP in the offline training phase as weights have been reoptimized on the unreduced training data. Nevertheless, it does not outperform the Reduced MIP in the online reduction phase, the latter offering better online internal forces approximation for hyper-reduced quadratures of size 8, 11, 12, 13, 14, and 16. Elements are similarly selected by the different approaches, as observed in Figs. 13, 14, and 15 where most selected elements concentrate in the necking zone around the hole. By contrast with the proposed method, the SNNLS does not deselect elements when quadrature size increases, leading to different quadratures for similar prescribed precision in the learning phase. Differences are more visible on intermediate-size quadratures. In Fig. 15, Reduced MIP and the Reduced MIP + LS did not select the same elements. Obviously, when all elements are selected, results are the same for all methods. The Reduced MIP + LS solution corresponds, in fact, to the Reduced MIP for 15 elements in which one element has been unselected by the single pass in the SNNLS algorithm.
Weights comparison in Fig. 16 indicates overfitting issues of both reduced approaches in some cases. Overfitting issue is most pronounced in quadratures of small and large sizes, while quadratures of size eleven and thirteen exhibit uniform weight repartition regardless of the method. Neither constraints reduction nor adding weights re-optimization seem to circumvent this issue. Overall, the SNNLS algorithm still provides the most homogeneous weights repartition.

Discussion
The SNNLS algorithm shows out to be an excellent heuristic to solve the optimization problem arising in the ECSW training phase. It provides good performances on the pierced plate example both in offline and online phases compared with the proposed approach.
Adding consistency conditions to the MIP formulation seems to deteriorate online performances. On the contrary, constraints reduction improved online performances.
Further testing should include the effect of the maximal weight value ζ max , introduced in "Hyper-reduced integration" section, on both offline and online performances.
Both offline and online accuracy of the different methods have been investigated in details. A computationally expensive method for the NP-Hard quadrature problem has been used to find a near-optimal solution to be used as a reference. While the computational cost of the training phase may be considered as a secondary issue, it rapidly becomes excessive when the size of the problem increases, making a proper convergence study difficult. Concerning CPU times, the online phase depends only on the quadrature's size, it is therefore similar for different methods. In the offline phase, further progress can be obtained by properly choosing ξ min and ξ max which is clearly problem dependent.

Conclusion
The contribution of this paper is the formulation of the hyper-reduction problem in terms of a linear mixed-integer optimization problem, imposing a threshold on the infinity norm of the training data reconstruction error. This formulation enables optimal solution by IBM CPLEX algorithm, which is then compared to the standard SNNLS heuristic from the literature.
The impact of additional consistency constraints on volume preservation and polynomial integration is studied.
Due to high computational cost, the problem is reformulated using reduced constraints set.
All results are compared in both offline and online phases of projected Galerking ROM on an explicit nonlinear transient dynamics case.
The different methods are implemented in a research development branch of the industrial FE solver Altair Radioss [39].
The obtained results may be used as reference for evaluating different hyper-reduction schemes. While the proposed MIP method remains costly for larger problems, the use of infinity norm in the offline trianing phase enhances online performances and the proposed constraints reduction allows also for further research on heuristic schemes.
This work is focused on the technical aspect of hyper-reduction. Nevertheless, further research is required on the influence of hyper-reduced Galerkin ROM on such topics as timestep, wave propagation, precision of acceleration.