Structure preserving model order reduction of large sparse second-order index-1 systems and application to a mechatronics model

ABSTRACT Nowadays, mechanical engineers heavily depend on mathematical models for simulation, optimization and controller design. In either of these tasks, reduced dimensional formulations are obligatory in order to achieve fast and accurate results. Usually, the structural mechanical systems of machine tools are described by systems of second-order differential equations. However, they become descriptor systems when extra constraints are imposed on the systems. This article discusses efficient techniques of Gramian-based model-order reduction for second-order index-1 descriptor systems. Unlike, our previous work, here we mainly focus on a second-order to second-order reduction technique for such systems, where the stability of the system is guaranteed to be preserved in contrast to the previous approaches. We show that a special choice of the first-order reformulation of the system allows us to solve only one Lyapuov equation instead of two. We also discuss improvements of the technique to solve the Lyapunov equation using low-rank alternating direction implicit methods, which further reduces the computational cost as well as memory requirement. The proposed technique is applied to a structural finite element method model of a micro-mechanical piezo-actuators-based adaptive spindle support. Numerical results illustrate the increased efficiency of the adapted method.


Introduction
This article discusses an efficient technique for model-order reduction (MOR) of large-scale sparse second-order index-1 descriptor systems. Mainly, we focus on second-order to secondorder balancing of such systems, in order to preserve the structure of the original model. We consider second-order systems of the form (1) CONTACT Jens Saak saak@mpi-magdeburg.mpg.de where zðtÞ 2 R n 1 , φðtÞ 2 R n 2 are the states, M; D and K 2 R nÂn are the finite element method (FEM)-matrices, H 2 R nÂp is the input matrix and the output matrix is H T , i.e. we assume collocated actuators and sensors. The corresponding control inputs and measurement outputs to the system are denoted by u(t) and y(t), respectively. The matrix D a 2 R mÂp represents the direct feedthrough from the input to the output. The matrices M; D; K and H are sparse and the first three are symmetric. Moreover, we assume the matrix block K 22 to be nonsingular. We call (1) an index-1 system due to the analogy to first-order index-1 (see the next section) linear timeinvariant (LTI) systems. See, e.g. [2,Chapter 2] and the references therein about the index of such structured descriptor systems. The dynamical system (1) arises, e.g. in mechatronics, where mechanical and electrical components are coupled with each other. In the specific case of the model example we use in the numerical experiments, the index-1 character results from the multiphysics application with very different timescales. This allows to treat one variable by a (quasi-)stationary analysis, while the other is covered fully dynamic. If the model is very large, performing the simulation with it requires prohibitively high computational effort or is simply impossible due to the limited computer memory. Therefore, reducing the size of the system is unavoidable for fast simulation. A classical approach to find a reduced-order model (ROM) of second-order index-1 descriptor systems is to first rewrite (1) in first-order form. Then MOR techniques are applied to find a reduced first-order state space system [1]. Under these circumstances, since the block structure of the original model is obfuscated in the reduced model, one cannot go back to the second-order representation if it is desired, to use software designed for second-order systems as, e.g. in flexible multibody simulations.
During the recent years, structure preserving MOR of second-order systems received a lot of attention, see e.g. [3][4][5][6] and the references therein. But, all of those approaches are only for standard second-order systems. As in our earlier work [1,7] on this model, also in this article, our goal is to apply MOR to the high dimensional model in (1) and replace it by the substantially lower dimensional modelM € zðtÞ þD _ zðtÞ þKẑðtÞ ¼ĤuðtÞ; yðtÞ ¼Ĥ Tẑ ðtÞ þD a uðtÞ; whereM;D;K 2 R lÂl ,Ĥ 2 R lÂp and l ( n. It is required that y Àŷ k k is small and the ROM preserves necessary properties, e.g. stability, passivity and symmetry of the original model. This article is concerned with balancing-based structure-preserving MOR of the second-order index-1 system (1). The central idea of this method is to truncate the less important states from the system, which correspond to the negligible system Hankel singular values (HSVs). The system HSVs are the square roots of the eigenvalues of the product of the controllability and observability Gramians [8] or equivalently the singular values of the product of the two Gramian factors [9]. A system is balanced if both the Gramians are identical and diagonal with decreasingly ordered entries which are the system's HSVs. The magnitudes of the HSVs have direct relations to the energy contributions of the corresponding state components in the total input-output behaviour of the system. The smaller the value, the lower the energy portion is and thus the less important the state component.
It is known that the most expensive part in this particular MOR method is to solve the two Lyapunov equations determining the system Gramian factors, which are the key ingredients in the derivation of the truncating matrices for forming the ROM. For a large sparse LTI system, the LRCF-ADI (low-rank Cholesky factor-alternating direction implicit) method [10,11] is one efficient method to compute these Gramian factors. We have already investigated this in [7] for a large sparse second-order index-1 descriptor system. In contrast to [7], in this article, the LRCF-ADI method is updated by exploiting the symmetry properties of the systems, and computing real Gramian factors applying the ideas from [12]. Moreover, we use the residual factor-based stopping criterion [13] to terminate the LRCF-ADI iteration. The presented algorithm to compute the low-rank Gramian factor is based on the second-order index-1 descriptor system (1).
The proposed techniques are applied to a piezo-actuated structural FEM model of a certain building block of a parallel kinematic machine tool. Numerical results illustrate the superiority of the new technique compared to our results in [7].

Model example
We investigate a part of the experimental machine tool as shown in Figure 1. It is a complex system, where a piezo-actuator-based adaptive spindle support (ASS) is mounted on a parallel kinematic machine in order to gain additional positioning freedom and accuracy during machining operations (see [14,15] for more details). The important purpose of the piezo-sensor and -actuator is to control active vibration or shunt damping so that the machine can ensure a highquality product. For analysing the mechanical design and performance of the ASS, a mathematical model as in (1) is formed using the FEM, where M, D and K are the mass, damping and stiffness matrices, respectively. The time-dependent state vector z(t) consists of the components of mechanical displacements and φðtÞ are the electrical charges. Separating the mechanical and electrical parts, M 1 , D 1 and K 11 are, respectively, mechanical mass, mechanical damping and mechanical stiffness matrices. The matrix K is composed of mechanical (K 11 ), electrical (K 22 ) and coupling (K 12 ) terms. The general force quantities (mechanical forces and electrical charges) are chosen as the input quantities u, and the corresponding general displacements (mechanical displacements and electrical potential) are the output quantities y. The total mass matrix contains zeros at the locations of the electric potential equations. More precisely, the electric potential (degrees of freedom [DoF] for the electrical part) is not associated with an inertia. The equation of motion of the mechanical system in (1) can be found in [16]. This equation results from a finite element discretization of the balance equations. For piezo-mechanical systems, these are the mechanical balance of momentum (with inertia term) and the electro-static balance. From this, the electric potential without inertia term is obtained. Thus, for the whole system (mechanical and electrical DoF) the mass matrix has rank deficiency. There are several ways to transform (1) into its equivalent first-order forms, see e.g. [2] for a selection. In this article, we prefer the following representation: The advantage of this representation is that if all coefficient matrices in system (1) are symmetric, then so is the matrix in (3). Moreover, the input-output matrices are still transposes of each other after the transformation. This fact can be exploited in the solvers and the MOR process. It accelerates the computations and allows to preserve stability in the reduced-order model, which can not be guaranteed for general second-order systems [3].

The BT method for second-order systems and related issues
In this section, we briefly review the BT method for second-order LTI systems M€ xðtÞ þ D _ xðtÞ þ KxðtÞ ¼ HuðtÞ; yðtÞ ¼ LxðtÞ þ D s uðtÞ; where M, D and K are nonsingular, and x(t) is the n dimensional state vector. Transforming (5) into first-order form yields The controllability Gramian W c 2 R 2nÂ2n and the observability Gramian W o 2 R 2nÂ2n for system (6) are the solutions of the Lyapunov equations The Gramians can also be defined from a physical point of view. Defining an energy function Glover in [8] shows that the optimal value of the minimization problem min u JðuÞ is This is the required minimal energy to steer the state of system (6) from t ¼ À1 to the state ζ 0 at time t = 0. Based on the optimization problem (8), the Gramians for the second-order system (5) are first defined in [17]. Let us consider the following optimization problems and min x 0 min u JðuÞ s:t: M€ xðtÞ þ D _ xðtÞ þ KxðtÞ ¼ HuðtÞ; xð0Þ ¼ x 0 : Due to the structure of system (6), the controllability Gramian can be compatibly partitioned as The authors in [17] (see also [18]), prove that the optimal solution to problem (10) is x 0 , which is the minimal energy required to reach the given velocity _ x 0 over all past inputs and initial values. The solution of problem (11) is x 0 P À1 p x 0 , which is the minimal energy required to reach the given position x 0 over all past inputs and initial values. Here, P v and P p are defined as secondorder controllability velocity Gramian and position Gramian, respectively. Analogously, one can interpret the observability Gramian W o by using duality arguments. Then partitioning W o as we obtain the observability velocity Gramian Q v and position Gramian Q p . We consider R as a low-rank controllability Gramian factor such that W c % RR T . The structure of the first-order system allows us to split R as Therefore, the controllability Gramian can be written as Hence, we have Similarly, considering W o % LL T , we find Apparently, R v and R p are obtained from the first n rows and the lower n rows of R, respectively. Analogously, L v and L p can be obtained from the first n rows and the lower n rows of the low-rank observability Gramian factor L. Once we have R α and L β , where α; β 2 fv; pg, the balancing transformation can be formed using the SVD and defining Here U αβ;1 and V αβ;1 are composed of the leading k columns of U αβ and V αβ , respectively and AE αβ;1 is the first k Â k block of the matrix AE αβ . Now, the ROM as in (2) can be formed by constructing the matrices of reduced dimensions: [3] for more details.
The fundamental drawback of the balancing based model reduction is the necessity to compute the two Gramian factors by solving two Lyapunov equations. Among several approaches, the LRCF-ADI method has been developed that allows to exploit the fact that often all coefficient matrices are sparse and the number of inputs and outputs are very small compared to the number of DoFs. We refer the reader to [10,19] and the references therein for details on the LRCF-ADI approach. Recently, this prominent method has been updated by exploiting the ideas of computing real lowrank Gramian factors [12] and a low-rank residual based stopping criterion. For convenience, the updated version of the LRCF-ADI (GLRCF-ADI) algorithm is summarized in Algorithm 3.
This algorithm either successively computes Z ¼ R for ðÊ; F; NÞ ¼ ðE; A; BÞ or Z = L for ðÊ; F; NÞ ¼ ðE T ; A T ; C T Þ. In this algorithm, μ i È É J i¼1 are called the ADI shift parameters or simply shift parameters [11]. A set of good shift parameters is necessary for fast convergence of the algorithm. Although several strategies are available in the literature (see e.g. [20] and the references therein), in this article, we restrict ourselves to the heuristic approach introduced in [11] and an adaptive choice following [20].

The BT method for second-order index-1 systems
This section discusses the balancing based model reduction methods for the second-order index-1 descriptor system (1). For a survey on balancing-related MOR methods for general descriptor systems, we refer to the recent survey [21]. In this case first, we transform the second-order index-1 descriptor system (1) into a standard system (5), where Note that K, H will then usually be dense matrices. Section 5 gives details on how to avoid forming them.
The first-order representation of this standard second-order model is obtained as in (6). Since the first-order form is symmetric (A T = A, E T = E) and the input-output matrices are transposes of each other (C = b, C = B T ), the controllability Gramian and the observability Gramian coincide, i.e. W c = W o = W and only one Lyapunov equation needs to be solved. Therefore, we can consider Note that the following section will discuss how to solve the Lyapunov equation (17) efficiently. Once we have computed Z v and Z p by solving the Lyapunov equation, following (13) and (14) we can compute four types of balancing transformations as shown in Table 1.
Now, using these projectors we obtain four types of ROMs, as in (2). In each case, the reducedorder matrices are constructed aŝ Note that in the VV and PP cases, we have the following result.
Theorem 4.1. Let the equivalent first-order system be as in (3), i.e. symmetric in the sense that both E and A are symmetric and the output matrix is the transpose of the input matrix. Then the truncating projection in the VV and PP cases becomes a Ritz-Galerkin projection. Moreover, stability is preserved by the reduction process.
Proof. By construction, for the VV and PP cases, the projection matrices T L and T R coincide. Thus, the Petrov-Galerkin projection becomes a Ritz-Galerkin method. Since, furthermore, the system is symmetric and stable, it is dissipative. Then by Bendixon's Theorem [22], the projected system is stable, as well.

Computation of the low-rank Gramian factors for second-order index-1 systems
This section concentrates on the efficient computation of Z v and Z p as defined above for the second-order index-1 DAEs (1) by solving Lyapunov equation (17). As a follow-up to our previous work [1], we want to apply Algorithm 3 with all its efficiency improving features. In the following, we discuss some computational strategies in this regard. We know that the most expensive part in the LRCF-ADI iteration is to solve a linear system in each iteration step. The linear system can be solved directly or iteratively. In either case, for our problem, avoiding the Schur complement formulation, i.e. K 11 À K 12 K À1 22 K T 12 À Á and exploiting the second-order block structure, we can accelerate the computation. In the following, we discuss these issues.
When we solve the Lyapunov equation (17) by applying Algorithm 3, in the ith step (see Step 3 in Algorithm 3), we need to solve This is equivalent to Now, inserting M, D and K from (16), linear system (21) yields It can easily be shown that reversing the Schur complement, instead of solving linear system (22), we can solve the linear system Although the dimension of the matrices in (23) is larger than that of (22), it can (in contrast to (22)) be treated using a sparse direct solver [23, Ch. 5], or any suitable iterative solver [24], since we removed the dense blocks resulting from the explicit Schur complements. The computation can be accelerated further by splitting linear system (23) as follows.
A simple algebraic manipulation on (23), again leads us first to solve the linear system for χ 2 , then to compute Thus, in the ith step V i can be computed by The whole procedure is presented in Algorithm 3.
Algorithm 3: SOGS-LRCF-ADI for the second-order index-1 systems Solve Update low-rank solution factor 11 Note that in this algorithm to compute the exact residual, our initial guess is

Shift parameters selection
It is known that for fast convergence of Algorithm 5, proper ADI shift parameter selection is crucial. Several approaches are proposed in the literature to compute the shift parameters. See, e.g. [20] for an overview of different shift selection approaches. For a large scale dynamical system an often used ADI shift selection technique is Penzl's heuristic procedure [11]. In [20], the authors propose an automatic shift generation technique that automatically selects shifts during the execution of the algorithm, rather than before. There, the technique is called adaptive procedure and numerical experiments show that the approach performs very well for a first-order index-1 power system model [25]. Here, we investigate both the techniques and propose a modification to the adaptive shift selection procedure. The heuristic procedure for selecting some suboptimal ADI parameters for Algorithm 3 has been discussed in our previous work (see e.g. [7]). For a second shift selection technique we recall [20]. There, the shifts are initialized by the eigenvalues of the pencil λE À A projected onto the span of B. Then, whenever all shifts in the set have been used, the pencil is projected onto the span of the current V i and the eigenvalues are used as the new set of shifts. Here, we use the same initialization. For the update step, however, we extend the subspace to all the V i generated with the previous set of shifts. Let us assume that this (orthogonalized) extended subspace is U. Now from the eigenvalues of λU T EU À U T AU, select some desired number of shifts by solving the so called ADI min-max problem like in the heuristic procedure. Repeat this approach while the algorithm has not converged within the given tolerance. Note that our system is dissipative, i.e. all the eigenvalues of λ E þ E T À Á À A þ A T À Á lie in the left complex halfplane. Therefore, Bendixon's theorem [22] ensures that all the eigenvalues of the projected pencil are stable and thus are admissible shifts.

Numerical results
In this section, we illustrate numerical results to asses the accuracy and efficiency of our proposed technique. The method is applied to a set of data for the finite element discretization of the ASS discussed in Section 2, see also [26]. The dimension of the original model is n = 290137, which consists of n 1 = 282699 differential equations and n 2 = 7438 algebraic equations. It is exactly the model we have investigated in [1], where we have not exploited the symmetry of the matrices to the same extent as here.
All the reduction results presented in the following have been obtained using MATLAB 7.11.0 (R2010b) on a board with 4 INTEL XEON E7-8837 CPUs with a 2.67-GHz clock speed, 8 Cores each and 1 TB of total RAM and nonexclusive access. In order to get representative timings for the speed-up checks, these have been performed using MATLAB 8.0.0 (R2012b) on a system with 2 INTEL XEON X5650 CPUs with a 2.67-GHz clock speed, 6 cores each and 48 GB of total RAM that was running the tests exclusively.
To execute Algorithm 2, we compute the low-rank Gramian factor Z using Algorithm 5. To implement this algorithm, we use both heuristic and adaptive shift parameters as mentioned in Section 5. First, we consider 40 heuristic shifts out of 60 large and 50 small magnitude approximate eigenvalues (see [7] for details on the computation of heuristic ADI shift parameters for the ASS model). This is equivalent to the selection in [1]. The algorithm is stopped by the maximum number of iteration steps, i.e. i max ¼ 400. Next we apply the adaptive shift computation approach to compute Z. In this case, again i max ¼ 400 iteration steps are taken. As we can see in Table 2, the performance of the adaptive shifts is better than that of the heuristic shifts in terms of the final normalized residual norm, although both versions of the algorithm have not fully converged.
Note that before forming Z v and Z p by taking the upper n 1 rows and lower n 1 rows of Z, we apply a column compression on the low-rank Gramian factor Z (see, e.g. [27] for the column compression technique) to remove linearly dependent columns.
Algorithm 2 is applied to the ASS model considering the truncation tolerance 10 −5 to the relative magnitude of the HSVs, i.e. we remove all singular values that are at least five orders of magnitude smaller than the largest one. In this case, the different second-order balancing approaches compute disparate dimensional reduced systems as shown in Table 3. The frequency domain comparisons of the full and different dimensional reduced systems are shown in Figure 2 on the wide range 10 1 -10 4 [rad/s]. Figure 2(a) shows the frequency responses of full and reduced systems with good matching. The absolute error and the relative error of the frequency responses of full and reduced systems are exhibited in Figure 2(b,c), respectively. As we can see in Figure 2(c), the relative errors for all reduced systems are well below the truncation tolerance (10 −5 ). We further compute the 40, 30, 20 and 10 dimensional ROMs using the same algorithm via balancing the system on the position-position level. In this case, the frequency responses of the reduced systems also resemble the graph in Figure 2(a). Figure 3    relative errors between the full and different dimensional ROMs. We observe that, as expected, the lower the dimension of the reduced models, the higher the relative error. But even a very low dimensional model, e.g. the model of dimension 10, preserves the important feature of the original model's input/output behaviour, such that all of them are expected to work well as the basis of a high performance controller. Figure 4 shows that the ROM are all stable numerically as expected from the theory. Figure 5 presents some particularly interesting SISO relations for full and different dimensional ROMs. We selected the exact same relations as in [1] for comparison with our earlier approaches. Since in the SISO case we know that the transfer function matrix is just a scalar rational function, here we have computed the absolute values of the transfer function in different frequencies. The relative errors between the original and ROMs of the respective SISO relation are also shown in the same figure. Comparing the results to the ones in [1], we see that using the new method, we can achieve a comparable error with models of almost half the dimension. Thus the new implementation is faster, requires less than half the memory during computations and, in contrast to the old method, guarantees stability of the ROM. Although we have observed stability in the old implementation as well, the new one is to be preferred due to all its advantages.

depicts the
The acceleration we get from the exploitation of the block structure, i.e. the speedup of Algorithm 3 compared to Algorithm 1, are similar to the ones reported in [4,13]. Comparing the computation times of the new symmetry exploiting implementation with the one required to perform the general reduction approach reported in [1], we observe that the new version is roughly four times faster. The largest part of the time is spent computing the Gramian factors. A factor of about two is expected from the fact that here, we only need to compute one factor instead of two in the earlier approach. The real factor computation also only needs to solve one complex shifted linear system per pair of complex shifts instead of the two required before. This motivates a further factor of two, since in the vibrating system we investigate, most eigenvalues appear as complex pairs and, thus, so do the ADI shifts. In fact in our test, we observe a factor of 4.05 longer runtime for the previous implementation. Note that in both cases, we limited the number of ADI iterations to 100 in order to have a representative average per step timing and still fit both versions into the 48 GB RAM even with the suboptimal memory management experienced in MATLAB. The combined runtime for the computation of Gramian factors, projection matrices and reduced-order models was roughly 1 h and 15 min for the new symmetry exploiting case and 5 h and 6 min in the previous approach. We also observed that due to the improved shift selection, the convergence is often accelerated, especially in the index-1 case, see e.g. [20]. This can increase the speedup even further due to earlier termination of the Gramian factor computation.
If desired, an α-shift strategy as proposed in [25] can be incorporated into both methods. This usually further accelerates convergence in the Gramian factor computation.
Additional details and numerical experiments can be found in [2,Chapter 2].

Conclusions
In this article, we have introduced an enhanced technique for structure preserving model reduction of a special class of second-order index-1 differential algebraic systems using balanced truncation. It is shown that the appropriate choice of the first-order representation of this type of systems enables us to solve one Lyapunov equation only, which reduces both computational cost and memory requirement by roughly one half. The Lyapunov equation is solved efficiently by using recent updates to the basic low-rank ADI iteration. We also have adapted the way to solve the linear system in each iteration step of the low-rank ADI method by exploiting the block structure. Compared to our earlier article [1], a different reformulation to first-order form was used. Similar to [1], we enable the optimal exploitation of the original matrices, which may dramatically decrease the computational cost. The performance of our proposed strategy has been demonstrated for one large FEM model of an ASS employing piezo actuators with almost 300,000 DoF, proving the applicability of our method in real world problems. In the numerical results, we have seen that even the 10 dimensional model nicely preserves the general behaviour of the transfer function of the original model in frequency domain. Therefore, it is expected to have a  Figure 5. The rows respectively, show the 1st input (mechanical force) to 1st output (displacement), 9th input (electrical potential) to 1st output, 1st input to 9th output (charge) and 9th input to 9th output relations (left) and the respective relative errors (right) of full and reduced systems (dimensions indicated in the legend) for position-position balancing. good performance in controller design. In [3], the authors showed that in general none of the existing BT techniques for second-order systems guarantees stability of the reduced systems. For the case of symmetric systems with collocated inputs and outputs, using our proposed method, stability can be guaranteed by Theorem 4.1, which is numerically depicted in Figure 4 and serves as the possibly most important advantage of the new method.