pp. X–XX NUMERICAL SOLUTION AND FAST-SLOW DECOMPOSITION OF A POPULATION OF WEAKLY COUPLED SYSTEMS

The modeling of the microphysics of a population of atmospheric particles interacting through a common medium leads to the solution of a large system of weakly coupled differential-algebraic equations. An implicit time discretization of the system of differential-algebraic equations is solved with a Newton method at each time step. The structure of the global system and the sparsity of the Newton matrix allow the efficient use of a Schur complement approach for the decoupling of the various subsystems at the discrete level. A numerical approach for the decomposition of the population into fast and slow subsystems is proposed. Numerical results are presented for organic atmospheric particles to illustrate the properties of the method.


1.
Introduction.The modeling of the dynamics of aerosol particles is a crucial step in the simulation of atmospheric processes [12,16,17].The computation of the microphysics of a population of aerosol particles is an actual challenge, since particles of different sizes are interacting through a common medium (the gas phase), and directly by coagulation.
A numerical method for the resolution of large population of subsystems is presented here.The interactions between subsystems are assumed to hold through a common medium (or agent), and direct interactions are neglected.The model leads to a large system of differential equations, constrained by the global minimization of the internal energy of each subsystem [1,2,3].The corresponding chemical model equations have been presented in [4].
Due to the wide range of particles' sizes and the diversity of the chemical components involved, the resulting system of differential equations is stiff [15].Moreover large time steps are usually required to cover the large time intervals of simulation.A mathematical and numerical framework is proposed here.An implicit first order time discretization of the differential-algebraic system is considered.The corresponding system of nonlinear algebraic equations is solved with a Newton method.

ALEXANDRE CABOUSSAT AND ALLISON LEONARD
Numerical linear algebra techniques based on the Schur complement are very efficient for the resolution of the Newton system, when considering a large number of subsystems of different sizes [7,8].
The reaction speed and the time to reach a stationary solution vary from one subsystem to the other.A decomposition technique is advocated to decouple fast and slow subsystems, for both theoretical and numerical purposes.Numerical results exhibit a continuum of speeds among the various particles that depend mainly on the size of the particles.A decomposition of the subsystems into fast and slow entities [6,13] is therefore proposed based on the behavior of the residuals and sizes for each subsystem.This captures the fast and slow scales without restriction on the time stepping procedure [5].Numerical experiments in a simple, two particles, case, show that the fast-slow decomposition allows the computational cost of the algorithm to decrease, without sacrificing the accuracy.
2. Model problem.Let N > 0 be a given integer denoting the number of subsystems.A population of N spherical particles, each with corresponding radius r i , i = 1, . . ., N is considered, as illustrated in Figure 1.The particles do not interact directly with each other, but each particle interacts with the common medium (or agent).Let us denote by b 0 , resp.b i , the chemical concentration vector in the common medium, resp. in the subsystem number i.The model problem reads as follows: where g is a non-convex, smooth internal energy function, f is a given smooth flux function, and A ∈ R n×m is a matrix describing the chemical reactions and phase separations inside each subsystem; the variable z i (t) is the global minimum of the optimization problem at time t ∈ (0, T ) corresponding to the right-hand side data b i (t) in the constraint.At each time t ∈ (0, T ), the radius r i of the particle is directly computed from the variables b i as r i (t) = h(b i (t)), i = 1, . . ., N , where h : R n → R is a given mapping that depends on the physical properties of the subsystem [4].The parameters r i in (1) are those governing the reaction speed of each subsystem.By replacing the global minimization problem in (1) by its first order optimality conditions (describing the stationary points of the optimization problem), a system of nonlinear differential-algebraic equations of index one [10] is obtained, and reads: where λ i (t) is the Lagrange multiplier related to the equality constraint Az i (t) = b i (t), θ i (t) is the Kuhn-Tucker multiplier related to the inequality constraint z i (t) ≥ 0, and G : R m+n+n+m → R m+n is defined as: The set of stationary points, i.e. the set of solutions to G(z, b, λ, θ) = 0, includes not only the global minimum of the original non-convex optimization problem in (1), but also any local minima, maxima or saddle-points, implying that the solution of the algebraic part of (2) is not necessarily unique, and a bifurcation phenomenon may exist when z i (t) = 0 [2,3].
An interior-point method for the relaxation of the inequality constraints in (2) has been described in [1].Tracking techniques for the activation/deactivation of inequality constraints are reminiscent of the detection of discontinuities for ordinary differential equations or delay differential equations [9,10,14].In order to focus on the behavior of a large population of subsystems, the inequality constraints appearing in the last equation of (2) are neglected in the present article; the unknowns θ i are therefore also neglected and G : R m+n+n → R m+n is redefined as: The numerical solution of (2)(3) is addressed for large values of N , and the repartition of subsystems into fast and slow entities is discussed.
3. Numerical method and numerical linear algebra.A first order implicit time scheme is advocated for the discretization of (2)(3).Consider a time step h > 0, t n = nh, and ) respectively.An implicit first order discretization of (2)(3) reads, for all n ≥ 0: The discretization of (2)(3) can be achieved with more accurate methods than the first order implicit Euler scheme.For instance, in the case of one single subsystem, a Crank-Nicolson scheme has been used in [3], and a numerical method based on the RADAU5 method [11] has been presented in [14].We plan to extend the latter to the case of a population in the future.
The solution of the (large) system ( 4) is obtained with a Newton method.Let us define x i ) := (b n i , x n i ), i = 1, . . ., N , the increments are computed as the solutions of the following Newton system: where the matrices in (5) are defined by , and the right-hand sides are defined as Finally the Newton iterates are updated by setting b until some stopping criterion is satisfied.The specific features of this linear system come from the mixing of optimization and differential-algebraic equations.
The numerical method for the solution of (5) relies on the sparse structure of the linear system.In order to avoid additional errors induced by an iterative method for the solution of linear systems, a direct method is preferred.When the size n of the last block line is small compared to the number of diagonal blocks N , a Schur complement approach is known to be efficient [7,8].The underlying idea is to eliminate the weak coupling variable b 0 , and decouple the linear systems relative to each subsystem.The Schur complement system is first constructed and solved, where For h > 0 sufficiently small, the linear system (5) and the Schur complement S are invertible, and therefore ( 6) is well-defined, and solved with a LU decomposition.Although S is invertible, it may be ill-conditioned.If needed, a regularization of S can be achieved by adding positive terms in its diagonal.Adding terms to the diagonal of S corresponds to modifying either H i , O i or H 0 in the original system.Increasing the diagonal terms of H i and H 0 corresponds to an artificial decrease of the time step h, while increasing the diagonal terms of O i can be interpreted as a convexification of the objective function g in the underlying optimization problem.
Once the solution of ( 6) is obtained, a sequence of linear systems is solved for given p b0 : The solutions of systems similar to (8) are also required for the construction of the Schur complement system (7).The cost of the total algorithm is equivalent to the solution of one system of size n (6), and N (n + 2) systems of size m + 2n (8).
Remark 1.The use of a Schur complement strategy for this kind of matrices is standard.Note that the matrix S is full, but of reasonable size n.On the other hand, we can take advantage of the special block structure of the matrices appearing in (8) in a second level of resolution as explained below.
Remark 2. The linear system (8) is block-structured.A hierarchic sequence of linear algebra techniques can be designed.Problem (8) is therefore solved by LU decomposition where Optimization-like techniques similar to those presented in [1] can be developed for the solution of linear systems with matrix V i .The simulation of large population of subsystems (N >> 1) may suffer from scalability issues when decomposition methods are not used.The computational cost of the algorithm for the solution of ( 5) is at least proportional to the number of diagonal blocks N , but may be larger when using direct methods without decomposition.Numerical results presented in the next section show the scalability properties of the Schur complement algorithm, and the computational cost when N becomes large.
4. Numerical results.We consider the three-component system (n = 3) formed by 1-hexacosanol (C 26 H 54 O), pinic acid (C 9 H 14 O 4 ) and water (H 2 O) at temperature 298.15 [K] and pressure 1 [atm].The energy function g is a non-convex function defined by the chemical UNIFAC model (see [1] and references therein).The flux f depends on the radius r, b 0 and z only, and are given by f (t, b 0 , z, r) = ϕ(r)(b 0 − Γe ∇g(z) ), where ϕ(r) is the mass transfer rate, and Γ is a constant depending on vapor pressures and temperature [2].The radius is computed by the relation 4  3 πr i (t , where m k is the molecular weight and ρ k is the density of the chemical component k. The evolution of three particles of different radii is considered, when the global system reaches its equilibrium.A time step of τ = 0.5 is chosen.Initial concentrations are given by b 1,0 = (2.0 Figure 2 (left) shows the evolution of the normalized particle feed b i (t)/e T b i (t), i = 1, 2, 3 in the two-dimensional simplex (triangle).Each vertex of this twodimensional simplex corresponds to a composition without mixing, while any point in the interior of the simplex corresponds to a specific mixing of water, pinic acid and 1-hexacosanol.Regions separated by solid black lines are determined by the number of activated inequality constraints [2]. Figure 2 (right) illustrates the evolution and the convergence to a stationary solution for the radii r i (t), i = 1, 2, 3.The scalability of the algorithm is discussed by considering a population of N particles of identical initial concentrations b i (0) = (0.2, 0.1, 0.7) T , together with b 0 (0) = (2.001,0.00114, 1.27) T .A simulation with 20 time steps of size h = 0.5 is considered.Numerical results exhibited in Figure 3 show that the Schur complement approach exhibits a (optimal) linear growth of the CPU time with respect to the number of diagonal blocks N .On the other hand, when using the monolithic approach that consists of solving directly ( 5) with an LU-type method, the CPU time increases faster (but is not proportional to N 3 , due to the sparsity of the matrix).Such a direct resolution of ( 5) with a LU decomposition is actually never used in practice, and the results in Figure 3 shows that it is computationally not tractable for that particular application.Figure 3. Scalability of the numerical method: Left: CPU times for the simulation of the system of differential-algebraic equations, as a function of the number of subsystems.Right: log-log plot of the CPU times for the simulation of the system of differentialalgebraic equations, vs. the number of subsystems N ; decomposition approach (solid line); monolithic approach (dashed line).
5. Fast/slow scales.In atmospheric chemistry applications, the size of each particle strongly influences the time before reaching a stationary or nearly-stationary solution, and small particles are reaching equilibrium more rapidly [16].From the numerical viewpoint, small particles therefore require smaller time steps to capture the fast scales, while larger ones allow larger time steps (see for instance [5]).Since the sizes of particles are forming a continuum spectrum, the separation between fast and slow entities is not trivial.Fast subsystems evolving on a fast time scale require a small time step to allow the internal concentrations to reach a stationary solution, while the concentrations in slower subsystems stay approximately constant.When fast subsystems have reached a stationary state, a larger time step can be used for the simulation of the slow scales.
Particles of various sizes are considered to study the qualitative behavior of the population and design decomposition techniques into fast and slow subsystems.Let us define b = (0.2 , 0.7 , 0.1) T , b 0 (0) = (2.001,0.00114, 1.27) T , and b i (0) = 10 k • b for k = −1, 0, 1, 2, 3, 6 respectively.These initial values correspond to N = 6 values of radii (reaction speeds).These values are typical values for large atmospheric particles.
Figure 4 show the evolution of the subsystems, by showing the difference between two successive (normalized) iterates b n+1 Particles with a smaller radius are represented with solid lines on Figure 4 and considered fast variables.The evolution of the increments shows a small overshoot that indicates that the time step cannot be larger for these subsystems.Particles with larger radii are represented with dashed lines on Figure 4, and considered slow variables.Figure 4 (left) shows the continuum of time scales among various particles.When considering only two particles with radii r 1 = 1.53 • 10 −6 and r 2 = 6.64 • 10 −4 , Figure 4 (right) shows that a clear distinction between the two regimes can be made.The time to reach a stationary state is approximately twice longer when the radius is 10 2 times larger.As expected, the radius of the particle can therefore be used as a criterion to distinguish between fast and slow scales.With such continuous spectrum, the separation of several scales is not trivial, and the subject of current research.We propose in the following a first numerical strategy for the decomposition of the subsystems into fast and slow entities when considering only two categories of sizes (or speeds) for the particles in the population.Note that our goal at this point is not to design decomposition methods for the accurate simulation of all scales, but to investigate how to reduce the computational time in some simple cases, without sacrificing the accuracy of the solution.
The system (5) is first reduced to a system of smaller size by freezing the slow subsystems.A small time step is used for the reduced system until the fast subsystems reach a stationary state.Then the slow subsystems are reintroduced into the system and a larger time step is used.This alternating procedure is repeated.
Figure 5 compares the evolution of the trajectories b n i /(e T b n i ) when using two approaches: i) when solving the complete system with two particles, or ii) when deactivating the slow particle until the fast particle reaches a stationary state.The trajectories of the fast particle by using both methods are very similar, and the maximal difference between the values of b 1 (t)/(e T b 1 (t) obtained by the two methods (in the Euclidean norm) is smaller the 1.38 • 10 −4 .This implies that the influence of the slow subsystem on the fast one can be neglected in a first stage.On the other hand the CPU time when deactivating the slow particle is 1.16[s] vs. 2.02[s] when computing with both subsystems, implying a significant gain of CPU time. ) for the resolution with two particles ( ) and with one fast particle only (•).The trajectories for the fast particles are nearly superimposed.Right: Representation of the ratio between radii obtained with both methods.The ratio for the fast particle is close to one, implying less than 1% of error on the size of the slow particle.
As expected, the simulation of the fast particle with the decomposition method is very close to the original one.Errors introduced for the slow particles have to be compensated for by reintroducing the slow variables.More thorough strategies for the decoupling of the population into fast and slow entities will be investigated in the future, following [5] for instance.When the particles cannot be distinguished into only two categories (fast/slow), decomposition techniques are not straightforward.The amplitudes of the residuals have to be categorized into several classes (depending of their order of magnitude) at each time step.We plan to investigate a hierarchic approach to define e.g.fast/slow particles at several levels: among the fast particles, we will distinguish recursively the various fast/slow scales, and repeat the procedure.The number of repetitions of the procedure is still to be determined, but can rely on the number of size bins used in three-dimensional global climate models [16] (typically between 10 and 20 size bins).
6. Conclusion.The modeling of a population of subsystems at equilibrium has been addressed.The subsystems are weakly coupled through a common medium.An efficient numerical method has been proposed and optimal results for the computational cost have been exhibited.The decomposition of the subsystems between various time scales has been investigated as a function of the size of the subsystems, and a continuum of scales throughout the population has been obtained.A first decomposition method, valid only when two kinds of particles (fast/slow) exist in the system, has shown to provide accurate results with significantly smaller computational cost.Application to atmospheric organic particles has been presented.A recursive approach for the decomposition of the various scales will be investigated in the future.

Figure 1 .
Figure 1.Population of N subsystems in a common medium (N = 4).

Figure 2 .
Figure 2. Dynamics for three particles and three components (water, pinic acid, 1-hexacosanol).Left: evolution of the normalized concentration vectors b i (t)/e T b i (t), i = 1, 2, 3, in the phase diagram; right: evolution of the radii r i (t) i = 1, 2, 3, of the particles.
i /(e T b n+1 i ) − b n i /(e T b n i ) for i = 1, . . ., N , as a function of the time iterations.It shows that the transient time before reaching a stationary solution depends strongly on the size of the subsystem.

Figure 5 .
Figure 5. Fast/slow decomposition: Comparison of the trajectories of a two-particle system when freezing the slow particle.Left: comparison of the trajectories b n i /(e T b n i ) for the resolution with two particles ( ) and with one fast particle only (•).The trajectories for the fast particles are nearly superimposed.Right: Representation of the ratio between radii obtained with both methods.The ratio for the fast particle is close to one, implying less than 1% of error on the size of the slow particle.