Minimization and Eulerian Formulation of Differential Geormetry Based Nonpolar Multiscale Solvation Models

Abstract In this work, the existence of a global minimizer for the previous Lagrangian formulation of nonpolar solvation model proposed in [1] has been proved. One of the proofs involves a construction of a phase field model that converges to the Lagrangian formulation. Moreover, an Eulerian formulation of nonpolar solvation model is proposed and implemented under a similar parameterization scheme to that in [1]. By doing so, the connection, similarity and difference between the Eulerian formulation and its Lagrangian counterpart can be analyzed. It turns out that both of them have a great potential in solvation prediction for nonpolar molecules, while their decompositions of attractive and repulsive parts are different. That indicates a distinction between phase field models of solvation and our Eulerian formulation.


Introduction
Solvation is an elementary process in nature and is of paramount importance to sophisticated chemical, biological and biomolecular processes including signal transduction, DNA recognition, transcription, posttranslational modification, translation, protein folding and protein ligand binding. The understanding of solvation is an essential prerequisite for the quantitative description and analysis of biomolecular systems. In particular, the calculation of solvation free energy has captured a great deal of interests [2]. Solvation models have been developed ranging from simple phenomenological modifications of Coulomb's law, implicit solvent models that describe the solvent by mean-field approximations [3][4][5][6][7], explicit solvent models that treat the solvent in molecular or atomic detail [8], to complex quantum mechanical methods [9][10][11][12]. Implicit solvent methods become very popular in solvation analysis and the quantitative description of general biological processes due to their promising balance between accuracy and efficiency.
Solvation free energy is a physical quantity that can be measured by experiments. To calculate the solvation energy, it is helpful to break down the solvation process into a nonpolar process of inserting the uncharged solute into solvent and a polar process of charging the solute in vacuum and solvent [13]. Nonpolar solvation is generally associated with the insertion of the uncharged solute into solvent. There are many nonpolar solvation models available, among which a solvent-accessible surface area (SASA) model is commonly used. SASA models state that nonpolar solvent-solute interactions are proportional to the area of the solvent-solute interface. It is worthy to note that the proportional constant varies dramatically in the literature because the energies of other processes are also assumed to be proportional to SASA [14]. Roughly speaking, SASA models are based on the scaled particle theory (SPT) [5,15] which actually includes the energy of surface tension effect and the mechanical work of immersing a particle into solvent. Studies indicate that the nonpolar distribution should depend on the solvent-accessible volume and surface area, with a crossover to SASA when the size of solute is large [16]. Moreover, recent works by Levy, Gallicchio, and others [16][17][18][19][20] have demonstrated the importance of the treatment of attractive solute-solvent dispersion terms in nonpolar solvation models.
Recently, intensive efforts have been made to develop differential geometry (DG) based multiscale models for solvation analysis, electrostatic field and transport of complex chemical and biological systems [1,[21][22][23]. One essential feature of such models is the use of differential geometry of surface as a natural means to dynamically couple a discrete description of solute and continuum description of solvent. The main idea is to construct a total energy functional of the system to encompass energies of interest. Then variational analysis leads to desired partial differential equations (PDE) for surface generation and other unknowns. DG based solvation models can be implemented in an Eulerian formulation [22,23] and a Lagrangian formulation [13]. While the Lagrangian formulation based DG solvation models are similar to those of Dzubiella et al. in spirit [24,25], the discription of surface tension, nonpolar solvation free energy and the treatment of the solutesolvent interface are different in these two approaches.
Our DG based nonpolar solvation model in Lagrangian formulation was able to predict nonpolar solvation energy in excellent agreement with experimental data for a large number of nonpolar molecules [1]. With implicit solvent, we use the following model for nonpolar solvation free energies [16] Gnp = · Area + p · Vol + ∫︁ Ωs ρs U vdW dr, where is the surface tension, 'Area' is the area of the solute, p is the hydrodynamic pressure, 'Vol' is the volume of the solute, ρs is the solvent density, Ωs denotes the solvent region, and U vdW is the solvent-solute van der Waals (vdW) interaction potential. The first two terms in Eq. (1) are those from the SPT model [5,15]. The first term is the surface energy. It measures the disruption of intermolecular and/or intramolecular bonds that occurs when a surface is created. The second term is the mechanical work of creating the vacuum of a biomolecule in the solvent. The third term represents the attractive dispersion effect near the solvent-solute interface which has been shown by Wagoner and Baker [16] to have played a crucial role in an accurate nonpolar solvation analysis. The nonpolar solvation formula (1) has been demonstrated to be in a good agreement with explicit solvent solvation forces on proteins [16] and RNA hairpins [6]. Work by Levy and co-workers has demonstrated the good performance of a similar model [17][18][19][20]. The objective of this work is three-fold. First, we theoretically prove the existence of a global minimizer in our nonpolar solvation free energy functional. This provides a solid mathematical basis for previous numerical implementations [1]. Second, we present an Eulerian formulation of differential geometry based nonpolar solvation model and its performance under a similar parameter setting to its Lagrangian counterpart [1]. By doing so, we are able to analyze the connection, similarity and difference between the Lagrangian and Eulerian formulations. Third, we look into the distinction between phase field models of solvation [26,27] and our Eulerian formulation.
The rest of this paper is organized as follows. We first prove the existence of a global minimizer for our Lagrangian formulation of nonpolar solvation model proposed in [1]. Two proofs are provided here. One is a direct approach based on the sharp interface formulation. The other one is inspired by a recent work [27] using a phase field model. They are followed by a brief description of the numerical scheme to find the minimizer. Then an Eulerian formulation of DG based nonpolar solvation model is proposed with its implementation in the solvation calculation for nonpolar molecules. Finally, numerical results from our Eulerian formulation are demonstrated. This paper ends with a conclusion remark.

Minimization of nonpolar solvation free energy and its numerical scheme
In our Lagrangian formulation of DG based nonpolar solvation model, the interface can be represented as a closed surface in the 3D Euclidean space, denoted as Γ. The solute region Ωm(Γ) and the solvent region Ωs(Γ) are separated by Γ and therefore they can be regarded as functions of Γ. We express the surface area and volume in Eq. (1) as the following integrals where dσ represents the infinitesimal surface element on the solute-solvent interface. The van der Waals (vdW) potential U vdW is computed pairwisely for each atom U vdW (r) = ∑︀ i U LJ i (r). The formulation is under the assumption that the nonpolar solute-solvent potential is pairwise. A 6-12 L-J pair potential is formulated as follows: Eq.
(3) is used in OPLS-AA force field [19,28] where well depth parameters are ϵ is = √ ϵs ϵ i , and σ is = 2 √ σ i σs in which σ i and σs are solute atomic and solvent radii, respectively. In this work, σ is = σ i + σs. Here r is the position variable and r i is the position of the i-th atom. Then Eq. (1) becomes Note that U vdW (r) can be formulated by ∑︀ i U att i (r) in which U att i (r) represents the attractive part of Lennard-Jones potential [16,22,23]. To this end, the L-J potential can be divided into attractive U att i and repulsive U rep i in different ways. It can be a "6-12" decomposition: or a Weeks-Chandler-Andersen (WCA) decomposition based on the original WCA theory [19]:

Minimizer existence
In this section , we will prove the existence of a global minimizer of nonpolar solvation free energy functional (4). Related mathematical notations and assumptions will be given first. Then a formal minimizer statement and its proofs will be described in detail. A function u ∈ L 1 (Ω) is of bounded variation in Ω if [29,30] ∫︁ Ω |∇u|dx := sup{

Due to the lower boundedness of U vdW and the fact that
is of finite value. Based on the above assumptions, we attempt to prove the existence of a global minimizer of the nonpolar solvation free energy functional G 0 np [M] stated as follows: The first proof of theorem 2.1: , M ∈ A} and be finite. Now there exists a sequence M k ∈ A(k = 1, 2, · · · ) such that , compact embedding BV(Ω) ˓→ L 1 (Ω) , and Helly's selection theorem, there exists a subsequence of {χ M k }, not relabeled, such that χ M k → χ M * in L 1 (Ω) for some Lebesgue-measurable set M * ⊂ Ω and Then lim If needed, we may take a further subsequence of Now the combination of (7),(8),(9) leads to

The second proof via a phase-field model
The second proof can be done by using a phase-field model [27]. For the purpose, a phase-field model of nonpolar solvation free energy needs to be constructed first. Then one proves that the phase-field representation of energy functional Γ-converges to the sharp interface counterpart (5). Finally, the existence of a global minimizer of sharp interface based nonpolar solvation free energy functional (4) is followed by the existence of a global minimizer of the corresponding phase-field based functional. Since the functional (4) is the nonpolar part of the energy functional (1.6) in [27], the second proof can be considered as a part of the results in [27]. Readers may refer to [27] for proof details. However, for the reader's convenience, the main results and related key lemmas are described in this paper.
In general, phase-field models are useful tools for the numerical simulation of interfacial problem to avoid the explicit treatment of boundary condition at a sharp interface. They are constructed in order to reproduce the given sharp interface based dynamics. In the limit of an infinitesimal interface width, the correct interfacial dynamics can be recovered. According to the sharp-interface based functional (5), a phase-field model is described as follows [27]. First of all, we define a continous phase-field function s : Ω → R which has values close to 0 and 1. The solute and solvent regions are approximated by 1 and 0, respectively. While, a thin transition layer, having value between 0 and 1, is used to describe the diffuse solute-solvent interface and then solvation system. With interface width parameter ξ ∈ (0, 1), we consider the phase-field functional G ξ np : where W(s) = 18s 2 (1 − s) 2 is called van der Waals-cahn-Hilliard functional. The second term in Eq. (10) Γconverges to the area of solute-solvent interface as ξ → 0. Notation ξ k ↘ 0 is used in this paper to indicate that ξ k is a decreasing sequence such that ξ 1 > ξ 2 > · · · and ξ k → 0 as k → ∞. Two Lemmas are used to prove theorem 2.1. Lemma 2.1 (refer to Lemma 4.1 in [27]) For any ξ ∈ (0, ξ 0 ] and ξ 0 < 1, there exists a s ξ ∈ H 1 (Ω) such that To prove the Γ-Convergence of phase-field based energy functional to the sharp interface based counterpart, we rewrite Eq.(5) in the following: It is easy to see that Eq. (11) is the same as Eq. (5). Now the Γ-convergence of free energy functional is stated as Lemma 2.2. Lemma 2.2 : (refer to Theorem 2.1 in [27]) For any sequence ξ k ↘ 0, the sequence of functional G ξ k np : In other words, the following holds: For any s ∈ L 1 (Ω) and any sequence ξ k ↘ 0, there exists a sequence Finally, theorm 2.1 can be proved by using Lemma 2.1 and Lemma 2.2. Proof two of theorem 2.1 [1] for each k. Also by the inequality (4.1) in [27], the corresponding Thus χ M minimizes G 0 np #

Solution and mathematical implementation for minimizer
To obtain the optimal solvent-solute surface, we take the first variation on nonpolar energy functional with respect to surface definition Γ to find a geometry flow equation. A brief description of variational analysis is given here, readers may refer to paper [13] for technical details. First, we consider a surface element f(u 1 , u 2 ) and its infinitesimal displacement in the normal direction where N is the outward unit normal direction and φ is an arbitrary C 2 function and u 1 , u 2 are two parameters for a 2D surface.The surface variation of the area leads to where H is the mean curvature. Second, the volume variation with respect to Γ by means of the variation with respect to ε yields Third, the first variation of the volume integration of solute-solvent vdw interaction can be attained as follows Eventually, we have Since φ is an arbitrary function, the following condition must be satisfied for each point on the optimized interface During the minimization process, the directional derivative of solvation free energy functional G 0 np in the direction of a normal variation φ can be expressed as If we choose φ(u 1 , u 2 ) = −Wn, then This means that the total free energy decreases along the normal direction when φ(u 1 , u 2 ) = −Wn until it reaches a minimum. Therefore the evolution f (ε) = f − εWnN leads to a steady state and associated solventsolute interface with strictly smaller energy. This analysis motivates the following potential driven geometric flow equation for the optimal solute-solvent interface where X ∈ Γ ⊂ R 3 is a position vector on the evolving manifold Γ.
To avoid numerical difficulties in handling topological changes during the biomolecule surface evolution, we embed a Lagrangian operator into its Eulerian representation. To this end, we introduce an arbitrary hypersurface function S(r) with r ∈ R 3 . It is easy to verify that the unit normed vector can be expressed in Then the desired surface can be represented as a set of points with a constant value of function S where L is an isosurface value. By the chain rule where X is a 3D position vector confined to the manifold Γ. Moreover, the surface mean curvature H can be rewritten in terms of S as According to Eq. (20) and (23), one has where all terms should be expressed in terms of the level surface function S.

Multiscale modeling of nonpolar solvation: Eulerian formulation
From a microscopic point of view, there is no sharp interface between solvent and solute. In principle, an isolated molecule can be analyzed by the first principle -a quantum mechanical description of the wavefunction or density distribution of all the electrons and nuclei. However, such description is computationally intractable for large biomolecules. Under physiological condition, biomolecules are in a non-isolated environment, and are interacting with solvent molecules and/or other biomolecules. Therefore, their wavefunctions overlap spatially, so do their electron density distributions. The Eulerian formulation becomes important in the sense that it is able to produce an overlapping solvent-solute boundary, which may be able to describe the true physical boundary between the solvent and solute when its generation is governed by the variational analysis -the total free energy optimization [22]. Moreover, an Eulerian formulation is useful because it is able to handle topological changes directly during the surface evolution. Finally, an Eulerian formulation avoids the complex interface problem that arises from a Lagrangian formulation for the electrostatic solution in an implicit solvent model.
Previously, an Eulerian formulation of multiscale solvation model has been proposed and used in polar solvation calculation [22]. Here, the solvation of nonpolar molecules is considered where the electrostatic interaction is negligible. This simplified case minimizes modeling uncertainties and confirms the reliability of the Eulerian formulation of the differential geometry based solvation model for the description of solvation free energy. The details of modeling, numerical schemes and parameterization are described briefly below.

Model and method
Let us consider a multi-domain setting of a biomolecule and solvent system. The biomolecule is described in discrete atomic detail, while the solvent is treated as a continuum. Therefore, the domain Ω ∈ R 3 is essentially divided into two (types of ) regions, i.e., solvent domain Ωs and biomolecular domain Ωm. As such, one has Ω = Ωs Similarly the attractive dispersion term can be rewritten in the form where we assume that the solvent bulk density ρs is a constant in space. For the surface area term in Equation (1), we make use of the co-area formula in the geometric measure theory [32]. Specifically, we introduce a concept of mean surface area of a family of isosurfaces which are subsets satisfying {S(r) = y|, 0 ≤ y ≤ 1}. Therefore the mean surface area can be given by a volume integral as = ∫︁ Ω ‖∇S(r)‖dr.
Note that ∇S ≠ 0 only in the region of the solvent-solute boundary. Numerical test of this formulation was validated in our earlier work [22]. Finally, nonpolar free energy functional of solvation for biomolecules at equilibrium is given by ∫︁ Ω ‖∇S(r)‖ + pS(r) + ρs(1 − S(r))U vdW dr. (28) Now, the nonpolar solvation free energy functional is a functional in terms of the characteristic function S. To obtain the optimal function S, we take the first variation on Gnp with respect to S to yield where ∇ · (︁

∇S ‖∇S‖
)︁ is a generalized Laplace-Beltrami operator. is treated as a constant in our present computation. In general it can be a function of the position = (r) to reflect surface hydrophobicity at different locations. The solution of Eq. (29) leads to a "physical solvent-solute boundary" S. As discussed in earlier work [22,33], the solution of this elliptic partial differential equation can be attained via a parabolic partial differential equation where the generalized "potential" V is defined as Note that in Eq. (30), as t → ∞, the initial profile of S evolves into a steady state solution, which solves the original Eq. (29). In this section, . The discretization scheme used here for the solution of the generalized geometry flow equation (30) is similar to what we designed previously [22,33]. It can be rewritten in the form For the initial value of S, we consider where we define the domain enclosed by the solvent accessible surface to be D = ⋃︀ Na i=1 {r : |r − r i | < r i + rp}, with rp being the probe radius. Then S ∈ H 1 can be satisfied via a numerical smoothing process. Here Na denotes the total number of atoms for a given biomolecular system. Let atom centers be r i = (x i , y i , z i ), i = 1, · · · , Na, and r i represents the radius of the ith atom. To make the computation more efficient, we assume that the domain enclosed by van der Waals surface is always pure solute (S = 1) , and we only update the values of S(x, y, z, t) at the points in between the van der Waals surface and the solvent accessible surface; i.e., (x, y, z) ∈ ⋃︀ Na i=1 {r : r i < |r − r i | < (r i + rp)}. Numerically, to avoid possible zeros in the denominator of Eq. (32) we add a very small number, such as 10 −7 , to the denominator. That turned out not to affect the result at all [22].
Regarding the parameterization in our nonpolar model, the nonpolar solvation model involves parameters, such as, surface tension , hydrodynamic pressure p, L-J well-depth parameters ϵ is , solvent density ρs, solvent radius σs and solute radii σ i . The parameter optimization is measured by root mean square (RMS) error between the calculated and solvation experimental data. In this work, the solvent density ρs is fixed at 0.0334 1/Å 3 [1,16]. Moreover, , p and ϵ is are considered as fitting parameters. Therefore, an iterative procedure is designed in the following process: (a) Choose a trial set of molecules with given atomic coordinates, radii, and experimental data of solvation free energies. Then take an initial set of parameters , p and ϵ is for the trial set. (b) For the jth molecule, solve Eq. (30) to compute surface area, molecular volume and solvation free energy G j np with current parameter values. (c) Set up a target function where G j,exp np are experimental data of solvation free energies. (d) All parameters p, and ϵ is are updated by resolving a nonnegative least squares problem to determine non-negative parameters. (e) The iterative procedure (b)-(d) continues until convergence reaches within a pre-set tolerance for the above fitting parameters.

Results and discussion
First of all, a set of 11 alkanes, including linear, branched, and cyclic apolar compounds, is taken as a calibration set to validate our model and its numerical implementation. Experimental data is available for this set of hydrophobic solute [16,17,34]. Note that recently Wang et al [35] used an Eulerian formulation of DG based full solvation model to calculate solvation free energy for both polar and nonpolar molecules in a unified formulation. However, in the present work, nonpolar solvation free energy is calculated only by nonpolar free energy containing terms related to surface area, volume and Lennard-Jones (LJ) solvent-solute interactions.
With the set of 11 alkane compounds, optimized parameters for alkanes are obtained. In particular, surface tension = 0.0715 kcal/(mol Å 2 ), solvent pressure p = 0.0154 kcal/(mol Å 3 ) , LJ parameters ϵcs = 0.496 kcal/mol and ϵ hs = 0.00 kcal/mol. Here we fixed the solvent radius σs =0.65 Å and carbon atom radius σc =1.87 Å. Interestingly, the pressure p=0.0154 kcal/(mol Å 3 ) is comparable to experimental measurement which is 0.0248 kcal/(mol Å 3 ). Surface tension = 0.0715 kcal/(mol Å 2 ) is also comparable to its experimental measurement which is 0.103 kcal/(mol Å 2 ). With the optimized parameters, results of 11 alkane compounds calculated from the present model are shown in the top section of Table 3. It is evident that our model accurately catches subtle differences between linear, branched, and cyclic apolar compounds. Meanwhile, it perfectly reproduces the total solvation free energies of 11 alkanes. The root mean square (RMS) error is found to be as small as 0.11 kcal/mol. It is worthwhile to point out that current nonpolar Eulerian formulation does not need to apply artificially enlarged Van der Waals radii in solvation analysis as required by our earlier Eulerian representation [22] We also examined the importance of dispersion interactions U vdW in the solvation analysis by carrying out simulations with and without the dispersion term. Computed solvation results and their comparisons are listed in Table 1 for the set of 11 alkanes. It turns out that the dispersion interaction is necessary for our DG based solvation model and substantially improves the performance of current nonpolar solvation model [16,19]. Let us point out that the optimized pressure parameter is 0 without the dispersion term. This directly supports the surface-area-only type of nonpolar solvation models when the dispersion attraction is not taken into consideration.
When current results of 11 alkanes are compared with those from the previous Lagrangian formulation of DG based nonpolar model [1,36], it is found that both of them reproduce the total solvation free energies very well for each molecule. The RMS errors are 0.11 kcal/mol and 0.12 kcal/mol, respectively. The comparison of total solvation free energy is displayed in Figure 1. This again indicates that the framework of our differential geometry based solvation model can be powerful for solvation prediction in both Lagrangian and Eulerian formulations [13]. However, they are different when one looks into each contributed term. Solvation decomposition results listed in Table 2 show that the repulsive and attractive parts of solvation free energies differ from each other. In particular, the magnitudes of repulsive and attractive energies are smaller in the Eulerian formulation than those in the Lagrangian formulation, although the summation of them are almost the same. Moreover, we conducted a predictive study for other 19 alkane compounds. the optimized parameters obtained from the above training set of 11 alkane molecules are utilized. Results are shown in the bottom section of Table 3 , in which the repulsive and attractive decomposition is also demonstrated. Our predictive values fit the experimental data very well and their comparison is displayed in Figure 2. The RMS error is 0.34 kcal/mol.
Having demonstrated the accuracy and reliability of our approach for the alkane molecules, we further carry out a prediction of a set of 11 alkene compounds which have been studied by Ratkova et al. [37]. By  assuming the same solvent behavior, we use the same set of optimized parameters obtained from the alkane training set. Solvation free energies of 11 alkene compounds are shown in Table 4. They match the experimental data very well and their comparison is shown in Figure 3. The RMS errors is 0.24 kcal/mol. The result is significantly better than that reported in [37] using integral equation techniques, which is about 0.462 kcal/mol, and is close to that obtained by our Lagrangian approach which is 0.18 kcal/mol [1]. Note that in the previous Lagrangian calculation, the surface tension parameter was optimized particularly for the chosen alkene set while we did not fit any parameter here.

Conclusion
In this paper, the existence of a global minimizer of Lagrangian formulation based nonpolar solvation energy functional, which was proposed and implemented previously [1], has been proved. It provides a mathematical foundation for our model justification and numerical implementation. Two proofs are given here. One is a direct method based on the sharp interface model (or Lagrangian formulation). The other one is via a phasefield model. The latter gives an insight of the relationship between the sharp interface model and its phasefield counterpart, namely, the phase-field model Γ-converges to the sharp interface model with respect to L 1 convergence. Moreover, an Eulerian formulation of our differential geometry based nonpolar solvation model is proposed and implemented using a very similar parameterization strategy to the corresponding Lagrangian approach in [1]. It turns out that our Eulerian formulation works perfectly in blindly predicting total solvation free energies of nonpolar molecules without enlarged van der Waals radii. In addition, the calculated results of total free energy are almost the same as those predicted by our Lagrangian formulation, while the decom- Table 4: Numerical and experimental total solvation free energies for 11 alkenes. The numerical energy is the sum of the repulsive part and attractive part. The free parameters are chosen as = 0.0715 kcal/(mol Å 2 ), p = 0.0154 kcal/(mol Å 3 ) and ϵcs = 0.496 kcal/mol and ϵ hs = 0.00 kcal/mol  positions of attractive and repulsive parts are clearly different. That indicates the similarity in the prediction power for total solvation between Eulerian and Lagrangian formulations as well as their differences in the modeling approach and the resulting repulsive and attractive contributions. Finally, together with the relationship between the phase-field model and Lagrangian formulation, differences between Eulerian and Lagrangian formulations imply a distinction between phase-field models of solvation and our Eulerian formulation. In both models, the targeted functions appear to be quite similar, namely, S=1 in the solute region , S=0 in the solvent area, and S ∈ (0, 1) at the solute-solvent boundary region. However, the description of each term and its resulting calculation in nonpolar solvation functional are different. Moreover, a phase-field model is normally used to approximate a sharp interface model while Eulerian formulation is not. In the future, the existence of a global minimizer for the Eulerian formulation based total solvation free energy functional will be explored. In addition, current parameterization scheme will be extended to the solvation prediction for both polar and nonpolar molecules, and then to other applications such as the description of ionic density near a solute.