MODELING DNA THERMAL DENATURATION AT THE MESOSCOPIC LEVEL

. In this paper a mesoscopic approach is proposed to describe the process of breaking of hydrogen bonds during the DNA thermal denaturation, also known as DNA melting. A system of integro-diﬀerential equations describing the dynamic of the variable which characterizes the opening of the base pairs is proposed. In the derivation of the model non linear eﬀects arising from the collective behavior, namely the interactions, of base pairs are taken into account. Solutions of the mesoscopic model show signiﬁcative analogies with the experimental S-shaped curves describing the fraction of broken bonds as a function of temperature at the macroscopic level, althought we instead simu-late the variation in time. With this respect a research perspective connecting the theoretical results to the experimental one is proposed.


1.
Introduction. Although if very specific phenomena are considered in the literature, the modeling of processes occurring in the DNA molecule remains a challenging research field of applied mathematics. In this paper we focus our attention on the thermal denaturation of DNA (or DNA melting) and the related macroscopic Sshaped curves (experimentally observed -see [32,33]) of broken bonds of base pairs vs. temperature, starting from a mesoscopic model and applying a multiscale approach -cf. [21].
The genetic information on the development and functioning of living organisms is contained into the DNA molecules. The Watson-Crick model [37] describes the structure of DNA: it is composed of two single-strand molecules wrapped around one another in the shape of a double helix (dsDNA, namely double stranded DNA). Every strand contains a sequence of the four components called bases: A -adenine, C -cytosine, G-guanine and T -thymine. The two strands are linked by weak hydrogen bonds connecting base pairs, but only two types of base pairs are allowed: A-T which is connected by two hydrogen bonds and C-G which is connected by three hydrogen bonds. DNA denaturation is the process of separation of the two strands due to the breaking of the hydrogen bonds; this process gives rise to the ssDna, namely single stranded DNA. The stable configuration of the dsDNA can be disrupted through different processes: by heating, by mechanically applied forces or by exposure to a low salt concentration. In particular, it is called DNA thermal denaturation the process such that under conditions of sufficiently high temperatures, the DNA double helix "melts" achieving the ssDNA configuration; the melting temperature is experimentally defined as the temperature at which the fraction of unbound base pairs reaches about half of its maximal value. The DNA melting curve is the fraction of unbound base pairs versus temperature at equilibrium and it has a typical S-shape -see e.g. [33,32].
The process which lead the dsDNA configuration to the ssDNA one is involved in many important biological phenomena such as transcription, i.e. the reading of the genetic code, or replication; moreover the same process is used in many synthetic biomolecular techniques such as PCR (Polymerase Chain Reaction), a technique to amplify a piece of DNA -cf. [33,32] and references therein -hybridization and antigen targeting among many others (see [1] and references therein). Experiments involving the DNA thermal denaturation give also useful data for modeling; moreover it is a process which is monitored in quite a simple procedure because it is accompanied by a large increase of the absorbance of UV light at 260 nm.
From the modeling viewpoint, it is worth mentioning a model introduced by M. Peyrard and A.R. Bishop [28] at the end of the 1980's, where theinteractions between pair bases are introduced at the microscopic level by an appropriate anharmonic stacking interaction potential modeling the change in the electronic distribution on the bases when the hydrogen bonds are broken. The Peyrard-Bishop model is based on the observation that the two state variable (0 for closed base-pair and 1 for open base-pair) adopted in the classical Ising models [36], is not able to capture the nonlinear dynamical features of DNA denaturation, while it should be possible to account for nonlinear interactions, in principle, by using a continuous variable ("stretching parameter") describing the intermediate opening states. In the Peyrard-Bishop model a continuous real variable is adopted to describe the stretching of the bond linking the bases. Usually, in performing numerical simulations, thermical fluctuations are considered by means of a thermal bath, thus obtaining a thermalized nonlinear DNA model [26]. The interest of the P-B model relies also on the fact that it gives an example of the possibility of an entropy-driven transition in a one-dimensional system, showing the existence of a first-order temperature driven transition where the bulk entropy changes at the transition [10]. An extensive use of this model is present in literature; for instance its multifractality is analyzed in [5,6], trying tho show that the multifractality of the DNA melting is independent of the model terms and the applied methodology. Numerical results based on the P-B model and regarding the "denaturation bubbles" (i.e. denaturated, single-stranded domains) length distribution in dependance of temperature are presented in [18], to cite a recent one; many applications of the model can be found in literature (see for instance [11,38]).
Theoretical models of DNA denaturation based on the entropy-enthalpy interplay can be first addressed to  model, where the DNA molecule is modeled as an alternating sequence of intact double-helical and denatured, singlestranded domains. One of the first to apply the Poland-Scheraga model was Fisher [14]. As an example of applications of the P-S model, one can see [16] where a generalization of the P-S model of DNA denaturation in the presence of an external stretching force is analyzed and a temperature-force phase diagram is obtained showing a lower melting temperature. In [16], a revisiting of the P-S model is introduced with the perspective to apply the introduced methodology in the general context of measurements and studies of polymers. Other papers comparing obtained results with the P-S model are [31,3] and [4], where the influence of the conservation of the linking number, i.e. the number of times that two chains of the DNA wind around each other, on DNA thermal denaturation is considered. A connection of the entropy of "denaturation bubbles" to the quantum entanglement entropy is made in [34].
From the point of view of mechanics, it is worth mentioning the Worm Like Chain based models (WLC): a standard way of modeling the mechanical behavior of nanoscales filamentous structures in thermal equilibrium. These models present the interesting feature that a single parameter, i.e. the elastic parameter of bending modulus of the filament, is used. In the framework of WLC models or the so called augmented helix coil WLC, the nonlinear elasticity of DNA under large energy deformations is often modeled. For a recent reference see for instance [13], where the breaking of hydrogen bonds is considered as the principal effect associated with the base pairing interaction: the authors basic idea is that the melted regions of DNA have a higher free energy per unit length than the ordered DNA, but a significantly lower bending modulus. In [19] an interesting comparison of a modeling principles in the framework of WLC model with results obtained in the frameworks of P-B and P-S models is presented.
Generally speaking, the DNA has the property to store the genetic information and, from the modeling viewpoint, its specificity lies in the highly non linear and localized motions it undergoes and in the high rapidity of the processes of heating determining the fact that the system is strongly out of equilibrium; moreover DNA thermal denaturation is indeed a collective process due to the fact that involves many copies of base pairs. These features are characteristic of complex systems and one of the greatest difficulties of the modeling approaches is the selection of the representation and modeling scale, which depends also on the properties one is interested in; there is a general agreement in literature on the fact that the study of biological systems needs a multiscale approach [9,21] although many technical and conceptual problems are connected to this [7,8]. Multiscale frameworks involve also the problem of finding the "right" approach regarding to numerical simulations.
A preliminary approach to the modeling DNA melting at the mesoscopic scale in terms of integro-differential equations was proposed in [21]. The microscopic parameter characterizing every bond was a stretching parameter measuring continuously the opening of the hydrogen bonds between base pairs. The main disadvantage of the model in [21] was the assumption that every two bonds interact with the same intensiveness and the rôle of temperature was not directly taken into account (temperature was increasing proportionally with time).
In the present paper we propose a more realistic model assuming the interaction between the nearest-neighbor bonds and taking separately the interaction and temperature-depending mechanisms. Moreover we include two types of bonds: possibly weaker and stronger in order to take into account the A-T and C-G bonds.
The present paper is organized as follows. We propose a new mesoscopic model -Eq. (1) -of DNA termal denaturation. The model takes into account the (time) evolution referred to the stretching parameter of a test bond of DNA. Theorem 3.1 states the basic properties of the model: the existence and uniqueness of solutions in a natural space setting. Then the model is simplified to its symmetrized version by Eq. (27). Its macroscopic version is Eq. (11) -a system of ODEs for the first moments. Theorem 4.1 states the asymptotic properties of the solutions to Eq. (11). In some particular it is shown that the first moments satisfy the logistic equation and therefore they correspond to the S-shaped curves [32,33]. Moreover, some simulations show that this agreement can be achieved even in the case in which analytical results are not available. In the case we study the numerical solutions are not related to the percentage of broken bonds as function of temperature, as in the experimental curves, however they have a similar behavior.

2.
A new mesoscopic model of DNA denaturation. In the present paper we consider some aspects of deoxyribonucleic acid DNA thermal denaturation process (cf. [26,27,1,36]). The denaturation process produces melting curves: the fraction of broken bonds of base pairs as a function of temperature, having a typical S-shaped -cf. [36] and [33,32]. Continuing the idea of [21] we propose here a new, more adequate, model that takes into account at the mesoscopic level the time evolution of the probability distribution of the state of hydrogen bonds. Two types of bonds (with two and three hydrogen bonds) and the direct dependence on the temperature are included in the model.
We number the base pairs (also called 'bonds') A-T, C-G by the discrete variable j ∈ J , J = {1, 2, 3, . . . , n}, n ≥ 3, and we consider the continuous variable u ∈ [0, ∞[ representing the stretching of the distance between the two connected bases. The variable u is called stretching parameter in the sequel.
Every base pair ('bond') is then characterized by two variables: the discrete variable j and the continuous variable u. The discrete variable belongs to one of two subsets of bonds: J 2 -two hydrogen bonds connecting A and T and J 3three hydrogen bonds connecting C and G. By the definition it follows that We may assume that the three hydrogen bonds are more resistant to heating than the two hydrogen bonds. We consider the probability densities f j , j ∈ J that describe the distribution of the variable u at the j-th bond. We are going to propose a general mesoscopic model containing the terms • A j -describing stochastic change of the parameter u due to the interactions between two neighborhood base pairs; • B j -describing stochastic change of the parameter u upon heating. In such a case, the general class of bilinear systems of (Boltzmann-like) integrodifferential equations describes the dynamics of entities undergoing kinetic interactions (at the mesoscopic level) reads (n ≥ 3) and Moreover we assume that for all j, k and for a.a. u, v, w ∈]0, ∞[, and for all j and for a.a. u, v ∈ ]0, ∞[. The functions B j describe the growing probability density of raising the stretching parameter as the temperature T is increasing. Therefore we assume that B j , j ∈ J , depend on the temperature T in such a way that where γ T = 1 for T ≤ T 0 , for some given T 0 , and γ T is an increasing function of T for T ≥ T 0 . We may note that B j must be singular at v = 0. If T changes with time t (cf. Ref. [20]) then γ T is a function of time. In such a case we assume that it is a continuous function of time For simplicity of notations, we shall not indicate directly the temperature Tdependence of B j .
Next we assume that where b (2) and b (3) are non-negative constants such that 0 < b (3) ≤ b (2) < ∞ to model that the three hydrogen bonds are more resistant to heating than the two hydrogen bonds. The functions A j,j±1 and a j,j±1 describe the interactions between the nearestneighbor bonds (as we have stated before only the nearest-neighbor bonds are interacting). We assume where α is a positive constant. Assumption (8) reflects the fact that the higher the stretching parameter in the nearest-neighbor bonds the higher is the probability of raising the stretching parameter in the given bond. The bonds with high stretching parameters have stronger impact on the nearest-neighbor bonds than those with small stretching parameters.
There is a difference between the ending bonds j = 1, j = n and those j = 2, . . . , n − 1 as in the former case they have only one neighbor bond.

MARINA DOLFIN AND MIROS LAW LACHOWICZ
Moreover we assume for a.a. v, w ∈ ]0, ∞[. Again as in the case of B j we may note that A j,j±1 must be singular at v + w = 0. The above ad hoc assumptions may be modified in the case of availability of further information on the properties of DNA. Therefore From the a priori non-negativity of the solution we obtain the following a priori estimate for the solution sup where c t is a constant that depends on t. This estimate assures existence of solutions for arbitrary large times.
By Eq. (11) the moments x j :=f j , j ∈ J , satisfy Assuming for simplicity that in (7) b (2) one may observe that Eq. (14) has two equilibrium solutions Then the solution to Eq. (14) is defined by the solution x * to the logistic equatioṅ with the initial datum X * , that is the solution (x 1 , . . . , x n ) of Eq. (14) corresponding to the initial data (X 1 , . . . , X n ), is such that Therefore x j satisfies the macroscopic logistic equation and thus the first moments corresponds to the S-shaped logistic curve.
and all bonds are initially distributed with the same probability density F * on [0, ∞[ then the solution (x 1 , . . . , x n ) of Eq. (14) corresponding to the initial data (X 1 , . . . , Moreover x * = x * (t) is a solution oḟ with initial datum X * , and Proof. This is a classical problem of ODEs. It is known that the solution to Eq.
Let ε > 0 be arbitrary but fixed. By the assumptions Therefore the first term on the RHS of Eq. (22) tends to 0 for t → ∞, i.e. it is estimated by ε 3 for t sufficiently large. We consider then the second term on the RHS. Let t 1 > 0 be such that We have Taking t sufficiently large we obtain and therefore Finally, taking t ≥ t 2 : and the statement follows.
The following numerical solutions, for non-homogeneous initial conditions, show the analogies with the experimental S-shaped curve of broken bonds vs. temperature, although we here take into consideration the variation in time of the corresponding variable. In all simulations α = 4; the most appropriate value for all the involved parameters can be obtained by a comparison of the theoretical melting curves with the experimental ones whenever a particular DNA sample is chosen.  In the last two numerical solutions ( Fig.5 and Fig.6) a sequence of alternating A-T and C-G bonds have been simulated. 4. Symmetrized model. The end points j = 1 and j = n in the model (1) cause some asymmetry. Analogously to [36] we may modify the model (1) considering its symmetrization where ⊕ is defined as follows j ⊕ ±1 = j ± 1 for j = 2, . . . , n − 1 , 1 ⊕ (−1) = n , 1 ⊕ 1 = 2 , n ⊕ (−1) = n − 1 , n ⊕ 1 = 1 .   It is easy to see that for Eq. (27) exactly the analogous theorem as Theorem 3.1 holds true. Moreover the corresponding moments x j :=f j satisfẏ and analogous results as in Corollary 1 and Remark 1 follow. Changing the variables in the system (28) for constant b T , we obtain the following systeṁ    Therefore the corresponding linearized system, around the equilibrium state ( b T 2α , . . . , b T 2α ) ∈ R n , is defined by a circulant matrix (a special kind of Toeplitz matrices [15])  where c 0 = −α, c 1 = c n−1 = α−b T 2 and all other elements are 0.

5.
Conclusions and research perspectives. This paper aims to put the basis for a mathematical description of the collective behaviors arising in DNA thermal denaturation. Simulations performed for the presented model show that significant analogies with the experimental S-shaped curves of broken bonds vs. temperature [29] can be achieved even in the case in which analytical results are not available. In our approach however we consider the time evolution under possibly high temperature. These two ways (i.e. the change in temperature and the time evolution) may coincide if we assume the time evolution of temperature -the temperature as an opportune function of time -like in the experimental data given in [20]. Following this approach, the most appropriate values for the model parameters can be obtained by a direct comparison of the theoretical melting curves with the experimental ones whenever a particular DNA sample is chosen.