Fluid Phase Equilibria Coarse grained force ﬁeld for the molecular simulation of natural gases and condensates

Theatomistically-detailedmolecularmodellingofpetroleumﬂuidsischallenging,amongstotheraspects, due to the very diverse multicomponent and asymmetric nature of the mixtures in question. Complicat-ing matters further, the time scales for many important processes can be much larger than the current and foreseeable capacity of modern computers running fully-atomistic models. To overcome these limitations, a coarse grained (CG) model is proposed where some of the less-important degrees of freedom are safely integrated out, leaving as key parameters the average energy levels, the molecular confor-mations and the range of the Mie intermolecular potentials employed as the basis of the model. The parametrization is performed by using an analytical equation of state of the statistical associating ﬂuid theory (SAFT) family to link the potential parameters to macroscopically observed thermophysical prop- erties. The parameters found through this top-down approach are used directly in molecular dynamics simulations of multi-component multi-phase systems. The procedure is exempliﬁed by calculating the phaseenvelopeofthemethane–decanebinaryandoftwosyntheticlightcondensatemixtures.Amethod-ology based on the discrete expansion of a mixture is used to determine the bubble points of these latter mixtures, with an excellent agreement to experimental data. The model presented is entirely predictive and an abridged table of parameters for some ﬂuids of interest is provided.


Introduction
A typical crude oil consists of several thousands of distinct chemical species, all of them roughly similar in chemical nature, but with an important spread in terms of their molecular size, morphology, and thermophysical behaviour. Furthermore, being a natural product, the particular properties of the mixture vary widely from reservoir to reservoir and can even change with time and during the extraction and processing stages [1]. While the lighter ends can be characterized individually, e.g. by gas chromatography or mass spectrometry, as the molecular weight increases the number of closely-related structures and their complexity increase combinatorially as their number or mass fraction decreases. As the heavier end of the spectrum is approached, only very general descriptions of these fractions can be obtained, usually expressed in terms of some general characteristics as the aromatic character, the percentage of heteroatoms, etc.
It is ludicrous to postulate that one could model these highly complex systems by explicitly taking into account each and every distinct molecule present in the system, even if such information could ever be obtained. As a consequence, in the pursuit of the theoretical modelling of these mixtures, historically two schemes have become the mainstream tools of petroleum engineering; either the description as a continuum distribution [2] or the description as a discrete but finite set of pseudo-components [3]. Pseudo-components are artificial assignments of a cut or fraction of the mixture to values of critical properties, densities and acentric factors which on average represent the bulk behaviour, obtained from measured oil bulk properties, light ends analysis, distillation, or other characterization methods. The concept brings back simplicity into the description of a mixture and the number of pseudo-components usually employed to describe a crude is in the order of dozens. There are a number of empirical ways to perform this mapping, and no consensus of an optimal procedure exits [4]. Related approaches map the behaviour of a crude to a mixture of real components [5] or characterize pseudo-components based on 13 C NMR and other analytical data subsequently applying group contribution methods [6,7]  brought to you by CORE View metadata, citation and similar papers at core.ac.uk provided by Elsevier -Publisher Connector mapping of the mixture to a finite set of constituents allows the use of analytical EoS to be used as fitting tools, as they are mostly built with a discrete mixture in mind. The number and diversity of the EoS available for this purpose is staggering and their review is removed from the scope of this manuscript. The reader is referred to recent monographs [8,9] for further details.
A more modern approach to study thermophysical properties of fluid mixtures is by means of classical molecular simulations. The recent perspectives by Maginn [10,11] and Palmer and Debenedetti [12] give the reader some insight on the currently accepted views. However, it is important to not to raise false expectations on the capabilities of computer simulations and particularly to understand the present and future limitations. Fully atomistic modelling, where the individual molecules are described in terms of their constituent atoms and the bonds between them, cannot currently be used to explore more than several nanoseconds of time (in the case of Molecular Dynamics) and a few thousands of individual molecules. Even with modern advances in parallel processing, the use of graphical processing units and the reduction in the costs of hardware, these limits are bound to remain essentially unchanged (see for example the comments made during a recent Faraday Discussion [13] on the topic). This is not to say that both massively large systems [14,15] and/or long time frames [16] have not been explored, but they are far from the norm; furthermore one extreme usually precludes the other. Unfortunately, this scenario is incompatible with the apparent need for modelling crudes with hundreds or thousands of different species, each in a discrete composition and for reasonably long times (e.g. to study asphaltene aggregation; freezing of waxes; solubility of gases, etc.). An immediate corollary of the above comments is that in the present and immediate future a detailed atomistic description of a crude oil is essentially unfeasible. It is thus natural to consider that the atomistic modelling will follow the EoS modelling approach, i.e. it is compulsory to describe a crude oil as a mixture of a relatively small number of prototypical species or surrogate real molecules [17]. Some key questions still remain to be answered as to the number and nature of discrete elements necessary for a trustworthy representation, the level of fidelity required from the models and the strategies employed to represent the more "unknown" fractions present in heavier crudes.
Notwithstanding some of the above warning signs, some "heroic" efforts have been made to atomistically describe complex oil mixtures by simulations. Notable is the seminal work of Lagache et al. [18] captured in extenso in a book [19], that a decade ago described the modelling of naturally occurring high-pressure high-temperature hydrocarbon gas mixtures using 18 discrete representative molecules. Other examples are the work of Maldonado et al. [20] who have presented a molecular dynamics (MD) simulation of 25 discrete n-alkanes from C6 to C30 using atomistic models. They considered a very low density system and the adsorption behaviour of this mixture onto graphite surfaces. Other authors have considered discrete mixtures of a handful of small molecular weight hydrocarbons to mimic a crude, analysing, for example the accumulation of aromatics at the oil-water interface [21] the interfacial properties of gases and brine [22] and the diffusion of gases [23]. Recently, Li and Greenfield [24] employed a system composed of a dozen representative molecules to describe asphalt systems. The use of proxy molecules to represent a cut or family of homologous molecules is a natural progression in the simplification of the problem. Mixtures of a small number of discrete model molecules, each representing a family of molecules (e.g. resins, asphaltenes) have been employed to discuss bitumens [25,26] while binary mixtures of heptane (or toluene) plus a model molecule are routinely employed to study the effects of asphaltene aggregation [27][28][29][30] in spite the fact that pure heptane (or toluene) are clearly deficient models of a complex crude. Wax deposition is also commonly studied using representative few-component alkane mixtures [31][32][33][34][35][36].
All of the above approaches encounter the technical problem associated with the fact that molecular simulations are based on the a priori specification of pairwise potentials amongst the N atoms that constitute the mixture and that the time required to solve the problem scales in principle as N 2 . The recently observed increased proliferation of atomistically-based studies is a reflection of both reduction in the cost of high performance computer hardware and the increased confidence in the quality of the predictive power of classical atomistic force-fields. However, the crux of the matter is that even the speedup provided by advanced algorithms which decrease the scaling of the problem and/or the steady historical increase in computational power implicit in Moore's law [37] are not enough to provide the baseline for fully atomistic modelling of crudes. A way forward is to recognize that the level of detail incorporated into the existing atomistic models is far too great for the needs of this problem, more so if one recognizes the large uncertainties surrounding the detailed characterization of the actual crudes. The use of simplified versions of the potentials, generically called coarse grained (CG) models becomes the immediately obvious route.
Coarse graining is a term that refers to the use of simplified molecular models, where the atomistic detail is removed and substituted by the description of molecules in terms of "super-atoms" which represent, typically, a small number of heavy atoms. For example, in a standard CG representation, a propane molecule could be modelled as an isotropic spherical bead where all the electronic details, the intramolecular vibrations, bond bendings and molecular topology are incorporated within a point pair-wise interaction model. Coarse graining techniques have been extensively used in computational biology [38,39] where the self-assembly of large molecules is the main point of interest, and have become a mainstream technique for the study of complex fluids, materials and soft matter. One of the key issues in developing CG force fields is the methodology used to parameterize the intermolecular potential. Although not uniquely, most CG approaches start with an atomistically detailed model and integrate out the degrees of freedom not deemed to be relevant [40]. This procedure, by its own nature, removes information and the resulting force field is inherently deficient, especially in terms of transferability. In the case of interest here, a bottom up coarse graining makes no sense, as the initial components are not well defined to start with. More aggressive CG of this type inevitably ends up losing the link to the parent models, with the corresponding loss in robustness. Dissipative Particle Dynamics (DPD), for example has been employed to model crude oil systems [41][42][43][44], borrowing the idea that the properties of soft repulsive beads may mimic "lumps" of fluid. DPD is appropriate for qualitative studies, but is challenging to use as a predictive tool [45].
A fundamentally different "top-down" approach is used herein, where the CG potential parameters are optimized to reproduce the macroscopically observed thermophysical properties (instead of integrating high fidelity atomistic models). This change in paradigm is achieved by employing an equation of state (an analytical representation of the free energy) as the link between the molecular-level interaction potential and the macroscopic experimental data that relates to it. We seek to perform the search for effective potential parameters in an average sense capturing the thermophysical properties of a molecule, e.g. its density over a wide temperature range, its vapour pressure, etc. with a single set of parameters. The idea of using an EoS to obtain parameters to be used in molecular simulations is not new; for example Cuadros et al. [46] used a cubic equation of state to fit Lennard-Jones (LJ) spherical parameters to a series of fluids. The fact that the LJ model does not have the required flexibility to model a wide range of fluids, its inability to model non-spherical geometries and in some cases, the weak link between the equation of state and the intermolecular potential have hampered the popularity of these approaches. These limitations are removed if one employs a molecular-based equation of state; e.g. Müller and Gubbins [47] used a decorated LJ sphere with association sites to obtain an intermolecular potential for water by linking it to an appropriate EoS while Vrabec et al. [48] used an accurate equation of state to successfully parameterize a two center LJ bead model with central dipoles and quadrupoles and used the approach to develop force fields for a large range of small molecules with an accuracy that rivals experimental measurements. This is the essence of our approach [49]: to employ a molecular-based EoS to parameterize a force field that can be employed directly in molecular simulations, details provided in the next section.

SAFT-␥ force field for coarse graining oils and gases
The Statistical Associating Fluid Theory (SAFT) is a welldeveloped perturbation theory used to describe quantitatively the volumetric properties of fluids. The reader is referred to several reviews on the topic which describe the various stages of its development and the multiple versions available [50][51][52][53]. The fundamental difference between the versions is in the underlying intermolecular potential employed to describe the unbounded constituent particles. Hard spheres, square well fluids, LJ fluids, argon, alkanes have all been employed as reference fluids in the different incarnations of SAFT. For the purpose of this work we will center on a particular version of the SAFT EoS, i.e. the SAFT-VR Mie recently proposed by Laffitte et al. [54] and expanded into a group contribution approach, SAFT-␥, by Papaioannou et al. [55]. This particular version of SAFT provides a closed form EoS that describes the macroscopical properties of the Mie potential [56], also known as the (m,n) potential; a generalized form of the LJ potential (albeit predating it by decades). The Mie potential has the form where C is an analytical function of the repulsive and attractive exponents, a and r , respectively, is a parameter that defines the length scale and is loosely related to the average diameter of a Mie bead; ε defines the energy scale and corresponds to the minimum potential energy between two isolated beads; expressed here as a ratio to the Boltzmann constant, k B . The Mie function, as written above, deceivingly suggests that four parameters are needed to characterize the behaviour of an isotropic molecule, however the exponents a and r are intimately related, and for fluid phase equilibria, one needs not consider them as independent parameters [57]. Accordingly, we choose herein to fix the attractive exponent to a = 6 which would be expected to be representative of the dispersion scaling of most simple fluids and refer from here on to the repulsive parameter as = r . The potential simplifies to In the CG application of the SAFT models one considers spherical elements that correspond to a chemical moiety comprised of several heavy atoms; i.e. "super-atom" beads. Furthermore, the SAFT theory lends itself naturally to consider chain molecules made of tangentially-bonded beads. This adds to the model an additional parameter, m, which quantifies the number of elements in a chain molecule. SAFT also has a built-in provision for embedding associating sites unto the models, which has not yet been employed in CG models, although there is no fundamental limitation for this. In summary, the CG model for an arbitrary molecule is sketched in Fig. 1 and corresponds to a chain of m tangent spherical segments, each of them characterized by a triad of parameters, (ε, , ). Cartoon of a SAFT CG molecule composed of m beads bonded at a characteristic distance . ε is the energy scale corresponding to the minimum in the intermolecular potential, while the range of the potential is determined by the repulsive exponent . Values for common substances are given in Table 1.

Fitting of parameters
The key requirement for an EoS model to be used in a topdown CG approach is its accuracy in representing the underlying Hamiltonian, e.g. the question is: how well do the simulations of the potential agree with the description made by the EoS? Fig. 2 shows an example of such a fit for the SAFT-VR-Mie EoS, where the properties of the (34.29, 6) potential (a model of propane) obtained both by simulations and theory are compared. The same set of parameters are used in both the theory and the simulations. The agreement of the two routes is excellent. The correspondence between theory and simulations makes it possible to invert the procedure, i.e. to use the EoS to fit the parameters (ε, , ) to match experimental data and then to use the same parameters obtained with the theory in a simulation. The agreement shown in Fig. 2 is not fortuitous, it is seen for a wide range of fluids, including, but not limited, to small polar molecules, refrigerants, chain-like fluids, etc. [49].
Having established that the EoS is capable of representing the underlying potential accurately, there are several plausible alternatives for obtaining the parameter sets for pure components. The obvious one is to perform a least square minimization between target experimental data sets and those predicted by the EoS. Using as a target both the saturated liquid densities and vapour pressures along the extent of the fluid region is a classical approach which leads to the most consistently robust parameters. Arguably it does require coding of the EoS and an appropriate optimization routine. While tedious, the process is aided by the fact that commercial software packages [58] are starting to include the SAFT-␥ models alongside optimization tools. To circumvent and streamline the fitting procedure, Mejia et al. [59] expressed the SAFT-VR-Mie EoS in terms of reduced units and found there was a direct association between the value of the repulsive exponent in the Mie potential and slope of the vapour pressure curve in a pressure-temperature diagram. This observation suggested that an empirical correlation could be made between the Pitzer acentric factor, ω, which for a spherical molecule is related to said slope, and the repulsive exponent, . A similar, and possibly more obvious link can be made between the value of the critical temperature and the energy scale of the model, ε, and between the size parameter , and a characteristic liquid density. The resulting correlation, known light-heartedly as the M&M correlation, allows the determination of parameters for the SAFT CG force fields solely from the knowledge (or estimation of) critical properties. Table 1 shows a very abridged collection of parameters of interest in the oil and gas industry obtained using this methodology. As an example, the experimental densities and pressures for propane (m = 1) are plotted in Fig. 2  has the required degrees of freedom to appropriately reproduce the data, but even more importantly, that the molecular simulations that are performed with these parameters will reproduce the theoretical results and, by extension, the experimental data. This three-pronged agreement is not always possible; e.g. most EoS will not faithfully reproduce the properties of the underlying potential due to inherent approximations made throughout the theoretical derivation. Similarly, not all potential functions have the flexibility to reproduce the properties of real fluids as a consequence of the reduced degrees of freedom within their functional form, e.g. the LJ model shown in Fig. 2 will be incapable of fitting simultaneously the densities and vapour pressures of propane regardless of the choice of parameters (ε, ) employed. Inherent in the use of a multi-parameter force field such as the Mie potential is the fact that there is the need to simultaneously fit several parameters which can, in principle have some degree of degeneracy. If one is not careful to include a wide range of experimental data, multiple solutions can be found to reproduce the  [66]. b A more accurate model for benzene corresponds to a trimer in a triangle configuration is given in Ref. [64]. This latter model gives not only satisfactory thermophysical properties but also improves on the structural properties thanks to its correct shape and geometrical aspect ratio.
c An alternative single-bead model for CO2 with non-conventional attractive exponent is given in Ref. [67]. d An alternative model for water consists of a single coarse grained bead representing two water molecules with parameters m = 1, = 8; ε/kB = 400 K; = 0.37467 nm. More faithful models have temperature dependent parameters, see Ref. [68].
same data with the same quality of fit. As an example, Gordon [60] showed how the temperature-density diagram of methane could be predicted with accuracy with a wide range of potential parameters. It is by taking a look at other properties (in the case presented by Gordon, at viscosity) that one could discern between the transferability of the potentials found. In our case, we have taken to use simultaneously liquid phase density, vapour pressure and critical temperature to bracket the parameter region. In spite of the above, it is clear, from an analysis of the behaviour of the parameters, that multiple parameters can be found all with a similar performance. While this is an indicative of the robustness of the model, it also implies the need for some care to be taken when selecting the particular parameter values. Take for example the case of butane. If one were to fit the SAFT-VR-Mie equation of state to the experimental vapour-liquid phase equilibrium properties, one obtains [54] (m = 1.8514, = 13.65, ε/k B = 273.64 K, = 0.40887 nm). The non-integer value of m precludes the use of these parameters in CG simulations (i.e. what is a fraction of a bead?). This leaves us the choice to arbitrarily choose to model butane as either a single sphere (m = 1) or as a dimer (m = 2). The resulting parameters, obtained through the M&M correlation (or through direct fitting to the EoS) are (m = 1, = 40.81, ε/k B = 510.63 K, = 0.5303 nm) and (m = 2, = 13.29, ε/k B = 256.36 K, = 0.3961 nm), respectively. Ramrattan et al. [57] have noted that the value of the repulsive exponent has a direct relation to the fluid range, i.e. the ratio between the critical and triple point of a fluid; and that this metric is a valuable tool to bracket the possible parameter space. For the attractive exponent used here, "hard" repulsive exponents, e.g. values larger than = 12 reduce the fluid range and after a value of = 43 the fluid phase is no longer stable being suppressed by the presence of the solid phase [57]. The upshot of this is that hard potentials might exhibit premature freezing as compared to the experimental results. In the example above, Ramrattan [61] predicts a triple point of 331 K for the single sphere model of butane and 139.8 K for the dimer model, the latter comparing much better to the experimental value of 134.6 K [62].
For the case of mixtures, a new set of unknown parameters come into play, namely the cross-parameters corresponding to the binary interactions. The best course of action is to obtain these parameters by fitting them to reproduce the properties of selected mixtures, however this is seldom possible. Lafitte et al. [54] suggested the following combination rules that can be used as first approximation to describe the interaction between two different Mie fluids, labelled with subscripts ii and jj.
The SAFT coarse grained models do not provide information on the intramolecular interactions, as these are all averaged out during the fitting procedure. However, one can recognize that both overall shape, intersegment connectivity and rigidity are crucial to preserve the quality of the structure prediction [64]. A limitation of the theory is that the CG segments be rigidly bonded at a distance corresponding to that used to evaluate the reference radial distribution function. In this work, this distance is taken to be the characteristic size, , i.e. the CG spheres are bonded at a distance of . With respect to the bending of longer chains, the underlying theory only specifies that on average, the molecules should remain extended [65]. This is a natural configuration for alkanes and similar molecules present in crude oils. Within these models, this elongation is biased by adding a bond angle bending potential [66], angle , between three consecutive beads, angle = k angle (Â − Â 0 ) 2 , where Â is the angle subtended by three consecutively bonded spheres. The particular values of the angle, Â 0 = 157.6 • , and the constant that restricts the distribution, k angle = 3.38 J mol −1 deg −2 (2.65 kcal mol −1 rad −2 ), are obtained by averaging over all-atom models of short-length alkanes.

Methane-decane binary
As an example of the predictive capability of the methodology, we use here the binary mixture of methane and n-decane at 363.15 K. This is a particularly asymmetric mixture at a temperature which is supercritical for the light component. Methane is modelled as a single spherical molecule and decane as a chain of three beads, c.f. Table 1. Cross interactions are not fitted to the mixture properties in order to explore the robustness of the force field parameters, although it is clear that a binary fit could, in principle produce a better match at the expense of predictability.
Molecular simulations ran for this mixture in the standard canonical (NVT) ensemble, where the number of molecules, N, the temperature, T, and the system volume V are kept constant. All simulations were run using GROMACS [69] software suite and correspond to classical molecular dynamics (MD) simulations. Visualizations are rendered using VMD [70]. Reported properties are averaged over at least 2 × 10 6 steps ( t = 0.01 ps) after the equilibration of the simulated system as determined by monitoring its total energy, and output pressure. The Nose-Hoover thermostat was chosen for the NVT simulations. Periodic boundary conditions and a potential cut-off of 2.0 nm is applied in all simulations. For a binary system, calculating the phase behavior of the mixture is a reasonably simple affair, as it amounts to preforming a phase split (isothermal flash) and evaluating the resulting pressure. In MD, this is frequently done by quenching isochorically to the desired temperature an otherwise well mixed mixture [71]. We employ a simulation cell composed of 3750 decane molecules and 15,000 methane molecules corresponding to an overall mol fraction of methane of x methane = 0.8. After equilibration, the system will present a liquid slab surrounded by a vapour phase. Analysis of the density distributions allows the calculation of the molar phase compositions. The pressure is obtained by inspecting the component of the pressure tensor which is normal to the interface (z direction). expect for a purely predictive model. Since the simulations describe the two-phase region, one can extract from them information regarding structural transport and/or interfacial properties. Particularly, the interfacial tension of the mixture is calculated both through the mechanical route and thermodynamic route [73], and compares well with the available experimental data, as shown in Fig. 4. This information would not be directly available from an EoS and hints at the extent of the transferability and representability of the CG models used. In a light crude oil, the modelling of this binary pair is often the most sensitive one, as it comprises typically the most abundant compound (methane) with one of the most dissimilar ones (decane) in terms of phase behaviour. The accurate and predictive capacity of the simulations based on the SAFT CG force field suggests its potential for the description of multicomponent mixtures as considered below.

Light condensate synthetic mixture
Yarborough [75] documented a selection of synthetic mixtures of light condensates with compositions and phase equilibria in a range of conditions of interest to the reservoir engineering community. In particular and with no prejudice we study mixtures labeled M4 and M8 which are composed of light alkanes up to n-decane with and without toluene. The overall compositions of the mixtures are given in Table 2.
A system was set up with 10,000 molecules, corresponding to the compositions given in Table 2. An NVT calculation from an initial well-mixed system, quenched to 366.48 K (200 • F) into a rather expanded system (11 × 11 × 100 nm 3 ) provides a two-phase liquid vapour split, from which the surface tension is calculated as before. Similarly, the vapour pressure is determined from the analysis of the z-component of the pressure tensor. Other simulation details mirror the conditions used for the methane-decane binary. Fig. 5 presents a snapshot of an equilibrium condition and an average density profile for each component along the z-axis. A further analysis over these profiles was used to calculate the average molar fractions in the gas and liquid phases, y i , x i , and their Table 2 Overall experimental molar compositions, z i , of the mixtures studied [75] and corresponding number of molecules included in the simulation n i . ratio, the distribution factors K i = y i /x i . The agreement (see Table 3) between the experimental data and the simulations is very good, considering there are no adjustable parameters. The equilibrium pressure is calculated as 29.97 ± 0.06 bar and compares well with the experimental [75] value of 31.85 bar. The interfacial tension of the mixture at this point is calculated as 13.12 ± 0.46 mN/m. Fig. 5 shows that the light components (C1 to C5) exhibits excess adsorption at the liquid-vapour interface, i.e. they accumulate at the interface, with the most noticeable adsorption by methane. The interfacial thickness is seen to be considerable (∼5 nm) suggesting that rather large system sizes are needed to include interfacial effects. The rather elongated length in the z direction, corresponding to 0.1 m, strengthens the idea that for explicit simulations of multicomponent multiphase systems, a speed up in the calculations is needed, in this case resulting from the reduced number of interactions required from the CG model.

Bubble point determination
The determination of the bubble and dew points of a mixture is a staple of petroleum engineering thermodynamics. The bubble (dew) point of a mixture is defined as the condition of pressure, P, and temperature, T, where a liquid (vapour) mixture is in equilibrium with an incipient second phase, i.e. a coexisting vapour (liquid). In practice, the overall composition of a single phase mixture is specified and either the temperature and pressure boundary at which the second fluid phase becomes stable is the required output. The results are usually expressed in terms of a P-T diagram where the curves describe said phase boundaries and may include other curves describing other existing phase boundaries, such as fluid-fluid or solid-fluid and/or similar curves at other compositions. The practical importance of determining the phase envelope of a gas or crude oil cannot be underestimated, as it is crucial knowledge in many aspects of reservoir production and transport.
Experimentally, the bubble (or dew) point is obtained by a slow depressurization (expansion) of a mixture in a pure state, usually employing a mercury displacement pump. At different points during the process, the volume of the mixture is monitored. A change in slope of the pressure-volume diagram is indicative of the appearance of a second fluid phase, as the compressibilities of the gas and the liquid are often significantly different. The phase change point is not normally found experimentally, but rather found by the intersection of two lines fitted to the pure phase and mixed phase compressibilities. A comprehensive review of the method and its relation with other phase equilibria methods is given in standard textbooks [76][77][78] and detailed in recent review [79]. In the oil and gas industry, this procedure is a rather standard part of the PVT characterization of a crude, however it is expensive and time consuming.
For mixtures described by an equation of state, this calculation amounts to simultaneously solving the condition of thermal, mechanical and diffusive equilibria (equality of chemical potential) amongst two fluid phases for each component of the mixture. The analytical nature of this calculation lends itself to a reasonably rapid solution by numerical methods. In its most common form, the composition and temperature are fixed and the pressures at either the bubble or the dew point are recursively calculated. The reader is referred to the excellent textbooks that describe the common algorithms employed [80][81][82]. The quality of the result is obviously limited by the accuracy of the EoS to faithfully represent fluid mixtures. Furthermore, the fact that some of the more interesting features of the phase diagram are close to the critical points of the mixture, make these calculations particularly challenging for all but the most optimized and force-fitted of models.
From the point of view of performing a canonical (NVT) simulation, the determination of the bubble (or dew) point is far from trivial. Quenching a one-phase mixture to a temperature at which phase separation occurs (flashing) produces two phases with usually very distinct compositions. More importantly, the liquid (or vapour) phase composition is an output of the simulation and cannot be fixed a priori. Some algorithms have been published for the purpose of obtaining the bubble point calculations from simulations using pseudo ensembles [83,84] although they are tailored for Monte Carlo and Gibbs Ensemble-based simulations. For the bubble point determination of a mixture, we consider a one phase state point similar to the one described above, but at a much higher pressure (a much lower total volume) than that expected for the bubble point. The precise location of this point is irrelevant to the outcome of the calculation. In this state we record both the pressure and the density. We use this state point as the initial condition for an NP zz T simulation, where the pressure, P, coupling is isotropic in the x and y direction, but different in the z direction. This latter ensemble is useful to achieve different pressure levels (with the longest length of the simulation cell properly oriented in the z-axis) all at constant temperature and overall composition. The Berendsen thermostat and barostat were selected as the coupling algorithms for the NP zz T simulations. After equilibration, the system density is recorded and further decompression is applied (the pressure is set at a lower value), mimicking the experimental procedure. Eventually the system will cross the bubble point and a two phase system will evolve. A plot of the pressure as a function of the mass density, , for all the equilibrated states shows a kink in the slope, corresponding to a change in the compressibility, (∂P/∂ ) T , associated with a change in the nature of the phases that compose the system. As expected, the pure liquid phases have a higher compressibility and a steeper slope. The bubble point, instead of being simulated is obtained by the intercept of the slopes in the pressure-density diagram (Fig. 6). The results for mixture M4 are plotted alongside the experimental results [75], the pressure and the bubble point predictions are excellent; 215.1 bar which implies a 0.92% error above the experimental value. The densities seem slightly overpredicted by about 2%; the density at the bubble point is found to be 401.5 kg m −3 , again, slightly above the experimental value.
Following the above-mentioned methodology, Fig. 7 shows the results of the bubble point determination for mixture M8. Here, only one experimental point (in red) is reported and is compared with several simulations points and a standard fit using an optimized cubic equation of state; the Peng-Robinson EoS [85] with the Peneloux volume translation [86]. The EoS calculations are based on the use of binary interactions parameters which allow the theory to match, in as much as possible, the available experimental point. In addition, we plot the prediction from the multi-parameter GERG-2008 reference EoS [87]; an engineering EoS tailored specifically for these type of systems. Although no clear conclusion can be obtained, as the EoS results are conflicting and there is only one data point to compare to, the trends of the simulations seem in reasonable agreement with the expected results. The simulations allow for the prediction of the full phase envelope including the retrograde condensation and the low pressure dew point with no mixture adjustable parameter whatsoever; i.e. it is a full prediction of the experimental curve. Fig. 8 shows a typical configuration at conditions of an impending appearance of the bubble point. An interesting observation is even at such conditions is not visually evident that a phase separation will occur. The emerging vapour phase is predominantly methane (white-gray beads in Fig. 8) and there is no qualitative indication of a nucleating phase or clustering. Configurations very close, but below the bubble point show roughly the same characteristics. The bubble point is seen to be a macroscopic property which even at these very large system sizes would be hard to detect directly.

Conclusions and outlook
No level of foreseeable technological prowess will suffice to allow the commonplace atomistic modelling of complex crudes. On the other hand, an appropriate coarse graining model can allow for the quantitative calculation of fluid phase properties of fluid mixtures of interest to the oil and gas sectors.
The correspondence between the theory and simulations makes it possible to use the SAFT-␥ EoS to fit the parameters (ε, , ) to match experimental data and use these same calculated parameters in a molecular simulation. This apparently cyclic argument becomes useful when the simulations are employed to gain information otherwise inaccessible from EoS. The robustness of the force fields allows the predictions of adsorption [88], transport and interfacial properties [89] which are not part of the original fit.
The level of CG described here is different from that understood in the oil and gas industry when approximations are made to reduce the degrees of freedom by considering solvent-free models [90] and effective averaging of potentials [91]. The Mejia et al. M&M correlations [59] have a real potential for developing models in this field, as they are particularly well suited for calculating intermolecular potentials of effective pseudo-components and similarly undefined fractions, which can then be modelled in classical MD programs without loss of fidelity. All that is needed as an input is the characterization by means of an acentric factor, a critical temperature and a density. This is a key aspect of the methodology which can be exploited to model poorly defined crude mixtures.
The SAFT CG simulations are able to predict the phase behaviour of light crude oil mixtures and allow the simulation of reasonably large systems (we have explored elsewhere systems with up to 300,000 particles, corresponding to millions of atoms). This is enough to observe complex dynamics, including, but not limited to cluster formation and phase segregation. While we used here a seven-component mixture, there are no real limitations to expand this number. Similarly, we have spanned several million time steps. A point to note is that for these coarse grained models, the time scale changes in an unclear way. In an all-atom simulation, these timesteps have a direct relationship with a well-defined time scale, as they link atomic masses and the distance parameters with time. In a coarse grained simulation, both the masses and the energy and distance parameters are changed, and each step represents a different "time". More crucially, however, by eliminating the details and "roughness" of the molecules, their diffusion and mobility is significantly enhanced. The molecules explore a larger part of phase space, reaching equilibrium states and overcoming energy barriers much before they would in an atomistic model. Unfortunately, there is no clear recipe for this scale up [38]. In fact, it has been suggested [92] that the distribution of, for example, characteristic timescales, should correspond to appropriately weighted average of distributions from the different dynamics under consideration. If one compares the self-diffusion of small molecules, e.g. alkanes, in a liquid state from both an atomistic and a CG model, one sees [66] a speedup in the latter of at least an order of magnitude. Using this rough guide, the simulations presented correspond to effective times of up to 10 × 20 ns = 0.2 s, which are enough to observe phase separation and clustering of even the most complex systems.
The model presented here corresponds to homonuclear molecular models. An improvement can clearly be made if one employs the theory to its fullest, and considers heteronuclear models, i.e. chains made of different type beads. An expanded version of the theory [93] allows for this to be done, which is most useful when considering complex polyphilic molecules such as surfactants [94,95], transferrable models for paraffins and waxes [66], and larger hetronuclear molecules and will be valuable when extending the model to the heavier fractions, including resins and asphaltenes [96]. Work is under progress in this area.