Eﬃcient chemical equilibrium calculations for geochemical speciation and reactive transport modelling

Chemical equilibrium calculations are essential for many environmental problems. It is also a fundamental tool for chemical kinetics and reactive transport modelling, since these applications may require hundreds to billions equilibrium calculations in a single simulation. Therefore, an equilibrium method for such critical applications must be very eﬃcient, robust and accurate. In this work we demonstrate the potential eﬀectiveness of a novel Gibbs energy minimisation algorithm for reactive transport simulations. The algorithm includes strategies to converge from poor initial guesses; capabilities to specify nonlinear equilibrium constraints such as pH of an aqueous solution and activity or fugacity of a species; a rigorous phase stability test to determine the unstable phases; and a strategy to boost the convergence speed of the calculations to quadratic rates, requiring only few iterations to converge. We use this equilibrium method to solve geochemical problems relevant to carbon storage in saline aquifers, where aqueous, gaseous and minerals phases are present. The problems are formulated to mimic the ones found in kinetics and transport simulations, where a sequence of equilibrium calculations are performed, each one using the previous solution as the initial guess. The eﬃciency and convergence rates of the calculations are presented, which require an average of 1–2 iterations. These results indicate that critical applications such as chemical kinetics and reactive transport modelling can potentially beneﬁt by using this multiphase equilibrium algorithm.


INTRODUCTION
In a chemical equilibrium state, the forward and reverse rates of the reactions in a system are equal, and therefore no changes in the concentrations of its species are observed with time. It is possible to demonstrate, with the use of the first and second laws of thermodynamics, that a chemical system undergoing an isobaric and isothermal process progress towards a state of minimum Gibbs free energy.
Other conditions also apply during this process, such as mass conservation of the chemical elements.
Therefore, an equilibrium problem consists of finding the number of moles of the chemical species that simultaneously minimises the Gibbs free energy of the system and satisfies a system of equilibrium constraints (Smith and Missen, 1982). In addition, a non-negativity constraint for the number of moles is required in order to guarantee a physically feasible molar composition.
The applicability of chemical equilibrium solvers for environmental problems is wide. For instance, speciation modelling of aquatic systems, calculation of solubilities of gases and minerals, analysis of the effect of pH on the dissolution of a mineral, investigation of water-gas-rock effects during carbon storage in geological formations, and Mathematical symbols r x the gradient operator with respect to the primal variables x only T the transpose operator of a matrix or vector problems that require equilibrium calculations. In addition, it is a fundamental tool for chemical kinetics and subsurface reactive transport modelling. In these applications, some reactions in the system are controlled by thermodynamics instead of kinetics, and consequently equilibrium calculations are necessary (Lichtner, 1985;Steefel and Cappellen, 1990;Steefel and Lasaga, 1994;Steefel and Mac-Quarrie, 1996;Steefel et al., 2005).
1.1. Existing geochemical equilibrium algorithms Smith and Missen (1982) classify chemical equilibrium algorithms in two types: stoichiometricand non-stoichiometric. The former solves a system of mass-balance and mass-action equations, while the latter minimises the Gibbs free energy of the system. In Zeleznik and Gordon (1968) and Van Zeggeren and Storey (1970), however, it is shown that both approaches are conceptually equivalent.
In general, these solvers rely only on equilibrium constants of reactions, which are insufficient to calculate the Gibbs free energy of the geochemical system. As a result, determining the stable equilibrium phase assemblage is a difficult and expensive task. Many heuristic techniques have been developed to resolve this issue for pure mineral phases. For example, Bethke (2007) presents a technique to add the most supersaturated mineral to the calculation and remove the most undersaturated. Although popular among many geochemical solvers, this heuristic is inadequate and unsuitable for aqueous and gaseous phases. Consequently, many of these packages require that the aqueous and gaseous phase always exist at equilibrium, by fixing the amount of liquid water and the fugacity of a gaseous species.
In Leal et al. (2013) we pointed out some issues with the numerical methods frequently used in stoichiometric geochemical solvers. For example, MINTEQA2 (Allison and Kevin, 1991), EQ3/6(Wolery et al., 1992), CHESS (der Lee et al., 2002), PHREEQC (Parkhurst andAppelo, 1999, 2013), and The Geochemist's Workbench (Bethke, 2007) adopt an incomplete Newton scheme developed by Morel and Morgan (1972) and further improved by Reed (1982) for solving aqueous speciation. The approach consists of arranging the species in a set of primary species and another of secondary species. The composition of the primary species is calculated by applying Newton's method to the modified mass-balance equations. The composition of the secondary species, on the other hand, is calculated via a successive substitution approach using the mass-action equations.
This incomplete Newton scheme aims to reduce the dimension of the Jacobian matrix so that less computational effort is spent in solving linear systems. However, combining Newton's method with a successive substitution approach has the potential of preventing the calculation to converge at quadratic rates near the solution. Moreover, in some chemical equilibrium calculations, the cost of evaluating expensive equations of state and their derivatives could exceed the cost of solving linear systems. See Appendix A for a more detailed description of this stoichiometric formulation, which also presents an in-depth discussion on its convergence rates.
Geochemical packages based on Gibbs energy minimisation algorithms include ChemSage (Eriksson and Hack, 1990), THERIAC (de Capitani and Brown, 1987), HCh (Shvarov, 1999(Shvarov, , 2008, FactSage (Bale et al., 2002(Bale et al., , 2009, and GEM-Selektor (Karpov et al., 1997(Karpov et al., , 2001Karpov, 2002;Kulik et al., 2004Kulik et al., , 2013. In ChemSage (Eriksson and Hack, 1990), a Gibbs energy minimisation algorithm is implemented in which the non-ideal values of the chemical potentials are used, but their derivatives correspond to ideal models. Although convergence might not be an issue with such practice (e.g., quasi-Newton methods converge with artificial derivatives that are considerably different from their real values, Nocedal and Wright, 1999;Fletcher, 2000), the approach cannot converge at a quadratic rate near the solution. In Eriksson and Hack (1990) it is pointed out that a strategy was developed to refine the initial guess to improve the efficiency of the algorithm. However, this improvement does not change the convergence rate of the method.
In GEM-Selektor (Karpov et al., 1997(Karpov et al., , 2001Karpov, 2002;Kulik et al., 2004Kulik et al., , 2013, an interior-point algorithm is used to minimise the Gibbs free energy of multiphase systems. Their method, however, does not use the logarithmic barrier functions of Fiacco and McCormick (1990) nor the KKT perturbation approach of El-Bakry et al. (1996), which are practices commonly adopted in several non-linear programming packages and algorithms (Ulbrich et al., 2004;Silva et al., 2008;Wächter and Biegler, 2005a,b,c;Vanderbei, 1999;Benson et al., 2000;Vanderbei, 2006;Byrd et al., 1999;Fernandes, 2008, 2011a,b). GEM-Selektor assumes that at every iteration the minimisation of the Gibbs energy is a convex problem, even though non-ideal phases are assumed in their calculation. As a consequence, the algorithm is limited to linear constraints only, such as mass-balance and charge-balance, otherwise the convexity assumption would not be possible. At every iteration, a convex minimisation algorithm is used to find a feasible descent direction that will be used to find a step length that sufficiently minimises the Gibbs free energy, a procedure similar to a line-search strategy. The algorithm not only maintains feasibility on the bounds at every iteration, but also on the mass-balance constraints. Our approach, in contrast, considers possible non-linear equilibrium constraints, such as the imposition of pH, activities and fugacities. Therefore, these more complex constraints make it impractical to maintain full feasibility at every iteration, though the converged solution satisfies all equality and inequality constraints. Harvie et al. (1987) presents an equilibrium algorithm for non-ideal multiphase systems based on the minimisation of the Gibbs free energy. Their algorithm consists of transforming a constrained minimisation problem into an unconstrained one by introducing Lagrange multipliers and quadratic slack variables to circumvent the bound constraints. Moreover, their approach has some similarity to projected gradient active-set minimisation methods, since their calculated directions are always feasible and some species are sometimes made active (i.e., having zero number of moles) during the search for the stable phase assemblage. However, their method presents some strategies for finding the global minimum, an approach we do not attempt in this work, since its computational cost would make our proposed method prohibitive for reactive transport simulations.
In the recent work of Harvey et al. (2013), a Gibbs free energy minimisation method for systems composed of solid solutions was developed. Their method uses exact Hessian expressions, which potentially produce fast convergence rates near the solution. They compared their method with the general-purpose optimisation packages IPOPT (Wächter and Biegler, 2005c), KNITRO (Byrd et al., 1999), and SNOPT (Gill et al., 2002(Gill et al., , 2005, identifying the necessity of suitable heuristics to improve the efficiency and robustness of chemical equilibrium calculations. Moreover, they observed that using quasi-Newton approximations for the Hessian matrix, an approach adopted by the general optimisation solver SNOPT, results in severely inefficient chemical equilibrium calculations. In their algorithm, all phases and species are initially assumed to exist at equilibrium, allowing a simplification of the KKT equations by eliminating the complementary equations. However, some phases can be excluded during the calculation if they are detected as unstable. The authors define an unstable phase as a phase that for a certain number of past iterations a full Newton step would bring its molar abundance to a negative value. These phases are then removed because eliminating the complementary conditions from the KKT equations is subjected to all species having non-zero number of moles. In the end of the calculation, phase stability tests are performed to identify any excluded phase that should be added to the system, or any phase at equilibrium that should be removed. Our algorithm, on the other hand, does not exclude or add any phase in the course of the calculation to attain numerical stability. All phases are assumed in the calculation, and the unstable phases are identified in the end of the calculation with a rigorous phase stability test.

Integration into reactive transport simulators
As discussed by Kulik et al. (2013), a chemical equilibrium solver to be integrated into a reactive transport simulator must be extremely efficient and accurate. Efficiency is paramount in these large scale simulations, since equilibrium calculations must be performed for every grid block of the mesh, at every time step. Accuracy is equally important, since the mass-balance residuals of the calculations can accumulate over time and propagate throughout the reservoir. These accuracy problems can then culminate in an unstable state of the whole transport simulation. Another essential attribute of an equilibrium solver is to accurately and robustly determine the stable phases of a multiphase system. Otherwise, the fronts of the flow, where phases are constantly appearing and disappearing, can be highly inaccurate.
Due to these demanding requirements on equilibrium solvers for reactive transport simulations, Kulik et al. (2013) has recently reviewed their own algorithm. Their new GEMS3K code is an improvement of the previous GEMIPM2K, whose accuracy and stability were not sufficient for transport modelling (Shao et al., 2009). Therefore, this necessity of Kulik et al. (2013) in adapting their numerical method for critical geochemical applications raises the question of whether other approaches in the geochemical literature require some adaptations as well. Since many geochemical equilibrium methods were initially developed with the primary intent for speciation-solubility calculations and plotting phase diagrams, this might indeed be the case.
In principle it may not be very clear how a dedicated equilibrium solver, specially one based on the minimisation of Gibbs free energy, could be integrated into chemical kinetics or reactive transport calculators. This is because the common approach is to directly incorporate the massaction equations into the system of differential equations, and solve both transport and chemical processes simultaneously. Examples of simulators that adopt this approach are TOUGHREACT (Xu et al., 2004(Xu et al., , 2006, PFLOTRAN (Lu and and Lichtner, 2005), and CrunchFlow (Steefel et al., 2009). This approach is also adopted in the works of Saaltink et al. (1998) and Bea et al. (2009).
In Leal et al. (2014b), however, we develop an algorithm for chemical kinetics that uses our Gibbs energy minimisation method. The mathematical formulation of the problem, nonetheless, is derived in a way that any equilibrium solver could be used, which can only be achieved if the equilibrium conditions in the kinetics problem are handled abstractly. For example, without direct substitution of mass-action equations in the kinetics equations. The method is based on an implicit formulation to guarantee higher efficiency and stability in the integration of the differential equations. The quantities that evolve with time are the natural molar amounts of chemical elements, and not the usual total concentration of primary species (Steefel and Cappellen, 1990). The presented systematic approach for kinetics could be straightforwardly applied into the context of reactive transport equations, either using an implicit or an operator splitting approach. For the implicit approach, the partial derivatives of the molar abundance of the species with respect to the molar abundance of the elements are necessary, and its calculation is also shown in Leal et al. (2014b).

Proposal of a novel geochemical equilibrium algorithm
In this work we propose the solution of the Gibbs energy minimisation problem by casting it into a non-linear programming problem. The advantage of this approach is that it becomes easier to identify the relevant aspects of the problem and to decide which minimisation method, among those existent in the mathematical literature, best solves it in terms of efficiency, accuracy and robustness. Moreover, this approach eliminates the dependence of specific chemical details of the problem on its numerical solution, simplifying its analysis and implementation.
Our numerical methodology for multiphase equilibrium calculations is based on the trust-region primal-dual interior-point algorithm of Ulbrich et al. (2004) and Silva et al. (2008). Their algorithm solves non-linear programming problems with non-convex objective functions containing both equality and inequality constraints. We briefly describe their method here for completeness, together with our modifications tailored to yield more efficient and robust chemical equilibrium calculations. The interested reader, however, is referred to Leal et al. (2014a) for the complete presentation of the method.
This chemical equilibrium method has been implemented in REAKTORO, a scientific library written in the C++ programming language for computational geochemical modelling. Currently, it provides methods for both multiphase chemical equilibrium and kinetics calculations. The code is freely available at http://bitbucket.org/reaktoro, where its licensing information can be found.

CHEMICAL EQUILIBRIUM
In this section we present the mathematical formulation for chemical equilibrium problems, where the system is composed of multiple phases and species. The equilibrium problem formulation is done generically, with the chemical system not being restricted in any aspect.
In what follows, we shall assume a chemical system composed of N species, where the ith species is denoted by a i and the set of species by a ¼ fa 1 ; . . . ; a N g. In addition, we consider that there exist E elements from which these species can be formed, where the jth element is denoted by e j and the set of elements by e ¼ fe 1 ; . . . ; e E g. Finally, we assume that the chemical species are partitioned among P phases, where a p i denotes the ith species in the pth phase, a p ¼ fa p 1 ; . . . ; a p N p g denotes the set of species in the pth phase, N p denotes the number of species in the pth phase, and I p denotes the set of indices of the species in the pth phase.

Equilibrium problem
Finding the equilibrium point of a multiphase system is a non-linear optimisation problem, where the Gibbs free energy of the system is minimised. The problem, however, is far from being trivial, since it contains both equality and inequality constraints, defining, therefore, a non-linear programming problem.
The equality constraints arises from the need to specify, for example, the number of moles of each chemical element in the system, the pH and the charge balance condition of an aqueous solution, the partial pressure of a gaseous species, and so forth. The inequality constraints, on the other hand, results from the physical condition that the number of moles of the species are non-negative.
From the principle of minimum Gibbs free energy, we calculate the equilibrium state of a chemical system by solving the following constrained minimisation problem: min n Gðn; T ; P Þ subject to where n 2 R N is the molar composition vector of the system; T and P are, respectively, the temperature and pressure of the system; h : R N # R M is the equilibrium constraint function; N is the number of species; and M is the number of equilibrium constraints. The temperature T and pressure P are assumed given parameters. The Gibbs energy function G : R 2þN # R is defined by: where n i is the number of moles of the ith species; and l i : R 2þN #R is the chemical potential function of the ith species, given by: In the previous equation, l i : R 2 #R is the standard chemical potential function of the ith species; a i : R 2þN #R is the activity function of the ith species; and R is the universal gas constant.
In the literature, the activity of a species is commonly replaced by the product of a concentration quantity and an activity coefficient. For example, the activity of an aqueous species is replaced by: where m i and c i are its molality and activity coefficient. The activity of a gaseous species, on the other hand, is replaced by: where x g i and u i are its molar fraction and fugacity coefficient, and P is a reference pressure, usually equal to 1bar, though this depends on the standard chemical potential data of the gaseous species.
In our formulation, however, the term a i representing the activity of a species is included in all equations. The idea is to avoid the explicit dependence of the type of the species every time an equation is written. For example, the Gibbs energy function (2.2) can be written succinctly and generically without the need to write terms that corresponds to the aqueous, gaseous, mineral, and possibly other types of species. This is a fundamental step for the development of a general chemical equilibrium algorithm that can be applied to any type of systems. At the computational level, such abstraction is also possible to be made with programming techniques such as polymorphism.
In general, the standard chemical potential l i of a species can be interpolated from temperature versus pressure tables. This is an efficient and sufficiently accurate approach, justified by the fact that the evaluation of their equations of state can be prohibitively expensive. However, the same practice cannot be done with the activity a i of a species, because this function depends on the composition of the system. Therefore, they must be directly computed from their equations of state. Moreover, fast convergence of the Gibbs energy minimisation problem require at least their first-order partial molar derivatives.
More information about the partial molar derivatives of the species activities, and the calculation of the gradient and Hessian of the Gibbs energy function (2.2) can be found in Leal et al. (2014a).

Equilibrium constraints
Below is a list of quantities that can be constrained at equilibrium: number of moles of an element; charge-balance of the aqueous solution; activity of a species; and partial pressure of a gaseous species.
The Gibbs energy minimisation problem, as formulated in Eq. (2.1), already provides the means to enforce these constraints. This can be done by using the equilibrium constraint function h, where each component of this vector function imposes an equilibrium condition.
Observe that our approach for specifying equilibrium constraints is general and flexible. A common constraint used in equilibrium calculations is the mass-balance conservation of elements, which are given by the linear equation: where W denotes the E Â N formula matrix of the system, with its ðj; iÞth entry denoting the number of atoms of the jth element in the ith species; E the number of elements; and b the molar abundance vector of the elements, whose component b j denotes the number of moles of the jth element in the system. If the mass-balance constraint is to be specified, then the equilibrium constraint function h can be defined as: and its gradient as: Let us now show how some other equilibrium constraints can be constructed individually.

Imposition of number of moles of an element
There are equilibrium problems in which the number of moles of some elements are unknown. Instead of specifying these amounts, they are in fact calculated, which constitutes an inverse problem (Kulik, 2006;Kulik et al., 2013).
As an example, consider the case where one needs to find out the amount of hydrochloric acid that needs to be mixed with 1 kg of water to produce a solution with pH equal to 4.0. In this problem, therefore, the number of moles of elements H and Cl are unknown a priori. However, these amounts can be determined by solving an equilibrium problem where (i) the pH of the solution is imposed together with (ii) the condition of charge-balance of the aqueous solution and (iii) the known amount of element oxygen. Thus, we see that the capability of specifying the number of moles of individual elements is of fundamental importance for chemical equilibrium modelling.
Imposing the number of moles of the jth element is governed by the following equilibrium constraint function: where b H j denotes the desired number of moles of the jth element; and w j 2 R N the vector formed from the jth row of the formula matrix W. The gradient of this function is given by: ð2:9Þ 2.2.2. Imposition of charge-balance As we just saw, the charge-balance condition is a convenient constraint to be imposed when the amount of a chemical element is unknown. This constraint can be even imposed when the amounts of all elements are known, so that the obtained solution certainly does not violate this physical condition. For example, Kulik et al. (2013) always impose the charge-balance condition, since they use the Lagrange multiplier corresponding to this linear constraint to compute redox potential quantities such as pE and Eh, as demonstrated in Kulik (2006).
By accounting for the balance of charges in an aqueous solution, the following condition must be attained: where z a 2 R N is the vector of charges of all species, whose entries are non-zero for the charged aqueous species only. Therefore, the equilibrium constraint function that imposes the charge-balance condition is given by: hðnÞ :¼ z T a n; ð2:11Þ whose gradient is: rhðnÞ ¼ z a : ð2:12Þ

Imposition of activity of a species
Imposing the activity of a species can be useful in some cases. For example, the pH of an aqueous solution can be easily measured, and so the activity of species H þ is known, given by: Therefore, this information can be used towards the calculation of the equilibrium state of the solution. Imposing the activity of the ith species requires defining the equilibrium constraint as: where a H i is the desired activity for the ith species. It follows that the gradient of the above function is given by: where ra i ðnÞ is the gradient of the activity function a i . The analytical calculation of these derivatives results in efficient equilibrium calculations, and they are of utmost importance in the minimisation of the Gibbs free energy. For simplicity, note that we have omitted the dependence on temperature T and pressure P of the activity function a i and its gradient ra i in the previous formulation.

Imposition of partial pressure of a gaseous species
Imposing the partial pressure of a gas is a fairly common practice. In Bethke (2007), the equilibrium problem of the dissolution of pyrite (FeS 2 ) is considered. In this problem, the fugacity of the gaseous species O 2 (g) is kept constant in order to simulate the contact of the solution with the atmosphere. Sometimes, however, one might opt to impose the partial pressure of a gas instead of its fugacity, which are equivalent practices when the gases are considered ideal. Thus, for instance, since oxygen makes up about 20% of the atmosphere, one needs to set the partial pressure of O 2 (g) to 0.2 atm, assuming the atmospheric pressure is 1 atm.
The partial pressure of the ith gaseous species is given by: Thus, the equilibrium constraint function can be defined as: ð2:17Þ and its gradient as: where the jth component of the vector rx g i , denoted by @x g i =@n j , should be zero if the jth species does not belong to the gaseous phase. Note that x g i and rx g i can be easily calculated from n using the indices of the gaseous species I g .

NUMERICAL METHOD
In this section we briefly describe the developed numerical method for equilibrium calculations of multiphase systems. Its detailed description can be found in Leal et al. (2014a), which adopts an approach of separating pure mathematical concepts from those specific to chemical equilibrium that are irrelevant at the algorithmic level. As mentioned before, the method is based on an adaptation of the interior-point minimisation algorithm of Ulbrich et al. (2004) and Silva et al. (2008).
Let us represent the Gibbs energy minimisation problem (2.1) as a general non-linear programming problem in the standard form: where the objective function f : R n #R and the equality constraint function c : R n #R m are assumed to be twice continuously differentiable. Moreover, let us assume that m 6 n, where n denotes the number of variables and m the number of constraints . Finally, we denote the vector of variables in the optimisation problem (3.1) by x 2 R n , which corresponds to the molar amounts of the chemical species in the context of the Gibbs free energy minimisation.

First-order optimality conditions
Let us write the necessary first-order optimality conditions for the non-linear programming problem (3.1). These are also known as the Karush-Kuhn-Tucker or KKT conditions (see Nocedal and Wright, 1999), and depends on the following definition of the Lagrange function: Lðx; y; zÞ :¼ f ðxÞ þ cðxÞ T y À x T z; ð3:2Þ where y 2 R m and z 2 R n are Lagrange multipliers. The KKT conditions, and their corresponding names, are then written as: r x Lðx; y; zÞ ¼ 0; optimality ð3:3Þ x; z P 0; dual-feasibility ð3:6Þ where X :¼ diagðxÞ and r x L denotes the gradient of the Lagrange function with respect to the primal variables x, given by: r x Lðx; y; zÞ ¼ rf ðxÞ þ rcðxÞ T y À z: ð3:7Þ Eqs. (3.3)-(3.6) are the requirements that a local solution must satisfy. Note that besides the primal variables x, the Lagrange multipliers y and z are also unknowns in the problem. Thus, there are a total of 2n þ m unknowns, which corresponds to two times the number of chemical species plus the number of equilibrium constraints, 2N þ M. As discussed in Kulik et al. (2013), it is imperative that the size of the linear systems to be solved are moderate in order to efficiently solve equilibrium problems. Fortunately, the Lagrange multipliers z can be explicitly written in terms of x and y, which decreases the size of the linear systems by n, as shown in Leal et al. (2014a).

Perturbed KKT conditions
Similarly to El-Bakry et al. (1996), the primal-dual interior-point algorithm of Ulbrich et al. (2004) solves the problem (3.1) by suitably perturbing the KKT complementary condition (3.5). The approach results in the following perturbed system of non-linear equations: where Fl is defined by: Flðx; y; zÞ :¼ r x Lðx; y; zÞ cðxÞ Xz Àle 2 6 4 3 7 5; ð3:9Þ withl > 0 denoting a small perturbation parameter; and e 2 R n the vector of all ones. More details on the perturbation parameterl, and its progress towards zero can be found in Leal et al. (2014a). Eq. (3.8) is solved using Newton's method. However, several strategies are adopted to aid convergence from arbitrary or poor initial guesses, since Newton's method is not guaranteed to converge far from a local solution (Nocedal and Wright, 1999). Other strategies are also adopted to guarantee that the solution is a local minimum, and not a maximum for example.

Convergence strategies
In this section we list a few convergence strategies adopted in the algorithm to aid convergence from poor initial guesses. A more in-depth description of these strategies can be found in Ulbrich et al. (2004) and Silva et al. (2008), and their modification for chemical equilibrium calculations is shown in Leal et al. (2014a).

Filter
The filter technique developed by Fletcher and Leyffer (2002) is used. The idea has its origins in multi-criteria optimisation problems, whose adaptation for non-linear programming problems is possible by considering the minimisation of the optimality and feasibility measures as two competing targets. These measures are respectively related to the norm of the optimality and feasibility conditions (3.3) and (3.4).
The filter works as follows. At every iteration, a decision is made about storing or not the current optimality and feasibility measures. These records are then used in subsequent iterations to decide the rejection of poor iterates. A poor iterate is defined as an iterate whose corresponding optimality and feasibility measures is not sufficiently smaller than those stored in the filter. Therefore, the filter guarantees that at every iteration the current iterate is closer to a local optimum solution.

Trust-region
Newton's method is well known to frequently behave erratically far from the solution. Thus, convergence strategies exist in which the length of the step taken towards a new iterate is somehow controlled. Differently from a line-search strategy, in which the size of Newton's step is controlled and its direction preserved, a trust-region strategy controls both its size and direction. This is done with the introduction of a decomposition of Newton's step into a tangential and normal components, whose lengths are restricted so that they fit inside a variable trust-region.
The benefit of this decomposition is the possibility to control how much the iterations should favour the decrease of either the feasibility measure or the optimality measure. By controlling the size of the tangential step, iterates that satisfy more the feasibility condition (3.4) is obtained. For example, the smaller the feasibility measure, the smaller the mass-balance residuals in the minimisation of Gibbs free energy. On the other hand, by controlling the size of the normal step, iterates that conform more to the optimal condition (3.3) are obtained. In addition, the lengths of these components can be appropriately adjusted to enforce positive number of moles at every iteration.

Restoration phase
Whenever the algorithm fails to find an iterate that does not satisfy the filter conditions, some action other than halt its execution must be done. When this happens, the algorithm enters in a restoration phase, in which it focus in the decrease of an alternative feasibility measure. During the iterations in the restoration phase, some progress towards improved optimality measures is also obtained, though the main result after the end of the restoration phase is in general the decrease of the feasibility measure (e.g., the mass-balance residuals). Once an iterate in the restoration phase satisfies the filter conditions, the normal trust-region algorithm takes back control of the calculation.

Scaling
As discussed in Nocedal and Wright (1999), it is crucial that proper scaling of the variables is done to improve the performance of the algorithm. Indeed, as we shall see later, the primal-dual interior-point algorithm works more efficiently for sequential optimisation calculations if proper scaling of the variables is performed.
Let us denote D as the diagonal scaling matrix of the primal variable x. Then, we define the scaled primal variable x as: ð3:10Þ The implication of this scaling on the objective and constraint function, as well as on their derivatives, is detailed in Leal et al. (2014a). When performing sequential equilibrium calculations, the previous solution serves as an excellent option for scaling the primal variables for the next calculation. Therefore, denoting byx the primal solution in the previous calculation, we set: Note, however, that for a standalone calculation, or at the beginning of a sequence of optimisation calculations, a good scaling is not available. This follows from the fact that only a poor initial guess can be provided at that time. Therefore, we always perform the first optimisation calculation without scaling. The performance boost of the calculation with scaling, as we shall see later, is in part explained by the shift of the scaled variables to the interior of the feasible domain. This is a beneficial condition for interior-point algorithms, which can behave erratically near the boundaries of the feasible domain. Moreover, this solves the issue of some variables having different orders of magnitude, which is very common in chemical equilibrium problems.

Watchdog strategy
The minimisation algorithm adopts a monotone strategy. In this strategy, every accepted iterate provides a sufficient decrease in either the optimality or feasibility measures. Its main advantage is to aid convergence from arbitrary initial guesses, since an uncontrolled scheme of the step lengths could result in huge increases of these measures.
Nevertheless, a well-known disadvantage of monotone strategies is that it potentially reject iterates that would make good progress towards the solution (Nocedal and Wright, 1999). This is known as the Maratos effect, when the algorithm continuously discards good iterates because they either increase the optimality or feasibility measures. As a result, convergence to a solution becomes very slow.
The Maratos effect can be circumvented with the use of the watchdog technique of Chamberlain et al. (1982). Our adapted approach consists of applying plain Newton's method to the relaxed KKT Eq. (3.8) with a constant perturbation parameter _ l, rather than the variablel. In addition, normal and tangential components for the trust-region approach are not computed, nor is the new iterate supposed to be acceptable by the filter. Therefore, this technique is considered a nonmonotone strategy.
After a sufficient progress has been made in the calculation, our algorithm switches to the watchdog mode. This is done in order to speed up convergence towards the optimum solution. The calculation is said to have progressed sufficiently ifl < l w , where l w is the watchdog threshold, whose default value is l w ¼ 10 À1 . Note, however, that the algorithm leaves a monotone strategy to a nonmonotone one, opening up the possibility of divergence. Hence, it is important to monitor the calculation to detect when no progress is being made towards the solution.
Our monitoring approach consists of checking after W iterations if the current iterate is acceptable by the filter. If so, then the filter is extended with the current feasibility and optimality measures and a new round of W iterations under the watchdog strategy is allowed. Otherwise, the algorithm reverts to the monotone trust-region strategy, starting from the last watchdog iterate acceptable by the filter. In addition, we monitor the optimality measure at every watchdog iteration. The idea is to prevent an uncontrolled increase of that measure.

Phase stability test
Once an equilibrium calculation has been performed, it is fundamental to determine the stability of the phases in the chemical system. The presented interior-point minimisation algorithm alone produce a rough estimate of which phases are unstable, since these phases have small number of moles with respect to the total in the system. However, this simplistic phase stability test is not very accurate and helpful, since the small number of moles of the unstable phases are very sensitive to the perturbation parameterl of the interior-point method. In addition, it does not indicate how far from equilibrium the unstable phases are.
Therefore, we adopt a phase stability test similar to the one presented by Kulik et al. (2013), which has been successfully applied in the geochemical package GEM-Selektor. In this test, stability indices for all phases are readily calculated from the Lagrange multipliers of the minimisation calculation. The advantage of this approach over the analysis of the relative number of moles of the phases is that these indices are less sensitive to algorithmic parameters, providing a more accurate indication of which phases are unstable. In addition, it also provides a quantitative measure of how far they are from equilibrium.
In Kulik et al. (2013), the stability index K p of the pth phase is defined as: where X p is e same phase. Differently from Kulik et al. (2013), however, we define X p as: where I p is the set of indices of the species in the pth phase; i H is the local index of the ith species in its phase; x p i H is the molar fraction of the ith species; and z i is the ith component of the vector of Lagrange multipliers z. Note that the minimisation algorithm of Kulik et al. (2013) does not introduce the additional Lagrange multipliers z, and so the calculation of X p uses a different, but equivalent, equation.
After the calculation of the phase stability indices, the unstable phases can be identified. For every stability index K p such that jK p j P K , the pth phase is classified as unstable, where K is the phase stability tolerance, whose default value is K ¼ 0:01. Then, their molar composition is zeroed and the mass-balance of the stable phases is corrected. This procedure ensures that the equilibrium solution does not violate the Gibbs phase rule.
To perform this mass-balance correction procedure, new equilibrium calculations are required. However, at this time, only the stable phases need to be considered. Observe that this correction approach is in general very efficient, because (i) a good initial guess is already known, and (ii) the size of the chemical system has been reduced. Therefore, using a quadratic convergent equilibrium method should require one or two iterations in general.
We have opted to use the equilibrium method based on the law of mass-action presented in Leal et al. (2013) to perform these mass-balance corrections. This is because the linear systems solved in that method have dimension N, instead of N þ M in the interior-point algorithm. In addition, it does not involve any Lagrange multiplier, and so the state of the multipliers y and z of the interior-point method is not artificially altered, which could compromise subsequent equilibrium calculations that require these as initial guesses.

RESULTS AND DISCUSSION
In this section we demonstrate the capability of our Gibbs energy minimisation algorithm to perform efficient chemical equilibrium calculations. Our objective is to show that (i) the algorithm is capable of minimising the Gibbs energy of multiphase systems, (ii) only a few iterations is required to obtain convergence when carrying out sequential chemical equilibrium calculations, and (iii) the calculations converge at quadratic rates near the solution.
The following activity and fugacity coefficient models were used to perform the calculations for the study of carbon dioxide injection in deep saline aquifers: HKF extended Debye-Hü ckel activity coefficient model for solvent water and ionic species, Kirkham (1974a,b, 1976); Helgeson et al. (1981); the Setschenow activity coefficient model for neutral aqueous species other than CO 2 (aq); the activity coefficient models of Drummond (1981) or Duan and Sun (2003) for CO 2 (aq); the fugacity coefficient models of Spycher and Reed (1988); Spycher et al. (2003) or Duan et al. (2006) for CO 2 (g) and H 2 O(g).
Similarly to Leal et al. (2013), the ideal fugacity coefficient model for H 2 O(g) is used in conjunction with the model of Duan et al. (2006) for CO 2 (g). As shown there, accurate results of CO 2 solubility in brines can be obtained with this simplification. This choice is an alternative to the use of the expensive equation of state of Duan et al. (1992) for H 2 O(g).
The chemical potential of the species were obtained using the equations of state of Helgeson and Kirkham (1974a); Helgeson et al. (1978); Tanger and Helgeson (1988); Shock and Helgeson (1988) and Shock et al. (1992). The database file slop98.dat from the software SUPCRT92  was used to obtain the parameters for the equations of state. The equation of state of Wagner and Pruss (2002) was used to calculate the density of water and its temperature and pressure derivatives.

Solubility of CO 2 in NaCl brines
Modelling carbon storage in saline aquifers requires accurate calculations of the solubility of CO 2 in NaCl brines. This is necessary to model its fate after injection and to estimate how much of it could be trapped by solubility mechanisms. In addition, the accuracy of these calculated solubilities have a direct impact on the correctness of the geochemical modelling of the reactions between fluid and rock. This is because these reactions are strongly pro-moted by the acidification of brine that occurs with the dissolution of CO 2 .
The multiphase system used to model the H 2 O-CO 2 -NaCl system is described in Table 1. The solubility calculations were performed by assuming a mixture containing 1 kg of H 2 O, 10 mol of CO 2 , and 2.5 or 4.0 mol of NaCl. The amount of CO 2 was enough to saturate the aqueous solution and form a gaseous phase at all considered temperatures, pressures and salinities. This is fundamentally necessary since our goal is to calculate the saturated solubility of that gas in brine. Once the amounts of H 2 O, CO 2 and NaCl are specified, one can easily calculate the number of moles of the elements H, O, C, Na, and Cl and perform a chemical equilibrium calculation. From the obtained solution, the solubility of CO 2 is given by the molality of element C in the aqueous phase, which accounts for all aqueous species containing carbon. Fig. 1 compares our calculated CO 2 solubilities with the experimental measurements of Hou et al. (2013). The calculations were performed at temperatures 323. 15, 373.15, and 423.15 K, pressures 20-200 bar, and salinities 2.5 and 4.0 molal. By using the activity coefficient model of Drummond (1981) for CO 2 (aq) and the fugacity coefficient model of Spycher et al. (2003) for CO 2 (g) and H 2 O(g), we obtained solubility results with percentage deviation errors smaller than 5%.
Observe in Fig. 1 the salting out effect on the solubility of CO 2 at different brine salinities. This result shows that the amount of carbon trapped in saline aquifers by solubility mechanisms decreases with the increase in brine salinity. The increase in pressure, on the other hand, causes an increase in the amount of dissolved CO 2 , while the increase in temperature has the opposite effect. Note that the results indicate that our choice of activity and fugacity coefficient models can accurately represent the salting out effect as well  Fig. 1. Comparison of our calculations of CO 2 solubility in 2.5 and 4.0 molal NaCl brine (lines), using the activity coefficient model of Drummond, 1981 for CO 2 (aq) and the fugacity coefficient model of Spycher et al., 2003 for CO 2 (g) and H 2 O(g), with the experimental solubility data of Hou et al., 2013 (points). The calculations assumed a CO 2 -H 2 O-NaCl system composed of an aqueous and gaseous phase.
as compute the solubility of carbon dioxide at high temperatures and pressures.

Sequential chemical equilibrium calculations
As discussed in Section 1, efficient sequential equilibrium calculations are fundamental for some critical applications, such as reactive transport modelling. In this section we show the performance results of our Gibbs energy minimisation algorithm, and we analyse the importance of the watchdog strategy and scaling of the variables on the efficiency of the equilibrium calculations. Consider the multiphase system in Table 2. This system is of interest for modelling water-gas-rock interactions in CO 2 storage in deep saline aquifers, where the reservoir is mainly composed of carbonate minerals. The formation rock is modelled by the mineral phases calcite, magnesite and dolomite. In order to model a possible precipitation of sodium chloride when the aqueous phase becomes salt saturated, the mineral phase halite is also assumed.
To assess the efficiency of our equilibrium calculations in a carbon injection simulation, we start from a chemical state containing only an aqueous phase and mineral phases. Then, CO 2 is gradually added to the system until a specified amount is reached, which should be high enough to saturate the aqueous phase and produce a gaseous phase. This example is also useful to assess the robustness of the algorithm when handling phase assemblage transitions.
The modelling of the previous problem is now described. Let H 2 O; CO 2 ; NaCl; CaCO 3 and MgCO 3 denote auxiliary components of the chemical system in Table 2. At the initial and final states of the chemical system, we consider their molar abundance given by Table 3. Next, let n c denote the vector of molar abundance of the components. Define the following linear path: where n i c and n f c are the given initial and final molar abundance of the components; and t 2 ½0; 1 is a scalar parameter. The inputs of our sequential equilibrium calculations are determined, therefore, by gradually increasing the parameter t in order to model the addition or removal of components from the system. Note, however, that the molar abundance of the components n c are auxiliary inputs for   the chemical equilibrium calculations. The natural inputs are the number of moles of the chemical elements, which can be determined from the expressions in Table 4.
Using the molar abundance of the components in Table 3, we calculated the initial and final equilibrium states of the system as shown in Table 5. It can be seen that the amount of dissolved carbon was considerably increased from the initial to the final state, while the aqueous solution became more acidic, with initial and final pH as 9.2 and 4.8 respectively. In addition, note that the zeroed number of moles of some species indicates that the phases containing them are not present at equilibrium. For example, from all assumed mineral phases, only calcite and dolomite exist at given conditions of temperature, pressure and molar amounts of the components.
The calculations assumed T = 60°C and P = 150 bar, and a tolerance error of 10 À8 . In addition, the activity coefficient model of Duan and Sun (2003) was used for the aqueous species CO 2 ðaqÞ because of the presence of ions Ca 2þ and Mg 2þ in the aqueous solution. The fugacity coefficient model of Duan et al. (2006) was adopted for the gaseous species CO 2 ðgÞ. The initial guess for the calculations in the absence of good estimates is shown in Table 6.
Figs. 2 and 3 show the efficiency of the Gibbs energy minimisation algorithm without and with the watchdog strategy. They show the required number of iterations to achieve convergence for each of the 20 sequential equilib-rium calculations. Observe that the watchdog strategy is capable of boosting the convergence speed significantly. For example, the first calculation required 3.5 times fewer iterations when adopting this strategy.
Note that the first calculation requires more iterations than the others. This is because only poor initial guesses for the molar abundance of the species and the Lagrange multipliers are available. The second calculation still requires a few more iterations than the others, which is justified by the fact that the initial state is relatively distant from the second, since the former did not contain any CO 2 . Nevertheless, observe that the calculations achieve convergence in 1-3 iterations after the second calculation.
At about n CO 2 ¼ 0:8 mol, the gaseous phase emerges in the system and the number of iterations increases slightly during the handling of this phase assemblage transition, as seen in Fig. 2 . As shown in Fig. 3, however, the number of iterations in this region is just slightly affected when the watchdog strategy is used. This is an important efficiency demonstration of the method, since in multiphase reactive transport simulations the front of the flow is constantly experiencing appearance and disappearance of phases.
Figs. 2 and 3 adopted the scaling procedure presented in Section 3.4. Figs. 4 and 5, on the other hand, show the same calculations without such scaling. Fig. 4 indicates that the trust-region minimisation algorithm alone is highly sensitive to scaling, as discussed in Nocedal and Wright (1999), and its efficiency is severely compromised without it. The watchdog strategy, however, is just slightly affected by the lack of scaling, with its main inefficiency occurring near a phase boundary as shown in Fig. 5. Therefore, we see that scaling of the variables is fundamental for efficient equilibrium calculations, either using or not the watchdog strategy, since any small gain in efficiency can cause expressive performance results in reactive transport simulations.
Figs. 6 and 7 show the efficiency of the method when performing 100 sequential calculations instead of 20. Since these calculations used shorter subintervals, it can be said Table 6 The initial guess of the molar abundance of the species.

Species
Initial guess H 2 OðlÞ n H2O CO 2 ðgÞ n CO2 Halite n NaCl Calcite n that the difference between two of its consecutive solutions is smaller than the one when 20 sequential calculations are performed. This is reflected by the decrease in the number of iterations in the region ½0:2; 0:8 when compared to the same region in Figs. 2 and 3. Finally, increasing the number of subintervals to 1000 yields an average number of iterations of 1.2 with the use of watchdog strategy. Fig. 8 demonstrate that the sequence of equilibrium calculations are still efficiently calculated assuming simultaneous variations in temperature and pressure. The initial and final temperatures and pressures used were 60-160°C and 100-300 bar, resulting in a variation of 5°C and 10 bar for every equilibrium calculation. Note that the number of iterations after the gaseous phase is formed has increased by one. This is because this phase is more sensitive to changes in temperature and pressure, and an extra iteration is necessary to correct the molar abundance of the gaseous species.
These results demonstrate that the use of the watchdog technique is fundamental if this algorithm is to be used for multiphase reactive transport modelling. By adopting this nonmonotone strategy, the calculations will potentially converge in only few iterations, even in regions near the front flow. Moreover, the scaling technique used in the calculations is capable of further decreasing the number of iterations necessary to solve the chemical equilibrium problems. As to the accuracy of the calculations, the massbalance residuals are either zero or in the range 10 -25 to 10 -19 for some elements. Thus, the accuracy of the method satisfies the relative threshold of 10 À13 discussed in Kulik In Figs. 2-8 , the additional effort in correcting the massbalance of the stable phases once an equilibrium calculation has finished was not represented. In these sequential calculations, we performed only a single correction iteration when unstable phases were encountered.
Finally, we present Fig. 9 as a side result of all these calculations. It shows the pH of the aqueous solution and the molalities of the species Ca 2+ and Mg 2+ as carbon dioxide is injected into the system. Observe that the aqueous phase becomes acidic with the injection of CO 2 , and promotes further dissolution of the carbonate minerals, as suggested by the increase in the concentrations of the ions Ca 2+ and Mg 2+ . Also note that once the gaseous phase emerges in the system, the solubility of CO 2 in the aqueous phase remains constant, and so does the pH and the concentrations of the ionic species.

Convergence rates of the interior-point method
In Section 1 we argued that superlinear rates of convergence are essential if this algorithm is to be incorporated in critical applications such as chemical kinetics and reactive transport modelling. Because initial guesses in these applications are usually close to the solution, with its proximity being mainly dependent on the used time step, it is of utmost importance that an equilibrium algorithm converges at fast rates locally. In this section we present results that permit us to analyse the convergence rates of the trust-region interior-point algorithm with and without the watchdog strategy. Figs. 10-12 present the convergence plot of equilibrium calculations assuming the multiphase system of Table 2. The residual of a calculation is defined as the Euclidean norm of the relaxed KKT function FlðwÞ given in Eq. (3.9). All calculations were performed using poor initial guesses, so that the initial residual would be large. The watchdog threshold l w ¼ 10 À6 was used instead of l w ¼ 10 À1 to create two distinct regions in the graphs that show when the watchdog strategy is activated. The solid circles indicate the activation of the watchdog strategy, and the empty circles indicate the maximum residual attained during the non-monotone iterations. Note that all calcula-tions succeeded under the watchdog mode, without returning to the monotone trust-region strategy.
These calculations were performed with different equilibrium conditions. In Fig. 10, for example, the amount of CO 2 in the system is not enough to produce a gaseous phase. In Fig. 11, a gaseous phase is about to emerge, and so the calculation was performed near a phase boundary. In Fig. 12, the specified amount of CO 2 was sufficient to saturate the aqueous phase and to form a gaseous phase.
These results show that the use of the watchdog strategy boosts the convergence rate near the solution. This applies even in critical regions, such as those where a phase boundary exists. By using the nonmonotone watchdog strategy, our calculations could achieve quadratic rates of convergence. These rates are superior to the linear ones obtained with the original monotone trust-region interiorpoint method of Ulbrich et al. (2004) and Silva et al. (2008). The convergence rates a r of the calculations were computed using the formula: a r :¼ logðr kþ1 =r k Þ= logðr k =r kÀ1 Þ; ð4:2Þ obtained by assuming that at the kth iteration the following applies: where r k denotes the residual at the kth iteration, and C is a positive constant. The values of the convergence rates displayed in Figs. 10-12 are average of the last three iterations.

Sensitivity of the interior-point method
The description of the interior-point method in the previous sections did not contain any discussion about how sensitive its produced solutions are with respect to the final perturbation parameterl. Recall that this parameter is used to relax the strict complementary conditions (3.5) of the KKT equations. Therefore, this relaxation is expected to perturb the solution as well, which we shall see in this section at which extent this happens. For our experiments, however, we will allow the watchdog strategy to be used, which uses a small constant perturbation parameter _ l until convergence.
Consider the multiphase system of Table 1. Similarly as before, we will fix the molar amounts of components H 2 O  10. Residual of the equilibrium calculation at n CO 2 ¼ 0:1 mol with and without the watchdog strategy. At this condition the gaseous phase is not present at equilibrium. The activation of the watchdog strategy is indicated by the solid circle, and the maximum residual attained during the non-monotone iterations is indicated by the empty circle. and NaCl and gradually add CO 2 to the system. Eventually a gaseous phase will be formed, but before this happens we want to investigate the effect of parameter _ l on the number of moles of the inexistent gaseous species. We expect that these are tiny values in comparison with the number of moles in other phases. Fig. 13 shows the effect of different values of _ l on the number of moles of the gaseous species CO 2 ðgÞ. The figure shows that the smaller the perturbation parameter _ l is, the more accurate is the numerical representation of the inexistent gaseous species. For example, using _ l ¼ 10 À15 results in the inexistent CO 2 ðgÞ having number of moles in the order of 10 À12 . Compare this with the case _ l ¼ 10 À9 , where the number of moles of CO 2 ðgÞ in its inexistent region is about to 10 -6 . Moreover, observe that the smaller the perturbation parameter _ l is, the sharper is the phase assemblage transition curve.

CONCLUDING REMARKS
The efficiency of a chemical equilibrium method for multiphase systems has been assessed. The method has been developed based on a primal-dual interior-point algorithm for non-convex minimisation problems. It provides flexibility for the imposition of several types of equilibrium constraints besides mass-balance and charge-balance. The trust-region approach aids convergence of the equilibrium calculations starting from poor initial guesses. The restoration algorithm increases the robustness of the method, overcoming convergence problems that would lead the calculation to a failure. The watchdog strategy produces quadratic rates of convergence locally and substantially reduces the number of iterations. The method comprises a rigorous phase stability test to identify the unstable phases, allowing the enforcement of the Gibbs phase rule and correction of mass distribution among the stable phases.
We have successfully applied the Gibbs free energy minimisation algorithm in a geochemical equilibrium problem relevant to CO 2 sequestration in saline aquifers. The method was capable of determining the correct phase assemblage at equilibrium in a system containing many phases. Application of the algorithm for sequential equilibrium calculations showed that the approach is very efficient, requiring 1-3 iterations in most calculations, even in critical regions such as phase boundaries. The results demonstrate that reactive transport simulations can potentially benefit by using this multiphase equilibrium algorithm.
The interested reader can find the implemented Gibbs energy minimisation algorithm in REAKTORO, a C++ scientific library for geochemical modelling, freely available at http://bitbucket.org/reaktoro.

APPENDIX A. CONVERGENCE RATES OF A STOICHIOMETRIC ALGORITHM
Here we briefly describe a common stoichiometric formulation for equilibrium calculations followed by comments on its convergence rate. Differently from a non-stoichiometric formulation, which is based in the minimisation of the Gibbs free energy, the stoichiometric formulation is based on the solution of a system of mass-action and mass-balance equations (Smith and Missen, 1982).

A.1. Mathematical formulation
In what follows, the chemical system has been partitioned into primary and secondary species. The ith primary species is denoted by a i , and the jth secondary species by a j . As it is commonly done in the geochemistry literature, the subscripts i and j are enough to tell if a species is primary or secondary (Bethke, 2007). In addition, the number of primary and secondary species are denoted respectively by N i and N j . For simplicity reasons, only an aqueous phase is assumed in the system.
In the stoichiometric approach, the primary and secondary species are related to each other according to the following linearly independent system of equilibrium reactions: where m ji is the stoichiometric coefficient of the ith primary species in the jth secondary species. For example, for the system H 2 O-CO 2 , we can write the following linearly independent reactions: OH À H 2 O À H þ ; CO 2 ðaqÞ HCO À 3 þ H þ À H 2 O; where the sets of primary and secondary species are fH 2 O; H þ ; HCO À 3 g and fOH À ; CO 2 ðaqÞg. Computing the equilibrium state of the system requires solving the following mass-balance equations: where m i and m j are the molalities of the ith primary species and jth secondary species; M i is the given total molality of the ith primary species; c i and c j are the activity coefficients of the ith primary species and jth secondary species; and K j is the equilibrium constant of the jth reaction. Remark: Using molalities as unknowns has the inconvenience of the need to handle different equations for solvent water and solutes, since the molality of water does not make sense. In this discussion, however, we keep it simple by listing only those mass-balance and mass-action equations that involves only solutes. The presented idea, however, is not compromised by this lack of completeness.

A.2. Numerical method
The common practice of solving Eqs. (A.2) and (A.3) is to eliminate the dependency of the the mass-balance Eq. (A.2) on the molalities of the secondary species m j . This results in the following system of equations: Let k denote the iteration counter of the calculation. Assume that the molalities of the primary and secondary species m k i and m k j are known at iteration k. Therefore, the new molalities of the primary species m kþ1 i can be calculated by applying Newton's method to the following system of equations: where it can be observed the resemblance of this update of m kþ1 j with a successive substitution method. Note that the dependence of the activity coefficients c i and c j on the molalities m i and m j is commonly ignored, and their values are hold constant until the end of the iteration. According to Bethke (2007), this is called the soft formulation.The hard formulation, on the other hand, takes into account a partial dependence of the activity coefficients c i and c j on the molalities of the primary species m i only. Theirupdate is done once both m kþ1 i and m kþ1 j have been obtained, after which the previous two-stage calculations using Eqs. (A.5) and (A.6) are repeated until convergence.
If a full dependence of the activity coefficients on the species molalities were taken into account, then Eq. (A.5) would become: in two stages as before. This formulation based on the law of mass-action approach that adopts a complete Newton scheme has been further studied in Leal et al. (2013).

A.3. Discussion
The motivation of only applying Newton's method to Eq. (A.5) is to avoid large linear systems. This incomplete Newton's scheme results in Jacobian matrices that are only N i Â N i , and since the number of primary species N i is in general considerably smaller than that of secondary species N j , the size of the linear systems is substantially decreased. This incomplete scheme is only possible if the non-linear dependence of the activity coefficients on the molalities of the species is completely or partially ignored. Eqs. (A.5) and (A.6) were obtained by completely ignoring this dependence, using a soft formulation, and so the following derivative terms are not included in the Jacobian matrix: @c i @m i ; @c i @m j ; @c j @m i ; @c j @m j ; ðA:8Þ which are extremely valuable information. Its absence can negatively impact the convergence rate of the calculation, specially in strong non-ideal systems or near phase boundaries, where these values are very sensitive. If, on the other hand, a hard formulation were used to obtain Eq. (A.5), then only the derivative terms: @c i @m i ; @c j @m i ; ðA:9Þ would have been taken into account. Since in general N i ( N j , the above terms represent just a small portion of all that should be considered. Thus, we see that the resulting Jacobian matrix, in both the soft and hard formulations, always lacks important first-order partial derivative information. Since quadratic convergence rates in Newton's method is subject to exact Jacobian expressions, the presented stoichiometric method should expect slower convergence rates. In fact, it can be said that the incomplete Newton's method is fundamentally more similar to the family of quasi-Newton methods, which converges Q-superlinearly, than standard Newton methods, which converges Q-quadratically (see Kelley, 1995;Nocedal and Wright, 1999).
Depending on the kind of chemical system, and its physical and chemical conditions, a soft or hard stoichiometric method might still converge in few iterations. Perhaps even less than an approach that attains quadratic convergence rates, since these optimal rates only happen near the solution. Therefore, it is near the solution that the algorithms should differ in performance, since one that converges with quadratic rates should do so in very few iterations. From this we see that for equilibrium problems in the context of chemical kinetics or reactive transport modelling, where the calculations always start with hot initial guesses, a well implemented quadratic convergent method can deliver optimal performance.