Extensions of CASH+ thermodynamic solid solution model for the uptake of alkali metals and alkaline earth metals in C-S-H

The new CASH+ core sublattice solid solution model of calcium-silicate-hydrate (C-S-H) can describe calcium uptake, solubility, water content, and mean silicate chain length up to 90 C. In this study, the model has been consistently extended to describe the equilibrium uptake of alkalis (Li, Na, K, Rb, Cs) and alkaline earths (Mg, Sr, Ba, Ra). The new endmembers for each cation were defined, and their properties, along with binary interaction parameters, were fitted against known aqueous and solid phase compositions. The CASH+ model with its extensions can be directly used in GEMS codes or discretized and exported for use with other geochemical speciation computer programs. In agreement with the experimental data, the model indicates that the uptake of alkalis and alkaline earth cations occurs mainly as a competitive (interlayer) ion exchange, uptake is favored at low C/S ratios, silicate chains get shorter at higher pH, and the uptake of bivalent cations (in the order Mg2+ < Ca2+ < Sr2+ < Ba2+ < Ra2+) is preferential over that of monovalent cations (Li ≈ Na < K ≈ Rb ≈ Cs).


Introduction
Calcium silicate hydrate (C-S-H) is the main solid phase in hydrated Portland cements, which is a poorly crystalline, nano-particulate, porous material, sometimes referred to as a "gel-like phase". The acronym "C-S-H" refers to the cement chemistry notation, where 'C' stands for CaO, 'S' for SiO 2 and H for H 2 O, the hyphens denote the variable composition of C-S-H gel, which is usually expressed in terms of molar C/S ratio C/S from 0.7 to 1.7. At lower C/S ratios C-S-H gel co-exists with amorphous silica (SiO 2(am) ) and at higher C/S ratios C-S-H gel co-exists with portlandite (Ca(OH) 2(cr) ) [1]. Chemical thermodynamic modelling of equilibria involving C-S-H, alkalis and other elements is crucial for quantitative predictions of composition and properties of hydrated cements and their changes upon interaction with the environment. In the last four decades, many alternative models aiming at describing the stability and solubility of C-S-H phases have been published, as recently summarized by Kulik [2] and Walker et al. [3]. Some models were even based on a sublattice solid solution concept with various simplifications (a detailed review of previous models is beyond the scope of this paper). From the variety of C-S-H models, only a few reconciled some structural features such as the connectivity of silica chains [2,4], and even fewer accounted for the uptake of alkalis [5,6].
XRD studies, pair distribution function extracted from total X-ray scattering, molecular modelling, and 29 Si NMR data [7][8][9][10][11][12][13] indicate that the C-S-H structure can be described as a defect tobermorite structure consisting of a main calcium oxide layer with silicate chains attached on both sides and organized in "dreierketten", i.e. repeating chains of three silica tetrahedra [14,15]. Fig. 1 shows a schematic representation: the two "pairing" silica tetrahedra are linked to the calcium oxide layer, while the third one, a "bridging" tetrahedron, links the two "pairing" silica tetrahedra. Several tobermorite-like sheets can be stacked together with interlayers containing water, calcium, alkalis, and other ions, with basal spacing between 1.4 nm and 1.1 nm.
This structural concept has been considered in developing the new CASH+ core solid solution model [16], implemented in the Gibbs energy minimization code GEM-Selektor ( [17] https://gems.web.psi.ch /GEMS3) and parameterized for the C-S-H sub-system. This thermodynamic sublattice solid solution model, compatible with PSI/Nagra [18] and Cemdata18 [19] chemical thermodynamic databases and consistent with recent structural, atomistic and spectroscopic studies, is aimed at describing the equilibrium composition, stability, solubility, and density of C-S-H phases under different chemical conditions and compositions. To facilitate its use in law of mass action (LMA)-based geochemical computer programs (e.g., PHREEQC), the CASH+ model can be discretized into a series of "frozen" solid phases (pseudocompounds) with fixed stoichiometries and equilibrium solubility constants [16]. Sublattice sites and moieties defining the CASH+ solid solution model from [16], with extensions for alkali metals and alkaline earth metals are: bridging tetrahedral (BT), interlayer cation (IC), interlayer water (IW), and dimeric unit (DU) (Fig. 1, Table 1).
The previous published study [16] contains a detailed description of concepts, thermodynamic basis, and development methodology of a sublattice solid solution model for the CASH+ system. In particular, it provides (as Appendices in Supplementary material) 27 equations that describe the mixing in sublattice (multi-site) non-ideal solid solutions, along with the relevant terms of the endmember chemical potential. These include: the site-fraction-based reference Gibbs energy and configurational entropy terms, the excess energy of site interactions and endmember activity coefficient terms (in Berman and compound energy formalism CEF forms), and the reciprocal Gibbs energy non-ideality terms. All necessary calculations are implemented in GEMS codes; the principles of GEM calculation of equilibria in aqueous-solid solution systems are also described. The model parameterizations are performed using the GEMSFITS code [20]. The Appendices describe strategies used in fitting, testing and evaluation of CASH+ model variants, the impact of GEM initial guesses and numerical settings and tolerances, and the selection of robust calculation variants. Regarding the excess energies of mixing, GEM calculations with one versus two C-S-H solid solution phases in the system were tested for possible appearance of a miscibility gap, Berman symmetric vs Berman asymmetric vs CEF symmetric variants of interaction parameters were compared to conclude that the simplest (Berman symmetric) variant is sufficient. The Appendices also list the sources of standard thermodynamic data, as well as describe the algorithms for discretization of the CASH+ model with the export for use in LMA codes, in particular, into reaction-based PHREEQC thermodynamic data format for use with the PHREEQC variant of Cemdata18 database.
All these technicalities, along with the description of prediction methods for obtaining initial standard thermodynamic properties of generated CASH+ endmembers provided in a previous paper [16], essentially remain the same for all the incremental extensions of the CASH+ core sub-model with more cations substituting in IC and/or BT sublattices, and can be consistently applied to perform such extensions. Therefore, while using the same methodology throughout, the reader is referred to paper [16] for all the detailed descriptions of concepts, thermodynamic basis, and development methods for extended CASH+ sublattice solid solution models. In the following sections, as well as in follow-up papers on CASH+ extensions with Al, Fe III and hazardous cations, only those methods are described that deviate from or go beyond the basis laid down in [16].
The CASH+ core model [16] in the C-S-H system has been shown to perform well at temperatures at least between 10 • C and 90 • C at ambient pressures in presence of liquid water, reproducing most sets of experimental data for Ca and Si solubility, pH, mean silicate chain length (MCL), density, and solid water content in C-S-H [16]. Hence, the CASH+ core model is a solid foundation for the incremental extensions to accommodate Na, K, Al, Fe, and other elements by retaining the previously adjusted thermodynamic properties of endmembers and interaction parameters. Endmembers of the CASH+ sublattice solid solution model are constructed by the permutation of chemical moieties assumed to exist on different structural sites. Before starting parameterizations using the GEMSFITS code [20], the initial thermodynamic properties of endmembers must be evaluated using various prediction methods such as those described in [16].
C-S-H phases can take up alkali metals (Li, Na, K, Rb, Cs) and alkaline earth metals (Sr, Ba, Ra, Mg) from aqueous solution. A common feature of these cations is that they are thought to be sorbed by ion-exchange with Ca 2+ in the interlayer of C-S-H, and are unlikely to occupy the bridging tetrahedral sites in silicate chains [21]. One justification of this assumption stems from the knowledge that more alkali ions get incorporated into low-C/S C-S-H than in high-C/S C-S-H [22][23][24][25][26]. The alkali uptake increases with alkali concentration, but decreases at high calcium concentrations, thus indicating a competition between calcium and alkali ions on the same type of sites [1], where cations compensate the negative charge formed due to deprotonation of ---SiOH to ---SiO − sites. The silanol groups deprotonate stronger at higher pH, leading to the build-up of negative charge [24,[27][28][29]. Divalent cations such as calcium are preferentially sorbed compared to monovalent sodium or potassium [24,30] due to a stronger electrostatic interaction, hence high Ca concentration in solution can inhibit the alkali uptake [1]. Less experimental or modelling evidence is available for the uptake of alkaline earth metals Sr, Ba and Ra [31][32][33][34][35][36][37]. However, at higher Ca-content, their sorption isotherms show the same strong suppression as observed for alkali metals, and their sorption is lower in the presence of alkali metals [38]. Thus, an uptake on the IC sites of C-S-H seems the most probable [21]. Nevertheless, in the current study, during the model parameterization, the possibility of alkaline earth metals substitutions in the BT sites has been considered as well.
The aim of this contribution is to extend the CASH+ core solid solution model [16] of C-S-H to accurately predict the uptake of alkali and alkaline earth metals in aqueous equilibria relevant to cementitious systems. Building up on [16], the construction of the respective new endmembers is described, followed by the initial estimation of their standard thermodynamic properties, which is necessary to start the parameterization process and to predict the temperature trends of CASH+ stability and solubility. Next, the parameterization strategy is discussed, and the optimization of the model is done using experimental data on alkali metals and alkaline earth metals uptake in C-S-H. The final optimized standard Gibbs energies of new endmembers are provided, with the respective interaction parameters for cationic moiety substitutions in the IC sublattice, along with some tests regarding the temperature trends. The CASH+ subsystems further extended with aluminum or ferric iron will be considered in a separate paper.

Extension endmembers and their standard thermodynamic properties
The CASH+ model is based on a defect-tobermorite structure with 5 sublattices identified (Fig. 1, Table 1), along with ublattice sites and moieties defining the CASH+ solid solution model (from [16], with extensions for alkali metals and alkaline earth metals). In two sublattices, IC and BT, the simultaneous substitutions of ionic moieties or vacancies are considered. Each moiety is represented by a chemical formula containing a specific ion and water associated with it. The stoichiometry of moieties was chosen such that all possible chemical substitutions were represented, all endmembers and substitutions were electrically neutral, and that water contents of C-S-H gel was realistic. Further details and more reasoning about endmember stoichiometries can be found in [16].
The CASH+ solid solution model consists of endmembers that were generated by permutation, i.e. by considering all possible combinations of moieties that can be substituted on the respective sublattice sites. For example, the "TSNh" endmember is built from the following moieties in the respective sites: 'T' ([(CaSiO 3 ) 2 H 2 O] 0 ) in the DU site, 'S' (SiO 2 OH − ) in the BT site, 'N' (NaHOH + ) in the IC site, and 'h' (H 2 O) in the IW site. The relative stability of different endmembers at given conditions (temperature, pressure, composition) defines mole fractions of endmembers and thus the bulk properties of the modeled C-S-H phase. To take into account possible non-ideal intra-site interactions, the approach of Berman [39,40] is used. In this approach, binary, pseudo-ternary or ternary interaction parameters can be defined for each combination of moieties substituting on each structural site. No interaction parameters between moieties on adjacent sites are considered. The summation then yields the activity coefficients of endmembers and the total excess Gibbs energy of the whole CASH+ phase (more details in [16], Appendix A1). The endmembers, generated (in extension of the CASH+ core model) using Table 1, are listed in Table 2.
The CASH+ model allows the calculation of the mean (silicate) chain length (MCL), thus making the stability/solubility model consistent with certain structural changes in C-S-H phases. From mole fractions x j2 of dimeric endmembers (j 2 represents the j-th dimeric endmember with missing Si in the BT site), MCL can be calculated using the formula [2]: By definition, dimeric endmembers have CL = 2 and polymeric ones have CL = ∞ (Table 2).

Estimation of thermodynamic properties of alkali and alkaline earth endmembers
Sublattice solid solution models may consist of many endmembers, most of which do not exist as pure mineral substances, and their thermodynamic properties cannot be found in thermodynamic databases or obtained from calorimetry or solubility data. In order to start parameter optimization using the GEMSFITS code, at least the initial standard chemical potential μ* T (standard Gibbs energy per mole g* T = μ* T ) values of endmembers at the temperature T of interest must lie within ±30 to ±50 kJ⋅mol − 1 interval relative to their "optimal" values. Besides, the solid solution model should correctly reproduce the known values of density [41][42][43][44][45][46], non-gel-pore water content [43,47] (the interlayer water), enthalpy [48], and heat capacity [48], of C-S-H. This requires a need for consistent prediction/correlation of water stoichiometry, Table 1 Sublattice sites and moieties defining the CASH+ solid solution model (from [16], with extensions for alkali metals and alkaline earth metals).
For the fully hydrated state (in presence of liquid H 2 O), the vacancy Va 'v' in IW site is not considered (no substitution). Letters 'T', 'S', 'v', 'C', 'N', 'K', and so on are codes of respective moieties in respective sites used in short names of generated endmembers (Table 2). Vacancy 'v' is represented by OHin BT sites and by H + in IC sites. It is assumed that alkali and/or alkaline earth metals except Ca 2+ are substituted in IC sublattice sites only; the Na, K, and Sr species in BT sublattice (with question marks) are listed for trial variants of the model.  (J⋅mol − 1 ) for all possible endmembers. In order to predict these properties, the systematic approach developed and described by Kulik et al. [16] has been followed in the current study. Superscript "∘"and subscript "298" signify that the properties are at the standard state and at the reference temperature of 298.15 K. Values of density, d = 2.5 g⋅cm − 3 are set for all endmembers having silica in the BT site (e.g., TSNh, TSLih, TSSrh, etc.,) and d = 2.7 g⋅cm − 3 for all other endmembers (e.g., TCNh, TvLih, TCSrh, etc.). This allows the CASH+ core model [16] to reproduce known values of density [41][42][43][44][45][46] as a function of C/S. The same assumption was used for the alkali and alkaline earth metals endmembers. Using the set density and the molar mass of the endmember, its standard molar volume was calculated. The estimates of standard molar properties of CASH+ endmembers (Table 2) were obtained from the so-called volume-based thermodynamics (VBT) equations [49], to calculate values of S ∘ 298 and Cp ∘ 298 from V ∘ 298 .
The first six endmembers comprise the CASH+ core model [16], and are shown for completeness. Boxes are used to group endmembers for each subsystem. *Na-, K-, and Sr-containing endmembers assuming Na, K, or Sr on both IC and BT sites (grey) were used only for trial calculations and are not part of the final model.
where V m = 10 21 ⋅ V ∘ 298 NA in nm 3 /(formula unit), N A = 6.022 • 10 23 is Avogadro's number, and V ∘ 298 is in cm 3 ⋅mol − 1 . The mean absolute errors are 7.4% and 24.4% for Eqs. (2) and (3), respectively. Trial estimates were also done using isocoulombic reactions, exchanging Ca 2+ with the other alkaline earth metal ions and assuming zero entropy and heat capacity of the reaction [50]. However, because the change in entropy and heat capacity with the change in volume of alkaline earth metal ions does not follow the VBT correlation, this estimation method produced values for the endmembers properties that are not consistent with the VBT correlation and were not further considered. The estimated V ∘ 298 from assumed densities of endmembers, and S ∘ 298 and Cp ∘ 298 from the VBT correlations (Eqs. (2), (3)) are not part of the parameter optimization and were not adjusted during CASH+ extensions parameterization.
Furthermore, initial standard molar Gibbs energy of formation G* 298 values (* denotes "initially estimated") for endmembers with Na, K were obtained using a simple method of polyhedral contributions suggested by Chermak and Rimstidt [51]. The estimations of initial G* 298 of CASH+ endmembers with Li, Rb, Cs, Mg, Sr, Ba and Ra have been performed using exchange reactions (Eq. (4) with the Gibbs energy of reaction ∆ r G* 298 = 0) and the fitted values for TSCh, TvCh and TCCh endmembers from Kulik et al. ([16], Table 11), estimated values for Na and K endmembers, plus the data for involved aqueous ions from the GEMS version of the PSI/Nagra thermodynamic database ( [18] https ://gems.web.psi.ch/TDB/): where A and B are alkali or alkaline earth metal ions. The G* 298 of Licontaining CASH+ were estimated from Na-containing endmembers.
As Cs + and Rb + ions are more similar to K + than to Na + , the G* 298 of Csand Rb-containing CASH+ endmembers were estimated from K-containing endmembers. In an analogous way, the G* 298 of Sr-and Mgcontaining endmembers were estimated from the Ca-containing endmembers. It is known that the partial molar properties of Ba 2+ ions are more similar to those of Sr 2+ than of Ca 2+ , and properties of Ra 2+ are more similar to Ba 2+ than to Sr 2+ and Ca 2+ [36,52,53]. Therefore, initial G* 298 of Ba-containing endmembers were obtained from Sr ones, and of Ra-containing endmembers from the previously estimated Ba ones. Assuming that the Gibbs energy of such reactions is equal to zero means that the initial values for solubility products of the two endmembers (of element A and B, Eq. (4)) are equal. This assumption is sufficient to get G* 298 values that were then taken as initial guess for the GEMSFITS parameterization within the assumed uncertainty of ±50 kJ⋅mol − 1 . The final fitted G ∘ 298 values together with S ∘ 298 estimated from VBT and the entropy of elements [18] were used to calculate the final standard molar enthalpy of formation H ∘ 298 of endmembers. The final complete set of standard state thermodynamic properties of CASH+ endmembers is given in the Appendix, Table A1 and the properties for their dissolution reactions in Table A2.

Objectives of parameterization exercises
The CASH+ model has been parameterized in a stepwise manner, using the C-S-H core model [16] as a starting dataset, then incrementally extended to take into account the alkali and alkaline earths metals uptake in fully hydrated C-S-H at various C/S and alkali concentrations. During the parameterization procedure, several scientific issues were investigated and solved.
(1) Are regular interaction parameters necessary and sufficient to describe the experimental data, or it can be accurately described only using the ideal mixing of endmembers?
(2) Can the model be incrementally extended, by independently fitting parameters for each subsystem and then keeping them fixed when extending the model with a new ion uptake? (3) How sensitive is the extended CASH+ model to the used thermodynamic model of aqueous speciation (i.e. the effect of considering possible aqueous alkali silica complexes formation on the model parameterization)? (4) Regarding the uptake of Na, K, and Sr, can the agreement with the experimental data be improved by taking into account the assignment of ions into different structural sites (i.e. allowing alkalis to be exchanged in the BT and IC sites)?
In addition, the performance of extended CASH+ (Li, Na, K, Rb, Cs, Mg, Sr, Ba, Ra) model was investigated by running simulations for chemical systems of different compositions to evaluate the impact on the mineral assemblage and aqueous composition. The adequate predictive modelling of the solid phase composition, the solubility behavior of C-S-H, and its ability to bind alkali metals and alkaline earth metals are fundamental to make quantitative predictions of composition and volume of hydrated cements and their usability for environmental applications.

Chemical system setup and tools
The chemical system, composed of elements, substances (chemical species) and phases (pure and solid solutions), was defined using the GEM software [17]. The standard thermodynamic properties were obtained from the GEMS version of the PSI/Nagra database and Cem-data18 TDBs [18,19,54,55]. The GEM-Selektor uses the TSolMod library [56], which provides activity models for aqueous electrolyte solutions, and contains rigorous methods for computing activity coefficients of endmembers of complex solid solutions, in particular, the implementation of Berman sublattice solid solution model [16,40].
Activity coefficients (γ) of aqueous species were calculated using the extended Debye-Hückel equation [57]: where A and B represent the Debye-Hückel solvent parameters, å the ion size parameter, b γ a semi-empirical extended term parameter, and I represents the effective ionic strength (corrected for ion pairing and complexing). The å the ion size parameter and b γ are common for charged complexes and are related to the background electrolyte in the modeled system. Values for å equal to 3.72, 3.31, 4.08, 3.67 Å and for b γ equal to 0.064, 0.098, 0.025, 0.123 were used when modelling experiments with NaCl, NaOH, KCl and KOH as major electrolyte, respectively. When other background electrolyte was used in the experimental setting (e.g., NaNO 3 ), the parameters for NaCl were used. For mixed electrolyte systems, such as in the case of cementitious systems weighted average parameters based on the electrolyte amount could be used. Such a method is not implemented in GEMS and generally for modelling mixed cementitious systems the parameters of the KOH background electrolyte are used [19]. Test calculations were done exchanging the parameters for different background electrolytes with no significant deviations being observed (Section 7.5). For neutral species, the first term in Eq. (5) disappears and the activity coefficient is calculated as b γ I. The activity coefficient of water is calculated from the osmotic coefficient [56]. Some of the uptake experimental data are represented as sorption isotherms, expressed as distribution coefficients R d defined as: To be able to refine the values of endmember properties and of interaction parameters, different types of experimental measurements from selected experimental datasets are needed, such as the measured aqueous and solid phase composition or distribution coefficients of elements of interest. Selected experiments need to cover a wide range of system chemical composition, the parameters should be sensitive to them, and must be all at once considered in the parameter optimization process.
The GEMSFITS multi-property multi-parameter optimization code [20] has been used to simultaneously optimize different sets of parameters against a large dataset of experiments. GEMSFITS can generate an internally consistent set of parameters of various types, fitted against a selection of different types of experimental data. GEMSFITS uses the GEMS3K chemical solver [17] for modelling each experimental run and takes advantage of the state-of-the-art NLOpt optimization library [58] for finding the best combination of parameters that reproduce the experimental data. GEMSFITS provides graphical and tabular output including the goodness of fit, parameter sensitivities and parameter confidence intervals obtained using Monte Carlo sensitivity simulations [20,59]. These statistical methods are helpful to evaluate the accuracy and uncertainty of fitted parameter values, to decide which parameters are more significant in reproducing the experimental data, as well as to avoid overparameterization by removing parameters that are not sensitive to the experimental data.

Aqueous species
Standard thermodynamic properties of all compounds other than C-S-H (aqueous species, gases and solids) included in the system will have an impact on the thermodynamic properties of endmembers and site interaction parameters of the CASH+ sublattice mixing model . An error due to the absence of the standard Gibbs energy G ∘ j of a j-th compound that, if present, would form in a significant amount in equilibrium with the C-S-H phase, will propagate into the optimized parameter values. The list of solid phases and aqueous species considered in the chemical system, along with the sources of their thermodynamic data, is reported in Table 3. Moreover, the effects of the aqueous speciation (e.g., calcium and alkali silica complexes) onto the parameterization of CASH+ model was investigated. For instance, the stability of CaSiO 3 0 (+ H 2 O = CaH 2 SiO 4 0 in hydrated form) complex has a significant impact on the dissolved silica concentrations at high C/S ratios, as considered in [3,16]. Alkali metal complexes with silica and chloride that can be stable in alkali aqueous solutions are also reported in the literature. The effect of NaCl 0 , KCl 0 , CaCl + , CaCl 2 0 , NaHSiO 3 0 , KHSiO 3 0 , Ca(OH) 2 0 as possible species present in the aqueous solution in equilibrium with C-S-H phase has been considered; more details on these investigation and test calculations are given in Section 7.4. With the exception of Ca(OH) 2 0 , these neutral complexes were finally excluded from the chemical system when optimizing CASH+ solid solution model extensions.

Fitting strategies
The CASH+ sublattice solid solution model is designed for the incremental parameterization against the available experimental data on equilibrium uptake of different cations. The C-S-H subsystem of the CASH+ model represents the starting point for further parameterization, as discussed in Kulik et al. [16] along with the overall fitting strategy.
Here the strategy of parameterization of CASH+ model extensions is outlined.
First, the model extension for a new subsystem is formulated, including the required endmembers and site interaction parameters. Initial estimates of standard properties of endmembers are generated using simple correlations (e.g., exchange reactions) with those of already provided by CASH+ endmembers and/or of other relevant substances (in Section 2.2).
Second, the new model parameters are refined against various types of selected measured experimental data, while keeping fixed the previously optimized parameters from the underlying CASH+ sub-model(s). At each optimization run, the sensitivity of adjusted parameters to the variation of experimental data is monitored.
Finally, the fitted parameters not significantly sensitive to the experimental data (i.e., not necessary for improving the quality of fit) are set to 0 in case of interaction parameters or in the case of G ∘ 298 of endmembers were a premise for reconsidering specific ion uptake sites or excluding them from the fit and independently estimating them. Such insignificant parameters decrease the minimized sum of residuals (total or per compared property) by <10% from the previous value, and parameters also have low composite scaled sensitivity (CSS) values. CSS is a measure of how sensitive one parameter is to input observational data [20,66]. At the end of the parameter refinement process, a solid solution model containing all significant endmembers and interaction parameters is constructed that provides the optimal description of the experimental data that can be used to model the uptake of investigated ions in a wide range of compositions.
During the parameter optimization, each individual experiment (input composition, T, P conditions of equilibrium, possible kinetic constraints) was simulated in GEMSFITS using the GEM algorithm. To accurately model the experiments, the reported elemental composition, solid to liquid ratios and other conditions were considered when reproducing the experimental points. The least square fitting method was used for calculating the objective function of residuals, i.e. the sum of the differences between measured and calculated values [20].
To better constrain the model, different types of measurements were simultaneously used to optimize the parameters. The largest portion of data consists of measured dissolved element concentrations in the aqueous solution in equilibrium with the C-S-H phase. The measured pH of the aqueous solution and the composition of C-S-H phase, measured or calculated from mass balance, was also used in the optimization process if reported as measured, calculated and/or target measured, calculated and/or target. The measured and calculated pH of the aqueous solution is linked to the C-S-H and aqueous phase composition Table 3 List of chemical species (compounds) considered in the Ca-Mg-Si-Na-K-Li-Cs-Rb-Sr-Ba-Ra-Cl-N-OH chemical system for modelling the experimental data for C-S-H solubility.  [18] This data, augmented with the data from the GEMS version of PSI-Nagra database 12/07, is also the basis of the Cemdata18 database. a Gibbs energy of species was adjusted in this study; starting value used in the optimization from [65]. b Properties derived in this study (see Section 7.4). and speciation. Unless otherwise specified, most data are plotted against the measured, calculated and/or target C/S ratio in the solid, as appropriate. This is equivalent to the C/S in C-S-H, except cases at low C/S where amorphous silica is in equilibrium or at high C/S where portlandite is in equilibrium.

Uptake of Na and K
The CASH+ core sub-model has been extended to CASH+N and CASH+K sub-models to account for the uptake of Na and K in C-S-H phases. In a first step, it was tested whether Na and K can be substituted in both IC and BT sites. By permutation of Na and K moieties, six endmembers were constructed for each alkali element subsystem (Table 2). To account for the non-ideal mixing, five binary Berman interaction parameters were considered for each alkali metal subsystem: two for interactions of an alkali metal with Ca and vacancy in IC sites, and three for interactions of an alkali metal with Ca, Si, and vacancy in BT site.
In a second step, Na and K were assumed be substituted only in the IC site. In this case, the model becomes much simpler, with only three endmembers and two interaction parameters for each alkali metal.

The CASH+N sub-model
Experimental studies that provided the data used in the optimization procedure for the uptake of Na in C-S-H are listed in Table 4. The selected experimental datasets were obtained at different target C/S and N/S (Na/Si) mole ratios, using different sodium electrolyte solutions (NaOH, NaCl, and NaNO 3 ), and covering a relatively wide range of alkali concentrations. The CASH+N sub-model parameters were optimized against the measured aqueous composition (pH, total Na, Ca, and Si concentrations) and C/S ratio in the solid phase. The available data on the effect of alkalis on the MCL is scarce and was thus not used in the optimization, but was compared against the model MCL predictions.
Experimental studies, as summarized by Lothenbach and Nonat [1] and in Fig. 2, show an increase of alkali metal binding by C-S-H at higher alkali metal concentrations and at lower C/S. The addition of sodium salts such as NaCl or NaNO 3 , has little effect on silicon concentrations, but moderately increases calcium concentrations, as sodium can replace calcium in the interlayer sites [21]. The addition of sodium hydroxide, however, increases pH, this lowers calcium concentrations and increases the silicon concentrations especially at low C/S. At step 1, the sensitivity of each endmember to the experimental data was tested to see if all the considered parameters are necessary for reproducing the experimental data (allowing Na to be exchanged in both IC and BT sites). To do this, the G ∘ 298 value of each individual endmember was set as a free fitted parameter, while keeping G ∘ 298 values of remaining endmembers fixed at their initial values. Additional tests were conducted after setting pairs of G ∘ 298 for two and three endmembers as freely adjustable parameters, and keeping the remaining ones fixed. In addition, the composite scaled sensitivities of parameters were monitored. When allowing the G ∘ 298 values of TNCh and TNvh endmembers as adjustable parameters in the optimization run, there was little to no improvement in the agreement between calculated and measured experimental data, and the fitting process always made these endmembers less stable. Optimized values of the interaction parameters between vacancy, Na and Ca in the bridging tetrahedral site were close to zero (0 +/− 1 kJ⋅mol − 1 ). These results indicate that allowing the substitution of Na in the BT (bridging tetrahedral) sites brings no improvement of the fit and is not necessary to reproduce the experimental data.
At step 2, the endmembers and interaction parameters related to the uptake of Na in the bridging tetrahedral site were removed from the CASH+N sub-model setup. This reduced the number of parameters from 11 to 5 (3 endmembers and 2 interaction parameters). Upper-and lower bounds of 30 kJ⋅mol − 1 were set upon the initial parameter values, and all five of them were simultaneously fitted against selected experimental data ( Table 4).
The same degree of agreement was obtained between measured and calculated data as in the case of step 1 when 11 parameters were optimized. Comparisons between the measured experimental data used in the fit and calculated data using the final parameter values are shown in Fig. 2. There is a good agreement at different C/S ratios, at various Na concentrations, and for the systems where different sodium electrolyte solution were used, including NaOH, NaCl and NaNO 3 . Disagreements are visible at low C/S with the experiments from [23] (Fig. 2D), while the agreement for the other experimental dataset was good also at low C/S. The model underestimates the measured dissolved alkali concentration at 0.85 C/S ratio (higher predicted uptake in C-S-H) while the measured Ca and Si concentrations are reproduced. When comparing the measured and calculated pH (Fig. 2E), deviations up to 0.5 can be observed for some experiments.
At an additional step 3, it was tested whether allowing the optimization of parameters from the CASH+ core subsystem (that were kept constant at steps 1 and 2) can further improve the agreement with the experimental data in the CASH+N subsystem. The G ∘ 298 values of endmembers, as well as the interaction parameters were allowed to be freely optimized individually, simultaneously or together with the Na uptake parameters. In all cases, the difference between measured and calculated values was not significantly improved as compared with the results from step 2, suggesting that the CASH+ core subsystem can be kept fixed as determined in Kulik et al. [16], and, hence, the CASH+ model can be incrementally extended.
The final internally consistent set of five adjusted parameters for the CASH+N sub-model, together with the 95% confidence intervals, can be found in Table 5.

The CASH+K sub-model
The selected experimental data for the uptake of K in C-S-H are summarized in Table 4. Several complementary studies are available, in which the authors performed the same type of measurements for both systems, at similar conditions, replacing Na for K [23,25,68,69]. In all cases, the experimental data showed only a little difference between Na and K uptake, although K tended to be slightly favored over Na.
Similar to the CASH+N sub-model parameterization step 1, the same sensitivity test was performed, showing that the additional endmembers and interaction parameters related to allowing K for substitutions in the BT (bridging tetrahedral) site are not necessary to accurately reproduce the selected experimental data.
At step 2, the G ∘ 298 values of the 3 endmembers and the values of the 2 interaction parameters were simultaneously optimized against measured concentrations of K, Si, and Ca in the aqueous solution and against reported C/S in the solid. The final agreement between measured and calculated data was good, but the two interlayer TCKh and TvKh endmembers dominant at high C/Si showed low sensitivity values compared to the TSKh endmember dominant at low C/S (the most sensitive). This suggests that G ∘ 298 of TCKh and TvKh are not well constrained by the experimental data, and that it would be better to estimate their values independently.
The experimental data on Na and K uptake show similar trends, see Figs. 2 and 3. This suggests that there are analogies between the two submodels. The possibility for estimating the G ∘ 298 values of endmembers in the K subsystem was investigated. This was done using the fitted G ∘ 298 values of endmembers from the Na subsystem. The following ion exchange reactions between the endmembers from the Na and K sub systems were considered: TCNh TvNh The Gibbs energy effects (∆ r G ∘ 298 ) of above reactions were assumed to be equal to zero. We began with optimizing the G ∘ 298 value of TSKh (low C/S endmember with the largest sensitivity to the experimental data) together with two interaction parameters, keeping TCKh and TvKh fixed. At the next step, only the G ∘ 298 value of TSKh was fitted and the interaction parameters in the K subsystem were set to be equal to those in the Na one. The sum of residuals did not significantly change between the three cases. Estimating the G ∘ 298 values of TCKh and TvKh endmembers (∆ r G ∘ 298 = 0 for Eqs. (8) and (9)), using the same values of interaction parameters as for the Na subsystem, and only adjusting the G ∘ 298 value for TSKh, were sufficient to get a good agreement with the experimental data. From the fitted G ∘ 298 value for TSKh endmember in reaction (7), obtained a value of − 2.15 kJ⋅mol − 1 for the Gibbs energy of the reaction was obtained, corresponding to a log equilibrium constant, log K = 0.38, indicating a slight preference for K uptake. This suggests the log K of exchange reactions for alkali metals containing endmembers have values close to 0 and that their solubility constants are close to equal. This makes the predicted uptake of alkali metals by the CASH+ to be similar. There is a good agreement between calculated and measured experimental data used in the fit (Fig. 3).
Similar to the CASH+N sub-model, some disagreements are visible with the experiments from [23] at low C/S ratio (Fig. 3D). More potassium is predicted to go into C-S-H while the measured concentrations of dissolved Ca and Si are reproduced. For the measured and calculated pH (Fig. 3E), maximum deviations of 0.5 units were observed.
The final internally consistent set of five parameters for the CASH+K sub-model, together with 95% confidence intervals obtained from 500 Monte Carlo runs, can be found in Table 5. The errors for TCKh and TvKh endmembers, as well as the errors for the interaction parameters, are to be taken the same as for the Na subsystem counterparts.
The CASH+N and CASH+K model was tested against the experiments of L'Hôpital et al. [25] done in NaOH and KOH solutions that were not used in the fit (Fig. 4). The model overestimates the dissolved Ca measured by L'Hôpital et al. [25] in 0.05 M NaOH at C/S between 0.8 and 1.2 (Fig. 4A) showing a better agreement with the datasets that were used in the fit.

Uptake of Li, Cs and Rb
The experimental data sets selected for parameterizing and checking further extensions of the CASH+ core model with alkali metals other than Na, K are summarized in Table 6.
To extend the CASH+ model for the uptake of Li, Cs, and Rb, no fitting was used. Instead, the standard thermodynamic properties of Li endmembers were estimated from those of Na, and those of Cs and Rb endmembers ( Table 2) were estimated from those of K using the exchange reactions as formulated in Eqs. (7) to (9), setting all thermodynamic effects of the reactions to zero. The G ∘ 298 values of the Li, Cs and Rb endmembers are summarized in Table 7. Values of interaction parameters of Li, Cs, or Rb with Ca and vacancies in the IC sites were assumed to be equal to those retrieved for Na, and all interaction parameters between all alkali metals were assumed to be zero.
Calculated dissolved Ca and Si concentrations and isotherms for Li, and Cs were compared with solubility measurements of C-S-H in CsOH and LiOH solutions (Figs. 5 and 6). The best overall agreement between the experimental data and the model is at 0.5 M concentration for both alkali metals. Disagreements were observed between the calculated and the measured Ca and Si in 0.1 M CsOH and LiOH solutions >1C/S, and Ca and Si in 1 M LiOH solution. The model predicts a similar behavior for both Cs and Li. This is because the G ∘ 298 of Li-and Cs-containing endmember were estimated from Na-and K-containing ones. Both the measured experimental data and the model predictions show similar behavior for systems at 0.5 M alkali metals concentrations. For lower and higher concentrations, the experimental data shows different trends when comparing the two alkali metals. This is the reason why the model, which predicts similar behavior for the two alkali metals in all concentrations, cannot reproduce well the experimental data that do not show similar trends. In trying to resolve these discrepancies, it was tested whether the optimization of endmember standard properties and interaction parameters would improve the agreement with the measured data. No significant improvement could be obtained, and data for the different concentrations could not be simultaneously brought in agreement with the model. This means that either the model does not correctly capture the uptake mechanisms or there is an inconsistency between the experimental data at different concentrations.
The measured and calculated concentrations of Li and Cs in solution are in good agreement and have similar values (Fig. 6). Only at 1 M LiOH, the measured Li concentrations are systematically lower (~0.04 mol/L) than calculated ones, even if compared with the measured data for Cs under the same conditions. For the C-S-H synthesized in 1 M LiOH, in addition Li 2 SiO 3 was detected by XRD for C-S-H with C/S = 0.6. In the solid with 1.2 < C/S < 1.6, a small shoulder located at 19 o 2θ belonging to Li 2 SiO 3 could also be observed [72]. This indicates that part of Li precipitates as a solid, and could explain the disagreement between the calculated and the lower measured [Si] in 1 M LiOH. Using the thermodynamic data for Li 2 SiO 3 [77] does not lead to an improved agreement, which could indicate that a different phase might be present.
For the Rb uptake in C-S-H, no experimental data is yet available to compare with the model calculations. Based on the good agreement with the data on Cs, it is expected that the model will perform reasonably good also in the case of Rb.

Uptake of Sr, Ba, Ra and Mg
In addition to alkali metals, C-S-H gel can also take up the alkaline earth metals Mg, Sr, Ba, and Ra. The uptake is probably based on ion exchange with Ca 2+ in the interlayer of C-S-H, as sorption isotherms show a stronger sorption at low calcium concentrations. Sorption isotherms for Sr, Ba, or Ra are available in the literature [34,37,74,76,78,79]. The uptake of Mg on C-S-H has only been studied at a target C/S = 0.8 in C-S-H gel with the addition of MgCl 2 to reach target M/S ratios = 0.05-1.34 or by the addition of periclase to reach target M/S ratios = 0.05-0.86 [80,81], but the uptake of Mg on C-S-H was observed to be <0.001 Mg/Si [81], which had been related to the low solubility of magnesium silicate hydrate (M-S-H) phases that can coexist with C-S-H.
Strontium is found in radioactive waste as Sr 90 and is relevant to the waste disposal safety performance assessment, also in cementitious waste matrices and engineering barriers. Non-radioactive strontium is also present in the cement clinker with concentrations of 0.01-0.1 wt% [78] and 0.1 to 1 mM Sr is present in the pore solution of hydrated cements [82]. The uptake of Sr in C-S-H has been investigated by several studies [32,73,74], see Table 6. The experimental dataset of [32] was chosen for optimizing the stability of Sr endmembers and respective interaction parameters. This dataset contains both sorption and coprecipitation measurements performed from low 0.75 to high 1.65C/S ratio in water and synthetic alkali cement water (ACW) solution (180 mM KOH + 114 mM NaOH). The experiments done in water were used to fit the properties of Sr-containing endmembers and the interaction parameters of Sr with Ca and vacancy in the IC. Only the interaction parameter of Sr with Na and K in the IC was incrementally fitted to the ACW experiments. The remaining datasets [73,74] were not used in the fitting, but only to verify the optimized model. The final parameter values are given in Table 8 and show a preference of the ion-exchange IC sites for Sr 2+ over Ca 2+ (for tabulated logK values, see Appendix Table A2). The experimental data of [32] could be fitted to the CASH+Sr model for both experiments done in water and in ACW solution (Fig. 7). In general, a higher uptake is observed in the high-pH ACW solution, where lower calcium concentrations and higher deprotonation of the silanol sites contribute to a higher uptake at higher pH value. A similar mechanism was observed for the alkali metals. A disagreement exists with the measurements at low C/S (<0.80) ratio in ACW, where the slope of the measured isotherm is different when compared to measurements done at higher C/S ratios. A different slope in the uptake isotherm may point to a different uptake mechanism compared to the higher C/S ratio data, which is not captured by the model. Because of the similar ionic radius of Ca 2+ and Sr 2+ [83], it can be reasonably assumed that Sr may also enter the bridging tetrahedral sites as well. A model where Sr is allowed to substitute in both IC and BT sites was tested. There was no additional improvement relative to the simpler model with Sr only entering the IC sites. Ca only enters the BT site at C/S ratios >0.9, signifying that for Sr to enter the BT sites, similar Sr/Si ratios >0.9 would be needed. This does not exclude that some fraction of Sr may enter the BT sites, but based on the experimental data, the model is not sensitive to this.
Due to the similarity of Ba and Ra ions, Ba is often used in experiments as an analogue of Ra to avoid radiotoxicity. The uptake of Ba in C-S-H has been studied experimentally by [75]. This dataset was used for optimizing the stability of the Ba endmembers and interaction parameters. During initial trial calculations, it became evident that a good agreement between the model calculated and the experimental values could be obtained by taking the same interaction parameter values as in CASH+Sr sub-model and making the G ∘ 298 of Ba endmembers 4 to 7 kJ⋅mol − 1 more negative than that of Sr endmembers. The optimized parameter values are given in Table 8. A good agreement between the calculated and measured isotherms was obtained (Fig. 8). Minor disagreements can be observed only at the highest Ba concentrations, where the slope of the measured isotherms decreases.
The alkaline earth element Ra can also be incorporated into C-S-H [34,37,76]. Ra has a high radiotoxicity, making experiments relatively uncertain and difficult to perform [76]. All experimental datasets [34,37,76] show the same trend of decreasing uptake with increasing C/ S ratio. The datasets of [37] and [34] are in agreement and fall on the same trend of R d as a function of C/S, while the values reported by [76] are systematically lower, showing lower uptake values by up to one order of magnitude.
The dataset of [76] shows a similar Ra uptake as Ba uptake and can be reproduced if we assume zero reaction effects for the exchange of Ba with Ra in the IC sites of C-S-H. If the datasets of [37] and [34] are used to optimize the properties of CASH+Ra endmembers, a systematically higher uptake is obtained (Fig. 9B, C) compared with the uptake of Ba and measurements of [76]. The optimized parameter values are given in Table 8. The interaction parameters between Ra and the alkali metals in the interlayer were determined based on the data of [76]. The resulting positive interaction parameters are around three times larger when compared with the Sr system (Table 8). Combining the interaction parameters with the endmember properties resulting from the measurements of [37] and [34] in water gives a R d value for the alkaline cement water that agrees with the one reported by [37] at C/S = 0.9 and one that is 0.5 log units larger at C/S = 1.4 (Fig. 9C).
The Mg uptake in C-S-H was estimated by correlating the R d values of different alkaline earth metals at different C/S with their ionic radius [84,85] (Fig. 10). The R d correlates almost linearly with the ionic radius, with Ba having slightly lower R d with increasing C/S. Finally, G ∘ 298 of Mg endmembers were fitted to the estimated R d for Mg uptake in C-S-H

Discussion
The full set of endmembers of the CASH+ model with their thermodynamic properties and the interaction parameters is provided in the appendix (Tables A1, A2 and A3). The CASH+ phase model can be composed considering the full system with all alkali and alkaline earth metals by taking all endmembers (Appendix Table A1) and the interaction parameters (Appendix Table A3) into one solid solution phase definition. Alternatively, specific subsystems for just a selection of elements can be considered e.g., CASH+Na, K, Sr, Ba, and Ra. A full CASH+ model phase definition (simultaneously accounting for the uptake of all elements considered in this study) contains 33 endmembers and 58 (43 nonzero) interaction parameters.
The properties of all endmembers and values of interaction parameters between the alkali and alkaline earth metals and calcium or vacancy in the IC site were fitted to experiments of C-S-H in equilibrium with the alkali solution of interest or were estimated based on similarities between the alkali cations. For the full CASH+ model, additional interaction parameters between the alkali and alkaline earth metals need to be considered. The interaction parameters for the alkaline earth metals between themselves and with the alkali metals have nonzero values (Table 8). Negative values signify that the uptake of the two cations is favored when compared to the ideal mixing case, while positive values signify that the uptake is hindered. The interaction parameters between the alkali metals are assumed to be 0 based on the similarities of the uptake of Na and K in C-S-H (Tables 5 and 7). Based on the experimental data on the uptake of Na and K, the value for the interaction parameter between the alkalis and Ca was also found to be 0. A zero value for the interaction parameters signifies that the mixing of these cations in the IC site is ideal and their uptake is controlled by the properties of their respective endmembers.

CASH+ model parameter uncertainties
Realistic 95% (two-sigma) confidence intervals for the fitted parameters were retrieved using the Monte Carlo method as implemented in the GEMSFITS code [20]. The uncertainty of G ∘ 298 of CASH+ model endmembers are between 0.3 (0.05 pK units) and 6.4 (1.12 pK units) kJ⋅mol − 1 , that results in uncertainties between 0.05 and 0.3 log 10 units for the corresponding dissolution reactions. For the "estimated" endmembers, the uncertainty of standard Gibbs energies was assumed equal to that of the endmembers used in the estimation, but may be actually larger due to the extrapolation uncertainty. The 95% confidence intervals for the interaction parameters related to the alkali metals and alkaline earth metals exchange in the IC site could only be computed for the interactions between Na and vacancy and between Sr, Ca, vacancy, and Na.

CASH+NK model performance
Relations between the system, solid, gas and aqueous phase compositions, and properties of phases such as pH and mean (silicate) chain length (MCL), depend on the thermodynamic properties of components that make up the chemical system in equilibrium, namely amorphous silica, portlandite and the C-S-H-N-K solid solution. The way the CASH+NK model performs is strongly related to how the moieties are defined (Table 1), how these moieties are combined to form endmembers (Table 2), and to the optimized standard properties of endmembers and interaction parameters.
Several simulations were conducted to investigate how the parameterized CASH+NK model behaves under various system compositions. Namely, the impact of variation in C/S ratio, increase in alkali concentration and change in pH and their effect on the uptake of alkali metals and the MCL. To avoid the effect of the liquid to solid ratio on the calculated results, all the following modelling calculations were conducted at a liquid to solid mass ratio of 50. Detailed account for these tests is provided in the Supplementary material. The CASH+ core model [16] calculates C/S of the C-S-H phase in range from 0.72 (in equilibrium with amorphous silica) to 1.64 (in equilibrium with portlandite). The C/S ratio in C-S-H is influenced by the uptake of cations and the change in pH of the aqueous solution due to the increased concentration of hydroxides. The compositional range   The changes caused by the addition of NaCl and KCl are due to the replacement of Ca by alkali metals in the IC sites, leading to a lower content of Ca in C-S-H. A stronger effect is calculated when adding alkali metal hydroxide. The effect is mainly due to the influence of pH on the stability of amorphous silica, C-S-H, and portlandite. The increase in pH leads to an increased solubility of silicon and a decreased portlandite solubility. This makes C-S-H containing more Ca at the low C/S end and less Ca at the high C/S end. Data on the MCL in the C-S-H alkali metals system is scarce. Using the CASH+NK sub-model, MCL for the C-S-H-N-K phase can be independently predicted and compared with the limited experimental data (Fig. 11A). Fig. 11 shows effects of C/S in the solid, the presence of alkali metals in the system, and of pH on the modeled MCL. At the same C/S ratio, the uptake of alkali metals lowers the MCL of C-S-H, and more so in the presence of hydroxides than in the presence of chlorides. This effect is also observed in the experimental data (Fig. 11A). The addition of alkali metals at C/S > 0.72 leads to a decrease in concentration of Si in Fig. 7. Comparison of calculated and measured [32,74] Sr sorption isotherms in water (A), in synthetic ACW (B) for different C/S ratios in solid (calculated by the CASH+Sr model). Comparison of calculated and measured distribution ratio (R d ) as a function of the C/S ratio in solid for measurements done in water (C) and for measurements done in synthetic alkali cement water (D). BT sites, as SiO 2 OH − is replaced by vacancies, represented as OH − , with no effect on the solution pH. If the alkali metals are added as hydroxide solution, pH of the solution increases, silicon in the BT site becomes less stable at higher pH and is replaced by vacancies at low C/S and by Ca at high C/S. At the same C/S ratio and alkali metal concentration (Fig. 11B), the MCL decreases with increasing pH as Si in the BT sites is replaced by vacancies and Ca (see also Supplementary material).

Evaluation of CASH+NK(Li, Rb, Cs, Mg, Sr, Ba, Ra) model performance
The thermodynamic behavior of alkali metal and alkaline earth metal ions in aqueous solution is largely influenced by their different ionic radius and hydrated ion size [90,91]. The same is here assumed for uptake of these cations in C-S-H. The ionic radii increase in a sequence: Li + < Na + < K + < Rb + < Cs + and Mg 2+ < Ca 2+ < Sr 2+ < Ba 2+ < Ra 2+ . When hydrated, the smaller the ionic radius the greater the hydrated ion size. Among alkali metal ions, Cs + is the smallest and Li + the largest cation in hydrated form.
The model extensions for the uptake of alkali metals other than Na and K were estimated from Na and K, resulting in two groups (Fig. 12). Na and Li (group 1) show slightly lower uptake than K, Rb and Cs, (group 2), but the difference is so small that it is safe to conclude that all alkali metals show similar uptake in C-S-H. The uptake of Cs and Na in C-S-H was experimentally compared by [79], observing that the uptake of Cs was larger than that for Na. The log 10 (R d ) values measured for Na are comparable to the ones calculated by the CASH+ model of 1.5 at 0.8C/S and 1 at 1.2C/S. The values reported in [79] for the uptake of Cs at low Cs concentrations are around 0.4 log 10 (R d ) units higher than the ones calculated by the CASH+ model. The experiments in [79] also show that with increasing Cs concentration > 1⋅10 − 5 M, the Rd value decreases at all C/S. The uptake of alkaline earth metals in C-S-H decreases from Ra to Mg in agreement with Ra having the smallest and Mg is having the largest hydrated cation in the group (Fig. 12). The same trend is observed in the ACW systems (Fig. 12D). In the presence of alkali hydroxide solution, the uptake distribution coefficient of a divalent cation is higher when compared with the water-only system and shows a less steep decrease with increasing C/S ratio. The uptake of alkali metals  Fig. 10. Correlation of R d with ionic radius (in pm) [83] for Sr, Ba, and Ra uptake at different C/S ratios in solid. Solid symbols are calculated using the fitted CASH+ model and empty symbols are estimates using the linear correlation.
increases with their concentration in solution, but there is a higher uptake if the alkali metal is added in form of hydroxide than if it is added in the form of chloride.
The addition of hydroxides leads to an increase in pH, which strongly promotes the uptake of cations in the interlayer, because vacancies represented by H + moiety are less stable with increasing pH and are substituted by Ca 2+ and alkali metals. When the alkali metals are added as chloride salts, there is no effect on pH, and the uptake is driven by the competition between calcium, alkali metals and alkaline earth metals. This competition leads to a lower uptake of the alkali and alkaline earth metals. In the case of Mg uptake, a different trend can be seen at low C/S (Fig. 12D). This may be due to the formation of MgSiO 3 0 species, which leads to a higher dissolved concentration of Mg. When suppressing this species, a similar trend as for the other alkaline earth metals was observed (not shown).
Mg uptake by C-S-H gel is extremely limited (Mg/Si < 0.001) by the preferred precipitation of the sparingly soluble M-S-H and brucite [81].  Fig. 11. The effect of (A) alkali metals and C/S ratio [6,29,[86][87][88][89], and (B) pH at different C/S ratio on the calculated MCL. simplification purposes, the uptake of Mg is not necessary to be considered in the CASH+ model, since the concentration of Mg is controlled by M-S-H or brucite [80].
Using the same correlation (Fig. 10), the uptake of beryllium would be predicted to be even lower than Mg, with a log(R d ) < − 1. This is in contradiction to the measured data [92], with a log(R d ) between 5 and 7, that shows a stronger uptake for Be with the lowest ionic radius compared with Ra that has the largest ionic radius, contrary to the radius correlation. This mismatch could be due to a different uptake mechanism compared to the other elements, like the formation of Ca-Be complexes on the surfaces of C-S-H as suggested by [92]. The uptake of beryllium will be considered in future CASH+ extensions.

Effect of aqueous complexes
In the literature, several aqueous alkali and alkaline earth metal complexes are reported that have not been considered in the parameterization of CASH+NK sub-models. Most notably, the formation of NaCl 0 and KCl 0 complexes could have an impact when dealing with concentrated alkali metal chloride solutions. Test calculations were done up to 3 m NaCl concentration, including and excluding these two complexes. Although at the highest electrolyte concentration, these species form up to 15% of the total dissolved alkali metals, the effect of not considering these species on the calculated concentrations of dissolved Si, Ca, and on pH is below 0.05 log units, and their impact on the solid phase composition and endmember properties is insignificant.
Like the calcium-silica neutral complex (CaSiO 3 0 ), which strongly influences Si speciation at high calcium concentrations, complexes between alkaline earth metals (M) and silica may also form at high pH. Stability constants for the reaction of 5.7 ± 0.2, 4.0 ± 0.1 and 2.9 are available for Mg [60], Ca [3,16], and Sr [93], respectively. These constants show a linear trend with the ionic radius (r, taken from [83]) with the following relation: The stability of these complexes decreases from Mg to Sr, suggesting that the silica complexes of Ba and Ra are less stable than that of Sr.
Alkali metal silica complexes have been proposed to be stable at high pH and concentrated dissolved alkali metal conditions [94]. Thermodynamic properties for the NaHSiO 3 0 complex were reported in [62], but no data for a similar possible KHSiO 3 0 species is knowingly available. Adding NaHSiO 3 0 to the model system would increase the concentration of dissolved Si with increasing Na concentration, with NaHSiO 3 0 forming the dominant Si species. This behavior of dissolved Si was not observed in experiments (Fig. 14), where dissolved Si concentrations remained relatively constant upon the increasing alkali metal concentration. Based on the result shown in Fig. 14, it is assumed here that no alkali silicate complex will form under any conditions in any alkali metal electrolyte solution. The reported thermodynamic data for this complex at 25 • C, 1 bar are not reliable. It has been previously suggested that such a complex might in fact not form under these conditions and that the reported properties were retrieved from an erroneous interpretation of experimental data [95]. The presence of Ca(OH) 2 0 neutral species in highly alkaline solutions has been suggested in [96][97][98], and used to explain the measured portlandite solubility in concentrated NaOH-NaCl solutions [96,97] and C-S-H immersed in NaOH solution data reported by [99]. This species becomes the dominant Ca species above 1 M NaOH at room temperatures. Because some of the experimental data on C-S-H solubility were obtained in sodium hydroxide solutions at concentrations of 1 M, the possible effect of Ca(OH) 2 0 formation needs to be considered.
The stability constant of Ca(OH) 2 0 at 25 • C for the reaction: with log 10 K ∘ 298 1.85 ± 0.1, fitted in the present study to the experimental data on portlandite solubility in NaOH and NaCl solutions [97,100] (Fig. 15) and the entropy of reaction, 61 ± 1 J mol − 1 K − 1 , was taken from [97], based on experiments run from 5 to 75 • C. Using the value of G ∘ 298 reported in [98], the log 10 K ∘ 298 for reaction (12) would be 3.06. Such a stable complex that would make up most of the Ca in a NaOH solution in equilibrium with portlandite would be inconsistent with the observations of [96][97][98].
To obtain an improved agreement with the experimental data, the Fig. 13. Measured Si, Ca, Mg aqueous concentration as a function of pH [80] and calculated concentrations using the present CASH+ model and the M-S-H model [64]. Calculation done at 20 • C, at L/S mass ratio = 45.

Fig. 14.
Comparison between measured [68] and modeled aqueous Si concentration using the final optimized parameters (Table 5) NaOH 0 aqueous complex had to be removed from the chemical system (Fig. 15). Tested thermodynamic properties for NaOH 0 taken from other sources [101][102][103] lead to similar discrepancies. The monovalent ion pairs are generally weak complexes [63], often have high uncertainties associated with their thermodynamic properties, and do not form in significant amounts at low temperatures. Fitting with the CASH+NK model, and calculation for CASH+Li and CASH+Cs (see Section 5), both with and without the alkali metal ion pairs, showed no significant changes in the results. Monovalent ion pairs NaOH 0 , KOH 0 , LiOH 0 , and CsOH 0 were therefore excluded from the calculations. The concentration of Ca(OH) 2 0 in the C-S-H gel system and its effect on the solubility of the solid phases increases with increasing pH. Increasing pH by the addition of 0.2 M NaOH results in Ca(OH) 2 0 forming 10% of dissolved Ca in solution, which starts to have a marked effect on solid phase solubility. By the addition of 0.6 M NaOH then Ca (OH) 2 0 becomes the dominant Ca bearing species in solution (Fig. 15A).
At higher NaOH concentrations then the formation of Ca(OH) 2 0 controls the solid phase solubility.

Effect of temperature
Calculations were also perfomred to test the CASH+NK model predictions as a function of temperature. All temperature effects are based on standard state thermodynamic data and parameters for solids and species taken form sources in Table 3 and estimations done for the CASH+ endmembers (Section 2.2), and are independent from the CASH+ model optimization that is based on the data obtained at ambient conditions.
The effect of temperature on the uptake of alkali metals as estimated by the CASH+ model was independently tested against the available experimental data for Na uptake in NaCl and NaOH solutions [6,104,105]. As reported in Section 2.2, estimates for the entropy and heat capacity of endmembers were obtained from correlations with the molar volume. These estimated values, combined with the temperature dependence of major aqueous species (Table 3), determine the solubility of C-S-H at different temperatures and compositions. There was no fitting involved for the temperature dependence of the CASH+ model, which produces reasonable calculated values for the dissolved Ca and Si in equilibrium with C-S-H in NaCl and NaOH solutions of various concentrations from 25 to 90 • C (Figs. 16 and 17).
Simulated Ca concentrations are in reasonable agreement with their experimental counterparts (Fig. 17). Discrepancies can be seen when comparing Si concentrations in dilute NaOH solutions at 90 • C, but there could be an issue either with calculated or measured Si values in Fig. 17C and D. Additional tests were done to check the effect of using different background electrolyte parameters for the extended Debye-Hückel activity coefficient model [57]. Only small differences result when using the KOH background electrolyte parameters to model systems with NaCl and NaOH aqueous solutions even at high electrolyte concentrations and elevated temperatures (Figs. 16F, 17D). The model tends to overestimate the measured Ca concentration in NaCl solutions at 25 • C, by around 0.1 log units, and at elevated temperatures at C/S > 1.4, by up to 0.3 log units. This is also visible in Figs. 2C and 3C where the measured Ca concentrations are overestimated at high alkali metal/S. This difference could be reduced by making the alkali containing endmembers 2 kJ less stable but then the data at low alkali metal/S are underestimated. This means that data at low alkali metal/S make the endmembers slightly too stable for the high alkali metal/S conditions.
The predicted temperature dependence of the uptake of alkali and alkaline earth metals is determined by the entropy and heat capacity effects of the CASH+ endmembers solubility reactions (Appendix Table A2). There is no temperature effect considered in the interaction parameters. It is expected that the temperature effect on the interaction parameters in the 0-90 • C interval will be minimal, and most of it will be accommodated by the thermodynamic properties of the endmembers. Figs. 18A and B show that the uptake of alkali metals decreases with increasing temperature. The CASH+ model predicts a decrease in the uptake of Ra and Ba as the temperature increases at C/S ratio < 1.1-1.2 (Fig. 18C). At higher C/S, the opposite trend is observed. An increase of Sr and Mg uptake with temperature increase is predicted for the whole C/S interval, with Mg showing a more pronounced temperature effect. This outcome is a consequence of the S ∘ 298 and Cp ∘ 298 values of the endmembers and those of the aqueous ions and complexes of alkali and alkaline earth metals.

Discretization of CASH+NK model
Sublattice mixing models are not provided in most chemical speciation solvers, even popular ones such as PHREEQC [106]. The easiest method to implement the CASH+NK sublattice mixing model in such a program is to provide a discretized solid phase (DSP) model variant, in which the changing composition of the solid solution phase is split into a finite number of pseudocompounds or "frozen" bulk compositions, each treated as a pure phase. Usage of DSP variants has a long history (see [3]) in C-S-H gel modelling with computer codes that cannot really handle complex solid solutions. The discretization procedure used here has been developed and applied to the CASH+ core model by Kulik et al. [16] in order to promote the use of CASH+ model in other chemical speciation codes. Here, an extended DSP of the CASH+NK model covering the uptake of Na and K is provided (see Supplementary material). The discretization has been done in a semi-automatic mode using the GEM-Selektor and ThermoMatch codes, as described in the Supplementary material of [16].
Briefly, the bulk property of a frozen phase (pseudocompound) as the mole-fraction weighted sum of the respective standard property of all endmembers plus the property of mixing in the solution phase can be defined as:   where G ss, 298 is the Gibbs energy of the ss discrete phase, the G ∘ j, 298 is the standard Gibbs energy of j-th endmember, x j is its mole fraction, and y j is its activity coefficient related to the excess Gibbs energy for the ss discrete phase (which also includes the reciprocal mixing terms).
The 11 discrete C-S-H phases from C/S = 0.72-1.65 of the core CASH+ DSP model reported in [16] were used as the starting point for generating the Na and K containing discrete phases. Alkali-containing DSP were generated by equilibrating the CASH+ model having as input each of the 11 discrete bulk compositions (target C/S mole ratios from 0.6 to 1.7 with step 0.1 mol/mol) with solutions between 0.1 and 300 mmol/L alkali metal hydroxide concentration. This covers CASH+ compositions having between 0.001 and 0.32 alkali/Si at target C/S of 0.6 and between 1e-6 and 0.073 alkali/Si at target C/S of 1.7, in equal intervals. The total number of pseudocompounds for both Na and K uptake is thus 55 (Supplementary material).
The DSP model seems to overestimate the dissolved [Si] aq and underestimate [Ca] aq when used to simulate systems with mixed Na and K and indicates the need for mixed Na-K discrete phases. This would require, however, many discrete solid phases to correctly account for equilibria with the CASH+NK solid with all possible variations in Na and/or K concentrations. It is therefore recommended that the DSP variants are only used in relatively simple chemical systems where one alkali element is dominant.

Implications for cement pore solutions
The CASH+NK model has been developed to predict the composition of the C-S-H phase in a hydrated cement and its corresponding pore solution at equilibrium. Knowledge on the amount of alkali and alkaline earth metals bound to C-S-H has important implications for their concentrations in the pore solution of cement and, in particular, on the pH of the pore solution. The pH is one of the main factors affecting the durability of cements. High values pH >10 in the pore solution are needed to efficiently prevent the oxidation of steel bars embedded in concrete [107], whereas pH values and alkali concentrations in the pore solution that are too high can accelerate the alkali silica reaction [108,109], which can lead to expansion and damage on a structural level.
Most C-S-H models neglect the binding of alkali and alkaline earth metals to C-S-H [2][3][4]16], thus leading to strong overestimation of alkali metal concentrations and pH values >14 in the pore solution in contact with hydrated cements [110], which does not agree with pore solution measurements, where measured pH values = 13.0-13.5 [111] are generally observed. The uptake of alkali metals in C-S-H [23,110,112] is often corrected by using the empirical distribution coefficient (R d values), which allows predicting the alkali uptake in Portland cements relatively well. However, the application of R d to blended cements is difficult due to the strong dependency of such R d values on the C/S ratio in the C-S-H. The uptake of alkalis has also been modeled by introducing (simplified) additional Na-and K-endmembers, e.g. (NaOH) 2.5 SiO 2 H 2 O and (KOH) 2.5 SiO 2 H 2 O, into a C-S-H solid solution model as described in [19]. Indeed, this simplifies the modelling, but shows only satisfactory agreements with the experimental results [19]. The compositionally reliable and inherently flexible CASH+NK solid solution model presented here helps to close this important gap in thermodynamic models for hydrated cement systems.

Conclusions
A new CASH+NK(Li, Cs, Rb, Mg, Sr, Ba, Ra) sublattice mixing model was parameterized against the available experimental data on solubility and uptake of alkali metals and alkaline earth metals in C-S-H. The CASH+ sublattice mixing model considers the defect tobermorite structure and yields a satisfactory account of measured MCL. The internally consistent set of thermodynamic parameters, provided as an extension to the Cemdata 18 database, can now be used in GEM-Selektor code for improved modelling of equilibrium solid-and pore-water composition in hydrated cementitious materials. In a discretized form, the model can also be used in other geochemical computer codes that can describe equilibrium between a mineral and an aqueous phase, albeit with less accuracy. Key conclusions that resulted from the development of the CASH+NK model in the current study are: (1) The CASH+ core solid solution model [16] was incrementally extended, i.e., all parameters fitted earlier could be kept constant when extending the model for the uptake of alkali and alkaline earth metals. The agreement with experimental data could not be significantly improved by additionally fitting the sub-model parameters that were initially fixed. (2) For the CASH+N and CASH+K sub-models, allowing the alkali metal uptake in the IC sites only, so that no K or Na ions entered the BT sites, was completely sufficient to describe all the selected experiments. This reduced the number of added parameters to 5 for each alkali metal subsystem, instead of 11 if alkali metals would have been allowed to enter BT sites. (3) The standard entropy and heat capacity of the CASH+ endmembers could be estimated from their standard molar volumes using the VBT correlations. (4) Regular binary non-ideal site interaction parameters are needed for CASH+ sublattice model and its extensions to better describe the measured element concentrations in solution, the C-S-H composition, and the MCL data. Fig. 19. Comparison between the calculated C-S-H solubility using the CASH+NK model (continuous lines) and the DSP model variant (dashed lines).
(5) Thermodynamic properties of endmembers and values of interaction parameters related to the uptake of Li, Rb and Cs could be estimated from the Na and K counterparts. The model predictions based on estimations show a satisfactory agreement with experimental data. (6) The model was successfully extended for the uptake of alkaline earth metals (Mg, Sr, Ba, Ra), with parameters fitted to the available experimental data. Standard thermodynamic properties of endmembers related to the Mg uptake were estimated from correlations of R d with the ionic radius. (7) The uptake of alkaline earth metals increases from Mg to Ra (smaller to larger ionic radius), while K shows only a slightly higher uptake than Na. The uptake of Li, Rb and Cs is comparable to the uptake of Na and K by C-S-H. (8) High Ca concentrations lower the uptake of alkali metals (Li, Na, K, Rb and Cs) and alkaline earth metals (Mg, Sr, Ba, Ra) due to the competition in the IC sites. (9) The addition of NaOH or KOH increases the uptake of alkali and alkaline earth metals due to the increase in pH that promotes a replacement of vacancies in the IC sites. (10) The presence of alkali hydroxides decreases the MCL in C-S-H as the bridging silicas were simulated as being replaced by vacancies at low C/S and by Ca at high C/S.
With stepwise incremental extensions, the CASH+ sublattice solid solution model will be further applied for the uptake of Al and potentially, for the uptake other cations and anions, as well as extensions of the model to partially and fully dehydrated states. The extended CASH+ model, anticipated to be part of the next release of Cemdata database, is currently provided for GEMS codes, though it can also be used in a discretized form with other geochemical computer programs that can describe equilibrium between solid and liquid phases in a simplified way.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments
Funding from SNF CASH-2 project 200021_169014/1 and the SNF Sinergia CASH project 130419, European Joint Programme on Radioactive Waste Management EURAD Assessment of Chemical Evolution of ILW and HLW Disposal Cells (ACED) work package (Grant Agreement No. 847593), as well as partial support from Nagra, Wettingen, are gratefully acknowledged. Thanks go to Andre Nonat for helpful discussions, to Elina Bernard for help with the Mg system, and to two anonymous reviewers for abundant and extremely insightful comments on the initial and the revised manuscript that greatly helped to improve this paper.

Table A1
Standard molar properties and uncertainties of CASH+ endmembers, master aqueous species, and species derived in this study. δG ∘ 298 95% confidence interval (2σ). Sites: DU dimeric unit, BT bridging tetrahedral, IC interlayer cation, IW interlayer water. Letters: 'T', 'S', 'v', 'C', 'N', 'K', and so on are codes of respective moieties (Table 1). Charges in the sublattice sites have been removed for brevity. G ∘ 298 fitted, S ∘ 298 and Cp ∘ 298 from VBT, V ∘ 298 from density. Additional information on the thermodynamic data of elements and master species can be found in [18].  a CASH+ core model reported in [16].

Table A2
Standard molar properties and uncertainties of dissolution reactions for CASH+ endmembers and aqueous species derived in this study (calculated based on values in Table A1). Sites: DU dimeric unit, BT bridging tetrahedral, IC interlayer cation, IW interlayer water. Letters: 'T', 'S', 'v', 'C', 'N', 'K', and so on are codes of respective moieties (Table 1). Charges in the sublattice sites have been removed for brevity. δlog 10   a As an example, these endmembers have Ca in BT sites (CaOOH is in the third position of the name) and their sum will give the total mole fraction of Ca in the bridging tetrahedral sites.  [16].