Modeling viscosity of CO2 at high temperature and pressure conditions

The present work aims at applying Machine Learning approaches to predict CO2 viscosity at different thermodynamical conditions. Various data-driven techniques including multilayer perceptron (MLP), gene expression programming (GEP) and group method of data handling (GMDH) were implemented using 1124 experimental points covering temperature from 220 to 673 K and pressure from 0.1 to 7960 MPa. Viscosity was modelled as function of temperature and density measured at the stated conditions. Four backpropagation-based techniques were considered in the MLP training phase; Levenberg-Marquardt (LM), bayesian regularization (BR), scaled conjugate gradient (SCG) and resilient backpropagation (RB). MLP-LM was the most fit of the proposed models with an overall root mean square error (RMSE) of 0.0012 mPa s and coefficient of determination (R2) of 0.9999. A comparison showed that our MLP-LM model outperformed the best preexisting Machine Learning CO2 viscosity models, and that our GEP correlation was superior to preexisting explicit correlations.


Introduction
Carbone dioxide (CO 2 ) is the main contributor to greenhouse gas (GHG) emissions with up to 72% of the total GHG emissions recorded in 2010. Its effects on the climate and environment has become a great concern for both industry and academia (Norhasyima and Mahlia, 2018;Gambhir et al., 2017). Major efforts have been undertaken in the last decade to reduce CO 2 emissions, especially from industrial processes. Two key strategies have been developed and adopted; carbon capture and storage (CCS) and carbon capture and utilization (CCU) (Cu� ellar--Franca and Azapagic, 2015). As per CCS, the CO 2 is stored in underground geological formations including saline aquifers, depleted oil/gas reservoirs, and other geological options (Aminu et al., 2017). Storage in deep ocean water (Khatiwala et al., 2013;Adams and Caldeira, 2008) and CO 2 mineral carbonation (Gerdemann et al., 2007;Oelkers et al., 2008) are also considered good alternatives for underground storage. However, the most promising alternative nowadays is to reuse captured CO 2 for other industrial purposes (Abas and Khan, 2014). CO 2 is widely used in enhanced oil and gas recovery (EOR and EGR) (Gong and Gu, 2015;Al-Bayati et al., 2018;Yu et al., 2015;Pu et al., 2016) and enhanced coalbed methane recovery (ECBM) (Mazzotti et al., 2009;Liu et al., 2019). Developments of the last decade have also indicated that CO 2 injection can be a viable option for enhanced oil and gas recovery in tight shale reservoirs (Yu et al., 2014;Sheng, 2015;Eshkalak et al., 2014;Hoffman and others, 2012;Jin et al., 2017). In particular, studies (Nuttal, 2010;Busch et al., 2008;Klewiah et al., 2020) have demonstrated that CO 2 has greater adsorption affinity than methane in shales and can both be stored effectively and lead to more efficient shale gas production. Alternative industrial applications include the use of CO 2 as a refrigerant in heating and refrigerating processes (Sawalha et al., 2017;Li et al., 2016), as feedstock in the production of chemicals (Ampelli et al., 2015;Chen et al., 2016) and carbon source for microalgae to produce biofuels (Taher et al., 2015;Aslam et al., 2018).
Designing and optimizing the above processes requires a thorough understanding of the thermo-physical properties of CO 2 which include PVT relations, enthalpy, thermal conductivity, viscosity, and diffusion coefficient to name some (Islam and Carlson, 2012). More precisely, the viscosity of CO 2 is one of the most crucial parameters for successful implementation and forecasting of numerous applications. In CO 2 -EOR projects, the viscosity of CO 2 is in a direct relationship with its mobility inside the reservoir (Eshraghi et al., 2016). For CO 2 flooding, low volumetric sweep efficiencies were always associated with the high viscosity contrast between CO 2 and other reservoir fluids (Yu et al., 2015). Likewise, the viscosity determines indirectly the energy and cost efficiency of CO 2 transportation by pipeline (Zhang et al., 2006), as is the case for CCS projects in USA and Canada (Cole and Itani, 2013). In fact, the viscosity is indirectly related through the Reynolds number with the pressure drop during flow in pipelines, which in turn affects the power consumption of pumps and compressors. It was reported that a viscosity underestimation of 30%, will lead to a 30% underestimation of the pump/compressor power consumption (Li et al., 2011).
Great attention was paid historically to study the viscosity of CO 2 experimentally and theoretically by developing models and correlations able to predict its variation at different fluid states, mixtures, and operating conditions. The available experimental data on the viscosity of CO 2 were thoroughly reviewed and compiled by Li et al. (2011) and more recently by Laesecke and Muzney (Laesecke and Muzny, 2017). As for the theoretical part, the CO 2 viscosity can be calculated using equations of state (EoS) or models based on EoS (Fan and Wang, 2006;Guo et al., 1997). Particularly, the Span and Wagner EoS (Span and Wagner, 1996) was dedicated for CO 2 property predictions. Empirical correlations were also developed. One of the first predictive correlations was reported by Chung et al. (1988). Their generalized correlation allowed the estimation of viscosity of polar, nonpolar, and associating pure fluids (including CO 2 ) and mixtures over a wide range of fluid states. However, this method was shown to be consistent with measurements only in the case of low-pressurized gases. Fenghour et al. (1998) improved a prior correlation established by Vesovic et al. (1990) which had suffered some deficiencies in the liquid region due to inconsistencies in some of the considered experimental data they employed when developing their model. The new correlation covered a wider range of temperatures (200-1500K) and pressures (from 0.1 up to 300 MPa). To date, the most complete correlation was established recently in 2017 by Laesecke and Muzney (Laesecke and Muzny, 2017). The authors employed all available viscosity data to develop this correlation in conjunction with the Span and Wagner EoS. The final correlation covered temperatures from 100 to 2000 K for gaseous CO 2 and from 220 to 700 K for compressed and supercritical liquid states. Considering the work by Laesecke and Myzney to be the state of the art, we will later compare our results to their correlation.
With the emergence of artificial intelligence, the trend in this research area now consists of using heuristic algorithms in modeling and predicting CO 2 properties based on the large existing set of experimental data. Zhang et al. (2018) used general regression neural network (GRNN) and back-propagation neural network (BPNN) algorithms in the prediction of CO 2 solubility, solution density and viscosity in potassium lysintae (which has a primary amine group such as monoethanolamine (MEA) solution due to a carbon-nitrogen bond (Kumar et al., 2003)) with various operating conditions and liquid concentrations. Besides, the structure-property relationships between ionic liquids, with different molecular structures, and their CO 2 solubilities were investigated by Venkatraman et al. (Venkatraman and Alsberg, 2017) using random forests (RF), conditional inference trees (CTREE), and partial least squares regression (PLSR); and by Ouaer et al. (2020) by applying machine learning algorithms such as multilayer perceptron (MLP) and gene expression programming. Recently, Abdolbaghi et al. (2019) reported the application of four computer-based models namely particle swarm optimization (PSO), multilayer perceptron (MLP), hybrid-adaptive neuro fuzzy inference system (hybrid-ANFIS) and coupled simulated annealing-least square support vector machine (CSA-LSSVM) in the prediction of CO 2 viscosity at high temperatures and pressures. Many studies have shown high ability of machine learning algorithms to offer faster and more robust computational schemes than the classical empirical correlations and methods and can thus save time and costs related to designing or performing future experimental studies (Javadian et al., 2018;Hemmati-Sarapardeh et al., 2013;Nait Amar et al., 2019a;Ayegba et al., 2017;Ahmadi et al., 2018;Raja et al., 2017;Shokir et al., 2017;Benamara et al., 2019a).
The main purpose of this study is to advance the research on development of high exactness and simple-to-use machine learning approaches that can predict the viscosity of CO 2 . This will be done by modeling CO 2 viscosity as function of temperature and density based on an extensive database encompassing wide ranges of pressure and temperature conditions. To do so, three advanced data-driven techniques, namely multilayer perceptron (MLP), group method of data handling (GMDH) and gene expression programming (GEP) were implemented. Both the GMDH and GEP methods produce the output as an explicit function of the input parameters, where GMDH applies a polynomial form while for GEP the type of mathematical operations is user specified. Numerous statistical and graphical assessment criteria were considered in the evaluation of the newly proposed models. A comparison of the performance of our results with the best available prior Machine Learning models and with well performing explicit correlations was performed. A trend analysis was conducted to observe variations of CO 2 viscosity with respect to density and temperature. Further, outlier detection was performed to quantify potential experimental points deviating from main trends in the database.
There are some important differences between this work and existing studies where ML has been applied to model CO 2 viscosity: (1) more widespread temperature (220-673 K) and pressure (0.1-7960 MPa) range conditions were considered in the training of the models, (2) four backpropagation based learning algorithms were evaluated in the training of the MLP based CO 2 viscosity model, (3) this study is not limited to black box ML models, but also two distinct explicit correlations for CO 2 viscosity based on GMDH or GEP were derived, (4) The database of experimental points and the best of the developed ML models are provided in Excel files to the benefit of the readers.
The rest of the paper follows this structure: Section 2 describes the database and main assumptions utilized in this study to develop the models and explicit correlations. Section 3 summarizes the implemented data-driven techniques. Results are presented and discussed in Section 5. The study is summarized in Section 6.

Data gathering and preparation
Consistent models need to be developed from a reliable database that contains a large number of experimental measurements. The viscosity of CO 2 has been investigated experimentally in several studies, where the effects of variables such as temperature and pressure on this parameter were in focus. Among such works, Laesecke and Muzney (Laesecke and Muzny, 2017) gathered the largest known database on CO 2 viscosity as function of temperature and pressure.
In this work, a widespread experimental database including 1124 samples for CO 2 viscosity with corresponding values of temperature, density and pressure was utilized for developing the proposed models. These experimental measurements were collected from several literature sources (Laesecke and Muzny, 2017;Van Der Gulik, 1997;Haepp, 1976;Vogel and Barkow, 1986;Golubʹev, 1970;Estrada-Alexanders and Hurly, 2008;Abramson, 2009;Kestin and Whitelaw, 1963;Vogel, 2016;Michels et al., 1957;Sch€ afer et al., 2015;Hendl et al., 1993). The database was randomly divided into a training set (80% of the database) and a test set (20% of the database). Table 1 represents the ranges of variables applied in this study. Our models (either using MLP, GMDH or GEP) assumed that the viscosity of CO 2 could be modelled as function of temperature and density, in line with the literature (Laesecke and Muzny, 2017;Abdolbaghi et al., 2019) where it has been claimed that more accuracy can be obtained using density rather than pressure as the second input parameter. Hence, the models and correlations take the following form: When performing the modeling task using MLP, the data points were normalized between À 1 and 1: where x norm is the normalized value of x i , x min and x max are the minimum and maximum values of the variable x (corresponding to viscosity, temperature and density), respectively. The data were not normalized when applying the GMDH or GEP methods. Mean square error (MSE) was considered as the assessment function during the training phase of the different models. MSE is defined as follows: where μ is the viscosity of CO 2 , N is the number of samples in the dataset and the superscript exp and pre indicate experimental and predicted values, respectively.

Multilayer perceptron (MLP)
Artificial neural network (ANN) is one of the most robust data-driven methods. This approach is characterized with a high ability to recognize and identify relationships between input and output parameters in complex systems (Hemmati-Sarapardeh et al., 2018a). The learning strategy and the manner of processing information in ANN were inspired by the human brain (Hemmati-Sarapardeh et al., 2018b). Multilayer perceptron (MLP) is one of the most frequently utilized types of ANN for modeling purposes (Nait  and is the first Machine Learning approach we apply here. The MLP model consists mainly of two principal elements: neurons and layers. The neurons are considered the basic component of any ANN. In the case of MLP, the neurons are distributed beneath 3 different kinds of layers, namely the input, hidden and output layers. The input layer is where the inputs (density and temperature data in this case, i.e. two neurons) are given to the model. The output layer is where the output (CO 2 viscosity, i.e. one neuron) are returned. For a given neuron j in layer i (not in the input layer), the input x ij consists of a linearly weighted sum of the outputs from the neurons of the previous layer y iÀ 1;j plus a bias term b ij . The output y ij of the given neuron is the evaluation of this input by an activation function f i (which can vary from layer to layer): An MLP model contains at least one hidden layer. Generally, one hidden layer allows to identify the relationships in simple to moderately complex systems, while more than one hidden layer is mandatory in highly complex systems (Haykin, 2001). Indeed, the role of hidden layers is to map the inputs into higher features by means of activation functions. These latter are generally of a non-linear type such as Logsig and Tansig: Pureline is generally the proper transfer function for the output layer.
A trial and error method is frequently considered when looking for the best number of hidden layers, their numbers of neurons as well as the proper activation and transfer functions.

Training
The training phase of the MLP model aims to find appropriate values for the weights and bias terms in order to minimize the gap between the predictions and the real values. Back-propagation (BP) learning approaches are suitable for this purpose and we applied four such alternatives to train the MLP method: Levenberg-Marquardt (LM) algorithm, bayesian regularization (BR), scaled conjugate gradient (SCG), and resilient backpropagation (RB). For more information about these algorithms, the readers are referred to some previous works (Hemmati --Sarapardeh et al., 2018b;Nait Amar et al., 2019b;Benamara et al., 2019b;Nait Amar and Zeraibi, 2019). The four resulting models are denoted MLP-LM, MLP-BR, MLP-SCG and MLP-RB.

Group method of data handling (GMDH)
Group Method of Data Handling (GMDH) is another neural network method which provides the output as an explicit mathematical correlation of the input parameters (Dargahi-Zarandi et al., 2017). It is also called polynomial neural network (PNN) as its structure consists of nodes organized in one or more intermediate layers and its generated expression is given in a polynomial form such as the quadratic form, introduced to GMDH by Ivakhnenko (1971). The relation between inputs x i ; x j ; …; x k to a node and the output Y from the node is expressed as follows: a; b i ; c ij and d ij…k are the coefficients of the polynomial and D is the number of inputs to the node. In this work, two nodes were applied, each as a third degree polynomial function of two input parameters (with D ¼ 2 and terms up to third order in Eq. (9)). The first nodes output was a polynomial function of the inputs density and temperature, while the node 2 output (giving viscosity) was a polynomial function of density and the output from node 1. Generally, the number of nodes and order of the polynomial can be tuned. The coefficients of the polynomials (as in Eq. (9)) were optimized by applying the least square errors method formulated as: In this equation, y i is the experimental value of viscosity, y GMDH i the value suggested by the polynomial formula and N is the number of training samples. To achieve the final trained GMDH model, the problem in Eq. (10) is resolved first by a transformation of y into matrix form (Dargahi-Zarandi et al., 2017;Hemmati-Sarapardeh and Mohagheghian, 2017): and the solution is then given as: where X is the matrix composed of vectors of input products for each data point appearing in the combined polynomial term and A is the corresponding vector of coefficients. In the present work we note that hybrid GMDH was employed which allows (1) interactions between nodes from different layers and (2) the combination of more than two independent variables at a time. These two advantages raise more flexibility when dealing with complicated modeling cases (Rostami et al., 2019). The final correlation gained from the hybrid version of GMDH is presented below: where θ ij…k mean the coefficients of the polynomial and L is the number of layers.

Gene expression programming (GEP)
Gene expression programming (GEP) is, similar to GMDH, a datadriven technique generating explicit (mathematical) expressions. It was proposed by Ferreira (2001) as an ameliorated version of the so-called genetic programming (GP) introduced by Koza (1992). GEP is regarded as an evolutionary technique, and hence, for the searching process to find the most fit explicit correlation, it applies the fundamental genetic operators (e.g. selection, crossover, mutation) and some other specific demarches that did not exist in the earliest version of evolutionary algorithms.
The GEP search process begins with creating a set of possible solutions (termed 'chromosomes'). In this case, the possible solutions represent different explicit correlations. The chromosomes are made up of 'genes' which are fixed length terms involving variables (such as fx 1 ; x 2 ;x 3 g) and operators (such as f þ ;=; � ; À ;tanh;lng) (Teodorescu and Sherwood, 2008). Fig. 1 illustrates a scheme of a chromosome including two genes and its converted expression. The main steps performed in GEP for reaching the accurate correlation are given as follows: -Setting the GEP control parameters: population size (the number of chromosomes/correlations in each generation), the number of genes (i.e. the maximum numbers of terms, also called the length), the considered operators and the mutation rate. -Create an initial population of chromosomes (possible correlation form).
-Each solution is optimized to fit the data by determining the involved coefficients using the Least Square approach. -Each chromosome in the population is evaluated based on a fitness function indicating its performance. In our case, mean square error (MSE) was used. -The next generation of solutions is generated using: -Elitism: the most fit element in the current population is saved for the following generation. -Selection and recombination: individuals are picked which will be recombined to give new offspring. -Mutation: modify elements within a genome according to a mutation rate. -Transposition, inversion and insertion of sequences: this is done by adding or inverting parts of the genome in the chromosome to improve the prediction ability of the correlations (Ferreira, 2001). -The new population is then evaluated and the calculation steps from elitism operators are repeated until a stopping criterion is satisfied.

Results and discussion
In the MLP approach, trial and error was considered for selecting the proper number of layers and nodes and the control parameters of the various employed learning algorithms. The best-found structure in the four proposed MLP paradigms was three hidden layers covering 11, 11 and 9 neurons, respectively. Tansig was the optimal activation function in all the hidden layers, while Pureline was the optimal transfer function for the output layer. The main control parameters applied in the GEP algorithm are stated in Table 2. Two nodes with third order polynomials were applied in GMDH.

Performance evaluation
To evaluate the performance and robustness of the proposed models, statistical indicators, namely root mean square error (RMSE) and coefficient of determination (R 2 ) are used in the evaluation. The mathematical expressions of the latter are specified as follows: RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 N Fig. 2 illustrates the obtained cross plots for the different models showing predicted CO 2 viscosity versus the experimental data. The reference unit slope corresponds to perfect match. It is seen that all the models exhibit very satisfactory distributions of predictions near the unit slope line for both training and testing datasets, demonstrating high reliability in estimating the viscosity of CO 2 . By close inspection, the MLP-LM and MLP-BR models have the least scatter around the unit slope line, while more scatter is seen for the MLP-SCG and MLP-RB and even more for GMDH and GEP models. The goodness of the match is illustrated more directly in Fig. 3 which shows comparative bar plots of RMSE and R 2 values for the proposed models evaluated on the overall dataset. As seen, all models have R 2 values almost identical to 1, indicating very good prediction, although the RMSE values clearly distinguish between the models in terms of their matching ability. Especially, these values demonstrate the same quality trend as seen visually in Fig. 2. The statistical parameters of the calibrated models evaluated for the training, test and overall datasets are listed in Table 3. Evaluation of the statistical parameters reported in Table 3 and Fig. 3, shows that the MLP model optimized using LM is the most accurate model. It has the highest coefficient of determination during training, testing and overall (R 2 ¼ 0.9999 in each), and lowest RMSE values, namely 0.0012, 0.0011 and 0.0012, respectively. It can also be noticed from Table 3 and Fig. 3 that the reliability of the developed models takes the following order: MLP-LM > MLP-BR > MLP-SCG > MLP-RB > GEP > GMDH. Furthermore, regarding the models providing explicit correlations, the GEPbased correlation outperforms that obtained with GMDH. Accordingly, the MLP-LM model will be considered for further evaluations and comparison with prior machine learning models from the literature, and the GEP-based correlation is compared with existing explicit correlations from the literature for predicting the viscosity of CO 2 . Fig. 4 shows a comparison between the MLP-LM predicted CO 2 viscosity values and the experimental values for training and test sets. This figure reveals that the majority of the predictions during the training and testing phases are in line with the measurements. Fig. 5 presents the relative error distribution of the MLP-ML model in a contour map plotted against temperature and density. Very small relative errors are achieved.

Comparison with the existing models
To compare performance of the proposed MLP-LM model and the GEP model correlation in estimation of CO 2 viscosity, two prior approaches were selected; the best established paradigm by Abdolbaghi et al. (2019), which consists of a radial basis function neural network optimized by PSO (RBF-PSO) and the correlation of Laesecke and Muzney (Laesecke and Muzny, 2017). The four stated models were applied to the collected CO 2 viscosity database. The performance results of the models by Abdolbaghi et al. (2019) and Laesecke and Muzney (Laesecke and Muzny, 2017) are stated and compared with our newly proposed models (MLP-LM and GEP) in Table 4 and Fig. 6. As demonstrated, our best implemented smart model (MLP-LM) outperforms the best prior model proposed by Abdolbaghi et al. (2019), with an RMSE of 0.0012 compared to 0.0018. In addition, the newly developed explicit correlation using GEP shows better match for predicting the viscosity of CO 2 compared to that of Laesecke and Muzney (Laesecke and Muzny,   (Abdolbaghi et al.) 0.0018 Laesecke and Muzney correlation 0.020 Fig. 6. RMSE comparison of our best models with the best models known from the literature. RMSE was evaluated using the entire experimental dataset. 2017) with an RMSE of 0.016 compared to 0.020. The correlations have notably higher RMSE than the machine learning models, but overall, they all have very high performance. It is worth noting that the proposed GEP-based explicit correlation can be applied more directly than less transparent intelligent schemes and hence, it can more easily be integrated into different software while studying and simulating tasks related to CO 2 utilization. With some programming, intelligent schemes can however also easily be implemented and applied. Hence, there is no practical limitation to the use of our best-established paradigm in this study, namely the MLP-LM model. For illustration, we have developed an Excel macro to allow its application and utilization. The procedure of using this Excel macro is described in Appendix A.

Trend analysis and relevancy factor
The impact of the input parameters, namely density and temperature on CO 2 viscosity in the developed MLP-LM model is investigated. The variation of CO 2 viscosity with density for several constant temperatures (313.15, 363.15 and 423.15 K) is reported in Fig. 7 together with experimental points. The stable smooth trends passing through the experimental points indicate the robustness of the newly proposed MLP-LM models in terms of physical interpretation. Fig. 8 also shows CO 2 viscosity versus temperature for constant densities (0.208, 1.442 and 2.732 kg/m 3 ) with experimental points. The implemented MLP-LM model is able to correctly follow the experimental variations of CO 2 viscosity with temperature.
To quantify the impact of various input parameters on model outputs, the relevancy factor (Pearson correlation coefficient) is normally used. The closer the absolute value of this parameter is to 1, the more the specific input parameter impacts the output (although a linear relation is assumed). The relevancy factor r is defined as follows: ðO i À OÞ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P n i¼1 À where I j indicates the jth model input (here j ¼ 2, and include temperature and density). I j;i is the ith value of input j and I j denotes the average of these values. O i is the output corresponding to input I j;i , and O is the average of the output values. As depicted in Fig. 9, the viscosity of CO 2 correlates positively with both input parameters, as given by relevancy factors of 0.26 for temperature and 0.64 for density. Hence, increasing density or temperature is expected to result in rising CO 2 viscosity. Furthermore, as density exhibits higher relevancy factor than temperature, it has more influence on the calculated CO 2 viscosity values.

Outlier detection
The statistical validity of the developed model in predicting CO 2 viscosity is analyzed using the Leverage statistical approach: the standardized residuals which represent the difference between the forecasted results and the experimental data, and statistical Hat matrix Leverage values are depicted in the Williams plot (standardized   Amar et al. residuals versus Leverage indices) to detect possible outliers. The Hat matrix is calculated using the formula below (Kamari et al., 2015;Jaworska et al., 2005): with X represents an (n � d) matrix, n and d denote the dimension and the data points number, respectively, and X t is the transpose matrix of X. In the Williams plot, a limit Leverage value (H*), which is a constant, has been computed as though 3(d þ 1)/n. The data points are selected in the range of �3 of standard deviations from the mean, where the cut-off value of 3 covers 99% of the distributed data. To obtain a valid model which leads to predictions in the applicable domain, the majority of data points must be situated in the intervals 0 � H � H* and À 3 � R � 3. The Williams plot of the MLP-LM model is shown in Fig. 10. The majority of data points are in the ranges of À 3 � R � 3 and 0 � H � 0.008 and only 1.96% of data points are located outside the applicability domain of the model. It means that most of the data is sufficiently predicted and the validity of the model is confirmed. Therefore, it can be stated that the developed model predicts CO 2 viscosity with high accuracy.

Conclusions
In this paper, several machine learning techniques were applied to establish robust and simple-to-use models to accurately predict the viscosity of CO 2 under wide ranges of pressure and temperature conditions, using density and temperature as input parameters. A dataset of 1128 experimental points was used to calibrate and validate the models. Multilayer Perceptron optimized with four distinct back-propagation algorithms, including LM, BR, SCG and RB, and two robust 'whitebox' techniques yielding explicit correlations, namely GMDH and GEP, were the applied data-driven methods.
The analysis revealed that MLP optimized with LM (MLP-LM) resulted in the best paradigm for predicting the viscosity of CO 2 with very low RMSE values: 0.0012 during the training, 0.0011 for the test data and 0.0012 as overall value. In addition, the best of the newly proposed models outperformed prior paradigms; The MLP-LM model performed better than the previous best performing intelligent model by Abdolbaghi et al. (2019); also, the best explicit correlation we obtained, from GEP, outperformed the best available explicit correlation, by Laesecke and Muzney (Laesecke and Muzny, 2017). Trend analysis of the MLP-LM model demonstrated that increasing temperature or density causes the viscosity of CO 2 to increase, in line with experimental trends.

Declaration of competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.