A novel predictive model for estimation of cell voltage in electrochemical recovery of copper from brass: application of gene expression programming

Regarding the high corrosion resistance of brass in sulfuric acid, its leaching process is the most important step in hydrometallurgical recovery of brass scraps. In this study, the electrochemical dissolution of brass chips in sulfuric acid has been investigated. The electrochemical cell voltage depends on various parameters. Regarding the complexity of electrochemical dissolution, the system voltage could not be easily predicted based on the operational parameters of the cell. So, it is necessary to use modeling techniques to predict cell voltage. In this study, 139 leaching experiments were conducted under different conditions. Using the experimental results and gene expression programming (GEP), parameters such as acid concentration, current density, temperature and anode-cathode distance were entered as the inputs and the voltage of the electrochemical dissolution was predicted as the output. The results showed that GEP-based model was capable of predicting the voltage of electrochemical dissolution of brass alloy with correlation coefficient of 0.929 and root square mean error (RSME) of 0.052. Based on the sensitivity analysis on the input and output parameters, acid concentration and anode-cathode distance were the most and least effective parameters, respectively. The modeling results confirmed that the proposed model is a powerful tool in designing a mathematical equation between the parameters of electrochemical dissolution and the voltage induced by variation of these parameters.


Introduction
Increasing demand for the metals combined with industrial progresses and population growth has significantly declined the mineral resources. Along with high production and consumption of the metals, a huge amount of metallic scraps is produced. Therefore, recovery of metals from these secondary resources has gained considerable interest to compensate the mineral shortage as well as resolving the environmental problems [1]. Recovery of Copper [2-7] and zinc [8][9][10] from the secondary resources has been widely researched due to high application of these metals. Brass alloy production wastes such as slag, dusts, chips and scraps are among the secondary resources containing both Zn and Cu. Despite the high popularity of hydrometallurgical methods for metals recovery from the secondary resources and scraps, this method has a major drawback for recovery of brass scraps: corrosion resistance of brass towards sulfuric acid; hence the majority of researches are focused on the leaching step [11]. To improve the yield of brass leaching, electrochemical methods can be employed in which the metal is placed in the anode position of an electrochemical cell and will be dissolved by application of direct current. Application of the electrochemical dissolution processes has been also reported by other researchers for copper recovery from copper wastes [12,13]. In electrochemical dissolution of the brass, copper and zinc will be dissolved at anode. Copper ions will be reduced at cathode surface, depositing as copper cathode. The obtained zinc sulfate solution can undergo electro winning in a separate cell resulting in zinc cathode.
One of the main operational parameters of an electrochemical dissolution cell is its voltage which has a direct impact on electrical energy consumption of this process. Cell voltage is not an independent parameter, but depends on the other operational parameters of the cell such as its electrolyte condition (acid concentration and temperature) current density and anode-cathode distance. Regarding high number of effective parameters, prediction of cell voltage is not possible by simple modeling methods such as regression analysis. In this context, application of the modern techniques to determine the optimal condition of electrochemical brass dissolution is a crucial step. Among the available methods, gene expression programming (GEP) (invented by Friera) is a suitable tool to obtain the governing mathematical equation. GEP is an evolutional algorithm and is in fact an extended version of genetic algorithm (GA) and genetic programming (GP) which has resolved their limitations [14]. GEP method was used to predict the nonlinear behavior in leaching process such as copper recovery in columnar leaching of copper oxide ores [15], modeling of diaspore leaching kinetics [16], modeling of leaching step in cobalt recovery from Li-ion batteries [17] Cu-Zn separation by supported liquid membrane [18], chemical kinetic modeling and parameter sensitivity analysis for the carbonation of Ca 2+ and Mg 2+ [19], modeling and optimization of synergistic effect of Cyanex 302 and D2EHPA on separation of zinc and manganese [20] and to assay the microbial population in heap bioleaching operations [21]. The accuracy of this method has been confirmed in these researches. In this study, the electrochemical dissolution voltage of the brass in sulfuric acid was simulated based on a new GEP-based method. According to the literature review, no study has modeled the electrochemical dissolution voltage using GEP method. To determine the dependence of the electrochemical dissolution process on the process parameters (sulfuric acid concentration, current density, temperature and anode-cathode distance) experiments were conducted on electrochemical dissolution of the brass under different operation parameters. Results of these trials were employed to train and test 8 different GEP algorithms. To investigate the accuracy of the simulations, 3 of these 8 models were further investigated.

Experimental and theoretical background 2.1. Materials and experimental procedure
Brass chips with the chemical composition listed in Table 1 were used as the starting materials. The brass chips were first sieved and grains with size range of 850-1499 underwent electrochemical dissolution. Since alloy chips couldn't be directly used in the anode, an anodic basket made of Ti with dimension of 30x10x3mm 3 was employed and the chips were placed in it. An electrochemical cell with dimension of 15x15x15 cm 3 was made out of glass and employed as the electrolysis container. Anodic basket was immersed into the electrolyte up to the depth of 10 cm and two stainless steel 316 cathodes (10x10 cm 2 ) were placed on its both sides. Electrolyte was prepared by dilution of sulfuric acid (Merck 98%) in deionized water. The required electrochemical potential was applied using a DC voltage device (MCH-K3010DN) and the voltage level was measured by a digital multimeter (WH5000) which was linked to a PC where it was recorded. Fig 1 schematically illustrates the equipment and devices used in electrochemical dissolution of brass. In this study, 4 parameters including acid concentration, current density, temperature and anode-cathode distance were studied. In each experiment, electrolysis voltage was measured and recorded as the result of experiment. The experiments conditions are listed in Table  2. Totally, 139 data were collected from the experiments.

Gene expression programming (GEP)
Similar to genetic algorithm, gene expression programming possesses linear chromosomes with constant length and similar to genetic programming, GEP has tree-like structure with various sizes and shapes. The only difference is that, on GEP, the tree structure is called expression tree (ET). GEP is one of the most powerful methods for nonlinear and complex modeling [22,23]. In GEP, genome or chromosome is a coded linear string with a fixed length which can include one or several genes. Despite the constant length of the chromosomes, ET in GEP can have different sizes and shapes. GEP method has two programming languages: Karva and expression tree which can be converted to each other. Karva language was invented by Fiera to read and express the coded program in the chromosome. In this type of coding, upper numbers indicate the position of the functions (mathematical operators) and terminals (problem variables and constant numbers). For coding, the positions are coded from 0 to 9, after 9, the coding will again start from 0. The starting point of the gene is from position 0 of the code, but its final point is not always in code 9 as a part of gene may not be expressible in tree-like structure but has a significant role in development. The section of chromosome which can be expressed in tree-like structure is called open reading frame (ORF). In GEP models, the results are displayed in ET form. In GEP, each gene has two parts: head and tail. In head section, functions and terminals exist while in tail section, only the terminals can be placed. The head size (h) is determined by the designer while the tail size is a function of head size and maximum number of the functions arguments (nmax) which can be obtained by Eq (1) [14]: Chromosomes usually contain several genes. For each problem, the number of chromosomes, genes, and head size of the genes can be determined by the designer through trial and error processes. For multi-gene chromosomes, codes related to each gene compose a sub-tree (sub-ET); the sub-trees are connected to each other through functions called linking functions giving rise to a larger ET. Finally, a mathematical equation can be extracted for predicting the values from ET. Similar to GA and GP, in GEP, first an initial population of chromosomes is randomly created. These chromosomes are initially the linearly-coded structures which follow the Karva language. In the next step, they will be expressed in ET structure. Then, regarding the fitness function determined by the designer, the fitness of each chromosome from the first generation will be calculated. If the termination condition (determined by the designer) is achieved, the process will be terminated; otherwise it will be continued in such a way that the best people (chromosome) of each generation are selected and copied to the next generation. Then the genetic operators will be applied on the chromosomes to form the new generation. The mentioned steps will be repeated until reaching to the termination condition. Genetic operators of GEP include mutation, inversion and three types of transposition (GENE, RIS and IS) and three types of recombination operator: single-point, two-point and gene recombination, in

Methodology and prediction of electrochemical cell voltage
The parameters in Table 2 were considered as the model inputs while electrolysis voltage was taken as the model output. GeneXpro v5 software was employed for modeling. 125 datasets were used for training whereas 14 datasets were taken for model testing. Modeling process involved 5 major stages [24]. First stage is the selection of fitness function; in this paper, root mean square error function was employed as the fitness function i th chromosome; Eq. (2). As the increase of fitness function will enhance the efficiency, this function can't be directly considered as the fitness function; thus Eq. (3) was employed to evaluate the i th chromosome fitness. The second stage involves the selection of terminals (inputs) and choosing the function for forming the chromosomes. The third stage includes determination of the chromosomes structure (i.e. head size and number of genes for each chromosome). The fourth stage involves the selection of the linking function and genetic operators and their rates will be selected in the last stage. In this study, 150 models were designed by trial and error to design the best model; some of them were selected whose structures are presented in Table 3. Genetic operators' rate and the formed models are presented in Table 4.
In which is the predicted value by i th chromosome for j th data among n data. represents the measured electrolysis voltage for j th data.

Results and discussions 3.1. Model Validation
The model efficiency was evaluated by R 2 and RMSE indices. Correlation coefficient (R 2 ) indicates the correlation between the measured voltage and the predicted ones; its value ranges in 0< R 2 <1. R 2 values near zero reflect weak or no linear correlation between the variables; this means that the correlation is nonlinear and random. When the data are placed on a direct line (R 2 =1), the two variables are completely correlated. The higher the correlation coefficient, the better the performance of the designed model will be. To validate the model, in addition to correlation coefficient, RMSE was also used. Lower values of this parameter indicate better performance of the model in predicting the system voltage. The system voltage values predicted by the models mentioned in Table 2 as well as their correlation coefficients and RMSEs are presented in Fig 3  and Fig 4, respectively. As the figures suggest that models No. 9, 2 and 7 showed better performance as they exhibited the highest correlation coefficient while having the lowest RMSE. ETs of these three models are presented in Fig 5. Finally, a mathematical equation was extracted from each of these sub-ETs as listed in Table 5.  The system voltages predicted by these three superior models are presented in Table 6 for training and testing stages. Model No. 9 showed a slightly better performance compared to the other models. Fig 6 and Fig 7 show the difference in the measured and predicted voltage and the diagram of the dispersion of the measured and predicted voltages (by model No. 9), respectively. Table 7 also lists the measured and predicted values (by three superior models) for the test data and their absolute error (calculated by Eq 4).

Sensitivity analysis
In this section, to determine the effect of each parameter on the system voltage and their relationship, Pearson correlation coefficient of each parameter relative to the system voltage (predicted by model 9) (both training and testing data) was calculated as presented in Fig 8. The direction of the columns in Fig.8 (a. Correlation coefficients and b. Change in Output mean) presents the positive or negative impact of each input on the output. If the direction is downward, by increasing the input parameter, the output parameter will be decreased. Otherwise, if the direction is upward, by increasing the input parameter, the output will be increased. On the other hand, the coefficient of each input shows that which input parameter is more effective [25,26]. As it is shown in this figure, both graphs confirm each other. Acid concentration had the highest correlation and was recognized as the most effective parameter on the system voltage. Its correlation was however negative meaning that an increase in sulfuric acid concentration will decline the system voltage. By increase of the sulfuric acid concentration, electric conductivity of the electrolyte will be enhanced and hence the IR drop in the electrolyte will be decreased. Therefore, the voltage required for brass dissolution will be reduced. Moreover, by increase of sulfuric acid concentration and the concentration of H + ions will be enhanced which will facilitate the cathodic reduction of these ions in lower voltages: Higher concentrations of sulfuric acid also increase the corrosivity of the electrolyte; hence the electrochemical dissolution of the brass will be easier and achievable at lower voltages; therefore, increase of sulfuric acid concentration has a significant impact on reducing the process voltage; thus the acid concentration should be increased as much as possible. The sulfuric acid concentration should be selected in a way that the zinc sulfate solution resulted at the end of the electrolysis could be applicable for electrowinning process.
Anode-cathode distance showed a near-zero correlation (0.00065%) indicating that this parameter did not have a significant impact on the system voltage. As in the electrochemical processes such as electrowinning and electrorefining, higher anode-cathode distance will reduce the number of anodes and cathodes per cell and hence more cells are required for extracting a specific amount of metal, the anode-cathode distance should be selected as short as possible. But by shortening the anode-cathode distance, they may contact with each other resulting in short circuit; moreover, the time interval for separating the metallic product from the cathode should be shorter and the staff costs will be increased; so the anode-cathode distance should be adequately long. Regarding this dual impact of anode-cathode distance, an optimized distance should be selected. Fortunately, since this parameter has low impact on the process voltage (based on sensitivity analysis), decision about the distance is simple; as one of the effective parameters on the optimal selection of this parameter (the process voltage) is omitted from the decision. Current density exhibited a positive correlation with the process voltage indicating that an increase in the current density will enhance the system voltage. An increase in current density can enhance the anodic and cathodic overpotentials in the dissolution process and copper and zinc entrance from anode to the solution as well as the reduction of copper ions and hydrogen in the cathode. Therefore, the voltage required for the electrolysis will be increased. Thus, if we only wish to reduce the voltage and energy consumption, the current density should be declined. But a decrease in the current density will reduce the cell productivity and hence increase the production costs such as staff costs. Working in low current densities requires increase of electrolysis cells to produce a specific amount of metal which will require higher capital. Thus an optimal value should be selected for current density. Increase of current density will enhance the density of the energy input to the system which will increase the temperature of the electrolyte. According to Fig 8,increase in temperature decreased the system voltage and process energy consumption. Therefore, working at higher current density values is also desirable. As a summary, the sulfuric acid concentration should be selected as high as possible while an optimized value should be employed for anodecathode distance and current density. Selection of the optimal values requires further information in terms of process technology and costs (energy costs and staff costs); in this context, a further study is recommended.

Conclusions
In this study, after brass dissolution experiments and collection of the experimental data, GEP method was employed for predicting the voltage of electrochemical brass dissolution. For this purpose, acid concentration, temperature, current density and anode-cathode distance were used to predict the electrolysis voltage during electrochemical dissolution of brass. After modeling and formation of various models, three superior models were selected. System voltage prediction indicated the proper performance of the GEP method in predicting the system voltage. The best GEP model (model 9) predicted the system voltage with correlation coefficient of 0.929 and RMSE of 0.052. Using system voltage values predicted by model 9 and input parameters, sensitivity analysis was conducted in which acid concentration and anode-cathode distance were determined as the most and lease effective parameters on system voltage, respectively. Finally given the success of the simulation of electrochemical brass dissolution by GEP method, it can be concluded that this approach can be experienced to simulate the electrochemical dissolution processes of other materials.