Analysis of waste groundnut oil biodiesel production using response surface methodology and artiﬁcial neural network

Investigation on the use of KOH and NaOH catalysts for waste groundnut oil (WGO) biodiesel production, as well as the comparative adoption of response surface methodology (RSM) and artiﬁcial neural network (ANN) for the modelling of yield and process parame- ters was carried out in this research work. Box–Benkhen experimental design was adopted and the four process parameters considered were methanol-oil mole ratio (6–12), cata- lyst concentration (0.7–1.7 wt%), reaction temperature (48–62 ° C) and reaction time (50– 90 min). The results of this research work reveal that KOH catalyst produced higher yield of biodiesel, compared to the yield obtained from NaOH catalysed process. ANN model had 0.9241 regression coeﬃcients ( R ) and 0.8539 correlation coeﬃcients ( R 2 ) while the R and R 2 calculated from RSM were 0.9290 and 0.8516 for KOH catalysed transesteriﬁcation process. Also, the overall regression coeﬃcients R and correlation coeﬃcient R 2 in the ANN model were 0.9629 and 0.9272, while the R and the correlation coeﬃcient R 2 calculated from RSM were 0.9210 and 0.8791, for NaOH catalysed WGO biodiesel production. Hence, the results typify the robustness and superiority of ANN over RSM in predicting and solving complex problems speciﬁcally in the transesteriﬁcation of biodiesel, due to the larger values of R and R 2 as recorded.


Rationale
Constant search for sustainable energy sources to meet the global energy demand is propelled by the astronomical rise in fossil fuel price, health and safety concerns, non-renewable nature of fossil fuels, increasing negative environmental impact caused by greenhouse gas emission [1][2][3] . Among biofuels, biodiesel is gaining acceptability and economic value as a reliable and alternative source of energy due to some benefits that it has over the fossil fuels. These benefits include its renewability as energy source, zero sulphur content and excellent lubrication property etc. [3][4][5] . Biodiesel is a processed fuel that can be obtained from the transesterification reaction between lipids (such as vegetable oils or animal fats) and low carbon chain alcohol (such as methanol, ethanol, butanol) in the presence of a suitable catalyst. 'Bio' is a representation of its renewability and biological source, while 'diesel' implies its similar performance with petroleum diesel, without diesel engine alteration [6] . The use of waste cooking oils (WCO), instead of the conventional virgin oils, has been proven to be an effective way of reducing the cost of biodiesel production. But effort to remove the free fatty acid (FFA) content in WCO has to be made before being utilised. The commonly used WCO are waste soybean oil, waste groundnut oil, waste palm oil and waste palm kernel oil.
The three categories of catalysts generally used in the production of biodiesel are base catalysts, acid catalysts and enzymes. The drawback experienced in the use of enzymatic catalysts is that enzymes are expensive form of catalysts for the industrial scale production of biodiesel [7] . Acid catalysed transesterification reaction usually requires high molar ratio of alcohol to oil, long reaction time (such as 10 -15 h) and high pressures to reach completion. In addition, both the process metallic materials and engines are subject to corrosion under acid catalysed transesterification reaction [8] . Fortunately, the use of base catalysts is cost effective because it requires short reaction time (range of 45 min -2 h), low molar ratio of alcohol to oil (5 -15) and low catalyst concentration (0.4 -2.0 wt%). And the two forms of base catalyst commonly used are KOH and NaOH [9][10] . The two base catalysts are mostly preferred to other forms of catalysts because they are readily available, cheap and possess good catalytic behaviour [9] . Response surface methodology (RSM) is a statistical tool that is adopted (among many other usefulness) to explore the relationship (in form of models and diagrams) between the process variables (inputs) and one or more responses (output) [11][12][13][14][15] . This is achieved through the introduction of experimental design that gives direction of how to obtain an optimal response and a second-order polynomial model 11, [13][14][15]. Another tool that can be used to establish relationship between the inputs and output(s) is artificial neural network (ANN).
ANN is a versatile computing tool for probing complex and chaotic systems via training of many inputs parameters as to produce system outputs [16] . Its interconnectivity between experimental data and the underlying theoretical facts is scientifically fascinating in the field of biology, biochemical and chemical engineering [17][18][19][20][21][22][23] . It is known with many embedded modelling methods that could be adopted in the project design and analysis. It comprises of an input layer, hidden layer and response (output) layer. The aim of this research work is to compare the performance of KOH and NaOH catalysts (in terms of biodiesel yield) during the transesterification of WGO. More importantly, to formulate model that relate four process variables (inputs) and biodiesel yield (output), using RSM and ANN modelling tools.

Materials, reagents and equipment
Materials and reagents used in the course of this research work include: waste groundnut oil, KOH pellets, NaOH pellets, methanol, hydrochloric acid, tetraoxosulphate (IV) acid (all reagents are analytical grade products of Sigma-Aldrich; J.T Baker; Qualikems, and Romil Ltd.). In addition, the equipment used include Gas Chromatography Mass Spectroscopy (Agilent Technologies 7890A) and Atomic Absorption Spectroscopy (Analyst 200 Perkin Elmer precisely).

Pre-treatment of the WGO
All unwanted particles (sand, sticks, fish particles, free fatty acid, water) present in the WGO were removed (to prevent low biodiesel yield and soap formation), using the methods reported in the previous work [5,24] .

Experimental design
Box-Benkhen method (Minitab 17) [24] was used for the experimental design. The four process variables considered are methanol-oil mole ratio, catalyst concentration, reaction temperature and reaction time ( Table 1 ).

Biodiesel production
As described in the previous work [5] , treated WGO was trans-esterified by reacting with methanol (using KOH and NaOH catalysts separately) in laboratory scale reactors, at the specified operating conditions (as indicated in Table 1 ).

Modelling of biodiesel yield using RSM and ANN
Both RSM and ANN tools were used to model the relationship between the four input variables (methanol/oil mole ratio, catalyst concentration, reaction temperature and reaction time) and biodiesel yield (output).
RSM design used number of continuous factors of 3, number of categorical factor of 1, number of block of 1 and number of replicate of 1. While ANN toolbox in MATLAB R2016a was utilized. A log sigmoid function was adopted at the hidden because of its high correlation profile In ANN toolbox, the sum of 316 experimental values were utilized for training and testing the efficiency of the artificial neural network. The experimental data comprises 4-input parameters such as methanol per oil mole ratio ( α m ), catalyst concentration ( α c ), reaction temperature ( α T ) and the reaction time ( α t ) as shown in Fig. 1 . In addition, the report of Hojjat et al. [25] advised the need to grading the input and response parameters. Thus, the input parameter was graded by dividing each column with the highest value in order to get the limit of zero to one (0-1) as shown in Tables 2 and 3 . Therefore, the study used the graded parameters as input values in modelling the artificial neural network. Further, the training methods were section into training, validation, and test set of 70%, 15% and 15% accordingly. This research work chose Levenberg Marquardt algorithm that works on the error back propagation to train the ANN [26][27][28][29][30][31] . In addition, a mean square error was used in knowing the deviation of the experimental values from the adopted ANN responses values.

Biodiesel yields from KOH and NaOH catalysed transesterifications
The yields of biodiesel obtained from the transesterification reaction (using NaOH and KOH catalysts separately) are represented in Fig. 2 . In general, the result revealed that KOH catalysed transesterification produced higher biodiesel yield compared to the yield obtained from NaOH catalysed transesterification, under same experimental condition. That is, higher percentage conversion of biodiesel was attained with KOH catalyst, under same reaction conditions. This could be due to    the fact that KOH is more active than NaOH, in terms of chemical reactivity. The atomic radius of potassium is greater than that of sodium and since the single valence electron that exists in the former is located farther from the nucleus than that for sodium hence lesser energy is required to excite the singe valence electron in potassium than in sodium, during reaction [24] .

Biodiesel yield model using RSM
The results of the model formulated between biodiesel yield and the four process variables using RSM (for both KOH and NaOH catalysts) are shown in Eqs.  also confirm the high reliability of the model, since these values indicated insignificant variance between the experimental data and the model-generated data. Tables 3 and 4 show the parameter utilised in the formulation of model between the four input parameters and responses (biodiesel yields), using KOH catalyst and NaOH catalyst respectively. Figs. 3 and 4 reveal the plots of regression for training, validation, test and overall , using KOH and NaOH catalysts respectively. While Fig. 5 (a) and (b) show the plots of ANN model for training, validation, test and overall , using KOH and NaOH catalysts respectively.

Biodiesel yield model using ANN
A number of sigmoidal neurons with one to two hidden layers were trained with the generated feed-forward network. In addition, the evaluation of uni-layered architecture for mean squared error MSE shows an increasing order behaviour [27] observed over fitting effect when the neurons exceed certain limit. Moreover, this over fitting may result into network perturbation faulty prediction in spite of input data accuracy [28] . In order to avoid the erroneous outcomes due to timely network convergence to local minimum, there is a need to adopt various weight initialization in the training method [29,30] . This study varied the number of neurons from 1 to 20 under 10 0 0 iterations and the hidden layer suitable for this study Fig. 4. Plots of regression for (i) training, (ii) validation (iii) test (iv) overall catalysed by NaOH. was found at 12 neurons. While its testing of mean square error begins to diverge fast at higher neurons, the training was stopped. This research work adopted a topology of 4-12-1 because of the 4 graded input parameters, 12 hidden neurons and one response (output).
The method of trial and error was utilised to achieve the lowest mean square error in the validation process and the performance of the trained network as to get the response values that would replicate the target values [31][32][33] . Therefore, the determining factor in Fig. 5 a on how the network is trained (minimum mean square error) is 0.0 0 0 024207 which is closer to 0 [28] , its validation is seen as 0.9944 and the overall regression coefficient is 0.92409 is close to 1 [30,31] . In addition, the minimum mean square error of Fig. 5 b is 0.0 0 0 034504 is nearer to 0 [29] , which trains its validation as 0.98845 and the overall regression coefficient as 0.96258 is more nearer to 1 than Fig. 5 a [ 30,31] . Although, the minimum mean square error of Fig. 5 a converges faster than Fig. 5 b, but, the overall regression coefficient of Fig. 4 is far better than that of Fig. 3 with 0.03849. Therefore, the best ANN topology occurs where the mean squared error points are in the neighbourhood of zero and correlation coefficient is around one. The training of the network was specifically stopped at the point where the occurrence of over shooting was noticed. Fig. 5 (a) shows the best validation performance for WGO biodiesel in the presence KOH catalyst with overall global minima of 0.0 0 0 024270 at 2 epochs. While Fig. 5 (b) shows the best validation performance for WGO biodiesel in the presence of NaOH catalyst with overall global minima of 0.0 0 0 034504 at 9th epoch.

Statistical comparison of RSM and ANN models
Moreover, the overall regression coefficients R and correlation coefficient R 2 of Table 5 shows that the ANN model had 0.9241 and 0.8539 while the R and the correlation coefficient R 2 calculated from RSM were 0.9290 and 0.8516 for KOH catalysed transesterification process. In addition, the overall regression coefficients R and correlation coefficient R 2 in the  ANN model were 0.9629 and 0.9272, while the R and the correlation coefficient R 2 calculated from RSM were 0.9210 and 0.8791, for NaOH catalysed WGO biodiesel production. Furthermore, the fitting of experimental and predicted data for both KOH and NaOH catalysts with ANN tool depicts good agreement and the validation sets reveal that the ANN forecasts are accurate and reliable [33] .

Conclusion
This study has compared the predictive power of RSM and ANN model in the transesterification process of WGO catalysed by KOH and NaOH. The results of this research work reveal that KOH catalyst produced higher yield of biodiesel (as revealed in Fig. 2 ). Also, the results typify the robustness and superiority of ANN over RSM in predicting and solving complex problems specifically in the transesterification of biodiesel, due to the larger values of R and R 2 as recorded in Table 5 . Interestingly, the visual observation of Tables 4 -5