Data on artificial neural network and response surface methodology analysis of biodiesel production

The biodiesel production from waste soybean oil (using NaOH and KOH catalysts independently) was investigated in this study. The use of optimization tools (artiﬁcial neural network, ANN, and response surface methodology, RSM) for the modelling of the relationship between biodiesel yield and process parameters was carried out. The variables employed in the experimental design of biodiesel yields were methanol-oil mole ratio (6 – 12), catalyst concentration (0.7 – 1.7 wt/wt%), reaction temperature (48 – 62 ° C) and reaction time (50 – 90 min). Also, the usefulness of both the RSM and ANN tools in the accurate prediction of the regression mod- els were revealed, with values of R-sq being 0.93 and 0.98 for RSM and ANN respectively. © 2020 Published by Elsevier


Specifications
The variables employed in the generation of biodiesel yield data were methanol-oil mole ratio (6 -12), catalyst concentration (0.7 -1.7 wt/wt%), reaction temperature (48 -62 °C) and reaction time (50 -90 minutes). Box-Benkehn BB(4) design in Minitab 17 environment was used for the design of experiments, ANN algorithm and RSM software were used for optimization studies. The esterification process of free fatty acid (FFA) removal involved the addition of 40 mL of a mixture of 25 mL isopropyl alcohol and 15 mL benzene solution to waste soybean oil (heated to 55 °C), as well as the addition of 2 drops of phenolphthalein and the mixture was then titrated with 0.

Value of the data
• The data aided the determination of the quality level and efficiency of the two processes investigated in biodiesel production. • The data are useful to authors in the field of renewable energy (at the global level). This is because the data revealed the processing conditions for the production of high yield and high-quality WSO-biodiesel. • The ANN and RSM data could be used to predict reliable models that comparatively connect yields, processing conditions and the quality of yields (in terms of statistical tools). • The ANN and RSM data could be used to improve the biodiesel production process by focusing on the operating conditions that gave high-quality value of biodiesel. • The data showed the comparative assessment of KOH and NaOH catalysed processes (under the same operating conditions) during the WSO biodiesel process. This is a piece of important information in the consideration of a more suitable catalyst for WSO biodiesel process.

Data Description
Fig. 1 depicts the ANN Architecture in which the 27 samples (from the experimental design) were distributed into training, validation and testing in the proportion of 19 samples (70%), 4 samples (15%) and 4 samples (15%) respectively. The experimental design (using Box-Behnken BB(4)) for WSO biodiesel production at three levels and four process parameters ( X 1 , X 2 , X 3 and X 4 ) is presented in Table 1 . However, when ANN technique was applied, these experimental  Table 1 Experimental design for WSO biodiesel production using Box-Behnken BB(4)

Process parameter
Notation Levels Methanol to oil mole ratio X 1 6:1 9: input parameters were re-assigned α m , α c , α T and α t to represent the mole ratio of methanol to oil, concentration of catalyst, temperature of reaction and time of reaction respectively. For ease of application of ANN algorithm, these four input parameters and the yields were computed by dividing each column of the actual process variables by the highest value in the column so as to have a range of zero (0) to one ( 1 ) values. Table 2 shows the process parameters and the WSO biodiesel yield obtained from RSM and ANN tools. Fig. 2 shows the comparative analyses of the WSO biodiesel yields from KOH and NaOH catalysed processes. Eqs. 1 , 2 show the respective RSM models that relate the biodiesel yields and the four process variables investigated using KOH and NaOH catalysts. The fitness and reliability of the models were revealed by the high values of R-sq and R-sq (adj.) , low values of probability value ( p ≤ 0.05 at significant level), the low values of Sum of Errors (SE) coefficient and high values of F , as presented in Tables 3-4 for the statistical analyses of the RSM models for WSO biodiesel production using KOH and NaOH catalysts respectively.   mean squared error and error histogram for WSO biodiesel production using NaOH catalyst. The best ANN topology could be observed at the least mean squared error value of approximately zero (0) and a correlation coefficient of one ( 1 ). The training of the network was terminated at the overshooting point.

Experimental Design, Materials, and Methods
The pre-treatment of WSO involves the removal of impurities, free fatty acid (FFA) and water to prevent low yield of biodiesel (and even formation of other products) during transesterification reaction [1][2][3][4][5][6] . Hence, WSO was pre-treated before the transesterification process was carried out. Impurities were removed through filtration process (using the industrial sieve of 50 μm pore size), FFA was removed through the esterification process which involved mixing 25 mL isopropyl alcohol with 10 mL benzene. 40 mL of this solution was then added to WSO and   Error histogram for biodiesel production from WSO using NaOH catalyst heated to 55 °C. 2 drops of phenolphthalein were added to the mixture and then titrated with 0.1 M KOH until a permanent pale pink colouration was observed. Removal of water was done by heating the esterified oil at 110 °C for 20 min [ 7 , 8 ].
Box-Benkehn BB(4) experimental design (through the application of Minitab 17 software) was used for the design of experiments. The software was also used for statistical analysis, which involved the plots of experimental biodiesel yields and statistical modelling. The procedures used for the transesterification of the waste soybean oil using methanol and the two forms of catalysts (KOH and NaOH) separately were exhaustively outlined in Ayoola et al. [ 9 , 10 ].
The 27 samples from the experimental design were distributed into training, validation and testing in the proportion of 19 samples (70%), 4 samples (15%) and 4 samples (15%) respectively. To generate the R-sq values and least error conditions, the plots of regression (training, validation, test and overall), mean squared error and error histogram were made using the ANN algorithm.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships which have, or could be perceived to have, influenced the work reported in this article.