Optimization of Anaerobic Treatment of Petroleum Refinery Wastewater Using Artificial Neural Networks

Treatment of petroleum refinery wastewater using anaerobic treatment has many advantages over other biological method particularly when used to treat complex wastewater. In this study, accumulated data of Up-flow Anaerobic Sludge Blanket (UASB) reactor treating petroleum refinery wastewater under six different volumetric organic loads (0.58, 1.21, 0.89, 2.34, 1.47 and 4.14 kg COD/m3·d, respectively) were used for developing mathematical model that could simulate the process pattern. The data consist of 160 entries and were gathered over approximately 180 days from two UASB reactors that were continuously operating in parallel. Artificial neural network software was used to model the reactor behavior during different loads applied. Two transfer functions were compared and different number of neurons was tested to find the optimum model that predicts the reactor pattern. The tangent sigmoid transfer function (tansig) at hidden layer and a linear transfer function (purelin) at output layer with 12 neurons were selected as the optimum best model.


INTRODUCTION
Petroleum refinery wastewater is hazardous to the environment as there are several organic and inorganic pollutants found in petroleum refinery wastewater of environmental concern including ammonia, oil, phenol, sulphur based contaminants and heavy metals (Vohra et al., 2006). One of the most important monitoring parameters for wastewater treatment plant is Chemical Oxygen Demand (COD) as it required relatively shorter time (Khan et al., 2006).
Anaerobic treatment of high-strength petroleum refinery wastewaters (generally>1000 mg COD/L) (Behling et al., 1997) has been shown to provide a very cost-effective alternative to aerobic processes with savings in energy, nutrient addition and reactor volume. Petroleum refinery wastewater is degraded under anaerobic conditions and many toxic and recalcitrant organic compounds that found in petroleum wastewater are serving as a growth substrate (Metcalf and Eddy, 2003). The Up flow Anaerobic Sludge Blanket (UASB) reactor is one of the most notable developments in anaerobic treatment process technology that commonly used for treating a wide range of industrial wastewater. UASB is a high rate system that can retain biomass with high treatment capacity and low site area requirement in addition to other advantages (Zinatizadeh et al., 2007).
The use of software to simulate existing historical experimental data and predict unknown data based on a model representing the process, help to minimize efforts and creates more data which doesn't exist. The modeling and simulation of processes have been developed using ever more complex deterministic models, due to the recent evolution of personal computer (Gontarski et al., 2000).
Neural Networks (NN) or widely known as Artificial Neural Networks (ANN) is a mathematical modeling tool used to simulate complex relationships following a simplified level of the activity of the human brain through a large number of highly interconnected processing elements (neurons); and have been used in application of artificial intelligence that has shown quite a promise in engineering, pattern recognition and analysis (Hamed et al., 2004). Anaerobic digestion is a non-linear process which requires a non-linear control strategy; whereby artificial neural networks is the choice when a large amount of anaerobic digestion data are available but no reliable model and little knowledge of how the process works (Ward et al., 2008).
ANN has being used to model existing data and simulate for predicted behavior in many wastewater treatment processes to ease the operation activities. Artificial neural networks are claimed to have a distinctive advantage over some other nonlinear estimation methods used for bio-processes as they do not require any prior knowledge about the structure of the relationships that exist between important controlling variables (Holubar et al., 2002).
ANN has being used to simulate full working wastewater treatment plant using a model that was developed using laboratory data for ten months. Modeling of this wastewater treatment process used a configuration with tan sigmoid activation function for the input and hidden layers, while the linear activation function was used as the output activation function, resulted in R 2 values ranged from 0.63 to 0.81 for Biochemical Oxygen Demand (BOD) and from 0.45 to 0.65 for Suspended Solids (SS) (Hamed et al., 2004). Using the same mentioned configuration, Chemical Oxygen Demand (COD) removal was modeled using ANN in a wastewater treatment process for the prediction and simulation of degradation. The configuration of the back propagation neural network with 14 neurons and Levenberg-Marquardt back propagation training algorithm (TRAINLM) predicted the actual experimental results with correlation coefficient (R 2 ) of 0.997 and MSE of 0.000376 (Elmolla et al., 2010).
Anaerobic biological treatment of wastewater was modeled based on integrated fuzzy systems and neural networks for the simulation and control of complex anaerobic treatment systems (anaerobic fluidized bed reactor and up-flow anaerobic sludge blanket) (Tay and Zhang, 1998).
Several Feed-Forward Back Propagation neural networks (FFBP) were trained in order to model and subsequently control, methane production in four anaerobic continuous stirred tank reactors. The model was able to predict gas production and avoid shock loadings (Holubar et al., 2002).
Utilizing a neural network simulation, anaerobic wastewater treatment process has been modeled to define the potentially damaging events that occur during disturbances to an anaerobic digestion. The neural network was capable of rapid recognition of disturbances that in the form of an increase in influent COD concentration and by utilizing data from an online bicarbonate alkalinity sensor (Wilcox et al., 1995).
A high strength wastewater (7300 mg COD/L) batch from a local petroleum refinery was treated in UASB as part of a train of biological reactor; the COD removal was found to be 80% (Gasim et al., 2012a). Two parallel UASB reactors were used to evaluate the treatment efficiency of petroleum refinery wastewater under six organic volumetric loading rates (0.58, 0.89, 1.21, 1.47, 2.34 and 4.14 kg COD/m 3 ·d, respectively); the COD removals efficiencies were 78, 82, 83, 80, 81 and 75%, respectively as the load increased (Gasim et al., 2012b, c).
The present study follows from the previous investigation by modeling the anaerobic treatment of petroleum refinery wastewater considering the influent and effluent COD concentration under different loads; the developed model was then used to simulate the reactor performance.

MATERIALS AND METHODS
Experimental data: The original raw data were adapted from previous work in which the data were representing two laboratory-scale Up-flow Anaerobic Sludge Blanket (UASB) reactors that were operated in parallel (A and B) at room temperature to treat petroleum refinery wastewater. The raw petroleum refinery wastewater was collected from a local petroleum refinery and fed to the two reactors in different concentration ranging on average from 982 to 6972 mg COD/L over approximately 180 days. Chemical Oxygen Demand (COD) was tested for influent and effluent samples following colorimetric method using a HACH DR 2000 spectrophotometer, other parameters were measured according to Standard Methods (APHA, 1980). The data that were gathered from this experiment were 160 entries for influent and effluent.
ANN procedure: The COD monitoring results during different loading were used for modeling. Artificial neural network was used as mathematical tool to simulate and predict the pattern of the reactor. Optimal generalization was targeted from this tool, therefore, the Levenberg-Marquardt algorithms was used as training function and batch gradient descent with momentum back propagation algorithms (TRAINGDM) as adaption learning function, Feed-Forward Back propagation network type was selected. The number of neurons has to be determined as it is related to the converging performance of the output error function during the training process. Increasing the number of neurons usually results in a better learning performance, as too few number of neurons limit the ability of the neural network to model the process, but too many number of neurons may results in losing the generalization and learning the noise present in the database used in training (Holubar et al., 2002).
Normalization of input data was performed by dividing all the input data with the maximum input; this resulted in the data to be in the range of 0 to 1. Output data were normalized by dividing all the output data with the maximum output; this resulted in the date to be in the range between 0 and 1.
Neurons were tested and varied the number of neurons in the range from 5 up to 35. For better initialization of the model, the model was run 100 times at every neuron tested. Optimum number of neuron was selected in this study based on: Neural Network in MATLAB (R2009a) software was used with back propagation neural network three layers in two configurations. First, with Log Sigmoid transfer function (LOGSIG) at hidden layer and a linear transfer function (PURELIN) at output layer. Second, with Tangent Sigmoid transfer function (TANSIG) at hidden layer and a linear transfer function (PURELIN) at output layer. The linear activation function (PURELIN) was used for both configurations for the output neuron since it is appropriate for continuous valued targets (Hamed et al., 2004).

RESULTS AND DISCUSSION
Modeling results: The raw data from anaerobic reactor was modeled using artificial neural networks software. Logsig-Purelin transfer function was compared to Tansig-Purelin transfer function to define the optimum model. The selected model was then used to predict the reactor performance. The simulation data were then used to find the optimum performance.
During testing and validation of data, number of neurons was tested ranging from 5 to 35. Table 1 shows the number of neurons tested and the score registered for RMSE, VAF, R 2 and MAPE during evaluation of Logsig-Purelin and Tansig-Purelin transfer functions.
Although the number of neurons are in the range of 5-35, but from Fig. 1 it is noted that after neuron 15 and from plotted line representing the R 2 from the training set is losing similarity with R 2 from validation set, indicating over fitting and the model will not be able to generalize the pattern of the data that used as training set during validation (Jeon, 2007).
Thus, the number of neurons was limited to the range between 5-15 neurons and the optimum neuron was selected as shown in Table 1 based on minimum RMSE, maximum VAF, maximum R 2 and minimum MAPE.
Logsig-Purelin transfer function indicated 15 neurons is the optimum, while Tansig-Purelin suggested 12 neurons. It is usually preferable to use of simpler models, with fewer number of parameters than more complicated ones with more parameters, whenever feasible (Hamed et al., 2004;Holubar et al., 2002). Thus, tangent sigmoid transfer function (tansig) at hidden layer and a linear transfer function (purelin) at output layer with 12 neurons is the optimum transfer function. Figure 2 showed the measured experimental data and the predicted using ANN for eighty entries of data that were used for training. Figure 3 showed the measured experimental data and the predicted using    ANN for eighty entries of data that were used for validation. The best selected model shows significant prediction of actual experiment; hence, it was then used for simulation.
Simulation results: The best model with Tansig-Purelin transfer function and 12 neurons was used to simulate random data to find out the optimum efficiency. Figure 4 shows all the hundred and sixty data set that was used for both the training and validation, used here for simulation. Random data entries ranged from 500 to 10000 was used to simulate the reactor performance. Figure 5 shows the simulated influent and effluent

CONCLUSION
Raw data from petroleum refinery wastewater treatment with different loads using two UASB reactors were successfully used for modeling. Modeling resulted in a model that used tangent sigmoid transfer function (tansig) at hidden layer and a linear transfer function (purelin) at output layer with 12 neurons as the optimum transfer function. Simulation using the optimum model with random data entries ranged between 500 to 9000 resulted in a pattern that simulates the reactor performance for data that were never really experimentally tested in the lab. Lab experiment was showing highest removal of 82% which confirmed by using the best selected model that developed using mathematical model.

ACKNOWLEDGMENT
This study was supported by the Universiti Teknologi PETRONAS graduate assistantship scheme.

Influent
Effluent Removal