Biomedical Prediction of Radial Size of Powdered Element using Artificial Neural Network

Silver nitrous aqueous solution is used to biosynthesize Silver nanoparticles (Ag-NPs) through a green and easy way using tuber powder extracts of Curcuma Longa (C. longa). The aim is to model an Artificial Neural Network (ANN) using seven existing algorithms in MATLAB for forecasting the size of the silver nanoparticle with volume of both C. longa extraction and AgNO3, time of stirring and temperature of reaction as input functions. Several techniques including Quasi-Newton, Conjugate Gradient and Levenberg-Maquardt are employed for training the designed ANN model, a feed-forward backpropagation network with different combinations of architecture and transfer functions. Each algorithm is fashioned to obtain the best performance by calculating the Regression (R), Mean Square Error (MSE), Mean Absolute Error (MAE) and Error Sum of Squares (SSE), thereby comparing the results and propounding the optimum algorithm technique for the discussed application in nanoengineering. Finally, based on the findings, the optimum network is proposed through the simulation results.

THE Nanoparticles (NPs) are a broad class of materials including particulate substances having a dimension less than 100 nm at least 1 .Noble metals like Silver in metal Nanoparticles mainly have been used for experimental purposes because of their robust properties in optics.This creates large amount of applications in areas such as photography, dentistry, electronics, food industries, clothing etc 2 .The shape and size of metal nanoparticle is measured typically using discrete techniques such as Scanning Electron Microscopy, Transmission Electron Microscopy etc 3 .
In the last decade an increasing use of artificial intelligence tools was observed in nanotechnology research.Artificial Intelligence can be used in classification of material properties of nanoscale, designing, simulation, nanocomputing etc 4 .Artificial neural network (ANN) is an efficient as well as dynamic simulation tool which allows one to classify, predict or estimate relationships among inputs and outputs 2 .They can expertly solve difficult problems such as stock exchange prediction, image compression, face recognition etc.These tasks may be carried out without any prior information.

Artificial neural network
An ANN is a computational technique that uses the assistance of a learning paradigm along with processing nodes and attempts to present an affiliation between the input and output data 5 .Majorly, there are two learning paradigms that an ANN can employ; supervised learning and unsupervised learning 6 .
A basic ANN network comprises of three primitive layers; input layer, hidden layer and output layer as illustrated in Fig. 1.These layers contain various mathematical functions, nodes which are also called artificial neurons, associated with weights or coefficients that builds the structure of the neural network 7 .When an input and the corresponding target is provided to the ANN model (the training in this case is supervised), the error is calculated from the difference between the system output and the target response.This information of the error is fed back (Back propagation or BP learning) during the training phase and consequently the weights are adjusted accordingly, thereby improving the system parameters.Reiteration is done until the desired performance is achieved 8 .
Several types of ANNs have been designed with different configurations either with a single-layer or multiple layer neurons.A multilayer perceptron (MLP) is the best model for complex problems.By introducing more number of hidden layers a MLP outlives the drawback of the singlelayer perceptron.In a conventional feed-forward MLP network, the input responses are multiplied with the weights and these multiplied signals from each input are then summed and guided to a transfer function which gives the output result for that particular neuron 7 .

Learning algorithms
There are several types of training algorithms that can be adopted to train an ANN.MATLAB provides 9 different types of algorithms for an Engine Data Set problem, out of which top 7 algorithms are explored in this study.

Conjugate Gradient
Conjugate Gradient (CG) starts by searching the negative of the descent in their first iteration.Before the next search is determined, a line search is implemented for acquiring the prime distance to travel forth the existing search direction, so that the two search directions are conjugate.The novel search direction is determined when the new steepest descent direction and the preceding search direction are combined 9 .Several versions  For the Fletcher-Reeves Update (CGF), the constant is calculated as the ratio of the norm squared of the present gradient to the norm squared of the previous gradient 7 .

Polak-Ribiére Update (traincgp)
Another practice of the CG algorithm is Polak-Ribiére Update (CGP).In CGP, the constant is calculated by the inner product of the previous gradient change with the current gradient divided by the norm squared of the previous gradient.CGP requires more storage than CGF 6 .

Scaled Conjugate Gradient (trainscg)
Scaled Conjugate Gradient (SCG) does not call for a line search at each and every iteration and employs the step size scaling mechanism Although the number of iteration may increase for the algorithm to converge 9 .

Quasi-Newton
Newton's technique provides improved optimization and converges faster than CG techniques but the Hessian matrix of the performance index at the present values of the biases as well as weights, which is the elementary step to the Newton's method, takes more time hence making the method complex for feed forward ANN.Based on this a class of algorithms, quasi-Newton or secant method, does not require the computation of second derivatives.In each iteration of the algorithm the approximate Hessian Matrix is updated 6 .

Broyden-Fletcher-Goldfarb-Shanno (trainbfg)
In Broyden-Fletcher-Goldfarb-Shanno (BFGS), the approximate Hessian matrix is stored with an n x n dimension, where n represents the number of weights and biases in the ANN model.Although it converges in fewer iterations, it has more calculations and storage requirements than CG methods 7,9 .

One Step Secant Algorithm (trainoss)
The One Step Secant (OSS) technique adopts that at every iteration, the preceding Hessian matrix is the identity matrix thereby not storing the complete Hessian giving it an additional benefit of calculating the new search direction without calculating the matrix inverse 6 .

Levenberg-Marquardt (trainlm)
The Levenberg-Marquardt (LM) training algorithm is a numerical least-squares non-linear function minimization technique 10 .LM method computes a Jacobian matrix that contains first derivatives of the network error with respect to the weights as well as biases.The calculation of Jacobian matrix by standard BP technique is less complicated than the Hessian matrix 6 .
LM algorithm first initializes the weights of the network following the computation of the outputs and errors for all the input responses.Subsequently, the Jacobian matrix is calculated and the new weights are obtained.A new error value is determined from these weights and a comparison between the new and the current error value is carried out.Accordingly, the regularization parameter, µ is reduced by a factor of ² if error is smaller otherwise it is increased by ².It is reiterated until the error is below the predefined value or a stopping condition is met 10 .

Network design Data Set
In this study, the sample data employed to train the ANN model is presented in [2, Table I].The database is split into; training set, validating set and testing set.A training set is adopted for learning to fit the parameters and is specifically applied to alter the varying weights and errors of the network in each iteration 2,11 .Validation set tunes the parameters.It is used to vary and enhance the structure of ANN like training function, transfer function, number of hidden layers and neurons etc 2,11 .A test set is used only to assess the effectiveness efficiency of the ANN [2].Table [2, I] presents the four parameters produced as a function to predict the size of the Ag-NPs along with the actual size of the nanoparticle obtained.

MeThODOLOGy
An appropriate ANN model requires a learning algorithm, transfer function, suitable number of hidden layers and neurons.The framework to build and elect the appropriate ANN model for the chosen application is shown in Fig. 2. The most common learning in ANN is the BP technique which uses a supervised learning.A supervised learning paradigm compares the output response to the target response to calculate the learning error.This learning error is used to adjust the network parameters to enhance the performance of the network 5 .In this paper, the designed network has four input parameters and one output parameter.Thus, the ANN is constructed with 4 neurons in the input layer and the output layer with 1 neuron.The number of neurons in the hidden layer and the transfer function is tested against to find the best suitable architecture for the application.The final evaluation of each network operation is done using Mean Square Error (MSE), Mean Absolute Error (MAE), Error Sum of Squares (SSE) and Regression (R).
The values of these indices can be calculated using the following equations, ... (1)   ...(2) ... (3)  where, n is the number of points, Yi is the value predicted from the ANN model and Pi is the actual value 2 .R, the determination coefficient of linear regression, is a line between the predicted values from the ANN model and the target output.It fits better to the actual data when the R value tends to 1 12 .

ReSULTS AND DISCUSSIONS
All 7 algorithms used are coded in MATLAB with R2012b (8.0.0.783) version.The study is carried out by choosing one input, hidden and an output layer.The architecture of the ANN model is changed by altering the number of neurons in the hidden layer (10, 20, and 30) along with the transfer functions (purelin, logsig and tansig) in both hidden and output layer.Table II presents the values obtained by various architectures and transfer function arrangements of each algorithm.Normalization of all the input data in accordance with the transfer function is the first step of the calculation before using the neural networks.The last step is the de-normalization of the output data 2 .For enhanced performance and selecting the optimum architecture for the application, the performance indicators ((1)-( 3)) and R between the target response and the output obtained are analyzed.
Other values of the indices comprising MSE, MAE and SSE are recorded in Fig. 3, 4 and 5.The transfer function is applied to both hidden and output layer in the ANN model.Therefore for example, in Fig. 3. (2-1) explains the use of Logsig transfer function in the hidden layer and Purelin transfer function in the output layer.All the other combinations follow the same pattern.
The values of indices are computed using the MATLAB syntax in the code itself.As presented in Table II, the optimum network model for this application for traincgb is when the network has 10 neurons in the hidden layer and logsig; purelin as the activation function in the network.The MSE corresponding to this is 0.003.It can be seen that all the other readings for MSE are bigger than MSE reading for the optimum network found.The R values for this network are 0.9864, 0.9877 and 0.9711.For traincgf the optimum network is found to be 10, logsig; purelin with MSE value as 0.027 and R value as 0.9880, 0.9981 and 0.9784 whereas traincgp gives the optimal results when the network architecture and parameters are set to 10, tansig; logsig where MSE value is seen to be 0.0026 and R values as 0.9924, 0.9824 and 0.9634.The trainscg algorithm gives better results with 10, tansig; purelin as its architecture and activation function.The MSE value for the same is found to be 0.0028.0.9882, 0.9885 and 0.9932 are the R values.However, it is seen that MSE values for trainbfg algorithm, 0.0018, is same for when the network is 10, logsig; purelin and 10, tansig; purelin.In this case, the optimal network is chosen by comparing the R values and the best validation giving the most favorable architecture in trainbfg as 10, logsig; purelin with best validation performance being 0.0016 at epoch 21 and 0.9920, 0.9663, 0.9917 being the R values.In trainoss the finest value of MSE is 0.0027 whereas R is 0.9838, 0.9881, 0.9745 with the network parameters as 20, logsig; purelin.Finally for trainlm, MSE value is recorded as 0.00007 with R values nearest to 1; 0.9977, 0.9968 and 0.9959 when the network had 10 number of neurons in the hidden layer and logsig; purelin as the activation function.
Effect of each of the seven algorithms on the output response by varying the architecture of the ANN model and the transfer function in hidden and output layer is shown in Fig. 3, 4 and  5. ANN models that are simulated using numerous training functions are altered in accordance with the number of neurons in their hidden layer.MSE of all the responses recorded is illustrated in Fig. 3. MSE is an important criterion for measuring the overall performance of a designed ANN model.Fig. 4 illustrates a graph between the MAE and total number of nodes in the hidden layer and activation function for all the 7 algorithms used to design the various ANN models.The absolute value of the difference between the target value provided to the ANN model to train and the actual value obtained is the absolute error.Fig. 5 illustrates a graph between error sum of squares, which computes the total deviation of the obtained values from the fitting line or the regression line, and total number of nodes in the hidden layer and activation function.Smaller the value of SSE, better will be the regression line.

CONCLUSION
In this research, the size of the Ag-NPs is determined using ANN modeling from different combinations of architectures and transfer functions by means of a feed-forward neural network model which renders the effect of volume of C. longa extraction, stirring time, temperature, and volume of AgNO3 on the nanocomposites behavior.The ANN model is simulated, trained and tested with the learning algorithms like Quasi-Newton, Conjugate Gradient and Levenberg-Maquardt using the dataset.In the projected work it is evident that Levenberg-Maquardt is the best suited algorithm when considering engine data set type for the particular application.It converges in lesser epochs and indeed takes shorter time period than all the other training algorithms.Some suitable architectures gave worthy performances within the same algorithms as their R value is observed nearest to 1.The experiment shows that ANN is an effectual tool in pondering subjects related to nanoengineering as the size of the silver nanoparticle is predicted in the absence of the costly and time-consuming tests.

Fig. 1 . 2 .
Fig. 1.Basic architecture of a Neural Network Model Fig. 2. Flowchart of the methodology used

Fig. 3 .Fig. 4 .Fig. 5 .
Fig. 3. Effect of Mean Squared Error on total nodes in the hidden layer and activation function on each algorithm

Table 1 .
Experimental Values For Prediction Of The Size Of Ag-nps

Table 2 .
Results And Comparison Of Algorithms Using Different Architectures And Transfer Functions which reduces the time consumption, making SCG the fastest among the second order algorithm.