Quality Prediction Model Based on Novel Elman Neural Network Ensemble

In this paper, we propose a novel prediction algorithm based on an improved Elman neural network (NN) ensemble for quality prediction, thus achieving the quality control of designed products at the product design stage. First, the Elman NN parameters are optimized using the grasshopper optimization (GRO) method, and then the weighted average method is improved to combine the outputs of the individual NNs, wherethe weights are determined by the training errors. Simulations were conductedto compare the proposed method with other NN methods and evaluate its performance. The results demonstrated that the proposed algorithm for quality prediction obtained better accuracy than other NN methods. In this paper, we propose a novel Elman NN ensemble model for quality prediction during product design. Elman NN is combined with GRO to yield an optimized Elman network ensemble model with high generalization ability and prediction accuracy.


Introduction
During product design, once the scheme design is complete, designers expect to obtain a reliable estimation of the overall characteristic index of the product system for the purpose of further adjusting the designing parameters and improving overall performance.For large complex product systems, there are many designing parameters, and complex relationships, which are nonlinear and strongly coupled, often exist between these parameters and the product's overall quality characteristics.Therefore, establishing an organic link between designing parameters and a product's overall quality characteristics is an essential problem that needs to be solved during product design.
Once the designing parameters are determined, designers start to conduct a detailed technical design according to corresponding technical codes, in addition to their experience.During this stage, differences caused by different designers and technicians only appear in part of the product and do not have a substantial influence on the quality characteristics of the entire product system.Therefore, there exists a certain correspondence between the product's quality characteristic index and the various designing parameters determined in the scheme's design phase.However, this type of correspondence is difficult to express using an exact explicit function.For some small and simple products, the overall characteristics are determined accurately with the aid of approaches such as the finite element and modal synthesis methods.The overall characteristics can even be obtained by creating models and conducting relative experiments.
Many quality prediction techniques are available in the literature, such as the Bayesian model [1], support vector machine [2], neural network (NN) [3][4][5][6], state-dependent parameter model [7], and deep learning techniques [8].Yu et al. [1] proposed a Bayesian model-based multi-Kernel Gaussian process regression approach to predict the quality of nonlinear batch processes with multiple operating phases.Bidar et al. [7] proposed a data-driven soft sensor approach for online quality prediction, which uses statedependent parameter models.Through a comparison with other methods, the proposed model is much more robust and reliable with fewer model parameters, which makes it useful for industrial applications.In view of quality prediction in Complexity manufacturing, Bai et al. [9] conducted a comparative study of intelligent learning approaches.Two categories of intelligent learning approaches, that is, shallow learning and deep learning, were investigated and compared for manufacturing quality prediction.The authors' experiments showed that the deep framework overwhelmed the shallow architecture in terms of the root-mean-square error (RMSE), threshold statistics, and mean absolute percentage error.Zheng et al. [10] proposed a new phase adaptive relevance vector machine (RVM) model for quality prediction in multiphase batch processes.Based on the information transfer of relevance vectors in each RVM model, different phases are connected one after another, which provides simultaneous information for the prediction of final product quality.To predict the quality of a complex production process, Zhang et al. [11] proposed a multimodel modeling approach based on fuzzy C-means clustering and support vector regression to solve the problems of a nonlinear, wide operating condition range and difficult prediction.Svalina et al. [12] provided models based on moving least-squares and moving least absolute deviations methods to predict machined surface quality.Comparisons with NNs were also conducted to show that the total mean square deviation in the proposed models was considerably higher than that in the application of the NN method.To address issues of the real-time prediction of final product quality during batch operation, Wang [13] presented a robust data-driven modeling approach using available process information up to the current points to capture their time-varying relationships with final product quality during the course of operation.Chu et al. [14] proposed an improved JYKPLS (Joint-Y Kernel partial least squares) process transfer model to solve the issue of final quality prediction for new batch processes.To cope with the difficulty of online quality prediction for multigrade processes, Liu et al. [15] proposed a just-in-time latent variable modeling method based on extracting the common and special features of multigrade processes.Garcia et al. [16] applied regression models to estimate physical quality indices in a tube extrusion process.Park et al. [17] developed a system for predicting the printed part quality during SLM (selective laser melting) process by simulation.Zhu et al. [18] presented an improved analytical variation propagation model for a multistage machining process based on a geometric constraint equation.Their study provided a method to predict part deviation under the influence of the fixture error, datum error, and machining error.Additionally, there are many studies on water quality prediction.Quality control is a process through which one seeks to ensure that product quality is maintained or improved with either reduced or zero errors.In this study, we address "quality control" at the stage of product design.The quality of the final product is a dependent variable that is determined by some independent input variables.We explore the relationship between the independent input variables and the dependent variable (product quality).In this sense, although the cases of air and water quality control are not fields clearly related to the case studied in this paper, the logic of quality control is similar, that is, the control of final quality through the control of dependent input variables.Final quality is predicted through establishing the relationship between input variables and quality characteristics.Liu et al. [2] proposed a water quality prediction model based on support vector regression.A hybrid approach, known as real-value genetic algorithm support vector regression, was presented and applied to predict the aquaculture water quality collected from the aquatic factories of Yixing in China.Li et al. [8] proposed a novel spatiotemporal deep learning-based air quality prediction method that inherently considers spatial and temporal correlations.Statistical models are widely used in the prediction of water quality.Avila et al. [19] compared the performance of a wide range of statistical models, including a naïve model, multiple linear regression, dynamic regression, regression tree, Markov chain, classification tree, random forest, multinomial logistic regression, and Bayesian network, in the prediction of water quality for the weekly data collected over the summer months from 2006 to 2014 from the Oreti River in Wallacetown in New Zealand.Their results demonstrated that the Bayesian network was superior to all other models.
However, for large complex products (systems), all these methods are difficult to apply.Our purpose is to propose a novel prediction algorithm based on an improved Elman NN ensemble for quality prediction.The contributions of this paper include the following.
(1) A novel quality prediction approach based on the Elman NN ensemble is proposed.The prediction model helps to establish an organic link between designing parameters and a product's overall quality characteristics, which is essential in product design.
(2) A novel grasshopper optimization-(GRO-) based Elman NN ensemble model is designed to improve the generalization ability and effectiveness for quality prediction.First, the GRO algorithm is used to enhance the performance of an individual Elman NN with respect to the required quality prediction.Then, while yielding an NN ensemble, the improved weighted average is used to combine the outputs of different individual networks, in which the weights are determined by the training error of the individual NNs.
The remainder of this paper is structured as follows: In Section 2, we present a literature review, which provides the basis of our research.In Section 3, we present the quality prediction model based on novel Elman NN ensembles.We conduct simulations of our proposed approach, in addition to comparisons with other algorithms, in Section 4. In Section 5, we provide a conclusion.

Literature Review
Artificial neural network (ANN) has been one of the main techniques for quality prediction.NNs have been applied to various areas for prediction.Xu and Liu [6] combined the wavelet transform with the BPNN to establish the shortterm wavelet NN water quality prediction model.Najah et al. [20] investigated different artificial intelligence techniques in water quality prediction, including multilayer perceptron NNs, ensemble NNs, and the support vector machine, to develop a computationally efficient and robust approach for predicting water quality.Han et al. [5] presented a flexible structure radial basis function NN (FS-RBFNN) and applied it to water quality prediction.To build an ANN with a selforganizing architecture and suitable learning algorithm for nonlinear system modeling, Han et al. [21] developed an automatic axon-NN, which can self-organize the architecture and weights, thus improving network performance.Sheoran et al. [22] proposed a hybrid method for software quality prediction, which used an advanced NN that incorporated a hybrid cuckoo search optimization algorithm for better prediction accuracy.Russo et al. [4] applied optimal ANNs to air quality forecasting.Their proposed model used methods in stochastic data analysis to derive a set of a few stochastic variables that represent relevant information about a multivariate stochastic system and are used as input for NNs.Liu et al. [3] presented a SNCCDBAGGbased NN ensemble approach for quality prediction in the injection molding process, in which bagging is used to create NNs for the ensemble and negative correlation learning via correlation-corrected data (NCCD) is used to achieve negative correlation of each network's error against errors for the remainder of the ensemble.Ma et al. [23] presented a new outsourcing model for privacy-preserving neural network prediction under two noncolluding servers framework.They proposed a neural network prediction scheme for fully noninteractive privacy-preserving.Olawoyin [24] used BP ANN as a prediction tool to study the potential toxicity of PAH carcinogens in soils.Liang et al. [25] proposed a neural network prediction control model based on improved rolling optimization algorithm.The control model realizes advanced prediction to achieve intelligent control of unbalanced drilling's underpressure value and performs fast and stable self-feedback control of the output prediction results.Gu et al. [26] used an improved BP neural network based on GA algorithm to develop the yield-irrigation prediction model for subsurface drip irrigation system.Bardak et al. [27] presented an application of ANN to predict the wood bonding quality based on pressed conditions.Heidari et al. [28] optimized the multilayer perceptron NN using the GRO algorithm, which was applied to many popular datasets.
As seen from the above review of related work, many studies have considered the problem of quality prediction.There are many studies on water quality prediction, air quality prediction, and quality prediction in manufacturing.However, publications on quality prediction during the stage of product design are scarce.Modeling the relationships between designing parameters and a product's overall quality characteristics has been ignored.Considering the NN's ability to simulate the complex input and output relationship of a real system and its powerful ability in nonlinear modeling, a quality prediction model based on a NN is built in this study.To improve the prediction accuracy, a novel prediction algorithm based on the Elman ensemble is proposed.First, the Elman NN parameters are optimized using the GRO algorithm, and then the weighted average method is improved to combine the outputs of the individual NNs, where the weights are determined by the training errors.

Quality Prediction Model Based on Elman
Neural Network Ensembles . .Elman Neural Network.The Elman NN is a type of locally recurrent network, which is considered as a special type of feedforward NN with additional memory neurons and local feedback.Thus, the BP algorithm, which is typically used for feedforward NN training, can be used to train the Elman network.
As shown in Figure 1, the Elman NN consists of the context layer, input layer, hidden layer, and output layer [29][30][31]. 1 denotes the weight from the context layer to the hidden layer, W 2 denotes the weight from the input layer to the hidden layer, and W 3 denotes the weight from the hidden layer to the output layer.( − 1) denotes the network input vector at the ( − 1)th iteration, () denotes the hidden layer output vector at the th iteration, and () denotes the network output vector at the th iteration.The context layer retains the hidden layer output vector from the previous iteration; that is,   () denotes the context layer output vector at the th iteration, and its value equals the hidden layer output vector at the ( − 1)th iteration.The thresholds of the hidden layer unit and output layer are   and   , respectively.Suppose the transfer functions of the hidden layer and output layer units are (⋅) and ℎ(⋅), respectively, then the transfer relations of the Elman network are expressed as where   () = ( − 1).
The output of the Elman NN is represented as The Elman NN is one of the most widely used and most effective NN models in ANNs and has powerful processing ability for nonlinear decisions [30].The Elman NN can be considered as a special kind of feedforward NN with additional memory neurons and local feedback.Because of its better learning efficiency, approximation ability, and memory ability than other neural network, the Elman NN can not only be used in time series prediction, but also in system identification and prediction [32,33].Therefore, Elman NN is chosen for quality prediction in our paper.
However, the Elman NN also has some disadvantages, such as a low convergence rate, easily becoming trapped at the local minimum, and lack of theory to determine the initial weights and threshold of the network.Generally, optimization methods can overcome the deficiencies of NNs.Additionally, the NN ensemble is a technique that can significantly improve the generalization ability of NNs through training a number of NNs and then combining them ( [31], He and Cao, 2012).Generally, the most common methods for the NN ensemble model are simple averaging and weighted averaging for regression problems [34].In this study, the GRO method is used to make up the deficiency of the Elman NN, and the input weights and output layer threshold are optimized.Then a novel method based on the training error is established to construct an ensemble model.  . .Grasshopper Optimization Algorithm.The GRO algorithm was proposed by Saremi et al. [34].The main characteristics of a swarm of grasshoppers in the larval phase are slow movement and small steps.The mathematical model used to simulate the swarming behavior of grasshoppers is presented as follows, which can provide random behavior: where   denotes the position of the ith grasshopper,   denotes social interaction,   denotes the gravity force on the ith grasshopper,   denotes the wind advection, and  1 ,  2 , and  3 are random numbers in [0, 1].The mathematical model (3) cannot be used directly to solve optimization problems, mainly because grasshoppers quickly reach the comfort zone and the swarm does not converge to a specified point.A modified version of this equation is proposed as follows to solve optimization problems: where   denotes the upper bound in the th dimension,   denotes the lower bound in the th dimension,   denotes the value of the th dimension in the target, and  denotes a decreasing coefficient to shrink the comfort zone, repulsion zone, and attraction zone.It shows that the next position of a grasshopper is defined based on its current position, position of the target, and position of all other grasshoppers.Note that the first component of this equation considers the location of the current grasshopper with respect to other grasshoppers.
The pseudocode of the genetic optimization algorithm (GOA) algorithm is as follows [34]: Initialize the swarm X i (i= , , . .., N)

Update the position of the current search agent Bring the current agent back if it goes outside the boundaries end for
Update X best if there is a better solution . .Improved Elman NN.In this study, the GRO algorithm is used to support NN training, which is shown in Figure 2.During the Elman-GRO combination, the NN manages nonlinearity, whereas GRO is used to optimize the NN parameters for an accurate fit.GRO optimizes a problem using individual grasshoppers, which move around in the exploration space according to a simple mathematical model.A grasshopper position represents one of the solutions for determining input weights and output thresholds for an Elman NN.The length of the grasshopper position equals the dimensionality of the input weights and output layer thresholds.The objective of the GRO is to determine an input weight and output threshold vector that can result in the minimum RMSE from the NN.
The RMSE is designed as the objective function; thus, the GRO is engaged to minimize the RMSE during the training phase.It is calculated as the difference between the values anticipated by the predicted value and the true value.The RMSE of the prediction with respect to the computed variable V  is determined as the square root of the mean-squared error and is given by where V  denotes the originally observed value of the th data instance and V  denotes the value predicted by the NN.
Convergence is checked after each iteration of GRO instead of each epoch.The objective of the optimization is to determine an input weight and output threshold vector that can meet the requirement of RMSE, which is written as where   is the desired threshold.As shown in Figure 2, if the convergence condition is met, the optimization stops.
. .NN Ensemble Model for Quality Prediction.An ANN ensemble is a composite model of multiple NNs, with better generalization ability and stability than a single network.When using the improved Elman NN to construct the NN ensemble, first, the individual NNs are produced, and then the outputs of the individual NNs are combined.
Bagging is a statistical resampling technique for generating training data for each individual in an ensemble model [35,36]  difference between individuals using bootstrap sampling and improves the generalization ability of the NN ensemble.Then, the outputs of the individual networks are weighted to form the overall output  of the ensemble: where   is the input vector,   is the weight applied to the output of the th individual network, and  is the number of NNs in the ensemble model.In this algorithm,   is designed according to training error   for each NN: where  = [ 1 ,  2 , . . .,   ] and ( ) is the summation function of a vector.

Simulation of Quality Prediction and Analysis
. .Simulation.To verify the effectiveness of the proposed approach, computer simulations were conducted.A 10 m diameter circular paraboloid antenna is considered as an example in this section.The quality prediction model was established on the basis of relationships between the designing parameters and quality characteristics indices.A circular paraboloid antenna is a structure with high precision.It is widely used in areas such as aeronautics and astronautics, radar technique, and satellite communication.The requirements of its quality characteristics include reflector accuracy, self-weight deformation requirement, and homologous design requirement.As is shown in Figure 3, the section characteristics of the beam, (1), ( 2), (3), (6), and (7), and the height of the beam, ( 4) and ( 8) affect the overall quality of the antenna greatly.ℎ  and ℎ  were introduced to denote the ordinates of point 6 and point 8, respectively, which are the height of the beam, (4) and (8).Accordingly, the designing parameters for the quality design of the circular paraboloid antenna are determined, which are the section characteristics of the beam, (1), ( 2), ( 3), (6), and (7), ℎ  and ℎ  , denoted by  1 ,  2 ,  3 ,  4 ,  5 ,  6 ,  7 , respectively.( 1 ,  2 ,  3 ,  4 ,  5 ,  6 ,  7 ) is identified as the input vector of the quality prediction model to be established.Simultaneously, the indices of the quality characteristics of the circular paraboloid antenna are identified as mean square error (MSE), dead weight, and fundamental frequency, denoted by  1 ,  2 ,  3 , respectively, which is the output vector of the model.
The fractional factorial design of 2 − was selected to conduct experiments [37].
In the simulation,  = 7,  = 2, and 32 sets of data were divided into two parts: a training set and a testing set, which consisted of 25 and 7 samples, respectively.There were 7 input nodes, 3 hidden nodes, 5 context nodes, and 3 output nodes in the Elman NN.Therefore, 21 input weights and 3 output layer thresholds were optimized.During NN training, the epoch size was set to 1,000.The grasshopper number (size of swarm) N and iteration number L were designed to be 200 and 200, respectively.Four Elman NNs were used for building the ensemble model.Due to the characteristics of artificial intelligence optimization and the bagging technique for the ensemble, the 4 Elman NNs are designed to be different with each other.Although only 25 sets of data are used in the simulation, they are employed to train the Elman NN 4 times, which can approximately suppos that there are 100 sets of data used in our simulation.Based on the excellent performance of Elman NN, the small-size data training problem is also solved in Vairavan et al. [38], which uses 60 sets of data to train the Elman NN and obtains higher detection accuracy.
Three levels were determined for each designing parameter.Detailed setting values are shown in Table 1.Outer diameters of the steel pipe of beam ( 1) and ( 2) were set to 50 mm, and the thickness varied from 2 mm to 4 mm.The thickness of 3 mm was set to be level 0, whereas 2 mm and 4 mm were set to levels -1 and level 1, respectively.The outer diameters of the beam, (3) and (7), were set to 65 mm, and the thickness varied from 3 mm to 5 mm.The thickness of 4 mm was set to be level 0, whereas 3 mm and 5 mm were set to levels -1 and level 1, respectively.The outer diameter of the beam (6) was set to 40 mm, and the thickness varied from 1 mm to 3 mm.The thickness of 2 mm was set to be level 0, whereas 1 mm and 3 mm were set to levels -1 and level 1, respectively.Test data acquired through experiments included combinations of designing parameters from the 32 tests and the corresponding values of the indices for the quality characteristics.Part of the data

Designing parameters Levels
(80% of the samples) was used to train the NN, whereas the remainder (20% of the samples) was used to verify the efficiency and generalization ability of the proposed model.Comparisons were conducted with other methods, including the Elman NN method, GRO-based Elman method, and weighted ensemble method [35].The prediction results based on the NN methods are shown in Figures 4-9.Table 2 shows the average RMSE of all the methods, which proves the superiority of the proposed ensembled Elman NN model.With the same population size and iteration number, we also compared the convergence performance of GRO with the popular GOA [39] in one optimization process, which is shown in Figure 10.It shows that GRO had better convergence performance, which can provide more advantages in optimizing the Elman NN.  . .Analysis and Discussion.We established that the single Elman method had the worst performance compared with the other methods.Meanwhile, the proposed GRO-based Elman method achieved better performance than others.The simulation results also established that the proposed ensemble model outperformed all the other methods.
An ANN ensemble has better generalization ability and stability than a single network.In this study, we combined Elman NN with GRO to yield an optimized Elman network ensemble model.The Elman NN supported by GRO proved its efficiency for prediction.Using GRO to enhance the performance of Elman NN to achieve the required prediction is the innovation of the current study.The improved weighted average was used to combine the outputs of the individual networks, and the weights were determined by the training errors of the individual NNs, which improved the generalization ability and effectiveness of the model.However, it required a long time to optimize and ensemble the Elman NN.Therefore, the proposed method can only be used in the offline prediction field or fields that do not have a strict runtime requirement.

Conclusion
In this paper, a novel prediction algorithm based on the Elman NN ensemble model was proposed for quality prediction.Elman NN parameters were optimized using the GRO method, and then a novel NN ensemble model was designed for quality prediction.The quality prediction model for a circular paraboloid antenna was considered as an example to verify the proposed algorithm.An Elman NN ensemble model that described the correlation between seven designing parameters and three indices of quality characteristics was built.The simulation results proved that the proposed algorithm obtained better prediction accuracy.
. Each training dataset is drawn randomly with a replacement from the original training set.Additionally, the size of the training set is typically the same as that of the original training set.The bagging technique increases the Complexity

Table 1 :
Levels of designing parameters and the corresponding actual values (unit: mm).