An Intelligent Predictive Model-Based Multi-Response Optimization of
EDM Process

Electrical Discharge Machining (EDM) is a popular non-traditional machining process that is widely used due to its ability to machine hard and brittle materials. It does not require a cutting tool and can machine complex geometries easily. However, it suffers from drawbacks like a poor rate of machining and excessive tool wear. In this research, an attempt is made to address these issues by using an intelligent predictive model coupled global optimization approach to predict suitable combinations of input parameters (current, pulse on-time and pulse off-time) that would effectively increase the material removal rate and minimize the tool wear. The predictive models, which are based on the symbolic regression approach exploit the machine intelligence of Genetic Programming (GP). As compared to traditional polynomial response surface (PRS) predictive models, the GP predictive models show compactness as well as better prediction capability. The developed GP predictive models are deployed in conjunction with NSGA-II to predict Pareto optimal solutions.


Introduction
The ability of electrical discharges to cause an erosive effect which forms the basis of electrical discharge machining was initially propounded by Joseph Priestly, an English chemist in 1770. Much later, in 1943, B. R. Lazarenko and N. I. Lazarenko invented an electrical discharge machine used for working difficult-tomachine materials such as tungsten [1]. EDM is a machining process where the desired shape is obtained by using electrical discharges or sparks. It does not require cutting tools, nor does it depend on the hardness of the material. Advantages like not being affected by mechanical properties of the machined material and independence from cutting forces make EDM a suitable machining technique for brittle, high hardness and high strength materials [2]. Complex shapes with high precision can be achieved by using cheap electrode materials [3].
EDM works on the principle of material erosion by applying recurring sparks between the workpiece and the tool, which is submerged in a dielectric fluid. On applying a voltage to the workpiece and the electrode, electrons break away from the cathode (i.e., the electrode) and accelerate towards the workpiece. Upon hitting the molecules of the dielectric fluid which is neutral, more electrons are generated. The electron flow in this manner accelerates towards the anode. The dielectric fluid in the vicinity evaporates due to a leakage current caused by the moving electrons. Thus, a "plasma" channel gets created between the electrode and the workpiece. A crater is created on both the workpiece and the electrode due to the high temperature of the plasma. The removed material is flushed by the flow of dielectric fluid once the plasma channel extinguishes.
Thus, the machining performance of EDM can be characterized by material removal rate (MRR), tool wear rate (TWR), surface roughness (SR), etc. Ideally, the desire is to maximize MRR, thereby increasing productivity. MRR and SR are directly dependent upon discharge energy released during sparking. MRR is higher at higher discharge energy but the surface finish is adversely affected. Discharge energy is dependent on gap voltage, discharge current, pulse frequency, pulse duration and electrode characteristics. In general, increasing the on-gap voltage, discharge current, pulse frequency, pulse duration increases the MRR but lowers surface finish [4]. TWR, which is directly related to machining costs, is an important factor to consider while using EDM. Spark current has a significant impact on TWR. The fundamental machining parameters involved in EDM are gap voltage, discharge current, pulse on-time and pulse off-time. Gap voltage is the voltage applied between the electrode and the workpiece during the EDM process. The current applied to the electrode during pulse on-time is referred to as discharge current. Pulse on-time is the time for which the current is applied to the electrode during each EDM cycle [5]. The waiting time between two pulse on-times is called pulse off-time. During pulse off-time melted particles are removed from the setup.
Though there are several other effects [6,7] that affect the materials of the machines, however, the effects of only process parameters are focused in this study. It is imperative to understand the effect of each parameter on the EDM process. To do so a parametric study can be carried out. This can be done by changing one variable at a time while keeping the other variables at a constant level. However, to make an exhaustive analysis a huge number of experiments will have to be carried out; cost and time-wise which is impractical. Additionally, it is desirable to find optimum combinations of parameters that would effectively maximize or minimize desired output characteristics. To achieve both these tasks, predictive models (also called as metamodels) can be built. A metamodel is an approximate model of the actual process [8]. It is built using a few experimental runs. Polynomial response surface (PRS), often referred to in scientific publications as RSM is essentially a polynomial regression approach, is perhaps the most widely used metamodeling technique in parametric study and optimization of machining/manufacturing process [9,10]. A few researchers have also used advanced metamodeling techniques like artificial neural networks (ANN) [11,12], ANFIS [13], gene expression programming (GEP) [3] in machining/manufacturing scenario. Once the metamodel is developed it can be coupled with a global optimization algorithm to search for the optimum machining combinations. Among the various classes of optimization algorithms, metaheuristics are the most commonly used [14]. Within the metaheuristic family, genetic algorithms (GA) are perhaps the most popular and have been applied to all classes of optimization problems with significant success. Ragavendran et al. [4] built an RSM model using a Box-Behnken design (BBD) to optimize the discharge current, pulse on-time and pulse offtime. Subsequently, they used a GA to maximize the MRR and minimize SR. They found the pulse current to be the most important parameter. It was also observed by them that the increase in current leads to increased MRR. RSM was used as a metamodeling technique to express MRR and EWR as a function of voltage, current, pulse on time, and duty factor by Hourmand et al. [15]. They too found the current to be the most influencing parameter in determining MRR. Chiang [16] too used RSM to study the impact of discharge current, pulse on time, duty factor and open discharge voltage on MRR, EWR and SR. The experimental design was carried out using the central composite design in which samples points outside the region of interest are also considered. The analysis was further augmented by using analysis of variance (ANOVA). Discharge current and the duty factor was found to be the two most influential factors in determining MRR. On the other hand, discharge current and the pulse on-time played the most prominent role in determining the EWR and SR. RSM was used in combination with (non-dominating sorting genetic algorithm-II) NSGA-II by Baraskar et al. [17]. The NSGA-II was used to predict a set of a compromise solution, called the Pareto front. Hewidy et al. [18] too used RSM to model wire electrical discharge machining. Maity et al. [19] used an ANN to build a metamodel to predict MRR, overcut effect and recast layer thickness. The training dataset for the metamodeling process was generated by using CCD. This 3-layer ANN metamodel was then used in conjunction with Elitist Teaching learning-based optimization.
Inconel based alloys are widely used in different areas like chemical, aeronautical, nuclear and medical industries due to their outstanding mechanical properties. This material has excellent resistance to corrosion, fatigue and creeps at high temperatures. Thakur et al. [20] carried out an analysis that confirms that more than 50% of the materials used in the engines of airplanes are Inconel alloys. Out of all Inconel alloys, Inconel 718 is the popular material that is used for the manufacturing of turbine disks, blades, combustors, etc. [21]. Due to the low thermal conductivity of Inconel alloys, a very high temperature is induced while grinding and cutting the material which is a serious problem [22]. EDM can be extensively used for the machining Inconel alloys due to its non-contact material removal mechanisms [23]. Also, the low process forces of EDM produce burr-free microstructures with very high aspect ratios. Copper and Brass are extensively used as electrode material in EDM due to their high conductivity.
From the literature review, it is observed that in die-sinking EDM, brass in rarely used as an electrode material. So, in the present research, brass is taken as an electrode material for the machining of Inconel 718. Further, it is seen that researchers so far have relied mainly on RSM for building empirical relationships between input and output parameters of the EDM process. Thus, in the present work, a novel metamodeling approach-genetic programming (GP) based on advanced machine learning features is used. By comparing with a traditionally applied metamodeling technique-RSM, the comparative advantages of GP are highlighted. MRR and TWR are experimentally studied based on a BBD design. The GP metamodels for MRR and TWR are coupled with NSGA-II (non-dominated sorting genetic algorithm) to carry out multiobjective Pareto optimization. As compared to a straight forward single-objective optimization, multi-response Pareto optimization are more beneficial because this presents the user with a host of candidate solutions where a suitable compromise between the maximized MRR and minimized TWR is obtained.

Experimental Details
Any physical system or phenomenon can be thought of as an input-output system. The factors responsible for a particular behavior of the system may be considered as the inputs whereas the behavior of the system may be considered as the output. In the present work for the EDM process, material removal rate (MRR) and tool wear rate (TWR) are considered as the response variables (i.e., the outputs). The effect of current (I), pulse-on-time (T on ) and pulse-off-time (T off ) on MRR and TWR is studied. A Box-Behnken design (BBD) is used for selecting the design points. In BBD each input parameter is coded into three levels viz. -1, 0 and 1. The design of experiments as per BBD design is reported in Tab. 1.
In the present work, a Die sinking EDM is used for the machining of the Inconel 718. A cylindrical brass tool with a diameter of 2 mm is used as an electrode for the machining of the workpiece. The workpiece and electrode are separated by a moving dielectric fluid EDM oil. The Inconel 718 has a chemical composition of C-0.8%, Mn-0.35%, Ni-54%, Cr-20%, Ti-0.75% with balanced Fe and physical properties of Inconel 718 is described in Tab. 2. Current (I), pulse-on-time (T on ) and pulse-off-time (T off ) are considered as the input parameter whereas MRR and TWR are considered as response parameters.
MRR and TWR in the experiments are calculated by the weight loss method using a precision electronic balance weighing machine with a least count of 1 mg. MRR is calculated by measuring the weight loss of the workpiece as per the Eq. (1), where W b and W a are the weight of the workpiece before and after machining respectively. T is the machining time in minutes.
TWR is calculated by measuring the weight loss of the tool as per the Eq. (2), TW b and TW a are the weights of the tool before and after machining respectively.
Each experiment is carried out thrice and its average is considered as the working value, to account for uncertainties in experimentation. The average experimental values of MRR and TWR along with the standard deviations of the experiments are presented in Fig. 1.

Response Surface Methodology
Response Surface Methodology (RSM) is a combination of statistical and mathematical techniques that reduce a complex input-output system into easily understandable polynomial equations. Using the training dataset RSM fits a curve of the a priori selected polynomial form. In general, the RSM fitted empirical equation may be represented as, Here, f denotes the approximate response surface and e is the normally distributed statistical error. x 0 s represents each independent parameter (inputs) while k is the maximum number of independent parameters.
In this research, a second-order polynomial model of the following form is selected a priori, Here b 0 s are the coefficients of regression. These coefficients of regression help in describing the response (y) as a function of predictor variables (x 0 s).
The coefficients of regression in Eq. (4) are selected such that the sum of squared residuals (SSR) is minimum. Since residuals are the errors in the fitting, sometimes SSR is also referred to as the sum of squared error in predictions (SSE).
Here n is number of sample points.
ANOVA is then performed to identify and eliminate the non-significant terms of the fitted RSM model [24,25].

Genetic Programming
Genetic Programming (GP) is a powerful computational intelligence technique that can perform symbolic regression to build explanatory models based on the provided training datasets. It is a highly automated process that requires no manual intervention once the algorithm is started. On the contrary, a polynomial regression carried out by RSM may require the use of additional statistical tests like ANOVA to determine the significance of the polynomial terms in the model. The GP, on the other hand, selfprunes the insignificant terms because of the inherent evolutionary traits.
GP starts with a randomly generated population of candidate solutions called individuals or GP trees. This population is called as generation zero. Each GP tree is made up of two key ingredients-functions is calculated using some predefined metrics, like mean squared error or coefficient of determination. In the case of using mean squared error as the metric lower values are considered better. Similarly, in the case of the coefficient of determination, the higher the values, the better. Next, the population in generation zero is improved by using three key genetic operators called selection, crossover and mutation. An additional genetic property called elitism is also used in the present work. Crossover is the process of randomly grafting chosen parts from one GP to another GP tree. This is done by associating a higher probability of selection for the crossover of higher fitness GP trees. A mutation is used to create new GP trees for the new population by randomly altering a small part or node of the selected GP tree. Elitism is the act of copying a small proportion of the fittest GP trees, unchanged into the next generation. This improved population makes up the subsequent generation. This process is repeated until a predefined accuracy on the performance measurement scale or the maximum number of generations is reached.

Optimization Using NSGA-II
Since it is desired to have maximized material removal but minimum tool wear, it is important to use a Pareto multiobjective optimization concept which presents the user with a set of equally desirable solutions, called the Pareto front. While in any single-objective optimization, the search for optima is straight forward; in case of multi-objective Pareto optimization, a non-dominated sort algorithm is used to rank the optimal solutions such that each solution in Pareto set is not dominated by any other solution. From an academic viewpoint, all the solutions in the Pareto front are equally viable. However, based on the practical necessity, the solutions within the Pareto set can be ranked according to their suitability using a multicriteria decision making (MCDM) approach. A multi-objective optimization problem may be expressed as, where k is the number of independent parameters, obj: ! 2 represents the number of objectives and m is the number of constraints.
In this work, multi-objective Pareto optimization is carried out by using NSGA-II. GA and GP work on the same concept i.e., Darwin's laws of natural selection [26]. Instead of trying to build a function relation like GP, GA is engaged in exploring the search space to find a suitable combination of input parameters that would maximize or minimize the target response.
Like GP, GA too starts by initiating a random population and calculating its fitness. Based on the probability associated with each fitness, individuals are selected for operation by various genetic operators. New individuals are created for the next generation by crossover (a random combination of parts from two parents to form two children) and mutation (random small modifications in individuals created by the flipping of the binary encoded genes). These operations are carried out until the maximum iterations are reached [27]. The pseudo code of the NSGA-II used in this research is given below.
Begin | Define objective functions Y 1 ; Y 2 ; . . . ; Y obj: | Initialize generation t ¼ 0 | Initialize random population P of n pop individuals | Calculate fitness of all individuals | Conduct non-dominated sorting | Assign ranks and select parents | Generate child population | Tournament selection | Crossover and mutation | Do | | Do for all individuals | | | Calculate fitness | | | Conduct non-dominated sorting | | | Generate Pareto fonts | | | Determine crowing distance | | | Loop inside by adding solution to next generation from the first front until n pop individuals | | End | | Select points on lower front with high crowding distance | | Create next generation | | Tournament selection | | Crossover and mutation | Until t ¼ t max: | Report the Pareto front having P nd non-dominated solutions End

Ranking Solutions from Multi-Objective Optimization Pareto Set Using EDAS
Since the Pareto front represents a set P nd number of feasible solutions, it is important to be able to extract a particular solution from the Pareto set based on the requirements of the actual problem at hand. In this regards, MCDM techniques can be used to rank the P nd solutions within the Pareto set based on a predefined weightage to each selection criteria. In MCDM terminology, each of the P nd solutions from the Pareto set is referred to as alternatives and the two objective functions (MRR and TWR) are referred to as criteria. In this research, EDAS MCDM technique is used.
EDAS or "Evaluation based on Distance from Average Solution" calculates the distance of each alternative from the average solution and uses this information to select the best alternative. The current MCDM decision matrix D ð Þ and the weight vector W ð Þ for the criterions can be expressed as, . . .
Each row in the D matrix corresponds to an alternative and the columns corresponds to criteria.
where w 1 and w 2 represents the weights or preference scores associated with the two criteria (MRR and TWR).
Next, criteria-wise average solutions are calculated.
The positive distances from average (PDA) are calculated as The negative distances from average (NDA) are calculated as The PDA and NDA matrices are formed based Eqs. (11) and (12) and has order of P nd Â 2. B and C represent benefit and cost criteria respectively. Next by using the weight vector in Eq. (9), weighted sum of PDA and NDA is calculated.
Next, the normalized values of SP and SN are calculated.
The appraisal score is then calculated as Based on decreasing appraisal score, the alternatives are ranked from best alternative to worst alternative. The most favorable solution is ranked 1.

Training the GP Metamodels
13 EDM machining operations are carried out as per the combination of the input parameters listed in Tab. 1. The corresponding MRR and TWR of the 13 experiments are listed in Tab. 3. This training dataset is used to build metamodels for MRR and TWR by using RSM and GP. Since parameters involved in the GP (i.e., population size, generations, crossover, mutation and elitism) have a significant influence on the training outcome, a pilot study is conducted. Based on the pilot study, a population size of 500 GP trees is iterated for 50 generations. The crossover probability and mutation probability are kept as 0.9 and 0.1 respectively. The top 1% of the population of each generation is considered as the elites. Additional restrictions on the GP trees like maximum tree length of 50 and a maximum depth of 8 is also applied. The application of restrictions on tree length and depth ensures that the trained solutions (GP trees) are not too cumbersome. The coefficient of determination (R 2 ) is used as the performance metric for calculating the fitness of the GP trees.
The improvement of the GP metamodel across the training generations is seen in Fig. 2. In Fig. 2a it is seen that the GP average fitness increases rapidly up to generation 40, beyond which it becomes very sluggish. This means that the selected number of 50 generation limit is appropriate as training beyond this would lead to overfitting. In Fig. 2b, it is seen that the GP's best fitness increases monotonically until the 30th generation around which it attains the ideal value of R 2 i.e., 1. Based on the fitness, the GP algorithm adjusts the terminals and functions in the GP expression so that the R 2 improves. The updating of the relative frequencies of the terminals and functions across the 50 training generations is shown in Fig. 3. Similarly, in Fig. 4, the relative frequency of the process parameters across the 50 generations is presented. It is seen that in the case of MRR GP metamodel, the process parameter current (I) has the most occurrence in the final expression. However, in the case of the TWR GP metamodel, all three process parameters have a similar number of occurrences in the final GP expression.

Comparative Assessment of Metamodels
The predicted values of MRR and TWR as reported by RSM and GP are listed in Tab. 3 along with the corresponding experimental output. Fig. 5 shows the prediction performance of RSM and GP on training data. MRR GP metamodel has an R 2 of 100% which means all variability in training data is explained by the GP model, whereas RSM metamodel has an R 2 of 40.57%. Similarly, the GP TWR metamodel also records 100% R 2 , but RSM achieves only 71.93%.
Additional four experiments are carried out to serve as testing data. Fig. 6 shows the prediction performance of RSM and GP on testing data. The predictions of the RSM and GP metamodels, as well as GP performed much better than RSM on MRR prediction, with RSM recording around 35% average relative error while GP had only 27%. On TWR model GP had an average error of 27% compared to RSM's 29%. The performance of the GP and the RSM metamodels are further evaluated using three different performance metrics-R 2 , MAE (mean absolute error) and RMSE (root mean squared error). From Tab. 5, it is seen that the GP metamodel has significant improvement over the RSM metamodels. In the case of performance measurement on the training dataset, GP records ideal performance for both MRR and TWR responses. Considering the overall dataset, the GP is seen to be 161% (on MRR estimation) and 41%  (on TWR estimation) better than their RSM counterparts. In terms of mean absolute error, the GP metamodels perform 65% and 52% better. Thus, henceforth in this work, only GP metamodel is used for carrying out the rest of the analysis like parameter effect study and multi-objective optimization.

Effect of Current, Pulse On-Time and Pulse Off-Time on MRR and TWR
Due to the non-linear relation between the input and response parameters in the experiments, it is difficult to predict the response parameter for a particular set of input parameters. In the present work current (I), pulse-on-time (T on ) and pulse-off-time (T off Þ are the input parameters. The combined effect of the three input parameters on MRR and TWR in the form of contour plots is shown in Figs. 7 and 8 respectively. Out of these three input parameters T on and I have a significant effect on the MRR and an increase of these parameters generally increases the MRR. From the experimental data, it is observed that expt. no. 7 has the least MRR which is due to the less I and T on whereas expt. no. 8 has a very high  Figure 7: Contour plots of MRR at various parameter combinations MRR which is due to higher I and T on . From the experimental data, it is also noted that expt. no. 4 has the highest I and T on (I = 12 amp, T on = 400 µs), but the MRR is less. This may be due to the less T off in that experiment by which the debris does not get cleared from the gap between the tool and workpiece and arcing take place which results in decreasing the MRR. The TWR is directly related to all the three input parameters. Higher current and T on leads to eroding more material both from the work surface and electrode surface. For machining of any material, TWR is considered as one of the most significant response parameters in the EDM process. The TWR is directly related to the MRR and practically those experiments having higher MRR will, in general, have higher TWR and vice-versa. From the experimental data, it is observed that in expt. no. 7 has a very low MRR and TWR while expt. no. 8 has a very high MRR.

Multi-Objective Optimization and MCDM Ranking
NSGA-II is used for finding a set of Pareto optimal solutions to maximize the MRR and minimize the TWR simultaneously. It is clear from the study of the involved independent parameters that both these objectives are contradictory. For example, while the high current could increase the MRR, it would also increase the TWR. Because of such conflicting agendas, single-objective optimization may not always be the best approach. The Pareto front for MRR vs. TWR is shown in Fig. 9. It is clear from the Pareto front that the absolute maximum MRR and absolute minimum TWR cannot be achieved simultaneously; because with increasing MRR, TWR increases. Thus, the user is left with an option to choose what is best for the particular application. Using this Pareto curve, the user can choose any combination of input parameters, that he/she feels works best for the problem without violating any set constraints.
The selection of the appropriate Pareto solution can be done based on prior experience or the help of multi-criteria decision-making approaches like EDAS employed in the current research. Three different cases of MCDM selection are considered based on the varying preference weights to the criteria. The weight associated with the MRR i.e., w 1 is kept as 10%, 50% and 90% respectively for 'Case 1', 'Case 2' and 'Case 3' respectively. The comparison of the NSP i and NSN i of each alternative is carried out in Figure 8: Contour plots of TWR at various parameter combinations Fig. 10 for all the three different cases. It is seen that the weights associated to the criteria has significant influence on the outcome of the optimal solution selection. This further highlights the fact that arbitrary selection of optimal solution from the Pareto set may not always suit the problem requirement. The selected optimal solutions by EDAS are further highlighted on the Pareto front in Fig. 11.

Conclusion
In this research, a traditionally hard-to-machine material is machined using EDM. To maximize the productivity of the EDM process and minimize the tool costs, a metamodel coupled global optimization approach is employed. To maximize productivity, material removal rate (MRR) is sought to be maximized. On the other hand, to minimize the tool costs, tool wear rate (TWR) is sought to be minimized. Using genetic programming (GP) metamodeling approach, MRR and TWR are expressed as functions of current, pulse on-time and pulse off-time. In comparison with traditionally applied polynomial regression based RSM metamodels, the proposed GP metamodels show about 59% improvement in MRR prediction capability and 28% improvement in TWR prediction on training data. Similarly, the GP metamodel recorded 8% and 2% less average errors for MRR and TWR respectively on independent test data. Thus, the present GP based metamodeling approach can be suitably used for predictive modeling of machining processes.
Funding Statement: The author(s) received no specific funding for this study.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.