Abstract

The ultraviolet spectrophotometric method is often used for determining the content of glycyrrhizic acid from Chinese herbal medicine Glycyrrhiza glabra. Based on the traditional single variable approach, four extraction parameters of ammonia concentration, ethanol concentration, circumfluence time, and liquid-solid ratio are adopted as the independent extraction variables. In the present work, central composite design of four factors and five levels is applied to design the extraction experiments. Subsequently, the prediction models of response surface methodology, artificial neural networks, and genetic algorithm-artificial neural networks are developed to analyze the obtained experimental data, while the genetic algorithm is utilized to find the optimal extraction parameters for the above well-established models. It is found that the optimization of extraction technology is presented as ammonia concentration 0.595%, ethanol concentration 58.45%, return time 2.5 h, and liquid-solid ratio 11.065 : 1. Under these conditions, the model predictive value is 381.24 mg, the experimental average value is 376.46 mg, and the expectation discrepancy is 4.78 mg. For the first time, a comparative study of these three approaches is conducted for the evaluation and optimization of the effects of the extraction independent variables. Furthermore, it is demonstrated that the combinational method of genetic algorithm and artificial neural networks provides a more reliable and more accurate strategy for design and optimization of glycyrrhizic acid extraction from Glycyrrhiza glabra.

1. Introduction

Glycyrrhiza glabra or G. glabra, one of the most widely used traditional Chinese medicines in China, has the effect of invigorating spleen and replenishing qi and clearing away heat and toxic substances to treat diseases like weakness of spleen and stomach, cough and phlegm, and so on [1]. Owing to its anti-inflammatory, antispasmodic, antiallergic, antidepressive, antiviral, antifungal, and antioxidant activities, it has drawn more and more attentions [25]. Particularly, the long-term clinical practice has demonstrated that it exhibits the functions of alleviating pain, tonifying spleen and stomach [6], eliminating phlegm [7], and relieving coughing [8], etc. Meanwhile, it is also a highly nutritional plant, which is widely used as an important sweetening and flavouring agent in food products, such as candies, chewing gum, toothpaste, and beverages [9, 10]. Recently, the in-depth studies have uncovered that there exist more than 400 isolated compounds and active ingredients from this herb [11].

Among these constituents, glycyrrhizic acid has been reported to be the main biologically active ingredient in Glycyrrhiza glabra, as it is thought to be responsible for the hepatic protective and antiulcer effects of Glycyrrhiza glabra [12, 13]. Meanwhile, this active substance has been widely studied for their pharmacologic effects and medical benefits in animal models and human studies [14, 15]. For example, a considerable interest has been attained to the glycyrrhizic acid for its critical pharmacological activities, including anti-inflammatory, antioxidative, and antitumor activities [16]. Henceforth, it is necessary to develop an accurate, precise, and reliable prediction method for analysis of glycyrrhizic acid from Glycyrrhiza glabra.

Based on the application of industrial extraction technologies, the extraction process has been greatly improved, and the corresponding cost has been reduced [17]. However, the analysis and interpretations of the optimum extraction conditions seem to be empirical, and the research of some parameters is still under serious investigations. To address these problems, the traditional single variable approach, in which the level of each parameter is varied individually, while those of the others hold constant, spends too much time and requires a large number of experiments [18]. Accordingly, the advanced multivariable methodologies for the analysis of optimal extraction process are highly demanded [19], with the purpose of obtaining high-quality active substance of glycyrrhizic acid. For example, the central composite design (CCD), firstly presented by Box and Wilson [20], is a typical one. This design consists of the following parts: a full factorial or fractional factorial design, an additional design, and a central point.

Various prediction models and methods have been proposed for the analysis of extraction process. These methods can be roughly grouped into two categories. The first class is based on the experimental analysis and is mostly relied on assumption for model simplification. Such an extraction process model is evaluated statistically [20, 21]. For example, response surface methodology (RSM) is the most relevant statistical technique used in the analytical optimization. RSM, which is a collection mathematical and statistical techniques, can be well applied for the fit of a polynomial equation to the experimental data with the objective of making statistical prevision. To understand and characterize the highly nonlinear relationships between the extraction and response variables, the second class is inspired by the data-driven techniques [22]. As an example, the artificial neural network (ANN) methodology is one of the most promising artificial intelligence methods [23, 24]. ANN analysis provides the modeling of complex relationships and is quite flexible in regard to the number and form of the experimental data. This makes it possible to use more informal experimental designs than with statistical approaches, potentially making the ANN model more accurate.

Over the years, the backward propagation neural network (BPNN), which is a typical ANN, has been widely employed to unveil the complicated relationships between the input and output variables [25]. Since the weights of BPNN are trained with an optimization method such as gradient descent algorithm, it is not possible to guarantee to find the global minimum solutions, but only a local minimum [26]. As can be seen, genetic algorithm (GA), which simulates the survival-of-the-fittest principle of nature, is commonly used in searching for global optimum in the entire solution space [27]. Besides, GA is applicable to solve a variety of optimization problems, specifically the discontinuous, no differentiable, stochastic, and highly nonlinear objective functions [28]. To overcome the limitation of local minimum performance of backward propagation algorithm, the optimization of weights of BPNN, therefore, can be implemented by GA. This combinational method, abbreviated as GA-BPNN, has been successfully applied in diverse fields [2931].

Depending on the single variable approach, four extraction parameters of ammonia concentration, ethanol concentration, circumfluence time, and liquid-solid ratio, which strongly influence the content of glycyrrhizic acid, are chosen as the independent variables in the present study [32]. Furthermore, CCD of four factors and five levels is taken into consideration for the experiments of glycyrrhizic acid extraction from Glycyrrhiza glabra. After acquiring data related to each experimental point, the predictable models of RSM, BPNN, and GA-BPNN are developed to describe the interaction between the different experimental variables and approximate a nonlinear response function to experimental data for the extraction process of glycyrrhizic acid. According to the above well-established models, GA is assigned to search for optimizing the extraction parameters of glycyrrhizic acid. For the first time, a comparative study of these three computational models is conducted for the evaluation and optimization of the effects of the extraction independent variables on the extraction efficiency of glycyrrhizic acid from Glycyrrhiza glabra. Additionally, it is shown that GA-BPNN is more reliable and accurate and has better predictive power than two other models for optimization of glycyrrhizic acid extraction from Glycyrrhiza glabra.

2. Materials and Experimental Design

2.1. Reagents and Materials

The herbal drug—Glycyrrhiza glabra (batch number: 150701)—was purchased from Huqingyutang pharmacy (Zhejiang province, China) and was identified by Shenwu Huang, the professor of Zhejiang Chinese Medical University. The crude slices were of the stipulated quality standards in Chinese Pharmacopoeia (2015 edition). Glycyrrhizic acid (batch number: 110731-201517, purity: 98%) was purchased from National Institute for the Control of Pharmaceutical and Biological Products (Beijing, China). The chemical structure of glycyrrhizic acid is shown in Figure 1. The other reagents were of analytical grade. The working solutions of glycyrrhizic acid were prepared by diluting appropriate amounts of the stock solutions with buffer solutions. The FA1004N analytical balance (Precision Instrument Co., Ltd., Shanghai, China) was utilized to accurately weigh the materials. The 018268 type electric-heated thermostatic water bath (Automation Instrument Factory, Suzhou, China) was used to extract the glycyrrhizic acid. The TU 1900 type double-beam UV-visible spectrophotometer (Puxi General Instrument Co., Ltd., Beijing, China) was prepared to detect the content of glycyrrhizic acid. Finally, the Milli-Q (Millipore, Bedford, MA, USA) purification system was employed to provide the deionized water for preparing all the required solutions.

2.2. Calibration Curve and Methodological Study

Here, the methodological study was in line with [32]. For preparation for sample solutions, 5.00 g powdered drug of Glycyrrhiza glabra was precisely weighed and extracted by the conditions as listed in (1). The method of condensing and heat reflux with ammonia-ethanol was used in this paper. Then, the mixture was filtered and transferred quantitatively to a 100 ml measuring flask with ethyl alcohol. After beinf diluted to 100 ml, the solutions were processed through 0.22 μm syringe filter and subsequently detected under the established ultraviolet conditions.

To verify the reliability of experimental methodology, the following four aspects were proposed as the evaluation criteria:(i)Calibration curve: seven working solutions, whose respective concentrations were as follows, 8, 16, 24, 32, 40, 48, and 56 μg/ml, were exploited to make the calibration curve of glycyrrhizic acid, where the value of optical density and its corresponding standard concentration matched a linear regression curve. Subsequently, one of the glycyrrhizic acid working solutions was scanned in the whole wavelength, and the maximum absorption wavelength of glycyrrhizic acid was detected at 252 nm. Consequently, the linear regression of glycyrrhizic acid for the calibration curve was calculated as with the fitting degree 0.9995 and the concentration range 8 μg/ml–56 μg/ml.(ii)Precision: for this part, the within-day and between-day precision was checked. To determine the within-day precision, one working solution (56 μg/ml) was examined 5 times in the same day. To determine the between-day precision, the same working solution was analyzed on other 5 consecutive days. The relative standard deviation (RSD) was taken as a metric of precision. The fact that RSD of within-day and between-day precision were 0.17% and 0.65%, respectively, implied that the developed UV detection method was feasible.(iii)Stability: the stability of sample solutions was measured after 0 h, 2 h, 4 h, 6 h, and 8 h at room temperature under the selected UV detection conditions. Similarly, RSD was taken as an evaluation metric. Through the experimental tests, the stability trend of 5 samples fluctuated up and down which was no significant difference. And the RSD was 0.884%, which suggested the stability of sample solutions at room temperature.(iv)Recovery: to evaluate the property of recovery, three working solutions (16, 32, and 48 μg/ml) were used in recovery test. The absorbance values were taken into the calibration curve. The corresponding results of recovery for 16, 32, and 48 μg/ml (expressed by “mean value ± standard deviation” with ) were , and respectively.

2.3. Experimental Design and Data Normalization

Before applying the predictable models, it is necessary to choose an experimental design to define which experiments should be carried out in the experimental region being studied. In this work, the independent variables with major effects on the extraction process to determine glycyrrhizic acid from Glycyrrhiza glabra were selected through the single variable approach. These parameters and their delimitation for the extraction experiments are displayed in (1).

In order to evaluate the coefficients of interaction parameters, CCD was identified to carry out the experiments. The domain of variation for each factor was determined based on knowledge of the system and acquired from initial experimental trials. Their ranges and levels with actual and coded values of each parameters were shown in Table 1, where the independent variables were coded to two levels, namely, low (−1) and high (+1), whereas the axial points were coded as −2 and +2. Then, the Design-Expert software (version 8.01) was used for this experimental design matrix. Totally, 30 experimental points, including 16 factorial points, 8 axial points, and 6 replicated at the center points, were defined with four independent factors and five levels. All the runs were conducted in duplicate randomly to minimize the experimental errors, as well as to verify the adequacy of the proposed models. Eventually, the complete CCD matrix in terms of coded variables , as well as experimental results, is exhibited in Table 2.

Codification of the levels for each variable consists of transforming the studied real values into coordinates inside a scale with dimensionless values, which must be proportional at their localization in the experimental space. Moreover, it cancels the order of magnitude difference between the extraction parameters and avoids causing large prediction error. The results improve the learning efficiency and the prediction accuracy of models. Therefore, one can use this codification schematic as the normalization process for the experimental data. Precisely, the normalization process is applied to transform a real value () into a coded value () according to the following equation:where is the dimensionless value of the independent variable , and is actual value and that at the central point, respectively, and is the step change of corresponding to a unit variation of the dimensionless value. The related parameters for each independent variable normalization are also displayed in Table 1.

3. Models and Optimization

3.1. Determination of the Optimal Extraction Parameters

GA is a parameter searching and optimization technique based on emulation of nature evolutionary processes. In a GA, a population of candidate solutions (also called individuals) to an optimization problem is evolved toward the best solution. Individuals are represented in binary as strings of 0s and 1s, but other encodings are also possible. In each generation, the fitness, which is usually the value of the objective function in the optimization problem being solved, is assessed for each individual in the population. The more fit individuals are stochastically selected from the current population, and the next new population of candidate ones is created through the bio-inspired operators, such as selection, mutation, and crossover. As the algorithm proceeds, the best fitness of the population is gradually improved. Commonly, the algorithm terminates when either a maximum number of generations has been achieved or a satisfactory fitness level has been reached.

In light of the powerful search function, GA was exploited to optimize the extraction conditions for glycyrrhizic acid throughout this work. Each individual was represented by the extraction parameters , , , and , with values within the variable upper and lower bounds as defined in (1). The fitness functions were identified by the well-constructed mathematical models, which are presented in the following subsections.

3.2. RSM and Statistical Analysis

In statistics, CCD is the most popular class of design used for fitting a second-order model in RSM, especially in the extraction process. Based on the obtained experimental data, RSM with a second-degree polynomial formula was applied to explore the relationship between four explanatory variables of , , , and and one response variable of glycyrrhizic acid, which can be seen in the following:where expresses the content of glycyrrhizic acid, () represent the extraction parameters, is the constant term, () are the coefficients of the linear part, () indicate the coefficients of the quadratic part, and means the residual associated with the experiments.

The mathematical model, found after fitting the function to the data, can sometimes not satisfactorily describe the experimental domain studied. Based on the multiple sample mean data, the more reliable way to evaluate the quality of the fitted model is by the application of one-way analysis of variance (ANOVA). In this work, the data were analyzed by the Design-Expert software (version 8.01), and the coefficients were interpreted by Fisher’s test. The statistically nonsignificant terms were omitted in the specific model. The accuracy and general ability of the polynomial model fitted can be evaluated by the coefficient of determination , which is defined as follows:where is the value predicted by the model, is the experimental value, and is the mean of experimental values. It is worth mentioning that is only applicable to the training data set, and its range is . In addition, the larger indicates that the more percent of the variance in the response variable can be explained by the explanatory variables.

3.3. BPNN Model

ANN is a novel information processing technique and a simplified computational model, which is enlightened by the structure of biological neural networks. It often consists of three layers, i.e., input, hidden, and output layers. The pattern of interconnection among the neurons is called the network structure, and it can be conveniently illustrated by a graph as shown in Figure 2(a). Data generated from the experimental design can be used as relevant inputs and outputs for ANN training.

The training is carried out by adjusting the strength of connections between neurons with the aim of adapting the outputs of the entire network to be closer to the desired outputs. In this approach, sum of inputs arrived at each neuron is weighted, and an output signal is generated through an activation function aswhere and () are the weights between two sequential layers and and () are the corresponding inputs and outputs, respectively. A general schematic of such architecture is illustrated in Figure 2(b).

Then, the network calculates the output values and obtains the evaluation criteria by comparing the predicted values and the experimental values. After that, the network updates the weights to improve the criteria and achieves the optimal goals through a neural network learning algorithm. In the present work, a three-layer BPNN was developed for explaining the extraction mechanism of glycyrrhizic acid, in the sense that the weights were updated via the backward propagation algorithm.

The performance of the model is statistically evaluated by the following two evaluation criteria: the maximum absolute error MAE and the correlation coefficient , as well as the coefficient of determination . The former two matrices are calculated as follows:where are the predicted values and the corresponding mean values, respectively, and, are the experimental values and the corresponding mean values, respectively. In particular, the evaluation criterion MAE is stricter than those metrics of mean absolute error and root mean square error, etc.

In order to avoid the overlearning of the model data, the number of hidden nodes was increased from to , and a -fold cross-validation method was applied in this work, where all data were randomly subdivided into two distinct groups: the training set was used to train the network, and the testing set was used to evaluate its performance.

3.4. GA-BPNN Method

Although the backward propagation method is the best known example of neural network learning algorithm, it has trouble crossing plateaux in the error function landscape. This issue results in the drawback of the local optimum to calculate the gradient of the loss function with respect to the weights of networks. In this study, GA was applied to optimize the parameters of BPNN, and the outline of the combination GA-BPNN model is depicted in Figure 3. As can be seen in Figure 3, the new combinational algorithm can be formalized by the following two parts:(i)Determining BPNN structure: the input layer consists of four nodes related to the independent extraction variables and the output layer has one node associated with the response of glycyrrhizic acid. The optimal number of hidden nodes is determined among examined neurons from to .(ii)Utilizing GA to optimize BPNN: a population with individuals is generated randomly, and the corresponding individuals are decoded into the network weights. Then, BP algorithm is employed to update the weights, and the fitness value of each individual is assessed. Finally, the best individual is found by selection, crossover, and mutation operators, and the corresponding BPNN model is confirmed.

The performance of BPNN was measured by the mean square error attached to Matlab 2015a. Meanwhile, a -fold cross-validation method was also applied. In the iterative optimization, the weights were encoded by real encoding, and the encoding length could be calculated by the equation as , where , , and are the number of input, hidden, and output nodes, respectively. The tangent sigmoid transfer function at both hidden and output layers was successfully employed. Network training was performed by epochs. The fitness function of each individual was defined as the maximum value between the maximum absolute error of the training set and the maximum absolute error of the testing set, in view of the following:

4. Results and Discussion

Experimental results for optimizing four factors according to the selected CCD are shown in Table 2. The average value of runs 6, 18, 21, 22, 28, and 30, carried out at the central point, was 365.45 mg, which indicated that the extraction ability of glycyrrhizic acid was stronger by comparing to the other runs in this experimentation. Meanwhile, the relative standard deviation was 0.46%, which showed that the experiments were stable. The highest value (376.46 mg) was obtained in run 12 with the extraction conditions as = 1, and = 1 and thereby highlighted the importance of the changes of these conditions to enhance extraction yield. The lowest value (290.93 mg) was marked in run 7 with the extraction conditions as = −1, = 1, = −1, and = −1. By comparison with the extraction parameters of these two extreme values, the conditions of ammonia concentration and liquid-solid ratio might exert significant effects on the response .

To disclose the more precise relationships between the independent and dependent variables, the previously introduced RSM, BPNN, and GA-BPNN models in combination with experimental design were utilized to optimize the extraction conditions for glycyrrhizic acid from Glycyrrhiza glabra. The corresponding results and analysis are presented in the following subsections.

4.1. Modeling and Optimization by RSM

From the regression analysis applied to the results in Table 1, the following model of RSM of (8) is derived for the content of glycyrrhizic acid as function of the extraction conditions , , , and , where the coefficients are estimated via the least squares method, and the statistically nonsignificant ones () are removed.The corresponding ANOVA results are displayed in Table 3.

According to (8), the negative coefficients for the model terms , , and indicate the unfavorable effects on the extraction of glycyrrhizic acid; the positive coefficients for the model terms , , and mean the favorable effects on the dependent variable. Meanwhile, the goodness of fit of regression equation can be assessed by adjusted determination coefficient of . The values of 0.9465 and adjusted of 0.8966 show that the model could be significant predicting the response and explaining approximately 90% of the variability in the extraction of glycyrrhizic acid. Generally, the model -value of 18.96 implies that the model is significant and shows that the model is statistically significant at 95% confidence level ().

Despite the nonsignificant coefficient of the linear term , the parameter still negatively influences the response value by virtue of its significant coefficient of quadratic term. According to ANOVA, pred- value is 0.6954, and the lack of fit is statistically significant, which both reflects that this model is invalidated for predictive purpose. Indeed, the profile for predicted values and desirability option in the GA toolbox of Matlab 2015a was used for the optimization process. Each individual was represented by the extraction parameters , , , and , with values within the variable upper and lower bounds in Eq. (1), and the fitness function was the regression equation Eq. (8). In addition, GA was processed with 12 generations (Figure 4), population size of 30, and the rest setting as default. As a result, the maximum content of glycyrrhizic acid (427.1562 mg) was predicted by GA at the following conditions: 0.722, 0, 2, and −2 in the coded form (Table 4). Under these extraction conditions taken into account, the experimental results (Table 5) were conducted. Therefore, it could be concluded that this model cannot be considered a good choice for modeling the experimental data of this study. Nonetheless, it is presented here only for comparison with BPNN and GA-BPNN modeling.

4.2. Modeling and Optimization by BPNN

Firstly, the 10-fold cross-validation method was adopted to divide the input data into two distinct sets: the training set of 90% input data and the testing set of 10% input data. Meanwhile, BPNN toolbox of Matlab 2015a with the maximum epochs 2000 was applied for BPNN model. The relation among the number of hidden neurons, , , and MAE is shown in Table 6.

As can be seen in Table 6, MAE for the training set has a significant improvement in performance by increasing the number of hidden layer, whereas, for the testing set, its performance exhibits the negative effect for 5, 7, 8, and 9 neurons, respectively. Combined with the correlation coefficient , the optimal structure of the network with 4 neurons in the hidden layer is applied for further prediction.

Thereafter, the whole data were used to train the neural network. The learning curve for training is given in Figure 5(a). As can be found in Figure 5(a), mean square error decreases initially and then it becomes almost constant. Moreover, the trained network is used to estimate the response of 30 experimental points, and the correlation coefficient between actual and estimated responses is , as shown in Figure 5(b).

After being well trained, an optimization was then performed using GA, whose results are also listed in Table 4. As already mentioned, this model was obtained so as to deliberately overtrain the network. Despite the better performance obtained for this model, it cannot be considered a good choice being obtained through an inadequate training/testing methodology. Hence, there was a bad agreement between the BPNN predictions and experimental data (Table 5) with the above optimum conditions. Although the optimization cannot be considered reliable due to the above explanation, it is presented here only for comparison with RSM and GA-BPNN modeling.

4.3. Modeling and Optimization by GA-BPNN

As already mentioned, BPNN is an effective data processing method. But the problem is that BP algorithm is easy to get stuck in local minimum in the sense that the different original weights always give rise to the different training epochs. In this regard, GA was combined with BPNN to optimize the initial distribution of weights and enable BPNN to fit not only the training data, but also the testing data very well.

Similarly, GA toolbox of Matlab 2015a was applied for GA-BPNN model. The relationships among the number of hidden neurons, , , and MAE are shown in Table 7. The main criterion for selection of the optimum BPNN structure is the MAE of the test data as well as the correlation coefficient . As can be found in Table 7, the values of and MAE both reach the best performance if there are 7 hidden neurons. Moreover, the corresponding weights of BPNN are calculated as listed in Table 8. The plot between the measured and model-predicted values is illustrated in Figure 6(a), which implies that BPNN model with 7 hidden neurons is consistent with the experimental data. It should be noted that the Q-Q probability plot of the prediction residuals can provide additional information regarding model fitting to a data set. In fact, a careful examination of Q-Q plot in Figure 6(b) reveals that the probability distribution of residuals corresponds with the expected normal distribution of the test line, and it is demonstrated that the prediction performance of this model for glycyrrhizic acid with high confidence levels is credible. Henceforth, based on these considerations as a whole, one can infer that GA-BPNN model is the best modeling and optimization tool under the specific conditions selected for this work.

After modeling, an optimization was then performed using GA, whose results are listed in Table 4. The fitness function was the equation of BPNN model with the weights presented in Table 8. Compared with the experimental data in Table 5, it was truly remarkable that GA-BPNN model produced the best agreement between the predicted and the experimental values among these three models. The generalization ability of GA-BPNN is better; this may be due to that GA is good at global searching, and the weight adjustment is exquisite.

4.4. Comparative Study of RSM, BPNN, and GA-BPNN

On the one hand, Chinese herbology is the theory of traditional Chinese herbal therapy, which accounts for the majority of treatments in traditional Chinese medicine. There are roughly 13,000 medicinal plants used in China and over 100,000 medicinal recipes recorded in the ancient literature. Chinese herbal extracts are herbal decoctions that have been condensed into a granular or powdered form. For example, glycyrrhizic acid is the major active ingredient of Chinese herbal medicine Glycyrrhiza glabra, which has many pharmacological activities. On the other hand, RSM, BPNN, and GA-BPNN are three alternatively computational and predictable models capable of solving linear and nonlinear multivariate problems. In the present work, these three models were developed for describing the experimental data of the extraction of glycyrrhizic acid from Glycyrrhiza glabra. As a consequence, all models could be well fitted to the experimental response of glycyrrhizic acid. After being well established, GA was set to optimize the extraction conditions, which are summarized in Table 4 for the selected models. In order to further evaluate their accurate prediction and practicability, the extraction experiments were carried out for each of the predicted optimum conditions, and the corresponding results are displayed in Table 5. The existence of the high degree of agreement between the experimental results and predicted optimum results indicated that the GA-BPNN could be used effectively for the evaluation and optimization of the effects of the extraction independent variables on the extraction concentration of glycyrrhizic acid from Glycyrrhiza glabra.

It was indeed that RSM had a regression equation for forecasting and achieving optimum conditions for extraction process. However, classical RSM requires the specification of a polynomial function such as linear, first-order interaction, or second-order quadratic, to be regressed. Moreover, the number of terms in the polynomial is limited to the number of experimental design points. Hence, it has its drawbacks in providing the complex intrinsic relationships among input/output data set clearly. Also, this might be why it was less effective for predictive purpose and therefore was not suitable for the experimental data in this work. Although BPNN methodology provides the modeling of complex relationships, especially nonlinear ones, that may be investigated without complicated equations, it cannot be considered a good choice being obtained through an inadequate training/testing methodology.

Since GA has good global searching ability and can learn the near-optimum solution without the gradient information of error functions, it has been a powerful tool of optimization, searching, and machine learning. Additionally, the cross-validation method prevents the so-called overtraining responsible for a reduction of neural network ability to generalize knowledge. Comparing the discrepancy between the experimental and predicted data in Table 5, it is evident that GA-BPNN can be accepted as the most precise method within RSM and BPNN for modeling of the extraction of glycyrrhizic acid from Glycyrrhiza glabra.

As discussed in [29], GA-BPNN is not suitable for some complicated data sets. When data sets are complex, GA is so slow and hard to process them; it can only be treated as a presearch technique, that is, to find a better search space. However, GA-BPNN is not always valid; its parameters are also hard to decide. The future goals of this study include applying this method for optimization of bioactive ingredient extraction from other Chinese herbal drugs; adjusting related parameters to further improve the algorithm’s efficiency.

4.5. Remarks

In the current studies of traditional Chinese medicine, a variety of data (such as extraction data, pharmacokinetics and pharmacodynamics data, and clinical data) are produced. The corresponding relationships are complex, and some are even random and fuzzy. Therefore, the deterministic approaches are often powerless, and one needs to appeal the new technologies, such as complex system and artificial intelligent. Particularly, machine learning, a fundamental concept of artificial intelligent research, has been demonstrated to possess the ability to describe the complex relationship between inputs and outputs. Moreover, the recently developed deep neural network has more powerful self-learning ability [26]. Indeed, the neural network technology in machine learning and genetic algorithm in global optimization algorithm have received extensive attention and extensive research and have shown an attractive application prospect.

It is remarkable that the evaluation and optimization of the extraction process of saponins and total flavonoids from Glycyrrhiza glabra have been conducted in [32]. Undoubtedly, it is a multilevel optimization problem, and the entropy weight method is used to assign the corresponding weights. Besides, the model of BPNN is developed for explaining the extraction mechanism. Through this previous study, it is found that BP algorithm is easy to get stuck in local minimum, and it means one should try a couple of times to obtain a satisfied result. In view of this fact, a combinational model of GA-BPNN is offered to be an alternative to RSM and BPNN as a modeling tool. Therefore, one component from Glycyrrhiza glabra is reconsidered as a single object optimization to illustrate the feasibility of the methodology proposed in this work. Frankly, the relevant results are incomparable, and the multilevel optimization problems are all important topics for further research in the near future.

5. Conclusion

In this study, the bioactive ingredient glycyrrhizic acid was successfully extracted from Glycyrrhiza glabra. By using central composite design, the time of analysis and experiment expense were decreased without obvious reduction in efficiency. Afterwards, the significant variables were optimized by RSM, BPNN, and GA-BPNN. These three models were compared for their predictive and generalization capabilities as well as their ability to optimize the concentration of glycyrrhizic acid. Comparing the results of numeric optimization and experiments through the above methods, it was shown that GA-BPNN was absolutely satisfactory owing to its adequate training/testing method. Under the reliable optimum conditions, the experimental and predicted content of glycyrrhizic acid were 379.46 mg, and 381.24 mg, respectively. Although GA-BPNN model has been successfully validated by the comparative results with two well-known correlations, it has its drawbacks in processing the more complex data sets. Nevertheless, the main benefits of GA-BPNN for extraction and determination of glycyrrhizic acid from Glycyrrhiza glabra are low sample consumption, minimum use of raw materials, simplicity, and high enrichment product.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was jointly supported by the NSFC (Grant no. 81473587), the Natural Science Foundation of Zhejiang Province (Grant no. LR16H270001), the Foundation of Zhejiang Educational Committee (Grant no. Y201534584), and Basic Public Welfare Research Project of Zhejiang Province (Grant no. LGN18A010001).

Supplementary Materials

To read our article more intuitively, the supplementary material of graphical abstract is added in Fig. S1. (Supplementary Materials)