Discovery of Herbal Pairs Containing Gastrodia elata Based on Data Mining and the Delphi Expert Questionnaire and Their Potential Effects on Stroke through Network Pharmacology

Background Traditional Chinese medicine (TCM) formulae can be regarded as a source of new antistroke drugs. The aim of this study was to discover herbal pairs containing Gastrodia elata (Tianma, TM) from formulae based on data mining and the Delphi expert questionnaire. The proposed approach for discovering new herbal combinations, which included data mining, a clinical investigation, and a network pharmacology analysis, was evaluated in this study. Methods A database of formulae containing TM was established. All possible herbal pairs were acquired by data mining association rules, and herbal pairs containing TM were screened according to the Support and Confidence levels. Taking stroke as the research object, the relationships between herbal pairs containing TM and stroke were explored by the Delphi expert questionnaire and statistical methods. To explore the effects of herbal pairs containing TM on stroke, a network pharmacology analysis was performed to predict core targets, biological functions, pathways, and mechanisms of action. Results A total of 1903 formulae containing TM, involving 896 Chinese herbal medicines (CHMs) and 126 herbal pairs containing RG, were analyzed by association rules. A total of 27 herbal pairs were further screened according to the Support and Confidence levels. Twelve herbal pairs containing RG were added according to the expert questionnaires. Weightiness analysis showed that 9 groups of core herbal pairs contained RG, including TM-QX, TM-JH, TM-CX, TM-GG, TM-SJM, TM-JC, TM-SCP, TM-MJZ, and TM-GT. Two core herbal pairs, TM-JH and TM-CX, were randomly screened to explore their network pharmacological mechanisms in stroke. The important biological targets for network pharmacological analysis of TM-CX and TM-JH related to stroke were PTGS2, ACE, APP, NOS1, and NOS2. An herbal pair-compound-core target-pathway network (H-C-T-P network) was established, and arginine biosynthesis, arginine and proline metabolism, and the relaxin signaling pathway were identified by enrichment analysis. Conclusion The herbal pairs of TM-CX and TM-JH obtained from data mining and the expert investigation were found to have effects of preventing and treating stroke through network pharmacology. This could be a viable approach to uncover hidden knowledge about TCM formulae and to discover herbal combinations with clinical and medicinal value based on data mining and questionnaires.


Introduction
Traditional Chinese medicine (TCM) in China has led to the accumulation of expansive theoretical knowledge and clinical experiences over the past thousands of years. TCM formulae are formed from herbal medicines, animal medicines, minerals, and other traditional medicines to enhance curative effects and reduce adverse reactions. e book named Prescriptions for Fifty-two Diseases (52 Bingfang, 206 BC∼8 AD) is one of the oldest existing works recording medical formulae in China. Since then, Yellow Emperor's Inner Canon (Huangdi Neijing, 770 BC∼220 AD) laid the foundation for the formula theories of TCM, and Treatise on Cold Pathogenic and Miscellaneous Diseases (Shanghan Zabing Lun, 200 AD∼210 AD) is another representative work [1] containing approximately 100,000 formulae recorded in various studies. e numerous formulae are of great value in clinical research of TCM [1]. Herbal pairs are the most basic and simplest element in formulae that represent the concentrated expression of herbal compatibility. Shennong's Classic of Materia Medica (Shennong Bencao Jing, 100 BC∼46 BC) provided an important theoretical basis for herbal compatibility [2]. e principle of "monarch, minister, assistant, and guide" described in Yellow Emperor's Inner Canon has been applied to herbal compatibility and provides guidance for clinical practice [3]. e compatibility of Chinese herbal medicines (CHMs) can enhance the efficacies of herbs and reduce their possible toxicities [4,5]. e reasonable compatibility of CHMs and the compatibility law of formulae have become a foundation in the exploration of TCM [6,7]. GASTRODIAE RHIZOMA (Gastrodia elata Bl., Tianma, TM) is a main herb that can prevent and treat stroke [8], a destructive neurological condition that can lead to death and long-term disability. Stroke has brought about a huge burden to society [9]. Increasing experimental and evidence-based medical evidence shows that TM compatible with other CHMs can be used to treat and prevent stroke and repeat stroke. TM and Uncaria rhynchophylla (Gouteng, GT) have been confirmed to modulate the antioxidant system and antiapoptotic genes in oxygen glucose-deprived neuronal differentiated PC12 cells and in middle cerebral artery occlusion (MCAO) rats [10], benefiting the treatment of stroke or repeat stroke. erefore, for the development of new drugs, it is of great significance to identify more herbal pairs containing TM from formulae or clinical prescriptions that can effectively prevent ischemic stroke and to explain their mechanism of antistroke. e application of data mining, bioinformatics, network pharmacology, and other emerging technologies has greatly changed the scientific understanding of TCM and has played major roles in addressing the complexity of TCM [11][12][13]. Data mining is considered an important tool for discovering the potential associations of formulae [14]. TCM herbal formulae have been considered multicomponent and multitarget therapeutics, which can potentially meet the demand to treat a number of complex diseases, and network pharmacology methods can be used to gain a priori knowledge about the combination rules embedded in formulae [15]. us, these are considered emerging and powerful approaches for revealing the underlying complex interactions between formula and cellular proteins [16]. Studies have suggested employing data mining and TCM network pharmacology approaches as a new research paradigm for translating TCM from an experience-based medicine system to an evidence-based medicine system, which would accelerate drug discovery from TCM [17].
In this study, all herbal pairs containing TM were obtained from formulae by data mining, and herbal pairs that could effectively prevent and treat stroke were further identified through a clinical expert questionnaire. en, the potential effects of the herbal pairs containing TM on stroke were predicted and evaluated by analyzing their roles in the stroke target network by network pharmacology. is study provides new research ideas and strategies for discovering new Chinese herbal pairs and potential herbal combinations.

Data Sources.
e data used in this study were obtained from the Chinese Traditional Medicine Database, which contains 84464 formulae from more than 710 ancient and modern documents and was built by the Information Institute of China Academy of Chinese Medical Sciences (http://cintmed.cintcm.com/cintmed/main.html).

Inclusion Criteria.
e formulae underwent data cleaning and transformation to be suitable for data mining [18]. Formulae and CHMs were required to meet the following criteria: (1) the formula composition contains TM; (2) the formula originated from ancient books of TCM before 1911 AD with a clear source; (3) the administration route of the formula was oral; (4) the formula has a clear composition of herbs (here, "herbs" refer not only to plants but also to animal sources and minerals with treatments effects); and (5) the names of CHMs were standardized according to Chinese Pharmacopoeia (2015 edition, China Pharmaceutical Science and Technology Publishing House) and Great Compendium of Chinese Medicines (second edition of 2014, Shanghai Science and Technology Publishing House), such as changing the herbal name "Wu Shi" to "Niu Bang Zi" (ARCTII FRUCTUS) and changing the name "Dong Chong Cao" to "Dong Chong Xia Cao" (CORDYCEPS).

Data Mining Process and Methods.
High-frequency herbal pairs were discovered using the algorithm of association rules for arules package in R language [19]. e algorithm of association rules is the most common method for data mining of TCM formulae; it can be used to investigate CHM compatibility patterns and to reflect the interdependence and relationship between variables [20]. e high-frequency herbal pairs containing TM were screened and retained according to Support, given that Lift > 1 and Confidence ≥ 90% [21]. A higher value of Support reflects that the herb is more in line with herbal pair dependence. e higher the Support is, the more consistent the relationship of herbs.
Support is a measure that reflects how frequently the rule occurs in the database. In this study, Support indicates the probability of herb X and Y existing simultaneously: (1) Confidence refers to the ratio of the probability of the coexistence of herbs X and Y to the existence of herb X in a dataset, reflecting the closeness of the relationship between them: Lift refers to the ratio of the probability of containing both X and Y of CHM to the probability of containing only Y without X. Lift is greater than 1, indicating that the association rules are meaningful.

Delphi Process.
e Delphi process is an objective method to mine formula compatibility with data mining. In the mining of herbal pairs for antistroke treatment, attention should be paid to both deciphering the potential objective laws of ancient formulae and the clinical application. e Delphi method is considered an ideal method for reaching consensus and is essentially an anonymous feedback inquiry method. Two rounds of consultation were performed in this study. e questionnaire based on the Delphi method involved five aspects.

Group.
e research group was set up to investigate the clinical use of TM in the prevention and treatment of stroke.

2.4.2.
Object. Referring to international experience, the questionnaire follows the general principles of integrity, simplicity, guidance, comparability, uniformity, operability, practicality, regionality, authority, and representativeness [22]. Herbal pairs containing TM included the high-frequency herbal pairs obtained by data mining and the clinical herbal pairs obtained from a literature investigation. Clinicians were required to recommend the commonly used herbal pairs containing TM in the first round of the questionnaire, and the herbal pairs were added to the second round. In the whole process of questionnaire consultation, clinicians should evaluate the relationship of herbal pairs and stroke from "herbal pairs correspond to clinical syndrome type of stroke" and "seven features of compatibility" [23].

Questionnaire Design Form.
e clinicians evaluated whether herbal pairs containing TM were commonly used in preventing and treating stroke, and the degree of use of herbal pairs was graded according to a Likert scale (Table 1).

Selection Conditions of Clinicians.
Conditions for screening clinicians were defined, and the questionnaires were sent to the clinicians who met the conditions and agreed to participate [24] (Table 2). e selection conditions meet the first 4 conditions and any of 5, 6, and 7 conditions. All the completed questionnaires were received within one month, and statistical processing was carried out. e names of the respondents were not revealed throughout the whole process.

Feedback.
In the second round of the questionnaire, the judgment results and opinions of the clinicians obtained from the first round were provided back to the clinicians, and they were asked whether they had changed their original judgments, and clinicians in the second round were invited to evaluate the new herbal pairs.

Statistical Analysis of the Delphi Process.
e indexes of the Delphi questionnaire were quantitatively analyzed by the descriptive statistics module, the reliability analysis module, and the nonparametric tests module of SPSS 19.0. e maximum, minimum, mean, standard deviation, median, frequency, coefficient of variation, reliability, and Kendall's W harmony coefficient were analyzed [25] (Table 3).

Weightiness Analysis.
In this experiment, the data mining association rule and the Delphi expert investigation method were used to obtain the herbal pairs containing TM for stroke. e two groups of data were standardized by the dimensionless method, and the weightiness analysis was performed according to a certain proportion [26]. e dimensionless calculation formula for the data x 1 , x 2 , . . ., x n is as follows: e new sequence y 1 , y 2 , . . ., y n ∈ [0, 1] is dimensionless. e herbal pairs in the association rule group (group 1) were subjected to dimensionless treatment according to the Support, and the herbal pairs in the Delphi expert investigation group (group 2) were subjected to dimensionless treatment according to the degree of herbal pair usage.
Weightiness analysis calculation formula is given by where y iA refers to the dimensionless value of herbal pairs (i) obtained by the data mining method (Group A), and y iB refers to the dimensionless value of herbal pairs (i) obtained by the Delphi expert survey method (Group B). is study Evidence-Based Complementary and Alternative Medicine considered data mining and expert questionnaires to be equally important. e index function is used to randomly select two groups of herbal pairs for network pharmacology research.

Network Pharmacological Analysis of Herbal Pairs
Containing RG. e molecular mechanism of herbal pairs containing TM was predicted and confirmed by network pharmacology. To facilitate this research, herbal pairs with high midweight analysis values were screened, and the mechanism of action of the herbal pairs on stroke was analyzed by using the network pharmacology method.  [27], and the compounds of CHM were supplemented and validated by the PubChem Compound database.

Screening of Potential Active Compounds of Herbs.
DruLiTO is a tool for calculating the druglikeness of compounds from herbal pairs and known drugs, which follows Lipinski's "rule of five" (i.e., a molecule with a molecular mass less than 500 Da, no more than 5 hydrogen bond donors, no more than 10 hydrogen bond acceptors, no more than 10 rotatable bonds, and an octanol-water partition coefficient log P not greater than 5) [28]. SwissADME (http://www.swissadme.ch/) was used to screen the components of herbal pairs that are easily absorbed by the intestine (GI) and that permeate by the blood-brain barrier (BBB). e properties of the components were demonstrated by BOILED-Egg, a skilled method in SwissADME [29].

Target Prediction for Potential Active Molecule Compounds.
e SwissTargetPrediction platform [30] was used to predict the candidate targets of active molecule compounds in herbs, and Homo sapiens was chosen by default.

Building the Disease Target Database.
Taking "apoplexy," "apoplexia," "stroke," and "ischemic stroke" as themes, the disease targets were searched in OMIM (Online Mendelian

Statistical indicators Concept of indicators Significance of indicators
Positive coefficient (C) Recovery rate of the expert survey and consultation questionnaire (C � n/N * 100%, n represents the number of clinicians participating in the questionnaire; N represents the total number of clinicians consulted) A high positive coefficient of the clinicians indicates that the clinicians have a high degree of attention and enthusiasm in participating in this research project

Concentration degree
Reflects the degree of concentration of clinicians' opinions on the relative importance of various indicators; evaluated by the median, mean, and standard deviation and by the median, mean, standard deviation and percentage e higher the percentage is, the larger the mean, the smaller the standard deviation, and the more important the CHM in the expert evaluation opinions

Degree of coordination (CV)
Reflects the convergence of divergent clinicians' opinions, which is usually expressed by the coefficient of variation and the Kendall harmony coefficient CV � (S/X) × 100%; the smaller the coefficient of variation is, the higher the degree of coordination among the clinicians' opinions on herbal pairs containing TM, the smaller the divergence, and the better the convergence e Kendall harmony coefficient indicates the overall degree of coordination of clinicians' opinions on herbal pairs. e larger the value is, the higher the degree of coordination of clinicians' opinions (its value ranges from 0 to 1)

Questionnaire reliability (α)
Reliability refers to the degree of consistency of the results obtained by repeated measurements of the same object using the same method, which is expressed by Cronbach's alpha α ≥ 0.9 indicates high reliability; 0.8 ≤ α < 0.9 indicates acceptable reliability; 0.7 ≤ α < 0.8 indicates that some problems may exist; and α < 0.7 indicates major problems e Gene Ontology (GO) annotation of core target bubble charts was mapped using an Omicshare cloud platform (https://www.omicshare.com/tools/Home/Report/ goenrich). e list of genes was submitted, the species "Homo sapiens" was selected for GO annotation enrichment, and the functions were screened with a cutoff P < 0.001.  e "herbal pair-compound-core target-pathway" network (H-C-T-P network) was constructed with Cytoscape 3.2.1. Starting from the analysis of core targets, the related active small molecule compounds of herbal pairs and metabolic pathways of diseases were correlated to form the molecular mechanism action network of herbal pairs [31]. e String database was used to enrich the molecular pathways from all core targets of herbal pairs for treating stroke, and these pathways were screened with a cutoff FDR (false discovery rate) < 0.05.

Association Rule Analysis Results.
A total of 896 herbs were screened from 1903 formulae containing TM, and their occurrence frequency was 15918 times. SAPOSHNIKOVIAE RADIX

Investigation Results of Herbal Pairs
Containing TM Based on the Delphi Expert Questionnaire

Expert Basic Survey.
According to the expert screening criteria, the respondents of the returned questionnaires were evaluated based on their academic qualifications, professional titles, and working years. e questionnaires from respondents who did not meet the criteria for expert selection were not included in the statistical category (Table 5).

Expert Positive Coefficient.
In the first round, 24 questionnaires were distributed in May 2017, and 18 valid questionnaires were received back (the positive coefficient was 75%). In the second round, 21 questionnaires were distributed from July to September 2017, and 20 valid questionnaires were collected (the positive coefficient was 94.73%). Both rounds of questionnaires were effective consultations.

Distribution of Expert
Opinions. e usage of TM and its herbal pairs by clinicians was the best evidence reflecting the application value of TM (Table 6).

Reliability of the Expert Questionnaire.
e reliability of the expert consultation questionnaire was calculated by the reliability analysis in the "measurement" module of SPSS 21.0. Cronbach's alpha coefficients of the two rounds of questionnaires were 0.900 and 0.813, respectively, which showed that the indexes were reliable in the two rounds.  (Table 8).

Network Pharmacological Analysis
According to the above results, TM ⇒ QX, TM ⇒ FF, TM ⇒ JC, TM ⇒ CX, TM ⇒ QH, TM ⇒ BFZ, and TM ⇒ JH had relatively high weight values. To facilitate network pharmacological analysis, two pairs of herbal pairs were selected by the index function for the follow-up study. TM-CX and TM-JH were randomly screened out.

Potential Active Molecule Compounds of Herbal Pairs
Containing TM. Potential active molecule compounds were screened by SwissADME online tools (Figure 1), including 115 from CX, 28 from JH, and 24 from TM. Prediction by Evidence-Based Complementary and Alternative Medicine the BOILED-Egg chart showed that TM, CX, and JH had good gastrointestinal absorption (blue) or blood-brain permeation (red), which are more inclined to permeate the BBB.

Target Prediction and Analysis.
A total of 680 targets of active small molecule compounds of TM, CX, and JH were predicted based on the principle of chemical structure similarity, including 383 targets related to CX, 184 targets related to JH, and 149 targets related to TM. A total of 232 stroke-related targets were screened by the OMIM and HPO disease databases. A Venny chart was used to show the common predictive targets and stroke targets (Figure 2). Among the common targets with herbal pairs, there were 106 common targets related to TM-CX and 62 common targets interrelated with TM-JH. Among the common stroke targets, there were 12 common targets relevant to CX, 9 common targets relevant to TM, and 9 common targets relevant to JH.

Common Target Clustering Analysis.
e results of cluster analysis showed that there were 17 common targets between potential targets of TM-CX and known targets of stroke (Figure 3), of which 6 were core targets by cluster analysis (Figure 4), 14 were common potential targets of TM-JH and known targets of stroke ( Figure 5), and 7 were core targets by cluster analysis (Figure 6). e targets such as PTGS2, ACE, APP, NOS1, and NOS2 were all in the datasets of CX and JH (Tables 9∼10).

Biological Function Enrichment Analysis of Core Targets.
We obtained 1890 GO annotations about the core targets relevant to TM-CX, including the regulation of protein

Herbal Pair-Compound-Core Target-Pathway Network.
e pathways corresponding to the core targets relevant to each herbal pair were obtained by the String database, and the "herbal pair-compound-core target-pathway network" (H-C-T-P network) was established by Cytoscape 3.7.1. In the H-C-T-P network (Figure 9), the degree of PTGS2 was the highest, and many compounds in TM, CX, and JH were related to PTGS2. NOS1 and NOS2 were common targets related to CX and JH. APP was a common target related to TM and CX. ACE was a common target related to TM and JH. MMP9 and LDLR were targets related to JH, and F2 was a target related to CX. In the pathway enrichment analysis, important pathways were involved, such as metabolic pathways, cancer pathways, and the relaxin signaling pathway.

Discussion
TCM is a popular complementary or alternative medicine in Europe, America, and other countries [32,33] and involves a mature theory of methodology, prescription, and CHMs. Data mining has been successfully used to study the rules of CHMs combined with syndrome differentiation of TCM [13]. TM was first recorded in Shennong's Classic of Materia Medica and was listed as a top grade medicine. It is mainly used to prevent and treat headache, dizziness, stroke, migraine, epilepsy, convulsions, neurological headaches, Blue dots represent molecules that are better absorbed by the intestine; red dots represent molecules that are more permeable to the brain.  Evidence-Based Complementary and Alternative Medicine Alzheimer's disease, and other diseases [34]. Studies on compounds and pharmacological effects have shown that TM includes phenols, polysaccharides, sterols, organic acids, and other chemical components. TM is used for a number of effects, including sedative, hypnotic, antiepileptic, anticonvulsant, antianxiety, antidepressive, neural protection, antivertigo, regulation of the circulatory system, anti-inflammatory, analgesic, antioxidative, memory improving, antiaging, antiviral, and antitumoral effects. In this study, 126 herbal pairs containing TM were obtained based on data mining. A total of 27 herbal pairs were selected for inclusion in the expert questionnaire consultation. Twelve herbal pairs containing TM were supplemented in the process of expert questionnaire survey.
rough the weight analysis of the data mining and Delphi     selected for evaluation by network pharmacological methods to explore their molecular mechanisms on stroke. e findings suggested that the herbal pairs containing TM might have specific therapeutic effects against stroke. PTGS2, NOS2, NOS1, APP, F2, and ACE were identified as core targets of TM-CX, and APP, ACE, PTGS2, MMP9, LDLR, NOS2, and NOS1 were identified as core targets of TM-JH. Among them, NOS2, NOS1, APP, ACE, and PTGS2 exist in both herbal pairs. Among the active small molecule compounds associated with the core targets, 7 are from TM, 38 are from CX, and 13 are from JH. For example, L-pyroglutamic acid, 3-hydroxybenzoic acid, 4-formyl-2-methoxyphenyl acetate, vanillyl alcohol, p-hydroxybenzyl ethyl ether, bis(4hydroxybenzyl)ether, and 4-ethoxymethylphenyl-4′hydroxybenzylether are considered the main active small molecule compounds of TM. Research shows that TM and GT extract can treat hypertension and cerebrovascular disease, obviously improve neurological function, and reduce cerebral infarction [35]. e pathways of arginine biosynthesis, arginine and proline metabolism, and the relaxin signaling pathway have very low FDRs. According to the FDR, three important pathways of TM-CX and TM-JH were predicted in the treatment of stroke: arginine biosynthesis, arginine and proline metabolism, and the relaxin signaling pathway. NOS1 and NOS2 belong to the family of nitric oxide synthases, which are reactive free radicals and act as biologic mediators in several processes in the brain and peripheral nervous system. NOS1 is the main source of NO in the central nervous system and promotes L-arginine catalysis [36,37]. NOS2 is involved in the inflammatory cascade reaction process after ischemia. e inhibition of NOS2 expression in leukocytes and brain endothelial cells after cerebral ischemia can prolong treatment time and induce long-term neuroprotection [38]. APP is upregulated after acute stroke, chronic cerebrovascular disease, hypoxia, and ischemia brain injury, and it may exert this function by regulating neuronal calcium homeostasis and cell survival [39]. ACE is a potent vasopressor and aldosterone-stimulating peptide that controls blood pressure and vascular homeostasis. ACE plays an important role in the development of cerebrovascular and cardiovascular diseases [40]. PTGS2 is the key enzyme in prostaglandin biosynthesis and plays an important role in modulating motility, proliferation, and resistance to apoptosis. Upregulation of PTGS2 is associated with increased cell adhesion, resistance to apoptosis, and tumor angiogenesis [41].

Technical Route and Results Diagram
e aim of this study was to find a reasonable and effective method to explore herbs that could be compatible with TM for preventing and treating stroke.
is study provides a method and technical Support for basic research and the clinical application of rational compatibility of TCM for disease prevention and treatment ( Figure 10).

Conclusion
In this study, data mining, an expert investigation, network pharmacology analysis, statistics, and other methods were used to analyze TCM prescriptions in a step-by-step manner. From single variable to multivariate, macroscopic to microscopic, and local to integrated, we provide a comprehensive analysis of TCM prescriptions. We have uncovered hidden knowledge, sought to meet clinical needs, and provided some avenues for the rational integration of TCM, research and development of new herbs or formulae, and clinical prescription.

Limitations
is study has several potential limitations. First, as we sought to obtain herbal pairs that are compatible with TM as much as possible, it was difficult to uncover deeper tacit knowledge of the compatibility of Chinese medicinal herbs solely by association rules. Second, there are some limitations of the Delphi expert questionnaire: on the one hand, more attention should be paid to the rational design of questionnaires, the principles of expert selection, discussion links with the main issues, statistical methods, and other aspects; on the other hand, modern Chinese medicine lacks a unified basis for judging drug pairs. e screening of herbal pairs containing TM for stroke prevention and treatment depends on clinicians' clinical experiences and subjective knowledge, which makes it difficult for expert consensus to be reached regarding the effectiveness of some infrequently used herbal pairs. ird, the study lacks corresponding outreach work, such as animal and clinical research. Fourth, the experimental samples and scale were small, and the experiment lacked repeated measurements given the limited resources and time. Although subjective errors were avoided as much as possible, some human factors may still exist.

Data Availability
e data used to support the findings of this study are available from the first author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.