Based on Network Pharmacology to Explore the Molecular Targets and Mechanisms of Gegen Qinlian Decoction for the Treatment of Ulcerative Colitis

Background Gegen Qinlian (GGQL) decoction is a common Chinese herbal compound for the treatment of ulcerative colitis (UC). In this study, we aimed to identify its molecular target and the mechanism involved in UC treatment by network pharmacology and molecular docking. Material and Methods. The active ingredients of Puerariae, Scutellariae, Coptis, and Glycyrrhiza were screened using the TCMSP platform with drug‐like properties (DL) ≥ 0.18 and oral availability (OB) ≥ 30%. To find the intersection genes and construct the TCM compound-disease regulatory network, the molecular targets were determined in the UniProt database and then compared with the UC disease differential genes with P value < 0.005 and ∣log2 (fold change) | >1 obtained in the GEO database. The intersection genes were subjected to protein-protein interaction (PPI) construction and Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. After screening the key active ingredients and target genes, the AutoDock software was used for molecular docking, and the best binding target was selected for molecular docking to verify the binding activity. Results A total of 146 active compounds were screened, and quercetin, kaempferol, wogonin, and stigmasterol were identified as the active ingredients with the highest associated targets, and NOS2, PPARG, and MMP1 were the targets associated with the maximum number of active ingredients. Through topological analysis, 32 strongly associated proteins were found, of which EGFR, PPARG, ESR1, HSP90AA1, MYC, HSPA5, AR, AKT1, and RELA were predicted targets of the traditional Chinese medicine, and PPARG was also an intersection gene. It was speculated that these targets were the key to the use of GGQL in UC treatment. GO enrichment results showed significant enrichment of biological processes, such as oxygen levels, leukocyte migration, collagen metabolic processes, and nutritional coping. KEGG enrichment showed that genes were particularly enriched in the IL-17 signaling pathway, AGE-RAGE signaling pathway, toll-like receptor signaling pathway, tumor necrosis factor signaling pathway, transcriptional deregulation in cancer, and other pathways. Molecular docking results showed that key components in GGQL had good potential to bind to the target genes MMP3, IL1B, NOS2, HMOX1, PPARG, and PLAU. Conclusion GGQL may play a role in the treatment of ulcerative colitis by anti-inflammation, antioxidation, and inhibition of cancer gene transcription.


Background
Ulcerative colitis (UC) is chronic nonspecific enteritis with typical clinical manifestations of abdominal pain, diarrhea, and mucopurulent bloody stools; it mostly manifests as superficial ulcers occurring in the rectum and sigmoid colon but can also spread to the proximal colon and even the entire colon, causing permanent fibrosis and tissue damage. Despite improvements in treatment prospects, the incidence of longterm colectomy has not decreased over the last 10 years, and achieving mucosal healing early may be strongly associated with a reduced risk of future colectomy [1].
Gegen Qinlian (GGQL) decoction is from Treatise on Febrile and Miscellaneous Disease, and the monarch drug Pueraria has antipyretic and antidiarrheal properties; meanwhile, the subject drugs Scutellaria and Coptis have a function to eliminate dampness and heat and Glycyrrhiza can replenish qi.
A meta-analysis including 2028 patients defined that GGQL could improve clinical symptoms and reduce an endoscopic severity index (UCEIS) and recurrence rate; further, it also led to fewer adverse events whether it was used alone or in combination with Western medicine [2]. According to spectral efficiency studies, its main components-puerarin, daidzin, and coptisine-act through a synergistic relationship, but the specific mechanism of action is not clear.
Network pharmacology combines pharmacology, bioinformatics, and several other sciences with system network analysis and explains the multicomponent and multitarget drug treatment mechanism from the direction of gene distribution, molecular function, and signaling pathways by constructing a network related to "disease-phenotype-genedrug," which is suitable for the study of traditional Chinese medicine (TCM) compounds.

Materials and Methods
2.1. Screening of Active Ingredients and Target Genes. The active ingredients of the GGQL herbs, including Puerariae, Scutellariae, Coptis, and Glycyrrhiza, were screened using the TCMSP platform (http://lsp.nwu.edu.cn/tcmsp.php) with drug-like properties ðDLÞ ≥ 0:18 and oral bioavailability ðOBÞ ≥ 30% as conditions. The predicted targets of the screened compounds were acquired from the DrugBank (https://www .drugbank.ca/) database and verified literature. Meanwhile, the UniProt database (https://www.uniprot.org/) was used for comparison of target information and gene name standardization.
2.2. Acquisition of Differential Genes. The genetic samples-Series: GSE38713-of UC patients and healthy people were obtained from the GEO database (https://www .ncbi.nlm.nih.gov/geo/). The script was run in Strawberry Perl-5.30.2.1 (Perl) software, and the gene probe names were annotated as gene symbols and grouped. The "limma" package was installed in the Perl software, and the sample values were corrected and subjected to log2 (logFC) transformation. Samples with P value < 0.005 and |log2 ðfold changeÞ | >1 were screened and considered to have statistically significant differential genes. The gene volcano map of the samples was generated, and the top 20 genes with the most significant upand downregulation were selected to draw the heat map.

Traditional Chinese Medicine Compound-Disease
Regulatory Network. The Perl software was used to acquire intersection genes of the disease differential genes and the target genes of TCM, as well as the active ingredients of TCM. Subsequently, the TCM compound-disease regulatory network was generated using the Cytoscape software.
2.4. Protein-Protein Interaction (PPI) Network and Topological Analysis. The "bisogenet, cytoNAC" package was installed in the Cytoscape 3.8.0 software, and the intersection genes were entered, and the parameter "homo sapiens" was selected. Data for constructing the PPI network were sourced from six main experimental research databases: Database of Interacting Proteins, Biological General Repository for Interaction Datasets, Human Protein Reference Database, IntAct molecular interaction database, Molecular INTeraction Database, and Biomolecular Interaction Network Database. The method "input nodes and its neighbors" was selected to obtain the PPI network and perform topological analysis based on its network centrality.

GO and KEGG Enrichment
Analysis. The R package including "colorspace," "stringi," and "ggplot2" was installed in software R 4.0.0, and a bioconductor package that includes "DOSE," "clusterProfiler," and "enrichplot" was used for GO and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis. The function "enrichGO" was used for GO enrichment analysis. The database was org. Hs .eg. db (DOI: 10.18129/http://b9.bioc.org.Hs.eg.db); the "enrich-KEGG" function was applied for KEGG enrichment analysis, and the database was the KEGG database (https://www.kegg .jp/). As for the parameters of the two functions, species was set to "has," and the filter values (i.e., P value and q-value) were set to 0.05. The first 20 enrichment results were visualized as a bar graph, and the KEGG regulatory network was generated by the Cytoscape 3.8.0 software.
2.6. Molecular Docking. The target genes involved in the first 10 pathways of the KEGG enrichment results were searched in the PDB database (https://www.rcsb.org), of which the 3D protein conformations with a crystal resolution of lower than 3 Å as determined by X crystal diffraction were acquired. The Mol2 format files of GGQL key active ingredients were downloaded from the TCMSP platform. The Auto-DockTools 1.5.6 software was applied to process proteins as follows: separate proteins, add nonpolar hydrogen, calculate the Gasteiger charge, and assign the AD4 type, and set all the flexible bonds of small molecule ligands to be rotatable. According to the original ligand coordinates, the docking box was adjusted to include all protein structures. Meanwhile, the receptor protein was set to rigid docking, the genetic algorithm was selected, and the maximum number of evals was set as the medium. The docking results were obtained by running autogrid4 and autodock4, by which the binding energies were revealed. The partial diagram of molecular docking was then generated using the PyMol software.

2
BioMed Research International    3.2. Differential Gene Screening. By comparing 15 normal samples with 30 disease samples in the GEO database, a total of 21,653 differential genes were acquired, including 9198 upregulated genes and 12,455 downregulated genes. After screening with a P value < 0.005 and |log2 ðfold changeÞ | >1, a total of 305 upregulated genes and 263 downregulated genes were obtained. As indicated by the gene volcano map (Figure 1), the differential genes in the disease samples display a normal distribution, with a larger number of significantly upregulated genes than significantly downregulated genes. The top 20 genes with the most significant upregulation and downregulation are presented in Figure 2 and Table 2. 3.3. Construction of the TCM Compound-Disease Regulatory Network. As listed in Table 3, there are 23 intersection genes (sorted by logFC). The targeting relationship between TCM active ingredients and intersection genes is presented by the TCM compound-disease regulatory network ( Figure 3). Active ingredients of Glycyrrhiza and Scutellariae have the most amount of and related target genes, indicating that Glycyrrhiza and Scutellariae in GGQL are the most efficacious components. The active ingredients quercetin, kaempferol, wogonin, and stigmasterol are associated with 18, 5, 3, and 3 target genes, respectively. Therefore, they are classified as multitarget and multieffect compounds. The gene NOS2 is the gene associated with the highest number of active components, followed by PPARG and MMP1.

PPI Network and Topological Analysis.
In the PPI network, the degree centrality (DC) of a node is simply the number of edges it has. The higher the degree, the more central the node is. The betweenness centrality (BC) captures how much a given node is in between others. Specifically, it is the ratio of the number of the shortest paths passing through the node to the total number of the shortest paths in the network. DC and BC reflect the influence of the corresponding node in the entire network. They describe the topological centrality based on the connectivity and controllability of the network. The combination of DC and BC values has been confirmed to be effective for screening reliable important proteins [3]. As shown in Figure 4, 830 protein nodes and 9689 edges were obtained for intersection genes. After screening with DC > 61 and a BC range of 0-113.2, the first  . BP mainly involves aspects of response to oxygen levels, leukocyte migration, collagen metabolic process, and response to nutrients. CC is mainly related to the extracellular matrix, collagen-containing extracellular matrix, fibrillar collagen trimer, and banded collagen fibril. MF is mainly involved in serine-type endopeptidase activity, serine-type peptidase activity, serine hydrolase activity, CXCR chemokine receptor binding, platelet-derived growth factor binding, and cytokine activity ( Figure 5). According to KEGG enrichment results, the mechanism of GGQL in treating UC is mainly concentrated in the IL-17 signaling pathway, relaxin signaling pathway, AGE-RAGE signaling pathway in diabetic complications, toll-like receptor signaling pathway, TNF signaling pathway, fluid shear stress and atherosclerosis, transcriptional misregulation in cancer, proteoglycans in cancer, rheumatoid arthritis, and prostate cancer ( Figure 6). Genes associated with the greatest number of pathways were IL1B, MMP9, and MMP3 ( Figure 7 and Table 5).
3.6. Molecular Docking. Molecular docking is a technique that mimics the interaction between small ligand molecules and receptor protein macromolecules, and the binding energy between the two counterparts can be calculated to predict their affinity. A binding energy lower than 0 indicates that the two molecules combine spontaneously and that smaller binding energies lead to more stable conformations.  Figure 2: Gene heat map. In the gene heat map, red and green represent upregulated (logFC > 0) and downregulated (logFC < 0) genes in the sample, respectively, whereas black represents no significant difference. The first 13 samples were from healthy people, and the last 30 samples were from patients with ulcerative colitis.

BioMed Research International
Most ingredients in GGQL can bind well with target genes, among which stigmasterol, coptisine, and berberine have the best binding properties ( Table 6). The genes MMP3, IL1B, NOS2, HMOX1, PPARG, PLAU, and MMP1 can dock well with most active ingredients. Figure 8 illustrates some local structures of molecular docking in detail.

Discussion
In this study, a network pharmacological analysis was conducted on the medicinal ingredients of the four TCMs (Puerariae, Scutellariae, Coptis, and Glycyrrhiza) in GGQL and UC disease targets. Quercetin, kaempferol, and wogonin were identified as the active ingredients associated with the most targets. The results of molecular docking also verified that they have good binding properties with most target genes. Quercetin is a common flavonoid compound in nature. It is considered the most effective reactive oxygen species (ROS) scavenger and inhibits the production of several proinflammatory factors, such as TNF-α and NO. The antioxidant and anti-inflammatory advantages of quercetin in UC treatment have been confirmed by various in vivo and in vitro experiments [4]. Quercetin plays an anticancer role by reducing the activity of kinase MEK1, downregulating the cascade reaction of Raf and MAPK, and inhibiting telomerase [5]. Kaempferol is also a natural flavonoid, and its efficacy as an anti-inflammatory, antioxidant, and anticancer agent has been reported in the treatment of a variety of diseases, such as diabetes, obesity, and cancer (e.g., skin, liver, and colon cancers) [6]. Wogonin is the compound with the highest content in Scutellariae; it is quickly converted into metabolites, such as baicalin, after entering the bloodstream. Baicalin has been confirmed to significantly inhibit TLR4-induced   [7,8]. The molecular docking results indicate that stigmasterol, coptisine, and berberine have superior affinities to the target genes MMP3, IL1B, NOS2, HMOX1, PPARG, and PLAU, and they are the effective ingredients with potent anti-inflammatory and antioxidant effects. Stigmasterol has been shown to inhibit the innate immune response induced by lipopolysaccharide in a mouse model [9]. Berberine exerts local anti-inflammatory effects by blocking the IL-6/STAT3/NF-κB signaling pathway. Meanwhile, it effectively enhances the expression of SIgA and lowers the expression of iNOS, MPO, and MDA [10]. A PPI topological analysis was performed for 23 intersection genes, revealing 32 strongly associated proteins, among which 9 proteins (EGFR, PPARG, ESR1, HSP90AA1, MYC, HSPA5, AR, AKT1, and RELA) are the predicted targets of Figure 3: TCM compound-disease regulatory network. This network shows the targeted relationship between the active components of TCM and the intersection genes. Green GC represents Glycyrrhiza, yellow GG represents Pueraria, light blue HQ represents Scutellaria, purple HL represents Coptis, red M represents common components, and dark blue triangle represents intersection genes.

BioMed Research International
TCMs. PPARG is also an intersection gene, and these 9 targets were speculated to be the key targets of GGQL in the treatment of UC.
UC is an inflammatory disease related to intestinal immune recognition disorders. The KEGG enrichment results indicate four inflammation-related pathways: IL-17 signaling pathway, AGE-RAGE signaling pathway in diabetic complications, toll-like receptor signaling pathway, and TNF signaling pathway. Meanwhile, further analysis revealed inflammation-related genes, such as IL1B, CXCL10, CXCL11, MMP9, MMP3, SPP1, NFKB1, IKBKG, and RELA. The first six genes are widely involved in IL-17, toll-like receptors, and TNF signaling pathways, while the latter three are participants in the NF-κB pathway, particularly the gene RELA, which encodes NF-κB p65. It is suggested by the results that both the treatment mechanism of GGQL and the pathogenesis of UC are related to inflammatory regulation. A meta-analysis showed that in allelic and dominant models, the genetic polymorphisms of IL-17A and IL-17F may increase the risk of UC occurrence. In addition, IL-17 levels in serum are significantly associated with the severity of UC [11]. Toll-like receptors (TLRs) are a group of transmembrane proteins widely distributed in immune cells, playing a key role in identifying invading pathogens and upregulating signals related to inflammatory cytokines and costimulatory molecules [12]. Tumor necrosis factor (TNF) not only is a potent proinflammatory mediator but also upregulates the production of ROS and RNS and exacerbates cell damage [13]. Anti-TNF therapy has been proven to quickly induce clinical and endoscopic remission in UC patients; however, its safety and related risks still require urgent attention [14]. NF-κB induces cytokine expression and neutrophil aggregation. It is often regarded as a sign and the central pathway of inflammatory response, while being involved in cancer development through various pathways [15]. In the classical NF-κB signaling pathway, TNF-α and IL-1 activate  11 BioMed Research International toll-like receptors (TLRs), followed by the activation of the IκB kinase complex, which can phosphorylate IκBα [15]. Advanced glycosylation end products (AGE) and IL-17 (highly expressed during UC activity) can also be used as mediators for NF-κB pathway activation [16,17]. In this study, it is confirmed that the action mechanism of GGQL is related to the four pathways of inflammatory response, and the regulation of these pathways is linked to the transcription of NF-κB, indicating that GGQL plays its therapeutic role by inhibiting the inflammatory response mediated by the NF-κB signaling pathway. It has also been confirmed by other studies that baicalin and berberine exert inflammatory inhibition effects by downregulating the expression of proinflammatory cytokines (i.e., IL-1β, TNF-α, ICAM-1, TLR2, and TLR4) and inhibiting the NF-κBp65 signal transduction pathway [18,19].
The excessive ROS in the intestinal mucosa can trigger inflammatory responses by inducing redox-sensitive signaling pathways and transcription factors (e.g., NF-κB, TNF-α, and AP-1). The inflammatory responses then lead to the generation of more ROS, forming a vicious circle between oxidative stress and inflammation [20]. The KEGG path map shows that there are three pathways related to oxidative stress: (1) relaxin signaling pathway, (2) fluid shear stress and atherosclerosis pathway that activates downstream sec-ond messengers of PI3K-AKT, thus inducing eNOS expression, and (3) AGE-RAGE signaling pathway in diabetic complications, in which the receptor RAGE activates NADPH oxidase 2, thus leading to excessive ROS production. PPARG, NOS2, and DUOX2 are the three key target genes closely related to oxidative stress. NOS2 and PPARG are the targets associated with most active ingredients. The results of molecular docking have also verified the effectiveness of the association. PPARG is an important gene for fatty acid metabolism and oxidation and insulin sensitization. It also inhibits the activation of NF-κB to exert antiinflammatory effects. PPAR-γ has been proven to be a key receptor for 5-ASA to exert anti-inflammatory and antioxidant effects [21]. Relevant studies have found that the expression of PPAR-γ messenger RNA in the colonic mucosa of UC patients is impaired, accompanied by enhanced expression of the corresponding inflammatory factors (e.g., NF-κB). Partial PPAR-γ agonists may be a new target for UC treatment [22]. According to the results of differential genes, PPARG was downregulated. In GGQL, there are 67 compounds that have a regulatory effect on PPARG. NOS2 encodes the synthesis of inducible nitric oxide synthase (iNOS). The highly expressed iNOS in the inflamed mucosa plays a key role in oxidative stress-induced inflammation [23]. NOS activity in colon biopsies has been shown to be correlated with disease   Figure 7: KEGG relational regulatory network. This network shows the relationship between the enriched 14 pathways and 18 genes, and the size of the graph shows the number of pathways or genes connected. 13 BioMed Research International intensity [24]. DUOX2 participates in the regulation of hydrogen peroxide anabolism and mediates peroxidase activity on the mucosal surface. DUOXA2 is a partner of DUOX2. Significant upregulation was determined for DUOXA2, DUOX2, and NOS2, among the different genes. It implies that excessive production of ROS and RNS will bring oxidative damage to cellular components, including lipids, DNA, and proteins, thus causing damage to the mucosal barrier and leading to sustained inflammatory response [20]. In GGQL, 86 compounds, including quercetin and kaempferol, have been shown to have effects on NOS2 and DUOX2 by the TCM compound-disease regulatory network.
As suggested by the KEGG enrichment results, GGQL may prevent the occurrence of cancer in the following ways: transcriptional misregulation in cancer and proteoglycans in cancer. Persistent inflammatory response and oxidative stress state can promote the mutation of cancer genes. The abnormal expressions of NOS2, DUOX2/DUOXA2, ESR1, EGFR, MYC, and AKT1, which are being discussed in this study, have been confirmed to be related to cancer [25][26][27][28]. The expression of the protooncogene MYC is significantly upregulated in up to 70% of colorectal cancer (CRC) patients. The overexpression of its gene product, c-Myc, leads to the activation of downstream genes, DNA synthesis, cell proliferation, and chromosomal aberrations. These mechanisms ultimately lead to genomic instability and chemical resistance [29]. Encoded by the EGFR gene, EGFR is a transmembrane glycoprotein belonging to the ErbB family; it is reportedly   15 BioMed Research International excessively expressed in 85% of NSCLC cells and is associated with a poor prognosis. By regulating downstream signaling pathways, mainly PI3K/Akt and MAPK pathways, the activated EGFR leads to receptor dimerization and tyrosine autophosphorylation, which could result in aberrant proliferation in certain cells, such as NSCLC cells. MYC is a protooncogene that encodes transcription factors involved in basic cellular pathophysiological processes. Activation of MYC causes abnormal cell proliferation, regression, and redifferentiation of cancer cells and susceptibility to aurora kinase inhibition in SCLC cells [30]. Proteoglycans are the main components of the extracellular matrix. They participate in matrix remod-eling in tumor cell growth and the formation of stromal vessels, thus affecting the response of tumor cells and tissues and regulating the cancer phenotype by influencing the signals within cancer cells [31].
Many other scholars' studies are consistent with our findings. Xu et al. found that GGQL decoction could increase the SOD activity and decrease the MDA and iNOS activities along with the reduction in TNF-α and IL-1β levels to take effect on the treatment of UC rats [18]. Li et al. revealed that GGQL could reduce the TLR4 expression and NF-κB activation along with several inflammatory cytokines such as TNFα, IL-6, IL-1β, and IL-4 and NO [32].