Interpreting the Molecular Mechanisms of Yinchenhao Decoction on Hepatocellular Carcinoma through Absorbed Components Based on Network Pharmacology

To investigate the mechanisms through which Yinchenhao decoction (YCHD) inhibits hepatocellular carcinoma (HCC), we analyzed YCHD ingredients absorbed into the bloodstream by using network pharmacology. We conducted a weighted gene coexpression network analysis on gene expression data collected from the Gene Expression Omnibus and The Cancer Genome Atlas databases to derive an HCC gene set; moreover, we used four online prediction system databases to predict the potential targets of YCHD ingredients absorbed into the bloodstream. We discovered that YCHD directly interfered with 17 HCC-related disease targets. Subsequent gene ontology enrichment analyses of these 17 disease targets revealed that YCHD exhibited effects through 17 biological processes, 7 molecular functions, and 9 cellular components. Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses indicated 14 pathways through which YCHD inhibits HCC. We observed similar trends in how the 17 small molecules interfered with the key target set. We surmised that YCHD inhibits HCC by regulating inflammatory and metabolic pathways. Network pharmacological analysis of YCHD ingredients absorbed into the bloodstream may provide new insights and serve as a new method for discovering the molecular mechanisms through which YCHD inhibits HCC.


Introduction
Hepatocellular carcinoma (HCC) is the fourth most common cancer and the third most deadly cancer in China, accounting for 85%-90% of primary malignant liver tumors [1]. Although the multikinase inhibitor sorafenib has been approved as a first-line treatment for late-stage HCC because of its favorable antiangiogenesis effects, it increases patients' survival time by only a few months [2]. Traditional Chinese medicine (TCM), which has a long history of clinical application, exhibits multimolecule, multitarget, and synergistic effects. TCM has subsequently become a valuable alternative for HCC treatment. Therefore, the effectiveness of TCM in inhibiting HCC merits further investigation [3].
Introduced in Shanghan Lun (Treatise on Cold Damage Diseases), Yinchenhao decoction (YCHD)-which consists of yinchen (Artemisiae Scopariae Herba), dahuang (Radix et Rhizoma Rhei), and zhizi (Gardeniae Fructus)-is a popular remedy in TCM and is commonly used to treat yang jaundice. A study identified 160 potential targets related to 16 diseases, including cancer; of these targets, 96 were related to cancer [4]. Concerning oral TCM compounds (i.e., medicine comprising two or more ingredients), the ingredients must undergo absorption, distribution, metabolism, and excretion before they enter the bloodstream; only molecules that complete this process are considered active. These molecules merge with targets in the organism and then exhibit medicinal effects. In a preliminary study, we obtained active chemical molecules by using oral bioavailability (OB) screening [5], drug-likeness (DL) assessments [6], and intestinal epithelial permeability (Caco-2) screening. In that study, we were able to predict which molecules were beneficial; however, their effects were minimal if the molecule content was low or if the molecules were metabolized quickly. Sun et al.
Accordingly, the present study analyzed data obtained from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases to identify the differentially expressed genes (DEGs) of HCC. On the basis of the YCHD ingredients absorbed into the bloodstream, network pharmacology was then used to identify active HCC-inhibitive pharmaceutical ingredients and their possible molecular targets. The workflow of our bioinformatics and network pharmacology analyses of the effects of YCHD on HCC is presented in Figure 1. The results may serve as a basis for subsequent experimental research.

Data Collection.
The GEO is a public gene expression profile database of the National Center for Biotechnology Information, National Institutes of Health (USA). By mining public databases, medical researchers can obtain a clearer understanding of the molecular mechanisms underlying the onset of cancer. Such an understanding can enhance early detection, diagnosis, and treatment of various cancers [15]. In the current study, we used "liver cancer" and "hepatocellular carcinoma" as search terms in the high-throughput GEO database and collected disease-related gene expression profile chips. After analyzing and comparing the different chips, we selected the GSE121248 chip for analysis. This chip originated from the GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array platform, which contains 70 liver cancer samples and 37 normal samples. TCGA is a joint project initiated by the American National Cancer Institute and the National Human Genome Research Institute in 2006. TCGA is the largest cancer gene database worldwide and contains a wealth of clinical information [16]. We searched TCGA for (and subsequently downloaded) liver cancer-related RNA-seq gene expression data and the corresponding clinical data files, which contained 373 tumor samples and 49 normal samples.
2.2. DEG-Based Disease Gene Analysis. We used the limma package (version: 3.42.2) in R language to analyze DEGs identified in the GSE121248 and TCGA data. Subsequently, we filtered out upregulated and downregulated DEGs from the GEO and TCGA chip data by using the following conditions:jlog 2 FCj > 1:0 and adj:p:value < 0:05. Finally, we employed the ggplot2 package (version: 3.3.1) and pheatmap package (version: 1.0.12) to draw volcano maps and heat maps corresponding to the DEGs, and we used Venny 2.1 (https://bioinfogp.cnb.csic.es/tools/venny/index.html) to determine the intersection of TCGA and GEO data in order to derive the differential gene analysis-based gene set G D .

Weighted Gene Coexpression Network Disease Gene
Analysis. We performed a weighted gene coexpression network analysis (WGCNA) to analyze the gene expression data from the GEO and TCGA. From this analysis, we identified modules and hub genes related to clinical phenotypes (i.e., normal and disease-based).
A gene coexpression network was constructed using a weighted gene expression correlation network analysis. The soft-threshold power was extracted from PowerEstimated and the topological overlap matrix (TOM). Subsequently, on the basis of the weighted gene expression correlation network, the TOM was used to cluster all the genes, and dynamic tree cutting was employed to identify various gene modules. To determine the correlation between the different modules of the clustered genes, a minimum gene module size of 50 and a dynamic shear height value of 0.25 were set. The different gene modules are represented by various colors.    BioMed Research International Finally, the correlation between different gene modules and clinical phenotypes was calculated, and a heat map of the correlation between the modules and clinical phenotypic traits was drawn. According to the correlation between the gene module and the normal and control groups, we determined modules with the highest positive and negative correlation values in the GEO and TCGA data, respectively. Subsequently, gene modules with the highest positive and negative correlation values in the GEO and TCGA data were intersected, and two intersecting genes were combined to constitute the HCC gene set G W through the WGCNA.

HCC Protein-Protein Interaction Network Construction
and Analysis. We combined the HCC gene set G D obtained from the DEG-based analyses with the HCC gene set G W obtained from the WGCNA to form the liver cancer disease gene set G HCC ; these gene sets were imported into the STRING database (version: 11.0; https://string-db.org/).
Regarding the parameter settings, we set the parameter "organism" to "homo sapiens" and set the "combine score" threshold to 0.9 to produce an interaction network of all gene targets. We then used Cytoscape (version: 3.7.2, https:// cytoscape.org/) to produce a protein-protein interaction (PPI) network for the target proteins. Furthermore, we used a Cytoscape-based CytoHubba plug-in to mine the core genes in the PPI network and selected the maximal clique centrality (MCC) algorithm for the network. We also used default values for the other parameters and calculated the core gene set G P in the liver cancer PPI network.  7 BioMed Research International molecular functions, and cellular components of genes. A group of genes, rather than a single gene, typically participates in a biological process or pathway. Enrichment analyses are performed on the premise that if a biological process or pathway is known to be abnormal, genes that function together are extremely likely to be selected into the gene set related to this process or pathway.
We used OmicShare (http://www.omicshare.com/tools) to annotate the GO functions for the HCC gene set G HCC and applied the clusterProfiler package (version: 3.14.3) to conduct Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses. Moreover, we selected genes using the conditions p:value < 0:05 and q:value < 0:05 and employed Cytoscape to construct an HCC target-pathway network according to the enrichment analysis results.

Discovery of Key Disease Genes through Survival Analysis.
We determined the key gene set G K through a survival analysis and included all genes that satisfied the following conditions in our subsequent analyses: (1) genes in the intersection between the gene sets G D and G W obtained from the DEG-based analyses and WGCNA, respectively, and (2) the core gene set G P obtained from the PPI network analyses described in Section 2.4.
We downloaded clinical data files from TCGA. After organizing the data, we used the survival package (version: 3.1) and survminer package (version: 0.4.7) to conduct survival analyses (where p < 0:05) and plot gene survival curves.

Collection of Effective YCHD Ingredients and Assessment
of Absorption, Distribution, Metabolism, Excretion, and Toxicity. We referenced data on YCHD ingredients collected in animal experiments performed in related studies [7]; we also collected chemical information or data on effective YCHD ingredients from the PubChem database (https:// pubchem.ncbi.nlm.nih.gov/) and organized them according to PubChem name and CID, molecular formula, and the simplified molecular-input line-entry system (SMILES) strings (hereafter referred to as canonical SMILES). Additionally, we used Advanced Chemistry Development, Inc. (ACD/Labs) software (version: 2019) and the SwissADME online prediction system (http://www.swissadme.ch/) to evaluate and analyze the absorption, distribution, metabolism, excretion, and toxicity (ADMET) of the ingredients.
We obtained the intersection between the predicted target gene set and the disease gene set G W , identified the potential targets of YCHD effective in treating HCC, and The 25 crucial targets in the PPI network obtained using the MCC algorithm. In the figure, the degree value of the node is presented as a gradient from red to green (large to small) according to the degree of the target in the entire PPI network. The darker the color is, the greater the degree value of the node is. 8 BioMed Research International obtained the intersection gene set G YH . Additionally, we conducted survival analyses on all genes in the intersection gene set G YH .

Calculation Model-Based Effective Ingredient Regulatory
Network Analysis. In practice, some TCM ingredients do not exhibit direct effects on disease-related targets. However, such ingredients occasionally affect or regulate key disease targets by interfering with neighboring targets. Therefore, we adopted a network calculation model to detect small molecules consistent with the aforementioned description.
We downloaded a high-quality, all-human PPI background network integrating 15 commonly used databases and selected a high-quality PPI incorporating five types of evidence. The PPI network contained a total of 16,677 proteins and 243,603 interactive relationships, and it was subsequently set as the background network of the current study to determine the intensity of YCHD ingredients' interference with and regulation of key genes and the key gene set G K .
2.9.1. Analysis of Key Gene Set Interference with the Target Sets of Different Ingredients. We performed the analyses by using three network topological distances hd AB i (i.e., hd S AB i, h d C AB i, and hd K AB i). These distances can be expressed as follows: where A is the target set of an ingredient, kAk is the number of targets in the target set, B is the key gene set, kBk is the number of targets in the key gene set, and dða, bÞ is the distance between two nodes in the PPI network.     BioMed Research International where hd S AA i is the average distance between the targets of an ingredient, hd S BB i is the average distance between key genes, and hd S AB i is the average distance between the target set of an ingredient and the key target set. S P AB < 0 signifies that the target set A of an ingredient and the key gene set B are close together in the network topological structure, indicating that the ingredient can regulate the key gene set B by interfering with the target set A. S P AB ≥ 0 signifies that the target set A of an ingredient and the key gene set B are far apart in the network topological structure, indicating that the ingredient cannot markedly regulate the key gene set B. Therefore, by calculating hd S AB i, hd C AB i, hd K AB i, and S P AB , one can determine whether YCHD contains ingredients that directly or indirectly affect key disease targets by interfering in certain target (set) proteins, thereby achieving the goal of alleviating or curing diseases.
In this study, we used R language (version 3.6.2) and the igraph package (version 1.2.5) to complete the aforementioned programming, calculations, and analyses.

DEG-Based Disease Gene Analysis Results.
We used the limma package to conduct a DEG analysis on liver cancer data obtained from GSE121248 and TCGA. For GSE121248, we obtained 558 DEGs, 167 and 413 of which were upregulated and downregulated genes, respectively. For TCGA, we obtained 2713 DEGs, 1024 and 1689 of which were upregulated and downregulated genes, respectively. Figures 2(a)-2(d) illustrate the volcano and heat maps of the analysis results. By obtaining the intersection of TCGA and GSE121248 data, we identified that the two sets of data shared 493 genes (G D set) (Figure 2(e)).

WGCNA Results for Disease Genes.
We performed a WGCNA on the gene expression profile data collected from GSE121248 and identified that the modules most positively and negatively correlated with the normal and disease phenotypes were MEbrown (0.82) and MEturquoise (−0.72), which contained 691 and 33 genes, respectively. Conversely, the WGCNA performed on TCGA data revealed that the modules most positively and negatively correlated with the normal and disease phenotypes were MEturquoise (0.4) and MEblue (−0.76), which contained 4381 and 3538 genes, respectively (Figures 3(a)-3(d)).
We obtained the intersection between the positively correlated module collected from the GEO-based WGCNA (i.e., MEbrown) and that collected from TCGA-based WGCNA (i.e., MEturquoise), identifying 63 common genes. Subsequently, we obtained the intersection between the negatively correlated module collected from the GEO-based WGCNA (i.e., MEturquoise) and that collected from TCGA-based WGCNA (i.e., MEblue), identifying 150 common genes. We then combined the two sets of common genes (in total, 213 common genes) to form the WGCNA-based HCC gene set G W (Figures 3(e) and 3(f)).

HCC PPI Network Analysis
Results. We combined the 493 genes (in gene set G D ) obtained from the DEG analysis with the 213 genes (in gene set G W ) obtained from the WGCNA to form a liver cancer disease gene set G HCC , which contained 670 genes. Subsequently, we imported the 670 genes into the STRING database to generate the HCC PPI network. We then applied the MCC algorithm for analysis, identifying 25 genes in the PPI network with much higher scores compared with those of the other genes. The 25 genes were thus determined to be core gene set G P in the HCC PPI network (Figures 4(a) and 4(b)).

Discovery of Key HCC Genes Based on Survival Analysis.
We obtained the intersection between the DEG set G D and WGCNA gene set G W , obtaining 36 common genes. We then combined these genes with 25 core genes collected from the PPI network-based analyses, resulting in a total of 61 key genes (G k ) on which we performed a survival analysis that revealed 26 genes related to survival in G k ( Figure 5). These genes were ACADS, BUB1B, CCNA2, CCNB1, CDC20, CDK1, CENPE, CEP55, DLGAP5, FAM149A, KIF11, KIF20A, KIF4A, NDC80, NUF2, NUSAP1, PALM3, PBK, PRC1, RBP7, RRM2, SFN, SPINK1, TOP2A, TPX2, and TTK, as illustrated in Figure 6. 3.5. ADMET Assessment Results for Effective YCHD Ingredients. We conducted a literature analysis and find that of the 41 chemical components entered into blood, 5 components could not be retrieved from PubChem CID information. We derived data on 36 small YCHD molecules and subsequently performed ADMET assessments on the effective ingredients by using ACD/Labs and SwissADME. The corresponding structure of each small-molecule compound and its corresponding canonical SMILES were obtained from the PubChem database and then imported into the ACD/LABS software and the SwissADME online prediction system to evaluate these active ingredients.
In the assessments, we evaluated indicators such as gastrointestinal absorption, bioavailability (%), and dose (mg = 50), as presented in Table 1. Among the 36 molecules, 24 (66.7%) could be sufficiently absorbed by the stomach and intestines, and 22 (61.1%) had a bioavailability greater than 70%.

Results of the Prediction and Identification of Small YCHD Molecular Targets.
We imported all small YCHD molecules into four online prediction systems (i.e., HitPick, SEA, SwissTargetPrediction, and STITCH) by using the method described in Section 2.8 to predict molecular targets.    In (c), the triangles, circles, and squares represent small molecules, targets, and pathways, respectively, in which targets were enriched; the relationship between small molecules and targets and that between targets and pathways are represented by solid and dotted lines, respectively, and a greater node degree indicates a larger node size.
We then used thresholds to perform screening, obtaining 24 small YCHD molecules and 105 potential targets. Information on the targets is presented in Table 2.

Network Analysis of YCHD Ingredients
Interfering with HCC-Related Disease Targets 3.7.1. Enrichment Analysis of YCHD Ingredients Interfering with HCC-Related Disease Targets. We performed an enrichment analysis on YCHD molecules and 17 HCC-related disease targets. A GO functional enrichment analysis revealed that the mechanisms underlying the effects of YCHD involved 17 biological processes, 7 molecular functions, and 9 cellular components (Figure 8(a)). The main YCHD effects included inhibition of abnormal cell proliferation, strengthening of the immune system, and facilitation of metabolism, possibly explaining the mechanism through which YCHD inhibits HCC. Through a KEGG enrichment analysis, we derived 14 pathways (Figure 8(b)): a cancer overview pathway (chemical carcinogenesis), an inflammatory signaling pathway (tumor necrosis factor (TNF) signaling pathway), and nine metabolic pathways (i.e., steroid hormone biosynthesis, drug metabolism-cytochrome P450, metabolism of xenobiotics by cytochrome P450, linoleic acid metabolism, retinol metabolism, drug metabolism-other enzymes, arachidonic acid metabolism, nitrogen metabolism, and pentose and glucuronate interconversions). Therefore, we deduced that the inflammatory signaling pathway and metabolic pathways were the most critical pathways for the delivery of the antiliver cancer effects of YCHD.

Construction and Analysis of the Network of YCHD Ingredients Interfering with HCC-Related Disease Targets.
To determine the targets and pathways regulated by YCHD compounds, we constructed a compound-target-pathway network incorporating all compounds, targets, and signaling pathways (Figure 8(c)).
The nitrogen metabolism and bile secretion signaling pathways (degree = 7) were linked to seven chemical molecules; chemical carcinogenesis (degree = 6) was linked to six chemical molecules; steroid hormone biosynthesis, drug metabolism-cytochrome P450, and xenobiotic metabolism by cytochrome P450 (degree = 5) were linked to five chemical molecules; retinol metabolism, drug metabolism-other enzymes, and linoleic acid metabolism (degree = 4) were linked to four chemical molecules; and the AGE-RAGE signaling pathway in diabetes, pentose and glucuronate interconversions, arachidonic acid metabolism, and the TNF signaling pathway (degree = 3) were linked to three chemical molecules. The remaining signaling pathways were linked to at least two chemical molecules.

Regulatory Network Analysis of YCHD Ingredients
Interfering with Key Targets. S P AB < 0 was used to determine whether small YCHD molecules affected and regulated the key target set in the G k set. The results indicated that 17 small YCHD molecules interfered with the key target set ( Table 3).

Discussion
HCC is a complex disease whose onset and progression is related to multiple proteins and pathways [17]. TCM compounds contain numerous ingredients that affect multiple targets and exhibit diverse, multipathway pharmacological activities [18] that may assist in HCC treatment. Hanahan et al. proposed 10 theories of tumor hallmarks according to modern tumorigenesis and tumor development mechanisms [19]. Because TCM has a complex composition and affects various targets, it may simultaneously interfere with multiple tumor hallmarks. TCM compounds contain numerous ingredients and involve complex mechanisms that may be linked to multiple targets and multiple pathways. In TCM, compounds without certain pharmacokinetic properties cannot reach the target organs; thus, they cannot effectively exhibit their biological activity. In a preliminary study, we observed that OB screening and DL assessments were the most favorable predictors of compounds with biological activity; however, the molecule effects were minimal if the molecule content was low or if the molecules metabolized quickly. In oral medication, ingredients that enter the bloodstream contribute the most to medicinal effects. Because of the complex characteristics of TCM, comprehensively studying the internal mechanisms of TCM compounds is difficult. Nevertheless, the use of bio-informatics methods to analyze TCM compound ingredients absorbed into the bloodstream may be a viable approach to determining the properties of such complex mechanisms. Thus, the current study employed such an approach to demonstrate the pharmacological mechanisms through which YCHD interferes with HCC.
YCHD is a traditional decoction commonly used in clinical practice. The current study used the network pharmacology method to study the anti-HCC effects of YCHD. We posit that YCHD demonstrates inhibitory effects on HCC by directly regulating the metabolism, inflammation, and signaling transduction pathways. Among the 36 YCHD ingredients absorbed into the bloodstream, 17 were directly related to liver cancer targets and inhibited HCC through interference with 14 pathways. For example, the inflammationrelated signaling pathways have been identified to play a key role in the treatment and prevention of HCC [20]. Our research also revealed that MOL17 (salicylic acid), MOL29 (geniposide), and MOL33 (naringenin) can be used to treat HCC by regulating the TNF signaling pathway. Two studies have reported that geniposide [21] exerts anti-HCC effects by suppressing vascular endothelial growth factor expression and angiogenesis and that naringenin [22] suppresses the invasiveness and metastatic potential of HCC by inhibiting multiple signal transduction pathways. Energy metabolism may also play a critical role in the inhibition of HCC [23]. Our study revealed that 12 YCHD compounds affected 9 metabolism pathways. Five of these compounds can be used to treat HCC by regulating the energy metabolism pathway (i.e., MOL01 (gallic acid), MOL02 (2,5-dimethyl-7-hydroxychromone), MOL12 (7-methoxycoumarin), MOL16 (isoquercitrin), and MOL17 (salicylic acid)).

BioMed Research International
We identified 17 HCC targets directly affected by YCHD, of which 5 genes (AKR1D1, CYP2C9, CYP2E1, CYP3A4, and SLC22A7) were associated with the prognosis of patients with HCC. AKR1C3 plays crucial roles in multiple cancers, and a low expression level of AKR1D1 predicted poor prognosis and short median survival time [24]. CYP2C9 is involved in the metabolism of many carcinogens and drugs, and CYP2C9 was downregulated in human HCC progression [25]. CYP2E1 is a unique gene expressed in the liver but not expressed in HCC [26]. CYP3A4 expression is significantly low in the liver tumor tissue of patients with HCC [27]. Low SLC22A7 expression indicates a high risk of poor prognosis [28]. These five protein targets may be the prognostic targets for HCC and the targets of YCHD in the treatment of HCC. The network pharmacology analysis results demonstrate that MOL17 (salicylic acid) regulates SLC22A7 and CYP2C9, MOL20 (azelaic acid) regulates AKR1D1, MOL29 (geniposide) regulatesCYP2E1, and MOL33 (naringenin) regulates CYP3A4. In addition, gallic acid [29], physcion [30], rhein [31], isofraxidin [32], geniposide [21], naringenin [22], and chlorogenic acid [33] have been reported to exhibit anti-HCC effects.
The current study investigated how YCHD ingredients regulated and interfered with key targets; the results reveal similar trends in how the 17 small molecules interfered with the key target set. Among the 17 small molecules, MOL02 (2,5-dimethyl-7-hydroxychromone), MOL12 (7-methoxycoumarin), MOL17 (salicylic acid), and MOL23 (cirsimaritin) exhibited the most profound interference with the target set.
The current study adopted the network pharmacology method and used YCHD ingredients absorbed into the bloodstream to predict the targets of YCHD and construct meaningful pathways. In addition, the relationships between YCHD compounds, targets, and diseases were consolidated, and the multi-ingredient and multitarget characteristics of YCHD were analyzed. Finally, the mechanisms through which YCHD treat HCC were revealed.
In the current study, we combined bioinformatics and the network pharmacology method to predict which YCHD 34 Targets acted on directly by 4 small molecules 25 Core targets related to survival Interfered targets in human PPI networks Figure 10: Targets acted on directly by four YCHD small molecules, core targets related to survival, and targets interfered with. Purple triangles represent the four small YCHD molecules, dark green octagons represent the 34 potential targets that the four small molecules directly acted on, ellipses represent the proteins that interfered with the 34 direct action targets in the human PPI network, and the V shapes represent the 25 key genes associated with HCC survival in G k .

20
BioMed Research International ingredients entered the bloodstream as well as the molecular targets and pathways through which YCHD treats HCC; the current results may serve as a basis for subsequent experimental studies. The specific mechanisms governing these ingredients should be examined through experiments in future studies.

Data Availability
The data used to support the findings of this study are included within the article.