Molecular mechanism of the treatment of lung adenocarcinoma by Hedyotis Diffusa: an integrative study with real-world clinical data and experimental validation

Background With a variety of active ingredients, Hedyotis Diffusa (H. diffusa) can treat a variety of tumors. The purpose of our study is based on real-world data and experimental level, to double demonstrate the efficacy and possible molecular mechanism of H. diffusa in the treatment of lung adenocarcinom (LUAD). Methods Phenotype-genotype and herbal-target associations were extracted from the SymMap database. Disease-gene associations were extracted from the MalaCards database. A molecular network-based correlation analysis was further conducted on the collection of genes associated with TCM and the collection of genes associated with diseases and symptoms. Then, the network separation SAB metrics were applied to evaluate the network proximity relationship between TCM and symptoms. Finally, cell apoptosis experiment, Western blot, and Real-time PCR were used for biological experimental level validation analysis. Results Included in the study were 85,437 electronic medical records (318 patients with LUAD). The proportion of prescriptions containing H. diffusa in the LUAD group was much higher than that in the non-LUAD group (p < 0.005). We counted the symptom relief of patients in the group and the group without the use of H. diffusa: except for symptoms such as fatigue, palpitations, and dizziness, the improvement rate of symptoms in the user group was higher than that in the non-use group. We selected the five most frequently occurring symptoms in the use group, namely, cough, expectoration, fatigue, chest tightness and wheezing. We combined the above five symptom genes into one group. The overlapping genes obtained were CTNNB1, STAT3, CASP8, and APC. The selection of CTNNB1 target for biological experiments showed that the proliferation rate of LUAD A549 cells in the drug intervention group was significantly lower than that in the control group, and it was concentration-dependent. H. diffusa can promote the apoptosis of A549 cells, and the apoptosis rate of the high-concentration drug group is significantly higher than that of the low-concentration drug group. The transcription and expression level of CTNNB1 gene in the drug intervention group were significantly decreased. Conclusion H. diffusa inhibits the proliferation and promotes apoptosis of LUAD A549 cells, which may be related to the fact that H. diffusa can regulate the expression of CTNNB1.


Introduction
Globally, the incidence and death rates of lung cancer are among the highest in the list of malignant tumors.Lung adenocarcinoma (LUAD) is a type of non-small cell lung cancer (NSCLC), which accounts for about 40%-55% of lung cancer patients (Herbst et al., 2018).LUAD tends to occur in female and male non-smokers, and is more common in China, and is characterized by brain, bone and liver metastases (Steinet al., 2019).LUAD usually has no obvious clinical symptoms in the early stage of the disease, and about 75% of patients are in the middle or late stage when they are found, with a low 5-year survival rate, which seriously affects the prognosis of patients (Wanget al., 2017).Although tyrosine kinase inhibitors and immunotherapy have brought some survival benefits to some patients with LUAD, the overall survival rate of patients is still low.Therefore, there is an urgent need to find safer and more effective drugs to treat LUAD.
Traditional Chinese medicine (TCM) has rich clinical experience and a large number of effective drugs in the treatment of tumors.Hedyotis diffusa (H.diffusa) is slightly bitter, sweet, cold in nature, belongs to the stomach, large intestine and small intestine meridians, and has the efficacy of clearing heat and removing toxins, inducing diuresis and clearing diuresis (Corraleset al., 2020).With a variety of active ingredients, H. diffusa can treat a variety of tumors such as digestive system tumors, lung cancer, prostate cancer and cervical cancer, and has anti-tumor effects such as inhibiting angiogenesis, inhibiting proliferation, and promoting apoptosis (Ma et al., 2014;Ye et al., 2019).Relevant studies have shown that the main antitumor components in H. diffusa are ursolic acid and oleanolic acid, and the antitumor mechanism of ursolic acid is to inhibit the growth of breast and colon cancer cell lines by triggering apoptosis, cell cycle arrest, and anti-metastatic and anti-angiogenic properties through various molecular targets and signaling pathways, and oleanolic acid is to induce cell apoptosis by regulating different signaling pathways and biological processes through multiple targets at the same time (He et al., 2018).Oleanolic acid, on the other hand, induces apoptosis by regulating different signaling pathways and biological processes through multiple targets, including intracellular calcium levels, NF-κB signaling pathway, Notch signaling pathway, JAK/STAT3 signaling pathway, and the expression of ribose polymerase (Chan et al., 2019).Based on the current research, there is still a lack of large samples of clinical data analysis and related mechanism studies on the treatment of LUAD with H. diffusa.
Symptom phenotypes (i.e., symptoms and signs), one of the main clinical manifestations of disease conditions, that could be obtained by human natural perception and cognition abilities, play a vital role for medical visiting, clinical diagnosis, and disease treatment.In recent years, researchers have developed a series of systems or network pharmacological strategies to detect the molecular mechanisms of TCM (Schroen, et al., 2015).A study extracted symptom cooccurrences from clinical textbooks to construct phenotype network of symptoms with clinical co-occurrence and incorporated highquality symptom-gene associations and protein-protein interactions to explore the molecular network patterns of symptom phenotypes (Shu, et al., 2021).The work of Lu et al. showed that the integrated network analysis method could be used for identifying robust symptom clusters (SCs) and investigate the molecular mechanisms of these SCs, which would be valuable for symptom science and precision health (Lu, et al., 2020).By establishing a protein-protein interaction network framework, it was revealed that the general rule in TCM treatment selection is network distance (Gan, et al., 2023).By constructing a symptom-based human disease network, the authors analyzed the relationship between disease-symptom similarity and gene-protein interactions, and explored the relationship between the diversity of clinical manifestations of a single disease and its molecular mechanism (Zhou, et al., 2014).Taking advantage of GPU performance, the algorithm implementation of drug-symptom correlation analysis can be effectively accelerated (Tian, et al., 2020).Herb-Target Interaction Network (HTINet) approach obtained lowdimensional expression of nodes by constructing a heterogeneous network and performing network embedding, and used this expression to predict the relationship between Chinese herbal medicines and targets (Wang, et al., 2019).Using deep learning and heterogeneous network topology, a drug-target association prediction method that calculates the similarity between drug and target has been proposed (Zong, et al., 2019).A method using an iterative algorithm of three-layer heterogeneous networks to achieve drug repositioning is proposed (Wang, et al., 2014).A Network-based Random Walk with Restart (NRWRH) method is proposed to predict unknown drugtarget interactions (Chen, et al., 2012).
Based on real-world clinical data, this study reviewed and analyzed the application and symptom improvement of H. diffusa in LUAD, and performed relevant statistical analyses.In addition, the relationship network between symptom improvement and disease genes was also analyzed, and the gene target highly related to the symptom improvement of lung adenocarcinoma was finally screened out, and the molecular and cellular experiments were designed to validate the regulatory effect of H. diffusa on this target.The present study demonstrated the efficacy and possible molecular mechanism of Hedyotis diffusa in the treatment of lung cancer at both clinical and experimental levels, and provided the basis and new ideas for the study of the molecular mechanism of H. diffusa in the treatment of LUAD.The technical flowchart design we studied is shown below (Figure 1).

Clinical data and preprocessing
The data of 85,437 inpatients were collected from the electronic medical records (EMRs) database of a provincial cancer hospital, including 318 patients with LUAD and 85,119 patients with non-LUAD, which included the patient's age, gender, length of stay (LOS), admission records, discharge records, diagnosis information, and medical order information.The patients' personal privacy was removed when the data was exported.Since patients' admission records and discharge records were free text that cannot be directly used for statistical analysis, we used a clinical information extraction tool (SHU et al., 2020) to efficiently extract the biomedical entities (e.g., symptoms, diseases) from these records.Then, to normalize the various clinical term descriptions, we manually checked and standardized the terms "disease" and "herb" by referring to the 10th Revision of International Classification of Diseases (ICD-10) (Chabra S et al., 2019) and the Pharmacopoeia of the People's Republic of China 2020 Revision (ChP 2020) (Chinese Pharmacopoeia Commission, 2020), respectively.In addition, The symptom data was standardized by two physicians in a back-to-back manner with reference to the Unified Medical Language System (UMLS) (Bodenreider et al., 2004).

External data sources
The herbal target data and symptom-related gene data used in this article were downloaded from the SymMap (Wu et al., 2019) database (www.symmap.org).SymMap integrates traditional Chinese medicine (TCM) with modern medicine (MM) through both internal molecular mechanism and external symptom mapping, thus providing massive information on herbs/ ingredients, targets, as well as the clinical symptoms and diseases they are used to treat for drug screening efforts.The SymMap database contains target sets of 618 herbal medicines, 832 TCM symptom genes, and a total of 20,965 targets/genes.

Statistics analysis
To assess whether there is a correlation between two categorical variables, we used chi-square and t-tests on the processed data.The chi-square test evaluates the degree of association between variables (gender, symptoms) based on the difference between observed and expected frequencies.When performing the chi-square test, we first construct a contingency table to organize the observed frequency distributions of different categorical variables together.The expected frequency for each cell is then calculated, which is the frequency expected based on the assumed independence.Next, the differences between the observed and expected frequencies are calculated and these differences are squared and normalized.Finally, the chi-square statistic is obtained by summing the standardized differences.It can be used to compare the difference between the observed frequency and the expected frequency to determine whether the difference exceeds the range caused by randomness.The t-test is used to compare whether there are significant differences in the means (age, length of hospitalization) of two samples.When performing a t-test, we first calculate the mean and variance of each sample.Next, using these statistical indicators, we calculate the t-value, which represents the size of the difference between the two sample means relative to their standard errors.

Drug-symptom molecular network analysis
In order to further study the mechanism behind the treatment of diseases by traditional Chinese medicine, a molecular network-based correlation analysis was further conducted on the collection of genes associated with traditional Chinese medicine and the collection of genes associated with diseases and symptoms.Network medicine leverages the human protein-protein interactome (PPI) to reveal disease and drug patterns.The PPI is a network consisting of nodes that are proteins that link to each other by physical (binding) interactions.Network medicine showed that disease-associated proteins tend to form locally clustered modules in the PPI, and shorter network distance between two disease modules is indicative of their comorbidity.
It is a method to measure the network relation between two node sets (e.g., target modules of herbs A and B) using the network distance d AB and network separation S AB metrics.The network distance d AB is the average of network distances between all node pairs in two node sets.The network separation metric S AB was designed to characterize diseasedisease relation and drug-drug relation.S AB compares the shortest distances between proteins d AA and d BB within each TCM and disease genome with d AB .For gene set A and gene set B participating in the protein network, the calculation formula of S AB is as shown in the formula.
Among them, taking Chinese medicine (X) and symptoms (Y) as an example, the calculation of the shortest distance between the Chinese medicine target (x) and the symptom gene (y) is as shown in the formula.

2) CCK-8 assay
After treatment of cells, the culture medium was removed and replaced with 100 μL fresh medium in each well.Next, 10 μL CCK-8 was added and cells were incubated at 37 °C for 1 h.The OD value at 450 nm was measured on a microplate analyzer, and data analysis was performed.

3) Flow cytometry
Cell apoptosis was detected by staining with Annexin V-FITC/ PI.Cells were collected by centrifugation and the supernatant was discarded; cells were washed twice with PBS and 500 μL binding buffer was added.Next, 5 μL Annexin V-FITC and 10 μL propidium iodide were added and cells were incubated at room temperature in the dark for 5-15 min.Analysis by flow cytometry was performed.

4) Western blot
Cells were lysed in RIPA buffer containing PMSF, phosphatase inhibitor and protease inhibitor (at 100:1:1:1).Protein quantification was performed using a BCA protein quantification kit.Lysates were boiled in 5× loading buffer and separated on 10% SDS-PAGE gels at 80 V for 30 min and 120 V for 50 min, followed by transfer to a PVDF membrane.Membranes were blocked in 5% skim milk-TBST at room temperature for 2 h.The TBST membrane was washed three times, 8 min each time, and 4 °C incubated with primary antibody overnight.Primary antibodies were anti-EGFR, KDR, MAPK3, PTPN11 and CTNNB1 (1:1,000; KDR, MAPK3 and PTPN11) and anti-GAPDH (1:1,000).The membrane was washed three times with TBST for 8 min each and incubated with secondary antibody at room temperature for 2 h.Secondary antibodies included HRP-labeled sheep anti-rabbit IgG (1:10,000) and HRPlabeled sheep anti-mouse IgG (1:10,000).The blots were processed using an ECL kit and the bands were analyzed using a gel image processing system (Gel-Pro-Analyzer software).

Results
Analysis of the role of H. diffusa in patients with LUAD Firstly, based on the data as shown in Table 2, the proportion of LUAD patients using prescriptions containing H. diffusa is 39%, much higher than that of the non-LUAD group (p < 0.005).
Secondly, based on the LUAD patient group, the demographic statistics between the group using H. diffusa and the group not using it were calculated.We extracted patient symptoms from admission and discharge medical records through manual annotation and a clinical information extraction tool (Human-machine Cooperative Phenotypic Spectrum Annotation System, HCPSAS; www.tcmai.org).Table 3 presents the statistics comparison between H. diffusa used group and unused group, only symptom cough and sleep disturbances show statistical difference.
Finally, statistical analysis was performed on symptoms that were significantly relieved among discharged cases.We counted the relieved symptoms of patients in the used group and unused group respectively, and sorted them in descending order according to the number of symptoms in the used group, the number in brackets is the total number of symptoms in this group.Except for symptoms of Asthenia, Heart pounding, and Dizziness, the improvement rate of symptoms in the used group was higher than that in the unused group, as shown in Table 4.

Herb-symptom network analysis
First, based on the symptom statistics of admitted patients, we selected the five most frequently occurring symptoms in used group, which are Non-productive cough, Productive cough, Asthenia, Chest pain, dull and Asthma.For these symptoms, we perform exact matching or approximate matching in the symptom database and obtain the number of genes shown in Table 5.
Based on the above symptom gene set and herbal medicine target set, we combined the above five symptom genes into one group, and put the total symptom gene set and herbal medicine  Then, the ten symptoms with the highest degree of improvement in the used group were selected for analysis, and their S AB values for the Hedyotis diffusa target set and the gene set of the ten symptoms were calculated respectively.The calculation results are shown in Table 6 and the illustration diagram is shown in Figure 2.For S AB < 0, it means that the targets (genes) of the two traditional Chinese medicines and symptoms are located in the same network area.The smaller the value, the stronger the correlation.Therefore, when evaluating the correlation between traditional Chinese medicine and symptom genome, if the value of S AB is negative, its significance is more important, which means that there is a closer correlation between traditional Chinese medicine and  Combined with the statistical results of the improvement ratio comparison in the previous section, it is verified that Hedyotis diffusa has a positive effect on the above symptoms.

Effects of H. diffusa on apoptosis of lung cancer cells
Flow cytometry was used to detect the effects of H. diffusa components on A549 LUAD cell apoptosis, and the results are shown in Figure 3.The apoptosis rates of the control group, the low concentration group and the high concentration group were 9.05%, 32.53%, and 74.43%, respectively.These results indicate that H. diffusa promote the apoptosis of lung cancer cells.

Regulatory effects of H.diffusaon on CTNNB1 in lung cancer cells
Western blot was used to detect the effects of H.diffusa on key proteins of LUAD, and the results are shown in Figures 4A,B.The expression levels of CTNNB1 proteins in cells treated with the H.diffusa were significantly lower than those in the control group Illustration diagram of S AB analysis between Hedyotis diffusa and relieved symptom.(p < 0.05).In addition, the protein expression level of the high concentration group was significantly lower than that of the low concentration group (p < 0.05).Further to the above CTNNB1 target detect the mRNA level of analysis, the results as shown in Figure 4C, mRNA expression level and protein level consistent.The mRNA expression levels of CTNNB1 in cells treated with the H.diffusa were significantly lower than those in the control group (p < 0.05).Inaddition, them mRNA expression level of the high concentration group was significantly lower than that of the low concentration group (p < 0.05).

Discussion
LUAD is the most common subgroup of NSCLC with aggressive and metastatic potential, leading to drug resistance and treatment failure (Wang et al., 2009;Travis et al., 2015).Despite advances in the detection and treatment of NSCLC, unfortunately, the prognosis remains poor when the disease is detected in late clinical stages, and some patients are still plagued by rapid disease recurrence and progression, thus resulting in a 5−year survival rate of only about 15% (Erridge et al., 2007).TCM participation is an important means in the treatment of LUAD.Due to the characteristics of syndrome differentiation and treatment, combined with surgery, chemotherapy and targeted therapy, TCM plays an anti-tumor role from the two aspects of self-healing ability and disease resistance.Recently, an increasing number of studies (Wu et al., 2022;Huang et al., 2022;Qian et al., 2019) showed that the herb has some antitumor effects, including in the treatment of LUAD.It is a kind of Chinese herbal herb family, which is a famous Chinese herbal medicine with thousands of years of clinical practice history.It is an important component of various anticancer formulations; it was reported to inhibit tumor cell proliferation and metastasis and reduce side effects after chemotherapy (Song et al., 2019).Early pharmacological studies confirmed that the herb has medicinal properties such as anti-tumor, anti-inflammatory, immunoregulatory, antioxidant and other biological activities (Shen et al., 2016).The current studies on the mechanism of lung adenocarcinoma focus on the theoretical stage of network pharmacology or only in vitro experimental stage, and the analysis of real world clinical data has not been reported.Based on this, based  on real-world clinical data, our study analyzed the key targets of LUAD from the perspective of drug improvement of symptoms, and used biological experiments to verify them, so as to provide certain theoretical basis and support for the mechanism of LUAD in TCM.
H. diffusa is an important component of TCM clinical anticancer prescription (Han et al., 2020;Wazir et al., 2021;Zhang et al., 2021), and the results of this study showed that it is used significantly more frequently in patients with LUAD than in patients with other diseases (p < 0.05).A literature study on the treatment of NSCLC (Qi et al., 2021) showed that the herb is one of the more frequent Chinese medicines in prescription.At the same time, many studies in Taiwan have shown that it is also one of the most frequently prescribed TCM (Hung et al., 2017;Kuo et al., 2018;Ting et al., 2017;Wang et al., 2021) in the treatment of other cancers (such as liver cancer, gastric cancer, nasopharyngeal cancer and pancreatic cancer).An experimental study (Huang et al., 2022) showed that the H. diffusa injection significantly reduced the survival of lung adenocarcinoma cells in vitro, inhibited the growth of BALB/c nude mice in vivo, and induced iron death of VDAC 2/ 3 regulation by Bax by inhibiting Bcl-2 gene.In addition, the results of this study showed a significant improvement in the clinical symptoms (such as cough, chest tightness, wheezing, sputum, shortness of breath, headache, dry mouth), and improved the quality of life.
Most of the previous studies have focused on the relationship between TCM and disease, for example, TCM diagnosis and herbal therapy are based on the symptom phenotype rather than disease diagnosis.Thus, our study builds on symptoms rather than disease.By focusing on symptoms rather than disease, our approach is consistent with the practice of TCM in diagnosing and treating patients based on the symptom phenotype.The study by Gan et al. (Gan et al., 2023) found that proteins associated with symptoms tend to cluster in a local PPI module, and the network proximity between herbal targets and symptom modules indicates the effectiveness of herbs in treating symptoms.Based on the above improvement symptom gene set and the TCM target set, we combined the above five symptom genes into one group, and put the total symptom gene set and the TCM target set into the gene association analysis platform respectively.The overlapping genes obtained were CTNNB1, STAT3, CASP8, and APC.We picked out the highest weighted target CTNNB1 to unfold the following validation.WANG et al. (Wang et al., 2017) used network pharmacology to analyze the potential targets of H. diffusa-Astragalus membranaceus for the treatment of colorectal cancer, and found that the core target also contained CTNNB1.Zhou et al. (Zhou et al., 2020) included 564 patients with LUAD in the study and found a poor prognosis in patients with primary lung adenocarcinoma with CTNNB1 mutation.In vitro, we combined the H. diffusa with lung adenocarcinoma A549 cells, and the apoptosis experiment showed that the higher concentration of drug groups promoted apoptosis more than the lower concentration.Furthermore, WANG et al. (Wang et al., 2022) showed that H. diffusa significantly reduced the survival rate of lung adenocarcinoma cells in vitro and significantly inhibited the cell adhesion, invasion and migration of lung adenocarcinoma A549 cells.We further verified the effect of H. diffusa on CTNNB1 at the mRNA and protein levels, and found that compared with the control group, the drug group significantly inhibited the mRNA and protein levels of CTNNB1 gene, and the inhibitory effect of the high drug group was more obvious than that of the low concentration of the drug group.It may speculate that it may promote apoptosis of lung cancer A549 cells by inhibiting the CTNNB1 gene.In addition, the shortcomings of our study compared to the study by Peng et al. are that we did not consider prescriptions, but rather looked at the effects of individual drugs (Wang et al., 2022;Peng et al., 2023).
In conclusion, our research based on real world data from clinical LUAD symptoms changes and the relationship between H. diffusa, using network algorithm for key targets, and through the experiment verified the key target, the whole argument method is rigorous, can for clinical treatment of lung adenocarcinoma provide certain theoretical support.

FIGURE 1
FIGURE 1Schematic diagram of the technical process.

FIGURE 3
FIGURE 3 Effects of the H.diffusa on the apoptosis of A549 LUAD cells.(A), Control group; (B), Low concentration group; (C), High concentration group.
work was supported by grants from the National Natural Science Foundation of China (82174533), 2021 Henan Province Key R&D and Promotion Special Programs (212102310761), 2019 Henan University of Traditional Chinese Medicine Outstanding Doctoral Fund (RSBSJJ2019-17) and the Natural Science Foundation of Henan Province (222300420538).

TABLE 1
Primer sequence list.

TABLE 2
Statistics on LUAD patients and users of H. diffusa.

TABLE 3
Hospitalization information statistics of used group and unused group.

TABLE 4
Statistics on symptom improvement in the used group and unused group.

TABLE 5
Target and gene set information for specific herbs and symptoms.From the results table, it can be seen that the S AB values of Hedyotis diffusa and symptoms such as Nonproductive cough, Shortness of breath, Breath-holding spells, Headache, cephalalgia and Xerostomia are all less than 0.

TABLE 7
Inhibitory effects of the H. diffusa at different concentrations on A549 LUAD cells.