Elucidating the molecular targets of Curcuma longa for breast cancer treatment using network pharmacology, molecular docking and molecular dynamics simulation

Background: To elucidate the molecular mechanisms of Curcuma longa ( C. longa ) in breast cancer treatment. Methods: Phytocompounds of C. longa were obtained from Dr. Duke’s Phytochemical and Ethnobotanical Database. Potential active targets were retrieved from Bindingdb, SEA and Swiss Target Prediction databases. Breast cancer targets were retrieved from the Therapeutic Target Database. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway enrichment analyses were done using DAVID and KOBAS3.0 databases respectively. The Cytoscape software was used to construct the phytocompound-target-pathway network. The PyRx and Desmond software were utilized for molecular docking and molecular dynamics simulation respectively. Results: Out of one hundred and fifty-six phytocompounds, fifty-four modulated proteins involved in breast cancer. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes pathway analysis indicated C. longa exerts its therapeutic effect through regulating several key pathways. Molecular docking analysis revealed that most phytocompounds of C. longa had a good affinity with the key targets. Molecular dynamics simulation showed that ethinylestradiol formed stable ligand-protein complexes. Conclusion: The results of this study will enhance our understanding of the potential molecular mechanisms by which C. longa inhibits breast cancer and lay a foundation for future experimental studies.


Introduction
Breast cancer, a malignant condition affecting the breast tissue, is recognized as the second leading cause of cancer-related deaths in women worldwide, following lung cancer [1]. A vast majority of breast cancers (85%) arise in the epithelium of the ducts and about 15% arise in lobules in the glandular tissue of the breast. In 2020, the World Health Organization reported that around 2.3 million women worldwide received a breast cancer diagnosis, leading to approximately 685 deaths globally [2].
Biomarkers, including the presence of hormone receptors, extra copies of human epidermal growth factor 2 (HER2) gene and an excess amount of HER2 protein, helps classify breast cancer into three types-estrogen receptor-positive breast cancer, HER2-amplified breast cancer, and triple-negative breast cancer [3,4]. The different types of breast cancer feature varying risk factors for incidence, disease progression, therapeutic response, and preferred organ site for metastasis [4].
Several efforts have been put into the treatment of breast cancer. This ranges from chemotherapy, immunotherapy, cancer vaccinations to stem cell transformation. Among the treatment options, chemotherapy is widely used, involving cytostatic and cytotoxic drugs. Although such drugs have been highly effective against different forms of cancers, there have been some limitations associated with them, including a range of side effects, high cost and toxicity as well as not being eco-friendly [5]. Therefore, it is crucial to look for more alternative drugs that have fewer side effects, are less expensive, and are less toxic to patients.
Medicinal plants are an essential but frequently ignored source for novel antitumor drugs. Many plant-based chemotherapeutics, including vincristine, vinblastine, sulphoraphane, and paclitaxel, have been shown to be effective against a variety of tumors. These drugs are in high demand because they are known to have cytotoxic effects on cancer cells while having non-toxic effects on normal cells [6,7]. Sometimes, medicinal plants are used extensively as a complementary therapy to treat several cancers, including breast cancer, with relatively few and milder side effects [8,9].
Curcuma longa (C. longa), commonly known as turmeric, is a member of the Zingiberaceae family and has been used for several years in treating several diseases [10]. The rhizome of C. longa is known to possess several therapeutic activities, including anti-inflammatory, anti-diabetic, hepato-protective, hypolipidemic, anti-diarrhoeal, anti-asthmatic and anti-cancer activities [11]. The bioactive compounds found in C. longa, known as curcuminoids, including curcumin, demethoxycurcumin, and bisdemethoxycurcumin, have demonstrated anti-cancer properties against various cancer cell lines, including breast cancer [12,13]. These effects have also been observed in animal models [14,15]. However, the multiple targets, pathways and molecular mechanisms of its antineoplastic effect are still unknown.
Network pharmacology is a research field that examines the molecular mechanism of action of drugs and plant-based compounds in treating diseases using multiple components and targets as the starting point [16]. It is a discipline that integrates systems biology, pharmacology, and bioinformatics to investigate the synergistic effects of several components, targets, and pathways, in contrast to the traditional single-component/single-target concept [17,18]. In this study, network pharmacology was used to predict the potential therapeutic targets and signaling pathways of C. longa for the treatment of breast cancer The findings of this study are expected to provide new targets and signaling pathways for breast cancer treatment.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis
GO and KEGG pathway analyses were done using the DAVID (https://david.ncifcrf.gov/) and KEGG Orthology Based Annotation System 3.0 (http://kobas.cbi.pku.edu.cn/) databases respectively [17]. The top 10 GO results were sorted and displayed as bubble plots. Similarly, bubble plots were created to display the results of KEGG pathway enrichment. The ggplot2 package in R programming was used for generating these plots base on P-value. A significance level of P < 0.05 was used [27].

Construction of protein-protein interaction (PPI) network
The protein targets were inputted into the STRING database (https://string-db.org/) to generate a network illustrating the interactions between proteins [28].

Construction of component-target-pathway network
The component-target-pathway network was created using the Cytoscape 3.8.2 software. Active phytocompounds, targets, and pathways represent the input nodes, while edges represented the connections between these nodes [17].

Molecular docking simulation
Molecular docking analyses were run between CA1, CA2 (the proteins that were targeted by most phytocompounds), EGFR, AKT1 (the proteins that interacted with most targets). ESR1, ESR2 and PGR were also picked for docking as they are frequently implicated in breast cancer [29]. The 3D structures of these proteins were collected from RCSB Protein Data Bank (PDB) (https://www.rcsb.org/). The PDB IDs of the proteins are CA1:1HCB, CA2:3DC1, EGFR:5FED, AKT1:3O96, ESR1:1ERR, ESR2:3OLS and PGR:1A28. The heteroatoms and water molecules in these structures were removed using Discovery Studio 2021, and hydrogen bonds were added using Chimera 1.14 to reduce docking interference [25,30]. Computed Atlas of Surface Topography of proteins 3.0 (http://sts.bioe.uic.edu/castp/) was used to identify the active sites of the proteins [31]. The structure-data file structures of the phytocompounds were collected from PubChem and ZINC databases. The active sites of protein targets were utilized for docking the phytocompounds, and the binding affinities were assessed using Autodock Vina, a component of PyRx. The resultant interactions between proteins and phytocompounds were then visualized using Discovery Studio 2021 [30].

Molecular dynamics simulation
Molecular dynamics simulation was conducted for a duration of 50 nanoseconds using Desmond software, developed by Schrödinger [32]. The simulation employed the standard approach of integrating Newton's classical equation of motion to track the movements of atoms over time [33,34]. Its objective was to predict the binding status of the ligand within a physiological environment. To prepare the protein-ligand complex, the Maestro's Preparation Wizard was utilized for optimization and minimization. The System Builder tool was employed to configure each system, incorporating the TIP3P solvent model within an orthorhombic box and the OPLS 2005 force field [35]. The models were neutralized by adding counterions, and the simulation recreated physiological conditions with a concentration of 0.15 M NaCl. During the simulation, the isothermal-isobaric ensemble was used with a temperature of 300 K and pressure of 1 atm. The models were relaxed prior to the simulation. Trajectories were saved at intervals of 100 ps for subsequent analysis, and the stability of the simulation was evaluated by comparing the Root Mean Square Deviation (RMSD) of the protein and ligand over time [36].

Active components and their targets
One hundred and fifty-six phytocompounds were identified from C. longa in which fifty-four modulated proteins involved in breast cancer (Table 1).

Druglikeness and ADMET profile
Among the fifty-four phytocompounds from C. longa, eleven were predicted to have a positive druglikeness score, while forty-three had a negative score. Thiamin had the highest druglikeness score at 0.87. Five phytocompounds, alpha-tocopherol, beta-sitosterol, campesterol, cholesterol, and stigmasterol, violated one of lipinski's rules of five. The remaining phytocompounds did not violate any of the rules. Table 2 displays the list of phytocompounds along with their druglikeness scores. The ADMET profile of each phytocompound is presented in Figure 1 as a heat map. With the exception of o-coumaric acid, all phytocompounds from C. longa were predicted to have positive scores for human intestinal absorption. Thirteen phytocompounds were determined to be non-blood brain barrier permeant, while thirty-three had positive bioavailability scores. Except for curcumin, dihydrocurcumin, divanillalacetone, and tetrahydrocurcumin, all phytocompounds were predicted to be non-inhibitors of P-glycogen. Six, eight, fifteen, three, and eighteen phytocompounds were predicted to be inhibitors of CYP3A4, CYP2C9, CYP2C19, CYP2D6, and CYP1A2, respectively. Thirty-four phytocompounds were projected to cause eye irritation. Quercetin was predicted to be mutagenic, while eight phytocompound were inhibitors of human ether a-go-go. Among the phytocompounds investigated, thirteen demonstrated a potential for hepatotoxicity, while eighteen showed a tendency towards nephrotoxicity. However, none of the phytocompounds exhibited any indication of carcinogenicity. Thirty phytocompounds were predicted to be skin sensitive, twenty-two were respiratory toxic, and eighteen were reproductive toxic.

Network pharmacology analysis
The component-target-pathway network consisted of 491 edges, with 376 representing component-target interactions and 115 representing target-pathway interactions. The network included 123 nodes, representing 54 phytocompounds, 47 targets, and 22 pathways. CA2 had the highest edge count, with interactions found with 24 protein molecules. Similarly, the PI3K-Akt signaling pathway modulated the highest number of proteins, with interactions observed with 12 proteins (Figure 4).

Molecular docking analysis
The phytocompounds of C. longa as predicted by network pharmacology were docked with the selected breast cancer targets. Table 3 shows the binding energies of the three best docked phytocompounds for each protein. Figure 5 and Figure 6 illustrate the interactions between the amino acid residues of the targets and the phytocompounds with the lowest binding energies.

Molecular dynamics simulation
In order to evaluate the stability of the protein-ligand complexes at their binding sites, molecular dynamics (MD) simulation was conducted. MD simulation studies the behaviour of proteins and ligands for a particular period of time. The entire simulation was carried out for 50 nanoseconds in the production phase for the protein-ligand complexes. The stability of the complexes during the MD simulation was evaluated using two parameters: RMSD and radius of gyration. Figure 7-12 shows the RMSDs of the proteins when compared to their initial positions. The results show that in all protein-ligand complexes (especially in the EGFR-demethoxycurcumin, ESR1-ethinylestradiol, PGR-ethinylestradiol and PGR-curcumol complexes), the proteins remained stable throughout the MD simulation. In the AKT1-demethoxycurcumin complex, the protein reached equilibrium position at 18 ns and remained in equilibrium position almost throughout the entire simulation. Also, RMSDs of ligands compared to the position of the proteins were tracked for each complex to determine how well the binding pose was maintained during MD simulation. The results show that in the PGR-ethinylestradiol complex, the position of the ligand was preserved during the entire simulation. In the AKT1-demethoxycurcumin complex, demethoxycurcumin was not stable throughout the entire simulation. It was observed that demethoycurcumin detached from AKT1 at 6 th and reattached at 33 rd ns. Also, in the ESR1-ethinylestradiol complex, ethinylestradiol detached from ESR1 at 44 th ns and reattached at 46 th ns. In the Figure 3 Protein-protein interaction of regulated proteins by the phytocompound from C. longa. In the network diagram, each protein is represented by a node or circle, and the interactions between proteins are denoted by lines. The thickness of the lines reflects the strength of the interaction. If a circle is not connected to any other circles by lines, it signifies that the protein did not interact with any other protein within the network. C. Longa, Curcuma longa. Submit a manuscript: https://www.tmrjournals.com/pmr ESR1-1-Hydroxy-1,7-bis-(4-hydroxy-3-methoxyphenyl)-6-heptene-3,5dione complex, 1-Hydroxy-1,7-bis-(4-hydroxy-3-methoxyphenyl) -6-heptene-3,5-dione detached and moved away completely from ESR1 in the 7 th ns. In the EGFR-demethoxycurcumin complex, demethoxycurcumin detached and moved away completely from EGFR in the 5 th ns. In the PGR-curcumol complex, cuurcumol also detached and moved away from PGR in the 4 th ns. Radius of gyration is another parameter that is used to study the stability of the proteins and ligands. The radius of gyration of the ligands were observed during the MD simulation ( Figure 13). The results show that there is no considerable change in the radius of gyration of the ligands in the ESR1-ethinylestradiol, PGR-ethinylestradiol and PGR-curcumol complexes. They remain stable during the entire simulation. It was also observed that demethoxycurcumin detached from the AKT1-demethoxycurcumin complex at 12 th ns and reattached at 15 th then detach again at 17 th ns and attach at 32 ns. In the EGFR-demethoxycurcumin complex, demethoxycurcumin detached completely from the complex at 5 th ns. In the ESR1-divanillalaceton complex, divanillalaceton was on unstable throughout the simulation, moving in and out of binding pocket of the protein.

Figure 4 Component-target-pathway network of phytocompound of Curcuma longa.
The color green represents the phytocompound, pink represents the targets, and purple represents the pathways. The lines or edges in the diagram represent the interactions between them.

Discussion
C. longa is known to possess several therapeutic activities, including anti-inflammatory, anti-diabetic, hepato-protective, hypolipidemic, anti-diarrhoeal, anti-asthmatic and anti-cancer activities [11]. The natural bioactive phytocompounds of C. longa known as curcuminoids, have been shown to exert anti-cancer effects to diverse cancer cell lines in vitro, including breast cancer [12,13] and also in vivo in animal models [14,15]. However, the multiple targets, pathways and mechanism of its antineoplastic effect remain unknown. Network pharmacology has emerged as a valuable approach for exploring the activity, targets, and pathways of substances with complex compositions, including plants, herbal formulas, and drugs. To our knowledge, this study represents the first attempt to identify the molecular mechanisms underlying the therapeutic use of C. longa in breast cancer treatment through network pharmacology. In this study, various in silico tools were used to analyze the inhibitory mechanism of C. longa on breast cancer. Network analysis predicted the modulation of forty-seven proteins by fifty-four phytocompounds, suggesting several interactions between phytocompounds and proteins. Several proteins were found to be hit by several phytocompounds. For instance, CA1, CA2 and ESR2 were modulated by over 20 phytocompounds. Additionally, phytocompounds of C. longa such as demethoxycurcumin, divanillalaceton, quercetin and tetrahydrocurcumin have the ability to modulate over 15 targets. This observation suggests that the phytocompounds of C. longa might have a synergistic effect on multiple targets, indicating their therapeutic potential not only for breast cancer but also for other diseases. This validates the nature of phytomedicine as being multicomponent, multitarget and multi-disease.
ESR1 and PGR are targets that are greatly associated with breast cancer. This indicates that C. longa could probably inhibit breast cancer. ESR1 is one of the most important molecular markers of breast cancer and it is expressed in 70% of breast cancers [37,38]. Estrogen are steroidal hormones that function primarily as the female sex hormone. Estrone, estradiol, and estriol are the three main types of estrogen. Estrone and estriol are typically produced during pregnancy and after the onset of menopause, respectively, while estradiol is the primary estrogen in non-pregnant females [39]. Estradiol plays an essential role in the development and progression of breast cancer, and most human cancers start out as estrogen dependent and express the estrogen receptor (ER). ER is made up of a DNA binding domain (DBD), two activation function domains (AF1/2), a hinge domain and a ligand binding domain (LBD). As a ligand-dependent transcription factor, ER activates the transcription of genes when a ligand (estrogen) binds to the LBD, resulting in breast cancer transcription [40,41]. PGR is another intercellular steroid hormone receptor that is expressed in breast cancer. It is estimated that PGR is expressed in about 65% of breast cancer cases [42]. From a single gene, PGR is expressed as two isoforms (PGR-A and PGR-B). Similar to ESR1, PGR has a DNA binding domain, LBD, and numerous AFs. PGR is an estrogen-regulated gene, and both normal and cancer cells need estrogen and ESR1 for its synthesis.
PPI analysis of the 47 targets revealed that AKT1, EGFR, ESR1, AR, CDK1, ABL1, CDK2, CXCR4, CDK4, and HDAC were the top 10 breast cancer genes, representing crucial targets for C. longa in breast cancer treatment. AKT is a crucial component of the PI3K/AKT signaling pathway and is also referred to as protein kinase B. The hallmarks of cancer, such as tumor growth, survival, and tumor cell invasiveness, are regulated by AKT [43]. AKT has three distinct isoforms, AKT1, AKT2, and AKT3, which exhibit distinct and non-redundant functions in various tumor types despite having high sequence and structural homology [44]. Several studies show that AKT1 activation speeds up tumor initiation and growth [45,46]. Through the delocalization of p21 from the nucleus and subsequent disinhibition of the G1-S transition, AKT1 promotes cell proliferation. This causes the release of CDK2, which raises the levels of cyclin A and aids in the progression of the cell cycle [47]. The overexpression of EGFR is linked with decreased overall survival in women with early breast cancer [48]. EGFR is activated by binding to its ligands, especially epidermal growth factor and transforming growth factor-alpha. When activated, EGFR pairs with other members of the EGFR family to form homodimers or heterodimers. The resulting dimerization promotes the tyrosine kinase activity of intracellular domain and may lead to cancer cell proliferation, resistance to cell death and metastasis [49,50]. The overexpression of EGFR has been seen in all forms of breast cancer, with it occurring more often in inflammatory breast cancer and triple-negative breast cancer [51].
GO enrichment analysis of the 47 targets was carried out in order to better understand the multiple mechanisms that C. longa uses to inhibit breast cancer from a systematic standpoint. Functional enrichment analysis showed the overrepresented GO terms and their corresponding functional domains. KEGG pathway enrichment analysis demonstrated significant enrichment of the 47 target proteins in 22 signaling pathways. Among these pathways, key targets were predominantly found in signaling pathways such as the estrogen signaling pathway, PI3K-Akt signaling pathway, MAPK signaling pathway, HIF-1 signaling pathway, and Jak-STAT signaling pathway.
The key proteins involved in these pathways include ESR1, PGR, AKT1, MTOR, EGFR among others. It is possible that C. longa might exert effects on various malignant tumors, including colorectal cancer, pancreatic cancer, and prostate cancer. This is because some of the signaling pathways that are significantly enriched by the target proteins were strongly correlated with to these cancers.
To confirm the active ingredients and targets predicted by network pharmacology, molecular docking simulation was carried out. The proteins CA1, CA2, AKT1, EGFR, ESR1, and PGR were docked with different phytocompounds identified through network pharmacology. The results of the verification showed that most phytocompounds exhibited binding free energies much lower than -5 kcal/mol (Table  3), indicating spontaneous binding of the ligand molecules to the proteins [52].
The top scoring protein-ligand complexes of EGFR, AKT1, ESR1, and PGR were subjected to MD simulation after molecular docking studies in order to investigate the stability of the protein-ligand complexes and to investigate the protein-ligand interactions in greater detail. MD simulation was therefore carried out on the AKT-demethoxycurcumin, EGFR-demethoxycurcumin, ESR1-ethinylestradiol and PGR-ethinylestradiol complexes. Studies show that majority (70%) of breast cancers depend on estrogen to develop and grow [53,54]. Treating these kinds of cancers with ethinylestradiol which is a form of estrogen in plants will be counterproductive. However, ethinylestradiol has been used to suppress the growth of breast tumors in postmenopausal women [55,56]. For these reasons, ESR1-divanillalacetonand PGR-curcumol complexes were also subjected to MD simulation. MD simulation showed that the ESR1-ethinylestradiol and PGR-ethinylestradiol complexes were stable throughout the simulation. Conversely the AKT1-demethoxycurcumin, EGFR-demethoxycurcumin, ESR1-divanillalacetonand PGR-curcumol complexes were not stable during the simulation.
Druglikeness character of each phytocompound from C. longa was predicted based on Lipinski's Rule of Five. Among the fifty-four phytocompounds that were predicted to target breast cancer proteins, only six phytocompounds (alpha-pinene, alpha-tocopherol, beta-sitosterol, campesterol, cholesterol and stigmasterol) were predicted to violate 1 Lipinski rule. Additionally, the ADMET of each phytocompound were predicted to assess important pharmacokinetic parameters and potential toxicity in the human body.

Conclusion
A network-based pharmacological analysis was employed to predict potential molecular targets for the inhibitory activity of C. longa against breast cancer. Based on the data from network pharmacology, molecular docking, and molecular dynamics simulation, phytocompounds such as quercetin, demethoxycurcumin, curcumin, and ethinylestradiol, among others, were predicted to inhibit key proteins associated with breast cancer. Key target proteins such as AKT1, EGFR, ESR1, and PGR, as well as signaling pathways like the estrogen, mTor, HIF-1 and Jak-STAT signaling pathways, were found to be associated with the breast cancer inhibitory effects of C. longa. The results of molecular dynamics simulation revealed that the ESR1-ethinylestradiol and PGR-ethinylestradiol complexes maintained stability throughout the entire simulation period. However, the AKT1-demethoxycurcumin, EGFR-demethoxycurcumin, ESR-divanillalaceton, and PGR-curcumol complexes were observed to be unstable during the simulation.