Doxorubicin-induced transcriptome meets interactome: identification of new drug targets

The working mechanism of the chemotherapeutic drug doxorubicin, which is frequently used in cancer treatment, its effects on cell metabolism, and pathways activated solely by doxorubicin are not fully known. Understanding these principles is important both in improving existing therapies and in finding new drug targets. Here, I describe a systems-biology approach to find a generalizable working principle for doxorubicin by superimposition of human interactome over gene datasets commonly expressed among various cancer types. The common –in at least two different diseases–transcriptional response of distinctive cancer cell lines to doxorubicin was reflected via 199 significantly and differentially expressed genes, mostly related to the regulation of transcription. Then, by integrating with interactome data, an active network was constructed allowing detection of clusters. Since each cluster defines densely connected regions, another level of understanding of functional principles is provided. Significant clusters were associated with the linked transcription factors and transcriptional factor enrichment analysis within these regulatory networks led to the proposition of Pou5f1b, Znf428, Prmt3, Znf12, Erg, Tfdp1, Foxm1, and Cenpa as new drug targets in drug development that can be applied in different cancer types.


Introduction
One of the most used chemotherapeutic drugs, shown to be efficient in different types of cancer, is doxorubicin. Doxorubicin is known to intercalate into DNA, inhibit topoisomerase II, lead to DNA double-strand breaks, induce production of reactive oxygen species (ROS), overproduce ceramide, damage chromatin through histone modifications and disrupt nucleosome assembly (Yang et al., 2014;van der Zanden et al. 2020). Although it has a high drug efficacy, usage of doxorubicin creates resistance and causes several side effects (Ferreira et al., 2017;Ashrafizaveh et al., 2021). Despite the known functions of doxorubicin, the mechanism of action is not fully clarified. To overcome the limitations of the drug, to design better therapies and to utilize it in combination therapies, comprehensive understanding of the mechanism is required.
With the broad usage of transcriptome profiling, there have been studies suggesting signature genes that can be used to predict type of a disease, subtype and eventually the drug that can be administered for therapy, centering at cancer-and/or cell type-specific responses (Yu et al., 2019;Gopi and Kidder, 2021) or effects of various concentrations of the drug on a certain cell model (Maillet et al., 2016). Functional annotation and/or correlation analysis together with clustering methods on gene expression data are used to identify differences between tumors or cell lines. Focusing on the differences in gene expression, therefore, serves to improve the prediction of that certain type of disease's response. However, such approaches yield ad hoc results specific to the used data and/or platform used for the measurement of data leading to lack of generalization.
Integrative multiomic approaches were applied to understand doxorubicin-induced cardiotoxicity, to identify the common signature of anthracycline-induced cardiotoxicity, or to suggest patient-specific optimal combination therapy (Selevsek et al., 2020;Cava et al., 2021). Systems-wide effects of doxorubicin have been investigated also in yeast cells. While long-term treatment showed extensive reconfiguration of metabolic and signaling networks with ROS formation and DNA damage (Taymaz-Nikerel et al., 2018), short-term treatment showed the contribution of DNA repair, DNA replication, and RNA surveillance pathway (Karabekmez et al., 2021). Addition to the transcriptome and fluxome, interactome was integrated in those studies, which explained well the effects in eukaryotic yeast cells. Mentioned metaanalysis studies focused on the effects of doxorubicin in a certain cell type or differences across the set of various cells. However, a genome-wide approach to explore the common response of doxorubicin in different cancer types has not been applied.
Among the cancer types in which doxorubicin treatment has been applied either in single form or in combination with other chemotherapeutic drugs, are breast cancer (Di Francesco et al., 2021), leukemia (Turk et al., 2020), and nonsmall cell lung cancer (Shahriari et al., 2021). Though, doxorubicin is not much preferred in the treatment of colon cancer (Sonowal et al., 2017) or renal cancer (Acharya and Singh, 2021) due to severe cardiotoxiciy and drug resistance. These different sets of diseases are selected to examine the common effect(s) of doxorubicin regardless of the cancer/cell type. By this, affected processes, drug-specific transcriptional and regulatory networks in doxorubicin treatment irrespectively can be identified, and consensus mechanism of action can be offered.
Here, a systems biology approach is described, which is based on the integration of transcriptome with interactome data for the selected cell types, providing the regulatory components having a role in the mechanism of action of doxorubicin. Enrichment analyses relative to processes, pathways, and transcription factors demonstrate that there are several common types of machinery responsible for the cellular changes caused by doxorubicin. Moreover, such an approach enables the prediction of new targets for the development of drugs or of combination therapies.

Compilation of gene expression data
While selecting gene expression datasets, cancer types, cell lines, dosage, and duration of administration of doxorubicin drug were taken into consideration. Microarray data, obtained using the Affymetrix platform, associated with doxorubicin was collected by scanning the NCBI Gene Expression Omnibus, GEO, and ArrayExpress, the functional genomics data repository supporting MIAME-compliant data submission.
Doxorubicin-induced transcriptome responses were then collected from the measurements across the National Cancer Institute (NCI)-60 cell line panel (Monks et al., 2018). In that study, expression of genes in NCI-60 human tumor cell lines were measured in response to several anticancer agents for 2, 6, and 24 h. For renal cancer, breast cancer, leukemia, nonsmall cell lung cancer and colon cancer, the gene expression data, Gene Expression Omnibus (GEO) accession number of GSE116441 measured in the cells treated with 1000 nm of doxorubicin were used. As reference gene expressions, data measured in untreated control cultures were applied. Details of the compiled data are given in Supplementary Table S1.

Data analysis 2.2.1. Differential expression analysis in transcriptome
First, datasets were normalized. Differentially expressed genes (DEGs) were defined from normalized logexpression values using linear models for microarray data. In data analysis, t-test method was applied to determine statistically significant (fold change in gene expression >|1.05|, p-value < 0.05) subsets. These analyses were performed in CLC Genomics Workbench (Qiagen Bioinformatics).

Integration with interactome data
A protein-protein interaction (PPI) network was created using the proteins encoded by the common -in at least two diseases -differentially expressed genes and physical interactions of these proteins obtained from BIOGRID Homo sapiens 3.5.187 database comprising 613869 protein-protein interactions (Oughtred et al., 2019). Subnetwork analysis was carried out to detect clusters within the active network via MCODE application in Cytoscape 3.8.0, where all networks were visualized (Shannon et al., 2003).

Functional annotation analysis
Functional annotation analyses were carried out for the common differentially expressed genes and for each cluster identified within the active PPI network. Gene ontology (GO) term enrichment analyses were performed in DAVID functional annotation tool to identify significantly (Benjamini-Hochberg corrected p-value < 0.05) associated biological process, molecular function, cellular compartment, and KEGG pathways (Huang et al., 2009).

Transcription factor enrichment analysis
For each cluster identified within the active PPI network, transcription factor (TF) enrichment analysis was done in ChEA3, a web-based tool, ranking TFs (Keenan et al., 2019).

Results and discussion
This study aims to explain the working principle of doxorubicin and, via this, propose new drug targets for cancer therapy. A systems biology approach was applied through a metaanalysis of available transcriptome data measured in the presence of doxorubicin subjected to different cell lines of cancer. Commonly altered genes were coupled with interactome yielding a mechanistic and systems-level understanding.

Common genes differentially expressed under doxorubicin treatment in different cancer types
Genome-wide response of renal cancer, breast cancer, leukemia, nonsmall cell lung cancer, and colon cancer cells to doxorubicin at the transcriptional level revealed that numerous genes were differentially and significantly (fold change >|1.05| and p-value <0.05) expressed. Number of identified differentially expressed genes (DEGs) for each cancer type are presented in Figure 1.
In order to determine the common DEGs of different cancer types, genes that showed significant differences in at least two diseases were identified. This resulted in 199 genes, listed in Supplementary Table S2. Gene ontology (GO)-term enrichment analyzes for these 199 common DEGs revealed statistically significant (Benjamini-Hochberg < 0.05) biological processes (Figure 2A), molecular functions ( Figure 2B), and the cellular part they take place in ( Figure 2C). TNF signaling pathway and osteoclast differentiation are significant pathways. When these findings are examined, it is seen that these common genes are mostly related to the regulation of transcription ( Figure 2A).

Construction of the active network
An active network was formed by obtaining the proteinprotein interactions of the proteins encoded by the common 199 DEGs (Supplementary Table S2). After removing the unlinked residues, a linked network of active protein-protein interactions was identified through 7022 proteins and 15775 protein-protein interactions. Topological analysis of the active protein-protein interaction network revealed that it is a scale-free network, with degree distribution following the power law model P(k) ≈ k −0.856 with R 2 = 0.975, (Supplementary Table S3 and Figure S1).

Modular analysis of the active network
The heavily connected clusters in the active network were determined in MCODE application, resulting in nine clusters (Supplementary Figure S2). The parameters for each cluster are summarized in Supplementary Table  S4. Functional enrichment analyzes were performed for the identified clusters. Biological process terms/cellular compartments/molecular functions and pathways that were found to be statistically significant (Benjamini-Hochberg p-value < 0.05) are presented for the first three clusters ( Figures 3A).
Among the biological processes related to the genes/ proteins in Cluster 1, besides general terms such as transcription and translation, more specific processes such as ribosomal large subunit, ribosomal large subunit biogenesis, NIK/NF-kappaB signaling, tumor necrosis factor-mediated regulation of signaling pathway, and cell adhesion were also observed ( Figure 3B). NIK/NF-kappaB was reported to potentially trigger doxorubicin resistance (Gao et al., 2019). Ribosomes were frequently encountered in different terms in this cluster. This allows us to predict in advance that Cluster 1 is important in defining transcriptional regulatory networks.
In Cluster 2 ( Figure 4A) there are no significant (Benjamini-Hochberg p-value < 0.05) results for biological processes, but many for pathways ( Figure 4B). Many cancer-related terms have been obtained, such as nonsmall cell lung cancer, glioma, viral carcinogenesis, melanoma, chronic myeloid leukemia, microRNAs in cancer, serotonergic synapse, thyroid hormone signaling pathway, thyroid cancer, bladder cancer, proteoglycans in cancer, regulation of actin cytoskeleton, endometrial cancer, acute myeloid leukemia, central carbon metabolism in cancer, renal cell carcinoma, prostate cancer, and choline metabolism in cancer. In addition, the genes in Cluster 2 were found to be associated with several signaling pathways that can be associated with cancer, long-term depression, and natural killer cell-mediated cytotoxicity.
Statistically significant biological processes associated with the genes/proteins that constitute Cluster 3 ( Figure  5A) are cell cycle regulation, DNA damage checkpoint, and negative regulation of transcription from the RNA polymerase II promoter ( Figure 5B). Significant molecular function terms are mostly related to binding. HTLV-I infection is the only significant pathway in Cluster 3.

Transcriptional regulatory networks
Identification of the transcription factors (TFs) responsible for the regulation of genes, which had altered expression is important to understand the working mechanism of doxorubicin. Transcription factor enrichment analysis for the three statistically significant clusters was performed in R e n a l c a n c e r B r e a s t c a n c e r C o lo n c a n c e r Leukemia N o ns m a l l c e l l l u n g c a n c e r Renal cancer  314  Breast cancer  139  Leukemia 288 Non-small cell lung cancer 367 Colon cancer 459 Figure 1. Number of identified differentially expressed genes (DEGs) in the presence of doxorubicin exposed to different cancer cell lines.   TA  F1  B  TB  L1  X TC  F4 TC  F7  L2  TF  AP  2A  TF  AP  2B TIA  M1 TN  IK TN  S3 TR  AF  2  TR  IM3  6 Figure 2. Functional enrichment analysis among the common 199 differentially expressed genes. Plots are generated in R via GOplot: the left side of the circles display genes, and the right sides display GO terms, annotated below each circle. If the gene belongs to a GO term, there is a line between the gene and the GO term for biological processes (A), molecular functions (B), and cellular compartments (C).

Cancer type Number of DEGs
cell-cell adhesion cytoplasmic translation NIK/NF-kappaB signaling nuclear-transcribed mRNA catabolic process, nonsense-mediated decay regulation of tumor necrosis factor-mediated signaling pathway ribosomal large subunit assembly ribosomal large subunit biogenesis rRNA processing SRP-dependent cotranslational protein targeting to membrane translation translational initiation viral transcription translation SRP-dependent cotranslational protein targeting to membrane rRNA processing ribosomal large subunit biogenesis ribosomal large subunit assembly regulation of tumor necrosis factor-mediated signaling pathway nuclear-transcribed mRNA catabolic process, nonsense-mediated decay NIK/NK-kappaB signaling Cytoplasmic translation Cell-cell adhesion   ChEA3 (Keenan et al., 2019). TF co-expression networks help visualize top-ranked transcription factors in the framework of the larger human transcription regulation network. TF-TF co-regulatory networks are dynamically created using the best results of the selected library. The regulatory network for the top 10 TF results for Cluster 1 is presented in Figure 3C. Expression of TF Pou5f1bp associated with Cluster 1 was shown to increase in cervical cytology and was proposed as a biomarker to identify cervical high-grade squamous lesions . Another TF, Znf428p, was presented as one of the regulators of cell invasion and cell migration in tumor cells (Barreiro-Alonso et al., 2018); it is also one of the seven panel biomarker candidates for the early diagnosis of hepatocellular carcinoma . The transcription factor Prmt3p (protein arginine methyltransferase 3) has been shown to be dysregulated in gemcitabine-resistant pancreatic cancer cells. Overexpression of Prmt3p resulted in increased resistance to gemcitabine in pancreatic cancer cells, while reduction of Prmt3p was observed to restore gemcitabine sensitivity in resistant cells (Hsu et al., 2018). Based on this, inhibition of Prmt3p was proposed as a new strategy for the treatment of gemcitabine-resistant pancreatic cancer (Hsu et al., 2018).
The regulatory network for the top 10 TFs of Cluster 2 is presented in Supplementary Figure S3. Znf12p was proposed as a new target in imatinib-resistant gastrointestinal stromal tumor cells (Cao et al., 2018). Znf888p was reported as one of the methylation-based genes associated with clear-cell renal cell carcinoma (Wang et al., 2020). Overexpression of another Cluster 2-related TF, Ergp, was reported to be associated with tumor stage in prostate cancer but did not strongly predict the rate of relapse or death among men treated with radical prostatectomy (Pettersson et al., 2012). TMPRSS2-ERG gene fusions are the major subtype of prostate cancer and are predominantly seen in young patients and lead to constitutive overexpression of the transcription factor Ergp. ERG overexpression alone is not prognostic; however, ERG was shown to modulate the expression of > 1600 genes in prostate epithelial cells (Büscheck et al., 2019).
The regulatory network for the top 10 TFs of Cluster 3 is presented in Supplementary Figure S4. Of the TFs associated with Cluster 3, Tfdp1p has the highest meanrank. Extensive characterization of DNA amplification at chromosome region 13q34 in breast cancer has revealed Tfdp1p as one of the possible candidate target genes; in addition, tumors with high gene expression have been associated with markers of tumor proliferation and cell cycle progression (Melchor et al., 2009). Under normal physiological conditions and in most cancer cells, Tfdp1p is a predominant protein that binds to E2F; deregulated TFDP1/E2F1 is known to induce stress leading to high levels of p53 in cancerous cells (Zhan et al., 2017).
Foxm1p associated with Cluster 3 is an oncogenic transcription factor that is overexpressed in most human cancers. Foxm1p is involved in cell migration, invasion, angiogenesis, and metastasis. The important role of Foxm1p in cancer confirms its importance for therapeutic intervention (Halasi and Gartel, 2013). Understanding the regulation and function of Foxm1p has gained attention, which will provide potential roles of it in cancer and additional diseases.
Another TF associated with Cluster 3 is Cenpa (centromere protein-A), which was shown to be important in hepatocellular carcinoma development (Zhang et   2020), and gastric cancer progression and prognosis (Xu et al., 2020).

Conclusion
The analyses reported in this study have shown that doxorubicin causes common changes in gene expression in different types of cancer. Significant differential expressions were mainly observed in genes that have a role in transcription, regulation of transcription, and regulation of cell proliferation. Binding of protein, chromatin, and DNA are among significant molecular functions in addition to several terms related to RNA polymerase II activity, indicating that doxorubicin targets transcription machinery of RNA polymerase II, related to DNA intercalating property of doxorubicin. Further analysis of the PPI network revealed the interconnectivity of expression-level effects of doxorubicin on additional cancer types such as glioma, melanoma, thyroid, bladder, endometrial, and prostate cancers. Identification of TFs within the significant clusters of the active network revealed the necessity of a comprehensive understanding of the regulation principles of these TFs in order to provide new potential therapeutic strategies for cancer therapy. As a result of this study, Pou5f1b, Znf428, Prmt3, Znf12, Erg, Tfdp1, Foxm1, and Cenpa can be suggested as new drug targets in the development of drugs that can be effective in application to different cancer types.