Bioinformatics analysis identified CDC20 as a potential drug target for cholangiocarcinoma

Background Cholangiocarcinoma (CCA) is a malignancy that originates from bile duct cells. The incidence and mortality of CCA are very high especially in Southeast Asian countries. Moreover, most CCA patients have a very poor outcome. Presently, there are still no effective treatment regimens for CCA. The resistance to several standard chemotherapy drugs occurs frequently; thus, searching for a novel effective treatment for CCA is urgently needed. Methods In this study, comprehensive bioinformatics analyses for identification of novel target genes for CCA therapy based on three microarray gene expression profiles (GSE26566, GSE32225 and GSE76297) from the Gene Expression Omnibus (GEO) database were performed. Based on differentially expressed genes (DEGs), gene ontology and pathway enrichment analyses were performed. Protein-protein interactions (PPI) and hub gene identifications were analyzed using STRING and Cytoscape software. Then, the expression of candidate genes from bioinformatics analysis was measured in CCA cell lines using real time PCR. Finally, the anti-tumor activity of specific inhibitor against candidate genes were investigated in CCA cell lines cultured under 2-dimensional and 3-dimensional cell culture models. Results The three microarray datasets exhibited an intersection consisting of 226 DEGs (124 up-regulated and 102 down-regulated genes) in CCA. DEGs were significantly enriched in cell cycle, hemostasis and metabolism pathways according to Reactome pathway analysis. In addition, 20 potential hub genes in CCA were identified using the protein-protein interaction (PPI) network and sub-PPI network analysis. Subsequently, CDC20 was identified as a potential novel targeted drug for CCA based on a drug prioritizing program. In addition, the anti-tumor activity of a potential CDC20 inhibitor, namely dinaciclib, was investigated in CCA cell lines. Dinaciclib demonstrated huge anti-tumor activity better than gemcitabine, the standard chemotherapeutic drug for CCA. Conclusion Using integrated bioinformatics analysis, CDC20 was identified as a novel candidate therapeutic target for CCA.

For the 3D cell culture, the adherent CCA cells were trypsinized and seeded into a 96 well ultra-low attachment plates (Corning, Incorporated, Kennebunk, Maine) at the optimal starting cell numbers. CCA cells were then cultured at 37 • C, 5% CO 2 for 24-72 h for spheroid formation. Spheroid morphology was observed under an inverted light microscope. Spheroid integrity were analyzed by ImageJ NIH software (https://imagej.nih.gov/ij/download.html).

Microarray data sources
Three gene expression datasets of CCA and normal tissues were retrieved from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) including GSE26566, GSE32225

Gene ontology, pathway enrichment, protein-protein interaction (PPI) and hub gene identification analysis
The common DEGs were inputted into STRING software version 11.0 (https://stringdb.org/) for analyzing the interactions among various proteins to constitute a PPI network, gene ontology and pathway enrichment. Gene functions were classified into cellular components (CC), cellular processes (CP), molecular function (MF) and the Reactome pathway. GO results were visualized as a bubble plot by RStudio software. Screening and identification of hub genes were conducted with cytohubba (https://apps.cytoscape.org/apps/cytohubba) (Chin et al., 2014) of the Cytoscape (https: //cytoscape.org/) plug-in MCC model.

Prediction and prioritizing of novel candidate anti-cancer drugs
After hub gene identifications, the next step was further prioritizing and predicting a novel anti-cancer drug based on up-regulated hub genes using PanDrugs software (https://www.pandrugs.org) (Piñeiro Yáñez et al., 2018). Prioritization criteria for the selection of a novel anti-tumor drug was that the candidate drug had to have been used in clinical trials or is already approved for clinical use.

RNA extraction and real-time PCR
Cells were collected and total RNA samples were extracted from CCA cell lines using TRIzol R reagent (Invitrogen; Thermo Fisher Scientific, Van Allen Way, California.) according to the manufacturer's protocol. RNA (2 µg) was reverse transcribed to cDNA. The system conditions for real-time PCR contained (10 µl): 5 µl for 2X SYBR Green I Master LightCycler R 480 SYBR green I master (Roche Diagnostics GmbH, Mannheim, Germany), 2 µl of cDNA (25 ng/µl), 2 µl of forward and reverse primers and 1 µl DD H 2 O PCR grade. The primers for CDC20 sequences were as follows: forward, 5 -CGGGTAGCAGAACACCATGT-3 reverse, 5 -ACTGGCCAAATGTCGTCCAT-3 . The PCR conditions were 95 • for 5 min, followed by 50 cycles at 95 • for 10 s and 62 • for 10 s.

Western blot analysis
Cells were lysed with RIPA buffer (Thermo Fisher Scientific, Rockford, Illinois) containing Proteinase Inhibitor Cocktail (NACALAI TESQUE, INC., Kyoto, JAPAN). The lysates were shaken at 4 • C overnight. Protein content was determined using BCA assay kit (Thermo Fisher Scientific, Rockford, Illinois). Protein lysates were separated in a 12% (for actin) or 15% (for CDC20) sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) and transferred on a polyvinylidene fluoride or polyvinylidene difluoride (PVDF) membrane (Merck KGaA, Darmstadt, Germany) using Towbin transfer buffer in a wet tank transfer blotting system (Bio-Rad Laboratories (Shanghai) Co., Ltd., Shanghai, China). After blocking with 5% skimmed milk for 1 h at room temperature (RT), membranes were incubated with each primary antibody, CDC20 (Cell Signaling Technology, Tokyo, Japan), or actin (Santa Cruz Biotechnology, Inc., Dallas, Texas) at 4 • C, overnight. Blots were then incubated with anti-rabbit IgG, HRP-linked Antibody (Cell Signaling Technology, Tokyo, Japan) or anti-mouse IgG, HRP-linked Antibody (Cell Signaling), at RT for 1 h. The western blot bands were detected using an ImageQuant LAS 4000 (GE Healthcare Bio-Sciences AB, Uppsala, Sweden).

The cytotoxicity test in two-dimensional (2D) and three-dimensional (3D) cell culture models
For the 2D cell culture model, CCA cell lines were seeded into 96-well plates (1,500 cells/well) overnight (16 h) and then treated with various concentrations of gemcitabine (Sigma-Aldrich, St. Louis, Missouri) or dinaciclib (Abcam plc., Cambridge, UK) for 72 h. After incubation, cell viability was measured by ATPlite 1step (PerkinElmer, inc, Waltham, Massachusetts) for luminescence detection of ATP from cultured cells under the 2D model using SpectraMax L Microplate Reader (Molecular Devices, St, San Jose, California).
For the 3D cell culture model, 2,500 cells/well of CCA cell lines were seeded into 96 well ultra-low attachment flat-bottom plates (Corning, Incorporated, Kennebunk, Maine), and incubated for 24 to 78 h depending on cell type. The formed spheroids were then exposed to gemcitabine or dinaciclib for 72 h. Cell viability was measured by the ATPlite3D kit (PerkinElmer, inc, Waltham, Massachusetts) using SpectraMax L Microplate Reader. Cells cultured in DMSO; 0.01% for 2D-culture and 0.4% for 3-D culture, were used as a vehicle control in all drug testing experiments. Three separated experiments with triplicate assays were performed.

Statistical analysis
Statistical analyses were conducted using Graphpad Prism software version 7.03 (GraphPad Software, Inc., San Diego, California). All experiments were performed in three independent experiments. The cytotoxicity results are expressed as mean ± SD. Statistical significance was determined by Students' t test and P < 0.05 was noted as statistical significance.

Identification of common DEGs
Four GSE datasets of CCA were retrieved form GEO including GSE26566, GSE32225, GSE76297 and GSE89749. Of these, GSE89749 which comprised of a small number of normal cases and showed the abnormality of gene distribution was excluded from the study. The GSE26566 contained 104 cancer and 59 normal cases, while GSE32225 comprised of 149 cancer and 6 normal cases. The CCA cases that formed both datasets were non-Opisthorchis viverrini (OV )-associated CCA. GSE76297 datasets consisted of 91 OV -associated CCA and 92 surrounding liver tissues. The volcano plot elucidated DEGs of CCA and normal tissues (Figs. 1A-1C). Based on the selection criteria of an adjusted P value <0.05 and absolute log fold-changes ≥ 0.5, a total of 2,029, 4,153 and 4,212 DEGs were identified from GSE26566, GSE32225 and GSE76297. 1,152 and 877 genes were up-and down-regulated in the GSE26566 dataset. 2,165 and 1,988 genes were up-and down-regulated in GSE32225. 2,407 upregulated and 1,805 down-regulated genes were identified in GSE76297 dataset. Finally, Venn diagram analysis revealed that there are 226 common DEGs in the three microarray datasets, in which 124 were up-regulated and 102 down-regulated genes (Figs. 1D and 1E).

Gene ontology and pathway enrichment analysis
The GO and pathway enrichment analysis of common DEGs elucidated in cellular component (CC) terms ( Fig. 2A), common DEGs were mainly enriched in cytoplasm, extracellular space, high-density lipoprotein particles and in the peroxisomal matrix. The common DEGs were also involved with several molecular functions (MF) (Fig. 2B) such as oxidoreductase activity, endopeptidase inhibitor activity and enzyme inhibitor activity. In cellular process terms (Fig. 2C), the results showed that common DEGs were significantly enriched in carboxylic acid catabolic processes, whereas small molecule metabolic processes, were responsive to inorganic substances and organic acid metabolic processes. Moreover, the Reactome pathway enrichment analysis indicated that common DEGs were mainly

PPI of common DEGs and hub gene identifications
The PPIs of 226 common DEGs were constructed using STRING software as shown in (Fig. 3A). The hub genes based on the PPI network were further prioritized (Figs. 3B and 3C). The results categorized the top 20 hub genes automatically into two groups. The up-regulated hub genes from common DEGs, including CCNB1, CDC20, MAD2L1, FANCI, CDKN3, RRM2, EZH2, MCM3, ANLN, MCM7 and HMMR were ranked from the degree of connectivity with other proteins (binding scores) (Table 1). Down-regulated hub genes included FGA, AHSG, SERPINA10, F9, F11, FETUB, C6, F12 and KNG1. Node colors are represented for the degree of connectivity. The pseudocolor scale from red to yellow represents the gene ranking from 1 to 20. The dark red color represents the highest degree while an orange color stands for the intermediate degree and yellow color is the To confirm the reliability of the analyzed results, the expression levels of all upregulated hub genes using different datasets and different databases were further examined. RNA sequencing profiles were retrieved from The Cancer Genome Atlas (TCGA, https://www.cancer.gov/about-nci/organization/ccg/research/structural-genomics/tcga). Box plots were plotted for the elucidation expression level of the hub genes in patient CCA tissues compared with the normal tissues (Figs. 4 and 5A). The results elucidated strong evidence to support the analysis that all the identified hub genes were highly expressed in CCA when compared with normal controls even though the analysis was performed on different databases and different platforms. (B-C) Subnetwork of top 20 hub genes from protein-protein interaction network using Cytoscape software. Node color reflects degree of connectivity. The pseudocolor scale from red to yellow represents the top nine hub genes rank from 1-20. Red color represents highest degree, and orange color represents intermedia degree, and yellow color represents lowest degree.
Full-size DOI: 10.7717/peerj.11067/ fig-3 Prioritizing and prediction of a novel targeted drug for CCA Novel candidate drugs targeted for CCA were based on 11 up-regulated hub genes using PanDrugs, a bioinformatics platform, to prioritize anticancer drug treatments. PanDrugs not only provided drug status descriptions but it also predicted the possible drug response and the interaction between drugs. The drug prediction and prioritizing in Table 2 shows several potential novel drugs for 6 of the 11 upregulated hub genes (Fig. 3B). CDC20 was the candidate that fit to the selection criteria ( Fig. S1) as CDC20 plays significant roles in cell cycle and mitosis, and was the high-scoring hub gene (Fig. 3B). There are several candidate drugs suggested for CDC20 (Table 2). Dinaciclib, however, was selected to be the drug of choice for CDC20 in this study as it has never been reported for CCA, and the safety and efficacy has been reported in the clinical trial phase III (Ghia et al., 2015;Sharp & Corp, 2017).

CDC20 is a candidate novel target for CCA
After the in-silico analysis, a series of experiments for investigating the antitumor activity of dinaciclib in CCA cell lines under 2D and 3D cell culture models was performed. CDC20 mRNA and protein were investigated in four CCA cell lines, KKU-100, KKU-213A, KKU-213B and KKU-452, using real-time PCR and western blot analysis. Expression of CDC20 in patient CCA tissues was significantly higher than those of normal counterpart (Fig. 5A), and all CCA cell lines differentially expressed CDC20 mRNA ( Fig. S2 and File S1) and protein (Figs. 5B and 5C). The anti-tumor activity of dinaciclib in CCA cell lines was compared with that of gemcitabine, the standard chemotherapeutic agent for CCA treatment. In the 2D cell culture model (Figs. 5D and 5E), dinaciclib elucidated very effective anti-tumor activity against CCA cell lines at approximately 40 times of the 50% inhibitory concentrations (IC 50 ) lower than gemcitabine (Table 3). In the 3D cell culture model (Figs. 5F and 5G), KKU213A and KKU-452 spheroids were treated with various concentrations of dinaciclib or gemcitabine for 72 h. The results demonstrated that dinaciclib had highly effective antitumor activity better than a thousand times gemcitabine in both CCA cell lines (Table 3). The IC 50 of KKU-213A and KKU-452 spheroids to dinaciclib were 514.8 nM and 4.7 nM, whereas the IC 50 of gemcitabine against KKU-213A spheroid cannot be calculated because it was higher than 500 ×10 3 nM, the maximum tested drug concentration. A similar trend was observed for KKU-452 spheroids with gemcitabine IC 50 is 25.78 ×10 3 nM. The KKU-213 spheroid morphology at 72 h exposure time with both drugs and vehicle were demonstrated in Fig. 5H. It was then concluded that the CCA cells were more sensitive to dinaciclib than gemcitabine.

DISCUSSION
The identification of the molecular mechanisms of cancer cells is crucial for diagnosis and therapy of cancer patients. Various high throughput screening (HTS) techniques, cDNA microarray and RNA sequencing, are widely used to explore DEGs involved in carcinogenesis and progression which has provided valuable information for clinical applications (Lowe et al., 2017). A huge amount of corresponding data from a cDNA microarray and RNA sequencing is stored in several public databases such as ENCODE (https://www.encodeproject.org/), TCGA, ICGC (https://dcc.icgc.org/) and GEO. Integration of multiple HTS datasets (cDNA microarray and RNA sequencing) is considered a better approach of enhancing the reliability of results than an individual HTS dataset (Yan et al., 2018;Le, 2019;Huang et al., 2019). (continued on next page) This gene uses alternative splicing to generate two different proteins-high molecular weight kininogen (HMWK) and low molecular weight kininogen (LMWK). HMWK is essential for blood coagulation and assembly of the kallikrein-kinin system. In the present study, an in-silico analysis using several bioinformatics approaches for screening and identification novel molecular targets for CCA treatment was performed using three microarray datasets of patient CCA tissues. The analyses revealed several novel targets for CCA treatment. Of these, CDC20, a regulatory protein involves in multi-cell cycle checkpoints was identified as a novel target molecule for CCA treatment and dinaciclib, a pan CDK inhibitor, was suggested as a compatible drug for CDC20. The suggestion from the in-silico analyses was proved by the in vitro cytotoxic experiments of dinaciclib in comparison with Gemcitabine in human CCA cell lines. Three microarray datasets from CCA patients including GSE26566, GSE32225 and GSE76297 were retrieved from the GEO database and used in this study. From GEO2R and Venn diagram analysis, a total of 226 common DEGs including 124 up-regulated and 102 down-regulated genes in CCA tissues were revealed. GO and pathway enrichment analysis revealed that common DEGs were enriched in metabolism, and cell cycles especially in the mitosis phase. As is well known, one of the hallmarks of cancers is the alteration of metabolism in which cancer cells predominantly use the glycolytic pathway as the main energy source rather than tricarboxylic acid cycle (TCA) even in an adequate oxygen condition, known as ''Aerobic glycolysis'' or the ''Warburg effect'' (Warburg, 1956;Schwartz, Supuran & Alfarouk, 2017). The understanding of the cancer metabolism also provides an opportunity for development specific novel targets for cancer diagnosis and treatment. At present, several alterations of glycolytic related molecules have been reported in CCA such as glucose transporters (GLUT) hexokinase (HK) II (Paudyal et al., 2008;Thamrongwaranggoon et al., 2017), and Tumor M2-pyruvate kinase (PKM2) (Cuenco et al., 2018). Moreover, the alteration in expression levels of other molecules such, c-MET, RAS/BRAF, EGFR/ERBB2 which are involved downstream, affect cancer cell proliferation and development. (Terada, Nakanuma & Sirica, 1998;Miyamoto et al., 2011;Churi et al., 2014;O'Dell et al., 2012;Leone et al., 2006;Ito et al., 2001).
The PIP network and module analysis in the currently study demonstrated the top 20 high-scoring hub genes in CCA. Several of these identified hub genes, for instance, CCNB1, CDC20, MAD2L1, were involved in mitosis and cell cycle control (Table 1) (Percy et al., 2000;Chi et al., 2019;Zhang et al., 2019;Wang et al., 2015). The computational analysis for prioritizing and predicting of novel targeted drugs based on the 11 up-regulated hub genes by PanDrugs suggested several new targets and drug of choices. Among these, CDC20 is the most interested target as: firstly, it was upregulated in patient CCA tissues compared with the normal counterpart tissues (Fig. 5A). Secondly, it plays an important role in chromosome segregation (Kapanidou, Curtis & Bolanos-Garcia, 2017) during metaphaseanaphase transition (Fujimitsu, Grimaldi & Yamano, 2016;Wang et al., 2015) and also other cellular processes, for example suppressing apoptosis (Harley et al., 2010;Wan et al., 2014). Thirdly, expression of CDC20 have been demonstrated to be involved with poor patient outcomes in several human malignancies. For instance, the high expression of the CDC20 gene were associated with poor prognosis and outcome of breast cancer patients (Karra et al., 2014). Overexpression of CDC20 in tumor tissues was related with poor differentiation and a lower 5-year recurrence-free survival rate of pancreatic cancer patients (Li et al., 2003;Chang et al., 2012), and a short survival of colorectal cancer (Wu et al., 2013). Moreover, knockdown of CDC20 gene inhibited cell growth and induced the G2/M arrest in cell cycle of lung cancer cells (Kidokoro et al., 2008). Lastly, several   (Table 2). Dinaciclib treatment effectively suppressed tumor growth in CCA cell lines cultured under 2D and 3D cell culture models (Figs. 5D-5G). In similar in-vitro studies, dinaciclib elucidated antitumor activity against cultured cells in 2D models at low nanomolar levels such as, 12, 8, 17 and 14 nM IC 50 for prostate, breast, colon, and ovarian cancer cell lines (Parry et al., 2010). Interestingly, the tumor-suppressing effect of dinaciclib was better than gemcitabine, the current standard treatment drugs for CCA. The 4 CCA cell lines tested showed different response to gemcitabine as indicated by a wide IC 50 range of 87-241 nM, whereas all cell lines showed a similar sensitivity to dinaciclib with a narrow IC 50 range of 5.4-6.9 nM. Cancers seem to be more sensitive to dinaciclib than gemcitabine as the similar observations were also reported in several pancreatic cancer cell lines (Khan et al., 2020) and hepatocellular carcinoma (Shao et al., 2019;Hassan et al., 2019). The different response of cancer cells to gemcitabine and dinaciclib may be due to the different drug targets and actions. Gemcitabine targets the DNA synthesis and induces cell death, whereas dinaciclib acts on several CDKs in cell cycle and hence targets several points of the cell cycle at the same time. Moreover, as the 4 CCA cell lines were established from different subtypes of CCA, using dinaciclib as the treatment of choice for CCA may give more advantage than gemcitabine in such the way that dinaciclib exhibited a high anti-tumor activity independently of CCA subtypes.
Dinaciclib is the novel generation of the potent multi-CDKs inhibitor which can suppress CDK1, CDK2, CDK5 and CDK9 (Parry et al., 2010) with the IC 50 in a low nanomolar range (1-4 nM). The effective anti-tumor activity of Dinaciclib have been elucidated both in vitro and in vivo in various types of cancers (Rello-Varona et al., 2019;Chen et al., 2016). Enhancing G2/M phase arrest and induction of apoptosis by dinaciclib have been demonstrated in the in vitro and xenografted mouse model in thyroid cancer (Lin et al., 2017) and the triple negative breast cancer (Rajput et al., 2016). In addition, dinaciclib was shown to trigger abnormal mitotic division (anaphase catastrophe) in lung cancer cells through, CDK1 and CDK2 suppression (Danilov et al., 2016). Moreover, dinaciclib has been used in clinical trials of several cancers, such as chronic lymphocytic leukemia (Flynn et al., 2015;Gojo et al., 2013) and breast cancer (Mita et al., 2014). The tolerant side effects such as transient gastrointestinal toxicities, liver function tests abnormalities, fatigue, and hypotension have been reported (Gojo et al., 2013).
A similar but different approach has been performed using a dataset from GEO (GSE26566) to find the potential candidate treatment agents for CCA without validation (Chujan et al., 2018). Recently, Ye et al. (2020) has identified the key genes associated with the progression of intrahepatic CCA using three GEO dataset including GSE107943, GSE119336 and GSE26566. The present study provided several target candidates different from those reported in the previous studies. This may be due to the different objectives, different workflow of the analysis and different GEO dataset used. Combining various bioinformatics methods using several gene expression datasets may reduce the bias and strengthen the analysis outcome. The experimental evidence in vitro and in vivo must be conducted to confirm the analysis outcome before translating to the clinical practice.
Taken together, the current study provided several novel draggable-target molecules for CCA treatment from computational analysis. The reliability of in-silico results for biological investigation was confirmed in CCA cell lines. The compatible results between computational and biological investigations demonstrated very interesting and strong evidence that all identified hub genes, especially CDC20 was highly expressed in CCA tissues and may be used as an effective novel target for CCA treatment.

CONCLUSIONS
In the present study, the aim was to identify a novel molecular target for CCA treatment using bioinformatics analysis based on three transcriptomic datasets available in the GEO database. The identified hub genes were mostly involved with metabolism, mitosis and cell cycle control which indicated that, CCA was highly effective in these processes. CDC20 was suggested by bioinformatic analysis as a potential novel target for CCA. Not only using in-silico analysis, the expression levels of CDC20 in CCA cell lines as well as the investigation of the anti-tumor activity of novel potential targeted drugs against CCA cells were compared with the standard chemotherapy, gemcitabine. This novel drug has a stronger antitumor activity than gemcitabine.