Disparity between Inter-Patient Molecular Heterogeneity and Repertoires of Target Drugs Used for Different Types of Cancer in Clinical Oncology

Inter-patient molecular heterogeneity is the major declared driver of an expanding variety of anticancer drugs and personalizing their prescriptions. Here, we compared interpatient molecular heterogeneities of tumors and repertoires of drugs or their molecular targets currently in use in clinical oncology. We estimated molecular heterogeneity using genomic (whole exome sequencing) and transcriptomic (RNA sequencing) data for 4890 tumors taken from The Cancer Genome Atlas database. For thirteen major cancer types, we compared heterogeneities at the levels of mutations and gene expression with the repertoires of targeted therapeutics and their molecular targets accepted by the current guidelines in oncology. Totally, 85 drugs were investigated, collectively covering 82 individual molecular targets. For the first time, we showed that the repertoires of molecular targets of accepted drugs did not correlate with molecular heterogeneities of different cancer types. On the other hand, we found that the clinical recommendations for the available cancer drugs were strongly congruent with the gene expression but not gene mutation patterns. We detected the best match among the drugs usage recommendations and molecular patterns for the kidney, stomach, bladder, ovarian and endometrial cancers. In contrast, brain tumors, prostate and colorectal cancers showed the lowest match. These findings provide a theoretical basis for reconsidering usage of targeted therapeutics and intensifying drug repurposing efforts.


Introduction
Cancers have high levels of molecular heterogeneity [1]. It is manifested at least at the two levels: intertumoral, between different patients with the same type of tumor and intratumoral, between different parts of tumor and/or metastases [2]. While intratumoral heterogeneity is thought to be the main cause of drug resistance and relapse of individual cancers, the inter-patient level of heterogeneity hinders the development of universally active anticancer drugs [3,4]. Tumor localization, histologic and molecular differences dictate the need for the development of multiple drugs with different specificities. From the molecular perspective, even tumors of the same localization and histological type are considered a heterogeneous set [5]. The inter-patient heterogeneity of tumors is an important problem in clinical oncology that requires personalization of most of the treatment options [6].
Modern methods in molecular diagnostics enable classification of tumors not only by their histologic type but also by specific genetic features [7]. The information on molecular genetic patterns that determine the course of carcinogenesis, tumor progression and response to treatments is being constantly accumulating [8].
In a number of large-scale experimental and meta-analysis investigations, multiple cancer molecular biomarkers and critical pathways of carcinogenesis were identified. Understanding the molecular biology of tumor heterogeneity stimulates updating of therapeutic instruments including targeted anticancer drugs. These drugs have a known molecular target, which increases their selectivity and reduces side effects compared to conventional chemotherapy [14][15][16]. For several cancers, target drugs demonstrated dramatic increase of survival [14]. There are currently around 200 anticancer target drugs available in the domain of clinical oncology worldwide [17]. Outstanding improvements were achieved for hematological and lymphoid malignancies, non-small cell lung cancer (NSCLC), colorectal cancer, gastro-intestinal stromal tumors, breast cancer, kidney cancer, melanoma and others [18][19][20][21][22]. Nevertheless, results of clinical trials show that generally targeted therapeutics are strongly effective only for minor cohorts of patients, whereas the average efficacy for all patients of a given cancer type remains relatively low [14,18]. Furthermore, the mechanism of action of targeted drugs dictates restrictions of their use only for the patients with specifically expressed/manifested respective molecular targets or other relevant biomarkers [23]. For example, therapeutic monoclonal antibodies against EGFR, such as Cetuximab and Panitumumab, are poorly effective in KRAS/NRAS-mutated colorectal cancers, that is, for~40% of the patients [24]. Trastuzumab, a therapeutic antibody targeting the HER2 receptor, is effective only for the~20%-30% cohort of patients overexpressing the ERBB2 gene [18].
The effectiveness of Epidermal Growth Factor Receptor (EGFR)-specific tyrosine kinase inhibitors in NSCLC is associated with mutations in EGFR gene, that is, deletions of the 19-21 st exons and amplifications of EGFR gene positively correlate with the clinical benefit of treatment [20]. In melanoma or NSCLC, specific inhibitors of BRAF kinase Vemurafenib and Dabrafenib and MEK inhibitors Binimetinib and Trametinib are used only in BRAF-mutated tumors. Tyrosine kinase inhibitor Larotrectinib is recommended for solid tumors with fusions of NTRK genes. In turn, inhibitors of isocitrate dehydrogenase-1 (IDH1) protein are used for the treatment of patients with relapsed or refractory acute myeloid leukemia with a diagnostic mutation in IDH1 gene. To the date, there are similar molecularly-guided restrictions for more than 50 targeted cancer drugs [25].
Moreover, the US Food and Drug Administration (FDA) now recommends developing companion diagnostic tests for all new cancer drugs entering the pharmaceutical market [26]. Several marketed target drugs already have such companion molecular tests [25]. Alternatively, clinicians can use transcriptomics-based high throughput data-driven second opinion systems of targeted therapeutics selection [27][28][29][30][31].
Historically, the treatment standards have been formulated for most types of cancer [32][33][34][35][36][37][38][39][40][41][42][43][44]. However, the underlying treatment schemes are focused primarily on localization or histological characteristics of a tumor but do not consider most projections of its molecular phenotype. Moreover, the currently accepted treatment regimens do not reflect the degree of intertumoral heterogeneity within a particular cancer type [1]. Exceptions are made only for a narrow spectrum of specific genetic damages, such as diagnostic mutations discussed above or epigenetic changes like methylation of MGMT gene promoter in brain tumors [18,19,24,45].
Thus, it can be assumed theoretically that cancer types with higher degree of intertumoral molecular heterogeneity have a smaller proportion of responses to a particular targeted therapy and more different drugs should be accepted for clinical use in these instances. In this study we investigate whether this concept is in line with the currently accepted standards of care in oncology.
We estimated the degree of intertumoral heterogeneity in different primary cancer localizations by analyzing the whole exome and gene expression data of 4890 tumors taken from TCGA project database. The extent of heterogeneity was assessed by measurements of clustering quality. For all major cancer types, we compared heterogeneities at the levels of mutations and gene expression with the repertoires of targeted therapeutics and their molecular targets accepted by the National Comprehensive Cancer Network (NCCN) guidelines. In total, 85 drugs were investigated that included the classes of targeted monoclonal antibodies; immunotherapeutics; tyrosine kinase, cyclin, histone deacetylase, poly-ADP ribose polymerase and proteasomal inhibitors; rapalogues; antiangiogenic and microtubule agents and the others. Collectively, they covered 82 individual molecular targets. For the first time, we showed that the repertoires of molecular targets of accepted drugs did not correlate with molecular heterogeneities of different cancer types. On the other hand, we found that the current clinical recommendations for the available cancer drugs were strongly congruent with the gene expression but not gene mutation patterns. We detected the best match among the drugs usage recommendations and molecular patterns for the kidney, stomach, bladder, ovarian and endometrial cancers. In contrast, brain tumors, prostate and colorectal cancers showed the lowest match. These findings provide a theoretical basis for reconsidering clinical guidelines and intensifying drug repurposing efforts.

Biosample Sets
Intertumoral variation was measured here using gene expression data and mutation frequencies in genes using molecular profiles for 4890 patient biosamples of thirteen cancer types. The following cancer types were investigated (according to TCGA classification): (i) Bladder Urothelial Carcinoma, For those types of cancer, major variation parameters were compared. However, the variation may depend on the sample size. Ideally, data sets of the same size should be used for reasons of statistical equivalence in comparative analyses. However, in the reality this is difficult to obtain groups of biosamples of the same size for different cancer types. Artificially reducing sizes of most cancer type datasets down to the size of the minimal set is not desirable because it can bias their distribution patterns. Another possibility is taking datasets of not identical but comparable sizes. To this end, we selected from TCGA project database thirteen cancer type datasets each having 136-797 samples, where average number of samples per dataset was 377. For every sample, whole exome and RNA sequencing data were available.
We then checked whether there is a relationship between the variation of gene expression and the number of biosamples per cancer type. For the expression levels of every gene, we calculated a standard deviation (SD) in every cancer type. Pearson correlation was then calculated for the obtained SDs and the respective numbers of biosamples per cancer type (Figure 1a). Similarly, mutation data were investigated using normalized gene mutation rates and their SDs (Figure 1b) [46]. In both cases, we found no significant correlation between SDs and sample sizes for the thirteen tumor types investigated. Furthermore, we performed randomization computational experiment to investigate the effect of sampling size on our expression and mutation dataset. The groups of biosamples that previously corresponded to cancer types were now selected randomly for 1 000 times. Total numbers of samples per randomized groups were equal to the numbers of samples per group in the initial TCGA cancer datasets (Table 1).  We then calculated SDs for logarithms of gene expression and for normalized mutation rates and correlated them with the numbers of samples per group (Figure 1c,d). We observed lack of significant correlations between SD and number of samples per group for both mutation and gene expression data. We, therefore, concluded that the cancer group sizes investigated here are acceptable for further functional comparisons of inter-patient heterogeneities.

Intertumoral Heterogeneities
The inter-patient (intertumoral) heterogeneities were measured for every cancer type in two ways: by similarity with the other cancer types and by similarity between different samples in the same cancer type.
The similarities in the same cancer type were measured for all genes as pairwise distances between either normalized mutation rates or logarithms of expression, Figure 2a,b. We then compared intragroup heterogeneities of different cancer types with numbers of molecular targets for the respective National Comprehensive Cancer Network (NCCN)-recommended targeted therapeutics. We also did cancer stage-specific comparisons for stages I-IV, where only samples and corresponding molecular targets of drugs were considered for each stage (Supplementary File 1). To perform cancer stage-specific investigations, we selected matched transcriptomic and mutational profiles for 3211 previously untreated cases with known cancer stage. We investigated only cancer types with a certain stage having 40 and more sample to increase statistical significance of heterogeneity analysis.
For both expression and mutation data, we found no significant correlations between the intragroup pairwise distances and numbers of the respective molecular targets of drugs, both for the whole set of samples ( Figure 2d) and for the cancer stage-specific subsets (Supplementary File 1).
The similarities of different cancer types were measured by their abilities to specifically cluster in a general clustering dendrogram of 4890 tumor samples. Samples of more heterogenous cancer types did not form separate clusters and showed mixed patterns with samples from the other cancer types and vice versa. For heterogeneity assessments, cancer types were taken one by one and compared versus mixed samples of all the remaining cancer types. Two sorts of labels were used for the samples, that is, whether they belong to a cancer type under investigation or not. On the dendrogram of 4890 tumor samples (for both expression and mutation data) clustering quality reflects the degree of separation of the cancer type under study from the others, and, consequently, the degree of its intragroup similarity. The quality of clustering was assessed using the Watermelon Multisection (WM) method (Supplementary File 2, [10,13,[47][48][49][50][51][52][53][54][55][56]),where a bigger value of a specific metric termed WM area corresponds to a higher intragroup similarity (Figure 2c).It is worth noting that measurements of the WM area suggested an almost perfect separation of every type of cancer from the others using gene expression data ( Figure 2c). However, the mutation data generally showed a very mixed pattern between the different cancer types. Nevertheless, the WM area for mutation data was relatively high for three types of cancer-colon/rectum adenocarcinoma, lung adenocarcinoma/squamous cell carcinoma and bladder urothelial carcinoma (Figure 2c). Intriguingly, two of these cancer types (colorectal, lung cancer) are outstanding for their known specific mutation patterns strongly associated with the outcomes of targeted therapies that work for these cancers but not the others [20,24,57]. Finally, from the perspective of quality of clustering, we found lack of correlation between the intragroup similarities of cancer groups and numbers of molecular targets for the respective NCCN-recommended drugs (Figure 2e), for both expression and mutation data. The same phenomenon was observed also for the cancer stage-specific subsets of gene expression and mutation data (Supplementary File 1) We then repeated these analyses considering only molecular data for the 82 genes that encode molecular targets for the NCCN-recommended drugs in the thirteen cancer types under study. For the same 4890 tumor samples, we established 82-gene expression profiles. For the mutation data, we analyzed only 2696 tumor samples, because the rest had no mutations in the drug target genes and screened only for 78-gene profiles as no mutation data were obtained for all the samples.
Overall, the latter type of analyses fully confirmed the trends established initially using the whole-exome and transcriptome input data (Figure 3), also for the cancer stage-specific subsets (Supplementary File 3). The major difference was that for the mutation data in target genes, the thyroid cancer showed the highest WM area, thus reflecting the most peculiar target gene mutation profiles for the samples of this group (Figure 3c). This fact can be at least partly explained by outstandingly high frequency of BRAF V600E mutation in the most frequent papillary subtype of thyroid cancer. In contrast, bladder cancer samples now showed average value of WM area which may reflect apparently poor response of this cancer to the targeted therapies [57]. However, the lung cancer and colorectal cancer groups of samples as before showed relatively high values of WM area which is in line with the previous considerations. Taken together, our results clearly suggest that for the groups investigated there was no significant correlation for the different cancer types between the intertumoral heterogeneity and number of molecular targets for the cancer therapeutics (Figure 3d,e). Interestingly, we observed statistically significant correlation between cancer type-specific average pairwise distances for the mutation and expression profiles of all genes (Figure 4a). The same trend was seen also for the stage-specific subsets of cancer samples (Supplementary File 4). This finding most probably reflects direct relationship of overall genetic changes and altered gene expression regulation. Bigger number of mutated genes here means overall stronger alteration of the expression profiles.
However, our data also suggest that this pattern may have a complex nature and is evident only for the high-scale genetic profiles because there was no significant correlations found for the fraction of 78 drug target genes (Figure 4b, Supplementary File 4). However, for the individual tumor samples from the same datasets we previously showed no correlation between normalized tumor-specific gene expression changes and normalized mutation rates [58].

Clustering of Cancer Types in Relation with Recommended Targeted Therapeutics and Molecular Data
We then clustered molecular profiles averaged for every cancer type. The averaged profiles were calculated separately for every cancer type for both gene expression and mutation data. As before, the clustering was performed in two modes: for all genes and for a reduced set of target genes of NCCN-recommended drugs ( Figure 5), including cancer stage-specific clustering, (Supplementary File 5). Interestingly, clustering of the cancer types by expression data reflected the anatomical proximity of tissues or organs of primary tumor localizations. This trend was more pronounced for the clustering of full expression profiles (Figure 5a,d). In turn, for the mutation-based clustering by both all and drug target genes, we observed a tendency for colorectal cancer, lung cancer, uterine corpus endometrial carcinoma, bladder urothelial carcinoma and stomach adenocarcinoma to cluster together. These cancers are known for their high frequency of mutations [59].
It is worth noting that the kidney, prostate and thyroid cancers strongly tended to cluster together in both mutation and expression-based dendrograms (Figure 5a,b,d,e), including cancer stage-specific subsets (Supplementary File 5).
The molecular-based clustering was then compared with the clustering by the molecular targets of NCCN-recommended drugs for the particular cancer types (Figure 5c). We assessed whether the cancer types that had similar repertoires of recommended drugs were also clustering together according to molecular profiles and vice versa. We screened clustering features for a mix of cancer stages (Figure 5c) and also separately for all cancer stages I-IV (Supplementary File 5).
For example, for a mix of cancer stages we observed three major clusters of cancer types according to the target drugs used (Figure 5c). The rightmost cluster differed from the rest by frequent therapeutic targeting of tubulins, PARP, hormone receptors, cyclin dependent kinases and CD molecules. The leftmost cluster had targets such as receptors for vascular, placental, platelet derived and endothelial growth factors, NTRK, DDR2, MAPK11, other tyrosine kinases: ABL1, EPHA2, TEK and FRK. Finally, for the middle cluster including prostate, stomach, endometrial and cervical cancers we observed the minimal number of drug targets (Figure 5b,e).
This type of analysis was done for every specific cancer stage (I-IV, Supplementary File 5) and for all investigations we assessed whether drug usage clustering was reflected by the clustering according to DNA mutation or gene expression data ( Table 2). For some cancer types like kidney, thyroid and colorectal cancers, enough information was available for targeted drugs recommendations in all stages, whereas for the prostate, ovarian, endometrial, cervical and brain tumors there was no enough information of stage-specific targeted drugs usage to enable clustering ( Table 2). For both stage-specific and unspecific settings, we quantified coincidences of molecular-and drug usage-based clustering for all cancer types (Table 2).
We found that clustering by gene expression profiles was significantly more congruent with the drugs recommendation-based clustering rather than clustering by mutation data (82% coincidence for expression vs. only 49% for mutation data, Table 3). Drug usage in some cancer types was perfectly matching the molecular profiles (e.g., kidney, bladder, stomach, endometrial, cervical and ovarian cancers for expression data; kidney, prostate, endometrial, ovarian and brain tumors for mutation data). In turn, the matching outsiders were the brain, prostate and colorectal cancers for expression data and the lung, colorectal, endometrial and stomach cancers for mutation data ( Table 3). The kidney, liver, endometrial and ovarian cancers were highly matching with drugs usage profiles for both expression and mutation data and the colorectal cancer was rather poorly matching for both types of molecular data (Table 3).

all genes
Stage I, drug target genes LIHC / Cluster 1

Discussion
Molecular classification of tumors is a promising field in cancer research that can bring new diagnostic, prognostic and therapeutic options along with improved treatment outcomes [60]. High throughput sequencing provides an instrument for modern classification of tumors based on whole exome mutation profiles and/or gene expression data. The classes may be the groups of tumors showing specific molecular features, like mutation and gene expression patterns. The heterogeneity of cancer types depends on the number of molecular classes/subclasses. However, the criteria for cluster allocation are most frequently subjectively determined by the researcher. Heterogeneity of a given type of cancer can be measured by characterizing mixing of the respective samples with the other types on clustering dendrograms. In this case, homogeneous types of cancer will form separate clusters, whereas in heterogenous types samples can be mixed with the other types. However, quantitative assessment of the quality of clustering, especially with high number of samples (e.g., thousands) is a complex non-intuitive task.
For this purpose, a method termed Watermelon Multisection (WM) was developed that enables relatively quick and objective assessment of clustering quality and hence the intertumoral heterogeneity. The method is based on algorithmic assessment of entropy on every cut of clustering. We speculate that WM may be useful also for the analysis of other types of Big Data in biomedicine. In this study, we applied WM to assess heterogeneities of thirteen cancer types presented by 4890 tumor samples with genomic and transcriptomic data. To date, these samples were already described as a heterogeneous group [61,62], but, to our knowledge, their inter-tumor heterogeneity was never assessed numerically. Here we quantitatively characterized intertumoral heterogeneities between and inside different cancer types.
The analysis has been performed independently for the gene expression and DNA mutation data, both at the levels of full set of human genes and for the fraction of drug target genes. For the first time we performed a qualitative assessment of the relationship between intertumoral heterogeneity and repertoires of clinically approved targeted therapeutics. We investigated here the molecular profiles for 4890 patients representing thirteen cancer types and assessed molecular targets of 85 targeted therapeutics. We found that there is no correlation between the intertumoral heterogeneity and the number of recommended therapeutics/drugs molecular targets. Our data demonstrate that currently the repertoire of approved/recommended targeted drugs does not reflect the spectrum of molecular genetic variants of cancers and probably could be expanded in the case of most highly heterogenous cancers.
On the other hand, clustering of cancers by gene expression data reflects their anatomical and histological proximities, whereas clustering by mutations shows little dependence on these factors ( Figure 5). Overall, the most frequently mutated drug target genes were BRAF for thyroid cancer, FGFR2 for endometrial cancer, EGFR for brain tumors, MGMT for colorectal cancer, FGFR2 and FGFR3 for bladder cancer, NTRK3 for non-small cell lung cancer and NTRK2, FGFR2, EGFR for stomach cancer. These mutations were mentioned in the previous literature but additional studies are necessary to investigate whether they can be considered clinically actionable. Thus, thyroid carcinomas bearing BRAF mutations are less sensitive to BRAF inhibitors than melanomas and develop primary or acquired resistance due to additional mutations and activation of alternative signaling pathways that reinforce ERK signaling [63]. Recently NRG Oncology/Gynecologic Oncology Group study showed association of FGFR2 mutations with poor outcomes in endometrial cancer [64]. Targeting the EGFR signal transduction pathway in brain tumors faces the issue of rapid adaptation through activation of alternative signaling pathways. The role of temozolomide in colorectal cancer still remains controversial and further research is warranted. Temozolomide showed a modest activity in colorectal cancers with MGMT promoter methylation and the corresponding clinical trial did not meet its primary end point [65]. Tracing FGFR3 mutation is currently used for following bladder cancer recurrence but no related therapeutic options became available [66]. Finally, the role of mutated NTRK3 as target gene for treatment of non-small cell lung cancer is exploited by clinical trials [67], whereas targeting FGFR2 and EGFR in stomach cancer is considered promising for improving current strategies [68].
Our results also add to the discussion of whether gene expression biomarkers have a potential to replace or add value to DNA mutation biomarkers [69]. When comparing the molecular clustering for the drugs recommendation patterns and for the molecular profiles, we found that the gene expression data reflected drugs usage practice much better that the gene mutation profiles (Table 3). This finding may be in line with the results of a previous clinical investigation WINTHER, where personalized gene expression-guided experimental drug prescriptions to advanced cancer patients resulted in better clinical outcomes than in the case of mutation-based prescriptions [70].
NCCN treatment guidelines for drugs usage represent consensus critical assessment of multiple clinical trials results. We showed that they are in line with the transcriptomic features of different cancer types, as reflected by coincidence of clustering by drugs usage patterns and by gene expression in 82% of the cases tested (Table 3). This was not the case for the mutation data, where coincidence was detected in only 49% of the cases, thus strongly arguing that theoretically gene expression data could be more adequate source of information for the cancer treatment selection.
From this perspective, considering the existing repertoire of targeted therapeutics, kidney, stomach, bladder, ovarian and endometrial cancer patients receive the best molecular-matched therapy. In contrast, brain tumor, prostate and colorectal cancer patients receive the least molecular-matched targeted therapy (Table 3). We hypothesize that rethinking drugs usage practice towards more molecularly-matched therapeutics has a potential to improve treatment outcomes in the latter group of cancers.
The best validation for our results could be similar analyses of individual cancer molecular profiles associated with known responses on treatment with targeted therapeutics. Unfortunately, only 130 out of 4890 tumor profiles investigated here have been annotated with known responses to targeted drugs. Such a small number of samples does not make it possible to analyze statistically significant groups of drug responders and non-responders.
However, further accumulation of clinically annotated high throughput molecular profiles, for example, References [71][72][73] can dramatically improve this situation in the future.
To conclude, we showed here for the first time that the repertoires of current targeted therapeutics do not correspond to molecular heterogeneities of different cancer types. On the other hand, the clinical recommendations for the available cancer drugs are mostly congruent with the gene expression but not gene mutation patterns. We detected the best match among the drugs usage and molecular patterns for the kidney, stomach, bladder, ovarian and endometrial cancers; in contrast, brain tumors, prostate and colorectal cancers showed the lowest match. We propose that studies of a similar design could be carried out periodically to control the progress of anticancer drugs development.

Tumor Samples
We used molecular profiles for individual tumor samples simultaneously investigated by RNA sequencing and whole exome sequencing in The Cancer Genome Atlas (TCGA) project [10]. For higher statistical significance, we considered only primary localizations having at least one hundred samples with RNA sequencing and exome sequencing data. Totally, 4890 tumor samples were selected for the analysis ( Table 1). The term "cancer type" is used as the synonym of cancer from a particular primary site according to TCGA classification [74]. TCGA sample barcodes for all molecular profiles investigated are given in Supplementary Table S1. For stage-specific studies, we selected molecular profiles for previously untreated tumor samples with known stage diagnosed (Supplementary Table S1). For our analyzes, we considered minimal group size of 40
Normalized mutation rate (NMR) was calculated as follows: where NMR n,g is NMR of gene n in sample g; N mut(n,g) is the number of mutations for gene n in sample g; Length CDS(n) is the length of coding DNA sequence (CDS) of gene n in nucleotides [27,46]. The gene CDS normalization was done to eliminate a bias caused by different gene lengths [46], calculated NMR values are given in Supplementary

Gene Expression Data
RNA sequencing data were extracted from the GDC portal in HTSeq counts format [76]. For every gene expression profile, Deseq2 normalization was applied [47]. For further analyses we used logarithms of normalized gene counts.

Clinical Utility of Drugs and Molecular Targets
We extracted information about targeted therapeutics used in clinical practice for treatment of the cancer types under investigation from the US Food and Drug Administration (FDA) portal [25] and from the National Comprehensive Cancer Network (NCCN) guidelines [32][33][34][35][36][37][38][39][40][41][42][43][44]. The inclusion criterion for a therapeutic was its FDA approval and/or recommendation by the NCCN, category higher than 2B. Only those drugs were included that are accepted for the histotypes presenting in the 4890-tumor sampling for the thirteen cancer types investigated here (  [17,77]. In total, we identified 82 unique genes for molecular targets of the drugs selected (Supplementary  Table S7).

Clustering Parameters and Quality
The clustering of tumor samples was performed using Ward.D2 method [78] using "euclidian" distance method. The assessment of clustering quality was performed by calculation of Watermelon Multisection (WM) metric. WM is the method characterizing entropy of cluster dendrogram on every cut level. The obtained values are compared with random orders of class labels. The final output value of WM termed WM Area reflects the quality of clustering. When the elements of different classes are perfectly separated, the clustering has minimal entropy on every dendrogram cut level and WM area is equal to 1. When the elements of different classes are randomly distributed among clusters, WM area may vary around zero, also taking negative values. WM scoring demonstrates an advantage over other clustering quality metrics, described in detail in Supplementary File 2.

Data Presentation
The results were visualized using packages grafics and ggplot2 [79,80].