Pharmacoproteomic characterisation of human colon and rectal cancer

Abstract Most molecular cancer therapies act on protein targets but data on the proteome status of patients and cellular models for proteome‐guided pre‐clinical drug sensitivity studies are only beginning to emerge. Here, we profiled the proteomes of 65 colorectal cancer (CRC) cell lines to a depth of > 10,000 proteins using mass spectrometry. Integration with proteomes of 90 CRC patients and matched transcriptomics data defined integrated CRC subtypes, highlighting cell lines representative of each tumour subtype. Modelling the responses of 52 CRC cell lines to 577 drugs as a function of proteome profiles enabled predicting drug sensitivity for cell lines and patients. Among many novel associations, MERTK was identified as a predictive marker for resistance towards MEK1/2 inhibitors and immunohistochemistry of 1,074 CRC tumours confirmed MERTK as a prognostic survival marker. We provide the proteomic and pharmacological data as a resource to the community to, for example, facilitate the design of innovative prospective clinical trials.

A D E B C Figure EV2. Protein/mRNA correlation & missing value imputation (related to Fig 3).
A Bar chart of the fraction of missing values per intensity bin, relative to the respective dataset. B Scatterplot of median mRNA expression before MComBat adjustment for all transcripts across all cell lines versus protein expression in both the CRC65 cell line and CPTAC patient datasets (log2-transformed & median-centred giBAQ; diagonal: x = y; lm: linear model; R 2 : coefficient of determination). C Scatterplot of the number of peptides per protein used for the calculation of giBAQ values in the CPTAC dataset versus the CRC65 dataset. The quantification of the majority of all proteins is based on more peptides in the CRC65 dataset compared to the CPTAC dataset (lm: linear model; R 2 : coefficient of determination; see main text and Appendix Supplementary Methods for details). D, E Histograms visualising the distribution of log2-transformed and median-centred giBAQ values of the CRC65 (top row) and CPTAC (bottom row) datasets after application of (D) mRNA-guided, (E) sequential mRNA-guided and perseus-type, as well as sequential mRNA-guided and minimum-guided missing value imputation on the protein or peptide level, respectively. Additional lines visualise the contribution of measured values (green), values imputed by mRNA-guided missing value imputation (yellow) and values imputed by either perseus-type or minimum-guided missing value imputation (red, see also Appendix Supplementary Methods). Figure EV3. Estimating protein levels from mRNA levels (related to Fig 3).

A B C
A Protein/mRNA ratios (log10) for four proteins plotted across the CRC65 and CPTAC datasets show that this ratio is relatively stable for a given protein/mRNA. B Distribution of the median absolute fold-change in protein/mRNA ratios, relative to the protein-wise median protein/mRNA ratio, indicating that variation in the protein/mRNA ratios was typically below twofold. The median is marked with a bold horizontal line inside a box spanning the interquartile range (IQR) from the 25% quantile (q 25% ; lower horizontal line) to the 75% quantile (q 75% ; upper horizontal line), while the whiskers extend to a = q 25% À 1.5 × IQR and b = q 75% + 1.5 × IQR. Outliers according to the standard boxplot definition (x < a, as well as x > b) were excluded for visual clarity. C Scatterplot of median mRNA expression after MComBat adjustment for all transcripts across all cell lines/tumours versus protein expression in the CRC65 and CPTAC datasets, indicating that protein levels can be estimated reasonably well from transcript levels (lm: linear model; R 2 : coefficient of determination; see main text and Appendix Supplementary Methods for details).  Figure EV5. Reproducibility of Kinobeads-based quantification, Western blots of EPHA4, ABL1 and BRAF V600E , correlation of LFQ intensity with densitometry and full proteome giBAQ values, as well as overlap of kinase IDs, Kinobeads Subtypes and predictive markers for cetuximab (related to Fig 4).

Molecular
A 3-D scatterplot showing the correlation of LFQ values for all proteins across all cell lines in the three biological replicates of the Kinobeads pulldowns after ComBat adjustment (R: Pearson's R). B-D Scatterplot (B) and underlying Western blot data (C and D) visualising the correlation between normalised densitometry log2-ratios relative to C10 and normalised LFQ log2-ratios relative to OXCO-1 for EPHA4 and ABL1 (diagonal: x = y; lm: linear model; R 2 : coefficient of determination). Densitometry was performed using ImageStudioLite v5.2.5 (LI-COR), expressing EPHA4 and ABL1 expression relative to the respective ERK1/2 signal, followed by dividing all expression values by the expression value of OXCO-1 (present on each gel) and log2-transformation. LFQ log2-ratios were based on log2-transformed and median-centred LFQ intensities after ComBat adjustment. We reverted the log2-transformation and divided all LFQ values by the LFQ value of OXCO-1, followed by log2-transformation of the resulting ratios. Cell lines harbouring the BRAF V600E mutation are visualised using a mutation-specific antibody. E Scatterplot of kinase expression as quantified using CRC65 full proteome measurements versus Kinobeads experiments. Kinase expression is systematically higher in the Kinobeads data.  Figure EV6. Co-treatment with UNC569 (MERTK inhibitor) and MEK inhibitors is more effective than treatment with MEK inhibitors alone (related to Fig 5).
A Effect-size heat maps of two drugs (one from two datasets) targeting MEK1/2 show consistent association of high MERTK expression with drug resistance, even when drug sensitivity is modelled based on all proteins in the Kinobeads expression matrix (see Appendix Supplementary Methods). B Boxplots of MERTK expression (log2-transformed and median-centred LFQ values after ComBat adjustment) in cell lines predicted to be sensitive (CC07, HDC-143, SK-CO-1; dark blue) or resistant (C10, CaCo-2, T84; yellow) towards two MEK1/2 inhibitors RDEA119 and PD-0325901. The whiskers extend to the minimum and maximum expression of MERTK for cell lines with a given sensitivity prediction, while the median expression of MERTK is marked with a bold horizontal line inside a box spanning the interquartile range (IQR) from the 25% quantile (lower horizontal line) to the 75% quantile (upper horizontal line). Resistant cell lines show significantly higher expression of MERTK than sensitive ones (P ≤ 0.05, one-sided Mann-Whitney test). C-E 10-point dose-response curves of (C) the MERTK inhibitor UNC569, (D) two MEK1/2 inhibitors RDEA119 and PD-0325901 and (E) their constant-ratio combination (see Appendix Supplementary Methods) for the same cell lines as in (B) from our own in vitro experiments. Relative response is expressed as the mean of three technical replicates. F Boxplots of the AUC from (C-E), showing significant differences between cell lines predicted to be sensitive (CC07, HDC-143, SK-CO-1; dark blue) and cell lines predicted to be resistant (C10, CaCo-2, T84; yellow) to RDEA119 and PD-0325901 (P ≤ 0.05, one-sided Mann-Whitney test). The whiskers extend to the minimum and maximum AUC for a given drug and sensitivity prediction, while the median AUC is marked with a bold horizontal line inside a box spanning the interquartile range (IQR) from the 25% quantile (lower horizontal line) to the 75% quantile (upper horizontal line). This difference disappears when co-treating cells with UNC569 in addition to MEK1/2 inhibitors.