Discovery of Molecular Glue Degraders via Isogenic Morphological Profiling

Molecular glue degraders (MGDs) are small molecules that degrade proteins of interest via the ubiquitin–proteasome system. While MGDs were historically discovered serendipitously, approaches for MGD discovery now include cell-viability-based drug screens or data mining of public transcriptomics and drug response datasets. These approaches, however, have target spaces restricted to the essential proteins. Here we develop a high-throughput workflow for MGD discovery that also reaches the nonessential proteome. This workflow begins with the rapid synthesis of a compound library by sulfur(VI) fluoride exchange chemistry coupled to a morphological profiling assay in isogenic cell lines that vary in levels of the E3 ligase CRBN. By comparing the morphological changes induced by compound treatment across the isogenic cell lines, we were able to identify FL2-14 as a CRBN-dependent MGD targeting the nonessential protein GSPT2. We envision that this workflow would contribute to the discovery and characterization of MGDs that target a wider range of proteins.


General procedure for the synthesis of fluorosulfates FL1 to FL4
To a solution of phenol precursors 6, 7, 9 or 10 (1.0 eq) in a mixture of acetonitrile/DMSO (0.1 M, 90:10) to facilitate solubility, triethylamine (2.6 eq.) is added dropwise and stirred 20 min at room temperature.1-(Fluorosulfuryl)-2,3dimethyl-1H-imidazol-3-ium trifluoromethanesulfonate (2.6 eq.) is added and the reaction stirred at room temperature for 2-3 h until TLC shows no starting phenol.The mixture is then diluted in ethyl acetate and washed with water, aq.0.5 M HCl and brine.Organic layer is dried over anhydrous Na 2 SO 4 , filtered and the solvent evaporated under reduced pressure.Desired fluorosulfates were purified by flash chromatography in hexane:ethyl acetate and obtained as yellow powders.
The reaction is then diluted with ethyl acetate and washed with water and aq.0.5 M HCl.Organic layer is dried over anh.Na 2 SO 4 , filtered and the solvent evaporated under reduced pressure.Final compound is purified via flash chromatography using hexane:ethylacetate as solvents.

Assessment of gene expression and essentiality in RKO WT
Public datasets on the transcriptome and essentiality of a protein-encoding gene for RKO (cell line ID: ACH-000943) were retrieved from DepMap Public 23Q2 (https://depmap.org/portal/,accessed on 14 August 2023).The public dataset of proteins predicted to harbor the G-loop, the degron recognized by CRBN, was retrieved from the CRBN Substrate Database (https://bailab.siais.shanghaitech.edu.cn/services/crbn-subslib,accessed on 17 May 2023). 5We retrieved the protein-encoding genes with available transcriptomic and essentiality data (17798 protein-encoding genes).We also filtered the list of predicted CRBN substrates retrieved to those containing the key glycine residue in the seventh position of the G-loop, yielding a list of 2610 CRBN substrates and 2484 of them had transcriptomic and essentiality data.We considered a protein expressed if log transcripts per million 1 2. Expressed proteins were then binned into essential (CERES score ≤ -1) and non-essential (CERES score > -1).

Cell-titer Glo assay
Cell-titer Glo® Luminescent Cell Viability Assays (Promega, #G7572) were performed according to the manufacturer's instructions.Cell viability was determined after 48 h treatment with the drugs.For the screening of the whole SuFEx compound library compound dilution and transfer (8-point dosage, 3-fold dilution, 13.5 µM starting concentration in duplicates) as well as cell seeding of RKO WT cells was carried out by a liquid handling system (PerkinElmer).We used bortezomib as a positive control (13.5 nM).We had 32 negative control wells (DMSO) and 32 positive control wells on each plate.We used these controls to calculate a Z'-factor 7 for each plate individually.All plates had wells with Z' > 0. To compare compounds across plates with different signals, we calculate their signals as a percentage of the control.This calculation is done by linear regression, setting the mean signal of the DMSO wells to 100% and the mean signal of the positive control wells to 0% for each plate.Data visualization was done with seaborn (https://seaborn.pydata.org,version 0.12.2).

Isogenic Cell Painting Assay
The Cell Painting assay we implemented follows the method described by Bray et al. 8 and the updated version described in Cimini et al. 9 with the omission of one cellular stain (SYTO 14) due to limitations of the laser and filter configuration of the Opera Phenix used in this study.We used RKO cell lines (WT, CRBN KO and CRBN OE) 6 which were passaged for no more than 5 passages.When the cells reached a confluency of ca.80%, they were washed with PBS, detached with trypsin and resuspended in DMEM media.The cell suspension was passed through a 40 μM cell strainer to reduce the clumping of cells.The cells were counted with a CASY Cell Counter and diluted to achieve a count of 1000 cells per well in a volume of 40 μL per well.The cells were dispensed into PhenoPlate 384-well microplates using the Multidrop Combi reagent dispenser (Thermo Scientific #5840300).Microplates were transferred to an incubator at 37 °C, 5.0% (v/v) CO 2 , 95.0% humidity for 24 h.
The microplates were than dosed with compound using the Labcyte Echo 550 acoustic liquid handler (Beckman Coulter).We avoided dosing (and imaging) the outermost wells in the microplates as we observed that these wells tended to exhibit imaging biases with lower median channel intensities detected in our imaging set up.The microplates were returned to the incubator for another 48 h.
To stain the cells, we first prepared working solutions of each stain.We diluted Concanavalin A Alexa Fluor 488 conjugate, Phalloidin Alexa Fluor 568 conjugate and MitoTracker Deep Red following the manufacturer's instructions.We diluted Hoechst 33342 with water to a concentration of 20 mg/mL.We prepared the permeabilization buffer comprising 1⨯ HBSS mixed with 0.1% (v/v) Triton X-100 and 1.0% (wt/v) BSA.We mixed the permeabilization buffer with Concanavalin A Alexa Fluor 488 conjugate (f.c. 2 μg/mL), Hoechst 33342 (f.c. 5 μg/mL) and Phalloidin Alexa Fluor 568 conjugate (f.c.8.25 nM) to obtain a multi-staining and permeabilization solution.
We stained the cells by dispensing 20 μL MitoTracker Deep Red into each well.The microplates were returned to the incubator for 30 min.We then fixed the cells by dispensing 20 μL 16.0% methanol-free paraformaldehyde and incubated the microplates at r.t. for 30 min.We washed the wells with 60 μL PBS 4 times using the BioTek ELX405 washer (Perkin Elmer).We permeabilized and stained the cells using 20 μL multi-staining and permeabilization solution, and incubated the microplates at r.t. for 30 min.We washed off the solution and added 40 μL 1⨯ HBSS with 0.05% (wt/v) NaN 3 .We sealed the lid of the microplates with parafilm.The microplates were wrapped with aluminium foil and stored at 4 °C until ready for imaging (for up to four days).
Compound treatment for the isogenic Cell Painting Assay.We used eight compounds that induced diverse morphology (dubbed "morphology controls") as recommended by the JUMP-Cell Painting Consortium. 10 We also curated a list of nine molecular glue degraders and PROTACs with CRBN-dependent bioactivity and seven compounds without CRBN-dependent activity (Table S3).In addition to the 132 test compounds synthesized in-plate by sulfur(IV)-fluoride exchange chemistry, we used a total of 156 compounds.The control compounds were present in all microplates used for the Cell Painting assay, while the 132 test compounds were divided across microplates (44 test compounds per plate).The microplates were dosed with compound or DMSO to a final concentration of 10 μM or the equivalent volume to reach the IC 50 value.
Feature extraction and data pre-processing.We processed and extracted features from the images acquired using CellProfiler (https://cellprofiler.org,version 4.2.1) and cellpose (https://cellpose.org,version 1.0.2) to extract 3005 morphological features and one feature reflecting cell count.In brief, the illumination correction matrix per channel was calculated for each plate and used for correcting images prior to nuclei and whole cell segmentation and feature extraction.The DNA channel images were used for nuclei segmentation, and the composite images of DNA and AGP channels were used for whole cell segmentation.Segmentation was done using the default models of cellpose (i.e."nuclei" and "cyto").The segmentation masks were then used for defining the nuclei, whole cells and cytoplasmic regions for feature extraction using a custom CellProfiler pipeline adapted from the pipeline used by the JUMP-Cell Painting Consortium (https://github.com/broadinstitute/imaging-platformpipelines/tree/master/JUMP_production).
We assessed the quality of the images with a supervised approach that uses the measurements in the "Image.csv"file from the feature extraction pipeline.To identify biases in cell seeding or staining across the plate, we visualized the cell counts ("Count_RelatedUnfilteredCells") and the median channel intensities of each channel (aggregated per well) as a heatmap formatted to mimic the layout of a 384-well plate and visually inspected them.We also visualized the same features per site as a kernel density estimate plot and rug plot.The plots indicated a skewed distribution in median intensities of the channels with trailing right tails.We, thus, visually inspected images sampled from the trailing right tail of each distribution to determine a suitable threshold on median channel intensities or if thresholding was necessary.The data presented in this study did not require thresholding on the median channel intensities.We also flagged blurry images and images with saturation artefacts by manually setting thresholds on all fluorescent channel measurements with the prefixes "ImageQuality_PowerLogLogSlope" and "ImageQuality_PercentMaximal".We visually inspected flagged images to ensure that they are indeed blurry or contain saturation artefacts.Lastly, we flagged treatments that exhibited an unusually high ratio of nuclei to cells i.e. supernumerary nuclei phenotype.Treatments with ratio within one standard deviation of the ratio for AMG-900 were flagged.AMG-900 is a morphological control that induces the supernumerary nuclei phenotype.All the images and treatments flagged as "failing" our quality checks were recorded in cell-line specific CSV files.
Finally, we assembled the morphological profiles.The individual object CSV files (i.e."Nuclei.csv","Cell.csv"and "Cytoplasm.csv")were merged per cell line.Images and treatments flagged during quality control were excluded.The merger of the individual object CSV files assumes each cell has one matching nuclei and cytoplasm object.As such, treatments that induce the supernumerary nuclei must be excluded.Thereafter, the data was aggregated as the median of all detected objects (i.e.cells) per site.Wells with less than three sites and/or 10 cells per site detected in the feature extraction pipeline were excluded.Finally, the cell line-specific profiles were merged, and the cell line context was indicated as "Metadata_Cell".
Feature selection.We employed two strategies for feature selection -the "global" and "treatment-centric" feature selection -using a custom Python class (SelectFeatures in post_feature_extraction_modules.py).For both strategies, features that are constant (i.e. they have a median absolute deviation (MAD) of 0) across all treatments or within all DMSO controls in all cell lines are discarded.The remaining features are then selected depending on the strategy employed.
We only carried out global feature selection for RKO WT data, where redundant features are discarded directly.We quantified the median absolute deviation (MAD) of each feature, ordered the features by descending MAD to select features that vary more and removed features with a Pearson's correlation coefficient greater than 0.8.We assumed that features with higher variation could be used to distinguish treatments better than features with lower variation.

S 25
The set of global features should, thus, approximate the minimal set of morphological features changed across all treatments.
Treatment-centric feature selection uses data from all cell lines, where an additional step is carried out before redundant features are discarded.For each treatment, we calculated the correlation between feature measurements and cell lines used using Kendall's Tau-b coefficient (Scientific Python, https://rdcu.be/b08Wh,version 1.11.1).We treated the cell lines as ordinal data and ranked them as "RKO CRBN KO, RKO CRBN WT and RKO CRBN OE" (in order of increasing CRBN expression).From the Kendall's Tau-b coefficient results, we selected features with a p-value is less than 0.05 after Bonferroni correction for multiple-hypothesis testing.For each cell line, the remaining features are put through a vote on redundancy using a similar process as the global feature selection strategy (ordering the features by descending MAD instead to select features that are more consistent per treatment).Finally, we selected features that are redundant in no more than one cell line.We assume that features that are redundant in more than one cell line are truly redundant, although we could falsely discard non-redundant features in theory.We proceeded with this approach as any redundant features can artificially inflate corrected U scores used for CRBN-dependency prediction (see Prioritizing Test Compounds by their Likelihood of CRBNdependent Bioactivity).The final set of treatment-centric features should, thus, approximate the minimal set of features describing the morphology induced by a compound treatment in a CRBN dependent manner.
The features selected are exported as a CSV file detailing the feature selection strategy used and the respective set of features selected using the strategy.Two-dimensional UMAP projection of features selected by "global" feature selection for RKO WT.We reduced the features from the 470 selected by the "global" strategy to two UMAP dimensions with umap-learn (https://umaplearn.readthedocs.io/en/latest/#,version 0.5.3) with metric set to "cosine".

Quantification of strength of morphological perturbation.
The strength of morphological perturbation by a compound compared to DMSO controls was calculated using the RKO WT profile and features selected by the "global" strategy (see Feature Selection).We quantified the strength of morphological perturbation as Robust Hellinger Distance following Vuillard et al. 11 We first reduced the features from the 470 selected by the "global" strategy to four UMAP dimensions with UMAP.jl (version 0.1.9),using CosineMetric() and setting min_dist to 2. We then calculated the Robust Hellinger Distance using BioProfiling.jl.
Calculating the likelihood that a compound has CRBN-dependent bioactivity.We used the RKO CRBN KO and RKO CRBN OE profiles and features selected by the "treatment-centric" strategy (see Feature Selection) to calculate an "induction score" (relevant custom Python functions in profile_interpretation_modules.py;see GitHub repository).For each treatment, a set of features ( ) are selected.The induction score is the average absolute robust Z score per treatment-centric feature ( ) per image calculated using the following equation:

𝐼𝑛𝑑𝑢𝑐𝑡𝑖𝑜𝑛 𝑠𝑐𝑜𝑟𝑒 |𝑥 | 𝑁
As the features selected correlate with CRBN dependency, we hypothesized that compounds that have CRBNdependent bioactivity should have induction scores greater in the RKO CRBN OE background than in the RKO CRBN KO background.We, thus, compared the induction scores in RKO CRBN OE and RKO CRBN KO using the Mann-Whitney U test (Scientific Python, https://rdcu.be/b08Wh,version 1.11.1).The maximum value of  is the product of the sample sizes of (i.e.number of images in) RKO CRBN OE and RKO CRBN KO data.We, thus, normalized the  scores calculated to allow us to compare  scores for different treatments with varying sample sizes: Mathematically, if a compound has a corrected  score > 0.5, the compound has marginally higher induction scores in RKO CRBN OE than in RKO CRBN KO.We considered compounds with corrected  score > 0.5 as potentially CRBNdependent.

Comparing the treatment-centric features of compounds predicted to have CRBN-dependent bioactivity.
We compared the treatment-centric features between all pairs of compounds predicted to have CRBN-dependent bioactivity (number of treatment-centric features ≥ 5 and corrected U scores ≥ 0.7).We first retrieved the set of treatment-centric features selected for the pair of compounds compared.We then retrieve the matrix of Kendall's τ-b correlation coefficients calculated per feature for each compound.In our use case, τ approximates the strength of positive or negative correlation between the robust Z scores of a feature with the CRBN-expression level of cell lines.We finally compare the matrices of τ values between compounds using the cosine similarity metric (Numerical Python, https://numpy.org,version 1.24.3).
We then visualized the cosine similarities calculated on a network graph with NetworkX (https://networkx.org,version 3.1), using the spring_layout() with seed set to 500 and iterations set to 200.Cosine similarities ≥ 0.85 were represented as thick edges with a weight of 3.0, while those < 0.85 were represented as light grey edges with a weight of 0.2.Thick edges were also color-coded.If any compound/node connected by the edge is a CRBNdependent control, the edge is blue.Otherwise, the edge is black

Expression proteomics to identify the degradation target of FL2-14
Sample preparation.Quantitative proteomics was performed for an unbiased, proteome-wide identification of the degradation target(s) of FL2-14 in RKO cells.40 million RKO cells per condition were treated with selected SuFEx IMiDs for 16 hours in biological duplicates.Cells were harvested via centrifugation, washed three times in ice-cold PBS and snap-frozen in liquid nitrogen.Each washed cell pellet was lysed separately in 500 µl of freshly prepared lysis buffer containing 50 mM HEPES (pH 8.0), 2% SDS, 1 mM PMSF and protease inhibitor cocktail (Sigma-Aldrich).Samples rested at RT for 20 minutes before heating to 99°C for 5 min.After cooling down to RT, DNA was sheared by sonication using a Covaris S2 high performance ultrasonicator.Cell debris was removed by centrifugation at 16,000 × g for 15 min at 20°C.Supernatant was transferred to fresh Eppendorf tubes and protein concentration determined using the BCA protein assay kit (Pierce Biotechnology).Filter-aided sample preparation (FASP) was performed using a 30 kDa molecular weight cutoff centrifugal filters (Microcon 30, Ultracel YM-30, Merck Millipore) essentially according to the procedure described by Wisniewski et al. 12 In brief, 100 µg of total protein per sample was reduced by the addition of DTT to a final concentration of 83.3 mM, followed by incubation at 99°C for 5 minutes.After cooling to room temperature, samples were mixed with 200 μL of freshly prepared 8 M urea in 100 mM Tris-HCl (pH 8.5) (UA-solution) in the filter unit and centrifuged at 14.000 × g for 15 min at 20°C to remove SDS.Residual SDS was washed out by a second wash step with 200 μL UA solution.Proteins were alkylated with 100 µl of 50 mM iodoacetamide in the dark for 30 min at RT.Thereafter, three washes were performed with 100 μL of UA solution, followed by three washes with 100 μL of 50 mM TEAB buffer (Sigma-Aldrich).Proteolytic digestion is performed using the protease trypsin in a 1:50 ratio overnight at 37°C.Peptides were recovered using 40 μL of 50 mM TEAB buffer followed by 50 μL of 0.5 M NaCl (Sigma-Aldrich).Peptides were desalted using the Thermo Scientific™ Pierce™ Peptide Desalting Spin Columns (Pierce).TMTpro 18plex Label Reagent Set was used for labeling according to the manufacturer (Pierce).After the labeling reaction was quenched, the samples were pooled, the organic solvent removed in the vacuum concentrator, and the labeled peptides purified by C18 solid phase extraction (SPE).
Tryptic peptides were re-buffered in 10 mM ammonium formate buffer pH 10, shortly before separation by reversed phase (RP) liquid chromatography at pH 10 as described by Gilar et al. 13 Peptides were separated into 96 time-based fractions on a Phenomenex C18 RP column (150 × 2.0 mm Gemini-NX, 3 µm C18 110Å, Phenomenex, Torrance, CA, USA) using Dionex Ultimate 3000 series HPLC system fitted with a binary pump delivering solvent at 50 µL/min.Acidified fractions were consolidated into 36 fractions via a concatenated strategy described by Wang et al. 14 After removal of solvent in a vacuum concentrator, samples were reconstituted in 0.1% TFA prior to LC-MS/MS analysis.

Sample analysis (LC-MS/MS).
Mass spectrometry analysis was performed on an Orbitrap Fusion Lumos Tribrid mass spectrometer (ThermoFisher Scientific) coupled to a Dionex Ultimate 3000 RSLCnano system (ThermoFisher Scientific) via a Nanospray Flex Ion Source (ThermoFisher Scientific) interface.Peptides were loaded onto a PepMap 100 C18, 5 μm, 5 × 0.3 mm trap column (ThermoFisher Scientific) at a flow rate of 10 μL/min using 0.1% TFA as loading buffer.After loading, the trap column was switched in-line with an Acclaim PepMap nanoHPLC C18 analytical column with 2.0 µm particle size and a dimension of 75 µm IDx 500 mm (ThermoFisher Scientific, #164942).The column temperature was maintained at 50°C.Mobile phase A consisted of 0.4% formic acid in water, and mobile phase B consisted of 0.4% formic acid in a mixture of 90% acetonitrile and 10% water.Separation was achieved using a multistep gradient over 150 min at a flow rate of 230 nL/min (increase of initial gradient from 6% to 9% solvent B within 1 min, 9% to 30% solvent B within 146 min, 30% to 65% solvent B within 8 min, 65% to 100% solvent B within of 1 minute and 100% solvent B for 6 minutes before equilibrating to 6% solvent B for 24 minutes before the next injection).In the liquid junction setup, electrospray ionization was enabled by applying a voltage of 1.8 kV directly to the liquid being sprayed, and non-coated silica emitter was used.
The mass spectrometer was operated in a data-dependent acquisition mode (DDA) and used a synchronous precursor selection (SPS) approach.For both MS2 and MS3 levels, we collected a 400-1600 m/z survey scan in the Orbitrap at 120 000 resolution (FTMS1), the AGC target was set to 'standard' and a maximum injection time (IT) of 50 ms was applied.Precursor ions were filtered by charge state (2-5), dynamic exclusion (60 s with a ±10 ppm window), and monoisotopic precursor selection.Precursor ions for data-dependent MSn (ddMSn) analysis were selected using 10 dependent scans (TopN approach).A charge-state filter was used to select precursors for datadependent scanning.In ddMS2 analysis, spectra were obtained using one charge state per branch (from z=2 to z=5) in a dual-pressure linear ion trap (ITMS2).The quadrupole isolation window was set to 0.7 Da and the collisioninduced dissociation (CID) fragmentation technique was used at a normalized collision energy of 35%.The normalized AGC target was set to 200% with a maximum IT of 35 ms.During the ddMS3 analyses, precursors were isolated using SPS waveform and different MS1 isolation windows (1.3 m/z for z=2, 1.2 m/z for z=3, 0.8 m/z for z= 4 and 0.7 m/z for z = 5).Target MS2 fragment ions were further fragmented by high-energy collision induced dissociation (HCD) followed by Orbitrap analysis (FTMS3).The normalized HCD collision energy was set to 45% and the normalized AGC target was set to 300% with a maximum IT of 100 ms.The resolution was set to 50 000 with a defined scanning range of 100 to 500 m/z.Xcalibur Version 4.3.73.11 and Tune 3.4.3072.18were used to operate the instrument.
Data processing and analysis.Following data acquisition, the acquired raw data files were processed using the Proteome Discoverer v.2.4.1.15platform, with a TMT18plex quantification method selected.In the processing step, we used the Sequest HT database search engine and the Percolator validation software node to remove false positives with a false discovery rate (FDR) of 1% at the peptide and protein level under stringent conditions.All MSn spectra were searched against the human proteome (Canonical, reviewed, 20 304 sequences) and appended known contaminants and streptavidin, with a maximum of two allowable miscleavage sites.The search was performed with full tryptic digestion.Methionine oxidation (+15.994Da) and protein N-terminal acetylation (+42.011Da), as well as methionine loss (-131.040Da) and protein N-terminal acetylation with methionine loss (-89.030Da) were set as variable modifications, while carbamidomethylation (+57.021Da) of cysteine residues and tandem mass tag (TMT) 18-plex labeling of peptide N termini and lysine residues (+304.207Da) were set as fixed modifications.Data were searched with mass tolerances of ±10 ppm and ±0.6 Da for the precursor and fragment ions, respectively.Results were filtered to include peptide spectrum matches with Sequest HT cross-correlation factor (Xcorr) scores of ≥1 and high peptide confidence assigned by Percolator.MS2 signal-to-noise (S/N) values of TMTpro reporter ions were used to calculate peptide/protein abundance values.Peptide spectrum matches (PSMs) with precursor isolation interference values of > 70, average TMTpro reporter ion S/N < 10 and SPS Mass Matches < 65 % were excluded from quantification.Both unique and razor peptides were used for TMT quantification.Correction of isotopic impurities was applied.Data were normalized to total peptide abundance to correct for experimental bias and scaled "to all average".Protein ratios are directly calculated from the grouped protein abundances using an ANOVA hypothesis test.Adjusted p-values are calculated using the Benjamini-Hochberg method.

Prediction of the G-loop degron sequence in GSPT2
We first compiled all CRBN degron sequences with known crystal structures (Table S 2).We subsequently discarded 6UML as its degron sequence included an alanine instead of the typical glycine observed in the seventh amino acid residue of other degron sequences, leaving six degron sequences.We then scanned the latest AlphaFold structure of GSPT2 15 for a potential stretch of 10 amino acids that would align best with each degron sequence.We yielded the same stretch of amino acid residues from position 560 to 569 (LVDKKSGEKS) and RMSD values under 0.5 Å.

Analysis of G-loop mutation
GSPT2 in a pENTR223 vector was obtained from the BCCM/GeneCorner ORFeome (https://bccm.belspo.be/,#81108-A10) and cloned in a lentiviral backbone containing a C-terminal 2xHA-tag by Gateway cloning.The Invitrogen Clonase Gateway LR Clonase II Enzyme Mix (Thermo Fisher, #11-791-020) was used according to the manufacturer's instructions.The construct was amplified in E. coli Stbl3 cells at 30°C over night and DNA was extracted with the QIAGEN Plasmid Plus Midi Kit (Qiagen, #12943).
The G566N point mutation was introduced in the pENTR-GSPT2 vector using the Q5 site-directed mutagenesis kit (NEB, #E0552S) with forward primer (CAAAAAATCAaacGAAAAAAGTAAGACACGAC) and reverse primer (TCTACCAAGGAGATTAAC) according to the manufacturer's instructions.For the PCR reaction an initial denaturation step at 98°C was followed by 30 cycles of 10 sec at 98°C, 20 sec at 55°C and 120 sec at 72°C and a final extension of 2 min at 72°C.The PCR product was treated with KLD mix (NEB, #M0554S) according to the manufacturer's instructions.The plasmid was amplified in E. coli Stbl3 cells, the mutation was verified by Sanger sequencing and the modified insert was cloned in the same HA-tag destination vector as wild-type GSPT2 as described above.
For degradation assays 800 000 RKO wt or CRBN knock-out cells were seeded 1 day prior to the treatment in 6 well plates.Cells were treated with 10 µM of the compounds for 16 hours if not otherwise stated.Cells were harvested, lysed as described above and 10 µg total proteome were subjected to western blot analysis as described above.Table S3: List of compounds used as CRBN-dependent (in blue) and CRBN-independent (in pink) controls in the isogenic CPA and their reported targets.Transcriptomics and essentiality data retrieved from DepMap Public 23Q2 files (OmicsExpressionProteinCodingGenesTPMLogp1.csv and CRISPRGeneDependency.csv respectively) for RKO WT (DepMap ID: ACH-000943).Targets with a log 2 TPM ≥ 2 were considered as "expressed" and targets with a gene dependency score ≤ -1 were considered as "essential".

S 35
Table S 4: List of compounds with at least five treatment-centric features and corrected U scores ≥ 0.7 i.e. they are predicted to have CRBNdependent bioactivity.The compounds colored in blue are CRBN-dependent controls, while the ones in black are CRBN binders.The CRBN binding affinities shown were quantified by the fluorescence polarization-based competition assay (data also shown in Figure S2).

Figure S1 :
FigureS1: Distribution of proteins in RKO (the cell line background primarily used in this study) classified by expression (presumed from transcription levels) and essentiality.A total of 17798 protein-encoding genes have transcriptomic and essentiality data available publicly from DepMap (https://depmap.org/portal/).RKO expressed slightly over half of the proteins comprising the human proteome (ignoring protein isoforms encoded by the same gene).A tenth of the expressed proteome is essential, while the remainder of the expressed proteome is non-essential.Reflecting a similar trend, out of the 2484 proteins predicted to contain a G-loop (the degron recognized by CRBN)5 with publicly available transcriptomic and essentiality data, over half are expressed in RKO.A large proportion of these G-loop containing proteins are non-essential and expressed in RKO.

Figure S2 :
Figure S2: CRBN binding affinities of the library synthesized using the SuFEx protocol reported in this study.The CRBN binding affinities of these compounds were determined with the Fluorescence Polarizationbased competition assay described in the methods section.The values are given in μM and are the average of triplicate measurements.

Figure S 3 :
Figure S 3: Assessment of cytotoxicity of CRBN binders in RKO WT using CellTiter-Glo assays (Promega).None of the CRBN binders reduced the normalized Cell-Titer-Glo signal to 50% (marked in red).This observation indicates that none of the CRBN binders induce cytotoxic effects up to treatment concentrations of 13.5 μM.Only two compounds induce mild cytotoxic effects with normalized CellTiter-Glo signals quantified slightly under 65.0% at the maximum treatment concentration of 13.5 μM.The normalized CellTiter-Glo signal quantified is the average of duplicate measurements normalized to a positive control (bortezomib-treated cells) and a negative control (DMSO-treated cells).

Figure S 4 :
Figure S 4: Treatment-centric features can vary greatly across compounds.Some CRBN-independent controls have numbers of treatment-centric features similar (or even more than) that of CRBN-dependent controls.

Figure S 5 :
Figure S 5: The corrected U score calculated does not correlate with the CRBN binding affinity or morphological perturbation strength of the compound.(A) The CRBN binding affinity of CRBN binders (given in µM, also shown in Figure S2) compared to their corrected U scores.(B) The morphological perturbation strength of all compounds compared to their corrected U scores.

Table S 1
: Library of secondary amines used for the high throughput synthesis

Table S 2
: List of crystal structures used for the computational prediction of the G-loop degron sequence in GSPT2