Improving drug discovery using image-based multiparametric analysis of the epigenetic landscape

High-content phenotypic screening has become the approach of choice for drug discovery due to its ability to extract drug-specific multi-layered data. In the field of epigenetics, such screening methods have suffered from a lack of tools sensitive to selective epigenetic perturbations. Here we describe a novel approach, Microscopic Imaging of Epigenetic Landscapes (MIEL), which captures the nuclear staining patterns of epigenetic marks and employs machine learning to accurately distinguish between such patterns. We validated the MIEL platform across multiple cells lines and using dose-response curves, to insure the fidelity and robustness of this approach for high content high throughput drug discovery. Focusing on noncytotoxic glioblastoma treatments, we demonstrated that MIEL can identify and classify epigenetically active drugs. Furthermore, we show MIEL was able to accurately rank candidate drugs by their ability to produce desired epigenetic alterations consistent with increased sensitivity to chemotherapeutic agents or with induction of glioblastoma differentiation.


Introduction
A cell's epigenetic landscape is largely determined by its chromatin organization, the pattern of its DNA, and its histone modifications, all of which confer differential accessibility to areas of the genome and, through direct and indirect regulation of all DNA-related processes, form the basis of the cellular phenotype (1,2). By collecting global information about the epigenetic landscape, for example using ATAC-or histone ChIP-seq, we can derive multilayered information regarding cellular states (3,4). These include stable cell phenotypes such as quiescence, senescence, or cell fate, as well as transient changes such as those induced by cytokines and chemical compounds. However, current methods for collecting such information are not adapted for high-content drug screening. Over the past decade the decreasing cost and remarkable scalability have made high content screening particularly attractive for drug discovery. More recently, novel image analysis coupled with multiparametric analysis and machine learning have significantly impacted our ability to understand and process phenotypic screening outputs (5,6). Despite these advantages, such assays have not been adapted to extract and utilize information for the cellular epigenetic landscape.
While malignant glioblastoma is the most common and lethal brain tumor, current therapeutic options offer little prognostic improvement, so the median survival time has remained virtually unchanged for decades (7)(8)(9). Tumor-propagating cells (TPCs) are a subpopulation of glioblastoma cells with increased tumorigenic capability (10) and are operationally defined as early-passaged (<15) glioblastoma cells propagated in serum-free medium (11). Compared to the bulk of the tumor, TPCs are more resistant to drugs, such as temozolomide (TMZ) and radiation therapy (12,13). This resistance may explain the failure of traditional therapeutic strategies based on cytotoxic drugs targeting glioblastoma. Multiple approaches aimed at reducing or circumventing the resilience of TPCs have been proposed. These include targeting epigenetic enzymes (i.e., enzymes that write, remove, or read DNA and histone modifications) that would increase sensitivity to cytotoxic treatments (14)(15)(16)(17), differentiating TPCs to reduce tumor expansion by decreased cell proliferation, and increasing sensitivity to cytotoxic treatments (18)(19)(20)(21)(22)(23).
Culturing primary GBM cells in serum-containing medium induces their differentiation into cells with drastically reduced tumorigenic potential (24). In addition, Bone Morphogenetic Protein 4 (BMP4) treatment was reported to induce GBM differentiation (25,26), which might be reversible (27) and is contingent on the presence of functional BMP receptors (28). These observations support the potential therapeutic value of small molecules that mimic the differentiation effect of serum and BMPs on TPCs Here we have used the novel high-content screening platform MIEL to profile chromatin organization by using the endogenous patterns of histone modifications present in all eukaryotic cells.
We validated this platform across multipole cell lines using epigenetically active compounds and applied MIEL to realize the mechanism by which BET inhibitors synergize with TMZ treatment. Also relevant to the tumor differentiation paradigm, we demonstrated that MIEL can identify epigenetically active drugs, classify them by molecular function, and accurately rank them by the ability to produce a set of desired epigenetic alterations consistent with inducing glioblastoma differentiation.

Development of the MIEL platform.
We have developed a novel phenotypic screening platform, MIEL, which interrogates the epigenetic landscape at both population and single cell levels using image derived features and machine learning (29). MIEL takes advantage of epigenetic marks such as histone methylation and acetylation, which are always present in eukaryotic nuclei and can be revealed by immunostaining.
MIEL analyzes the immunolabeling patterns of epigenetic marks using conventional image analysis methods for nuclei segmentation, feature extraction, and previously described machine-learning algorithms (30) (Fig. 1a and Methods). Primarily, we utilized four histone modifications: H3K27me3 and H3K9me3, which are associated with condensed (closed) facultative and constitutive heterochromatin, respectively; H3K27ac, associated with transcriptionally active (open) areas of chromatin, especially at promoter and enhancer regions; and H3K4me1, associated with enhancers and other chromatin regions ( Fig. 1a; (31,32)). To focus on the intrinsic pattern of epigenetic marks, we used only textureassociated features (e.g., Haralick's texture features (33), threshold adjacency statistics, and radial features (34)) for multivariate analysis. Previous studies have successfully employed similar features for cell painting techniques combined with multivariate analyses to accurately classify subcellular localization of proteins (34), cellular subpopulations (35), and drug mechanisms of action (30,(36)(37)(38).
We employed three main methods of data visualization and analysis: To visualize similarity between multiple cell populations, we calculated the multivariate centroids for each cell population and the Euclidean distance between all populations. To reduce data dimensionality and present as a 2D scatter plot (termed distance map), we used multidimensional scaling (MDS; Methods and Fig. 1a). To classify multiple cell populations, we employed quadratic discriminant analysis of multivariate centroids, while single cells across cell populations were classified using a Support Vector Machine (SVM; Methods and Fig. 1a).
The most commonly used cellular assays for epigenetic drug discovery are lysis and ELISA, such as AlphaLISA (PerkinElmer). Imaging-based alternatives rely on staining for relevant histone modification and monitoring changes in average fluorescent intensity (39,40). Using MIEL, we screened a library of 222 epigenetically active compounds, many with known targets among epigenetic writers, erasers, or readers (SBP epigenetic library, Supplementary Fig. 1a, b). We focused on MIEL's ability to (1) detect active compounds; (2) group drugs by function and identify off-target effects; (3) be robust across cell lines and drug concentrations; (4) rank active drugs, and derive information regarding drug mechanism of action.

MIEL improves detection of epigenetically active drugs
To determine how well MIEL could detect active compounds and compare them against other intensity-based methods, primary-derived TPCs (GBM2 cell line) were treated with epigenetically active drugs for 24 hours (10 uM, triplicates). Treated cells were immunolabeled for multiple histone modifications expected to exhibit alterations following drug treatment (H3K9me3, H3K27me3, H3K27ac, and H3K4me1). Image analysis, including nuclei segmentation and features extraction, was conducted, as previously described (30) on an Acapella 2.6 (PerkinElmer). Phenotypic profiles were generated for each compound or control-treated (DMSO) treated wells. These are vectors were composed of 1048 (262 features per modification X 4 modifications) texture features derived from the staining of each modified histone modification and representing the average value for each feature across all stained cells in each cell population (drug or DMSO). When treatment reduced cell count to under 50 imaged nuclei per well, the compound was deemed toxic and excluded from analysis.
Following feature normalization by z-score, we calculated the Euclidean distance between vectors of the compounds and DMSO-treated cells. These distances were then normalized (z-score) to the average distance between DMSO replicates and the standard deviation of these distances.
Compounds with a distance z-score of greater than 3 were defined as active (see Methods section).
This analysis identified 122 compounds that induced significant epigenetic changes. Active compounds were not uniformly distributed across all functional drug categories. Rather, we identified 10 categories in which 50% of the drugs were identified as active and nontoxic and 13 categories in which 25% or less fewer of the drugs induced detectable epigenetic alterations following a 24-hour treatment (Fig.   1b).
To compare MIEL with current thresholding methods, we repeated the calculation using mean fluorescence intensity for all histone modifications by normalizing (z-score) each drug against DMSO; active compounds were defined as compounds for which z-scored intensity for at least one of the histone modifications was greater than 3 or less than -3. As a result, we identified 94 active compounds, which were distributed across functional categories similarly to MIEL-identified compounds ( Fig. 1b). For each functional category, the number of compounds identified as active using thresholding was fewer than the number identified using MIEL (Fig. 1b), demonstrating MIEL's increased detection sensitivity over standard thresholding.
To determine the contribution of individual histone modifications, we repeated both MIEL and thresholding analyses individually for each of the 4 modifications. Using MIEL-based analysis, a single modification yielded similar detection rates to the combination of modifications across most functional categories ( Supplementary Fig. 2a). Using intensity-based analysis, individual modifications yielded lower detection rates compared to the combination of modifications and displayed equal or reduced detection rates when compared to MIEL in all categories and modifications ( Supplementary Fig. 2a). Of note, 3 of the 4 modifications used for MIEL analysis showed similar detection rates across most of the functional categories. However, detection rates of modified H3K27me3 were consistently reduced across the most active categories ( Supplementary Fig. 2a) except for EZH1/2 inhibitors, possibly due to the role these enzymes play in regulating this posttranslational modification. To further compare MIEL and thresholding, we estimated the magnitude of epigenetic alterations induced by active compounds.
We calculated the fold increase in distance from DMSO (normalized to average distance between DMSO replicates), as well as the fold change in fluorescence intensity for active compounds in each category. In all categories, MIEL showed an increased effect ( Supplementary Fig. 2b).
These results demonstrate that, across all tested epigenetic modifications, detecting epigenetically active compounds using high content imaging was markedly improved by implementing MIEL compared to current image-based thresholding methods.

MIEL suggests functional groups and identifies the off-target effects
One key advantage of phenotypic profiling methods like MIEL is the ability to classify compounds by function and identify its nonspecific effects by comparing with profiles of well-defined controls. To assess whether MIEL could correctly group compounds by function, we applied discriminant analysis (DA) to all active, nontoxic compounds from categories that had at least 3 such compounds (85 compounds; 7 categories and DMSO). Two replicates from each drug and 38 DMSO replicates were used as a training set for a quadratic DA, using all texture features derived from images of the four histone modifications (features displaying multicollinearity were reduced). The third replicate for each compound, as well as 10 DMSO replicates, were used as a test set to validate the model.
Results showed that MIEL separated multiple categories of epigenetically active drugs with an average accuracy of 91.4% (Fig. 1c, d). Although many of the epigenetically active compounds induced alterations in average fluorescence ( Supplementary Fig. 2b), a DA utilizing intensity measurements from all 4 channels was ineffective at separating the various categories and yielded only 51.6% average accuracy ( Supplementary Fig. 3a). To test whether modification textures of individual histones contained sufficient information to distinguish between the various drug classes, we performed DA using features derived from each modification. Although this degraded MIEL's ability to separate compound subclasses, which affected similar changes in histone modification such as Class I and Pan HDAC inhibitors, MIEL was still able to separate major categories, such as histone phosphorylation and deacetylation ( Supplementary Fig. 3b).
Of note, the compound library used in this study included Pan HDAC inhibitors (HDACi), Class I HDACi, and Class I HDACi, known to also target HDAC6. HDAC inhibitors targeting both Class I and HDAC 6 displayed the same profile as Pan HDAC, and DA showed the two categories to be undistinguishable. This was likely due to the high expression of HDAC Class I and HDAC 6 and low expression of other HDACs in GBM2 glioblastoma line ( Supplementary Fig. 4a, b, c).
Of the 85 compounds tested, 7 (8.2%) were identified as active but were misclassified by MIEL.
One of these was valproic acid, a commonly used anticonvulsant (41), which also functions as a Pan HDAC inhibitor at high concentrations (42). Though valproic acid is expected to inhibit HDACs only at high concentrations (>1.2mM), a short 24-hour treatment induced detectable epigenetic changes even at low concentrations (<30uM). However, when we quantified H3K27ac and H3K27me3 Taken together, we have demonstrated a unique ability of the MIEL approach to identify epigenetically active compounds, to accurately categorize them according to their molecular mechanism of action, and to detect off-target effects of compounds with known mechanism of action.

Unbiased detection of drug concentration effect on cellular epigenetic state.
As drugs vary in potency, predicting the function of unknown drugs relies on generating functional category-specific profiles that remain valid over a range of activity levels. To determine whether MIEL could correctly identify the functional category of drugs with different potencies, we In addition to their on-target effect, the drugs may induce epigenetic alterations through toxicity and stress. To estimate the impact of toxicity on changes to drug-induced profiles and its contribution to drug misclassification across a range of concentrations, we plotted z-scored distance from DMSO (effect size) against z-scored nuclei count (a proxy for cytotoxicity) for GBM2 cells treated at a range of drug concentrations (0.1, 0.3, 1, 3, 10µM). This demonstrated that some compound classes, such as Aurora and JAK inhibitors, induce epigenetic alterations only in concentrations where cell count is significantly reduced, whether through toxicity or direct effect on proliferation (Fig. 2b -dark blue and pink respectively), while other compounds, such as HDAC inhibitors, characteristically have a concentration range where epigenetic alterations are not accompanied by reduced cell counts (Fig. 2b -green and yellow). Interestingly, both SIRT and EZH1/2 ( Fig. 2b -light-blue and red, respectively) inhibitors affected significant epigenetic changes without inducing significant changes in cell count.
These results indicated the MIEL platform is ideally positioned to analyze dose-dependent effects from drug treatment. In particular, our data suggest that low (0.1uM) and high (10uM) concentration of HDAC inhibitors resulted in distinct and separable epigenetic landscapes, suggesting potentially distinct chromatin/gene expression profiles and divergent biological outcomes when using a low vs high concentration of such compounds.

MIEL profiles are coherent across multiple cell lines
Testing candidate drugs in multiple cell lines can help gauge their inclusivity and identify tumor subtypes that do not respond to a specific drug or drug class. To test whether MIEL readouts were coherent across multiple glioblastoma TPCs, we treated 4 cell lines with a subset of drugs from the epigenetic library (57 drugs), derived phenotypic profiles, and calculated their effect size (z-scored Euclidean distance from DMSO replicates. This revealed a significant positive correlation between all 4 cell lines pointing to the similarities in their drug sensitivity profiles and demonstrating the robustness of the MIEL read out (Fig. 2c,d). To assess the ability of MIEL to group compounds by function across multiple cell lines we employed DA to classify DMSO and drug treated TPCs across these 4 GBM lines.
In this way, we could accurately separate cells treated with drugs modulating distinct functions, such as EZH1/2 or SIRT inhibitors (5 and 3 compounds respectively; mean accuracy 100%; Fig. 2e). However, we were unable to separate drug subclasses with similar functions, such as class I and pan HDACs inhibitors (6 and 17 compounds respectively; mean accuracy 76.8%; Fig. 2e). These results demonstrate MIEL's ability to correctly categorize by function drugs with varying degrees of potency across multiple cells lines.
Finally, although individual drug activity correlated well across cells lines, the magnitude of the effect for some classes of drugs was highly correlated to the gene expression levels of the target. For example, SIRT inhibition was significantly more effective in lines showing reduced Sirt1 expression (the main SIRT to deacetylate histone 3; n=4 compounds, p<0.02; Supplementary Fig. 6b, c), and there was a significant inverse correlation between Sirt1 expression and the effect size (R=-0.87; Supplementary   Fig. 6c).
In sum, we documented that the MIEL assay was both sensitive and robust across multiple primary human glioblastoma cells lines, which further underscores its ability to detect the differences in gene expression and to provide a cumulative measurement of the effect of each compound on cellular epigenetic landscape.

MIEL helps uncover the mechanism of BET inhibitors synergy with TMZ and ranks their activity.
MIEL analysis demonstrated that the magnitude of drug-induced profile change, as measured by the distance from DMSO controls, vary between individual drugs within each drug class ( Supplementary Fig. 7a). To test whether these differences are biologically meaningful, we correlated MIEL-based activity of epigenetic drugs which are often designed to work in combination with other treatments (16,17). One common approach is to use epigenetic drugs to sensitize tumor cells to a standard of care in cytotoxic treatment (15,(43)(44)(45), such as radiation and temozolomide (TMZ), which are used to treat glioblastoma. To identify drug classes that sensitize glioblastoma TPCs to cytotoxic therapy, GBM2 cells were treated with epigenetic drugs for 2 days prior to radiation or TMZ. Cytotoxic treatment was carried out for 4 days at levels that induced a 50% reduction in cell numbers (1Gy radiation or 200uM TMZ; Fig. 3a). At the end of the treatment (day 6), cells were counted, and a combined drug index (CDI) was calculated (see Methods). Though we did not identify any drugs that synergized (CDI<0.7) with the radiation therapy ( PARPi have been extensively studied in this context and have been shown to function through multiple nonepigenetic mechanisms such as PARP trapping (46)(47)(48). Consistent with this, most PARPi did not induce detectable epigenetic changes using MIEL ( Only a few BETi-induced epigenetic changes occurred during our 24-hour initial screening (Fig. 1b).
However, following 6 days of treatment, 6 of 11 BETi induced significant (average z-scored distance from DMSO replicates >3) epigenetic changes in all cell lines (Fig. 3d, Supplementary Fig. 7b). In lines displaying TMZ and BETi synergy, the degree of BETi activity, as measured by MIEL, significantly correlated with the degree of synergism ( Fig. 3d -top panel). This demonstrated that for individual compounds, MIEL can predict relative drug activity and suggests an epigenetic component for the mechanism of BETi-TMZ synergy. O 6 -alkylguanine DNA alkyltransferase (MGMT), which provides the main line of defense against DNA alkylating agents such as TMZ, has been found to be epigenetically silenced through DNA methylation in a large fraction of glioblastoma tumors (50,51). To gain a better understanding of the mechanism by which BETi sensitizes glioblastoma TPCs to TMZ treatment, we quantified MGMT expression in the 6 lines tested using qPCR. Analysis showed that while all lines expressed similar BET-TF levels, such as Brd2 (Fig. 3e), and were thus susceptible to BET inhibitors, only 3 lines displaying BETi-TMZ synergy expressed MGMT (Fig.3e). Yet after treating those 3 lines with BETi, MGMT expression was dramatically reduced (Fig. 3f). Finally, after combining BET inhibitors with the MGMT inhibitor Lomeguatrib, we detected no increase in sensitivity to TMZ above the levels conferred by Lomeguatrib alone (Fig. 3g).
In sum, we have discovered that several BETi synergized with TMZ treatment by reducing MGMT expression. We determined that the degree of synergism displayed by individual BETi positively correlated with the magnitude of their epigenetic effect as measured using MIEL, suggesting that their mechanism of action involves epigenetic change. In contrast, the activity of PARP inhibitors didn't correlate with MIEL distance, suggesting an alternative mechanism of action unrelated to epigenetic changes.

MIEL discriminates between multiple cell fates.
To determine whether MIEL could discriminate between different cell fates we analyzed 3 cell types: primary human fibroblasts, induced pluripotent stem cells (iPSCs) derived from these fibroblasts, and neural progenitor cells (NPCs) differentiated from the iPSCs. The fibroblasts were isolated from 3 unrelated donors (WT-61, WT-101, WT-126) and used to obtain corresponding iPSC and NPC lines. Classification of the test set indicated a high degree of separation between the different fates at a single cell level ( Supplementary Fig. 8b, d). Additionally, MIEL analysis (using only H3K9me3) was able to discriminate between the main lineages of primary hematopoietic cell types freshly isolated from mouse bone marrow, namely lymphoid, myeloid, and stem/progenitors ( Supplementary Fig. 9). However, closely related cell types in each lineage such as hematopoietic stem and progenitor cells were not readily separated ( Supplementary Fig. 9).
These results underscore MIEL's ability to discriminate multiple different cell types and differentiation states uniquely based on their single-cell epigenetic landscapes both in cultured and primary cells of human and mouse origin.

MIEL determines the signatures of glioblastoma stem cells and differentiated glioblastoma.
Most epigenetic drugs are known to directly affect the level histone and DNA modifications, which are the substrates MIEL assay. To test whether MIEL is capable to identify and classify drugs that affect epigenetic landscape indirectly, we focused on glioblastoma differentiation paradigm.
Although such approach was proposed by several groups (24)(25)(26), identification of small molecule inducers of glioblastoma differentiation has been challenging. Previous attempts to design screening strategies for this purpose have met with multiple difficulties. One critical problem is the lack of informative markers faithfully reporting GBM differentiation that could be used for high-throughput screening (10). Therefore, we tested the utility of MIEL platform to screen for drugs inducing glioblastoma TPCs differentiation.
We tested MIEL's ability to distinguish TPCs and differentiated glioma cells (DGCs), derived from primary human glioblastomas (10). Three TPC/DGC pairs were derived in parallel from 3 genetically distinct glioblastoma tumor samples (MGG4, MGG6, and MGG8) over a 3-month period using either serum-free FGF/EGF for TPCs or 10% serum for DGCs (10). Visualization using a distance map demonstrated that TPCs and DGCs segregate to form two visually distinct territories ( Supplementary Fig. 8g) and were separated with high accuracy using DA (mean accuracy 100%; Fig.   4d). SVM-based pairwise classification of single cells distinguished TPCs from their corresponding DGC lines with an average accuracy of 83%, using any of the 4 epigenetic marks tested (H3K27me3, H3K9me3, H3K27ac, and H3K4me1; Fig. 4e). An SVM classifier derived from images of the MGG4 TPC/DGC pair separated all 3 TPC/DGC pairs with 88% average accuracy, providing proof of principle for the derivation of a signature for nontumorigenic cells obtained following serum differentiation of primary glioblastoma cells (Fig. 4f).
These findings suggest that MIEL can readily distinguish undifferentiated TPCs from differentiated DGCs based on multiparametric signatures of these glioblastoma cells using only the patterns of universal epigenetic marks. Of note, until now such signatures could only be obtained using simultaneous assessment of dozens of transcripts by averaging thousands of cells (10, 52).

Short-term treatment with serum or Bmp4 initiates TPC differentiation.
For the purpose of establishing a screening protocol, we tested whether short serum or Bmp4 treatment is sufficient to induce a differentiation-like phenotype in TPCs. We treated several glioblastoma cell lines for 3 days with either serum or Bmp4, then quantified expression of core transcription factors previously shown to determine the TPC transcriptomic program (10).
Immunostaining revealed that the 4 transcription factors Sox2, Sall2, Brn2 and Olig2 were downregulated by both serum and Bmp4 in a cell line-dependent manner (Supplementary Fig. 10a).
RNAseq analysis of serum-and Bmp4-treated GBM2 cells revealed that 3 days of treatment reduced (vs untreated cells) expression of most genes previously found to constitute the transcriptomic stemness signature (52) (Supplementary Fig. 10b). Additionally, both serum and Bmp4 were found to attenuate TCP growth rate ( Supplementary Fig. 10c). To identify the cellular processes altered by these treatments, we conducted differential expression analysis. Expression of 4852 genes was significantly altered (p<0.01 and -1.5<Fold Change >1.5) by either serum or Bmp4. Gene Ontology (GO) analysis of these altered genes indicated enrichment in multiple GO categories consistent with initiation of TPC differentiation; these include cell cycle, cellular morphogenesis associated with differentiation, differentiation in neuronal lineages, histone modification, and chromatin organization ( Supplementary   Fig. 11).
These results demonstrate that a 3-day treatment with either serum or Bmp4 is sufficient to induce transcriptomic changes characteristic of TPC differentiation. Previous work indicated distinct features of glioblastoma differentiation induced with BMP compared to serum (27). Indeed, we observed distinct expression changes, including differences in expression of genes regulating chromatin organization and histone modifications ( Supplementary Fig. 12a, b), between serum-and Bmp4-induced glioblastoma differentiation.

MIEL detects epigenetic changes following short-term serum or Bmp4 treatment.
We treated 4 genetically distinct glioblastoma lines with serum or BMP4, then conducted MIEL analysis using either H3K9me3 and H3K4me1 or H3K27ac and H3K27me3 to detect TPC differentiation. Discriminant analysis allowed high accuracy separation of these treatments across all cell lines using both histone modification combinations (mean accuracy 100%; Fig. 4h; Supplementary   Fig. 12c).
The global gene expression profile represents a gold standard for defining the cellular state (53). To test whether MIEL reliably reports the epigenetic changes associated with serum and Bmp4 treatments we conducted a correlation between MIEL-based and global gene expression-based metrics. We sequenced untreated and 3 days serum-or Bmp4-treated GBM2 TPCs. All genes with FPKM>1 in at least one cell population were used to calculate the Euclidean distance matrix between all cell populations. FPKM-based distances were then correlated to image texture feature-based distances. The resulting Pearson correlation coefficient of R=0.93 (p<0.001) suggests a high correlation between these two metrics (Fig. 4j, k), demonstrating that MIEL is capable of distinguishing closely related glioblastoma differentiation routes induced by serum or BMP and validating the robustness of the MIEL approach for analyzing glioblastoma differentiation.

MIEL successfully prioritizes small molecules inducing TPCs differentiation.
We screened the Prestwick compound library (~1200 compounds) using MIEL to identify compounds inducing glioblastoma TPC differentiation based on the differentiation signatures obtained with serum/Bmp4 treatments. GBM2 TPCs were treated for 3 days with Prestwick compounds at 3 µM fixed, then immunolabeled for H3K27ac and H3K27me3. GBM2 cells treated with DMSO, serum, BMP4, or compound were compared within the same plate (to avoid imaging artifacts and normalization issues).
To identify epigenetically active compounds, we calculated the Euclidean distance to the DMSO center for each DMSO replicate and Prestwick compound. Distances were z-scored, and active compounds were defined as compounds for which z-scored distance was greater than 3. Compounds with less than 50 cells imaged were considered toxic and excluded from analysis. Following analysis, MIEL identified 144 active compounds. To identify compounds inducing epigenetic changes reminiscent of serum-BMP4-induced differentiation, we used quadratic DA to build a model separating untreated, serum-treated, and Bmp4-treated cells and classified all 144 active compounds to these categories ( Fig. 5a,b). A total 31 compounds were classified as similar to either serum or Bmp4 (i.e., differentiated). Of these, 20 compounds belonged to 1 of the following 4 categories: Na/K-ATPase inhibitors of the digoxin family, molecules that disrupt microtubule formation or stability, topoisomerase inhibitors, or nucleotide analogues that disrupt DNA synthesis (Fig. 5b). To further narrow down the list of candidates, we conducted pairwise SVM classification of DMSO-and either serum-or BMP4-treated cells, then selected compounds that induced at least 50% of the cells to be classified as either serumor BMP4-treated. We then calculated the Euclidean distance between candidate compounds and serum-and BMP4-treated cells; we selected compounds where the distance to one or both treatments was less than the distance between DMSO and that treatment. Of the 20 candidate compounds identified, 15 belonged to 1 of the 4 categories mentioned above (Supplementary Fig. 13a). Because most of these compounds are known for their cytotoxic effects, we verified the growth rates of drug-treated glioblastoma cells. With the exception of digoxin, which was cytostatic, drug treatment resulted in growth rates comparable with those induced by serum or BMP4 (supplementary Fig. 14a).
We used immunofluorescence to test for expression of core TPC transcription factors (Sox2, Sall2, Brn2 and Olig2). Except for trifluridine, all compounds induced statistically significant reductions in Sox2; digoxin and digitoxigenin also induced a significant reduction of Sall2 and Brn2; olig2 expression was unaltered by any treatment (Supplementary Fig. 14b).
Next, we investigated whether the compounds identified using MIEL can induce transcriptomic changes similar to serum and Bmp4 treatment and quantified the ability of MIEL to predict compounds best at mimicking these treatments. GBM2 cells were treated with DMSO, serum, Bmp4, or each of the eight candidate compounds; after 3 days, RNA was extracted and sequenced. Transcriptomic profiles of the eight compounds were ranked according to average Euclidean distance (based on FPKM values for all expressed genes) from serum-or BMP4-treated cells. To safeguard against potential artefacts of cytotoxicity, we compared gene expression-based ranking with measured cellular growth rates from drug treatments and found no positive correlation ( Supplementary Fig. 14c). When we next compared Sox2 expression levels under all treatment conditions to determine whether the transcription factor can identify drugs that best mimic serum or BMP4, we found no positive correlation between either expression levels or transcriptomic-based rankings ( Supplementary Fig. 14d in contrast, the lowest-ranking candidate, digoxin, induced changes in gene expression, which were rather different from serum and BMP4 (Fig. 5d).
Taken together, the above results suggest unique ability of MIEL to identify molecules that shift epigenetic signature of glioblastoma TPCs towards DGCs. Critically, MIEL is capable of ranking such molecules according to their change-inducing potency and that ranking robustly correlate with global expression-based readouts of glioblastoma differentiation.

Discussion
Here we have introduced MIEL, a novel method that expands phenotypic profiling to take advantage of universal biomarkers present in all eukaryotic cells, histone modification, and exploits the patterns of chromatin organization and epigenetic marks. The pipeline we developed employs information extracted from immunofluorescence images of specific histone modifications and is geared towards drug discovery and high-throughput screening. Focusing on compounds that modulate epigenetic writers, erasers, and readers, we have shown that MIEL markedly improves detection compared to conventional intensity-based thresholding approaches and enables functional categorization of such compounds. We have demonstrated that MIEL readouts are coherent across multiple compound concentrations and cell lines and can provide information regarding drug activity levels and their mechanism of action. We have also documented MIEL ability to robustly report cellular fate and provide proof of concept for identifying and prioritizing drugs inducing differentiation of glioblastoma TPCs.
Previous studies have demonstrated that image-based profiling can distinguish between classes of compounds with very distinct functions, such as Aurora and HADC inhibitors (5). One objective of our study was to estimate the resolution of separation between categories of compounds with similar functions. We found that a single histone modification was sufficient to separate highly distinct classes .
Separating similar classes (e.g., Aurora and JAK inhibitors, which affect histone phosphorylation, or Pan and Class I HADCs, which affect histone acetylation) required staining for at least one additional histone modification. Despite their many advantages, cellular assays, including high-content assays, are often used as secondary screens for epigenetic drugs due to multiplicity of enzyme family members and an inability to determine direct enzymatic activity (54). Consequently, MIEL's ability to separate closely related functional categories on top of other advantages make this profiling approach an attractive alternative for primary screens.
Phenotypic profiling methods have been previously used to identify genotype-specific drug responses by comparing profiles across multiple isogenic lines (55). Here we show that biologic activity (i.e., serum and Bmp4) that induces glioblastoma differentiation, as well as that of 57 epigenetic compounds, was significantly correlated across four different primary glioblastoma lines. We also showed that variation in activity levels correlated with target expression levels and that the various categories can be distinguished across cell lines. Together, these suggest that MIEL could be used to identify cell lines showing an aberrant reaction to selected drugs and, therefore, aid in identifying optimal treatments for individual patients. Similar applications have previously been used to tailor specific kinase inhibitors to patients with chronic lymphocytic leukemia (CLL) who display venetoclax resistance (56).
Given the limited success of cytotoxic drugs to treat glioblastoma, we focused on alternative approaches: (1) epigenetic drugs aimed at sensitizing glioblastoma TPCs to such treatments, and (2) inducing glioblastoma differentiation. We have demonstrated MIEL's ability to rank candidate drug activity to correctly predict the best candidates for achieving the desired effect. The importance of this is highlighted in large (hundreds of thousands of compounds) chemical library screens, which typically identify many possible hits needing to be reduced and confirmed in secondary screens (57,58).

Our results uncovered a strong correlation between BET inhibitor activity (measured by MIEL)
and its ability to synergize with TMZ and reveal a previously unknown role for BET inhibitors in reducing MGMT expression. Previous studies have demonstrated upregulation of several BET transcription factors in glioblastomas (59,60), and multiple pre-clinical studies have investigated the potential of BET inhibition as a single drug treatment for glioblastoma (61)(62)(63). However, while clinical trials with the BET inhibitor OTX015 demonstrated low toxicity at doses achieving biologically active levels, no detectable clinical benefits were found (64). This prompted approaches using drug combinatorial treatments (65) such as combining HDACi and BETi (66,67). However, the mechanism by which BETi induces increased TMZ has not been described. Recently, a distal enhancer regulating MGMT expression was identified (68). Activation of this enhancer by targeting a Cas9-p300 fusion to its genomic locus increased MGMT expression while deletion of this enhancer reduced MGMT expression (68). As BET transcription factors bind elevated H3K27ac levels found in enhancers (69,70), this may be a possible mechanism for BETi-induced reduction of MGMT expression, which in turn result in increased sensitivity to the DNA alkylating agent TMZ.
Silencing the MGMT gene through promoter methylation has long been known to make TMZ treatment more responsive and to improve prognosis in patients with glioblastoma (50,51,71). Yet, clinical trials that combine TMZ and MGMT inhibitors have not improved therapeutic outcomes in such patients, possibly due to the 50% reduction in dose of TMZ, which is required to avoid hematologic toxicity (72)(73)(74). Thus, BETi offers an attractive line of research, though further studies are needed to determine whether the elevated sensitivity of glioblastoma to BETi, and its ability to reduce MGMT expression, thus synergizing with TMZ, could be exploited to improve patient outcome.
Based on our success with identifying the mechanism of BETi action, we believe that MIEL approach is well positioned to systematically analyze and identify epigenetically active compounds, then provide critical initial information for their mechanism of action.
We previously analyzed serum and BMP4, two established biologicals known to induce glioblastoma differentiation in culture (24)(25)(26), and established signatures of the differentiated glioblastoma cells based on the pattern of epigenetic marks that could be applied across several genetic backgrounds. This is the first time that a signature for glioblastoma differentiation suitable for high-throughput drug screening has been obtained. Indeed, results of previous studies using bulk glioblastoma analysis (27) or single-cell sequencing (52) could not be readily applied for highthroughput screening. As a proof of principle, we analyzed the Prestwick chemical library of 1200 approved drugs to validate MIEL's ability to select and prioritize small molecules, which mimic the effect of serum and BMP4, using global gene expression profiling. Surprisingly, we observed that the degree of reduction in endogenous SOX2 protein levels following drug treatment did not correlate with the degree of differentiation assessed by global gene expression; in contrast, MIEL-based metrics did correlate. This result, taken together with MIEL's ability to distinguish multiple cells types (iPSCs, NPCs, fibroblasts, hematopoietic lineages) across several genetic backgrounds, suggests that the MIEL approach does not only readily identify compounds by inducing desired changes in cell fate but, specifically, can be a cost-effective tool for prioritizing hundreds of thousands of compounds during the primary screenings.
By tapping into the wealth of information contained within the cellular epigenetic landscape through modern high-content profiling and machine-learning techniques, the MIEL approach represents a valuable tool for high-throughput analytical and drug discovery and is especially relevant when the desired cellular outcome cannot be readily defined using conventional approaches.

Acknowledgments
We are thankful to Harley Kornblum (UCLA) for sharing multiple primary human glioblastoma lines,       profile-based ranking and growth rates for untreated, serum-treated, Bmp4-treated, or 8-drugs-treated GBM2 cells. Euclidean distance to serum-or Bmp4-treated GBM2 cells was calculated using transcriptomic profiles (vertical axis), or growth rate after 72 hours treatment with immunofluorescence intensity (horizontal axis). Distances and growth rates were normalized to untreated and serum-or Bmp4-treated GBM2 cells. R denotes Pearson correlation coefficient. (D) Scatter plot showing the correlation of gene expression profile-based ranking and Sox2 expression for 8 candidate drugs, untreated, serum-or Bmp4-treated GBM2 cells. Euclidean distance to serum-or Bmp4-treated GBM2 cells was calculated using transcriptomic profiles (vertical axis), or Sox2 immunofluorescence intensity (horizontal axis). Distances and Sox2 levels were normalized to untreated and serum-Bmp4-treated GBM2 cells.
Immunofluorescence: Cells were rinsed with PBS and fixed in 4% paraformaldehyde in PBS for 10 min at room temperature. After blocking with PBSAT (2% BSA and 0.5% Triton X-100 in PBS) for 1 hour at room temperature, the cells were incubated overnight at 4°C with primary antibodies diluted in PBSAT. Primary antibodies are listed in Table 1, and the appropriate fluorochrome-conjugated secondary antibodies were used at 1:500 dilution. Nuclear co-staining was performed by incubating cells with either Hoechst-33342 or DAPI nuclear dyes.
Microscopy and image analysis: For MIEL analysis, cells were imaged on either an Opera QEHS high-content screening system (PerkinElmer) using ×40 water immersion objectives or an IC200-KIC (Vala Sciences) using a ×20 objective. Images collected were analyzed using Acapella 2.6 (PerkinElmer). At least 40 fields/well for Opera and 5 fields/well for IC200 were acquired and at least 2 wells per population were used. Features of nuclear morphology, fluorescence intensity inter-channel co-localization, and texture features (Image moments, Haralick, Threshold Adjacency Statistics) were calculated using custom algorithms (scripts available from www.andrewslab.ca). A full list of the features used is available from the authors. Values for each cell were generated and exported to Microsoft Excel or MATLAB for further analysis. For Sall2, Olig2, Brn2, Sox2, Oct4, and GFAP immunostaining, images were captured on an IC200-KIC (Vala Sciences) using a ×20 objective.
Between 3 and 8 fields per well were acquired and analyzed using Acapella 2.6 (PerkinElmer). For all nuclear markers, average intensities in nucleus or fold change compared to untreated cells are shown.
Unless stated otherwise, at least 3 wells and a minimum of 300 cells for each condition were compared using the unpaired two-tailed t-test. Discriminant Analysis: Quadratic discriminant analysis was conducted using the Excel add-on program xlstat (Base, v19.06). The model was generated in a stepwise (forward) approach using default parameters. All features derived from images of tested histone modification were used for analysis following normalization by z-score. Features displaying multicollinearity were reduced. Model training was done using multiple DMSO replicates and at least 2 replicates from each cell-line or drug treatment. The model was tested on at least 8 DMSO replicates and at least 1 replicate from each cell line or treatment.
SVM classification: SVM classification was conducted as previously described (30). Cell-level data in total populations (minimum 400 cells per population) were normalized to z-scores, and a subset of cells from each population being classified was randomly chosen as the training set (subset size at least 100× the population number bei ng classified). The training set was used for a SVM classifier (MATLAB svmtrain function). The remaining cells (test set) were then classified using the SVM-derived classifier to assess the accuracy of classification (MATLAB svmclassify function). Here, the accuracy of all pairwise classifications was given as the average accuracy calculated for each population. To classify the similarity of multiple cell populations, we classified known populations (e.g., different treatments or cell fates) to generate known bins and then used the same classifiers on the unknown population to categorize each cell.
Epigenetic Drug Screening: GBM2 cells were plated at 4000 cells/well and exposed to epigenetic compounds ( Table 2) at 10 µM for 1 day in 384-well optical bottom assay plates (PerkinElmer).
Negative control was DMSO (0.1%), 48 DMSO replicates per plate, 3 technical replicates (wells) were treated per compound. Cells were fixed and stained with histone modification-specific antibodies (H3K27ac & H3K27me3, H3K9me3, H3K4me1) and AlexaFluor-488-or AlexaFluor-555-conjugated secondary antibodies. DNA was stained with DAPI followed by imaging and feature extraction. To compare data from multiple plates, average feature values in each plate were normalized to DMSO.
Here, feature values of all DMSO replicates center vectors in each plate, then were used to calculate the plate-wise DMSO vector. Raw feature values for all center vectors of all populations in each plate were normalized to the plate-wise DMSO vector; normalized feature values were z-scored as above. To identify active compounds, activity level for each compound was calculated as above, and active compounds were defined as compounds for which activity z-score was >3. Compounds reducing the number of imaged cells per well below 50 were considered toxic and excluded from analysis.
Concentration Curves: GBM2 cells were plated and stained as above. For each compound (Table 3), cells were treated at 0.1, 0.3, 1.0, 3.0, 10.0 uM. Activity levels were calculated as above. Average cell count was calculated across the replicates for each compound to compare epigenetic changes and toxicity. Cell counts were z-scored against the average and standard deviation of all DMSO replicates.
Distances (z-scored) and cell counts (z-scored) were averaged for each functional class at each concentration.
RNAseq and transcriptomic analysis: Total RNA was isolated from GBM2 cells using the RNeasy Kit (Qiagen), 0.25 ug total RNA was used to isolate mRNAs and for library preparation. Library preparation and sequencing were conducted by the SBP genomics core (Sanford-Burnham NCI Cancer Center Support Grant P30 CA030199). PolyA RNA was isolated using the NEBNext® Poly(A) mRNA Magnetic Isolation Module, and barcoded libraries were made using the NEBNext® Ultra II™ Directional RNA Library Prep Kit for Illumina®(NEB, Ipswich MA). Libraries were pooled and single-end sequenced (1X75) on the Illumina NextSeq 500 using the High-Output V2 kit (Illumina). Read data, processed in BaseSpace (basespace.illumina.com), were aligned to Homo sapiens genome (hg19) using STAR aligner (https://code.google.com/p/rna-star/) with default settings. Differential transcript expression was determined using the Cufflinks Cuffdiff package (https://github.com/cole-trapnell-lab/cufflinks).  Table 4. Activity level was calculated as above. Pearson coefficient and significance of correlation for activity levels in each pair of cell lines were calculated using the Excel add-on program xlstat (Base, v19.06).

Correlation of transcriptomic and image-based profiles:
Euclidean distances were calculated using either transcriptomic data (FPKM) or texture features. Pearson's correlation coefficient (R) was transformed to a t-value using the formula (t = R × SQRT(N-2)/SQRT (1-R2) where N is the number of samples, R is Pearson correlation coefficient; the p-value was calculated using Excel t.dist.2t(t) function. For compound prioritization, Euclidean distance between the compound treated and serum-or Bmp4-treated GBM2 cells was calculated based on either FPKM)or image features. The average distance for both serum and Bmp4 treatments was normalized to the average distance of untreated cells to serum and Bmp4. Sensitization to radiation or TMZ: Cells were plated at 1500 cells/well in 384-well optical bottom assay plates (PerkinElmer). Two sets of the experiment were prepared; DMSO (0.1%) was used for negative controls at 48 DMSO replicates per plate; 3 replicates (wells) were treated per compound.
Compound concentrations used are shown in Table 5. Cells in both sets were pre-treated with epigenetic compounds for 2 days prior to cytotoxic treatment. Cytotoxic treatment, either 200uM temozolomide (TMZ, Sigma) or 1Gy x-ray radiation (RS2000; RAD Source) was carried out for 4 days on single set ('treatment set'); for TMZ treatment, DMSO control was given to the second set. A single radiation dose was given at day 3; TMZ was given twice at days 3 and 5 of the experiment. Cells were fixed, stained with DAPI, and scored using an automated microscope (Celigo; Nexcelom Bioscience).
For each compound, fold change in cell number was calculated for both the "treatment set" (Drug+Cytotox) and the "control set" (Drug), compared to DMSO-treated wells in the control set. The effect of radiation or TMZ alone was calculated as fold reduction of DMSO-treated wells in the treatment set compared to DMSO-treated wells in the control set (Cytotox). The coefficient of drug interaction (CDI) was calculated as (Drug+Cytotox)/ (Drug)X(Cytotox). For conformation experiments, the same regiment and CDI calculations were carried out on SK262, 101A, 217M, 454M, and PBT24 glioblastoma cell lines; PARPi and BETi were used at same concentration as the initial screen on GBM2 (Table 5).
Images were acquired using Perkin Elmer Opera® QEHS. MIEL analysis was conducted as described above.