Identification of the SOX2 Interactome by BioID Reveals EP300 as a Mediator of SOX2-dependent Squamous Differentiation and Lung Squamous Cell Carcinoma Growth*

Lung cancer is the leading cause of cancer mortality worldwide, with squamous cell carcinoma (SQCC) being the second most common form. SQCCs are thought to originate in bronchial basal cells through an injury response to smoking, which results in this stem cell population committing to hyperplastic squamous rather than mucinous and ciliated fates. Copy number gains in SOX2 in the region of 3q26–28 occur in 94% of SQCCs, and appear to act both early and late in disease progression by stabilizing the initial squamous injury response in stem cells and promoting growth of invasive carcinoma. Thus, anti-SOX2 targeting strategies could help treat early and/or advanced disease. Because SOX2 itself is not readily druggable, we sought to characterize SOX2 binding partners, with the hope of identifying new strategies to indirectly interfere with SOX2 activity. We now report the first use of proximity-dependent biotin labeling (BioID) to characterize the SOX2 interactome in vivo. We identified 82 high confidence SOX2-interacting partners. An interaction with the coactivator EP300 was subsequently validated in both basal cells and SQCCs, and we demonstrate that EP300 is necessary for SOX2 activity in basal cells, including for induction of the squamous fate. We also report that EP300 copy number gains are common in SQCCs and that growth of lung cancer cell lines with 3q gains, including SQCC cells, is dependent on EP300. Finally, we show that EP300 inhibitors can be combined with other targeted therapeutics to achieve more effective growth suppression. Our work supports the use of BioID to identify interacting protein partners of nondruggable oncoproteins such as SOX2, as an effective strategy to discover biologically relevant, druggable targets.

metaplasia and lower grade dysplasia, which generally do not have 3q copy number gains, frequently regress spontaneously (21)(22)(23), whereas high grade dysplasia, which commonly has 3q copy number gains, is less prone to regression and more likely to progress to invasive carcinoma than earlier stages (24 -27). Notably, SOX2 and PI3K (PIK3CA), within the 3q amplicon, cooperate to induce a hyperplasic squamouscommitted stem cell state in basal cells (28), providing a molecular and stem cell-based explanation for how the 3q amplicon stabilizes the squamous injury response. Because of the importance of SOX2 in this process, inhibition of its activity during early stages of SQCC pathogenesis could slow disease progression. In addition, because growth of many cancers is dependent on the transcription factors that specify their lineage from stem cells (29 -35), targeting SOX2 later, in frank carcinoma, may also be beneficial. Accordingly, reduction in SOX2 activity inhibits growth of SQCC cell lines with 3q copy number gains (17,36,37). Furthermore, SOX2 also has been suggested to have roles affecting SQCC malignancy that do not necessarily arise from affecting stem cell fate along the squamous lineage (37,38). Overall, SOX2 is thus a compelling target for treatment of premalignant and invasive disease.
Because SOX2 lacks an obvious small molecule binding domain, it is challenging to design an inhibitor, as compared with e.g. the estrogen and androgen receptors, where antagonists related to their natural ligands have been successfully developed to treat breast and prostate cancer, respectively (39,40). Because PI3K cooperates with SOX2 to drive the squamous injury response in stem cells (28), PI3K inhibitors could theoretically be used in the treatment of SOX2-driven neoplasias. However, in a phase II SQCC clinical trial, the PI3K inhibitor BKM120 was ineffective at its maximum tolerated dose (41). Whereas PI3K inhibitors might be more effective during preneoplasia, their effectiveness in SQCC might be improved by combining them with drugs that more directly target SOX2 activity. Such drugs could inhibit protein-protein interactions or the activity of chromatin modifying enzymes that are essential for SOX2-dependent effects on transcription. For example, small molecules that inhibit the interaction between TP53 and MDM2, or interactions between BET domain-containing proteins and acetyl-lysine residues, are in clinical trials (42) (e.g. GSK525762, https://clinicaltrials.gov). In addition, romidepsin and vorinostat are histone deacetylase inhibitors that are approved for treatment of cutaneous T-cell lymphoma (43), whereas histone methyltransferase inhibitors are in clinical trials for different cancers (e.g. EPZ-5676, https://clinicaltrials.gov).
To rationally develop effective anti-SOX2 targeting strategies, a comprehensive understanding of the SOX2 interactome and its potentially diverse functions in SQCC pathogenesis is required. Although SOX2 interactomes have been described in several previous studies (37, 44 -50), no interactors have been functionally characterized in basal cells, and only TP63 has been studied as a SOX2 interactor in SQCCs (37). Furthermore, these studies have utilized standard affinity purification combined with mass spectrometry (AP-MS), a technique that is prone to false negatives for chromatin-associated proteins because of difficulties of solubilizing such polypeptides, especially as part of intact complexes. A recently developed method that circumvents this limitation is the in vivo proximity-dependent biotinylation technique, BioID (51). In this method, the "bait" protein of interest is fused to a mutant E. coli biotin ligase (BirA R118G), which releases activated biotin to covalently label nearby lysine residues. The bait is expressed in living cells, where biotin labeling is performed before lysis. Because of the covalent nature of the biotin label, the integrity of protein complexes need not be maintained post-lysis. Thus, cells can be lysed under harsher buffer conditions to maximize solubilization of chromatin-associated proteins. Labeled proteins are then purified by high affinity streptavidin precipitation and identified by mass spectrometry. This method was originally developed to help identify poorly soluble proteins such as those found in the nuclear lamina (51). BioID has since been used for a wide range of bait proteins localized throughout the cell (52)(53)(54)(55)(56)(57)(58), and head to head comparisons with AP-MS for nearly 100 baits have revealed that BioiD and AP-MS generally detect complementary interactomes (53, 54, 56 -58).

EXPERIMENTAL PROCEDURES
Ethics Approval-Human tracheobronchial tissue for the isolation of basal cells was obtained as surgical waste from lung transplant operations with approval of the University Health Network Research Ethics Board (08 -0318-T). Primary patient lung SQCC tissue was also obtained with approval of the University Health Network Research Ethics Board (04 -0557-T), and was implanted subcutaneously into NOD/SCID mice following Animal Usage Protocol 603. All animal work was carried out with the approval of the University Health Network Animal Care Committee and was registered and licensed under the province of Ontario's Animals for Research Act, and was compliant with the humane policies and guidelines of the Canadian Council on Animal Care.
Liquid Chromatography-Mass Spectrometry-Liquid chromatography (LC) analytical columns (75 m inner diameter) and precolumns (150 m ID) were made in-house from fused silica capillary tubing from InnovaQuartz (Phoenix, AZ), and packed with 100Å C18-coated silica particles (Magic, Michrom Bioresources, Auburn, CA). LC-MS/MS was conducted using a 120-min reversed-phase buffer gradient running at 250 nl/min (column heated to 40°C) on a Proxeon EASY-nLC pump in-line with a hybrid LTQ-Orbitrap velos mass spectrometer (ThermoFisher, Waltham, MA). A parent ion scan was performed in the Orbitrap, using a resolving power of 60,000. Simultaneously, up to the twenty most intense peaks were selected for MS/MS (minimum ion count of 1000 for activation) using standard CID fragmentation. Fragment ions were detected in the LTQ. Dynamic exclusion was activated such that MS/MS of the same m/z (within a 10 ppm window, exclusion list size 500) detected three times within 45 s were excluded from analysis for 30 s.
Experimental Design and Statistical Rationale-Four BioID runs were conducted on FlagBirA*-SOX2 expressing cells, consisting of two technical replicates (n ϭ 2) from two biological replicates (n ϭ 2; total n ϭ 4). Ten control runs of a BioID analysis conducted on cells expressing the FlagBirA* tag alone were used for comparative purposes. For protein identification, raw files were converted to the .mzXML format using Proteowizard (v3.0.4468) (59), then searched using X!Tandem (v2013.06.15.1) (60) against Human RefSeq Version 45 (containing 36,113 entries). Search parameters specified a parent MS tolerance of 15 ppm and an MS/MS fragment ion tolerance of 0.4 Da, with up to two missed cleavages allowed for trypsin. No fixed modifications were set but oxidation of methionine was allowed as a variable modification. Data were analyzed using the trans-proteomic pipeline (61) via the ProHits 4.0.0 software suite (62). Proteins identified with a ProteinProphet cut-off of 0.85 (corresponding to Յ1% probabilistic FDR (63)) were analyzed with SAINT Express v. 3.3 (64,65) to identify high-confidence interactors. The ten controls were collapsed to the highest three spectral counts for each hit. Proteins identified with two or more unique peptides and scoring above a Bayesian False Discovery Rate of 1% (corresponding to a SAINT score Ն 0.75, see (65) for details on BFDR calculation) were highconfidence proximity interactors. All data are publicly available and have been uploaded to the MassIVE archive (www.massive.ucsd.edu) with ID: MSV000080175 (SOX2_BioID_Interactome).
Primary Human Tracheal Basal Cells and Cell Lines-Normal basal cells were isolated from healthy adult human tracheobronchial tissue, and unless otherwise indicated, cultured on plastic dishes in LHC-9 medium, as previously described (28). Basal cells were used between passage 2 and 3, and included isolates from 3 different donors. For growth on Transwell filters, basal cells were cultured in LHC basal: DMEM (1:1) that was supplemented as described (71), which included 0.33 nM retinoic acid and 5 ng/ml EGF. Media was added to both chambers. ChaGoK1 and SW1573 were obtained from the ATCC, whereas MGH7 cells were isolated by our group (72).
Immunostaining and Proximity Ligation Assay (PLA)-Immunostaining of cultured basal cells and SQCC xenograft tissue was performed using standard methods, as described (28). PDX#77 tissue was cryopreserved in OCT and was sectioned for staining without fixation. PDX#188 tissue was fixed in formalin and paraffin-embedded. Cultured cells were fixed in 4% paraformaldehyde in PBS. PLA was done with the Duolink In Situ Red Starter Kit (Sigma). Antibodies Transient Transfection and PLA of MGH7 Cells-MGH7 cells were seeded onto glass coverslips coated with 48 g/ml PureCol (Advanced BioMatrix, Carlsbad, CA) in 12-well dishes and transfected with the PolyJet transfection reagent (SignaGen). Cells were transfected with 1 g of the GFP-expressing lentiviral vector pMA1 as a marker for transfected cells and 150 ng of adenoviral E1A expression vector (gifts from Ed Harlow). Three days following transfection, cells were fixed in 4% paraformaldehyde and subjected to PLA.
Lentiviral Infections-Lentiviruses were generated and used to infect cells as previously described (28), with transduction efficiencies typically in the range of 80 -90%. To select for shRNA lentiviral infection, cells were treated with puromycin (3 g/ml) 24 h after infection for 24 -48 h.
AlamarBlue Growth Assays-AlamarBlue was purchased from Thermofisher and used according to the manufacturer's instructions.
Statistical Analyses for Biological Data-Statistical significance was calculated using either 2-tailed t or Fisher's exact tests, as indicated in the figure legends.
Gene Ontology (GO) analysis revealed that 87% of the BioID hits could be classified as transcriptional regulators (Fig. 2, supplemental Table S3), consistent with BirA labeling proteins close to the transcription factor bait in vivo. The interactome largely comprised sequence-specific transcription factors, transcriptional corepressors and coactivators, chromatin modifiers and remodeling proteins, and members of the cohesin complex ( Fig. 1). As expected, given the central roles of many of the SOX2-interactors in transcriptional regulation, Gene Set Enrichment Analysis (GSEA) showed enrichment of gene sets associated with multiple transcription factors (supplemental Table S4). However, some gene sets annotated as responses to specific stimuli were also enriched ( Fig. 2, supplemental Table S4). These stimuli included anticancer treatments and signaling pathway agonists and receptors, suggesting potential ways of modulating SOX2 activity through extracellular treatments.
When compared with eight other SOX2 interactome studies conducted by AP-MS in neuronal, embryonic stem (ES), and TP63-expressing SQCC cell types (37, 44 -50), 46 (56%) of the BioID interactors were identified in at least one other study ( Fig. 1, 2, Table I, supplemental Table S5), supporting the ability of BioID to detect genuine SOX2 interactors. Indeed, TP63 was included among these interactors, which associates with SOX2 and coregulates gene expression in SQCC (37), as well as CHD7, a helicase that coregulates a subset of SOX2 targets in neural stem cells (46). In addition, several of the common interactors have been linked to SOX2 function in ES cells and in the reprogramming of somatic cells into ES cell-like stem cells (iPS cells). These interactors include SMC1A and SMC3, which colocalize with SOX2 at some sites in ES cells while helping maintain pluripotency (82,83); EP300, which colocalizes with SOX2, OCT4, and NANOG in many enhancer regions of ES cells (84,85); and SMARCA4, a core component of the SWI/SNF complex that enhances generation of iPS cells by SOX2 (86).
Although 36 (44%) of the interactors were not previously reported in other studies, several lines of evidence support these interactions as also being genuine and relevant to SOX2 function. For example, JMJD1C, a histone demethylase, coregulates the miR-302 promoter with SOX2 in ES cells and promotes pluripotency (87,88). BCOR, a corepressor found in some Polycomb complexes (89), was also detected. Its mutation is linked to anophthalmia/microphthalmia syndromes that are also caused by SOX2 mutation (90,91), supporting it acting with SOX2 in eye development. In addition, through GSEA analysis we found that 7 novel interactors belong to a larger group of 11 that are direct SOX2 transcriptional targets in ES cells (Fig. 2, supplemental Table S4). Although such targets do not necessarily physically interact with SOX2 to promote SOX2 function, we corroborated this possibility for at least one of the targets using a different method.
We used the proximity ligation assay (PLA) to validate the SOX2 interaction with ARID1B, a SWI/SNF complex component that is both a novel interactor and a direct SOX2 target in ES cells (92). With this assay, proximity of proteins within 30 nm of each other can be visualized in situ (93). Typically, standard primary antibodies to the proteins of interest are first incubated with fixed cells, followed by binding of specialized secondary antibodies with unique covalently attached DNA strands. Connector oligonucleotides are then ligated to the DNA strands of the secondary antibodies to create a circularized template, which is amplified with a polymerase. Amplified circles are then detected with complementary fluorescent oligonucleotide probes and visualized as discrete foci. We performed PLA on endogenous SOX2 and ARID1B in MGH7 cells, a lung SQCC cell line that has high copy 3q amplification and expresses high levels of SOX2 (72, 94) (supplemental Fig. S1). Nuclear fluorescent foci were observed in many cells and were dependent on the presence of both primary antibodies (Fig. 3). These data support the PLA reaction being dependent on both SOX2 and ARID1B proteins, and are consistent with these proteins being close to each other in vivo. Overall, both the PLA data, as well as the identities of some of the common and novel SOX2 interactors, support BioID as a valid method to detect proteins relevant to SOX2 function.
EP300 is Expressed In Normal Respiratory Basal Cells and In SQCCs-To prioritize BioID hits for study in basal cells and SQCCs, we considered the presence of potentially druggable domains and previous studies with the interactor. EP300 is a BioID SOX2-interactor that acetylates histones and nonhistone proteins and promotes transcriptional activation by a number of transcription factors (95). It has several regions that are involved in protein-protein interactions (Fig. 4A), which include KIX, zinc finger (TAZ, ZZ), and LXXLL (L) domains, as well as a bromodomain (Br) that binds acetyl-lysine residues in other transcription factors and histones. EP300 also has a RING finger that autoregulates acetyltransferase activity (96). This architecture allows EP300 to be sensitive to different types of modulatory compounds including distinct research tools, natural products, and FDA-approved drugs (97)(98)(99)(100)(101)(102)(103)(104)(105). Although some previous work has linked EP300 to regulation of SOX2 activity, its inconsistent detection in AP-MS studies and opposing models for regulation of SOX2 did not cause us to consider it prior to our work. EP300 acetylates SOX2 in vitro (106), but it has only been detected as a SOX2-interactor in a subset of AP-MS studies in ES cells (48,49). In ES cells, EP300 has been proposed to be both an inhibitor (106), as well as a coactivator, with the coactivator function depending on other pluripotency factors including OCT4 (84,107,108). Because the parental HEK293 cells used in our screen do not express OCT4 under standard culture conditions (109), our reproducible identification of EP300 as a SOX2-interactor prompted us to consider that EP300 might regulate SOX2 FIG. 1. The SOX2 BioID interactome. SOX2 interactors were classified according to information compiled at GeneCards (www.genecards.org) and literature searches. Interactors uniquely identified in our study are depicted in black, while those identified in at least one previous AP-MS study are shown in red (see supplemental Table  S5 for details). CPC ϭ chromosomal passenger complex. Node size is proportional to total number of peptides identified in four mass spectrometry runs.

TABLE I High-confidence SOX2 interactors identified using BioID
BioID was performed on two SOX2-BirA* biological replicates, each divided into two technical replicates for mass spectrometry. Spectral counts from each technical replicate are shown. Proteins scoring above a Bayesian False Discovery Rate of 1% (in this analysis corresponding to a SAINT score Ͼ 0.75) were considered high-confidence proximity interactors. Novel interactions (not reported in previous SOX2 AP-MS studies) are highlighted in blue.

EP300 Promotes SOX2 Activity in Basal Cells and SQCC Growth
activity outside of ES cells, in contexts relevant to SQCC pathogenesis. Indeed, such a role for EP300 could relate to its mutation in cervical SQCCs (110) and why its expression is associated with poor prognosis in cutaneous, laryngeal, and esophageal SQCCs (111)(112)(113), all of which are characterized by varying frequencies of 3q copy number gains (17,114,115). We thus, focused on EP300 for further study.
We first examined EP300 expression in cell types relevant to lung SQCC pathogenesis. EP300 was readily detected by immunohistochemistry in all cells of normal respiratory epithelia, including basal cells (Fig. 4B), which also express SOX2 (28). EP300 was also expressed in ex vivo cultures of purified tracheobronchial basal cells that retain stem cell activity (Fig.  4C) (28,116). In primary patient SQCCs harboring SOX2 amplification that were grown as xenografts (PDXs), EP300 was expressed in most tumor cells, along with SOX2 (Fig. 4D). This expression pattern was similar to that in nonxenografted SQCC tissue, as reported by the Human Protein Atlas, which found EP300 to be expressed at moderate/high levels in five of five primary patient SQCCs (Fig. 4E, three representative images are shown). Thus, both EP300 and SOX2 are expressed in the putative normal stem cells of origin and invasive SQCC disease.
EP300 Is Proximal to SOX2 In Normal Basal Cells and SQCCs-We next used PLA to assess the proximity of SOX2 and EP300 in basal cells and SQCCs. When cultured ex vivo, basal cells enter a SOX2 Lo state that is transiently observed in vivo during recovery from injury (28). Re-expression of phys-  2. Summary of select characteristics that are significantly enriched in the SOX2 BioID interactome. The number of transcriptional regulators was determined by Gene Ontology (GO) analysis (supplemental Table S3), whereas other characteristics were evaluated by Gene Set Enrichment Analysis (GSEA) (supplemental Table S4).
FIG. 3. Detection of proximity between SOX2 and ARID1B in SQCC cells with 3q amplification. Cytospun MGH7 cells were fixed and subjected to the proximity ligation assay (PLA). Cells were stained with either both ␣-SOX2 and ␣-ARID1B primary antibodies, or single antibody controls. Nuclei were visualized with DAPI staining. Scale is 10 m. Mean percentages of foci-positive nuclei Ϯ S.E. are shown, and were calculated by scoring five fields comprising 239 -582 nuclei per field. Significance was calculated using a 2-tailed t test. ***p ϭ 8.9 ϫ 10 Ϫ9 and 8.3 ϫ 10 Ϫ9 relative to ␣-SOX2 alone and ␣-ARID1B alone, respectively, primary antibody controls. iologic high levels of SOX2 in these cells drives different types of differentiation, depending on the signaling context (28). For example, when subconfluent basal cells are cultured in normal growth media, PI3K cooperates with lentivirally expressed SOX2 to induce a squamous fate (28). Under these conditions, PLA with both ␣-SOX2 and ␣-EP300 primary antibodies resulted in more fluorescent foci than single primary antibody controls (Fig. 5), supporting their proximity at a time when basal cells functionally respond to SOX2. Further evidence for the PLA-generated foci reflecting proximity of SOX2 and EP300 was obtained by using basal cells that had been transduced with empty vector instead of Lenti-SOX2. Omission of exogenous SOX2 expression in these cells significantly reduced the number of foci generated by simultaneous presence of both ␣-SOX2 and ␣-EP300 primary antibodies (Fig. 5).

EP300 Promotes SOX2 Activity in Basal Cells and SQCC Growth
We also used PLA to examine the proximity of SOX2 and EP300 in lung cancer cell lines with 3q copy number gains, as well as a primary patient SQCC PDX tissue that we had previously determined to have SOX2 amplification (28). The MGH7 SQCC cell line was described earlier and ChaGoK1 was derived from an undifferentiated bronchogenic carcinoma (117). ChaGoK1 has low level gain at 3q and expresses TP63 and SOX2 (supplemental Fig. S1), common SQCC markers located within the 3q amplicon (5,6,17,18,28,118). In this assay Ͼ50% of cells in culture and in SQCC tissue displayed nuclear foci that were strongly dependent on the presence of both ␣-SOX2 and ␣-EP300 primary antibodies ( Fig. 6). In addition, the number of cells showing nuclear foci, as well as the number of foci per cell, were reduced when MGH7 cells (the SQCC cell line that had the highest levels of SOX2 expression) were transfected with expression vectors for adenoviral E1A oncoproteins (supplemental Fig. S2). Both the 13S and 12S E1A isoforms are well-established to bind EP300 through their common amino termini (119,120), with sequestration of EP300 thought to be a major component of their transforming properties (121). In addition, E1A has been shown to be able to sequester EP300 away from a GAL4-SOX2 chimeric transcription factor (108). Thus, the ability of both the 13S and 12S E1A isoforms to reduce the number of SOX2-EP300 PLA foci provide further evidence that these foci signify a bona fide SOX2-EP300 interaction. Overall, the PLA data are consistent with the proximity between SOX2 and EP300 observed by BioID in HEK293 cells, and by AP-MS and ChIP in ES cells (48,49,84), but provide evidence that their interaction is not limited to ES cell-like contexts and occurs in normal basal and SQCC cells.
EP300 Is Required for SOX2 Activity in Basal Cells-To determine if EP300 affects SOX2 activity in an SQCC-relevant context, we first asked whether an EP300 chemical inhibitor interferes with SOX2's ability to induce a squamous fate in basal cells. The induction of the squamous fate by SOX2 reflects a core SOX2 activity in basal cells and appears to be one of the earliest points of genetic dysregulation in SQCC pathogenesis (28). Basal cells were infected with either con-

FIG. 5. Detection of proximity between SOX2 and EP300 in primary human tracheobronchial basal cells. Second passage basal cells
were transduced with Lenti-SOX2 and after 2 days, cytospun, fixed, and subjected to the proximity ligation assay (PLA). Cytospins were stained with either both ␣-SOX2 and ␣-EP300 primary antibodies, or single antibody controls. Nuclei were visualized with DAPI staining. Scale is 10 m. Mean percentages of foci-positive nuclei Ϯ S.E. are shown, and were calculated by scoring five fields and counting 50 -100 nuclei per field. Significance was calculated using a 2-tailed t test. **p ϭ 0.00003 and 0.0007 relative to ␣-SOX2 alone and ␣-EP300 alone, respectively (in Lenti-SOX2-transduced basal cells), and 0.0009 relative to the combination of ␣-SOX2 and ␣-EP300 in empty vector control-transduced cells.
trol vector or Lenti-SOX2 in the presence or absence of CBP30, a compound that preferentially binds to the bromodomains of EP300 and related CREBBP to inhibit their interactions with acetyl-lysine residues (100 -102). After 5 days, squamous lineage marker expression was quantified (Fig. 7). CBP30 strongly suppressed induction of squamous lineage markers by Lenti-SOX2. To determine if the inhibitory effect of CBP30 on SOX2 activity was also manifested on earlier transcriptional responses to SOX2, we examined expression levels of a panel of SOX2 early response genes. After addition of Lenti-SOX2, most basal cells do not express SOX2 protein until ϳ36 h post-infection (28). At this time point, we previ-ously identified a set of genes that were induced by SOX2 (28). CBP30 strongly suppressed induction of all three tested early response genes (ADH7, KIAA1199, EDN1), as well as a gene whose expression is correlated with SOX2 in SQCC that we also found is a target of SOX2 in basal cells (FOXE1) (17, 28) (Fig. 7). By contrast, PIK3CA and SMARCA4 were not induced by SOX2 and were insensitive to CBP30 (Fig. 7).
To more specifically interfere with EP300 activity, we used lentiviral shRNA constructs that target EP300, but not the related gene CREBBP. In our first screen of shEP300 constructs, we identified shEP300 #3, which reduced EP300, but not CREBBP protein levels in basal cells (Fig. 8). We later FIG. 6. Detection of proximity between SOX2 and EP300 in lung cancers with 3q gains. A primary patient-derived SQCC xenograft (PDX) and two lung cancer cell lines (1 definitive SQCC), all harboring SOX2 gains, were subjected to the proximity ligation assay (PLA). Slides were stained with either both ␣-SOX2 and ␣-EP300 primary antibodies, or single antibody controls. Nuclei were visualized with DAPI staining. Scale is 10 m. Mean percentages of foci-positive nuclei Ϯ S.E. are shown, and were calculated by scoring five fields and counting 60 -200 nuclei per field. Significance was calculated using a 2-tailed t test. ***p ϭ 0.00007 and 0.00007 relative to ␣-SOX2 alone and ␣-EP300 alone, respectively (SQCC PDX); 5.7 ϫ 10 Ϫ9 and 7.4 ϫ 10 Ϫ11 relative to ␣-SOX2 alone and ␣-EP300 alone, respectively (MGH7); 6.2 ϫ 10 Ϫ9 and 9.8 ϫ 10 Ϫ9 relative to ␣-SOX2 alone and ␣-EP300 alone, respectively (ChaGoK1). FIG. 8. Identification of lentiviral shRNA constructs that reduce EP300, but not CREBBP expression in basal cells. Primary tracheobronchial basal cells were infected with lentivirus overnight. After 24 h, infected cells were selected with puromycin and protein levels quantified by Western blotting between 48 -72 h. Left panel, four different shEP300 constructs were initially tested, which led to identification of shEP300 #3. Right panel, characterization of shEP300 #5. Protein levels were quantified by densitometry, with EP300 and CREBBP expression normalized to PTPN11 (loading control) levels. Normalized EP300 and CREBBP expression was then adjusted relative to control shluciferase (shluc)-transduced cells, which was assigned a value of 100. NS ϭ nonspecific band. identified another shRNA construct, shEP300 #5, that also reduced EP300, but not CREBBP protein levels in basal cells (Fig. 8). To test the effects of EP300 knockdown on SOX2 activity, the shEP300 constructs or control shluciferase (shluc) were coinfected with Lenti-SOX2 in basal cells (Fig. 9). In perfect agreement with the CBP30 data, both shEP300 constructs suppressed induction of squamous lineage marker and early response genes by SOX2 (Fig. 9A, 9B). Also in accord with the CBP30 results, SOX2-nonresponsive PIK3CA and SMARCA4 expression were not reduced by the shEP300 constructs (Fig. 9A, 9B). As expected, both shEP300 constructs reduced EP300, but not CREBBP mRNA expression (Fig. 9A, 9B). Altogether, these data support EP300 being necessary for several basal cell transcriptional responses to SOX2, as well as their biological differentiation into the squamous lineage.

EP300 Copy Number Gains Are Commonly Selected in
Lung SQCCs-Copy number gains in SOX2 at 3q26 -28 occur in 94% of lung SQCCs (17,18), the highest frequency of all studied cancers (supplemental Fig. S3). Notably, EP300 gains at 22q13 are also common in SQCCs (45% of cases), whereas only 20% of SQCCs display heterozygous loss, and no samples appear to have biallelic inactivation (Fig. 10A) (18). In SQCCs, EP300 copy number gains are correlated with increased EP300 mRNA expression (Fig. 10B), supporting the gains affecting EP300 expression. The pattern of EP300 copy number variation favoring gains over losses was specific to SQCCs as compared with lung adenocarcinomas (ADCs) (18,122). In ADCs, the opposite pattern was observed, with only 13% of samples displaying EP300 gains and 44% of cases showing heterozygous loss (Fig. 10A, 10C). When compared with 19 other major cancers, EP300 copy number gains  11. EP300 promotes growth of 3q-amplified MGH7 SQCC cells, but not non-3q-amplified SW1573 lung cancer cells. A, B, Cell lines were transfected overnight with the indicated siRNA pools (siCon ϭ control non-EP300 targeting pool of siRNAs) and at 48 h, EP300 and CREBBP expression was quantified by Western blotting (A) and qRT-PCR (B). A, For Western blotting, protein levels were quantified by densitometry, with EP300 and CREBBP expression normalized to PTPN11 levels (loading control). Normalized EP300 and CREBBP expression was then adjusted relative to siCon-transfected cells, which was assigned a value of 100. NS ϭ nonspecific band. B, For qRT-PCR, EP300 and CREBBP expression was normalized to levels of TBP and plotted relative to siCon, which was assigned a value of 1.0. Means Ϯ S.E. of 3 replicates are shown. Significance was calculated using paired 2-tailed t tests between siCon and siEP300-transfected populations. Only p values Յ 0.05 are indicated. For MGH7, ***p ϭ 0.0005 and *p ϭ 0.01. For SW1573, **p ϭ 0.001. C, siEP300 inhibits growth of MGH7 cells. Cells were transfected with either an siEP300 or siCon pool in triplicate and growth quantified by alamarBlue after 5-6 days. Data are plotted relative to siCon transfected cells, which was assigned a value of 100. Means Ϯ S.E. are shown. Significance was calculated using a 2-tailed t test between siEP300 and siCon cells. Only p values Յ 0.05 are indicated. ***p ϭ 4 ϫ 10 Ϫ6 . D, E, shEP300 lentiviruses reduce EP300, but ranked highest in lung SQCCs and were significantly more enriched than in 18 of the other cancers (supplemental Fig.  S4). By contrast, heterozygous EP300 losses were not commonly selected in lung SQCCs, with the SQCC frequency being lower than at least 8 other cancers (supplemental Fig.  S4). Similarly, EP300 mutations were observed in only 4.5% of lung SQCCs, a frequency that was statistically like 10 other cancers (supplemental Fig. S5). Together these data suggest that both increased EP300 and SOX2 activity are genetically coselected in SQCCs.
EP300 Promotes Growth Of Lung Cancer Cells With 3q Copy Number Gains-The growth of lung and esophageal SQCC cell lines with 3q copy number gains depends on SOX2 expression (17,36,37). To investigate if 3q gains in lung cancer might also be associated with sensitivity to EP300 levels, we examined the effect of EP300 knockdown on cell growth. To this end, we first reduced EP300 expression with a pool of 4 different EP300 siRNAs in MGH7 SQCC cells, which have high level 3q amplification (94). As a comparator, we also reduced EP300 expression in SW1573 lung cancer cells. SW1573 was derived from an alveolar lung cancer (ATCC), and harbors an activating KRAS mutation, but does not have 3q copy number gains (supplemental Fig. S1A). This cell line also expresses significantly lower levels of TP63 and SOX2 mRNA than MGH7 and ChaGoK1 (supplemental Fig.  S1B), and does not express detectable TP63 and SOX2 proteins (supplemental Fig. S1C). Relative to a control non-targeting pool of siRNAs, the siEP300 pool reduced EP300 (but not CREBBP) protein and mRNA expression in both MGH7 and SW1573 cells (Fig. 11A, 11B). However, whereas MGH7 growth was severely impaired by siEP300 treatment, SW1573 growth was not affected (Fig. 11C). To further verify dependence of MGH7 growth on EP300, we also infected MGH7 cells with two shEP300 lentiviruses. Both shEP300 #5 and #3 reduced EP300 (but not CREBBP) protein and mRNA expression in MGH7 cells (Fig. 11D, 11E), and inhibited growth of this cell line (Fig. 11F). We then tested the effect of reducing EP300 expression in ChaGoK1 cells, which have low level 3q gain (supplemental Fig. S1A). Using shEP300 #3, we were able to achieve specific knockdown of EP300 protein and mRNA levels over CREBBP (Fig. 12A). As with MGH7, knockdown of EP300 in ChaGoK1 was associated with growth inhibition (Fig. 12B). Thus, growth of at least some lung can-cers with 3q copy number gains, including SQCCs, is dependent on EP300.
An EP300 Chemical Inhibitor Suppresses Growth of Lung Cancer Cell Lines With 3q Gains and Increases Sensitivity to a PI3K Inhibitor-We next explored whether EP300 chemical inhibitors suppress growth of lung cancer cells with 3q copy number gains. At 20 M, CBP30 impaired growth of both lung cancer cell lines with 3q gains by almost 60% (MGH7, ChaGoK1), whereas growth of SW1573 cells, which do not have 3q gains, was only reduced by 14% (Fig. 13A). Because of the recent failure of the PI3K inhibitor BKM120 in a lung SQCC clinical trial, it was proposed that PI3K inhibitors might be more effective if combined with other targeted therapeutics (41). We therefore, tested if growth suppression by a low dose of BKM120 could be enhanced by simultaneous treatment with CBP30. Indeed, combination of 0.5 M BKM120 with 20 M CBP30 impaired growth of the two cell lines with 3q copy number gains more strongly than single agent treatment (Fig. 13B). The combination treatment strategy was not effective in SW1573 cells (Fig. 13B). Thus, CBP30 recapitulates the sensitivity of lung cancer cells with 3q copy number gains to EP300 genetic inhibition, and enhances sensitivity of these cells to low doses of a PI3K inhibitor.
Tracheobronchial Basal Cell Growth is Dependent on EP300, but Quiescence Reduces Toxicity of EP300 Inhibitors-Because many SQCCs are likely to originate from TP63expressing basal cells, some of the sensitivity of SQCCs to EP300 inhibition may reflect an inherent dependence of the putative cell of origin on EP300 levels. To test this possibility, we measured long-term growth in tracheobronchial basal cell cultures that had been infected with one of the shEP300 lentiviruses. After 7-10 days of shEP300 infection, all basal cells had died (Fig. 14A). This extreme sensitivity to EP300 inhibition was recapitulated by CBP30 treatment, with basal cell growth reduced to 38% when exposed to 10 M CBP30 (Fig. 14B). By contrast, it has been reported that 10 M CBP30 is not profoundly toxic in 12 different normal primary cell types (101). However, in the healthy native tracheobronchial epithelium, most basal cells are quiescent, with only 1.7% expressing Ki-67, a marker of cycling cells (123). We therefore tested whether EP300 inhibition would be equally toxic in the largely quiescent cell populations that are found under normal homeostatic conditions. To recapitulate these conditions, we not CREBBP expression in MGH7 cells. Cells were infected overnight with shEP300 or shluciferase (shluc) control lentiviruses and at 24 h, selected in puromycin. At 48 h post-virus addition, EP300 and CREBBP expression was quantified by Western blotting (D) and qRT-PCR (E). D, For Western blotting, protein levels were quantified by densitometry, as described in (A). NS ϭ nonspecific band. E, For qRT-PCR, EP300 and CREBBP expression was quantified as described in (B). Means Ϯ S.E. of 3 replicates are shown. Significance was calculated using paired 2-tailed t tests between shluc and shEP300-infected populations. Only p values Յ 0.05 are indicated. ***p ϭ 0.0004 (shEP300 #5), 0.0001 (shEP300 #3). F, shEP300 lentiviruses inhibit growth of MGH7 cells. Cells were infected overnight with shEP300 and shluc control lentiviruses and selected in puromycin after 24 h for an additional 24 h. Selected cells were then seeded in replicate for growth assays. Following 7-10 days of culture, cell growth was quantified by alamarBlue. Data are plotted relative to control shluc-infected cells, which was assigned a value of 100. Means Ϯ S.E. from triplicate cultures are shown. Significance was calculated using a 2-tailed t test. ***p ϭ 0.0004 (shEP300 #5), 5 ϫ 10 Ϫ8 (shEP300 #3). grew basal cells on Transwell filters, where quiescence is induced at confluence (28). We then compared the effects of exposure to the combination of CBP30 and BKM120 between proliferating (plastic) and quiescent (Transwell) cultures. Over an 8-day treatment period, the drug combination reduced cell growth on plastic by 97%, but on Transwell filters, it reduced cell number by only 20% (Fig. 14C). Although these data are supportive, further studies will be required to determine if an appropriate therapeutic index might be attainable for this class or other classes of EP300 chemical inhibitors. DISCUSSION The major goals of this study were to determine if BioID could be used to characterize the SOX2 interactome and to gain insight into the mechanism and treatment of SOX2driven SQCC pathogenesis. Here we report the first SOX2 interactome characterized by BioID. This interactome has some overlap with previously reported SOX2 AP-MS studies (37, 44 -50), but also identifies 36 novel high-confidence interactions. An additional 15 interactions were largely unique to our data set as they were reported in only one other study where the maximum number of spectral counts for these interactors ranged between 1 and 2 (37). The additional interactors identified in our study likely reflect bona fide SOX2interactions because a number of both common and novel interactors have known connections to SOX2 activity, including EP300, which we functionally validated outside of an ES cell context for the first time. In addition, we found that the novel interaction with ARID1B could be recapitulated by PLA. Our data thus support BioID as an effective method to interrogate SOX2 interactomes.
We also characterized roles of EP300 in early SQCC pathogenesis and growth of invasive carcinoma, processes that are regulated by SOX2. Previous work proposed contrasting functions for EP300 on SOX2 activity. Recombinant EP300 protein has been shown to acetylate SOX2 on many lysine residues in vitro, with an alanine substitution mutation at one of the lysine residues, K75A, inhibiting cytoplasmic accumulation of SOX2 in ES cells (106). Based on these findings, the authors of that study proposed that EP300 inhibits SOX2 activity through acetylation-dependent nuclear export. However, lysines can also acquire other modifications (e.g. ubiquitin), making it difficult to ascribe the effect of the K75A mutation solely to defective acetylation. Furthermore, it is not clear to what extent EP300 acetylates SOX2 in vivo, with one FIG. 12. EP300 promotes growth of ChaGoK1 lung cancer cells with 3q gains. A, An shEP300 lentivirus reduces EP300, but not CREBBP expression in ChaGoK1 cells. Cells were infected overnight with shEP300 #3 or shluciferase (shluc) control lentiviruses and at 24 h, selected in puromycin. At 48 h post-virus addition, EP300 and CREBBP expression was quantified by Western blotting and qRT-PCR. For Western blotting, protein levels were quantified by densitometry, with EP300 and CREBBP expression normalized to PTPN11 levels (loading control). Normalized EP300 and CREBBP expression was then adjusted relative to shluc-transfected cells, which was assigned a value of 100. NS ϭ nonspecific band. For qRT-PCR, EP300 and CREBBP expression was normalized to levels of TBP and plotted relative to shluc, which was assigned a value of 1.0. Means Ϯ S.E. of 3 replicates are shown. Significance was calculated using paired 2-tailed t tests between shluc and shEP300-infected populations. **p ϭ 0.005 and *p ϭ 0.03. B, An shEP300 lentivirus inhibits growth of ChaGoK1 cells. Cells were infected overnight with shEP300 #3 and shluc control lentiviruses and selected in puromycin after 24 h for an additional 24 h. Selected cells were then seeded in replicate for growth assays. Following 7-10 days of culture, cell growth was quantified by alamarBlue. Data are plotted relative to control shluc-infected cells, which was assigned a value of 100. Means Ϯ S.E. from triplicate cultures are shown. Significance was calculated using a 2-tailed t test. ***p ϭ 1 ϫ 10 Ϫ6 . study suggesting that EP300 might only be a minor contributor (124). Thus, the extent to which this EP300 mechanism operates in distinct cell types is still unknown. Contrary to the inhibitor model, other studies support EP300 being a SOX2 coactivator. In ES cells, EP300 is colocalized with SOX2 at many enhancer regions (84), and it cooperates with SOX2 and OCT4 to promote transcriptional activation from an FGF4 enhancer that is active in these cells (107,108). Our findings also support a coactivator over inhibitor role for EP300 in regulating SOX2 activity in basal cells. Inhibition of EP300 activity through both chemical and genetic methods suppresses SOX2-dependent transcriptional changes and squamous differentiation of basal cells. However, there appears to be mechanistic differences between ES and basal cells in how EP300 functions with SOX2. In ES cells, both OCT4 and NANOG are necessary for global recruitment of EP300 to SOX2 sites (84), and at least at the FGF4 enhancer, OCT4 is necessary for EP300 to promote transcriptional activation by SOX2 (108). By contrast, in SQCCs, the OCT4 binding motif is not enriched at SOX2 binding regions (37), and most SQCCs do not express OCT4 protein (125,126). Thus, in basal cells and SQCCs, other factors such as TP63 may be involved in EP300 recruitment or SOX2 may be able to directly recruit EP300.
Although EP300 has been suggested to have both tumor suppressor and oncogenic activities (95), such putative functions have not been clearly established for most cancers. In the TCGA somatic cancer cohorts, EP300 is mutated at Ͻ10% frequency in most types of cancer, including lung SQCCs. At least one missense mutation affects an amino acid in the catalytic HAT domain (D1399N), that when altered to D1399Y (or D1435E in CREBBP), strongly reduces acetyl transferase activity (96,127). However, in murine hematopoietic stem and progenitor cells, similar HAT inactivating mutations in EP300 are biological gain-of-functions with regards to growth of these cells (128). Thus, it remains to be seen ). B, At low doses, combination of PI3K and EP300 chemical inhibitors results in more growth inhibition than either single agent alone. Growth assays were performed as described in (A). Significance was calculated using a 2-tailed t test. ns ϭ not significant. MGH7, **p ϭ 0.001, ***p ϭ 5 ϫ 10 Ϫ5 ; ChaGoK1, *p ϭ 0.009, **p ϭ 0.005, SW1573, ns, p ϭ 0.40 (to CBP30 alone), 0.17 (to BKM120 alone). whether in SQCCs, the EP300 mutations are passenger or oncogenic, or if they define a rare disease subclass where EP300 is a tumor suppressor. By contrast, EP300 gains are common in SQCCs, occurring in 45% of cases, which is the highest frequency among surveyed TCGA cohorts. SQCCs also have the highest frequency of SOX2 copy number gains, supporting co-selection of increases in both SOX2 and EP300 activities during SQCC pathogenesis. Consistent with this possibility, we find evidence for SOX2 and EP300 being in close physical proximity in basal cells and SQCCs. Furthermore, we find that EP300 is necessary for SOX2 activity in basal cells, including for the induction of the squamous injury response by SOX2, a key initiating event in SQCC pathogenesis. Also, we find that like SOX2 (17,36,37), EP300 promotes growth of at least some lung cancer cell lines with 3q gains.
EP300 is not generally required for all cell growth, as normal ES cells and some cancer cell lines still grow when EP300 is homozygously inactivated (129,130). However, certain cancers are very dependent on EP300 for their growth. For example, EP300 chemical inhibitors that target either the bromodomain or the catalytic HAT domain suppress growth of AML1-ETO-driven AMLs (103,105,131). They also synergistically inhibit growth of some AML cell lines when combined with the chemotherapeutic doxorubicin (103). The growth inhibitory effect of these drugs is thought to arise from EP300 inhibition because EP300 acetylates AML1-ETO and is necessary for its leukemogenic activity (132). Our studies suggest that SOX2-driven SQCCs could be another EP300-dependent cancer. This dependence could arise through both the developmental origin of the cancer from basal cells, which are also very dependent on EP300 for growth, as well as the requirement of SOX2 for EP300 for some of its major functions during SQCC pathogenesis. Accordingly, it would be worthwhile to develop EP300 inhibitors for clinical trials involving SQCCs. Notably, some FDA-approved drugs that contain salicylate moieties inhibit EP300 catalytic activity, including the anti-inflammatory diflunisal, at concentrations that can be achieved in patients (105,133). In addition, natural products including curcumin, anacardic acid, and luteolin, inhibit EP300 catalytic activity (97,98,104), with luteolin also being able to suppress growth of a head and neck SQCC xenograft in mice (104). As suggested by our work, compounds that target EP300 could have therapeutic benefit in lung SQCCs on their own or in conjunction with other treatments. Other treatments could include drugs, radiation, or agonists/antagonists of signaling receptors that we found affect expression of components of the SOX2 interactome, or specific targeted therapeutics such as PI3K inhibitors. Indeed, we found that an EP300 inhibitor increases the effectiveness of low dose of the PI3K inhibitor, BKM120, in suppressing growth of lung cancer cell lines with 3q gains. Although normal basal cell proliferation is also very sensitive to this drug combination, toxicity is significantly reduced under physiologic conditions where they are largely quiescent. Further work will be required to determine if a tolerable therapeutic index may be attainable for clinical grade EP300 inhibitors. Additionally, the growing number of natural products with EP300 inhibitory activity suggests potential chemoprevention strategies, which potentially can be combined with myo-inositol, a natural PI3K inhibitor that promoted regression of premalignant squamous lesions in a phase I clinical trial (134 -136). In conclusion, our work sup-FIG. 14. Quiescence reduces the toxicity of EP300 and PI3K chemical inhibitors in tracheobronchial basal cells. A, Knockdown of EP300 inhibits basal cell growth. Basal cells proliferating on plastic were infected overnight with shEP300 #3 and shluc control lentiviruses and selected in puromycin after 24 h for an additional 24 h. Selected cells were then seeded in replicate for growth assays. Following 7-10 days of culture, cell growth was quantified by alamar-Blue. Data are plotted relative to control shluc-infected cells, which was assigned a value of 100. Means Ϯ S.E. from triplicate cultures are shown. Significance was calculated using a 2-tailed t test. ***p ϭ 1 ϫ 10 Ϫ8 . B, Single agent EP300 chemical inhibitor treatment suppresses basal cell growth. Cells were fed every other day with the indicated CBP30 concentration and growth quantified after 7-10 days by ala-marBlue. Data are plotted relative to control vehicle-treated cells, which was assigned a value of 100. Means Ϯ S.E. from triplicate cultures are shown. Significance was calculated using a 2-tailed t test. ***p ϭ 0.0002 (10 M CBP30), 0.00004 (20 M CBP30). C, Quiescence induced by growth on Transwell filters reduces toxicity of PI3K and EP300 chemical inhibitors on basal cells. Basal cells were cultured either subconfluently on plastic dishes or at confluence on Transwell filters, and were treated with fresh drugs every other day. After 8 days, growth was measured be either alamarBlue (plastic) or counting total cell number (Transwell filters). Data are plotted relative to cultures treated with control vehicle, which was assigned a value of 100. Means Ϯ S.E. from triplicate cultures are shown. Significance was calculated using a 2-tailed t test. ***p ϭ 9.1 ϫ 10 -7, *p ϭ 0.03. ports the continued use of BioID to study SOX2, and suggests that it may be effective to identify druggable targets of a wide range of "nondruggable" oncoproteins.