Live-cell mapping of organelle-associated RNAs via proximity biotinylation combined with protein-RNA crosslinking

The spatial organization of RNA within cells is a crucial factor influencing a wide range of biological functions throughout all kingdoms of life. However, a general understanding of RNA localization has been hindered by a lack of simple, high-throughput methods for mapping the transcriptomes of subcellular compartments. Here, we develop such a method, termed APEX-RIP, which combines peroxidase-catalyzed, spatially restricted in situ protein biotinylation with RNA-protein chemical crosslinking. We demonstrate that, using a single protocol, APEX-RIP can isolate RNAs from a variety of subcellular compartments, including the mitochondrial matrix, nucleus, cytosol, and endoplasmic reticulum (ER), with specificity and sensitivity that rival or exceed those of conventional approaches. We further identify candidate RNAs localized to mitochondria-ER junctions and nuclear lamina, two compartments that are recalcitrant to classical biochemical purification. Since APEX-RIP is simple, versatile, and does not require special instrumentation, we envision its broad application in a variety of biological contexts.

other biological contexts. In another example, the localization of noncoding RNAs 46 (ncRNAs) can play an architectural role in the assembly of subcellular structures, 47 including short-range chromatin loops, higher-order chromatin domains, and large 48 sub-nuclear structures like nucleoli and Barr bodies (Rinn and Guttman 2014;49 Engreitz, Ollikainen, and Guttman 2016). However, despite these examples, our 50 general understanding of the breadth and biological significance of RNA subcellular 51 localization remains inchoate. 52 Techniques that elucidate the subcellular localization of RNAs are therefore 53 critical for advancing our understanding of RNA biology. Classically, such techniques 54 rely either on imaging or biochemical approaches. Imaging methods-such as 55 Fluorescence immunoprecipitation-based approaches is highly sensitive to the antibodies and 75 enrichment protocols used (Hendrickson et al. 2016) and captures only RNAs that 76 are directly complexed with each target protein. Fractionation-Seq is applicable only 77 to organelles and subcellular fractions that can be purified, and is frequently 78 complicated by contaminants (false positives) and loss of material (false negatives) 79 Therefore, a new technology is needed for unbiased and large-scale discovery and 80 characterization of RNA neighborhoods, with high spatial specificity, and within 81 cellular structures that cannot be enriched by biochemical fractionation. 82 Here we introduce such a technology-termed APEX-RIP-that enables 83 unbiased discovery of endogenous RNAs in specific cellular locales. APEX-RIP 84 merges two existing technologies: APEX (engineered ascorbate peroxidase)-85 catalyzed proximity biotinylation of endogenous proteins (Rhee et al. 2013), and 86 RNA ImmunoPrecipitation (RIP)(Christopher Gilbert et al. 2004). We demonstrate 87 that APEX-RIP is able to enrich endogenous RNAs in membrane-enclosed cellular 88 organelles, such as the mitochondrion and nucleus, and in membrane-abutting 89 cellular regions such as the cytosolic face of the endoplasmic reticulum. The 90 specificity and coverage of this approach are much higher than those obtained by 91 traditional Fractionation-Seq. Moreover, by applying APEX-RIP to multiple 92 mammalian organelles, we have generated high quality datasets of 93 compartmentalized RNAs that should serve as valuable resources for testing and 94 generating novel hypotheses pertinent to RNA biology. 95 96 Development of APEX-RIP method and application to mitochondria 97 APEX is an engineered peroxidase that can be targeted by genetic fusion to various 98 subcellular regions of interest (Rhee et al. 2013) ( Figure 1A). Upon addition of its 99 substrates, biotin-phenol (BP) and hydrogen peroxide (H2O2), to live cells, APEX 100 catalyzes the formation of biotin-phenoxyl radicals that then diffuse outward and 101 covalently biotinylate nearby endogenous proteins. More distal proteins are not 102 significantly labeled because the biotin-phenoxyl radical has a half-life of less than 1 103 millisecond (Wishart and Madhava Rao 2010). Previous work has shown that APEX-104 catalyzed proximity biotinylation, coupled to streptavidin enrichment and mass 105 spectrometry, can generate proteomic maps of the mitochondrial matrix, 106 intermembrane space, outer membrane, and nucleoid, each with <5 nm spatial 107 specificity (Rhee et  Because most cellular RNAs exist in close proximity to proteins, we reasoned 109 that APEX-tagged subcellular proteomes could also provide access to the nearby 110 RNA content, if proteins and RNA could be crosslinked together in situ, immediately 111 before or after APEX labeling. As our first target organelle, we selected the 112 mitochondrion because its RNA content--derived from both the mitochondrial 113 genome and from imported, nuclear-encoded RNAs--has been extensively to which we can compare our results. The mitochondrial matrix was also the first 117 mammalian compartment mapped by APEX proteomics methodology (Rhee et al. 118 2013). As a RNA-protein chemical crosslinker, we opted for mild formaldehyde 119 treatment, which covalently captures most protein-protein and protein-nucleic acid 120 interactions, and can be achieved with minimal disruption of native interactions in 121 live cells. It is for these reasons that formaldehyde is used for several RIP(Chris 122 Gilbert and Svejstrup 2006) technologies for identifying the RNA partners of specific 123 proteins of interest, including our own "fRIP-Seq" protocol (Hendrickson et al. 2016). 124 Since it was unclear a priori whether APEX-catalyzed biotinylation should 125 precede or follow the formaldehyde crosslinking step, we explored both schemes in 126 parallel ( Figure S1A; see methods). Each protocol, applied to HEK 293T cells 127 expressing mitochondrially-localized APEX2 ("mito-APEX2", Figures 1B-C), resulted 128 in clear enrichment of fifteen mitochondrial-encoded RNAs-relative to the 129 cytosolic marker GAPDH-as gauged by RT-qPCR (average of 49.3±3.5 and 130 60.9±4.1-fold enrichment, respectively, Figure S1A). Assuming that fixing cells prior 131 to biotinylation would better capture transient or weak RNA-protein interactions, 132 we selected the crosslinking-then-BP protocol for RNA-Seq analysis. While this 133 confirmed that mitochondrial mRNAs were enriched, a sizeable "shoulder" of 134 conspicuous off-target RNAs were also unexpectedly enriched ( Figure S1B). Thus, 135 we re-examined our labeling and crosslinking protocols, using a sampling of these 136 off-target RNA markers (e.g., the abundant nuclear RNA XIST, and cytosol-localized 137 RNAs HOOK2 and MAN2C1). This more comprehensive analysis revealed that APEX 138 labeling followed by crosslinking provides superior specificity ( Figure S1C). We 139 hypothesize that the mild formaldehyde treatment compromises membrane 140 integrity (Fox et al. 1985), allowing BP radicals to escape to adjoining compartments 141 when APEX labeling is performed after formaldehyde treatment.

142
We used the optimized APEX followed by crosslinking protocol to map 143 mitochondrial RNAs in mito-APEX2-expressing HEK 293T cells ( Figure 1D, Table 1,  144 tab 2). Gene-level analysis, comparing RNA counts before and after streptavidin 145 enrichment, revealed that all 13 mRNAs encoded by the mitochondrial genome were 146 highly enriched (greater than 3.5 fold) in three independent replicates (Figures 1D  147 and S1E, Table 1 tab 1). Enrichment was absent in negative controls with H2O2 148 omitted ( Figure S1F). Read density plots mapped to the mitochondrial genome 149 demonstrated that most of our captured RNAs correspond to fully-processed 150 transcripts, including mRNAs, interstitial tRNAs, and the D-loop leader sequence 151 from which mitochondrial transcription initiates ( Figure 1E We generated HEK 293T cells that stably express APEX2 in the nucleus 165 (APEX-NLS) or in the cytosol (APEX-NES; NES is a nuclear export signal). The 166 specificity of in situ biotinylation by these constructs within each compartment was 167 confirmed by imaging ( Figure 2A). Whole cell lysates prepared from each cell line 168 also produced distinct "fingerprints" of biotinylated proteins, as assayed by 169 streptavidin blotting ( Figure S1D).

170
We performed APEX-RIP on both APEX-NLS and APEX-NES cells, using the 171 biotinylation-first/crosslinking-second protocol established above, with an 172 additional one-minute radical-quenching step in between the APEX and crosslinking 173 steps ( Figure S3A; see methods). Encouragingly, "gold standard" nuclear and 174 cytosolic RNAs (defined from the ENCODE data as the top 1000 RNAs in each 175 compartment; see Table 2 tab 4) were enriched from the corresponding cell lines as 176 predicted ( Figure 2B histograms and Figures S2E-F). Moreover, when directly 177 comparing the fold-enrichments from each compartment to one another, it was 178 apparent that APEX-NLS had effectively enriched known nuclear-localized RNAs, 179 while APEX-NES had enriched known cytosol-localized RNAs ( Figure 2B scatter plot, 180 Table 2 tab 3). We calculated for each RNA a "nuclear preference score," defined as 181 the minimum geometric distance of each point to the line y=x (corresponding to the 182 set of genes which are not preferentially enriched from either compartment). 183 Receiver Operator Characteristic (ROC) analysis of these nuclear preference scores 184 was used to filter the data and obtain final transcript lists of 5,467 nuclear RNAs and 185 10,130 cytosolic RNAs from living HEK 293T cells (Table 2 tabs 1 and 2). The false 186 discovery rates of these two lists are <0.6% and <0.4%, respectively. 187 When plotted by nuclear preference score, the human transcriptome 188 displayed an overall bimodal distribution, wherein the majority of species were 189 cytoplasmic, appended by a smaller right-shifted populace of predominantly nuclear 190 RNAs ( Figure 2C, left). As might be predicted (Derrien et al. 2012), many of this latter 191 group were lncRNAs, which clearly showed preferential nuclear localization ( Figure  192 2C, middle). Most mRNAs appeared to be cytosolic in our data ( Figure 2C, right). 193 Notably, we also observed sizeable populaces of RNAs exhibiting noncanonical 194 nuclear-cytoplasmic partitioning ( Figure 2D). 3323 mRNAs-including C1orf63, for 195 example ( Figure 2D)-appeared preferentially nuclear. 2016) ( Figure 2D).

201
Our APEX-RIP nuclear and cytosolic RNA lists provide an opportunity for a 202 head-to-head comparison with the traditional Fractionation-Seq method for 203 mapping subcellular RNA localization. ROC analysis of the ENCODE Fractionation-204 Seq data yielded a list of 3,056 RNAs enriched by nuclear fractionation (Table 2 tab  205  5). Of these RNAs, 81% (2469) were also enriched in our APEX-RIP nuclear dataset, 206 implying general agreement between the two technologies ( Figure 2E). Notably, 207 APEX-RIP also enriched nearly 3000 additional transcripts. These may be nuclear-208 localized RNAs that were opaque to the ENCODE protocol, or contaminants enriched 209 by APEX-RIP. To address this possibility, we examined each dataset for conspicuous 210 non-nuclear contaminants: RNAs that are known to be localized at the Endoplasmic 211 Reticulum(Jan, Williams, and Weissman 2014). Satisfyingly, the APEX-RIP nuclear 212 dataset, though larger, contained fewer ER contaminants than did the analogous 213 fractionation-based dataset, implying that APEX-RIP produces higher specificity 214 than Fractionation-Seq ( Figure 2F, left). 215 To compare the coverage/sensitivity of each method (sometimes termed 216 recall), we examined the enrichment in each dataset of lncRNAs, which are thought 217 to be predominantly nuclear (Derrien et al. 2012 ERM-APEX2 and HRP-KDEL, and confirmed by microscopy and streptavidin blotting 262 that each produced the expected labeling patterns ( Figures 3C and D). Next, we 263 compared the efficacy of each construct for target RNA isolation, using the 264 biotinylation-first/crosslinking-second APEX-RIP protocol, and analyzing our 265 results via RT-qPCR analysis of established secretome and non-secretome mRNAs 24 .

266
Parallel experiments with APEX2-NES cells served as negative controls ( Figure 3E).  Figure 3E, right). This is surprising, since proteomic 274 experiments in HEK 293T cells expressing the identical ERM-APEX2 construct 275 yielded highly specific enrichment of ER-localized proteins (Hung et al. 2017).

276
Our data strongly imply that APEX-RIP does not have the same spatial 277 specificity as peroxidase-catalyzed proteomic labeling, and may be limited by 278 perturbations induced by formaldehyde crosslinking. However, we were highly 279 encouraged by the data obtained with the HRP-KDEL construct. We hypothesize that 280 APEX-RIP with this construct is effective because formaldehyde crosslinking 281 physically couples RNAs on the cytosolic face of the ER to protein complexes that are 282 biotinylated within the ER lumen, thereby allowing target RNAs to be enriched by 283 streptavidin ( Figure 3A). Furthermore, we observed that the target specificity of this 284 approach could be greatly improved by addition of a one-minute radical-quenching 285 step in between the biotinylation and crosslinking steps in our protocol ( Figure  286 S3A). We surmise that this additional step prevents residual peroxidase-generated 287 radicals from leaking into adjoining compartments when the integrity of the ER 288 membrane is compromised during formaldehyde treatment. 289 Using this improved protocol, we performed APEX-RIP on HRP-KDEL cells 290 (Table 3 tab 2). Gene-level analysis, comparing RNA counts before and after 291 streptavidin pulldown, revealed a distinct population of substantially enriched RNAs 292 ( Figures 3F and S3B). Encouragingly, the majority (63.4%) of secretome mRNAs 293 (defined by ER proximal RNAs(Jan, Williams, and Weissman 2014) and Phobius 294 predicted mRNAs with exclusion of nuclear encoded mitochondrial mRNAs, see 295 methods) resided in this set, while most (97.1%) mRNAs in a test set of known non-296 secreted genes were not enriched, thus demonstrating the ability of APEX-RIP to 297 isolate ER-associated transcripts from the larger population of cellular RNAs ( Figure  298 3F). Using histogram and ROC analysis, we determined the optimal log2 FKPM 299 significance threshold cutoff for each experimental replicate ( Figure S3C; see 300 methods), obtaining a final list of 2970 ERM-associated RNAs that were 301 independently enriched in multiple experiments (  Figure 3H shows that we 305 also de-enriched mRNAs lacking such signals. Coverage was likewise exceptional 306 (97%), as gauged by the recall of 71 literature-curated well-established ER resident 307 proteins' mRNAs (Table 3 tab 5; Figure 3I, see methods). 308 We next compared the ERM APEX-RIP dataset to analogous results obtained 309 by subcellular biochemical fractionation(Reid and Nicchitta 2012), and by 310 proximity-dependent ribosome profiling(Jan, Williams, and Weissman 2014) ( Table  311 3, tabs 3 and 4, respectively). Encouragingly, APEX-RIP captures the majority of 312 RNAs enriched by each of these alternative techniques (70% and 93%, respectively, 313 Figure 3J), implying broad agreement between the different methodologies. To 314 examine this further, we quantified the specificity and coverage of each approach, as 315 above (see methods). Specificity analysis demonstrated that APEX-RIP and ribosome 316 profiling exhibited similarly high specificity (94% and 98%, respectively). However, 317 Fractionation-Seq was substantially noisier, such that only 90% of enriched mRNAs 318 bore a secretory annotation ( Figure 3H); the remaining 10% comprised sizeable 319 populations of conspicuous contaminants ( Figure S3E). The coverage of ER-localized 320 mRNAs retrieved by APEX-RIP (97%) was also considerably higher than those 321 retrieved by both Fractionation-Seq and ribosome profiling (73% and 77%, 322 respectively, Figure 3J). We attribute the enhanced coverage of APEX-RIP to its 323 higher sensitivity, since this method appears better suited for capturing RNAs with 324 lower abundances than do the alternative approaches ( Figure S3 F-G). Such higher 325 sensitivity may also explain why the set of RNAs enriched by APEX-RIP is so much 326 larger than those obtained by fractionation and ribosome-profiling ( Figure 3H). 327 Excitingly, this further underscores the ability of APEX-RIP to recover RNAs that are 328 opaque to other methods. While the vast majority (88.7%) of our enriched RNAs are 329 mRNAs, we also enrich hundreds of noncoding RNA species-including antisense 330 RNAs and lincRNAs ( Figure 3G). These RNAs are not translated, and thus cannot be 331 detected by ribosome profiling, and tend to be lowly expressed, making them 332 difficult targets for either ribosome profiling or Fractionation-Seq. 333 In summary, APEX-RIP is superior to existing methods for mapping 334 endogenous RNAs proximal to the ER membrane, and may be extensible to other 335 membrane-abutting subcellular regions as well.

337
Hypotheses from ER and nuclear APEX-RIP datasets 338 We wondered if the highly specific and comprehensive RNA subcellular localization 339 datasets produced by APEX-RIP could be mined for new biological hypotheses. We 340 first observed that, of the 2635 mRNAs in our ERM dataset, 141 code for 341 mitochondrial proteins. It is thought that that the bulk of the nuclear-encoded 342 mitochondrial proteome is translated within the bulk cytosol, or in proximity to 343 mitochondria themselves(Lesnik, Golani-Armon, and Arava 2015), raising the 344 possibility that the translation or subsequent processing of these 141 protein 345 products require machinery localized to the ER. Additionally, these mRNAs may be 346 translated at mitochondria-ER contact sites, some of which have been observed to 347 contain ribosomes (Csordás et al. 2006). To gain initial insight into these unusual 348 RNAs, we analyzed these 141 genes to see whether, relative to total pool of mRNAs 349 encoding mitochondrially-localized proteins, they were enriched in particular 350 properties (Table 4 tab 1). Intriguingly, 57.1% of these mRNAs code for 351 transmembrane proteins (as predicted by TMHMM), compared to only 20.4% for all 352 mitochondrial protein mRNAs ( Figure 4A). Mitochondrial subcompartment analysis 353 showed that the ER-proximal population is enriched for proteins destined for the 354 inner mitochondrial membrane, and is depleted for resident matrix proteins, 355 compared to the total mitochondrial proteome ( Figure 4B). Interestingly, proximity- showed enrichment of mRNAs encoding proteins destined for the inner 359 mitochondrial membrane. Perhaps a subset of inner mitochondrial membrane-360 destined proteins are locally translated at mitochondria-ER contact sites. 361 Next, we tested whether new insights could be gained by examining RNAs 362 that APEX-RIP had enriched from more than one subcellular compartment. Because 363 the ER lumen is contiguous with that of the nuclear envelope, we hypothesized that 364 the HRP-KDEL APEX-RIP experiment, in addition to enriching RNAs proximal to the 365 ER, might also enrich RNAs proximal to the nuclear membrane. This region within 366 the nucleus, termed the nuclear lamina, is widely thought to play a critical role in 367 gene repression (Kind and van Steensel 2010), and in shaping the global three-368 dimensional architecture of chromatin(C.-K. Chen et al. 2016). However, no 369 exclusively laminar-resident RNAs have yet been identified. We hypothesized that 370 we might identify such long-sought lamina RNAs by intersecting our APEX-RIP 371 nuclear and ERM RNA lists ( Figure 4C). Encouragingly, we observed 673 such RNAs 372 in the intersection list, 34 of which are long noncoding RNAs ( Figure 4D; Table 4  Weissman 2014), this approach is limited to mRNAs actively undergoing translation. 431 It also requires biotin starvation prior to tagging, which is toxic to mammalian cells. 432 As we have demonstrated, APEX-RIP can map diverse classes of noncoding RNA and 433 quiescent mRNA ( Figure 3G), and toxic protocols starving cells of essential nutrients 434 for hours are not required. 435 The APEX-RIP methodology does have notable limitations. Cells to be 436 analyzed must be transfected with a recombinant construct, in contrast to FISH and 437 Fractionation-Seq, which can be performed on native tissues. APEX-RIP also gives 438 poor spatial specificity in membrane-free subcellular regions. 439 The APEX peroxidase used here has also previously been used to generate as flexible as the one-electron oxidation reaction catalyzed by APEX.

449
We anticipate that the initial subcellular transcriptomic map presented in 450 this work-probing the mitochondrial matrix, cytosol, nucleus, and ER membrane of 451 HEK293T cells-will serve as valuable resources for cell biologists. Analysis of these 452 data has already yielded potential insight into nuclear-retained mRNAs, cytosolic 453 lncRNAs, putative lamina-localized RNAs, and genes that may be translated locally at 454 mitochondria-endoplasmic reticulum junctions. Applying APEX-RIP at other 455 subcellular compartments will further expand the depth and breadth of this map. 456 Furthermore, given the high temporal resolution of APEX-RIP, we imagine that our 457 technology might enable profiling of subcellular RNA pools in response to acute 458 stimuli or drugs, or throughout stages of the cell cycle and development.

459
Collectively, such studies would yield an understanding into the biology of RNA 460 subcellular localization at unprecedented scale.

462
Significance 463 RNA subcellular localization is a critical factor that influences a wide array of 464 biological processes, ranging from Drosophila embryogenesis to mammalian 465 neuronal signaling. However, while this spatial layer of transcriptome regulation has 466 been characterized in a handful of contexts, a broader understanding of its overall 467 extent, the factors governing its establishment, and its impact on biological function, 468 remain inchoate. The limitations hindering this understanding have been largely 469 technical, since conventional methods-such as fluorescence in situ hybridization 470 (FISH) and Fractionation-Sequencing ("Frac-Seq")-depend upon specialized 471 reagents and protocols that can limit throughput and general applicability. Funding was provided by the NIH (R01-CA186568 to A.Y.T. and U01 DA040612 to 499 J.L.R.) and Stanford (to A.Y.T). 500          Plasmids and cloning 681 The pCDNA3 mito-APEX plasmid was published previously (Rhee et al. 2013). The 682 Mito-APEX2 construct was cloned from this plasmid using a two-step protocol. First, 683 the A134P mutation (Lam et al. 2014) was introduced into the APEX gene itself, 684 using QuikChange mutagenesis (Agilent), and thereafter the APEX2 gene was moved 685 to the lentiviral vector pLX304 via Gateway cloning (Thermofisher), to generate 686 plasmid pLX304 mito-APEX2. Other APEX-fusion constructs (pLX304 APEX2-NLS, 687 pLX304 APEX2-NES, and plx304 ERM-APEX2) were cloned by Gibson assembly 688 (NEB), using PCR to add targeting sequences and Gibson Assembly homology arms 689 to the APEX2 gene, and joining the resulting insert into the pLX304 vector digested 690 by BstBI and NheI. For HRP-KDEL, HRP C previously published (Martell et al. 2016) 691 was used as a template to make HRP-KDEL-IRES-Puromycin PCR fragment. Then the 692 insert was cloned into PCDNA3 vector digested by NotI and XbaI. Targeting  693 sequences and restriction sites for all constructs are listed in (Table S1). supplemented as above, at 37 °C under 5% CO 2 . 707 708 Preparation of cell lines stably expressing APEX-fusion constructs 709 To prepare lentivirus, one ~70% confluent T25 plate of HEK 293T cells, grown as 710 above, was co-transfected with 2.5 μg of APEX2 fusion plasmid, along with 0.25 μg 711 and 2.25 μg , respectively, of the lentivirus packaging plasmids VSV-G, and 712 dR8.91 (Pagliarini et al. 2008). Transfection mixes used 10 μL Lipofectamine 2000 713 (Invitrogen) and were brought to a final volume of 2 mL with unsupplemented 714 MEM. The cells were transfected for 3 hours, after which media was replaced with 2 715 ml of fresh growth media with FBS. After 48 hours, the lentiviral supernatant was 716 collected by aspiration and filtered through a 0.45 μm syringe-mounted filter. This 717 filtered supernatant was immediately used to infect cells. HEK293T cells, grown in 718 6-well plates as described above, were infected at ~50% confluency, grown for 2 719 days, followed by selection in growth medium supplemented with 8 μg/mL 720 blasticidin for 7 days, before further analysis.

721
For the cells stably expressing HRP-KDEL, HEK293T cells at ~60% 722 confluency, grown in 6-well plates as described above, were transfected with the 723 mixture of 150 μg of plasmid and 10 μL Lipofectamine 2000 (Invitrogen) in 724 unsupplemented MEM for 3 hours, after which media was prelaced with 2 ml of 725 fresh growth media with FBS. After 48 hours, the cells were trypsinized and 726 replated in T25 flask in growth medium supplemented with 1 μg/mL puromycin for 727 7 days, before further analysis.

729
In situ biotinylation and crosslinking 730 Stable-expression HEK 293T cells were grown to 90% confluency in 6-well plates, as 731 described above. For the crosslinking-then-BP biotinylation protocol ( Figure S1A,  732 top), cells were washed once with 5 mL PBS, and crosslinked in 5 mL 0.1% (v/v) 733 formaldehyde in PBS for 10 min at room temperature, with gentle agitation. The 734 crosslinking reaction was quenched by addition of glycine (1.2 M, in PBS) to final 735 concentration 125 mM, and gentle agitation for 5 minutes at room temperature. 736 Crosslinked cells were then washed three times with PBS and incubated with 500 737 µM biotin-phenol (BP) (Rhee et al. 2013) in PBS at room temperature, for 30 min. 738 Thereafter, H2O2 was added to a final concentration 1 mM, for 1 min. The liquid 739 phase was then removed by aspiration, and cells were washed twice with 2 mL 740 quenching solution (5 mM Trolox, 10 mM Ascorbate, 10 mM sodium azide, in PBS). 741 Crosslinked, labeled cells were collected by scraping, and pelleted by centrifugation, 742 and either processed immediately or flash frozen in liquid nitrogen and stored at -743 80 °C before further analysis. 744 For the BP-then-crosslinking protocol ( Figure S1A, bottom) used for mito-745 APEX2 experiments (Figure 1), cell growth media was replaced with fresh media 746 supplemented with 500 µM BP. Cells were incubated in BP-supplemented media for 747 30 minutes at 37 °C, after which H2O2 was added to a final concentration of 1 mM.

748
After 1 min, the media was replaced with 5 mL crosslink-quench solution (0.1% 749 (v/v) formaldehyde, 10 mM ascorbate, and 5 mM Trolox, in PBS) for one minute, to 750 simultaneously quench the APEX2 BP labeling reaction and initiate formaldehyde 751 crosslinking. Thereafter, cells were washed and incubated in 5 mL of fresh crosslink-752 quench for two additional 1-minute incubation steps, followed by a third, 8-minute 753 wash. Thereafter, crosslinking was terminated by the addition of Glycine, and cells 754 were harvested as described above. 755 The BP-quench-then-crosslinking protocol ( Figure S3A) used for all other 756 subcellular compartments was identical to the BP-then-crosslinking protocol, 757 except that, following BP-labling, and prior to the addition of crosslink-quench 758 solution, cells were incubated in 2 mL azide-free quenching solution (10 mM 759 ascorbate and 5 mM Trolox, in PBS) for one minute. Subsequently, cells were 760 subjected to only two (1 and 9 minute) treatments in crosslink-quench solution. 761 Thereafter, crosslinking was terminated by the addition of Glycine, and cells were 762 harvested as described above.

764
Immunofluorescence staining and microscopy 765 For immunofluorescence experiments ( Figures 1B, 2A, and 3C), stable APEX-or 766 HRP-expressing cells were BP-labeled and crosslinked, as above, and subsequently 767 fixed with 4% (v/v) paraformaldehyde in PBS at room temperature for 10 min. Cells 768 were then washed with PBS three times and permeabilized with cold methanol at -769 20 °C for 5 min. Cells were washed again three times with room-temperature PBS 770 and then incubated with primary antibodies in PBS-supplemented with 1% (w/v) 771 Bovine Serum Albumin (BSA)-for 1 h at room temperature. After washing three 772 times with PBS, cells were incubated with secondary antibodies and neutravidin-773 AlexaFluor647 ( excitation, 700/75 emission) and differential interference contrast (DIC) images 784 were acquired through a 63x oil-immersion lens. Acquisition times ranged from 100 785 to 1,000 ms. For imaging quantitation and analysis, we used the SlideBook 6.0 786 software (Intelligent Imaging Innovations) to process and normalize the images. 787 The data in these figures ( Figure 1B, 2A, and 3C) are representative of three 788 independent experiments with ³ 5 fields of view each.

790
Western and Streptavidin blotting 791 For blotting experiments ( Figures 1C, 3D and S1D), stable APEX-or HRP-expressing 792 cells were grown in 6-well plates. After labeling, the cells were harvested by 793 scraped, pelleted by centrifugation at 3,000×g for 10 min, and stored at -80 °C prior 794 to use. Thawed pellets were lysed by gentle pipetting in RIPA lysis buffer (50 mM 795 Tris, 150 mM NaCl, 0.1% SDS, 0.5% sodium deoxycholate, 1% Triton X-100, 5 mM 796 EDTA), supplemented with 1× protease cocktail (Sigma Aldrich), 1 mM PMSF 797 (phenylmethylsulfonyl fluoride), for 5 min at 4 °C. Lysates were then clarified by 798 centrifugation at 15,000×g for 10 min at 4 °C before separation on homemade 8% 799 SDS-PAGE gels. Gels were transferred to nitrocellulose membranes, stained by 800 Ponceau S (0.1% (w/v) Ponceau S, 5% (v/v) acetic acid, in water) for 10 min at 801 room temperature, and imaged. The blots were then blocked with blocking buffer 802 (3% (w/v) BSA, 0.1% (v/v) Tween-20 in Tris-buffered saline) for 1 h at room 803 temperature, and incubated with primary antibodies in blocking buffer for 1 h more. 804 The dilutions of the antibodies are as followed: Mouse anti-V5 antibody (Life 805 Technologies) 1:1000 dilution and Mouse anti-FLAG antibody (Life Technologies) 806 1:800 dilution. Blots were rinsed four times for 5 min with wash buffer (0.1% 807 Tween-20 in Tris-buffered saline), and then immersed in blocking buffer 808 supplemented with Goat anti-Mouse IgG H + L-HRP Conjugate (1:3,000 dilution, Bio-809 Rad), for 1 h at room temperature. Blots were rinsed four times for 5 min with wash 810 buffer, and developed with the Clarity reagent (Bio-Rad) and imaged on an Alpha 811 Innotech gel imaging system. Processing of streptavidin blots was similar. 812 Following Ponceau imaging, blots were blocked in blocking buffer for 30 min at 813 room temperature, immersed in blocking buffer supplemented with streptavidin-814 HRP (1:3,000 dilution, ThermoFisher Scientific) at room temperature for 15 min, 815 rinsed with blocking buffer five times for 5 min each, developed and imaged using 816 the Clarity reagent and an Alpha Innotech gel imaging system. 817 The data in these experiments ( Figures 1C, 3D and S1D) were also 818 reproduced for quality control prior to quantitative PCR and sequencing. 819 820 Streptavidin bead enrichment of biotinylated material and RNA isolation 821 Unless otherwise noted, all buffers used during RNA isolation were supplemented to 822 0.1 U/ µL RNaseOUT (Thermo Fisher), 1×EDTA-free proteinase inhibitor cocktail 823 (Thermo Fisher) and 0.5 mM DTT, final. APEX-or HRP-expressing stable cells were 824 grown, labeled, crosslinked and harvested as described above. Labeled cell pellets 825 were lysed by incubation in 1 mL ice-cold RIPA buffer, supplemented with 10 mM 826 ascorbate and 5 mM Trolox, for 5 min at 4 °C with end-over-end agitation. Samples 827 were then sheared as described previously ( 1h, followed by 55 °C for 1h, as previously described (Hendrickson et al. 2016 replicates and no samples were excluded from analysis.

894
The experiments for Figures S1A, S1C, 3E, and S3A were performed once. For 895 statistical analysis on Figure 3E, percent yield of 6 target genes were compared 896 against percent yield of 6 non-target genes using paired t-test for both HRP-KDEL 897 and ERM-APEX2. For comparison between ERM-APEX2 and APEX2-NES, 12 target 898 and non-target genes were compared against each other using paired t-test. 899 900 Library preparation, sequencing, and quantification 901 from "Incubate RFP" step. Each library was given a unique index during synthesis. 909 Library concentration and quality were confirmed on an Agilent 2100 Bioanalyzer, 910 using "DNA High Sensitivity" kits. 911 Indexed libraries were pooled in equimolar concentrations, with no more 912 than ten libraries per pool, and subjected to 50 cycles of paired end sequencing, 913 followed indexing, on two lanes of Illumina HiSeq 2500 flow cells, run in rapid mode 914 (Genomics Core, Broad Institute of Harvard and MIT). 915 In general, the experiments for each construct were performed in three 916 biological replicates. The mito-APEX experiment in Figure S1B and the mito-APEX2 917 negative control experiment (omit H2O2) in Figure S1F  preferentially reside in either compartment. The nuclear preference score for gene i 950 (NPSi), is therefore defined as the minimum distance between its coordinates and 951 the line log2N = log2C. This is equivalent to calculating the distance between points 952 (x1,y1) = (log2Ni, log2Ci) and (x2,y2) = (0.5(log2Ni+log2Ci), 0.5(log2Ni+log2Ci)). Hence: The true and false positive gene sets needed for ROC analysis were defined as 960 follows:

964
(2) For the nuclear and cytosolic partitioning experiment, true and false 965 positive gene lists were compiled using available ENCODE human cell line (NHEK-966 Normal Human Epidermal Keratinocytes) nuclear-cytoplasmic fractionation 967 data (Dunham et al. 2012). We calculated fold-enrichments for RNAs in each 968 compartment (scaled relative to the whole cell RNA, Figure S2A), and used these 969 values to derive Nuclear Preference Scores, as described above. True positive and 970 true negative nuclear RNAs were then defined as the 1000 transcripts with the 971 highest and lowest NPSs, respectively ( Figure 2B; Table S2 tab 4). Using these gene 972 lists to perform ROC analysis on the original ENCODE data produced a significance 973 threshold cutoff at an NPS of 1.107 ( Figure S2B-D), and lists of the 5467 and 10130 974 RNAs called as being enriched in the nucleus and cytoplasm, respectively (Table S2,  975 tabs 4 and 5).

984
Coverage and Specificity analysis of nuclear, cytosolic, and ER-proximal RNAs 985 To estimate the coverage (recall) and specificity of APEX-RIP at each subcellular 986 compartment, we assembled lists of established target and off-target genes tailored 987 for that compartment. 988