Enhanced detection of expanded repeat mRNA foci with hybridization chain reaction

Transcribed nucleotide repeat expansions form detectable RNA foci in patient cells that contribute to disease pathogenesis. The most widely used method for detecting RNA foci, fluorescence in situ hybridization (FISH), is powerful but can suffer from issues related to signal above background. Here we developed a repeat-specific form of hybridization chain reaction (R-HCR) as an alternative method for detection of repeat RNA foci in two neurodegenerative disorders: C9orf72 associated ALS and frontotemporal dementia (C9 ALS/FTD) and Fragile X-associated tremor/ataxia syndrome. R-HCR to both G4C2 and CGG repeats exhibited comparable specificity but > 40 × sensitivity compared to FISH, with better detection of both nuclear and cytoplasmic foci in human C9 ALS/FTD fibroblasts, patient iPSC derived neurons, and patient brain samples. Using R-HCR, we observed that integrated stress response (ISR) activation significantly increased the number of endogenous G4C2 repeat RNA foci and triggered their selective nuclear accumulation without evidence of stress granule co-localization in patient fibroblasts and patient derived neurons. These data suggest that R-HCR can be a useful tool for tracking the behavior of repeat expansion mRNA in C9 ALS/FTD and other repeat expansion disorders.

One of the key pathological hallmarks in repeat expansion diseases is the presence of repeat-containing RNA foci. These foci are thought to represent repeat RNAs coassociated with specific RNA binding proteins, although they may also arise from intra-and inter-molecular RNA-RNA interactions via RNA gelation [1, 18, 19, 30-32, 45, 61, 62, 64, 70]. The exact behavior, biophysical properties, and associated protein and RNA factors of these RNA condensates varies across different repeat expansions (reviewed in [17,33,50]). Traditionally, these foci are visualized by fluorescent in situ hybridization (FISH) (Fig. 1a) [29,32,40,[75][76][77]. Early studies Glineburg et al. acta neuropathol commun (2021) 9:73 were able to successfully identify repeat containing foci because the repetitive nature of the repeat allowed single probes to "tile" along the mRNA and enhance signal detection. However, FISH in human tissues is sometimes hindered by the low abundance of repeat containing RNAs, and high background due to both auto-fluorescence of tissue and the human transcriptome containing numerous GC rich repeats [2,14,15,37,60,67,75,81].
Recently, a more sensitive method, hybridization chain reaction (HCR) was developed that uses an initiator probe recognizing the RNA of interest and a pair of hairpin probes conjugated with fluorophores to amplify the initiator signal (Fig. 1b) [6][7][8][9]. This approach significantly amplifies the signal of an individual molecule over traditional FISH and makes it easier to detect low abundant RNAs and overcome background signal caused by auto-fluorescence and off-target binding [63]. Here, we utilized HCR to detect RNA foci associated with G 4 C 2 repeat expansions in C9orf72 that are the most common genetic cause of ALS and FTD and CGG repeats associated with Fragile X disorders such as Fragile X-associated tremor/ataxia syndrome (FXTAS) [14,28,56]. Repeat HCR (R-HCR) provided significantly higher sensitivity for detection of repeats and allowed accurate tracking of endogenous GGG GCC repeat RNAs in patient cells in response to cellular stress activation. Taken together, these data suggest that R-HCR can be a valuable addition to analysis and imaging pipelines in repeat expansion disorders.

R-HCR is more sensitive than FISH for detecting GC rich repeats
We first compared the sensitivity and specificity of traditional FISH vs R-HCR probes. We designed fluorophore labeled locked nucleic acid (LNA) (C 4 G 2 ) 6 and (CCG) 8 FISH probes and (C 4 G 2 ) 6 and (CCG) 10 R-HCR initiator probes to hybridize to corresponding G 4 C 2 or CGG repeats. Alexa fluorophore labeled amplifier hairpin probes B1H1 and B1H2 were then used to amplify the R-HCR probe signal (Fig. 1a, b, Additional file 1: Table 2) [7,9]. We conducted a side-by-side comparison in mouse embryonic fibroblasts (MEFs) expressing (G 4 C 2 ) 70 -NL-3xFlag or (CGG) 100 -3xFlag reporters [24,34]. We found more repeat positive cells by R-HCR than FISH when using the same probe concentration (8 nM) (Fig. 1c, d). When we further increased each FISH probe concentration to 64 nM, we only saw a modest increase in number of repeat positive cells that remained significantly lower than that seen using R-HCR. In contrast, when we decreased each R-HCR probe concentration to 4 nM and 0.8 nM, we still observed enhanced signal compared to FISH, for both G 4 C 2 and CGG repeats (Fig. 1c, d). Furthermore, both nuclear and cytoplasmic signal was readily detected using R-HCR, while FISH primarily detected only the stronger nuclear signal. No signal was detected by either method in MEFs not expressing G 4 C 2 or CGG repeats (Fig. 1c, d). Together, these results show that R-HCR is more sensitive than FISH for detecting exogenous GC rich repeats.
To confirm that the observed signal from R-HCR was due to probe hybridization to RNA and not DNA, we treated the (G 4 C 2 ) 70 and (CGG) 100 transfected MEFs with DNase or RNase prior to R-HCR. While DNase robustly eliminated DAPI signal, it had no effect on the GC-rich repeat signal. Conversely, almost all probe signal went away when cells were treated with RNase (Fig. 1e). To further validate that our R-HCR probes were specifically recognizing their target RNA, we transfected cells with antisense CCC CGG (ATG-(C 4 G 2 ) 47 -NL-3xFlag) or CCG ((CCG) 60 -NL-3xFlag) reporters. We did not detect any repeat signal in Flag positive cells (Fig. 1f ). Together this supports that our R-HCR probes specifically hybridize to G 4 C 2 and CGG repeat containing RNA, with high specificity and sensitivity.
We next determined whether R-HCR could distinguish between different G 4 C 2 repeat sizes (3, 35, or 70 G 4 C 2 repeats). Despite similar transfection efficiencies among all three conditions (as monitored by GFP cotransfection, Additional file 2: Fig. 1a-c), we observed no signal in cells expressing (G 4 C 2 ) 3 -NL-3xFlag. The number of R-HCR positive cells was comparable for (See figure on next page.) Fig. 1 Hybridization chain reaction increases detection of GC rich repeats over FISH. a FISH probe with a 5′ conjugated fluorophore. Signal strength is dependent on number of RNA molecules present. b In R-HCR, two probe types are used: the initiator probe which binds directly to the RNA of interest, and 5' fluorophore conjugated hairpin probes (H1 and H2) complementary to 3′ and 5′ extensions on the initiator probe. Upon binding, H1 and H2 unfold to reveal new binding sites for the other hairpin probe. In this way, signal from one RNA molecule is amplified > 100 fold, dramatically enhancing detection. both (G 4 C 2 ) 35 -NL-3xFlag and (G 4 C 2 ) 70 -NL-3xFlag transfection (Fig. 2a, b). However, there was a significant correlation between R-HCR signal intensity and repeat length, with (G 4 C 2 ) 70 -NL-3xFlag transfected cells having higher R-HCR signal intensity (Fig. 2c). We next performed similar experiments in cells transfected with CGG n -NL-3xFlag reporters with repeats of various length (24,55, and 100 CGG repeats) . We observed a strong positive correlation between CGG repeat length and the number of R-HCR positive cells (Fig. 2d, e). We also observed a significant increase in R-HCR signal intensity within cells expressing longer CGG repeats (Fig. 2d, f ). CGG repeat expansions within the FMR1 5′UTR have previously been shown to enhance transcription, and we and others have observed a positive correlation of repeat length and RNA abundance in this setting ( [4,36] and unpublished data). Thus, this repeat-length increase in R-HCR signal could result from enhanced mRNA production as well as more CGG repeat binding sites in the longer repeat reporters. As both repeat size and expression are tightly linked to disease, we believe R-HCR can be used to qualitatively assess CGG repeat RNA burden in model systems.

R-HCR is more sensitive than FISH at detecting endogenous repeat signal
We next determined if R-HCR was effective at detecting endogenous GC-rich repeat RNA. We first compared R-HCR and FISH in control and C9orf72 patient fibroblasts. Unlike in transiently transfected cells where RNA signal via R-HCR was predominantly globular and nuclear, endogenous G 4 C 2 repeats appear primarily as small foci, similar to foci detected by FISH. Previous studies observed upwards of 35% of C9orf72 patient fibroblasts contained at least 1 G 4 C 2 repeat foci [42], although signal specificity was not established. Here, after normalizing to control fibroblast signal, we observed only half the number of G 4 C 2 repeat positive cells previously reported in three different expansion cell lines (C9-C1, C9-C2, C9-C3). However, our number of foci/foci positive cell is in agreement with previously published work (Fig. 3e) [13,42]. Using R-HCR, we detected > 2 × more G 4 C 2 positive cells than with FISH ( Fig. 3a, b, d). Importantly, we also observed a significant increase in the number of foci/foci positive cell (Fig. 3e). Intriguingly, while FISH primarily detected foci in the nucleus of G 4 C 2 repeat expansion cell lines, R-HCR was able to detect cytoplasmic foci in ~ 40% of G 4 C 2 repeat positive cells (Fig. 3f ). Previous studies observed cytoplasmic foci only ~ 10% of the time [49]. To confirm that the signal we were observing was from RNA, we treated the G 4 C 2 repeat expansion fibroblasts with DNase or RNase before R-HCR and found the R-HCR signal was sensitive to RNase and resistant to DNase (Fig. 3c).
We next compared FISH and R-HCR in control and C9orf72 patient brain tissue. We looked for foci in cerebellum and frontal cortex, as those regions have been previously shown to have G 4 C 2 repeat RNA foci and feature evidence of disease pathology [13,49]. After normalizing signal to controls, we detected G 4 C 2 repeat positive cells in the frontal cortex and cerebellum of three C9 brains (C9-B1, C9-B2 and C9-B3) using FISH. However, foci were only detected in less than 1% of granule cells in the cerebellum, and less than 5% of cells (predominantly glia and interneurons) in the frontal cortex. In contrast, when R-HCR was performed on these same brain samples, more than 30% of granule cells were positive for G 4 C 2 repeat RNA foci in the cerebellum and 8-21% of glia and interneurons were positive for G 4 C 2 repeat RNA in the frontal cortex ( Fig. 4a-d, f-g). These foci were absent when tissue was RNase treated, and remained when tissue was DNase treated, strongly supporting that these are RNA foci (Fig. 4e). R-HCR also enhanced the number of foci/foci + cell in both the cerebellum and frontal cortex, however this difference was only significant in one case (Additional file 2: Fig. 2e, f ). We also observed diffuse staining in both purkinje cells and pyramidal neurons with both FISH and R-HCR (Additional file 2: Fig. 2a-d).
Together, these data indicate that R-HCR can be useful in detection of low abundant endogenous G 4 C 2 repeat RNA.
We performed a similar experiment in control and FXTAS patient fibroblasts. With the sense CGG repeat R-HCR probe, we found CGG repeat signal was readily detectable not only in FXTAS patient cells lines, but also in premutation carriers and control lines. The pattern of the detected signal appeared predominantly nucleolar similar to prior reports in FXTAS brain tissue [61,62,70]. We did not observe any R-HCR signal in these same cell lines when we used the antisense CCG repeat R-HCR probe or after RNAse treatment, suggesting this signal was primarily CGG RNA-mediated (Additional file 2: Fig. 3a). These results indicate that the CGG repeat signals are CGG RNA-specific but not FMR1 CGG repeat expansion-specific. We next evaluated R-HCR on CGG repeats in frontal cortex and hippocampal sections as these have been previously shown to express the CGG RAN product, FmrpolyG [38]. Similar to fibroblasts, we observed nucleolar-like staining in both control and FXTAS patient brain tissue with both the CGG FISH and R-HCR probes. However, the signal was stronger and occurred in a higher percentage of neurons in FXTAS samples (Additional file 2: Supplemental Fig. 3b).
As the control cell lines still contain ~ 20-30 CGG repeats, we reasoned that the probe could still be specifically binding to FMR1 RNA. To investigate this further, we performed R-HCR in a transcriptionally silenced FXS iPSC line with 800 repeats, a control iPSC line with ~ 30 CGG repeats and then compared their signal intensity to an unmethylated full mutation line (TC-43) with a large (270) transcriptionally active CGG repeat expansion that supports RAN translation [26,58]. We observed significant staining with the CGG R-HCR probe in all three cell lines that was both diffuse in the cytoplasm, as well as localized to the nucleolus (Additional file 2: Fig. 3c). However, similar to observations in FXTAS brain, we saw enhanced signal in the unmethylated full mutation line compared to the WT and methylated FXS line, suggesting that this enhanced signal was due to CGG repeat expansions within FMR1 (Additional file 2: Fig. 3c-e).

G4C2 repeats accumulate in the nucleus in response to cellular stress
The ability to readily detect endogenous nuclear and cytoplasmic G 4 C 2 repeat RNA foci using R-HCR is potentially useful for exploring its roles in disease pathogenesis. Our lab and others have observed that expression of G 4 C 2 repeat containing reporters induces stress granule (SG) formation and the integrated stress response (ISR), and considerable evidence now suggests that this process can contribute to neurodegeneration [24,41,65,84]. Moreover, exogenous ISR activation through a variety of methods triggers a selective enhancement of RAN translation from both CGG and G 4 C 2 repeats in transfected cells and neurons [5,24,65,80]. To investigate the behavior of endogenous G 4 C 2 repeat RNA and foci in response to stress, we treated C9orf72 patient fibroblasts with sodium arsenite (SA) or vehicle. SA treatment for one or two hours led to a significant increase in the total number of cells with visible G 4 C 2 repeat foci. Moreover, there was a marked re-distribution of these foci into the nucleus and out of the cytoplasm (Fig. 5a-c). This same SA-induced nuclear re-distribution of G 4 C 2 repeat foci was also observed in C9orf72 patient derived neurons (Fig. 6a-c).
RNAs typically move into SGs and become translationally silenced in response to SA stress [35]. In contrast, G 4 C 2 repeat RNAs remain translationally competent after stress induction. We therefore assayed whether G 4 C 2 repeat RNA foci localized to SGs in response to stress. Consistent with their retained translational competency, we did not observe significant co-localization of G 4 C 2 repeat foci with the SG marker, G3BP1 (Fig. 7a, b), although rare co-localization events were observed. Integrated stress response activation induces the formation of nuclear stress bodies. These TDP-43 positive structures are thought to contribute to ALS disease pathogenesis [78]. Given the greater nuclear distribution of G 4 C 2 repeat RNA after stress induction, we evaluated whether there was any significant overlap between endogenous G 4 C 2 repeat RNA and TDP-43, a critical factor in C9orf72 ALS/FTD pathology and ALS pathogenesis as well as a robust marker for nuclear bodies [16,78]. Our untreated C9 fibroblasts appeared to have small TDP-43 nuclear bodies, indicative of them being inherently stressed. Upon 2 h of SA induction we saw a decrease in diffuse nuclear TDP-43, and an increase in TDP-43 nuclear foci size. Overall, nuclear TDP-43 showed limited (0.38-4.49%) co-localization with G 4 C 2 repeat RNA foci at baseline. After SA induction, there was a significant increase in this colocalization, but it remained modest (2.56% ~ 8.35%) (Fig. 7c, d). Taken together, these studies suggest that nuclear retention or re-distribution of G 4 C 2 repeat RNA foci in C9orf72 fibroblasts in response to stress is not predominantly driven by either SG or nuclear body association and may instead reflect nucleocytoplasmic transport defects elicited by stress pathway activation [3,43,52,84].

Discussion
Repeat RNA and the formation of RNA-protein and RNA-RNA condensates are thought to act as significant factors in the pathogenesis of multiple repeat expansion disorders. However, traditional detection techniques such as FISH are often limited in their sensitivity which may cloud the roles of such repeat RNAs in disease-relevant processes. Here we used a highly sensitive RNA in situ amplification method, R-HCR, to readily detect low expressing endogenous GC-rich repeat expansions in both patient cells and tissues. This non-proprietary method provided significantly enhanced sensitivity over RNA FISH probes with retained specificity. This increased sensitivity allowed for greater detection and appreciation of nuclear and cytoplasmic foci in patient cells. Moreover, we demonstrated that G 4 C 2 repeat RNA foci accumulate in the nucleus in both patient fibroblasts and neurons in response to cellular stress. This tool should prove useful to the field in explorations of endogenous repeat RNA behaviors and pathology in both repeat expansion disorders and model systems.
Previously established techniques for detecting RNA in situ have limitations. FISH is limited by RNA copy number, and probe specificity, while the use of MS2 and PP7 binding sites to detect low abundant RNAs is only applicable to exogenous gene expression, or in cases where these tags were inserted via CRISPR [20,66,69,82]. Recently, an alternative RNA in situ amplification method, BaseScope ™ , was shown to improve detection of endogenous G 4 C 2 repeats in patient tissue [46]. R-HCR and BaseScope ™ are comparable on a number of fronts. Namely, they both can be combined with IHC, have extensive signal amplification capacity, and in the case of newer R-HCR versions, utilize split probes to eliminate nonspecific signal [8]. However, R-HCR lends itself as a more universally applicable approach for a number of reasons, including minimal optimization needed, fewer steps involved, flexibility in hybridization temperature (and thus probe stringency), and the option of five different fluorophores to allow for combined R-HCR-IF with multiple probes and/or antibodies. However, while the flexibility of fluorescent probe choice makes R-HCR more adaptable in a variety of experimental settings, the chromogenic properties of BaseScope ™ may allow for better coupling with tissue stains for pathology purposes. Thus, both serve as valuable tools for detecting and investigating endogenous GC-rich repeats.
While both R-HCR and BaseScope ™ are sensitive tools for detecting G 4 C 2 repeat RNA, we do caution the use of these techniques for detecting CGG repeat RNA. Given the extensive use of CGG RNA FISH probes in the literature, we were surprised to find such high background signal, specifically in control and FXS human samples. The staining pattern with the CGG probe was also vastly different from the foci typically observed for other GC-rich repeat expansions. In fibroblasts, brain samples, and HEK293 cells we observed intense, large nuclear body staining, while iPSCs had weaker nuclear body staining and strong, diffuse cytoplasmic staining. This pattern is largely consistent with prior work using FISH to assay CGG repeat RNA in FXTAS [61,62]. That lack of punctate RNA foci makes it difficult to determine what signal is specific to the CGG repeat expansion on FMR1. There are 921 human genes which contain ≥ 6 CGG repeats, and this likely accounts for the high background in human cells [26,37]. However, previous studies using probes to the 3′ UTR and coding sequence of FMR1 showed similar large nuclear body staining, suggesting the signal observed with our CGG R-HCR probe could still be disease relevant [70].
Our R-HCR G 4 C 2 probe exhibited significantly better sensitivity and specificity in human cells and tissues. The increased sensitivity of R-HCR over FISH with this probe allowed us to consistently visualize G 4 C 2 repeat RNA in the cytoplasm, allowing us to ask questions regarding G 4 C 2 repeat RNA activities in different subcellular compartments. As a proof of concept, we analyzed G 4 C 2 repeat cellular localization during stress. We observed no significant co-localization with SGs, but instead, a redistribution of G 4 C 2 foci into the nucleus. This redistribution could either be caused by an increase in nuclear import, a decrease in nuclear export, or a retention of repeat RNA within sub-nuclear compartments. Nucleocytoplasmic transport is inhibited basally in many neurodegenerative conditions, including C9orf72 ALS/FTD [59]. Nucleocytoplasmic transport proteins, including importin-alpha, RanGap, and nucleoporins, are also recruited into SGs and colocalize with TDP-43 in ALS/FTD mutant cytoplasmic aggregates [10,23]. SG assembly itself inhibits nucleocytoplasmic transport by sequestering factors required for nuclear export, and thus the increased abundance of G 4 C 2 repeat RNA may be indicative of global nuclear mRNA retention [3,43,52,84]. G 4 C 2 repeat RNA itself is also implicated in nuclear import perturbations via binding to RanGap1 [85], suggesting that its nuclear retention could be not only a cause of but also a contributor to stress-dependent pathology.
Alternatively, repeat RNAs may interact with nuclear stress bodies. These complexes result from nuclear relocalization of heat shock factors, including HSF1 and HSP70, as well as RNA factors, including TDP-43, with satellite III repeat RNAs [22,47,74,83]. We observed a significant increase in co-localization between nuclear TDP-43 and G 4 C 2 RNA. However, the overall overlap between these two molecules remained modest and of unclear biological significance.
In sum, we describe the application of R-HCR to the detection of endogenous GC rich repeat RNA. This nonproprietary tool is sensitive, specific and useful in studying endogenous repeat RNA foci dynamics and should prove useful for investigators interested in the behavior of these disease-associated RNA species.

Cell lines culture and Clinical Specimens
MEFs were received from Randal Kaufman (Sanford Burnham Prebys Medical Discovery Institute) and cultured in RPMI1640 with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin (P/S). FXTAS skin fibroblasts were from Paul Hagerman (UC Davis) and University of Michigan donors. C9orf72 ALS/FTD skin fibroblasts were from Eva Feldman (University of Michigan). Human derived skin fibroblasts were cultured in high glucose DMEM with 10% FBS, 1% nonessential amino acid (NEAA) and 1% P/S. All cells were cultured at 37 °C. Control and FXTAS human brain paraffin sections were obtained from the University of Michigan Brain Bank and are previously described [38,71]. Human derived neurons were generated and differentiated as previously described [21,79]. Details regarding FXS Human derived iPSCs have been previously published [26]. Details on these specimens were presented in Additional file 1: Table 1.

Stress treatment
Fibroblasts and neurons were treated with 500 nM sodium arsenite (SA) or vehicle (H 2 O) at indicated time points then washed 1 × in 1xPBS and fixed for ICC-R-HCR.
For human brain paraffin sections, slides were first deparaffinized with xylenes and then rehydrated from 100% ethanol, 95% ethanol, 70% ethanol, 50% ethanol to DEPC treated H 2 O. Rehydrated slides were incubated in 0.3% Sudan black for 5 min followed by proteinase K treatment at RT for 10 min. Permeabilization, preheating, hybridization with probes, washing and mounting are the same as above.

R-HCR
The initiator probes (CCC CGG ) 6 and (CCG) 10 , were synthesized by OligoIDT. The fluorophore 647 labeled hairpin probes (B1H1 and B1H2) [7] were synthesized by Molecular Instruments. For transfected MEFs and fibroblasts, cells were processed according to Molecular Instrument's protocol. In brief, cells were fixed in 70% cold ethanol overnight at 4 °C. When R-HCR was performed following ICC, this step occurred prior to ICC. Cells were then preheat in hybridization buffer at 45 °C for 30 min, and incubated with 0.8 nM, 4 nM or 8 nM initiator probe ((CCC CGG ) 6 or (CCG) 10 ) at 45 °C incubator overnight. Cells were washed 4 times for 5 min each with pre-warmed probe wash buffer (50% formamide, 5 × SSC, 9 mM citric acid (pH 6.0), 0.1% Tween 20, 50 μg/mL heparin) at 45 °C and 2 × for 5 min with 5 × SSCT at RT. Cells were incubated with snap cooled hairpins B1H1 and B1H2 at room temperature for 12-16 h in amplification buffer. The concentrations of each hairpin (0.375 pmol, 1.875 pmol, and 3.75 pmol per well in an 8 well chamber slide) was proportional to the amount of initiator probe used (0.8 nM, 4 nM or 8 nM). Cells were washed 5 × for 5 min at RT with 5 × SSCT, then mounted with ProLong Gold antifade mountant with DAPI.
For human derived brain section, samples were processed according to Molecular Instrument's protocol. In brief, Histo-Clear II was used to deparaffinize tissue, then samples were rehydrated and treated with proteinase K as described for FISH. Slides were washed in 1 × TBST, incubated in 0.2 N HCL for 20 min at RT, washed 5 × in 5xSSCT, then incubated in 0.1 M triethanolamine-R-HCR (pH 8.0) with acetic anhydride for 10 min, and washed in 5 × SSCT for 5 min. Slides were preheated and incubated in hybridization buffer with probes at 45 °C in humidity chambers for 12-16 h. The remaining steps were the same as above for R-HCR in cell culture.

Imaging and analysis
Images were taken on an Olympus FV1000 confocal microscope equipped with a 40 × oil objective (60 × for iPSC images) and analyzed using ImageJ software. Signal for protein and repeat RNA were normalized to nontransfected cells or control samples. For brain tissue, we analyzed layer 4-6 prefrontal cortex gray matter and cerebellar lobules 4-5, with focus on the granule cell layer and purkinje cell layer than the molecular layer. For the cerebellum, most of the foci were found in granule cells and thus we limited our analysis to granule cells in this region. We observed additional diffuse staining with some foci detected within purkinje cells as well as foci present at lower frequency in basket cells within the molecular layer. For one case (B3) there was not sufficient tissue to complete analysis with both FISH and HCR, so this was quantified but not used in statistical analyses. For the cortex, there was staining in both neurons and glia, with foci more abundant by both HCR and FISH in glia and inter-neurons. Detectable signal was also present in pyramidal neurons, but for FISH in particular it was difficult to discern foci in these cells compared to a more diffuse signal in the nuclei and perinuclear regions. We therefore focused our comparative analysis to foci within the smaller interneuron and glial nuclei and cytoplasm.
Total cell number, protein stained cell number, RNA positive cell number, and foci number and distribution per cell were all manually counted. Signal intensities for iPSC images (Additional file 2: Fig. 3e) were calculated as mean intensity/area using ImageJ. For transfected MEFs, RNA intensity was graded into high, medium and low signal intensity. The ratio of repeat positive cells was calculated as number of RNA positive cells to total cells. The ratio of foci number per cell was calculated as foci number in all repeat positive cells divided by all repeat positive cells. The repeat distribution was expressed as proportion of cells with foci (only nuclear, only cytoplasmic and both) among all repeat positive cells. The relationship between repeat foci and SGs was analyzed as the proportion of co-localization of cytoplasmic G 4 C 2 repeat foci with G3BP1 granule to total cytoplasmic G 4 C 2 repeat foci. Similarly, the relationship between repeat foci and NBs was analyzed as the proportion of nuclear G 4 C 2 repeat foci co-localizing with TDP-43 granules.

Statistical analysis
All statistical analyses were performed in GraphPad Prism software. Chi-square test was applied for categorical data, including amount of protein and repeat RNA expression in cells in transfected MEFs, repeat RNA signal intensity in transfected MEFs, and distribution of repeat RNA foci in cells. Paired t-test, unpaired t-test and one-way ANOVA were performed to analyze continuous data, including number of detectable repeat foci, foci number per cell, and the co-association rates between repeat RNA and SG or NB markers. We designated P < 0.05 as our threshold for significance with corrections for multiple comparisons.