Cellular labeling of endogenous retrovirus replication (CLEVR) reveals de novo insertions of the gypsy retrotransposable element in cell culture and in both neurons and glial cells of aging fruit flies

Evidence is rapidly mounting that transposable element (TE) expression and replication may impact biology more widely than previously thought. This includes potential effects on normal physiology of somatic tissues and dysfunctional impacts in diseases associated with aging, such as cancer and neurodegeneration. Investigation of the biological impact of mobile elements in somatic cells will be greatly facilitated by the use of donor elements that are engineered to report de novo events in vivo. In multicellular organisms, reporter constructs demonstrating engineered long interspersed nuclear element (LINE-1; L1) mobilization have been in use for quite some time, and strategies similar to L1 retrotransposition reporter assays have been developed to report replication of Ty1 elements in yeast and mouse intracisternal A particle (IAP) long terminal repeat (LTR) retrotransposons in cultivated cells. We describe a novel approach termed cellular labeling of endogenous retrovirus replication (CLEVR), which reports replication of the gypsy element within specific cells in vivo in Drosophila. The gypsy-CLEVR reporter reveals gypsy replication both in cell culture and in individual neurons and glial cells of the aging adult fly. We also demonstrate that the gypsy-CLEVR replication rate is increased when the short interfering RNA (siRNA) silencing system is genetically disrupted. This CLEVR strategy makes use of universally conserved features of retroviruses and should be widely applicable to other LTR retrotransposons, endogenous retroviruses (ERVs), and exogenous retroviruses.


Introduction
Nearly 50% of human DNA content, and equivalently vast fractions of most other animal and plant genomes, consist of sequences derived from transposable elements (TEs) [1]. TEs  selfish genetic elements whose primary adaptation is to copy themselves in germline tissue, thereby passing on de novo copies. Mutations generated by germline transposition are a significant source of genetic variability, with obvious impact on phenotypic diversity and on evolutionary adaptation [2,3]. Although replication in the germline is the means by which TEs generate inheritable de novo copies, it is increasingly clear that endogenous TEs also replicate in somatic tissues. The evidence for somatic transposition is particularly strong for Class 1 TEs, also known as retrotransposable elements (RTEs) [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]. RTEs replicate through an RNA intermediate by the use of reverse transcription and insertion of de novo cDNA copies. Somatic replication of RTEs has the potential for tremendous impact both on normal biological properties of tissues and on human health [2,[16][17][18][19][20][21].
Historically, most studies of the mechanisms of transposition have focused on events in the germline for two main reasons. First, replication in the germline rather than somatic cells is the evolutionary adaptation that has permitted TEs to maintain their presence in the genome. Thus, from a conceptual point of view, it made great sense to study TE action there. The other reason most research focused on germline events, however, is technical. Detection of de novo replication in the germline is experimentally tractable because new insertions can be passed on to offspring, where they are present in every cell. As a result, classical molecular approaches such as Southern blots, as well as more modern genomic approaches, are each feasible means to characterize new insertions. In contrast, detection of de novo inserts in somatic tissues is far more challenging than in the germline because each new TE insertion will be present in a single cell or, at best, in one clonally related cell lineage if the insertions are followed by cell divisions. The discovery that long interspersed nuclear element (LINE)-like RTEs are able to replicate during neural development and in pluripotent stem cells in rodents [5-7, 12, 14, 22], as well as somatic cells during Drosophila embryonic development [15], however, provided the impetus to more closely examine TE replication in soma. It is increasingly clear that TEs also can be actively expressed and even mobile in somatic tissues such as the brain during normal development [5-7, 9-14, 23-26], during aging in a variety of cell types and species [4,[27][28][29][30][31][32][33], in a variety of neurodegenerative diseases and in animal models of human neurodegeneration [34][35][36][37][38][39][40][41][42][43][44], and in cancer [17,[45][46][47][48][49][50][51][52][53][54].
Single-cell whole genome sequencing approaches have provided the resolution to detect rare de novo insertions within individual somatic cells. As a whole, such studies have clearly supported the idea that some RTEs are capable of replication during brain development as well as in the context of cancer progression, although the rate of such de novo events and whether they have functional impacts remain controversial or unresolved [9,10,18,24,25,46,[55][56][57]. But these approaches are limited in some important practical and conceptual ways. First, single-cell sequencing is relatively expensive and is low throughput. In addition, the significant molecular biological and computational challenges that come with this approach have led to widely different estimates of mobilization frequency. But perhaps more importantly, detection of individual de novo replication events in single cells does not afford a platform to conduct structure/function manipulations of RTEs to dissect the biological mechanisms. Nor does it provide the opportunity to visualize or manipulate the cells in which the RTEs are replicating within a tissue.
A different approach to this problem is to engineer genetic reporters of RTE replication [4,5,58,59]. Such reporters offer the advantage that they provide the means for structure-function studies to uncover Cis-acting sites, investigation of impact of environmental or genetic perturbations, and to uncover the type and fate of the cells in which RTEs replicate within living tissues. Such reporters have been described for Ty long terminal repeat (LTR) retrotransposons in yeast [58], mammalian LINE-1 (L1) retrotransposons in cell culture and in transgenic mice, and for intracisternal A particle (IAP) elements in cell culture [5,[59][60][61].
Each of the above reporters were constructed using the insertion of heterologous introns within the engineered RTE. The intron can be removed by splicing when the RTE passes through an RNA intermediate, thereby activating a reporter upon de novo integration of replicated cDNA copies into host DNA. The L1-GFP reporter mouse has revealed that de novo mobilization events take place during mouse development [5] and provided the means to investigate the impact of behavioral and genetic perturbations [6,7]. We previously described an indirect mobilization reporter for the gypsy-LTR retrotransposon/endogenous retrovirus (ERV). This so-called gypsy-TRAP made use of a known chromosomal hot spot for gypsy integration to turn off a Gal80 repressor upon insertion of a de novo retro-element into the hot spot cassette [4]. When present together with both a Gal4 activator and a Gal4-responsive green fluorescent protein (GFP), this was sufficient to report integrations that disrupted the Gal80 cassette. This system has been successfully used to reveal de novo transposition during aging and in the context of neurodegeneration [4,8,27,28,38]. But there are several critical shortcomings of the gypsy-TRAP approach. First, it does not selectively report insertions of any single RTE. In fact, it can be activated by insertions from any gypsy family member or indeed any RTE that shares that hot spot. Second, it does not provide the means to manipulate the biology of the donor element. Third, because the gypsy-TRAP utilizes a repressor of Gal4 function, it precludes use of the powerful Gal4 system that is the primary tool for modern Drosophila genetics.
Although most of the work on somatic transposition has focused on LINE elements, an emerging literature has suggested the possibility of a far broader impact of RTEs within somatic tissues, with the possibility that both LINE-like and LTR RTEs, and ERVs may mobilize [4-14, 26-32, 34-43, 45-54, 57]. But aside from the gypsy-TRAP, whose utility is limited, no in vivo reporter system has yet been developed to monitor mobilization of donor LTR retrotransposons or ERVs in any multicellular organism. Using the Drosophila ERV gypsy as a model, we have developed a novel reporter to reveal de novo mobilization both in cell culture and in somatic brain tissue. The gypsy-cellular labeling of endogenous retrovirus replication (CLEVR) reporter becomes hyper-active with disruption to the short interfering RNA (siRNA) silencing system, which is known to help silence endogenous gypsy expression in somatic cells. Using this reporter system, we also confirm our previous report of age-dependent increases in mobilization in neurons and demonstrate similar effects in glial cells. Because the CLEVR reporter system capitalizes on highly conserved mechanisms of replication, it likely will be generalizable to LTR retrotransposons, ERVs, and even exogenous retroviruses in a variety of animal and plant species.

Engineering of the gypsy-CLEVR reporter
To generate a reporter capable of tracing gypsy retrotransposition in specific cell lineages, we capitalized on conserved features of LTR-RTE/ERV replication ( Fig 1A). Specifically, we took advantage of two universal features of retrovirus replication [62,63]. First, because the 5 0 UTR that is part of the promoter sequence at the 5 0 end of the 5 0 LTR is not transcribed, the corresponding sequences at the 5 0 end of the 3 0 LTR in the genomic RNA normally is used as a template to complete cDNA synthesis of the 5 0 end of the full-length transcript. Second, because the portion of the U5 region of the 3 0 LTR following the transcription termination signal is not transcribed, the corresponding portion of the 5 0 LTR is used as a template for replication of this region of the 3 0 LTR. As a result, each of the LTR sequences of the fully replicated cDNA are hybrids of the nucleotide sequence from the two LTR repeats. The CLEVR reporter takes advantage of this. We placed a promoter without a reporter within the U5 region of the 5 0 LTR (i) gypsy-CLEVR contains 5XUAS and the Watermelon reporter that are inserted into the U5 region of 5 0 LTR and U3 region within 3 0 LTR, respectively, with reverse orientation to gypsy element. (ii) During normal replication of LTR-RTEs and ERVs, a series of template switches transfers DNA sequence information from between subregions of the 3 0 and 5 0 LTRs. These two template switches cause a rearrangement in the replicated cDNA such that the WM reporter is placed downstream to the UAS regulatory element. (iii) The full-length cDNA reintegrates into the host genome at a distinct location. If that cell expresses Gal4, it will trigger expression of WM by binding to UAS regulatory element. The wild-type (WT), deletion (ΔPBS), and mutations (PBSm1 and PBSm2) of the primer binding site (PBS) variants of gypsy-CLEVR are shown, along with sequence traces confirming their fidelity. (B) Fluorescent images showing that WM positive cells are detected for gypsy-CLEVR with tub-Gal4, but not for deleted (gypsy-CLEVR ΔPBS ) or mutated (gypsy-CLEVR PBSm1 and gypsy-CLEVR PBSm2 ) gypsy-CLEVR and we positioned a reporter with no promoter into the U3 region of the 3 0 LTR. Thus, after full replication, the reinserted de novo element should have rearranged the heterologous promoter so that it is adjacent to the reporter at both the 5 0 and 3 0 LTRs.
As a heterologous promoter, we chose to use the Gal4-responsive 5XUAS enhancer, which is sufficient to drive strong expression under the control of the Gal4 transcription factor. This promoter is compact in size and provides maximal functional utility because its use taps into a wealth of extant Drosophila strains with Gal4 expression restricted to virtually any desired cell type or developmental stage. Importantly, the Gal4-responsive promoter faces to the left, opposite to the orientation of the gypsy promoter that lies in the LTR. Therefore, gypsy expression falls under its own control. For a reporter, we wished to provide the means to clearly mark nuclei in order to count cells and identify the location of their soma within tissues. But we also wanted to be able to see cell morphology, for which it is helpful to highlight the plasma membrane. This is particularly important in brain tissue, where glia and neurons can have extraordinarily complex morphologies. To accomplish both of these goals, we designed Watermelon (WM), a dual reporter module that encodes both a myristylated GFP that tethers to the membrane (myr-GFP-V5) and nuclear localized mCherry (H2B-mCherry-HA). The dual reporter expression relies on introduction of the porcine teschovirus-1 2A (P2A) self-cleaving peptide [64,65] sequence (myr-GFP-V5-P2A-H2B-mCherry-HA, S1 Fig). When this dual reporter is expressed, it simultaneously marks the plasma membrane with GFP and the cell nucleus with mCherry.
To test the functionality of the dual reporter, we created an upstream activating sequence (UAS)-WM construct, which we co-transfected with actin-Gal4 into Drosophila S2 cells. Indeed, this marked transfected cells with brilliant green membrane fluorescence surrounding bright red fluorescence in the nuclei (S2A Fig). We also generated UAS-WM transgenic flies, which we tested by crossing with several tissue-specific Gal4 drivers. In the presence of hedgehog (hh)-Gal4 (S2B Fig), the WM reporter marked posterior compartment cells [66] of wing discs during the larval stage with mCherry in the nucleus and GFP on the membrane. We also tested this dual reporter in the Drosophila adult brain by expressing this transgene in subperineurial glia (SPG) under the control of the moody-Gal4. Here too, the nuclei were marked by strong mCherry fluorescence that co-localized with the glial nuclear Repo marker in SPG glial cells. The SPG glial membranes were painted with strong GFP, revealing the characteristic mesh-like architecture of this cell type (S2C Fig) [67]. These results demonstrate the functionality of the WM reporter to simultaneously mark cell nuclei (red) and membranes (green).
We first tested the impact of deleting (ΔPBS) or mutating (PBSm1 and PBSm2) the primer binding site (PBS), a Cis-acting motif essential for retrovirus replication (Fig 1A). In contrast with the gypsy-CLEVR construct that contains the intact PBS, co-transfections of tubulin-Gal4 with the gypsy-CLEVR ΔPBS did not yield any labeled cells (Fig 1B and 1C). Similarly, introduced mutations in the nucleotide sequence of the PBS greatly reduced (PBSm1; 0.33% of cells labeled) or eliminated (PBSm2; 0% of cells labeled) the reporter expression (Fig 1B and 1C). To reinforce this finding, we performed reverse-transcription quantitative PCR (RT-qPCR) to check the relative expression of GFP and mCherry signals derived from gypsy-CLEVR and its PBS modified versions (gypsy-CLEVR ΔPBS , gypsy-CLEVR PBSm1 , gypsy-CLEVR PBSm2 ) ( Fig  1D). In order to detect the relative expression levels of GFP and mCherry, qPCR primer sets targeting the GFP and mCherry regions of WM transcript were separately designed and tested. To demonstrate that these primer sets functionally detect WM reporter, mRNA extracted from cells with no vector, tub-Gal4 alone, UAS-WM alone, and tub-Gal4 combined with UAS-WM reporter were used to examine the relative expression of each of the fluorescent proteins encoded by WM. As expected, GFP and mCherry were only detected in tub-Gal4 and UAS-WM (tub>WM) co-transfected cells, but not cells with no vector, or with tub-Gal4 alone or UAS-WM alone as negative controls (S2D Fig). This result demonstrates the GFP and mCherry primer sets can robustly and specifically detect WM signals. Next, we used these GFP and mCherry primer sets to detect WM expression levels in gypsy-CLEVR and its PBS modified versions (gypsy-CLEVR ΔPBS , gypsy-CLEVR PBSm1 , and gypsy-CLEVR PBSm2 ). We again compared reporter levels with co-transfection of the CLEVR-constructs and tub-Gal4 versus tub-Gal4 alone as a negative control. Here too, we see no detectable signal from WM in the tub-Gal4 alone group (Fig 1D). But in cells that were co-transfected with tub-Gal4 and the gypsy-CLEVR (tub>gypsy-CLEVR), both the GFP and mCherry transcripts were robustly detected. With each of these primer sets, the reporters were greatly reduced and, in fact, were barely detectable in the gypsy-CLEVR ΔPBS (tub>gypsy-CLEVR ΔPBS ) and gypsy-CLEVR PBSm2 (tub> gypsy-CLEVR PBSm2 ) groups ( Fig 1D). WM signals from gypsy-CLEVR PBSm1 (tub> gypsy-CLEVR PBSm1 ) also were significantly reduced when compared with gypsy-CLEVR (GFP: 14.4% and mCherry: 11.1%) ( Fig 1D). Notably, this quantitative effect seen with RT-qPCR for the PBSm1 mutation is consistent with the quantification of the number of cells exhibiting fluorescent reporter expression via confocal microscopy.

Molecular confirmation of gypsy-CLEVR retrotransposition in vivo
In order to test the efficacy and specificity of the gypsy-CLEVR reporter in vivo, we next introduced gypsy-CLEVR and its PBS modified variants (gypsy-CLEVR ΔPBS , gypsy-CLEVR PBSm1 , and gypsy-CLEVR PBSm2 ) into transgenic flies. We first sought to establish molecular confirmation that the donor RTE of the gypsy-CLEVR reporter was capable of replication, which should lead to the predicted rearrangement of the LTR sequences. The gypsy-CLEVR with an intact PBS and the variants with deletion or mutation of the essential PBS were tested in parallel. For each case, adult flies were collected directly after eclosion and were then aged for 7 days. Genomic DNA was extracted from these flies, and nested PCR was performed to examine gypsy-CLEVR rearrangement. We did not detect a visible signal in the first round of PCR (gel region indicated by green arrowhead in Fig 2A). The DNA content in the indicated region (predicted size is 4,039 nucleotides) of the gel was extracted and used as template to perform a second round of PCR. A PCR product of the predicted size (around 239 nucleotides) was only detected in the intact PBS gypsy-CLEVR tissue but not in non-transgenic control animals, and not in any of the three gypsy-CLEVR PBS-modifying variants (region indicated by red arrowhead in Fig 2A). This PCR product was purified, subcloned, and sequenced. The sequencing results from three independent clones confirmed the presence of the predicted rearrangement in the gypsy-CLEVR containing flies, and the sequence of this clone matched the predicted sequence expected to form after gypsy replication (S3 Fig). Together, these results provide strong molecular confirmation that the gypsy-CLEVR reporter is capable of replicating in vivo, and such replication is only detected at the PCR level in the presence of an intact PBS. We next tested whether the WM reporter could also be detected in these flies, and whether it too required an intact PBS.
Because of our previous finding that gypsy expression and replication is increased with age [4] and because expression may be highest in glia [35], we first tested whether we could detect gypsy-CLEVR replication in glial cells of aged flies by crossing the reporter flies with the panglial driver, repo-Gal4. This Gal4 strain is well characterized to express at high levels in virtually all glial cells. In order to independently mark all glial and neuronal nuclei, we used an antibody against the Repo nuclear protein that marks all glial nuclei, and an antibody against the nuclear Elav protein that marks all neuronal nuclei. As a second, independent label to unambiguously label glial nuclei, we added a UAS-driven nuclear LacZ transgene. This nuclear LacZ is driven from the same repo-Gal4 driver that activates the mCherry reporter, but unlike WM, its expression does not require replication of the gypsy-CLEVR construct. In 7-day-old animals, we detect strong mCherry fluorescence in many glial nuclei (Fig 2B and 2C), and we detect GFP membrane fluorescence surrounding the nuclear cherry (S4 Fig). By confocal imaging throughout the whole brain, we determined that most of the gypsy-CLEVR-positive cells were located near the brain surface, especially in glia near the anterior regions (S5 Fig). We quantified the total number of WM-labeled glia in anterior brain sections and found that, as was the case in cell culture, deletion or mutation of the PBS leads to a dramatic decrease in the number of labeled cells (Fig 2B and 2C), demonstrating that a Cis-acting motif that is essential for retrovirus replication also is required to activate the reporter. Although the replication of this construct is by design Gal4 independent, expression of the reporter is engineered to require Gal4. Indeed, we rarely detected any gypsy-CLEVR-positive cells in animals that did not contain the repo-Gal4 driver, even when these flies were aged for 30 days (S6 Fig), a time point when endogenous gypsy is expressed at high levels [4,35]. These rare labeled cells in flies that lack Gal4 likely derive from de novo replication events that insert nearby active genomic loci, where the rearranged UAS-WM may be expressed by trapping nearby enhancers, thereby obviating the need for Gal4.

Age-dependent increase of gypsy retrotransposition in glial and neuronal cells
Our previous work using the gypsy-TRAP demonstrated an age-dependent increase in gypsy replication in neurons [4], and similar effects of age were subsequently documented with that reporter in other tissues [8,27,28,38]. We therefore tested whether gypsy replication in glia was similarly age dependent and could be detected using the gypsy-CLEVR reporter. We quantified the total number of gypsy-CLEVR-positive glia in anterior brain sections and found that the number of labeled glia (32.1 ± 2.7 per anterior brain) is extremely low in young animals, just 2 days after eclosion (Fig 3A and 3B). This number greatly increases by 7 days Products of first and second rounds of nested PCR were amplified from genomic DNA extracted from 7-day-old transgenic flies containing either wildtype gypsy-CLEVR, gypsy-CLEVR ΔPBS , gypsy-CLEVR PBSm1 , or gypsy-CLEVR PBSm2 . Results from the first round of PCR (green primer set combining (118.8 ± 18.1 per anterior brain) and further increases by 30 days (404.6 ± 16.4 per anterior brain) after eclosion (Fig 3A and 3B), consistent with the previous evidence that gypsy replication is age dependent in other cell types [4,8,27,28,38]. Moreover, this age-dependent increase in gypsy-CLEVR labeling could be blocked by introducing a Gal4-driven RNAi transgene against gypsy (UAS-gypsy-IR) [42] that we previously demonstrated was capable of interfering with the age-dependent increase in gypsy expression [35]. Addition of this UAS-gypsy-IR transgene was sufficient to block the age-dependent increase in gypsy-CLEVR-labeled glia at 2-day-(21.1 ± 3.5 per anterior brain), 7-day-(54.4 ± 5.9 per anterior brain), and 30-day-(189.6 ± 10.6 per anterior brain) old animals (Fig 3A and 3B). Thus, gypsy-CLEVR labeling requires gypsy expression as predicted.
We next tested the effects of age on gypsy-CLEVR labeling in neurons. Although our recent RNAseq experiments are consistent with the interpretation that gypsy levels are higher in glia than in neurons [35], we previously demonstrated, using the gypsy-TRAP reagent, that gypsy is able to replicate in Kenyon cells (KCs), the intrinsic neurons of the mushroom body (MB), and that the number of de novo events increases with advancing age [4]. To test whether the gypsy-CLEVR reporter could also reveal age-dependent gypsy replication in these neurons, we used the MB247-Gal4 line, which expresses Gal4 in approximately 800 KC neurons per brain hemisphere [68]. Indeed, when we crossed MB247-Gal4 to gypsy-CLEVR reporters (Fig 4), we detected rare de novo events in 2-day-old animals ( Fig 4A and 4G), and the population of gypsy-CLEVR-expressing cells was significantly increased in 30-day-old animals ( Fig 4D and  4G). This result is consistent with our previous findings using the gypsy-TRAP reporter [4]. Importantly, the gypsy-CLEVR reporter is more sensitive (e.g., we detect events in younger animals) and more versatile in its applications. Together with the findings described above, and previously, these results demonstrate that gypsy retrotransposon activity is increased during aging in Drosophila adult brains, both in glia and neurons, and our gypsy-CLEVR reporter is capable to sensitively reveal these de novo retrotransposition events in vivo.

Argonaute-2 keeps gypsy replication in check in aging neurons
Previous literature has established that endogenous siRNAs loaded onto Drosophila argonaute-2 (dAgo2) plays a key role in transposon silencing in somatic cells [69][70][71][72], including brain [4]. Mutations in dAgo2 lead to increased expression of a suite of TEs in various somatic tissues [69][70][71][72] and lead to precocious expression in the brains of young animals of several RTEs, including gypsy [4]. Our previous findings on the impact of dAgo2 on brain aging were consistent with the hypothesis that elevated gypsy expression could lead to an increased rate of mobilization, but we were not able to test this directly with the gypsy-TRAP due to the practicalities of moving that multicomponent system into homozygous dAgo2 mutant animals. We Primer 1 with Primer 2 in schematic) do not detect a product. A second round of nested PCR (red primer set combining Primer 3 with Primer 4 in schematic) was conducted from product isolated from the highlighted gel region (green arrowhead at first-round PCR). Products of the correct size were detected (red arrowhead at second-round PCR) after this second round of PCR using nested primers. This product was only detected from flies with the intact gypsy-CLEVR transgene. PCR products were sequence verified (S3 Fig). (B) Gypsy-CLEVR replication was detected with confocal microscopy in adult brain glia by crossing flies containing the gypsy-CLEVR constructs with a strain containing the pan-glial repo-Gal4 and a UAS-nuclear-LacZ. Gypsy-CLEVR replication was detected using the mCherry nuclear signal (red) as a reporter. Glial nuclei were independently labeled both with a Repo antibody (cyan) and with an antibody against β-galactosidase (green). Neuronal nuclei were counterstained using the Elav antibody (dark blue). Optical sections from anterior brain regions are shown (left panels) for 7-day-old gypsy-CLEVR (wild-type, PBS deletion, and mutations) transgenic flies. Highmagnification micrographs (from the region indicated by yellow dashed box) are shown to the right. Scale bar = 20 μm. (C) Quantification of the number of mCherry-labeled glial cells per anterior brain section (mean ± SEM) reveals that the mCherry labeling of glia requires an intact PBS (mCherry-labeled glia for each group: gypsy-CLEVR: 108.3 ± 11.2, n = 16; gypsy-CLEVR ΔPBS : 5.3 ± 2.0, n = 11; gypsy-CLEVR PBSm1 : 10.0 ± 2.7, n = 10; and gypsy-CLEVR PBSm2 : 28.7 ± 5.0, n = 7) (data in S3 Data). Number of labeled glial cells is significantly higher with the wild-type gypsy-CLEVR construct ( ��� p < 0.001, unpaired t test). CLEVR, cellular labeling of endogenous retrovirus replication; LTR, long terminal repeat; PBS, primer binding site; UAS, upstream activating sequence; 2U, wilt type strain W(iso)CJ1. https://doi.org/10.1371/journal.pbio.3000278.g002

Fig 3. gypsy-CLEVR activity in glial cells is increased with age. (A)
Optical sections are shown from anterior brain regions from 2-day-, 7-day-, and 30-day-old gypsy-CLEVR transgenic flies without and with a gypsy-IR transgene. Glial nuclei are independently labeled with antibodies against the glial Repo marker (cyan). Neuronal nuclei are labeled with antibodies against the Elav marker took advantage of the simplicity and sensitivity of the gypsy-CLEVR system to test whether mutations in dAgo2 impacted gypsy replication rate per se, rather than just expression.
We used the same MB247-Gal4 driver to trace gypsy-CLEVR replication in MB KCs in ago2 mutant animals. We find that in young flies at day 2, there is no significant increase in numbers of gypsy-CLEVR-positive KC neurons in dAgo2 mutant (ago2 51B/414 and ago2 414/414 ) versus wild-type (Fig 4A-4C and 4G) animals. But in brains from 30-day-old animals, we detected a significant increase in numbers of gypsy-CLEVR-positive KC neurons in both the ago2 51B/414 and the ago2 414/414 mutant genotypes compared with wild-type controls (Fig 4D-4G). These data support the conclusion that the increase in gypsy expression levels in dAgo2 mutants leads to an increased rate of retrotransposition during aging in neurons.

Discussion
The CLEVR reporter system relies on universally conserved features of retroviral replication to activate genetic reporter expression after successful mobilization of the donor element. We demonstrate that the gypsy-CLEVR construct reports gypsy replication events both in cell culture and in several cell types in the adult Drosophila central nervous system. We present five separate lines of evidence that the activation of this reporter cassette is a consequence of replication of the gypsy reporter element. First, we found that three different mutations to the PBS each are sufficient to disrupt reporter activation both in cell culture and in vivo (the PBS is a universally conserved and essential Cis-acting site that is required to mediate priming of the first-strand cDNA synthesis by tRNA [tRNA-Lys in the case of gypsy]). This provides strong evidence that the activation of the dual reporter cannot occur in the absence of replication by some artifactual cause such as recombination. Second, we were able to verify with PCR and sequencing that the predicted replication-dependent rearrangement to place the UAS enhancer adjacent to the WM reporter actually takes place in vivo and is not detectable when the PBS is disrupted. Third, using an RNAi transgene that targets gypsy sequences, we demonstrated that activation of the reporter requires expression of gypsy. Fourth, we found that the activation of the gypsy-CLEVR reporter is age dependent in neurons and in glial cells. This dovetails with a previous literature using the gypsy-TRAP reporter, in which it was observed that gypsy de novo insertions accumulate with age in neurons [4,38], adipose tissue [27,28], and intestinal stem cells [8]. We further show evidence that the number of glial cells labeled by the gypsy-CLEVR reporter also increases with age. Fifth, we demonstrate using dAgo2 mutants that the gypsy-CLEVR reporter replication is inhibited by the siRNA surveillance system that normally stifles expression of endogenous gypsy elements. Taken together, these data strongly support the conclusion that CLEVR is both a sensitive and specific tool to reveal ERV replication in vivo.
https://doi.org/10.1371/journal.pbio.3000278.g004 Cellular labeling of endogenous retrovirus replication (CLEVR) the gypsy-TRAP does not distinguish which element has been inserted. In principle, any gypsy family member that shares a preference for the same hot spot can contribute labeled cells. Second, the gypsy-TRAP only captures the fraction of events that disrupts the particular cassette, and what fraction of the events are revealed is impossible to decipher. Third, the gypsy-CLEVR system is far less cumbersome technically because it does not rely on three separate components like the gypsy-TRAP. Fourth, the gypsy-TRAP precludes the use of the Gal4 system to separately manipulate other pathways; e.g., we were able to use UAS-RNAi against gypsy to demonstrate specificity, which would not be possible with the gypsy-TRAP. Fifth, the CLEVR system affords the ability to conduct structure-function studies, such as manipulations of the PBS used here.
These features of the new reporter have already permitted us to investigate new aspects of gypsy biology. For example, we were able to demonstrate that dAgo2 not only represses gypsy expression in somatic cells, as previously shown [4,[69][70][71][72], but that this has functionally relevant impact on the ability of gypsy to mobilize. This would have been difficult to test with the gypsy-TRAP because it would have necessitated combining five separate genetic components into one animal.
The CLEVR design in principle also should provide the modularity needed to flexibly interrogate a diverse suite of research questions. In our case, we chose to use the UAS enhancer in the 5 0 LTR because it provided us the means to query gypsy donor mobilization in different cell types and under different conditions. For example, we were able to separately query glial cells or KC neurons by selecting different Gal4 drivers. In principle, however, any relatively small promoter sequence could be used. For example, if we wanted to use the Gal4 system to separately manipulate a genetic pathway of interest and then query the mobilization of gypsy within MB KCs, we could introduce the 247-base pair fragment of the dMef gene, which is sufficient to drive Gal4 in KCs (and is the basis of the MB247-Gal4 line that we used here). CLEVR is also flexible with respect to the reporter in the 3 0 LTR. Here, we used a membrane GFP and a nuclear mCherry separated by the P2A sequence. It would be trivial to substitute reporters with different functional effects. For example, one could substitute GCaMP to image calcium or Channelrhodopsins for optogenetic manipulation to investigate functional correlates of gypsy mobilization in neurons. Finally, the CLEVR reporter relies on conserved features that are shared across all LTR-RTEs ERVs and exogenous retroviruses. Given the growing interest in the impact of ERVs on both normal and dysfunctional aspects of biology, the ability to follow insertion of proviruses within tissues should have palpable impact.

Constructs
To generate UAS-myr-GFP-V5-P2A-H2B-mCherry-HA (referred to as Watermelon based on colors of fluorescence and abbreviated as WM), the myr-GFP and H2B-mCherry were separately amplified from pJFRC12-10xUAS-IVS-myr::GFP [73] and pEV-12xCSL-H2B-mCherry [74] vectors, and V5 and HA tags were separately added to the C termini of myr-GFP and H2B-mCherry by PCR. To synthesize the final UAS-WM, the myr-GFP-V5 and H2B-mCherry-HA products were linked with P2A sequence [64,65] and then inserted into the MCS of pUAST with NotI and XhoI digestion (see S1 Fig for full WM sequences). The gypsy backbone was obtained as a gift from V. Salenko [75]. In order to generate gypsy-CLEVR reporter, gypsy backbone that was placed in the pCaSpeR5 transforming vector [76], and the 5xUAS regulatory element from pUAST plasmid were inserted into the BglII site of the U5 domain within gypsy 5 0 LTR, with antisense orientation relative to gypsy backbone. To synthesize the final gypsy-CLEVR reporter, the hsp70 promoter and the SV40 poly-A tail from pUAST were separately fused upstream and downstream of the WM reporter, and this final chimera was also placed in antisense orientation to gypsy transcript into the XhoI site of the U3 domain within 3 0 LTR (see Fig 1A for gypsy-CLEVR structure in detail). The PBS sequences (5 0 -TGGCGCCCAAC-3 0 ) of gypsy-CLEVR were deleted to generate the PBS deletion version (gypsy-CLEVR ΔPBS ) and separately mutated to generate two variant constructs, gypsy-CLEVR PBSm1 (5 0 -GTTATAAACCA-3 0 ) and gypsy-CLEVR PBSm2 (5 0 -CAATATTTGGT-3 0 ).

Immunostaining of S2 cells and fly brains
Drosophila S2 cells (R69007, Thermo Fisher Scientific) were cultured in Schneider's Drosophila Media (Thermo Fisher Scientific) supplemented with 10% fetal bovine serum (Thermo Fisher Scientific) and penicillin-streptomycin-glutamine (Thermo Fisher Scientific) in 75 cm 2 flasks. The actin-Gal4 [77] and tubulin-Gal4 [78] plasmids were used as in previous publications. Cells were transfected with 1 μg of each plasmid DNA with the Effectence transfection kit (Qiagen). After 48 hours transfection, cells were fixed in 4% paraformaldehyde and mounted on coverslips coated in 0.5mg/mL Concanavalin A and ProLong Diamond Antifade Mountant with DAPI (Thermo Fisher Scientific).
Fly adult brains and third instar wing discs were dissected, fixed, and immunostained as previously described [4,77]. The LacZ primary antibody was used at 1:500 dilution (A-11132, Thermo Fisher Scientific). Repo (8D12) and Elav (7E8A10) co-staining were performed using a 1:10 dilution (Developmental Studies Hybridoma Bank). DyLight 405, Alexa Fluor 488, and Alexa Fluor 647 conjugated secondary antibodies were used at 1:100 dilution against Repo and Elav antibodies and at 1:500 against LacZ (Jackson ImmunoResearch). All S2 cell and fly brain images were acquired on a Zeiss LSM 800 confocal microscope and processed in the Zeiss ZEN software package.

Quantification of gypsy-CLEVR-positive cells in fly brain
The gypsy-CLEVR signal in every fly brain was collected through a series Z section with glial Repo and neuronal Elav co-labeling. These two markers provided landmarks of brain architecture, permitting us to define the equivalent brain sections between fly brains. The majority of gypsy-CLEVR signal from glia during aging was from the more anterior sections (as shown in S5 Fig). The average number of gypsy-CLEVR-positive cells within the anterior sections of brains from different experimental groups was used to quantify the relative change of gypsy-CLEVR reporter activity. A similar strategy was also performed to quantify KCs, but focusing on MB247-Gal4-expressing cells within the fly central brain.

Nested PCR and sequencing
Genomic DNA was extracted from 20 flies of wild-type, gypsy-CLEVR, and gypsy-CLEVR PBS, modifying variants at day 7 by PureLink Genomic DNA Kit (Thermo Fisher Scientific). The extracted genomic DNA was followed by two rounds of standard PCR in the nested fashion. Primer 1 (5 0 -ACAATGTATTGCTTCGTAGC-3 0 ) and primer 2 (5 0 -AGATTGTTGGT TGGGCGCCA-3 0 ) were used in the first-round PCR, and the extract from first-round PCR (green arrowhead in Fig 2A) was amplified by primer 3 (5 0 -AAACTTAGTTTTCAATATTG-3 0 ) and primer 4 (5 0 -GAGCGGAGACTCTAGCAGAT-3 0 ). The predicted size range of the PCR product (red arrowhead in Fig 2A) from the second round of PCR was extracted from the gel and cloned by TOPO-TA cloning kit (Thermo Fisher Scientific). Sanger sequencing was performed by DNA Sequencing Facility at Stony Brook School of Medicine.

Statistical analysis
Cell culture data were analyzed using a Fisher exact test variant of the chi-squared analysis in order to obtain a p-value for significance. The statistical data from aging fly and genetic manipulation were analyzed by GraphPad Prism software. The unpaired t test was used to compare the different groups within the same time point, and two-way ANOVA was used to analyze the difference between groups with aging effect and genetic manipulation. Statistical values are presented as mean ± SEM throughout. UAS-WM in SPG (moody>WM). Brains were dissected from adult flies and stained with glial marker Repo (magenta). Scale bar = 20 μm. (D) An RT-qPCR approach is used to confirm the functionality of GFP and mCherry primer sets. Each value from these experimental groups was further normalized to the tub>WM group (tub-Gal4 and UAS-WM co-transfection) in order to get the relative fold change. The relative fold change of both GFP (tub>WM: 0.81 ± 0.09, tub-Gal4: undetectable, UAS-WM: 0.02 ± 0.00, No vector: undetectable) and mCherry (tub>WM: 0.96 ± 0.12, tub-Gal4: undetectable, UAS-WM: 0.03 ± 0.00, No vector: undetectable) are compared (data in S2 Data). n = 3 biological replicates ( ���� p < 0.0001, unpaired t test). GFP, green fluorescent protein; hh, hedgehog; myr-GFP, myristilated-GFP; RT-qPCR, reverse-transcription quantitative PCR; SPG, subperineurial glia; tub, tubulin promoter; UAS, upstream activating sequence; WM, Watermelon.