Plug and play modular strategies for synthetic retrotransposons
Introduction
Significant strides have been made in the mammalian retrotransposon field since long interspersed elements (LINEs) were first described in mammals [1]. The dominance of LINEs in mammalian genomes is now confirmed directly at the nucleotide level by recent genome sequencing efforts, including representative species from placental mammals, marsupials and egg-laying monotremes. In the human genome, LINE-1s (also known as L1s) account for ∼17% of the mass, with the majority of L1s being 5′ truncated. Classified as non-LTR retrotransposons, full-length L1s are typically 6–7 kb, encompassing an internal promoter in the 5′UTR, two non-overlapping open reading frames (ORF1 and ORF2), and a weak polyadenylation signal in the 3′UTR. Recent progress in L1 biology highlights its role as a major driving force in mammalian genome evolution [2], [3]. Two fundamental assay systems, used to assess L1 activities in cell culture and in transgenic animals, respectively [4], [5], have been prominently featured in this discovery process over the past decade.
The first active L1 was isolated in 1991 [6] and its retrotransposition activity was subsequently verified in a cell-culture-based L1 functional assay [4]. The initial retrotransposition indicator cassette contains a copy of the neomycin resistance gene disrupted by an antisense intron (neoAI) and the level of retrotransposition is reported by colony formation after drug selection [4]. An enhanced green fluorescent protein-based reporter (gfpAI) has since been developed [7]. L1 functional assays in cultured cells have drastically propelled the field forward, leading to the identification of essential and non-essential L1 sequences for its retrotransposition, the identification of other active L1s, and a wide range of impacts of L1 retrotransposition on mammalian genomes [8]. These assays also make it possible to probe the effect of cellular factors on L1 retrotransposition and to provide mechanistic insights into L1 replication [9].
Another boost for L1 research came from mouse models for L1 retrotransposition, which have paved the way for a thorough understanding of L1 biology in a living organism and toward L1-based tools for random in vivo mutagenesis. When placed under the control of its endogenous 5′UTR promoter, a human L1 transgene is found to express exclusively in mouse testis and ovary, and its retrotransposition can be detected in the male germ line [5]. Such tissue-specific expression is consistent with previous studies on the expression of endogenous mouse and human L1 elements [10], [11], [12]. However, in a subsequent study using a similar human L1 transgene, retrotransposition was not only detected in germ cells but also in neuronal cells [13], raising a possible role of L1 somatic retrotransposition in neuronal diversity. Both human and mouse L1 transgenes can readily retrotranspose in mouse somatic cells when they are regulated by heterologous promoters [14], [15], [16]. Germ line retrotransposition frequency as high as one in every three animals has also been achieved with a synthetic mouse L1 transgene, ORFeus [15].
There are two primary challenges when working with L1 plasmids containing retrotransposons for either cell culture or animal experiments. The first challenge is frequently encountered during plasmid construction. The relative large size of typical retrotransposon vectors (∼20 kb) makes subcloning technically demanding as DNA fragments larger than 10 kb are notoriously inefficient during almost all subcloning stages such as DNA recovery, ligation and transformation. Choice of unique 6-base cutters is limited; 8-base cutters are prized for assembly of complex L1 constructs but frequently they are either absent from the recipient plasmid or inconveniently positioned. Although it is often desirable to swap certain functional elements in and out of an existing L1 vector, such substitution remains an inefficient and time-consuming practice unless design principles are carefully considered ahead of time. The second challenge is the lack of a standard protocol for mapping retrotransposition events once the engineered L1 is introduced into cultured cells or animals.
Here we present strategies aiming to overcome aforementioned obstacles. In Section 2, we detail a blueprint for streamlining L1 vector design. Sequence components of L1 vectors are modularized, and strategically placed restriction sites are used to facilitate cassette swapping for tailored research needs. In Section 3, we describe a step-by-step inverse PCR (iPCR) protocol that we have found to be useful for mapping de novo L1 insertions in both cultured cells and transgenic animals, especially in DNA samples containing a complex population of individual retrotransposition events.
Section snippets
General considerations
Several synthetic biology standards for assembling complex series of “standardized parts” such as BioBricks [17] have been proposed, and some have been adopted by large segments of the synthetic biology community (e.g., the Registry of Standard Biological Parts; see http://partsregistry.org/Main_Page). The main drawback to “BioBricking” the various components of retrotransposons is that the retrotransposon assemblies can constitute combinations of ten or more “parts” and thus it is advantageous
General considerations
Mapping the integration site of a de novo L1 insertion and interrogating the inserted L1 sequence and its surrounding genomic DNA sequences have provided a rich source of information about preferences and consequences of L1 integration in both cultured cells and transgenic mice [13], [15], [16], [28], [29], [30], [31], [32]. Unlike DNA transposons and LTR retrotransposons, L1 insertions have two unique features that complicate global mapping/sequencing schemes: (1) the 5′ end of L1 varies from
Concluding remarks
We have proposed a modular design for L1 vectors that aims to streamline functional studies of L1 elements in both cell culture and transgenic animals. The eventual success of such a designing scheme depends on active yet voluntary participation of individual L1 investigators and beyond. Thus, we urge our colleagues to implement the proposed modular design whenever a new L1 plasmid or functional modules are made. Such a concerted effort will ultimately benefit the overall L1 research community
Acknowledgments
This work was supported in part by start-up funds from Washington State University (W.A.) and by National Institutes of Health Grant CA16519 (J.D.B.).
References (37)
Cell
(1982)- et al.
Cell
(1996) - et al.
Cell
(2008) - et al.
J. Biol. Chem.
(2004) - et al.
Cell
(1985) - et al.
Cell Host Microbe
(2008) - et al.
Anal. Biochem.
(2000) - et al.
Cell
(2002) - et al.
Cell
(2002) - et al.
Gene
(2000)
Bioessays
Science
Nat. Genet.
Science
Nucleic Acids Res.
Genetica
Mol. Cell. Biol.
Proc. Natl. Acad. Sci. USA
Cited by (10)
Levenshtein error-correcting barcodes for multiplexed DNA sequencing
2013, BMC BioinformaticsControlled insertional mutagenesis using a LINE-1 (ORFeus) gene-trap mouse model
2013, Proceedings of the National Academy of Sciences of the United States of America