A genetic toolkit for tagging intronic MiMIC containing genes

Previously, we described a large collection of Minos-Mediated Integration Cassettes (MiMICs) that contain two phiC31 recombinase target sites and allow the generation of a new exon that encodes a protein tag when the MiMIC is inserted in a codon intron (Nagarkar-Jaiswal et al., 2015). These modified genes permit numerous applications including assessment of protein expression pattern, identification of protein interaction partners by immunoprecipitation followed by mass spec, and reversible removal of the tagged protein in any tissue. At present, these conversions remain time and labor-intensive as they require embryos to be injected with plasmid DNA containing the exon tag. In this study, we describe a simple and reliable genetic strategy to tag genes/proteins that contain MiMIC insertions using an integrated exon encoding GFP flanked by FRT sequences. We document the efficiency and tag 60 mostly uncharacterized genes. DOI: http://dx.doi.org/10.7554/eLife.08469.001


Introduction
One of the most powerful techniques for characterizing gene function is to generate transgenic animals in which an epitope tag such as GFP has been fused to the gene at its normal genomic location (Ross-Macdonald et al., 1999;Morin et al., 2001;Skarnes et al., 2004). These tagged proteins are extremely useful as they permit determination of protein localization in vivo as well as conditional, tissue specific, temporal and reversible removal of the tagged proteins (Nagarkar-Jaiswal et al., 2015). However, previous methods for generating protein trap alleles in Drosophila have allowed only about 800 genes to be successfully tagged (Kelso et al., 2004;Buszczak et al., 2007;Quinones-Coello et al., 2007;Aleksic et al., 2009;Lowe et al., 2014).
We previously developed a flexible system for engineering the Drosophila genome using the Minos-Mediated Integration Cassette (MiMIC) transposable element. We generated 15,660 strains with a single MiMIC inserted at random within the fly genome and mapped their insertion site Venken et al., 2011;Nagarkar-Jaiswal et al., 2015). MiMIC carries sequences that function as a gene and protein trap when inserted in the proper orientation in a coding intron. Moreover, its content can be replaced by Recombination-Mediated Cassette Exchange (RMCE) leading to the introduction of any desired DNA, such as an artificial exon that encodes a protein tag. This approach can potentially be used to tag thousands of genes. Currently, 2854 existing insertions are located within the coding introns of 1862 distinct genes (Nagarkar-Jaiswal et al., 2015), and MiMIC-like elements can now be placed in any gene of interest by CRISPR (Zhang et al., 2014). Unfortunately, the RMCE method needed to convert these insertions into functional protein traps requires embryonic injections of an appropriate donor DNA and screening of many offspring to identify the desired events, a labor and cost-intensive procedure that does not scale easily. We therefore developed a more efficient and economical in vivo genetic tagging methodology that can in principle be used to generate protein trap alleles of all Drosophila genes.

Results and discussion
We developed a genetic strategy that allows the desired RMCE event to take place efficiently without the need for microinjection. The method uses FLP recombinase to release a genomically integrated DNA flanked by FRT sites into the nucleoplasm where it can efficiently undergo phiC31 integrasemediated cassette exchange, as shown by Gohl et al. (2011). As shown in Figure 1A, we engineered three donor cassettes, one for each reading frame. The core, which contains a splice acceptor (SA) followed by a (GGS) 4 flexible linker, multiple tags (EGFP-FlAsH-StrepII-TEVcs-3xFlag {GFSTF}), another (GGS) 4 flexible linker, and a splice donor (SD), is flanked by two inverted attB sites for phiC31mediated RMCE (Venken et al., 2011). We then cloned this cassette core between tandem FRT sites in a P-element transformation vector (Gong and Golic, 2003). FLP-mediated recombination between the tandem FRT sites excises a circular donor DNA molecule from its initial genetic locus, promoting its efficient recombination with a distal target site (Golic et al., 1997). A mini-white eye color marker gene between our donor cassette and one of the FRT sites allows us to monitor the presence or absence of the donor cassette in FLP recombinase-containing stocks.
We created 6 stocks ( Figure 1-source data 2), each harboring one of the three donor transgenes located on the second or third chromosome, and a heat shock-inducible FLP recombinase and a germ line-expressed phiC31 integrase on the X-chromosome. Because the heat shock-inducible FLP recombinase is somewhat leaky at 18˚C, the donor transgene is lost from these stocks at a low frequency, resulting in rare white-eyed flies, which we periodically discard.
To initiate RMCE, we crossed the appropriate donor flies to MiMIC-containing flies and heat shocked the resulting embryos and larvae ( Figure 2). Within the primordial germ cells of some of the embryos and larvae, phiC31 integrase catalyzed recombination between attB sites in the donor and attP sites in the MiMIC transposon. The positive RMCE events were selected based on the loss of the y + marker present in the original MiMIC ( Figure 2). We confirmed the integration and orientation of the donor cassette by PCR as described in Venken et al. (2011). Typically, 50% of the integration events are in the proper orientation.
We observed one to ten RMCE events in 93 out of 113 attempts in our initial trial when we set up 3-7 crosses (Cross 2 in Figure 2). After PCR screening, 60/93 of the tested MiMICs allowed integration of at least one donor in the proper orientation to tag the endogenous gene (Supplementary file 1). In summary, we set up 3-7 vials for each starting cross and obtained 60/113 tagged genes. Since the efficiency of RMCE and the ease of detecting yellow − progeny vary between different starting sites, we propose to set up 10-20 vials and to score more progeny to improve the success rate. The method has been found to work for a wide variety of genes including a gene located in a telomeric region (lethal giant larvae (l(2)gl)), suggesting that there may be few limitations in its applicability.
To ensure that the expression pattern and protein distribution correspond to the endogenous protein, we costained two tagged lines with GFP for which specific monoclonal antibodies are available: Eyes shut (Eys) (mAb 21A6,) and Delta (Dl) (mAb C594.9B) ( Figure 3A). In both cases, the protein recognized by the mAb colocalizes with the GFP and match the described expression patterns (Das et al., 2013;Haltom et al., 2014). Note, however, that the GFP tagged Eys protein is present in the cytoplasm of the photoreceptors and the inter-rhabdomere spaces (IRS) of the photoreceptors, whereas the mAb against Eys mostly localizes to the IRS ( Figure 3A). These data are in agreement with what we previously observed for numerous tagged proteins (Venken et al., 2011;Nagarkar-Jaiswal et al., 2015).
We stained third instar larval brains and discs for the 60 tagged gene/proteins. The examples, shown in Figure 3B, include lethal (2) giant larvae (l(2)gl) (a), Delta (Dl) (b), and twins (tws) (c) whose expression patterns are consistent with published data (Kooh et al., 1993;Albertson and Doe, 2003;Chabu and Doe, 2009). Similarly, kayak/fos (kay) is expressed in wing disc nuclei (d) as described earlier (Zeitlinger and Bohmann, 1999). The expression pattern of the remaining genes has not been previously described ( Figure 3B): Saposin-related (Sap-r) is expressed in a subset of cells in larval brain (e), Rad, Gem/Kir family member 3 (Rgk3) is enriched in mushroom body in L3 larval brain (f), Heterogeneous nuclear ribonucleoprotein at 98DE (Hrb98DE) is expressed in L3 larval brain (g), CG10086 is expressed in hindgut (h), and CG5656 is expressed in the cells of the cuticle (i). The expression patterns of all these genes as well as all the genes listed in Supplementary file 1 are documented in the MiMIC RMCE database at http://flypush.imgen.bcm.tmc.edu/pscreen/rmce/.
In summary, we developed a genetic tagging strategy that will greatly facilitate the EGFP tagging of nearly 2000 genes that already carry MiMIC insertions (Nagarkar-Jaiswal et al., 2015). The same strategies can also be used for tagging genes with other protein tags. In addition, a similar strategy based on lox sites instead of FRT cassettes has recently been developed to integrate an artificial exon carrying the GAL4 gene in MiMICs inserted in coding introns (Diao et al., 2015). These insertions are mutagenic but permit the expression of the endogenous wild-type and mutant cDNAs of Drosophila as well as other species under the control of UAS. Moreover, these tagging methods can now be combined with CRISPR directed integration of attP carrying cassettes similar to MiMIC in coding introns to tag almost every gene in Drosophila (Zhang et al., 2014).