Genetic and Transgenic Reagents for Drosophila simulans, D. mauritiana, D. yakuba, D. santomea, and D. virilis

Species of the Drosophila melanogaster species subgroup, including the species D. simulans, D. mauritiana, D. yakuba, and D. santomea, have long served as model systems for studying evolution. However, studies in these species have been limited by a paucity of genetic and transgenic reagents. Here, we describe a collection of transgenic and genetic strains generated to facilitate genetic studies within and between these species. We have generated many strains of each species containing mapped piggyBac transposons including an enhanced yellow fluorescent protein (EYFP) gene expressed in the eyes and a ϕC31 attP site-specific integration site. We have tested a subset of these lines for integration efficiency and reporter gene expression levels. We have also generated a smaller collection of other lines expressing other genetically encoded fluorescent molecules in the eyes and a number of other transgenic reagents that will be useful for functional studies in these species. In addition, we have mapped the insertion locations of 58 transposable elements in D. virilis that will be useful for genetic mapping studies.

2005; Orgogozo et al. 2006;Cande et al. 2012;Arif et al. 2013;Peluffo et al. 2015). However, in the vast majority of cases, these studies have stopped after quantitative trait locus mapping of traits of interest. One factor that has limited further genetic study of these traits is a limited set of genetic markers, which can facilitate fine-scale mapping. J. True and C. Laurie established a large collection of strains carrying P-element transposons marked with a w + mini-gene in a w 2 background of D. mauritiana (True et al. 1996a,b). These have been used for introgression studies (True et al. 1996b;Coyne and Charlesworth 1997;Tao et al. 2003a,b;Masly and Presgraves 2007;Masly et al. 2011;Arif et al. 2013;Tanaka et al. 2015;Tang and Presgraves 2015) and for highresolution mapping studies (McGregor et al. 2007;Araripe et al. 2010), demonstrating the utility of dominant genetic markers for evolutionary studies. One limitation of these strains is that the w + marker is known to induce behavioral artifacts (Zhang and Odenwald 1995;Campbell and Nash 2001;Xiao and Robertson 2016). We have also observed that mutations in the white gene and some w + rescue constructs cause males to generate abnormal courtship song (Y. Ding and D. Stern, unpublished data). Other pigmentation genes that are commonly used in D. melanogaster are also known to disrupt normal behavior (Bastock 1956;Kyriacou et al. 1978;Drapeau et al. 2006;Suh and Jackson 2007); therefore, it would be preferable to employ dominant genetic markers that do not interfere with normal eye color or pigmentation.
We were motivated by the phenotypic variability and genetic accessibility of these species to establish a set of reagents that would allow, simultaneously, a platform for site-specific transgenesis (Groth et al. 2004) and reagents useful for genetic mapping studies. Therefore, we set out to establish a collection of strains carrying transposable elements marked with innocuous dominant markers for four of the most commonly studied species of the D. melanogaster species subgroup: D. simulans, D. mauritiana, D. yakuba, and D. santomea. We chose the piggyBac transposable element to minimize bias of insertion sites relative to gene start sites (Thibault et al. 2004) and integrated transposable elements carrying EYFP and DsRed driven by a 3XP3 enhancer, which is designed to drive expression in the eyes (Horn et al. 2003). A large subset of the lines described here also include a fC31 attP landing site to facilitate site-specific transgene integration. Here, we describe the establishment and mapping of many lines of each species carrying pBac{3XP3::EYFP,attP} and pBac{3XP3::DsRed} (Horn et al. 2003). We have characterized a subset of the pBac{3XP3::EYFP, attP} lines from each species for fC31 integration efficiency of plasmids containing an attB sequence. In addition, we have integrated transgenes carrying the even-skipped stripe 2 enhancer to characterize embryonic expression generated by a subset of attP landing sites. We have employed CRISPR/Cas9 to knock out the 3XP3::EYFP gene in a subset of lines to facilitate integration of reagents for neurogenetics. We also describe several other genetic and transgenic reagents that may be useful to the community, including the map positions for pBac transposons integrated in the D. virilis genome.
Fluorescence could be detected easily in the compound eyes and ommatidia in all of the white 2 strains (D. simulans, D. mauritiana, D. yakuba, and D. virilis) using any dissecting microscope we tried with epi-fluorescence capability ( Figure 1A). In flies with wild-type eye coloration, fluorescence in the compound eye is limited to a small spot of 10 ommatidia ( Figure 1B). However, we found that fluorescence was very weak, and usually unobservable, in the eyes of flies with wild-type eye coloration using a Leica 165 FC stereomicroscope. This microscope In flies carrying a w 2 mutation, fluorescence is often intense and observable throughout the compound eye and in the ocelli (arrowheads). (B) In flies carrying wild-type eye coloration, fluorescence is observed in the compound eye as small dots including 10 ommatidia (arrows) and in the ocelli (arrowheads).
uses "TripleBeam Technology" to deliver excitation light along a separate light path from the emission light. Unfortunately, the excitation light in this system appears to illuminate ommatidia adjacent to the ommatidia that are viewed for the emission light. Fluorescence can still be detected in the ocelli of these flies with this microscope, although this requires a bit more patience than when using a standard epi-fluorescence microscope to screen for fluorescence in the compound eyes.

Mapping of transposable element insertion sites
We mapped the genomic insertion sites of all pBac elements using both inverse PCR (iPCR) (Ochman et al. 1988) and TagMap (Stern 2016). iPCR was not ideal for our project for several reasons. First, many isolated strains appeared to contain multiple insertion events, even though they were isolated from single G0 animals. These multiple events could sometimes be detected by segregation of offspring with multiple strengths of fluorescence in the eyes. In these cases, iPCR sometimes produced uninterpretable sequences and occasionally only a single insertion event was amplified. Second, many iPCR sequences were too short to allow unambiguous mapping to the genome. Third, sometimes iPCR reactions failed for no obvious reason. For all of these reasons, it was difficult to unambiguously map all of the pBac insertions with iPCR. Therefore, we developed and applied TagMap (Stern 2016) to map the insertion positions of all pBac elements. TagMap combines genome fragmentation and tagging using Tn5 transposase with a selective PCR to amplify sequences flanking a region of interest. This method provides high-throughput, accurate mapping of transposon insertions. Tagmap provided transposon insertion positions for all but a few strains. Transposable element insertion sites in the D. simulans and D. mauritiana strains were mapped to D. simulans

Mapping pBac transposon insertion sites in D. virilis
We previously generated multiple pBac(enhancer-lacZ) insertions into D. virilis to study the svb gene (Frankel et al. 2012). However, none of these pBac (enhancer-lacZ) insertions have been mapped previously. These reagents may be useful for genetic mapping studies. Therefore, we have mapped positions of these inserts using TagMap. The larger scaffolds from the D. virilis CAF1 assembly project (http://insects. eugenes.org/species/data/dvir/) (Drosophila 12 Genomes Consortium et al. 2007) have been mapped to Muller elements (Schaeffer et al. 2008). We combined this information with genetic linkage data to assemble 159 Mbp of the D. virilis genome into the six Muller arms (N. Frankel and D. Stern, unpublished data). We mapped insertion sites to this unpublished version of the D. virilis genome.
Generation of a D. santomea white 2 allele We began to generate this collection of reagents prior to the availability of a white 2 strain of D. santomea. However, soon after CRISPR/Cas9mediated genome editing became available, we generated a white 2 strain derived from D. santomea STO-CAGO 1482 as follows. In vitro-transcribed Cas9 mRNA, generated with an EcoRI-digested T7-Cas9 template plasmid and the mMESSAGE mMACHINE T7 Transcription Kit (Thermo Fisher Scientific), together with two gRNAs targeting the third exon of the white gene were injected into preblastoderm embryos by Rainbow Transgenics. The sequence for the T7-Cas9 Figure 2 Genomic insertion sites of pBac transposable elements in D. simulans. Each triangle represents a unique pBac element insertion. Some strains carry multiple insertion events. Some insertion sites are present in multiple strains, at least one of which contains multiple insertions. These strains were maintained to maximize the diversity of insertion sites in the collection. pBac insertions oriented forward are indicated above each chromosome and point to the right while reverse insertions are indicated below each chromosome and point to the left. Rectangles represent inserted elements whose orientation could not be determined. Yellow, green, and red indicate elements carrying 3XP3::EYFP, 3XP3::EGFP, and 3XP3:: DsRed, respectively. plasmid is provided in File S2. The gRNAs were generated by separate in vitro transcription reactions, using the MEGAscript T7 Transcription Kit (Thermo Fisher Scientific), of PCR-amplified products of the following forward and reverse primers: Forward primer CRISPRF-san-w12, 59-GAA ATT AAT ACG ACT CAC TAT AGG CAA CCT GTA GAC GCC AGT TTT AGA GCT AGA AAT AGC-39; Forward primer CRISPRF-san-w17, 59-GAA ATT AAT ACG ACT CAC TAT AGG GCC ACG CGC TGC CGA TGT TTT AGA GCT AGA AAT AGC-39; Reverse primer gRNA-scaffold, 59-AAA AGC ACC GAC TCG GTG CCA CTT TTT CAA GTT GAT AAC GGA CTA GCC TTA TTT TAA CTT GCT ATT TCT AGC TCT AAA AC-39. All PCR reactions described in this paper were performed using Phusion High Fidelity DNA Polymerase (New England Biolabs) using standard conditions. Injected G0 flies were brother-sister mated and G1 flies were screened for white eyes. Once we identified a white 2 strain, we backcrossed the pBac{3XP3::EYFP-attP} markers generated previously in D. santomea STO-CAGO 1482 to the white 2 strain. The pBac insertion sites in these new white 2 strains were then remapped with TagMap.
Testing fC31-mediated integration efficiency Different attP landing sites provide different efficiencies of integration of attB-containing plasmids (Bischof et al. 2007). We performed a preliminary screen of integration efficiency on a subset of the attP landing sites that we generated. Preblastoderm embryos were co-injected with 250 ng/ml of plasmids containing attB sites and 250 ng/ml pBS130 (Gohl et al. 2011), a heat shock-inducible source of fC31 integrase, and 1 hr after injection were incubated at 37°for 1 hr. G0 offspring were backcrossed to the parental line and G1 offspring were screened for the relevant integration marker. We performed this screen using a heterogeneous collection of plasmids that we were integrating for other purposes. Therefore, the integration efficiencies we report are not strictly comparable between sites. Nonetheless, we were able to identify a subset of sites that provide reasonable integration efficiency and which can be made homozygous after integration of transgenes. We report these statistics for all sites that we have tested (File S3).
Testing expression patterns and levels of transgenes integrated in different attP sites Different attP landing sites drive different levels and patterns of transgene expression (Pfeiffer et al. 2010). We have tested a subset of the attP sites in our collection for embryonic expression of an integrated D. melanogaster even-skipped stripe 2 enhancer (Small et al. 1992). A plasmid containing the D. melanogaster eveS2-placZ was co-injected with 250 ng/ml pBS130 into 10 pBac{3XP3::EYFP-attP} strains of each species. We isolated transgenic lines for seven D. simulans, four D. mauritiana, two D. yakuba strains, and four D. santomea strains. We performed mRNA fluorescent in situ hybridization (FISH) and imaged midstage 5 embryos on a Leica TCS SPE confocal microscope (antibody staining is less sensitive at these stages than FISH due to slow production of reporter gene protein products.) Embryos of all samples were scanned with equal laser power to allow quantitative comparisons of expression patterns between strains.
We performed staining experiments for all sites from each species in parallel; embryo collection, fixation, hybridization, image acquisition, and processing were performed side-by-side under identical conditions. Confocal exposures were identical for each series. Image series were acquired in a single day, to minimize signal loss. Sum projections of confocal stacks were assembled, embryos were scaled to match sizes, background was subtracted using a 50-pixel rolling-ball radius, and fluorescence intensity was analyzed using ImageJ software (http://rsb. info.nih.gov/ij/).

Killing EYFP expression from attP landing sites
Expression of the EYFP genes associated with the attP sites may conflict with some potential uses of the attP landing sites, for example for integration of transgenes driving GFP derivatives, such as GCaMP, in the brain. Therefore, we have started generating pBac{3XP3::EYFP-attP} strains where we have killed the EYFP activity using CRISPR/ Cas9-mediated targeted mutagenesis. We first built a derivative of the  pCFD4-U61-U63 tandem gRNAs plasmid (Port et al. 2014) where we replaced the vermillion marker with a 3XP3::DsRed dominant marker. The vermillion marker was removed by HindIII digestion of pCFD4-U61-U63 and isolation of the 5253 bp band. The 3XP3::DsRed cassette was amplified from a pUC57{3xP3::DsRed} plasmid using the following primers: 59-TAC GAC TCA CTA TAG GGC GAA TTG GGT ACA CCA GTG AAT TCG AGC TCG GT-39 and 59-TTG GAT GCA GCC TCG AGA TCG ATG ATA TCA ATT ACG CCA AGC TTG CAT GC-39. The PCR product and vector backbone were assembled with Gibson assembly (Gibson et al. 2009) following http://openwetware. org/wiki/Gibson_Assembly to generate p{CFD4-3xP3::DsRed-BbsI}. To remove the BbsI restriction site from DsRed, which conflicts with the BbsI restriction site used for cloning gRNA sequences, we digested this plasmid with NcoI and isolated the 6 kb fragment, PCR-amplified this region with primers that eliminated the BbsI restriction site (forward primer: 59-CGG GCC CGG GAT CCA CCG GTC GCC ACC ATG GTG CGC TCC TCC AAG AAC GTC A-39 and reverse primer: 59-CGC TCG GTG GAG GCC TCC CAG CCC ATG GTT TTC TTC TGC ATT ACG GGG CC-39), and Gibson cloned the PCR product into the plasmid backbone. This yielded plasmid p{CFD4-3xP3::DsRed}.
To make a plasmid for mutating EYFP in fly lines, we digested p{CFD4-3xP3::DsRed} with BbsI and gel purified the 5913 bp fragment. A gBlocks Gene Fragment (IDT) (59-CAA GTA CAT ATT CTG CAA GAG TAC AGT ATA TAT AGG AAA GAT ATC CGG GTG AAC TTC GGG TGG TGC AGA TGA ACT TCA GTT TTA GAG CTA GAA ATA GCA AGT TAA AAT AAG GCT AGT CCG TTA TCA ACT TG-39), which contained a gRNA sequence targeting EYFP that was previously validated by direct injection of gRNA, was synthesized and Gibson assembled with the BbsI-digested fragment of p{CFD4-3xP3::DsRed} to make p{CFD4-EYFP-3xP3::DsRed}.
This plasmid contains attB and can be integrated into attP sites. We tested this by integrating this plasmid into the attP site of D. simulans line 930. This plasmid is a potent source of gRNA targeting EYFP, which we confirmed by crossing this line to a transgenic strain carrying nos-Cas9. We have generated transgenic strains of D. simulans, D. mauritiana, and D. yakuba carrying nos-Cas9 [Addgene plasmid 62208, described in Port et al. (2014)] and details of these lines are provided as File S3.

Data availability
Plasmid pBac{3XP3::EYFP-attP} is available from D. Stern upon request. The p{CFD4} derivative plasmids have been deposited with Addgene (plasmid IDs 86863 and 86864). All fly stocks are maintained in the Stern lab at Janelia Research Campus and all requests for fly stocks should be directed to D. Stern. The raw iPCR and TagMap data are available upon request from D. Stern. We continue to produce new fly strains based on the reagents described in this paper. An Excel sheet containing information about all strains in this paper and any new lines is available at http://research.janelia.org/sternlab/Strains_and_Integra-tion_Efficiencies.xlsx. Geneious files containing genomic insertion sites for all transgenes will be updated with new strains and are available at the following sites: http://research.janelia.org/sternlab/D.simulans_ mauritiana_insertions.geneious; http://research.janelia.org/sternlab/ D.yakuba_santomea_insertions.geneious; and http://research.janelia.org/ sternlab/D.virilis_insertions.geneious. All of these files can be accessed via our lab web page at https://www.janelia.org/lab/stern-lab/toolsreagents-data.

Mapping pBac transposon insertion sites in D. virilis
To assist with genetic experiments in D. virilis, we mapped the insertion locations for all pBac lines generated in our lab for a previously published study (Frankel et al. 2012). We mapped 58 transposon insertions n from 39 pBac(enhancer-lacZ) strains plus nine new pBac{3XP3::EYFP-attP} strains. Some strains contained multiple insertions and some insertions mapped to contigs that are not currently associated with Muller arm chromosomes. These results are shown in Figure 6 and are available in a Geneious file and File S3.
Testing fC31-mediated integration efficiency We tested efficiency of integration of attB plasmids into attP landing sites of multiple strains of each species. There are strong differences in integration efficiencies between landing sites. Some landing sites in D. simulans, D. mauritiana, D. santomea, and D. yakuba supported integration of attB plasmids, although many landing sites did not support integration at reasonable frequency ( Table 1). Details of integration efficiencies for each line are provided in File S3. In addition, we tested nine D. virilis strains carrying pBac{3XP3:: EYFP-attP} for integration of the eveS2-placZ plasmid, which contains an attB site. We screened 100 fertile G0 offspring for each of these nine strains and did not recover any integrants. This is a surprising result, and we do not yet know whether this failure of attB integration is specific to these lines or reflects a general low efficiency of attP-attB integration in D. virilis. Testing expression patterns of transgenes integrated in different attP sites We integrated a D. melanogaster eveS2-placZ plasmid into multiple attP landing site strains of each species to examine variability in expression at different landing sites. Levels of reporter gene expression varied between strains (Figure 7). In D. simulans, D. mauritiana, and D. yakuba, we identified at least one strain that drove strong and temporalspatially accurate levels of eveS2 expression. However, of the four landing sites we tested in D. santomea, none provided strong expression of eveS2 (Figure 7 and Figure 8). eveS2 transgenes often drive weak, spatially diffuse expression prior to stage 5, and all of the D. santomea strains displayed similar diffuse, weak expression at early stages. We also observed ectopic expression of the eveS2 transgene in D. santomea 2092 ( Figure 8H). It is not clear if the poor expression of eveS2 in these D. santomea landing sites reflects differential regulation of the D. melanogaster eveS2 enhancer in D. santomea or suppression of expression caused by position effects of these specific landing sites.

Unmarked attP landing sites
To facilitate integration of plasmids expressing fluorescent proteins that overlap with the excitation and emission spectra of EYFP, we have generated a subset of strains in which we induced null mutations in the EYFP gene marking the attP landing sites. These strains were generated by CRISPR/Cas9-induced mutagenesis. All strains were sequenced to ensure that the mutations did not disrupt the attP landing site. We have so far generated two strains in D. mauritiana, and three strains in each of D. santomea, D. simulans, and D. yakuba (File S3).

DISCUSSION
We have generated a collection of transgenic strains that will be useful for multiple kinds of experiments. First, the 3XP3::EYFP-attP strains provide a collection of attP landing sites for each species that will facilitate transgenic assays in these species. Integration efficiencies vary widely between strains and our experiments provide some guidance to identify landing sites with the highest efficiency of integration. Second, these transgenes carry markers that will be useful for genetic mapping experiments. Several published studies have already used these reagents and illustrate the power of these strains for genetic studies (Andolfatto et al. 2011;Erezyilmaz and Stern 2013;Ding et al. 2016).
We have generated transgenic strains using these attP landing sites and found that they show variation in embryonic expression patterns (Figure 7 and Figure 8). These results provide a rough guide to which strains may be useful for experiments that require low or high levels of embryonic expression. However, these results may not be predictive of transgene expression patterns at other developmental stages and in other tissues, and we strongly encourage colleagues to test a variety of landing sites for their experiments and report their experiences to us. We plan to continue to maintain a database reporting on integration efficiencies and expression patterns, and we will periodically update the Excel file associated with this manuscript.
This collection of reagents complements the existing resources available for studying species of the genus Drosophila, including the availability of multiple genome sequences (Drosophila 12 Genomes Consortium et al. 2007) and BAC resources (Song et al. 2011). This resource will accelerate research on gene function in diverse Drosophila species and the study of evolution in the genus Drosophila.