Genomic analysis of P elements in natural populations of Drosophila melanogaster

The Drosophila melanogaster P transposable element provides one of the best cases of horizontal transfer of a mobile DNA sequence in eukaryotes. Invasion of natural populations by the P element has led to a syndrome of phenotypes known as P-M hybrid dysgenesis that emerges when strains differing in their P element composition mate and produce offspring. Despite extensive research on many aspects of P element biology, many questions remain about the genomic basis of variation in P-M dysgenesis phenotypes across populations. Here we compare estimates of genomic P element content with gonadal dysgenesis phenotypes for isofemale strains obtained from three worldwide populations of D. melanogaster to illuminate the molecular basis of natural variation in cytotype status. We show that P element abundance estimated from genome sequences of isofemale strains is highly correlated across different bioinformatics approaches, but that abundance estimates are sensitive to method and filtering strategies as well as incomplete inbreeding of isofemale strains. We find that P element content varies significantly across populations, with strains from a North American population having fewer P elements but a higher proportion of full-length elements than strains from populations sampled in Europe or Africa. Despite these geographic differences in P element abundance and structure, neither the number of P elements nor the ratio of full-length to internally-truncated copies is strongly correlated with the degree of gonadal dysgenesis exhibited by an isofemale strain. Thus, variation in P element abundance and structure across different populations does not necessarily lead to corresponding geographic differences in gonadal dysgenesis phenotypes. Finally, we confirm that population differences in the abundance and structure of P elements that are observed from isofemale lines can also be observed in pool-seq samples from the same populations. Our work supports the view that genomic P element content alone is not sufficient to explain variation in gonadal dysgenesis across strains of D. melanogaster, and informs future efforts to decode the genomic basis of geographic and temporal differences in P element induced phenotypes.


Introduction
5 discovery, annotation, and population analysis in Drosophila (Kofler, Betancourt & Schlötterer, 2012;118 Linheiro & Bergman, 2012;Cridland et al., 2013;Nakagome et al., 2014;Zhuang et al., 2014;Rahman 119 et al., 2015). Comparison of different methods for detecting TEs in Drosophila NGS data has shown 120 that they identify different subsets of TE insertions (Song et al., 2014;Rahman et al., 2015), and thus 121 determining which TE detection method is best for specific biological applications remains an area of 122 active research (Ewing, 2015;Rishishwar, Marino-Ramirez & Jordan, 2016). 123 124 To better understand the molecular basis of differences in cytotype status among populations, we 125 investigated the relationship between GD phenotypes and P element predictions in whole genome 126 shotgun sequences from three worldwide populations of D. melanogaster. By combining previously 127 published GD assay data (Ignatenko et al., 2015) with P element predictions (this study) from genomic 128 data of the same strains (Bergman & Haddrill, 2015), we show that the number of euchromatic P 129 elements is not correlated with the degree of a GD phenotype exhibited by a strain. Furthermore, we 130 show that populations can differ significantly in their euchromatic P element content, yet show similar 131 distributions of GD phenotypes. We also investigate several bioinformatics strategies for detecting P 132 element insertions in strain-specific and pooled genomic data to ensure robustness of our conclusions 133 and help guide further genomic analysis. Our work supports previous conclusions that euchromatic P 134 element copy number is not sufficient to explain variation in GD phenotypes, and informs future efforts 135 to decode the genomic basis of differences in P element induced phenotypes over time and space. 136 137

138
Gonadal dysgenesis phenotypes. We re-analyzed GD assay data from (Ignatenko et al., 2015) for 43 139 isofemale strains of D. melanogaster from three geographic regions: North America (Athens, Georgia, 140 USA), Europe (Montpellier, France), and Africa (Accra, Ghana), described in (Verspoor & Haddrill, 141 2011). Definitions of A and A* crosses in (Ignatenko et al., 2015) are inverted relative to those 142 proposed by (Engels & Preston, 1980), and were standardized prior to re-analysis here. Cross A 143 measures the activity of tester strain males mated to M-strain Canton-S females; cross A* measures the 144 susceptibility of tester females mated to a P-strain Harwich males. P, P', Q, and M-strains were defined 145 according to (Kidwell, Frydryk & Novy, 1983;Quesneville & Anxolabéhère, 1998). 146 147 Genome-wide identification of P element insertions. Whole genome shotgun sequences from the same 148 D. melanogaster strains used for GD assays were downloaded from the European Nucleotide Archive 6 (ERP009059) (Bergman & Haddrill, 2015). These genomic data were collected using a uniform library 150 preparation and sequencing strategy (thus mitigating many possible technical artifacts) and include data 151 for both individual isofemale strains and pools of single flies from isofemale strains (see (Bergman & 152 Haddrill, 2015) for details). In total, 43 isofemale strain genomes from (Bergman & Haddrill, 2015) 153 were analyzed that had GD data in (Ignatenko et al., 2015). Two pool-seq samples were analyzed for 154 each population [N. America (15 and 30 strains), Europe (20 and 39 strains), and Africa (15 and

Europe and Africa. 185
To address whether genomic data can be used to understand how cytotype status varies geographically 186 and temporally, we identified P element insertions in publicly available genome sequences (Bergman 187 & Haddrill, 2015) for a panel of 43 isofemale strains from three global regions with previously-188 published GD phenotypes (Ignatenko et al., 2015). As reported in (Ignatenko et al., 2015), isofemale 189 strains from these populations were mainly P, M and Q ( Figure 1, Table 1). Based on genomic analysis, 190 all strains in these populations that are defined phenotypically as M are actually M' (File S1). For N. 191 American and African populations, the degree of activity tends to vary more across strains relative to 192 susceptibility (Table 1, Figure S1A (Table 2). These results suggest that the filtering steps performed by 209 McClintock improve the consistency of TE predictions made by TEMP and RetroSeq on isofemale 210 strains, and that the filtered data are more likely to reflect the true P element content of these lines. 211

212
The average number of P element insertions predicted per strain for all three populations is shown in 213 Table 2. In general, McClintock filtered predictions data suggests these isofemale lines contain ~70-214 8 120 P element insertion sites, which is roughly 2-fold higher than the 30-50 copies per haploid genome 215 estimated from Southern blotting (Bingham, Kidwell & Rubin, 1982;Ronsseray, Lehmann & 216 Anxolabéhère, 1989;Bonnivard & Higuet, 1999;Itoh & Boussy, 2002). These results are consistent 217 with increased resolution of P element predictions based on genomic data plus residual heterozygosity 218 due to incomplete inbreeding in these strains (Lack et al., 2016). In contrast to the lack of population 219 difference observed at the phenotypic level, genomic data shows clear differences in the numbers of 220 euchromatic P element insertions in strains from North American populations relative to the European 221 and African populations, regardless of the TE detection method and filtering (One-way ANOVA; 222 Table 2; Figure S1C-F; Figure 3). Taken together with the GD data, these 223 results suggest that population-level differences in the abundance of P elements per strain do not 224 necessarily lead to population-level differences in the frequency of GD phenotypes. 225

226
Integrating published GD data with our genomic predictions at the level of individual strains, we 227 directly tested whether the number of P elements per strain is associated with either GD phenotype. We 228 found that neither the degree of activity (cross A) nor the degree of susceptibility (cross A*) was 229 significantly linearly correlated with the filtered number of predictions made by TEMP or RetroSeq 230 (p>0.11; Figure 3). Similar results were obtained using the raw output of these methods as well ( Figure  231 S2). These results confirm results at the population level above and suggest that there is no simple 232 relationship between the total number of euchromatic P elements and GD phenotypes at the level of 233 individual strains. 234 235

Population difference in P element insertion numbers can be observed in pool-seq samples. 236
Pooled-strain sequencing (pool-seq) is a cost-effective strategy to sample genomic variation across 237 large numbers of strains and populations (Schlotterer et al., 2014). To address whether the differences 238 among populations we observed in the number of P elements predicted in isofemale strain data are also 239 seen in pool-seq data, we predicted P element insertions in pool-seq samples from the same populations 240 (Table 3, File S2). Two pool-seq samples are available for each population that differ in the number of 241 individuals (one per isofemale strain) used: North America (n=15 and n=30), Europe (n=20 and n=39), 242 and Africa (n=15 and n=32). The smaller pools from each population include one individual from the 243 same isofemale strains analyzed above; the larger pools contain one individual from the same strains as 244 the smaller pools, plus individuals from additional isofemale strains from the same population that 245 were not sequenced as isofemale strains. Thus, the smaller pool-seq samples are a nested subset of the 246 9 larger pool-seq samples, and pool-seq samples from the same population are not fully independent 247 from one another. 248

249
The numbers of P element insertions identified by TEMP and RetroSeq in pool-seq samples are given 250 for all three populations in Table 3. In the raw output, TEMP predicted more insertions in larger strain 251 pools relative to smaller strain pools, as expected for a method designed to capture TE insertions that 252 are polymorphic within a sample (Zhuang et al., 2014). However, McClintock-filtered TEMP output 253 generated between 9-fold and 60-fold fewer insertions per sample in the pool-seq output relative to raw 254 output, as well as fewer insertions overall in the larger strain pools relative to the smaller strain pools. 255 These effects are likely because of the McClintock requirement for TEMP predictions to have the 256 proportion of reads supporting the insertion to non-supporting reads to be >10%. In contrast, 257 McClintock filtering reduced the total number of RetroSeq predictions by less than 2-fold, and fewer 258 insertion sites were predicted for the larger strain pools in both raw and filtered RetroSeq output. 259 260 When the number of strains in a pool-seq sample was used as a scaling factor, pool-seq samples yield 261 many fewer P element predictions per strain than the average number of P elements predicted from the 262 same isofemale strains ( Table 2, Table 3). This is expected because of lower sequencing per strain 263 depth in the pooled samples relative to the isofemale line samples. Similarly, the scaled data shows 264 fewer insertions were predicted per strain in the larger pools relative to smaller pools for all populations 265 regardless of method or filtering (Table 3). This effect arises because larger and smaller strain pool 266 samples contain similar numbers of reads (~44 million read pairs per sample), and thus larger strain 267 pools have fewer reads per strain. Because pooled samples contain the same strains as isofemale lines 268 and because smaller pools contain a subset of the same strains that are present in larger pools, these 269 results suggest a dilution effect for P element detection in pool-seq samples: at a fixed sequencing 270 coverage, P element insertions that are predicted in samples with higher coverage cannot be detected in 271 samples with lower coverage, even though they are in fact present in the sample. 272

273
In spite of this dilution effect, African pool-seq samples tend to have more insertions per strain than 274 North American samples (Table 3), similar to what is seen in the isofemale strain datasets (Figure 3, 275 Table 2). This result is most clearly demonstrated for the comparison between North American and 276 African samples which each had 15 strains, where the African sample has more predicted insertions 277 regardless of TE detection method and filtering. These results suggest that, if dilution effects are 278 properly controlled for, pool-seq samples can capture general trends among populations in total P 279 element insertion numbers that are seen in isofemale strain sequencing. pool-seq samples. Similarly, we found that there is a diminishing return on the number of P element 297 insertions detected per strain in pool-seq samples for a given sequencing coverage, regardless of 298 method or filtering. These dilution effects mean it will be difficult to compare P element predictions 299 from pool-seq data with those from isofemale strains or to compare pool-seq samples to each other, 300 unless the read depth per strain in the pool is carefully controlled. We note that the observation of 301 diminishing returns for a fixed level of coverage in pool-seq samples does not contradict previous 302 claims that (with increasing total sequencing coverage) there appears to be no diminishing return on 303 detection of new TE insertions in D. melanogaster pool-seq samples (Rahman et al., 2015). 304 305 Regardless of prediction method, we found no simple linear relationship between the strength of GD 306 phenotypes and the number of euchromatic P element insertions across isofemale strains (Figure 3). 307 Our results are consistent with previous attempts to connect total numbers of P elements in a genome to 308 GD phenotypes, which found weak or no correlations using Southern blotting or in situ hybridization to 309 polytene chromosomes (Todo et al., 1984;Engels, 1984;Boussy et al., 1988;Ronsseray, Lehmann & 310 11 Anxolabéhère, 1989;Rasmusson et al., 1990;Itoh et al., 1999;Bonnivard & Higuet, 1999;Itoh & 311 Boussy, 2002;Itoh et al., 2004Itoh et al., , 2007. Assuming that GD assays using single reference strains provide 312 robust insight into the GD phenotypes of these natural strains, our results are at face value consistent 313 with recent arguments that the P element may not be the primary determinant of hybrid dysgenesis 314 (Zakharenko & Ignatenko, 2014). However, our results are also consistent with GD phenotypes being 315 determined by one or more active full-length P element insertions found in specific locations in the 316 euchromatin, or by the relative abundance of full-length and truncated repressor elements, rather than 317 overall copy numbers (which includes both active and inactive copies). Alternatively, the lack of 318 correlation between the number of P element insertions and GD phenotypes may result from noise in 319 the data due to the genomic sequence data not being of sufficient depth in these samples, or the GD 320 assays having substantial experimental variation across lines. 321 322 Nevertheless, we did observe differences among populations in the number of predicted P element 323 insertions per strain (Figure 3), even though no strong differences were observed in the levels of GD 324 phenotypes across these populations. Specifically, strains from the North American population had the 325 fewest predicted P element insertions, regardless of the TE detection method or filtering (Figure 3, 326 Figure S1). This result is somewhat unexpected given that the P element is thought to have first been 327 horizontally transferred into a North American population before invading the rest of the world 328 (Anxolabehere, Kidwell & Periquet, 1988). This observation suggests that N. American populations 329 may have evolved some form of copy number control not present in other populations. Evidence for 330 fewer P element insertions per strain in the North American population could also be detected in pool-331 seq samples, especially when the number of strains per pool was controlled for (Table 3), indicating 332 that pool-seq is a viable strategy for surveying differences in P element copy number across 333 populations. Overall, our results show that it is possible to detect clear differences in euchromatic P 334 element insertion profiles among populations using either isofemale strain or pool-seq genomic data, 335 however interpreting how P element insertion site profiles relate to GD phenotypes at the strain or  Table 1. Gonadal dysgenesis (GD) levels and P-M status for isofemale strains of D. melanogaster 512 obtained from natural populations in North America, Europe and Africa. %GD for cross A (tester strain 513 males versus M-strain Canton-S females) and cross A* (P-strain Harwich males versus tester strain 514 females) are based on data reported in (Ignatenko et al., 2015). Cross A and A* labels in (Ignatenko et 515 al., 2015) are inverted relative to those proposed by (Engels & Preston, 1980) and were converted to 516 standard labels prior to analysis here. P-M status for individual strains is according to (Ignatenko et al., 517 2015). Phenotypically M-strains are in fact M'-strains based on analysis of genomic data (File S1, File 518 S2    Europe and Africa. %GD for cross A (tester strain males versus M-strain Canton-S females, vertical 537 axis) and cross A* (P-strain Harwich males versus tester strain females, horizontal axis) are based on 538 data reported in (Ignatenko et al., 2015). Cross A and A* labels in (Ignatenko et al., 2015) are inverted 539 relative to those proposed by (Engels & Preston, 1980) and were converted to standard labels prior to 540 analysis here. Each dot represents an isofemale strain. The P-M status for various sectors of GD 541 phenotypic space defined by A and A* crosses are according to (Kidwell, Frydryk & Novy, 1983;542 Quesneville & Anxolabéhère, 1998) Figure S2. Each triangle represents an isofemale strain. 556

557
File S1. Tab separated value (TSV) formatted file with %GD data from A and A* crosses, P-M 558 cytotype status, population, and numbers of predicted P elements in raw and filtered output from 559 TEMP and RetroSeq, respectively, for 43 isofemale strains from three global regions. GD data are 560 taken from (Ignatenko et al., 2015) and were standardized to definitions proposed by (Engels & 561 Preston, 1980) prior to re-analysis here. 562 563 File S2. Zip archive of browser extensible data (BED) files of predicted P element locations in genome 564 sequences from 50 isofemale strains and 6 pool-seq samples from three global regions. Each sample 565 has four BED files corresponding to raw (*raw.bed) and filtered (*nonredundant.bed) output from 566 TEMP and RetroSeq, respectively. BED files for 7 isofemale strains from (Bergman & Haddrill, 2015) 567 are included here that do not have GD data in (Ignatenko et al., 2015) but are included in the pool-seq 568 samples, allowing comparisons to be made between isofemale strains and pool-seq samples for the 569 same set of strains. 570 571 Figure S1. Distributions of %GD in A and A* crosses and numbers of predicted P element insertions 572 for isofemale strains within and between populations from three worldwide regions. Distributions are 573 shown as boxplots with black lines representing median values, boxes representing the interquartile 574 range (IQR), whiskers representing the limits of values for strains that lie within 1.5 x IQR of the upper 575 or lower quartiles, and circles representing strains that lie outside 1.5 x IQR of the upper or lower 576 quartiles. GD data are taken from (Ignatenko et al., 2015) and were standardized to definitions 577 proposed by (Engels & Preston, 1980) prior to re-analysis here. Numbers of P elements predicted by 578 TEMP or RetroSeq shown are before (raw) and after (filtered) Figure S2