The Effect of Common Inversion Polymorphisms In(2L)t and In(3R)Mo on Patterns of Transcriptional Variation in Drosophila melanogaster

Chromosomal inversions are a ubiquitous feature of genetic variation. Theoretical models describe several mechanisms by which inversions can drive adaptation and be maintained as polymorphisms. While inversions have been shown previously to be under selection, or contain genetic variation under selection, the specific phenotypic consequences of inversions leading to their maintenance remain unclear. Here we use genomic sequence and expression data from the Drosophila Genetic Reference Panel (DGRP) to explore the effects of two cosmopolitan inversions, In(2L)t and In(3R)Mo, on patterns of transcriptional variation. We demonstrate that each inversion has a significant effect on transcript abundance for hundreds of genes across the genome. Inversion-affected loci (IAL) appear both within inversions as well as on unlinked chromosomes. Importantly, IAL do not appear to be influenced by the previously reported genome-wide expression correlation structure. We found that five genes involved with sterol uptake, four of which are Niemann-Pick Type 2 orthologs, are upregulated in flies with In(3R)Mo but do not have SNPs in linkage disequilibrium (LD) with the inversion. We speculate that this upregulation is driven by genetic variation in mod(mdg4) that is in LD with In(3R)Mo. We find that there is little evidence for a regional or position effect of inversions on gene expression at the chromosomal level, but do find evidence for the distal breakpoint of In(3R)Mo interrupting one gene and possibly disassociating the two flanking genes from regulatory elements.

Drosophila inversion polymorphism eQTLs DGRP DPGP Chromosomal inversions, in which a portion of linear DNA sequence is flipped in its orientation, are a common member of the menagerie of DNA polymorphisms, and have been found in diverse organismal populations such as humans, plants, and fruit flies (Krimbas and Powell 1992;Kidd et al. 2010;Lowry and Willis 2010) . In many cases, large chromosomal inversions have profound impacts on phenotype and disease (Feuk 2010). For instance, recurrent inversions are responsible for an estimated 43% of hemophilia A cases (Lakich et al. 1993). Inversions can also have beneficial effects. A 900 kb inversion on human chromosome 17 (q21.31) has been shown to be associated with higher female fecundity in the Icelandic population (Stefansson et al. 2005). In populations of the malaria vector Anopheles gambiae, a large chromosomal inversion on chromosome 2L (2La) is associated with desiccation resistance and thus segregates at high frequencies in arid environments (Fouet et al. 2012). These examples are the very tip of the iceberg; inversion polymorphisms have been implicated in numerous phenotypic differences among a host of organisms, however little is known about the mechanisms by which inversions confer their phenotypic effects.
Perhaps the single best studied inversions are those from Drosophila, in part made famous by the pioneering work of Dobzhansky (Dobzhansky and Sturtevant 1938). Dobzhansky focused much attention on spatial and temporal variation in the frequency of large inversions of Drosophila pseudoobscura and showed in broad strokes that clear fitness differences were responsible for the regular patterns of frequency change observed. These findings in turn spurred a large body of population genetics theory to explain the establishment and selective persistence of inversions in natural populations (Levene and Dobzhansky 1958;Fraser et al. 1966;Anderson et al. 1967;Tobari and Kojima 1967). As postulated by Sturtevant (1921), crossover suppression induced in inversion heterozygotes can mean that a single adaptive allele within an inversion may suffice for the selective invasion of that rearrangement (Haldane 1957). Such lowered levels of recombination and attendant increases in linkage disequilibrium (LD) could thus present the opportunity for subsequent coadaptation of multiple genes near inversion breakpoints (Sturtevant and Mather 1938;Dobzhansky 1947). Conversely, locally adapted alleles that predate the rearrangement on the same chromosome might aid the establishment of an inversion simply because of the reduction in recombination rates between such loci (Kirkpatrick and Barton 2006). Further, inversions might have direct fitness effects, for instance by deletion or changes in gene expression near the inversion breakpoints (Kirkpatrick and Kern 2012). At present, we have precious little information as to the variants responsible for differential fitness effects associated with inversions.
In D. melanogaster, paracentric inversions spanning several megabases are common and have been found in populations across the globe (Stalker 1976(Stalker , 1980Aulard et al. 2002;Van 't Land et al. 2000;Knibb et al. 1981;Sezgin et al. 2004;Anderson et al. 2005;Umina et al. 2005). Much of this segregating inversion polymorphism is associated with latitudinal clines in D. melanogaster, a historically tropical species adapting along tropical-to-temperate climatic gradients in Australia and North America (Knibb et al. 1981;Weeks et al. 2002;de Jong and Bochdanovits 2003;Sezgin et al. 2004;Reinhardt et al. 2014;Schrider et al. 2016). Clinically varying phenotypes that are associated with inversions include heat resistance, cold tolerance, and body size (Weeks et al. 2002;Anderson et al. 2005;de Jong and Bochdanovits 2003). Clinal variation of inversion frequency in D. melanogaster has been shown via population genetic approaches to be due to selection independent of demography (Reinhardt et al. 2014;Kapun et al. 2016), though migration has been suggested to generate these patterns along with local adaptation (Bergland et al. 2016). Indeed, inversions have been observed to have a major effect on several phenotypes that vary between temperate and tropical populations across several Drosophila species, and these clines have been stable since their discovery roughly 80 yr ago (Hoffmann et al. 2004;Cogni et al. 2017). Unfortunately, while the associations are known, the molecular mechanisms at work in determining differential phenotypes as a result of inversion status are still unknown.
Recent population genomic projects in D. melanogaster, such as the DGRP and Drosophila Population Genomics Project, have opened up the opportunity to study inversions systematically as these resources have captured segregating inversions from North America and Africa Mackay et al. 2012;Pool et al. 2012;Houle and Márquez 2015).  bioinformatically mapped previously unknown breakpoints of several inversions, a task that was tedious for even single inversions prior to whole-genome sequencing (Wesley and Eanes 1994;Andolfatto et al. 1999;Matzkin et al. 2005). For instance,  discovered the breakpoints associated with numerous inversions, demonstrated the expected increase in LD near inversion breakpoints, and elevated differentiation between inverted and standard arrangement chromosomes at the nucleotide level. In parallel with the exponential increase in population genomic resources, large-scale phenotypic association studies of these same genotypes have been accumulating (Mackay et al. 2012;Telonis-Scott et al. 2016;Vonesch et al. 2016). These include numerous phenotypes previously associated with inversion poly-morphism such as body size (Weeks et al. 2002) and desiccation resistance .
A logical place to look for inversion effects that may influence suites of phenotypes would be transcript level variation. Previous findings strongly suggest that inversions could be important drivers of adaptation with gene expression variation as a potential molecular mechanism (Chambers 1991;López-Maury et al. 2008;Fraser 2013). Indeed, inversions could affect patterns of transcript variation in a number of ways: (1) genes at or near inversion breakpoints may become disabled or separated from their regulatory apparatus, thus inversions may have direct effects on transcription; (2) increased LD in inversions due to crossover suppression may increase linkage with gene expression Quantitative Trait Loci (eQTL), and thus alternative alleles of the inversion may be associated with differential expression of genes within the inversion (i.e., indirect, cis-eQTL associated with the inversion); (3) eQTL in LD with the inversion might themselves regulate genes outside of the inversion (i.e., indirect, trans-eQTL associated with the inversion); or (4) the large-scale nature of Drosophila inversions may create global changes in the organization of chromatin or nuclear localization of the chromosomes such that genes are differentially regulated between inversion and standard karyotypes. Thus, inversions may have a direct effect on global patterns of transcription and act as trans-eQTL themselves. Indeed, earlier studies of transcriptional variation in D. melanogaster have hinted at the influence of inversions on genomewide patterns of gene expression (Ayroles et al. 2009;Massouras et al. 2012;Huang et al. 2015). Here, we address the effect of two cosmopolitan inversions, In(2L)t and In(3R)Mo, on patterns of transcription by using whole-genome sequence, gene expression, and inversion call data from the DGRP.

Materials
Processed expression data previously reported in Ayroles et al. (2009) was downloaded from ArrayExpress (Kolesnikov et al. 2015). We accepted inversion state calls for each of the DGRP lines where cytological and bioinformatic inversion calls for In(2L)t and In(3R)Mo agree and removed lines from analyses where there was any disagreement Huang et al. 2014;Houle and Márquez 2015). Using the same databases, we removed individuals from lines likely heterozygous for In(2L)t or In(3R)Mo. Expression analyses were performed with 34 lines (136 individuals), with two lines homozygous for In(2L)t and seven lines homozygous for In(3R)Mo. There were 26 lines homozygous for the standard arrangement at both In(2L)t and In(3R)Mo, as one line was homozygous for both inversions. We calculated LD between In(2L)t and SNPs using 181 lines, including 19 inversion-bearing lines. We calculated LD between In(3R)Mo and SNPs using 197 lines, including 17 inversion-bearing lines. To maintain consistency with Affymetrix library files, dm3/BDGP release 5 genomic coordinates and annotations corresponding to BDGP version 5.49 were used in conjunction with the Affymetrix Drosophila 2 Release 35 library file update.
39 UTR array analysis: Correlation structure: Pairwise gene expression correlation coefficients were calculated by linear regression on all unique pairwise combinations of probe sets, excepting self-comparisons. Correlation coefficients are reported here as adjusted r 2 from the R function (lm). Gene expression modules used here were reported in Ayroles et al. (2009). Null distributions of uniquely occupied clusters for each inversion were generated by permuting the occupied cluster for each gene 100,000 times and calculating the total unique clusters occupied by IAL for each permutation.
Inversion effect on expression: Linear regressions of sex, inversion, and Line effects with expression as the response were performed for each probe set: for individual expression value, Y ijkl , in response to ith Sex, A i , jth In (2L)t state, B j , kth In(3R)Mo state, C k , and the lth Line in the jth In(2L) t state, D kl , with e ijkl as the error term in Lines. Model testing was performed using (add1) and (drop1) in R to add interaction terms or remove main effect terms from the above model, respectively. Inter-action terms were added one at a time to the main effects and tested for each probe set. An AIC is reported for each model with a lower absolute value being preferred when comparing two models. The effect of adding an interaction term between Sex and In(2L)t or Sex and In(3R)Mo varied by probe set, but the above model was the best fit for 10,082 of the probe sets. Models with and without an interaction term between inversions performed the same for all loci, thus, we chose the less complex model. All of our analyses compare results across all loci; therefore, for meaningful conclusions, we must use the same model for all loci. We chose the simpler model by simple majority rule. It is worth noting that this means that most loci are tested by the best fit model and the rest of the loci are tested by a likely overly conservative model with respect to detecting inversion effects. Similarly, the above model was the best fit for 12,994 probe sets when compared to dropping any of the main effect terms. We calculated P-values of the observed F values as percentiles of the F distributions generated by 10,000 permutations, sampling each inversion independently without replacement. Multiple testing correction was performed by calculating q-values using the R package (q value) with FDR = 0.05 (Storey and Tibshirani 2003) on the permutation-derived P-values. Proportion of variance explained by each effect was calculated as h 2 . Magnitude and direction of inversion effect was calculated as Cohen's d (Cohen 1988). Cohen's d is the difference of the means of two groups scaled to SD by means of the root mean square of the SD of the two groups. Thus, effect sizes as Cohen's d can be compared across multiple tests, as we present here, even though SD, in addition to means, may differ both within the pairwise comparison of one locus and between loci. For example, for two distributions, each with a SD = 1, Cohen's d = 1 represents a difference of one SD between the means and a 62% expected overlap given a false positive rate of 5%.
Functional annotation enrichment: Functional annotation profiling was performed using the g:Profiler online portal of g:GOSt using default settings (Reimand et al. 2016) (version r1622_e84_eg31). Ambiguous 39 UTR probe sets were resolved manually if possible, or ignored if they overlapped transcripts for . 1 gene.
SNP-inversion LD: LD was calculated as r 2 = D'/p S (1-p S )p I (1-p I ) for each diallelic SNP S and inversion I, with major allele frequencies p S and p I , using a custom bash script. For each SNP, significance was calculated as a chi-squared (x 2 ) transformation with one d.f. of r 2 as x 2 = Nr 2 , where N is the sample size. Significant LD was defined as a SNP with sample size of at least 60, minor allele frequency of 10% for both the SNP and inversion state, and x 2 . critical value (P = 0.05, d.f. = 1) with Bonferroni correction for all SNPs on that chromosome arm (n = 967774, x 2 . 29.65329 and 947970, x 2 . 29.61321 for chr2L and chr3R, respectively). The sample size and minor allele frequency cutoffs ensured that there were at least six representative lines bearing minor alleles. Genes were considered in significant LD with inversion state if at least one significant SNP was found within the annotated gene region (FlyBase v5.49).  Counts of IAL between breakpoints, on the same chromosome as the inversion but outside of the breakpoints, and those on the same chromosome as the inversion but outside and within 1 Mb of the breakpoints, as well as total IAL on each chromosome arm for each inversion. Chr, chromosome; IAL, inversion-affected loci.
IAL physical clustering: To see if IAL were physically clustered within the genome, we examined physical clustering by measuring the coefficient of variance (CV) of distances between genes by chromosome arm. Location and length of each gene was used from FlyBase v5.49. Distance between neighboring genes was calculated as the distance between ends of gene-annotated regions of neighboring genes. Distance from the most distal or proximal genes to the distal or proximal endpoint, respectively, was not included. CV was calculated as the SD divided by the mean of the distribution of intergenic distances for each chromosomal arm (Sokal and Rohlf 1995). We defined the intergenic distance between overlapping gene regions as zero. Null distributions of intergenic distances for each chromosome and each inversion were generated by 100,000 random samples, without replacement, of the same number of genes from a chromosome arm as the number of IAL for that chromosome arm and inversion, then calculating the distances between those genes. C.I.s were calculated as 2.5-97.5% quantiles from the corresponding random sample distribution.
Transcription factor (TF)-target gene (TG) interactions: TFs and TGs were defined by Drosophila Interaction Database modMine (Contrino et al. 2012. Genes in this analysis were those that are present in Affymetrix Drosophila2 genome array annotation (Release 35), the DroID TF-TG database v2015_12, and FlyBase v5.49 annotation. Over/underrepresentation of genes with significant inversion effect on expression as targets of TFs with SNPs in LD with inversion state was calculated as the probability of the observation given the hypergeometric distribution.

Data availability
Custom scripts, data, and analysis results can be found online at https:// github.com/kern-lab/lavingtonKern, including file descriptions in the AnalysisFiles.readme document.

RESULTS
To examine what influence, if any, common inversion polymorphisms have on patterns of transcription in the Drosophila genome, we combined publicly available genome sequences ( Table 1). There are a large number of loci with transcript abundance variation correlating with inversion state; however, we note that the inversion effect contribution to variance is relatively small for the vast majority of loci (Supplemental Material, File S1, Table S1, and Table S2).
One explanation for the large number of IAL found across the genome is that few loci are directly affected by the inversion and the remaining loci are affected indirectly by an expression variation correlation structure, previously described by Ayroles et al. (2009). We addressed this correlation structure by the numbers of unique expression modules occupied by, and the distribution of correlation coefficients of, IAL as compared to all genes. If a significant portion of the IAL we observe are due to expression variation correlation, then we would expect that IAL occupy fewer expression modules than the same number of genes drawn at random. We would also expect the mean correlation of IAL between IAL to be higher than the genome-wide average. We observe that IAL occupy more modules than expected at random for both In(2L)t (71 obs; 38-56 95% C.I.) and In(3R)Mo (108 obs ; 65-87 95% C.I.). For both inversions, we did not observe higher mean correlation between IAL and non-IAL or the genomewide mean (Table 2).
We then examined the inversion effect on gene expression variation for four distinct categories of effect: (1) cis-or (2) trans-inversion effects of SNPs in LD with the inversion, (3) direct effects of the inversion by interrupting genes, and (4) regional effects of chromosomal rearrangement.

Cis-inversion effect of SNPs in LD with the inversion
Our model explicitly tests the effect of chromosomal arrangement on expression variation, and here we focus on SNP variation as the main driver of the inversion effect by taking advantage of LD between the inversion state and SNP variants. LD with inversions is highest at the breakpoints and decays in both directions from each breakpoint as expected (Figure 2) (Wesley and Eanes 1994;Andolfatto et al. 2001;Langley et al. 2012;Corbett-Detig and Hartl 2012). To determine n  cis-effects of the inversions, we tested for overrepresentation of IAL among loci in LD with inversion state. This ignores the location of the loci and focuses on the correlation of SNP alleles with the inversion state. As expected, few loci in LD with the inversion were located on a different chromosomal arm [three with In(2L)t, and two with In(3R) Mo)] and IAL in LD are located only on the same chromosomal arm (Table 3). We observed a significant overrepresentation of IAL with SNPs in LD with inversion state (Table 4).

Trans-inversion effect of SNPs in LD with the inversion
Our expectation of a cis-inversion effect is dependent on SNP variation in LD with the inversion. By the same rationale, a trans-inversion effect may be detected as an IAL without SNP variation in LD with the inversion, as we observe with a majority of IAL for each inversion [181 of 192 for In(2L)t and 323 of 425 for In(3R)Mo] (see Table 4). Assuming that SNP variation is the basis of expression variation, one trivial explanation of a trans-inversion effect is SNP variation in TFs in LD with the inversion acting on downstream targets. The TFs in this case need not be IAL as SNPs in protein-coding regions of TFs can give rise to expression variation in downstream targets. However, we found no over-or underrepresentation of IAL that are targets of TFs with SNPs in LD with either inversion (Table 5). A possible example of trans-inversion effect was found by functional analysis of IAL, and is discussed below.

Direct effect of inversions by disrupting genes at breakpoints
Nucleotide sequence variation at each breakpoint of an inversion can truncate transcribed gene regions and disassociate transcribed regions from TF binding sites and other regulatory elements. Truncation most likely leads to downregulation of genes and rearrangement of regulatory elements, and can lead to up-or downregulation of genes at either breakpoint. The closest IAL to any of the four breakpoints are near the distal breakpoint of In(3R)Mo. CG1951 and bGalNAcTB are within 2.5 kb inside and outside of the inversion region, respectively, and Ssl2 is interrupted by the breakpoint. All three are downregulated in the presence of the inversion. The next closest IAL is 24 kb away. The closest IAL to the proximal breakpoint of In(3R)Mo are 32 kb away. It is also important to note that loci immediately surrounding the proximal breakpoint are not transcriptionally affected by the rearrangement, so the presumed disassociated regulatory elements from the distal end are not altering expression of loci at the proximal end. The closest IAL to the proximal and distal breakpoints of In(2L)t are 34 and 37 kb away, respectively, and thus probably too far to have been affected by direct effects of the breakpoint.

Regional effect of chromosomal rearrangement
We tested for regional effects of chromosomal rearrangement by looking for over-or underrepresentation of genes in LD with the inversion as IAL. We did observe more IAL than expected in LD with each inversion (Table 4) and a trend of transcriptional downregulation across the In (2L)t region (Figure 3, A and D), but note that the effect size is relatively small. Near the breakpoints, patterns of inversion effect direction are generally small and not likely significantly different from zero (Figure 3, B, C, E, and F). The pattern of downregulation immediately surrounding the distal breakpoint of In(3R)Mo is moderate and appears to be driven by three loci: CG1951, Ssl2, and b4GalNAcTB. This is likely a direct effect of the inversion on Ssl2, as the breakpoint interrupts this gene proximal to CG1951 (Corbett-Detig et al. 2012), and possibly disassociation of regulatory elements from coding sequence with respect to CG1951 and b4GalNAcTB.
We also examined whether IAL tend to cluster together by physical location along chromosomes, which could arise from more localized regional effects. We measured physical clustering as the CV of distances between genes, for the global CV, and between IAL by chromosome arm. For each arm, IAL for both inversions were less clustered than the distribution of all genes, although In(3R)Mo IAL are more clustered on chromosome 3R than we would expect for the same number of randomly drawn genes, but are still less clustered than the genomic background generally (Table 6).

Functional analysis
Coadapted alleles segregating with an inversion should also be in LD with the inversion breakpoints. We used gProfiler g:GOSt functional profiling to detect overrepresentation of functional groups in sets of IAL. The sets of IAL that we analyzed were those IAL in LD with the inversion ( Table 2) or targets of another gene with variation segregating with the inversion (Table 4). Functional analysis of IAL for each inversion yielded significant groups only when considering all In(3R)Mo IAL or only IAL where inversion state explains $ 15% of expression variance (Supplemental Material). Sterol transport is significant in both cases (P = 0.022 for all, P = 0.000116 for $ 15% variance) and catalytic activity term (GO:0003824) is significant when considering all IAL (P = 0.000146). We found no significant functional groups when considering any similar grouping of In(2L)t IAL.
The sterol transport group found to be enriched among In(3R)Mo IAL includes four of the eight Niemann-Pick type 2 orthologs (Npc2b, Npc2c, Npc2f, and Npc2g) (Huang et al. 2007), along with Apoltp. All five genes are upregulated in association with the inverted arrangement, suggesting an increase of sterol uptake in In(3R)Mo-bearing flies. In the context of the expression samples, most of these genes are preferentially expressed in the adult gut (modENCODE, Contrino et al. 2012). While the four NPC2s are on chromosome 3R (Figure 4), Apoltp is on chromosome 2L, and none of these genes contain a SNP in the gene region that is in significant LD with In(3R)Mo, or each other, and only Npc2f is within the inversion region ( Figure 5). It is possible that the significant upregulation of the sterol transport group is a downstream effect of n n mod(mdg4), which is an IAL near the proximal breakpoint ( Figure 4) and in LD with In(3R)Mo ( Figure 5). mod(mdg4) is a chromatin protein associated with Npc2b as well as other chromatin proteins and TFs associated with all eight Npc2 orthologs (Supplemental Material). We address this exciting finding in further detail in the Discussion.

DISCUSSION
Despite decades of research on polymorphic inversions in D. melanogaster, and despite an overwhelming consensus that inversions are maintained due to selection, we have little understanding of the targets of selection in these inversions that lead to their maintenance (cf. Kirkpatrick and Kern 2012). Here, we examine the role of transcriptional variation induced by cosmopolitan inversions to explore what effect, if any, inversions might have on gene expression that itself might be selectively favored. Besides gross rearrangement ef-fects, we examined gene disruption at inversion breakpoints, IAL in LD with the inversion, and IAL not in LD with the inversion. We assumed multiple functionally-linked IAL to be potentially coadapted so long as they contained SNPs in LD with the inverted arrangement. We also assumed that trans-inversion effects, IAL not in LD with the inversion, could be the result of these loci interacting with TFs in LD with the inversion or epistatic interactions with loci in LD. We note that while sample size of In(2L)t-bearing individuals in the expression analysis is small (8 of 136), we were still able to detect a relatively large number of significant loci across the genome. We believe that our methods were conservative and limit the number of false positives at the expense of a likely high number of false negatives. We argue that we are accounting for the small sample sizes in our analyses and interpretations. We also note that we could not address over-or underdominance in this study as we n  examined only lines known to be homozygous for either arrangement of In(2L)t or In(3R)Mo.
Of particular concern was what role, if any, expression correlation had on our findings. To explore this, we considered pairwise correlation coefficients and previously described expression modules (Ayroles et al. 2009). One could imagine that the true false discovery rate would be much higher than anticipated due to expression correlation of multiple loci from the same expression module. We found that IAL for both inversions occupied more expression modules than we would expect when drawing genes at random. Even when considering just the pairwise expression correlation, we found that IAL for both inversions were slightly less correlated on average than the genome-wide mean. These observations suggest that the inversion effects arose independently of the underlying correlation structure.
We found many loci of significant effect, yet most have a modest contribution to expression variation. Importantly, the only evidence of direct structural influence on patterns of expression was at a single breakpoint that appears to be due to the inversion mutation interrupting gene regions but not moving genes to a different region of the chromosome. That is, there was no appreciable pattern of up-or downregulation of genes along the chromosome with respect to the location of inversion breakpoints. Perhaps unsurprisingly, we did find an overabundance of expression perturbation within the inversion regions themselves. This would suggest that the effect is possibly due to genetic, rather than the structural, variation in LD with the inversion. That is, transcription variation associated with the inversion is due to genetic variation at the gene level and not the rearrangement of loci.
Inversion polymorphism can be maintained by reduced recombination between locally coadapted alleles within inversions (Dobzhansky and Dobzhansky 1970), both with (Nei et al. 1967;Pepper 2003) or without epistasis (Kirkpatrick and Barton 2006). Clinal inversion variation in D. melanogaster within In(3R)Payne in Australian populations and In(3R)Payne, In(2R)NS, and In(2L)t in American populations has been shown to be due to selection rather than demographic history and indicative of coadaptation (Kennington et al. 2006;Kolaczkowski et al. 2011;Kapun et al. 2016). We reasoned that one might be able to detect epistatic coadaptation through functional analysis of IAL. Coadaptation of noninteracting loci would likely result in either no significant functional groups or multiple, unrelated significant groups. We found functional annotation enrichment only when we considered all IALs with respect to In(3R)Mo. This would suggest that either one locus or a few coadapted, but noninteracting, loci segregate with these inversions to maintain polymorphic inversions. However, it is still possible that coadapted loci could be overlooked in our analysis; certainly, the existing annotation is incomplete. Moreover, our statistical power to detect IAL suffers due to the constraints of the number of inversions captured in the DGRP dataset. Nevertheless, our observation that there are mul-tiple IAL within the inversion argues strongly for the role that inversions play as modifiers of recombination, which may hold adaptive haplotypes together.
Our strongest functional finding, that sterol uptake associated with In(3R)Mo, appears to be driven by genetic variation in a single locus as a trans-inversion effect. Four of the five genes in this cluster are located on chromosome 3R; however, none of these genes have a SNP in significant LD with In(3R)Mo, or with each other (Supplemental Material), and only one is found within the inversion region (Npc2f). This would rule out effective coadaptation of these genes and suggest that the location of these genes on the same chromosome as the inversion is a coincidence. Assuming that the upstream effector of the sterol transport is a TF, it could contain a SNP that alters either its protein-coding sequence or expression. Either scenario would require that the genetic variant responsible for the upregulation of the four Npc2s in question would have to be in LD with In(3R)Mo. Only one TF, mod(mdg4), annotated in DroIDb as interacting with any (Npc2b) of the five sterol transport IAL, contains a SNP in LD with In(3R)Mo. Furthermore, mod (mdg4) is itself an IAL and is also located near the proximal breakpoint of In(3R)Mo. We speculate that inversion-associated expression variation detected in this functional group is under the control of mod (mdg4).
Increased sterol uptake fits nicely with the positive correlation of In (3R)Mo frequency with latitude (Kapun et al. 2014). Npc2 genes control sterol homeostasis via uptake of dietary sterols in D. melanogaster (Huang et al. 2007;Carvalho et al. 2010;Niwa et al. 2011). Two species of Drosophila differentially express Npc2 in response to cold acclimation, though the patterns differ between the two (Parker et al. 2015). Increased dietary cholesterol increases cold tolerance in D. melanogaster (Shreve et al. 2007); however, D. melanogaster takes up phytosterols more efficiently than cholesterol (Cooke and Sang 1970). Furthermore, D. melanogaster is a sterol auxotroph (Carvalho et al. 2010;Niwa et al. 2011) that utilizes dietary sterols preferentially to biosynthesizing different sterols from dietary sterols (Carvalho et al. 2012). This would suggest that In(3R)Mo carrying D. melanogaster could be cold-acclimated due to increased uptake of dietary sterols, rather than the upregulation of cholesterol production.
It is difficult to interpret results for In(2L)t without a clear functional annotation group associated with the inverted state. Assuming that In (2L)t is under selection (Kapun et al. 2016), the simplest interpretation is that In(2L)t polymorphism is maintained by only a small number of loci in LD with the inversion. It is possible that one or more loci in LD n  with In(2L)t contain protein-coding variation under selection and no appreciable transcript abundance variation with respect to chromosomal arrangement. We note that there are only 14 IAL in LD with In(2L)t (Supplemental Material). While a lack of a significant functional group may be dissatisfying, this does provide a manageable candidate list for validation of single targets.

Conclusions
We found that two different cosmopolitan inversions in D. melanogaster have some effect on the expression of hundreds of genes across the genome. While we caution that our sample sizes, particularly for In(2L) t, are very small, the permutation approach that we have taken is conservative. The genetic variation responsible for the observed transcriptional variation is only in small part due to the inversion event itself, with the majority of the variation being the result of either allelic variation in LD with the inversions, trans-effects of regulators that also are in LD with the inversions, or as of yet uncharacterized, indirect effects of the inversions. Our results mirror those of a recent report on transcriptional variation caused by inversions in D. pseudoobscura (Fuller et al. 2016). Fuller et al. (2016) demonstrated quite convincingly that the well-studied polymorphic inversions of D. pseudoobscura are modulating levels of transcription at multiple life history stages and even found hints of trans-effects. However, due to experimental design constraints, they could not examine inversion effects on unlinked chromosomes. Our findings extend this pattern to D. melanogaster and show that inversions are affecting loci genome-wide. While we have begun to parse the possible causes of IAL, our focus on available data limits the statistical power of our study. Thus, it will be important to conduct carefully designed experiments on inversion polymorphism in D. melanogaster to elucidate the true extent of influence of inversions on genome-wide patterns of transcription.

ACKNOWLEDGMENTS
We thank Dan Schrider, Steve Schaeffer, and two anonymous reviewers for feedback on the manuscript. A.D.K. and E.L. were supported in part by National Institutes of Health award no. R01GM078204.