Contributions of cis- and trans-Regulatory Evolution to Transcriptomic Divergence across Populations in the Drosophila mojavensis Larval Brain

Abstract Natural selection on gene expression was originally predicted to result primarily in cis- rather than trans-regulatory evolution, due to the expectation of reduced pleiotropy. Despite this, numerous studies have ascribed recent evolutionary divergence in gene expression predominantly to trans-regulation. Performing RNA-seq on single isofemale lines from genetically distinct populations of the cactophilic fly Drosophila mojavensis and their F1 hybrids, we recapitulated this pattern in both larval brains and whole bodies. However, we demonstrate that improving the measurement of brain expression divergence between populations by using seven additional genotypes considerably reduces the estimate of trans-regulatory contributions to expression evolution. We argue that the finding of trans-regulatory predominance can result from biases due to environmental variation in expression or other sources of noise, and that cis-regulation is likely a greater contributor to transcriptional evolution across D. mojavensis populations. Lastly, we merge these lines of data to identify several previously hypothesized and intriguing novel candidate genes, and suggest that the integration of regulatory and population-level transcriptomic data can provide useful filters for the identification of potentially adaptive genes.


Introduction
Statistical correlations between phenotypes impose fundamental constraints on phenotypic evolution (Lande 1979). As such, selection may disfavor the propagation of especially pleiotropic mutations whose causal effects alter many traits (Otto 2004). This idea has led to considerable speculation on the precise molecular effects of successful mutations. Vigorous debate regarding the relative importance of coding sequence and gene regulatory evolution hinged on claims Significance Gene expression evolution can be driven by changes in the focal gene itself (cis-regulation) or by changes in the genes that regulate it (trans-regulation). The importance of these two processes, and their contribution to adaptation specifically, remains under debate. Through a novel integration of data on genetic variation in gene expression with data on gene regulatory evolution, we found increased evidence for a primary role of cis-regulation in both total expression evolution as well as adaptive expression evolution. These results inject nuance into the discussion of how regulatory processes influence evolution within species and outline an approach for using expression data to address adaptive hypotheses.
regarding the respective pleiotropic consequences of these types (Hoekstra and Coyne 2007;Carroll 2008). Within the category of regulatory mutations, however, further distinctions are likely to be relevant in this context. Specifically, trans-regulatory changes, which are primarily a consequence of changes in expression and/or structure of transcription factors, are expected to affect large networks of target genes and therefore be highly pleiotropic (Gibson 1996;Wittkopp 2007; but see Lynch and Wagner [2008]). In contrast, cis-regulatory mutations, occurring in promoters or enhancers of the target genes themselves, might affect only single genes in specific contexts (Stern 2000;Prud'homme et al. 2007). In an early and thorough theoretical treatment of the subject, Wray et al. (2003) did not equivocate in hypothesizing that cis-regulatory evolution should primarily be responsible for the evolution of gene expression phenotypes.
In the years since that prediction, the accumulation of evidence regarding the prevalence of cis-and trans-regulatory effects in evolution has led to a far murkier picture. This may in part reflect the methodological diversity of studies approaching the question (reviewed in Signor and Nuzhdin [2018]). Some experiments, such as chromosomal substitutions (Hughes et al. 2006;Osada et al. 2006), crosses utilizing the diversity of a reference panel (Genissel et al. 2007;Fear et al. 2016;Osada et al. 2017), and eQTL mapping studies (Massouras et al. 2012;King et al. 2014) have generally, but not always (Lemos et al. 2008;Wang et al. 2008) corroborated the hypothesis, finding greater contributions of cis-effects to intrapopulation variation. On the other hand, results from another frequently used experimental design, which we will henceforth call the F 1 hybrid design, have consistently led to the opposite conclusion. The F 1 hybrid design requires expression data from two parental lines and their F 1 hybrids. Cis-regulatory effects are measured using the differential expression of allele-specific reads within the hybrid samples, whereas trans-regulatory effects are calculated by subtracting the cis-regulatory effect from the overall differential expression between the parental lines (Wittkopp et al. 2004). Usage of the F 1 hybrid design has repeatedly found that trans-regulation dominates expression variability within species, whereas cis-regulation plays a greater role in interspecific differences (Graze et al. 2009;Wittkopp et al. 2004Wittkopp et al. , 2008McManus et al. 2010;Suvorov et al. 2013;Coolon et al. 2014;Metzger et al. 2017;Glaser-Schmitt et al. 2018).
Given the power of the F 1 hybrid design and its applicability to a wide range of study systems and biological contexts, closer attention to the interpretations stemming from this approach is merited. As such, recent work has begun to approach the F 1 hybrid paradigm with increased nuance. Glaser-Schmitt et al. (2018) perform a tissue-specific study, filling an important gap given the focus of previous work on whole-body samples. Taking this one step further, Combs and Fraser (2018) estimate fine-scale spatial variation in allelespecific expression within embryos. From a different angle, two recent commentaries (Fraser 2019; Zhang and Emerson 2019) make salient points regarding potential biases in the estimation of trans-regulatory divergence given that it cannot be estimated independently of cis-regulatory and parental divergence using this approach, and stress the need for replication to mitigate this. Here, we build from these efforts and probe the initial findings from an across-population F 1 hybrid study using two simple experiments. First, we conduct a tissue-specific study in parallel with a whole-body study, to directly estimate the effects of sample heterogeneity on the estimation of regulatory type. Second, we supplement our measures of parental divergence with further sampling of genotypes from each parental population, to gain more confidence in patterns of within and between-population variation in transcription.
We apply these experiments to an investigation of gene expression evolution in larval brains across two populations of the cactophilic fly Drosophila mojavensis. This combination of organism and tissue lends itself to a strong hypothesis of predominant cis-regulatory evolution, for two reasons. First, D. mojavensis is predicted to have experienced strong differential selection pressures across populations due to variable ecological conditions. The two populations studied here, from Santa Catalina Island, CA, and the Sonoran Desert (Guaymas, Sonora, Mexico and Organ Pipe National Monument, Arizona), are genetically distinct (Reed et al. 2006) primarily utilize highly divergent cactus species, the prickly pear Opuntia littoralis and the columnar Stenocereus thurberi, respectively (Heed 1978;Ruiz et al. 1990). These host cacti form unique chemical and nutritional environments (Kircher 1982;Starmer and Phaff 1983), and detoxification genes in particular have seen substantial expression and coding sequence evolution across these populations (Matzkin et al. 2006;Allan and Matzkin 2019). In addition to selection from the host, these populations experience vastly different temperature and humidity regimes, which is expected to generate selection broadly on phenology and organismal physiology (Matzkin 2014). We choose to focus on brains here in part because we previously identified larval behavioral differences related to locomotion and pupation (Coleman et al. 2018), indicating the potential for the evolution of expression changes in the brain, as well as muscle and fat body. Second, despite this potential for selection, there are also a priori expectations that transcriptome-wide evolution should actually be reduced. Brain gene expression is highly conserved in many animals, including Drosophila (Brawand et al. 2011;Catal an et al. 2012;Uebbing et al. 2016). Additionally, gene expression in larvae is more conserved than in later developmental stages in Drosophila (Artieri and Singh 2010). The pairing of strong directional and strong stabilizing selection across genes is precisely the scenario that should result in transcriptional fine-tuning due to cis-regulatory evolution. Thus, our expectation was to uncover a greater role for cisregulatory changes than observed in other intraspecific studies using similar experimental designs.

Sample Collection and Sequencing
For initial analysis of population divergence and analyses of allele-specific expression, we used single genomesequenced isofemale lines of D. mojavensis from Santa Catalina Island, CA (Drosophila 12 Genomes Consortium 2007) and Guaymas, Sonora, Mexico (Allan and Matzkin 2019). These lines have been maintained as isofemale lines without direct inbreeding in the laboratory on bananamolasses media (Coleman et al. 2018) since 2002 and 1999, respectively. We generated F 1 hybrids between these two lines by placing 20 virgin genome-line Catalina Island males and 20 virgin genome-line Sonora females in vials containing banana-molasses media, and performed the reciprocal cross in an identical manner. For analyses of genotypic variation in expression, we selected seven additional isofemale lines from Santa Catalina Island and seven isofemale lines from the Sonoran Desert population from Organ Pipe National Monument, AZ, which were collected between 2007 and 2009 (Coleman et al. 2018) and maintained as described earlier.
We collected all samples during the third-instar wandering stage. For whole-body samples, we collected five larvae per replicate, washing each larva in deionized water before storing them on ice in tris-EDTA buffer. For brain samples, we dissected ten brains per replicate in tris-EDTA before storing them on ice. We then froze samples at À80 C for storage. We collected three biological replicates for each genome line and hybrid (brain and body) and single replicates of each additional isofemale line (brain only). We ground samples in TRIzol (Thermo Fisher, Waltham, MA) and used Qiagen RNEasy columns (Qiagen, Hilden, Germany) to extract RNA, prepared libraries using Illumina TruSeq kits (Illumina, San Diego, CA), and sequenced samples as 150-bp paired-end reads on an Illumina HiSeq. Information on sample identity and sequencing can be found in supplementary table S1, Supplementary Material online.

Bioinformatic Analysis
We removed Illumina adapters and low-quality sequence using Trimmomatic (Bolger et al. 2014) and used NextGenMap (Sedlazeck et al. 2013) with default parameters to separately map all reads to both the original Catalina Island genome (Drosophila 12 Genomes Consortium 2007; FlyBase version r1.04_FB2018_06) and the same genome templated with Sonora genomic reads (Allan and Matzkin 2019). We calculated total read counts at the gene level for each sample using HTSeq-count (Anders et al. 2015), using the reads mapped to the Catalina Island genome for analysis. We then downsampled reads to 11,908,854 reads over 13,410 genes in brain samples and 14,001,634 reads over 13,628 genes in whole-body samples, to match the lowest coverage sample in each tissue. For the additional brain isofemale lines, which were more highly covered, we downsampled to 18,665,415 reads over 13,565 genes. From these gene sets, we analyzed only genes with at least ten total reads in each sample. After comparing the consistency between biological replicates within each group using Spearman's correlation coefficients, we discarded three samples from the genome lines as outliers: one Sonora brain sample, one Sonora (f)ÂCatalina Island (m) hybrid body sample, and one Catalina Island (f)ÂSonora (m) body sample. In the analysis of genotypic variation in the brain, we discarded an additional Sonora sample as an outlier based on the same criteria. We included only a single randomly chosen replicate from each genome-sequenced line in the analysis of genotypic variation to avoid pseudoreplication.
For allele-specific counts, we used SAMtools mpileup (Li et al. 2009) and VarScan2 (Koboldt et al. 2013) to identify informative variants for allele-specific expression analysis. We first removed all SNPs where we found, in any of the parental genotype samples the allele from the other parental genome at >5% frequency (Glaser-Schmitt et al. 2018). This step helps to avoid analyzing heterozygous sites, which will lead to inaccuracy in the assignment of reads to parental genomes. We then compared the remaining SNPs from the mapping results to both reference genomes and removed sites with substantially differing allele frequencies in the two resulting data sets, following previous work (Benowitz et al. 2019). In this way, we removed sites potentially affected by mapping bias, which, although not a major problem here (supplementary table S1, Supplementary Material online), would result in overestimation of allele-specific expression of genes containing those sites. We then filtered all bam files (mapped to the Catalina Island reference) for informative reads using VariantBam (Wala et al. 2016) and output these reads as text files using sam2tsv (https://lindenb.github.io/jvarkit/ Sam2Tsv.html; last accessed July 2020). We then counted allele-specific reads overlapping each informative SNP, generating gene-level counts after accounting for reads overlapping multiple variants in R 3.4 (R Core Team 2018). We ran this pipeline independently for brain and body samples. We randomly downsampled allele-specific reads in each brain sample to 4,541,638 reads, matching the reads in the lowest coverage sample. These reads covered SNPs in 7,933 genes. For the whole-body samples, which covered 8,584 genes, we downsampled all read counts to 4,939,804 total reads, to preserve the ratio of reads per gene between brains and whole bodies. In both data sets, we analyzed only genes containing at least ten total reads in each sample. The same three samples identified as outliers above were also outliers in this data set and cis-and trans-Regulatory Evolution in a Cactophilic Fly GBE Genome Biol. Evol. 12(8):1407-1418 doi:10.1093/gbe/evaa145 Advance Access publication 11 July 2020 we accordingly discarded them for allele-specific expression analysis as well.

Statistical Analysis
We calculated per-gene parental divergence from the total (not allele-specific) expression counts as log 2 (P CI /P SON ), where P CI and P SON are either parental genome-sequenced line means (when comparing single genotypes; in brains and whole bodies) or parental population means (when comparing all genotypes; brains only). We calculated transcriptomewide differentiation across populations as 1Àq, where q is the Spearman's correlation of expression divergences between populations. We estimated 95% confidence intervals of 1Àq from 10,000 bootstrapped replicates.
We calculated cis-regulatory divergence, following previous studies, as log 2 (H CI /H SON ), where H CI and H SON are averages of allele-specific counts across all F 1 hybrid replicates. We then calculated trans-regulatory divergence as the difference between parental divergence (between the genome-sequenced lines) and cis-regulatory divergence, log 2 (P CI /P SON )Àlog 2 (H CI / H SON ). We independently assessed the contributions of cisand trans-regulatory divergence to both metrics of parental divergence using Spearman's correlation coefficient q. As above, we assessed 95% confidence intervals from 10,000 bootstrap replicates. We visualized correlations using leastsquares regression lines and 95% confidence regions around those regressions using the R package ggplot2.
We estimated differential expression between parental populations using FDR-corrected negative binomial tests using the R package NBPseq (Di et al. 2011). We also used negative binomial tests comparing allele-specific counts to assess the significance of cis-regulatory effects. To estimate the significance of trans-regulatory effects at the gene level, we used Fisher's exact tests comparing the ratio of allele-specific expression differences to total gene expression differences in the parental samples.
For evolutionary analysis of regulatory evolution in the brain, we used previously published dN/dS values across the four D. mojavensis populations (Allan and Matzkin 2019). To estimate network connectivity, we identified the closest Drosophila melanogaster ortholog for each gene and used the in-degree metric calculated in Marbach et al. (2012) and used previously in a similar analysis in Yang and Wittkopp (2017). Briefly, in-degree quantifies the number of transcription factors found to have significant regulatory interactions with each gene. We analyzed dN/dS and in-degree between genes in different regulatory categories using Mann-Whitney U test with the R function pairwise.wilcox.test, using Holm's method to correct for multiple comparisons. For these analyses, we defined trans-regulated genes via the brain analysis using multiple parental genotypes. We performed all statistical analyses in R 3.4 (R Core Team 2018).

Analysis of Single Parental Genotypes in Brains and Whole Bodies
Examining a single genotype per population, transcriptomewide expression differentiation across populations was lower in brains (1Àq ¼ 0.017; 95% CI ¼ [0.016, 0.018]) than in bodies (1Àq ¼ 0.036; 95% CI ¼ [0.034, 0.038]), as expected. In contrast, we found evidence for considerably more significantly differentially expressed (DE) genes in brains than in bodies (table 1 and supplementary table S2, Supplementary Material online). The lack of statistical support for many DE genes in bodies despite increased overall expression differences reflects substantially greater intragenotypic variation in whole-body data (supplementary fig. S1, Supplementary Material online).
We then correlated measures of divergence across all genes with measures of cis-and trans-regulatory divergence as estimated from F 1 hybrids between these two lines (Coolon et al. 2014;Metzger et al. 2017). This correlation broadly estimates the contributions of each regulatory type to total expression divergence without relying on thresholds of statistical significance. We found that trans-effects were more closely associated with parental divergence than cis-effects to parental divergence in both brains (trans: q ¼ 0.  fig. 1B). Lastly, we found more individual genes displaying evidence of regulation in trans than in cis in both brains and bodies, although this trend was much more dramatic in whole bodies (table 1 and supplementary tables  S3

Analysis of Multiple Parental Genotypes in Brains
The above comparison, using only single parental genotypes, provides a limited estimate of expression evolution across populations. To more confidently assess expression evolution across populations, we analyzed brain RNA-seq data from seven additional Catalina Island genotypes and six additional Sonoran genotypes. Specifically, we expected the inclusion of multiple genotypes to reduce sampling error and result in lower expression differentiation between populations. Indeed, parental divergence across populations was lower in this data set (1Àq ¼ 0.005; 95% CI ¼ [0.004, 0.005]) and the number of significantly DE genes was reduced (table 1  and supplementary table S2, Supplementary Material online). We then examined correlations between parental divergence using multiple genotypes with the identical cis-and trans-regulatory divergence values calculated above. We now found the opposite result: cis-effects were more related to population divergence as measured by multiple genotypes than trans-effects (cis: q ¼ 0.362; 95% CI ¼ [0.343, 0.380], trans: fig. 2). We also used the multigenotype data set to recalculate the number of trans-regulated genes, and found far fewer than in the single genotype analysis, although still considerably more than the number of cis-regulated genes (table 1 and

Evolutionary and Candidate Gene Analysis
We found that cis-regulated genes had higher evolutionary rates within D. mojavensis than did trans-regulated genes (P ¼ 0.0010) or genes that were either conserved or lacking a clear regulatory pattern (P ¼ 0.0018). We also found that the in-degree (number of transcriptional regulators), as inferred from D. melanogaster orthologs, of cis-regulated genes in our data set was lower than that of either trans-regulated genes (P ¼ 0.0034) or those with no identified regulatory type (P ¼ 1.1e À5 ). The distributions of dN/dS and in-degree values for genes in each regulatory classification are shown in figure 3. To examine potential evolutionary hypotheses on a more granular level, we also compiled a list of candidate genes displaying two criteria: differential expression between the two populations in the multiple genotype brain data set, and a statistically significant pattern of cis-and/or trans-regulatory evolution. About 68 genes met these criteria, of which 27 where cis-regulated, 35 were trans-regulated, and six had significant cis-and trans effects (table 2). Of these six, five showed evidence for compensatory (cisÂtrans) evolution, whereas only one showed evidence for combined (cisþtrans) evolution.

Discussion
The measurement of allele-specific expression in F 1 hybrid offspring has been one of the primary approaches for understanding genome-wide patterns of cis-and trans-regulatory  Bold points indicate significantly differentially expressed genes in each data set. Trend lines represent least-squares regressions surrounded by 95% confidence intervals.
cis-and trans-Regulatory Evolution in a Cactophilic Fly evolution both within and between species. Although other methodologies for quantifying these effects have been used effectively, the advantage of the F 1 hybrid approach, in our opinion, is its simplicity and potential applicability to a wide range of study systems and evolutionary contexts. However, as with any other genome-scale approach to evolution, the conclusions stemming from F 1 hybrid studies come with biases and limitations. Here, our goal was to investigate how two straightforward modifications to this common experimental design affect the evolutionary interpretations regarding the prevalence of cis-and trans-regulation in natural populations. Furthermore, we aimed to leverage the ecological and evolutionary information from our model system, D. mojavensis, to examine how successfully the integration of complex regulatory data can uncover adaptive gene expression changes across populations.

The Effects of Tissue Specificity on Estimation of Regulatory Type
Most of the original genome-wide studies of cis-and transregulatory evolution used gene expression measurements taken from whole organisms. This experimental design may blunt the ability to detect cis-regulatory changes, if those changes are only realized in a subset of tissues. Here, we performed allele-specific expression experiments in both whole bodies and brains of larval D. mojavensis in parallel, to determine if and how much the use of heterogeneous tissue samples affects quantification of regulatory type. We found clear evidence that our analysis of whole-body samples both overestimated trans effects and underestimated cis effects. This is reflected both in correlations between regulatory and parental divergences ( fig. 1) as well as in the numbers of genes statistically categorized as cis-or trans-regulated (table 1). We cannot say precisely how much of this difference is due to the problems of using heterogeneous tissue samples and how much is due to regulatory properties specific to the larval brain. A systematic data set of allele-specific expression in multiple tissues and life stages collected would be needed to robustly address this question.

The Effects of Using Multiple Parental Genotypes on Estimation of Regulatory Type
It is well known that the F 1 hybrid approach can bias estimates of trans-regulatory evolution, because they cannot be estimated independently of the measurement of parental expression divergence. Fraser (2019) pointed out how this issue, when combined with error in the estimation of allelespecific expression, can lead to overestimation of cis-trans compensatory evolution. By the same logic, we hypothesized that simple errors in the measurement of parental expression evolution might lead to inflation of the degree of trans-regulatory effects.
To address this issue, we simply compared the quantification of cis-and trans-regulatory effects in brains using two measures of parental expression divergence across populations: one measured from only the single genotype utilized in the allele-specific expression experiment, and one using seven additional genotypes from each population. Using population expression values has two obvious potential consequences for each gene. First, it should reduce noise in the estimation of population expression means coming from the small sample size of using only a few samples of a single genotype. This should lead to reduced estimates of trans-regulatory evolution. However, there will also be a subset of genes whose expression in the focal genotype substantially differs from the mean of its population. Thus, for some number of genes, our method should result in the artificial detection of trans-regulatory effects because the parental divergence will be mismatched with the allele-specific expression data.
Despite this, we find that using estimates of parental population divergence using multiple genotypes considerably reduces the genome-wide estimate of trans-regulatory evolution. Our data present mixed results, however, on the question of whether cis-or trans-regulation is primarily responsible for expression evolution between our populations. Although the correlation analysis suggests that cis-regulatory divergence is more closely related to population divergence ( fig. 2), our per-gene hypothesis tests maintain nearly twice as many genes with evidence of trans-regulatory divergence The relationship between cis-and trans-regulatory divergence and divergence between parental genotypes in brains, where parental divergence is measured using all genotypes. Cis-and transregulatory divergence data are the same as in figure 1. Bold points indicate significantly differentially expressed genes. Trend lines display least-squares regressions surrounded by 95% confidence intervals. (table 1). However, we argue that the numbers of genes displaying evidence for trans-regulation here is an overestimate for three reasons. First, as mentioned above, the inclusion of multiple parental genotypes will induce false positives in cases where intrapopulation variation in gene expression is substantial. Second, it is likely that the difference in power between the methods to detect cis-and trans effects contributes to the number of genes detected in each category (Graze et al. 2009;Coolon et al. 2014; but see Glaser-Schmitt et al. [2018]).
Third, the estimation of parental expression differences, and therefore trans-regulatory effects, are more error-prone due simply to the biology of our samples. We compared larval samples across two populations that develop at different rates (egg-pupation time [h]: CI ¼ 275.20 6 5.48; SON ¼ 315.21 6 5.96; Benowitz KM, Unpublished data), making it impossible to guarantee that sampling occurred at precisely the same exact developmental stage. Thus, some ontogenetic or environmental variation in gene expression is inevitable here. The usage of multiple genotypes should mitigate this problem but is unlikely to resolve it completely, and thus we may be generating false positives for this reason. For example, we find significant and consistent population differences in expression of two fat body proteins (Fbp1; Fbp2) and three larval serum proteins (Lsp1beta; Lsp1gamma; Lsp2) that are all clearly attributed to transregulatory evolution (table 2). Fat body proteins and larval serum proteins interact in a key pathway for nutritional storage prior to pupation (Burmester et al. 1999), and therefore could lead us to a hypothesis of adaptation via trans-regulation to variable nutritive environments. However, it is also well established that all four of these genes undergo rapid increases in expression during the third-instar wandering stage (Burmester et al. 1999). Our results are therefore equally consistent with the possibility that the expression differences were due to slight variations in the developmental stages sampled, and that expression patterns of these genes have not meaningfully evolved at all. In contrast, the estimation of cis-regulatory effects is completely controlled for any such environmental variability because it is measured within individuals (Pastinen 2010), and is therefore inherently less prone to similar errors.

Integrating Regulatory Data to Address Evolutionary Hypotheses
Recent work has sought to identify the evolutionary and structural properties associated with genes evolving via cis-and trans-regulation. Here, we demonstrate that in D. mojavensis larval brain cis-regulated genes tend to display faster rates of coding evolution. Furthermore, we show that cis-regulated genes also tend to occupy less central positions within transcriptional networks, confirming the results of Yang and Wittkopp (2017) and supporting their generality. Notably, we reached this conclusion using D. melanogaster network data, given that similar data are unavailable for D. mojavensis. Thus, our results suggest that gross network properties may be conserved across significant lengths of evolutionary time. Considered together, our findings linking regulatory type to evolutionary rate and network connectivity indicate that the genes experiencing cis-regulatory evolution are relatively unconstrained compared with trans-regulated genes. This perhaps contrasts with our predictions, which were that cis-regulation should be predominant due precisely to the presence of such constraints. However, it is not clear whether errors regarding the determination of trans-regulation discussed above may be obscuring any potential statistical relationships. Ideally, the determination of regulatory evolution will also help identify adaptively regulated genes (Fraser 2011;Delbare and Clark 2018). RNA-seq experiments in ecology and evolution nearly always result in hundreds if not thousands of DE genes, many of which are likely false positives (Todd et al. 2016;Bengston et al. 2018). Truly DE genes must have experienced cis-or trans-regulatory evolution; therefore, the corroboration provided by a statistically significant regulatory effect estimated from an independent sample may help weed out noisy or environmentally variable genes. Thus, allele-specific expression data have been used to supplement studies of gene expression adaptation at the genome-wide (Juneja et al. 2016; Verta and Jones 2019) and candidate gene (Bendesky et al. 2017) levels. We thus turned our attention to the identities of the genes displaying clear patterns of both divergence and regulation, and compared between those displaying cis-and trans-regulation.
Previous transcriptomic (Matzkin et al. 2006;Matzkin 2012;Matzkin and Markow 2013;Smith et al. 2013) and genomic (Allan and Matzkin 2019) investigations of D. mojavensis have identified detoxification and chemosensory genes as important classes of genes likely related to adaptation to the alternative chemical environments provided by their hosts. Taking this as an a priori hypothesis, we examined the identities of genes identified here to search for candidates fitting these categories. We are most interested in the cisregulated genes, which have the cleanest interpretations in this data set. Among the 33 cis-regulated candidate genes are 28 with D. melanogaster orthologs, of which 12 have described functions. Noteworthy among these is GstD1, a detoxification gene with considerable evidence for a functional role in adaptation across D. mojavensis populations (Matzkin  Matzkin 2008). Here, we find that expression differences in GstD1 are clearly attributable to cis-regulatory evolution between these populations, leading to increased expression in the Catalina Island population. Four other genes, including three cytochrome p450s and one UDPglycosyltransferase, have well-characterized roles in detoxification of plant chemicals as well (Heckel 2014). We also find a single chemosensory gene, Obp99a. Among the 19 characterized genes displaying trans-regulation, we find the detoxification gene GstE1 as well as the chemosensory genes Obp99a (regulated in cis and trans) and Obp99b. Thus, although we are less confident in the trans-regulated gene set as a whole, this confirmation suggests that the regulatory evolution of at least a subset of these genes is accurately represented in this data set.
Although it is unsurprising that chemosensory genes are expressed and have evolved specifically in the brain, it is not as immediately clear why the expression of detoxification genes should be important in brain tissue. Detoxification is usually associated with tissues such as the midgut, Malpighian tubule, and fat body (Chung et al. 2009) and the blood-brain barrier tends to shield the brain from harmful chemicals (Stork et al. 2008;Hindle and Bainton 2014). However, the Drosophila blood-brain barrier is not completely impermeable to xenobiotics (Zhang et al. 2018), and important detoxification processes have been demonstrated in the brain in other insects (Zhu et al. 2010). Thus, it is at least plausible, given the chemical cocktail that D. mojavensis is exposed to within organ pipe and prickly pear necroses (Kircher 1982;Starmer and Phaff 1983), that some compounds may enter the brain. Alternatively, it is possible that we are witnessing the consequences of indirect selection. For example, strong selection on GstD1 expression in the midgut might have resulted in a cisregulatory change to a binding site for a transcription factor that is also highly expressed in brains. If the resulting change in brain expression is neutral or nearly neutral it may then persist without having any adaptive function.
Given the ability of our approach to recapitulate a priori hypotheses about expression evolution, we also asked whether this approach might lead to novel predictions about phenotypic and genetic adaptation. Among the remaining cis-regulated genes are photorepair (phr) and CG5316 (ortholog of human aprataxin), which function to repair UVdamaged DNA (Boyd and Harris 1987;Hirano et al. 2007). In D. melanogaster, selection has generated adaptive differences in DNA repair mechanisms between tropical and temperate populations, and has resulted in both coding and noncoding genetic changes (Svetec et al. 2016). Here, both phr and CG5316 are upregulated in the Sonoran Desert population, where presumably intense UV exposure is a more pressing environmental challenge than in the cooler and wetter clime of Santa Catalina Island. Interestingly, phr was not among the D. melanogaster candidate genes differentiated in sequence or expression among populations, whereas CG5316 showed evidence of protein-coding but not expression evolution (Svetec et al. 2016). Thus, even when the predictability of DNA repair evolution pathways to similar environmental variables may extend to the gene, the type of genetic change itself may still be unpredictable.
Broadly, our usage of a replicated, tissue-specific data set and requirement of gene to display a clear regulatory pattern, especially one of cis-regulation, has led us to a manageable set of a few dozen highly intuitive genes that may be adaptively regulated across D. mojavensis populations associated with local ecological conditions. We propose that deeper understanding of patterns of regulatory evolution in ecological model systems, where there are strong predictions regarding selection, will be essential for a robust understanding of the differing roles of cis-and trans-regulation in local adaptation and evolution.

Supplementary Material
Supplementary data are available at Genome Biology and Evolution online.