Differentiation in putative male sex pheromone components across and within populations of the African butterfly Bicyclus anynana as a potential driver of reproductive isolation

Abstract Sexual traits are often the most divergent characters among closely related species, suggesting an important role of sexual traits in speciation. However, to prove this, we need to show that sexual trait differences accumulate before or during the speciation process, rather than being a consequence of it. Here, we contrast patterns of divergence among putative male sex pheromone (pMSP) composition and the genetic structure inferred from variation in the mitochondrial cytochrome oxidase 1 and nuclear CAD loci in the African butterfly Bicyclus anynana (Butler, 1879) to determine whether the evolution of “pheromonal dialects” occurs before or after the differentiation process. We observed differences in abundance of some shared pMSP components as well as differences in the composition of the pMSP among B. anynana populations. In addition, B. anynana individuals from Kenya displayed differences in the pMSP composition within a single population that appeared not associated with genetic differences. These differences in pMSP composition both between and within B. anynana populations were as large as those found between different Bicyclus species. Our results suggest that “pheromonal dialects” evolved within and among populations of B. anynana and may therefore act as precursors of an ongoing speciation process.


Introduction
As Darwin (1871) already noticed and recent studies have repeatedly demonstrated, sexual traits are often the most divergent characters among closely related species (Mendelson and Shaw 2005;Arnegard et al. 2010;Safran et al. 2012). Large differences in sexual traits have therefore been postulated as the signature of ongoing evolution of reproductive isolation and recent speciation (Allender et al. 2003;Boul et al. 2007;Ritchie 2007). In this scenario, divergence in sexual traits within a species contributes to reproductive isolation and thus speciation. However, interspecific differences in traits might accumulate after speciation and be the consequence rather than the cause of speciation (Butlin 1987). Hence, a potential link between trait divergence and speciation can only be made if we observe trait differences that have evolved before or during the speciation process, that is, at the level of populations or early stages of differentiation (Tregenza et al. 2000;Masta and Maddison 2002;Ritchie 2007;Am ezquita et al. 2009;Grace and Shaw 2012;Oneal and Knowles 2013;Schwander et al. 2013).
Among the sexual traits potentially involved in differentiation, sex pheromones, which convey information related to mate choice among conspecific individuals in many species of mammals, reptiles, amphibians, fishes, and insects (see Kikuyama et al. 2002;Johansson and Jones 2007;Smadja and Butlin 2009;Wyatt 2014 for review), may play a key role. Indeed, species specificity of sex pheromones maintains reproductive isolation between species sharing the same habitat (L€ ofstedt et al. 1991;Roelofs et al. 2002). Furthermore, there is evidence for intraspecific quantitative and qualitative variation in sex pheromones in Lepidoptera that may form a basis for intraspecific mate choice (e.g., Agrotis segetum: Toth et al. 1992;LaForest et al. 1997;Ctenopseustis obliquana: Clearwater et al. 1991; Ostrinia nubilalis: Klun et al. 1975;Roelofs et al. 1985). Among the taxa with intraspecific chemical divergence, some also display a strong genetic differentiation, which questions whether these populations are still genetically compatible, that is, populations of the same species (White and Lambert 1995;Sperling et al. 1996;Svensson et al. 2013). Therefore, we suggest that it may be helpful to compare pheromone differentiation and genetic differentiation in order to assess whether chemical signal may play a role in speciation.
The sub-Saharan butterfly genus Bicyclus (Lepidoptera, Satyrinae) is a suitable model to explore the role of pheromone divergence in the process of speciation. It includes close to 100 species, and up to 20 species can be found in sympatry in a single forest patch (Condamin 1973;Aduse-Poku et al. 2012;Valtonen et al. 2013). In this genus, males have wing patches and brushes formed by modified scales that are thought to produce and emit the male sex pheromones (MSP). The importance of chemical sexual traits in the differentiation among species was suggested by the fact that differences in the position and shape of the androconia are used for species identification in the field (Condamin 1973). Recently, we identified the major male-specific compounds (putative MSP, hereafter pMSP) in 32 field-caught Bicyclus species and unraveled a pattern of reproductive character displacement at the scale of the genus (Bacquet et al. 2015). Between the closely related species of the genus, we observed on average five differences in the presence of pMSP components (Bacquet et al. 2015). This supports the role of theses major male compounds in species recognition and potentially in Bicyclus speciation (Bacquet et al. 2015). Hence, we tested whether divergence in MSP composition accumulates before completion of genetic isolation within a species, using Bicyclus anynana (Butler, 1879) as a model. In this species, three MSP components have been identified using physiological (gas chromatography coupled to electroantennography; GC-EAD hereafter) and behavioral mate choice experiments (Nieberding et al. 2008). These three MSP components, (Z)-9-tetradecen-1-ol (hereafter MSP1), hexadecanal (MSP2), and 3: 6,10,14-trimethylpentadecan-2-ol (MSP3), are used at short range when males flicker their wings and erect their androconial hair probably favoring their dissemination (Nieberding et al. 2008). Variation in the abundance of the three MSP components among males indicates age and level of inbreeding of the males and is under sexual selection by females (Nieberding et al. 2008(Nieberding et al. , 2012San Martin et al. 2011;van Bergen et al. 2013). If the divergence of male chemical profiles is involved in speciation, we expect to observe MSP differences between, or even within, populations of the same species, that is, before completion of the speciation process.

Population sampling
We sampled four geographically distinct wild populations of B. anynana ( Fig. 1 and Table 1). Notably, the populations in Uganda are of the subspecies Bicyclus anynana centralis Condamin, 1968, while others are regarded Bicyclus anynana anynana Condamin, 1968. Among the 160 sampled individuals, a subset of 35 males and 33 females was used for the chemical analyses (Table 1). Most of the males and females in each location were sampled during a single trapping session of 1 week (Tables 1  and S1). We also sampled B. anynana individuals from two laboratory stocks, the first one originating from Malawi (Nkhata Bay, from 80 females sampled in 1988, Brakefield et al. 2009) and used in most studies on B. anynana, and the second one sampled in 2006 from South Africa (False Bay, from over 70 females, sampled in 2006, de Jong et al. 2010.

DNA sequencing
DNA was extracted with the DNeasy Blood and Tissue Kit (Qiagen) from a leg or a thorax piece kept in 100% ethanol at À20°C or in glassine envelope at À80°C. In total, we sequenced 160 and 77 individuals, respectively, for the mitochondrial gene cytochrome oxidase 1 (COI) and the nuclear gene carbamoyl-phosphate synthetase 2, aspartate transcarbamylase, and dihydroorotase (CAD) following the protocols of Monteiro and Pierce (2001) and Wahlberg and Wheat (2008, see Table 1 and www.nymphalidae.net/Nymphalidae/Molecular.htm). Some COI sequences were already available from de Jong et al. (2011). A second nuclear marker, elongation factor alpha 1 (EF1a), was sequenced for a subset of 12 individuals. The alignments were 853, 714, and 1243 base pairs long and presented 34, 51, and 18 SNPs for COI, CAD, and EF1a, respectively. Sequences are stored in GenBank (accession numbers in Table S1).

Analysis of population genetic structure and diversity
We deduced the haplotypes forming each individual CAD diploid genotypes using the Excoffier-Laval-Balding algorithm (ELB) in Arlequin software (version 3.5.1.2) with default parameter values (Excoffier et al. 2003(Excoffier et al. , 2005. Five independent runs of the ELB algorithm were compared, and we discarded individuals showing different coding phases across runs (two individuals, Table S1).
To represent the genetic distance among haplotypes, we constructed median-joining networks with the software Network 4.6.1.1 (Bandelt et al. 1999, www.fluxus-engineering.com). To ensure the consistency of the network construction, several reconstructions were performed with variable epsilon values, and the less parsimonious links were deleted to simplify the network (Polzin and Daneshmand 2003). We also tested the consistency of network links by analyzing partial datasets after removing the haplotypes occupying a central position in each full network. For the sake of comparison, we additionally reconstructed the maximum-likelihood phylogenetic trees (Appendix 2).
We investigated the distribution of the genetic diversity among populations with an analysis of molecular variance (AMOVA) based on haplotype uncorrected pairwise differences (Excoffier et al. 1992). We performed 1000 permutations of the individuals among populations to test the significance of the observed pattern of genetic variation. We also calculated Φ ST between pairs of populations, a type of F ST based on haplotype dissimilarity (Excoffier et al. 1992). We run these analyses in Arlequin (version 3.5, Excoffier et al. 2005). Extraction, analysis, and repeatability of chemical profiles Androconia were dissected from the wings of one side of freshly killed butterflies or from butterflies frozen alive and kept at À80°C. At À80°C, the chemical compounds do not evaporate enough to induce differences of extraction profiles between these two treatments (P. Bacquet, unpublished data). Each androconium was extracted in separate 2-mL screw cap vials (VWR International, Leuven, Belgium) with 100 lL of redistilled n-heptane (VWR International) containing 1 ng/lL of (Z)-8-tridecenyl acetate (Z8-13:OAc) as an internal standard. The remaining tissue of each dissected wing was also extracted in a vial with 300 lL redistilled heptane containing 0.33 ng/lL of the internal standard. The wing extracts were analyzed on Agilent (Santa Clara, CA) 5975 mass-selective detector coupled to an Agilent 6890 gas chromatograph (GC-MS), equipped with a HP-5MS capillary column (30 m 9 0.25 mm i.d., and 0.25 lm film thickness; J&W Scientific, Folsom, CA). The oven temperature was programmed from 80°C for 3 min, then to 210°C at 10°C/ min, hold for 12 min and finally to 270°C at 10°C/min, hold for 5 min. Inlet and transfer line temperatures were 250 and 280°C, respectively, and helium was used as the carrier gas with a flow of 0.8 mL/min. Compounds that eluted no later than n-tricosane were considered as the more volatile part of the chemical diversity displayed by these butterflies, and thus included in our analysis. Identification of wing compounds was performed by comparing the GC retention times and mass spectra with commercially acquired or laboratory-synthesized authentic compounds on both nonpolar (HP-5MS) and polar (INNOWax, 30 m 9 0.25 mm i.d., and 0.25 lm film  Condamin (1973) is represented in plain red. The two CAD haplotypes corresponding to B. anynana subspecies are in green and blue dashed lines. The bar plots represent the log-transformed average amount (ng) of each pMSP component in each population (AESD). The sample size is given for each graph (within brackets for the preliminary samples from Uganda). The number below each bar ("comp.") codes the pMSP component: 1: (Z)-9-tetradecen-1-ol; 2: hexadecanal; 3: 6,10,14-trimethylpentadecan-2-ol (the three active MSP identified from the laboratory stock of B. anynana, Nieberding et al., 2008); 4: unidentified anynana #19; 5: octadecan-1-ol (physiologically active in GC-EAD tests of individuals from False Bay, Nieberding et al., unpublished data); (see Table S2 and Fig. S1 for details of their identification). The abundance of each pMSP component selected once is represented for all the populations. The counts ("pres.") below the compounds number represent the number of individuals displaying the pMSP component in the population. The bars are light colored for the populations in which the compound was not selected as a pMSP component. thickness; J&W Scientific) columns. Tentative structures were assigned for compounds that were not fully identified. The chemical composition and related information are summarized in the Table S2 and Figure S1.
We determined the repeatability of full chemical profiles, that is, of all compounds found in the chemical extracts (as opposed to the pMSP composition, see below), among field-sampled individuals within and between the four populations. For this purpose, we used Spearman correlations on the amounts of compounds displaying identical mass spectra and retention times across individuals, thereby assumed to be homologous.

Selection of pMSP components from chemical profiles
The chemical profile of Bicyclus butterflies consists of many compounds (Table S2). For example, in the model species B. anynana, over a hundred chemicals with lower volatility than fatty acids have been identified on the different body parts, of which twenty significantly correlated with traits important in mate choice in this species (sex and age; Heuskin et al. 2014). As behavioral or GC-EAD assays would be unrealistic to perform for each field-caught population, we used a previously developed standardized routine to select a list of pMSP among all the compounds observed per species (Bacquet et al. 2015). The selection criteria were based on physiological (GC-EAD) and behavioral data accumulated from the stock populations of B. anynana originating from Malawi and South Africa (Nieberding et al. 2008(Nieberding et al. , 2012van Bergen et al. 2013 and unpublished data). In summary, we first applied a threshold of 10% of the internal standard such that any compound at a lower amount than 20 ng per individual was discarded from the analysis to standardize the quality of peak identification among samples. After this filtering step, the absolute amount of each compound and its repeatability level among individuals were computed. A compound was considered repeatable if present in two males out of three. As the three behaviorally and electrophysiologically active MSP components identified in the Malawi stock of B. anynana are among the top third most abundant of the repeatable, and male-specific compounds (Nieberding et al. 2008(Nieberding et al. , 2012van Bergen et al. 2013), we thus selected in every population the pMSP components following the same criteria. The female samples were used to assess the male specificity of the chemical compounds. A compound was considered as male specific if it was at least five times more abundant, than in female extracts sampled in the same population. In case a compound was selected as a pMSP in one population, its presence in other populations was also included in the analyses of the pMSP composition. This ensured not to artificially exaggerating the difference between a population where a compound is selected as a pMSP component and a population where it is not. Of note, our selection method of abundant and repeatable male-specific compounds that form the pMSP of Bicyclus populations is in accordance with observations that MSP components that experience directional sexual selection are usually more abundant than other compounds (see Steiger et al. 2011; and examples in Lepidoptera: Iyengar et al. 2001;Andersson et al. 2007; Hymenoptera: Ruther et al. 2009; lizards: Mart ın and L opez 2010; or elephants: Greenwood et al. 2005).

Analysis of chemical diversity
We applied three types of dissimilarity indices on the chemical profiles of individuals. First, the Jaccard dissimilarity described qualitative differences in composition (presence and absence of compounds). Second, the Euclidean distance discriminated chemical profiles based on the absolute amounts of compounds and, thirdly, on the relative amounts of compounds (proportion of the sum of all compounds per individual). Using these dissimilarity indices, we performed permutational multivariate analyses of variance for estimating how much chemical variation was explained by sex and geographic origin of the individuals (PerMANOVA, adonis function in R package vegan; Oksanen et al. 2011). PerMANOVA was preferred to the conventional MANOVA because its nonparametric permutational approach for testing significance is robust to the presence of large proportions of null data and is not restricted to the use of Euclidean distances (McArdle and Anderson 2001).

Comparison of genetic and chemical differentiations between pairs of populations
If, as we hypothesized, chemical divergence is a potential cause of reproductive isolation and speciation in Bicyclus butterflies, the level of pMSP differentiation should on average be as large as, or larger, than the level of neutral genetic differentiation between corresponding pairs of populations. We thus compared Φ ST between all pairs of populations based on either chemical distances or genetic CAD-based distances in the similar framework with Arlequin (version 3.5, Excoffier et al. 1992Excoffier et al. , 2005. For chemical profiles, we used distance matrices based on either presence-absence of compounds or absolute amounts of compounds or relative amounts of compounds.

Patterns of genetic structure
The mitochondrial COI gene showed a strong similarity among all individuals. This is not expected for a rapidly evolving marker, in particular when compared to the nuclear CAD which displayed a larger diversity (Figs. 2 and A2). A few sequences for a third independent marker (EF1a, nuclear) confirmed that the uniformity of the COI sequences was not reflecting the population's neutral divergence (Table A3). As a confirmation, neutrality tests H of Fay and Wu (2000) and D of Tajima (1989) were both significant for the COI but not for the CAD gene (Table A4). Altogether, this information suggests that COI underwent a selective sweep and is therefore inappropriate to describe the neutral genetic divergence of the populations in this species, leading us to discard this locus from subsequent analyses (Appendix 3; but see de Jong et al. 2011). Bicyclus anynana displayed significant genetic variation between populations based on the CAD nuclear marker (37% of the total genetic variance, P < 0.001, see AMOVA in Appendix 4 and Table A5). The median-joining network and the phylogenetic reconstructions of the CAD gene supported the isolation of a Ugandan clade, in which all but three divergent Ugandan haplotypes clustered together (n = 46, Fig. 2; Appendix 2, and see the three black arrows). The three divergent Ugandan haplotypes assembled with South African and Kenyan haplotypes. They very likely belong to hybrids which displayed both Kenyan or South African, and the typical Ugandan haplotypes (Appendix 5). Together with the observation of a selective sweep in the mitochondrial genome across populations, the presence of hybrids shows that there is ongoing gene flow between B. anynana populations.

Patterns of chemical diversity
The total number of chemical compounds per male ranged from 8 to 16 (12.0 AE 3.3; average AE SD) and from one to six in females (2.6 AE 1.2; Appendix 6). Individuals Figure 2. Networks of the COI (left) and CAD (right) genes for Bicyclus anynana. The surface of a circle is proportional to the number of individuals bearing one haplotype. Black circles represent intermediate haplotypes missing from the sampling. For the sake of representation of the COI network of B. anynana, the distance between haplotypes is twice as large as in the CAD network. We only represented the simpler networks built with a null epsilon value as other reconstruction parameters showed consistent similar results. The three arrows pinpoint the probable introgressed haplotypes forming CAD hybrids in the population from Uganda. from the same population showed similar chemical profiles (Fig. 3). These profiles were significantly more similar within populations than between (P = 0.001 in males and P = 0.04 in females, test based on 10,000 permutations, Fig. 3).
Sex was the main factor explaining the differentiation of full chemical profiles among individuals (PerMANOVA analysis; n = 73 individuals; significant R 2 of 47%, 16%, and 67% of the variance in presence, absolute, and relative abundance of the compounds explained by sex, respectively; Table 2). When performing the PerMA-NOVA analysis on the two sexes separately, the variation in the presence and abundance of compounds was mostly explained by geographic location (PerMANOVA; n = 38 males and n = 35 females; significant R 2 of 50%, 59%, and 73% for males and of 38%, 38%, and 75% for females; Table A1).
We subsequently aimed at spotting the most likely chemical compounds forming the MSP in these populations. One to four pMSP components were selected per population (Fig. 1). We found that the qualitative and quantitative variations in composition of the pMSP in males were mostly explained by geography as for full chemical profiles (PerMANOVA; n = 38; significant R 2 of 51%, 59%, and 75% for the difference in presence, absolute, and relative abundance of the compounds, respectively; Table 2). Specifically, the three MSP initially identified in the laboratory population of B. anynana originating from Malawi were present in all populations of B. anynana except in Uganda where MSP1 and MSP3 were absent from the male profile ( Fig. 1). Five additional males sampled at the same Ugandan location in 2007 lacked MSP1 and MSP3 as well which suggests it is not an effect due to chance and the small sample size. Of note, these five additional individuals were not added to the analyses because we lacked an internal standard for quantifying the compounds and we did not have the corresponding genetic data. Additional compounds (pMSP4 and pMSP5) were also selected in the South African populations (Limpopo and False Bay laboratory stock, respectively, Fig. 1). The Kenyan population showed some peculiarities regarding the chemical profiles of males. First, some pairs of Kenyan males showed a much poorer correlation (between 0.2 and 0.4) than that found on average between pairs of male chemical profiles (mean = 0.61 AE 0.16 SD, Fig. 3). We also discovered that some Kenyan individuals had the same pMSP profile as the Ugandan population and thus missed MSP1 and  PerMANOVA analyses were performed for each type of chemical distance (presence-absence, absolute, and relative abundances, respectively). In males, the analyses on both full chemical profile and putative male sex pheromone composition are presented. Significant R 2 are in bold (asterisks inform on significance of the estimates: "*" if P ≤ 0.05; "**" if P ≤ 0.01; "***" if P ≤ 0.001). MSP3, while other males did not, suggesting that in Kenya, a mixture of different chemical profiles could be present. We found an average number of 2.0 AE 0.0 (mean AE SD) differences in presence and absence of pMSP components between the Ugandan population and the Malawian and South African populations. Finally, the pMSPs and the chemical profiles of the two laboratory stocks, and those of the South African laboratory-adapted and field-caught populations, were very similar despite over 20 years of laboratory acclimation for the Malawian stock population: All compounds identified in one population were found to be produced at least in some individuals of the other population, demonstrating that the corresponding biosynthetic pathways were conserved between populations (Fig. 1). In summary, qualitative differences in the presence or the absence of population-specific pMSP components were observed between populations in B. anynana, but also within the Kenyan population.

Comparison of the patterns of genetic and chemical differentiations
The level of pMSP differentiation between pairs of B. anynana populations was higher than the corresponding level of nuclear genetic divergence, and this held true whether the patterns of presence-absence of pMSP components of absolute or of relative variations in abundance in pMSP composition were considered ( Fig. 4A-C). Although the number of data points is too limited to test this pattern statistically, it suggests that chemical divergence was concomitant, or preceded, genetic differentiation between B. anynana populations. One notable exception to this pattern was the Uganda and Kenya population pair ("Ug-Ke" in Fig. 4), wherein chemical differentiation was lower than neutral genetic divergence. This is because the Kenyan population was composed of genetically similar individuals displaying contrasted chemical profiles with some bearing the Ugandan profile and others the Malawi-South African one (see Fig. 1 and results above). This increased the average chemical differentiation within the population and decreased the average chemical differentiation between some males from Kenya and the Ugandan population.

Discussion
We compared variation in pMSPs with neutral genetic variation in an African butterfly in which pMSPs are suspected to play a role in speciation and observed a strong differentiation in both the full chemical profiles and in the pMSP composition among B. anynana populations, as well as variation within one population. The chemical differentiation involved both qualitative differences in the composition, and quantitative differences in absolute and relative abundances of chemical compounds. A large part of the variance in chemical profiles could be attributed to the geographic isolation among B. anynana populations. We observed particularly obvious differences between the two subspecies of B. anynana, B. a. centralis in Uganda, and B. a. anynana in Malawi and South Africa ( Fig. 1; Table 2). Between one and four compounds were selected as pMSP components in each of the B.  components per Bicyclus species, and~20% of the younger pairs of Bicyclus species displayed one to three pMSP differences, while all pairs of Bicyclus species differed on average by seven pMSP compounds (Bacquet et al. 2015). Thus, the qualitative differences observed between populations in this study were of the same magnitude as those observed among closely related species and that participate to reproductive isolation across the genus (Bacquet et al. 2015). Moreover, such chemical differences are of greater magnitude than those found in female sex pheromones in moths (Linn and Roelofs 1989). Overall, neutral genetic divergence between pairs of B. anynana populations was lower than variation in pMSP although the small sample size for the chemical data could inflate Φ ST values. In addition, a similar level of chemical differentiation was also found within the genetically undifferentiated population of B. anynana in Kenya (Figs. 1, 4). Furthermore, there is evidence of ongoing gene flow between all B. anynana populations as they share a similar mitochondrial haplotype, and we found three individuals in Uganda that appear to be crosses between the two subspecies. The greater chemical than genetic differentiation lends support to the hypothesis that chemical divergence was concomitant, or preceded, genetic divergence. This study is one of the few comparing chemical and genetic differentiations at the intraspecific level in animals (White and Lambert 1995;Sperling et al. 1996;Tregenza 2002;Watts et al. 2005;Eltz et al. 2008;Dopman et al. 2010;Palacio Cort es et al. 2010;Runemark et al. 2011;Khannoon et al. 2013;Svensson et al. 2013). For example, Tregenza (2002) and collaborators showed that cuticular hydrocarbon profiles that participate to reproductive isolation vary across European populations of the grasshopper Chorthippus parallelus. In several occasions, this genetic approach allowed to unravel that the hypothesized populations were actually cryptic species (White and Lambert 1995;Sperling et al. 1996;Svensson et al. 2013). Our results suggest that variation in sex pheromone composition may be responsible for the evolution of premating isolation among B. anynana populations, and hence may act as a driver for speciation.
A follow-up question would be how frequent is this important divergence of intraspecific chemical profiles in the Bicyclus genus? Preliminary data on genetic and chemical distances were gathered for three additional Bicyclus species (Appendix 1). In one of the three, Bicyclus vulgaris, we could detect qualitative differences of pMSP composition associated to geography similar to what was observed in B. anynana (Appendix 6). In addition, B. vulgaris displayed genetic differentiation at the mitochondrial level. In contrast, Bicyclus safitza and Bicyclus smithi did not display clear differences in chemical profiles between sampled populations, although the latter showed mitochondrial differentiation associated to geographic location. Despite the low sample sizes, this suggests that the accumulations of chemical and genetic differences are uncoupled in the Bicyclus genus.
Differences in sex pheromone composition among geographically distant populations have been shown to have a genetic basis in several Lepidoptera (Sappington and Taylor 1990a,b;LaForest et al. 1997;Groot et al. 2013). Additional information would be necessary to determine whether the observed chemical differences are genetically fixed or plastic in B. anynana. Indeed, we performed chemical analyses on wild individuals sampled at different locations and time periods of the year. While MSP1 and MSP2 ([Z]-9-tetradecen-1-ol and hexadecanal) in B. anynana are fatty acid derivatives and synthesized de novo (Li enard et al. 2014;Wang et al. 2014), terpenoid compounds such as MSP3 (6,10,14-trimethylpentadecan-2-ol) are likely derived from host plant compounds (Schulz et al. 2011) and potentially affected by host plant use. In addition, climatic or ecological conditions, such as alternative wet and dry seasons typically experienced by these tropical butterflies, could be responsible of part of the observed variation (Landolt and Phillips 1997;Buckley et al. 2003;Groot et al. 2009;Geiselhardt et al. 2012;Larsdotter-Mellstr€ om et al. 2012). The Bicyclus genus is known for its seasonal polyphenism with dry and wet season forms varying in life history traits and displaying differences in wing patterns (mostly studied in B. anynana; Brakefield and Reitsma 1991;Windig et al. 1994;Brakefield et al. 2009;Prudic et al. 2011). However, several observations argue that part of the observed variation of MSP in B. anynana may be genetically based. First, preliminary results in B. anynana laboratory population originating from Malawi showed that the MSP composition was neither affected by the temperature during development (the triggering factor of the two seasonal forms) nor by host plant (maize or Oplismenus sp., the natural host plant of B. anynana; Nieberding et al. 2008). Second, we did not find MSP1 and MSP3 in the population of B. anynana from Uganda in five additional males sampled in 2007, which suggests that plasticity was likely not responsible for their particular MSP composition.
To conclude, the pMSP of B. anynana displays variations of interspecific magnitude at an intraspecific level (i.e., low genetic differentiation) and is therefore a good candidate to drive reproductive isolation and speciation.

Acknowledgments
We are grateful to Marie-Claire Etong Nguele, Zacharie Manyim Mimb, Benjamin N'Gauss, Carole Ngono, and L eon Tapondjou for logistical support in Cameroon; to

Supporting Information
Additional Supporting Information may be found online in the supporting information tab for this article: Figure S1. Mass spectra of the pMSP components. Table S1. Detailed list of sampled individuals for the four Bicyclus species.  Table A1). For these, three males and two females per population were used for the chemical analyses (Table A1). The four species can be considered independent at the scale of the genus as they are relatively distant in the phylogenetic tree (10% K 2 P distance on average for the mitochondrial gene COI).

Appendix 2: Phylogenetic reconstructions for the four Bicyclus species
Maximum-likelihood (ML) trees were reconstructed using the program Garli (Zwickl 2006;Bazinet and Cummings 2011). The model of sequence evolution GTR + G + I was selected to encompass the features of the best models of all four species and the two genes to allow comparison of phylogenetic reconstructions on a common basis. The model ranking was based on corrected Akaike information criterion computed with JModeltest (Posada 2008). Final consensus trees for each species and each gene were obtained from 2000 bootstrap replicates starting from the best tree of 10 independent ML searches. We rooted the phylogenetic trees with closely related species: Bicyclus campa and B. mollitia for B. anynana; B. funebris for B. safitza; B. buea, B. golo, and B. sophrosyne for B. smithi; and B. jefferyi and B. moyses for B. vulgaris (Condamin 1973;Monteiro and Pierce 2001).
We performed the test of Shimodaira and Hasegawa (1999;S.H. test hereafter) to compare likelihood scores of trees reconstructed with different constraints in order to test more specifically the support for population (defined according to the sampling location) separation in different clades of the trees. We used positive and negative constraints to test the support of population separation in different clades. Positive constraints forced all individuals of the same population to be kept in a single clade in the trees during the reconstruction. Conversely, negative constraints forced the individuals of each population to be distributed across different clades. We used positive constraints to test the support of observed separation of some populations in distinct clades, and negative constraints to test the support of observed admixture of populations. To compare the different hypotheses, each corresponding best tree was imported into R program (Paradis et al. 2004; R Development Core Team 2011); the branch lengths were optimized with the functions pml and optim.pml for the tree likelihood to be subsequently compared with S.H. tests (package Phangorn, Schliep 2011). The likelihoods of the trees for each hypothesis and the P values of the tests are reported in the Table A2. Table A1. Summary of the geographical distribution and size of the samples for genetic and chemical analyses of Bicyclus safitza, Bicyclus smithi, and Bicyclus vulgaris. Details for each individual are given in Table S1.

Number of sequences
The ML phylogenetic reconstructions using COI supported the genetic differentiation of geographic clades in B. smithi (bootstrap support, BT hereafter, of 86% for separating Uganda from Cameroon) and B. vulgaris (BT of 71% for grouping Nigerian haplotypes, Fig. A2). The geographically intermediate B. vulgaris from Cameroon grouped with either Nigerian or Ugandan haplotypes showing the coexistence of the two mitochondrial lineages in Cameroon. For the two species, the clade separation in the COI gene was supported by S.H. tests (Table A2). In contrast, the CAD gene showed no genetic differentiation between populations of the two species. B. safitza did not display geographic clade either for COI or for CAD, despite the distance separating Nigeria and Uganda (approximately 2600 km, Figs. A1, A2). Constraining the tree for the presence of the clades did not improve the likelihood, and SH test revealed no significant differences (Table A2).

Appendix 3: Congruence between COI and the nuclear markers in B. anynana
To assess whether the COI or the CAD marker was a better signal of the evolutionary history of B. anynana, we obtained additional sequences for the nuclear marker EF1a for seven Ugandan individuals, two South African individuals from False Bay, and three from Limpopo (Tables 1 and S1). We measured the average number of pairwise differences between all individuals for each of the three genes. The average divergence between populations is larger than the average divergence within populations for the two nuclear markers, but not for the COI (Table A3). This confirmed the deep nuclear divergence found for Ugandan individuals compared to more southern populations, which is also in accordance with the description of a Ugandan subspecies, using morphological traits, B. anynana centralis, Condamin, 1968. As COI is supposed to evolve faster than the nuclear markers (EF1a is used to resolve deep nodes in phylogeny studies in butterflies), this results mean that COI is unreliable to describe the neutral genetic divergence of the populations in this species. In the three other species investigated, COI displays, as expected, a larger divergence between populations (Appendices 2 and 4).
This relative mitochondrial uniformity in B. anynana suggests a recent spread of one haplotype across populations (mitochondrial introgression). Such a pattern of mitochondrial discordance is not uncommon and justifies the use of several independent markers (Funk and Omland 2003;Bazin et al. 2006). The erasing of mitochondrial diversity can be due to drift (stronger in the mitochondrial genome with lower effective population size due to haploidy and maternal inheritance), selection of an advantageous mutation (selective sweep facilitated by the absence of recombination in this genome), or selection by symbiotic interaction with bacteria such as Wolbachia (reviewed in Ballard and Whitlock 2004). According to de Jong et al. (2011), the neutrality of the polymorphism in COI sequences was not met for several B. anynana populations, which would reject the first hypothesis of fixation by drift. We applied neutrality tests H of Fay and Wu (2000) and D of Tajima (1989) with the DnaSP software (version 5.10.01; Librado and Rozas 2009) and observed that both were significant for the COI but not for the CAD gene (Table A4). This further supported the hypothesis of a recent sweep on the mitochondrial polymorphism.  This analysis was performed with MEGA5 and the standard error computed with 1000 bootstrap (Tamura et al. 2011). The ratio between the number of differences between and within population is stronger in CAD than in COI in accordance with the ANOVA and pairwise Φ ST results.  variance explained for COI and CAD, respectively, Table A5). Of note, the divergence between the mitochondrial clades found within B. smithi and B. vulgaris (average uncorrected pairwise difference of 1.96 AE 0.34% SD and 1.53 AE 0.17%, respectively) was under the level known for other Lepidopteran subspecies (3.4% in Heliconius erato, Brower, 1994;or 2% in Spodoptera frugiperda host plant strains, Kergoat et al., 2012), suggesting that we are dealing with intraspecific genetic variability.
Appendix 5: Ugandan heterozygotes using the CAD gene of B. anynana The three CAD haplotypes sampled in Uganda but present out of the Ugandan clade were associated to typical Ugandan haplotypes in three heterozygous individuals. These three Ugandan individuals had heterozygous positions for two and three of the three positions that bore, in all other sampled B. anynana individuals, fixed differences between the Ugandan and the Kenyan and South African populations (Table A6). The three heterozygous individuals also presented a heterozygous state for two other positions fixed in Ugandan individuals but still variable in more southern populations (Table A6). Given the frequency of these mutations, the probability of being heterozygous (by mutation) for at least three of these five positions was very low in the Ugandan population (P = 0.008, 0.0004, and 8.56 9 10 À6 for three, four, and five heterozygous positions as observed for these three individuals, Table A7). It was even more surprising to observe no individual heterozygous for only one or two positions (v 2 = 5128.61, P = 0.0001 obtained from 10,000 Monte Carlo simulations; Table A8). Thus, the probability that these three individuals bear alleles both typical from Uganda and from other populations, which differ at several positions, due to mutation only is very low. It is far more likely that they inherited these heterozygous genotypes from recent hybrid matings between Ugandan and more southern individuals. Based on the strong divergence of the clades at the nuclear level, the differentiation of the populations must be ancient compared to pairs of populations investigated in the other species (Appendix 4), yet reproductive isolation is not yet complete given the presence of the three hybrids. The nucleotide positions correspond to the 714 base pairs considered for the CAD gene. In Uganda, the character states are fixed for the five positions except for three individuals ("Mu005", "Iu043", and "Iu049", arrows in Figs. 2 and A2). These five mutations are synonymous. For example, the probability of being fully homozygous for the more frequent state at these five positions is the product of the squared probability of the states at each position (that is: P(C pos.105 ) 2 9 P(T pos.212 ) 2 9 P (C pos.341 ) 2 9 P(A pos.461 ) 2 9 P(A pos.677 ) 2 ; or: 0.96 2 9 0.96 2 9 0.93 2 9 0.93 2 9 0.96 2 = 0.58).