Genetic effects on life-history traits in the Glanville fritillary butterfly

Background Adaptation to local habitat conditions may lead to the natural divergence of populations in life-history traits such as body size, time of reproduction, mate signaling or dispersal capacity. Given enough time and strong enough selection pressures, populations may experience local genetic differentiation. The genetic basis of many life-history traits, and their evolution according to different environmental conditions remain however poorly understood. Methods We conducted an association study on the Glanville fritillary butterfly, using material from five populations along a latitudinal gradient within the Baltic Sea region, which show different degrees of habitat fragmentation. We investigated variation in 10 principal components, cofounding in total 21 life-history traits, according to two environmental types, and 33 genetic SNP markers from 15 candidate genes. Results We found that nine SNPs from five genes showed strong trend for trait associations (p-values under 0.001 before correction). These associations, yet non-significant after multiple test corrections, with a total number of 1,086 tests, were consistent across the study populations. Additionally, these nine genes also showed an allele frequency difference between the populations from the northern fragmented versus the southern continuous landscape. Discussion Our study provides further support for previously described trait associations within the Glanville fritillary butterfly species across different spatial scales. Although our results alone are inconclusive, they are concordant with previous studies that identified these associations to be related to climatic changes or habitat fragmentation within the Åland population.

investigated over the last 20 years in the Åland islands, and the species is now considered as a 58 model organism in the dynamics and evolution of natural (meta)population. 59 Allelic variation in metabolic genes is often associated with life-history traits in natural 60 populations (Marden 2013). A much-studied example is the gene Phosphoglucose isomerase 61 (Pgi) in butterflies but also in other insects (Dahlhoff & Rank 2000) and plants  Pedersen 1998). In the Orange Sulfur butterfly (Colias eurytheme, Pieridae), Pgi allozyme alleles 63 are associated with survival in the field (Watt 1977) and reproductive success (Watt et al. 1985), 64 while in the Glanville fritillary butterfly (Melitaea cinxia, Nymphalidae), a SNP in Pgi is 65 associated with flight metabolic rate (Niitepold 2010;Niitepõld et al. 2009) and dispersal rate in 66 the field (Hanski & Mononen 2011;Niitepõld et al. 2009), population growth rate (Hanski & 67 Saccheri 2006), female fecundity (Saastamoinen 2007b), body temperature at flight 68 (Saastamoinen & Hanski 2008), lifespan (Orsini et al. 2009; Saastamoinen et al. 2009) and larval 69 development . Considering insect studies on other candidate genes, 70 Saastamoinen et al.  found genetic associations between the incidence 71 of an extra larval instar and allelic variation in Pgi, Serpin-1 and Vitellin-degrading protease 72 precursor genes. Importantly, many of these associations involve a significant interaction 73 between genotype and ambient (Niitepold 2010) or acclimation temperature , 74 consistent with the hypothesis of temperature-dependent enzyme kinetics. Drosophila (Walker et al. 2006  Pre-diapause larvae of the Glanville fritillary were sampled from five regional   proportions, and were ArcSin-transformed prior being integrated into the PCAs. The ten 207 analysed PCs accounted for the majority of the total variation in the traits included in the 208 respective PCAs (74% in developmental-traits, and over 58% in male and female adult-traits).

209
Normality of the ten PCs was assessed using Shapiro-Wilk tests (α=0.01).  Table S6).  The quality of SNPs, including deviation from the Hardy-Weinberg equilibrium 242 (HWE), was assessed using an in-house quality control pipeline (Wong 2011). False discovery 243 rate calculation was used to correct for multiple testing (=0.05) (Hochberg & Benjamini 1990). 244 For the pilot study, one of the two Pgi SNPs genotyped did not reach our quality criteria, while  (Supplementary Table S6). interaction were used as explanatory variables to those models, while population was included as 263 a random factor nested within the environment type. Similar models, excluding the sex factor, 264 were used to test genetic associations to the three principal components from both PCA F and 265 PCA M . Initial models with interaction term were subsequently simplified by the removal of non-266 significant terms to give a final minimal adequate model. 267 We accounted for additive, dominant, recessive, and over-recessive/dominant effects of 268 the alleles, and conducted false discovery rate calculation (Hochberg & Benjamini 1990) to 269 correct for multiple testing for all tests. This means that each raw p-value was corrected for a 270 total of 1086 tests conducted (including 4 inheritance models x 10 PCs x 33 SNPs that reached 271 our quality criteria, Table S6). A FDR adjusted p-value of 0.05 means 5% of the uncorrected 272 significant discovery will result in false positives.

334
One association was found between Pgi:c.331A>C and PC 1-3 (p-value = 3.84e-4, 335 dominant effect). Individuals with one or two copies of the C allele had higher values of PC 1-3 336 than the AA homozygotes (Fig. 2). PC 1-3 is primarily negatively correlated with the weights of  Table S3). indicating that smaller males mate at older age. Finally, PC M3 positively correlated flight activity 364 (probability to fly) with mating success (total number of copulations and age at 1 st mating), thus 365 male flying more mate more, despite starting to mate at older age. There were no direct effects of 366 environment on any of the PCs. 367 We found an association between PC M1 and the SNP c3917_est:386A>C in the Serine 368 proteinase-like protein gene (p-value=1.64e-4, recessive effect, Fig. S6). Butterflies with at least 369 one copy of the A allele showed lower PC M1 values (AA=-3.07, AC=0.13 and CC=15.09). 370 We found an association between SNPs in the SgAbd-8 gene and PC M2 (SNP  Fig. 4B). There were no significant differences between the genotypes in the 376 northern fragmented environments, but the heterozygote individuals from the southern 377 continuous environments had higher PC M3 values than the homozygote individuals (Fig. 4B).

378
There was also a significant association between PC M3 and the SNP hsp_1:206T>G in 379 the Heat shock 70kDa protein coding gene (p-value=1.33e-4, recessive effect, Fig. S7A). In     (Table 1), 422 however these associations did not hold up following correction for multiple testing. One reason 423 for the lack of significant results may remain the small sample size for each population, and the 424 large amount of tests done, leading to false negative results. We therefore believe many of the 425 positive results before correction, are robust and most likely driven by the fact that the selection 426 of candidate genes was largely based on previous studies (discussed below), in which these 427 genes had already shown significant or nearly significant effects. Furthermore, we used 428 phenotypic traits, some of which have been highly heritable in previous studies (de Jong et al.  In each case there is support from previous studies on the Glanville fritillary, or from 454 other insect studies, and to some extend from the present study, that these genes may influence       8. Glucose-6-phosphatase dehydrogenase (G6PD) gene, in which polymorphism has 495 been associated with mating success in Colias butterflies (Carter 1988).

496
The results of the genetic association for the SNP (Pgi):c.331A>C are a case in point. 497 We found significant associations with larval development. The AA homozygotes have heavier 498 larvae following diapause, but smaller pupae than the other genotypes. These results were 499 consistent across the four populations, even though there are great differences in the frequency of 500 the C allele in the populations: the C allele is much more common in the two northern 501 populations from continuous environment (around 50%) than in the southern populations from 502 fragmented environment (around 20%).