Genetic, ecological, behavioral and geographic differentiation of populations in a thistle weevil: implications for speciation and biocontrol

Abstract Because weevils are used as biocontrol agents against thistles, it is important to document and understand host shifts and the evolution of host-specificity in these insects. Furthermore, such host shifts are of fundamental interest to mechanisms of speciation. The mediterranean weevil Larinus cynarae normally parasitizes either one of two thistle genera, Onopordum and Cynara, being locally monophagous. In Sardinia, however, both host genera are used. We used three types of data to help understand this complex host use: (i) weevil attack rates on the two host genera among 53 different populations in Sardinia and nearby Corsica, (ii) host preference in a lab setting, and (iii) genetic (allozyme) differentiation among weevil populations exploiting the same or different hosts. Using a subset of populations from northern Sardinia, we attempted to relate interpopulation differences in host preference to gene flow among populations by comparing pairwise differences in oviposition preference (Qst) and in allozyme frequencies (Fst). Overall, Qst and Fst were positively correlated. Fst was positively correlated with geographic distance among pairs of populations using the same host, but not among different-host population pairs. As mating occurs on the hosts, this result suggests reinforcement. Genetic evidence indicates Cynara as the ancestral host of the weevils from both islands and our current studies suggest repeated attempts to colonize Onopordum, with a successful shift in Corsica and a partial shift in Sardinia. This scenario would explain why in Sardinia the level of attack was higher on Cynara than on Onopordum and why, when given a choice in the laboratory, Sardinian weevils preferred Cynara even when sampled from Onopordum. The lability of host shifts in L. cynarae supports caution in using these or related weevils as biocontrol agents of exotic thistles.


Introduction
Parasitic species frequently show spatial variation in host use (Fox and Morrow 1981;Thompson 1994), even among habitats with similar arrays of potential hosts (e.g. . The relationship between this variation and parasite genetic differentiation is under increasingly intensive study because of its many spinoffs for the evolution of resource use (Bernays and Chapman 1994;Feder et al. 2003;Ferrari et al. 2006;Xie et al. 2007) and speciation (Feder et al. 1988;Via 1999;McCoy et al. 2001;Dres and Mallet 2002). The speciation aspect has acquired renewed impetus from the recent discovery of several mechanisms that should promote genetic divergence between sympatric populations using different hosts. For example, the preference for a particular host can be associated with performance on that host (Singer et al. 1988;Hawthorne and Via 2001) and mate choice behavior can be directly driven by host affiliation (Feder et al. 1994;Gotoh and Kubota 1997;Funk 1998;Nosil et al. 2002Nosil et al. , 2007. The latter can occur when mate-attraction pheromones are host-derived (Landolt and Phillips 1997;Emelianov et al. 2001) or when males and females show parallel variation in prealighting host choice (Emelianov et al. 2004).
In addition to the knowledge generated by a few model systems (Bush 1994;Via 2001), the study of the processes underlying speciation can be considerably enriched by investigating systems which are on the verge of speciating. For example, comparing the spatial patterns of host preference and of genetic differentiation may provide insight into the transition to speciation. Analyzing factors that determine the host range of an insect is facilitated in species where this host range varies among populations. In this context, the weevil Larinus cynarae and the thistles it parasitizes is an excellent system to study the evolution of specialization. Here, we address the question of the geographic scale of specialization and investigate its consequences for genetic differentiation in a system in which speciation has not (yet) occurred.
The issue of host use and genetic differentiation among populations also bears on the choice of potential biocontrol agents. In this context, the use of the weevilthistle system is particularly relevant, as weevils related to L. cynarae have been used as biocontrol agents against thistles (Jordan 1995;Briese 1996;Louda 1998). To choose biocontrol agents wisely, we need to know how insect host ranges evolve, how predictable is the direction of such evolution, and how best to interrogate particular insects about their future evolutionary plans (Strong 1997;Singer 2004;Hufbauer and Roderick 2005;Sheppard et al. 2005). For each group of insects that contains candidates for biocontrol agents, we need to understand the evolution of their host specialization and the mechanics of the host shifts that they undertake. This will assist in predicting the characteristics of parasites, hosts, and their interactions which may make some systems more or less appropriate for biocontrol intervention.
Larinus cynarae exhibits strong geographic variation for host use . In southern France and northern Spain, the weevil feeds exclusively on Onopordum species, while it attacks only Cynara species in southern Spain, continental Italy (with a few rare exceptions on Onopordum, Briese et al. 1996) and Greece. Both host genera are present in each of these regions but only one of them is used, the weevil being thus regionally monophagous ; Y. Michalakis and I. Olivieri, personal observation). Such local monophagy is well-known in herbivorous insects (Singer 1971;Fox and Morrow 1981;Sheppard et al. 2005). In contrast to this general pattern of regional monophagy, L. cynarae attacks species belonging to both genera in Corsica and Sardinia (Onopordum illyricum and Cynara cardunculus). Both host species flower at the same time and are roughly equally abundant in Sardinia, although relative abundances and phenology of the two genera vary among locations and Cynara is essentially absent from the extreme North of the island, as well as from Corsica (I. Olivieri, personal observation).
We report on three kinds of empirical data: (i) weevil attack rates on Corsican and Sardinian populations of both plant genera in the field, (ii) host preferences of experienced and naive insects under experimentally controlled conditions, and (iii) genetic differentiation, assessed by enzyme electrophoresis, among weevil populations exploiting the same or different host species. These different lines of evidence enable us to describe the geographic pattern of host exploitation in the field, to assess the potential of different insect populations to attack one or both host genera, and to investigate how host preference and spatial isolation interact to shape the population genetic structure of this weevil. We discuss the implications of our findings for biological control using this type of organisms.

Materials and methods
The natural history of L. cynarae Larinus cynarae FAB. (Curculionidae Cleoninae) is a univoltine, sexually reproducing weevil, which feeds, mates, and develops almost exclusively on thistles of the genera Onopordum and Cynara (Asteraceae: Cardueae) in the circum-mediterranean area. Adult weevils become active in the spring and feed and mate on their host until early summer. During the same period, females lay eggs between the bracts of thistle flower heads. A single egg is laid into each hole that a female has drilled with her snout. In the lab, we observed that females would live for 3-6 weeks, during which they could lay up to 50 eggs (I. Olivieri, personal observation). After hatching, the larvae grow inside the capitula and feed on the developing seeds (Martelli 1948;Michalakis et al. 1993;Briese 1996). Pupation occurs in the capitulum inside a more or less loose cocoon of chewed capitulum tissue. Development from egg to adult lasts about 6 weeks. After completing development, the adults emerge from the dry capitula, often disperse away from the plants, overwinter by undergoing diapause, and appear again the following spring. Adults do not feed from the time of eclosion until breaking diapause in spring and do not survive after the reproductive period.

Attack rates in the field
When we started this study, we did not know which host(s) would be attacked in the two Mediterranean islands considered here. Both islands are slightly closer to Italy, where Cynara is the principal host of L. cynarae, than to France, where only Onopordum is attacked. The geographic proximity of the two islands suggested that gene flow should frequently occur between them. However, Cynara is very rare in Corsica, hence if weevils were present on this island, they would have to parasitize Onopordum. Finally, when we discovered that both hosts were attacked in Sardinia and that both could co-occur at the same site, it was initially unclear whether the local monophagy typical for this weevil would be maintained or whether both hosts might actually be used sympatrically. To address these issues, we developed a long-term field study. From 1995 to 2006, weevils were sampled from host plant populations at 40 sites in Sardinia ( Fig. 1), including 22 Onopordum pure populations (hereafter called 'OS' populations for 'using Onopordum in Sardinia'), nine pure Cynara populations ('CS' populations for 'using Cynara in Sardinia'), and nine mixed stands where both species occurred ('OMS' and 'CMS' populations for 'using Onopordum which occurs as a Mixed stand with Cynara in Sardinia' and for 'using cynara which occurs as a Mixed stand with Onopordum in Sardinia', respectively). At such sympatric sites, plants of both species occurred next to one another. Five additional sites were sampled in Corsica, including four Onopordum populations where only Onopordum was present ('OA' populations for 'using Onopordum in CorsicA') and one pair of sympatric plant populations, which represent the only site in Corsica where Cynara are present ('OMA' and 'CMA' populations for 'using Onopordum which occurs as a Mixed stand with Cynara in CorsicA' and 'using Cynara which occurs as a Mixed stand with Onopordum in CorsicA', respectively). Because of the unpredictability of host phenology, not all populations were in the right stage for sampling for weevil attacks when visited. Out of 54 populations, 37 were sampled more than once (Table 1). On average each site was visited 2.9 times (SD = 2.1).
In July of each year, a random sample of at least 50 capitula per host species (1-5 capitula per plant) was haphazardly collected from each site sampled that year and brought back to the lab for dissection (in most cases about 100 capitula were dissected). The attack rate on each host was defined as the percentage of capitula that contained at least one weevil: either a well-developed larva (L3, L4, or nymph) or an emerging adult. To compare attack rates in Corsica and Sardinia, to address the effect of the co-occurrence of both plants on host use, and to take into account temporal variation, we tested whether attack rates on the two host species were significantly variable among population types (CS, CMS, OS, OMS, OA, OMA, and CMA) and years using a generalized linear mixed-effects model (hereafter called GLMM). Population type and year were considered as fixed effects. Data from a given population (i.e. weevils sampled at a given site from a given host species) across years are not necessarily independent. To control for this potential lack Identified sites are those studied in the host preference experiments and/or the allozyme study. Sites MS27, MS16 and MA53 were more particularly considered for the effect of host in sympatry, whereas populations OA22, OA23, MS11, CS16, OS27, CS27, and CS32 were used in Experiment 2 to study the effect of diet on host preference. Gray squares indicate attacked populations of O. illyricum, white squares, unattacked populations of O. illyricum, and black squares attacked populations of C. cardunculus.

Host shifts in weevils
Olivieri et al. of independence, and thus to avoid pseudo-replication, we considered the effect of population as a random effect, as suggested by Pinheiro and Bates (2000). The number of attack rates per population type was too small to estimate the interaction term population type:year. Models were computed with lme4 Package of R, using the lmer function (Bates and Sarkar 2007). We assumed a binomial error associated with logit link function. The significance of the effects was assessed by comparing the described model with and without each fixed effect using chi-squared tests on differences in deviance; all models were fitted using unrestricted maximum likelihood estimation (method = ML) and keeping the same random effects, as suggested by Crawley (2007). Pairwise comparisons between population types were computed using the pvals.fnc function from the Language R package (Baayen 2007), which performs 10 000 MCMC simulations to estimate P-values.

Host preference experiments
To understand the observed patterns of attack in the field, we performed several host preference experiments, classified into two main types described below.

Experiment 1: host preferences of experienced insects
In June 1996, six weevil populations on O. illyricum (five sardinian populations, OS2, OS4, OMS7, OS14, OS21, and one population from Corsica, OA22), and three on C. cardunculus in Sardinia (CS8, CS12, CMS16) were sampled during their oviposition period (see Fig. 1 for the location of these populations). Twenty to 50 adult weevils were collected haphazardly on host plants at each site and brought back alive to the CSIRO laboratory in Montpellier to be subjected to oviposition preference experiments. In the lab, each weevil was fed with the same plant species from which it had been collected. Plant material was collected in southern France. Host preference was tested by introducing one or two females (with one male added per female) into cages in which two to four fresh ramets of C. cardunculus and of O. illyricum had been transplanted in sand, at about 10-20 cm one from another. Each ramet bore one to three capitula in the early blooming stage and the capitula of each host had approximately the same size. Weevils were left in the cage for 2 days, and the number of eggs on each capitulum was counted at the end of this period. Overall, seven to 34 replicates were performed for each population (mean = 12.3, SD = 8.4, see Table 1 for sample size per population). The total number of females tested was 208 (109 experiments with 1 or 2 females tested) and the total number of eggs was 593. Thus, each female laid about three eggs in 2 days, close to what would be observed in natural conditions (Martelli 1948). The preference of each weevil (or pair of weevils combined) is expressed as the ratio of the number of eggs laid on Onopordum over the total number of eggs laid in the cage.
In 2000 and 2004, the same experiment was performed with females sampled in June from various populations (in 2000: OA22, OA23, OS4, OS14, CS8, CS12, CMS16, OMS16, CS32, CMS27, CMA53 in 2004: OS52, OA22, OA23, OMA53, and CMA53, see Table 1 for sample sizes). After feeding on leaves from its original plant species, each female was transferred individually to plastic containers in which she was offered a simultaneous choice between the two hosts. Each container had a single capitulum of each host species of approximately the same size. Capitula were replaced every 2 days. The old capitula were removed and the eggs on them counted.
From 2001 to 2003, as well as in 2004 for one site, the same experiment was performed, but with adult weevils that had been gathered as pupae in the previous year. As seems to occur in natural conditions, the insects did not feed prior to diapause. They were kept at 4°C till April. Diapause was broken by placing the weevils at room temperature and providing them with the host plant on which they had been sampled. Fifteen populations were studied this way with a total of four to 72 weevils per population (OA22, OA23, OS14, OS21, OS52, OMS11, OMS16, OMS27, CS13, CS32, CS78, CMS16, CMS27, CMS35, and CMS82) ( Fig. 1, see Table 1 for sample size). For six other populations (CS12, OS2, OS4, OS33, OMS7, and CMS11), sample size was lower than four, but their inclusion or exclusion from the analysis did not affect the results. We analyzed the above dataset in several ways described below.
Variation for host preference Using the above dataset, we tested the hypothesis that host preference was independent of population type (CS, CMS, OS, OMS, CMA, OMA, or OA), using a GLMM as previously described with population type as a fixed effect and year and population as random effects. To study the interaction between year and population type in a meaningful way, we would need a more balanced study. We assumed a binomial error weighted by the total number of eggs laid by each female or each replicate (pair of females).
We also tested the hypothesis that host preference was independent of the host species on which weevils had been collected, using a GLMM (see above) with host as a fixed effect and year and population as random ones. Because there was a single (sympatric) population of Cynara in Corsica, and because host preference was found to vary between the two islands (in particular between populations using Onopordum), we performed this last comparison within Sardinia only.
To compare the divergence among populations for host preference with that for allozymes (see below), we defined a 'preference distance' between two populations (Qst) as a phenotypic analog of the standardized variance of gene frequencies (Fst): if p i is the observed mean proportion of eggs laid per female or per replicate on Onopordum for population i in the preference experiments, Qst between any two populations i and j is calculated as Var p ð Þ p 1 À p ð Þ ½ , with Var(p) the observed variance of p among the two populations, p the arithmetic mean of p i and p j , and p(1 ) p) the maximum value of Var(p). We calculated Qst among pairwise populations for the 1996 dataset, so as to compare these preference distances with geographic distances, as well as with genetic distances obtained in the allozyme study described later (Fst). As Wright (1969) has shown, when the variance of a given selectively neutral quantitative trait is determined by many additive gene effects, the genetic differentiation among populations generated by genetic drift will be equivalent to that at the underlying genes (QTL), or at any neutral locus. This theoretical background has been used to identify traits undergoing homogeneous or heterogeneous selection, for which the amount of genetic differentiation would be smaller or larger, respectively, compared to that observed for likely neutral loci (Bonnin et al. 1996;McKay and Latta 2002;Le Corre and Kremer 2003). Further, a positive correlation between Qst and Fst can be interpreted as evidence either for a genetic basis of the quantitative trait, or for a covariation between Fst and some environmental factor which also affects Qst.
Sympatric sites: association between host plant and preference Because we found differences between populations of weevils sampled from different hosts, we also tested for preference differences between weevils that could use different hosts in their field site. We used weevils from three sites where both hosts occurred and were attacked (Sympatric sites: MS16 and MS27 in Sardinia, and MA53 in Corsica, see Table 1). We tested for the effects of host, population, and their interaction on host preference using a generalized linear model (GLM). We used glm function of R with a quasibinomial family as error structure and an F-test to check for the effect of host, as suggested by Crawley (2007, p. 578). Using a quasibinomial family allows the model to estimate a dispersion parameter which will scale the nominal variance to take into account departure from a true binomial error (McCullagh and Nelder 1989, p. 124-128). To study the effect of host within each population, GLMs were subsequently computed for each population.
Experiment 2: test for induction of host preference in naive insects , we estimated the effects of diet on oviposition preferences of individuals from seven populations, with the aim of understanding the causes of the preference variation among populations revealed in Experiment 1. We used naive adult weevils that had been collected as pupae in the previous year. After diapause was broken, half the weevils from each test population were fed with Cynara and the other half with Onopordum. After they had fed and mated for about 2 weeks on their test diet, the females were transferred individually to plastic containers in which each female was offered the choice between the two hosts as in the previous experiment. There were two Corsican populations from Onopordum (OA22 and OA23), two Sardinian populations from Onopordum occurring at a sympatric site (OMS11 and OMS27) and three Sardinian populations from Cynara (CMS16, CMS27, and CS32), two of which were sympatric with Onopordum. Population OMS11 was sampled in both 2001 and 2002, and tested the following years. For this population, data across years were pooled. Populations OMS27 and CMS27 were sampled in 2003 and studied in 2004. The other populations were sampled in 2002 and studied in 2003. At least 10 females were tested per diet, apart for population CMS27 where only five females were tested on Onopordum and four on Cynara (see Table 1 for sample sizes).
Because we had studied few populations for each host, we did not study the host:diet interaction. Instead we tested for the effects of population, diet, and their interaction on host preference by a GLMM as previously described for the first experiment. Here, population, diet, and their interaction are the fixed effects, and host is considered as a random effect to control for potential confounding effect of differences among hosts. To study the effect of diet within each population, GLM were subsequently computed for each population, with an F-test to test for the effect of diet.

Enzyme polymorphism
In July 1995, several hundred mature capitula were sampled from eight Sardinian populations (OS2, OS4, OS14, OS21 on Onopordum, and CS8, CS12, CMS16, CS20 on Cynara, see Fig. 1) known to have been attacked the previous year, as well as two Corsican populations (populations OA22 and OA23), and brought back to the laboratory in Montpellier. Emerging insects were killed in liquid nitrogen and stored frozen at )80°C until being processed for enzyme polymorphism using the methods previously described for Larinus (Michalakis et al. 1992;Briese et al. 1996). Overall, 272 weevils were scored for 10 polymorphic loci, of which seven were highly polymorphic at the level of the species, and five at the level of Sardinia (see Appendix).
Differentiation over all samples and within each host were tested using Fisher's method for combining probability tests. Unbiased estimates of the associated P-values were calculated using the Markov chain method computed by genepop version 3.4 (Raymond and Rousset 1995). Wright's F-statistics F st (Wright 1951) were estimated by the estimatorĥ of Weir and Cockerham (1984). We also used the gda software (Lewis and Zaykin 2001) to perform a hierarchical ANOVA, to compare the amount of variation within and among hosts in Sardinia. Confidence intervals for h S (among populations within hosts), and h P (among host species) were obtained by bootsrapping over loci (Lewis and Zaykin 2001).
The correlation between Fst and pairwise differences in host preference between populations (Qst) also studied in Experiment 1 (all eight populations studied for allozymes but population CS20), or of any of these two pairwise distance matrices and geographic distance were tested with Mantel's test (Mantel 1967) using Pearson's correlation coefficient as the test statistic. To test the significance of the correlation between Fst or Qst and geographic distance depending on whether pairs of populations used the same or different host plant, we used a randomization test by modifying the standard Mantel's test procedure to account for the particular structure of the distance matrices being handled. For populations using different host plants, each of the two distance matrices (one for geographic distances, the other for Fst or Qst) are rectangular (with populations on Onopordum in, e.g. columns and populations on Cynara in rows), and we randomly combined rows and columns of one of them (1000 permutations each time). In the case of populations on the same host plant, for each distance there were two symmetric matrices (relative to the diagonal), each of them corresponding to one host plant. We independently combined the rows and columns of both matrices for one of the distances and then combined the randomly generated matrices to calculate Pearson's coefficient. The two-sided P-value of the test is calculated as the proportion of sampled permutations where the absolute value of the correlation coefficient is greater than or equal to the observed absolute value.
To interpret our results and determine the ancestral host of the Sardinian and Corsican weevils, we used enzyme data from Briese et al. (1996) on weevils from Spain, southern France, Italy and Greece, and a subset of our own data (seven loci out of 10, corresponding to the first seven in Appendix), to reconstruct a distance tree at the scale of the mediterranean basin. The species Larinus latus, specialized on Onopordum (assumed to be the ancestral host of L. cynarae by Briese et al. 1996), was used as outgroup. We used the phylip 3.57 package (Felsenstein 1994). The program seqboot was used to produce 1000 datasets by bootsrapping over loci; gendist was used to compute the Cavalli-Sforza distance, and for each dataset the tree was constructed using the neighbor-joining method. The program Consense allowed the reconstruction of the consensus tree.

Results
Attack rates in the field The attack rate (proportion of capitula with at least one larva having reached the third instar) varied widely among years, plant species and plant populations, ranging from 0% to 100%. On average, 33% of capitula were attacked in plant populations that were used as hosts (sample size above 50, usually 100). The mean attack rate in 2001 (43.5%) was particularly high, and that in the record heat-wave year of 2003 particularly low (15.9%). As a result, the effect of year was highly significant (v 2 = 456.90, 8 d.f., P < 0.0001). We also found a significant effect of population type (v 2 = 41.05, 6 d.f., P < 0.0001), with Sardinian populations of C. cardunculus and pure Corsican populations of O. illyricum significantly more heavily attacked than Sardinian populations of O. illyricum (Fig. 2, shared letters among population types indicate nonsignificant differences; all significant differences had P < 0.007). Mean attack rates (percent of capitula containing at least one larva or emerging adult) per population type : CMS and CS: Cynara cardunculus from, respectively, sympatric and single-species sites in Sardinia; OA: pure Onopordum illyricum from Corsica; CMA and OMA: C. cardunculus and O. illyricum from the unique sympatric site in Corsica; OMS and OS: O. illyricum from, respectively, sympatric and single species sites in Sardinia. Each bar shows the average attack rate over 1-7 years of data. Letters over each bar indicate significant differences among population types: shared letters indicate a lack of significant difference (see text). Letters over CMA and OMA are only indicative, as these types are represented by a unique population.

Host shifts in weevils
Olivieri et al.

Host preference experiments
Variation of host preference (Experiment 1) Figure 3 shows the pattern of host preference over all experiments performed between 1996 and 2004 with weevils fed with the host plant from which they were sampled. The effect of population type was significant (v 2 = 31.3, 6 d.f., P < 0.0001). In Corsica (dotted and hatched bars), the mean proportion of eggs laid on Onopordum varied from 59% (CMA53) to 81% (OA23). Thus, overall, weevils preferred Onopordum in Corsica. Conversely, in Sardinia, the mean proportion of eggs laid on Onopordum varied from 8% for a naturally-Cynara feeding weevil population at a sympatric site (CMS27) to 53% for populations which naturally fed on Onopordum (OS52). Thus, regardless of their original host and location, Sardinian weevils generally preferred to oviposit on Cynara, or showed no preference (z-test, z = )9.4, P < 0.0001). However, within Sardinia, there was a significant difference in preferences between weevils from the two host plant origins, with populations naturally found using Cynara more strongly preferring Cynara compared to populations naturally found on Onopordum (with average proportion of eggs laid on Onopordum of, respectively 14% and 34%; v 2 = 8.97, 1 d.f., P = 0.0027).

Sympatric sites: association between host plant and preference (Experiment 1)
We specifically tested the effect of host at three sympatric sites (indicated Fig. 3 by horizontal lines linking populations). As weevils from these three sites had significantly different preferences (F 2;87 = 10.78, P < 0.001), we tested for the effect of Host within each site. At sites MA53 and MS16 there was no trend for a difference in preference between weevils sampled from the two hosts (MA53: F 1;14 = 0.004, P = 0.95; MS16: F 1;49 = 0.027, P = 0.87), and at site MS27 there was a large and significant trend for weevils from one host genus to prefer that same genus in experimental preference trials (F 1;21 = 5.12, P = 0.034).

Test for induction of host preference in naive insects (Experiment 2)
There was no significant main effect of the weevils diet on their oviposition patterns (v 2 = 0.64, 1 d.f., P = 0.43) across all populations (Fig. 4). However, five out of seven populations showed the same trend of increasing preference towards the host they had previously experienced as a diet. Furthermore, there was a significant interaction of population and experimentally-controlled diet (v 2 = 33.31, 6 d.f., P < 0.0001). Host preference of weevils during oviposition trials was strongly and significantly influenced by previous experimentally-manipulated diet in only one population (CS32) (F 1;30 = 6.55, P = 0.016). In a second population (OA22), there was a weaker and nonsignificant tendency for induction of preference (F 1;21 = 2.48, P = 0.13), whereas experimental diet did not significantly influence host preference by weevils from other populations (F < 0.69, P > 0.42) (Fig. 4).

Relationship of preference distance to geographic distance in Experiment 1
Over the eight populations studied for host preference in 1996, no significant relationship between preference distance and geographic distance was found (permutation test, r = )0.03, P = 0.89) (Fig. 5A). The sign of the correlation was positive (but still nonsignificant) when only those pairs of populations collected on the same host species were considered (r = 0.28, permutation test, P = 0.54), and negative when we considered only those pairs of populations in which each member of the pair used a different host species (r = )0.44, permutation test, P = 0.26) (Fig. 5A). This last trend was essentially because of weevils at two sites (population OS14 on Onopordum and populations CS8 on Cynara), which showed unusually strong preferences for the hosts that they used (Fig. 3).

Enzyme polymorphism
There was a weak though significant differentiation among populations of the two islands considered together (Fst = 0.040, Fisher probability test, P < 0.001), as well as among Sardinian populations (F st = 0.022, Fisher probability test, P < 0.001). The average Fst among pairs of populations was larger between populations on different hosts than between populations exploiting the same host (mean Fst = 0.029 and 0.017, respectively). However, a hierarchical ANOVA (gda) suggested that among-host differentiation was not significantly different from 0 (h P = 0.006, CI obtained by bootstrapping over loci: )0.001-0.014) whereas within-host differentiation was significantly positive (h S = 0.025, CI: 0.004-0.050).

Host shifts in weevils
Olivieri et al.
Within Sardinia, there was no significant correlation between genetic distance and geographic distance (Fig. 5B, r = )0.03, Mantel test, P = 0.91). However, F st and geographic distance between sites became positively and significantly correlated when only those pairs of populations exploiting the same host were analyzed (Fig. 5B, r = 0.61, permutation test, P < 0.0001). On the other hand, when we considered only those pairs of populations collected on different host species, we found a (nonsignificant) negative correlation between F st and geographical distance (Fig 5B; r = )0.49, permutation test, P = 0.14), a pattern similar to the relationship between Qst and geographic distance (Fig. 5A). Thus, for population pairs exploiting different host plants in Sardinia, the genetic differentiation between geographically closely located populations was just as high as that between populations separated by large distances.
Although the overall differentiation was small in Sardinia, a significant positive correlation was observed between preference distance Qst and F st (Fig. 5C, r = 0.59, Mantel test, P = 0.02), suggesting that host preference has affected the genetic structure of the weevil metapopulation, or vice versa. This relationship was even stronger when we analyzed only those pairs of populations sampled on different hosts (r = 0.67, permutation test, P = 0.05), whereas it was no longer significant when only same-host population pairs were considered (r = )0.08, permutation test, P = 0.74).
We used allozyme polymorphism to study the likely origin of Sardinian and Corsican populations. We used data from the present work and previously-published data on a subset of the same loci . We found that both Sardinian and Corsican populations were more closely related to populations specialized on Cynara sp. (western Italy and southern Spain), than to populations specialized on Onopordum sp. (N. Spain and S. France) (Fig. 6). This phylogeographic pattern gives an explanation for our finding that Sardinian weevils exhibited a general tendency to prefer C. cardunculus, regardless of the host they naturally used.

Discussion
We begin our discussion by drawing together our accumulated evidence from patterns of allozyme and preference variation to infer the current processes involved in generating spatial and temporal patterns of attack by our study species on its two host genera. Subsequently, we discuss the implications of our findings for gene flow and host-range evolution. We then ask how results such as ours may contribute to making informed decisions about the potential risks posed by exotic insects used in biocontrol programs.

Preferences of Sardinian weevils
In contrast to the regional monophagy exhibited over most of its range, L. cynarae weevils exploit both host plant genera in Sardinia. However, Cynara plants were generally more heavily attacked than Onopordum (Fig. 2). The preference experiments under controlled conditions corroborated field observations. Overall, Sardinian weevils preferred Cynara to Onopordum. However, when given the choice, weevils from populations that used Onopordum in the field laid more eggs on this species than weevils from populations using exclusively Cynara. These experimental results show that behavioral differences exist among populations.

Positive correlation between genetic distance and preference distance
The present study is the first to show a quantitative relationship between a continuously varying host preference and a continuously varying genetic divergence. The relationship we found was a positive correlation (Fig. 5C).
One can ask what are the mechanisms underlying this positive correlation between Qst and Fst. First, marker polymorphism could be directly involved in host preference. Indeed, there is some evidence that allozyme polymorphism might not always be neutral with respect, e.g. to assortative mating (Feder and Filchak 1999). However, the same correlation pattern was observed when we used microsatellite data (F. Justy and I. Olivieri, unpublished data). Therefore it is likely that the observed pattern does not reflect that of particular genes under selection. Another possibility is that genetic differentiation is a direct consequence of host preference: weevils from populations which exhibit different preference may be less likely to encounter each other than insects from populations with similar preference. In this case, gene flow among populations with different host preference would be more restricted compared to that among populations with similar preference. Alternatively, it could be argued that host preference, just as allozyme variation, is neutral and behaves just like any neutral marker (Jimenez-Ambriz et al. 2007, and references therein for examples of Fst-Qst studies). However, as differentiation at allozyme loci is much lower than differentiation for host preference, preference is most likely under diverging selection. Another possibility would be that host preference is not genetically determined and that Qst simply reflects phenotypic plasticity. However, in this case, the strong correlation between Qst and Fst would remain unexplained.
Although environmental influences on preference, such as the induction demonstrated here, are frequent in beetles, genetic influences typically exist alongside them, leading to significant heritability of oviposition preference (Tucic and Seslija 2007;and references therein) and rapid response to artificial selection (Fricke and Arnqvist 2007).
Other authors have shown how host preference might mediate genetic divergence between host-races (Rice 1985;Duffy 1996;Craig et al. 1997;Feder et al. 1997;Ferrari et al. 2006;Frantz et al. 2006). Indeed, assortative mating based on host preference is expected to lead to genetic differentiation (Feder et al. 1988(Feder et al. , 1997McPheron et al. 1988;Craig et al. 1993). Since L. cynarae do mate on their host plant, this mechanism is likely.

Host preference and genetic differentiation: a role for reinforcement
Assuming that Fst reflects current gene flow, our results suggest that, among populations on different hosts, gene flow among nearby sites is at least as low as that among distant sites, whereas among same-host populations isolation by distance occurs. Indeed, although the overall differentiation among populations is small, there is a tendency for pairs of populations using different hosts to be more genetically distinct than pairs using the same host. More importantly, the two types of population pairs show strikingly different patterns of association between Fst and geographic distance. In the Sardinian dataset, the significant positive correlation between Fst and geographic distance, expected under the standard isolationby-distance scenario, is observed among same-host population pairs. However, this correlation disappears or even becomes negative when we consider only population pairs using different hosts (Fig. 5B).
The trend toward a negative correlation between Fst or Qst with geographic distance among different-host populations suggests that these populations actually exchange fewer genes than populations further apart. One possible explanation for this pattern is that increased host fidelity has been directly selected for in areas of sympatry or parapatry, as a premating barrier to lessen cross-breeding between weevils associated with Onopordum and Cynara. Thus, the pattern could correspond to a process of reproductive reinforcement (Butlin 1987;Noor 1999) to reduce the production of less fit hybrids between populations specialized on alternative host plants. Note, however, that we have no evidence yet for hybrids having a low fitness.
The results from our host preference experiments suggest that (i) learning affects host preference differently across populations, and (ii) reinforcement does not systematically occur in sympatric populations (Fig. 3). This variation may be caused by the patterns of variation of the populations themselves in the field. Indeed, thistle or weevil populations are not stable entities. Throughout the 10 years of sampling, some populations have disappeared and/or they have been (re)colonized, suggesting that local extinctions or bottlenecks of plant and/or weevil populations are frequent (I. Olivieri, personal observation). When a population becomes either very scarce or temporarily extinct, it may be recolonized by immigrants from the same host or from the alternate host, When colonization occurs from the alternate host, this may blur the effect of reinforcement. However, we expect a bias towards same-host colonizations as occurs in other oligophagous insects (Hanski and Singer 2001).
Overall, the pattern of host preference appears as one of small isolated populations displaying a mosaic of levels of attack, with repeated attempts to colonize a novel host (Onopordum), seemingly leading to selection for reproductive isolation, as suggested by the unexpected patterns of local genetic differentiation. It will be very interesting to follow the evolution of these populations, some of which might prove to be a natural example of speciation mediated by reinforcement on host preference.

Phylogeographic scenario and ongoing adaptation on alternative hosts
Over most of its range L. cynarae is monophagous on either Onopordum or Cynara, even when both hosts are available. This monophagy is brought about by strong host preferences: in experimental trials French females laid 94% of their eggs on Onopordum and females from southern Spain specialized on Cynara laid 95% of their eggs on plants of this genus (Y.D. and I.O., unpublished data). In an open-field experiment, females from Greece specialized on C. cardunculus did not lay any eggs on Onopordum (Briese et al. 1995).
The existing evidence suggests that Sardinia was colonized by Cynara-exploiting weevils. The higher field attack rates on Cynara compared to Onopordum (Fig. 2), in conjunction with the distance tree based on enzyme polymorphism (Fig. 6) indicate that these weevils were primarily adapted to Cynara. Further, most insects collected on Onopordum laid more eggs on Cynara than on Onopordum when given the choice (Fig. 3). This also supports the scenario of an ongoing host-shift from Cynara to Onopordum, as other studies have also found a host shift to be followed by a lingering preference for the traditional host remaining among insects using the novel host Berlocher and Feder 2002).
If L. cynarae are indeed undergoing a host-shift from Cynara to Onopordum, they are returning to the host identified as the ancestral host of their taxonomic group (according to Briese et al. 1996). This would not be surprising. Janz and Nylin (1998) showed that, in butterflies, a higher tendency to recolonize ancestral hosts helps to explain the apparent large-scale conservatism in the patterns of association between insects and their host plants, patterns which at the same time are flexible on a more detailed level. There are several other examples of such evolutionary conservatism (Thompson 1993;Futuyma et al. 1994;Futuyma and Mitter 1996;Fox et al. 1997).
Our results confirm that the members of the Curculionid taxon Cleoninae can indeed undergo multiple colonizations and radiations on the Cynaroideae, as previously suggested by Zwölfer and Herbst (1988). Geographic variation of insect diet implies its rapid evolution (Singer 1971;Funk and Bernays 2001). Altogether, our current results confirm the great flexibility and evolutionary potential of host preference in these weevils, as has been shown in other insects (e.g. see Taber 1994;Feder et al. 1997;.
One of the most notorious examples of ill-advised biological control involves yet another thistle-head weevil, Rhinocyllus conicus, that was introduced against slender thistles (Carduus pycnocephalus and C. tenuiflorus) from 1968 onwards in the United States and Canada, and that was later found attacking rare, endemic species of the native American flora (Louda et al. 1997Strong 1997;Louda 1998;Louda 2005, Russell et al. 2007).
The history of R. conicus shows the importance of understanding the ecological and evolutionary causes and consequences of host-specificity and host shifts prior to making artificial introductions. Despite this cautionary tale, biological control research is continuing unabated. When control is successful its economic impact can be enormous, as in the recent dramatic success of an introduced weevil in clearing water hyacinth from Lake Victoria (Wilson et al. 2007).
Evidently, one should be more cautious when using insects for biocontrol than were the enthusiasts who introduced R. conicus, which was already known to have a fairly wide host range (Strong 1997). To assess the risk to native species posed by biocontrol agents, we need to be able to predict their likely evolutionary trajectories. How can this be approached? Recent reviews by Hufbauer and Roderick (2005) and Sheppard et al. (2005) express considerable optimism that the problems are now wellenough understood that if current knowledge were applied uniformly, attack on nontarget plants could be effectively avoided. For example, these authors note that regulations now require introductions to be made from a specific population that has been tested for its potential host range, not just from a species from which some populations have been tested.
There are still, however, some very basic questions to which we do not have answers, such as: 'is there a lower risk when a sample is taken from a strictly monophagous species than from a strictly monophagous population of a species with geographic variation of diet? ' (Singer 2004). Although it might seem intuitively obvious that insects in taxonomic groups with strictly monophagous species are less likely to indulge in host shifts, this might still not be true. In groups of strictly monophagous species, each host shift must have been associated with a speciation event. This is true regardless of the direction of cause and effect, i.e., whether the host shifts trigger the speciation events or whether the speciation events facilitate the host shifts.
But this does not necessarily mean that host shifts are rarer in groups with strict monophagy. It could be, on the contrary, that these groups have higher rates of speciation but the same rate of host-shifting as groups containing regionally-monophagous species. This is a testable hypothesis (Singer 2004).
Even if we knew whether we should restrict the search for biocontrol agents to totally monophagous species or also include regional monophagy, the present study illustrates the practical difficulty of classifying species as strictly or regionally monophagous. If L. cynarae were studied superficially, it would probably be recorded as completely monophagous. If the study were extended broadly enough geographically, the weevil would be recorded as using two hosts, but always locally monophagous. It is only with luck and extensive work that one finds there are spots in its distribution where its diet is fluctuating, flexible, and probably rapidly-evolving. How many strictly monophagous species are there, and how many that are recorded as monophagous would turn out not to be so with sufficient study? In any case, it seems that weevils contain both species that are strictly monophagous and those that are regionally so, as in the present case.
In the case of L. cynarae, the more detailed the investigation undertaken, the broader and more flexible the diet appears to be. However, there are cases where the exact opposite occurs and detailed molecular investigation reveals a supposed generalist insect species as a cluster of cryptic species with narrow diets. Hebert et al. (2004) titled their DNA-barcoding study of neotropical skippers 'Ten species in one' while Fumanal et al. (2004a,b) discovered that an apparently generalist European weevil actually comprised two morphologically identical species, a generalist and a specialist. When this occurs, previously unsuspected candidate biocontrol agents can be revealed and made available for study. Overall, recent work including that reported here, suggests that even in insect groups regarded as suitable for biological control, the factors that influence host range may not yet be well-enough understood to give us the necessary confidence to predict future evolution of introduced agents. Nonetheless, we consider that pursuit of the ability to make these predictions remains a worthwhile enterprise.