Ecotypic differentiation under farmers' selection: Molecular insights into the domestication of Pachyrhizus Rich. ex DC. (Fabaceae) in the Peruvian Andes

Abstract Understanding the distribution of crop genetic diversity in relation to environmental factors can give insights into the eco‐evolutionary processes involved in plant domestication. Yam beans (Pachyrhizus Rich. ex DC.) are leguminous crops native to South and Central America that are grown for their tuberous roots but are seed‐propagated. Using a landscape genetic approach, we examined correlations between environmental factors and phylogeographic patterns of genetic diversity in Pachyrhizus landrace populations. Molecular analyses based on chloroplast DNA sequencing and a new set of nuclear microsatellite markers revealed two distinct lineages, with strong genetic differentiation between Andean landraces (lineage A) and Amazonian landraces (lineage B). The comparison of different evolutionary scenarios for the diversification history of yam beans in the Andes using approximate Bayesian computation suggests that Pachyrhizus ahipa and Pachyrhizus tuberosus share a progenitor‐derivative relationship, with environmental factors playing an important role in driving selection for divergent ecotypes. The new molecular data call for a revision of the taxonomy of Pachyrhizus but are congruent with paleoclimatic and archeological evidence, and suggest that selection for determinate growth was part of ecophysiological adaptations associated with the diversification of the P. tuberosus–P. ahipa complex during the Mid‐Holocene.


| 499
DELÊTRE ET aL. resource allocation to vegetative or reproductive parts, and select for different ecotypes adapted to contrasted environmental conditions. Determinacy is a particularly desirable trait because a compact growth habit is associated with shorter growth cycle, which is advantageous in short-season environments (Kwak, Toro, Debouck, & Gepts, 2012).
Indeterminate growth, in contrast, induces a prolonged competition for whole-plant resources between reproductive and vegetative parts, but is better adapted to tropical environments, where constant humidity allows continuous plant growth and confers a competitive advantage as plants must compete for light.
Yam beans (Pachyrhizus Rich. ex DC., Fabaceae) are a good model for studying the need for manipulating whole-plant resource allocation and its role in driving the evolution of plant architectural traits during plant domestication. Unlike most legume crops, yam beans are grown not for their seeds, which contain high levels of toxic polyphenols, but for their tuberous roots, which have high starch, vitamin, and protein contents as well as high levels of micro-and macronutrients (Ramosde-la- Peña, Renard, Wicker, & Contreras-Esquivel, 2013).
For many plants, developing tuberous roots is a common strategy to survive protracted drought, and Harris (1972) suggested that most tropical root crops were domesticated in regions with marked seasonality before they were introduced in the more humid tropics.
In root and tuber crops, most of which are propagated vegetatively, roots usually monopolize most plant resources (Kooman & Rabbinge, 1996). In legume plants, however, whole-plant resource allocation is determined mostly by the sink effect exerted by the developing seeds, the strength of which is proportional to the number of maturing pods on the plant, thereby influencing resources available for the rest of the plant (Bennet, Roberts, & Wagstaff, 2011). Because yam beans are grown only from seeds, farmers must find a compromise between producing seeds (the part used for propagation) and producing storage roots (the part used for consumption) (Høgh-Jensen, Mora, Morera, & Sørensen, 2008;Nielsen & Sørensen, 1998). The evolutionary importance of this tradeoff for the domestication and diversification of yam beans, however, has not been documented.

Yam beans have received increasing attention in recent years
for the high-quality starch and high nutritional value of their roots (Doporto, Mugridge, García, & Viña, 2011;López et al., 2010).
However, their potential for crop improvement is limited by a lack of data on genetic resources in the genus (Santayana et al., 2014) and by confusion around phylogenetic relationships between Pachyrhizus species that are still not fully resolved.
The genus Pachyrhizus comprises five species (Sørensen, 1988), three of which are native to South America: (i) P. panamensis R.T.
Clausen, which is only known in the wild; (ii) P. tuberosus (Lam.) Spreng., which is known both in the wild and as a cultivated species; and (iii) P. ahipa (Wedd.) Parodi, which is endemic to Bolivia and known exclusively as a cultivated species. The most widespread species, P. tuberosus, forms a complex of four cultivar groups: (i) "Ashipa," (ii) "Chuin," (iii) "Yushpe," and (iv) "Jíquima." Cultivars differ in leaf morphology, ecology, and traditional cultivation methods (Sørensen, Døygaard, Estrella, Kvist, & Nielsen, 1997). The most common cultivar group, Ashipas, are found in Ecuador, Peru, Bolivia, and Brazil. In Ecuador, Ashipas are known as "Iwa," "Capamu," or "Namau." In Brazil, where their cultivation is now marginal, Ashipas are known as "Jacatupé" or "Feijão-macuco" (Da Silva, Da Silva Filho, & Ticona-Benavente, 2016). Ashipas are usually grown in slash-and-burn agroecosystems, in parts of the forest that are never inundated (terra firme). In contrast, Chuins and Yushpe thrive on rich alluvial soils and are grown on floodplains (várzea) that are inundated for 4-6 months a year (Oré Balbín, Sørensen, Kvist, & Delgado Vasquez, 2007). Chuins are found along the Marañón and Ucayali rivers in Peru, while Yushpes are cultivated only along the Ucayali River (Oré Balbín et al., 2007). The last cultivar group, Jíquima, is endemic to seasonally dry tropical forests (SDTFs) of the Guayas and Manabí provinces, in coastal Ecuador. Unlike other P. tuberosus cultivars, all of which are indeterminate, the Jíquima cultivar is a bushy plant with determinate growth, very similar to the Andean yam bean, P. ahipa, another endemic to SDTFs, which is found only in the semiarid inter-Andean valleys of Bolivia and northern Argentina.
Seasonally dry tropical forests are characterized by <1,600 mm of annual rainfall and a marked dry season of 5 months at least, when total rainfall drops below 100 mm (Gentry, 1995). Environmental conditions in these regions are particularly suited for the evolution of tuberous roots in plants (Harris, 1972), and SDTFs have been proposed as important centers of origin for Andean roots and tuber crops (Debouck et al., 1995;Pearsall, 2008).
Little is known about the origins and history of South American yam beans. The origin of the Bolivian "Ajipa" (P. ahipa), in particular, is unknown, as no wild parent has yet been identified. This study represents the most comprehensive molecular study to date on the genus Pachyrhizus and aims at shedding light on the domestication history of yam beans in South America. We present new molecular evidence based on chloroplast DNA sequencing and a set of new nuclear microsatellite markers. Using a landscape genetic approach and approximate Bayesian computation (ABC), we examine correlations between environmental factors and phylogeographic patterns of genetic diversity within Pachyrhizus landrace populations, and evaluate different scenarios for the diversification of yam beans in the Peruvian Andes. Results are discussed through the lens of historical ecology.

| Sampling, genotyping, and DNA sequencing
We analyzed 244 accessions (wild and cultivated) from field and herbarium specimens spanning the whole South American distribution range of the genus (Table S1). Total genomic DNA was extracted from 20 mg of lyophilized leaf tissue using NucleoSpin 96 Plant kits (Macherey-Nagel, Hoerdt, France), following the manufacturer's instructions. Purified DNA was eluted in a final volume of 200 μl, and final concentration was checked using a NanoDrop ND-1000 spectrophotometer (Labtech, Palaiseau, France). Genetic diversity was analyzed using the 17 nuclear Simple Sequence Repeats (SSRs) designed by Delêtre, Soengas, Utge, Lambourdière, and Sørensen (2013). PCR amplification, multiplexing, and genotyping were carried out following Delêtre et al. (2013). Three loci (AIP05, AIP09, and AIP34) showed no polymorphism across all species and were discarded.

| Phylogeographic patterns of genetic diversity
We used BAYESCAN 2.5 (Foll & Gaggiotti, 2008) to carry out neutrality tests for microsatellite markers and detect outliers signaling loci under selection. BAYESCAN assumes a model in which a common migrant pool exchanges genes with a number of subpopulations that differ from the common gene pool according to a population-specific F ST . The difference in allele frequency between this common gene pool and each subpopulation is measured by a subpopulation-specific F ST coefficient. Although yam beans are predominantly self-pollinating (Sørensen, 1996), which can increase the rate of false positives, BAYESCAN was shown to perform well with selfing populations when many (small) populations are considered (De Mita et al., 2013).
Sixteen subpopulations were considered (Table S1). Default settings were used, and loci were considered outliers when their q-value was <0.01 after applying a 1% false discovery rate correction. Loci under selection were removed from analysis.
Analyses of genetic variability were carried out using an extended dataset consisting of one copy of each unique multilocus genotype (MLG) identified at each sampling site (n = 86) (Table S1). Mean number of alleles per locus (A), allelic richness corrected for sample size (A R ), unbiased expected (H E ) heterozygosity (Nei, 1978), and Fstatistics (Weir & Cockerham, 1984) were computed using FSTAT 2.9.3.2 (Goudet, 1995). Because Pachyrhizus populations were fixed for one allele at most loci, Jost's measure of estimated differentiation (D est ), which estimates genetic differentiation based on shared alleles rather than based on mean heterozygosity (Jost, 2008), was used to carry out population differentiation tests with the R package DEMEtics (Gerlach, Jueterbock, Kraemer, Deppermann, & Harmand, 2010).
The structure of multilocus genetic diversity was explored using the Bayesian clustering method implemented in STRUCTURE 2.3.4 (Pritchard, Stephens, & Donnelly, 2000). STRUCTURE uses Markov chain Monte Carlo algorithms to group individuals in K clusters, assuming Hardy-Weinberg equilibrium and linkage equilibrium within each cluster, to reconstruct hypothetical ancestral populations. A narrow dataset consisting of one copy of each unique MLG (n = 39) was used to avoid autocorrelation due to repeated MLGs. K was set to vary between 1 and 10. For each value of K, ten independent runs based on 100,000 Markov chain Monte Carlo iterations with a burn-in period length of 10,000 generations were performed, assuming admixture and uncorrelated allele frequencies. No group priors were defined.
Discriminant analyses of principal components (DAPC) were performed in R (R Development Core Team, 2016) using the package Adegenet (Jombart, 2008). DAPC is a two-step multivariate method which associates principal component analysis with discriminant analysis to achieve maximal discrimination of individuals into predefined groups (Jombart, Devillard, & Balloux, 2010). Unlike Bayesian approaches, which assumes that populations are at Hardy-Weinberg equilibrium and loci at linkage equilibrium, nonparametric methods make no assumptions and may be used to confirm results of Bayesian analyses when some conditions are not met (e.g., panmixia); DAPC is also better suited to unravel underlying genetic structuring in complex groups (Jombart et al., 2010). To infer the number of genetic clusters, DAPC uses sequential K-means clustering of principal components.
The "true" number of populations is determined in relation to a Bayesian information criterion (BIC). We retained all principal components (PCs) during the preliminary variable transformation to conserve all the variation in the original data. The number of PCs retained for the discriminant analysis was determined based on α-scores (Jombart, 2008).
Phylogenetic relationships between Pachyrhizus populations were investigated by constructing a phylogenetic network based on Nei's minimum genetic distance (D A ) (Nei, Tajima, & Tateno, 1983). POPULATIONS 1.2.32 (Langella, 1999) was used to produce the matrix of genetic distances and SPLITSTREE4 (Huson & Bryant, 2006) was used to create the split graphs using the Neighbor-Net algorithm. Finally, a minimum spanning haplotype network was constructed from the concatenated sequence data of the two chloroplast DNA (cpDNA) regions using HAPSTAR 0.7 (Teacher & Griffiths, 2011). Summary statistics were calculated using DnaSP 5.10 (Librado & Rozas, 2009). We first performed a PCA to identify a subset of key factors that captured most of the variance (Fig. S1). Based on this initial analysis, six environmental variables were retained: elevation (ALT), annual mean temperature (BIO1), isothermality (BIO3), temperature annual range (BIO7), annual precipitation (BIO12), and precipitation seasonality (BIO15). Longitude (LON) and latitude (LAT) were also included to test for isolation by distance. Additional occurrence data were compiled using herbarium specimens for which geographic coordinates were available, bringing the total to 169 unique presence points representing the 16 subpopulations (Table S1). Environmental data were extracted from the WorldClim database (Hijmans, Cameron, Parra, Jones, & Jarvis, 2005) at 30 arc-second resolution using DIVA-GIS 7.5.0 (Hijmans et al., 2012). Populations were represented by mean values of the eight bioclimatic variables. Burn-in, thinning interval, sample size, and pilot runs were set to default values. The R package beanplot (Kampstra, 2008) was used to represent the observed distribution range of Pachyrhizus landrace populations along environmental gradients.

| ABC analyses
We performed ABC analyses using the program DIYABC 2.1.0 (Cornuet et al., 2014) to compare different evolutionary scenarios for the origin of the Bolivian Ajipa (AC) and the Ecuadorian Jíquima (JI), and their relationship to wild (TW) and cultivated P. tuberosus (TC). Three sets of competing scenarios were compared (Fig. S2).
In the first set, AC, JI, and TC diverged from distinct but closely related wild ancestors (partial genealogical link; scenarios S1-S5); in the second set, AC, JI, and TC were independently domesticated from the same wild ancestor (weak genealogical link; scenarios S6-S7); and in the third set, AC and JI diverged from TC (strong genealogical link; scenario S8). In all scenarios, dispersal events were assumed to be associated with strong genetic bottlenecks.
Because herbarium specimens for wild Pachyrhizus were scarce, we introduced a fifth "ghost" population (Cornuet, Ravigne, & Estoup, 2010) to account for unsampled ancestral populations. All parameters were given wide uniform priors, except for bottleneck duration which was given a log-uniform prior. Microsatellite loci were assumed to follow a symmetric generalized stepwise mutation model (Estoup, Jarne, & Cornuet, 2002), with a maximum range of 40 contiguous allelic states and a 10 −7 -10 −4 mean mutation rate.
As most of cpDNA variation consisted of indels, which the current version of DIYABC does not consider, cpDNA data showed too little variation to be informative and analyses were carried out with nuclear SSR data only. Details on model specifications, priors for demographic parameters, and locus-specific mutation model parameters are given in Table S2.
For each scenario, 1 × 10 6 datasets were simulated and eight pairwise summary statistics were computed (mean number of alleles across loci, mean gene diversity, mean allele size variance, pairwise F ST and classification index, and δμ 2 ). Confidence in scenario choice was assessed by evaluating type I and type II error rates over 100 pseudo-observed datasets (PODs) simulated using parameter values drawn from prior distributions and LDA-transformed summary statistics (Cornuet et al., 2010). Similarly, bias estimates were assessed by simulating 100 PODs under the best-supported demographic scenario and using medians of demographic parameters drawn from the corresponding posterior distributions.

| Genetic variability among Pachyrhizus populations
Diversity indices for nuclear microsatellites and cpDNA are presented in Table 1. At the species level, genetic variability was highest among wild populations of P. tuberosus, which showed polymorphism at all loci but one (Fig. S3). In contrast, P. ahipa was fixed for one allele at 12 of the 14 loci analyzed.
Matrices of pairwise genetic differentiation tests are given in Table S4. The average fixation index for all loci was high (f = 0.907), reflecting the fact that most landraces consisted of few geographically discrete genotypes that were fixed for one allele at most loci and differed from each other by one or two alleles only ( Table 1). The Ashipa cultivar group (P. tuberosus) showed the most variability, with the highest levels of genetic diversity recorded in Ecuador and Peru. In Peru, Ashipas cultivated by Tupi-speaking groups (Cocama) in Dept. Loreto were distinct from Ashipas grown by Pano-speaking groups (Shipibo) in Dept. Ucayali (D est = 0.081, p < .01). Similarly, we found two genotypes of Chuins, one of which was predominantly found in Cocama villages (MLG 36), and the other in Shipibo villages (MLG 37). The two genotypes differed by one locus only (D est = 0.028, p < .01) and were both also detected among Ashipas.
In Bolivia, Ajipa (P. ahipa) showed similarly low levels of genetic variability. Northern twining and southern bushy morphotypes were associated with distinct genotypes but differed by only one locus (D est = 0.064, p < .01). Whereas northern accessions were fixed for AIP30 91 , southern accessions were fixed for allele AIP30 84 (Fig. S4).
Two rare alleles were also found, AIP17 48 in the north, and AIP30 85 in the south. Neither of these was detected in Peru, although both were present among wild populations from Ecuador.
Genetic analyses confirmed that all specimens collected in the Bolivian Yungas belonged to the Ashipa cultivar group (P. tuberosus), albeit to a genotype different from those cultivated in the Peruvian and Bolivian Amazon. This "Yungas type" was characterized by the fixation of two alleles, AIP17 50 and AIP31 71 , which were otherwise detected among only four Peruvian Ashipas (TC356, TC357, TC364, and TC534) and in the Brazilian "Jacatupé" (Fig. S4).
The Ecuadorian Jíquima (P. tuberosus) was found to be genetically uniform, with only one genotype detected across the cultivar's entire distribution area. Although sharing unusual morphological characters, notably a determinate growth habit, Jíquima and Ajipa were genetically distinct (D est = 0.308, p < .01). They were also both characterized by unique alleles at locus AIP30 (AIP30 84 and AIP30 91 for Ajipa, AIP30 95 for Jíquima) and a shared allele at locus AIP23 (AIP23 69 ) that was otherwise detected among wild accessions only. Genetic outlier analyses strongly suggested that this locus may be under disruptive selection (log 10 (q value) = −2.3, p > .99; Fig. S5), and AIP23 was therefore excluded from subsequent analyses. Ajipa was also characterized by a private allele at AIP28 (AIP28 32 ) and AIP31 (AIP31 66 ), while Jíquima was fixed for a private allele at AIP36 (AIP36 70 ).

| Phylogeographic structure of nuclear and chloroplast genetic diversity
Bayesian structuring analyses suggested that data were best rep-  distinguished the two haplotypes (TCW and TCA) resolved for the trnQ-rps16 region. In trnH-psbA, we detected a 9-bp inversion region flanked on both sides by 16-bp palindromic sequences (Fig. S9).
Homoplasious inversions of short sequences flanked by long inverted repeats are common at the intrageneric and intraspecific levels in trnH-psbA, which can lead to overestimation of divergence among closely related species (Whitlock, Hale, & Groff, 2010). The distribution of the two configurations of the inversion among accessions was not random, however, and distinct geographic clusters could be identified (Figure 4b). While the L form characterized lowland P. tuberosus cultivars, the reverse complement (R) was predominantly found among highland Andean accessions (Figure 4c).  (Table S1).
F I G U R E 1 Unsupervised Bayesian clustering of individual Pachyrhizus multilocus genotypes (n = 39), assuming increasing values from K = 2 to K = 4. The asterisk denotes the optimal number of clusters detected by STRUCTURE (Fig. S6)

| Influence of environmental factors on the structure of genetic diversity
The relative importance of environmental factors on genetic differentiation between Pachyrhizus landrace populations was assessed using GESTE. Excluding wild populations, generalized linear regression models showed a positive correlation between precipitation seasonality (BIO15) and genetic differentiation between landrace populations, but no significant influence of elevation (ALT) or geographic distance (LON, LAT) on the structure of genetic diversity F I G U R E 3 Unrooted phylogenetic network based on Nei's minimum genetic distance D A (Nei et al., 1983). (a) Split graph of individual Pachyrhizus multilocus genotypes (MLGs). Splits 1 and 2 indicate strongly supported clusters. The scale bar indicates genetic distance. Nodes are colored according to the main genetic clusters identified by STRUCTURE and DAPC (lineages W, A, and B). Each MLG is identified by a number (see Table S1). Varietal types are also indicated (As, Ashipa; Aj, Ajipa; Ch, Chuin; Iw, Iwa; Ja, Jacatupé; Ji, Jíquima; Yu, Yungas type; Yp, Yushpe). P. panamensis and wild P. tuberosus accessions are indicated as Pw and Tw, respectively. (b) Neighbor-joining tree based on Nei's minimum genetic distance D A and 1,000 bootstrap resampling

| Divergence time between P. ahipa and P. tuberosus
Using ABC, we analyzed phylogenetic relationships between P. ahipa and P. tuberosus. Analyses showed the strongest support for scenario S8 (posterior probability >99% (95% CI [0.988, 0.995]; Fig. S10), in which P. ahipa (AC) and Jíquima (JI) diverged from P. tuberosus (TC) (Fig. S2). Divergence time estimates suggest that TC diverged from TW ~7,600 years ago, while AC and JI diverged from TC ~3,900 and ~2,800 years ago, respectively ( Figure 6 and Table 3). Confidence in scenario choice was strong (Table S5), with a type I error rate of 20%. Type II error rates were also very low (<5%), except in the case of S6 (10%). Bias and precision of parameter estimates are given in Table S6.

| Pachyrhizus ahipa and P. tuberosus share a progenitor-derivative relationship
At present, P. ahipa is only known in cultivation in the subtropical east Andean valleys of Bolivia and northern Argentina (Ørting, Grüneberg, & Sørensen, 1996). Brücher (1989) suggested that P. ahipa might have been domesticated from wild material growing in the Bolivian Yungas, where the holotype specimen was collected (Mandon, 1856), but the species' taxonomic status remains unclear as no wild relative has been found. The striking genetic uniformity of P. ahipa throughout its entire distribution area, however, suggests ancient founder effects which, in turn, suggest that the crop may have been introduced into Bolivia only after it was domesticated. Sørensen (1988) suggested that the distinct morphological characters of P. ahipa might be the result of farmers' selection and a long cultivation history. A possible scenario, supported by the ABC analyses (Figure 6), is that P. ahipa represents a case of "progenitor-derivative speciation" (PDS) (Crawford, 2010;Gottlieb, 2003). In this model of geographic speciation, a population "buds off" from its ancestral population and becomes  (Crawford, 2010).
Although PDS has mostly been applied to wild species, it can be extended to scenarios for the domestication and diversification of crop plants. PDS offers a plausible explanation for the drastic reduction in genetic diversity we found in P. ahipa, suggesting a strong genetic bottleneck following the plant's diffusion from its likely center of origin, in Ecuador, to the southeasternmost edge of the distribution range of the genus, in Bolivia. Morphological and molecular characters are congruent with a PDS scenario. Pachyrhizus ahipa shows several unique morphological characters that distinguish it from P. tuberosus, notably short, simple racemes with few flowers, which contrast with the complex racemes found in P. tuberosus; round legume pods; and the outward curling of the wing petals following anthesis, which is characteristic of the species (Sørensen, 1988). Pachyrhizus ahipa is also unique among yam beans in that it displays a sharp geographic contrast in growth habit, with twining types predominating in the north of Bolivia and bushy erect types prevailing in the south (Ørting et al., 1996). This wide range of infraspecific morphological and physiological plasticity (Sørensen, 1996) contrasts with a very low genetic variability across the species' distribution range. Pachyrhizus ahipa was fixed for one allele at most nuclear loci and shared its unique chloroplast DNA haplotype (TCA) with P. tuberosus, which further supports the hypothesis that P. ahipa and P. tuberosus share a progenitor-derivative relationship.

| Pachyrhizus ahipa diverged from P. tuberosus through ecotypic differentiation
In most p-d species pairs, ecogeographic factors play a major role in preventing gene flow between the two species. Generalized linear regression models showed a clear segregation between P. ahipa (lineage A) and P. tuberosus (lineage B) and highlighted the importance of precipitation seasonality as the main factor affecting the structure of genetic diversity among Pachyrhizus landrace populations.
Self-pollination may have played an important role in facilitating the adaptation of Pachyrhizus cultivars to different environments by accelerating the fixation of beneficial alleles (Glémin & Ronfort, 2012;Ronfort & Glémin, 2013).
In northern Bolivia, P. tuberosus and P. ahipa are both locally known as "Ajipa," but although they are found within a similar altitudinal range (1,500-2,000 m.a.s.l.), they never occur together in farmers' fields (Sørensen, 1996). Ashipas (P. tuberosus) are grown predominantly in the Yungas, a transitional zone between the Andean highlands and Amazon lowland rainforests characterized by warm tropical climate with no dry season, while Ajipa (P. ahipa) is only cultivated in the semiarid inter-Andean valleys. Genetic analyses confirmed that all P. tuberosus accessions from the Yungas belonged to the Ashipa type. Like most Ashipas from Ecuador but unlike those from Peru, all Bolivian Ashipas carried the R form of the cpDNA inversion. Tapia and Sørensen (2003) showed that this "Yungas type" was also morphologically distinct from Peruvian Ashipas; while having the complex inflorescences typical of P. tuberosus, the growth habit, tuber shape and seed size, shape and color of Bolivian Ashipas are more similar to P. ahipa than they are to Peruvian Ashipas (Sørensen et al., 1997). Like other P. tuberosus cultivar groups, however, these Yungas accessions differ from P. ahipa by distinct flowering periods and a markedly longer vegetative growth (Sørensen et al., 1997;Tapia & Sørensen, 2003). These distinct phenological characteristics probably contributed to restrict gene flow between the two species. Hybridization experiments suggest a lack of absolute reproductive isolation between P. ahipa and P. tuberosus (Grüneberg, Freynhagen-Leopold, & Delgado-Váquez, 2003; B. Heider, E. Romero, R. Eyzaguirre, W. Grüneberg in preparation). This lack of reproductive barrier calls the taxonomical distinction between P. ahipa and P. tuberosus into question, but further supports the hypothesis that the two species share a progenitor-derivative relationship.
Disjunct flowering periods and distinct flower morphology could be by-products of selection by farmers for ecotypes adapted to different climatic conditions, that is, for shorter growth cycle, more advantageous in seasonal environments with a marked dry season. Such adaptive differentiation could have been facilitated by Pachyrhizus species being mostly self-pollinating, thus accelerating the fixation of distinct alleles and unique morphological characters between the progenitor (P. tuberosus) and the derivate species (P. ahipa).
T A B L E 2 (A) Posterior probabilities for the models including the eight bioclimatic variables (G1-G8) included in GESTE analyses. The best-supported model is shown in bold. (B) Parameter estimates for the best-supported model P, sum of posterior probabilities for the models including the six bioclimatic variables; α, variable regression coefficient for each factor; 95% HPDI, 95% highest probability density interval; σ 2 , residual variance. found, but comparable to the 400-700 mm annual precipitation rate in Bolivian inter-Andean valleys. While leaf shape differs between the two cultivars, Ajipa and Jíquima share important agromorphological characters, notably determinate growth and short racemes, but also photothermal neutrality (Sørensen, 1996), a trait associated with adaptation to higher elevations and/or drought-prone environments (Kwak et al., 2012). Farming techniques are also similar between the two cultivars, which both require reproductive pruning. In areas where rainfall is strongly constrained by seasonality, a determinate growth habit is more advantageous to farmers as it shortens the growth cycle (Huyghe, 1998), but it compels farmers to prune flower buds to force plants to mobilize resources in increasing the size of tuberous roots. In contrast, constant humidity in tropical forests allows continuous plant growth and favors late flowering indeterminate genotypes, for which reproductive pruning is unnecessary.

| Domestication and diversification of yam beans in South America
Based on the molecular evidence presented in this study, we propose a two-phase scenario for the domestication and diversification of yam beans in the Peruvian Andes (Figure 7).
Approximate Bayesian computation analyses suggest that yam beans diverged from their wild relatives ~8,000 years ago, which coincides with the Early-Mid-Holocene, a period of environmental F I G U R E 6 Best-supported demographic scenario for the diversification of Pachyrhizus in South America, based on ABC analyses. Similar results were obtained when AIP23 was included (data not shown)

TW JI TC AC
T A B L E 3 Posterior estimates (mean and 95% confidence interval) of parameter values for the best-supported evolutionary model based on ABC analyses (see Table S3 for details on priors and conditions). Divergence times are given in years, assuming a 1-year generation time F I G U R E 5 Comparison across Pachyrhizus landrace populations of their distribution range along the six environmental variables included in GESTE analyses: (a) elevation (ALT), (b) annual mean temperature (BIO1), (c) isothermality (BIO3), (d) temperature annual range (BIO7), (e) annual precipitation (BIO12) and (f) precipitation seasonality (BIO15). Fifteen landrace subpopulations were defined (Table S1) as follows: Aj, Ajipa (north and south); Yu, Yungas (lowlands and highlands); As, Ashipa (Loreto, Ucayali, Yungas); Ch, Chuin (Loreto and Ucayali); Yp, Yushpe; Ji, Jíquima; Iw, Iwa; Tw, wild (north and south). Brazilian accessions (Jacatupé), whose exact geographic origin is unknown, were not included. Wild populations are shown for comparison. Beanplots are colored according to the three main genetic clusters identified by STRUCTURE and DAPC (lineages A, B, and W). Individual observations are shown as small horizontal white lines within the estimated density trace. Dotted lines represent the median over all subpopulations transition and agricultural expansion in the Peruvian Andes (Pearsall, 2008;Piperno, 2006). Yam beans were probably domesticated in the dry forests of western Ecuador, where most wild specimens of P. tuberosus have been recorded (Sørensen et al., 1997), and where we also found the highest levels of genetic diversity.
There is strong evidence that reduced precipitations during the Mid-Holocene led to significantly drier climatic conditions in the tropical Andes than at present, causing cloud forests to be replaced by semideciduous dry forests in ecotonal areas (Mayle & Power, 2008 (Figure 7b).
Several landraces of Ashipas and Chuins have been described, the largest diversity of which is found in the Peruvian Amazon. "White Ashipas" have been recorded along the Huallaga, Marañón, and Ucayali rivers, while "yellow Ashipas" occur mainly along the Tigre River but are also found occasionally along the Nanay, Marañón, and Ucayali rivers.
Similarly, farmers distinguish between "white," "yellow," and "purple" Chuins, but also between cultivars with high dry matter (DM) content, known as "Pitichuin," and low DM varieties, known as "Cocotichuin." Purple Pitichuins are the most common, while yellow and purple  (Sørensen et al., 1997). Molecular data seem to support this hypothesis, with Chuins and Jacatupé containing only a subset of the genetic variability found in Ashipas. In turn, the partitioning of morphological and genetic diversity into local landraces suggests serial founder effects accompanying the diffusion of lowland cultivars through the Peruvian Amazon.
The persisting ambiguity about the taxonomic status of these plant remains is mainly due to the absence of cultivars with determinate growth in Peru nowadays (Sørensen et al., 1997). Yam bean agriculture in Peru is strongly linked with the development of pre-Columbian societies on the western Andean foothills during the Late Holocene. River catchment basins form oases of riparian dry forests along the Peruvian coast. Known as "lomas," these forests depend on seasonal rainfall in the headwaters of Andean highlands. Lomas have played a major role in the history of pre-Columbian cultures (Engel, 1973;Fehren-Schmitz et al., 2014), and archeological records indicate they were focal points for the early onset of yam bean agriculture in the Peruvian Andes. However, these lowland populations probably disappeared as desertification progressed on the coast of Peru starting ~AD 500, forcing settlements to relocate into the middle and upper valleys of the Andean foothills (Fehren-Schmitz et al., 2014;Schittek et al., 2015). Roots discovered in the Casma River valley and dated from ~AD 1500 (Ugent et al., 1986) et al. (1995) suggested that "ancestral" populations of P. ahipa might be found (also unconfirmed), should also be given priority for collecting germplasm which could, if it (still) exists, help resolve the history of P. ahipa in Peru.
Given the limited genetic variation in P. tuberosus and P. ahipa landrace populations, it is urgent to document and safeguard genetic resources in Pachyrhizus wild relatives. Pachyrhizus panamensis, in particular, represents an important taxonomic gap which needs to be addressed. Like Ajipa and Jíquima, P. panamensis distribution is restricted to SDTFs. Pollen grains of P. ahipa are morphologically distinct from pollen grains of P. tuberosus and support a close relationship between P. ahipa and P. panamensis (Sørensen, 1989), although this could also indicate an adaptation to similar environmental stress.
Phylogenetic relationships between P. panamensis, P. tuberosus, and P. ahipa cannot be fully resolved, however, because of the rarity of P. panamensis herbarium specimens.
In conclusion, while gray areas remain, this study sheds a new light on phylogenetic relationships between P. tuberosus and P. ahipa and offers insights into the evolutionary history of yam beans in the Andes. Molecular data suggest the two species share a progenitor-derivative relationship, with environmental factors playing an important role in driving selection for divergent ecotypes adapted to contrasted environments. Future collection missions in Peru and Ecuador might bring new elements that will help refine the scenario we propose.