Genetic structure and diversity of natural and domesticated populations of Citrus medica L. in the Eastern Himalayan region of Northeast India

Abstract Citron (Citrus medica L.) is a medicinally important species of citrus native to India and occurs in natural forests and home gardens in the foothills of the eastern Himalayan region of northeast India. The wild populations of citron in the region have undergone rapid decline due to natural and anthropogenic disturbances and most of the remaining individuals of citron are found in fragmented natural forests and home gardens in the region. In order to assess the genetic structure and diversity of citron in wild and domesticated populations, we analyzed 219 individuals of C. medica collected from four wild and eight domesticated populations using microsatellite markers. The genetic analysis based on five polymorphic microsatellite loci revealed an average of 13.40 allele per locus. The mean observed and expected heterozygosity values ranged between 0.220–0.540 and 0.438–0.733 respectively among the wild and domesticated populations. Domesticated populations showed close genetic relationships as compared to wild populations and pairwise Nei's genetic distance ranged from 0.062 to 2.091 among wild and domesticated populations. Analysis of molecular variance (AMOVA) showed higher genetic diversity among‐ than within populations. The analysis of population structure revealed five groups. Mixed ancestry of few individuals of different populations revealed exchange of genetic materials among farmers in the region. Citron populations in the region show high genetic variation. The knowledge gained through this study is invaluable for devising genetically sound strategies for conservation of citron genetic resources in the region.


Introduction
Citrus medica L., commonly known as citron, is native to India (Scora 1975;Mabberley 2004) and occurs as wild and semiwild populations in both primary and secondary forests in the foothills of the Himalayas in northeast India (Hooker 1875;Bhattacharya and Dutta 1956;Tanaka 1977;Nair and Nayar 1997). Citron fruits are widely used in local medicinal practices and are a socioeconomically important genetic resource of the region. Citron is considered to have been a parental contributor to several cultivated Citrus accessions, and has mostly acted as the male parent (Nicolosi et al. 2000). In combination with sour orange (Citrus 9 aurantium), citron contributed to the origin of lemon (Citrus limon), bergamot (Citrus bergamia), and key lime (Citrus aurantifolia) (Barkley et al. 2006;Ollitrault et al. 2010). Natural populations of citron are severely affected by harvesting and deforestation, and most of the remaining individuals are confined to home gardens and agroforestry systems in the region. Thus, conservation measures are urgently needed to prevent further decline of citron genetic resources, and information on its genetic structure and diversity is essential for formulating conservation and management strategies.
A limited number of population genetic studies of citron using RFLP (Federici et al. 1998), RAPD, SCAR, and cpDNA (Nicolosi et al. 2000), and simple sequence repeat (SSR) and ISSR (Corazza-Nunes et al. 2002;Barkley et al. 2006;Kumar et al. 2010;Garcia-Lor et al. 2015) markers are reported in the literature. Through RFLP analyses, Federici et al. (1998) reported low heterozygosity levels among three C. medica accessions in the Citrus Variety Collection (CVC) at the University of California, Riverside. Barkley et al. (2006) studied 29 citron accessions from the CVC using SSR markers and reported lower heterozygosity values among the C. medica accessions as compared to the other Citrus species. The low genetic diversity observed among citron accessions could be attributable to selfing, as citrons are known to produce vigorous, highly homozygous seedlings through selfing (Barrett and Rhodes 1976). Genetic studies based on ISSR data also revealed a low level of heterozygosity (Ht = 0.160) in the seven accessions of C. medica in northeast India (Kumar et al. 2010). However, Luro et al. (2012) reported high diversity among citron varieties in the Mediterranean region, which could be attributable to intervarietal pollination and seed introductions from Asia. Using RAPD and cleaved amplified polymorphic sequence markers, Nicolosi et al. (2000) reported high genetic diversity among 12 varieties of citron. These studies are based on a limited number of C. medica accessions and the genetic diversity of citron in their native habitat remained unknown.
The present study, based on an extensive sampling from northeast India, is the first to assess the genetic variability of C. medica in its natural habitat. The overall objective of the present study is to assess the genetic diversity and structure of wild and domesticated populations of C. medica over a broad geographical area. The specific objectives of the present study are to (1) determine the levels of genetic diversity in wild and domesticated populations of C. medica, (2) determine whether the domestication process led to a reduction in genetic diversity (3) assess genetic structure and diversity of C. medica in its native habitat and (4) infer genetic relationships among wild and domesticated populations.

Materials and Methods
Leaf samples from 219 individuals of C. medica (Fig. 1) representing four wild and eight domesticated populations in home gardens in Assam, Arunachal Pradesh and Mizoram (Fig. 2, Table 1) were collected and stored dry until further analyses. The identification of collected samples was based on the comparison of morphological characters with those of herbarium specimens and following taxonomic monographs on Citrus (Bhattacharya and Dutta 1956;Tanaka 1977;Mabberley 2004). The citron members have distinct characteristics including thorny shrub to small trees; leaves are large (length 5-26 cm and width 2.5-9 cm), oblong, serrate margin, short, wingless petioles; flowers are large (3.5-6.5 cm), highly aromatic, mostly axillary racemes; fruits medium to large in size (length 2.5-12.5 cm and width 1.5-12 cm; individual fruit weight 24-210 g), shape long-oval to ellipsoid, sometime necked, apex blunt, color green and yellow; smooth to rough fleshy thick rinds (peel thickness 0.50-3.5 cm); low juice content and highly acidic to low sweet with varied aroma, numerous seeds with white cotyledons. A total of 20 individuals per population, with the exception of Neairgram and Namsai populations where 15 and four individuals respectively were available, were sampled. Morphological features including tree height, leaf length and width, fruit shape, size and weight were recorded during sampling.
The total genomic DNA from leaves was extracted following the methods of Doyle and Doyle (1987) and Dayanandan et al. (1997). The quality of extracted DNA was tested through electrophoresis on 0.5% agarose gel and staining with ethidium bromide. The PCR amplification of SSR loci was carried out following Barkley et al. (2006Barkley et al. ( , 2009 and Ollitrault et al. (2010) in 15 lL reactions containing 2.0 lL template DNA, 0.2 lL Taq polymerase, 1.5 lL of 109 PCR buffer, 1.5 lL of 2.5 mmol/L MgCl 2 , 1.5 lL of 0.2 mmol/L dNTP, 0.5 lL of the forward and reverse oligonucleotide primers (2.5 pmol each) and 0.5 lL of the M13 universal forward primer (1 pmol/ lL), 0.5 lL DMSO and 6.3 lL sterile dH 2 O. Thermal cycling parameters consisted of initial denaturation at 94°C for 4 min followed by 35 cycles of 94°C for 1 min, 50-55°C for 45 sec (primer specific annealing temperature, Table 2), and 72°C for 1 min and final extension at 72°C for 7 min. PCR reactions were performed on a Gen-eAmp PCR System 9700 thermal cycler.
Each forward oligonucleotide primer consisted of M13 tail sequence (5ʹ-CACGACGTTGTAAAACGAC-3ʹ) at the 5ʹ end for visualization of the PCR product using M13 primers labeled with IRD700 and IRD800. The amplified PCR products were diluted (1:20) with loading dye (Formamide and Bromophenol blue), denatured at 94°C for 5 min and cooled on ice before loading onto the 6% polyacrylamide gel on a LI-COR IR 2 DNA analyzer. About 1 lL aliquot of each PCR product was loaded onto each lane of the gel along with three lanes containing a 50-350 bp size standard (LI-COR). The fragment size corresponding to each SSR marker of each sample was scored using the e-seq software and the bands recorded as 1 (present) or 0 (absent) on an EXCEL sheet for further analysis.

Microsatellite data analysis
The obtained genotype data for all populations and markers were tested for Hardy-Weinberg equilibrium (HWE) and linkage disequilibrium (LD) using POPGENE Version 1.31 (Yeh et al. 1999 expected heterozygosity (He) as well as the mean number of alleles (MNA), allelic richness (A R ), private allele (A P ), genetic differentiation (F ST ), and inbreeding coefficient (F IS ) in each population and locus were calculated using software programs POPGENE Version 1.31 (Yeh et al. 1999), FSTAT version 2.9.3.2 (Goudet 2001) and Arlequin   Table 1.
(1) Tinsukia-Assam, (2) Banskandi-Assam, (3) Itanagar-A.P., (4) Aizawl-Mizoram, (5) Sairang 1 -Mizoram, (6) Sairang 2 -Mizoram, (7) Motinagar 1 -Assam, (8) Motinagar 2 -Assam, (9) Lakhipur-Assam, (10) Sonai-Assam, (11) Neairgram-Assam, (12) Namsai-A.P. Version 3.0 (Excoffier et al. 2005). The Polymorphic Information Content (PIC) for each SSR microsatellite locus based on the entire set of accessions was calculated using Power Marker V3.25 (Liu and Muse 2005). Pairwise standard genetic distances (D S ) among the 12 domesticated and wild populations were calculated following Nei's unbiased measures of genetic distance (Nei 1978) using the POPGENE software package and the resulting genetic distance matrix was used for cluster analysis through unweighted pair-group method with arithmetic averages (UPGMA). The F-statistics (F IS = inter-individuals, F IT = subpopulations and F ST = total population; Wright 1978) were computed to estimate genetic differentiation among the 12 C. medica populations. POPGENE Version 1.31 (Yeh et al. 1999) was used to estimate the significance of genotypic differentiation between population pairs. All probability tests were based on the Markov chain method (Guo and Thompson 1992;Raymond and Rousset 1995) using 1000 dememorization steps, 100 batches and 1000 iterations per batch. When the null hypothesis was rejected, the F IS statistic of Wright (1951) was estimated following Weir and Cockerham (1984) and used as an indicator of heterozygote excess or deficit. The F ST statistic (Wright 1951) was estimated following Weir and Cockerham (1984) and pairwise tests of differentiation were performed in FSTAT. Permutation tests were performed in FSTAT, where genotypes were randomized among samples and the significance of the P-values from the pairwise tests of differentiation was determined using standard Bonferroni corrections.
Analysis of molecular variance (AMOVA) (Excoffier et al. 1992) was performed in Arlequin 3.0 software (Excoffier et al. 2005) to test the differentiation of the accessions in various groups with the probability of nondifferentiation (F ST = not > 0) over 10000 randomizations. The distribution of genetic variation within and among wild and domesticated populations was estimated using Nei's standard genetic variation (Nei 1987). Pairwise F ST values between all pairs of populations were calculated and differentiations were tested between the populations in Arlequin. To examine the geographic structure of genetic variation among the C. medica populations, we tested for correlations between genetic distance and geographic distance using a Mantel test based on a pairwise matrix of Nei's (1978) unbiased genetic distances, Rousset (1997) genetic differentiation [F ST / (1 À F ST )] and a pairwise matrix of geographic distances (Mantel 1967). Gene flow (Nm) among populations was estimated as the number of migrants per generation between pairs of populations. Nm was estimated according to Slatkin (1993) by using the formula Nm = (1ÀF ST ) /4F ST .
Genetic bottlenecks among populations were identified using the program BOTTLENECK version 1.2.02, under three different models, the infinite allele and stepwise mutation (Cornuet and Luikart 1996), and the twophased model of mutation (Luikart et al. 1998). Both the Wilcoxon signed-rank test and a sign test were used to assess significance of whether the observed He is greater than expected under an equilibrium model.
The software program STRUCTURE version 2.1 (Pritchard et al. 2000) was used for the analysis of population structure and identification of ancestral and hybrid forms. This method follows a Bayesian clustering approach to assign individuals into clusters using multilocus genotype data and allele frequencies. This approach works on the principle that the loci selected for investigation are unlinked, independent and at linkage equilibrium among the populations under the Hardy-Weinberg principle (Pritchard et al. 2000). Different accessions were assigned to probable clusters under the assumption that all accessions were from a common ancestor and that admixing of individuals among the populations had occurred. The posterior probabilities were estimated using a Markov Chain Monte Carlo (MCMC) method. The admixture of individuals independent of the geographic locations was used for clustering all individuals from the study populations and 15 independent runs of STRUC-TURE were carried out for the total data set for K (number of clusters) values of 1-15. Simulations were carried out with the following settings: admixture model, correlated allele frequencies, and MCMC repetitions of 10,000 iterations. The final results were based on a run length of 100,000 and five iterations for each K using admixture model with the independent frequency and correlation model. We examined DK values, which are derived from the second-order rate of change of the likelihood function used to determine K (Evanno et al. 2005), to provide a better estimate of the number of clusters in such conditions. For the number of clusters best represented by the data, only individuals with probabilities above the threshold q = 0.75 for a specific cluster were retained in that population.

Results
Characteristics of the seven SSR markers used to assess genetic diversity of the 219 Citrus medica individuals are given in Table 2. Five of the seven primer pairs described by Barkley et al. (2006Barkley et al. ( , 2009 and Ollitrault et al. (2010) were used for genetic analysis. Two of the seven markers, cAGG9 and CCTO1, were excluded from the analysis due to their low polymorphism and poor amplifications. All SSR loci used in the present study were polymorphic and none of the loci deviated from Hardy-Weinberg equilibrium. No significant LD was found in any pairs of loci, so all five SSR microsatellite loci provided independent information. A total of 67 alleles were detected within the citron individuals, with allele frequencies across all loci ranging from 2.50% to 82.50%. The number of alleles generated by each SSR marker varied from eight to 20 with an average of 13.4 alleles per locus ( Table 3). The highest number of alleles was scored at locus CiBE3936 (20 alleles) and lowest number of alleles at locus CiBE4796 (8 alleles) ( Table 3). The effective number of alleles (Ne) for each locus ranged from 3.66 to 6.25 with an average value of 4.85. The amplified fragment size of the alleles varied from 131 (CiBE3936) to 248 (CiBE3298) bp. The PIC values ranged between 0.829 (CiBE3936) and 0.694 (CiBE0753) with a mean PIC value of 0.762 for all loci (Table 3).
The total number of alleles across all loci ranged between 13 in the Namsai wild population and 36 in the Banskandi domesticated population. The mean allelic richness (A R ), independent of sample size, ranged between 3.83 in the Tinsukia wild population to 2.48 in the Sairang 2 domesticated population (Table 4). Overall, genetic diversity varied significantly within wild and domesticated populations located in different geographic locations. The MNA across all populations was 2.77 AE 0.17, varying between 2.60 AE 0.55 in the Namsai wild population, which had the lowest number of individuals (4), and 7.20 AE 2.95 in the domesticated Banskandi population. In general, a higher MNA was observed in the domesticated populations. Most of the alleles present in domesticated populations were also present in wild populations. Private alleles, unique to a specific population, were observed in the Itanagar domesticated population (A P = 4), as well as in the Tinsukia wild, Banskandi domesticated, Aizawl domesticated and Sairang 1 wild populations, each with two private alleles, and in the   (Table 4). The frequencies of these private alleles ranged between 2.50-12.50%.
The mean observed (Ho) and expected (He) heterozygosity values varied significantly (P < 0.001) within the populations (Table 4). The highest value for Ho = 0.540 AE 0.251 was observed in the domesticated Banskandi population, while the lowest Ho = 0.220 AE 0.160 occurred in the Tinsukia wild population. The highest He within the populations was found in the Tinsukia wild population (He = 0.733 AE 0.093), while the lowest occurred in the Sairang 2 domesticated population (He = 0.438 AE 0.217). The He values for wild populations ranged from 0.500-0.733, and for domesticated populations it ranged from 0.438-0.706. This wide range of heterozygosity values indicates high diversity within the populations. In all cases, average observed heterozygosities were lower than the expected heterozygosities under HWE (Table 4).
Population differentiation (F ST ) values were calculated for each locus and population separately and slight variation was observed among loci (Table 3) (Table 5). Among the 12 pairs of populations, only three pairs were not significantly differentiated, viz., Banskandi (domesticated) and Tinsukia (wild), Aizawl and Itanagar (domesticated) and Sairang 1 (wild) and Sairang 2 (domesticated). All other population pairs were significantly differentiated and the significance level in the most of the population Table 3. Diversity statistics of the five polymorphic simple sequence repeat loci used among 219 Citrus medica individuals. Statistics include number of alleles (Na), polymorphic information content (PIC), effective number of alleles (Ne), observed (Ho) and expected (He) heterozygosity, Nei's standard genetic distance (D S ), local inbreeding coefficient (F IS ), overall inbreeding coefficient (F IT ), genetic differentiation (F ST ) and gene flow (Nm).  pairs was P < 0.001 (Table 5). The greater and significant F ST values between these population pairs may indicate greater genetic divergence in citron populations among these pairs. Inbreeding coefficient (F IS ) values were significantly positive (F IS = 0.204-0.705; 0.001 < P < 0.05) for all the populations except for one wild population in which it was positive but insignificant (F IS = 0.115; P > 0.05) ( Table 4). In all loci, significantly positive F IS values were obtained and these ranged between 0.204-0.548. The average value of F IS for all loci was 0.334 and F IT was 0.511 for all accessions (Table 3). The gene flow (Nm) was calculated according to genetic differentiation and it ranged between 0.600 in the Sairang 2 domesticated population to 1.187 in the Tinsukia wild population ( Table 4). The pairwise Nei's genetic distance (D S ) values are summarized in Table 5. In general, domesticated populations showed close genetic relatedness as compared to wild populations. The pairwise D S vales between populations ranged from 0.062 between the Sairang 1 wild and Sairang 2 domesticated populations in Mizoram to 2.091 between two domesticated populations, Sairang 2 (Mizoram) and Neairgram (Assam). Similar results were observed when the genetic distances of the populations in the study were determined using Nei's D A index (Nei's unbiased genetic distance) of genetic distance. The smallest D A was observed between the Sairang 1 wild and Sairang 2 domesticated populations (0.049) and largest D A was observed between the Neairgram and Sairang 2 domesticated populations (2.074) (Data not shown here). The AMOVA showed significant total genetic variation among the populations and individuals (P < 0.001) for all variance components. The genetic differences were 27.49% among individuals within populations, 24.98% among populations, and 47.53% at the individual level (Table 6).
Correlation between geographic distance (km) and Nei's genetic distance among the citron populations of NE India was insignificant. The geographic distance among the populations ranges from 0.01 to 535 km. Mantel test also showed no significant correlation between geographic distance and genetic differentiation [F ST /(1 À F ST )] for C. medica populations in the region. Thus, genetic distances between populations are independent of the corresponding geographical distances.

Discussion
The present study is the first to quantify the amount and distribution of genetic variability in C. medica within its native geographical range. The results, based on genotypes of five selected SSR loci, demonstrate that domesticated citron populations possess a slightly higher genetic diversity than wild populations and the difference between those populations was insignificant. High levels of polymorphism in the five selected SSR markers allowed us to unambiguously distinguish 219 accessions belonging to 12 geographically isolated populations.
Overall diversity values obtained in the present study differ from those found by Ollitrault et al. (2010), who reported low genetic diversity (He = 0.15, 1.44 alleles per locus). A prior study by Barkley et al. (2006) also reported lower diversity indices between citron individuals. In a recent microsatellite marker based study of 47 citrons from Yunnan Province of China and Mediterranean region by Ramadugu et al. (2015) and 56 citron from southwest China by Yang et al. (2015) reported substantial heterozygosity and genetic diversity among citron accessions. These differences in genetic diversity between the present and previous studies could be attributable to sample size as limited number of individuals were sampled in earlier studies. More importantly, current sampling from different regions throughout its native range, rather than from small numbers of accessions in ex situ germplasm banks may have resulted in a better assessment of the genetic diversity present in C. medica. These results show that there are abundant genetic variation at the molecular level among the 219 citron individuals from four wild and eight domesticated populations throughout northeast India, where the species are thought to have originated. A large number of studies suggest that the primary centre of origin of Citrus is south and south-east Asia, particularly the region extending from northeast India, eastward through the Malayan Archipelago to China and Japan, and southward to Australia (Tanaka 1958;Swingle and Reece 1967;Scora 1975;Mabberley 2004). Extensive field exploration studies and presence of large number of natural populations in the primary forests also revealed that the region is the center of origin of several Citrus species (Bhattacharya and Dutta 1956).
The domesticated populations of C. medica have slightly higher genetic diversity as compared to those wild populations. In general, all the populations have lower observed heterozygosity values then the expected heterozygosity suggesting inbreeding. Slightly higher genetic diversity among the domesticated populations suggest that movement of cultivated individuals through a large geographic distances resulting in allele combinations which would not occur naturally (Miller and Gross 2011). The exchange of such highly valued medicinal plants in the form of seed, seedlings and mature plant cuttings, sometimes over long distances, is a common practice among tribal and nontribal communities in the region. Most likely farmers may have selected individuals with desirable traits, which may have contributed to the increased genetic diversity in domesticated populations through increased mixing and gene flow among geographically isolated populations.
An average F ST = 0.275 for overall loci revealed significant genetic differentiation between populations. Similar moderate-to-high F ST values are consistent with the relatively high genetic differentiation observed in some other tropical trees Caryocar brasiliense (Collevatti et al. 2001), Swietenia macrophylla (Novick et al. 2003), and Dalbergia monticola (Andrianoelina et al. 2009). These results also reflect genetically distinct populations in the region differing simultaneously in allele frequencies and allele sizes, and suggest that new mutations may be contributing to the allelic diversity found in wild and domesticated citron populations. In general, wild and domesticated citron populations showed strong genetic differentiation. Domesticated populations showed a higher proportion of genetic differentiation (F ST = 0.193-0.294) than wild populations (F ST = 0.174-0.252). Similarly, Hamrick and Godt (1996) reported that the mean value of genetic differentiation among populations of crop species (domesticated) is higher than that of noncrop (wild) species. The observed high F ST values in cultivated populations can be explained by distinct sources of germplasm used in establishing domesticated populations with limited exchange of genetic material, leading to high genetic differences among domesticated populations. The results are supported by the long cultivation history of citron species in the region. Some of the domesticated populations are not far from wild habitats; therefore, migration from wild to cultivated populations by natural or artificial means may be an ongoing process. Abundant occurrences of wild and primitive relatives of citron, e.g., C. nana (Wester) Yu.Tanaka, C. odorata (Wester) Tanaka and species under the subgenus Papeda in the eastern Himalayan areas (Tanaka 1969), as well as our recent Citrus germplasm collection in northeast India indicate their persistence and diversification in the region of origin. Favorable environmental conditions in this area, currently in the 'Indo-Burma biodiversity hot spot' favored its growth and further spreading to other parts of the world (Tanaka 1969). In a recent palynological study, Langgut (2014) stated that citron originated in Asia, particularly India and then gradually dispersed to other areas.
The AMOVA results revealed a high level of genetic variation among individuals (47.53% of the total variation) and significantly (P < 0.001) low level of variation among populations (24.98%). In most of the citron populations, seeds or cuttings of one or a few individuals were brought from the wild population, transferred to and grown in the farmers' home gardens or local agroforestry systems, and maintained for generation after generation. In clonally propagated plants, separation from the wild ancestor during the domestication process reduces the chances of sexual crossing in subsequent populations (Zohary and Spiegel-Roy 1975;McKey et al. 2010). However, in many perennial plant species heterozygosity also maintained through clonal propagation (Petit and Hampe 2006). Clonal propagation methods may have increased the homogeneity at the population level. The citron populations showed significant inbreeding coefficients (F IS ) (P < 0.001-0.01), with the single exception of the Namsai wild population.
The indirect estimates of geneflow (Nm) based on population differentiation showed significant variation (P < 0.001) and ranged between 0.600 and 1.187. Population differentiation and effective population size corresponded to three different categories of Nm values: high (Nm ≥ 1.000), intermediate (0.250-0.990) and low (0.000-0.249) (Slatkin 1981(Slatkin , 1985. One wild population, Tinsukia, and three domesticated populations, Banskandi, Itanagar and Aizawl, showed relatively high gene flow (Nm > 1.000) and in the other populations it was intermediate (Nm = 0.600-0.918). The relatively high through intermediate levels of gene flow among populations attributable to the movement of genetic material among farmers in the region. Genetic distances between wild and domesticated populations are smaller and admixture is  Table 1; the Y-axis shows the proportion of alleles derived from each population. Accession assignments are as follows (population numbers and proportion): Cluster 1: #5 (34%), #6 (36%) and #7 (30%) Cluster 2: #1 (24%), #2 & #3 (26% each) and #4 (24%); Cluster 3: #2 & #7 (6% each), #8 & #9 (36% each) and #10 (16%). Cluster 4: #1 (24%), #2 (18%), #3 (26%), #4 (29%) and #5 (3%); and Cluster 5: #1 (4%), #10 (19%), and #11 & #12 (38.5% each) (B) Assignment of 219 individual (population number in brackets) Citrus medica accessions to into five distinct clusters. The Y-axis shows the proportion of alleles derived from each individual. Individuals of the same color belong to the same cluster. An individual with more than one color shares a percentage of its among multiple clusters, according to the admixture proportions. more common between sympatric populations of wild and domesticated populations than between allopatric populations, which is indicative of gene flow between sympatric populations. The presence of a few private alleles (1-4) in most of the wild and domesticated populations also shows the existence of gene flow among populations (Slatkin 1985). A review by Ellstrand et al. (1999) of thirteen globally important crops including wheat, rice and maize concluded that gene flow among wild and domesticated relatives is common and unintentional, and occurs naturally whenever these relatives come into contact with each other. Viard et al. (2004) and Scurrah et al. (2008) reported similar results of gene flow among the wild and domesticated annual crop plants (beet and potato species) through seeds and clonal propagation. Similar results have also been reported for many perennial food plants (Miller and Gross 2011).
The BOTTLENECK analysis indicated that no bottleneck event occurred in citron populations of the region. It is possible that slight or past bottleneck effects may have gone undetected. A number of natural citron populations in the region have diminished, due to natural and anthropogenic disturbances and overexploitation. Until now, such disturbances have had no identifiable consequences in terms of overall genetic diversity and effective population size. Citron populations in the region are maintaining their allelic richness without any reduction in genetic diversity through either natural processes or farming methods. Future studies on larger populations and a wider selection of markers and methods are needed to detect bottleneck events.
The STRUCTURE analysis showed shared ancestry between the wild and domesticated citron populations, suggesting that gene flow has occurred between these populations. Overall, the STRUCTURE results suggest five subpopulations within the 12 wild and domesticated populations. The grouping of individuals into five distinct clusters is supported by the highest ΔK value, confirming the presence of five genetically distinct groups ( Fig. 4 and Table 7). This is further supported by AMOVA, which showed that most of the total variance distributed within individuals (47.53%) and among individuals within populations (27.49%). A few individuals of some populations genetically related to individuals of geographically isolated populations of the region. Similar groupings through cluster analysis also supports gene flow among distant populations. The genetic diversity observed among the wild and domesticated populations did not affect the clustering of the individuals at the population level. Grouping of wild and domesticated individuals in to the same cluster indicates their admixture due to the long history of cultivation in the region. The domesticated Banskandi population and the wild Tinsukia population showed similarly large amounts of genetic diversity; however, most of the individuals from these two distant populations clustered together (Cluster-1 and 3, Fig. 5). Such clustering suggest admixture of individuals among distant populations, which could be attributable to the long history of genetic material exchange. Individuals of C. medica may have spread from wild sources, (i.e., the site of origin), to farmer-managed lands through the movement of the people or sharing of seeds. Further, the UPGMA dendrogram (Fig. 3) clustered 12 populations into five groups. The cluster analysis could not clearly differentiate the wild and domesticated populations. Thus, there has been mixing of wild and domesticated populations. The nonsignificant (P > 0.05) relationship between geographic and genetic distances between populations indicates that their genetic differences are independent of corresponding geographical distances.

Conclusion
There is a significant level of genetic diversity in the citron germplasm that could be used for sustainable utilization and conservation of this valuable genetic resource. The Himalayan northeast region of India is believed to be a center of diversity for the genus Citrus and this study Table 7. Proportion of ancestry of each population in each of the gene pools as defined using the model-based clustering method from Pritchard et al. (2000).

Populations/ Clusters
Proportion of individuals in each gene pool (%)   P1  P2  P3  P4  P5  P6  P7  P8  P9  P10  P11  P12 Cluster 1  reveals that high level of genetic diversity exists in Citrus medica. This also supports the views of Vavilov (1951) who stated that generally plant species show high diversity in their original place of origin and in the regions with large number of wild relatives of crop plants. A few individuals showed mixed ancestry between the wild and domesticated populations. The observed intraspecific genetic variation in the citron germplasm is valuable for selecting the most diverse populations for further improvement of fruit quality through breeding programmes and commercialization. The present study shows that the genetic diversity of Citrus medica has been maintained by the indigenous communities in their home gardens through traditional cultivation practices. This highlights the important role played by indigenous communities in conservation of genetic resources.

Supporting Information
Additional Supporting Information is available online in the supporting information tab for this article: Table S1. Citrus medica population sampling details (Geographical location information is in Table 1). Table S2. Genotypes of Citrus medica population in NE India. Table S3. Allele frequency comparison over populations. Figure S1. Relationship between geographic distance and Nei's genetic distance among the 12 populations of wild and domestic C. medica. Figure S2. Relationship between geographic distance and genetic differentiation [F ST /(1 À F ST )] among the 12 populations of wild and domestic C. medica. F ST was calculated according to Weir and Cockerham (1984).