Genetic diversity and structure of core collection of winter mushroom (Flammulina velutipes) developed by genomic SSR markers

A core collection is a subset of an entire collection that represents as much of the genetic diversity of the entire collection as possible. The establishment of a core collection for crops is practical for efficient management and use of germplasm. However, the establishment of a core collection of mushrooms is still in its infancy, and no established core collection of the economically important species Flammulina velutipes has been reported. We established the first core collection of F. velutipes, containing 32 strains based on 81 genetically different F. veltuipes strains. The allele retention proportion of the core collection for the entire collection was 100%. Moreover, the genetic diversity parameters (the effective number of alleles, Nei’s expected heterozygosity, the number of observed heterozygosity, and Shannon’s information index) of the core collection showed no significant differences from the entire collection (p > 0.01). Thus, the core collection is representative of the genetic diversity of the entire collection. Genetic structure analyses of the core collection revealed that the 32 strains could be clustered into 6 groups, among which groups 1 to 3 were cultivars and groups 4 to 6 were wild strains. The wild strains from different locations harbor their own specific alleles, and were clustered stringently in accordance with their geographic origins. Genetic diversity analyses of the core collection revealed that the wild strains possessed greater genetic diversity than the cultivars. We established the first core collection of F. velutipes in China, which is an important platform for efficient breeding of this mushroom in the future. In addition, the wild strains in the core collection possess favorable agronomic characters and produce unique bioactive compounds, adding value to the platform. More attention should be paid to wild strains in further strain breeding.


Background
A core collection is a subset of accessions that presents the maximum possible genetic diversity contained in an entire collection with minimum redundancy [1,2]. The establishment of a core collection for crops is practical for efficient management of germplasm. Core collections of most major food crops, such as Oryza sativa, Zea mays, Glycine max and Triticum aestivum, have already been established [3][4][5][6][7].
A core collection is traditionally constructed based on morphological and agronomic characters using different strategies, such as the constant allocation (C) strategy, the logarithm (L) strategy, the proportional allocation (P) strategy, and the random sampling (R) strategy [2,[8][9][10]. However, most morphological and agronomic characters are quantitative traits that can be easily affected by environmental variation [11][12][13]. Therefore, phenotypic data cannot directly reflect the genetic diversity of germplasm resources [11].
Conversely, molecular markers can directly reflect a germplasm's genetic diversity at the DNA sequence level. Compared with other molecular markers, simple sequence repeats (SSRs) are randomly repeated DNA sequences, generally 1 to 6 base pairs in length per unit. SSRs can spread extensively throughout a genome. They are typically co-dominant, highly polymorphic, reproducible and easy to score [14][15][16]. Based on molecular marker data, Kim et al. [17] developed software named PowerCore by applying the advanced maximization (M) strategy with heuristic searching to establish a core collection (allele mining collection); it allows all alleles to be captured in a minimum number of accessions [17]. It has been successfully used with many economically important crops, such as Oryza sativa, Glycine max, Olea europaea, Vigna radiata, and Sesamum indicum and has been proven to be most suitable for establishing a core collection based on molecular data [18][19][20][21][22].
However, the development of core collections of edible mushrooms is still at an early stage, and core collections have been established only in Pleurotus ostreatus and Lentinula edodes [23][24][25]. Flammulina velutipes is cultivated on a large scale in East Asia [26][27][28]. China is currently the largest producer of F. velutipes, with an annual production of 2.4 million tons [29]. In our previous study, we obtained 124 strains (110 cultivars from the spawn market of China and 14 wild strains from Yunnan, Sichuan, and Hunan provinces), and excluded cultivars labeled with confusing names, then screened out 81 strains that are genetically different [30]. In order to efficiently manage and utilize of these genetically  different strains, a smaller representative core collection without redundant strains is urgently needed.
In this study, we aimed to (i) establish the core collection of F. velutipes; (ii) evaluate the genetic diversity of the core collection and the entire collection; and (iii) analyze the core collection's genetic structure.

Strain materials and DNA extraction
We used 81 strains of F. velutipes in this study, including 67 cultivars and 14 wild strains (Additional file 1: Table  S1). Genomic DNA was extracted for each strain with the CTAB-based method [31]. In each case, fresh mycelium was harvested from potato dextrose agar medium after inoculation for 10 days at 23°C. The DNA concentration and purity were measured with a NanoDrop2000 spectrophotometer. The DNA solution of each sample was diluted to 100 ng/μl.

SSR genotyping
The 25 polymorphic SSR markers used in this study were developed by our previous study [30]. The forward primer of each SSR was labeled with fluorescent dye (FAM) at the 5′ end (TSINGKE, Kunming). Polymerase chain reactions (PCR) were carried out in a total volume of 25 μl, containing 1 μl template DNA, 1 μl bovine serum albumin, 2.5 μl reaction buffer, 0.5 μl deoxynucleoside triphosphate, 1 μl for each primer, 0.3 μl Taq DNA polymerase, and 17.7 μl ddH 2 O. PCR was conducted on an ABI 2720 Thermal Cycler (Applied Biosystems, Foster City, CA) or an Eppendorf Master Cycler (Netheler-Hinz, Hamburg, Germany) under the following parameters: 94°C for 4 min, then 35 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 30 s, followed by a final extension step of 72°C for 8 min. The PCR products were run in an ABI 3730 Genetic Analyzer using GeneScan 500 Rox as a size standard (Applied Biosystems); after a denaturation step at 98°C for 5 min and shock chilling on ice, alleles of each locus were scored in base pairs with the GeneMapper v3.2 software package (Applied Biosystems), the size of the PCR products for each SSR was recorded in an Excel spreadsheet.

Development of a core collection for F. velutipes
The core collection was established based on genotyping data using PowerCore software [17]. The heuristic algorithm that finds the optimum path from the initial to the final stages for sample selection was used (http:// genebank.rda).

Data analysis
The genetic diversity parameters (the effective number of alleles Ae, Nei's expected heterozygosity H, the number of observed heterozygosity Ho, and Shannon's information index I) and allele frequency of the core collection and of the entire collection were established with PopGene v1.31 [32]. A dendrogram of the genetic relationships among the core collection strains was constructed based on the simple matching (SM) coefficient by applying the unweighted pair group method with arithmetic mean (UPGMA) using the NTSYSpc v2.10e [33]. The genetic structure of the core collection was analyzed with STRUCTURE v2.3.4 based on an admixture model. Models were tested for K-values ranging from 2 to 10, with 10 independent runs per K value. For each run, the initial burn-in period was set to 100,000 with 100,000 MCMC iterations. To determine the most probable value of K, the deltaK method was used and implemented in Structure Harvester [34,35].

Core collection construction
A total of 153 alleles were amplified by 25 SSRs in the 81 strains [30]. In this study, based on PowerCore calculation, the 153 alleles could be represented using a minimum of 32 strains, including 19 cultivars and 13 wild strains (Table 1). This finding suggests that the 32 strains could be a core collection of the 81 strains. The core collection sampling proportion is about 39.5%: for cultivars about 27.9%, and for wild strains about 92.9%. Genetic diversity of the core collection Statistics to describe the genetic diversity of the core collection and the entire collection for 25 SSR markers are summarized in Table 2. The t-tests of the mean genetic diversity parameters (Ae, H, Ho, I) of the core collection and the entire collection were non-significant (p > 0.01) ( Table 2), which reveals that the genetic diversity of the core collection has no significant difference from that of the entire collection. In addition, based on an UPGMA dendrogram of the core collection and the entire collection (Fig. 1), the strains in different groups of the entire collection were uniformly selected for the core collection. Thus, the core collection could represent the genetic diversity of the entire collection. However, the distributions of allele frequency differ between the entire collection and the core collection (Additional file 2: Table S2). We further investigated the genetic diversity parameters (Ae, H, Ho, I) of the cultivars and wild strains in core collection, which demonstrated that the wild strains possess greater genetic diversity than the cultivars (Table 3).

Genetic structure of the core collection
The admixture model-based clustering method was used in the STRUCTURE program to infer the genetic structure of the core collection. The optimum number of K was analyzed using delta K (ΔK). A strong peak of ΔK is six, which indicated that there were six groups in the core collection (Fig. 3). The cultivars were assigned to groups 1 to 3, and the wild strains were assigned to groups 4 to 6 (Fig. 4).
A similar result was also shown in the dendrogram constructed with the UPGMA method (Fig. 1). For the cultivars, white strains were assigned to groups 2 and 3, and yellow strains were distributed throughout groups 1 to 3. The wild strains were clustered in groups 4 to 6, and each group was stringently in accordance with its geographic origins. The strains in group 4 were collected from Hunan Province, with the exception of strain F77, which was purchased from Spawn Company in Jilin Province in northeastern China. This strain shares similar alleles with the strains collected from Hunan Province, indicating that it may have been isolated from Hunan Province. The strains in groups 5 and 6 were collected from Sichuan and Yunnan Provinces, respectively.

Representation of the core collection
The successful formation of a core collection depends on maximum allelic representation efficiency and elimination of redundancy from the entire collection [2]. In this study, we successfully developed a core collection of F. velutipes with 100% allelic representation under 39.5% of the sampling proportion (28.4% for cultivars and 92.9% for wild strains) based on 25 SSR markers. The genetic diversity parameters of the core collection could represent of the entire collection. And the strains  selected in the core collection can represent the different allele components of each group in the entire collection ( Fig. 1). Our results proved that the advanced M strategy is powerful in capturing 100% allelic diversity in a core collection [17,22]. The differences of the allele frequency between the entire collection and the core collection may be due to the redundant alleles including some homozygous loci that were excluded during the core collection construction.
Further strain improvement of F. velutipes based on the core collection Most crops inevitably undergo a drastic loss of genetic diversity during cultivation, and F. velutipes is no exception [30,[36][37][38]. Lower genetic diversity among cultivars may lead to inbreeding depression [39]. Thus, the core collection established in this study could effectively protect the cultivars' heterogeneous germplasm resources and help avoid inbreeding depression for further strain improvement. Furthermore, the specific alleles harbored in different groups of cultivars may indicate different agronomic characters in each group. For example, in our previous experiment, the mycelium growth rate at 23°C of a yellow strain F1 (6.38 ± 0.1 mm/d) in group 1 was significantly greater than that of the industrialized white strain F3 (6.35 ± 0.07 mm/d) (p < 0.01) (unpublished data). Therefore, F1 could be used to crossbreed with the industrialized white strains to facilitate mycelium growth and shorten the production time.
The wild strains harbor higher genetic diversity than the cultivars in the core collection. Meanwhile, several economically important agronomic traits, such as tolerance to high temperature and rich contents of sesquiterpenes, can also be found in wild strains [30,40]. In the cultivation of F. velutipes, temperature is usually an important limiting factor. The cultivars' fruiting temperature needed for stringent control is less than 15°C, which will result in high energy costs [30]. However, we have gathered several wild strains of F. velutipes from subtropical regions of China in summer, despite most of the wild strains of this species mainly forming fruiting bodies in winter. Thus, those wild strains are ideal samples to domesticate for tolerance to high temperatures. In fact, we did find a wild strain (F98), collected from Longling, Yunnan, that can grow more vigorously than cultivars at a higher temperature (18°C) [30]. Further analyses on chemical components showed that this strain contains 15 new sesquiterpenes with various skeletons, some of which showed moderate antidiabetes and antitumor bioactivity [40]. Thus, it is quite essential to keep as many wild strains as possible in constructing a core collection for F. velutipes.

Conclusions
In conclusion, we have established the first core collection of F. velutipes in China, which is an important platform for efficient breeding of this mushroom in the future. The core collection is representative of the entire collection. In addition, the wild strains in the core collection possess favorable agronomic characters and produce unique bioactive compounds, adding value to the platform. More attention should be paid to wild strains in further strain breeding.