Genetic Diversity of Shanlan Upland Rice (Oryza sativa L.) and Association Analysis of SSR Markers Linked to Agronomic Traits

Shanlan upland rice, a kind of unique rice germplasm in Hainan Island, was used to evaluate genetic diversity and association between SSR markers and agronomic traits. A total of 239 alleles were detected in 57 Hainan upland rice varieties using 35 SSR markers, and the number of alleles per locus was 2-19. The observed heterozygosity was 0.0655-0.3115. The Shannon diversity index was 0.1352-0.4827. The genetic similarity coefficient was 0.6736-0.9707, and 46 varieties were clustered into one group, indicating that the genetic base of the Shanlan upland rice germplasm was narrow. A total of 25 SSR markers significantly related to plant height, effective panicle number per plant, panicle length, total grain number, filled grain number, seed rating rate, and 1000-grain weight were obtained (P < 0.01), with the percentage of the total variations explained ranging from 0.12% to 42.62%. RM208 explained 42.62% of the total variations in plant height of Shanlan upland rice. RM493 was significantly associated with 6 agronomic traits. We can speculate that RM208 may flank QTLs responsible for plant height and RM493 may flank QTLs playing a fundamental role in the intertwined regulatory network of agronomic traits of Shanlan upland rice.


Introduction
Shanlan upland rice (Oryza sativa L.) is a type of landrace adapted to the tropical dryland climate, with strong drought resistance and good taste quality [1][2][3], distributed in the central and western regions of Hainan Island where Li and Miao people live. Shanlan liquor made from Shanlan glutinous rice is known as "Moutai of Li people" and well received by local people. In our previous study, the coefficients of variation of the agronomic traits including plant height, panicle length, seed setting rate, and 1000-grain weight were all more than 10%, and the coefficients of variation of the traits including effective panicle number per plant, total number of grains, and number of filled grains reached about 30%, indicating that the Shanlan upland rice germplasm has a rel-atively rich diversity of agronomic traits. Many excellent Shanlan upland rice varieties were found out, including 7 varieties with panicle length exceeding 30.0 cm, 3 varieties with total number of grains being more than 200.0, 10 varieties with seed setting rate exceeding 90.0%, and 23 varieties with 1000-grain weight being more than 30.0 g, which would provide excellent genetic resources for the high-yield breeding of rice [4].
Based on linkage disequilibrium (LD), the association between target traits and genetic markers or candidate gene mutations in the natural population can be identified. It is widely used in association analysis between molecular and phenotypic variation, and discovery, location, and functional analysis of genes of interest [5][6][7]. Simple sequence repeat (SSR) is composed of a set of 1 to 6 base sequences (motif) repeated in tandem, with high polymorphism, abundant quantity, good repeatability, and codominance [8,9]. The Gramene website (http://www.gramene.org) has published more than 19,000 SSR markers in rice, which are commonly used in rice genetic diversity studies, germplasm evaluation, genetic map construction, target trait gene location, and cloning.
In previous studies, only few Shanlan upland rice varieties were used in the analyses of the genetic diversity [1,10]. No report on association analysis between SSR markers and agronomic traits of the Shanlan upland rice germplasm was found. The objectives of this present study were to use 57 Shanlan upland rice varieties to evaluate genetic diversity and analyze association between SSR markers and agronomic traits.  Table 1). From 2015 to 2017, all accessions were grown in the Yongfa Base of Hainan Academy of Agricultural Sciences in Chengmai County, Hainan Province, and Tropical Crop Field in Yinggen Town, Qiongzhong County, Hainan Province, and were planted around the Dragon Boat Festival in a direct-seeding way, with shallow soil cover. 100 plants of each accession were planted with 25 cm of row spacing and 30 cm of plant spacing. The conventional management of upland rice planting was performed.

Agronomic and Molecular
Methods in the Study. Ten plants were selected at random from each accession and evaluated for 7 agronomic traits including plant height, effective panicle number per plant, panicle length, total number of grains per panicle, number of filled grains per panicle, seed setting rate, and 1000-grain weight. Analysis of variance (ANOVA) was performed using SPSS 19.0 software.
A total of 48 SSR markers distributed on 12 chromosomes of rice were used to survey the Shanlan upland rice germplasm for genetic diversity (see Table 2). The forward  3 BioMed Research International fluorescent primers of SSR markers were filled with FAM (blue) fluorescent dye, and synthesized by the BGI (Guangzhou) Company. The reverse nonfluorescent primers were synthesized by the Shenggong (Shanghai) Company.
Using the Plant Genomic DNA Rapid Extraction Kit produced by the Shenggong (Shanghai) Company, the genomic DNA was extracted from 50 to 100 mg of the fresh tender leaves collected and sampled from individual plants of the accessions at the seedling stage as a template. The PCR reactions were carried out in a reaction solution of 10 μL containing 1 μL of the template DNA (50~100 ng), 0.2 μL of each primer (10 μmol/L), 5 μL of the 2x EasyTaq® PCR SuperMix produced by TransGen Biotech Company, and 3.6 μL of ddH 2 O. The PCR amplification reactions were performed at the following cycle profile: initial denaturation at 94°C for 4 min, 30 cycles of 45 s denaturation at 94°C, 45 s annealing at 50~67°C, 1 min extension at 72°C, followed by 8 min at 72°C for the final extension. The amplified products were submitted to the BGI (Wuhan) Company for capillary electrophoresis by the ABI 3730xl Genetic Analyzer, and the original data was collected using Data Collection software. A 1/0 matrix was constructed based on the presence and absence of alleles. The presence was denoted as 1 and absence as 0. The genetic diversity parameters such as number of alleles per locus, observed heterozygosity (Ho), and Shannon's diversity index (I) were estimated using the program Popgene 1.32. The genetic similarity coefficients among the accessions were calculated using NTSYS 2.1 software. The cluster analysis was carried out usying the UPGMA and SHAN methods. The population structure was estimated using Structure 2.2 software.
Association analyses were carried out using Tassel 2.1 software. The maximum of LðKÞ was identified as the optimum number of the subpopulation, and the structure matrix (Q) was extracted from the membership probability of each genotype for the mixed linear model (MLM) analysis.

Results
3.1. Genetic Diversity Analysis. 35 polymorphic SSR markers selected from a total of 48 SSR markers were used to screen 57 Shanlan upland rice accessions. As shown in Table 3, a total of 239 alleles were detected. A couple of allele report images are shown in Figure 1. The number of alleles per locus varied from 2 to 19 with an average of 6.8. The observed heterozygosity ranged from 0.0655 to 0.3115 with an average of 0.1702. The Shannon diversity index ranged from 0.1352 to 0.4827 with an average of 0.2826.

Genetic Similarity Coefficient and Cluster Analysis.
The genetic similarity coefficients of 57 Shanlan upland rice accessions ranged from 0.6736 to 0.9707 with an average of 0.7889. The dendrogram resulting from the distance-based analysis of 57 accessions with Jaccard's genetic distance is shown in Figure 2. 57 accessions were classified into 3 clades with a genetic similarity coefficient of 0.75. Clade 1 included 46 accessions, such as M1, M3, and M6, which were classified into 6 subclades. The accession M31 constituted Clade 2 alone. Clade 3 included 10 accessions, such as M5, M7, and M53. Figure 3. The likelihood was maximum at 2 of K value and then decreased, after which it became almost constant. Therefore, the structure results of K = 2 were considered the best possible partition. Using Structure 2.2 software, the posterior probability of each accession was calculated, and 57 Shanlan upland rice accessions were divided into 2 subpopulations. The population structure diagram is shown in Figure 4. Comparing the results of population structure analysis with cluster analysis, it was found that the accessions contained in Subpopulation 1 were consistent with those contained in Subclade 1A, Subclade 1B, Subclade 1C, and Subclade 1D. Subclade 1E, Subclade 1F, Clade 2, and Clade 3 belonged to Subpopulation 2.  Table 4. The analysis of variance revealed significant differences A total of 25 SSR markers significantly associated with agronomic traits such as plant height, effective panicle number per plant, panicle length, total grain number, filled grain number, seed setting rate, and 1000-grain weight were detected, with the percentage of total variations explained ranging from 0.12% to 42.62% (P < 0:01) (see Table 5). Of them, RM208 explained 42.62% of total variations in plant height of Shanlan upland rice. The locations of the associated SSR markers on 12 chromosomes of rice are shown in Figure 5 according to Cornell SSR 2001 (https://archive .gramene.org/). The SSR markers significantly associated with each agronomic trait of the Shanlan upland rice germplasm were all distributed on multiple chromosomes. There were many SSR markers significantly linked to 2 or more agronomic traits, respectively. For example, 8 markers were significantly associated with 2 traits, 3 markers with 3 traits,  M1a  M3  M6  M19  M26  M34  M25  M32  M2  M21  M4  M14  M23  M24  M35  M39  M8  M10  M11  M17  M29  M16  M9  M13  M27  M28  M41  M42  M44  M15  M22  M12  M33  M18  M20  M30  M43  M38  M36  M37  M57  M40  M50  M51  M49  M56  M31  M5  M7  M53  M54  M46  M47  M45  M55  M48   BioMed Research International and 3 markers with 4 traits. In particular, RM493 was significantly associated with 6 traits.

Discussion
Shanlan upland rice is a kind of unique rice germplasm in Hainan Island. So, it is essential to analyze its genetic diversity and explore its application in rice breeding. Through the sequence analysis of SSII, ITS, Ehd1, ndhC-trnV, and cox3 genes, Yuan et al. have found that the genetic diversity of 14 Shanlan upland rice varieties was lower than that of Asian cultivated rice and the common wild rice [1]. Wang et al. have analyzed the genetic diversity of 23 Shanlan upland rice varieties using 22 RAPD primers and found that the genetic similarity coefficients ranged from 0.881 to 0.952 [10]. In this study, a total of 239 alleles were detected in 57 Hainan upland rice varieties using 35 SSR markers, and the number of alleles per locus varied from 2 to 19 with an average of 6.8. The genetic similarity coefficient of 57 Shanlan upland rice varieties ranged from 0.6736 to 0.9707 with an average of 0.7889, and 46 varieties were clustered into one group, indicating that the genetic base of the Shanlan upland rice germplasm is narrow, which is similar to the results of previous studies [1,10]. The low genetic diversity of the Shanlan upland rice germplasm may be caused by factors such as the relatively single geographic origin and the long-term continuous selection of Li and Miao people.
In this study, 57 accessions were classified into 3 clades, and Clade 1 was further classified into 6 subclades through cluster analysis. Through population structure analysis, 57 accessions were divided into 2 subpopulations. Subclade  7 BioMed Research International origin, clades and subclades were not concentratedly distributed, and the difference in their geographic origin was not obvious. Zheng et al. have used five indicators such as glume hair, grain phenol reaction, 1 to 2 internode length below the spike, grain length/width ratio, and chaff color at heading to detect the species margin of Shanlan upland rice accessions and found that most of the accessions belong to the japonica subspecies [11]. We can speculate that the subspecies structure of Shanlan upland rice has changed in the past two decades, and the proportion of the indica subspecies has increased. The area of Hainan Island is not large, and the geographical and climatic conditions, such as altitude, temperature, and sunshine, are very similar in the areas where Shanlan upland rice is cultivated. In addition, the frequent exchanges between the Li and Miao ethnic groups have made the geographical boundaries of different Shanlan upland rice accessions increasingly blurred.
Rice agronomic traits are mostly quantitative traits, controlled by multiple genes. Known rice plant height QTLs exist on each chromosome of rice, and the mapping results and the effect value of the same QTL in different studies are different [12]. Currently, nearly 90 rice dwarf genes have been discovered, and most of them are phytohormone biosynthesis defective mutations or signal transduction defective mutations [13]. Liu [19]. Rice grain weight is greatly affected by grain shape. GW2, GS3, GL3, and other genes are the major genes that control rice grain weight [20][21][22]. In this study, a total of 25 SSR markers significantly related to plant height, effective panicle number per plant, panicle length, total grain number, filled grain number, seed rating rate, and 1000-grain weight were obtained (P < 0:01), with the percentage of total variations explained ranging from 0.12% to 42.62%. 12 SSR markers distributed on 9 chromosomes were significantly associated with plant height. RM208 explained 42.62% of the total variations in plant height. We can speculate that RM208 may flank QTLs responsible for plant height. Four SSR markers distributed on chromosomes 1, 2, 7, and 8 were significantly associated with the effective panicle number per plant that were similar to the results of Liu et al. and Jiao et al. partly [14,15]. 10 SSR markers distributed on 8 chromosomes were significantly associated with panicle length. 15 SSR markers distributed on 10 chromosomes were significantly associated with the total number of grains or the number of filled grains. Five SSR markers distributed on chromosomes 9, 10, and 12 were significantly associated to 1000-seed weight. No marker associated with 1000grain weight mentioned by previous reports was found on chromosomes 2 and 3 [20][21][22]. It may be related to the germplasm specificity of Shanlan upland rice and needs further study.
In this study, the SSR markers associated with each agronomic trait of Shanlan upland rice were all distributed on multiple chromosomes, which proves to a certain extent that these agronomic traits were regulated by multiple genes. On the other hand, this study also found that many SSR markers were significantly associated with 2 or more agronomic traits. RM493 was significantly associated with 6 agronomic traits. We can speculate that RM493 may flank QTLs playing a fundamental role in the intertwined regulatory network of agronomic traits of Shanlan upland rice. The genes encoding protein GFS12 (LOC4326972), protein transport protein SEC23 (LOC4326988), and probable 2-oxoglutarate-dependent dioxygenase AOP1.2 (LOC112939546), which may play a role in the regulation of the agronomic traits of Shanlan upland rice, are located in chromosome 1 close to RM493 [23][24][25]. These genes are worthy of further study.

Conclusion
A total of 239 alleles were detected in 57 Hainan upland rice varieties using 35 SSR markers, and the number of alleles per locus was 2-19. The observed heterozygosity was 0.0655-0.3115. The Shannon diversity index was 0.1352-0.4827. The genetic similarity coefficient was 0.6736-0.9707, and 46 varieties were clustered into one group, indicating that the genetic base of the Shanlan upland rice germplasm was narrow. A total of 25 SSR markers significantly related to plant height, effective panicle number per plant, panicle length, total grain number, filled grain number, seed rating rate, and 1000-grain weight were obtained (P < 0:01), with the percentage of the total variations explained ranging from 0.12% to 42.62%. RM208 explained 42.62% of total variations in plant height of Shanlan upland rice. RM493 was significantly associated with 6 agronomic traits. We can speculate that RM208 may flank QTLs responsible for plant height and RM493 may flank QTLs playing a fundamental role in the intertwined regulatory network of the agronomic traits of Shanlan upland rice.

Data Availability
Data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
There are no conflicts of interest to disclose.   BioMed Research International