Genome-Wide Association Study Discovered Favorable Single Nucleotide Poly Morphisms and Candidate Genes Associated with Ramie (Boehmeria Nivea L.) Colloidal Matters

ABSTRACT Ramie (Boehmeria nivea L.) bast fiber is one of the most ancient natural fibers, which is used for textile only after degumming. Hemicellulose, pectin, and hydrotrope are the main colloidal matters in ramie fiber, which are removed when degumming. However, the genetic variation and control loci are poorly understood. Here, we analyzed the genetic variation of the colloidal matters and discovered the favorable single nucleotide poly morphisms (SNP) by genome-wide association study using 319 ramie core germplasms. In total, 21, 5, and 10 SNPs were found to be the significantly key control loci associated with hemicellulose, pectin, and hydrotrope, respectively, which changed the amino acid sequences of the reference genes. Fourteen key genes involved in various sugars biosynthesis and metabolism pathways were identified to regulate the hemicellulose content in ramie. Meanwhile, 7 calcium transporting or binding related genes were identified to involve in the biosynthesis of calcium magnesium pectinate-the main component of pectin. Moreover, 306 genes related to hydrotrope were identified and their function were also discussed. Our study first reveals the genetic variation and candidate genes related to the colloidal matters of ramie bast fiber, and provides tools for marker-assisted selection in improvement of agricultural traits.


Introduction
Bast fiber extracted from the stem bark of plants is one of the most important fiber types for human use. Ramie (Boehmeria nivea L.), which is native to China and one of the oldest fiber crops, has been strategy to cultivate high-quality fiber varieties. In this study, we focus on excavating the significantly associated single nucleotide polymorphism (SNP) of ramie colloidal matters (hemicellulose, hydrotrope and pectin) by whole-genome scanning with 319 core germplasms of ramie. The related regulatory genes are screening by aligning the ramie genome, and the function of the predicted candidate genes are also discussed. This research will provide the genetic basis of the components of ramie gum and reveal the key genetic regulatory sites. Our study will provide significant reference for ramie molecular breeding.

Plant material and phenotypic analysis
A total of 319 core germplasms of ramie collected from China, India, Indonesia, Japan and Cuba (Table S1), were used in this study. These germplasms were planted in the Yuanjiang Experimental Station, Institute of Bast Fiber Crops, Chinese Academy of Agricultural Sciences (E112°36", N28°83"). All germplasms were propagated by asexual propagation, and each germplasm was planted in a row with six plots (one plant per plot), with a distance of 50 cm between plots and 70 cm between rows. The bast fiber of the six plants for each germplasm were mixed harvested in June (season 1) and August (season 2), 2018, respectively, and dried under natural conditions for the determination of hemicellulose, hydrotrope, and pectin.
Before determination, the bast fiber for each germplasm was divided into two equal parts: one was discarded and the other was retained, then the retained one was divided into two equal parts via the same method. The retained sample was separated repeatedly by this method, until its weight was about 40-50 g. The middle part of the final retained sample was used for phenotypic analysis, and three replicates were set for each germplasm, with ~3 g for each replicate. Sample was dried to constant weight in an oven in 105°C, immediately put in a dryer for 30 ± 5 min, and weighed (the weight of hydrotrope, hemicellulose and pectin was recorded as W hy 1, W hc 1 and W p 1, respectively). Next, the sample was put in a conical bottle filled with 150 ml of extractant. Distilled water, 20 g·L −1 of sodium hydroxide and 5 g·L −1 of ammonium oxalate was the corresponding extractant for hydrotrope, hemicellulose, and pectin determination, respectively. After 20 min for extractant soaking, the sample was heated for 30 min in a microwave oven, then washed with running tap water. The clean sample was dried in 105°C, cooled in a dryer, and weighed (the weight of hydrotrope, hemicellulose and pectin was recorded as W hy 2, W hc 2, and W p 2, respectively.). The content of hydrotrope, hemicellulose and pectin was calculated with the corresponding following formula: W hy = 100 × (W hy 1 -W hy 2)/W hy 1, W hc = 100 × (W hc 1 -W hc 2)/W hc 1, W p = 100 × (W p 1 -W p 2)/W p 1

GWAS
Resequencing was performed using the 319 core ramie germplasms and 3,494,975 high-quality SNPs was received (Huang 2020). Using these high-quality SNPs and the phenotypic data, phenotypegenotype association analysis and allele effect calculations were performed with the TASSEL 5.0 software. The general linear model adjusted using the Q-matrix (GLM-Qmatrix model) was used in the association analysis. Those with P < .1 and P < .05 were defined as significant trait-associated SNPs.

Changes of hemicellulose, hydrotrope, and pectin in the 319 ramie germplasms
The content of hemicellulose, hydrotrope, and pectin of the ramie bast fiber were detected at June, 2018 (season 1), and August, 2018 (season 2), using 319 ramie core germplasms. The content of hemicellulose ranged from 0.114 to 0.229, with an average of 0.142 at season 1; from 0.055 to 0.155, with an average of 0.097 at season 2 ( Table 1). The coefficient of variation (CV) at season 2 for hemicellulose was more than twofolds compared with that of season 1. The changes of hydrotrope content ranged from 0.038 to 0.161 for the two seasons, and the CV was about 19 at the two detected seasons, which showed a large variation among the 319 ramie germplasms ( Table 1). The content of pectin ranged from 0.033 to 0.104, with a CV value of 21.11 at season 1; from 0.091 to 0.202, with a CV value of 11.09 at season 2 ( Table 1). The changes of hemicellulose, hydrotrope, and pectin content at season 1 and season 2 showed a normal distribution with good continuity (Figure 1). These data illustrated the large variation amplitude of the detected traits of the core germplasm population, indicating that it could be an excellent population for marker-trait GWAS.

Identification of favorable SNPs and candidate genes associated with hemicellulose
In total, 80 significant SNPs associated with hemicellulose were identified, with 40 of them distributing on chromosome 1, 19 on chromosome 5, 5 on chromosome 11, and 7 on chromosome 13 ( Figure 2, Table S2). There were 47 SNPs located on the intron of the reference genes, 21 located on exon, and 12 on UTR ( Table 2, Table S2). The 21 SNPs located on the exon region involved 14 genes, whose amino acid sequences were changed due to the existence of SNPs. Two genes (Maker00076265 and Maker00086604) were early terminated when translated into amino acid sequence, which was induced by the existence of the SNPs located on the exon of Maker00076265 and Maker00086604. Maker00076265 encodes a cytochrome P450 protein of 486 amino acid residues (named BnP450-1), and the corresponding mutant gene encodes a protein of 334 amino acid residues (named BnP450-M1), whose predicted structure was different with BnP450-1 (Figure 3(a,b,e,f). Maker00086604 encodes an auxin response factor of 844 amino acid residues (named BnARF-1), and the corresponding mutant gene encodes a protein of 514 amino acid residues (named BnARF-M1), whose predicted structure was different with BnARF-1 (Figure 3(c,d,g,h).
To receive candidate genes related to hemicellulose synthesis or metabolism, the genes located in the region 100 kb upstream or downstream of the trait-associated SNPs were identified, and their function were also analyzed. Particularly, two glucose biosynthesis related genes (BnG6PT3 and BnUGPA3), one trehalose (BnTPPA), one fructose (BnABF1), one glucan (BnGE134) biosynthesis related genes were identified nearly the SNPs of Maker75072-385, Maker75196-018, Maker75707-243, which were mapped onto the chromosome 1 ( Figure 2, Table 3). On chromosome 13, one glucose (BnG6PI), one glucan (BnGUN7), and one xylan (BnIRX9 H) biosynthesis-related genes were detected, which was nearby the SNP Maker03162-140, Maker96309-904 and Maker24425-396, respectively. One galactose metabolism-related gene (BnPT225), one glucose biosynthesis-related gene (BnHXK1), one mannose biosynthesisrelated gene (BnMANA1), and one glucan biosynthesis related gene (BnGBG6) were identified on chromosome 11, 2, 4 and 5, respectively. Glucose, mannose, glucan, trehalose, fructose, and xylan are main components of hemicellulose. The above genes identified nearby the significant SNPs related to hemicellulose may show a key role in controlling the hemicellulose content of ramie bast fiber.

Identification of favorable SNPs and candidate genes associated with pectin
Twenty-nine significant SNPs associated with pectin were identified, which were mainly distributed on the chromosome 11 (10), chromosome 13 (7) and chromosome 8 (5) (Figure 4(a,b), Table S3). Eighteen of the SNPs located on the intron of the reference genes, 5 on UTR, and 6 on exon. The 6 SNPs located on the exon involved 5 genes, and the amino acid sequences encoded by 4 of the genes were changed due to the existence of SNPs (Table 4). Genes located in the region 100 kb upstream or downstream of the significant SNPs were scanned to identified the candidate genes related to pectin biosynthesis. In total, 189 genes were identified, which were mapped on the chromosome 13 (59), chromosome 9 (31), chromosome 3 (29), chromosome 8 (18), chromosome 11 (16), and chromosome 12 (11) of ramie (Table S4). KEGG analysis showed that 74.0% of the identified genes from the top 20 significant enriched differential pathways involved in metabolism, which included amino sugar and nucleotide sugar metabolism, phenylalanine metabolism, cyanoamino metabolism, and others ( Figure 5(a), Figure S1). Calcium magnesium pectinate is the main component of pectin, and particularly, 7 calcium transporting or binding related genes were identified nearby the SNPs of Maker36972-744, Maker86695-027, Maker58329-659, Maker09298-307, and Maker63012-296 (Table 5), hinting the key genes functioning in the controlling of the pectin component.

Identification of favorable SNPs and candidate genes associated with hydrotrope
The hydrotrope of ramie bast fiber includes a variety of substances, such as micromolecules, pigments, sugars and other substance. In total, fifty-two significant SNPs related to hydrotrope ramie bast fiber were identified, which were distributed on 11 chromosomes of ramie (Figure 4 (c,d), Table S5). Thirty-three of the SNPs located on the intron of the reference genes, 15 on exon and 4 on UTR (Table S5). The 15 SNPs located on exon involved 11 genes, and the amino acid sequences encoded by 10 of the genes were changed due to the existence of SNPs (Table 4). Genes located in the region 100 kb upstream or downstream of the significant SNPs were scanned to identified the candidate genes related to hydrotrope biosynthesis and metabolism. Totally, 306 genes were identified which were mainly distributed on the chromosome   14 (43), chromosome 13 (40), chromosome 3 (35), chromosome 6 (29), chromosome 11 (28) and chromosome 1 (27) ( Table S6). The top 20 significantly enriched differential pathways showed that the candidate genes relatively evenly involved in fructose and mannose metabolism, nitrogen metabolism, ABC transporters, biosynthesis of amino acids and others processes ( Figure 5(b), Figure S2), which is consistent with the multiple components of hydrotrope.

Discussion
The components of colloidal matter of ramie bast fiber are complex, and the genetic basis and molecular regulation mechanism of colloidal matter are still unclear up to now. Here, we performed a strategy of GWAS with 319 ramie core germplasms to identified significantly SNPs and key candidate genes associated with the regulatory of hemicellulose, pectin, and hydrotrope. In total, 153 significantly SNPs (80, 21 and 52 associated with hemicellulose, pectin, and hydrotrope, respectively) were identified, and 88 of them were located on the intron of the reference genes, 21 on UTR, and 42 on exon, suggesting most of the colloidal matter regulatory related SNPs being the nonsense mutation. The SNPs significantly associated with hemicellulose mainly distributed on the chromosome 1, 5, 11 and 13 of ramie, while the SNPs associated with pectin mainly located on the chromosome 1, 3, 6, 10 and 11. The significantly SNPs related to hydrotrope were distributed on the first, eleventh and thirteenth chromosome of ramie. These results denoted that the genetic regulation sites of the ramie colloidal matter were mainly concentrated on chromosome 1, 11 and 13. Because of comprising about half of the total gummy components, hemicelluloses are the main removal objective in the degumming process of ramie bast fiber. Hemicelluloses are polysaccharides that comprise of sugar moieties, such as glucose, arabinose, xylose, mannose, galactose and also other monomers (Scheller and Ulvskov 2010). Due to its intricate network structure with cellulose and other biological molecules (Sella Kapu and Trajano 2014;Zhou et al. 2016), as well as its complex components, little of the genetic basis and the regulatory molecular mechanism of hemicellulose is known in ramie. Several researches have been conducted on other plant species for detecting the regulatory controlled sites of hemicellulose. For example, three QTLs for hemicellulose were detected in Brassica napus L. (Liu et al. 2013). Seven and 9 significant SNPs associated with hemicellulose were identified in sorghum (Laavanya et al. 2021) and Populus trichocarpa (Porth et al. 2013), respectively. In this study, 21 SNPs associated with hemicellulose were identified in ramie, which induced the changes of amino acid sequences of the reference genes. Especially, the two genes BnP450-1 and BnARF-1 were early terminated when translated into amino acid sequence, as a result of the mutation on their exons caused by the existence of the SNPs. Thus, the 21 SNPs could be considered as the key genetic controlled sites in regulating the hemicellulose in ramie. The homologous gene of BnP450-1 in Arabidopsis shows a regulating role in cell division of leaf and pollen germination via brassinosteroids signaling pathway (Vogler et al. 2014;Zhiponova et al. 2013). The P450 family genes play various roles in plant growth and development and stress resistance, and our study may show a novel function of P450 family genes in biosynthesis of hemicellulose. Meanwhile, we identified 12 key genes (Table 3) related to the synthesis and metabolism of a variety of sugars, such as glucose, glucan, xylose and others, which are the main components of hemicellulose. These genes could be considered as candidate genes regulating the hemicellulose contents in ramie.

Conclusion
In conclusion, we performed a GWAS strategy with 319 ramie core germplasms to identified the significantly associated SNP of ramie gum as well as the associated genes for the first time. Key SNPs associated with hemicellulose (21), pectin (5), and hydrotrope (10) were mainly located on the chromosome 1,11 and 13 of ramie genome. The key regulatory candidate genes of hemicellulose (14) and pectin (7) were identified, while 306 genes regulating multiple pathways were found to be related to hydrotrope synthesis, which was consistent with its multicomponent composition. Our study provides the genetic basis of the components of ramie gum and reveal the key genetic regulatory sites, which provides significant reference for ramie molecular breeding.

Summary
• The contents of hemicellulose, pectin, and hydrotrope in ramie bast fibers were systematically studied with 319 ramie core germplasms for the first time. • The genetic variations of hemicellulose, pectin, and hydrotrope in ramie were first revealed in this study.

Figure 5.
Bar graph of the top 20 significantly enriched differential pathways of pectin (a) and hydrotrope (b) synthesis and metabolism related candidate genes generated by KEGG analysis. • The key SNP loci and key genes associated with hemicellulose, pectin, and hydrotrope were identified. And the genetic regulation sites of the ramie colloidal matter were found mainly concentrated on chromosome 1, 11, and 13. The identified loci and genes may be promising targets for genetic engineering and selection for modulating the colloidal matters in ramie.