High-Density Genetic Map Construction and QTL Mapping of Leaf and Needling Traits in Ziziphus jujuba Mill

The Chinese jujube (Ziziphus jujuba Mill., 2n = 2x = 24), one of the most popular fruit trees in Asia, is widely cultivated and utilized in China, where it is traditionally consumed as both a fresh and dried food resource. A high-density genetic map can provide the necessary framework for quantitative trait loci (QTL) analyses and map-based gene cloning and molecular breeding. In this study, we constructed a new high-density genetic linkage map via a genotyping-by-sequencing approach. For the consensus linkage map, a total of 3,792 markers spanning 2,167.5 cM were mapped onto 12 linkage groups, with an average marker interval distance of 0.358 cM. The genetic map anchored 301 Mb (85.7%) of scaffolds from the sequenced Z. jujuba “Junzao” genome. Based on this genetic map, 30 potential QTLs were detected, including 27 QTLs for leaf traits and 3 QTLs for needling length. This high-density genetic map and the identified QTLs for relevant agronomic traits lay the groundwork for functional genetic mapping, map-based cloning, and marker-assisted selection in jujube.


INTRODUCTION
Chinese jujube (Ziziphus jujuba Mill., 2n = 2x = 24) is a popular fruit tree in Asia. The fruit crop has been widely cultivated across Northern China for 7,000 years (Qu and Wang, 1993;, and nowadays there are more than 840 cultivars (Liu et al., 2014a). These cultivars are mostly landraces which have not been subjected to modern breeding. The jujube is domesticated from its wild relatives, Z. jujuba Mill. var. spinosa (Bunge) Hu ex H. F. Chow, and this long process has been suggested to be linked with human selection and natural reproduction . The wild jujube is a small shrub, typically possessing thorny branches and ovate-acute leaves with three conspicuous veins at the base and finely toothed margins (Gupta, 2004). They can withstand extreme arid conditions and produce reasonable yields. In contrast, the Chinese jujube has become greatly differentiated during the long history of evolution (Liu, 2010). Chinese jujube has diverse stipular spines (strong, weak, and absent) and leaf shape (oval, ovoid in shape, ovate-lanceolate). Importantly, leaf traits can influence the fitness of trees through biochemical, physiological, morphological, and developmental mechanisms (Donovan et al., 2011). Leaves are also the major organ of photosynthesis in plants, and photosynthesis of jujube is highly sensitive to water deficit, directly affecting development and productivity (Cui et al., 2009). An additional morphological feature of jujube is needling, which causes inconveniences in field operation of farmers and may even cause injury (Qi et al., 2009). Thus, mapping of genes controlling leaf and needling traits and development of applicable markers, are of significant value in jujube farming.
In the past 15 years, substantial progress has been made towards the development of genetic markers and construction of linkage maps in jujube (Supplementary Table S6). The first genetic map is based on the F 1 progenies from the cross between "Dongzao" and "Linyillizao, " with 128 AFLP markers, consisting of seven linkage groups (LGs), and spanning 458.66 cM (Lu et al., 2005). The map was then further developed by RAPD and SSR markers, with the marker numbers ranging from 333 to 423 by Shen (2005), Qi (2009), andXu (2012). However, the map is limited by low marker density, meaning it is often unsuitable for breeding purposes. More recently, the jujube reference genome sequences have been released (Liu et al., 2014b;Huang et al., 2016), making it possible to develop more genetic markers for jujube. With the development of next generation sequencing technologies, two sets of 2,872 and 2,540 single nucleotide polymorphisms (SNPs) were identified from two F 1 populations, "Dongzao" × "JMS2" and "Dongzao" × "Zhongningyunzao, " respectively. These reports demonstrate a robust and powerful approach for genotyping in jujube using Illumina sequencing technology (Zhao et al., 2014;Zhang et al., 2016). However, there is still no saturated genetic map for QTL localization in Chinese jujube.
Genotyping by sequencing (GBS) identifies SNPs within restriction-site-associated DNA sequences at many loci throughout the genome (Miller et al., 2007). A combination of three restriction enzymes with distinct restriction sites, MseI (TTAA), HaeIII (GGCC), and EcoRI (GAATTC), have been previously used for GBS library construction. This approach increased the tag number, sequencing depth, and genome coverage, while also providing additional opportunities to detect suitable regions for targeted fragments (Carlson et al., 2015). Recent studies have shown that GBS is an efficient and low-cost approach for SNP marker development in jujube Wu et al., 2017). Thus, we sought to use a GBS strategy to construct a new high-density genetic map for jujube.
In this study, we generated a new high-density genetic map with an F 1 population crossed by "Dongzao" × "Jinsi4. " With this genetic map, we also identified genomic regions that were associated with important horticultural traits, such as leaf length, leaf width, leaf area, leaf shape index, specific leaf weight, chlorophyll content, and needling length.
The leaf samples were collected at four weeks after sprouting from each F 1 individual and the parents for DNA isolation in May 2016. Collected samples were immediately frozen in liquid nitrogen and transferred to −80°C. Approximately 200 mg of each sample was ground in liquid nitrogen for genomic DNA isolation using a plant genomic DNA extraction kit 9768 (Takara, Dalian, China) following the manufacturer's protocol. A NanoDrop 2000 UV-Vis spectrophotometer (Thermo Fisher Scientific, USA) was used to determine the DNA concentration in each sample.

GBS and High-Throughput Sequencing
A GBS strategy was used to develop SNP markers as previously described . Briefly, approximately 0.1 to 1 μg of genomic DNA was incubated at 37°C with MseI (New England Biolabs, NEB), T4 DNA ligase (New England Biolabs, NEB), ATP (New England Biolabs, NEB), and an MseI Y adapter N containing barcodes, and the samples were then heat-inactivated at 65°C. Two additional enzymes, HaeIII and EcoRI (New England Biolabs, NEB), were simultaneously added into the MseI digestions to further digest the fragments at 37°C. Then, the digested fragments with ligations were purified with Agencourt AMPure XP beads (Beckman Coulter, Inc.) and subjected to PCR amplification with the Phusion Master Mix (New England Biolabs, NEB) using both universal primers as well as i5 and i7 index primers (Illumina). The PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, Inc.), pooled, and separated by electrophoresis on a 2% agarose gel. Fragments of 400 to 450 bp (with indexes and adaptors) were excised from the gel and purified using a gel extraction kit (QIAGEN). These purified products were further cleaned using Agencourt AMPure XP beads (Beckman Coulter, Inc.) prior to sequencing. Then, paired-end 150 bp sequencing was performed on the selected tags on the Illumina HiSeq 4000 platform.

Data analysis
To ensure that sequencing reads were reliable and without artificial bias raw data in the Fastq format was initially processed through a series of quality control (QC) procedures using in-house C scripts. QC standards were as follows: (1) reads with ≥10% unidentified nucleotides (N) were removed; (2) reads with >50% bases having a phred quality <5 were removed; (3) reads with >10 nt aligned to the adapter, allowing ≤10% mismatches were removed; (4) reads containing the Haell or EcoRI enzyme sequence were removed. BWA (Burrows-Wheeler Aligner) (Li and Durbin, 2009) was used to align the clean reads of each sample to the reference genome (settings: mem-t 4-k 32-M-R). Alignment files were converted to BAM files using the SAMtools software ) (settings: -bS -t). If multiple read pairs had identical external coordinates, only the pair with the highest mapping quality was retained.

SNP Calling and Genotyping
Variant calling was performed for all samples using the GATK software (Mckenna et al., 2010). UnifiedGenotyper was used to estimate genotype and gene frequencies. Unreliable SNPs were eliminated via a filtering process. SNP calling was performed for both parents and progenies using the SAMtools software . SNPs were filtered using a house-in Perl script. Polymorphic markers between the two parents were detected and classified into eight segregation patterns (ab × cd, ef × eg, hk × hk, lm × ll, nn × np, aa × bb, ab × cc, and cc × ab) according to the CP model in JoinMap 4.1 software (Van Ooijen and Voorrips, 2001). For the F 1 population, markers with the genotypes of hk × hk, lm × ll, and nn × np were chosen for genetic mapping.

Map Construction and anchoring Sequence Scaffolds
Prior to map construction, SNP markers were further filtered using the parameters of segregation distortion (p < 0.01), integrity (> 95%), or the presence of abnormal bases with a house-in script. Then, the filtered SNP markers were sorted using the maximum-likelihood method and corrected by Smooth algorithms in JoinMap4.1. The Kosambi mapping function was then used to calculate marker distances . The integrated maps for both the male and female parents were computed using the combined group for map integration function in the MergeMap software (Wu et al., 2008). A Perl script SVG was used to visualize the exported maps. The number and linkage distance of gaps representing the interval between two markers on 12 LGs were counted.

anchoring Sequenced Scaffolds to the Genetic Map
For the collinearity analysis, markers localized on the genetic map were anchored with the assembled scaffolds of jujube genome (NCBI accession: LPXJ00000000)  using a Perl script. The mapping results were further visualized by joining the LG and anchored scaffolds together with grey lines .

Phenotyping Traits
All measurements were performed on the 103 progenies of JS × DZ. The needles and leaf traits were investigated in May and July 2016, 2017, and 2018. Fifteen to 20 leaves were picked from middle shedding shoots of the jujube on each tree, and six major leaf traits were measured. These traits were leaf length, leaf width, leaf area, leaf shape index, specific leaf weight, and chlorophyll content. Ten to 20 needles were picked from the biennial shoots. Leaf length, leaf width, and needle length were measured using Vernier calipers, and leaf area was measured using a Li-3000c Leaf Area Meter (Li-cor Inc., USA). Chlorophyll content and leaf weight was measured by a SPAD-502 chlorophyll meter (Zhejiang Top Instrument Co., Ltd.) and leaf weight was measured using an analytical balance (Zhejiang Top Instrument Co., Ltd.). All of the measurements were repeated three times. Reported phenotypic data were the average of three years, and were analyzed using the SPSS v18.0 statistical software package .

QTL analysis
Seven phenotypic traits were subjected to QTL analysis using the MQM mapping method of the MapQTL software (Van Ooijen, 2009). A 1,000 permutation test at a 95% confidence level was used to determine the LOD thresholds (use random genotypes to associate with phenotypes, take Max in every time permutation), with significance set at p < 0.05 (Van Ooijen, 1999). After 1,000 permutation test, a LOD threshold of 3.0 was set to identify significant QTLs at the 95% confidence level. Ranges above the LOD threshold of 3.0 were identified as QTL intervals. Markers located at or flanking the peak LOD value of a QTL were identified as QTL associated markers (Sun et al., 2015).

Quality evaluation of Sequencing Data
A total of 40.31 Gb of clean reads (99.99% of total raw reads) were generated by sequencing the parents and 103 progenies. After data filtering, 99.99% of reads were of high quality, with an average Q20 ratio of 99.9% and a GC content of 35.94%. The parents were sequenced at a higher depth to enhance the chances of SNP detection. Finally, clean data covering 1,592,370,720 bp (99.99%) and 1,466,831,520 bp (99.99%) were obtained for the female and male parents, respectively. For each individual plant, the clean data ranged from 203,802,048 to 618,848,928 bp in coverage, with an average of 361,682,155 bp. The average number of total reads for parents and progenies were 1,529,724,096 and 361,713,312 bp, respectively (Supplementary Table S1).
Paired-end reads of clean data for the two parents and the F 1 progenies were computed using the BWA Comparison software (parameters: mem-t 4-k 32-M-R). Comparison results were processed with SAMtools for format conversion. The published jujube genome has 351,295,593 bp. High-quality clean reads were aligned to this genome. In our study, 11,058,130 and 10,186,330 aligned clean reads were obtained for the female and male parents, respectively. For F 1 individuals, an average of 2,511,682 clean reads was aligned to the reference genome. Mapping rates of the 103 F 1 individuals were between 96.6% and 98.15% using the Perl script (Supplementary Table S2).

SNP Calling and Genotyping
In total, 378,500 and 346,669 SNPs were detected in the female and male parents, respectively. For F 1 individuals, an average of 224,432 SNPs were discovered for each progeny. The parents exhibited a lower SNP heterozygosity rate (39.96%) than F 1 individuals (41.45%) (Supplementary Table S3).
By excluding missing information, a combined 194,976 polymorphic SNPs were detected between the two parents. These SNPs were classified into eight segregation types according to the CP model using JoinMap 4.1. Among eight detected marker patterns, four major patterns including lm × ll, aa × bb, hk × hk, and nn × np accounted for nearly 99.5%, while the other four patterns, ab × cd, ab × cc, cc × ab, and ef × eg, only accounted for 0.5%. Only segregation types lm × ll, nn × np, and hk × hk were selected for genotyping in F 1 individuals. The final number of available markers was 177,224 (Figure 1).

Genetic Linkage Map Construction
Prior to map construction, we generated 35,509 candidate markers by further filtering the abnormal bases and a rate lower than 95% integrity in each individual. By screening the threshold of segregation distortion (p < 0.01), 22,424 markers were used for the final map construction ( Table 1).
In total, 11,423 (3,336 without consistent loci) markers and 15,539 (3,850 without consistent loci) markers fell into 12 LGs on the male and female maps, respectively. The genetic lengths were 2,175.643 and 2,093.059 cM, with average marker intervals of 0.66 and 0.4875 cM (Supplementary Tables S4 and S5, Supplementary  Figures S1 and S2).
The combined map spanned 2,167.511 cM with 22,424 (3,792 without consistent loci) markers falling into 12 LGs, with an average marker interval of 0.358 cM and average chromosome length of 180.63 cM (Figure 2). Among the 12 LGs, LG11 was the largest group, with a genetic distance of 318.5 cM and 3,005 markers ( Table 2).
LG07 was the shortest group with 969 markers and spanning 57.68 cM. The average marker interval ranged from 0.24 to 0.48 cM, with an average distance of 0.358 cM ( Table 1). Among these markers, 5,774 gaps were detected. Of these, 5,749 gaps (99.6%) were less than 5 cM, 23 gaps were between 5 and 10 cM, and only two gaps were larger than 10 cM, which were on LG08.
anchoring Scaffolds of the Sequenced Jujube Genome to the Genetic Map For the collinearity analysis, 5,786 markers (without consistent loci) localized on the genetic map of JS × DZ were anchored with the assembled scaffolds of jujube genome. In total, 852 unique scaffolds representing 301 Mb (85.7%) of the total genome were localized on the 12 LGs (Figure 3). The sequenced jujube genome ("Junzao") consisted of 36,147 scaffolds covering 351 Mb of sequences. LG11 anchored the highest number of scaffolds with a physical length of 46.9 Mb. LG4 anchored the lowest number of scaffolds with a length of 16.6 Mb. The correlation between genetic and physical length was 163.07 Kb/cM on average ( Table 2).

Variations of F 1 -Generation Leaf and Needling Characteristics
The morphological characteristics of leaf and needling in F 1 -generation were measured (Supplementary Tables S7 and S8). The differences between leaf and needling parameters among F 1 individuals were all extremely significant (p < 0.0001). The coefficients of variation for the progeny leaf features were all lower than 30%. In particular, the variation in leaf area ranged from 3.69 to 19.08 cm 2 , and its coefficient of variation reached up to 27.75%. The variation in leaf shape index ranged from 1.45 to 2.66, and the coefficient of variation was 12.34%. The average specific leaf weight of F 1 -generation leaves was 49.39, with a variation range from 27.29 to 107.78. The average chlorophyll content of F 1 -generation leaves was 17.46 SPAD, which was close to that of parents (18.02 SPAD). The variation coefficient of needling was 44.98%, the variation in needling length ranged from 2.47 to 37.2 cm.

Normality Test of Leaf and Needling Characteristics
We tested the normality of leaf length, leaf width, leaf area, leaf shape index, specific leaf weight, chlorophyll content, and needling length (Supplementary respectively, which were all higher than the level of significance (p < 0.05). These results suggest the determined characters for F 1 -generation leaves and needlings conformed to a normal distribution (Supplementary Figure S3).

QTL analysis
A total of 30 QTLs distributed across seven LGs were discovered by QTL analysis using our high-density map, with a range of two to seven QTLs for each trait. LOD scores for the seven qualitative traits ranged from 3.04 to 7.08. Twentyseven QTLs were detected for leaf characteristics, including four QTLs for leaf length on LG1 and LG11; five QTLs for leaf width on LG1 and LG11; four QTLs for leaf area on LG9 and LG11; seven QTLs for leaf shape index on LG1 and LG11; five QTLs for specific leaf weight on LG2, LG8, and LG10; and two QTLs for chlorophyll content on LG5. In addition, three QTLs for needling length on LG5 were also detected by QTL. Interestingly, we found a co-localization of marker lm8132 for leaf length, leaf width, and leaf area (LL11.1, LW11.4 and LA11.3), and marker lm5435 for leaf width and leaf area (LW11.2 and LA11.2). The proportion of phenotypic variation ranged from 13.3% to 29.9%, with an average of 17.72%. Information related the detected QTLs for leaf and needling traits are shown in Table 3, Figure 4 and 5 .

Construction of the Map Population
Construction of jujube mapping population is challenging, because jujube is a self-incompatible plant with a small flower size and high seedless percentage (Lu et al., 2005). Therefore, it is difficult to generate an F 1 population. With the development of combination of SSR molecular identification and controlled pollination, a genetic mapping population was successfully constructed Liu et al., 2014a). A previous study revealed that "Dongzao" and "Jinsixiaozao" had a distant genetic relationship (Huang et al., 2015), highlighting the potential for detecting more polymorphic markers in the derived population from the cross between them. Therefore, we constructed the jujube mapping population of "Dongzao" and "Jinsixiaozao 4," expecting to generate a higher density genetic map than that currently available map by "Dongzao"× "Zhongningyuanzao."

Construction of a Genetic Map
SNP markers are an excellent tool for carrying out gene mapping experiments, as they are highly abundant and can be genotyped on a large scale (Slate et al., 2010). GBS utilizes one or multiple restriction enzymes to digest genomic DNA into fragments that can be sequenced on high-throughput platforms (Elshire et al., 2011). Multiplexing samples following the addition of unique barcodes avoids prohibitive sequencing costs (Pootakham et al., 2015). GBS has been used to construct genetic maps for many other species, such as wheat (Poland et al., 2012), bean (Schröder et al., 2016), rice (Leon et al., 2016), clam (Nie et al., 2017), grape (Wang et al., 2012), pear (Wu et al., 2014), peach (Bielenberg et al., 2015), and sweet cherry (Guajardo et al., 2015). It has also been used to construct maps for jujube . In our study, a total of 40,315,919,328 bp of raw sequencing data with 99.9% clean data were generated, and 97.8% of the cleaned data were mapped to unique positions on the reference genome . Our data provide further evidence that GBS is a low cost, high efficiency, and rapid approach for SNP development and map construction. We constructed a high-density genetic map for jujube using GBS technology to develop SNPs based on an F1 population. The integrated genetic linkage map comprised 3,792 SNP markers and spanned 2,167.511 cM, with an average marker interval of 0.358 cM. This map was divided into 12 LGs, which was consistent with the haploid chromosome number. Compared with previously reported linkage maps (Qi et al., 2009;Zhao et al., 2014;Zhang et al., 2016), we achieved a higher map density as well as shorter marker distances (0.358 cM). In addition, the mean interval distance is comparable to the currently reported high-density map with an interval distance of 0.34 cM crossed between "JMS2" and "Xing16" (Zhao et al., 2014). Therefore, our study demonstrates a highly efficient for map construction using the hybridization between two jujube cultivars with a distant genetic background.
High-density genetic linkage maps can facilitate genome assembly, and has been one of the fundamental components of genome sequencing (Gaur et al., 2012). Assisted by the high-density genetic map, we anchored 85% of the assembled scaffolds (310 Mb) of the jujube genome, higher than the genetic map of "Dongzao"× "Zhongningyuanzao" . Comparison of our map to the previously anchored genome revealed a highly collinear relationship between genetic and    physical maps. The inconsistent scaffold placement order could be explained by differences in the cultivars sequenced for the genetic map and genome sequencing. Rearrangements, translocations, gains or losses of DNA segments, and copy number variations have been widely observed in different genotypes of the same species (Swanson-Wagner et al., 2010;Zmienko et al., 2014). Alternatively, markers on different genetic maps could influence the anchoring results. Certainly, improper mapping or errors present in the genome assembly would also contribute to inconsistent placement orders (Soto et al., 2015). We calculated the relationships between the genetic and physical maps in the present study, finding that the ratio of physical/genetic distance was an average of 144.74 Kb/cM. This information will be useful for further research efforts, including gene cloning and genome structural analyses.

Quantitative Trait Locus Identification
Mapping QTLs in jujube is challenging, because jujube is a self-incompatible plant with high heterozygosity and a long growth and breeding cycle. Therefore, it is difficult to generate a population fitting for QTL mapping, such as F 2 and recombinant inbred lines. The number of samples for a crossed population is also smaller than annual crops. With the development of sequencing and its application in marker screening, high-resolution linkage maps have been successfully employed for QTL fine mapping (Chapman et al., 2012). Previously, no studies have included mapping and QTL identification using the F 1 population of "Dongzao" × "Jinsi 4." Shen (2005)   QTLs in the same population using AFLP and RAPD markers for agronomic traits of needling (long needle on trunk, short needle on trunk, long needle on branch, and short needle on branch), explaining 8.2% to 44.2% of the variance. The genetic maps these studies constructed were of lower resolution and unsaturated, and possibly unable to reliably capture QTLs. In contrast, markers in our constructed map spanned 2,167.54 cM with a shorter average marker interval of 0.358 cM, which would facilitate the accurate localization of QTLs. A total of 30 QTLs for seven traits and for the first time, for some important jujube leaf-related traits, such as specific leaf weight and chlorophyll content, have been reliably mapped. These markers are easily located in the corresponding genome sequences and can be used to study candidate genes related to traits located on those chromosomal regions. Therefore, our findings will enhance the efficiency of future gene discovery studies in jujube.
In this study, only QTLs associated with leaf length were located in the same linkage group as those in previous studies, although at different positions. The QTLs affecting leaf length were found on chromosomes LG1 and LG11 both in our study and in previous studies (Shen, 2005). However, there were additional novel QTLs detected in our study that were not previously identified. Our study also mapped QTLs for chlorophyll content for the first time in Chinese jujube. Interestingly, all of these were located on chromosome LG5. The three QTLs affecting needling were also located on chromosome LG5, which was distinct from previous studies (Qi et al., 2009). The 30 QTLs that showed stable and significant effects for phenotype of leaf and needling traits would be valuable resources for candidate gene exploration in the near future. Combined with the whole genome sequence of jujube, genes surrounding these QTLs could be investigated as candidate genes for further screening and verification.

CONCLUSIONS
In this study, a high-density genetic linkage map of Chinese jujube was constructed using GBS. This linkage map contained 12 linkage groups with a low inter-marker distance of 0.358 cM. A total of 27 QTLs associated with leaf traits, and 3 QTLs associated with needling traits were identified. The findings would be helpful in marker-assisted selection studies for jujube. The high-density linkage map will also serve as a foundation for jujube genetic improvement.