Genetic distance of inbred lines of Chinese cabbage and its relationship to heterosis

for variety protection. The number of DNA markers available that can be used to assess the purity of inbred lines is limited in B. rapa . The aim of this study is to use DNA markers to assess the genetic distance between inbred lines to examine early de- velopmental and yield heterosisso asto develop methods for selectingthe best parental lines for the production of hybrids. We screened highly polymorphic SSR and CAPS markers to assess the genetic uniformity of inbred lines and characterize their genetic relationship. We examined the early sizeand yield heterosis in 32 F 1 hybrids of Chinese cabbage. There was a moderate correlation in mid-parent heterosis between leaf size at 21 days after sowingandharvestedbiomassbutnotinbest-parentheterosis.Incontrasttherewasnocorrelationbetweenge-neticdistanceand mid-parentorbest-parent heterosis, indicating that geneticdistance does not predict the het- erosis phenotype.


Introduction
Brassica rapa L. comprises commercially important vegetable crops consumed worldwide such as leafy vegetables including Chinese cabbage (var. pekinensis), pak choi (var. chinensis), and komatsuna (var. perviridis), root vegetables including turnip (var. rapa), and oilseed (var. oleifera). Chinese cabbage forms a head with large pale-green colored leaves and wide white midribs and is an important vegetable in Asia. As the genome sequence of Chinese cabbage  has been released and B. rapa is related to the model plant Arabidopsis thaliana (The Brassica rapa Genome Sequencing Project Consortium., 2011), detailed genetic and evolutionary studies have become possible.
In Japan, most commercial cultivars of Chinese cabbage are F 1 hybrids because of their agronomic benefits such as high yield, stress tolerance, disease resistance, and uniform phenotype. Hybrid breeding came from the discovery of heterosis or hybrid vigor, which is defined as the superior performance of hybrid plants over the parents (Crow, 1998). When breeding F 1 hybrid cultivars, breeders developed elite pure lines (inbred lines) as parents for hybrid production. About five to seven generations of selfing and selection based on traits concerned with the breeding objective are required for developing inbred lines as parental candidates. The level of heterosis of crosses of all possible combinations of the inbred lines is used to identify suitable parents for F 1 hybrid generation. An efficient method for predicting hybrid performance in the parental generations is desired as hybrid production can be expensive, time consuming, and labor intensive. Genetic distance between parental lines might be a good predictor, though the relationship between genetic distance and heterosis is controversial (Barth et al., 2003;Dreisigacker et al., 2005;Flint-Garcia et al., 2009;Geleta Plant Gene 5 (2016) 1-7 et al., 2004;Meyer et al., 2004;Yu et al., 2005). Information concerning the genetic relationships among parental candidate inbred lines is also useful for variety protection.
There are various types of DNA markers such as cleaved amplified polymorphic sequences (CAPS), amplified fragment length polymorphisms (AFLP), randomly amplified polymorphic DNA (RAPD), simple sequence repeats (SSRs), single nucleotide polymorphisms (SNPs), and insertion/deletion polymorphism (InDel) markers. SSR markers have been widely used because of high polymorphism, reproducibility, co-dominant inheritance, and genome-wide coverage. In addition, SSR markers require only small amounts of DNA for PCR, and can be used for high-throughput analysis. SSR markers have been widely used for detecting genetic diversity and making genetic linkage maps, and many SSR markers are available for the genus Brassica (Guo et al., 2014;Hatakeyama et al., 2010;Lowe et al., 2004;Pino Del Carpio et al., 2011;Ramchiary et al., 2011;Suwabe et al., 2002;Suwabe et al., 2006). Sequencing technology enables us to identify SNPs easily, and SNPs are wide spread in the B. rapa genome (Metzker, 2010;Rafalski, 2002). SNPs detected by RNA-sequencing (RNA-seq) in coding regions are used for developing gene-based markers (Paritosh et al., 2013). Restriction-site associated DNA sequencing (RAD-seq) where the flanking region is sequenced from a specific restriction site, is useful for developing DNA markers and high-throughput genotyping (Baird et al., 2008).
The purpose of this study is to examine the possibility of using DNA markers to enhance breeding in B. rapa. We focused on the relationship between genetic distance and heterosis. We identified highly polymorphic DNA markers for calculating genetic distance or assessment of homozygosity in inbred lines of Chinese cabbage and examined the relationship between heterosis and genetic distance calculated by DNA markers. The information obtained in this study will be useful for breeding in Brassica.
Seeds were sown on soil and plants were grown in growth chambers under a 16-h/8-h light/dark cycle at 22°C. Young leaves harvested from the F 1 and F 2 seedlings were used for genomic DNA extraction. Total genomic DNA was isolated by the Cetyl trimethyl ammonium bromide method (Murray and Thompson, 1980).

Evaluation of heterosis phenotype
For examining the heterosis phenotype of early developmental stages, plants were grown in plastic dishes containing Murashige and Skoog (MS) agar medium supplemented with 1.0% sucrose (pH 5.7) in growth chambers under a 16-h/8-h light/dark cycle at 22°C. Cotyledons at 6 days after sowing (DAS), and 1st and 2nd leaves at 14 DAS were fixed in a formalin/acetic acid/alcohol solution (ethanol:acetic acid:formalin = 16:1:1). The image of the whole cotyledon or leaf was photographed under a stereoscopic microscope, and sizes were determined with Image-J software (http://rsb.info.nih.gov/ij/). For examining the yield under field conditions, seeds were sown on multi-cell trays on 21st August 2014 and grown in a greenhouse. At 6 DAS, cotyledons were photographed, and the area of the cotyledons was determined with Image-J software (http://rsb.info.nih.gov/ij/). On 3rd September 2014, seedlings were transplanted to the field at Osaki, Miyagi, Japan (38°57′N, 141°00′E). Thirty plants per plot were transplanted and plot size was 13.5 × 0.7 m. Row spacing is 70 cm and planting distance is 40 cm. At 21 DAS, leaf lengths and widths of the first and second largest leaves were measured. On 13th and 14th November 2014, plants were harvested. Statistical comparisons of cotyledon area, leaf size, fresh weight of total biomass and harvested biomass were performed using Student's t-test (p b 0.05).
The ratio of heterosis performance between F 1 and mid parent value (MPV) (termed rMPV) is calculated as follows, rMPV = F 1 (mean)/MPV (mean of two parents). The ratio of heterosis performance between F 1 and better parent value (BPV) (termed rBPV) is calculated as follows, rBPV = F 1 (mean)/BPV (mean of better parent).

Detection of DNA polymorphism with SSR markers
A total of 321 SSR markers, "BRAS", "BRMS", "BnGMS", "CB", "KBr", "Na", "Ni" and "Ol", were used to screen for polymorphisms among F 2 individual plants derived from the F 1 hybrid cultivar 'W77' (Table S2). The PCR reaction was performed using the following conditions; 1 cycle of 94°C for 3 min, 40 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min, and final extension at 72°C for 3 min. The PCR products were electrophoresed on 10% or 13% polyacrylamide gel using NA-1040 or NA-1118 (NIHON EIDO, Japan). The gel was stained with Gelstar solution (0.1 μl/10 ml; Takara Bio Inc., Japan). Primer sequences used in this study are shown in Table S3.

Detection of DNA polymorphism with CAPS markers
A total of 38 CAPS markers were used for examining the genetic distances among Chinese cabbage inbred lines. The PCR reaction was performed using the following conditions; 1 cycle of 94°C for 3 min, 35 cycles of 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min, and final extension at 72°C for 3 min. Amplified DNA digested by Afa I, Hae III, Hha I, Hinf I, Mbo I, Msp I, or Hae III restriction enzymes were electrophoresed on 13% polyacrylamide gel. The gel was stained with Gelstar solution (0.1 μl/10 ml; Takara Bio Inc., Japan). Primer sequences used in this study are shown in Table S3.

RAD-seq
Genomic DNA was digested using two restriction enzymes, Bgl II and Eco RI. The digested DNA fragments and two adapters (Bgl II adapter and Eco RI adapter) were ligated. The digestion and ligation were simultaneously performed at 37°C for 16 h. The reaction mixture consisted of 20 ng of genomic DNA, 5 units of Bgl II (NEB), 10 units of Eco RI-HF (NEB), 1 × NEB buffer2 (NEB), 1 × BSA (NEB), 0.2 μM Bgl II adapter, 0.2 μM Eco RI adapter, 1 mM ATP (Takara), 300 units of T4 DNA ligase (Enzymatics). The ligation product was purified by the AMpureXP (Beckman coulter) according to manufacturer's instructions. One tenth of the purified DNA was used in the PCR enrichment with the KAPA HiFi HS ReadyMix (KAPA biosystems). Sequences of adaptors and primers used in this study are shown in Table S4. Approximately 350 bp fragments of the PCR product was selected by the E-Gel size select 2% (Life technologies). Single end 50 bp and index sequence of the library was sequenced by the HiSeq2500 (Illumina) with the TruSeq v3 chemistry. Preprocessing of the sequence data was performed by the trimmomatic-0.32 with the following parameters: ILLUMINACLIP TruSeq3-SE.fa:2:30:10 LEADING:19 TRAILING:19 SLIDINGWINDOW:30:20 AVGQUAL:20 MINLEN:51 (Bolger et al., 2014). The preprocessed sequences were analyzed by the Stacks program with default parameters (Catchen et al., 2013). We selected 288 positions of a nucleotide sequence with less than 3 missing data per site among 30 lines.

Phylogenetic analysis
Sixty-six SSR markers, 38 CAPS markers, and RAD-seq data from 288 positions were used for screening polymorphic loci among 22 inbred lines (Table S2). The software Populations version 1.2.31 (http:// bioinformatics.org/~tryphon/populations/) was used to calculate the genetic distance among inbred lines and to construct a neighbor-joining phylogenetic tree with 1000 bootstraps (Nei et al., 1983). The tree was depicted using TreeView version 1.6.6 (http://tree.bio.ed.ac.UK/ software/figtree/).

Screening of polymorphic DNA markers between the inbred lines of Chinese cabbage
SSR markers can detect nucleotide sequence differences between F 2 individuals and determine segregation of the parental genomes in the F 2 population. To find SSR makers that can efficiently detect polymorphisms among Chinese cabbage genotypes, we screened SSR markers using an F 2 population derived from an F 1 hybrid cultivar. We assessed 321 SSR markers in a commercial F 1 hybrid plant ('W77') and 6 individual F 2 plants of 'W77' (Tables S1, S2). Fifty-nine DNA markers (18%) detected polymorphism among these 7 plants (Table S2). We selected 11 co-dominant DNA markers confirming their Mendelian segregation using 96 F 2 plants derived from 'W77' (Table S5).
We selected 7 (BRMS007, BRMS026, BRMS027, BRMS040, BRMS163, BRMS226, and BRMS276) of the 59 SSR markers because the BRMS series are highly polymorphic markers in Brassica vegetables (Dr. Satoru Matsumoto, personal communication). To examine whether these 7 SSR markers that can detect polymorphisms in parental lines of 'W77' are applicable to other parental combinations of F 1 hybrid cultivars, we assessed 13 F 2 populations derived from commercial F 1 hybrid cultivars. These SSR markers showed 38% (5 of 13 F 2 populations) to 92% (12 of 13 F 2 populations) polymorphism rate among 6 individual F 2 plants, of which BRMS007 and BRMS027 showed the highest polymorphism rate (Table S6). More than 3 DNA markers (43%) were found to detect parental polymorphisms of F 1 hybrid cultivars (Table S6).
RNA-sequencing (RNA-seq) analysis in leaves using two Chinese cabbage inbred lines, RJKB-T23 and RJKB-T24, was used for SNP analysis (Shimizu et al., 2014). More than three CAPS markers in each chromosome were developed, and their marker locations were spread over the reference genome (Fig. S1). Totally 38 CAPS markers with codominance were developed, and their Mendelian segregation was confirmed in 12 of 38 CAPS markers using 96 F 2 plants between RJKB-T23 and RJKB-T24 (Table S5).
Genetically uniform lines are desired for both laboratory experiments and F 1 hybrid breeding. To confirm genetic uniformity of these inbred lines, we used 7 DNA markers (BRMS007, BRMS026, BRMS027, BRMS040, BRMS163, BRMS226, and BRMS276) on 8 individual plants for each inbred line. These SSR markers did not show any polymorphism among the 8 plants in any inbred line (data not shown). We further tested the genetic uniformity by RAD-seq. The numbers of raw sequenced reads per inbred line ranged from 526,146 to 5,386,735 (mean = 1,774,162). The heterozygous genotypes in 22 inbred lines were less than 1.42% (mean = 0.54%) in 288 positions of a nucleotide sequences determined by RAD-seq (Fig. S1, Table S7). These results indicate that these 22 inbred lines are highly homozygous.

Genetic distance of inbred lines in Chinese cabbage
The genetic distance (GD) among 22 genotypes of Chinese cabbage inbred lines was examined for use in F 1 hybrid cultivars. The genotypes of one commercial cultivar, Chiifu, 3 turnip DH lines, and 4 komatsuna inbred or DH lines were also examined. We selected 66 SSR markers to calculate the GD, and a total of 163 alleles were identified among all 30 lines. There are 1 to 7 alleles (mean = 5.1 alleles) per marker among 30 lines. The average scores of GD among Chinese cabbage, turnip, komatsuna, and all lines were 0.38, 0.48, 0.45, and 0.44, respectively (Table S8). Scores of GD between Chinese cabbage and turnip lines, between Chinese cabbage and komatsuna lines, and between turnip and komatsuna lines were 0.53, 0.51, and 0.46, respectively.
Next we calculated the GD using 38 CAPS markers, and a total of 143 alleles were identified in all 30 lines. There are 1 to 8 alleles (mean = 4.8 alleles) per marker among all 30 lines. The average scores of genetic distance among Chinese cabbage, turnip, komatsuna, and all lines were 0.42, 0.58, 0.59, and 0.50, respectively (Table S9). Scores of genetic distances between Chinese cabbage and turnip lines, between Chinese cabbage and komatsuna lines, and between turnip and komatsuna lines were 0.56, 0.63, and 0.59, respectively.
We selected 288 nucleotide positions determined by RAD-seq. The average scores of genetic distance among Chinese cabbage, turnip, komatsuna, and all lines were 0.07, 0.20, 0.16, and 0.10, respectively (Table S10). Scores of genetic distances between Chinese cabbage and turnip lines, between Chinese cabbage and komatsuna lines, and between turnip and komatsuna lines were 0.16, 0.15, and 0.19, respectively.
GD calculated by the three methods was compared, and the correlation coefficient between SSR and CAPS, between SSR and RAD-seq, and between CAPS and RAD-seq were 0.65 (p b 0.01), 0.68 (p b 0.01), and 0.73 (p b 0.01) (Fig. 1), respectively, indicating that there is a high correlation among three methods. Using all genotype information among 22 inbred lines of Chinese cabbage, the genetic distance between RJKB-T05 and -T14 was lowest, and highest between RJKB-T07 and -T19. A dendrogram based on cluster analysis in Neighbor-joining placed all lines into 5 major groups. Four groups (I, III, IV, and V) were comprised of Chinese cabbage inbred lines, and turnip and komatsuna lines were in group-II (Fig. S2).

Evaluation of heterosis phenotype
Considering the future breeding cultivars we used 3 new inbred lines, RJKB-T26-T28, and 18 of the 22 inbred lines for seed parents and 6 new inbred lines, RJKB-T30-T35, for pollen parents. 32 F 1 hybrids were obtained. We examined the cotyledon area at 6 days after sowing (DAS) and leaf area at 14 DAS when grown on MS medium in a growth chamber of 12 F 1 hybrids to confirm that both heterotic and nonheterotic hybrids are included in this sample. In cotyledon area, the scores of rMPVs (the ratio of heterosis performance between F 1 and mid parent value (MPV)) ranged from 0.94 to 1.96, and the average scores of rMPV was 1.34. The scores of rBPV (the ratio of heterosis performance between F 1 and better parent value (BPV)) ranged from 0.90 to 1.90, and the average scores of rBPV was 1.20. In the 1st and 2nd leaf areas, scores of the rMPV and rBPV ranged from 0.52 to 2.94 and from 0.33 to 2.29, respectively, and the average scores of rMPV and rBPV were 1.41 and 1.16, respectively (Table S11).
Next we examined the cotyledon and leaf size at an early developmental time and at the final stage in the field. The average scores of the rMPV and rBPV in the cotyledon area at 6 DAS were 1.48 and 1.27, respectively (Table S12). The average scores of rMPV and rBPV in leaf length of largest leaves were 1.12 and 1.06, respectively, and the average scores of rMPV and rBPV in leaf width of largest leaves were 1.13 and 1.06, respectively. The average scores of rMPV and rBPV of the product of leaf length × width in the largest leaves were 1.27 and 1.14, respectively (Table S13). Finally total biomass (whole above-ground), harvested biomass (in which the outer leaves were stripped for marketing), and height and circumference of the harvested plants were examined (Table S14). The average scores of rMPV and rBPV of the harvested biomass were 1.28 and 1.17, respectively. The plant circumference in F 1 tended to be larger than parental lines, while there was little difference in the plant height between F 1 and parental lines (Table S14). The F 1 hybrids using RJKB-T35 as a pollen parent tended to show higher rMPV through 12 traits (Figs. S3, S4, Tables S12-S14).
We compared the rMPV and rBPV of 6 day cotyledon area and the 1st and 2nd leaf sizes between plants grown on MS medium in a growth chamber and plants grown on soil in the greenhouse or field. In these two comparisons, there was a significant correlation between the two experiments except for the rBPV of the 1st and 2nd leaf sizes (cotyledon, rMPV: r = 0.86**, rBPV: r = 0.83**; leaf size, rMPV: r = 0.71*, rBPV: r = 0.39).
We examined the relationships of rMPV and rBPV among 12 traits at three developmental stages. There was no correlation between the rMPV/rBPV-cotyledon area at 6 DAS and rMPV/rBPV-leaf area at 21 DAS or between rMPV/rBPV-cotyledon area at 6 DAS and rMPV/ rBPV-harvested biomass (Table 1). There was a moderate correlation between rMPV-leaf area at 21 DAS and rMPV-harvested biomass, while there was no correlation between rBPV-leaf area at 21 DAS and rBPV-harvested biomass (Table 1).
RAD-seq was performed on another 9 inbred lines (RJKB-T26-T28, T30-T35) and combined with the previous RAD-seq data (Table S15) to examine the relationship between rMPV and GD or between rBPV and GD. Of 12 traits, no trait showed a positive correlation between rMPV and GD or between rBPV and GD. The leaf length of the second largest leaf at 21 DAS and plant height of the harvested plants showed a moderate negative correlation between rBPV and GD (Table 2, Fig. 2), indicating that it is difficult to predict the level of heterosis from the genetic distance between parental lines.

Screening of polymorphic SSR markers
We screened SSR markers that can distinguish between the parental alleles of commercial F 1 hybrid cultivars of Chinese cabbage. Fifty-nine of 321 SSR markers detected a polymorphism between the parental alleles of a commercial F 1 hybrid cultivar, 'W77'. We assessed 7 of the 59 SSR markers on 13 parental combinations of F 1 hybrid cultivars, and more than 3 SSR markers detected polymorphisms of parental alleles, suggesting that these highly polymorphic SSR markers can be applied to assess the genetic uniformity of inbred lines. We also developed 38 CAPS markers using the SNP information detected by RNAseq. As the putative chromosomal positions of these CAPS markers were able to be predicted, these CAPS markers enable us to avoid any bias of DNA marker positions relative to SSR markers. When breeders develop inbred lines, five to seven generations of selfing are performed, but it is not clear how many generations are sufficient for generating genetically uniform inbred lines. Completion of the creation of inbred lines is evaluated by the uniformity of the traits in a field test, but this has the risk of being affected by the environment and is time consuming and expensive. The assessment of genetic uniformity of inbred lines by DNA markers becomes a possibility, though it is not clear how many markers are sufficient. We tested the same 7 SSR markers on 8 individual offspring of 22 inbred lines, and did not find any polymorphisms among them. Genetic uniformity was also confirmed by RAD-seq analysis, which is a more powerful method of assessing the genetic uniformity. The highly polymorphic 7 SSR markers might be useful as a first test, and the use of additional DNA markers would increase the reliability of the data. Thus, the 59 SSR and 38 CAPS markers identified in this study are useful for assessing the genetic uniformity of inbred lines in Chinese cabbage, and the combination with RAD-seq gives a more precise analysis. As there are some reports of interspecific transferability of SSR and CAPS markers in closely related species (Márquez-Lema et al., 2010), these marker sets may also apply to other varieties of B. rapa species such as turnip or komatsuna or to related species such as cabbage (B. oleracea L.).

Characterization of genetic distance among inbred lines of Chinese cabbage using SSR markers
We generated neighbor-joining trees using 23 Chinese cabbage, 3 turnip, and 4 komatsuna lines. Five major clusters were constructed and 4 of them included only the Chinese cabbage lines. Turnip and komatsuna lines were clustered into the same group, but using more turnip and komatsuna lines will separate the two varieties into different groups. When B. rapa collections such as Chinese cabbage, turnip, and pak-choi, are examined, Chinese cabbage lines tends to be clustered into the same group with a small number of exceptions (Pino Del Carpio et al., 2011;Takuno et al., 2007;Zhao et al., 2005), consistent with our results.
Understanding the genetic information of breeding resource collections is important for breeding programs, and the genetic relationship among genotypes can help breeders in selecting parental lines, and predict the level of heterozygosity of hybrids. As any single DNA marker type may show bias in identifying the genetic distance (Frascaroli and Landi, 2013;Hamblin et al., 2007), we assessed the genetic distance by three types of DNA markers; SSR are multi-allelic markers, CAPS generated based on SNP information of RNA-seq are bi-allelic markers in exon regions, and SNPs detected by RAD-seq are bi-allelic markers. The genetic distance measured using the three types of DNA markers showed a high correlation (correlation coefficient is greater than 0.65). Though the ascertainment bias of SNP markers is controversial (Hamblin et al., 2007), the three types of DNA markers together may predict the genetic distances among inbred lines.

Estimation of yield heterosis from genetic distance of parental lines or the early developmental growth
Methods for predicting hybrid performance from the parental generations would benefit breeding programs. Heterosis has been suggested to be associated with the genetic distance between parental lines of F 1 hybrids (Biton et al., 2012;Cheres et al., 2000;Flint-Garcia et al., 2009;Godshalk et al., 1990;Hua et al., 2002;Jagosz, 2011). In this study, we examined the relationship between GD calculated by three types of DNA markers and the level of mid parent and better parent heterosis such as cotyledon area at 6 DAS, leaf length x width of largest leaf at 21 DAS, and harvested biomass. Correlations were not observed between GD and rMPV/rBPV in any parameter examined, indicating that it is difficult to predict the hybrid performance from the genetic distance of parental lines.
In our previous study using in A. thaliana, the heterosis phenotype was obvious in early developmental stages and heterosis was maintained and enhanced at the later stages. We suggested that an early developmental growth advantage is important for displaying later heterosis (Fujimoto et al., 2012). Early developmental heterosis was    observed in crops such as rice, maize, and wheat, and the level of heterosis is trait-dependent (Flint-Garcia et al., 2009;He et al., 2010;Li et al., 2014;Springer and Stupar, 2007;Xing et al., 2014). We considered the prediction of yield heterosis from the early developmental stages could be useful to save time and labor. We examined the heterosis level in three developmental time points, cotyledon (6 DAS), seedling (21 DAS), and harvesting stages. In rMPV, we found moderate correlation between leaf size at 21 days seedling and harvested biomass. In crops, seed production is important for yield but early seedling development does not directly correlate to yield-related traits. However early seedling development is more closely related to yield in leafy vegetables such as Chinese cabbage, suggesting that the assessment of heterosis level in seedling stages will be useful for the first screening of parental combinations of F 1 hybrid cultivars during the breeding step. As heterosis of yield is a difficult trait to measure, multiple trials to examine the relationship between early and late stage heterosis will be required.