Efficient anchoring of alien chromosome segments introgressed into bread wheat by new Leymus racemosus genome-based markers

The tertiary gene pool of bread wheat, to which Leymus racemosus belongs, has remained underutilized due to the current limited genomic resources of the species that constitute it. Continuous enrichment of public databases with useful information regarding these species is, therefore, needed to provide insights on their genome structures and aid successful utilization of their genes to develop improved wheat cultivars for effective management of environmental stresses. We generated de novo DNA and mRNA sequence information of L. racemosus and developed 110 polymorphic PCR-based markers from the data, and to complement the PCR markers, DArT-seq genotyping was applied to develop additional 9990 SNP markers. Approximately 52% of all the markers enabled us to clearly genotype 22 wheat-L. racemosus chromosome introgression lines, and L. racemosus chromosome-specific markers were highly efficient in detailed characterization of the translocation and recombination lines analyzed. A further analysis revealed remarkable transferability of the PCR markers to three other important Triticeae perennial species: L. mollis, Psathyrostachys huashanica and Elymus ciliaris, indicating their suitability for characterizing wheat-alien chromosome introgressions carrying chromosomes of these genomes. The efficiency of the markers in characterizing wheat-L. racemosus chromosome introgression lines proves their reliability, and their high transferability further broadens their scope of application. This is the first report on sequencing and development of markers from L. racemosus genome and the application of DArT-seq to develop markers from a perennial wild relative of wheat, marking a paradigm shift from the seeming concentration of the technology on cultivated species. Integration of these markers with appropriate cytogenetic methods would accelerate development and characterization of wheat-alien chromosome introgression lines.

Mining of useful genes from wild genetic resources, especially the tertiary gene pool, through distant hybridization, to broaden the genetic base of elite cultivars of bread wheat is expected to continue, considering the current trend in global climatic change, accompanied by new strains of pests and disease pathogens. This strategy is, however, largely hindered by linkage drag and low rate of success in distant hybridization, which, in this age of next-generation sequencing (NGS) technologies and improved interspecific hybridization techniques, can effectively be managed. While in vitro culture techniques, example embryo rescue, and induction of homoeologous chromosome recombination have been employed to achieve successful distant hybridization and useful gene recombination, integration of appropriate molecular markers into breeding programs to conduct marker-assisted backcrossing can immensely assist in selecting against deleterious genes, hence fast-tracking the process. Unfortunately, unlike their cultivated counterparts, whose genomes have been extensively analyzed, DNA sequence information and molecular markers of these wild species are limited or completely absent in some cases, culminating in a poor understanding of their genome structures and delay in cultivar development and adequate characterization. This dearth of information, which our research sought to address, largely accounts for the current underutilization of the rich diversity readily available for wheat breeding.
Plant breeders, in various attempts to deal with the aforesaid situation, have had to resort to applying available expressed sequence tags (ESTs) from a few perennial grasses and heterologous markers from annual cereals, example barley, to aid their work [18,19], but the outcomes, although informative, are hardly satisfactory as a consequence of increased species divergence arising from mutations and other genetic events during speciation. To effectively harness useful genes from these all-important genetic resources, their genomic information base should be continually enriched to at least include data on outstanding species that can serve as representatives for their evolutionary close relatives. Efforts to achieve this much needed expansion have generated enormous molecular cytogenetic data, EST-SSR markers, EST linkage maps and other useful information [12,[19][20][21][22][23]. However, molecular markers developed from whole genome sequence information of these species are still lacking, making it difficult to adequately anchor alien chromosome segments in wheat-alien chromosome introgression lines (CILs). Also, the application of DArT-seq genotyping to study diversity and develop molecular markers from wild Triticeae species is yet to be accorded the popularity it deserves.
In this research, therefore, we applied PCR and DArTseq to develop 8632 [110 PCR-based and 8522 DArT-seq (SNPs)] polymorphic markers from the genome of Leymus racemosus. We also developed additional 1468 CILs-based SNPs which are obvious polymorphisms resulting from the interaction between the alien chromosomes and the background/carrier. Our efforts extended to the application of 5196 (~52%) of all the markers to genotype 22 wheat-L. racemosus CILs and the analysis of the transferability of PCR-based markers to other Triticeae species, with emphasis on donor species whose genomes have not been sequenced. This is the first research reporting on the development of molecular markers from L. racemosus whole genome and RNA-seq, and the application of DArT-seq platform of wheat to develop DNA markers from its perennial wild relative.

Development of L. racemosus polymorphic markers
From a total of 294 primer sets screened by PCR, 164 sets (~56%) amplified L. racemosus genome. Out of the amplified markers, 110 (~67%) were polymorphic in wheatabsence or difference in size of bands in wheat -( Fig. 1a; Table 1). Six of the polymorphic markers showed size polymorphism, while 104 markers constituted presence/absence polymorphism. Also, out of 11,570 DArT-seq SNP markers filtered based on high call and reproducibility rates, 8522 (~74%) were polymorphic in wheat (absence of SNP alleles in wheat) -8430 SNPs were absent in our wheat cultivar, CS, while 92 were present but showed presence of both reference and SNP alleles in L. racemosus ( Fig. 1c; Table 1). These 92 markers form part of the polymorphisms we observed between our CS and the reference CS genome sequence on DArT platform. Taken together, we developed a total of 8632 polymorphic markers from L. racemosus genome.
Characterization of wheat-L .racemosus chromosome introgression lines with L. racemosus markers About 65% (72 markers) of the polymorphic PCR markers amplified L. racemosus chromosomes in nine wheat-L. racemosus chromosome addition lines, while   Fig. 1b and d). It should be noted here that we used only SNP markers from the DArT-seq data in our analysis, as silico DArT was less informative in analyzing the required polymorphism. This is because silico DArT data is binary (dominant), making it impossible to identify polymorphism as codominance (in our case, presence of both reference and SNP alleles), which we mostly utilized to genotype the chromosome introgression lines, since they have genome representations of wheat (alien chromosome recipient) and L. racemosus (alien chromosome donor).

Development of L. racemosus chromosome-specific markers
We developed a total of 3551 chromosome-specific markers for the nine L. racemosus chromosomes in wheat genetic background, and the number of the specific markers per chromosome ranged between two in Lr#E and 533 in Lr#L (Table 1; Additional files 1, 2, 3, 4, 5, 6, 7, 8 and 9). The large number of markers on each chromosome enabled us to reliably differentiate the nine wheat-L. racemosus chromosome addition lines analyzed ( Table 1).

Confirmation of homoeologous groups of L. racemosus chromosomes in wheat background
To further assess the validity of our chromosome-specific SNP markers, we exploited correspondence of L. racemosus chromosome-specific markers with the homoeologous groups of CS chromosomes to determine the most probable homoeologous group (HG) of each L. racemosus chromosome in the chromosome addition lines ( Table 2). The results revealed that the alien chromosomes spread between HG 2 and 7: Lr#A and L are in HG 2, Lr#H and N are in HG 3, while Lr#F, I, K and J are in HG 4, 5, 6 and 7, respectively.
Detailed characterization of chromosomes I, J, N and their respective translocation arms As shown in Table 3, we successfully allocated I-, J-and N-specific markers to their respective arms using their respective translocation lines. A detailed analysis of the homology between each of the three chromosomes and their respective translocated arms gave a clearer picture of the structures of the translocation lines. For chromosome I, the markers were adequately allocated to the short (S) and long (L) arm translocations, revealing the proportions of chromosome I markers that differentiated each of the translocated arms and eight markers located on a segment of chromosome I that may have not been transmitted during the production of the translocation lines (Fig. 2a). However, we observed few markers specific to the translocation lines, which are absent in I-addition line. If no genotyping error is assumed, these markers would represent polymorphisms that may have arisen from the interactions between the translocated arms and CS genome.
Most J-chromosome markers were found to be present only in the JS translocation, about half of which were co-located on the JL translocation (Fig. 2b). This result obviously indicates that what we hitherto regarded as JL translocation is a segment of JS translocation. The 10 unique markers (Fig. 2b) each present in the two translocation lines may have resulted from changes in each genetic background or small chromosomal rearrangements during the various production processes. As observed in chromosome I, 14 markers identified a segment of chromosome J which may have not been transmitted to the translocation lines.
Chromosome N and its translocated arms presented a scenario similar to chromosome I. Both the NS and NL arm translocations of chromosome N showed well separated markers (Fig. 2c). However, 27 markers were found Table 2 Determination of homoeologous groups of L. racemosus chromosomes in wheat genetic background using chromosomespecific DArT-seq SNP markers Lr Leymus racemosus, HG Homoeologous group, ND Not determined; a : [12]; b : [9] The bold numbers represent the number of markers indicating the homoeologous groups of L. racemosus chromosomes to be specific to NL and nine specific to NS translocation lines (Fig. 2c), indicating unique polymorphisms which may have been acquired from interactions between the background and the translocated arms as observed for chromosomes I and J. Also, the 13 markers located on the whole N-addition line, which are absent in the translocated arms, suggest that the NS and NL translocations lost the region of chromosome N identified by these markers (Fig. 2c).

Analysis of recombination positions of N-recombination lines
The N-specific markers aided us to determine the size of the recombinant fragments and map their locations on Lr#N and the corresponding CS chromosomes, revealing the probable fraction of CS chromosome replaced in each recombination line (Table 4; Fig. 3). N recombinant fragments 2, 3, 5 and 6 were found to be located in the short arm, while the recombinant fragments 4 and 7 were found in the long arm of each of the lines. Although the two markers that specified recombinant fragment 1 were traced to NL, we are not certain about the arm location of this fragment; hence, we intend to clarify this in a future report. Recombination lines 6 and 7 were observed to have the largest fragments, with all the markers in the short arm translocation represented in recombination line 6, and all except two markers in long arm translocation represented in recombination line 7 (Table 4; Fig. 3). Other lines were found to have relatively small fragments which can best be described as different sizes of bins represented in recombination lines 6 or 7. With the two markers recorded for recombination line 1 (Table 4), it would appear as though there was no recombination event, although low recombination rates between wheat chromosomes and aliens is not unusual [24]. However, molecular cytogenetic characterization clearly differentiated the lines (Additional file 10), indicating the importance of integrated characterization of wheat-alien CILs.

Leymus racemosus Chromosomes' universal markers
Two of the polymorphic markers, 21_s46518 and 333_s46518 (developed from the same DNA sequence scaffold) identified all the L. racemosus chromosomes in wheat ( Fig. 4a; Additional file 11). On sequencing PCR products generated with one of these markers, and conducting BLAST search with the official NCBI search tool (BLASTN, megablast), we observed that 26% and 16% of sequences of

Additional unique CILs-based SNPs
DArT-seq data further revealed additional 1468 unique SNPs in the nine wheat-L. racemosus addition lines, absent in the two parents. One hundred and ninetyseven of these SNPs are common to the lines, while 1271 are line-specific, with a range of 38-355 specific markers on each line (Table 5). Like the L. racemosus chromosome-specific markers, the line-specific markers also guided us in differentiating the nine addition lines. These additional SNPs account for polymorphisms acquired from the interactions between the added chromosomes and the background (CS genome), and their effects may be of agronomic significance.  To assess the applicability of L. racemosus markers in studying the genomes of other related species, we analyzed the transferability of the markers using genomic DNA from 11 other Triticeae species, alongside with L. racemosus. The results of this analysis, utilizing 164 prescreened L. racemosus PCR-based markers, showed that 75% of the markers were transferable, while the remaining 25% were L. racemosus genome-specific, particularly revealing higher amplification frequencies in three other important perennial Triticeae species (L. mollis, Psathyrostachys huashanica and Elymus ciliaris) in comparison to wheat and other species studied (5a-d; Table 6). More importantly, the amplified markers in each of these species were found to be reasonably polymorphic in wheat (Table 6), obviously indicating their suitability in genotyping wheat-alien CILs carrying chromosomes from these species. Interestingly, the two universal markers which identified all L. racemosus chromosomes in wheat genetic background were found to be Leymus-specific, as they amplified only the two Leymus species out of the 12 Triticeae species analyzed, revealing size polymorphism between the two Leymus genomes (Fig. 5b). These markers can, therefore, be applied to separate Leymus genomes from genomes of other species in the same tribe, and their (Leymus) chromosomes, if introgressed into wheat, can easily be sorted out in one PCR. We also observed informative co-amplification between the two Leymus species and Psathyrostachys huashanica (Fig. 5c), and a phylogenetic analysis using 123 markers co-amplified among the 12 Triticeae species (Fig. 6) revealed a close evolutionary relationship between the three species, which agrees with reports asserting that Leymus species are segmental polyploids with variant N-genomes from genus Psathyrostachys [13,25]. However, one highly conserved marker sequence amplified all the species, revealing size polymorphism among them (Fig. 5d).

Analysis of polymorphism based on DNA and RNA markers
In a bid to compare the performance of markers developed from DNA and RNA-seq, we analyzed polymorphism based on the two sources of markers. As expected, markers from genomic sequence were more polymorphic than those from RNA-seq, indicating that the polymorphisms between hexaploid wheat and other Triticeae species studied are more traceable to the variations in the repetitive sequences of the genomes (Table 6). However, the polymorphisms recorded from the RNA-seq/gene markers (Table 6), which account for variations in the genic regions, make the two approaches equally informative.

Discussion
Fast-tracking introgression breeding and wheat-alien characterization with appropriate molecular markers Utilizing introgressive hybridization to combat the agelong wheat genetic erosion has since been identified and is currently inevitable, but the achievements are still not satisfactory mostly resulting from poor understanding of the genomics of important wild relatives of wheat [3,26].Therefore, to create the necessary platform for successful breeding of hexaploid wheat through the intermediary of its tertiary gene pool, mobilization of research resources towards genome analysis and development of molecular markers from notable Triticeae perennial species must be intensified. The projected 60% increase in wheat demand in 2050 [27], which obviously cannot be met solely through the cultivation of high yielding elite wheat cultivars, most of which are poorly adapted to harsh growing conditions, further justifies our opinion. From the stand point of our results (Tables 1-4; Fig. 3), it is evident that the availability of adequate molecular markers from the genomes of potential gene donors can accelerate introgression breeding, as they can be reliably deployed to genotype wheat-alien CILs and tackle linkage drag, where necessary. The massive chromosome-specific markers developed for each of the chromosome addition lines (except Lr#E) are, therefore, expected to aid breeders Ae. Tauschii in conducting more stringent screening and selection in their efforts to develop cultivars with only necessary chromosome segments to satisfy specific breeding goals within a reasonable time frame. At the moment, we attribute the few chromosome-specific markers developed for Lr#E to high homology between wheat genome and the chromosome, having observed about 30% monomorphic markers between L. racemosus and T. aestivum genomes (Fig. 1a and c). However, there may be some form of genetic instability, cytochimerism for instance, which we intend to ascertain in the future. We are certain that alien chromosome loss is not the reason for the strange result, as we used the original stocks of all the addition lines in the TACBOW gene bank, which had been characterized by some of our co-authors [9,28], to confirm our results. The difference between the total number of markers developed and the number of markers that identified the aliens in wheat background (Table 1; Fig. 1b and d) account for the difference between the whole set of chromosomes in L. racemosus genome and the number of L. racemosus chromosomes we studied in the nine genotypes. Another factor likely to contribute to this difference is the possibility of losing some segments of the nine chromosomes during production of the lines.
Noteworthy is that the higher proportion of PCR markers that identified aliens in wheat as compared to SNP markers (Fig. 1b and d) was expected because SNP markers are sequence-based, which are theoretically more stringent than PCR-based markers. However, the proportions of both the PCR-based and SNP markers that amplified the nine chromosome addition lines are deemed reasonably high, given that L. racemosus has 14 pairs of chromosomes, 6 of which are obviously not represented in the nine genotypes analyzed. Also, the effectiveness of the chromosome-specific SNP markers in determining the homoeologous groups of L. racemosus chromosomes further validates the importance of inclusion of appropriate molecular markers in wheat-alien cultivar development and screening. The homoeologous groups of the chromosomes determined by our analysis are highly consistent with previous reports [9,12]. Interestingly, we could clarify the homoeologous groups of chromosomes Lr#J (HG 7) and Lr#N (HG 3), which were previously not reported with certainty (Table 2). We presume that the chromosomes found to be in the same groups [Lr#A and Lr#L (HG 2); Lr#H and Lr#N (HG 3)] are homoeologous chromosomes from the two genomes that constitute L. racemosus. Our inability to determine the HG of Lr#E chromosome, as was the case with earlier reports cited here, is another pointer that the line may be genetically unstable.
In our opinion, this approach of utilizing chromosome-specific genome-based molecular markers to characterize introgression lines is faster than in situ hybridization procedures (FISH and GISH), which are traditionally employed for this purpose. Although in situ hybridization methods have proven to be reliable in characterizing CILs, they are lengthy, laborious and not without limitations. An example of such limitations was reported when two genomic in situ hybridization (GISH) procedures failed to reveal some distally located breakpoints in wheat-rye recombinant genotypes [29]. Beyond being easier, characterization of wheat-alien CILs by molecular markers, as proven by our results, brought to light detailed chromosome segments rearrangements, some of which can be likened to "zebra" chromosome  [30,31]. Additionally, the unique line-specific polymorphisms revealed by DArT-seq analysis, absent in either of the parents, would not be captured by in situ hybridization methods, as hybridization probes are usually designed to track alien segments, not polymorphisms which may arise from genome interactions. Nevertheless, we are not suggesting the replacement of hybridization procedures with molecular markers. Rather, our strong recommendation is the integration of efficient DNA markers with in situ hybridization strategies in wheat-alien breeding programs to accelerate the process and improve outcomes.

Possibility of genome or alien modification in wheat-alien translocations
The interactions between alien chromosomes and carrier genomes need to be properly dissected. Analyses of the chromosome addition and translocation lines in our study indicated the possibility of genetic modification of either the introgressed chromosomes, background (wheat genome) or both. These modifications, capable of generating additional polymorphisms, as observed in our study (Table 5; Fig. 2), may result from small chromosomal rearrangements, activation of transposable elements or any other interactive genetic event between alien materials with the genome of wheat [31,32]. Also, by graphically genotyping the recombination lines, we were able to uncover different patterns of recombination events in each line (Fig. 3). This observation indicates that the same chromosome (Lr#N) interacted with wheat genome in different ways to produce different genotypes, which are likely to result in diversity in agronomic traits. Of more importance is the potential effect of these interactions on the overall performance of the genotypes [33], necessitating detailed studies to clarify the underlying mechanism of such genetic events and their agronomic implications. Such studies would be greatly enhanced by the availability of adequate molecular markers to track aliens and unique polymorphisms which may result from genome interactions.

Association of L. racemosus chromosomes' universal markers with CACTA-family transposons
The universal markers we developed are particularly valuable since they can be applied to easily track the transmission of alien chromosomes over generations, given the possibility of alien chromosome elimination in the course of cultivar multiplication and maintenance [34][35][36]. Following the alignment of the sequence of one of these markers to CACTA-family transposon in L. perenne, we speculate that this marker sequence is part of a possible CACTA-family transposon in Leymus. CACTAfamily transposons, one of the most abundant superfamilies of class II transposons exclusively found in plants, have been reported to play significant roles in genome variation in Triticeae and other plants [37][38][39][40][41]. Although the specific role of this sequence in Leymus species is unknown at the moment, it is likely to have amplified after differentiation of the ancestral species of Leymus.
Leymus chromosome N-specific markers and biological nitrification inhibition (BNI) activity Biological nitrification inhibition (BNI) activity in L. racemosus, a highly desirable trait with agronomic and environmental consequences, had previously been reported to be chiefly controlled by chromosome N [42]. The N-specific markers are, therefore, particularly of high value, as they can easily be applied to identify genotypes with BNI activity, avoiding the cumbersome and expensive process of root exudates analysis [43], requiring expertise which an ordinary plant breeder may not possess. Interestingly, only DNA sequences of the PCR products of L. racemosus and wheat-Lr#N generated with one of our universal markers aligned to the CACTA-family transposon in L. perenne, one of the forage grasses reported to have endogenous BNI activity [42,43]. However, whether BNI activity is linked with actions of mobile genetic elements, transposons in this case, cannot be ascertained at the moment.

Transferability of markers between L. racemosus and other Triticeae perennials
Sequencing of all the potential gene sources for wheat breeding in the near future is not expected. Hence, transferability of markers between useful species of this gene pool, as a compensational approach of analysis, is highly desired [44][45][46][47]. Our analysis has proven clearly that markers from L. racemosus can be successfully transferred to L. mollis, P. huashanica and E. ciliaris, three other important species in the tribe Triticeae (Table 6). Also, the transferred markers were found to be reasonably polymorphic in wheat (Table 6), suggesting their suitability for characterizing wheat genotypes with alien chromosomes from the three genomes. The genera of these species, because of their recognition as profitable forage grasses and gene mines for hexaploid breeding, have received fair research attention [13,14,23,25,[48][49][50]. However, their genomes have not yet been sequenced, leaving breeders with the option of transferring markers from evolutionary closely related species to analyze their genomes and wheat genotypes carrying their chromosomes.

Conclusion
The molecular markers developed in this study are expected to play valuable roles in hexaploid wheat breeding, particularly in the process of developing and characterizing wheat-alien CILs. Our success in applying them to unequivocally genotype 22 wheat-L .racemosus CILs validates their usefulness. Specifically, the universal and N-specific markers are of great breeding importance. While the universal markers can readily be applied to monitor and confirm alien presence and transmission, N-specific markers can find application in mapping of nucleotide sequences associated with biological nitrification inhibition (BNI) activity. The additional SNPs found on the nine chromosome addition lines would be especially useful in identifying and analyzing unique polymorphisms which may result from alien interaction with background, while the L. racemosus markers not mapped on any of the nine chromosomes reported here would aid production of other wheat-L. racemosus CILs carrying other chromosomes of L. racemosus. Also, the remarkable transferability of the PCR-based markers to three other notable perennial Triticeae species is an added advantage, as they can be deployed to characterize wheat-alien CILs bearing chromosomes from these genomes. Since this is the first report on the development of molecular markers from this genome, coupled with the efficiency of the markers as proven in our results, we recommend wide application of these markers in bread wheat breeding programs. Integrating the markers with in situ hybridization strategies would undoubtedly shorten the duration of cultivar development and produce more reliable results.

Plant materials
We analyzed 22 wheat-L. racemosus CILs (Table 7), three cultivated and nine wild Triticeae species (Table 8)   were ready for DNA extraction. About two weeks after sowing, leaf samples were collected from each plant, immediately frozen in liquid nitrogen and stored at -80°C until needed for DNA extraction. Cetyl trimethylammonium bromide (CTAB) miniprep extraction protocol, with some modifications, was followed to extract and purify genomic DNA from all samples, while quantification and quality check were done with NanoDrop2000C Spectrophotometer (ThermoScientific, USA).

Production of wheat-L. racemosus chromosome introgression lines
Details of the production procedures and identification of L. racemosus chromosomes in the chromosome addition and translocation lines were reported in previous studies [9,28,51]. We, therefore, report here the additional steps taken to develop the N-recombination lines, which we characterized alongside with the addition and translocation lines in this study. Basically, our strategy was modelled after the methodology described and adopted in the production of wheat-rye recombination lines [24]. To produce the first N-recombination line, N recomb #1, we crossed a monosomic Chinese Spring (CS) wheat line (2n = 42 -3B') to a disomic N-addition line (2n = 42 + N″) and selected N monosomic substitution plants (2n = 42 + N′ -3B') in the first filial generation (F 1) . In the second filial generation (F 2) , a naturally occurring recombinant was recognized by FISH/GISH analysis and homozygote recombinants were selected and named N-recomb #1. The production of N-recomb #2 to #7, except #4, was initiated by the hybridization of N-short arm translocation with a CS ph 1 mutant (CSph 1) to enable homoeologous pairing and recombination. To obtain young root tissues, oldlignified roots were partially cut and plants were grown in the hydroponic culture for two weeks. Young roots were then separately subjected to salinity stress and ammonium treatment for 12 h, to ensure gene expression for salinity tolerance and BNI activity, which are reported traits of L. racemosus [17,42]. Salinity stress was imposed by addition of 400 mM NaCl to the hydroponic medium, while ammonium treatment was achieved by replacement of 2 mM KNO 3 with 2 mM (NH 4 ) 2 SO 4 . Control plants were maintained in unaltered hydroponic medium. Root tissues were harvested, frozen in liquid nitrogen and stored at -80°Cuntil needed for DNA and RNA extraction.

RNA extraction and library preparation
Control-, salt-and ammonium-treated root tissues of L. racemosus were used for RNA-sequencing. Total RNA was extracted using RNeasy mini kit with the inclusion of an on-column DNase digestion kit (Qiagen). Using the isolated RNA, mRNA-seq libraries were constructed for the three conditions using TruSeq RNA Sample Preparation. To generate 150-bp pair-end reads, the libraries were sequenced by HiSeq2500 according to the standard protocol.

Assembly of RNA-sequencing reads
A total of 174-GB reads were determined by mRNAsequencing (Additional file 12). Approximately 5% of reads with low-quality scores or adapters were partially trimmed by Trimmomatic software (version 0.32) [52].
To remove non-mRNA sequences, we collected known rRNA and tRNA sequences [53,54], and after removing the reads mapped to rRNA and tRNA sequences by Bowtie2 software (version 2.2.3) [55], the remaining reads were used for construction of assembled contigs in either control, high ammonium or salinity stress condition. First, assembled contigs were generated by three softwares: Velvet ver.  [57]. The proportions of conserved sequences were 44%, 17% and 4% in Velvet-Oases, SOAPdenovo-Trans and Trinity contigs, respectively. Here, Velvet-Oases generated the best contigs with respect to similarity in closely related species. In the same procedure, we generated 634,480, 460,748 and 434,862 contigs with more than 500 bp in normal, high ammonium and salinity stress condition, respectively.

Primer design from RNA-seq
We designed primers in the homologous regions between T. aestivum and L. racemosus and L. racemosus specific region, resulting in two categories of primers. First, both forward and reverse primers are in the homologous regions but the length of amplified DNA fragment is expected to be different between T. aestivum and L. racemosus by 20-1000 bp. Second, either forward or reverse primer is in the homologous region, while the other primer is in L. racemosus specific sequence, and the amplified DNA fragment is expected to range from 100 to 1000 bp. All the primers were designed by Pri-mer3 software [58]. We designed 9256 and 7637 pairs of primers in the contigs with more than 500 bp in normal condition in the former and latter strategies, respectively. To design primers in high salinity and ammonium contigs, we first removed high salinity contigs which are similar to normal condition contigs and then removed high ammonium contigs which are similar to either normal or high salinity contigs. From the remaining high salinity contigs, using the two strategies in the order explained above, we designed 4930 and 3339 pairs of primers, respectively, and from the filtered high ammonium contigs, we designed 5461 and 4312 pairs of primers, respectively. Thus, we identified a total of 34,935 pairs of primers to identify the difference between T. aestivum and L. racemosus.

Genome sequencing and assembly
Genomic DNA was extracted from root tissues of 2week-old L. racemosus plants (control treatment) using MagExtractor™ -Plant Genome-(TOYOBO). The isolated DNA was submitted to generate sequencing library for Illumina MiSeq analysis, and the library construction and sequencing process were achieved by a purchasable service from Macrogen, Japan. A total of~35 M pairedend reads (2 × 151-nt) was obtained from the analysis. Subsequent quality trimming (Q > 30) and artificial sequence elimination steps were achieved manually. The cleaned reads were subjected to build L. racemosus genome contigs utilizing Platanus software (v1.2.4) [59]. Assemblies with variable k-mers (27, 29, 31, 33, 35, and 37) were conducted in parallel, and the resultants were merged into L. racemosus genome scaffolds of unique and significantly long (> 1000-nt) length. The raw sequence data was deposited in NCBI/EBI/DDBJ short read archive under a specific accession number (SRR5796629).

Primer design from genomic sequence
Wheat mRNA-seq data from a previous study [60] was mapped to the L. racemosus genome scaffolds with the aid of TopHat (ver. 2.0.8) [61] with the following options: "-read-realign-edit-dist 0 -b2-fast -meta-std-dev 200 -a 6 -i 8 -I 10000 -max-segment-intron 100 -min-segmentintron 3". Ten scaffolds were retained based on their length (> 2000-nt) and read mapping (no specific mRNAseq read mapping). Scaffolds polymorphisms against wheat reference genome (v1.1) [62] were evaluated by BLASTN, and primers sensitive to relatively large (> 3-nt) gaps were designed by Primer3. From the pre-screened primers, polymorphic markers were selected and applied to genotype nine wheat-L. racemosus chromosome addition lines and markers specific to L. racemosus chromosomes I, J and N were deployed to characterize two each of I-, J-and Ntranslocation and seven N-recombination lines, respectively. To assess the transferability of L. racemosus markers to other Triticeae species, we used 164 markers amplified in L. racemosus to genotype 12 species in the tribe (Table 8), including L. racemosus as a positive control, aiming at analyzing polymorphism between bread wheat genome and genomes of other species studied.

Sequencing and analysis of some PCR products
We applied Sanger sequencing to determine the nucleotide sequence of PCR products generated by one of our markers which amplified all the L. racemosus chromosomes added to wheat. All the PCR products were purified with AxyPrep PCR cleanup kit, according to the PCR cleanup spin protocol (AXYGEN Biosciences). The purified products were premixed in accordance with Macrogen's recommendation (Macrogen, Japan) and same delivered to the company for sequencing. Each genotype sequence was searched against nucleotide sequences in NCBI and Ensembl Plants databases using BLASTN (megablast). Also, the DNA scaffold from which the marker was developed was searched in like manner. To check for polymorphism between the chromosomes, we aligned all the sequences using Just-Bio multiple alignment tool.

Development of markers and genotyping of wheat-L. racemosus chromosome introgression lines by DArT-seq
To complement the PCR-based markers and widen the scope of application of our makers, especially chromosome-specific markers, we applied DArT-seq to genotype L. racemosus and the 22 CILs alongside bread wheat to assist in data analysis and interpretation. DArT-seq platform used HiSeq2500 to sequence the samples and generated 44,277 markers. This approach enabled us to develop massive chromosome-specific markers for the nine L. racemosus chromosomes analyzed.
Analysis of data PCR data PCR results were scored in a binary fashion, "0" and "1" for absence and presence of band, respectively, while size polymorphic bands (very few) were differentiated using 1 to designate band size in wheat and 2 for band size in an introgression line or another Triticeae species, depending on the case. The scores were analyzed using simple proportion to determine the percentage of screened primers amplified in L. racemosus genome as well as the proportion of the amplified markers polymorphic in wheat. Also, the frequency of amplification of alien chromosomes in the CILs was computed, with a view to making clear the proportion of the developed markers located on the alien chromosomes. Markers specifically amplified by each of the nine chromosome addition lines were designated chromosome-specific and I-, J-and N-specific markers specifically located on any of the translocation lines were accordingly named armspecific. Also, arm-specific markers of chromosome N specifically amplified by the seven N-recombination lines were applied to determine the arm location of each recombinant fragment. Data from the screening of the 12 Triticeae species were handled in a similar manner, but with more emphasis in identifying and computing polymorphism in wheat in each case. This gave a basis to decide the suitability of L. racemosus for genotyping of wheat lines carrying chromosomes from these species. In addition, we used PCR-based markers amplified in each species to compare frequency of polymorphism between DNA and RNA sequence information to assess the suitability of the two approaches.

DArT-seq data
DArT-seq markers in the SNP 1-Row Mapping Format, which we used for our analysis, were scored "0", "1" and "2", representing reference (Wheat_ChineseSpring04) allele only, SNP allele only and both reference and SNP