Gene Deletion in Barley Mediated by LTR-retrotransposon BARE

A poly-row branched spike (prbs) barley mutant was obtained from soaking a two-rowed barley inflorescence in a solution of maize genomic DNA. Positional cloning and sequencing demonstrated that the prbs mutant resulted from a 28 kb deletion including the inflorescence architecture gene HvRA2. Sequence annotation revealed that the HvRA2 gene is flanked by two LTR (long terminal repeat) retrotransposons (BARE) sharing 89% sequence identity. A recombination between the integrase (IN) gene regions of the two BARE copies resulted in the formation of an intact BARE and loss of HvRA2. No maize DNA was detected in the recombination region although the flanking sequences of HvRA2 gene showed over 73% of sequence identity with repetitive sequences on 10 maize chromosomes. It is still unknown whether the interaction of retrotransposons between barley and maize has resulted in the recombination observed in the present study.

The architecture of branched inflorescences in grasses depends on the developmental fate of primordia and axis orientation 1 . The rice (Oryza sativa L.) panicle generates several primary and secondary branches on which spikelets are produced. Sorghum and maize male inflorescences share a structure similar to that of rice. In barley (Hordeum vulgare L.) spikes, however, spikelets are borne directly on the main axis, the rachis, and there are no pedicels. A diagnostic feature of barley is the possession of three one-flowered spikelets at each rachis node 2,3 . Based on lateral spikelet size and fertility, barley is classified into two-rowed and six-rowed types. Two-rowed barley only has a central fertile spikelet with small and infertile lateral spikelets while the six-rowed barley has three fully-developed fertile spikelets.
The major genes that control row-type variation in barley are Vrs1 4 , Int-c 5 and Vrs4 6 . The barley domestication gene Vrs1, located on the long arm of chromosome 2H, encodes a homeodomain-leucine zipper (HD-Zip) transcription factor that suppresses the development of lateral spikelets in two-rowed barley. Mutant vrs1 results in a well-developed six-rowed phenotype 4 . Int-c, located on chromosome 4H, is an ortholog of the maize (Zea mays. L.) domestication gene, Teosinte branched 1 (TB1), a member of the TCP gene family encoding putative basic helix-loop-helix DNA-binding proteins 5 . The Int-c gene modifies lateral spikelet fertility in barley, and can influence the phenotypic effect of the Vrs1 locus 7 .
Vrs4 controls row-type and spikelet determinacy in barley; an induced mutation, vrs4, can convert the two-rowed to a six-rowed phenotype 5,8 . Vrs4 is an ortholog of the maize inflorescence architecture gene RAMOSA2 (RA2), which encodes a transcriptional regulator that contains the lateral organ boundaries (LOB) domain. Expression analyses by mRNA in situ hybridization and microarray approaches showed that Vrs4 is expressed very early during inflorescence development and controls the row-type pathway through Vrs1 by negatively regulating the lateral spikelet fertility in barley. Moreover, the Vrs4 gene is an important modifier of inflorescence development. Here, we report on a new mutant, poly-row and branched spike (prbs) obtained by soaking a two-rowed barley inflorescence in maize genomic DNA from a single cross hybrid 9,10 , and characterize its genetics, report its positional cloning, and analyze its origin.

Results
Mutant prbs resulted from deletion of the Vrs4 gene. The poly-row and branched spike (prbs) barley mutant was obtained by soaking a two-rowed barley inflorescence in maize genomic DNA solution 9,10 . The mutant prbs not only changes two-rowed barley into a poly-rowed form but also adds a spikelet row, forming irregular poly-row and branched spikes (Fig. 1). Genetic analysis indicated that the mutant phenotype was caused by a recessive gene, which has an epistatic effect on Vrs1 11 . The prbs was initially mapped to the centromere of the short arm of chromosome 3H 11,12 , a location similar to that of vrs4. Furthermore, the immature spikes of the prbs mutant under stereoscope are akin to the scanning electron microscopy images of the Vrs4 immature spikes 12 . Three molecular markers (DQ327702, Cbic43, and Cbic44), closely linked with the Vrs4 gene, co-segregated with the prbs gene in the prbs/Kunlun12 RIL and prbs/Zangqing 320 F 2 populations ( Supplementary Fig. 1). No fragment was amplified in the mutant plants using these three molecular markers. Similarly, three other primers covering the Vrs4 gene sequence also failed to amplify a specific DNA fragment from either the prbs mutant or its progeny 11R258-95. Expression analyses revealed that the expression of Vrs4 was not detected and the expression of Vrs1 was significantly down-regulated in immature spikes at lemma primordium stage of the prbs mutant (Fig. 2). These results indicated that the prbs mutant may have resulted from a large deletion around the Vrs4 gene.

Identification of deletion region in prbs mutant.
To identify the deletion region in the prbs mutant, a Morex BAC clone was identified that contains the Vrs4 gene. PCR primers were designed at 2 kb intervals from 14 kb upstream to 22 kb downstream of the Vrs4 gene and were tested on the prbs mutant, 11R258-95, Pudamai-2,  and Morex. PCR primers located in the region from 3 kb upstream to 10 kb downstream of the Vrs4 gene failed to amplify a specific DNA fragment in the prbs mutant and 11R258-95 but amplified a single band in Pudamai-2 and Morex instead. Sequencing revealed that amplicons represented a single product in Pudamai-2 and Morex. Primers designed from 3 to 13 kb upstream and 10 to 21 kb downstream of the Vrs4 gene amplified a single band in all tested plants, but the amplicons represented multiple products when sequenced. These results did not support these regions arising from a single deletion event in the prbs mutant. However, an additional primer pair, Cbic123, matching a site 14 kb upstream of Vrs4, amplified a single band in all tested plants; the PCR product had 100% sequence identity among the prbs mutant, 11R258-95, Pudamai-2, and Morex. Another primer pair Cbic119, 22 kb downstream of Vrs4, also amplified a single band in all tested plants (sequencing was identical in all tested lines). These results revealed that the deletion sequence was from ~13 to 36 kb in the prbs mutant.
After failure to amplify a single DNA fragment using many PCR primers in the target region, long-range PCR was used to isolate the sequence covering the prbs mutation. Based on the above PCR test results, PCR primers Cbic131 and Cbic132 were designed for this purpose; the forward primer was near the site of primer Cbic123 and the reverse primer near the site of primer Cbic119, as both have been confirmed to amplify a single copy of DNA from the control varieties and mutants. A 15 kb fragment was successfully amplified from both the prbs mutant and 11R258-95 (Fig. 3), whereas the control PCRs using DNA from Pudamai-2 and Morex as a template failed to amplify. The amplification product from prbs was 14,715 bp (accession number KU758926).
We identified a 48,951 bp sequence from Morex using the 14,715 bp prbs sequence as query in BLASTN searches of the Morex genome database (http://webblast.ipk-gatersleben.de/barley/viroblast.php). Alignment of these two sequences showed that a 27,804 bp sequence in Morex, extending from nt 15,680 to nt 43,769 and containing the entire Vrs4 gene, is deleted in prbs (Fig. 4). Sequence analysis demonstrated that the Vrs4 gene in Morex is flanked by two long terminal repeat (LTR) retrotransposons located, respectively, at nt 10,967 to 19,828 upstream and nt 38,791 to nt 47,684 downstream (Fig. 4). A search of the Triticeae Repetitive Elements (TREP) database revealed high similarity to the retrotransposon RLC BARE1 B consensus-1 (TREPACC = TREP3133) with 90% and 98% of sequence identity, respectively, for the two elements. We refer to the retrotransposon upstream of Vrs4 as BARE up, and the one downstream of Vrs4 as BARE down. The two share 90% identity, are bound by 1.8 kb LTRs, and contain the full-length open reading frame encoding Gag, aspartic proteinase (AP), integrase (IN), reverse transcriptase (RT), and RNaseH (RH) expected for canonical BARE1 elements 13 (Fig. 4).
We further sequenced part of the deletion from the wild parent Pudamai-2, which contains a complete Vrs4 gene. A phylogenetic tree was constructed using the MEGA6 program 14 with the minimum evolution method. The results showed that the Vrs4 haplotype in Pudamai-2 was similar to haplotype 8 ( Supplementary Fig. 2) which was characterized by an insertion of TA bases in the 5′ UTR of the gene 6 and was mainly distributed in Asia. Thus, the prbs mutant resulted from a deletion of the entire functional gene HvRA2, a barley ortholog of the maize inflorescence architecture gene RAMOSA2 (RA2), which thereby transformed a two-row barley into a poly-row branched structure.
Recombination between the integrase genes of two BAREs formed prbs. Sequence analysis of the 15 kb region from prbs identified a single BARE element of the canonical 8.9 kb in length. Alignment revealed that the 5′ part of the prbs (nt 1 to 4,714) BARE was identical to BARE up, whereas the 3′ part (4,972 to 8,918 bp) was the same as BARE down (Fig. 5). The joint between the two halves is between nt 4,715 and nt 4,971 in the  prbs BARE (Fig. 5), within the integrase (IN) domain, corresponding to position nt 9,402-9,690 in the cloned prbs fragment (accession number KU758926). Retrotransposon integration generates a direct-repeat target-site duplication (TSD) flanking the individual element, as a consequence of repair of the staggered cut made by the integrase 15 . BARE up is flanked by imperfect CCAAG TSDs and BARE down by a perfect pair of CTGAA motifs. The BARE in prbs is flanked by CCAAG and CTGAA, supporting the origin of this BARE by recombination. Moreover, the upstream and downstream sequences surrounding the single BARE in prbs correspond, respectively, to that upstream of BARE up and downstream of BARE down. Thus, the HvRA2 gene, flanked by two BARE elements in Pudamai-2, was deleted by recombination between them, thereby generating the prbs mutation.
Search the maize genome for sequence similarity. To investigate the possible role of maize DNA in forming the prbs mutation, we used the region spanning the recombination zone in the BARE elements (9,402-9,690 bp) to carry out a BLASTn search the MaizeGDB B73 reference genome sequence. No maize-specific sequence was found in the region. Hence, it appears that no maize DNA has been inserted into the prbs region and that the prbs phenotype results solely from the Vrs4 gene deletion.
As an alternative to insertion, the maize DNA may have played a role through sequence similarity at the recombination point. The region of recombination in prbs corresponds to the most conserved part of the integrase gene, which is the core domain that includes the D-D-35-E active site motif 16 . The recombination itself took place in the region between the second Asp and the Glu of the active site; a BLASTn search of this region against the MaizeGDB B73 reference genome sequence found more than 120 matches between 73% and 80% identity, containing multiple stretches of ~10 nt perfect identity, dispersed over all maize chromosomes (Table 1). Given their numbers, it is highly likely that these BLAST matches correspond to members of the Copia superfamily in the maize genome, which comprises ~425-485 Mb of the maize genome 17 , the universal presence of integrase in intact, autonomous LTR retrotransposons. Further research is required whether foreign DNA may induce recombination through sequence similarity, especially when the foreign DNA exist in high concentration or high copy numbers.

Discussion
Barley is classified as two-rowed or six-rowed based on lateral spikelet size and fertility. Two-rowed barley has a central fertile spikelet and two infertile lateral spikelets, and six-rowed barley has three fully-developed fertile spikelets. Vrs4 is an ortholog of the maize (Zea mays. L.) inflorescence architecture gene RAMOSA2 (RA2), which encodes a LOB-domain-containing transcriptional regulator [18][19][20] . Vrs4 controls row-type variation and modifies inflorescence development in barley (Hordeum vulgare. L) 6 . Expression analyses of mRNAs by in situ hybridization and microarray analysis revealed that Vrs4 is expressed very early during inflorescence development and controls the row-type pathway in barley through Vrs1, a negative regulator of lateral spikelet fertility.
The prbs mutant was obtained from Pudamai-2, which has the normal Vrs4 gene and a two-rowed phenotype, by soaking the barley inflorescence in maize genomic DNA solution. In the mutant, Vrs4 is deleted through a recombination between two BARE retrotransposons on either side of the Vrs4 gene. LTR retrotransposons are known to recombine; the recombination between two LTRs of a single element, which results in deletion of the internal domain of the retrotransposon and generates a solo LTR, has been studied 16 . The process of retrotransposon replication generates LTRs that are identical at the moment of integration 21 ; the accumulation of mutations in the LTRs at the neutral rate after that allows for the estimation of the age of the integrated element 22 . Genome-wide analyses show that the average half-life of a retrotransposon in the Copia superfamily, which includes BARE, is 859,000 years, or a rate of 1.16 × 10 −6 events per element per generation, in the grass Brachypodium distachyon, which loses retrotransposons through recombination relatively rapidly 23 . In barley and other plants, the rate of solo LTR formation varies considerably between retrotransposon families and also between chromosomes and regions 20,23,24 . Analyses of the frequency of recombination events between internal retrotransposon domains, such as the generated prbs reported here, have not been made and are difficult to identify in the absence of novel phenotypes.
Recombination between the LTRs of two different elements can generate a concatenated structure comprising two internal domains flanking a single, recombinant LTR, which results in the loss of the intervening genomic sequence, including any gene that happens to be there. A quantitative PCR survey of the barley genome for such structures with three LTRs and two internal domains showed that their presence in about 4.3 × 10 3 copies per haploid genome 23 . While this indicates the potential for gene loss through recombination of retrotransposons flanking a gene, especially given that the gene islands 25 are flanked by retrotransposon "seas" which increases the intervening distance between recombining retrotransposon sequences and appears to be correlated with decreasing recombination frequency 20 . The question arises as to whether the maize DNA soaking procedure is connected to the recombination that generated the prbs mutant. Our procedure and the other methods introduce foreign DNA into the megagametophyte before fertilization 26,27 . Whether or not any foreign DNA is integrated, the presence of extra chromosomal or cytoplasmic DNA triggers a range of defense responses in animals 28,29 , mediated by DNA recognition by proteins including STING (also called MITA, MPYS, TMEM173, or ERIS) 30 , specific toll-like receptors (TLRs) 31 , Z-DNA binding proteins (ZBP-1, DLM-1, or DAI) 32 , and Mre11 (meiotic recombination 11) 33 . Mre11 is particularly intriguing because, together with RAD50 and NBS1, is a part of the MRN complex and has been shown to play a vital role in double-strand break (DSB) repair 34

in plants, which is an intermediate step in recombination.
The maize genome contains 404,000 Copia superfamily retrotransposons 35,36 ; the integrase domains of these are very similar to the integrase core domain of BARE that underwent recombination in prbs. We speculate whether the homologous maize and barley integrase sequences may have interacted with each other, mediating the recombination. Expression analyses, in situ hybridization and microarrays revealed that Vrs4 is actively expressed during inflorescence development 6 , corresponding to the stage at which the barley inflorescence was soaked in maize genomic DNA to generate the prbs mutant. Due to its transcriptional activity, this region is likely to have an open chromatin conformation, which could provide an opportunity for maize DNA to interact at the BARE integrase domains and promote the recombination. The high concentration of conserved retrotransposon sequences would make binding and recombination in a retrotransposon sequence more likely than elsewhere. While recombinations between endogenous retroviruses (ERVs)-which are structurally identical to LTR retrotransposons-have caused genic deletions through recombination 37 , to our knowledge, there has not been an earlier demonstration of this in plants.
Horizontal gene transfer (HGT) is well documented in prokaryotic genome evolution. It is relatively clear that there are several HGT pathways, including transformation, conjugation, and transduction. In eukaryotes, direct DNA exchanges may occur during grafting 38 , symbiosis 39,40 , parasitism 41 , pathogenesis 42 , and epiphyte or entophyte 43 . Some vectors, such as pollen 43 , fungi 44 , bacteria 45 , viruses 46 , plasmids 38 , insects 47 and transposons 43 , may also be involved in HGT. Transposable elements (TEs) have been recognized as important vectors for the horizontal movement of genes between eukaryotic genomes [48][49][50][51] . Transposons, with their inherent ability to mobilize, can proliferate and integrate into genomic DNA and generate HGTs with ease 52 . Transposons have also captured and transduced genomic DNA sequences in both Daphnia pulex 53 and Drosophila species 52 . The transfer of Mu-like transposons between Setaria and rice has been documented 48,54 . LTR retrotransposons can produce virus-like particles, which may work as more frequent vectors for HGT 52,55 . Such cases have been demonstrated in LTR-retrotransposon RIRE1 within the genus Oryza 56 and the LTR-retrotransposon Route66 in Poaceae 50 . With the increasing availability of eukaryotic genome sequences, more evidence will be available that plants are also likely to undergo HGT. However, the results have been based on incongruences in molecular phylogenetic trees. On the other hand, there are numerous reports in the literature that have directly introduced foreign DNA by injecting exogenous DNA or directly DNA soaking or pollen tube pathway into rice, barley, wheat, sorghum, maize, cotton, oats, rye, cucumber, pumpkin, kidney bean and soybean to create new genetic variations (Supplementary Table 7). RAPD, AFLP and SSR molecular markers have been used to test DNA transfers between species in several studies [57][58][59] . However, no study has demonstrated how the exogenous DNA causes genetic variation in other species. Our study provides preliminary evidence that LTR-retrotransposon-mediated gene deletion/insertion may play a role in direct gene transfer between different species.
In addition to act as potential vectors for horizontal gene transfer, transposable elements are also responsive and susceptible to environmental changes. It is well documented that stresses could activate TE to generate new genetic diversity 60 . This is especially true for LTR-retrotransposons as the LTR is sufficient in itself to activate TE transcription in response to stress. It is possible that soaking the barley spike in the maize DNA solution created stress conditions for the developing spikelets, which activated LTR-TE mediated recombination. In this scenario, the maize genomic DNA may be not essential for the mutation. Further research is required to test this assumption by soaking the developing barley spikes in water or salt solution to provide similar stresses for identification of new mutants.

Materials and Methods
Plant materials. A poly-row branched spike (prbs) barley mutant was obtained by soaking a two-rowed barley inflorescence (cv. Pudamai-2) in maize genomic DNA solution 61 . The method followed that described earlier for wheat 62 . Flowering barley spikes were soaked in total maize DNA at 1.6 ug/ul in 0.1 × SSC for 24 hours. After soaking, the head was moved from the solution and air-dried under ambient conditions. Plants were self-pollinated and seeds harvested. The mutant was identified at flowering of the next generation plants.
Genetic mapping was conducted in two populations: one recombinant inbred line (RIL, F 2:6 ) population consisting of 207 plants derived from a cross between the prbs mutant and a six-rowed barley cultivar Kunlun 12, and an F 2 population consisting of 285 spike mutant plants derived from a cross between the prbs mutant and a six-rowed barley cultivar Zangqing 320. The prbs mutant, RIL 11R258-95 with a branched spike phenotype, Pudamai-2, and var. Morex were used for DNA sequence analysis.
Genomic DNA extraction and genotype analysis. Genomic DNA was extracted from leaves of individual plants and their parents using a modified CTAB method 63 . DNA samples were quantified using a Unican UV300 UV/Vis spectrometer (Thermo Electron Corporation, Cambridge, UK), and then adjusted to 25 ng/μ l. Because a DQ327702 marker associated with the mutant is closely linked with the Vrs4 gene 6 , new molecular markers Cbic43 and Cbic44 were designed around the Vrs4 gene using the barley genome sequence 14  Vrs4 gene (AS12, AS34, and AS56) 6 were designed for Vrs4 haplotype analysis. Primers were synthesized by Shanghai Sunny Biotechnology (Shanghai, China). PCR reactions were performed in 10 μ L volumes containing approximately 25 ng genomic DNA, 0.2 μ M of each primer, and 5 μ L 2 × Taq Master Mix (Gene Solution, Shanghai, China) using the following program: 94 °C for 3 min, 32 cycles of 94 °C for 30 sec, 55 °C for 45 sec, 72 °C for 1 min, and 72 °C for 5 min. PCR products were separated on 8% polyacrylamide gels.
Cloning of the deletion in mutant prbs. BAC sequences were identified by blasting the Vrs4 against the International Barley Genome Sequencing Consortium database (unpublished data). PCR primer pairs were designed at 2 kb intervals in the region near the Vrs4 locus using the Primer-Blast tool (http://www.ncbi.nlm.nih. gov/tools/primer-blast/). PCR reactions were described as above. Annealing temperatures were optimized for each primer pair (Supplementary Table 1). PCR products were examined by electrophoresis on 1% agarose gels. Long-range PCRs were performed in 50 μ L reactions containing 1 × buffer, 5 μ L template DNA, 0.4 μ M of each primer, 400 μ M each deoxyribonucleotide, and 2.5 U LA Taq DNA polymerase (Takara, Dalian, China) using the following program: 94 °C for 3 min, 32 cycles of 98 °C for 10 sec, 68 °C for 15 min, and 68 °C for 20 min. PCR products were examined by electrophoresis using a 0.8% agarose gel, analyzed by Bio-Red Quantity One gel image analysis system and sequenced by Shanghai RuiDi Biological Technology (Shanghai, China).
Quantitative RT-PCR. RNA was extracted from immature spikes at lemma primordium stage of the prbs mutant and wild parent Pudamai-2 using Spin Column Plant total RNA Purification Kit (Sanggon Biotech (Shanghai) Co.,Ltd). cDNA was prepared from 1 ug RNA using AMV First Strand cDNA Synthesis Kit (Sanggon Biotech (Shanghai) Co., Ltd). qPCR reactions were performed using SYBR Green (SG Fast qPCR Master Mix (HighRox), BBI ) and the Applied Biosystems Stepone plus Real-time PCR System. The Real-time PCR assays were performed in triplicate for each cDNA sample. Vrs4 6 , Vrs1 4 and HvActin 6 primer sequences used for quantitative RT-PCR. The HvActin gene was used as reference gene for normalization.