Elsevier

Biochimie

Volume 87, Issues 9–10, September–October 2005, Pages 905-910
Biochimie

Identification of new small non-coding RNAs from tobacco and Arabidopsis

https://doi.org/10.1016/j.biochi.2005.06.001Get rights and content

Abstract

Small non-coding RNAs (ncRNAs) have typically been searched in fully sequenced genomes using one of two approaches—experimental or computational. We developed a mixed method, using both types of information, which has the advantage of applying bio-computing methods to actually expressed sequences. Our method allowed the identification of new small ncRNAs in Arabidopsis thaliana and in the unfinished genome of Nicotiana tabacum. We constructed a N. tabacum cDNA library from small RNAs ranging from 20 to 30 nucleotides (nt). The sequences from 73 unique clones were compared to the A. thaliana genome and to all plant sequences using a pattern-matching approach (program Patbank). Thus, we selected 15 clones from the library corresponding mostly to A. thaliana or N. tabacum non-coding sequences. By Northern blot analyses, we confirmed the presence of most RNA candidates in Arabidopsis and in Nicotiana sylvestris with a size range of 21–100 nt. To gain more insight into the possible genesis of 21–24 nt sequences, stable folding of sRNAs with their flanking regions were predicted with the software MIRFOLD dedicated to the folding of microRNAs (miRNA). Stable hairpins structures were observed for some putative miRNAs.

Introduction

Small non-messenger RNAs have typically been defined as non-coding RNAs with sizes ranging from 20 to 500 nt [1]. Using the so-called experimental Rnomics approach, some small non-protein-coding RNAs, sized 50–500 nt have been cloned in several sequenced organisms including the model plant A. thaliana [2], [3]. In addition to new members of known small ncRNAs, such as small nucleolar RNAs, other RNAs have been identified and located in nuclear intergenic regions and in the mitochondrial or chloroplast genomes. Recently, another class of eukaryotic small ncRNA has emerged in many organisms, miRNAs, which negatively regulate their complementary mRNAs at the posttranscriptional level. In plants, miRNAs base pair to their target RNAs and induce their degradation (for review, see [4]). MiRNAs are small, non-coding RNAs of 18–25 nt in length with 5′P and 3′OH. MiRNAs sequences correspond to regions annotated as intergenic and arise from evolutionarily conserved hairpin precursor transcripts of variable length (60–300 nt in plants) cleaved by the RNAse III-like enzyme, DICER. Another class of small ncRNAs, small interfering RNAs (siRNAs), share structural properties with miRNAs. SiRNAs are 21 nt in length, however they are double stranded and arise by another mechanism (for review, see [5]).

Initially, identification of the first miRNAs in A. thaliana [6], [7] implicated the cloning of their cDNAs and confirmation of their expression by Northern blot analysis. Recently, pure computational approaches have been developed: they involve a series of conditions that a sequence has to fulfill to be eligible as a candidate [8], [9], [10]. Complete genomes were systematically checked against a set of rules derived from the previous instances of experimentally described miRNAs. These approaches were successful in the identification of large amounts of miRNAs sequences. For example, Wang et al. [10] used several filters such as hairpin structure, G-C content, identity with Oryza sativa genome to select putative miRNAs from Arabidopsis intergenic sequences.

Here, we describe a mixed method, which allowed the identification and classification of new ncRNAs. We constructed a cDNA library from small RNAs (sRNA) ranging from 20 to 30 nt. The clones originated from N. tabacum, whose known genomic sequences are scarce. Hence, it was likely that most of our sequences would not be assigned to a known sequence in this species. However, we expected that, if miRNAs were cloned, the sequences would be conserved and similarities would be found among other plant sequences. This approach thus differs from pure computational approaches, which can be described as pipelines of tests which, starting with a whole genome, eliminate sequences step by step [8], [9], [10]. The main task of such a process is to avoid false positive sequences, as these are likely to occur because of the size of the initial set and the absence of prior knowledge about the query sequences. Our starting point was very different, as we knew that each sequence we examined was actually expressed. In this context, the task of the bio-computing analysis was to assign a possible identity and/or role to these molecules, and to direct future experiments. Thus, even though we were obviously looking for the same type of features as the pure computational approaches our constraints could be more relaxed.

The sequences from 73 unique clones were compared to the Arabidopsis thaliana genome and to all known plant sequences, using a pattern-matching approach allowing for a limited number of mismatches (program PatBank). This way, we selected 15 clones from the library corresponding mostly to A. thaliana or N. tabacum non-coding sequences. By Northern blot analysis we confirmed the presence of most of the RNA candidates in Arabidopsis and/or in N. sylvestris (maternal ancestor of N. tabacum). In addition, stable folds with their flanking regions (upstream and downstream) were computationally predicted with MIRFOLD, a program specifically dedicated to the prediction of miRNA precursor structure. Stable hairpins structures were observed. Ways of characterizing the targets of these putative miRNAs are discussed.

Section snippets

Cloning of sRNAs from tobacco

Total RNA was isolated from Nicotiana tabacum leaves 3–5 h after infiltration of harpin protein from Erwinia amylovora [11]. Small RNAs (about 500 μg) were purified on a denaturing 15% polyacrylamide gel. A gel fragment spanning the size range of 20–30 nt was excised and RNA eluted overnight in 0.3 M NaCl at 4 °C. RNA was recovered after ethanol precipitation and ligated sequentially to 5′ and 3′ RNA/DNA chimeric oligonucleotide adapters (5′ adapter: ACGGAATTCCTCACTaaa (lower case are RNA) and

Construction and characterization of a small RNA library from N. tabacum

We constructed a N. tabacum cDNA library from small RNAs of N. tabacum leaves as described in Llave et al. [6]. Small RNAs (20–30 nt) were size fractionated from denaturing polyacrylamide gels. After elution, they were ligated to 5′ and 3′ adapters, amplified, cloned and sequenced. We obtained sequences from 127 clones. Size distribution showed a typical bimodal curve [6] with higher score at 21 and 24 nt (not shown). A small proportion of sequences (9%) were considered too small (≤ 14 nt) for

Conclusion

We constructed a cDNA library of small ncRNAs from tobacco using a procedure used to clone miRNA. To help the identification of the cloned sequenced we developed a software, Patbank, which allowed comparison with the fully sequenced genome of Arabidopsis. Indeed, we postulated that regulatory RNA would be conserved between the two plants. We identified sRNA localized in A. thaliana or N. tabacum intergenic regions, and in non-coding regions from organelles. Most of the sequences presented here

Acknowledgments

We thank very much Louise Chapell (Sainsbury Laboratory) for introducing one of us (M.B.) to small RNA manipulations and cloning. M.B. warmly thanks Vincent Colot and his group (URGV, Evry) for discussions and encouragement. Thank you to Edouardo Rocha (CNRS, ABI) and Gillian for advices on the manuscript. We thank both referees for helpful suggestions to improve our manuscript. This work has been founded by the ACI Interface physique, chimie, biologie: dynamique et reactivité des assemblages

References (25)

  • B.J. Reinhart et al.

    MicroRNAs in plants

    Genes Dev.

    (2002)
  • E. Bonnet et al.

    Detection of 91 potential conserved plant microRNAs in Arabidopsis thaliana and Oriza sativa identifies important target genes

    Proc. Natl. Acad. Sci. USA

    (2004)
  • Cited by (0)

    View full text