A quantitative PCR approach for determining the ribosomal DNA copy number in the genome of Agave tequila Weber Electronic Journal of Biotechnology

Background: Agavetequilana has a great economic importance inMexico inorderto producealcoholicbeverages and bioenergy. However, in this species the structure and organization of the rDNAs in the genome are limited, and it represents an obstacle both in their genetic research and improvement as well. rDNA copy number variations per eukaryotic genome have been considered as a source of genetic rearrangements. In this study, the copy number of 18S and 5S rDNAs in the A. tequilana genome was estimated, and an absolute quantitative qPCR assay and genome size was used. In addition, an association between the rDNAs copy number and physical mapping was performed to con ﬁ rm our results. Results: The analysis were successfully applied to determine copy number of 18S and 5S rDNAs in A. tequilana genome, showing high reproducibility with coef ﬁ cient of variation (CV) values of 0.014 – 0.0129%, respectively. A variation of 51 times in the copy number the 18s regarding 5s rDNA was found, thus contributing to genome size of 1.47 and 8.38 × 10 -3 %, respectively. Similarly, data show a linear relationship (R [2] = 0.992) between rDNA copy number and the detected signals for each of the loci by FISH. The comparison of the rDNA copy numberofagaveshoweddifferentialrelationshipwithotherorganismsanditmaybeduetoevolutionaryecology. Conclusions: Results show that the proposed method a) can correctly detect the rDNA copy number, b) could be used as species-speci ﬁ c markers and c) might help in understanding the genetic diversity, genome organization and evolution of this species.


Introduction
The genus Agave contains more than 200 species and 47 infraspecific categories; 186 taxa are distributed in Mexico, as it is the center of origin of this genus [1]. Many of these species are of great economic importance due to its high sugar content, fiber and bioactive natural products. Some agaves are mainly used to produce alcoholic beverages (Tequila, and Mezcal) [2]. Natural mead or juices obtained from cores or "piñas", fresh or cooked, can be used to obtain polysaccharides, agave fructans, high fructose syrup, biofuel or Maillard compounds. In addition, recent attention has been focused on the potential of Agave species as bioenergy crops, as these plants offer many advantages for this purpose [3].
The blue agave, Agave tequilana Weber 'Azul' named by German botanist Weber, is the only variety legally permitted for the production of Tequila by the Mexican government and it has been vegetatively spread throughout the last 200 years [4]. In economic terms for Mexico, it represents $1.7 billion in annual revenue only within the United States [5]. In addition, this species is the most promising for bioenergy production because of its productivity, established agricultural practices, and ethanol conversion technologies [6]. A. tequilana is commercially propagated by asexual rhizomatous shoots, a procedure which allows for the increase of genetically elite clones with remarkable qualities (higher sugar content, size and rapid growth); however, and as a consequence, genetic diversity has been reduced though it has been subject to major disease and insect pressure [2,7]. In this context, the economic importance and vulnerability of Agave crops have led to an increased interest in genetic research of the species [4,8,9]. Recent studies have described a specific relationship between ploidy level, genome sizes, and the number of loci for rDNAs and repetitive sequences [10,11,12,13]; it has also showed that variation in size and structural rearrangements of the genome have meaning adaptive and influences the phenotype of two ways, both in the expression of their gene content and the physical effects of their mass or volume [10,14]. Therefore, morphological and genetic changes boost the possibility of selecting biotypes elite with useful features that can be used in breeding programs and biotechnology.
Nuclear ribosomal DNA (rDNA) encoding 45S and 5S rRNAs belong to the most important housekeeping genes, and play a central role in cell metabolism and genome organization [15,16]. In plant genomes there may be from several hundred up to tens of thousands of highly homogeneous copies of each gene. A high copy number of these genes is probably important so the increased demand for proteosynthesis during plant development and stabilization of the cell nucleus can be ensured [17,18]. Each 45S rDNA unit consists of three coding regions (5.8S, 18S, 25S/26S in plants, 28S in animals), the internal transcribed spacers, and the intergenic spacer, which separates transcribed units [19]. The 45S units are organized in tandem arrays at one or several loci. The 5S rDNA encoding a 120-bp-long transcript has been traditionally considered to inhabit separate chromosomal locations in plants [20]; however, the 5S rDNA array may be found at more than one locus, either on the same chromosome as the 45S repeats or scattered across the genome. The rDNA arrays display substantial copy number variation within and between species [21,22,23]; this variation is functionally relevant, as it modifies chromatin states and gene expression across the genome. Moreover, the rDNA copy number has a functional significance to our understanding of crop plant domestication, and agricultural improvement as well [24]. Therefore, it is not surprising that a wide spectrum of laboratory methods has been developed to identify these rDNA copy number changes [25,26,27]. Each method is characterized by particular (dis)advantages, being the choice of a given technique largely dependent on the application, required resolution, flexibility, workload, and cost. Conventional karyotyping allows detecting structural variations across the entire genome, but it is limited in resolution (N5-10 Mb) [28]. FISH analysis for targeted regions has been used in a routine setting for many years, and requires either metaphase chromosomes (similar to karyotyping) or interphase nuclei. However, this method can be more difficult to analyze, especially when attempting to resolve tandem duplications [29]. Quantitative PCR (qPCR) technology offers fast and reliable quantification of any target sequence in a sample [30]. It also has many advantages over alternative methods, such as low consumable and instrumentation costs, fast assay development time, and high sensitivity. Recent studies have used this method for determining the rDNA copy number per genome for a better understanding of the genome organization in different species [15,22,31,32,33,34,35]. Furthermore, the studies provide evidence of a strong relationship between genome size (DNA C-value) and the rDNA copy number.
Although the current and potential economic importance of Agaves is irrefutable, even surprisingly, relatively little research has been carried out on these species, especially in the study of rDNA copy number variations into the genome. In this sense, more recent studies only identify the number and chromosomal location of rDNA sites in three different species of the genus Agave, including A. tequilana Weber by FISH [36]. Furthermore, Tamayo-Ordoñez et al. [37] defined the rDNA copy number using a complex method, which is based on isolation of cell nuclei and analysis of fragments by quantitative capillary electrophoresis.
The aim of this study was to develop a methodological tool based on qPCR in order to determine the copy number of the ribosomal DNA (18S and 5S) in A. tequilana, as well as their contribution to genome size. The results showed a variation conserved in the rDNA copy number per genome and physical distribution of the loci. This method was a contribution to the understanding of genome organization in the Agave species.

Plant material
A. tequilana plants were grown under greenhouse conditions at day/night temperature regime of 30/25°C in the Regional Botanical Garden of the Centre for Scientific Research of Yucatán, México (CICY). The leaves were dissected, frozen in liquid nitrogen and kept at -80°C until analysis.

Molecular analysis
Total DNA (gDNA) was extracted from 100 mg of leaf samples using the Wizard® Genomic DNA Purification Kit (Promega) according to the manufacturer's instructions and treated with RNAse A (Boehringer Mannheim). The quality and quantity of the DNA were assessed spectrophotometrically (NanoDrop 2000; Thermo Scientific) by a standard procedure. In order to verify DNA integrity, extracts were fractioned by electrophoresis in a 1.2% agarose gel, stained with ethidium bromide, and visualized under UV light (High Performance UV Transiluminator, UVP).
Partial sequences of 18S and 5S rDNAs were obtained by PCR using conserved primers pairs. These primers were designed with Primer Premier v5 software (PrimerBiosoft) and conserved regions of sequences reported from other plant species (Table 1). PCR was performed in a Veriti Thermal Cycler (Applied Biosystem) and using Taq DNA polymerase (Invitrogen) according to the manufacturer's protocol. The PCR mix contained 100 ng DNA and 1 μL of each primer 10 μM, in 50 μL total volume. The cycling parameters were: 94°C (3 min); 35 cycles of 94°C (30 s), 60°C (30 s), 72°C (30 s); 72°C (10 min). Reaction products were separated by electrophoresis in a 1.2% agarose gel, stained with ethidium bromide. PCR products were carefully excised and purified with a Nucleo Spin Extract Kit (Macherey-Nagel), linked to the pGEM-T Easy vector (Promega) and sequenced by Langebio (CINVESTAV, Mexico). Sequence analysis was carried out at the National Center for Biotechnology Information (NCBI, http://www.ncbi.nlm.nih.gov/) using the BlastN service. The phylogenetic tree was constructed using MEGA6 software [38] from the ClustalW alignment and the neighbor-joining method. Reliability of each node was established by the bootstrap method.

Quantitative PCR
Quantitative PCR (qPCR) was used to determine the copy number of 18S and 5S rDNAs. qPCR reactions were run on a StepOnePlus Real-Time PCR System (Applied Biosystems) and PCR products were analyzed by 1means of StepOne Software v2.3 (Applied Biosystems). Each amplification reaction contained 12.5 μL of SYBR® Green PCR Master Mix (Applied Biosystems), 10 ng gDNA and 1 μL of each primers 1 μM, in 25 μL total volume. Cycling parameters were: 10 min at 95°C; 40 cycles of 95°C (15 s), 60°C (1 min); and melt-curve analysis.
All primers pairs gave a single peak of dissociation in all reactions, and no amplification occurred in reactions without template. PCR efficiencies for each primers pair were determined from the standard curve (R [2] N 0.996 for all primers and efficiencies 104-110%). Primer sequences used for qPCR are reported in Table 1.
Known concentrations of purified plasmids (100, 10, 1, 0.1, 1 × 10 -2 , 1 × 10 -3 , 1 × 10 -4 ng/μL) were used to develop standard curves for absolute quantification of the copy number of target genes. Standard curve was developed by plotting Ct values against Log10 values of plasmid copy numbers used as template [39]. Three independent assays were carried out. For estimation of plasmid copy number, [Equation 1] was used.
Where n = nucleotide size; m = mass; Avogadro number = 6.023 ×10 23 molecules/moL; and average MW of a dsDNA molecule = 660 g/moL. Thus, plasmid copy number = plasmid concentration/m. From the standard curve, a regression equation was developed to obtain the slope (b) and y intercept (a). Absolute copy number (CN) in each sample was estimated by using regression equation, CN = antilog10 (Ct − a) / b. Lastly, rDNA number of copies per genome was determined in relation to the 4C DNA content in A. tequilana [37].

Probes and FISH
Chromosome preparation and FISH were performed according to Robert et al. [13]. In short, Metaphase chromosomes were obtained from root tips, which were treated with paradichlorobenzene (PDB) at 25°C for 3 h, fixed in 3:1 ethanol-glacial acetic acid for 12 h, and then transferred to 70% ethanol at -20°C until use. Root tips were washed in standard saline citrate buffer (SSC) and digested with 50% (v/v) of Viscozyme® L (Sigma V2010) at 37°C for 25 min. Meristematic cells were dissected, gently pipetted onto slides, macerated in a drop of a 60% acetic acid, and squashed with a glass coverslip. Slides were frozen on liquid nitrogen and the cover slip was flipped off with razor blade. The 18S probe includes a 1 kb fragment isolated from Triticum aestivum, which was labeled with digoxigenin-11-dUTP by Nick translation and detected with anti-digoxigenin-fluorescein isothiocyanate (FITC). A probe for 5S sequences rDNA sequences was prepared by amplifying and labeling the 120 bp genic sequence from Nicotiana rustica, in the presence of biotin-16-dUTP. Biotin-labeled probes were detected with avidin-cyanine-3 (Cy3).
For FISH, the probe mixture was prepared as described by Robert et al. [13]. After overnight hybridization at 37°C, slides were given a stringent wash in 20% (v/v) formamide in 0.1% SSC at 40-42°C. Sites of probe hybridization were detected using 20 mg mL -1 Anti-Digoxigenin-Fluorescein (Roche Biochemicals) and 5 mg mL -1 Cy3-conjugated avidin (Amersham Pharmacia) in 4% SSC containing 0.2% (v/v) Tween-20 and 5% (w/v) bovine serum albumin. Chromosomes were counterstained with 2 mg mL -1 4′, 6-diamidino-2-phenylindole (DAPI) in 4% SSC, mounted in Vectashield (Vector Laboratories) medium, and examined using a Leica DMRA2 epifluorescent microscope fitted with an Orca ER camera and Open Lab software (Improvision). All images were processed with Adobe Photoshop, treated with a uniform color contrast and brightness. For all the probes, three to ten metaphase cells were analyzed. Densitometric units (DU) of the ribosomal DNA images were estimated using ImageJ software [40].

Cloning and sequences analysis of ribosomal DNA
Based on the highly conserved sequence of rDNA genes of plants, a DNA fragment of 18S rDNA and complete 5S rDNA were initially isolated with specific primers by PCR from A. tequilana leaves (Table 1). BLAST and phylogenetic analysis confirmed the close sequence homology of these amplicons with the corresponding genes in different plant species (Fig. 1). 18S rDNA (186 bp) show 100% identity with Agave ghiesbreghtii, including the most representative kinds of the Agavaceae subfamily (Anthericum, Chlorophytum, Hosta, Yucca) [41]. The 18S regions have remained relatively constant within the same family (Asparagaceae), making these DNA regions the important of information for the study of phylogenetic relationships [42]. On the other hand, a search of public databases (GenBanK/EMBL/ DDBJ) showed that to date no data of 5S-ITS sequences for the Asparagaceae family were found; therefore, we were unable to compare the sequences in this family. In spite of this, 5S rDNA (505 bp) had a 97-98% homology with various species of the Poaceae family, showing that the 5S sequence of A. tequilana has conserved regions with species of economic importance (eg. Avena sativa, Triticum turgidum and Zea mays). The complete sequence includes a conserved region (121 bp) and internal transcribed spacers (384 bp). In eukaryotic cells, the 5S rDNA encoding a 120-bp-long transcript with a molecular mass of 40 kDa [43]. The secondary and tertiary structures are generally conserved across phylogeny [20,44]. In both cases, partial or complete sequences of individual rDNA indicate the high evolutionary conservation of the amplified fragments.

Generation of standard curve and quantification assay
The establishment of a standard curve using the qPCR process is one of the key steps in determining the copy number of a given target sequence. Standard curves for the 18S and 5S rDNAs were generated from known starting pDNA concentrations, and used to determine the copy number per genome of these ribosomal genes in an amount DNA sample by interpolation. To generate the curve, amplifications on serial dilutions of the plasmid containing the 18S and 5S rDNA fragments were performed and a linear relationship between the level of fluorescence (ΔCt) and the input amount of pDNA was produced (Fig. 2). The ΔCt values were  then plotted against the log of the initial copy number of template pDNA, ranging from 1 × 10 3 to 1 × 10 10 copies/μL. The correlation (R [2]) between ΔCt value and log pDNA concentration was N 0.996 for the 18S and 5S rDNAs. The slopes (-3.238 and -3.095) of the regression equation indicated very efficient amplification rates (Table 1). This demonstrates that there is a high correlation between the original amount of pDNA in the template and the Ct value obtained after amplification as well as a high efficiency of the qPCR reaction [30,45,46]. The absolute quantification method relies on a standard plot constructed from known concentrations of standards to measure the actual copy numbers of a particular target, and is therefore considered to be more informative and reliable for comparisons [47,48]. The quantity of target in samples can be evaluated with reasonable accuracy only by using a properly characterized standard.

Determination of ribosomal DNA copy number in A. tequilana
Estimation of copy number of 18S and 5S rDNAs was determined with a fixed amount of total DNA (10 ng) from five different plants of A. tequilana. Absolute quantification was performed using the ΔCt values and the standard curves were constructed. Comparison of ΔCT values for each sample shows minimal variation based on standard deviation and coefficient of variance ( Fig. 3; Table 2). This demonstrates the reproducibility and the value in the introduction of qPCR for the quantitative detection of the ribosomal genes. The mean Ct values were used to determine the copy number per genome and their contribution to genome size of Agave (Table 2). Copy number was based on the 4C DNA content (15.01 pg) of the A. tequilana genome [13]. The number of copies of the 18S rDNA was 51 times higher than the 5S rDNA, contributing to genome size of 1.47 and 8.38 × 10 -3 %, respectively. This result is in contrast to previous results, which have reported a greater number of copies per genome of the 5S rDNA, with 1.5 and 1.92 times more than the 18S rDNA in A. tequilana and species of the genus Agave, respectively [37]. This difference may be because the rDNA copy number was determined by using quantitative capillary electrophoresis, in relation to the area under each peak (fluorescence intensity) of the different sizes for each rDNA, as well as the variation of the amount of DNA estimated in about 30,000 nuclei·mL -1 , for each Agave L. accession [37,49]. In general, these results showed a wide range of variation in rDNA gene proportion for each species tested. Meanwhile, Gomez-Rodriguez et al. [36] performed the physical mapping of 5S and 18S ribosomal DNA in three species of Agave, but this method does not have the resolution (N1 kb or tandem sequences) to determine the number of copies [25]. At present, qPCR analysis allows to determine polymorphisms single nucleotide (SNP), small insertion-deletion, also genomic variation between species, with a detection capability of the Fig. 2. qPCR detections show the development of fluorescence signal of the serial dilutions of the pDNA of 18S (a) and 5S (b) rDNA. Amplification plots were generated by serial dilutions of pDNA ranging from 100 ng to 0.1 pg and specifics primer pairs. In each case, the standard curves were generated from amplification data. The reproducibility was analyzed by evaluating the slope and correlation coefficient of the curve. Three replicates were performed for each sample. copy number in a range of 1 to 4 in a diploid genome [25,29]. The qPCR technique offers great flexibility and adaptability, and can be carried out in a closed system, thus eliminating the risk of PCR and sample contamination and does not require postprocessing of PCR products. Consequently, the estimate of copy number of rDNAs and DNA transposons in the genome of different organisms (plants, algae, fungus, and animals) has been determined based on the application of qPCR [12,22,23,50]. In this regard, application of qPCR, DNA C-values and analysis of target genes have been proposed as a viable methodology for estimating its contribution to the genome [12,21,35,51]. Comparison of rDNA copy number per genome from our data with those reported in plants show close relationship (150 to 75,000 rDNA copies), although copy number per genome may vary greatly within a given species [35,52,53]. In a set of microalgae and unicellular algae, numbers range from 1 to 36,896 rDNA copies [32,34]; in fungi range from 60 to 220 [54]; and in protozoa range from 3385 to 315,786 [22]. Moreover, copy numbers in A. tequilana are generally higher in both animals 300) and humans (100-12,900) [34,55]. Therefore, rDNA copy number variations suggest evolutionary mechanisms (gene duplication processes, structural rearrangements and genome mutations) of these essential genes for each species [21,26], providing a mechanism for cell homeostasis and quick and reversible adaptation. Copy number variation mainly affects the members of large families of functionally redundant genes [29,31]. The effects of this variation on phenotype are usually modest; nevertheless, there are many cases in which they have been linked to important traits such as resistance to biotic and abiotic stress. In economically important plants, changes in the rDNA copy number may be related to biological processes (flowering time, plant height and response to stress), and genome organization (size, composition, ploidy level), allowing selection of plants with outstanding agronomic characteristics [56].

Comparison of FISH images and the rDNA copy number by qPCR
In order to confirm the copy number of 18S and 5S rDNAs in the genome of A. tequilana, physical mapping of rDNAs was examined using dual color fluorescence in situ hybridization FISH (Fig. 4). The diploid chromosome number was confirmed 2n = 60, with bimodal karyotype composed of 5 pairs of large chromosomes and 25 pairs of small chromosomes [13,36,57]. The 18S rDNA was associated with the secondary constriction of a large chromosome pair, while 5S rDNA was located in both arms of a small chromosome pair. Also, the 18S rDNA signals were higher than those observed for 5S rDNA signals. Comparing the signals obtained for 18s (151,992 ± 109 DU) and 5s (36,539 ± 3721 DU) rDNAs with copy number per genome of A. tequilana showed a linear relationship (R 2 = 0.992) and evidenced that qPCR estimates of the different rDNAs were very coherent with FISH data. This approach has been used in the identification and characterization of rDNA copy number variations in genome of plants and microalgae [29,34], and it establishes a good relationship between rDNA copies and direct visualization of DNA copy number at the level of individual cells. It also suggests that the qPCR data could be used as a better analysis when attempting to resolve tandem duplications, since FISH is considered a semi-quantitative technique [58], because the size and intensity of hybridization signals are an indicator of targeted sequence copy number. On the other hand, Agave genus exhibits different ploidy levels ranging from diploid (A. tequilana, 2×) to octaploid (8 ×) [1,4,11,13], regardless of polyploidy, the bimodal karyotypes are maintained, and suggests that the process of speciation in Agave was due to structural rearrangements in chromosome groups [1,58]. However, complex interactions between these factors and environmental growing conditions exist [10], which have resulted in the loss and gain of rDNA loci and probably also in copy repeat  number variations in each rDNA arrays. In some economically important crops such as banana, wheat and potato, these variations affect the copy number and the expression level of polymorphism in the rDNA, increasing the occurrence of productive traits in these crops [10,56]. Therefore, this method could be used to study repeated regions (rDNA) into the genome, not only A. tequilana if not in this kind, helping us to understand some of the processes in the genome organization, genetic diversity and adaptive plasticity of these species.

Conclusions
In this current study the methodology for estimating the copy number of 18s and 5s rDNAs and the contribution of these to the genome size of A. tequilana are reported. Our estimate is based on the generation of standard curves using the absolute quantification by qPCR and 4C DNA content reported for this species. This methodology happened to be fast and robust, validating variability using different concentrations of pDNA and gDNA from different A. tequilana plants.
The resulting dataset proved that the copy number of 18s was greater than 5s per genome, as well as the percentage of genome size contribution. These data were confirmed through FISH and presented a significant correlation between both analyzes. The association between rDNA copy number and genome size provides convincing evidence of a strong relationship between the two and among different groups of organism. However, more research must be conducted to understand the genetic changes that influence the genome organization in Agave species and develop to breeding programs to preserve biodiversity and the use of biotypes elite with desirable production traits.

Financial support
This work was supported by the National Council of Science and Technology of Mexico (CONACYT) with the Science Project (CB-50268).

Conflict of interest
There is no conflict of interest.