Comparative analysis of 16 S ribosomal RNA of ‘ Candidatus Liberibacter asiaticus ’ associated with Huanglongbing disease of Persian lime and Mexican lime reveals a major haplotype with worldwide distribution

Centro de Investigación Científica de Yucatán, A.C (CICY), Unidad de Bioquímica y Biología Molecular de Plantas, Yucatán, México. Centro de Investigación Científica de Yucatán, A.C (CICY), Unidad de Biotecnología, Yucatán, México. Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias (INIFAP) Campo Experimental Mocochá, Yucatán, México. Centro de Investigación y Asistencia en Tecnología y Diseño del Estado de Jalisco, A.C.(CIATEJ) Unidad Sureste, México.


INTRODUCTION
Huanglongbing (HLB) or citrus greening, is considered the most destructive and devastating disease of citrus trees worldwide (Gottwald et al., 2007). The disease affects almost all major citrus fruit trees, with sweet oranges, mandarins and mandarin hybrids being most affected (Bové, 2006). HLB has spread throughout the majority of citrus-producing countries with millions of dollars lost for growers. The disease is feared worldwide because citrus trees, once infected, will irrevocably deteriorate. In the course of many years, no effective treatments for the disease existed and successful control involves preventing trees from becoming infected (Teixeira et al., 2008). HLB disease management involves three principal components, control of the insect vector Diaphorinacitri by chemical and biological methods, planting pathogen-free nursery stock and removing the inoculum by destroying infected trees (Grafton-Cardwell et al., 2013). Currently, the use of controlled heat treatments to cure HLB caused by 'Candidatus Liberibacter asiaticus' ('Ca.L. asiaticus') using continuous thermal exposure to 40 to 42°C for a minimum of 48 h was sufficient to significantly reduce titer or eliminate 'Ca. L. asiaticus' entirely in HLB-affected citrus seedlings (Hoffman et al., 2013). The HLB disease caused by a phloem-limited bacterium was originally described by Garnier et al. (1984) using electron microscopy as an intracellular pathogen, and was included in the α-Proteobacteria subdivision. Three identified species are the causative agents, 'Candidatus Liberibacter africanus' ('Ca. L.africanus'), 'Candidatus Liberibacter americanus' ('Ca. L. americanus'), and 'Ca.L. asiaticus ' (McClean and Oberholzer, 1965;Capoor et al., 1967;Bové, 2006). Nonetheless, given that axenic cultures of these bacteria have been difficult to obtain because it is an obligate pathogen, molecular techniques are essential tools for identifying and analyzing the phylogeny and taxonomy, by means of amplification of 16S rDNA. Diagnosis of HLB is made by means of PCR in leaves of diseased trees with various symptoms such as blotchy mottling, yellowing veins and green islands, wherein the bacterial titer is generally high (Teixeira et al., 2008). Real-time quantitative PCR (qPCR) is another method that has been used for detection and quantification of the pathogen. In plants and insect vectors, positive amplification was achieved with as few as 10 cells per PCR reaction and the presence is detected even at low levels of the pathogen (Li et al., 2006(Li et al., , 2007(Li et al., , 2008Wang et al., 2006;Teixeira et al., 2008).
Before the complete genome sequence of 'Ca. L. asiaticus' (ASM2376V1) was reported, information on the genetic diversity of HLB pathogens was scarce (Duan et al., 2009). Diversity studies were restricted to the analysis of sequences from 16S/23S genes, the omp gene region, or the rplKAJL-rpoB operon (Villechanoux et al., 1992;Planet et al., 1995;Jagoueix et al., 1997;Subandiyah et al., 2000;Bastaniel et al., 2005). Particularly, analysis of the 16SrDNA region has been used to estimate the genetic diversity among worldwide strains with many Asian strain having identical 16SrRNA sequences, e.g., sequences from Japan, Taiwan, Indonesia, Philippines, Vietnam, and Thailand (Jagoueix et al., 1994;Subandiyah et al., 2000;Tomimura et al., 2009). Furthermore, numerous single nucleotide polymorphisms (SNPs) identified using restriction fragment length polymorphism (RFLP) have been reported in one Chinese and two Indian strains collected in Karnataka in the southwest of India (Adkar-Purushothama et al., 2009).
In México, the first HLB-infected tree was detected in 2009 in the municipality of Tizimin, Yucatán and subsequently in the states of Quintana Roo, Nayarit, and Jalisco. Afterward, the occurrence of HLB was confirmed in different localities of Campeche, Colima, Sinaloa and Michoacán (Senasica-Sagarpa, 2010). We initiated a study with the aim to detect the pathogen 'Ca. L. asiaticus' in citrus trees on the Yucatán Peninsula with the classical symptoms of the HLB disease. Here we reported the genetic diversity of the16S rRNA gene of 'Ca. L. asiaticus' strains from symptomatic citrus plants of the species Citrus latifolia and Citrus aurantifolia. Analyzing the sequences from Mexico and other countries, we identified a universal haplotype (H36PENINSULAR) with worldwide distribution and detected in both the citrus species and the insect vector.

Plant samples
The plant samples were collected during the years 2010 and 2011 in plantation fields and backyard trees of C. latifolia Tanaka (Persian lime) and C. aurantifolia Christmann (Mexican lime) located in Yucatán, Quintana Roo, and Campeche States, México. Leaves were sampled from the citrus plants with the characteristic HLB symptoms such as blotchy mottling, yellowing veins and green islands. Vegetal material was stored at 4°C and transported to the laboratory for DNA extraction. The leaves of five healthy citrus plants were used as a control.

Total genomic DNA extraction
Leaves were rinsed twice with sterile distilled water and twice with 95%v(v/v) ethanol and midribs were cut with a sterile scalpel. Onetenth of a gram of midribs was macerated with a pestle in a mortar containing liquid nitrogen. DNA was extracted using the CTAB method (Murray and Thompson, 1980). To eliminate impurities from the DNA preparation, samples were processed twice with one volume of phenol-chloroform (1:1) and once with chloroform. Precipitation of the DNA was performed with 3 M sodium acetate at pH 5.0 ( 1 / 10 of volume) plus 1 volume of 2-propanol. Cloning and sequencing of the 16S rRNA gene PCR amplification of the 16S rRNA gene was carried out using REDTaq polymerase (Sigma-Aldrich) and oligonucleotides OI1/OI2C (Jagoueix et al., 1994). The reaction was performed in a total volume of 25 µl containing 1X PCR buffer, 0.2 mM of each dNTP, 0.4 µM of each primer, 5 ng of template DNA, and 1 unit of DNA polymerase. The PCR conditions were as follows: an initial denaturation step of 94°C for 2 min; 40 cycles at 94°C for 30 s, 62°C for 30 s, and 72°C for 1 min; finally, 72°C for 10 min. Amplified fragments of 1167 bp were purified using the Qiaex II Gel Extraction Kit (Qiagen) and then cloned into the pGEM T-Easy Vector (Promega). Recombinant plasmids were purified using the QIAprep Spin Miniprep Kit (Qiagen). Sequencing of the plasmids was performed on both DNA strands.
Escherichia coli cells were grown in the Luria-Bertani medium and standard procedures for the growth and transformation of the cells were used (Sambrook and Rusell, 2000). Ampicillin was added at a final concentration of 100 µg/mL.

Analysis of 16S rRNA sequences of 'Ca.L. asiaticus' and phylogenetic tree construction
The sequences were assembled and trimmed using the Sequencher software, version 5.0 (Gene Codes Corporation, Ann Arbor, MI USA http://genecodes.com). Edited sequences were analyzed for similarity using the BlastN program, and 16S rDNA gene sequences were retrieved from the non redundant NCBI database (http://blast.ncbi.nlm.nih.gov/).Selected sequences were aligned using the ClustalW software configured for highest accuracy (Larkin et al., 2007). The phylogenetic relationships were determined using the Neighbor-Joining algorithm in the Mega 4 software (Tamura et al., 2004), and the Kimura 2-parameter statistical model was applied (Kimura, 1980). The confidence of the grouping was verified using bootstrap analysis (1000 replications). Sinorhizobium meliloti RFP1 (EU271786), Rhizobium etli CFN42 (NR029184), and Escherichia coli (J01859) were used as outgroup.
Ninety-three sequences of the 'Ca.L. asiaticus' 16SrRNA gene, at least 1000 bp in size were retrieved from the GenBank sequence database (Benson et al., 2005). Sequences were edited and phylogenetic analysis was performed as described above.

Identification of single nucleotide polymorphic sites and haplotype designation
All the DNA sequences of 16S rRNA gene were aligned using Sequencher software version 5.0. The identical sequences were separated and the group was realigned to confirm the percentage of identity. A representative sequence was used for realignment with the sequences with an identity value below 99.5%. Single nucleotide polymorphisms (SNPs) were identified visually (den Dunnen and Antonarakis, 2001). A sequence was considered a haplotype when 2 or more samples had a mutation in the same position (Arias et al., 2010). The SNPs and haplotypes was confirmed by means of the DnaSP software version 5.0 (Librado and Rosas, 2009). Definition of SNPs for the 16S ribosomal gene sequences of 'Ca. L. asiaticus' obtained from the GenBank database was performed with a multiple alignment by using the Mexican sequence as reference to identify the positions of the SNP's.

Real-time qPCR
TaqMan amplification reactions were performed on a Real-Time PCR StepOnethermocycler (Applied Biosystems, Foster City, California, USA). PCR amplifications were performed with EXPRESS qPCRSupermix Universal (Invitrogen) in a 20 µl reaction containing 10 µl 2X qPCRSuperMix, 25 µM ROX reference dye, 250 nM of each target primer (HLBas and HLBr), and 150 nM of target probe (HLBp). For positive internal control, 300 nM (each) internal control primers (COXf and COXr), 150 nM internal control probe (COXp) were used (Li et al., 2006).The cycling amplification conditions were 95°C for 2 min followed by 35 cycles of 95°C for 15 s and 60°C for 1 min. To exclude false-positive results some control reactions with genomic DNA sample from a PCR-positive HLBdiseased plant, DNA from five healthy citrus plants and distillated water were done. A second real-time PCR assay was carried out with positive reactions.

Detection of 'Ca.L. asiaticus'
PCR amplification of the 16S rRNA gene of 'Ca.L. asiaticus' was carried out for 214 genomic DNA samples purified from leaf samples collected from HLB-symptomatic and asymptomatic citrus trees from Campeche, Yucatán, and Quintana Roo. 'Ca. L. asiaticus' were detected in 25 trees; 6 of the leaf samples were collected from citrus plants in Campeche, 7 in Quintana Roo, and 12 in Yucatán. The presence of 'Ca.L.asiaticus' in HLBsymptomatic citric plants was detected mostly in C. latifolia trees. The HLB-positive samples were confirmed by means of real-time PCR and C T values are shown in Table 1. This analysis, however, showed an unexpectedly low number of PCR-positive samples from HLBsymptomatic citrus plants. To increase the reliability of the results, real-time qPCR was carried out for all the 214 samples. The results showed an increase in the positive samples from 25 to 70 (35 from Campeche, 15 from Quintana Roo and 20 from Yucatán). The real-time qPCR test improved the detec-tion of diseased plants by identifying false negative samples. The above results are suggestive of a low titer of the bacteria in the phloem tissue of the leaves and uneven distribution in the host (Hung et al., 1999;Li et al., 2008).
In México, the presence of 'Ca.L. asiaticus' on citrus plantations was first reported in Yucatán, particularly in the municipality of Tizimin (Senasica-Sagarpa, 2009). Initially, PCR was the method of choice to detect the pathogen, but the low percentage of detection, even in HLB-symptomatic trees made it necessary to use realtime qPCR method. Nowadays, although the HLB vector (Diaphorina citri) has been collected in the 23 citricultural states of México, the disease has not yet been reported at all. In this regard, it is worth mentioning that Mexican authorities established Official Mexican Guidelines (NOM-EM-047-FITO-2009, http://www.senasica.gob.mx/?Idioma=2&doc=9366), which specifies actions, such as total destruction of trees, fruits and derivatives, that must be implemented immediately once the bacteria are detected.

Sequence and phylogenetic analysis
DNA fragments of 1167 bp from the 16S rDNA gene amplified from the 25 conventional PCR-positive samples were cloned, and three independent transformed cell clones harboring the constructs were selected from each positive sample. Sequencing of 75 independent plasmids was performed, and individual sequences were trimmed, edited, and analyzed as mentioned earlier. The multiple alignment showed sequences with similarity of 99 to 100% to sequences of 'Ca. L. asiaticus', 96% with 'Ca. L.
africanus' and 'Ca. L. solanacearum', and 94% with 'Ca. L. americanus'. Phylogenetic tree construction with representative sequences from the Yucatán Peninsula and sequences deposited in the GenBank database is showing Figure 1. As expected, the sequences of our strains clustered with sequences of 'Ca.L.asiaticus' from different countries. It is clearly a close relation with sequences of 'Ca.L. africanus' rather than to sequences of 'Ca.L. americanus'. The above alignment also showed the clustering of sequences into two groups. The first group contained 26 sequences with an identity of 100%, and the second one contained 43 sequences with 1 to 5-sites of sequence polymorphism. The most important feature of the first subset of sequences was their geographical distribution into the three Mexican States. For this reason, we considered the sequence as the major haplotype identified on the peninsula, which we named H36PENINSULAR (GenBank accession No. JQ867409).

Worldwide distribution of the haplotype H36PENINSULAR
A similar analysis as previously mentioned was carried out with 93 16S rRNA sequences of'Ca.L. asiaticus' retrieved from GenBank sequence database with at least 1000 bp in length ( Table 2). Results of the multiple alignment showed the formation of two groups of sequences, wherein the principal group contained 41 sequences with 100% similarity. For simplicity sake, this consensus sequence was named provisionally HLB-CLas, The second group contained sequences with1 to 15-sites (nucleotide) variations in different parts of the gene. Metadata of the members of the first group showed that all sequences originated from samples of different species of citrus trees and from the D. citri insect vector collected in countries such as Dominican Republic, Florida (USA), Brazil, Indonesia, Vietnam, Thailand, Taiwan and Japan. Identical sequences of 16S rDNA were obtained in Asiatic strains from Japan, Taiwan, Indonesia, the Philippines, Vietnam and Thailand (Subandiyah et al., 2000;Tomimura et al., 2009). Further studies based on the analysis of the 16S rRNA gene and the omp gene region of 'Ca.L. asiaticus' found the closest relationship of sequences from northeastern India with sequences from Japan, Southeast Asia, USA (Florida) and Brazil, rather than with sequences from other Indian regions. Additionally, the study showed that common Asian strains are distributed in India together with other atypical strains (Miyata et al., 2011). Finally, a sequence comparison of   (Gupta et al., 2012). Comparison between the sequences H36PENINSU-LAR and HLB-Clas was performed using the CLC Workbench software, version 6.1 (data not shown). Alignment of the two sequences showed a perfect match (100% identical), which suggests that the H36PENISULAR sequence can be considered the principal haplotype with worldwide distribution, which includes the Mexican states of Yucatán, Campeche and Quintana Roo. Therefore, we believe that this universal sequence could be used as reference for the analysis of 16S rRNA sequences of 'Ca. L. asiaticus'.

Genetic diversity of 16s rRNA sequences of 'Ca. L. asiaticus'
To analyze the genetic diversity of the 16S rRNA gene in 'Ca. L. asiaticus' a nucleotide comparison between the H36PENINSULAR sequence with the two subset of polymorphic sequences described above was carried out. For the Mexican sequences, the SNPs were identified as mentioned in Materials and Methods and the variants names were assigned. Forty-two variants were identified and their sequences were deposited in the GenBank database (Table 3). The table shows the position and type of mutation for each sequence, wherein it is clear, a predominance of sequences with 1or 2 polymorphic sites. The modifications were transitions, transversions, insertions and deletions with dominance of transitions. In the case of the subset of sequences from the GenBank, the variations detected ranged from 1 to 15 SNPs across the gene, thus showing a diversity of single and multiple polymorphic sites (Table 4). Besides, the GenBank sequences showed a higher number of polymorphic sites than the Mexican variants: the predominant nucleotide mutation was transitions (there were also some transversions, insertions, deletions and substitutions ( Figure 2).
In relation to the SNPs, our findings show that Mexican strains have a small number of polymorphic sites in the 1167 bp DNA fragment distributed in a random fashion. In contrast, most GenBank sequences show high variability, which is suggestive of misreads during sequencing rather than genuine SNPs. To our knowledge, the 'Ca.L. asiaticus' variants of Mexican isolates described in this study are the largest set of reported sequences with SNP sites in the 16s rDNA gene; as a result, 43 sequences were identified and added to the GenBank sequence database. Looking for possible phylogenetic relations between the polymorphic sequences, we constructed a tree using the Neighbor-Joining method (Figure 3). It was not possible to distinguish clusters of related polymorphisms, because only 3 well-supported clades were evident: a monophyletic clade that includes all the sequences corresponding to the 16S rRNA gene of 'Ca.L. asiati-cus' irrespective of the geographical region, and 2 independent clades corresponding to 'Ca.L. africanus' and 'Ca.L. americanus', as expected. Other studies have analyzed the genetic diversity of the 16S rRNA sequences. Based on the RFLP technique, Adkar-Purushothama et al. (2009) identified 14 genetic lineages by analyzing the SNPs present in the 16S rRNA, which revealed a new lineage on the Indian subcontinent. Recently, a study was conducted as a reanalysis of all 'Ca.L. asiaticus' 16S rDNA sequences deposited in GenBank database to determine whether the discrepancy in reports of 16S variation can be resolved and whether this variation has a geographic origin (Nelson, 2012). The authors used geographic designations available in the metadata of the deposited sequences. A 302 bp segment common to 175 sequences was used to asses SNPs and the distribution, and the researchers found 118 identical sequences with different accession numbers and another 47 records exhibited 73 SNPs, most of them corresponding to a single accession number. A few SNPs occurred in more than one database record. The authors concluded that the reanalysis does not show sufficient confidence to confirm haplotypes of 'Ca.L. asiaticus' based on the 16S rDNA sequences because the low percentage of SNPs in the segment studied suggested misreads during sequencing rather than genuine haplotypes. Differing from the above idea is that heterogeneity exists in the 16S rRNA genes detectable in the variable regions of the gene (V1, V2, V3, etc). The location of the differences in the most variable part of the 16S rRNA corroborates that the differences are true differences and not mere sequencing errors (Coenye and Vandamme, 2013).
Our results suggest that the sequence of the haplotype H36PENINSULAR may be a suitable reference sequence for the analysis of the 16s rRNA gene of novel 'Ca. L. asiaticus' strains instead of the 16s rDNA sequences normally used for the detection of 'Ca. L. asiaticus', such as the Poona, Karnataka and Madikeri strains. The Poona strain has commonly been used as a reference sequence for phylogenetic analysis, even though its 16S rRNA sequence still contains undetermined nucleotides (Miyata et al., 2011). The amplification and analysis of the 16S rDNA sequences are useful for describing new nucleotide variations across the gene sequence.

Conflict of Interests
The author(s) have not declared any conflict of interests.

ACKNOWLEDGMENTS
M-E A received a scholarship grant from CONACYT No. 165309. FOMIX-Campeche No.Camp-2008-96801 andNo. Camp-2008-94224 supported this study. We are grateful to Dr. Zahaed Evangelista-Martínez from CIATEJ A.C. for the critical review and correction of the manuscript.