Characterization of beta-tubulin DNA sequences within Candida parapsilosis complex

Background and Purpose: Candida parapsilosis is a common cause of candidemia in children and patients with onco-hematological diseases, septic arthritis, peritonitis, vaginitis, and nail and skin infections. Regarding this, the present study was condcuted to evaluate intra- and inter-species variation within beta-tubulin DNA sequence of C. parapsilosis complex in order to establish the utilization of this gene in the identification and phylogenetic analysis of the species. Materials and Methods: A total of 23 isolates representing three different species of C. parapsilosis complex were used in this study, all of which were identifed by ITS-sequencing. For the successful amplification of beta-tubulin gene, a newly designed set of pan-Candida primers was used, followed by bilaterally sequence analysis for pairwise comparisons, determination of multiple alignments, evaluation of sequence identity levels, counting sequence difference, and construction of phylogenetic tree. Results: The multiple alignment of 623-629 bp-long nucleotide (nt) sequences reflecting the beta-tubulin gene indicated an inter-species divergence ranging within 0-68 nt in C. parapsilosis, C. orthopsilosis, and C. metapsilosis with a mean similarity of 84.7% among the species. Meanwhile, the intra-species differences of 0-20 and 0-6 nt were found between the strains of C. parapsilosis and C. orthopsilosis, respectively. The phylogenetic tree topology was characterized by a clade made up by C. parapsilosis and C. orthopsilosis, while C. metapsilosis formed a related but separate lineage. Conclusion: Our data provided the basis for further discoveries of the relationship between the species belonging to C. parapsilosis complex. Furthermore, the findigns of the prsent study revealed the efficiency of beta-tubulin DNA sequence data in the identification and taxonomy of C. parapsilosis and other pathogenic yeasts.

Introduction andida parapsilosis is a common commensal of the skin that can cause candidemia in children and onco-hematologic patients, due to its ability to adhere to vascular catheters, prosthetics devices, and the skin of health care workers [1][2][3]. This species can also affect the patients with septic arthritis, peritonitis, vaginitis, as well as nail and skin infections [4,5]. Early reports showed that C. parapsilosis is genetically more heterogeneous than other Candida species.
Based on molecular techniques, such as randomly amplified polymorphic DNA (RAPD), DNA sequencing, and morphotyping [6,7], this species is divided into three groups, including C. parapsilosis I, II, and III [8]. However, molecular fingerprinting and mitochondrial genome signatures have shown that these groups are related to three different species, namely C. parapsilosis sensu stricto, C. orthopsilosis, and C. metapsilosis [9,10].
The use of phylogenetic species concepts based on ribosomal DNA regions has greatly improved the taxonomy of yeasts. According to ISHAM-ITS reference database (http://its.mycologylab.org/), there is an intra-species diversity in C. parapsilosis complex. Nevertheless, confirmation and refinement using other genes is long overdue. The description and characterization of new genetic markers for C. parapsilosis complex can clarify its taxonomy and might be helpful for detection/identification purposes. The protein coding genes, such as beta-tubulin (BT2), have been proven to be a powerful tool for the species delimitation of the closely related species, which have been successfully used for the species delineation of fungal groups, such as Aspergillus [11], Penicillium [12], Scedosporium [13], dermatophytes [14,15], and Phaeoacremonium [16]. With this background in mind, the present study was conducted to compare BT2 gene sequences with ITS sequences within the C. parapsilosis complex and investigate its resolution power as a new genetic marker with regard to intra-and inter-species variation and application in phylogenetic analysis, taxonomy, and identification.

Materials and Methods
A total of 23 isolates representing three different species of C. parapsilosis complex, including 20 clinical isolates, 2 ATCC, and 1 TIMM reference strains (Table 1), were subjected to BT2 gene sequencing. The clinical isolates were selected from a collection of strains isolated from blood and other normally sterile clinical samples, which had been already collected from the children admitted to the Pediatric Inensive Care Unit of the Pediatric Medical Centers of Tehran, Iran [17].
DNA extraction and purification from the yeast colonies was accomplished using a previously described method [18]. For the preliminary identification of the strains, the ITS1-5.8SrDNA-ITS2 region was amplified using ITS1 (5′-TCC GTA GGT GAA CCT GCG G-3′) and ITS4 (5′-TCC TCC GCT TAT TGA TAT GC-3′) primers. The polymerase chain reaction (PCR) products were subjected to digestion with a restriction enzyme, namely MspI (Fermentas, Vilnius, Lithuania), as previously described [19]. To confirm identification, all PCR products were subjected to sequencing of the entire ITS region using both ITS1 and ITS4 primers, and the sequences were compared with valid reference sequences deposited in the GenBank by Blast (https://blast.ncbi.nlm. nih.gov/Blast).
For the sequence analysis of BT2 gene, the BT2 sequences of various fungal species were obtained from the GenBank and aligned using the Geneious software (http://www.geneious.com). A novel set of pan-Candida primers was designed and named as: BCF (5'-AAG AAT TCC CTG ATA GAA TGA TG-3') and  BCR (5'-CCA ATG CAA GAA AGC TTT TCT T-3'). The PCR reactions contained 12.5 μL of premix (Ampliqon, Denmark), 2 μL (around 1 ng) of DNA template, 0.5 μM of each primers, and enough water up to a final reaction volume of 25 μl. The reaction mixture was initially denatured at 95°C for 5 min, followed by 35 cycles of 30 sec at 94°C , 45 sec at 55°C, and 45 sec at 72°C, and a terminal extension step of 72°C for 5 min.
For the strains that failed to amplify, a nested PCR was set up for the successful amplification of the gene using BCFN (5'-AAG AAT TCC CTG ATA GAA TGA TG-3') and BCRN (5'-CCA ATG CAA GAA AGC TTT TCT T-3') primers. Subsequently, 1 μL of the 1:50-diluted product of the first PCR was added as a template to the reaction mixture of the second PCR and subjected to the above-mentioned thermal conditions.
The PCR products were purified and sequenced bilaterally with the BCFN and BCRN primers using the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (Applied Biosystems, Foster City, CA, USA) on an automated DNA sequencer (ABI PrismTM 3730 Genetic Analyzer, Applied Biosystems) according to the manufacturer's instructions.
Forward and reverse sequences of each sample were subjected to ClustalW pairwise alignment using Geneious and MEGA6 software [20]. Furthermore, the consensus sequences were entered into BioEdit software, version 7.0.5 [21] for the determination of multiple alignments, evaluation of sequence identity levels, counting of sequence difference, and construction of phylogenetic tree.
A phylogenetic tree was built using the maximum-likelihood algorithm with the Tamura-Nei parameter as a substitution model in MEGA6. The reliability of the branches was assessed using the bootstrap method with 1,000 simulations. The nucleotide sequences obtained in the study and their corresponding amino acid sequences were deposited in the GenBank, under the accession numbers of MH352134 to MH352145.

Results
The primers designed in this study successfully amplified the target with a single band for all tested strains. Sequence analysis by BioEdit showed interspecies polymorphism ranging within 623-629 nucleotides (nt). The multiple alignment of the sequences indicated a mean similarity of 84.7% between the species. The sequence difference count matrix created by BioEdit showed significant differences among the species belonging to C. parapsilosis complex, including insertions/deletions and substitutions within the complex indicating interspecies divergence ranging from 0-68 nt (Figure 1). The largest inter-species nucleotide difference was observed between C. metapsilosis (ATCC 96144) and C. parapsilosis clinical isolate 125 with 68 nt. Meanwhile, intra-species differences were found within both C. parapsilosis and C. orthopsilosis by 0-20 and 0-6 nt, respectively (data not shown). The interspecies differences facilitated the distinction of five and six distinct BT2 genotypes in C. parapsilosis and C. orthopsilosis, respectively. Because only one strain of C. metapsilosis was tested, we could not evaluate the intra-species variation within this species. Bioinformatic analysis and nucleotide BLAST search revealed no introns in the fragments, and it seemed that the region was evolutionarily conserved. Figure 2A illustrates the BT2 gene tree topology as computed by the MEGA6 software. The backbone of the tree had high bootstrap values (70%) within the species of C. parapsilosis complex. Moreover, the interspecies correlations were obvious in the clades (Figure 2A). The BT2 gene tree topology of the species was similar to that inferred from the ITS region analysis, with species clustering in similar strongly supported clades ( Figure 2B). Sequence variation between the Candida strains led to the formation of a clade consisting of C. parapsilosis and C. orthopsilosis, while C. metapsilosis strain formed a separate lineage closely related to the clade consisting of C. parapsilosis and C. orthopsilosis.

Discussion
Candida parapsilosis is the second most common yeast involved in bloodstream infections among neonates, catheter-associated candidemia, and intravenous hyperalimentation in different regions, such as Latin America, Asia, and Europe [22][23][24]. Highly variable (ITS1 and ITS2) and conserved (18S, 5.8S and 28S) regions of ribosomal DNA have been used for the detection and differentiation of medically important Candida species. However, the improvement of the current databases by the identification of new genetic markers establishes a foundation for the better distinction of the closely related species [25].
The sequence difference count matrix of BT2 gene observed among the members of C. parapsilosis complex indicated that this locus may be more useful than ITS regions (84.7% versus 89.6% similarity) for the discrimination of these three closely related species (data not shown). Meanwhile, the analysis of the hyphal wall protein 1 (HWP1) nucleotide sequence alignment revealed only 60% similarity between C. parapsilosis and C. orthopsilosis [26]. complex species ranged within 0-68 nt, which was similar to that of ITS sequences retrieved from the GenBank (0-67 nt). This finding was in line with the results obtained through the pyrosequencing of ITS2, sequencing of ITS1, and restriction fragment length polymorphism patterns of the intergenic spacer region 1 [8,9,27,28]. Furthermore, the analysis of intein (in vacuolar ATPase gene, VMA) in 85 strains of C. parapsilosis complex showed that this locus is able to discriminate the members of C. parapsilosis complex based on VMA intein sizing. In this regard, C. metapsilosis exhibits a VMA intein smaller than that of C. orthopsilosis [29]. The intra-species DNA sequence variations of 0-20 and 0-6 nt led to the identification of five and six distinct BT2 genotypes in C. parapsilosis and C. orthopsilosis, respectively.
The association between distinct genetic variants and emergence of C. parapsilosis complex in various clinical settings has not been fully elucidated yet [30]. The presence of more intra-species sequence variation in BT2 gene than in ribosomal genes makes this locus a potential candidate for the discrimination of organisms at the strain rather than the species level. Based on the in silico analysis of ITS1-5.8S-ITS2 region, C. parapsilosis and C. orthopsilosis had the intraspecific polymorphisms of 1-4 and 0-8 nt, respectively (data not shown). In addition, another analysis based on ISHAM-ITS reference database (http://its.mycologylab.org) revealed that C. parapsilosis, C. orthopsilosis, and C. metapsilosis had 2, 5, and 4 polymorphic sites, respectively.
Greater genetic variability of C. orthopsilosis in comparison to that of C. parapsilosis as observed by RAPD analysis has caused difficulties in the development of molecular techniques for the subtyping of these yeasts [9,31]. The BT2 gene tree topology constructed by means of the maximum-likelihood method revealed a cluster consisting of C. parapsilosis and C. orthopsilosis, with C. metapsilosis coming out as a separate lineage.
The phylogeny of C. parapsilosis complex based on BT2 gene is similar to the one inferred by using HWP1 gene [26], D1/D2 region of ribosomal RNA gene [32], ITS, and 26S rRNA gene [33] sequences confirming the placement of C. parapsilosis and C. orthopsilosis in 'psilosis' clade.

Conclusion
The present study is the first attempt targetted toward the evaluation of BT2 gene as a new marker for the delineation of C. parapsilosis complex members. The obtained data can provide a basis for further discovery regarding the relationships of the closely related yeast species. The constructed tree topologies showed a high concordance between BT2 and those observed for other markers, such as HWP1, D1/D2 region of 26SrDNA, and ITS1-ITS2. According to intraspecific polymorphism observed in C. parapsilosis and C. orthopsilosis species, it is needed to perform further studies on longer portions of BT2 gene to test the potentiality of BT2 to be used for genotyping.