Escaping introns in COI through cDNA barcoding of mushrooms: Pleurotus as a test case

Abstract DNA barcoding involves the use of one or more short, standardized DNA fragments for the rapid identification of species. A 648‐bp segment near the 5′ terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the universal DNA barcode for members of the animal kingdom, but its utility in mushrooms is complicated by the frequent occurrence of large introns. As a consequence, ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus, the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5′ region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus. Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.

ITS has been adopted as the standard DNA barcode marker for mushrooms despite several shortcomings. This study employed newly designed primers coupled with cDNA analysis to examine COI sequence diversity in six species of Pleurotus and compared these results with those for ITS. The ability of the COI gene to discriminate six species of Pleurotus, the commonly cultivated oyster mushroom, was examined by analysis of cDNA. The amplification success, sequence variation within and among species, and the ability to design effective primers was tested. We compared ITS sequences to their COI cDNA counterparts for all isolates. ITS discriminated between all six species, but some sequence results were uninterpretable, because of length variation among ITS copies. By comparison, a complete COI sequences were recovered from all but three individuals of Pleurotus giganteus where only the 5′ region was obtained. The COI sequences permitted the resolution of all species when partial data was excluded for P. giganteus. Our results suggest that COI can be a useful barcode marker for mushrooms when cDNA analysis is adopted, permitting identifications in cases where ITS cannot be recovered or where it offers higher resolution when fresh tissue is. The suitability of this approach remains to be confirmed for other mushrooms.

K E Y W O R D S
COI, DNA barcoding, internal transcribed spacer, oyster mushrooms, taxonomic verification

| INTRODUCTION
DNA barcoding employs short, standardized DNA fragments for the rapid identification of species (Gilmore, Graefenhan, Louis Seize, & Seifert, 2009;Hebert, Cywinska, & Ball, 2003;Nguyen & Seifert, 2008;Vialle et al., 2009). This approach is particularly valuable for verifying species identification, and for the evaluation of taxonomic diversity in organisms with cryptic morphology such as fungi (Dentinger, Didukh, & Moncalvo, 2011). The use of molecular tools is essential for identifying and classifying the 90%-95% of undescribed fungi (Blackwell, 2011;Seifert, 2009); The ribosomal internal transcribed spacer (ITS), a highly variable region between the conserved sequences of the small subunit, 5.8S, and large subunit rRNA genes, has been adopted as the primary DNA barcode marker for fungi (Schoch, Seifert, Huhndorf, et al., 2012).
The ideal DNA barcode region is easy to amplify and variable enough to discriminate species, a condition that is best met when variation within species is low and divergence between species is high, a situation which creates a "barcode gap" (Hebert et al., 2003;Lahaye et al., 2008). A 648-bp segment near the 5′ terminus of the mitochondrial cytochrome c oxidase subunit I (COI) gene has been adopted as the DNA barcode region for animals because its performance in species discrimination is high and it is usually easy to recover (Hebert et al., 2003). Contrary to animals, no single gene region has been found that serves as an ideal DNA barcode for fungi and plants. As a consequence, a multi-locus barcode approach has been adopted to improve resolution across plants and fungi (Hollingsworth et al., 2009;James et al., 2006), and ITS has been adopted as the standard barcode region for fungi (Avin, Bhassu, Shin, & Sabaratnam, 2012;Begerow, Nilsson, Unterseher, & Maier, 2010;Schoch, Seifert, Huhndorf, et al., 2012;Seifert, 2009) although studies have shown that this gene region often fails to distinguish closely related fungal species (Schoch, Seifert, Caldeira, et al., 2012). Despite the acceptance of ITS as the fungal barcode, length variation in this region makes sequence alignment difficult across divergent taxa (Dentinger et al. (2011). Additional markers beyond ITS are needed for fungal barcoding, but finding suitable loci that can be easily amplified across the diversity of fungi remains a challenge (Robert et al., 2011;Stielow et al., 2015). COI has potential to address this gap because alignment of this locus across a divergent set of taxa is trivial (Dentinger et al., 2011).
A few studies have compared the resolution of ITS and COI in sets of closely allied species. COI was more effective than ITS in Penicillium (Seifert et al., 2007), while COI and ITS were equally effective in Leohumicola, (Nguyen & Seifert, 2008). In the Agaricomycotina, COI and ITS generally delivered similar resolution, but the prevalence of introns resulted in COI not being recovered from many taxa (Dentinger et al., 2011). Conversely, COI sequences showed low divergences in Fusarium (Gilmore et al., 2009) and Aspergillus (Geiser et al., 2007), although data interpretation was complicated by the apparent presence of multiple copies of COI, perhaps reflecting the recovery of nuclear pseudogenes. The strong performance of COI as a DNA barcode in animals (Hebert et al., 2003) suggests the value of exploring its use as a marker in mushrooms. Similar to the multi-locus barcode approach used in plants, COI could be used in conjunction with ITS for the identification of fungal species. There is one barrier to the implementation; the prevalence of introns in the COI gene of many fungal species including mushrooms is well documented (Seifert, 2009;Vialle et al., 2009). For example, nine introns occur in Pleurotus ostreatus (Wang, Zeng, Hon, Zhang, & Leung, 2008), 19 in Agaricus bisporus (Férandon et al., 2010), 15 in Trametes cingulata (Haridas & Gantt, 2010) and four in Agrocybe aegerita (Gonzalez, Barroso, & Labarère, 1998). These introns are often long, leading to extreme variation in length of the COI gene from approximately 1,584 bp in species lacking introns to over 22 kb in those with many introns (Férandon et al., 2010;Gonzalez et al., 1998;Haridas & Gantt, 2010;Wang et al., 2008). The presence of these introns impedes sequence recovery by conventional PCR (Seifert, 2009;Seifert et al., 2007), a factor which has supported the adoption of ITS as the sole DNA barcode for mushrooms Vialle et al., 2009).
Although COI seems to have the potential to reliably identify taxa, there is a need for more detailed study. In particular, given the prevalence of introns and the apparent occurrence of nuclear pseudogenes, it is critical to adopt RT-PCR to properly recover and evaluate the capacity of COI sequences to resolve fungal species.
In this study, we examine the ability of the COI gene to discriminate six species of Pleurotus. We test amplification success, sequence variation within and among species, and the ability to design effective primers. We also recover ITS sequences from all isolates to allow their comparison with the sequences recovered through the analysis of cDNA from COI.

| Sample collection
The 24 strains examined in this study included representatives of six species of Pleurotus (Table 1). They were mostly obtained from mushroom farms in Malaysia or from the University of Malaya collection. A few isolates were newly collected from Malaysia, while others were imported from China or Iraq (Table 1). The species assignment for each isolate was verified by comparison of morphological traits of basidiocarps and mycelial cultures.

| DNA and RNA extraction and cDNA synthesis
Total genomic DNA was extracted from fresh mycelium by a rapid protocol (Avin, Bhassu, & Sabaratnam, 2013). Briefly, after adding sufficient 2% SDS buffer, the samples were homogenized at 65°C for 30 min. The mixture was purified twice with phenol: CHCl 3 : Isoamyl alcohol (25: 24: 1). DNA was precipitated with cold T A B L E 1 List of species and strains used in this study and length of amplicons for COI and ITS. Bold process IDs for the samples sequenced in this are also indicated and are publically available isopropanol, and then pelleted by centrifugation at 4°C for 15 min at 11,000 × g. The resultant DNA pellet was dissolved in TE buffer and stored at -20°C.
Total RNA was isolated from fresh mycelium using Trizol (Invitrogen, USA). Briefly, sufficient Trizol was added to the homogenized mycelia and incubated at 25°C for 15 min, then purified by chloroform. RNA was precipitated by cold ethanol and the pellet was washed twice with 70% ethanol. The RNA pellet was then dissolved in RNAase free water and stored at -80°C. Samples that did not successfully amplify in the first round of RT-PCR were re-extracted using Nucleospin ® RNA columns (Macherey-Nagel, Germany) following the manufacturers protocol. This included a DNAase treatment prior to elution in nuclease free water.
Total cDNA was synthesized from the RNA extracts using an Access One Step RT-PCR system kit (Promega, USA). The first mixture was generated by gently mixing 1.0 μl of total extracted RNA, 1.0 μl of Oligo dt primer, and 3.0 μl of Nuclease-free H 2 O that was incubated for 5 min at 70°C. The second mixture was prepared by mixing Mixtures I and II were then combined for each sample and incubated for: 5 min at 25°C, 60 min at 42°C, and 15 min at 70°C before being stored at -20°C.

| Primer design
The coding sequence of COI from the mitochondrial genome of P. ostreatus (19: EF204913) was used as a reference to design primers ( Figure 1). Several criteria, including the generation of proper length fragments (800-900 bp) with enough conserved sites in the binding regions were employed to design primers. NCBI Primer-BLAST was used to design primer pairs for two cDNA regions that spanned the coding sequence of COI (Rozen & Skaletsky, 2000;Ye et al., 2012). Figure 1 shows the location and orientation of these primers on the open reading frames of COI.
Primer ID, sequence and annealing temperatures are provided in Table 2.

| PCR and reverse transcription (RT)-PCR conditions
PCR amplification of the COI cDNA employed an initial denaturation at 95°C for 5 min; followed by 30 cycles with denaturation at 94°C, annealing at 55°C and extension at 72°C for 1 min; followed by a final Flexi DNA Polymerase (Promega, USA). We used genomic DNA to amplify and sequence the ITS region with primers ITS1 and ITS4 using standard protocols (White, Bruns, Lee, & Taylor, 1990), or with local primers ITS1-UM2 and ITS2-UM2 (Avin, Bhassu, Shin, & Vikineswary, 2014). Successfully amplified PCR products were purified using the Nucleospin Extract II Kit (Chemopharm), and bidirectionally sequenced using an ABI 3730XL automated sequencer. Sequences along with voucher information were deposited in the Barcode of Life Data System (BOLD Process IDs; CDB001-CDB024-15) (Ratnasingham & Hebert, 2007) and are publicly available in NCBI GenBank (Table 1).
For both genes, ML trees were constructed in MEGA 6 (Tamura et al., 2013) under the selected model; branch topology was optimized using extensive subtree pruning and regrafting (SPR) with branch swap filter selected. The stability of nodes was inferred by non-parametric bootstrapping (Felsenstein, 1985), using 1,000 heuristic bootstrap pseudoreplicates. DNAsp ver. 5.10 was used to calculate the haplotype data file and genetic divergences (Librado & Rozas, 2009). To estimate the significance of variance within and among species, an AMOVA (analysis of molecular variance) was calculated using Arlequin ver.

| RESULTS
An interpretable ITS sequence was recovered from 20 of the 24 specimens, including at least one representative of each species with sequences varying in length from 592 to 625 bp (Table 1). A COI sequence was recovered from all specimens, but only a partial COI-3′ sequence was obtained from specimens of P. giganteus. Near full length COI sequences were generated by aligning and assembling a consensus of the 5′ and 3′ reads for the five species with reads for both regions (Table 1 and 3). Because the COI sequences were generated from cDNA template generated by RT-PCR they lacked introns, while ITS was amplified using standard PCR (Figure 2 and Table 3).
The percentage of variable sites for all six species was computed for both genes (Table 3). Across all 1,516 sites for COI, 76.8% were conserved, while 23.1% were variable with 12.3% being parsimony informative and 10.9% singletons. By comparison, 37.7% of the 715 ITS sites were conserved, while 55.2% were variable with 38.4% being parsimony informative, and 17.0% singletons (Table 3). Due to the indels in ITS, the mean divergence for all 20 sequences was higher for ITS (0.199) than for COI (0.059). Intra-specific divergences were generally slightly higher for ITS than COI, but so too were inter-specific divergences. Barcode gap analysis supports higher interspecific and intraspecific distances for ITS than COI. Both markers indicate P. ostreatus, P. eryngii, and P. pulmonarius are relatively close (Table 4) and fall under the 2% divergence threshold for COI and ITS (except P. eryngii). However, the use of the closely related mushrooms in our analysis with small sample sizes may explain the low divergence threshold (below 2%). The maximum intraspecific distance was greater for both COI and ITS in P. ostreatus. Otherwise, intraspecific distances were low for the remaining Pleurotus species with multiple representatives per species.

Figure 2a-d shows ML trees for COI and ITS with bootstrap values
for each node based on 1,000 replicates. ITS (Figure 2c,d) discriminated all six species with strong support, but sequences from four of eight specimens of P. ostreatus failed. COI sequences were recovered from all specimens, albeit just partial COI-5′ sequences for P. giganteus. COI failed to distinguish between P. pulmonarius and P. giganteus when partial 5′ sequences were included, but when partial sequences were excluded, COI distinguished between these two species with strong support (Figure 2). Overall, both markers readily distinguished between species with moderate to strong support. When sampling was improved for ITS with Genbank sequences, there was strong support for the monophyly of the six Pleurotus species, results confirming our morphological identifications (Figure 2d).

| DISCUSSION
In contrast to prior studies that failed to recover COI through conventional PCR-based approaches (Dentinger et al., 2011;Vialle et al., 2009), the cDNA approach employed in this analysis recovered full COI sequences from all six species of oyster mushrooms (barring a few incomplete recoveries for P. giganteus). The past failures of T A B L E 2 List of the primers used for amplification of COI and ITS from Pleurotus species F I G U R E 2 COI and ITS phylogenetic analyzes. (a-d) Phylogeny reconstruction based on maximum likelihood under a GTR+G model for COI and a TN92+I model for ITS. Numbers at the nodes indicate the percentage of bootstrap replicates supporting a given topology, although bootstrap values below 50% are not indicated. Samples ID and species delimitations are indicated at the tips of the tree. One COI tree for Pleurotus giganteus is based on 759-bp COI-5′ fragments, while sequences for the other taxa were full length 1,516 bp. Twenty-two ITS sequences and two mitochondrial sequences were retrieved from GenBank; they are indicated in yellow standard PCR were undoubtedly due to the presence of several large introns in the COI gene of Pleurotus (Dentinger et al., 2011;Seifert, 2009;Seifert et al., 2007). However, cDNA barcoding escapes this problem, generating amplicons that are easily aligned. The present study generated a 1,516-bp COI sequences from 21 of the 24 specimens, failing only to recover full sequence information from the 3′ region of P. giganteus. Our failure to amplify the 3′ end of P. giganteus reflects the need to further optimize COI primers for Pleurotus (and other mushrooms) given the diagnostic ability of the 3′ end of this gene. Alternatively, more samples of different species should be sequenced and aligned to design appropriate primer pairs on the most conserved regios. ITS sequences were recovered from all six species, but results from four of the 24 specimens were uninterpretable due to sequence length variation. Although the number of species examined in this study was small, the success of COI in discriminating these taxa justifies a larger-scale effort to validate the effectiveness of COI as a barcode for Pleurotus and other mushrooms.
Paralogues (multiple copies) of COI and low success in species delimitation rate were reported in a study on the important pathogenic and commonly isolated Fusarium (Gilmore et al., 2009) and also in certain genera of the Agaricomycotina (Dentinger et al., 2011). These paralogues likely represent nuclear encoded pseudogenes of COI (Gilmore et al., 2009) Schoch, Seifert, Huhndorf, et al. (2012) concluded that ribosomal markers (e.g., ITS) have fewer problems with PCR amplification than protein-coding markers (e.g., COI), the difficulties in generating a reliable alignment are an important drawback to the use of ITS as a DNA barcode marker (Dentinger et al., 2011;Seifert et al., 2007). Furthermore, sequence variation among paralogues can result in uncertain base calls. Despite these caveats, the availability of ITS sequences from a large number of fungal species in GenBank is a major advantage that often outweighs the complications introduced by alignment problems. The current study suggests that the COI can be an additional barcode marker for particular taxonomic groups of fungi when ITS is unsuitable (e.g., some genera in Ascomycota or some species of mushrooms discussed in this study) or for examining fresh material through a cDNA based approach. However, this approach needs to be extended to determine its suitability for other fungi. Moreover, COI sequences generated phylogenetic groupings for Pleurotus similar to those for ITS while having the advantage of being easily aligned.
These results justify the broader examination of cDNA-based analysis to test the potential of COI as a barcode marker that could complement ITS, in much the same fashion that two gene regions (rbcL, matK) have been adopted as the standard barcode regions for plants (Hollingsworth et al., 2009). Future efforts should explore the use of COI in groups where ITS is unable to deliver species-level resolution.

CONFLICT OF INTEREST
None declared.

DATA ACCESSIBILITY
The DNA sequences are available in the Barcode of Life Data System (BOLD) and National Center for Biotechnology Information (NCBI) which have shown in Table 1.

AUTHOR CONTRIBUTIONS
Farhat A. Avin: designed research, performed research, analyzed data, wrote the paper; Subha Bhassu: project adviser, edited the paper, analysis adviser; Dr. Tan Yee Shin: project adviser, edited the paper, analysis adviser; Thomas W. A. Braukmann: analyzed data, edited the paper; Vikineswary Sabaratnam: project leader, project financial leader, project adviser, edited the paper; Paul Hebert: project adviser, edited the paper, analysis adviser.