DNA Barcoding and Phylogenetic Relationships of Selected South Indian Freshwater Fishes Based on mtDNA COI Sequences

DNA barcoding is an effective tool for the identification of species representing diverse taxa especially through the sequence analysis of mitochondrial cytochrome c oxidase subunit I (COI) gene. In the present study, DNA barcodes were generated from 46 species of freshwater fishes covering the Orders Cypriniformes, Siluriformes, Synbranchiformes and Perciformes representing 30 genera under 9 families. All the samples were collected from diverse sites which also includes some endemic species. A total of 47 COI sequences were generated. After amplification and sequencing of 678 base pair fragment of COI, primers were trimmed which invariably generated a 635 base pair barcode sequence. The average Kimura two-parameter (K2P) distances within-species, genera, families, and orders were 0.32%, 8.40%, 14.50%, and 18.65%, respectively. DNA barcode discriminated congeneric species without any confusion. The present study strongly supports the efficiency of COI as an ideal marker for DNA barcoding of selected freshwater fishes.


Introduction
India is rich in fishery resources as it inhabits about 2508 fish species [1] of which the 856 are freshwater occupants [2,3]. The fishes are the most diverse vertebrate in world and about 40% of them live in freshwater. India is one of the mega biodiversity countries in the world and occupies the ninth position in terms of freshwater mega biodiversity and contributed 11.72% of the globe fish biodiversity [4]. However, the actual number of fish species found in India is still not accurately documented because of prevailing taxonomic confusion [5] due to inadequate exploration, indiscernibility among cryptic forms coupled with species ambiguity in the taxonomic keys [6]. As a result, many species have been considered as cryptic and some of which may also be dormant [6,7]. Therefore, for legible characterization of Indian freshwater fishes, there is a vital need of species scrutiny using advanced molecular methods. Hence, there is an urgent need for the assessment of Indian freshwater fish species through DNA barcoding.
DNA barcoding is a promising technique for species identification using a short mitochondrial DNA sequence of COI gene [8]. This technique involves the analysis of the sequence diversity of a 50 segment mitochondrial COI gene to identify species [9]. Of late, DNA barcoding method has been extensively followed for species identification as well as species discovery in various groups of organism [10,11]. Effectiveness of DNA barcoding has now been validated for many groups of animals [12] and among them fishes being one of the most extensively studied groups [13,14].
In recent years, several such molecular studies have been conducted on members of this group to better understand their relationships and to develop more accurate taxonomic classifications based on phylogeny [15][16][17][18][19][20][21][22][23][24][25]. Use of COI gene for barcode is considered to be suitable marker to discriminate the closely related species of fishes [26][27][28][29]. But the challenge in use of small DNA barcode (only 655bp) based phylogenetic study is selection of a nearly perfect nucleotide substitution model for the dataset, so that the weakest evolutionary signal can be correctly detected [30]. However, a comprehensive assessment of DNA barcodes of Indian freshwater fishes is limited, though a similar study has been done for the selected Indian freshwater fishes by Lakra [8]. Therefore, the present study reported additional DNA barcoding of 46 commercially important Indian freshwater fish species belonging to 30 genera under 9 families and 4 orders.

Sample collection and morphological identification
The tissue and voucher specimens of 46 species (9 families) were collected from different riverine systems of south India namely from Cauvery and Bhavani river systems. Approximately 100 mg of muscle tissue and fin clips from two to five individuals of each species were preserved in 95% ethanol until used. The species identification and confirmation were carried out using standard literature [31,32]. The valid nomenclatural names were adopted as per the Catalogue of Fishes of the California Academy of Sciences [1,33]. The live specimens were photographed with Canon 1100 Digital SLR Camera and later preserved in 7% formalin solution for future reference. Table 1 represents specimen details and GenBank accession numbers.

Amplification and sequencing
Genomic extractions were taken from fin clips, preserved in >95% ethanol using Invitrogen's ''Pure Link Genomic DNA Mini Kit'' following the manufactures instructions. COI amplification was carried out in 25-μL reaction mixtures containing 1 μl template DNA, 1X reaction buffer, 2.5 mM MgCl 2 , 2.5 mM dNTPs, 0.5 μl of each primer, and 0.2 U TaqDNA polymerase in a PTC-200 (Bio-Rad, USA) PCR machine. The reaction mixtures were preheated at 94°C for 5 min, followed by 50 cycles of amplification (94°C for 45 sec, 48°C for 45 sec, and 72°C for 60 sec), and a final extension at 72°C for 6 min. The COI gene was amplified using the universal primer set: The primers used for the amplification of the COI gene were: Fish F1-5'-TCAACCAACCACAAAGACATTGGCAC-3' and Fish R1-5'-TAGACTTCTGGGTGGCCAA AGAATCA-3' [34]. Sequencing was performed using Big Dye Terminator on ABI 3500 Genetic Analyzer (Applied Biosystems, Foster City, CA). The PCR products were visualized on 1.2% agarose gels and the most intense products were selected for sequencing. Products were labeled using the BigDye Terminator V.3.1 Cycle Sequencing Kit (Applied Biosystems, Inc., Foster City, CA) and sequenced bidirectionally using an ABI 3730 capillary sequencer following instructions of the manufacturer. One individual of each species was used for the nucleotide sequence analyses.

Sequence analysis
Sequences were aligned using Clustal W [35] and then submitted to GenBank. The extent of sequence difference between species was calculated by averaging pair-wise comparisons of sequence difference across all individuals. Pair-wise evolutionary distance among haplotypes was determined by the Kimura 2-Parameter method [36] using the software program MEGA 3 (Molecular Evolutionary Genetics Analysis, MEGA Inc., Englewood, NJ) [37]. The Neighbor-Joining (NJ) tree was constructed using MEGA 3 and to verify the robustness of the internal nodes of NJ tree, and bootstrap analysis was carried out using 1000 pseudo replications.

Genetic divergence and phylogenetic analysis
A total of 47 sequences were generated from 46 freshwater fish species. Sequence alignment of COI gene after trimming of primers yielded 635 nucleotide base pairs per taxon. All the sequences showed simplicity and un-ambiguity, and no insertions, deletions, or stop codons were observed in any of the sequences. The sequence analysis revealed average nucleotide frequencies as A=26.00%, T=29.80%, G=26.4%. and C=17.90%. The average K2P distances in percentage among the different taxonomic levels were analyzed ( Table 2). The average transitional pairs (si¼72) were more frequent than average transversional pairs (sv¼56) with an average ratio of 1.30. The average genetic distances within order, family, genus and species were 18.65%, 14.50%, 8.40%, and 0.32% respectively. The overall average genetic distance among all the species was 23.90%. The phylogram was divided into two main clades with high bootstrap support (>50%) (Figure 1). The clade I was subdivided into six separated subgroups: subgroup 1.1 includes 11 species belong to 6 genus (Hypselobarbus, Osteochilichthys, Barbodes, Neolissochilus, Tor and Osteochilus) of the family Cyprinidae.

Discussion
During DNA barcoding by Hebert et al. [38], the sequencing of ã 650bp region of the mitochondrial cytochrome oxidase I gene (COI), has been proven to be extremely an effective method for discriminating fish species [28,34,39]. Interestingly, recent research has illustrated some straightforward benefits from the use of standardized species-specific molecular tags derived from COI gene for species-level identifications [40]. DNA barcoding analysis has clearly discriminated freshwater fish species from India [8] Canada [28] and Mexico and Guatemala [39]. Presently, we have effectively used partial COI genes as DNA barcode in 46 freshwater fish species from south Indian waters representing 4 orders (Cypriniformes, Siluriformes, Perciformes and Synbranchiformes) representing 9 families and 30 genera. The universal primers amplified the target region in all 46 species, thus generating 47 COI barcodes of 635 bp and no insertions, deletions, stop codons or NUMTs were observed in any sequence, which support the hypothesis that all the amplified sequences derive from a functional mitochondrial COI sequences. And the present findings are in line with the previous reports [34]. Although the primary objective of DNA barcoding is to identify species, phylogeographic structure among COI sequences within species became evident. DNA barcoding pursues to provide a convenient, accurate and valid tool for species identification, and any candidate gene must suit this qualification. Use of a single, universal gene has many advantages, especially as barcoding applications expand to ecological questions and in the identification of illegally imported parts of organisms [41]. The study indicates that the standard barcoding marker, COI, can identify fish species [42,43,44]. The barcode sequences clearly discriminated all the studied freshwater fish species along with the apparent phylogenetic resolution. Although intra-and inter-specific genetic divergences overlap, tree-based methods can distinguish species in unidentified samples. For the ecologist and taxonomist alike, DNA barcoding would provide a powerful tool for the correct species identification, biodiversity assessments and locating the occurrence of cryptic species [45,46].