Mitochondrial genome of the garfish Hyporhamphus quoyi (Beloniformes: Hemiramphidae) and phylogenetic relationships within Beloniformes based on whole mitogenomes

Mitochondrial DNA (mtDNA) can provide genome-level information (e.g. mitochondrial genome structure, phylogenetic relationships and codon usage) for analyzing molecular phylogeny and evolution of teleostean species. The species in the order Beloniformes have commercial importance in recreational fisheries. In order to further clarify the phylogenetic relationship of these important species, we determined the complete mitochondrial genome (mitogenome) of garfish Hyporhamphus quoyi of Hemiramphidae within Beloniformes. The mitogenome was 16,524 bp long and was typical of other teleosts mitogenomes in size and content. Thirteen PCGs started with the typical ATG codon (with exception of the cytochrome coxidase subunit 1 (cox1) gene with GTG). All tRNA sequences could be folded into expected cloverleaf secondary structures except for tRNASer (AGN) which lost a dihydrouracil (DHU) stem. The control region was 866 bp in length, which contained some conserved sequence blocks (CSBs) common to Beloniformes. The phylogenetic relationship between 26 fish Beloniformes species was analyzed based on the complete nucleotide and amino acid sequences of 13 PCGs by two different inference methods (Maximum Likelihood and Bayesian Inference). Phylogenetic analyses revealed Hemiramphidae as the sister group to Exocoetidae and it is a paraphyletic grouping. Our results may provide useful information on mitogenome evolution of teleostean species.


Introduction
Mitochondrial DNA (mtDNA) of teleosts is a circular genome ranging from 15 to 19 kbp in length that is generally composed of two ribosomal RNA genes (12S rRNA and 16S rRNA), 13 protein-coding genes (PCGs), 22 transfer RNA genes (tRNAs) and two typical non-coding control regions (origin of the light strand (O L ) and control region (CR)) which contain essential regulatory elements for replication and transcription [1,2]. MtDNA is commonly used for population genetics and phylogenetic molecular evolution due to maternal inheritance, rapid evolution, coding content conservation, and high substitution rates compared to the nuclear genome [3,4]. In addition, the molecular characteristics of the mitogenome, such as gene rearrangement, tRNA secondary structure and models of the control of mtDNA replication are valuable for deep phylogenetic analysis [5,6]. Garfishes (order Beloniformes), which are known for their importance to commercial and recreational fisheries, consist of approximately 260 species classified into 6 families depending on the taxonomy [7]. Identifying adult garfish is not difficult [7], but larvae identification is difficult to carry out based on morphological characters. Several partial mitochondrial CRs gene sequences from Beloniformes have been sequenced and used for systematics [8]. However, the CRs do not provide enough phylogenetic information for molecular evolution and sometimes even appear of disputation. Although other researchers had previously determined the complete mitogenomes of some species from Beloniformes and constructed a phylogenetic tree to analyse their interspecies relationship [2], we still do not understand the higher-level phylogeny of Beloniformes because of the lack of more completely sequenced mitogenomes that will allow obtaining more informationfor a deeper exploration and evolutionary relationships. So far, there are 35 described variations that deviated from conserved mtDNA organization in teleosts, although none described among Beloniformes [9]. Therefore, sequencing more Beloniformes mtDNA may show novel variations in mtDNA organization among vertebrates. To date, more than 200 complete mitogenomes have been determined from teleostean species, however, only 26 species from Beloniformes are available in the GenBank database. The garfish Hyporhamphus quoyi, which is zooplankton feeders and carnivores [10], is a widespread species in the family Hemiramphidae (Beloniformes) ranging from Southeast Asia, Oceania, the eastern Pacific Ocean and West Africa [7]. At present, the complete mitogenome of H. quoyi has not been sequenced. To understand the deeper interspecies relationships of Beloniformes, we sequenced the complete mitochondrial genome of H. quoyi and its genome organization and structure were compared with other Beloniformes fish. In addition, the phylogenetic tree has been reconstructed by the Bayesian inference (BI) and Maximum Likelihood (ML) methods to understand the evolutionary relationships among Beloniformes. The characterization of the H. quoyi mitogenome may provide more information about the evolution of teleosts and will aid in larvae identifications.

Sample collection, DNA extraction, PCR amplification and sequencing
Adult specimens of H. quoyi were collected in the Pearl River estuary (N 21˚45 0 , E 133˚36 0 ), China, in June 2017 and no specific permissions were required for this location. According to the International Union for Conservation of Nature Red List, H. quoyi were not protected or endangered species. Our study was conducted with the approval from the Institutional Animal Care and Use Committee at Jinan University. All operations were performed according to international guidelines concerning the care and treatment of experimental animals. All samples were preserved in 95% ethanol and were stored at -80˚C until use. Total genomic DNA was isolated from dorsal muscle tissue samples using proteinase K treatment, followed by the Animal Tissue Genomic DNA Extraction Kit. To sequence the H. quoyi mitogenome, several primer pairs were designed for the amplification according to the conservative sequence based on the conserved sequences which were obtained by aligning the complete mitogenome of Hyporhamphus sajori (GenBank: AB370892.1) and Hyporhamphus intermedius(GenBank: NC_026467.1) (S1 Table) [2]. PCR amplification reactions were performed with PrimeSTAR 1 GXL DNA Polymerase under the following conditions: after an initial denaturation step at 95˚C for 1 min, then 35 cycles at 95˚C for 20 s (denaturation), 55˚C for 45 s (annealing) and 72˚C for 1-5 min (elongation). PCR products were sequenced from both directions using a primer walking method.

Sequence annotation and analysis
We used the program Seqman within Lasergene software to check and assemble manually the mitogenome sequences of H. quoyi. The complete sequence and its annotation were performed by NCBI BLAST (http://blast.ncbi.nlm.nih.gov/Blast) and the DNAStar package (DNAStar Inc. Madison, WI, USA). The circular gene map of mitogenome was drawn by GCView Server [11]. The location of the 13 PCGs and the two rRNAs were primarily determined through Dual Organellar Genome Annotator (DOGMA) [12]. All of the tRNA gene sequences were identified by the tRNA-scan-SE1.21 from the website http://lowelab.ucsc.edu/ tRNAscan-SE/ using the default search mode and the 'Mito/chloroplast' source [13]. The software RNAstructure was used to draw the secondary structure of tRNA genes and O L [14]. The relative synonymous codon usage (RSCU) of the 13 PCGs was calculated by the software MEGA 6 [15]. Tandem repeats in the control region (CR) were analysed using the Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.html) [16]. The nucleotide composition skewness was measured according to the following formulas: AT skew [(A−T)/(A+T)] and GC skew [(G−C)/(G+C)] [17]. To analyse evolutionary adaptation, the rates of nonsynonymous (Ka) and synonymous (Ks) substitutions in the mtDNA among 26 garfish of Beloniformes were estimated with DnaSP 5.10.01 [18]. The complete mitochondrial DNA sequence of the H. quoyi was deposited into the GenBank database under the accession number MG851912.1.

Phylogenetic analysis
A total of 26 Beloniformes mitogenomes available in GenBank were used to investigate the phylogenetic relationships among fish ( Table 1). The mitogenome of Perciformes fish (Caesio cuning (KP874185.1), Emmelichthys struhsakeri (AP004446.1) and Banjos banjos (KT345965.1)) was used as outgroups [19][20][21]. The nucleotide and amino acid sequences of the 13 PCGs were aligned using default settings and concatenated, which were used for phylogenetic analysis via BI and ML methods by MrBayes v 3.2.4 and raxmlGUI, respectively [22,23]. Each gene was aligned separately by the software Clustal X with default settings [24]. GTR+ I + G was selected as the appropriate model for the nucleotide sequences by Modeltest 3.7 based on Akaike's information criterion (AIC) [25]. MtArt+ I+ G+ F was the appropriate model for the amino acid sequence dataset according to ProtTest 3.4 based on AIC [26]. For the Bayesian Inference, four independent runs were allowed to run simultaneously for 1,000,000 generations and each was sampled every 1,000 generations, with the first 25% discarded as burn-in. Stationarity was considered to be reached when the average standard deviation of split frequencies was much less than 0.01. In ML analysis, the default parameters were used and the node support values were assessed by bootstrap resampling (BP) estimated using 100 replicates. The resulting phylogenetic trees were drawn by FigTree v1.4.3.

Genome organization and structure
The complete mitogenome sequence of H. quoyi was a 16,525 bp circular molecule. The mitogenome was typical of other Beloniformes fish mitogenomes, including 13 PCGs (cox1-3, nad1-6, nad4L, atp6, atp8 and cytb), 22 transfer RNA genes (one for each amino acid and two each for serine and leucine), 2 rRNA genes (12S rRNA and 16S rRNA) and two non-coding regions (the control region (CR) and O L ) (Fig 1 and Table 2). Twenty-three genes were transcribed on the heavy strand (H-strand), whereas the other genes (nad6 and eight tRNA genes (Asn, Gln, Ala, Cys, Tyr, Ser (UCN), Glu, and Pro)) were oriented on the light strand (Lstrand). The organization and composition in the H. quoyi mtDNA was identical to most of Beloniformes fish sequenced to date [27,28].

Skewness, overlapping, and intergenic spacer regions
The nucleotide composition of the H. quoyi mitogenome was slightly biased towards A and T, accounting for 56.80%. The overall base nucleotide composition of the H-strand was as follows: A = 4,838 (29.28%), T = 4,550 (27.53%), G = 2,534 (15.33%), and C = 4,603 (27.86%). The highest A+T content (65.24%) was detected in the CR, which was consistent with previous reports of the skewness of teleostean species. The average AT-skew of Beloniformes mtDNA was 0.0089±0.0269, ranging from 0.0425 in Dermogenys pusilla to −0.0517 in Hyporhamphus intermedius [9,29]. The AT-skew in H. quoyi mitogenome was positive (0.0307), which was similar to most mitogenomes of Exocoetidae, Belonidae, Scomberesocidae and Zenarchopteridae (Table 1). Among all sequenced Beloniformes mitogenomes, H. quoyi has a the most negative GC-skew (−0.2980) indicating that a higher content of Cs compared to Gs. Similar GC-skew values were also detected in other Beloniformes mitogenomes, apart from Ablennes hians [2]. Additionally, the mitogenome had a 31 bp overlap between genes in eleven locations ranging from 1 to 11 bp. Two overlaps, atp8-atp6(11 bp) and nad4l-nad4(6 bp), were detected in the H. quoyi mitogenome. The same phenomenon occurred in the Metazoa [30,31]. There was a 69-bp nucleotide sequence dispersed in twelve intergenic spacers, ranging in size from 1 to 38 bp, with the longest spacer sequence located between the trnN and the trnC, which formed the origin of the light strand. Transfer RNA genes and ribosomal RNA genes A total of 22 tRNA genes in the H. quoyi mitogenome were identified successfully based on their potential secondary structures (Fig 2). With the exception of 8 tRNAs, all other tRNAs were encoded by the H-strand ( Table 2). The length of tRNAs of H. quoyi ranged from 66 bp to 74 bp in size. Most of the tRNA genes could be folded into typical cloverleaf secondary structures, except trnS2 (AUN) lacking of a DHU stem. This phenomenon occurs in most teleost mitogenomes including Beloniformes species [31][32][33]. Although almost all secondary structures of tRNAs had amino acid acceptor stem with 7 bp paired bases, the remaining trnaF, trnaV, trnaE and trnaP have a 9 bp aminoacyl acceptor stem. A total of 16 unmatched base pairs (G-U pairs) were found in the H. quoyi tRNAs, which form a weak bond. A positive AT skew (0.1209) and a negative GC skew (−0.1250) were found among the concatenated sequences of all 22 tRNAs in H. quoyi, indicating tRNAs biased toward As and Cs. Similar results had been found in the ribosomal genes. The AT skew of 12S and 16S rRNA genes were 0.1610 and 0.2425, respectively, and they had a negative GC skew (−0.0930 and −0.0992). The length of the 12S rRNA and 16S rRNA were 944 bp and 1,687 bp and A+T contents were 53.28% and 56.97%, respectively. The location of the 12 rRNAs was between trnF and trnV, and the location of the 16 rRNAs was between trnV and trnL1 (UUR), which were similar for most vertebrates [31,34,35].

Protein-coding genes
The 13 PCGs in H. quoyi mitogenome comprised 11,433 bp in total, with a A+T content of 56.50%, and ranged in size from 168 bp (atp6) to 1,839 bp (nad5). The start and stop codons of the 13 PCGs in the H. quoyi mtDNA were shown in Table 2. All but one PCGs of H. quoyi initiated with methionine (ATG) as the start codon. The only exception was the cox1 gene, which utilized GTG as a start codon. The phenomenon of alternative start codons occurs in most teleost mitogenomes [8,31,32]. The majority of the PCGs of H. quoyi had the complete termination codons TAA (nad1, nad3, nad4l, nad5, nad6, cox1, cox3, atp6 and atp8) or TAG (nad2). The remaining three genes (cox2, nad4 and cytb) utilized T as incomplete termination codons, which were presumed to be completed through post-transcriptional RNA editing mechanism in metazoan mitogenomes [36]. The AT skew and GC skew values of the PCGs were shown in Fig 3. All PCGs of GC skew and AT skew values were negative, except for nad2 and nad6, indicating most PCGs contained more Ts and Cs, which was identical to most previous observations [6,28,35]. RSCU for the H. quoyi mtDNA were shown in S2 Table and Fig 4. The value greater than 1 mean the codon more commonly used. Nine amino acid were encoded by four different codons and 13 amino acid were encoded by two codons. Excluding AGA and AGG codons, the total number of codons in PCGs of H. quoyi was 3792. The most common amino acids were Leucine 1 (Leu 1, 552), alanine (Ala, 292)and threonine (Thr, 335) in H. quoyi. In all 13 PCGs, the Ka/Ks ratio was much less than 1 (varied from 0.0192 (cox1) to 0.1618 (nad6)) ( Fig 5), indicating that all the PCGs were evolving under the purifying selection. The result suggested negative selective coefficients affected purifying selection against deleterious mutations [37]. In addition, the highest ratios were in nad5 and nad6 in the H-and L-strand, respectively, indicating that the selection pressures were relatively independent on the two strands.

Non-coding regions
The mtDNA had two long non-coding regions, O L and CR, which were used for the replication, and maintenance of the mitogenome. A 38 bp O L , which was folded into a hairpin secondary structure, was located between trnN and trnC (S1 Fig) [38]. The 866 bp long CR was found between tRNA Pro and tRNA Phe with 65.24% A+T content, which was essential for the initiation of vertebrate mtDNA replication [9,31,34]. Several conserved sequence blocks (CSBs), which could be very important roles for mitochondrial metabolism, were found in the CR of teleost fish [39]. The central conserved blocks (CSB-F, CSB-E and CSB-D) were found in the CR of H. quoyi, and the conserved sequence block domains (CSB-1, CSB-2 and CSB-3) were similarly detected (Fig 6). By comparing the recognition sites in Beloniformes species, all of the CSBs were typically present in CR of teleost fish [19,39]. The relatively similar repetitive motifs (GGTTTTT) and highly conserved motifs (CTTAATG) were found in CR of H. quoyi. Besides, tandem repeats were not recognized in H. quoyi. Beyond the genera Oryzias and Ablennes, tandem repeats did not similarly appeared in other Beloniformes fish [2,32].

Phylogenetic analysis
The phylogenetic relationships of Beloniformes were constructed by the BI and ML methods based on concatenated nucleotide and amino acid sequences of the 13 PCGs from 27 Beloniformes species and three outgroups species (Figs 7 and 8). The phylogenetic trees contained consistently three major clades, including (I) Adrianichthyidae, (II) Scomberesocidae, Belonidae and Zenarchopteridae, (III) Hemiramphidae and Exocoetidae. The best supported phylogenetic relationship of Beloniformes is as follows: (Adrianichthyidae + ((Hemiramphidae + Exocoetidae) + (Scomberesocidae + (Belonidae + Zenarchopteridae))). We sequenced H. quoyi within Hemiramphidae as the sister group to Exocoetidae, and H. quoyi in comparison to the other two Hemiramphidae species shared a close ancestry with Exocoetidae. This result may be that the mitogenome of H. quoyi more close to P. brachypterus within Exocoetidae than the other two Hemiramphidae fish based on BLAST analysis, and especially in nad2 (Fig 1). The topology relationships of Beloniformes was consistent with most phylogenetic mitogenomes research [1,19]. However, previous work based on partial mitochondrial gene (16S and cytb) and nuclear genes (Rag2 and Tmo) for phylogenetic analysis indicated that Hemiramphidae was close to Belonidae besides Exocoetidae [40]. Whether the difference in the phylogenetic analysis is due to e.g. hybridization, introgression and lineage sorting is unknown. It is worth noting that the phylogenetic placement of H. quoyi inferred here actually makes Hemiramphidae paraphyletic. Moreover, the previous research based on nuclear genes also showed that Hemiramphidae including H. quoyi and H. sajori was a paraphyletic grouping [40]. Besides, each of the family Zenarchopteridae and Scomberesocidae were only one mitogenome sequenced to date. Additional mitogenomes data from Zenarchopteridae, Scomberesocidae and Hemiramphidae fish are required to demonstrate the relationships among Beloniformes species in the future.