Addressing the complex phylogenetic relationship of the Gempylidae fishes using mitogenome data

Abstract The Gempylidae (snake mackerels) family, belonging to the order Perciformes, consists of about 24 species described in 16 genera primarily distributed in tropical, subtropical, and temperate seas worldwide. Despite substantial research on this family utilizing morphological and molecular approaches, taxonomy categorization in this group has remained puzzling for decades prompting the need for further investigation into the underlying evolutionary history among the gempylids using molecular tools. In this study, we assembled eight complete novel mitochondrial genomes for five Gempylidae species (Neoepinnula minetomai, Neoepinnula orientalis, Rexea antefurcata, Rexea prometheoides, and Thyrsites atun) using Ion Torrent sequencing to supplement publicly available mitogenome data for gempylids. Using Bayesian inference and maximum‐likelihood tree search methods, we investigated the evolutionary relationships of 17 Gempylidae species using mitogenome data. In addition, we estimated divergence times for extant gempylids. We identified two major clades that formed approximately 48.05 (35.89–52.04) million years ago: Gempylidae 1 (Thyrsites atun, Promethichthys prometheus, Nealotus tripes, Diplospinus multistriatus, Paradiplospinus antarcticus, Rexea antefurcata, Rexea nakamurai, Rexea prometheoides, Rexea solandri, Thyrsitoides marleyi, Gempylus serpens, and Nesiarchus nasutus) and Gempylidae 2 (Lepidocybium flavobrunneum, Ruvettus pretiosus, Neoepinnula minetomai, Neoepinnula orientalis, and Epinnula magistralis). The present study demonstrated the superior performance of complete mitogenome data compared with individual genes in phylogenetic reconstruction. By including T. atun individuals from different regions, we demonstrated the potential for the application of mitogenomes in species phylogeography.


| INTRODUC TI ON
The Gempylidae fishes commonly known as snake mackerels consist of 24 species described in 16 genera of which 13 are monotypic (Nakamura & Parin, 1993;Nelson, 2006). Gempylids are widely distributed in deeper tropical and subtropical waters (Blend et al., 2010;Li, Qi, et al., 2020). A number of these species are economically important (Bustamante & Ovenden, 2016;Carnevale, 2006;Griffiths, 2002). Snake mackerels belong to the taxonomically complex suborder Scombroidei, and together with trichiurids (Trichiuridae) they form the Trichiuroidae superfamily (Nakamura & Parin, 1993). Gempylids and trichiurids are among the best-known fishes anatomically and key distinguishing features for these families exist, that is, the number of nostrils, presence/absence of caudal fin, and morphology of dorsal fins (see Nakamura & Parin, 1993). Although they are treated as separate groups, most authors have presented evidence that the trichiurids represent a subgroup of gempylids (Carpenter et al., 1995;Johnson, 1986;Monsch, 2000). This may be due to the fact that several crucial family diagnostic characteristics are not uniformly distributed within each family (Carnevale, 2006). For example, the conical process on the tip of the upper/lower jaw only occurs in some species of the gempylids or trichiurids, or the anterior soft ray fins, which are underdeveloped or absent in many species of the Trichiuroidae (Nakamura & Parin, 1993). Numerous studies have discussed the systematics of these fishes and their relationships (Carpenter et al., 1995;Collette, 1984;Gago, 1997;Johnson, 1986;Monsch, 2000).
Challenges of taxonomic classification within the Trichiuroidae extend beyond superfamily level to family level. While gempylids can be identified by several characteristics considered as diagnostic for the family and the assignment of species to their genera does not seem to be in question, it is not always clear whether some genera belong to a certain tribe or another (Carnevale, 2006;Monsch, 2000;Nakamura & Parin, 1993). The variations between genera are minor and subtle, prompting questions about whether they are significant enough to warrant separation at the genus level or should be viewed as indicators of interspecific variation. Furthermore, it has been shown that some meristic traits used to designate species within the Gempylidae change with species' geographic range (Collette, 1984;Grey, 1953;Mikhailin, 1983;Parin et al., 1978;Parin & Bekker, 1972).
Although different traits for identifying specimens to species exist, the evolutionary histories of these attributes are unknown or poorly understood. As a result, past morphology-based phylogenetic investigations have failed to establish the interrelationships of Gempylidae species (Carpenter et al., 1995). Including such morphological characters in the construction of the cladogram without consideration means the cladogram cannot provide independent reference for the interpretation of family evolution but for character evolution (Block & Finnerty, 1994;Carpenter et al., 1995). To avoid this circularity some researchers have used molecular markers to understand the interrelationships of the complex gempylids Orrell et al., 2006). While mitochondrial genome (mitogenome) genes such as COI, ND2, 12S rRNA, 16S rRNA, and CYTB, are routinely used in phylogenetic and population genetics studies (Liu et al., 2016;Meng, 2019;Orrell et al., 2006;Schroeter et al., 2020;Yoon & Park, 2015), it is well known that individual genes independently often fail to properly resolve phylogenetic trees . Whole mitogenomes, on the other hand, offer more understanding and finer resolution from higher-level classification (i.e., order and family) down to closely related species than individual gene sections do (Miya & Nishida, 2000). Consequently, mitogenomic data have been widely employed to resolve complex phylogenetic relationships like for example in the Otocephala cohort (Lavoué et al., 2005) and in the Scombridae family (Friedman et al., 2019). Accordingly, a rapid increase in the number of fish mitogenomes published compelled the establishment of a whole database dedicated to fish mitogenomes. The MitoFish database (as of 08 May 2023) has 42,955 (species) complete and partial mitogenomes (http://mitof ish.aori.u-tokyo.ac.jp/), which is attributed to recent advances in high-throughput sequencing methods, where greater throughput at reduced cost has enabled the quick discovery of mitogenomes.
Additionally, they are more frequently substituted than nuclear genomes but are nevertheless substantially conserved, making them taxonomically discriminative.

| Fish sampling and DNA extraction
Tissue samples (fin clips and muscle) used in the current study were obtained from both local and international government fisheries and international research institutions. The fishes were originally caught during the 2014-2019 period from the follow- 38°41′33.0″S 77°31′33.6″E) (Figure 1). The tissue samples were preserved in 99% (v/v) ethanol until processing. Total genomic DNA (gDNA) was extracted from fin clips or muscle tissue using a standard cetyltrimethylammonium bromide (CTAB) extraction protocol (Sambrook & Russell, 2001).

| Library construction and Ion Torrent sequencing
Sequencing was carried out at the Central Analytical Facility of Stellenbosch University, South Africa. Before library construction, gDNA concentration was quantified on the Qubit™ 4 Fluorometer, and the purity was determined using the NanoDrop™ ND-2000 spectrophotometer (Thermo Fisher Scientific). Genome Quality Scores (GQS) were determined by electrophoresis using the PerkinElmer® LabChip GX Touch 24 Nucleic Acid Analyzer (PerkinElmer). Library preparation was performed using the Ion Plus Fragment Library Kit score for the last 30 bases was calculated again and if needed, the last 15 bases were trimmed again. This process was repeated until the average QV for the last 30 bases was higher than 16. If the read was trimmed to less than 25 bases the whole read was removed from the dataset.

| Mitochondrial genome assembly and annotation
Complete and partially complete mitogenomes of several Gempylidae species are available in NCBI's GenBank, therefore, a map-to-reference assembly method was used for most of the assemblies, with one exception where a combination of both the de novo and map-to-reference assembly methods was necessary. Rexea antefurcata and R. prometheoides sequence reads were mapped and assembled to R. solandri (KJ408216); Neoepinnula minetomai and N. orientalis raw reads were mapped to the closely related Epinnula magistralis (AP012943). As previously stated, four T. atun samples were sequenced, each from a distinct geographic location. The New Zealand sample with a mean length of 374 base pairs (bp) and 4,019,478 reads was considered the best candidate for draft genome assembly. Thyrsites atun NGS reads were mapped to a partially complete mitogenome of the closely related Gempylus serpens (AP012502) according to the study of Miya et al. (2013) using Geneious Prime v2020.2.5 (Biomatters Ltd.), resulting in an incomplete T. atun mitogenome missing the control region. To improve genome length coverage, a de novo assembly using the SPAdes v3.15.5 assembler (Prjibelski et al., 2020) was performed and the resulting contigs were aligned to the incomplete T. atun mitogenome. The contigs extended to the presumed control region, generating a full mitogenome. The newly constructed T. atun mitogenome was then utilized as a reference sequence for the remaining T. atun genome assemblies. Read mapping and assembly were carried out in Geneious Prime v2020.2.5 software (https:// www.genei ous.com) utilizing the Geneious mapper with medium/ low sensitivity and fine-tuning up to five iterations, followed by manual curation.
The position of the 13 protein-coding genes (PCGs), 2 rRNAs, and 22 tRNAs was determined using MitoAnnotator  and confirmed with the MITOS genome annotation pipeline (Bernt et al., 2013) using the vertebrate mitochondria genetic code. Annotations were performed in Geneious Prime and linear maps of complete mitogenomes were generated using the CGView server (https://proks ee.ca). Protein-coding sequences were translated into amino acids and checked for the presence of premature stop codons indicative of nuclear mitochondrial DNA segments (NUMTs). Nucleotide and amino acid composition, codon usage, and relative synonymous codon usage (RSCU) of the 13 protein-coding genes were analyzed using MEGA v11 (Tamura et al., 2021). Nucleotide compositional skew was calculated according to the formula: AT-skew = (A − T)/(A + T) and GCskew = (G − C)/(G + C) (Perna & Kocher, 1995). Information on the full characterization and nucleotide composition of these genomes can be found in Tables S1 and S2.

| Gempylidae phylogeny
The phylogenetic relationship of 21 gempylids representing 17 species was investigated (Table 1). Thirteen PCG sequences excluding stop codons were aligned separately with ClustalX2 (Larkin et al., 2007) using default settings. The resulting alignments were optimized manually in BioEdit v7.2.5 (Hall, 1999) to remove regions that were challenging to align because of unique insertions into one or more sequence regions. The best-fitting model for nucleotide substitution was determined in jModelTest v2.1.10 (Darriba et al., 2012) according to the Bayesian information criterion (BIC).
Descriptive statistics of single gene alignments and their predicted models of evolution are presented in Table S3. Following model testing, the 13 alignments were concatenated in the order in which they appeared in the mitogenome. Bayesian inference (BI) and maximumlikelihood (ML) were performed in MrBayes v3.2.7 (Ronquist et al., 2011) and Garli2.0 (Zwickl, 2008), respectively, using moderately gene-partitioned data. A heuristic tree search was performed using ML and the confidence level of/for each branch was evaluated by performing bootstrapping with 100 replicates. The Markov Chain Monte Carlo (MCMC) analysis was run for 20,000,000 generations to allow for adequate time for convergence with sampling at every 1000 generations. Twenty-five percent of starting trees were discarded as burn-in while the remaining trees were used to estimate the consensus tree (50% majority rule) and the Bayesian Posterior Probabilities (BPP). To ensure that stationarity had been reached, the Effective Sample Size (ESS) for all sampling parameters was checked in Tracer v1.7 (Rambaut et al., 2018). The resulting trees obtained from the BI and ML analyses were visualized and edited in FigTree v1.4.4 (Rambaut, 2012).

| Molecular divergence dating
Molecular divergence dating was performed in BEAST v2.6.6 (Bouckaert et al., 2019). An alignment of 13 PCGs was codonpartitioned to improve the data's phylogenetic signal. ModelFinder (Kalyaanamoorthy et al., 2017) implemented in IQ-TREE v2.1.3 was used to select the best partitioning schemes for the data. A package in BEAST, bModelTest, was used to infer substitution models for the identified partitions (Bouckaert & Drummond, 2017). bModelTest package allows for model prediction during the MCMC analysis.
Site models were inferred from a transversion/transition split subset (Table S4). The molecular phylogeny was calibrated using the fossil occurrence data of the most recent common ancestor (mrca) for

Thyrsitoides marleyi and Gempylus serpens, Thyrsitoides zarathoustrae.
This fossil was discovered in the fish-producing strata of Istehbanât and Elam of the late Eocene to early Oligocene 28.1-33.9 million years ago (mya) (Friedman et al., 2019). Gempylidae-specific speciation rates published by Rabosky et al. (2018) were used as birthRateY prior (Mean = 0.053, SD = 0.001). To retain species-tree relationships while predicting branch lengths, a starting tree was used. This was achievable since the current study's Bayesian inference and maximumlikelihood analysis yielded the same tree topology.
Prior to running the final analysis, we performed a series of tests to identify the appropriate parameters and priors for our data, such as testing the data for clock-likeness and comparing the performance of tree priors, specifically the Calibrated-Yule versus Birth-Death Model, and the calibration priors (normal versus log-normal distribution). The concluding MCMC analysis was conducted using the Calibrated-Yule tree model together with the uncorrelated relaxed molecular clock model while applying a normal distribution calibration prior (Mean = 31, Sigma = 0.5). We performed two independent runs each comprising 200,000,000 iterations with sampling at every 5000 generations. Before merging the results of the two runs we inspected the output log files in Tracer v1.7.2 (Rambaut et al., 2018) to check: (i) for effective sampling of estimated parameters (ESS values > 200), (ii) to confirm both runs converged to the same posterior distribution (each run should be sampling from the same distribution). Once this was confirmed, 20% burn-in was removed and the results were merged in LogCombiner v2.6.7 (Drummond & Bouckaert, 2015). A consensus tree (Maximum Clade Credibility Tree) summarizing the posterior tree distribution was constructed with TreeAnnotator v2.6.6 (Drummond & Bouckaert, 2015). The resulting tree was visualized and edited in FigTree v1.4.4 (Rambaut, 2012).

| Genome composition
The eight fish mitogenomes generated and compared in this study ranged between 16,401 and 16,550 bp in size, which falls within the expected range for most teleost mitogenomes. As in other vertebrates, all mitogenomes comprise 37 genes (13 protein-coding, 22 TA B L E 1 List of complete and partial mitochondrial sequences used in Gempylidae phylogenetic reconstruction. Trichiurus lepturus and Benthodesmus tenuis (Scombriformes: Trichiuridae) were used as outgroups.

Species
Family Source Two forms each of tRNA Leu and tRNA Ser , and the three tRNA clusters; IQM (Ile, Gln, and Met), WANCY (Trp, Ala, Asn, Cys, and Tyr), and HSL (His, Ser, and Leu) were identified, similar to other fish mitogenomes ( Figure 2). The tRNAs varied from 65 to 76 bp in size, which is consistent with prior fish studies. Similarly, to other vertebrates, the 22 tRNA genes utilized the same anticodons across all five species.

| Nucleotide composition
The mitogenomes of the five species studied were generally AT-rich, whether it was the protein-coding, tRNA, or rRNA genes being investigated. The control region D-loop had the highest A + T percentage (Table 2). Although protein-coding genes were AT-rich, they presented negative AT-skew values in all five species: nucleotide T compared with A was overly represented. All mitogenomes, whether protein-coding subsets, tRNA, rRNA genes, or control regions were considered, presented moderate negative GC-skew values, which F I G U R E 2 Complete linear mitochondrial genome maps of five Gempylidae species. Mitogenome features of Neoepinnula orientalis (OP354258), Rexea antefurcata (OP354256), Rexea prometheoides (OP354257), and Thyrsites atun (OP598802) are mapped to the largest contig of the five compared mitogenomes: that of Neoepinnula minetomai (OP359221). All mitogenomes expressed nearly identical nucleotide composition. Annotations were performed in Geneious Prime software and linear maps were generated in the CGView server.
TA B L E 2 Nucleotide composition and skewness levels calculated for the sequenced majority strands of five Gempylidae species.
The AT-richness (%) and nucleotide compositional skew per gene are presented in Figure 3. All 13 PCGs except for the ND6 gene showed a negative GC-skew. Table 3 shows sequence nucleotide distribution per codon position. Pronounced anti-G bias was observed at the second codon positions. The A + T composition of the second codon position was relatively higher compared with 1st and 3rd codon positions, which is not uncommon with vertebrates. Detailed nucleotide composition is presented in Table S2.

| Codon usage and RSCU
Codon usage analyses revealed that Gempylidae fishes used all 64 codons to encode all 20 amino acids in their PCGs. Leucine, averaging 650 counts was the most frequent amino acid, whereas Cystine (averaging 42 counts) was the least common ( Figure S1).

| Gempylidae phylogenetics
Both BI and ML tree search methodologies yielded nearly identical and well-resolved phylogenetic trees using a concatenated dataset of 13 PCGs. There were a few differences in the node support metrics used by the two approaches, with bootstrap generally providing less support than Bayesian posterior probability. Bayesian Inference identified two primary clades within the Gempylidae family ( Figure 5).

| Estimating time since divergence
The two primary clades of the Gempylidae are estimated to have di-

| DISCUSS ION
Advances in high-throughput sequencing technologies have led to increased availability of genome-scale data for evolutionary analysis.
These data provide a great opportunity for studying genetic diversity, population dynamics, and evolutionary genetics. Here we characterized the complete mitochondrial genomes for five Gempylidae species using Ion Torrent sequencing. The mitogenomes varied in size from 16,401 to 16,550 bp, which was well within the predicted range for teleost mitogenomes (Liu & Cui, 2009). Gene content and order of arrangement were conserved as typically found in other vertebrates. Most genes were encoded on the heavy strand, except ND6 and eight other tRNA genes, which occurred on the light strand.
All eight mitogenomes were AT-rich, a feature common to all teleost mitogenomes.

| Protein-coding gene features
A typical ATG initiation codon was found in all protein-coding genes except COI, which exclusively used GTG, similar to most teleosts (Li et al., 2019), as a result, COI is thought to evolve at a much slower rate than other mitochondrial genes (Li et al., 2019).
While TAA was the preferred termination codon for most protein- The presence of a tRNA gene, which serves as a punctuation marker, may allow transcription to end without using complete TA B L E 3 Nucleotide composition of five Gempylidae mitogenomes partitioned into protein-coding genes, transfer RNAs, and ribosomal RNAs.   (Satoh et al., 2016). On the contrary, only complete stop codons were used in the ND5, ND6, and COI genes. These genes were followed by a gene encoded on the opposite strand, which causes transcription without punctuation and may explain the exclusive use of complete stop codons by these genes Li, Qi, et al., 2020;Satoh et al., 2016).

| Mitogenome evolution
In fish, the mitochondria organelle plays an important role in ion regulation and the production of adenosine triphosphate (ATP) to maintain cell functioning and heat regulation. In essence, the mitochondrion allows fish to adapt, to some extent, to changes in water temperatures and salinity (Gerber et al., 2020;Lin & Sung, 2003).
The fishes of the Gempylidae family have a widespread range, Gempylids are among the best-known fishes anatomically.
Many of these studies, if not all of them, made use of features that were either common within the family or diagnostic of one or more genera. Such features cannot be utilized to infer relationships between the genera. In the current study, we demonstrated the superior performance of concatenated mitochondrial protein-coding genes in resolving complicated phylogenetic relationships such as for the Gempylidae. By utilizing multiple molecular markers our understanding of the evolutionary connections between species within the family was improved.
In the present study, gempylids were divided into two likely significant clades: Gempylidae 1 (Thyrsites, Promethichthys, Nealotus, F I G U R E 5 Gempylidae phylogeny reconstructed using the Bayesian inference and maximum-likelihood methods from concatenated dataset of 13 protein-coding genes of the mitochondria. Members of the trichiurids, Benthodesmus tenuis AP012522 and Trichiurus lepturus MK333401 were included as outgroups. The numbers on branches represent Bayesian Posterior Probabilities (BPP)/Bootstrap values (Bootstrap values below 70 are shown in red).

Diplospinus, Paradiplospinus, Rexea, Thyrsitoides, Gempylus, and
Nesiarchus), and Gempylidae 2 (Lepidocybium, Ruvettus, Neoepinnula, and Epinnula) ( Figure 5). The results of the Bayesian inference and maximum-likelihood approaches were considerably more in agreement, despite differences in node support. This is because although posterior probabilities and bootstraps are both approaches used to evaluate node support, these two metrics model different parameters, and as a result, disparities are not uncommon (Svennblad et al., 2006). Bayesian Inference is more conservative and less prone to strongly supporting a false phylogenetic hypothesis (Brooks et al., 2007;Douady et al., 2003).
According to Russo (1983) and Collette (1984), Promethichthys, together with Nealotus, and Rexea, are regarded as a monophyletic group within the gempylids and the genus Diplospinus is thought to be closely related to the genera Gempylus, Nesiarchus, and Paradiplospinus (Carpenter et al., 1995;Gago, 1997;Parin & Bekker, 1972). In the current study, these groups were nested within the Gempylidae 1 clade, which also contains Thyrsites, contrary to the classification of Russo (1983) where Thyrsites was grouped with Neoepinnula and Epinnula (Gempylidae 2 in the current study).
Interestingly, it was in the study of Russo (1983)  This finding is in line with the results of our recent work, in which we used all 13 PCGs of mitochondria to investigate the placement of T. atun within the Gempylidae family Mthethwa et al. (2023). However, the current results indicate that T. atun and T. marleyi are not closely related, even though we remain convinced in the inclusion of Thyrsites in Gempylidae 1. This could be a result of unrepresentative sampling in the latter study that only investigated the relationship between 13 gempylids while the former study likely suffered from decreased phylogenetic resolution capability of individual genes. Thyrsites atun, also known as snoek and Thyrsitoides marleyi commonly known as black snoek share several morphological similarities. Furthermore, Thyrsites and Gempylus (true sister genus of Thyrsitoides) are the only members of the Gempylidae to share a caringiform swimming mode and a high number of dorsal and anal finlets, making them evolutionary close relatives (Monsch, 2000). Lepidocybium flavobrunneum, which has consistently been identified as the most primitive member of the gempylids, was found in the basal cluster as were Ruvettus and Epinnula. The placement of Neoepinnula within Gempylidae clade 2 was expected. Neoepinnula spp. and Epinnula spp. share numerous similarities and were previously assumed to be members of the same F I G U R E 6 Time-calibrated phylogenetic tree of 17 Gempylidae species. Mean divergence times were estimated using a relaxed molecular clock model on a subset of mitochondrial genes. Two species of trichiurids, Benthodesmus tenuis AP012522 and Trichiurus lepturus MK333401 were included as outgroups. The blue disk denotes the placement of the fossil calibration.
genus. Nesiarchus nasutus is the oldest member of Gempylidae 1. Our findings are congruent with the molecular classification of Jondeung and Karinthanyakit (2010), and the molecular phylogenetic studies of Friedman et al. (2019), , and Li, Qi, et al. (2020).

| Evolution of Gempylidae-Calibrated time tree
The appearance of gempylids, scombrids, trichiuroids, and other distantly related predatory percomorphs in the early Paleogene is interpreted as a response to the extinction of major Mesozoic groups of predatory marine teleosts . The oldest Gempylidae fossil occurred in the Ypresian age (47.8-56.0 mya), an early era of the Eocene; the period after the Cretaceous-Paleogene about 20 million years after the K-Pg mass extinction. While the late Cretaceous and early Cenozoic mass extinction of K-Pg is regarded as the fundamental cause of contemporary fish diversification (Alfaro et al., 2018;Li, Qi, et al., 2020), it is reasonable to believe that a different geological event was responsible for the observed diversity in the Gempylidae family. The tectonic plate movement during the mid/late Eocene-Oligocene is one occurrence that might be related to the observed Gempylidae radiation. Tectonic plates are potentially one of the main determinants of marine biodiversity, as they modulate the distribution of seafloors through continental movement and collision over geological timescales, in turn shaping the marine food webs (Leprieur et al., 2016;Renema et al., 2008). Events linked to the shift of tectonic plates such as changes in ocean temperature, and oceanic currents have also been reported to influence biodiversity dynamics (Leprieur et al., 2016).
The gempylids and closely related sister groups such as scombrids are believed to have evolved from a deep-ocean ancestor . Primitive gempylids such as L. flavobrunneum, N. nasutus, R. pretiosus, G. serpens, T. atun, and T. marleyi

| Snoek phylogeography
While mitogenomes are routinely used for phylogenetic assessments, they are also invaluable resources for phylogeographic studies. In the current phylogenetic analyses, we included T. atun from five different geographic locations (Figure 1 Table S5).
These findings imply that currents, rather than the physical distance between the two lineages, significantly influence the gene flow of this species. The South African T. atun population occurs along the South Atlantic Ocean separated from the Indian Ocean by a barrier formed where the Benguela and Agulhas currents meet (Figure 1). This barrier plays a prominent role in limiting the dispersal of southern African coastal fishes (Maduna et al., 2016). Where two ocean currents converge, strong environmental gradients, such as temperature and salinity, have been observed. These gradients are believed to affect some mobile species' dispersion patterns and contribute to the genetic structuring of marine communities (Miller et al., 2013).
This explains why T. atun from South Africa (Atlantic Ocean) versus T. atun from Amsterdam and Saint-Paul Islands, although in geographic proximity to each other, but occurring on either side of the barrier, showed greater genetic variation than T. atun from Amsterdam and Saint-Paul Islands versus T. atun from New Zealand, which are geographically farther apart but are found on the same side of the barrier (Indian and Pacific oceans).

| CON CLUS ION
Although short-read mapping against a single reference sequence is a frequently used method for reconstructing genomes like in the present study, there is reason to believe that this method could introduce biases depending on the reference used for mapping (Valiente-Mullor et al., 2021). The majority of these errors result from genetic differences between the reference and read sequence data, and they can have an impact on the analysis that follows (Valiente-Mullor et al., 2021). The Ion Torrent, like all second-generation sequencing technologies, is incapable of sequencing long stretches of DNA; these short-read sequences frequently fail to generate sufficient overlap sequence from the DNA fragments. This constitutes a significant challenge for the de novo genome assembly (Adewale, 2020).
The authors advocate the use of third-generation sequencing technologies, long-read sequencing technologies (LSR) as they generate a reasonable length of overlap sequence for better sequence assembly (Adewale, 2020).
The current study sought to investigate the utility of mitochondrial genome data in understanding the phylogenetic relationships of the taxonomically complex Gempylidae family. Although comparative analysis of Gempylidae mitogenomes revealed a high degree of similarity at the nucleotide and amino acid level, including gene arrangement and gene content, their potential to resolve the gempylids' phylogenetic relationships with higher confidence support than any other marker used previously, is remarkable.
Even though this is the largest study to date to investigate the phylogenetic relationships of 17 (of 24 species), we were still un- was not involved in the study design, in the collection, analysis, and interpretation of data, in the writing of the report, and in the decision to submit the article for publication.

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors report there are no competing interests to declare.
Submission declaration and verification: The work described in this article has not been published previously nor is it under consideration for publication elsewhere. Publication of this work is approved by all authors and explicitly by the responsible authorities where the work was carried out, and that, if accepted, it will not be published elsewhere in the same form, in English or any language, including electronically without the written consent of the copyright holder.

DATA AVA I L A B I L I T Y S TAT E M E N T
The genome sequence data that support the findings of this study are openly available in GenBank of NCBI at https://www.ncbi.