Phylogenetics and biogeography of the two‐wing flyingfish (Exocoetidae: Exocoetus)

Abstract Two‐wing flyingfish (Exocoetus spp.) are widely distributed, epipelagic, mid‐trophic organisms that feed on zooplankton and are preyed upon by numerous predators (e.g., tunas, dolphinfish, tropical seabirds), yet an understanding of their speciation and systematics is lacking. As a model of epipelagic fish speciation and to investigate mechanisms that increase biodiversity, we studied the phylogeny and biogeography of Exocoetus, a highly abundant holoepipelagic fish taxon of the tropical open ocean. Morphological and molecular data were used to evaluate the phylogenetic relationships, species boundaries, and biogeographic patterns of the five putative Exocoetus species. We show that the most widespread species (E. volitans) is sister to all other species, and we find no evidence for cryptic species in this taxon. Sister relationship between E. monocirrhus (Indo‐Pacific) and E. obtusirostris (Atlantic) indicates the Isthmus of Panama and/or Benguela Barrier may have played a role in their divergence via allopatric speciation. The sister species E. peruvianus and E. gibbosus are found in different regions of the Pacific Ocean; however, our molecular results do not show a clear distinction between these species, indicating recent divergence or ongoing gene flow. Overall, our phylogeny reveals that the most spatially restricted species are more recently derived, suggesting that allopatric barriers may drive speciation, but subsequent dispersal and range expansion may affect the distributions of species.


| INTRODUCTION
Marine fish habitats are typically large, continuous, and lack definitive boundaries. Fishes that inhabit the epipelagic zone are generally less taxonomically diverse than species found in other habitats (benthic, coastal, reef-associated, estuarine), possibly because the overall homogeneity of epipelagic habitats may reduce rates of speciation (Hamner, 1995). Nevertheless, some widespread and diverse fish families such as scombrids, belonids, hemiramphids, and exocoetids have circumtropical distributions that include a diversity of habitats (Gaither et al., 2015). The underlying mechanisms responsible for diversification in these fishes remain unclear, at least in part because their phylogenetic relationships are poorly resolved and life history characteristics little known. Phylogenetic characterizations are necessary to understand speciation because they define the sequence of lineage and species diversification. Also, phylogenies can clarify species identity when taxa are morphologically very similar (cryptic species), thereby improving understanding of species geographic distributions (Bass et al., 2005;Colborn et al., 2001;Quattro et al., 2005).
Comprehensive species phylogenies can provide key insights regarding speciation in marine lineages with high dispersal potential, wide ranges, and overlapping distributions. Exocoetus (two-wing flyingfish) is a monophyletic genus of five species found in the epipelagic waters of tropical and subtropical oceans worldwide (Lewallen et al., 2011;Parin & Shakhovskoy, 2000). Gliding on elongated pectoral fins ( Figure 1) separates Exocoetus from most other members of the family Exocoetidae that can use elongated pectoral, pelvic, and sometimes dorsal fins to achieve prolonged aerial glides. As with many widely distributed fishes, Exocoetus has buoyant, pelagic eggs, and larvae that persist in the epipelagic zone during maturation, which occurs at lengths of 130-155 mm (SL) (Grudtsev et al., 1987). Exocoetus individuals live for approximately 1 year, are small [max SL ≤ 207 mm (Grudtsev et al., 1987)], slow swimming, and incapable of longdistance migrations (Parin, 1968). Curiously, the distribution of each Exocoetus species overlaps with at least one other species, suggesting they may have evolved in parapatry or sympatry. Species ranges vary from circumtropical (e.g., E. volitans) to single oceanographic regions (e.g., E. peruvianus), indicating differences in habitat specialization.
A thorough examination of genetic diversity in Exocoetus is greatly needed, considering the potential for uncovering cryptic species (especially within the globally distributed E. volitans). Here, through extensive sampling and phylogenetic analysis, we improve the resolution of evolutionary lineages within Exocoetus, thereby providing new data on how speciation occurs in the epipelagic zone. We specifically focused on the following questions: (1) What are the phylogenetic relationships within Exocoetus based on molecular data, and how do they compare to the most recent morphological hypothesis? (2) Do the currently recognized Exocoetus species represent distinct monophyletic lineages, and are there cryptic species? (3) What biogeographic patterns of speciation are revealed by phylogenetic arrangements within this genus?

| Taxon sampling
A total of 429 flyingfish specimens (422 Exocoetus and seven outgroup specimens) were collected at night using long-handled dipnets and/or donated by collaborators (Appendix S1). Animals were euthanized in an ice-water bath. Post-mortem handling included shipboard freezing in seawater, removal of lateral muscle tissue for DNA analysis F I G U R E 1 An Exocoetus fish gliding along the surface of epipelagic water in the eastern tropical Pacific. Photo credit: EAL (first author) F I G U R E 2 Morphology-based phylogenetic hypotheses for Exocoetus. (a) Phylogenetic hypothesis presented by Parin and Shakhovskoy (2000). Illustrations of adults and juveniles were compiled from the following publications: Exocoetus volitans (Parin, 2002), Exocoetus obtusirostris (Parin, 2002), Exocoetus monocirrhus adult (Parin, 1984); Exocoetus monocirrhus juvenile (Heemstra & Parin, 1986), Exocoetus peruvianus (Parin & Shakhovskoy, 2000), Exocoetus gibbosus (Parin & Shakhovskoy, 2000). (b) Phylogenetic hypothesis using the same 11 morphological characters as Parin and Shakhovskoy (2000) (95% ethanol), whole-specimen fixation (10% formalin), and long-term museum archiving (70% ethanol). Each specimen was identified using key diagnostic characters (e.g., gill raker counts and body depth measurements) as presented in (Parin & Shakhovskoy, 2000). All voucher specimens are archived with catalogue numbers at the Royal Ontario Museum or Scripps Institution of Oceanography (Appendix S1). We note that 15 specimens used in the current study were included in a previous study (Lewallen et al., 2011). Also, 266 E. volitans specimens were sequenced (Cytb) for a previous population genetic analysis (Lewallen et al., 2016). Details regarding which specimens are common among studies are provided in Appendix S1. Parin and Shakhovskoy (2000) presented a series of morphological characters for Exocoetus, and a phylogenetic hypothesis for the genus ( Figure 2a). However, their study used dichotomous morphological character analyses to discern species rather than explicit phylogenetic analyses. Importantly, only 11 characters in Parin & Shakhovskoy's study were informative for distinguishing species and could be clearly coded for phylogenetic analysis. To test the morphology-based phylogeny for this genus, we tabulated the characters presented in Parin and Shakhovskoy (2000) into a data matrix (
Genomic DNA was extracted using DNeasy kits (Qiagen, Valencia, CA, USA). A portion of both the Cytb and Rag2 genes were amplified using previously published primers ExoCBFwd, ExoCBRev, and Ffly-Ch, Rfly-Ch, respectively (Lewallen et al., 2011). One advantage of using Rag2 over some other nuclear genes is that it does not contain introns in the coding region (Peixoto, Mikawa, & Brenner, 2000). PCR conditions, internal sequencing primers (ExoFwd1 and ExoRev1 for Cytb; F16-Ch and R17-Ch for Rag2), and sequence alignment methods followed Lewallen et al. (2011).
T A B L E 1 Morphological character matrix for the 11 characters described by Parin and Shakhovskoy (2000). Characters were coded as binary (1 or 0). Outgroup taxa (Fodiator acutus and F. rostratus) were added to this matrix using morphological data presented by Parin and Belyanina (2002) and Parin and Shakhovskoy (2000) For MP analysis of Cytb (Set 2) and Rag2 (Set 3) data, we used heuristic searches (10,000 random addition sequence replicates and TBR branch swapping Bootstrap support was calculated with 100 bootstrap replicates and 10,000 random addition sequence replicates per bootstrap iteration. The expanded Cytb dataset (Set 5) was analyzed using a heuristic search of 1,000 random addition sequence replicates, and TBR branch swapping. For this analysis, four non-Exocoetus sequences were included (2 P. hillianus and 2 P. brachypterus) and designated as outgroups (Appendix S2).
Outgroup taxa for each dataset were the same as in MP analyses above. As in previous phylogenetic analyses of these taxa and molecular markers (Lewallen et al., 2011), a general time reversible model with invariant sites and gamma distribution (GTR + I + Γ) was determined as the best model of evolution, and was used for this study.
Using a random starting tree, 10 million MCMC generations were run, saving one of every 1,000 trees, and the first 10% of saved trees were discarded as burn-in. TRACER 1.4 (Rambaut & Drummond, 2007a) was used to view the posterior distribution of sampled trees and assess convergence, and TreeAnnotator 1.4 (Rambaut & Drummond, 2007b) was used to calculate a maximum clade credibility tree.
Phylograms were generated using TreeView (Page, 1996), with branch lengths corresponding to substitutions per site and Bayesian posterior probabilities (BPP) presented at each node.

| Genetic distance
To estimate genetic distances among sampled individuals, mean Kimura two-parameter (K2P) values (Kimura, 1980) were calculated using MEGA 5 (Tamura et al., 2011). All possible pairwise comparisons were calculated among individuals within each species, and also between each species. Between-species genetic distance estimates were then used to obtain an overall mean for the genus.

| Phylogeny of exocoetus
Phylogenetic analyses of Exocoetus species yielded consistent, well-supported evidence for the arrangement of four monophyletic groups irrespective of the method used. First, Exocoetus is monophyletic, which corroborates the findings of other authors (Collette et al., 1984;Lewallen et al., 2011;Parin, 1961;Parin & Shakhovskoy, 2000   Gene trees may not be congruent with a species tree when the rate of speciation exceeds the rate at which allelic polymorphisms achieve reciprocal monophyly in separated gene pools (Harrison, 1991).

| Species distinctions
Although we have not calibrated a molecular clock for this study, the very low amounts of divergence between individuals of the two putative species are indicative of very recent divergence. A second possibility is that E. peruvianus and E. gibbosus represent a single species, with regular gene flow between two distant allopatric populations, sufficient to prevent them from becoming reproductively isolated.
Observed morphological differences might be due to phenotypic plasticity associated with the occupation of slightly different habitats. If this is the case, these species would be better classified as regional morphotypes of the same species, a pattern that has been observed in other flyingfishes (Parin & Belyanina, 1998). Our results point to the need for additional sampling and genetic analyses to confirm whether E. peruvianus and E. gibbosus represent distinct species. Increasing the number of sampled individuals, or using higher-resolution genetic markers would likely improve our ability to differentiate between the two scenarios described above. We also note that adding samples for lineages with lower numbers of individuals sequenced would reduce any possible biases caused by differences in sample number across species analyzed. For example, we are likely to have incompletely sampled the total Cytb variation of E. peruvianus and E. gibbosus, and further sequencing may provide clearer indication of whether these putative species are genetically isolated.

| Cryptic species
Cryptic species (Bickford et al., 2007) have long posed taxonomic challenges and may be identified using anatomical, ecological, behavioral, biogeographic, and/or molecular characteristics. DNA comparisons can provide particularly useful information about species distinctiveness. Hebert et al. (2003) suggested that genetic distance estimates above 3% for the DNA "barcode" gene cytochrome oxidase I should be used as a threshold for defining species, and genetic distance estimates above 2% have been proposed for distinguishing vertebrate species using Cytb data (Avise & Walker, 1999). However, other studies have shown that model selection for genetic distance calculations can also affect species delimitation (Barley & Thomson, 2016). In our study, very low mean Cytb genetic distance estimates (K2P = 0.7-1.2%) within each species suggests an absence of cryptic species. In addition, in the case of E. volitans, an analysis across the range of the species found minimal population genetic structure at a global scale (Lewallen et al., 2016).
Single widely distributed marine taxa are sometimes found to consist of morphologically cryptic, but genetically distinct, independent evolutionary lineages (species) segregated by ocean basin (Briggs, 1960), or oceanographic factors (Gaither et al., 2015). Examples include bristlemouths (Miya & Nishida, 1997), goliath groupers (Craig et al., 2009), bonefish (Bowen, Karl, & Pfeiler, 2007), ocean sunfish (Bass et al., 2005), and hammerhead sharks (Quattro et al., 2005). In contrast to these documented cases of cryptic species, we find no evidence of this phenomenon in Exocoetus, despite the multi-ocean distributions of E. volitans and E. monocirrhus. At least some globally connected species maintain global population connectivity by dispersal (e.g., pelagic wahoo; Theisen et al., 2008). For Exocoetus, buoyant pelagic eggs likely provide an adequate mechanism for dispersal across large distances. The exact pelagic larval duration of Exocoetus is not known, but Mora et al. (2012) recently demonstrated that the larvae of many tropical reef fishes persist in the water column long enough for regular breaching of the Eastern Pacific Barrier. Thus, long-distance dispersal of pelagic eggs could explain the lack of genetic differentiation of Exocoetus populations from different Oceans.

| Exocoetus biogeography
At the base of the Exocoetus tree, E. volitans is distributed throughout all tropical oceans and sympatric with every other species in the genus ( Figure 6) although granular patterns of habitat preference might preclude contact among individuals from different species (e.g., seasonally sympatric/parapatric). As sister to all other species in this genus, we conclude that the ancestor of Exocoetus fishes may have been similar to E. volitans, both in terms of morphology and distribution. The distribution of the sister lineage to E. volitans may also have had an expansive distribution, so inferring the geographic context of divergence within Exocoetus is difficult. Sympatric diversification between two globally distributed lineages is a possibility, but equally realistic is allopatric diversification followed by significant dispersal and range expansion.
Because E. volitans individuals from the Atlantic and Indo-Pacific are not genetically diverged (Lewallen et al., 2016), we suggest that the species either dispersed between these regions very recently, or that there is regular gene flow between Oceans, presumably across the Benguela Barrier. The Benguela Barrier results from the upwelling of cold waters near the tip of South Africa and can prevent dispersal between the tropical Atlantic and tropical Indian Oceans for some marine fishes (Briggs, 1995;Rocha, Craig, & Bowen, 2007). However, Rocha et al. (2005) provided compelling evidence to suggest that at least some tropical marine fishes (Gnatholepis gobies) have breached the Benguela Barrier to invade the Atlantic Ocean from the Indian Ocean. Additionally, Craig, Hastings and Pondella (2004) showed support for a sister species relationship between Caribbean and Western Indian Ocean species of the grouper genus Dermatolepis, demonstrating trans-Atlantic dispersal and crossing of the Benguela Barrier by reef fishes (Craig et al., 2004). The distribution of E. volitans, and the minimal genetic divergence between individuals from the Atlantic and Indian Oceans suggest that Exocoetus may be capable of similar dispersals.
The two most distal nodes of the Exocoetus tree (E. obtusirostris + E. monocirrhus and E. peruvianus + E. gibbosus) provide better opportunities for determining the biogeographic context of diversification ( Figure 6). In particular, previously identified marine biogeographic barriers (Rocha et al., 2007)  Barriers may have provided effective boundaries to limit gene flow, resulting in speciation. On the west side of the Atlantic Ocean, the Isthmus of Panama is a well-known land barrier that has resulted in the speciation of many Atlantic and Pacific sister lineages of marine fishes (Banford, Bermingham, & Collette, 2004). According to Banford et al. (2004), at least four periods in the last 10 million years provided marine fishes with opportunities for allopatric speciation on opposite sides of the Isthmus of Panama as it gradually formed. Perhaps a result of species-specific thermal tolerances, the cold waters of South Africa (Benguela Barrier) seem to effectively confine E. obtusirostris and E. monocirrhus to their respective tropical Atlantic and tropical Indo-Pacific distributions.
The sister relationship between E. peruvianus and E. gibbosus, in combination with their respective distributions in the Peruvian Upwelling Current and South Pacific Subtropical Gyre (Parin & Shakhovskoy, 2000), suggests that the Eastern Pacific Barrier (Lessios & Robertson, 2006) may segregate these putative species. The Eastern Pacific Barrier is an expanse of deep water (4,000-7,000 km wide) that separates coastally distributed fishes, although Lessios and Robertson (2006) showed examples of species that can cross this barrier. The lack of islands in the eastern Pacific makes it a particularly effective barrier for some reef-inhabiting organisms. However, this barrier should, in principle, only influence coastal or neritic flyingfish species (e.g., Fodiator, Parexocoetus), or species that have island-associated life stages (e.g., Cheilopogon atrisignis, Cypselurus angusticeps), and would be expected irrelevant to holoepipelagic species, such as Exocoetus.
Thus, E. peruvianus and E. gibbosus may be separated by some other oceanographic barrier. Additional samples are required to first determine whether these taxa are distinct, and then reveal if and how gene flow occurs across this putative barrier. Given the low amount of sequence divergence between E. peruvianus and E. gibbosus, we favor F I G U R E 6 Phylogeny of Exocoetus with distribution maps derived from collection localities presented in Parin & Shakhovskoy (Parin & Shakhovskoy, 2000). Polygons were produced using ArcMap 9. the hypothesis that this species pair is the result of recent divergence and provides a rare example of incipient speciation in an epipelagic fish lineage. As such, these taxa are good candidates for addressing the mechanisms by which speciation occurs in the epipelagic zone.

ACKNOWLEDGMENTS
For assistance with collecting specimens, we thank the crews and re-