Mitochondrial sequence diversity reveals the hybrid origin of invasive gibel carp (Carassius gibelio) populations in Hungary

Background Invasive gibel carp, Carassius gibelio (Bloch, 1782) has become well-established in the Hungarian waters and now are spreading in the European waters. On major concern now is the potential hybridization between gibel carp and the other invasive species in the Carassius auratus complex (CAC), which may further accelerate the spread of the whole invasive species complex. The identification of gibel carp and their hybrids is difficult because of its morphological similarity to the other species in CAC. Here we carry out a genomic assessment to understand the history of gibel carp invasion and its phylogenetic relationship with the other species in CAC. Three loci of the mitochondrial genome (D-loop, CoI, Cytb) were used to determine the phylogenetic origin of individuals and relarionship among six gibel carp populations and the other species in the CAC. Methodolgy A total of 132 gibel carp samples from six locations in Southern Transdanubia (Hungary) were collected after phenotypic identification to measure the genetic diversity within and among gibel carp populations of Southern Transdanubia (Hungary). The genetic background was examined by the sequences of the mitochondrial genome: D-loop, Cytochrome c oxidase I (CoI) and Cytochrome b (Cytb). Mitochondrial genetic markers are excellent tools for phylogenetic studies because they are maternally inherited. Successfully identified haplotypes were aligned and with reference sequences in nucleotide databases (i.e., NCBI-BLAST: National Centre for Biotechnology Information and BOLD: Barcode of Life Data System). The phylogenetic relationships among gibel carp populations were then analyzed together with the reference sequences to understand the relationship and the level of hybridization with the species in CAC. Results Among the 132 aligned D-loop sequences 22 haplotypes were identified. Further examination of representative individuals of the 22 haplotypes, six Cytb and four CoI sequences were detected. The largest number of haplotypes of all three loci were found in Lake Balaton, the largest shallow lake in Central Europe. Based on the NCBI-BLAST alignment of the D-loop, haplotypes of Carassius auratus auratus and Carassius a. buergeri in CAC were identified in the C. gibelio samples. Further analysis of haplotypes with the other two mitochondrial markers confirmed the occurrence of intragenus hybridization of C. gibelio in the Hungarian waters. Conclusion By using three mitochondrial markers (D-loop, Cytb, CoI), we genomically characterized a gibel carp-complex in Hungarian waters and assessed the C. gibelio phylogenetic status between them. Hybrid origin of locally invasive Carassius taxon was detected in Hungary. It points out that invasive species are not only present in Hungary but reproduce with each other in the waters, further accelerating their spread.

In Hungary, it was mentioned in the literature the first time in 1887 by Herman, but it was most likely a misidentified specimen (Herman, 1887). The first official shipment to fish farms arrived in 1954, from Bulgaria (Szalay, 1954). Nowadays it is one of the most common generalist fish species in the lowland waters of the country (Takacs et al., 2017).
Although gibel carps are now found in most of European waters, its taxonomy remains unsolved. The species has been grouped into the Carassius auratus-complex (CAC), which also includes, for example, C. auratus, C. gibelio, C. praecipuus, C. langsdorfii, C. cuvieri, C. carassius (Rylkova et al., 2018;Takada et al., 2010). Some authors referred it as a subspecies of the goldfish, Carassius auratus (Linnaeus, 1758), but the others argued that the differences are not enough to list it as a subspecies and both belong to the same species C. auratus (Berg, 1932;Guti, 1993;Pinter, 2002). Furthermore, gibel carp has different levels of ploidy (2n = 100, 3n = 150-160) within a single population (Zhou & Gui, 2017) which makes the genetic diversity analysis of this species even more difficult. Kalous et al. (2012) have specified two neotypes of the C. gibelio. The described neotypes and knowledge of the genetic background allows tracking of the spreads of gibel carp populations and CAC and help understand the level of hybridization among them. Hybrids between gibel carp and the other species in CAC can be more of a threat to native species than gibel carp in European waters given their rapid growth and even wider environmental tolerance (Wouters et al., 2012).
The main objectives of this study are, by using different molecular markers, to identify gibel carp haplotypes, assess the phylogenetic status of the recent, locally invasive populations in Hungary, and understand their genetic diversity within and among these populations.

Bioinformatic and statistical analysis
Chromatograms were converted to FASTA file format with BioEdit Sequence Alignment Editor (Hall, 1999 (Rohl, Bandelt & Forster, 1999;Clement et al., 2002). We used the closest related species (C. auratus, C. a. buergeri, Carassius carassius, C. cuvieri, Cyprinus carpio carpio) as reference sequences along with other C. gibelio haplotypes found in the literature. In case of Cytb based on the genetic distance and the phylogenetic tree was reconstructed by MegaX software with Neighbour Joining fitting with Kimura-2 parameter and 1,000 bootstrap replication. The references were the haplotypes described in C. gibelio neotypes by Kalous et al. (2012), as well as European sequences described by Takada

D-loop
Based on the comparisons of 699 bp long sequences of the D-loop region, 22 haplotypes were identified in 132 individuals ( Table 1). Three of these (HapDl_1, HapDl_7, HapDl_21) were identified in more than 15 individuals while the others were less frequent. These haplotypes were present in maximum 6 individuals. Within the haplotypes, the number of polymorphic sites was 43 (Fig. 2). For each population, the highest haplotype diversity was in Siófok 0,83 (HD) ± 0,04 (SD). The second was the KWBPS I with 0,80 (HD) ± 0,08 (SD). The lowest value was 0,19 (HD) ± 0,11 (SD) for the population of Őszödi-berek. In the population of Hőgyész HD was 0,72 ± 0,05 (SD), in the population of Siófok-Törek 0,66 ± 0,07 and in the population of KWBPS II. HD was 0,63 ± 0,09 (SD). Thirtyfive sites were parsimony informative, based on the DnaSP analyses. Samples from all locations represented a mixture of different haplotypes except those for the sequences from Őszödi-berek, which separated into two individual groups (HapDl_21, HapDl_22) and the HapDl_2 group with seven individuals from Lake Balaton. Twelve other haplotypes were found (HapDl_3, HapDl_5, HapDl_6, HapDl_10, HapDl_11, HapDl_12, HapDl_14, HapDl_15, HapDl_17, HapDl_18, HapDl_19, HapDl_20), of which each contained only one or two samples from one location. The identified haplotypes were divided into three major groups separated on the network figure (Fig. 3). Most of the haplotypes, included the two largest groups (HapDl_1, HapDl_7), clustered together in the first group (Dl_Group 1.     (Table 2) among the populations has shown a moderate difference between each pair but most of the values were lower than 0.1, with the exception of one population (Őszödi-berek) which resulted in higher values. The lowest difference (−0.015) was identified between the populations of Hőgyész and KBWPS Stage I.

Cytochrome b
Cytb sequences were analysed to confirm the results of 22 D-loop haplotypes. Only six Cytb haplotypes were identified. The complete length of the sequence region was 1057 bp. Within the haplotypes the numbers of polymorphic sites were 27 (Fig. 4). NCBI database was used to check the evolutionary origin of the sequences. All six haplotypes were detected as Carassius gibelio (Table 3) Table 4). The group containing most of the samples (HapCoI_2) had 100% of identity coverage of the Tatia intermedia query sequence. Furthermore, based on the BOLD system HapCoI_2 and HapCoI_4 were identified as C. gibelio. HapCoI_1, as well as HapCoI_3 have shown the highest similarity with C. a. auratus. In the case of HapCoI_3, not even the first 20 hits contained C. gibelio sequences. The phylogenetic tree formed three clearly visible branches (Fig. 7). The first branch included the HapCoI_1 and HapCoI_3. The second branch contained the HapCoI_2 and the HapCoI_4 was in the third group. These haplogroups contained samples also from the geographically distant fish ponds of Hőgyész, which is attributed to be ancestral. These haplotypes are in a centroid position among others in the network system (Fig. 3). The rest of the haplotypes were found only in a small number of individuals. The largest number of haplotypes of all three loci were found in Lake Balaton, which corresponds to the size of the lake (the largest shallow lake in Central Europe Istvanovics et al., 2007). Haplotype diversity is potentially corresponded with the diverse and geographically isolated spawning habitats, situated in the inflows throughout the catchment. Young fish usually stay in these habitats until they reach 1-3 years of age (Specziár, 2010).  among the populations. The only exception is the population of Őszödi-berek, which is a completely closed water system, no fish can reach or escape from it. There, we identified only two haplotypes and they were not present in other populations. This wetland was disconnected from other water bodies during the water regulation period of the early 19th century (Zlinszky & Timar, 2013). Gibel carp was most possibly introduced to this site unintentionally by coarse angels, which has been documented in other systems in the earlier years (Docherty et al., 2017). The wetland suffered from multiple, at least partial droughts in the last 20 years, which most possibly resulted in a population level bottleneck (Ferincz et al., 2016a;Lennox et al., 2019). Similar observations were made in the cases of Gadopsis marmoratus in Australia (Coleman et al., 2018) and Eupallasella percnurus in Poland (Kaczmarczyk & Wolnicki, 2016), which reinforced that isolated populations are more vulnerable to stochastic events. Five C. a. auratus (HapDl_1, HapDl_3, HapDl_8, HapDl_14, HapDl_20) and one C. a. buergeri (HapDl_4) haplotypes were identified in 5 different populations with low frequency. The most distant haplotype (HapDl_4) was identified as C. a. buergeri (Temminck & Schlegel, 1846) but contained only 3 individuals. The genetic detachment of these individuals is clear. The presence of these haplotypes in the studied natural ''gibel carp'' populations clearly indicates interspecific hybridization among CAC species in this region. The two ornamentally bred species (C. auratus and C. a. buergeri) are not native in the European waters, but originated from different types of goldfish kept commonly by aquarists. None of the previous European studies ( Kalous et al., 2012;Rylkova & Kalous, 2013;Kalous et al., 2007;Tsipas et al., 2009) revealed the presence of C. a. buergeri from natural waters outside of Asia (Ueda & Ojima, 1978;Ojima & Yamano, 1980;Kobayasi, Ochi & Takeuchi, 1973). Our research did not find any sequence that could originate from other nonindigenous species in CAC previously described from European waters (e.g., Carassius langsdorfii) (Rylkova & Kalous, 2013;Kalous et al., 2007;Tsipas et al., 2009) or hybrids with the native Carassius carassius (Zlinszky & Timar, 2013;Rylkova et al., 2013). Nonetheless the hybridization between the members of the CAC cannot be ruled out based solely on the mitochondrial regions. Because of their maternal inheritance, only the female Carassius gibelio sequences are detectable and the paternal lines remains hidden.
To get a more comprehensive information, the sequences of Cytb and CoI loci of 22 haplotypes were examined. However, neither Cytb nor CoI has shown as many haplotypes as the D-loop did. Rylkova & Kalous (2013) also reported that Cytb showed low genetic diversity. The Cytb haplotypes supported the phenotypic identification but does not confirm the presence of C. auratus and the group, determined as C. a. buergeri. Comparing to the D-loop BLAST results none of the first 100 scored Cytb hits contained sequences from the species C. a. buergeri. The CoI was the less informative marker of the three tested types. It has identified only four haplotypes. However, based on the BLAST analysis of the standard nucleotide full database of GeneBank, all four showed unreliable results. The highly scored overlapping with the driftwood catfish (Tatia intermedia) and the amur wild carp (Cyprinus carpio haematopterus) in the database draws attention to the weakness of using online databases, which could contain incorrectly uploaded data. For example Elgin, Tunna & Jackson (2014) and Buhay (2009) both had similar experience with the CoI. The benefits of the BOLD (barcode based) system were emphasized, but it did not help to clarify our results. Based on our results, we agree with Tsipas et al. (2009) that in the future, in order to determine the status of gibel carp in taxonomic studies, it will be necessary to include nuclear markers in addition to mitochondrial markers. Nuclear genetic markers are able to widen genetic identification and may be suitable for finding foreign sequence pieces that can be used to explore the hybrid origin, which may be hidden due to maternal inheritance in the analyses of the mitochondrial genome. However, more markers (from 4 to 70) should have to be used for the proper identification of hybrids in the later or backcrossed generations (Boecklen & Howard, 1997).
The Cytb and CoI have the lowest mutation rate among the mitochondrial protein coding genes in fish. The CoI is used in BOLD system for species identification but our results are revealed that Cytb can be used more efficiently for identification the CAC complex because of more reliable databases. This marker showed the highest agreement with the phenotypes. While the D-loop showed the highest genetic variability. It is in agreement with other studies and explained by the highest mutation rate of the only larger non-coding mitochondrial region (Brown Wesley, Matthew Jr & Wilson, 1979;Parker et al., 1998). It is much more informative, than the nuclear markers when used for the analysis of closely related species, subspecies categories or populations (Brown Wesley, Matthew Jr & Wilson, 1979;Lagouge & Larsson, 2013;Ballard & Whitlock, 2004). Our results also show that this region can be efficiently used for intrapopulation analyses for identifying hybrids in CAC.

CONCLUSIONS
This study is the first genetic diversity assessment for Hungarian gibel carp populations, in which we reported the recent homogenous genetic background of the studied populations. However, the potential hybrid origins of gibel carps were identified in the studied waters. The origin of the introgressed C. auratus sequences is doubtful but denotes the unreliability of morphological based identification of taxa within the genus Carassius and the hidden presence of goldfish in natural waters of Hungary.