Population genetics of the freshwater fish Prochilodus magdalenae (Characiformes: Prochilodontidae), using species-specific microsatellite loci

Prochilodus magdalenae is a freshwater fish endemic to the Colombian Magdalena-Cauca and Caribbean hydrographic basins. The genetic structure patterns of populations of different members of Prochilodus and the historic restocking of its depleted natural populations suggest that P. magdalenae exhibits genetic stocks that coexist and co-migrate throughout the rivers Magdalena, Cauca, Cesar, Sinú and Atrato. To test this hypothesis and explore the levels of genetic diversity and population demography of 725 samples of P. magdalenae from the studied rivers, we developed a set of 11 species-specific microsatellite loci using next-generation sequencing, bioinformatics, and experimental tests of the levels of diversity of the microsatellite loci. The results evidenced that P. magdalenae exhibits high genetic diversity, significant inbreeding coefficient ranging from 0.162 to 0.202, and signs of erosion of the genetic pool. Additionally, the population genetic structure constitutes a mixture of genetic stocks heterogeneously distributed along the studied rivers, and moreover, a highly divergent genetic stock was detected in Chucurí, Puerto Berrío and Palagua that may result from restocking practices. This study provides molecular tools and a wide framework regarding the genetic diversity and structure of P. magdalenae, which is crucial to complement its baseline information, diagnosis and monitoring of populations, and to support the implementation of adequate regulation, management, and conservation policies.


INTRODUCTION
The family Prochilodontidae (Teleostei: Characiformes) comprises the genera Prochilodus, Semaprochilodus, and Ichthyoelephas, and encompasses 21 Neotropical freshwater fish species in the main river basins of South America (Castro & Vari, 2004). Most of the prochilodontids exhibit large body sizes, high fecundities, and abundances, representing around 50-80% of the biomass caught by the subsistence and commercial fisheries in some regions of their distribution area (Barroca et al., 2012b;Melo et al., 2016a). Furthermore, some members of Prochilodontidae constitute a potential resource for fish farming due to certain characteristics such as their fast growth and weight increase, rustic management, and high economic value (Flores-Nava & Brown, 2010;DellaRosa et al., 2014;Roux et al., 2015).
In addition to the economic importance, Prochilodontidae plays an important trophic role in aquatic ecosystems. These detritivorous and migratory fishes contribute to the nutrient cycling, distribution, equilibrium, and maintenance of energetic flows and support a wide trophic network for a great number of predators (Flecker, 1996). Hence, the adequate management of fisheries is crucial for the maintenance of high productivity and permanent resource availability, as well as to guarantee the stability and continuity of the aquatic ecosystems (Taylor, Flecker & Hall, 2006;Batista & Lima, 2010).
The bocachico Prochilodus magdalenae Steindachner 1878 is the most representative endemic species of the Colombian ichthyofauna, considered the emblematic fishery resource of the Magdalena-Cauca Basin, with an estimated unload for the Magdalena Basin of 2,182.67 metric tons in 2013 (Colombian fishing statistical service: SEPEC). However, between 1978 and 2012, this species experienced drastic decreases in its population densities, catches (approx. 85%), and mean catch sizes. These effects resulted from overfishing during migratory periods, violations of legislation related to mean catch sizes, and habitat disturbances including deforesting, floodplain lake desiccations, agrochemical or chemical contamination derived from farming and mining activities, sedimentation, and dam/hydropower construction, among others (Cortés Millán, 2003;Lasso et al., 2010;Mojica et al., 2012).
To counteract this detrimental situation, several state regulations were implemented for the management and conservation of P. magdalenae (Usma et al., 2009;Lasso et al., 2010;Mojica et al., 2012). Specifically, this fish resource was cataloged as critically endangered in 2002 and as vulnerable since 2012 in the Colombian Red List of freshwater fishes (Mojica et al., 2012). Additionally, national regulations of territorial entities and autonomous corporations focused their efforts on the restocking of natural stocks threatened in the last 20 years (INPA regulation 531-1995;ANLA, INCODER, AUNAP regulation 2838. However, these last-mentioned activities are not based on knowledge of the population genetics of P. magdalenae and their ecological, genetic, and sanitary impacts are unknown due to the lack of programmatic monitoring and regulation of fish farming (Povh et al., 2008;FAO, 2011).
Moreover, population genetic studies of P. magdalenae are scarce, and fragmented (López-Macías et al., 2009;Aguirre-Pabón, Narváez-Barandica & Castro-García, 2013; see Mancera-Rodríguez, Márquez & Hurtado-Alarcón, 2013;Orozco-Berdugo & Narváez-Barandica, 2014;Hernández, Navarro & Muñoz, 2017), and most of the required information regarding the origin, genetic diversity and structure of juveniles used for restocking of natural stocks remains unavailable. Hence, natural stocks of P. magdalenae are highly susceptible to experiencing disturbances of their genetic background resulting from artificial mixtures of genetic stocks with different evolutionary histories or, alternatively, from the high competition for resources among different stocks.
Since P. magdalenae performs long longitudinal migrations (ca. 1,224 km; velocity: 55.6 km/day) (López-Casas et al., 2016), it is reasonable to think that its natural stocks experience extensive gene flow. However, the observation that Prochilodus lineatus (Godoy, 1959) and Prochilodus argenteus (Godinho & Kynard, 2006) show fidelity to spawning sites ("homing") suggests that P. magdalenae may exhibit a population genetic structure even in the absence of physical barriers.

Microsatellite loci development
Low-coverage sequencing of the genomic library of one specimen of P. magdalenae from the middle section of the Magdalena River was performed using the Illumina MiSeq v2 platform (Illumina, CA, USA), the "whole genome shotgun" strategy and the Nextera library preparation kits (Illumina, CA, USA) for the sequence reads. All steps regarding the read cleaning, contig assemblage, identification of microsatellite loci, primer design, in silico alignment of primers using primer-BLAST (available in https://ncbiinsights.ncbi. nlm.nih.gov/2017/06/28/e-pcr-is-retiring-use-primer-blast/), PCR optimization, and polymorphism analysis of 52 microsatellites were performed following the methodology described by Landínez-García & Márquez (2016). A set of 21 polymorphic microsatellite loci were selected and fluorescently labeled for genotyping of 88 randomly chosen samples. Then, a subset of 11 loci were selected for further evaluation of genetic diversity and structure in 725 samples because they satisfied the criteria of clearly defined peaks, reproducibility and consistency of amplifications, absence of stutter bands, specific bands, correct motif sizes, levels of heterozygosity, and high polymorphism information content (PIC) values, among others parameters required to validate new microsatellite primers (Neff, Garner & Pitcher, 2011;Fernandez-Silva et al., 2013;Schoebel et al., 2013).

Genotyping of samples
The PCRs were conducted in a volume of 10 µl, which contained 2-4 ng/µl of template DNA isolated with the GeneJET Genomic DNA Purification kit (Thermo Scientific, Karlsruhe, Germany) following the manufacturer´s instructions, 1 × buffer (Invitrogen, CA, USA), 0.2 mM dNTPs (Thermo Scientific, MA, USA), 0.05 U/ml Platinum TM Taq DNA Polymerase (Invitrogen, CA, USA), 2.5 mM MgCl 2 , 2% formamide (Sigma-Aldrich, Steinheim, Germany), 0.35 pmoles/ml labeled forward primer (either FAM6, VIC, NED, or PET, Applied Biosystems, CA, USA), and 0.5 pmoles/ml reverse primer (Macrogen, Seoul, Korea). The PCRs were performed on a T100 thermocycler (BioRad, CA, USA) with an initial denaturation step of 95 C for 3 min followed by 32 cycles consisting of a denaturation step of 90 C for 22 s and an annealing step for 18 s using the annealing temperatures described for each primer in Table 1. The extension step and a final elongation were absent in this thermal profile. Finally, the PCRs were submitted to electrophoresis on an automated sequencer ABI 3730 XL (Applied Biosystems, CA, USA) using GeneScan 500 LIZ dye size standard (Applied Biosystems, CA, USA) as the internal molecular size. Allelic fragments were denoted according to their molecular size and scored using GeneMapper v4.0 software (Applied Biosystems, CA,
To examine other groupings of the samples, genetic differentiation among samples was tested using the Bayesian analysis of population partitioning with Structure v2.3.4 software (Pritchard, Stephens & Donnelly, 2000). Parameters included 350,000 Monte Carlo Markov Chain steps and 50,000 iterations as burn-in, the admixture model, correlated frequencies, and the LOCPRIOR option for detecting relatively weak population structure (Hubisz et al., 2009). Each analysis was repeated 20 times for each simulated K value, which ranged from 1 to n + 3 (n, number of populations compared). For a best estimation of genetic stocks (K), the web-based software STRUCTURESELECTOR (Li & Liu, 2018) was used to calculate the ΔK ad hoc statistic (Evanno, Regnaut & Goudet, 2005), the estimators MEDMEANK, MAXMEANK, MEDMEDK and MAXMEDK (Puechmaille, 2016), and to generate the graphical representation of results using the integrated Clumpak software (Kopelman et al., 2015). Based on the coancestry coefficients provided by Structure and Clumpp, the individuals were reorganized by genetic stock in sample sites that showed multiple stocks and were later submitted to the genetic analyses described above. Additionally, the occurrence of recent genetic bottlenecks of populations was evaluated by calculating the levels of heterozygosity and the M ratio using Bottleneck v1.2.02 software v3.5.2.2 (Piry, Luikart & Cornuet, 1999) and Arlequin (Excoffier, Laval & Schneider, 2005), respectively. Excess heterozygosity was assessed by employing the Wilcoxon sign-rank test (Cornuet & Luikart, 1996;Luikart & Cornuet, 1998). The M ratio-the mean ratio of the number of alleles compared to the range of allele sizeindicates that the population has experienced a recent and severe reduction in population size when its value smaller than 0.680 is (Garza & Williamson, 2001).
To explore non-neutral evolutionary forces acting on the microsatellite loci, a scanning analysis was performed using the BayeScan v2.1 software (Foll & Gaggiotti, 2008) to detect candidate loci under selection. Parameters for BayeScan analyses included 10:1 prior odds for the neutral model and 20 pilot runs consisting of 5,000 iterations each followed by 250,000 iterations with a burn-in length of 50,000 iterations (Foll & Gaggiotti, 2008).

Phylogenetic relationships among genetic groups
To explore the phylogenetic relationships among individuals sampled along the basin, partial fragments of the mitochondrial cox1 gene (~650 bp) were amplified in a subset of samples using primers and PCR conditions previously described by Ward et al. (2005) and Ivanova et al. (2007). PCR products were sequenced by the Sanger method using an automated sequencer, ABI 3730 XL (Applied Biosystems, CA, USA). The best-fit evolutionary model was determined based on the Bayesian information criterion as implemented in the jModelTest v2.1.7 software (Posada & Crandall, 1998). A Bayesian phylogenetic analysis was conducted in MrBayes v3.2.6 software (Ronquist & Huelsenbeck, 2003) including GenBank sequences of Prochilodus magdalenae, Prochilodus reticulatus, Prochilodus mariae, Prochilodus nigricans and using Ichthyoelephas longirostris as outgroup. For this purpose, we performed two independent runs of 20 million generations sampled each 1,000 generations using 25% as burn-in. The remaining values were left as default. The convergence of each parameter was checked based on a potential scale reduction factor nearing 1, an average standard deviation of the split frequencies lower than 0.010, and the visualization of the resulting trees was performed with FigTree v1.4.3 software (Rambaut, 2012). Finally, the pair-wise divergences of P. magdalenae and P. reticulatus haplotype sequences were estimated using the Kimura 2-parameters model in MEGA v10.1.8 software (Kumar et al., 2018).
A total of 21 of the 52 microsatellite loci evaluated were polymorphic and showed Hardy-Weinberg disequilibrium (Table 1)

Genetic diversity, population demography and outlier loci screening
Comparisons among rivers revealed that 8 of 11 loci satisfied the Hardy-Weinberg equilibrium expectations in at least one case (Table 2). However, the analysis across loci showed significant departures from Hardy-Weinberg equilibrium expectations in all rivers evaluated ( Table 2). The average number of alleles per locus was higher in Cauca  (Table 2).
Furthermore, comparisons among sites within each river showed similar high levels of genetic diversity (Table 3). The highest value of genetic diversity was found in the floodplain lake Palagua in the Magdalena River (Na: 17.182 alleles/locus; H E : 0.895; H O : 0.792), whereas the lowest was observed in Beté, a site of the Atrato River (Na: 9.273 alleles/locus; H E : 0.791; H O : 0.711). In addition, all sites exhibited a highly significant deficit of observed heterozygosity (Table 3) with Mata de Palma and Samaná Norte River showing the lowest and highest observed heterozygosity deficits, respectively. Inbreeding coefficients (F IS ) per site in main rivers of the different Colombian hydrographic areas were significant and ranged from 0.120 to 0.255 (Table 3). Although decreased in magnitude, the inbreeding coefficients (Table 3) remained significant even after comparing the genetic diversity according to genetic stocks in Chucurí, Puerto Berrío, and Palagua and among the Magdalena River and tributaries.
Results of the genetic bottleneck tests (Table 4) were significant for all populations under the infinite alleles model (IAM) and for most populations under the two-phase model (TPM), whereas they were non-significant under the stepwise mutation model (SMM). As it is thought that few loci follow the strict SMM (Piry, Luikart & Cornuet, 1999), the best estimation of expected heterozygosity at mutation-drift equilibrium is expected under a combination of IAM and TPM. Additionally, all values of the M ratio were substantially smaller than 0.680, indicating that all populations have experienced recent and severe reductions in population size (Table 4).
In contrast to other samples that did not show evidence of selection, BayeScan analysis revealed that 8 of 11 loci (Pma39, Pma25, Pma02, Pma35, Pma40, Pma36, Pma13 and Pma14) exhibit substantial evidence of selection in the Magdalena River (Table 5).

Microsatellite loci development
This work developed species-specific microsatellite loci using next-generation sequencing and bioinformatic analysis. Although a total of 21 of 52 microsatellite loci with tri-and tetra-nucleotide motifs were polymorphic in P. magdalenae, the consistency in the amplification in a larger sample, allelic size class distribution, and high definition peaks allowed the selection of only 11 microsatellite loci for further population genetic analysis. Most of the loci showed departures from Hardy-Weinberg equilibrium and significant observed heterozygosity deficit in the random sample. The observed heterozygosity deficit may be related to technical problems such as silent alleles; however, it remains to explore the potential variations in the primer alignment sequences, since this study sequenced the genome of a single specimen of P. magdalenae. Two non-excluding explanations may be related to the significant levels of inbreeding and the genetic structure of the samples by the coexistence of two genetic stocks (see below).
Although the levels of genetic diversity measured by the expected heterozygosities were similar, the levels of observed heterozygosity as well as the average number of alleles per locus found in this study were substantially greater than those found in the same samples by Orozco-Berdugo & Narváez-Barandica (2014). However, despite these differences, both heterologous (Orozco-Berdugo & Narváez-Barandica, 2014) and species-specific microsatellite loci (this study) revealed a general deficit of heterozygotes in all samples. In this context, the species-specific microsatellite loci developed in this study  seem to provide a good approach to study the population genetics of P. magdalenae considering that the levels of heterozygosity constitute a parameter used to estimate the genetic diversity of the populations. In addition to the applications in harvest management, stocking programs, definition of conservation units, recovery of threatened species, and management of invasive species, these tools may be useful in forensic genetics since partially degraded DNA samples are often found in this area (see Bourret et al., 2020).

Genetic diversity and population demography
Microsatellite data revealed average values of genetic diversity (H E : 0.737) among the highest values found in other Prochilodontidae species, only surpassed by those reported for P. costatus (Melo et al., 2013) and P. argenteus (Coimbra et al., 2017) (2017)). Additionally, this study found levels of observed heterozygosity higher than those found by Orozco-Berdugo & Narváez-Barandica (2014). However, the use of species-specific microsatellite loci developed in this study revealed similar values of expected heterozygosity among samples analyzed by Orozco-Berdugo & Narváez-Barandica (2014) and the remaining samples analyzed, indicating that differences between the two studies are related to the type of microsatellite loci utilized (heterologous vs. species-specific microsatellite loci).
The significant deficit of observed heterozygosity in all studied samples corroborates the previous findings for P. magdalenae from Magdalena River (Orozco-Berdugo & Narváez-Barandica, 2014); however, the magnitude of the observed heterozygosity deficit as well as the inbreeding coefficient (0.075-0.239) were substantially lower than those previously reported (0.624-0.788). Following Franklin (1980) and Soulé (1980), the values above 10% of the inbreeding coefficient indicate that these populations require careful management to avoid future detrimental effects on its populations. This point is important since it has been recommended recently that any inbreeding coefficient higher than zero will usually have an adverse fitness effect (Frankham, Bradshaw & Brook, 2014). Another non-excluding alternative is plausible considering that the significant deficit of observed heterozygosity observed in all sites analyzed may be also explained by the coexistence of genetic stocks (Wahlund effect) as this was evidenced by the genetic structure analysis (see below). Another biological cause of observed heterozygosity deficit, assortative mating, does not seem to explain the results found in this study because P. magdalenae is iteroparous and characterized by total spawning (Jaramillo-Villa & Jiménez-Segura, 2008) as described in its congeners, P. costatus (Carolsfield et al., 2004) and P. lineatus (Roux et al., 2015). Even more, in this latter species, the genetic analysis based on microsatellite loci support polygamous mating in both sexes (Ribolli et al., 2020).
On the other hand, this study also provided evidence for a population bottleneck, suggesting that P. magdalenae shows signs of erosion of the genetic pool, likely by the constant pressure from fishing and other anthropogenic activities exerted on its populations. Although paradoxical to the observed heterozygosity deficit found in all populations evaluated, this outcome is plausible considering that the Bottleneck algorithm tests not for an excess of heterozygotes (H O > H E ) but rather for an excess of heterozygosity (He > He at mutation-drift equilibrium) (Piry, Luikart & Cornuet, 1999). Besides, the combination of a population bottleneck and an observed heterozygosity deficit may result from population growth in a closed system, population genetic structure, or admixture (Barson, Cable & Oosterhout, 2009). Considering the lengths of the rivers studied, population growth in a closed system is unlikely but the last two alternatives may explain our results due to the coexistence of genetic stocks in the samples studied and the continuous restocking of natural stocks using juveniles from fish hatcheries, which may create an apparent excess of novel alleles and an incomplete allele frequency distribution. Similar results have also been found in guppies, Poecilia reticulata, in Trinidad and Tobago (Barson, Cable & Oosterhout, 2009).

Genetic structure
This study tested the hypothesis that P. magdalenae exhibits genetic stocks that coexist and co-migrate along sections of the main channel and some tributaries of the Magdalena River (Cauca, San Jorge and Cesar), Sinú, and Atrato rivers. Before testing this hypothesis, we compared the genetic structure at regional scale, finding two spatially structured populations: one in the Magdalena River (Puerto Berrío and the floodplains Chucurí and Palagua) and the other in the remaining rivers evaluated.
The geographical genetic structure may result from taxonomic differences among stocks due to the lack of regulations on the restocking of natural stocks of P. magdalenae. The phylogenetic analysis using partial sequences of cox1 gene indicates that samples do not correspond to species such as P. mariae or P. nigricans because this genetic stock is clustered with previously published sequences of P. magdalenae (Aguirre-Pabón, Narváez-Barandica & Castro-García, 2013). However, it remains to be seen whether they represent artificial mixtures of P. magdalenae and P. reticulatus because the current mitochondrial phylogenetic analysis of Prochilodontidae does not allow the two species to be discriminated (Melo et al., 2016b(Melo et al., , 2018. Moreover, the morphological and molecular similitudes have led to the proposal that P. magdalenae and P. reticulatus represent only one species with probable allopatric differentiation resulting from the uplift of the Sierra del Perijá (Melo et al., 2016b). Thus, a separated clustering of mitochondrial sequences of those stocks is not expected in the phylogenetic analysis even though they represent allopatric populations.
An alternative explanation is that the genetic differences result from eight outlier loci that are putatively under selection in three sites of the Magdalena River, suggesting that P. magdalenae experiences natural/artificial selection or local adaptation, although testing of these hypotheses is out of the scope of the present study. The explanation that outlier loci represent false positives resulting from the inclusion of severely bottlenecked populations (Teshima, Coop & Przeworski, 2006;Foll & Gaggiotti, 2008) seems unlikely because the significant excess of heterozygosity and small values of the M ratio were found even in populations that do not exhibit outlier loci. Thus, considering that those sites have been exposed to restocking since 20 years ago and since most microsatellite loci are not transcriptionally active, the outlier loci found in this study may reflect hitchhiking selection resulting from restocking using juveniles selected artificially by fish hatcheries. Alternatively, the outlier loci may result from asymmetric gene flow by unidirectional migration from hatchery stocks to wild populations. Similar results were found in Denmark in populations of three brown trout, which have been significantly admixtured with stocked hatchery trout (Hansen, Meier & Mensberg, 2010).
Although the above reasoning might explain the genetic differences between stocks, an additional justification is required to explain the restricted distribution of one genetic stock in only three sites of the Magdalena River considering the migratory abilities of these species/allopatric populations. Thus, this genetic structure seems to result from recent restocking before reproductive/feeding migrations, use of artificial barriers to avoid migration of the fish, clogging by sedimentation or vegetation, or the desiccation of access to floodplain lakes or may be a product of the intensive anthropic intervention in these territories characterized by the exploitation of hydrocarbons and livestock. This idea is concordant with the fact that degradation of preferred habitat and barriers that impede dispersal contribute to the degree of genetic differentiation among populations (Faulks, Gilligan & Beheregaray, 2011).
Excluding the genetic stock of Puerto Berrío and the floodplains Chucurí and Palagua, each river showed the coexistence of at least two genetic stocks. Homogeneous and non-homogeneous distributions of these genetic stocks along the rivers explain similarities (Cauca, Magdalena, San Jorge, Cesar and Nare) as well as geographical differences among the rivers analyzed (within Magdalena, including Puerto Berrío and the floodplains Chucurí and Palagua, Sinú and Atrato). This genetic structure also explains the significant heterozygosity deficit observed in all sites analyzed (Wahlund effect) as discussed above. Similar evidence of the Wahlund effect has been documented in the congener P. costatus, which exhibited genetic differences resulting from temporal isolation (Braga-Silva & Galetti, 2016). Although sampling in this study was not designed to detect temporal genetic structuring, genetic similarities among samples collected in different years suggest that the Wahlund effect must be more spatial than temporal. It remains to be seen whether this behavior is natural or artificial, considering that the restocking activities have been widely implemented along different Colombian rivers.

CONCLUSIONS
This study provides evidence that P. magdalenae exhibits high genetic diversity, significant inbreeding levels per genetic stock, and signs of erosion of the genetic pool and conforms a mixture of genetic stocks heterogeneously distributed along the rivers studied. Additionally, this study developed a set of 11 microsatellite loci that allows the reliable detection of levels of genetic diversity, providing a tool for monitoring changes in the genetic diversity of the species, brood stocks and juveniles used for supportive breeding and for measuring the efficacy of current population restocking activities. Management and conservation strategies need to be implemented at the level of the basins Magdalena-Cauca, Sinú and Atrato concordantly with their genetic population structure.