Historical demography and genetic differentiation of the giant freshwater prawn Macrobrachium rosenbergii in Bangladesh based on mitochondrial and ddRAD sequence variation

Abstract Macrobrachium rosenbergii, the giant freshwater prawn, is an important source of high quality protein and occurs naturally in rivers as well as commercial farms in South and South‐East Asia, including Bangladesh. This study investigated the genetic variation and population structure of M. rosenbergii sampled from four rivers in Bangladesh (sample size ranged from 19 to 20), assessing sequence variation, both in the mitochondrial cytochrome oxidase subunit 1 (CO1) gene and in 106 single nucleotide polymorphisms (SNPs) sampled randomly from the genome with double digest RAD sequencing (ddRADseq). The mitochondrial variation presented a shallow genealogy with high haplotype diversity (h = 0.95), reflecting an expansion in population size for the last ~82 kyr. Based on the CO1 variation the current effective population size (N e) was 9.7 × 106 (CI: 1.33 × 106 – 35.84 × 106) individuals. A significant population differentiation was observed with the mitochondrial CO1 sequence variation and based on the ddRADseq variation, which could be traced to the divergence of the population in the Naf River in the South‐East border with Myanmar from the other populations. A differentiation in mtDNA haplotype frequencies was also observed between the Biskhali River and the Karnaphuli Rivers in eastern Bangladesh. This study demonstrated the use of high‐throughput genotyping based on the ddRADseq method to reveal population structure at a small geographical scale for an important freshwater prawn. The information from this study can be utilized for management and conservation of this species in Bangladesh.

the fertilized eggs in the brood chamber, where eggs hatch as freeswimming zoeae and after progressing through 12 larval stages in the brackish environment, the postlarvae (PL) enter into the freshwater system to grow until sexual maturity (FAO 2002(FAO , 2004. During the last two decades, M. rosenbergii aquaculture has attracted considerable attention in Bangladesh for its export potential (Ahmed, Demaine, & Muir, 2008;Wahab, Ahmad-Al-Nahid, Ahmed, Haque, & Karim, 2012). Aquaculture operations have expanded in the South and South-western districts of Bangladesh due to the availability of PL in the coastal areas (Azad, Lin, & Jensen, 2008). Moreover, a large number of freshwater ponds in Bangladesh have high potential to culture the freshwater prawn , which mostly depends on wild caught PL (Ahamed, Hossain, Fulanda, Ahmed, & Ohtomi, 2012;Ahmed, Occhipinti-Ambrogi, & Muir, 2013;Ahmed & Troell, 2010). Concerns for the effects of wild prawn PL overfishing on coastal ecosystem biodiversity and production of other species caught as bycatch, led the Department of Fisheries (DOF) in Bangladesh to impose a ban in the year 2000 on wild prawn PL harvest (Ahmed & Troell, 2010;Department of Fisheries Bangladesh 2002). To fulfill the increased demand of PL, 27 freshwater prawn hatcheries have been established since 1992, producing about 27,000,000 PL in 2014 (Department of Fisheries Bangladesh 2015). Ovigerous females used in the hatcheries are collected directly either from the rivers or from aquaculture .
The genetic diversity of M. rosenbergii in Bangladesh is under continuous threat due to human activities (i.e. overexploitation, natural postlarvae collection, escape from aquaculture, and use of banned gears) and climate change effects (i.e. sea level rise, saline water intrusion) (Department of Fisheries Bangladesh 2013; Quader, 2010). Assessments of population structure and genetic diversity are essential to inform management of harvested populations as ignorance of structure can lead to overexploitation, and escapees from aquaculture can be a risk for locally adapted populations (Koljonen, 2001;Laikre, Palm, & Ryman, 2005;Olsson et al., 2007;Palsbøll, Berube, & Allendorf, 2007;Ward, 2000) as escaping from aquaculture occurred during cyclones and its resulted floods (Department of Fisheries Bangladesh 2008). Genetic diversity facilitates further evolution given environmental change and may in addition play a key ecosystem function (Reusch & Hughes, 2006).

Information of genetic variation in M. rosenbergii in Bangladesh
has until recently been mostly unexplored, but recent studies on Penaid shrimps in Bangladesh have uncovered high levels of genetic diversity (Alam, de Croos, & Pálsson, 2016;Alam, Westfall, & Pálsson, 2015Hurwood et al., 2014).
Several genetic studies have been applied to M. rosenbergii including allozymes (reviewed in Agarwal et al., 2016), mtDNA (De Bruyn et al., 2005;Hurwood et al., 2014), microsatellites Khan et al., 2014), and more recently identification of single nucleotide polymorphisms (SNPs) (Agarwal et al., 2016;Jung et al., 2014). The analysis of mtDNA variation revealed three distinct lineages within the species, that is, a western lineage West of the biogeographic barrier at the Isthmus of Kra, a central lineage mainly from the Sunda-Shelf region and an eastern lineage mainly in Indonesia (De Bruyn et al., 2005;Hurwood et al., 2014), but microsatellite variation showed four distinct clusters where Bangladesh samples clustered together with the samples from South-western Thailand . A study on genetic patterns within M. rosenbergii sampled from South-western Bangladesh (the Pashur and the Paira Rivers) and South-eastern Bangladesh (the Naf River), based on seven microsatellites, did not reveal significant differences among the sites, but the pairwise distances corresponded with the geographical distances (Khan et al., 2014) (Benestan et al., 2015) and three spine stickleback Gasterosteus aculeatus (Baird et al., 2008;Hohenlohe et al., 2010;Jones et al., 2012).
The aim of this study was to investigate the genetic variation and population structure of M. rosenbergii within Bangladesh by assessing genomic variation in samples from four of its main rivers: the Bishkhali in the West, the Meghna, and the Karnaphuli in the East and the Naf River at the boundary with Myanmar in South-East by applying more genetic markers than in a previous study by Khan et al. (2014) and including other rivers. This was carried out by analyzing geographical and historical patterns in mitochondrial DNA sequences and from double digest restriction-site associated DNA sequence (ddRADseq; Peterson, Weber, Kay, Fisher, & Hoekstra, 2012) variation.  (Figure 1). All samples were preserved in 96% ethanol. Total genomic DNA was extracted for mtDNA sequencing from ~1 mg pleopod tissue through overnight incubation at 56°C in a mixture of 6% Chelex and 0.2 mg/ ml proteinase K, using a Thermomixer (Eppendorf Thermomixer Compact), and for ddRADseq from ~20 mg pleopod tissue using standard Phenol-Chloroform extraction (Maniatis, Fritsch, & Sambrook, 1982). The quality and concentrations of DNA were examined with a ND-1000 spectrophotometer using ND-software (Thermo Fisher Scientific). For ddRAD sequencing 500 ng gDNA was allowed to run on 2% agarose gel to ensure quality. All gDNAs (1,000 ng) were treated with RNase to get rid of RNA before library preparation. includes the barcoding region (CO1b) and a downstream region (CO1d). The CO1b fragment was amplified using standard barcode primers LCO-1490 and HCO-2198 (Folmer, Black, Hoeh, Lutz, & Vrijenhoek, 1994), and the CO1d fragment with COIF (Palumbi & Benzie, 1991) and TL2N (Quan et al., 2001). PCR was performed in a volume of 10 μl, including 30-150 ng DNA, 0.2 mmol/L dNTP, 0.1% Tween 20, 1× Standard Taq Buffer (New England Biolabs), 0.5 mg Bovine Serum Albumin, 0.5 U Taq DNA Polymerase, and 0.34 mmol/L each of forward and reverse primers. The amplification protocol of CO1b fragment included an initial denaturation at 94°C for 4 min, and 37 cycles of denaturation at 94°C for 30 s, annealing at 45°C for 45 s, and extension at 72°C for 1 min, then a final extension at 72°C for 6 min. The PCR protocol for the CO1d region contained an initial denaturation at 94°C for 5 min, 38 cycles of denaturation at 94°C for 30 s, annealing at 55°C for 30 s, and extension at 72°C for 30 s, then a final extension at 72°C for 7 min.

| Polymerase chain reaction (PCR) and sequencing
All PCR products were examined on a 1.5% agarose gel with 1 μl Bromophenyl Blue and visualized under UV light, after staining with Ethidium Bromide. An ExoSAP reaction was performed to remove excess nucleotides and primers from PCR products (5 μl) in a 10 μl reaction volume. The DNA template (1 μl) was sequenced with the Big Dye Terminator kit 3.1 (AB), precipitated with ethanol and run on a Genetic Analyser (3500xL Applied Biosystems). The sequences were edited using BioEdit Sequence Alignment Editor (Hall, 1999) and aligned by applying the ClustalW Multiple alignment.

| Genetic diversity and population differentiation
Genetic diversity of the combined CO1 fragments of M. rosenbergii, including haplotype diversity (h), nucleotide diversity (π) and the partition among sample sites with analysis of molecular variance (AMOVA) applying both the distance method (Φ) and the conventional F-statistics from haplotype frequencies were calculated using ARLEQUIN v3.5 (Excoffier & Lischer, 2011). Significance level of the genetic partition was tested by 1,000 permutations of individuals among samples. Haplotype richness (H R ) was calculated using the allele richness function in HIERFSTAT package in R (Goudet, 2005).

| Demographic history and population expansion
Population demographic changes and deviation from neutrality in M. rosenbergii were estimated by analyzing the mismatch distribution, using sum of square deviation (SSD) (Excoffier, 2004) and the raggedness index (Harpending, 1994), and with Tajima's D (Tajima,1993) and Fu's Fs (Fu, 1997) using ARLEQUIN v3.5 (Excoffier & Lischer, 2011). The time since expansion was based on the median of the mismatch distribution (τ) and the mutation rate, μ = 0.7 to 1.3% per site per Myr, for the CO1 (Knowlton & Weigt, 1998;Knowlton, Weigt, Solorzano, Mills, & Bermingham, 1993;Schubart, Diesel, & Hedges, 1998), as t = τ/(2 μl), where L is the length of the sequence. The demographic changes were further analyzed with the Bayesian Skyline Plot (BSP), based on the 83 CO1 sequences to estimate the past population dynamics from the time of sampling, assuming no population structure. The BSP analysis was implemented in BEAST v1.7.5 (Drummond, Ho, Rawlence, & Rambaut, 2007), following a strict molecular clock and the TN93 model with Invariants sites (I), derived from a PhyML Test (Guindon et al., 2010) using the APE package (Paradis, 2012) in R (R Core Team 2015). Posterior probability of the effective population size (N e ) was estimated with the BSP analysis, using MCMC procedures by moving backward until the time of the most recent common ancestor was reached (Liao et al., 2010). Markov chains were run for 5.0 × 10 7 generations and sampled every 1000. Log files were visualized for the posterior probabilities of the Markov Chain statistics using TRACER v1.5 (Rambaut & Drummond, 2009), and 10% of the samples were discarded as burn-in during "skyline" reconstruction. Skyline data were exported from TRACER v1.5, and the skyline plot was redrawn using the package APE (Paradis, 2012)
Min depth of coverage to create a stack was 7, maximum distance allowed between stacks was 2 and to align secondary reads 4. Max number of stacks allowed per de novo locus was 3.
A  (Foll, 2012) format. Given enough statistical power BayeScan enables the identification of the effect of natural selection on the population subdivision at different loci, either due to diversifying or balancing/directional selection, which is characterized by positive and negative alphas, summarized using the package BOA (Smith, 2015) in R (R Core Team 2015).
Ordination of the SNP genotypes was investigated using discriminant analysis of principal components (DAPC), and followed by the assignment of individuals to different clusters defined with and without prior information, as implemented in the ADEGENET package (Jombart et al., 2015) in R.

| Mitochondrial DNA diversity
The mitochondrial variation was characterized by a high overall haplotype diversity close to its maximum value (h = 0.90-0.99), and nucleotide diversities ranged from 0.0022 to 0.0031 for the different samples ( Table 1)

| Genetic diversity
In total 141 stacks with 106 SNPs were obtained from the ddRADseq analysis of 73 individuals with an average depth of 18.6 (SD = 20.7) using the 7 depth base calling. The deviations of expected heterozygosity from observed heterozygosity (F IS ) for the SNP dataset were overall negative, ranging from −0.22 to −0.66 within populations ( Each pie represents a haplotype and its size reflects the frequency. Distances between pies correspond to number of mutational differences between haplotypes. Shadings (black to white) denote four sampling locations, BR, MR, KR, and NR, respectively, see Figure 1) 0 ( Table 2). The expected heterozygosity for the single SNP dataset was similar among the sampling locations. (Table 2).

| Detection of selection and demographic changes
Despite the deviation from Hardy-Weinberg equilibrium in some loci, variation at all SNP markers was in compliance with the neutral expectation obtained from BayeScan. The distribution of alpha values ranged from −0.06 to 0.51 with a mean of 0.00 and was all nonsignificant as the corrected p values (q) due to multiple testing using the FDR method (Foll, 2012) ranged from 0.66 to 0.90.

| Clustering of individuals and population differentiation
The DAPC analysis with prior information of the sampling sites, based on the SNP dataset, reveal distinct differentiation among populations, with the highest overlap between KR and MR ( Figure 5a; see  Table 3). The pairwise comparisons between populations among BR, KR, MR, and NR support the result observed from the DAPC analyzes that the NR differed most from the others (Figure 7).
All comparisons with NR were significant (p < .02), whereas the others were not (Table 3)

| DISCUSSION
Information about genetic population structure and connectivity of natural populations is important for sustainable harvest of populations and the management of diversity (Olsson et al., 2007). Natural populations can be affected by human activities such as aquaculture, exploitation for consumption, and environmental changes. Application of next-generation sequencing (e.g. RAD sequencing) has proven to be successful to detect population patterns, for example, in American lobster Homarus americanus (Benestan et al., 2015) and three spine stickleback Gasterosteus aculeatus (Baird et al., 2008;Hohenlohe et al., 2010;Jones et al., 2012), but application of such intensive methods has been limited in the developing nations, particularly in South and South-East Asia where both biodiversity and threats to biodiversity    were not significant, they were almost similar to the values observed in our study (0.011-0.035). The detection of significant differentiation between the populations in the present study might have been supported by the larger number of the nuclear markers. The result from the DAPC ordination method supported the differentiation of the Naf River population but indicated that there might also be barriers to admixture between the rivers. When considering the combination of base pairs at the different sites within the genome, the most individuals sampled in one river were more similar to the individuals from the same river than to the individuals in other river.
To conclude, SNP variation revealed at least two distinct popu- has high genetic variation, prawn hatcheries could be more sensible when they use ovigerous females from the same area in order to reduce threats to the local population diversity due to accidental escape from aquaculture. Further information about the biology of the species, such as variation in time of reproduction, behavior and habitat, is warranted to evaluate whether the observed population structure can be explained by different sources rather than the geographical origins.

ACKNOWLEDGMENTS
The study was funded by the United Nations University-Fisheries Training Programme (UNU-FTP), Iceland. The first author is being supported under a Doctoral scholarship by the UNU-FTP, Iceland.