Development and characterization of novel microsatellite markers by Next Generation Sequencing for the blue and red shrimp Aristeus antennatus

The blue and red shrimp, Aristeus antennatus, is a commercially important crustacean, in the Mediterranean Sea, which has been listed as a priority species for fishery management. Hypervariable microsatellite markers could be a useful tool to identify genetic stocks among geographically close fishing grounds. Potential microsatellite markers (97) identified from next-generation sequencing of an individual shrimp using a 454 GS Junior Pyrosequencer were tested on a preliminary panel of 15 individuals representing the four worldwide genetic stocks of the species from which 35 polymorphic loci were identified and used to characterize an additional 20 individuals from the Western Mediterranean Sea. In the Western Mediterranean sample, 32 out of 35 were polymorphic loci and the number of alleles per locus ranged from 2 to 14 and expected heterozygosity ranged from 0.050 to 0.968. No linkage disequilibrium was detected, indicating the independence of the loci. These novel microsatellites provide additional tools to address questions relating to genetic diversity, parentage studies and connectivity patterns of A. antennatus populations and help develop effective strategies to ensure long-term sustainability of this resource.


INTRODUCTION
The demersal blue and red shrimp Aristeus antennatus (Risso, 1816) (Crustacea, Decapoda) is distributed in the Mediterranean Sea, eastern Atlantic waters adjacent to Gibraltar Strait and the Indian Ocean from Mozambique to Zanzibar (Fernández et al., 2013 and references therein) and has been recorded recently from the Brazilian coast (Serejo et al., 2007). The species inhabits the muddy bottoms of the continental slope along submarine canyons, ranging from 80 to 3,300 m depth, making this the most eurybathic species in the Mediterranean Sea, and has a complex demographic structure in the water column (Sardà et al., 2010 and references therein). Populations in shallower waters (<1,000 m depth) which are exploited by commercial fishing are mainly composed of large and mature females with a low but significant proportion of males that return to deeper grounds after mating takes place, principally in the summer. Populations in deeper waters (>1,000 m depth), which are not fished, have lower abundances and biomasses and are mainly composed of males and juveniles (Sardà et al., 2010).
Exploited since the 1950s, the global catch has increased annually, reaching 1,851 tonnes in 2013 (FAO, 2015). Currently the focus of an important trawl fishery because of its high value in Mediterranean fish markets (60-120 $/kg), neither international conservation strategies nor worldwide sustainable management policies have been developed for this species despite priority listing for action by the Scientific Advisory Committee (SAC) of the General Fisheries Commission for the Mediterranean Sea (FAO, 2006). An annual 60 days local closure in the fishing area of Palamós (Spanish Western Mediterranean) has been recently implemented, aimed at achieving the sustainable exploitation in this fishing ground (BOE, 2013).
Although studies have addressed A. antennatus biology and ecology, mainly of the Mediterranean populations (Fernández, 2012 and references therein), a major problem in achieving sustainable fisheries management is the identification of the biological units supporting the fishery, or the so-called genetic stocks (FAO, 1993). Mitochondrial DNA (mtDNA) diversity analyses of blue and red shrimp populations on a broad oceanographic scale have detected four genetic stocks: the Western Mediterranean Sea (WM), the Eastern Mediterranean Sea (EM), the Atlantic Ocean (AO) and the Indian Ocean (IO), but have failed to distinguish genetic divergence within the regional level (Maggio et al., 2009;Roldán et al., 2009;Fernández et al., 2011;Marra et al., 2015). Surveys using a small number of microsatellites isolated using the FIASCO protocol (Zane, Bargelloni & Patarnello, 2002) failed to distinguish any type of population structure within Italian fishing grounds (Cannas et al., 2008;Cannas et al., 2012).
Nevertheless, microsatellites have proved their utility in identifying genetic stocks in marine penaeoid shrimps around the world (Benzie, 2000;Brooker et al., 2000;Supungul, Sootanan & Klinbunga, 2000;Ball & Chapman, 2003;Maggioni, Rogers & Maclean, 2003;Waqairatu et al., 2012;Abdul-Aziz et al., 2015) and it is possible that a greater number of these might provide higher resolution in other Mediterranean and Mozambique populations. Next-generation sequencing (NGS) based on 454 pyrosequencing has enabled a high-throughput approach and has demonstrated its usefulness in the development of SSRs in non-model organisms (Schoebel et al., 2013;Sousa-Santos, Fonseca & Amorim, 2015). Long reads generated by 454 sequencing are expected to contain SSRs and their flanking regions that are needed to develop methods for SSR screening among populations.
The aim of our study was the development and characterization of novel polymorphic microsatellite loci for A. antennatus to provide additional markers necessary to analyze the structure and connectivity of blue and red shrimp populations at smaller geographical scales both geographically and bathymetrically, but particularly within the Western Mediterranean, and to provide insights into this species' reproductive behaviour through enabling parentage analysis.

DNA extraction for next-generation sequencing
Genomic DNA isolation was performed from 10 mg of muscle tissue stored in 95% ethanol of an individual (Blanes 2010; specimen-voucher: LIGUDG:Aa:872, stored at our laboratory) by means of two series of Sambrook, Fritsch & Maniatis (1989) phenolchloroform extraction protocol, with minor modifications. The sample was treated between the series with 1 µl RNase A (20 mg/ml; Invitrogen TM , Thermo Fisher Scientific) and incubated at 37 • C for 30 min. The high quality of extracted DNA was checked by resolution on a 0.8% agarose gel with ethidium bromide (0.5 mg/ml) and was quantified using both a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific) (136 ng/µl) and Picogreen reagent R (Thermo Fisher Scientific) (60 ng/µl) with a Paradigm Detection Platform (Beckman Coulter) (CEGEN-UPF Services).

Next-generation sequencing and De Novo assembly
Two shotgun whole-genome sequencing was performed on a Roche 454 GS Junior Pyrosequencer by Pompeu Fabra University's Genomics Service (Barcelona, Spain). The reads obtained were assembled into contigs using the GS De Novo Assembler program included in the Newbler software package v.2.7. (Roche 454 Life Sciences 2006-2012; Roche, Bramfort, CT, USA) using the default settings with two modifications to take into account that A. antennatus is a diploid (heterozygoteMode: true), eukaryotic organism (largeGenome: true).

Microsatellite loci identification
To avoid redundancy with previous reported microsatellites, a BLASTn search employing our entire database and using a significance threshold of e-values <10 −10 was performed against all published A. antennatus microsatellites (Cannas et al., 2008). All contigs matching with already available A. antennatus microsatellites were not included in the further microsatellite development. Remaining contigs were scanned with Tandem Repeats Finder v. 4.04 (Benson, 1999) to detect microsatellite motifs. Alignment parameters of match, mismatch and indels were set as 2, 7 and 7, respectively; minimum alignment score to report repeats was set to 30; maximum period size to 5 and matching and indel probabilities were set as 0.8 and 0.1, respectively. In addition, the option of flanking sequence was considered to record up to 500 nucleotides on each side of the repeat in order to later design specific PCR primers.
Contigs containing tandem repeats with at least 30 bp of flanking regions were used for primer design with Primer3 v. 0.4 (Untergrasser et al., 2012) using the following conditions: contiguous repeat areas that were less than 100 bp apart were considered the same locus (interrupted microsatellite), PCR product size ranged from 100 to 450 bp and PCR primers design followed the default general primer picking settings consisting of an optimal primer length of 20 bp (ranging between 18-27 bp), an optimal annealing temperature of 60 • C (ranging between 57-63 • C) and a GC content optimized between 20 and 80%.

Verification of putative microsatellites
Putative microsatellites were preliminarily screened for amplification success and polymorphism by genotyping a panel of 15 A. antennatus individuals representative of four geographic areas showing significant genetic divergences with mitochondrial markers: WM (n = 5), EM (n = 3), AO (n = 4) and IO (n = 3) (Fernández et al., 2011). For each individual, the total DNA extraction was conducted following the standard phenolchloroform procedure (Sambrook, Fritsch & Maniatis, 1989). The approach of Schuelke (2000), which included three primers in a nested PCR, was conducted for SSR genotyping. For each locus, we used a non-labelled specific forward primer with a universal 19 bp 5 -M13 tail (5 -CACGACGTTGTAAAACGAC-3 ), a non-labelled specific reverse primer and the universal fluorescently 6-FAM-labelled M13 as the third primer. This reduced genotyping costs because only a common fluorescently labelled primer was required for all putative microsatellite verifications.
Subsequently, 2 µl of amplified product was added to 10 µl of Hi-Di TM formamide with 0.1 µl of GeneScan TM 500 LIZ size standard and were loaded on to an ABI PRISM 3130 Genetic Analyzer (Applied Biosystems). Finally, allele scoring, after manually checking, was performed using Geneious v7.1.9 (Kearse et al., 2012).

Data analysis
In order to assess the usefulness for population genetics studies, all loci yielding reliable polymorphism were further evaluated by genotyping 20 individuals collected from the same WM locality (Blanes) used in the preliminary screening. For each locus, observed heterozygosity (H O ), expected heterozygosity (H E ) and inbreeding coefficient (F IS ) were estimated with Genepop v4.2.2 (Rousset, 2008) using allele identity. For loci where at least 70% of individuals were genotyped, departures from Hardy-Weinberg expectations (HWE) and linkage disequilibrium for each pair of loci were also tested using Genepop v4.2.2 (Rousset, 2008). Significance level was adjusted using Bonferroni correction for multiple testing. Micro-Checker (Van Oosterhout et al., 2004) was used to check for the presence of null alleles, stuttering or allele dropout in loci that presented significant departure from HWE. In addition, the combined non-exclusion probability test was conducted with Cervus v3.0.3 (Kalinowski, Taper & Marshall, 2007) to check for efficiency of these loci for inferring parentage analysis.

Next-generation sequencing and De Novo assembly
During the last decade, 454 pyrosequencing technology has been used widely to develop microsatellites in non-model organisms (Leese et al., 2012;Schoebel et al., 2013), as it is faster and more cost-effective than the previous methodologies (e.g., FIASCO protocol). The shotgun whole-genome sequencing with a Roche 454 pyrosequencer on the A. antennatus genome produced 165,507 reads with an average length of 343 bp (56,772,861 bp in total). Comparatively, the number of reads and total bp sequenced conforms to the values obtained by Leese et al. (2012) for other crustaceans, including decapods, using the same type of approach (49,802-186,890 and 12,098,817-55,152,741, respectively). Our average read length was clearly above the range of previous crustacean studies (211.5-265.5 bp). Up to 10,613 of the obtained reads were assembled into 2,029 contigs covering 895,246 bp of the A. antennatus genome and the number of singletons was 128,835. These contigs had an average length of 873 bp (range: 100-12,986 bp) and N50 contig size was 858 bp.

Microsatellite loci identification and development
No homology with already available A. antennatus microsatellites was found in the BLASTn search, except for our contig00595 that matched the microsatellite Ant93 described in Cannas et al. (2008) (GenBank accession number EU417952).
One hundred and ten of the putative microsatellite markers presented a minimum of 30 flanking bp, likely useful for primer design and Primer 3 software suggested specific primer pairs for 97 of these markers. These results were in the range observed in previous searches undertaken on crustaceans including decapods (25-1,420 putative markers) when stringent filters were used on massive 454 sequencing datasets (Leese et al., 2012). From the preliminary evaluation, 93 out of 97 microsatellites have amplified PCR products. After discarding primer pairs producing unexpected size fragments for microsatellite loci or which were monomorphic, 35 markers yielded reliable microsatellite variation with a total of 144 alleles ranging from 2 to 11 alleles per locus with an average of 4.1 (Table 1). Sequences of the contigs including these polymorphic markers were submitted to GenBank (Accession Numbers: KU195267-KU195299) ( Table 1 and Table S1). The proportion of success (36%) in detecting polymorphic loci for putative microsatellite markers was consistent to that achieved in decapods (15%-68%) (Meehan et al., 2003;Dao, Todd & Jerry, 2013;Dambach et al., 2013;Miller et al., 2013), in other marine species (e.g., 29% in a mollusk (Pardo et al., 2011) and also in fishes (∼40%; Sousa-Santos, Fonseca & Amorim, 2015;Williams et al., 2015)).

Population genetic analysis
Of the 35 initial loci, 32 were polymorphic in the Western Mediterranean sample analysed with a mean number of alleles per locus of 4.8 (2-14) and a mean observed and expected heterozygosity of 0.350 (0.000-1.000) and 0.521 (0.050-0.968), respectively ( Table 2). The number of alleles detected in our study resembled that previously shown in the Italian study (3-13;Cannas et al., 2008), but our estimates of observed and expected heterozygosity were broader than 0.2-0.85 and 0. 23-0.89 in Cannas et al. (2008). In decapod shrimps, levels of polymorphism are usually high, with abundant alleles and often heterozygosity levels close to 1.0 (Scarbrough, Cowles & Carter, 2002). Among the 32 polymorphic loci, no linkage disequilibrium was detected after Bonferroni correction (P = 0.0001), suggesting their independent segregation. In 28 out of 32 polymorphic loci, proper amplification was obtained for more than 70% of the analysed specimens (N = 20). In 21 out of 28 the polymorphic loci, genotype proportions were consistent with HWE expectations after Bonferroni correction (P = 0.0018), with a mean number of alleles per locus of 4.4 and a mean expected heterozygosity of 0.455, which are in line with those obtained by Cannas et al. (2008) in 9 loci (5.3 and 0.637, respectively). Micro-Checker revealed null alleles (non-amplified alleles by PCR due mainly to mutations in flanking regions) as likely involved in significant heterozygote deficiency (positive F IS values, see Table 2) observed in six loci. In addition, stuttering in two loci (Aa818 and Aa1444) contributed to genotype scoring errors. In most studies in other decapods, significant departures from HWE resulted in positive F IS values (Supungul, Sootanan & Klinbunga, 2000;Ball & Chapman, 2003;Pan et al., 2004;Cannas et al., 2008;Dambach et al., 2013;Miller et al., 2013). Despite the majority of these authors arguing that the most probable reason for HWE departures was the presence of null alleles; this suggestion was only tested and confirmed by one study (Miller et al., 2013). In regard to Aa315locus in this study, all the analyzed individuals showed the same heterozygote genotype, and Micro-Checker failed to detect any genotyping error. It may be that, this is indicative of two duplicate regions; with no variation within regions but different fixed allele when compared (Moen et al., 2008).
The application of microsatellite markers also has proven its utility to study mating systems and paternity in decapod shrimps (Bilodeau, Felder & Neigel, 2005). In our study, using the 21 loci under HWE to test their potential in parentage analysis, the combined exclusion probability in parentage assignment was 0.9786 for a candidate parent when parents were unknown and the combined exclusion probability for the identity of two unrelated individuals was 0.9999. Thus, these results indicate the effectiveness of these loci for future parentage studies in natural populations of A. antennatus.

CONCLUSIONS
Massive sequencing by a 454 pyrosequencing platform of the A. antennatus genome has allowed the development, validation and characterization of 35 new polymorphic microsatellite loci which greatly increases the number of available microsatellites from the eight microsatellites previously described (Cannas et al., 2008). This larger set of loci will provide a stronger set of tools to: (i) perform parentage studies, and (ii) examine connectivity patterns (horizontal and vertical) including examining the population structure of this species at a variety of geographical scales and, particularly, between fished populations in shallow waters and deeper unfished populations. This contribution will also assist to the responsible and sustainable management of this exploited marine resource, by means of a feasible strategy of temporal genetic monitoring.