Population connectivity of fan-shaped sponge holobionts in the deep Cantabrian Sea

Connectivity is a fundamental process driving the persistence of marine populations and their adaptation potential in response to environmental change. In this study, we analysed the population genetics of two morphologically highly similar deep-sea sponge clades (Phakellia hirondellei and the ‘Topsentia-and-Petromica’ clade, (hereafter referred to as ‘TaP clade’)) at three locations in the Cantabrian Sea and simultaneously assessed the corresponding host microbiome by 16S rRNA gene sequencing. A virtual particle tracking approach (Lagrangian modelling) was applied to assess oceanographic connectivity in the study area. We observed overall genetic uniformity for both sponge clades. Notably, subtle genetic differences were observed for sponges of the TaP clade and also their microbiomes between a canyon and bank location, < 100 km apart and with the same depth range. The Lagrangian model output suggests a strong retention of larvae in the study area with variable inter-annual connectivity via currents between the three sampling regions. We conclude that geologic features (canyons) and the prevailing ocean currents may dictate sponge holobiont connectivity and that differentiation can emerge even on small spatial scales.


Introduction
The ocean is the largest interconnected habitat on planet Earth (Ramvirez-Llodra et al., 2011). At the same time, it is a highly dynamic system showing natural fluctuations, but also discrete responses to human impacts. Connectivity studies are important for our understanding regarding the resilience of ecosystems to changing oceanic conditions, as well as for an evaluation of population vulnerability to environmental change and anthropogenic stressors (Fox et al., 2016;Kenchington et al., 2019). Understanding population-and ecosystem-connectivity in the ocean is crucial for the design of appropriate conservation measures, in particular for the design of marine protected area (MPA) networks (White et al., 2014;Gallego et al., 2017;Andrello et al., 2017;Kenchington et al., 2019). Many marine benthic organisms have pelagic larvae that ensure the maintenance of genetic diversity and allow for a colonization of new habitats (e.g. review by Levin, 2006;Cowen and Sponaugle, 2009). Larval dispersal is difficult to observe directly in the field and in particular in the deep-sea. Therefore, indirect methods such as molecular markers and virtual particle tracking are commonly applied to analyse genetic connectivity between organisms (e.g. review by Cowen and Sponaugle, 2009;Baltazar-Soares et al., 2014;Breusing et al., 2016). To conduct virtual particle tracking, biophysical models based on a Lagrangian approach (i.e. individual particle tracking in space and time; Cowen and Sponaugle, 2009)

are powerful
Abbreviations: TaP clade, Topsentia-and-Petromica clade. tools. As the term 'biophysical' implies, in these approaches biological parameters (e.g. time point of larval release and pelagic larval duration) are integrated into a physical framework (i.e. ocean current models) to acquire large ensembles of passive drift trajectories, which represent larval dispersal (Breusing et al., 2016). Information about the biology of the studied organism, including its reproductive cycle, is crucial to fine-tune respective models, but this kind of information is generally not available for the deep ocean. Along these lines, the number of population genetic studies addressing connectivity is still very small (Taylor and Roterman, 2017). Previous studies suggest that the offspring of sessile benthic invertebrates is typically transported 10-100 km distance away from the parents (Palumbi, 2004). However, identification and exploration of biophysical barriers, which may inhibit gene flow, is fundamental to understanding population connectivity. While depth is commonly known to be a main factor related to species and population structure (e.g. Van Soest and de Voogd, 2015;Taylor and Roterman, 2017;Indraningrat et al., 2019), the barriers separating regions within a similar depth range are less understood.
The geology of the seafloor is relevant for the formation and maintenance of biophysical barriers on different spatial scales. On large spatial scales, massive features such as shelf breaks, trenches and ridges direct major ocean currents. On regional scales, the seafloor may lead to up-or downwelling, or circulatory water flows. Seamounts and banks are prominent elevated geologic features on the regional scale. These settings are often habitats for diverse macroorganisms (Morato et al., 2010) and microorganisms (Busch et al., 2020a), which frequently are endemic to that location (De Forges et al., 2000). Submarine canyons are prominent plunging geologic structures which represent crucial connections between productive shelf waters and the deep open ocean (Martín et al., 2006) but may also potentially lead to retention of organic matter (Hickey et al., 1986).
Around the northern shores of Spain, the continental shelf is comparably narrow   (Fig. 1) and streaked with numerous submarine canyons. The Avilés Canyon System, which is located in front of the Asturian coast, is one of the largest submarine canyons in Europe (Rumín-Caparrós et al., 2016). Overall, the seafloor on the Cantrabrian shelf has a high structural complexity, with elevated topography (e.g. banks) occurring in close proximity to plunging geologic features (e.g. canyons). Along these lines, the Le Danois Bank (Spanish name: El Cachucho) is found within a few kilometers distance next to the Avilés Canyon System (Fig. 1). Le Danois Bank is a designated Marine Protected Area (MPA) and the Avilés Canyon System is a designated Special Area of Conservation (SAC), as both areas harbor a great diversity of benthic organisms (Sánchez et al., 2008. Sponges are prominent and abundant benthic organisms in the deep North Atlantic Ocean (Klitgaard et al., 1997;Klitgaard and Tendal, 2004;Murillo et al., 2012). Dense aggregations of sponges in the deep-sea are commonly referred to as 'sponge grounds'. Sponge grounds are often characterized as Vulnerable Marine Ecosystems (FAO 2009;FAO 2016) as they can be affected by anthropogenic activities such as bottom trawling (Kazanidis et al., 2019;Busch et al., 2020b) and oil and gas exploitation (Vad et al., 2018). These threats stand opposed to the high ecological value of deep-sea sponge grounds. The ecological value of deep-sea sponge grounds includes enhancement of local biodiversity (Beazley et al., 2013), formation of biomass hot-spots, and representation of a crucial role in biogeochemical cycling (Cathalot et al., 2015).
Sponges live in close association with dense and diverse symbiotic microbial consortia (Easson and Thacker, 2014;Hentschel et al., 2012;Thomas et al., 2016). The microbiomes of sponges are represented by diverse bacterial and archaeal clades with >63 prokaryotic phyla having been identified so far (Thomas et al., 2016;Moitinho-Silva et al., 2017a). Microbial symbionts were shown to contribute to the metabolism of the sponge, i.e., via carbon and nitrogen cycling as well as vitamin production and defense (reviewed in Pita et al., 2018). Sponges and their associated microbial communities (hereafter termed 'holobionts') are considered fine-tuned entities where both partners are tailored to function together. At the population-level, previous studies found sponge host population genetics and location to be related with the structure of the associated microbiomes at variable degrees, depending on the level of gene flow between the hosts (Griffiths et al., 2019;Díez-Vives et al., 2020;Easson et al., 2020).
The present study aimed to explore the host genetic and microbial connectivity of sponge holobionts on a small geographic gradient <100 km. We focused on fan-shaped sponges that are commonly found at the Avilés Canyon System and the Le Danois Bank of the Cantabrian Sea (Fig. 1). These sponges of the 'Topsentia-and-Petromica' clade (hereafter referred to as 'TaP clade') and Phakellia hirondellei are morphologically very similar even to the expert eye, but belong to different sponge orders. We sought to address whether genetic diversification can be resolved in deep-sea sponge holobionts sampled less than 100 km apart and what effect the geological setting (canyon versus bank) might have on population structure. In addition, oceanographic connectivity in the Cantabrian Sea was assessed by a virtual larvae tracking approach. Given that the continental shelf is comparably narrow in the study area, we were particularly interested to evaluate whether virtual sponge larvae would travel along the bathymetry. The presented findings contribute -in the long run -to a better management and preservation of vulnerable marine ecosystems.

Field work
42 fan-shaped sponge specimens were collected with a rock dredge onboard RV Á ngeles Alvariño from the area of the Avilés Canyon System (~43.9 • N; 6.0 • W) and the Le Danois Bank (~44.1 • N; 5.0 • W) in June 2017 during the expedition SponGES0617. In addition, 15 full depth CTD casts were performed at the respective and additional sampling stations. The focus of this study was on three sampling regions: region A ('Canyon'), region B ('West-Bank'), and region C ('East-Bank'), (Fig. 1). Sampling at all three sampling regions was conducted within a similar depth range (region A: 695 m, region B: 653 m, and region C: 541 m). Bathymetric data of the study area were extracted from the Bathymetry Data Portal of the European Marine Observation and Data Network (EMODnet). Samples of different sponge species were collected in the same grab. The sample details, corresponding (meta-) data and raw sequence NCBI-accession numbers for all methodological approaches of all seawater and sponge samples presented in this study were deposited in the PANGAEA data repository (https://doi.org/10.1594/PANGAEA. 923271).
After collection, sponge specimens were rinsed with fresh seawater and photographed. Tissue subsamples were taken for microbial work and instantly flash-frozen. Sponge fragments of each specimen (1-3 cm 3 ) were preserved in 96% EtOH for molecular analysis (see below), and immediately kept at − 20 • C; EtOH was replaced once after one day preservation. Small sponge fragments were also incubated in sodium hypochlorite solution (Panreac 5% w/v technical grade) and kept at room temperature until the end of the cruise for spicule analysis (see below). The remainder of the specimens not used for molecular and morphological analyses was preserved in 80% EtOH and kept at room temperature in the reference collection at the Oceanographic Centre of Gijón (IEO, Spain). Seawater samples (2 L) for microbial community analyses were derived from each CTD casts' bottom depth (~5 m above seafloor), filtered onto PVDF filter membranes (pore size = 0.22 μm and ø = 47 mm) and stored at − 80 • C until further processing.

Sponge morphological analyses
For spicule preparation, soft sponge tissue was digested with sodium hypochlorite solution (Panreac 5% w/v technical grad.). The remaining spicules were soaked 3 × 1 h with distilled water in a water bath at room temperature which was removed by repeated centrifugation (1 min at 12,000 rpm). Finally, absolute ethanol was added. A few drops of ethanol and spicules were placed on a slide which was then flamed for dehydration. A few drops of Durcupan™ ACM embedding mixture for microscopy (Sigma-Aldrich) were placed on the coverslip which was then kept in the oven at 36 • C for 24 h before visualisation of spicules was performed using a microscope Nikon Eclipse 50i.
Overlapping sequence fragments of 18S and COI were assembled and trimmed separately into consensus sequences using the software Geneious v.10.1.3 (Kearse et al., 2012). Occasionally only forward or reverse sequences were used due to the poor quality of one of the fragments (for all 18S and three COI sequences). Consensus sequences were then checked for contamination using BLAST (Altschul et al., 1990), aligned with MAFFT v.7.309 (Katoh and Standley, 2013) using sequences of closely related species, and used to construct a phylogenetic hypothesis with Maximum Likelihood using RAxML 8.1.22 (Stamatakis, 2014). The evolutionary model was selected using jModelTest (Darriba et al., 2012), resulting in GAMMAGTR for both markers. Phylogenetic analyses for 18S and COI were run separately ten times, with 100 replicates for bootstrap recovery. Our taxonomic distinction between species is then based on 18S and COI phylogenetic analyses (see below) and spicule confirmation. This procedure, of combining classical sponge taxonomy with molecular sequencing, left us with 21 unambigiously identified sponge specimens for the present study, which are comprised of 11 TaP clade individuals and 10 P. hirondellei individuals.

Sponge reproductive state analyses
An assessment of the sponge reproductive state was conducted as biological ground-truthing for the biophysical particle tracking model. Our main aim of this analysis was to derive a reasonable timepoint of larval release into the water column. For the assessment, small pieces (5 mm 3 ) of five TaP clade individuals and three P. hirondellei were fixed in a solution of 2.5% glutaraldehyde, 0.4M PBS and 0.34 NaCl onboard and stored at 4 • C (modified after Koutsouveli et al., 2018). In the home laboratory, samples were rinsed three times with buffer (0.4 M PBS-0.6 M NaCl) and then postfixed in 2% osmium tetroxide for 2 h at 4 • C. After rinsing the sample pieces three additional times (with an incubation time of 15 min per washing step, at 4 • C), partial dehydration was conducted with an ascending ethanol series (2x 30 %-1x 50 %-1 × 70%). In the next step, sponge pieces were submerged in 5% hydrofluoric acid overnight to remove any silica remnants from their skeleton. Subsequently, samples were washed carefully (8 × 70% EtOH) and dehydration was continued with an increasing gradient of ethanol (1x 90 %-2 × 100%). Once dehydrated, samples were embedded in LR-White (LRW) resin, using a gradient of LRW/EtOH (1x 70/30-1x 50/50-1x 30/70-2x 100). Afterwards, samples were transferred into fresh LRW and polymerized inside embedding capsules in an oven (57 • C) for two days. Blocks of LRW were cut using a Reichert Ultracut ultramicrotome equipped with a diamond knife (DIATOME, Switzerland) at 0.5-2 μm (semi-thin sections) and 70-90 nm (ultrathin sections) respectively. The semi-thin sections were stained with toluidine blue or Richardson solution (latter prepared as described in (Mulisch and Welsch, 2015)), while the ultra-thin sections for transmission electron microscopy (TEM) were contrasted with uranyl acetate and Reynold's lead citrate. Ultra-thin sections were visualised with a Hitachi Transmission Electron Microscope TEM (H-7650) and a Tecnai G2 Spirit Bio Twin TEM (FEI Company) at 90 kV. Semi-thin sections were visualised with an Axio Observer. Z1 microscope (Zeiss, Germany).

Sponge population genetic analysis
ddRADseq libraries were performed for a total of 21 individuals: 11 TaP clade specimens and 10 P. hirondellei. Library preparation was conducted following (Peterson et al., 2012) with the following modifications by (Combosch et al., 2017). Double-stranded genomic DNA (500 ng) was digested using the high-fidelity restriction enzymes SbfI and EcoRI (New England Biolabs). Resulting digested fragments were cleaned by manual pipetting using Agencourt AMPure beads (1.5X volume ratio; Beckham Coulter), and were subsequently quantified with a Qubit dsDNA HS assay (Life Technologies). In the next step, resulting fragments were ligated to custom-made P1 and P2 adapters containing sample-specific barcodes and primer annealing sites. Barcoded individuals were pooled into libraries, cleaned by manual pipetting using AMPure beads (1.5X volume ratio), and size-selected (range sizes 200-400 bp) using a Blue Pippin Prep (Sage Science). Each library was PCR-amplified using Phusion polymerase (Thermo Scientific) and a different set of PCR primers with barcodes in order to create multiplex libraries. The PCR program used was 98 • C/30 s -(98 • C/10 s-65 • C/30 s-72 • C/1.5 min) x 12 cycles -72 • C/10 min. Resulting PCR products were cleaned by manual pipetting using Agencourt AMPure beads (1.5X volume ratio), quantified with a Qubit dsDNA HS assay, and quality-checked on a Tapestation 2200 (Agilent Technologies). Libraries were pooled normalizing their concentration, and pooled together with RNA-seq libraries in the same flow cell. Libraries pair-end sequenced (150 bp) were run on an Illumina HiSeq 4000 (Illumina) at Macrogen Inc. (South Korea).
Quality filtering of reads and locus assembly was conducted with the Stacks pipeline, v2.4.1 (Catchen et al., 2013). RAD-tags (DNA fragments with the two appropriate restriction enzyme cut sites that were selected, amplified, and sequenced) were processed using process_radtags, where raw reads were quality-trimmed to remove low quality reads, reads with uncalled bases, and reads without a complete barcode or restriction cut site. The process_radtags rescue feature (-r) was used to recover minimally diverged barcodes and RAD-tags (-barcode_dist 3; -adapter_mm 2). The process_radtags trimming feature (-t) was used to trim remaining reads to 120 bp, in order to increase confidence in single-nucleotide polymorphism (SNP) calling. After performing these filtering steps in process_radtags, we retained a total of 74,442,388 reads from the initial 102,673,760 raw reads (72.5% of the total reads retained), with an average of 3,544,876 reads per sample.
Preliminary tests were carried out following (Jeffries et al., 2016) and (Paris et al., 2017) in order to identify optimal Stacks parameters (including m, M, and n) for our dataset. Briefly, tests were carried out for 5 sets of 3 randomly chosen individuals and, for each test, all non-test parameters were kept as default. The Stacks populations module was run to filter data with r = 0.8 for each test, and the number of assembled loci, number of polymorphic loci, number of SNPs, and coverage was compared between the tests. Final parameter values were as follows: ustacks: M = 2, m = 3; cstacks: n = 1. Mean locus coverage among all samples was 47.1 ± 13.0, ranging from 15.8 to 99.5.
The Stacks populations module was used to conduct a first filtering of the data, retaining those SNPs present in at least 70% of the individuals (r = 0.7), and just retaining the first SNP from each RAD-tag using "-write_single_SNP" in order to reduce the linkage disequilibrium among loci. A subsequent more accurate filtering was performed using the adegenet R package (Jombart, 2008;R Development Core Team, 2008;Jombart and Ahmed, 2011), assessing SNP distributions across individual samples and sampling regions, and testing different filtering thresholds in order to maximise the number of retained SNPs and minimise missing data. This approach provides significant help in defining final thresholds in comparison with the Stacks populations module. The resulting assessment resulted in no further filtering of samples.
Given the known presence of symbiotic bacteria in all sponges, the resulting set of sequences containing variable SNPs obtained after running populations were filtered for bacterial hits. This was done using -blastn comparing the aforementioned set of sequences against a nr database for bacteria extracted from NCBI (accessed on the June 17, 2018), using a e-value of 1e-6 or lower. This filtering of bacteria resulted in 0 being removed in both the P. hirondellei and TaP clade datasets. That said, only polymorphic SNPs were used.
We calculated population genetic diversity and demographic statistics separately for TaP clade specimens and P. hirondellei by grouping the samples per sampling region. Expected (He) and observed (Ho) heterozygosities, and inbreeding coefficients (F IS ) were calculated per each sponge clade per sampling region using Genodive v.3.02 (Meirmans and Van Tienderen, 2004). To assess global inbreeding within sampling regions and differentiation among them, we also calculated global inbreeding coefficient F IS in Genodive.
We assessed the population structure for each sponge clade separately using three different methods: STRUCTURE v.2.3 (Pritchard et al., 2000); the function snapclust in the adegenet R package (Beugin et al., 2018); and the discriminant analysis of principal components (DAPC) as implemented in the adegenet R package (Jombart et al., 2010). We ran STRUCTURE with 50,000 MCMC iterations using the admixture model, with a burn-in of 20,000 iterations, setting the putative K from 1 to 4 with 10 replicates for each run. We used STRUCTURE HARVESTER (Earl and vonHoldt, 2012) and CLUMPP v.1.1.2 (Jakobsson and Rosenberg, 2007) to determine the most likely number of clusters and to average each individual's membership coefficient across the K value replicates, respectively. We used the Akaike Information Criterion (AIC) to identify the optimal number of clusters in snapclust. choose.k, and then initial memberships for snapclust were chosen using the k-means algorithm (pop.ini = "kmeans"), allowing a maximum K (number of clusters) of 12 (max = 12), and a n. start.kmeans of 100 (n.start.kmeans = 100). For the DAPC analysis we grouped samples by sampling region, and retained the number of principal components analysis (PCA) axes and eigen values using the cross-validation xvalDapc function from the adegenet R package. Finally, Pairwise F ST values were estimated to measure the differentiation between pairs of sampling regions using Genodive v.3.02 (Meirmans and Van Tienderen, 2004) with 20,000 permutations.

Sponge-associated microbial community analyses
DNA extraction of all fan-shaped sponges was performed using the DNeasy Power Soil Kit (Qiagen) with approximately 0.25 g sponge tissue or half a seawater filter (for 46 seawater samples in total) used as input material. The concentration of resulting DNA extracts was checked with a NanoDrop spectrophotometer and their quality assessed by gel electrophoresis after a polymerase chain reaction (PCR) with the universal 16 rRNA gene primers 27F and 1492R. After the quality check, a onestep PCR (30x cycles: 98 • C/30 s -98 • C/9 s -55 • C/1 min -72 • C/ 1.30 min -72 • C/10 min -hold at 4 • C) was conducted on the extracts to amplify the V3 to V4 variable regions of the 16S rRNA gene (using the primer pair 341F 5 ′ -CCTACGGGAGGCAGCAG-3' (Muyzer et al., 1993) & 806R 5 ′ -GGACTACHVGGGTWTCTAAT-3' (Caporaso et al., 2011) in a dual-barcoding approach (Kozich et al., 2013)). The amplicon libraries were quality checked by gel electrophoresis, normalised with the SequalPrep Normalization Plate Kit (ThermoFisher Scientific) and pooled equimolarily. Sequencing was performed on a MiSeq platform (MiSeqFGx, Illumina) using v3 chemistry and in subsequent demultiplexing 0 mismatches were allowed in the barcode sequences.
Sequences were processed using QIIME2 (version 2018.11, (Bolyen et al., 2019), similar to the methods described in Busch et al., 2020c. Briefly, the DADA2 algorithm (Callahan et al. 2016) was applied on forward reads (truncated to 270 nt) to generate Amplicon Sequence Variants (ASVs), which in turn were used to calculate phylogenetic ASV trees with the FastTree2 plugin. Classification of representative ASV sequences (on the genus, family, order, class and phylum level) was performed by a primer-specific trained Naive Bayes taxonomic classifier, using the Silva 132 99% OTUs 16S database (Quast et al., 2013).
Statistical analyses of the microbial community were performed on all 21 sponge individuals, including eight TaP clade individuals from region A ('Canyon'), six individuals from region B ('West-Bank') (comprised of three TaP clade individuals + three P. hirondellei individuals), and seven P. hirondellei specimens from region C ('East-Bank'). Further, nine seawater samples were included in the statistical analyses (three samples from each region). Alpha diversity indices (among others Shannon index), as well as beta diversity metrices (among others weighted UniFrac distances) were calculated. Non-metric multidimensional scaling (nMDS) was performed on weighted UniFrac distances to evaluate sample separation in ordination space. To assess if groups of samples differed significantly from each other, permutational multivariate analyses of variance (PERMANOVA) were conducted in a pairwise manner on weighted UniFrac distances. Phyla differing significantly between sample groups were determined and ranked using the Linear Discriminant Analysis Effect Size (LEfSe) algorithm (Segata et al., 2011). We applied a significance level of α = 0.05 for all statistical analyses throughout this study. Plotting was performed with R (version 3.0.2 (R Development Core Team, 2008),) and further fine-tuning of visualisations conducted using Inkscape (version 0.92.3 (Harrington and Team, 2005);) and/or GIMP (version 2.8).

Larval dispersal modelling and oceanographic observations
Before running the virtual particle tracking model simulations, the following steps were performed (i) validation of technical input parameters (number of released particles), (ii) further validation of biological input parameters (timepoint of larval release), (iii) comparison of model output (TS data) with in situ measurements.

Validations with biological and physical oceanographic observations
To further validate the timepoint of larval release into the water column (as determined by the sponge reproductive state analyses), we conducted an assessment of pelagic productivity. This seemed reasonable as several previous studies have described a strong link between pulses of primary productivity and the larval release of benthic organisms (e.g. Highfield et al., 2010). We assessed productivity by two means: (a) based on in situ biological data of bacterioplankton and chlorophyll-a concentrations (snapshot in time, relying on sampling in June 2017); and (b) based on remote sensing (assessment of temporal varitions within the year 2017). The latter was performed with the help of publicly available resources, as we derived monthly chlorophyll-a concentrations of the year 2017 from satellite data accessed via the GlobColor web portal (provided by the European Space Agency, ESA). In situ productivity data was retrieved by flow cytometry using a Beckman Coulter Gallios machine. Samples for flow cytometry were derived in triplicates from bottom depths of all 15 performed CTD casts. Subsamples (4 mL seawater) were fixed with 200 μL glutaraldehyde (GDA, 25%). Sample-containing tubes were stored in vertical position at − 20 • C until processing. For measurement of bacterial cell numbers, the samples were thawed, prefiltered (syringe filters with 50 μm pore size), and sample aliquots of 400 μL were mixed with 10 μL of a SYBR Green stock solution (10,000x concentrate in DMSO). Fluoresbrite fluorescence microspheres with a diameter of 1 μm were added, followed by a product incubation time of 15 min. For measurements of phytoplankton concentrations, no staining was performed, but detection of autofluorescence conducted instead.
To validate our physical model output with in situ physical measurements (of TS data), and to get a better understanding of prevailing water mass properties, we performed in situ sensing using a Seabird 37 CTD sensor system. The bathymetry map was created with QGIS (version 3.4.4; (QGIS Development Team, 2017)). CTD data and satellite data were visualised with Python (version 3.7.3).

Biophysical modelling
We applied a Lagrangian modelling framework using the particle tracking toolbox Parcels (version 2.0.0. beta2, Delandmeter and Van Sebille, 2019) with three-dimensional velocity data from an Atlantic model based on the Nucleus for European Modelling of the Ocean (NEMO v3.6, Madec, 2016) system. VIKING20X is a successor of the Ocean General Circulation Model (OGCM) VIKING20 (Böning et al., 2016) and combines an eddy-rich (1/20 • ) grid of the whole Atlantic (69 • N -33.5 • S), two-way nested into a global 1/4 • ocean-sea-ice configuration (ORCA025), forced by the JRA55-do atmospheric forcing data set of the past decades (Tsujino et al., 2018). In this study we used the experiment VIKING20X.L46-KKG36107B. VIKING20X has 46 z-levels in the vertical, for which the layer thickness increases from 6 m at the surface to 250 m in the deep ocean. At the depths of the simulated regions (i.e. 628 m), the layer thickness is 94 m. The bathymetry in the model is based on the 2-min ETOPO2 bathymetric database and represented by partial steps. Simulated velocities are stored as daily three-dimensional averages (to allow releases on an everyday basis to assure equal integration lengths). The VIKING20 model itself has been intensively validated in previous studies (e.g. Breckenfelder et al., 2017) and used for dispersal studies (Baltazar-Soares et al., 2014;Breusing et al., 2016). To further demonstrate the reliability of the model simulations in our area of interest, we compared in situ temperature and salinity data derived from the 15 priorly mentioned full ocean depth CTD casts in June 2017, with modeled data from VIKING20X (respective data are shown in the Supplementary Material and discussed further in the Results section 3.4). Lagrangian simulations were performed on ~90,000 particles in total (1000 particles per release region and release day), which were released from three release regions (covering four model grid boxes each) during the assumed natural spawning period in June (1st of June -30th of June). The coordinates of the three virtual release regions were covering the actual sampling locations in 2017 at region A ('Canyon'), region B ('West-Bank') and region C ('East-Bank') (consider the Supplementary Material for further insights into the representation of regions in model topography as well as model velocities). The number of particles was determined based on the saturation plateau of computed rarefraction curves (respective data are shown in the Supplementary Material and discussed further in the Results section 3.4). In our study, release positions of sponge larvae were close to the seafloor (in detail a larval release was performed in the vertical centre of the last gridbox above the bottom). After release, virtual larvae were allowed to drift passively at any depth with the three-dimensional time-varying ocean velocities, and no assumptions about mortality were made at any time of the simulations. A typical pelagic larval duration of 14 days was assumed and dispersal probabilities were computed as described previously (van Sebille et al., 2018). To consider temporal fluctuations in current patterns, we modeled releases and dispersal over a period of 10 years (between 2009 and 2018). In-depth analyses on seasonality were performed for one year, in which case we chose the year of the conducted expedition (i.e. 2017). For plotting larval trajectories and probabilities of dispersal, all particle positions deeper than 2000 m (determined by a depth tracer) were flagged particularly. Those particles which, on drifting day 14 (i.e. the last day of the typical pelagic larval duration phase) were found in areas where the seafloor lies below 2000 m, were considered as 'lost' to the open ocean. In this study we present '100% -areas' and '95 %-areas' of larval drift. In the latter case, the area covered by the larvae drift refers to the area including 95% of all observed particle positions within 14 days when positions are binned in boxes of 5 km × 5 km onto a geographic grid. Due to the chosen bin size the number of particle positions considered to calculate this area is, however, up to 0.6% lower than 95%. The connectivity matrix shows the probability that a particle is present at least once in one of the three regions during the last seven days of its drift relative to the number of all particles released. All oceanographic modelling and visualising was performed within Python (version 3.7.3) running (inside Jupyter notebooks; Kluyver et al., 2016) on an Unix system.

Sponge taxonomy
Phakellia hirondellei and the TaP clade individuals are morphologically very similar in their external appearance and can be difficult to distinguish even to the expert eye. Although similar, the spicule content of both lineages differed slightly: TaP clade individuals possessed two types of strongyles and two sizes of styles, and P. hirondellei contained one type of strongyles, two sizes of oxeas and one size of styles (Supplementary Table S1). Only two individuals within P. hirondellei showed small differences in the spicule content (lack of strongyles, Supplementary Table S1). The molecular markers 18S and COI were then used to investigate their phylogenetic placement and to confirm their taxonomic assignment in combination with traditional spicule analyses. In the COI analysis, the ten individuals identified based on the spicule content as P. hirondellei (Topsent, 1890; Bubarida order) were grouped in a monophyletic clade showing relatively moderate support (Supplementary Material S1B), with Phakellia robusta as sister clade (Supplementary Material S1B). However, these same individuals were grouped together in a clade with P. robusta based on the phylogenetic analysis of 18S (Supplementary Material S1A) because of the low resolution power of this marker for sponge phylogeny at species level. Despite multiple sequencing attempts, generation of clean host sequences turned out to be difficult for the two sponge groups. Working with one single sequence direction for 18S, we observed a few nucleotide mismatches within species, which could not be properly checked, but most likely do not reflect a true variation of the marker (as concluded based on the combination of the four different approaches used to decipher sponge taxonomy). The spicule analysis for all these ten individuals (Supplementary Material S1C) supported the assignation to P. hirondellei, despite small differences in the spicule content of two individuals. Within the same clade as P. hirondellei and P. robusta, we found other species which are assigned to either the order Axinellida or Bubarida (Supplementary Material S1B). The other 11 sponge individuals were classified as belonging to the TaP clade. 18S sequencing suggested that this clade included Topsentia and Petromica species (Supplementary Material S1A). In the COI analysis, the robustly supported TaP clade appeared as sister to another clade containing Petromica, Cymbastela, and Ciocalypta (Supplementary Material S1B). The remaining 21 sponges could not be unambigiously taxonomically classified, which is why we discarded the respective samples from further analyses of connectivity.

Sponge microbial community composition
The microbiomes of TaP clade specimens and P. hirondellei clustered apart from seawater in a non-metric multidimensional scaling plot of weighted UniFrac distances ( Fig. 2A). In addition, microbiomes of TaP clade individuals and P. hirondellei fell into two distinct clusters based on weighted UniFrac distances. Alpha diversity indices were highest in seawater, intermediate in TaP clade specimens and lowest in P. hirondellei ( Fig. 2A, small insert). Further, the microbial community composition was significantly different on ASV-level between TaP clade specimens and P. hirondellei, and also different from seawater (PER-MANOVA, p = 0.001, Table 1). Notably, the seawater reference microbiomes did not differ between the three sampling regions in terms of diversity and community composition .
In terms of bacterial abundance and composition, the microbiomes of the TaP clade specimens and P. hirondellei were characteristic of low microbial abundance (LMA) sponges (according to Hentschel et al., 2006;Gloeckner et al., 2014;Moitinho-Silva et al., 2017b) with dominant Gammaproteobacteria and Bacteroidetes (Fig. 2B). However, the relative abundances of these microbial phyla differed between TaP clade specimens and P. hirondellei. While the phylum Nitrospirae was significantly enriched in P. hirondellei, TaP clade individuals showed a significant enrichment of Alphaproteobacteria and Bacteroidetes (Supplementary Material S3).
In a nMDS approach, we observed a substructuring of TaP clade microbiomes between region A ('Canyon') and region B ('West-Bank') ( Fig. 3A), which was statistically significant (PERMANOVA, p = 0.011, Table 2). Phakellia hirondellei microbiomes showed no significant spatial differences between region B ('West-Bank') and region C ('East-Bank') ( Fig. 3B). In terms of alpha diversity indices (Shannon index) the TaP clade and P. hirondellei did not show any spatial differences between sampling region (Fig. 3A and B, small inserts).

Sponge population genetics
For population structure and connectivity (SNPs) analyses a total of 3408 and 2119 SNPs were obtained for the P. hirondellei and TaP clade datasets, respectively. Population genetic statistics for the two sponge clades are given in Table 3. Overall, the expected heterozygosity (H e ), generally considered as a measure of genetic diversity, was slightly lower for TaP clade specimens (0.369) than for P. hirondellei (0.400). Within sampling regions, H e ranged from 0.366 (A:'Canyon') to 0.381 (B:'West-Bank') for TaP clade specimens, and from 0.394 (C:'East-Bank') to 0.408 (B:'West-Bank') for P. hirondellei (Table 3). Observed heterozygosity (H O ) was again lower for the TaP clade (0.535) than for P. hirondellei (0.601). Within sampling regions, H O ranged from 0.532 (A:'Canyon') to 0.557 (B:'West-Bank') for TaP clade specimens, and from 0.597 (B:'West-Bank') to 0.603 (C:'East-Bank') for P. hirondellei (Table 3). All values for the inbreeding coefficient (F IS ) were negative and significant for the two sponge clades and the different regions analysed, indicating an excess of observed heterozygotes (Table 3).
The results of the STRUCTURE and adegenet analyses revealed no significant genetic structure among the samples for the two sponge clades from the different regions ( Fig. 4A and B). This was also corroborated by the F ST values for the pairwise comparisons between regions, which was 0.023 for the comparison between TaP clade regions (p-value = 0.06) and 0.015 for the comparison between P. hirondellei regions (p-value = 0.169). Interestingly, the DAPC analysis for which samples were grouped per sampling region, did detect differences across sampling regions for the TaP clade, which showed a subtle separation between regions A ('Canyon') and B ('West-Bank'), (Fig. 4C; note that DAPC analyses are generally more sensitive than STRUCTURE and adegenet analyses). In contrast, for P. hirondellei, we did not observe this differentiation, but samples of the canyon region (region A) were not available for this species (Fig. 4D).

Validation of oceanographic modelling setup
In order to link population genetic data with oceanographic connectivity, VIKING20X was chosen as our model to simulate drifting of sponge larvae. The optimal number of particles to be released was Non-metric multidimensional scaling ( Fig. 3A and B) is based on weighted UniFrac distances (ASV-level) and symbols indicate the respective sampling regions. The violin plots presented within Fig. 3A and B shows mean (dot) and standard deviation (whiskers) of Shannon indices for the two sponge clades across sites.

Table 1
Overview of pairwise-comparisons (PERMANOVAs) based on weighted UniFrac matrixes of microbial communities between sponge clades (TaP clade and P. hirondellei) and sample types (TaP clade, P. hirondellei, seawater).    S4). When determining the optimal model starting date, we observed that current direction and velocities seemed to be governed by mesoscale processes over the whole year (Supplementary Material S5). We chose June as the starting month to release virtual sponge larvae into the water column; and due to strong temporal variability we performed releases over 10 different years. The month of June was chosen because it is two months after the main phytoplankton bloom in 2017 (Supplementary Material S6) and it is during a period when ripe reproductive stages were found in both, TaP clade and P. hirondellei individuals. In this month, the in situ chlorophyll a concentrations were comparably low at all three locations (Supplementary Material S7A-B) and corresponded well to the satellite data. Both, TaP clade members and P. hirondellei were found to contain mature gametes in mid-June, when they were collected. In TaP   ocean. This configuration rests upon the evidence-based assumption that fan-shaped sponge abundance in the study area is very low below 2000 m water depth: With the aid of a photogrammetic sledge (ROTV Politolana), fan-shaped sponges at the study area were recorded particularly within the mesopelagic zone (Sánchez et al., 2017), especially in a depth range between 425 and 1300 m, with highest densities between 450 and 550 m.

Larvae drift model
For all three regions a considerable fraction of the released particles stayed within their release region for all analysed years and interregional connectivity was low. Passive-particle drift trajectories showed a preferential retention of simulated larvae within the study area (for a particular year see Fig. 5 and Supplementary Material S10; for multi-year data see Supplementary Material S11). Most virtual particles travelled along the bathymetry and were strongly influenced by the mesoscale variability, i.e. the existence and direction of ocean eddies. Across all years and sampling regions on drifting day 14, on average 44.2 ± SE 5.7% of all particles were found in areas where the seafloor is located below 2000 m, but 0% of the particles were actually drifting at water depths below 2000 m. We observed strong inter-annual variations (between 2009 and 2018) in the geographic location of eddies (Supplementary Material S12), as well as in the average larval drift distances, and in the 100% -areas and 95 %-areas of larval drift (Supplementary Material S13). These inter-annual variations were observed at all three sampling regions. To assess inter-annual patterns of connectivity, we computed a connectivity matrix (Fig. 6). Throughout the ten analysed years region A ('Canyon') and region C ('East-Bank') were never connected. In less than half of the analysed years, region B ('West-Bank') received particles from the other two regions. Only in two years (2009 and 2011) particles from region B were transported to another

Discussion
Almost two-thirds of the Earth are considered deep-sea, which is also the least explored habitat on Earth (Costello et al., 2010;Ramirez-Llodra et al., 2010. Designing population genetic studies in such understudied regions can be challenging. In the present study half of all originally collected sponge specimens could not be used for population genetic evaluation, because the taxonomic assignment remained ambiguous despite considerable efforts using both molecular (COI, 18S, 16S) and classical taxonomic markers (spicules). Further, even for specimens identified as belonging to the same species, it can be difficult to assign a taxonomic name. Indeed, sponge systematics and phylogenic reconstructions continue to be difficult using traditional spicule analyses and/or molecular techniques (Erpenbeck et al., 2006;Philippe et al., 2011). While in our study individuals belonging to the TaP clade were identified as belonging to the same species based on COI, 18S, and spicule analyses, their assignment to a described order remains wanting. In order to be able to provide an official taxonomic name, a complete review of the orders to which Topsentia and Petromica (and other groups) belong will need to be accomplished.
Although the encountered difficulties in taxonomic identification led to a considerable reduction of our sampled dataset, we emphasize that a critical re-evaluation of sponge taxonomy needs to be at the basis of every population genetic analysis, as morphologically very similar specimens (i.e. specimens with a similar external body shape) may belong to different sponge orders. The microbial community composition based on 16S amplicon data was identified as a helpful tool to support sponge taxonomy since the microbiomes were host-specific. Indeed, for many organisms phylogenetic relatedness of the host can strongly affect the associated microbial community composition (reviewed in Moran et al., 2019). For sponges, species-specific prokaryotic communities are commonly found (e.g. Easson and Thacker, 2014;Thomas et al., 2016;Steinert et al., 2017). Besides host-specific microbiomes, both, TaP clade individuals and P. hirondellei showed microbial signatures typical for Low Microbial Abundance (LMA) sponges, as described in (Moitinho-Silva et al., 2017b), which differed in terms of relative abundances of microbial phyla. Results of our study further imply that on a subspecies level (genotypes obtained using SNPs), the sponge microbiome seems to be a sensitive parameter for evaluating emerging taxonomic structures of sponge holobionts, and particularly so when the ASVlevel is considered for microbial analyses.
Choosing the appropriate spatial scale to identify potential genetic structures of deep-sea sponges can be difficult since knowledge on population genetics/genomics of deep-sea organisms is sparse. In order to cover a maximum geographic range, the population genetic pattens the fan-shaped sponges P. robusta and P. hirondellei are examined over a few thousand kilometers in a complementary paper by Taboada et al. (in review), while the present study focussed on distances of <100 km. For deep-sea organisms, generally low genetic structuring is reported when organisms occur at similar depths (McClain and Hardy, 2010;Taylor and Roterman, 2017), while other studies highlight the importance of biophysical barriers with respect to inhibition of gene flow. A recent meta-analysis on meiofaunal biodiversity covering 21 canyons together with reference stations on the adjacent slopes, observed a high dissimilarity in taxonomic composition between canyons and slopes (Bianchelli and Danovaro, 2019). At a population genetic level, the red gorgonian Paramuricea clavata showed a strong genetic differentiation among populations in different submarine canyons in the Mediterranean Sea, presumably due to limited effective dispersal (Pérez-Portela et al., 2016). Similarly, another recent study found genetic differentiation among cold-water coral populations across sites in the near neighbourhood (Boavida et al., 2019).
In the present study, we mostly observed genetically well-connected sponge populations for both, the TaP clade, and P. hirondellei. This result was consistent across the STRUCTURE analysis ( Fig. 4A and B), adegenet analysis, and Fst values. Interestingly, individuals belonging to the TaP clade showed a subtle separation between sampling regions A ('Canyon') and B ('West-Bank') in the DAPC analysis (Fig. 4 C), despite the analysis detected an overall single panmictic population. Similarly, we observed a significant difference of TaP clade microbiomes between sampling regions A ('Canyon') and B ('West-Bank') in terms of weighted UniFrac distances. Seawater microbiomes did however not differ between the sampling regions. This supports our oceanographic in situ observations (TS data) that sampling at all three sampling regions occurred within the same watermass and under similar environmental conditions. We thus conclude that slight but distinct substructure in host population genetics and microbial community composition occur concomitantly. We further conclude that although gene flow generally occurs between both examined geologic features (the Avilés Canyon System and the Le Danois Bank), the exchange may be low enough to promote subtle genetic differentiation. It is quite remarkable that slight differentiation can even be found on scales below 100 km distance, and interestingly on both, the host and microbiome level. For P. hirondellei, we did not observe a regional differentiation, but samples of the canyon region (region A) were not available for this species. Our results are overall consistent with a previous lack in strong genetic structuring in deep-sea organisms occurring at similar depths (McClain and Hardy, 2010;Taylor and Roterman, 2017).
Our performed simulations of Lagrangian connectivity suggest that virtual sponge larvae stay in the study area and ocean currents generally allow local gene flow events, despite inter-annual variations in the connectivity between the three regions. For sessile invertebrates, 10-100 km distance is the typical range within which the majority of offspring is transported away from its parents (Palumbi, 2004). Particle trajectories calculated within our study reveal a similar range for TaP clade individuals and P. hirondellei. Larval drifting occurred in swirled patterns and exhibited a variable strength across different years, as the major circulation patterns seemed to be governed by mesoscale processes. Overall, ocean currents seem to promote virtual larvae to drift along the bathymetry in the study area. Our results thus suggest that, although the continental shelf is comparably narrow around the northern shores of Spain, ocean circulation may prevent the loss of sponge larvae towards the open ocean. Whether this will still be the case under future climate scenarios remains to be tested in future studies. Interestingly, in a recent study by Fox et al. (2016) trajectories of cold-water coral larvae were found to be strongly correlated to the North Atlantic Oscillation. The authors of that study conclude that the connectivity of the existing MPA network is vulnerable to atmospheric-driven changes in ocean circulation.
Besides the need of further studies on the variability of the physical settings, also further biological data is urgently needed to improve understanding of connectivity in the deep-sea, especially for key organisms like sponges and the ecosystems they form. As current understanding of deep-sea larval behavior is extremely limited (in general) to nonexistant (for deep-sea sponges), our particle trajectory calculations are based on the assumption of passive dispersal only. However, recent studies have shown that active larval movements can have an order of magnitude impact on dispersal (Gary et al., 2020). It will therefore be crucial for future studies to gather more information on the ecology and behavior of deep-sea sponge larvae.
In summary, we show that canyons can promote slight genetic divergence of sponges and also of their microbiomes, although more data would be needed to provide further statistical support for this observation. We further conclude that the specific prevailing ocean currents of the Cantabrian Sea dictate sponge host genetics by promoting a settlement in suitable areas within kilometers from their parent sponges and consequently allowing gene flow. However oceanographic connectivity showed a distinct variability between the simulated years. While genetic uniformity would occur in times of connectivity between the Avilés Canyon and the Le Danois Bank, genetic differentiation might happen in years of disconnection, when the deep-water currents promote drifting of larvae in different directions. These findings contribute to a better understanding of connectivity, genetic and microbial variability of deep-sea organisms. Our results imply that a design of ecologically coherent networks of MPAs is no trivial task for deep-sea sponge holobionts. We conclude for our study site (on scales < 100 km) that (i) physical connectivity between close-by sites via ocean currents has a strong temporal component, and that (ii) differences between close-by sites are stronger reflected in the sponge-associated microbial community than in the concomitant sponge-host genetics. These points reveal that it is important for a design of ecologically coherent networks of MPAs, to consider the naturally existing variability of connectivity (e.g. physical data from multiple years) and to consider the occurring microbial diversity in addition to the macrobial diversity.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.