Candidatus Abditibacter, a novel genus within the Cryomorphaceae, thriving in the North Sea

Coastal phytoplankton blooms are frequently followed by successive blooms of heterotrophic bacterial clades. The class Flavobacteriia within the Bacteroidetes has been shown to play an important role in the degradation of high molecular weight substrates that become available in the later stages of such blooms. One of the flavobacterial clades repeatedly observed over the course of several years during phytoplankton blooms off the coast of Helgoland, North Sea, is Vis6. This genus-level clade belongs to the family Cryomorphaceae and has been resistant to cultivation to date. Based on metagenome assembled genomes, comparative 16S rRNA gene sequence analyses and fluorescence in situ hybridization, we here propose a novel candidate genus Abditibacter, comprising three novel species Candidatus Abditibacter


Introduction
Marine phytoplankton fixes the same amount of CO 2 as land plants despite representing only 0.2% of the global biomass of primary producers [22]. In coastal and upwelling regions, phytoplankton blooms can be initiated by increases in solar irradiation, nutrient availability and reduced grazing pressure particularly in spring [11]. These blooms are characterized by rapid increases in phytoplankton growth followed by a decline in population density after several days to weeks. A large part of the organic matter released by living and decaying phytoplankton during and after these bloom events is remineralized to inorganic nutrients and CO 2 by heterotrophic bacteria [11,57], mostly Alphaproteobacteria, Gammaproteobacteria and Bacteroidetes [11]. In particular, members of the class Flavobacteriia within the phylum Bacteroidetes play an important role in the degradation of high molecular weight (HMW) compounds, such as proteins and polysaccharides (e.g. [21,37,42,59]). There are correlations between flavobacterial abundances and specific phytoplankton species such as diatoms or flagellates [42,49], yet the interaction of phytoplankton and bacte- * Corresponding author. rioplankton can also be interpreted as a recurrent substrate-based succession [59,60].
Helgoland Roads is a long term ecological research station, located off the island of Helgoland in the German Bight, where spring phytoplankton blooms have been studied in detail for one decade [12,23,29,32,59,60]. Among the flavobacterial clades recurrently responding to the diatom-dominated blooms were novel clades related to the known genera Ulvibacter, Polaribacter, Formosa and, within the Cryomorphaceae, a clade referred to as Vis6 [12,59,60]. The Ulvibacter related genus was recently described as Candidatus (Ca.) Prosiliicoccus based on metagenome assembled genomes (MAGs) and other data [23]. Polaribacter and Formosa strains have been isolated from samples from 2010 [29] and have been studied in detail by Xing et al. [66] and Unfried et al. [62]. A taxonomic and ecological study of the clade Vis6 is lacking, despite the fact that this group is recurrently abundant in nearly all spring blooms investigated at Helgoland Roads with relative abundances up to 20% [60], and 16S rRNA gene sequences related to Vis6 have also been found in other phytoplankton blooms (e.g. [34,60,64]).
Vis6 was first found during a research cruise crossing different oceanic provinces from the East Greenland current to the North Atlantic subtropical gyre in September 2006. A fluorescence in situ hybridization (FISH) probe was designed based on 16S rRNA gene libraries, named Vis6-814 [25]. FISH counts showed highest abundances (up to 3.4 ± 1.3 x10 3 cells per ml) in the northern stations with decreasing numbers southwards. Despite substantial cultivation effort (e.g. [14,17,29]), no cultured representative is available to date and the global distribution and relevance of this clade remains unclear. Vis6 related sequences have been detected as abundant phytoplankton bloom responders (e.g. [34,60,64]), but were sometimes referred to as "Owenweeksia", due to Owenweeksia hongkongensis being at that time the closest cultured relative based on 16S rRNA gene sequence comparison [12,29]. The family Cryomorphaceae, to which both Owenweeksia and Vis6 belong, is polyphyletic and consists of nine genera [10]. It was named after the first and at that time sole member Cryomorpha ignava which was isolated from a quartz stone sublithic cyanobacterial biofilm in Eastern Antarctica [9]. The family seems to be widely distributed as indicated by different sequence databases such as SILVA, NCBI and GTDB, where highly related sequences are listed as "uncultured" or "unclassified".
In this study, we wanted to close a gap in our knowledge about this abundant bloom-responding flavobacterial clade. Here, the Vis6 clade is characterized based on (i) metagenomic data including a functional description based on gene annotation, (ii) 16S rRNA gene sequences for phylogenetic classification and (iii) abundance data based on 16S rRNA amplicon sequencing and cell counts from fluorescence in situ hybridization (FISH) from several years. Based on these data, we propose a novel candidate genus, Candidatus Abditibacter, within the Cryomorphaceae family that includes three species: Candidatus Abditibacter forsetii, Candidatus Abditibacter vernus and Candidatus Abditibacter autumni.

Sampling
In this study, we combined previously published sequencing and cell count data with new analyses. All samples were taken off the coast of Helgoland, North Sea, at the long-term ecological research station Helgoland Roads (54 • 11.3' N, 7 • 54.0' E) in varying sampling intervals in the years 2009-2013 and 2016. A strong emphasis was put on the spring time of each year to analyze the bacterial responses to spring phytoplankton blooms. In 2017, one water sample was taken on September 20.
The samples were taken as described in Teeling et al. [59]. In brief, the community DNA, collected on 0.2 m filters after prefiltration with 10 and 3 m cut-offs, was extracted and sequenced for metagenomic analyses. Unfractionated seawater, fixed with 1% formaldehyde, was collected on 0.2 m filters for catalyzed reporter deposition (CARD)-FISH and cell counting. From the sample from September 20, 2017, cells hybridized with a Vis6 specific FISH probe have been enriched by flow cytometry prior to sequencing [28] in addition to a shotgun metagenome. Phytoplankton data, including chlorophyll a concentrations, total cell counts and CARD-FISH counts for the years 2009-2012 have been published in Teeling et al. [60]. Chlorophyll a concentrations from the years 2013 and 2016 were assessed from fluorescence data using an algal group analyzer (bbe moldaenke, Kiel-Kronshagen, Germany) [51,65].

16S rRNA phylogenetic reconstruction
A 16S rRNA reference tree was created based on the SILVA database release 132 SSU Ref (www.arb-silva.de). Sequence curation and phylogenetic tree reconstruction was done with the ARB software [38]. The alignment of Vis6 sequences (selected based on a match with probe Vis6-814 [25]) was manually improved for 381 high quality sequences of a length of >1400 bp. All type strains within the Flavobacteriia were additionally selected, resulting in 708 sequences that served as outgroup for the tree construction. With the total of 1102 sequences, two trees with the neighbor joining (NJ) method were calculated, one with an additional 30% Bacteroidetes variability filter and one without. Two trees were calculated with RAxML8 tree construction method [56], one with 30% Bacteroidetes and one without any variability filter. Of those four trees a consensus tree using the consensus tree option implemented in ARB was generated and manually refined as described in Peplies et al. [47].

Amplicon data
The acquisition and processing of 16S rRNA amplicon data from 2010-2012 were previously published in Chafee et al. [12], and done in a similar manner for the years 2013 and 2016. In brief, the V4 region of the 16S rRNA gene was amplified from the 0.2-3 m size fractions and sequenced. The sequences were clustered based on oligotyping using Minimum Entropy Decomposition [19,20]. The 3-10 m size fraction was analyzed in the same way. Differentially abundant oligotypes were determined with the DESeq2 package [36].
Oligotypes with > 1% relative read abundance in the amplicon sequences that have been classified as Cryomorphaceae, as well as 16S rRNA gene sequences from the genome taxonomy database (GTDB; [46]) were selected. These sequences were added to the Vis6 16S rRNA tree using parsimony criteria with the tool ARB Parsimony in ARB.

Metagenome sequencing, assembly and binning
Thirty-eight metagenomes from 2010-2012 were sequenced at the DOE Joint Genome Institute as described in Teeling et al. [60]. Assembly of the reads and binning of contigs was done according to Krüger et al. [32]. Metagenome raw reads, assemblies and MAGs have been submitted to the European Nucleotide Archive (ENA) under accession number PRJEB28156. The MAGs analyzed in our study have the same identifiers as in ENA. MAGs from these analyzed metagenomes were clustered into Mash-clusters using Mash [44], which represent approximate species clusters of highly similar MAGs close to 95% average nucleotide identity (ANI). These Mash-clusters have been published in Krüger et al. [32] and two of them (mc 3 and mc 12) were classified as Vis6 clade, based on 16S rRNA gene sequences, and analyzed in this study. Briefly, fragments longer than 250 bp carrying 16S rRNA gene sequences were identified with CheckM ssu finder [45] in the MAGs from 2010-2012 and classified by adding them to the Vis6 consensus 16S rRNA tree with parsimony criteria using ARB [38]. From a total of 38 sequences falling into the Vis6 cluster, 37 derived either from mash-cluster mc 3 or mc 12. The flow-sorting based acquisition of MAGs from a sample taken on September 20, 2017 was described in Grieb et al. [28]. In brief, samples were sorted based on the fluorescence FISH signal of the Vis6-814 probe [25] prior to sequencing, assembly and binning. From the same sampling day, a Vis6 MAG was obtained from a shotgun bulk metagenome. This metagenome as well as the described mini-metagenomes were sequenced at the DOE Joint Genome Institute under the IMG GOLD [41] study ID Gs0130320.
Completeness, contamination and heterogeneity of the MAGs were estimated by CheckM [45]. Adopting quality thresholds from Bowers et al. [8], MAGs were classified as high, medium and low. Only the medium (<10% contamination, > 50% completeness) and high quality (<5% contamination, > 90% completeness, ≥18 tRNA) genomes were used for analyses. This included 13 high quality and 13 medium quality MAGs from mc 3, 29 high quality and 9 medium quality from mc 12 and 8 medium quality MAGs from September 2017. All MAG identifiers, their bin sizes and quality are listed in Table S1.

Phylogenomic reconstruction
One representative MAG (selected bases on highest quality values) of mc 3, mc 12 and September 2017 was placed in a GTDB reference tree [46]. The genome based phylogeny was calculated using GTDBtk v0.3.1, and GTDB R89 as the reference data package [46]. The tree itself was created with the de novo wf pipeline, using the bacterial marker set (-bac120 ms), and the phylum p Deinococcota as the -outgroup taxon. The ANI and amino acid identity (AAI) between MAGs and references were calculated using ani.rb and aai.rb from the enveomics collection [53].

Temporal and spatial distribution
The relative abundance of cells targeted with the Vis6-814 probe [25] was determined by CARD-FISH in relation to the total cell counts as determined by 4',6-diamidino-2-phenylindole (DAPI) staining. CARD-FISH was done according to Pernthaler et al. [48]. Counts from 2009-2012 were taken from Teeling et al. [60], the counts from 2013 and 2016 were done as part of this study. We selected a representative MAG from both mc 3 and mc 12 and recruited the reads from the metagenomes from 2010-2012 and 2016 to these MAGs as described in Francis et al. [23]. Relative abundance was calculated as the percentage of recruited reads from the total number of reads per sampling date. For calculating the abundance of oligotypes, the relative abundance of reads of the amplicon data-set was calculated as described in Chafee et al. [12].
Data for global distribution was collected using IMNGS [33]. As query sequences, representatives of each 16S rRNA gene cluster were used. These were the sequence FQ032803 [26], the 16S rRNA gene sequences from the MAGs 20120524 Bin 102 1, 20110509 Bin 54 1 and 3300031407 1. Minimum target size was 200 and an identity threshold of 99% was used. Percent of reads in each sequencing run was calculated from the IMNGS output, and the corresponding geographic positions for each sequencing run were collected from NCBI. A cutoff of at least 10 reads matching the query was used for plotting.
Additionally were selected for annotation by the IMG pipeline [13]. These are available by the ER comparative analysis system IMG/MER [13] under the submission IDs 208456, 208439, 208438, 208426, 208427 and 208362. The annotations of the MAGs from September 2017 have been published in Grieb et al. [28] and are available under the GOLD study ID Gs130320. These annotations were used to reconstruct metabolic pathways.

Taxonomic classification
Based on the 16S rRNA gene sequence analyses, Vis6 formed three clusters -referred to as cluster A, B and C -independent of the phylogenetic tree construction method. A phylogenetic consensus tree is shown in Fig. 1. Sequence identity was above 98.7% within the sequences of clusters A and B, whereas Cluster C was more diverse with sequence identities > 97.4% among sequences of this group. Sequence identities between the clusters A, B and C were between 94.5% and 98.6% (Table S2), suggesting that they all originate from one genus, but comprise different species [67]. A few sequences, targeted by the Vis6-814 probe, were deep branching and with 16S rRNA sequence identities < 94% to clusters A, B and C, and therefore clearly outside the genus.
From the study of Chafee and coworkers [12], three oligotypes could be classified within the genus-level Vis6 group: oligotype 6900, oligotype 6896 and oligotype 6921. The oligotype identifiers from the study of Chafee and coworkers were altered when we analyzed the data together with the sequences from 2016. Oligotype 6900 is identical to oligotype 2940 in Chafee et al. [12], which was referred to "Owenweeksia related". In our study, this oligotype 6900 was affiliated to Vis6 cluster B, whereas oligotype 6896 and oligotype 6921 were affiliated to Vis6 cluster C. Another oligotype, oligotype 16147, was affiliated to a sequence targeted by the Vis6-814 probe, but was not affiliated to the genus-level Vis6 group. This could have been due to the short sequence length of the representative oligotype sequences. Other oligotypes that have been classified as Cryomorphaceae in Chafee et al. [12] were either not affiliated to the Vis6 group or could not be stably placed in the 16S rRNA gene tree due to short sequence lengths. MAGs of the Mash-clusters mc 3 and mc 12 included 16S rRNA gene sequences, which were affiliated to cluster B. The 16S rRNA gene sequences from the MAGs from September 2017 were affiliated to cluster C. The numbers indicate how many of the four generated trees, used for the consensus tree, branched at that position. The dashed line indicates sequences targeted by the Vis6-814 probe, including few sequences that are deep-branching between Vis6 and "TMED14". Added to the tree were 16S rRNA gene sequences from MAGs of all three Vis6 candidate species, four oligotypes that were closely related to Vis6 and four closely related 16S rRNA sequences from the GTDB. The classification of GB GCA 001438615 as TMED14 was adopted from GTDB.
Analyzing the 16S rRNA gene sequences contained in the genomes from GTDB, we found that sequences from genus UBA10364 were affiliated to the Vis6 clusters B and C. The closest neighboring cluster contained the 16S rRNA gene sequences of MAG GCA 001438205.1 from candidate genus TM14 (former "Coccinistipes").
The GTDB genus UBA10364 contained 11 species, of which we calculated the ANI values of each representative sequence to the Vis6 genome sequences from this study (Fig. 2). Sequences from mc 3 and mc 12 have been published in Teeling et al. [60] and were therefore considered when the GTDB database was created [46]. Consequently, whole genome comparison showed that a representative sequence of mc 3 (20100303 Bin 80 1) shared 99.9% ANI with UBA 10364 sp002387615 (GCA 002387615.1) and a sequence of mc 12 (20100420 Bin 31 1) shared 98.7% ANI with Sum29DL08 bin30 (GCA 003045825.1). All other species from GTDB share < 95% ANI with our analyzed MAGs. Sequences from September 2017 formed a distinct cluster with no representative from GTDB.
The representative MAGs (from mc 3, mc 12 and September 2017) represent three different species based on 77-81% ANI between each other. Within the three species clusters (mc 3, mc 12 and September 2017), ANI values were > 99%. We will refer to MAGs from mc 12 as candidate species 1, to MAGs of mc 3 as candidate species 2 and to MAGs from September 2017 as candidate species 3.
The closest cultured representative of the Vis6 sequences, based on 16S rRNA gene sequence analysis, was Phaeocystidibacter luteus with 90% sequence identity (Fig. 1). The closest cultured representative, based on whole genome analysis, was Owenweeksia hongkongensis with 50% AAI (Fig. 3).

Temporal distribution
Abundance estimations based on FISH with the genus-specific probe Vis6-814 ( [60] and Table S3) indicated a growth pattern of Vis6 which peaked shortly after phytoplankton blooms. This was in particular seen in 2013, but also in 2009 and 2012 (Fig. 4). One exception was 2010, where Vis6-814 FISH counts reached 20% relative abundance already at the beginning of the spring phytoplankton bloom and had a second, smaller peak after the summer bloom.  Average nucleotide identities (ANI) between three representative MAGs of each candidate species and one representative sequence of each species from genus "UBA10364" from GTDB. The ANI between the three candidate species clusters was 77-81%. The ANI within each species cluster was > 99%. examined years. The other two oligotypes, 6896 and 6921, were detected only later in the years and were therefore only present in the datasets of 2010-2012 where sampling occured throughout the year. Oligotype 6896 was always more abundant than oligotype 6921.
In the analyzed years, metagenomes have only been sequenced during spring. Resulting MAGs were affiliated to candidate species 1 (mc 12) and 2 (mc 3). Within the analyzed metagenomes, candidate species 1 was more abundant in 2010 and 2011, candidate species 2 peaked in 2012 and both were present in nearly equal numbers in 2016. It seems that the oligotype 6900 has a similar abundance pattern as candidate species 1 and 2 combined.
Within the analyzed datasets based on metagenomics read recruitment, we observed that candidate species 1 and 2 only occurred in spring, but not in autumn, whereas candidate species 3 occurred in autumn, but not in spring (Table S4).

Spatial distribution
The global distribution of 16S rRNA gene sequences closely related to Vis6 shows that the majority of samples derived from coastal surface waters (Fig. 5). We observed patterns for a latitudinal separation in different Vis6 clusters, based on 16S rRNA gene amplicons. Sequences affiliated to cluster A were found primarily in polar regions, sequences affiliated to cluster B were found in polar and temperate regions and C showed tendencies towards warmer, temperate zones.

Functional annotation
The estimated genome sizes of the three candidate species of Vis6 ranged between 1.7 and 2.4 Mbp and the GC content ranged between 44% and 47% ( Table 1). The number of CAZymes per Mbp was between 17 and 19 and the number of peptidases per Mbp was between 35 and 43.
Based on KEGG annotation, genes for TCA cycle, glycolysis, the non-oxidative part of the pentose phosphate pathway and fatty acid metabolism were present in all three candidate species (Table S5). No differences in presence of analyzed KEGG modules were found between the three candidate species. The annotation of transporters also yielded similar transporters for the three candidate  43 35 species. Annotated genes encoded lipopolysaccharide export, multidrug efflux pumps, polysaccharide transport, peptide transport, MFS transporters, the Tol biomer transport system and transporter for manganese, iron, zinc, nickel, magnesium and ammonium as well as mechanosensitive channels (Table S6). Candidate species 3 had a higher number of O-antigen ligases and membrane proteins annotated than the two spring species. All three candidate species had a gene encoding for proteorhodopsin. They can possibly replenish oxaloacetate for the TCA cycle by anaplerotic CO 2 fixation. Genes for carbonic anhydrase, converting CO 2 to bicarbonate, using a phosphoenolpyruvate carboxylase, were also detected. The total number of CAZymes (PL, CE and GH) per Mbp was almost the same in all three candidate species (17-19 CAZymes per Mbp) (Table S7) as well as the number of GHs (Fig. 6). GH13, encoding mostly alpha-glucan degradation, GH16 and GH17, both encoding ß-glucan degradation, were found in all candidate species. Abundant was also the family GH74 endoglucanases that could indicate a degradation of xyloglucans [2].
Candidate species 3 stood out by the increased presence of GH73, likely cleaving ß-1,4-glycosidic linkage between N-acetylglucosaminyl and N-acetylmuramyl moieties in the carbohydrate backbone of bacterial peptidoglycans [35]. Nacytylglucosaminidase activity has been shown for the GH73 family, but also for the GH3 family which was present in all three candidate species [63]. GH99 (endo-mannosidase), GH100 (invertase) and GH109 (N-Acetyl-galactosaminidase) [35] were annotated in candidate species 3, but absent or only very rare in the other two species. The same applied to GH37, a glycoside hydrolase family coding for trehalases [35]. Trehalase activity is also predicted for GH65, which was annotated in all three species. The family GH46 (chitosan degrading, [35]) was only annotated for candidate species 2.
The number of CBMs per Mbp was lower for candidate species 3 (7 CBM/Mbp) compared to 12 CBMs per Mbp for candidate species 1 and 2 ( Figure S1). CBM44 (cellulose and xyloglucan binding) was the most abundant CBM in all three species, followed by CBM50 (peptidoglycan binding). Of all CEs, CE1 (targeting peptidoglycan, xylan and chitin) was most often annotated in all three species ( Figure S2). Candidate species 1 and 3 had a CE3, which was not annotated for candidate species 3. Candidate species 3 had only one polysaccharide lyase -PL12, a heparin-sulfate lyase ( Figure S3). Candidate species 1 had only PL22 (oligogalacturonate lyase). Candidate species 2 had PL22, PL1 10 and PL9. The annotated GTs were similar between the three candidate species ( Figure  S4). Some of the annotated CAZymes were co-localized with each other and with SusC and SusD genes, forming PULs. A putative alpha-glucan degrading PUL, containing GH13 and GH65, was predicted for all three candidate species (Table S8 and Figure S5), as well as a putative beta-glucan degrading PUL, containing GH30, GH17, GH16 and GT4. The latter PUL type is reminiscent of a laminarin degrading PUL with GH30 removing the side chains of laminarin and GH17 degrading the polysaccharides to oligosaccharides [62]. An additional beta-glucan PUL, containing CBM6/GH5 46 and GH16, was annotated in candidate species 1 and 2. Candidate species 3 had two putative PULs, not found in the other two candidate species, of which the first contained GH97 (hydrolyzing a-glycosidic linkages; [31]) and GH37 (hydrolyzing trehalose into glucose; [35]) and the second GH92 and GH20 (containing ß-N-acetylglucosaminidases and lacto-N-biosidases; [35]). No sul-  fatases were detected in the MAGs of candidate species 3, one sulfatase per genome was annotated for candidate species 2 and about half of the MAGs of the candidate species 1 had a sulfatase encoded (16 out of 38 MAGs). All detected sulfatases were classified as sulfatase family S1 [4].
The essential proteins for gliding motility of Bacteroidetes are suggested to be GldB, GldD, GldH and GldJ in addition to the PorSS/type IX secretion system (GldK, GldL, GldM, GldN, SprA, SprE, SprT) [39]. Genes gldD, H, J, L, M and N were detected for candidate species 1 and 2 (Table S6). Genes for GldH, M, SprA, SprB and SprT were detected for candidate species 3. More than 8 unspecified gliding-motility annotations per MAG of candidate species 1 and 2 were found, which were not annotated in the MAGs of candidate species 3. Few genes involved in surface adhesion [27] were annotated like PKD and additionally FG-GAP for candidate species 3.

Ortholog groups
Half of the ortholog groups were shared by all three candidate species (1111 of 2215) (Fig. 7). The largest proportion of ortholog groups, that were only present in one of the species, was found in candidate species 3 (537). Candidate species 1 and 2 were more similar based on this analysis (sharing 207 groups) and more distant to candidate species 3. Most of the orthologs only present in candidate species 3 were of unknown function (Table S9). The kinases and integrases were notable by their higher presence in candidate species 3 compared to the other two species.

Cell morphology and habitat
Cells targeted by the Vis6-814 probe were rod-shaped ( Figure  S7). The cell dimensions, measured on four sampling dates in three different years, were all in the same range between 0.9 ± 0.2 m and 1.25 ± 0.2 m in length and between 0.48 ± 0.04 m and 0.62 ± 0.08 m in width. Oligotypes affiliated to the candidate genus were found in both free-living (0.2-3 m) and the particle associated (3-10 m) fraction, but enriched in the free-living fraction, particularly in spring ( Figure S8). oligotype 6921 and oligotype 6896 were enriched in the particle associated fractions in autumn.

Discussion
Based on the polyphasic data, we describe here three candidate species within a new candidate genus of the family Cryomorphaceae. The description of the novel candidate genus is based on a high coherence within the MAGs of >99% ANI, and an AAI of 50-51% between the candidate genus and the genome of the closest described relative Owenweeksia hongkongensis [52]. As a name for the new candidate genus we suggest Candidatus Abditibacter, which alludes to a genus of hidden, rod-shaped bacteria, being nearly always present in the coastal marine surface waters examined, but most of the time not dominant. We suggest the renaming of the genus UBA10364 from GTDB to Ca. Abditibacter. The species sp003045825 thereby represents candidate species 1 and sp002387615 represents candidate species 2. Candidate species 3 represents a species not yet present in the GTDB.

Three species of Ca. Abditibacter with seasonal variation
We hypothesize a seasonal alternation of the analyzed Ca. Abditibacter species based on the observation that MAGs of candidate species 3 could not be retrieved from the spring metagenomes and MAGs of candidate species 1 and 2 could not be retrieved from the September 2017 metagenome. We named candidate species 1 Ca. Abditibacter vernus due to its main detection in spring. Candidate species 2 was named Ca. Abditibacter forsetii after Forseti, the god that was worshipped on Helgoland, for the sampling location off the coast of Helgoland. Candidate species 3 was named Ca. Abditibacter autumni due to its occurrence in autumn. The connection between the various data used for candidate species description is shown in Table 2.
The two spring species Ca. A. forsetii and Ca. A. vernus were not distinguishable on 16S rRNA gene sequence level. Together with the oligotype 6900 they were affiliated to the Ca. Abditibacter 16S rRNA gene cluster B with > 98.8% sequence identity, suggesting that oligotype 6900 represents both species. This assumption is supported by the observation that the combined abundance patterns of Ca. A. forsetii and Ca. A. vernus match the abundance patterns of oligotype 6900 and also the microscopic FISH counts from the Ca. Abditibacter specific probe Vis6-814.  The two oligotypes 6896 and 6921, which are 99.6% identical, were affiliated to the Ca. Abditibacter 16S rRNA gene sequence cluster C. In the same cluster C, the 16S rRNA gene sequences from the Ca. A. autumni MAGs were grouped. We therefore assume that these two oligotypes represent Ca. A. autumni. This corroborates the assumption that Ca. A. autumni flourishes in late summer and autumn.

Taxonomic classification within the family Cryomorphaceae
All described members of the Cryomorphaceae are either rods or filamentous rods with a GC content ranging from 34-45% [10]. Most described genera showed gliding motility and require elevated salt concentrations for growth. Ca. Abditibacter matches these traits by being rod-shaped and dwelling in seawater. Gliding motility may be indicated for Ca. Abditibacter based on the detection of yet incomplete set of gliding motility genes, but could also be an evolutionary relict. The GC content is with 44-48% at the higher end of the range described for the Cryomorphaceae family.

Prevalence in coastal areas linked to phytoplankton blooms
As the name of the family Cryomorphaceae implies, first isolates were retrieved from cold waters, which could indicate an adaption to cold conditions in this family. Indeed, the map of Vis6 16S rRNA gene sequences indicates that in particular cluster A species have been found preferentially in polar regions. Cluster C species seem to have a tendency towards warmer waters and tropical regions, but have also been detected in temperate regions. However, the 16S rRNA dataset provides only a limited view into the global distribution of Ca. Abditibacter, since the resolution of 16S rRNA is not sufficient to discriminate between the different Ca. Abditibacter species. A distribution analysis based on whole genome sequences of Ca. Abditibacter species could probably provide a more detailed picture.
In the initial FISH experiments by Gomez-Pereira et al. [25], Ca. Abditibacter abundance correlated to chlorophyll a concentrations and to total flavobacterial abundance. Our data also shows that Ca. Abditibacter abundance peaks simultaneously with or shortly after chlorophyll a, which was also seen in other studies where Ca. Abditibacter related 16S rRNA gene sequences were retrieved from a diatom dominated phytoplankton bloom [34], from the deep chlorophyll maxima [55] and from a phytoplankton bloom in a mesocosm study [43]. Other isolation sources further indicate that Ca. Abditibacter spp. occur mainly in coastal surface waters, but also related to phytoplankton blooms in the North Atlantic [25] and the Southern Ocean [55].

Free-living polysaccharide and peptide degraders
Based on the annotation of genes encoding transporters for polysaccharides and peptides, and the respective degradation enzymes, we propose for all three candidate species a largely heterotrophic lifestyle based on polysaccharide and peptide utilization. Relatively small genomes (1.7 -2.4 Mbp) and high peptidase to CAZyme ratios (2-2.5) are typical for heterotrophic bacteria appearing during the degradation of phytoplankton blooms [7,32,66]. The proteorhodopsin gene is more commonly found in planktonic than in algae associated Bacteroidetes [66] and likely provides an advantage during phases of substrate limitation [24]. The proton gradient generated by proteorhodopsin energizes the inner membrane, for example for transport so that substrates can be used for anabolism instead of respiration and the TCA cycle can be replenished by anaplerotic bicarbonate fixation [27].
We assume a predominantly free-living lifestyle for Ca. Abditibacter based on microscopic observations and 16S rRNA gene analysis. CARD-FISH hybridization on unfractionated seawater did not indicate particle attachment of Ca. Abditibacter. Oligotype analysis also showed an enrichment of Ca. Abditibacter in the freeliving fraction (0.2-3 m), except for two out of three oligotypes in autumn. This could indicate that some Ca. Abditibacter species survive in the particle associated fraction over autumn and winter, though we did not see microscopic evidence in our September sample.
The genomes of Ca. Abditibacter species contained PULs for the degradation of the simple storage molecules of phytoplankton. These include alpha-glucans like glycogen in cyanobacteria [3] or beta-glucans like chrysolaminarin in diatoms [5]. The CAZyme repertoires of the two spring species Ca. A. forsetii and Ca. A. vernus were more alike than the repertoire of Ca. A. autumni (Fig. 6). The latter stood out by a larger number of GHs specific for cleaving the ␤-1,4-glycosidic linkage between N-acetylglucosaminyl (NAG) and N-acetylmuramyl (NAM) moieties in the carbohydrate backbone of bacterial peptidoglycans. Enzymes involved in cell wall hydrolysis are needed for cell division, cell wall rearrangement and also for the recycling of cell walls in sudden carbon depletion [63]. These enzymes could also indicate a utilization of peptidoglycans present in the dissolved and particulate organic matter fraction [6,40].
Sulfatases were not (Ca. A. autumni) or rarely (Ca. A. vernus and Ca. A. forsetii) detected for the described species, suggesting that sulfated polysaccharides are not utilized. Similar to Formosa spp. we did not detect mannitol dehydrogenases for Ca. Abditibacter which suggests a specialization on chrysolaminarin that does not have mannitol side chains and is preferentially produced by diatoms [62]. A slight correlation of clade "Owenweeksia", including a sequence closely related to Ca. Abditibacter, to the diatom species Pseudo-Nitzschia was found by Needham and Fuhrman [42]. The PUL spectrum of Ca. Abditibacter identified in this study suggests that the association with diatoms is caused by chrysolaminarin utilization.
Besides polysaccharides, proteins and peptides seem to be an important carbon source for Ca. Abditibacter species, which is not uncommon for Bacteroidetes [21]. The metallopeptidases M1, M23, M16 and the serine proteases S9 and S41 were among the most abundant peptidase families in our dataset, which corresponds with findings of Gomez-Pereira et al. [26], who analyzed bacteroidetal fosmids retrieved from the North Atlantic Ocean. Peptidases of family M23 lyse peptidoglycans of bacterial cell walls, either as a defensive or feeding mechanism [50,63]. This does not necessarily point to a pathogenic lifestyle as peptidoglycan can be taken up directly from the environment and serve as an energy source [6,61].
Ca. Abditibacter likely plays a significant role in the degradation of organic matter derived in particular from phytoplankton blooms. The 16S rRNA gene sequence analyses in IMNGS shows a rather cosmopolitan occurrence of Ca. Abditibacter in coastal and open ocean surface water. The data even hint towards polar occurrence of Ca. Abditibacter cluster A and a preference for higher temperatures of Ca. Abditibacter cluster C. This hypothesis of different growth temperature optima could also explain the seasonal distribution of the three described Ca. Abditibacter species in the temperate climate of the North Sea and could be tested in the future with isolates.

Description of Ca. Abditibacter gen. nov
Candidatus Abditibacter (Ab.di.ti.bac'ter. L. past part. abditus hidden, kept secret, concealed; N.L. masc. n. bacter a rod; N.L. masc. n. Abditibacter a hidden rod) Members of the genus Ca. Abditibacter are psychro-to mesophilic, photo-heterotrophic, marine bacteria, living primarily of peptides and polysaccharides. Three species have been defined within this genus, based on ANI values. Ca. Abditibacter cells can be visualized by the FISH probes Vis6-814 and Vis6-871. The genus was detected in densities of 1.34x10 3 to 2.2x10 5 cells ml -1 in all marine surface water of Helgoland during spring. The GC content of the three species is between 44% and 48%. Cells are rod shaped with an approximate size of 1.1 m x 0.5 m. The genus Ca. Abditibacter belongs to the family Cryomorphaceae, order Flavobacteriales, class and phylum Bacteroidetes.
Description of Ca. Abditibacter spp.
This species was detected in seawater samples from the North Sea during spring phytoplankton blooms in multiple years. Its estimated genome size is 1.7 Mbp with a GC content of 45%.
This species was detected in seawater samples from the North Sea during spring phytoplankton blooms in multiple years. Its estimated genome size is 1.9 Mbp with a GC content of 44%.
Candidatus Abditibacter autumni (au.tum'ni. L. gen. n. autumni of the autumn). This species was detected in seawater samples from the North Sea in September 2017. Its estimated genome size is 2.4 Mbp with a GC content of 48%.
A tabular overview of the three novel candidate species is summarized in Table 3.

Declarations of interest
none.