Taxonomic and functional metagenomic assessment of a Dolichospermum bloom in a large and deep lake south of the Alps

Abstract Untargeted genetic approaches can be used to explore the high metabolic versatility of cyanobacteria. In this context, a comprehensive metagenomic shotgun analysis was performed on a population of Dolichospermum lemmermannii collected during a surface bloom in Lake Garda in the summer of 2020. Using a phylogenomic approach, the almost complete metagenome-assembled genome obtained from the analysis allowed to clarify the taxonomic position of the species within the genus Dolichospermum and contributed to frame the taxonomy of this genus within the ADA group (Anabaena/Dolichospermum/Aphanizomenon). In addition to common functional traits represented in the central metabolism of photosynthetic cyanobacteria, the genome annotation uncovered some distinctive and adaptive traits that helped define the factors that promote and maintain bloom-forming heterocytous nitrogen-fixing Nostocales in oligotrophic lakes. In addition, genetic clusters were identified that potentially encode several secondary metabolites that were previously unknown in the populations evolving in the southern Alpine Lake district. These included geosmin, anabaenopetins, and other bioactive compounds. The results expanded the knowledge of the distinctive competitive traits that drive algal blooms and provided guidance for more targeted analyses of cyanobacterial metabolites with implications for human health and water resource use.


Introduction
Cyanobacteria are a group of photosynthetic prokaryotic microorganisms that are widely distributed throughout the world.In aquatic en vironments , c y anobacteria are essential for the sustainability of terrestrial life, accounting for ∼25% of carbon dioxide fixation (Aguiló-Nicolau et al. 2023 ).In phosphorus-and nitrogenrich lake and river ecosystems, this group of microorganisms is often able to r epr oduce v ery r a pidl y, pr oducing high biomasses and causing blooms (Reynolds and Walsby 1975 ).In addition to eutrophication, c y anobacterial blooms are favoured and intensified by high water temper atur es and thermal stability of the water column (Paerl and Huisman 2009, Visser et al. 2016, J anko wiak et al. 2019 ).Evidence of an increase in the frequency, size and duration of c y anobacterial blooms around the w orld has been reported (Huisman et al. 2018, Hou et al. 2022 ).These phenomena are influenced by geogr a phic location, lake and watershed c har acteristics and the species involved, and their documentation depends on monitoring cov er a ge and effort (Wood et al. 2017, Hallegraeff et al. 2021, Bullerjahn et al. 2023, Mishra et al. 2023, Erratt and Freeman 2024 ).Given that many c y anobacteria are capable of producing a wide range of secondary metabolites that are toxic to humans and animals (Meriluoto et al. 2017 ), c y anobacterial blooms r equir e special attention in terms of monitoring and risk assessment re-lated to the use of aquatic resources for drinking and bathing purposes (Chorus and Welker 2021 ).
The dynamics of c y anobacterial harmful algal blooms can be highl y v ariable, r anging fr om localized and episodic e v ents ov er a few hours or days to persistent, large biomass accumulations over lar ge ar eas for se v er al days or weeks (Stumpf et al. 2012, Steffen et al. 2017 ).The intensity of these blooms depends on nutrient availability and local climatic and hydrological conditions (Wynne et al. 2010, Wu et al. 2013 ).
T he en vironmental localization and impact of c y anobacterial blooms ar e highl y species-specific, depending on the v ertical accumulation of biomass , e .g. at the surface , dispersed in the water column, or forming metalimnetic blooms, as in the case of Planktothrix rubescens (De Candolle ex Gomont) Anagnostidis and Komárek (Lindholm et al. 1989, Codd et al. 1999, Boscaini et al. 2017, Zepernick et al. 2024 ).In turn, the ability to synthesize toxins is often strain-specific and c har acterized by str ong geogr a phic patterns (Kardinaal et al. 2007, Haande et al. 2008, Vico et al. 2020 ).In all these cases, a complete taxonomic and functional c har acterization of the e v ents is essential for a compr ehensiv e risk assessment and management of the affected waters.
T he con v entional taxonomic a ppr oac h involv es the use of micr oscopic observ ations of envir onmental samples, occasionall y coupled with genetic c har acterization of isolates and/or environmental samples (Kurmayer et al. 2017 ).In parallel, a range of different c y anotoxins is c har acterized and quantified using liquid c hr omatogr a phy-mass spectr ometry (LC-MS) or enzyme-linked immunosorbent assay (Meriluoto et al. 2017 ).Ov er all, the genetic analysis of isolates and the metabolomic profiling of isolates and environmental samples are based on targeted analyses, which remain an efficient a ppr oac h to ensure the correct identification of c y anotoxin producers.Ho w ever, the use of targeted analyses is often v ery demanding, r equiring full y equipped labor atories and, in the case of isolates, long periods of time r equir ed for the establishment and growth of populations .T hey are also restricted to a limited number of target genes and metabolites.
Mor e r ecentl y, conv entional a ppr oac hes hav e been complemented by a number of technologies using culture-independent high-thr oughput sequencing (HTS) a ppr oac hes (Thompson and Thielen 2023 ).Metabarcoding has been widely used as a fast and inexpensive tool to characterize the microbial and c y anobacterial communities (Pawlowski et al. 2018, Cordier et al. 2020, Domaizon et al. 2021 ), allowing the study of spatial and temporal patterns in the distribution of specific c y anobacterial oligotypes (Berry et al. 2017, Salmaso et al. 2024 ) and toxigenic taxa (Casero et al. 2019, Linz et al. 2023 ).Analogous to the classical a ppr oac hes, metabarcoding is based on the targeted analysis of short DNA (deo xyribon ucleic acid) amplicons, allowing a deep determination of micr obial comm unities, but with man y limitations, mainl y due to the use of single marker genes per run, the limited information carried by short DNA fr a gments, and the incompleteness of r efer ence databases (Malashenk ov et al. 2021, Salmaso et al. 2022 ).Conv ersel y, meta genomic a ppr oac hes ar e based on DNAtargeted independent methods that allow the reconstruction of metagenome-assembled genomes (MAGs) from the analysis of any type of biological and environmental samples (Quince et al. 2017, Pérez-Cobas et al. 2020 ).The use of draft genomes , i.e .MAGs reconstructed with different levels of completeness and contamination (Garner et al. 2023 ), allows to unr av el the taxonomy and phylogen y of micr obial assembla ges (Soo et al. 2014, Dvo řák et al. 2023, P essi et al. 2023, Struneck ý et al. 2023 ), which opens important perspectives for the determination of functional properties of species and communities (Chrismas et al. 2018, Linz et al. 2018, Alcorta et al. 2020, Tran et al. 2021, Van Le et al. 2024 ).
In this w ork, w e report the results of a full-shotgun metagenomic analysis performed on a sample collected during a summer bloom of Dolichospermum detected in Lake Garda.In this context, and considering the many definitions proposed (Zepernick et al. 2024 ), the term bloom is applied to indicate a visible formation of scum.Blooms with the same c har acteristics hav e been r ecorded irr egularl y since the earl y 1990s, and the taxonomy of the unique species involved has been characterized (Salmaso et al. 2015b, Capelli et al. 2017 ).Our main objectives were (i) to use the MAG of Dolichospermum to c har acterize the taxonomic assignment of the species at the genomic le v el; (ii) to identify, through genome annotation, the main metabolic pathways and the presence of r ele v ant metabolites in Dolichospermum , including c y anotoxins; and (iii) to discuss the prospects for the practical use of meta genomic a ppr oac hes to complement conventional monitoring in assessing the risks posed by the de v elopment of potentiall y toxigenic c y anobacterial populations.

Sampling, filtr a tion, and phytoplankton analysis
The sample for metagenomic and c y anotoxins analyses was collected on the surface using a sterilized plastic bottle during a bloom observed on the afternoon of September 1, 2020, in the shallo w er southeastern basin of Lake Garda, ∼3 km off the coast of the village of Bardolino (45.55 N 10.68 E; Fig. S1 ).The sampled layer r anged fr om 2 to 10 cm.The sample was k e pt refrigerated overnight until filtration, which was performed the next day on GF/C filters (nominal particle retention 1.2 μm) until almost clogged.
During the bloom, water temper atur es wer e measur ed with m ultipar ameter pr obes (Idr onaut Ocean Se v en 316 Plus and SBE 19plus SeaCAT).Water tr anspar enc y w as measur ed with a Secc hi disk.Samples for chemical (0-2, 9-11, and 19-21 m) and phytoplankton (0-20 m) analyses were collected by the Regional Agency for Environmental Protection and Prevention of the Veneto Region (ARPAV) (Ragusa et al. 2021 ).T he used methods ha ve been r egularl y c hec ked between the ARPAV and the Fondazione Mach of S. Michele all'Adige (FEM) laboratory as part of the activities carried out within the Long Term Ecological Research (LTER) network (Capotondi et al. 2021 ) and previous projects (Domaizon et al. 2021 ).Chemical analyses were performed according to standard methods (APHA, A WW A, and WEF 2018 ) and included pH, dissolved oxygen, sulfate (SO 4 2 − ), nitrogen (NO 3 -N, NH 4 -N and TN, total nitrogen) and phosphorus (SRP, soluble r eactiv e phosphorus and TP, total phosphorus).Phytoplankton anal yses wer e performed using inverted microscopes (Salmaso et al. 2022 ).On the same day, additional field measurements and samples for chemical and phytoplankton anal yses wer e collected in the deeper northwestern basin (45.69 N, 10.72 E), ∼20 km north of the bloom location.

DN A extr action, libr ary prepar a tion and sequencing
Filters were stored at -20 • C until DNA extr action, whic h was performed with DNeasy Po w erWater ® DN A Isolation Kit (Qiagen, USA).DN A concentrations w ere measured with a NanoDrop ND-8000 (Thermo Fisher Scientific Inc., USA).Starting from a total amount of 100 ng, total DN A w as fr a gmented by enzymatic reaction at 37 • C x 5 min pr oducing fr a gments of 500 bp.P air edend library was prepared using the KAPA HyperPlus kit (Roche).Ada pters fr om the KAPA Unique Dual-Indexed Adapter Kit (Roche) recommended for use with the KAPA HyperPlus Kit were ligated to the DNA fr a gments following the manufactur er's instructions.
Libr aries wer e quantified using the KAPA Library Quantification Kits (Roche) and were sequenced for 150 bp paired-end reads on the Illumina Novaseq-6000 platform (Illumina Inc., San Diego, CA, USA).
The Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the project number PRJNA1074715.

Taxonomic assignment and phylogenomic analyses
The taxonomic analysis of the MAGs recovered from the Lake Gar da bloom w as based on the Genome Taxonomy Database (GTDB) 09-RS220, released in April 2024 (Parks et al. 2022 ).Taxonomic classifications were performed using GTDB-Tk 2.4.0 updated to use the GTDB R220 taxonomy (Chaumeil et al. 2022 ).
Genomes to be compared with the Dolichospermum MAG determined in Lake Garda (FEM_B0920) were selected to cover all the Dolichospermum species available in GTDB R220.Most of these genomes were obtained from metagenomic analyses of non-axenic cultur es, enric hed cultur es , and en vironmental samples.Only in a few cases, DN A w as isolated from single cells ( Dolic hospermum spp., str ains sed1-sed10; Woodhouse et al. 2024 ).In the GTDB R220 taxonomy no Dolichospermum lemmermannii (Ric hter) P. Wac klin, L. Hoffmann, and J. Komár ek genomes wer e included, whereas in NCBI (Sayers et al. 2022 ), two genomes at-tributable to this species were reported.The first was D. lemmermannii CS-548, collected in 1981 from Lake Edlandsvatnet, Norwa y (GC A_028330815.1)and classified in the GTDB R220 under Dolichospermum sp000312705.The second was Dolichospermum SB001 (GCA_016462165.1),which was detected during an offshore bloom of D. lemmermannii in Lake Superior in August 2018 (Sheik et al. 2022 ); in GTDB R220, this genome was howe v er not included in the r efer ence database.Fr om this initial set, three genomes lac king NCBI genus-le v el classification and GTDB species classifications, and a further 12 genomes with completeness below 95% and/or contamination above 4% (as determined by CheckM2) were excluded from subsequent analyses.Similarly, Dolichospermum SB001 (87.8% completeness and 0.2% contamination) was not included in the main set of analyses .T he genomes analysed ar e r eported in Table S1 .
The MAG of Dolichospermum recorded in Lake Garda (GCA_037075685.1)was compared with this set of genomes using the Av er a ge Nucleotide Identity (ANI) (Palmer et al. 2020 ) computed using p y ani 0.2.12 (ANI b ) (Pritchard et al. 2015 ), Or-thoANIu 1.2 (Yoon et al. 2017 ), and fastANI 1.32 (Jain et al. 2018 ).The suggested species boundary for distinguishing between two species based on ANI values is 0.95-0.96(Goris et al. 2007 , Richter andRosselló-Mór a 2009 ), wher eas genomes of differ ent species gener all y hav e ANI < 0.90 and ANI v alues in the r ange 0.90-0.95ar e compar ativ el y r ar e (Rodriguez-R et al. 2024 ).
Phylogenomic anal yses wer e carried out using GT oT ree 1.8.6 (Lee 2019 ) with the parameter -G set to 0.75.GT oT ree makes use of Muscle 5 (Edgar 2022 ) to align sequences.Sequence alignments were computed using the pre-packaged HMM single-copy genes set specific for Cyanobacteria (251 genes) available in GT oT ree.The alignment and partitions obtained with GT oT r ee wer e used to build phylogenomic trees with IQ-TREE 2.3.4,using ModelFinder to select the substitution mode (Nguyen et al. 2015 ), and with br anc h supports computed using ultrafast bootstrap (UFBoot) values (Minh et al. 2013 ) with 10 000 replicates; UFBoot 95% support v alues r oughl y corr espond to a pr obability of 95% that a clade is true.Two phylogenomic analyses were performed, the first including only the genomes classified at the species le v el in the GTDB R220 taxonomy (58 genomes) and the second including all the available Dolichospermum genomes (96 genomes); besides the Dolichospermum collected in Lake Garda, in both cases, the genome of Cuspidothrix issatschenkoi CHARLIE-1 (GCF_002934005.1)was used as an outgr oup, r esulting in a total of 60 and 98 genomes being utilized in the r espectiv e anal yses .T he tr ees wer e built using iTOL v6 (Letunic and Bork 2024 ).Analyses were performed by calculating the alignment and trees using both protein and DNA sequences (Lee 2019 ), which yielded comparable results; only the trees constructed using proteins are shown.Besides the GTDB taxonomy, the clades obtained in the trees were interpreted taking into account the NCBI taxonomy and the classifications based on the ADA ( Anabaena / Dolic hospermum / A phanizomenon ) clade concept (Driscoll et al. 2018, Dreher et al. 2021 ).

Functional annotation
Functional annotation of the Dolichospermum draft genome was performed using the NCBI stand-alone softwar e pac ka ge Pr okaryotic Genome Annotation Pipeline (PGAP) 2023-10-03.build7061(github.com/ncbi/pgap)(Li et al. 2020 ) and finally confirmed by annotation using the PGAP service in NCBI ( https://www.ncbi.nlm.nih.gov/).PGAP allows the prediction of protein-coding genes and other functional genomic entities such as structural RN As, tRN As, small RN As and pseudogenes.Functional annotations have been integrated with Bakta 1.9.2, which assigns stable database identifiers from RefSeq and UniProt (Schwengers et al. 2021 ) and, to impr ov e the annotation of antimicrobial resistance genes (ARGs), AMRFinderPlus (Feldgarden et al. 2019, Sc hwengers et al. 2021 ).Antimicr obial r esistance (AMR) and ARGs were further predicted using ABRicate (version 1.0.1),incorporating the NCBI AMRFinder , ARG-ANNOT, ResFinder , and Card databases (github.com/tseemann/abricate)with minimum DNA identity and cov er a ge v alues of 80% and 50%, r espectiv el y.The location of ribosomal rRNA genes in MAGs was further e v aluated using Barrnap 0.9 (github.com/tseemann/barrnap).Basic metabolism and phenotypic features of the NCBI D. lemmermannii species were defined using the Kyoto Encyclopedia of Genes and Genomes (KEGG) (Kanehisa et al. 2014 ).After identifying proteins with Prodigal 2.6.3 (Hyatt et al. 2010 ), the functional orthologs defined by K numbers (Kegg Orthology, KO, identifiers) were determined using GhostKOALA computed with the genus_prokaryotes + viruses database file (Kanehisa et al. 2016 ).T he pathwa y KEGG modules (functional units of gene sets in metabolic pathways) were identified with the KEGG Mapper Reconstruct tool (Kanehisa and Sato 2020 ).Selected phenotypic tr aits wer e anal ysed using KEGG pathway ma ps (Kanehisa et al. 2022 ).
The presence of secondary metabolite biosynthetic gene clusters (BGCs) in the Dolichospermum genomes was assessed using the antibiotics and secondary metabolite analysis shell antismash 7.1.0(default mode), which allows the detection and c har acterization of BGCs in micr oor ganisms .T he similarity is defined as the percentage of genes within the closest known compound that have a significant BLAST hit to genes within the current region (Blin et al. 2023 ).

The Dolichospermum bloom
Shortly after sampling at the LTER station in the northeastern basin, an opportunistic sample was taken on the afternoon of September 1, 2020, from a bloom ∼3 km off the coast of the village of Bardolino.The bloom was observed during a period of calm winds .T he bloom had the same c har acteristics observ ed in other episodes documented in pr e vious years (Salmaso et al. 2015b ), i.e. with distinct a ggr egates of filaments visible with the naked eye and more or less dense streaks in the first few cm of the water column ( Fig. S2 ).

Cyanotoxins
The LC-MS analyses sho w ed a quantifiable presence of ATX-a (0.3 μg L −1 ).The other toxins analysed were not detected.

Environmental and light microscopy analyses
During the bloom, water temper atur es in the first 20 m were between 21.0 and 23.6 • C (Table 1 A).The Secchi disk depth was 4 m.pH and oxygen values were between 8.1 and 8.5, and 7.7 and 9.4 mg L −1 (87%-120% satur ation), r espectiv el y.Sulfate sho w ed homogeneous concentrations in the layers analysed (10 mg L −1 ).SRP and TP wer e extr emel y low thr oughout the epilimnion, below the detection limit and < 10 μg L −1 , r espectiv el y.In the first 10 m, nitr ate nitr ogen was at or below the detection limit (0.05 mg L −1 ).In the northwestern basin, the analyses gave comparable results, with the main difference being the more homogeneous and measur able concentr ations of NO 3 -N (120-190 μg L −1 ) and dissolved oxygen, and slightly higher SRP and TP concentrations in the epilimnion (Table 1

B).
The microscopic analyses performed by ARPAV confirmed the presence of D. lemmermannii in the integrated sample collected between 0 and 20 m.The total biovolume contributed by the whole phytoplankton community was ∼400 mm 3 m −3 , while the contribution of c y anobacteria w as ∼100 mm 3 m −3 .More than 60% of the biovolume of cyanobacteria was contributed by picoplankton, filaments of Planktothrix rubescens (De Candolle ex Gomont) Ana gnostidis and Komár ek and colonies of Microcystis aeruginosa (K ützing) K ützing, while the contribution of D. lemmermannii was m uc h lo w er, i.e. ar ound 8% ( < 10 mm 3 m −3 ).Tyc honema bourrellyi (J.W.G.Lund) Anagnostidis and Komárek was detected with a fraction of biovolume around 5%. Eukaryotic phytoplankton was mainly dominated by Chlorophyceae ( 142  No Dolic hospermum blooms wer e visuall y detected at the north-western LTER station.In the 0-20 m layer of this station, the av er a ge total c y anobacterial biov olume w as 106 mm 3 m −3 , of which 2 mm 3 m −3 were contributed by D. lemmermannii ( Fig. S3 ).

Metagenomes assembly and the Dolichospermum MAG
Nov aseq sequencing gener ated 62 967 310 pair ed-end r eads.In total, raw data quality processing removed around 11% of the raw r eads.After r esampling, the successiv e anal yses wer e performed on the 30% of the quality c hec ked and pr ocessed pair ed-end r eads.
After correction with metaSPAdes, assembly with Megahit yielded 16 508 contigs larger than 1000 bp, with a total size of 76 Mbp and N50 of 18 548 bp.
According to the GTDB taxonomy, the MAG FEM_B0920 was identified by GTDB-Tk as Dolichospermum sp000312705.Furthermor e, the Dolic hospermum genome r ecov er ed fr om the Lake Garda bloom shared the highest average identities (ANI b > 0.960) with the group of genomes included in GTDB R220 under Dolichospermum sp000312705, whic h corr esponded, according to NBCI taxonomy, to se v er al species mostl y assigned to Dolic hospermum spp.and Anabaena spp., as well as D. lemmermannii , Dolichospermum flosaquae (Bornet and Flahault) P. Wacklin, L. Hoffmann, and Komárek and Dolichospermum circinale (Rabenhorst ex Bornet and Flahault) Wacklin, Hoffmann, and Komárek (Table 3 and Table S1 ).Specifically, D. lemmermannii CS-548 isolated from Lake Edlandsvatnet sho w ed an ANI b value of 0.966.The genome r ecov er ed fr om the Lake Superior bloom ( Dolichospermum sp.SB001; not included in Table 3 ) sho w ed an ANI b value of 0.982.

Phylogenomic analyses
In the phylogenomic tree, all the genomes classified at the species le v el following the GTDB ( D .flosaquae , D .circinale , D .heterosporum , D. gracile , and, partly, D. planctonicum ) and NCBI taxonomy ( D. lemmermannii ) sho w ed a clear separation into different clades (Fig. 1 ).D. gracile sho w ed a relationship with the sole r epr esentativ e of D.
compactum , but at a lo w er le v el of identity (ANI b < 0.94) compar ed to intraspecific differences.Excluding D. planctonicum , all groups containing distinct species in compact clades were supported by UFBoot values > 95%.
The phylogenomic analysis calculated using all the Dolichospermum genomes confirmed the results obtained with the analysis based on the species (Fig. 2 ).The close affinity of the two D. lemmermannii FEM_B0920 and CS-548 NCBI genomes to the Dolichospermum sp000312705 GTDB group (Table 3 and Table S1 ) was confirmed by the complete phylogenomic anal ysis, whic h sho w ed the inclusion of these genomes in a unique compact cluster.While the ANI b values between the Lake Garda genome (FEM_B0920) and all other genomes in this clade wer e al ways gr eater than 0.96, the ANI b values calculated considering all other genomes not included in the Dolichospermum sp000312705/ D. lemmermannii clade wer e al ways lower than 0.91.
Following the KEGG analysis, the complete and incomplete pathway modules found in the D. lemmermannii FEM_B0920 MAG  27), whereas the remaining pathways included from 1 to 11 modules ( Table S3 A).A number of modules contained reactions essential for the central metabolism of photosynthetic c y anobacteria, including oxygenic photosynthesis (photosystems II and I; modules M00161 and M00163), beta-carotene biosynthesis (M00097), and phycobilisomes (allophycoc y anin and phycoc y anin/phycoerythroc y anin, but not phycoerythrin; Fig. S4 ), the r eductiv e pentose phosphate c ycle (Calvin c ycle) (M00165), the TCA (tricarboxylic acid -Krebs) cycle (M00009) and gl ycol ysis (M00001) ( Table S3 A). Specific pathw ays w er e r ele v ant for diazotr ophic blooming species.Besides assimilatory nitrate reduction (M00531, which included the narB nitrate reductase and nirA nitrite reductase genes), nitrogen metabolism was sustained by nitrogen fixation (M00175) ( Fig. S5 ).Specifically, the annotation by Bakta revealed the presence of different nif genes involved in the fixation of atmospheric nitrogen (i.e.nifB , D , E , H , J , K , N , S , T , U , V , W , and X ).No modules associated with dissimilatory nitrate reduction, denitrification, nitrification, and anammox were identified ( Fig. S5 ).
Se v er al pr otein components of ATP-binding cassette (ABC) membr ane tr ansporters for a wide range of nutrients, microelements, and organic molecules were identified in the D. lemmermannii FEM_B0920 genome ( Fig. S6 ).To support the intracellular assimilatory N-reduction and N-uptake, genes encoding a nitr ate/nitrite tr ansporter wer e pr esent ( nrtABC ), complemented by ammonium uptake (K03320; amt gene).A bicarbonate transporter (CmpABCD) was part of the carbon-concentrating mechanism (CCM).Other set of genes encoded proteins for the selectiv e tr ansport of molybdate and organic molecules, such as the polyamines spermidine/putr escine, osmopr otectants, oligosacc harides (se veral with incomplete paths), polyols and lipids, and lipopolysaccharides.Supporting the assimilatory sulfate reduction (M00176), besides ABC transporters for sulfate/thiosulfate and alkanesulfonate (as an additional source of S), a gene encoding non-ABC lo w er affinity sulfate transport was detected (K03321, sulfate permease).FEM_B0920 included genes encoding active transporters for phosphate and organophosphorus compounds (phosphonate), as well as amino acids and the ric h-N ur ea, CO(NH 2 ) 2 .The Pst system (phosphate ABC transporter; Fig. S6 ) was complemented by Pho regulon components (PhoHURB; K06217, K02039, K07636, and K07657) involved in the regulation of P-uptake.A group of genes was involved in the synthesis of ABC transporters targeting gr owth elements suc h as, besides mol ybdenum (in the form of molybdate), zinc, cobalt, and nickel.Though potentially biosynthesized by FEM_B0920 (M00950), specific pr oteins wer e potentially encoded for the transport of biotin (vitamin B 7 ).
The presence of genes encoding ferrous iron transport proteins A (FeoA, K04758) and B (FeoB, K04759) were also identified.
The essential role of cofactors and vitamins in various bioc hemical r eactions essential for the maintenance of cellular functionality was expressed by the presence of several complete or nearly complete modules associated with their biosynthesis; among others, and in addition to biotin/vitamin B7, vitamins B1 (thiamine), B2 (riboflavin), B5 (Pantothenate), B6 (Pyridoxal-P), and B12 (Cobalamin).
In the KEGG database, specific modules describing the gene cluster involved in the gas vesicle biosynthesis are not present.Different gvp genes in the D. lemmermannii FEM_B0920 genome w ere ho w ever identified b y specific K-numbers (K23262) and by Bakta annotation.
KEGG annotation of the D. lemmermannii CS_548 genome produced results that were almost indistinguishable from those obtained with the D. lemmermannii FEM_B0920 annotation ( Tables S3 A-B and Fig. S6 ).

Genes potentially involved in the synthesis of bioacti v e peptides
The antismash analysis allo w ed the detection of distinct secondary metabolite regions (Table 5 ).These included regions involved in the physiology of heterocytous N-fixing c y anobacteria (heter ocyst gl ycolipids) and in the biosynthesis of GEO ( Table S2 ).
No r egions involv ed in the biosynthesis of the "conventional c y anotoxins" commonly identified in Lake Garda, such as MCs and ATXs (Cerasino and Salmaso 2012 ), were detected.On the contr ary, ne w bioactiv e secondary metabolites, some with potential inhibitory/to xic acti vity, belonging to the classes of nonribosomal peptides (NRP) and ribosomally produced natural products (RiPP) were identified (Table 5 ).The first class included anabaenopetins, sc ytoc yclamides (laxaphycins) and a mycosporinelike compound; varlaxin was detected with very low similarity.The second class included the anacyclamides.S1 .
Besides antismash, the absence of gene clusters encoding microcystins and anatoxins in the FEM_B0920 genome was confirmed by the negative results obtained with ISeqDb searching for the presence of anaC , anaF , mcyB , sxtA , and cyrJ .The genes mc yD and mc yE w ere detected with tw o short sequences (197 and 128 bp, pident 95.9% and 99.2%) similar to fr a gments c har acterized by MITE (miniature inverted-repeat transposable elements) insertion (Fewer et al. 2011 ).These short fr a gments wer e also detected in other Dolichospermum genomes considered in this work.
Compared to the D. lemmermannii FEM_B0920, the genome of D. lemmermannii CS-548 isolated from Lake Edlandsvatnet, Norway, sho w ed the presence of microcystin genes .Furthermore , besides MC, the ability of this genus to potentially synthesize ATX, STX, and CYN was well documented after the analyses by antismash ( Table S1 ).Some a ppar ent patterns wer e distinguishable , i.e .a br oad exclusiv e pr esence of genes encoding MC in ADA-2; a broad exclusiv e pr esence of genes encoding ATX in ADA-3 (and ADA-8, the outgr oup); the pr esence of genes encoding STX in D. gracile  S1 .Similarity indicates the percentage of genes within the closest known compound that has a significant BLAST hit to genes within the current region (Blin et al. 2023 ).
and in one genome in ADA-1; excluding two annotations with weak support, exclusive presence of genes encoding CYN in ADA-6.Genes encoding anabaenopeptins (APs) wer e pr esent in ADA-2 and AD A-3, D .gracile , and all the Dolichospermum sp028658405 genomes.Genes encoding GEO were well r epr esented in ADA-1 and ADA-3, and only sporadically in ADA-2.Noteworthy is the absence of all analysed genes encoding MC, ATX, STX, CYN, APs, and GEO in the genome of.D. flosaquae (ADA-4) and, excluding GEO, D .planctonicum (AD A-1).In AD A-1, in addition to the detection of a genome containing STX genes found by antismash, the analysis of two D. circinale strains was positive for the biosynthesis of STX using analytical methods (Beers 2020 ), while five strains were positive for the presence of the gene encoding sxtA ( Table S1 ).

Metagenomic analyses of the bacterial community
The MetaPhlan analysis classified the 30% of the quality-filtered reads as re presentati ve of the surface sample (scum).Cyanobacteria (28.4%) were mainly represented by Dolichospermum (28.1%), over the remaining bacterial classes, mainly represented by Gamma pr oteobacteria and Alpha pr oteobacteria.Besides Dolic hospermum , the other c y anobacteria w er e detected with r elativ e abundances w ell belo w 0.3% and w er e r epr esented b y Microc ystis aeruginosa , Tyc honema bourrell yi (r eported as Microcoleus bourrell yi in the GTDB taxonomy), and picoc y anobacteria ( Synechococcus la-custris and Cyanobium usitatum ; Cabello-Ye v es et al. 2018 ).Exceedingl y r ar e r eads included Dolic hospermum spp., A phanizomenon spp., Planktothrix spp., and Cuspidothrix issatschenkoi .MAGs r ecov er ed fr om the binning of the contigs included r epr esentatives of the classes found by MetaPhlan, mostly belonging to Gamma pr oteobacteria and Alpha pr oteobacteria ( Table S4 ).Repr esentativ es of the first group included Acidovorax and Rubrivivax , while the second group included Tagaea , Rhabdaerophilum and Sphingorhabdus .No genes involved in the biosynthesis of cyanotoxins and GEO were found in any of the identified bacterial contigs.
The whole set of raw contigs, including those unbinned and not included in any MAGs and those with a length < 1000 bp excluded from the binning procedure were analysed for the presence of MC, ATX, STX, and CYN, as well as GEO-encoding genes.For anatoxin-a, anaC and anaF were identified with sequences of 113 and 346 bp, r espectiv el y, showing 100% similarity to the corresponding genes in the anatoxin-a-producing Tychonema bourrellyi B0820 isolated from Lake Garda (Salmaso et al. 2023 ).In addition to the sequences including the MITE insertion (pr e vious section), further fr a gments of mc yB ( ∼200 bp) and mc yE (around 370 bp) were identified in other contigs not included in the MAGs with pident 95%-100% to uncultured c y anobacteria and Microc ystis .Conv ersel y, no other fr a gments of the sxtA , cyrJ, and geoA genes were identified in the entire contig set, except for geoA identified in Dolichospermum FEM_B0920.

AMR genes
Functional annotation of D. lemmermannii FEM_B0920 by Bakta and/or PGAP identified a tetracycline resistance protein, class C, and the m ultidrug r esistance pr otein MexB.After pr otein BLAST, the sequences sho w ed 100% QC and up to 100% and 99.6% similarity to the MFS transporter (Pasqua et al. 2019 ) of several Anabaena/Dolichospermum species and efflux RND transporter permease subunit (Nappier et al. 2020, Hw engw ere et al. 2022, Aguiló-Nicolau et al. 2023 ), r espectiv el y.KEGG annotation found one ortholog (K17836) associated with the beta-Lactam resistance (Bush 2013 ).No AMR genes were identified by ABRicate in the D. lemmermannii FEM_B0920 and in the complete set of ra w contigs , using the adopted thresholds for minimum identities and co verage .

Discussion
The meta genomic anal ysis of a surface sample collected during a c y anobacterial bloom identified in Lake Garda in late summer 2020 allo w ed to confirm the nature of the or ganism r esponsible for the episode and to functionall y c har acterize the population.The analyses allo w ed to clarify the phylogenomic position of D. lemmermannii in relation to other species of the same genus and ADA group and to interpret the adaptive ecological traits in relation to the range of primary and secondary metabolites potentiall y pr oduced b y the population inv olved in the bloom.

Environmental conditions during the bloom
D. lemmermannii blooms in the lake district south of the Alps were first recorded in Lake Garda at the turn of the 1980s and 1990s.Gr aduall y, blooms also a ppear ed in the other large and dee p lak es south of the Alps, namely Lakes Iseo , Como , Ma ggior e, and Lugano (Callieri et al. 2014, Funari et al. 2014 ).In Lake Garda, the whole set of microscopic and genetic analyses carried out since the 1990s confirmed the presence of a unique Nostocales in the c y anobacterial communities involved in the blooms (Salmaso et al. 2015b, Capelli et al. 2017 ).
The long-term historical colonization of D. lemmermannii in Lake Gar da w as investigated b y direct counting of subfossil akinetes identified from sediment cores and by estimating the nature and abundance of filaments germinated from subfossil viable akinetes by light microscope and genetic analyses (Salmaso et al. 2015a ).The application of this complementary approach allo w ed to identify the onset of colonization around the mid-1960s, when the lake sho w ed a shift fr om ultr a-oligotr ophy / oligotr ophy to oligomesotrophy (Milan et al. 2015 ).
The analysis of long-term limnological data collected in Lake Garda since the 1990s showed that D. lemmermannii filaments al ways de v eloped during the warmest months, with temperatures > 15 • C and abundances generally < 40 mm 3 m −3 in the layer 0 −20 m.Bloom formation during summer and early autumn was fav oured b y high temper atur es, high water stability and calm weather (Salmaso et al. 2015b ).Given the extremely low biomass of Dolichospermum in the epilimnetic layer, blooms were caused by the r a pid upward movement and accumulation of filaments to w ar ds the surface, rather than by in situ growth.The development of this species during the warmest months coincided with the periods of minimum availability of dissolved nitrogen concentr ations, whic h wer e gener all y < 100-150 μg N L −1 .These conditions were the same as those recorded during the bloom observed in September 2020.In particular, the low concentrations of SRP and TP ( < 5 and < 10 μg P L −1 , r espectiv el y) pr ecluded the de v elopment of high phytoplankton biomasses in the first 20 m, whereas the low DIN concentrations (below 50 μg N L −1 ) recorded in the first 10 m would indicate a state of nitrogen limitation, potentially fav ouring heteroc ytous nitrogen-fixing c y anobacteria (Schindler et al. 2016, Maberly et al. 2020, Chorus and Spijkerman 2021 ).Due to the low biomass associated with surface blooms and the strong constraints imposed by low nutrient concentrations on cyanobacterial de v elopment in the e pilimnion, these e pisodes have been termed "oligotrophic blooms" (Salmaso et al. 2015b and r efer ences therein).

Taxonomic position within the Dolichospermum species group
Genomic analyses were performed using Dolichospermum genera and species classified by the GTDB initiativ e, whic h uses a standardized microbial taxonomy based on genome phylogeny, with genomes obtained from NCBI RefSeq (Reference Sequence Database) and GenBank (Parks et al. 2022 ).The GTDB taxonomy is based on genome trees inferred from aligned concatenated sets of single-copy marker proteins for Bacteria and Archaea and ANI comparisons, while the LPSN (List of Prokaryotic names with Standing in Nomenclature) (Parte et al. 2020 ) is used for nomenclatur al r efer ence and to establish naming priorities and nomenclature types.In this respect, the phylogenomic and ANI compar ativ e a ppr oac hes used to define ADA groups (species) are similar to those used by the GTDB, and the two classifications provide comparable results in defining clades .T he use of genomic-based a ppr oac hes is the onl y objectiv e way to disentangle a legacy of names adopted by different laboratories to classify Nostocales.Consistent with the GTDB a ppr oac h (P arks et al. 2020 ), there is a convergence of opinion on the possibility of homogenizing and updating the species names of Nostocales included in the same clades and ADA groups (Österholm et al. 2020(Österholm et al. , Dreher et al. 2021 ) ).In this direction, the GTDB taxonom y re presents an important conceptual and practical step, but it is open to updates, as species r epr esentativ es ar e r e-e v aluated with each GTDB release.At present, the main limitations are due to the poor r epr esentation in the taxonomic databases (International Nucleotide Sequence Database Collaboration, and GTDB) of se v er al well-documented genomes of species of Nostocales (and c y anobacteria in gener al), whic h still r epr esents an obstacle to the correct determination of species of difficult attribution (e.g.Woodhouse et al. 2024 ) and to the completion of the ADA taxonomy based on the adoption of genomic criteria.In addition, most genomes were obtained from only a few countries , which ma y ha v e intr oduced a geogr a phical bias into the r esults of the taxonomic and annotation analyses.For example, although well c har acterized, the ADA-4 clade, whic h included se veral species of Aphanizomenon flos-aquae Ralfs ex Bornet and Flahault, was reclassified under the name Dolichospermum flosaquae in the GTDB taxonomy.At the same time, the available Dolichospermum flos-aquae genomes in the NCBI database were included, according to the genomic criteria, in three different clades (i.e.Dolichospermum sp000312705/AD A-2, D .heterosporum /AD A-3 and D .planctonicum /ADA-1).These two species are validly published according to the International Code of Botanical Nomenclature , hav e differ ent mor phologies (Komár ek 2013 ) and ar e ca pable of pr oducing a differ ent r ange of toxins (Bernard et al. 2017 ).Furthermore, the ADA7 at the extreme end of the tree (Fig. 2 ) is composed of two benthic strains originating from the br ac kish waters of the Baltic Sea, questioning their inclusion in the genus Dolichospermum (see Österholm et al. 2020 ).Clarification of the taxonomic position of Dolichospermum within this classification scheme requires better cov er a ge of the constituent genomes.Similar considerations a ppl y to the other groups in the tree, including the ADA-1 clade, which, as already suggested by Driscoll et al. ( 2018 ) and Dreher et al. ( 2021 ), could be split into two distinct species/subspecies, consistent with the discrimination of the two sets of genomes originally classified under D. planctonicum and D. circinale (Komárek 2013 ).
The two D. lemmermannii genome assemblies classified in the NCBI taxonomy (FEM_B0920 and CS-548) sho w ed high genomic similarity (ANI b > 0.96) with a large group of Dolichospermum and Anabaena species, which are collectively grouped within the Dolichospermum sp000312705 taxon defined in the GTDB taxonomy and within the ADA-2 gr oup.Ov er all, the r esults would suggest a relationship between the taxa r epr esented in this group and D. lemmermannii .

Functional annotation
The two photosystems and their associated reactions, the reductive pentose phosphate cycle, the TCA cycle and gl ycol ysis may be considered the major core pathways that characterize c y anobacterial metabolism.Other more specific metabolic pathways are differ entiall y pr esent in c y anobacteria and closely associated with selectiv e tr aits that pr omote c y anobacterial gro wth and bloom formation (Cao et al. 2020 ).In this regard, in the FEM_B0920 MA G , specific traits were associated with phenological and physiological c har acteristics of bloom-forming Nostocales.
Related to photosynthetic processes, the presence of genes encoding phycoc y anin and allophycoc y anin, whic h absorb far-r ed and r ed-or ange light, is consistent with the de v elopment of the D. lemmermannii population in the surface epilimnetic waters of Lake Garda (Salmaso et al. 2015b ).Phycoerythrin is mostly found in species that use the green-y ello w region of the spectrum in lowlight deeper waters and in species forming metalimnetic layers (Knapp et al. 2021 ).
Carbon, hydrogen, nitrogen, oxygen, phosphorus and sulfur are the six bulk macronutrients (CHNOPS) sustaining life (Fagerbakke et al. 1996 , Remick andHelmann 2023 ).Among the CHNOPS elements, N and P are often present at low environmental concentrations and r equir e tar geted cellular transporters for their uptake (Reynolds 2006, Yang et al. 2022 ).Similarly, under high photosynthetic activity and high pH conditions, CO 2 and HCO 3 − decrease in favour of CO 3 2 − , which is not directly utilized by microalgae, leading to C-limited conditions (Stumm andMorgan 1996 , Wetzel 2001 ).The presence of nitrogen fixation genes in the FEM_B0920 genome suggested the potential ability of the D. lemmermannii population to fix atmospheric nitrogen.Although the current practice for computational prediction of N fixation is based on the presence of the nifH and/or nifD genes (Dos Santos et al. 2012 ), it was suggested that the presence of a minimum set of six genes encoding structural and biosynthetic components , i.e .NifHDK and NifENB, should be verified, as in the FEM_B0920 MA G .At the ultrastructur al le v el, the potential for N-fixation was confirmed by the identification of the complex of genes encoding heterocyte glycolipids (Garg andMaldener 2021 , Pérez Gallego et al. 2023 ).The presence of heterocytes in the filaments of Dolic hospermum observ ed in Lake Garda is quite common, see, e.g.Fig. S3 in Salmaso et al. ( 2015b ), but their quantitative estimation was never performed in the sample collected in this or pr e vious blooms.Giv en the evolutionary establishment and success of nitrogen fixation in bacteria, the physiological and competitive benefits are likely to outweigh the ener getic costs.Ne v ertheless, while experimental measur e-ments have assessed quantifiable rates of N-fixation in se v er al lakes at different levels of environmental nitrogen (Natwora and Sheik 2021, Marcarelli et al. 2022, Ehrenfels et al. 2023 ) and nitrogen and CO 2 concentrations (Kramer et al. 2024 ), no experimental evidence has been collected by performing nitrogen fixation assays in Lake Garda.On the other hand, in addition to exogenous inor ganic (nitr ate, nitrite and ammonium) tr ansporters, the ability for or ganic nitr ogen uptake was identified in the FEM_B0920 MA G , suggesting the potential scavenging of additional sources of N compounds during the nutrient-poor summer period.Various types of amino acids, ur ea, putr escine and spermidine are common organic nutrient sources produced by the planktic community that can be used by microorganisms as a source of carbon and nitrogen.The elevated dissolved organic nitrogen levels observed in Lake Superior during the blooms of D. lemmermannii , coupled with a decrease in nitrate, indicated that nitrogen species conversion and cycling may have played a significant role in maintaining the blooming population (Sterner et al. 2020, Sheik et al. 2022 ).In Lake Gar da, due to the typically lo w epilimnetic microalgal biomass observed during the summer months, the contribution of the external organic nutrient sources remains to be quantified.
The presence of genes for the potential acti ve uptak e of P and bicarbonates is similarly indicative of adaptations to low-nutrient conditions during the summer months and blooms.In bacteria, the synthesis of the Pst phosphate transport system is promoted under low P-concentrations, as demonstrated in Nostoc punctiforme Hariot under P-starvation conditions (Hudek et al. 2016 ).The Pho r egulon is r esponsible for sensing envir onmental phosphate le vels and is ther efor e critical in r egulating ada ptiv e r esponses to P limitation, particularl y giv en its activity under low-P conditions (Santos-Beneit 2015, Zhang et al. 2024 ).Additional sources of P could potentially be provided by the uptake of organophosphorus compounds , e .g. phosphonates (Xiao et al. 2022 ), although known genes involved in subsequent mineralization after uptake (such as CP lyase; phnJ , K06163) were not identified in the FEM_B0920 MA G .As only a few c y anobacterial species possess genes encoding C-P lyase, the mineralization of phosphonate by the phycosphere community was described as an additional mechanism enabling organic phosphorus scavenging (Zhao et al. 2023 ).
Fiv e differ ent inor ganic carbon uptake systems hav e been identified in different model c y anobacteria (Hagemann et al. 2021 ).The cmpABCD cluster in the FEM_B0920 MAG encodes an ATPbinding cassette tr ansporter involv ed in HCO 3 − uptake (Maeda et al. 2000, Koropatkin et al. 2007 ).This operon is part of the CCMs in c y anobacteria, potentiall y mitigating the decr ease in CO 2 when pH is pr ogr essiv el y higher than 8. Inorganic carbon transporters allow high le v els of HCO 3 − to accumulate inside cells, especially when free CO 2 is very low, and the cells are mainly consuming bicarbonate from the medium.When accumulated into the cell, bicarbonate penetrates into carboxysomes, where it is dehydrated to CO 2 in proximity to RubisCO (Burnap et al. 2015 ).
Sulfur is an essential component of the amino acids cysteine and methionine and an essential constituent of se v er al cellular cofactors (Scott et al. 2007 ).Sulfur limitation reduces c y anobacterial growth, alters the cellular ultrastructure and exerts inhibitory effects on photosynthesis (Kharwar et al. 2021 ).In addition to sulfate, the uptake of organosulfur compounds like alkanesulfonates is an additional or alternative sulfur source.Once inside the cell, the sulfonate group is converted to inorganic sulfate or sulfite by specific enzymes such as alkanesulfonate monooxygenase ( ssuD ; K04091) in the FEM_B0920 genome.Induction of high-affinity sulfate transporters is only activated under sulfate deficiency (Kharw ar et al. 2021 , Kharw ar andMishra 2024 ).Accor ding to Re ynolds ( 2006 ), unlik e C, N and P, sulphur is usually in excess r elativ e to phytoplankton r equir ements, and sulfate normall y saturates the S-uptake of algae down to concentrations of 4.8 mg SO 4 2 − .In Aphanothece ( Anacystis ) nidulans P. Richter, Utkilen et al. ( 1976 ) and Green and Grossman ( 1988 ) reported half-saturation constants for sulfate uptake of 0.75 and 1.35 μM, indicating that, for this species, the transport of SO 4 2 − could be limited at low concentrations down to ca. 0.1 mg L −1 .
Along with the presence of several transporters targeting gr owth micr oelements, the pr esence of se v er al complete or nearl y complete modules associated with the biosynthesis of cofactors and vitamins r epr esented a crucial factor in ensuring a wide range of metabolic processes in a wide range of changing environmental conditions (Romine et al. 2017, Żyma ńczyk-Duda et al. 2022, Shah et al. 2024 ).
The surface bloom of D. lemmermannii was controlled by the biosynthesis of gas v esicles, whic h is mediated by se v er al gvp genes (Walsby 1994, D'Alelio et al. 2011, Hill and Salmond 2020 ), some of which have been identified in the FEM_B0920 genome.Under calm conditions and with a high rate of gas vesicle formation, D. lemmermannii filaments can r eac h upw ar d v ertical v elocities of up to 0.7-0.9m h −1 (Walsby et al. 1991 ), thus explaining the sudden formation of scums.Under these conditions, with high solar r adiation and O 2 av ailability, high pr oduction of ROS can se v er el y damage the functionality of cells (He and Häder 2002 ), making the r emov al of ROS via enzymatic reaction a k e y mitigating selective factor.

AMR
The absence of ARGs in the Dolichospermum bloom may be related to the oligotrophic status of the lake.In the same lake district, Di Cesare et al. ( 2024 ) reported extremely low concentrations of antibiotics and other pharmaceuticals in the oligotrophic Lake Ma ggior e .T he presence of ARGs in the Dolichospermum genome, as indicated by Bakta or KEGG, would r equir e further analysis, considering a larger number of samples to be e v aluated.This is particularl y r ele v ant as a study of 862 high-quality c y anobacterial genomes r e v ealed a high div ersity of ARGs, especiall y in Nostocales, which had the highest number of species with ARGs (67 out of 301) (Timms et al. 2023 ).

Conventional and emerging secondary metabolites
The absence of gene clusters or single genes encoding MCs and ATX in the D. lemmermannii population that caused the Lake Garda bloom in 2020 fully confirmed pr e vious studies carried out on sever al str ains isolated fr om Lake Garda and other large lakes south of the Alps (Salmaso et al. 2015b, Capelli et al. 2017, Cerasino et al. 2017 ).The FEM_B0920 genome contained short fr a gments of mc yD and mc yE with a MITE insertion (Fewer et al. 2011 ); their presence could indicate inactivation of the mcy gene cluster by genetic r earr angement, but pr oper anal ysis of this topic would r equire dedicated and complete analyses of a re presentati ve n umber of genomes.
The low concentrations of ATX detected in the bloom of D. lemmermannii in Lake Gar da w er e pr esumabl y pr oduced by T. bourrell yi , whic h until now was the onl y ATX pr oducer isolated in Lake Garda (Shams et al. 2015, Cerasino and Salmaso 2020, Salmaso et al. 2023 ).This was confirmed by the identification of anaC and anaF sequences in the complete set of contigs, with 100% similarity to Tyc honema bourrell yi .
Se v er al gene r egions potentiall y involv ed in the biosynthesis of secondary metabolites have been identified in the D. lemmermannii FEM_B0920 genome.GEO is a w ell-kno wn terpene volatile compound produced by a wide range of bacteria and c y anobacteria in terrestrial and aquatic environments giving soil and water an earthy odour.Although not toxic to humans via drinking water at envir onmentall y r ele v ant concentr ations, GEO can lead to a loss of consumer confidence in water quality (Akcaalan et al. 2022, Manganelli et al. 2023 ).In this work, GEO encoding genes were detected in ADA-1 and ADA-3, and partly in ADA-2.Some NRPs can be classified as emer ging c hemical contaminants , i.e .compounds that are not generally monitored and not subject to regulation, but which have the potential to have adverse effects on human health and ecosystems (Parida et al. 2021, Morin-Crini et al. 2022 ).Among these, anabaenopetins are a family of cyclic hexapeptides that have been identified in a large number of c y anobacteria (Sterner et al. 2020, Monteiro et al. 2021, Dreher et al. 2023, Zastepa et al. 2023 ).Congeners of APs have been shown to have inhibitory activity against phosphatases and proteases, but their potential effects on human health remain to be e v aluated (Gkelis et al. 2015, Monteiro et al. 2021 ).Among NRPs, sc ytoc yclamides are laxaphycins discovered in Scytonema hofmannii (Heinilä et al. 2020 ).Sc ytoc yclamides and laxaphycins have shown significant antifungal activity, usually coupled with cytoto xic acti vity (Fewer et al. 2021 ), as well as toxicity against the crustacean Thamnocephalus platyurus (Darcel et al. 2021 ).Varlaxin is a new NRPS aeruginosin-type inhibitor of human trypsins (Heinilä et al. 2022 ).A congener of this metabolite sho w ed inhibition of human prometastatic trypsin-3, making varlaxin a potential lead molecule for drug de v elopment (Heinilä et al. 2022 ).This BGC sho w ed a br oad pr esence in the Dolic hospermum genomes (data not shown), although it was detected in strain FEM_B0920 with a very low similarity value.
Among RiPPs, c y anobactins ma y be in volved in the competition between strains or act as antimicrobial agents against bacteria (Nowruzi and Porzani 2021 ).Mycosporine-like amino acids (MAAs) ar e pr oduced by a v ariety of or ganisms to pr otect a gainst ultr aviolet (UV) damage (Chen et al. 2021 ).Although still contr ov ersial (Hu et al. 2015 ), the presence of MAAs was related to the protection a gainst UV r adiation during high solar irr adiances (D 'Agostino et al. 2016, Yang et al. 2018, Geraldes et al. 2020, Jacinavicius et al. 2021 ), such as those experienced during blooms (Zhang et al. 2022 ).

Cyanotoxins and other encoding genes in Dolichospermum species
The distribution of genes encoding c y anotoxins, APs and GEOs sho w ed a w ell-distinguishable pattern in each ADA clade, suggesting a substantial relationship between genome identities within an individual species and the biosynthesis of these secondary metabolites .T his is in a gr eement with the r esults of Österholm et al. ( 2020 ).Genes or gene clusters encoding STX and CYN in Dolichospermum were investigated by Ledreux et al. ( 2010), D'Agostino et al. ( 2020), and Halary et al. ( 2023 ), and by Dreher et al. ( 2022 ), respectively, while genes encoding ATX were also investigated by Wood et al. ( 2007 ) and Rantala-Ylinen et al. ( 2011 ).Studies on MC-producing strains included, among others, Rouhiainen et al. ( 2004 ) and Dreher et al. ( 2019 ).
The ability to potentially synthesize specific c y anotoxins in specific phylogenomic clades has important implications for the expected impacts and potential risks associated with the de v elop-ment of ADA species.Ne v ertheless, the v ery limited geogr a phical areas of origin of the genomes (in particular AD A-1/ D .circinale , AD A-4 and AD A-6) and/or the under-r epr esentation of genomes in some ADA groups could introduce a bias in the re presentati veness of the results.
Obtaining reliable information on the potential of microorganisms to synthesize active biomolecules using genome mining tec hniques r equir es anal yses to be performed on genomes that are as complete and uncontaminated as possible.When applied to fr a gmented or poor-quality genome assemblies, genome annotation tools can produce inconstant results (Skinnider et al. 2020 ).In this respect, while genome mining may provide a remarkable screening tool and an essential guide to assess the potential of specific c y anobacterial populations to synthesize a range of harmful compounds, a complete risk assessment procedure should always consider the inclusion of chemical analytical techniques.

From genes to functions: extending the char acteriza tion of functional traits and competiti v e adaptations
The analysis of the genetic c har acteristics of c y anobacteria allows accessing explicit information on general metabolic pathways and specific ada ptiv e and competitiv e physiological ca pabilities proper of particular groups or species/strains morphologically similar or undistinguishable but with different genetic and functional c har acteristics.Fr om an ecological point of vie w, this r epr esents a considerable step able to integrate and substantially widen the functional c har acterization of c y anobacteria and phytoplankton based on structural morphometric and morphological traits like, among others, cell size and sha pe, arr angement of cells, pr esence of m ucila ge and gas v esicles (B-Bér es et al. 2024 ).Besides common functional traits represented by the central metabolism of photosynthetic c y anobacteria, a fe w distinctiv e and ada ptiv e traits contributed to defining the factors promoting algal blooms in oligotrophic en vironments , including the presence of several high-and low-affinity transporters for macr o-, micr onutrients, and organic compounds; the possession of a gene pool for nitrogen fixation; the ability to control vertical position; adaptations to r emov e r eactiv e oxygen species produced during photosynthesis; the ability to produce MAAs involved in UV protection of cells exposed to high irradiances.All these traits delineate the set of competitive functions that D. lemmermannii can potentially express in oligotrophic lakes.

Conclusions
Cyanobacterial blooms pose a potential risk to human and environmental health and function.A reliable assessment of the risks associated with massive population development or physical accumulation of potentially toxigenic c y anobacteria requires a compr ehensiv e assessment of the gene pool r esponsible for c y anotoxin production and metabolomic profiling.Ho w ever, tar geted anal ysis of individual c y anotoxins r equir es specific, separ ate labor atory pr otocols for both pol ymer ase c hain r eaction and later Sanger sequencing, as well as individual metabolite c har acterization.In addition to being time-consuming and costly, this a ppr oac h is gener all y dir ected to w ar ds the analysis of conventional c y anotoxins, without taking into account the high metabolomic diversity of c y anobacteria and thus ignoring other bioactive molecules and potential sources of risk.In this context, the determination of the draft genomes of the c y anobacterial and bacterial consortium provides rapid indications of both the taxonomic nature of the populations living in aquatic ecosystems and their functional profile, with a compr ehensiv e anal ysis r equiring a unique HTS run combined with bioinformatic analyses.The application of this approach to a D. lemmermannii bloom in Lake Garda allowed to e v aluate the taxonomic position of this species within the GTDB and ADA classification schemes, identifying a clear cluster including D. lemmermannii within ADA-2, but with still many uncertainties in the definition of the whole ADA classification system due to many gaps in the coverage of species genomes in NCBI and GTDB.Genome mining allo w ed the discovery of a number of genes encoding specialized functions r ele v ant to bloom-forming heterocytous Nostocales and a set of secondary metabolites pr e viousl y unknown in populations of this species de v eloping in the southern Alpine lake district.In addition to their taxonomic and ecological r ele v ance, the r esults hav e mana gement implications by c hallenging the completeness of analyses obtained using con ventional approaches .In this context and considering that the functional analyses of genomes provide information on the presence and potential expression of genes, the results obtained should also be considered as an essential guideline to better address analytical efforts in the c hemical anal ytical determination of metabolites of interest for potential effects on human health and the c har acterization of compounds of pharmaceutical interest.

Figure 1 .
Figure 1.Phylogenomic tree of Dolichospermum lemmermannii FEM_B0920 together with several Dolichospermum species of the ADA group ( Anabaena , Dolichospermum and Aphanizomenon ) available in the Genome Taxonomy Database (GTDB).All genome names, strain identifiers and accession numbers are taken from the NCBI taxonomy.Species names are highlighted and grouped in different colors and correspond to the NCBI ( D. lemmermannii ; in bold) and GTDB taxonomy ( D. compactum, D. gracile , D. heterosporum , D. planctonicum , D. circinale , and D. flosaquae ) (see legend).The tree was rooted with Cuspidothrix issatschenkoi CHARLIE-1 as an outgroup.UFBoot, Ultrafast bootstrap values .T he scale bar indicates the number of substitutions per site.Information on the individual assembled genomes is given in TableS1.

Figure 2 .
Figure 2. Phylogenomic tree of Dolichospermum lemmermannii FEM_B0920 and Dolichospermum taxa classified at either genus or species level available in the Genome Taxonomy Database (GTDB).All genome names, strain identifiers and accession numbers are from the NCBI taxonomy.For each clade, the names in red indicate the classification given by the GTDB taxonomy (excluding the Dolichospermum FEM_B0920 genome, not included in GTDB).ADA classifications are indicated by different colour codes; ADA and ADA + refer to the classifications given in Driscoll et al. ( 2018 ) and Dreher et al. ( 2021 ), and estimated in this paper based on membership in the same clade, respectively.In ADA-2, the Dolichospermum genomes classified as D. lemmermannii in the NCBI taxonomy are highlighted in bold.The tree was rooted with Cuspidothrix issatschenkoi CHARLIE-1 as an outgroup.UFBoot, Ultrafast bootstr a p v alues .T he scale bar indicates the number of substitutions per site .Information on the indi vidual assembled genomes is gi ven in TableS1.

Table 1 .
Physical and chemical characteristics of samples collected in three discrete epilimnetic layers at the (A) southeastern and (B) northwestern stations of Lake Garda during the Dolichospermum bloom recorded in the southeastern basin.

Table 2 .
Summary of statistics from the Dolichospermum lemmermannii FEM_B0920 genome assembly.

Table 3 .
Av er a ge Nucleotide Identity (ANI) values between the Dolichospermum lemmermannii FEM_B0920 (GCA_037075685.1)andDolichospermum genomes from GTDB, calculated using three different ANI formulations (see text).Onl y r esults with ANI b v alues ≥ 0.965 wer e included.Descending order of v alues according to ANI b .All the other Dolic hospermum sp000312705 not included in this table have ANI b values > 0.960.

Table 4 .
Megablast analysis of genes of taxonomic relevance and genes involved in geosmin biosynthesis from Dolichospermum lemmermannii FEM_B0920.