Discovery of the Streamlined Haloarchaeon Halorutilus salinus, Comprising a New Order Widespread in Hypersaline Environments across the World

ABSTRACT The class Halobacteria is one of the most diverse groups within the Euryarchaeota phylum, whose members are ubiquitously distributed in hypersaline environments, where they often constitute the major population. Here, we report the discovery and isolation of a new halophilic archaeon, strain F3-133T exhibiting ≤86.3% 16S rRNA gene identity to any previously cultivated archaeon, and, thus, representing a new order. Analysis of available 16S rRNA gene amplicon and metagenomic data sets showed that the new isolate represents an abundant group in intermediate-to-high salinity ecosystems and is widely distributed across the world. The isolate presents a streamlined genome, which probably accounts for its ecological success in nature and its fastidious growth in culture. The predominant osmoprotection mechanism appears to be the typical salt-in strategy used by other haloarchaea. Furthermore, the genome contains the complete gene set for nucleotide monophosphate degradation pathway through archaeal RuBisCO, being within the first halophilic archaea representatives reported to code this enzyme. Genomic comparisons with previously described representatives of the phylum Euryarchaeota were consistent with the 16S rRNA gene data in supporting that our isolate represents a novel order within the class Halobacteria for which we propose the names Halorutilales ord. nov., Halorutilaceae fam. nov., Halorutilus gen. nov. and Halorutilus salinus sp. nov. IMPORTANCE The discovery of the new halophilic archaeon, Halorutilus salinus, representing a novel order, family, genus, and species within the class Halobacteria and phylum Euryarchaeota clearly enables insights into the microbial dark matter, expanding the current taxonomical knowledge of this group of archaea. The in-depth comparative genomic analysis performed on this new taxon revealed one of the first known examples of an Halobacteria representative coding the archaeal RuBisCO gene and with a streamlined genome, being ecologically successful in nature and explaining its previous non-isolation. Altogether, this research brings light into the understanding of the physiology of the Halobacteria class members, their ecological distribution, and capacity to thrive in hypersaline environments.

H ypersaline ecosystems are extreme environments characterized by high salt concentrations, much greater than that of seawater and often close to salt saturation (1,2). The interest of the scientific community on hypersaline environments have been hugely increased as possible analogues to life on other planets such as Mars (3)(4)(5), as well as by its biotechnological potential in the biomedicine and biotechnology fields (6-8).
These ecosystems, whether natural or manmade, are characterized by severe stresses for microbial organisms, due not only to high salt concentration, but to other factors, such as low nutrient and oxygen availability, alkalinity, high UV irradiance, and the possible presence of toxic compounds and heavy metals (2,9). These conditions have selected for low class/family diversity, generally consisting of two major lineages i.e., the archaeal Halobacteria class and the bacterial family of Salinibacteraceae, class Rhodothermia (10,11), but with relatively high species richness (microdiversity) within each class (12,13). Members of the class Halobacteria are among the best adapted organisms to deal with the high salt concentration encountered in those habitats (14,15), and are widely distributed in hypersaline systems, from marine salterns to salt lakes, coastal sabkhas, salt mines, hypersaline soda lakes, hypersaline soils, and salt-fermented seafood (15)(16)(17).
Although traditional cultivation techniques have led to the isolation and description of hundreds of new groups of halophilic bacteria and archaea (19,(22)(23)(24)(25)(26)(27)(28)(29)(30), a great number of recent culture-independent studies from hypersaline habitats have revealed the prominence of several yet-uncultivated lineages in these systems (31)(32)(33)(34). Most of these uncultivated groups represent dominant populations in hypersaline environments (31,32,35,36), and are presumably slow growing and fastidious to isolate microorganisms. Thus, persistent efforts and improvements of the cultivation methodologies are still needed to bring these uncultivated lineages to culture (37). Within this context the term "culturomics" was coined to denote the application of high-throughput culturing efforts for the identification of a higher microbial culturable diversity (38).
As part of the same isolation campaign in the multi-pond, manmade Isla Cristina saltern, a novel haloarchaeon, designated as strain F3-133 T , was isolated in pure culture. Based on the genome-aggregate average amino acid identity (AAI; # 51.7%) and 16S rRNA gene (# 86.3% identity), strain F3-133 T was shown to be distantly related to any previously described haloarchaea. Here, we report the characteristic of its genome sequence, its environmental distribution based on available 16S rRNA gene amplicon and metagenomic data sets, as well as the ecophysiological strategies that underly its ecological success.

RESULTS AND DISCUSSION
Isolation of strain F3-133 T by culturomics. The culturomics approach was applied with the aim of isolating and characterizing the functional roles of the abundant saltern taxa lacking currently cultured representatives. Based on previously successful methodological approaches, for the isolation of other elusive microorganisms (26,40), several growth media and conditions were tested. The water samples collected from the different saltern ponds of Isla Cristina (37°129 N 7°199 W) were serially diluted and plated on the different culture media. Plates were subsequently incubated under different conditions (aerobic, microaerophilic, or anaerobic, in dark or light) and for longtime periods (up to 3 months). During this period, colonies appearing on plates were carefully observed, and those of fastest growth and large morphology were labeled in order to be further discarded. Afterwards, an arduous high-throughput isolation screening of those unlabeled colonies with slow growth and tiny morphology was conducted resulting in more than 2,000 newly-isolated-single-colonies. These isolates were identified by 16S rRNA gene amplification. Isolates likely to represent new groups of microorganisms based on 16S rRNA gene identity, were selected for further characterization. Among them, 1 tiny, red-pigmented colony (,0.5 mm) with extremely slow growth, designated strain F3-133 T , exhibited a remarkably low relatedness with any previously cultivated haloarchaea. Strain F3-133 T was isolated from a culture medium containing pyruvate, which indeed has been considered as a key compound for the isolation of fastidious microorganisms (48), such as the halophilic archaeon Haloquadratum waslbyi, or the streamlined halophilic and marine bacteria Spiribacter salinus and Pelagibacter ubique, among others. Due to its very slow and limited growth under laboratory conditions, several attempts were required before the successful subculturing of strain F3-133 T . The purity of the strain was subsequently checked by the colony morphology in the plate, microscopy and 16S rRNA gene amplification. Once its purity was confirmed, its genome was sequenced for further analyses.
Genome-based phylogeny and taxonomy. Complete 16S rRNA gene sequence (1470 nt) comparison of strain F3-133 T against available sequences in public databases revealed relative low relatedness (#86.3% identity) with any previously cultivated microorganism. Strain F3-133 T proved to be most closely related to Halobacteria class representatives. Based on recently recommended values (e.g., 83 to 86% and 86 to 89% for the class-level and order-level classification, respectively [49,50]), such 16S rRNA gene identity values indicate that strain F3-133 T represents a new order or even a new class within the phylum Euryarchaeota.
In order to further clarify the taxonomic position of this strain, a concatenated alignment of 62 single-copy genes shared between strain F3-133 T and representative members of different euryarchaeotal classes was used for an approximate maximum-likelihood tree reconstruction (Fig. 1A). Based on these results, strain F3-133 T represents an independent clade, a sister taxon to the class Halobacteria, with 100% bootstrap support (Fig. 1A). Consistent with this phylogeny-based picture, strain F3-133 T exhibited the highest AAI values (50 to 51%) to representatives of the class Halobacteria ( Fig. 2A), with AAI values among Halobacteria members being substantially higher (58 to 62%) ( Fig. 2A). Further comparison of strain F3-133 T with all type species from all orders and families of the class Halobacteria with available genomes supported its placement as a distinct phylogenomic branch within the class (Fig. 1B), which was also consistent with the AAI values among the same genomes (Fig. 2B). Collectively, it is clear enough that the evidence from 16S rRNA gene, universal protein-coding gene phylogenies and AAI analysis consistently   Table S4.
Streamlined New Haloarchaeon Halorutilus salinus mSystems described in the future (Fig. S1). The genomic sequences of representatives of these groups will provide better resolution of the class status and a more thorough taxonomic description of its extant diversity in the future. Ecological distribution. To gain insight into the ecological distribution and relative abundance of strain F3-133 T in hypersaline environments from around the world, a large number of available metagenomic and 16S rRNA gene amplicons data sets were screened (Tables S1 and S2). Among ; 200 metagenomes assessed (Table S2), only three provided significant read recruitment plots against the F3-133 T genome above the limit of detection of the metagenomic sequencing effort applied (Fig. 3). Specifically, F3-133 T -like populations were present in the environment of origin, the brine of a pond of Isla Cristina saltern (42% total salts), as well as in a solar saltern in Velddrif (South Africa; 15% total salts) and a natural saline lake in the Chilean Altiplano (34% total salts) ( Fig. 3), with a relative abundance of 0.1%, 0.03% and 0.02% of the total community, respectively. These results may indicate that strain F3-133 T increases its population abundance from non-detectable levels (rare biosphere) to an abundant member of the community under its optimal environmental conditions. Therefore, these results not only revealed that strain F3-133 T constitutes an abundant member of the microbial community in hypersaline environments widely distributed across the globe, but also reflected its versatility, being able to inhabit intermediate-to-high salinity systems. Moreover, the high number of reads recruiting below 95% identity in all three data sets suggested that there are likely additional, closely related, yet-to-be described species that are abundant (in order to be detected in the corresponding metagenomes) in these environments.
Further, screening of 16S rRNA gene amplicon or clone library data sets for close matches to the 16S rRNA gene sequence of F3-133 T revealed a remarkable number of sequences (1024 amplicons and 3 clones) related at the species (.98.6% identity), genus (.95% identity), and family (.92% identity) levels in samples from inland and coastal hypersaline environments in Australia, Romania, Tunisia, or USA (Fig. 4A). The salinities of those environments ranged from 5.1% total salts (Lake Strawbridge sediment) to 34% total salts (Bajool saltern) (Table S1), which are indeed comparable to the salinities of the metagenomes in which strain F3-133 T was found to be abundant (as specified above), further supporting the widespread presence of strain F3-133 T and related species in a broad range of salinity niches. With respect to 16S rRNA gene amplicon data set matches at genus level, the highest relative abundances were found in the sediments of Strawbridge Lake (Australia) (up to 0.9% of the total community) (Fig. 4B). However, the number of matches at species level in this environment were much fewer (Fig. 4B), reaching the highest relative abundance (0.3% of the total) in the meromictic hypersaline lake of Fara Fund (Romania). This fact, together with the metagenomic-based results mentioned above, enforce the hypothesis that aquatic niches are the preferred habitat for this group of microorganisms. Overall, these results further corroborated the worldwide distribution of F3-133 T and related species in a wide range of ecological niches associated with natural and manmade salterns, and suggested that more isolation efforts are needed to explore the extant diversity of this group of archaea.
General genome characteristics. The main genome characteristics of strain F3-133 T are provided in Table S3. One of the most remarkable features is the relatively small genome size of this new taxon (2.1 Mb) compared to other characterized halophilic archaeal species. To the best our knowledge, this is within the smallest genome sizes reported for any described member of the class Halobacteria to date (together with Halodesulfurarchaeum formicicum HSR6 T , Halanaeroarchaeum sulfurireducens HSR2 T , and Halobacterium salinarum NRC-1 T chromosome), with the available Halobacteria genomes to average ; 4 Mb (Table S4). Small genomes in Archaea and Bacteria representatives are frequently associated with parasitic or symbiotic lifestyles, although free-living members in environments with low nutrient availability, such as aquatic ecosystems, can also exhibit small genomic sizes in comparison to nutrient-rich terrestrial ecosystems (51,52). Moreover, smaller genome size species are often associated with an auxotrophic lifestyle (51, 53). These characteristics may also explain the preference of strain F3-133 T and relatives to aquatic environments (as discussed above), as well as its narrow nutritional requirements (strain F3-133 T can be only cultured in a defined medium with pyruvate, and is unable to use other compounds as carbon and energy sources) that presumably led to its fastidious growth during our culturomics isolation efforts. Further, autotrophy genes were also present in the F3-133 T genome, indicating that it can be truly free-living. Moreover, a single rRNA operon was identified in the strain F3-133 T genome, consistent with an oligotrophic lifestyle (54). The presence of a single rRNA operon may also be related to the slow growth phylotype (54) exhibited by strain F3-133 T , which also makes its cultivation under laboratory conditions challenging. Notably, the genome of strain F3-133 T is relatively low in DNA G1C content (59.8 mol%) in comparison with other halophilic archaea, which usually have G1C contents well above 60 mol% (Table S4), with the only known exceptions being Haloquadratum walsbyi (47.8 mol%), Halocatena pleomorpha SPP-AMP-1 T (57.1 mol%), Halonotius pteroides CECT 7525 T (59.5 mol%), and representatives of the Candidatus phylum Nanohaloarchaea (; 40 mol%) (55).
The genome of strain F3-133 T revealed a very low percentage of non-coding sequences and an outstanding short median and average intergenic spacers (Fig. S2) in comparison to any previously cultivated Halobacteria. Altogether, these unique features indicate a streamlined genomic strategy for this strain (51,56). Specifically, under some circumstances, selection may lead to a genomic complexity reduction in organisms that have large effective population sizes and resting cell stages or periods without significance growth, resulting in ecological successful strategies in nature, as for Pelagibacter ubique (57)(58)(59). Advances in multi-omics techniques revealed that the number of streamlined genomes in nature has been previously underestimated due to their fastidious growth under laboratory conditions (60). Besides, genome reduction is frequently associated with unusual nutritional requirements that necessitate specific cultivation approaches (56). In fact, strain F3-133 T was isolated using a culture medium similar to the ones used for growing other streamlined organisms. Indeed, the streamlined genome of strain F3-133 T probably accounts, at least in part, for the remarkable abundance and the worldwide-distribution of this new taxon in different hypersaline environments, but also to the previously unsuccessful attempts to isolate it. Lastly, small genome organisms are frequently related with relatively unchanging (stable) ecological niches, in contrast with fluctuating niches for which sigma factors play a crucial role (56). No sigma factors were identified in the isolate genome, corroborating its likely preferred distribution in aquatic ecosystems.
Central metabolism. Based on the bioinformatically annotated genome, strain F3-133 T encodes genes involved in gluconeogenesis, tricarboxylic acid cycle, and glyoxylate cycle (central carbohydrate metabolism; Fig. 5). However, as previously reported for other haloarchaea (40,42,61,62), the gene coding for 6-phosphofructokinase involved in the lower part of the classical Embden-Meyerhof-Parnas (EM) pathway of glycolysis is absent in strain F3-133 T , which could be alternatively replaced by the oxidative pentose phosphate pathway (Fig. 5) (63). Strain F3-133 T also possesses the pyruvate ferredoxin oxidoreductase genes (porA and porB) involved in the oxidation of pyruvate to acetyl-CoA (Fig. 5).
With respect to nitrogen metabolism, the genes encoding for nitrite reductases, i.e., nirA, nirB, and nirK, involved in assimilatory and dissimilatory nitrate reduction and denitrification, respectively, were all identified in the genome (Fig. 5). However, the remaining genes involved in these pathways, such as narB, narGHI, norBC, or nosZ, were not found, indicating that F3-133 T is probably doing only specific steps (modular) of denitrification. Strain F3-133 T also exhibited the high-affinity ammonium transporter (Amt) for ammonia uptake and the ABC transporter for the nitrogen derived compound spermidine (Fig. 5). The responsible genes for ammonia assimilation, glutamine synthetase, and glutamate synthase were also found in the genome (Fig. 5). Complete biosynthesis pathways of several amino acids (i.e., arginine, histidine, isoleucine, lysine, ornithine, proline, serine, threonine, and valine) were likewise identified. Consistent with its physiological characterization, genes encoding archaellum were encountered in the genome (Fig. 5). Inorganic phosphate might be incorporated to the cell via an ABC transporter (pstSCAB). Rhodopsin-like sequences such as a sensory rhodopsin   FIG 4 (A) Global distribution of strain F3-133 T and related organisms in hypersaline environments based on 16S rRNA gene amplicon and clone library data sets. Circles denote the geographic location of the related 16S rRNA gene sequences found in different amplicon data sets, while squares represent the sequences from clone library data sets. The star symbol indicates the place where strain F3-133 T was isolated. (B) Relative abundance of 16S rRNA gene sequences related to F3-133 T relative to all amplicon sequences in the corresponding data sets (at the geographic location highlighted with circles in [A]) assigned to the (same) family, genus, and species as strain F3-333 T based on identity threshold as described in the Materials and Methods section. Note that in some samples same-species sequences dominated the relatives found (e.g., Fara Fund Lake, Romania) while in some other samples the related sequences were mostly assigned to different species and genera than F3-133 T (e.g., Strawbridge Lake, Australia). RNA indicates transcriptomic data, while by default genomic data is used.
Streamlined New Haloarchaeon Halorutilus salinus mSystems and a halorhodopsin were also identified in the genome (Fig. 5), suggesting a versatile metabolic flexibility under illuminated conditions. Additionally, strain F3-133 T possesses genes encoding enzymes of the fatty acid b-oxidation pathway (Fig. 5). Strain F3-133 T encodes type III-like RuBisCO, and the associated AMP phosphorylase and ribose 1,5-biphosphate isomerase as part of the nucleotide monophosphate degradation pathway, which links nucleoside catabolism to glycolysis-gluconeogenesis (Fig. 5). The nucleotide monophosphate degradation pathway has been proposed in Archaea as part of a cyclic CO 2 fixation pathway (61,64), whereas this matter still remains unclear, and further examination will be necessary to understand its functional role in strain F3-133 T . On the other side, this pathway has been suggested to be a relic of ancient heterotrophy considering ribose was likely the most abundant sugar available on early earth (65). The nucleotide monophosphate degradation pathway has been previously described for other archaeal representatives (i.e., representatives of the Ca. Asgardarchaeota, the methanogenic Methanonatronarchaeia, the hyperthermophilic archaeon Thermococcus kodakarensis, and the DPANN superphylum [66][67][68][69]). Although several halophilic archaea encode ribose 1,5biphosphate isomerase (64), the evidence of type III-like RuBisCO presence has been only recently reported in members of this group (70); making strain F3-133 T one of the first known examples. Unexpectedly, strain F3-133 T 's type III RuBisCO exhibited higher similarities to sequences from the archaeal phyla Candidatus Nanohaloarchaea or Division MSBL-1, than to Halobacteria class representatives. This fact reinforces the differences existing between strain F3-133 T and the Halobacteria class clade that, collectively with the already discussed distant phylogenomic position among them, supports the hypothesis that this new taxon might even constitute a different new class within the Euryarchaeota phylum. Future analyses and discovery of new representatives from this group would corroborate these results and interpretations. Adaptation to extreme salinity. Considering the environmental distribution of strain F3-133 T and closely related sequences at different salinity ranges, the isoelectric points (pIs) and amino acid frequencies of the predicted proteins annotated in the genome of strain F3-133 T were calculated to shed light on its osmoregulatory adaptation mechanism. The proteome of the salt-in strategists Haloquadratum walsbyi, Halorubrum saccharovorum, or Salinibacter ruber, and the salt-out strategist Spiribacter salinus, and the non-halophile Escherichia coli were used for comparison. Results exhibited that the pI profile of strain F3-133 T was similar to the salt-in Halobacteria representatives and Salinibacter ruber, with a single peak at around 4.0, pointing out the predominance of acidic residues (Fig. 6A). Accordingly, a prevalence of acidic amino acids (i.e., aspartate and glutamate) in comparison to basic amino acids (i.e., arginine and lysine) was observed for strain F3-133 T , as well as for the other salt-in strategists analyzed (Fig. 6B). Therefore, it appears that strain F3-133 T employs the salt-in strategy for balancing the osmotic pressure of the saline environments.
In good agreement with a (hypothesized) salt-in osmoprotection mechanism (71,72), several genes encoding for secondary transporters involved in Na 1 extrusion, K 1 uptake, and Cl 2 homeostasis were also found in the genome of strain F3-133 T (Fig. 5). Moreover, no genes related to the biosynthesis or transport of compatible solutes were identified, consistent with the hypothesis that a salt-out strategy is not used by strain F3-133 T . Lastly, a small conductance mechanosensitive channel from the MscS family, which confers significant protection against rapid hypoosmotic shock (i.e., rapid transition from high to moderate salinity environments) (73), was also detected in the genome.
Physiology of strain F3-133 T . We carried out a detailed phenotypic characterization of the new isolate following the recommended minimal standards for describing new taxa of the class Halobacteria (74). These features included morphological, physiological, biochemical, and nutritional characteristics, as well as the determination of the membrane polar lipids, which has been proved to be an important feature for the characterization of haloarchaeal genera (74,75). These results are shown in Table S5, Fig. S3 and S4, and in the new genus and species descriptions detailed below. Overall, all these results are in good agreement with the metabolic behavior identified by the genomic and metagenomic analysis, such as preference and tolerance for a range of salt concentrations, and the stringent nutritional requirements. Notably, strain F3-133 T exhibits a very simplified chemotaxonomic profile, which is also consistent with its distant phylogenomic relationship with other members of the class Halobacteria. Further, the presence of the double chain length C 20 ,C 25 derived from PGP-Me as chemotaxonomic feature (Fig. S4) is not typically found in neutrophilic haloarchaeal species, which might also suggest the adaptability of this strain to thrive in environments with pH fluctuation. In the same way, the absence of glycolipids, as in strain F3-133 T (Fig. S4), is a typical chemotaxonomic characteristic of haloalkaliphilic archaea. Indeed, glycolipids are associated with bacteriorhodopsin sequences, which were consistently absent in the genome of strain F3-133 T (Fig. 5).
Based on the taxogenomic comparison of strain F3-133 T to available genomes and its physiology, we propose a new order, family, genus, and species within the class Halobacteria with the names Halorutilales ord. nov., Halorutilaceae fam. nov., Halorutilus gen. nov., and Halorutilus salinus sp. nov. to accommodate the new isolate as detailed below.
Description  (Fig. S3). Colonies are circular, entire, intense red-pigmented with a diameter of 1.5 mm on solid 25% salts medium after 21 days at 37°C. Growth occurs optimally at 25% (wt/vol) NaCl, pH 7.5, and 37°C. Nitrate and nitrite are reduced. The DNA G1C content is 59.8 mol% (genome). Other features are detailed in the genus description and in Table S5. The type strain is F3-133 T (= CCM 9157 T = JCM 33312 T ), isolated from a water sample of a pond from Isla Cristina saltern (Huelva, Spain).
The GenBank/EMBL/DDBJ accession number for the 16S rRNA gene sequence of Halorutilus salinus F3-133 T is MK182271, and that of the complete genome is RKLV00000000.
Description of Halorutilaceae fam. nov. Halorutilaceae (Ha.lo.ru.ti.la.ce'ae. N.L. masc. n. Halorutilus the type genus of the family; L. suff. -aceae ending to denote a family; N.L. fem. pl. n. Halorutilaceae the family the nomenclatural type of which is the genus Halorutilus).
The properties of the family are the same as for the representative genus Halorutilus. Currently, the family is monotypic and the type genus is Halorutilus. Phylogenetically affiliated to the class Halobacteria.
Description of Halorutilales ord. nov. Halorutilales (Ha.lo.ru.ti.la'les. N.L. masc. n. Halorutilus the type genus of the order; L. fem. pl. n. suff. -ales ending to denote an order; N.L. fem. pl. n. Halorutilales the order the nomenclatural type of which is the genus Halorutilus).
The properties of the order Halorutilales are the same as for the representative genus Halorutilus. The type genus of the order is Halorutilus.

MATERIALS AND METHODS
Isolation source and culture conditions of strain F3-133 T . Strain F3-133 T was isolated from a pond water sample (23% wt/vol salinity and pH 7.5) collected from the Isla Cristina solar saltern (Huelva, Spain, 37°21' N 7°33'W) during an extensive sampling campaign carried out in June 2016. Following the culturomics methodology previously described (40), and after several attempts, strain F3-133 T was obtained in pure culture on a solid medium containing the following salt mixture (g/L): NaCl, 195; MgCl 2 Á6H 2 O, 32.5; MgSO 4 Á7H 2 O, 50.8; CaCl 2 , 0.83; KCl, 5.0; NaHCO 3 , 0.21; NaBr, 0.58, and supplemented with pyruvate (1 g/L) and casein digest (5 g/L). The pH was adjusted to 7.5, and purified agar (Oxoid) was added when necessary. This medium was also used for routine growth of the strain. For long-term preservation cultures were maintained at 280°C in this medium containing 20% (vol/vol) glycerol.
Physiological characterization. Phenotypic features of strain F3-133 T were performed following the methodology previously described by Durán-Viseras et al. (43), and according to the minimal standards established for the taxonomic description of novel taxa of the class Halobacteria (74). Determination of the polar lipid profile of strain F3-133 T was carried out by high performance thin-layer chromatography (HPTLC) and revealed using as spray reagents 5% H 2 SO 4 (in water) (43). Halobacterium salinarum DSM 3754 T and Halorubrum saccharovorum DSM 1137 T were used as reference species for polar lipids characterization.
Genomic DNA extraction, sequencing, and data curation. Genomic DNA of strain F3-133 T was extracted, purified, and quantified following the methodology previously described by Durán-Viseras et al. (39). The 16S rRNA gene was amplified by PCR using the primer pairs ArchF/ArchR (76,77) and sequenced by Sanger technology at StabVida (Caparica, Portugal). The whole genome shotgun sequencing of strain F3-133 T was also performed by StabVida (Caparica, Portugal) using a 150 bp paired-end sequencing strategy, on the Illumina HiSeq 4000 platform. Sequencing reads were quality filtered and trimmed using BBTools v.38.44 (https://sourceforge.net/projects/bbmap/), and then assembled with SPAdes v.3.12.0 (78). CheckM v1.0.5 (79) and Quast v2.3 (80) were used for assembly quality checks. Genes were predicted on the assembled contigs using Prodigal (81), and the predicted genes were then annotated with Prokka (82). BlastKOALA (83) was used to assign KO identifiers (K numbers) to orthologous genes present in the genomes analyzed and, subsequently, mapped to the KEGG pathways and KEGG modules (84) in order to perform metabolic pathway reconstructions. The Average Amino acid Identity (AAI) was determined for all-versus-all genome pairs using the tool AAI-Matrix from the Enveomics collection (85). Isoelectric points and amino acid frequencies of predicted proteins were estimated using the tools iep and pepstats, respectively, from the EMBOSS package (86).
16S rRNA gene and universal protein-coding gene phylogenetic analysis. The 16S rRNA gene sequence of strain F3-133 T was compared to the sequences in the EzBioCloud server (87) and SILVA REF 138 database (88) to get the sequence identity to previously described taxa, and to identify related sequences for phylogenetic comparisons. The 16S rRNA gene sequences of the type strains of the type species of all genera of the class Halobacteria, and the identified related sequences were downloaded from GenBank/EMBL/DDBJ and SILVA databases, respectively.
For the 16S rRNA gene-based phylogenetic analysis, almost complete 16S rRNA gene sequences were aligned using the software MAFFT (89). Subsequently, the phylogenetic tree was constructed with the RaxML software (90).
For the phylogenomic analysis, core orthologous genes were determined using an all-versus-all BLASTp comparison among the translated CDS features of the annotated genomes under study, as implemented in the Enveomics collection toolbox (85). Afterwards, the translated single-copy core gene sequences were individually aligned with Muscle (91) and concatenated into a super-protein alignment. FastTreeMP v.2.1.8 (92) was used for the phylogenomic tree reconstruction by means of the approximately maximum-likelihood algorithm.
Ecological prevalence and estimated abundance of strain F3-133 T in saline habitats. The abundance of strain F3-133 T in available metagenomes was assessed by read fragment recruitment plots (94). For this, the genome sequence of strain F3-133 T was searched against all metagenomic reads from each data set (Table S2) using stand-alone BLASTn. Read recruitment plots were constructed using the enveomics collection (85) based on the best BLAST best-matches (with a cutoff 70% query coverage).
The identification and geographical distribution based on 16S rRNA gene data was performed as follows: firstly, the online tool IMNGS (95) was used to identify the publicly available amplicon project data sets containing 16S rRNA gene sequences related to strain F3-133 T . The 16S rRNA gene of strain F3-133 T was screened against all amplicon project data sets (Table S1) using BLASTn. Only BLAST best-matches with a minimum query size of 250 bp, 70% query coverage and identity $92, 95 and 98.6% (for family, genus, and species level relatedness, respectively) were considered. The geographical location of the data sets that provided matching sequences are displayed on the map (Fig. 4A), and the relative abundances of matches at family, genus, and species level relatedness in each data set is shown in Fig. 4B.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.