Trophic Specialization Results in Genomic Reduction in Free-Living Marine Idiomarina Bacteria

The streamlining hypothesis is usually used to explain the genomic reduction events in free-living bacteria like SAR11. However, we find that the genomic reduction phenomenon in the bacterial genus Idiomarina is different from that in SAR11. Therefore, we propose a new hypothesis to explain genomic reduction in this genus based on trophic specialization that could result in genomic reduction, which would be not uncommon in nature. Not only can the trophic specialization hypothesis explain the genomic reduction in the genus Idiomarina, but it also sheds new light on our understanding of the genomic reduction processes in other free-living bacterial lineages.

database [https://www.ncbi.nlm.nih.gov/genome/]). On the one hand, bacterial genome size can increase through horizontal gene transfer and gene duplication (3)(4)(5)(6). Bacteria that have larger genomes are assumed to be versatile in their lifestyles and are presumably adapted to changing environments. On the other hand, bacterial genome size can decrease (genomic reduction) by losing genes, and bacteria that have small genomes seem to prefer stable habitats (2). Generally, there are two kinds of genomic reduction in bacteria. One kind is related to the host-associated bacteria such as parasites and symbionts of animals and plants. Another kind is related to the free-living bacteria, such as the cultured marine bacteria Pelagibacter and Prochlorococcus (2, 7) as well as several metagenome-derived uncultured bacteria (8)(9)(10)(11).
The streamlining hypothesis is commonly used to explain the genomic reduction events in free-living bacteria, such as the marine bacteria Pelagibacter and Prochlorococcus living in stable, nutrient-poor environments (12)(13)(14). Genome streamlining is regarded as an adaptive genomic reduction process in which selection for metabolic efficiency could drive genome reduction. In stable, nutrient-poor environments, genomic reduction will have adaptive advantages, as losing sophisticated and costly regulatory and accessory machinery can reduce the metabolic burden of the cell, such as the reduced requirement for nitrogen and phosphorus, two elements that are rare in surface seawater. The common consequences of streamlined organisms include small cell and genome size, a low GϩC content, extremely reduced intergenic spacers, loss of accessory machinery, and very low numbers of paralogs, pseudogenes, phage genes, and regulatory genes.
Research on the streamlining hypothesis has mainly focused on the well-studied marine free-living bacterial groups, such as Prochlorococcus, roseobacters, SAR11, and SAR86. Free-living bacteria that have reduced genomes are predicted to be common in nature (15,16). Our current understanding of the relationship between bacterial genome size and their environmental adaptation relies on too few species. It is still unclear whether there are other types of genomic reduction in the free-living bacteria. Lifestyle could dramatically affect bacterial genomes, but the relationships between genome size and trophic lifestyles are complex, and generalizations are difficult to make (17,18). Studying more natural genomic reduction cases will undoubtedly improve our understanding of the evolutionary forces that drive this process in free-living bacteria.
The free-living bacterial genus Idiomarina belongs to the order Alteromonadales of the well-studied class Gammaproteobacteria (19). At this point, more than 10 genomes of strains from this genus have been deposited into public databases. All the sequenced genomes had a size less than 3 Mbp, much smaller than that of the genomes from other genera of the order Alteromonadales (all with an average genome size larger than 4 Mbp). This shows that the genomes of the genus Idiomarina had suffered dramatic genomic reduction. By comparative genomic and physiological studies, we find that the genomic reduction pattern in this genus is distinct from the pattern in the classic lineages like SAR11. We here propose a new hypothesis that trophic specialization can result in genomic reduction in the free-living bacteria of this genus.

RESULTS AND DISCUSSION
Extensive genomic reduction in the genus Idiomarina. Most species of the genus Idiomarina were isolated from deep-sea or high-salinity marine environments, and several genomes of this genus have been deposited into public databases (20)(21)(22). Ten bacteria of which nine were formally described from this genus that had defined isolation information are selected in this study. The bacterium Idiomarina sp. strain X4 (hereafter called X4) was isolated from the deep-sea sediment of the South China Sea at a water depth of 2,500 m by our laboratory, and we sequenced its complete genome. The average nucleotide identity and percentage of conserved protein values between strain X4 and its closest relative Idiomarina zobellii KMM 231 are 86.6% and 85.2%, respectively. On the basis of these two genomic classification standards (23,24), X4 should belong to the genus Idiomarina but represent a new species. Because strain X4 is kept in our laboratory, this strain is also included in comparative genomic analyses and used for further physiological analyses. The size of the 10 Idiomarina genomes ranges from 2.44 to 2.84 Mbp with an average size of 2.62 Mbp, which is much smaller than that of the genomes from all the other studied genera of the order Alteromonadales (with average genome size of 4.19 to 7.57 Mbp) (Fig. 1A). This shows that the strains of the Idiomarina genus have suffered extensive genomic reduction, with 37% to 65% genome size reduction compared to other genera, in their evolutionary history. As no parasitic and commensal characteristics of this genus were observed, this offers a unique opportunity to investigate the genomic reduction process in free-living bacteria.
The genomic reduction pattern in the Idiomarina genus is different from that of the classical SAR11 lineage. The genomic reduction processes and characteristics were well studied in the SAR11 lineage like strain "Candidatus Pelagibacter ubique" HTCC1062 (hereafter called HTCC1062) (7). We then investigated whether Idiomarina species had genomic traits similar to those of HTCC1062. Besides HTCC1062, three Idiomarina species and three strains with larger genomes in the order Alteromonadales that were also isolated from the deep-sea environment are included for comparison ( Table 1). The formally identified strain Paraglaciecola chathamensis S18K6 (hereafter called S18K6) that is a close relative to the Idiomarina genus is further used for physiological comparison (25,26). The comparison results are somewhat contradictory. The Idiomarina species have some genomic characteristics similar to those of the SAR11 lineage, such as a high percentage of coding regions, no CRISPR site, and a low number of paralog clusters. However, the Idiomarina species have some genomic characteristics that are different from those of the SAR11 lineage and some genomic characteristics that are similar to those of relatives with larger genomes, such as high GϩC content, high numbers of rRNA operons, regulatory sigma factors, and mismatch repair proteins, and high percentage of extracellular proteins, like peptidase. The genomic analyses show that the genomic traits of the genus Idiomarina are partially different from those of the SAR11 lineage. The tree was constructed based on concatenated alignment of 685 single-copy orthologous proteins shared by all the genomes using the neighbor-joining method with 1,000 bootstrap replications. Genome size is shown in megabase pairs (Mb). The numbers in parentheses are the number of genomes used to calculate the average genome size of the genus. This tree was used to infer gene gain and loss events in the Idiomarina genus. LUCA indicates the position of the last universal common ancestor of the Idiomarina genus. The strains used for physiological analyses, namely, Idiomarina sp. strain X4 and Paraglaciecola chathamensis S18K6, are shown in red. All the species except strain X4 were formally identified. (B and C) Cells (B) and growth curves (C) of strains X4 and S18K6 cultured in 2216E medium.
In the SAR11 lineage, genomic reduction was usually accompanied by a low growth rate, small cell size, and loss of accessory structures of bacterial cells, such as flagella. However, the cells of identified Idiomarina species are usually 1 to 2 m long and 0.5 to 1 m wide (19,(27)(28)(29). The cell size of strain X4 (ϳ1 by 2 m) falls in this range, which is comparable to the cell size of strain S18K6 (Fig. 1B) and much larger than the cell size of strain HTCC1062 (ϳ0.4 m in diameter). In addition, a polar flagellum is observed for strain X4, which is consistent with the finding that all identified Idiomarina species had a polar flagellum and motility in liquid environment (19,(27)(28)(29). When grown in rich medium, strain X4 can reach a high OD 600 value at a high speed comparable with that of strain S18K6 (Fig. 1C). The growth traits of strains X4 and S18K6 are obviously different from those of strain HTCC1062, which can reach undetectable OD 600 values only in natural and laboratory environments. Therefore, the high growth rate and cell morphology of Idiomarina species are clearly different from those of the SAR11 lineage. Taken together, the genomic and physiological characteristics of the Idiomarina genus are different from those of the classical SAR11 lineage, and a new hypothesis is therefore needed to explain the different genomic reduction phenomenon in the genus Idiomarina.
Although adaptive selection has usually been used to explain streamlined genomic reduction, Luo et al. recently proposed that genetic drift could lead to ancient genomic reduction in some marine bacterial lineages, such as Prochlorococcus, roseobacters, and SAR86 (30). They found that the ratio of radical (d R ) and conservative (d C ) nonsynonymous nucleotide substitutions were elevated in the streamlined lineages compared to their relatives with larger genomes. We then tested the effect of genetic drift on genomic reduction in the genus Idiomarina using the methods described by Luo et al. (30). We find that compared to the control clade, the d R /d C ratio is not significantly inflated in the Idiomarina genus (P Ͼ 0.05 [see Fig. S1 in the supplemental material]), suggesting that there is no excess of radical amino acid changes in the ancestral branch giving rise to the Idiomarina genus compared to its relatives with larger genomes. This result also shows that the mechanism and process of genomic reduction in the Idiomarina genus are different from those of other well-studied marine free-living bacterial lineages. Reconstruction of the genomic reduction process in the genus Idiomarina. The gene gain and loss events were reconstructed to investigate the genomic reduction process in Idiomarina. The last universal common ancestor (LUCA) (containing 2,164 gene clusters) of the genus Idiomarina had gained 135 gene clusters and lost 685 gene clusters, showing that the LUCA of this genus already had suffered great genomic reduction. Approximately 45% and 64% of the gained and lost gene clusters, respectively, can be assigned a COG function. The COG functional analyses show that high proportions of the genes lost are related to energy production and conversion (COG category C), carbohydrate transport and metabolism (G), cell wall/membrane/envelope biogenesis (M), and defense mechanisms (V), while the genes gained are enriched in functions of amino acid transport and metabolism (E), general predicted function (R), and intracellular trafficking, secretion, and vesicular transport (U) (Fig. 2A). In particular, note that 38 gene clusters related to carbohydrate transport and metabolism (category G) were lost, while no genes related to this function were gained.
Gene gain and loss events are further analyzed in the evolutionary path from the LUCA to the Idiomarina loihiensis L2TR and X4 terminus. A striking feature of this evolutionary path is that both strains continued to lose carbohydrate metabolism genes in the genomic reduction process ( Fig. 2B and C). Consistent with this, there are only 25 or 26 carbohydrate-active enzyme sequences in the Idiomarina genomes compared, even less than that in strain HTCC1062 and much less than that in related strains with larger genomes ( Table 1). The Idiomarina genomes also do not harbor genes encoding high-molecular-weight sugar polymer-degrading enzymes that are encoded in other strains (Table 1). Meanwhile, both strains consistently gained genes related to amino acid transport and metabolism ( Fig. 2B and C). For example, the percentage of the peptidases that can degrade proteinaceous substrates and initiate amino acid utilization is extremely high in the Idiomarina genomes ( Table 1). The Idiomarina genomes encode 15 to 20 peptidases (belonging to metallo, serine, and threonine families) predicted outside the cell membrane. The activity of peptidases can produce amino acids or peptides. Accordingly, genes encoding the transporters to absorb small peptides as well as some amino acids are found in the X4 genome, while the transporters responsible for absorbing small sugars are not found (Fig. 2D). This shows that strain X4 would have strong protein degradation ability and that it mainly acquires carbon and energy from proteinaceous resources, as previously reported for I. loihiensis L2TR, which was also isolated from the deep sea (31). I. loihiensis L2TR lost many carbohydrate metabolism genes, such as the genes encoding transaldolase, glucose-6-phosphate dehydrogenase, and 2-keto-3-deoxy-6-phosphogluconate aldolase (31), which are also absent in the X4 genome. Strain X4 lost a total of 36 identified transcriptional regulator genes, including one clearly annotated as relating to sugar metabolism (COG1349). This implies that when carbohydrate utilization genes are lost, related regulatory genes are also lost.
The reconstruction of the genomic reduction process in the Idiomarina genus showed that many more genes were lost than gained. Generally, the lost and gained genes can be assigned to all COG function categories except that carbohydrate transport and metabolism genes were consistently lost (Fig. 2). Meanwhile, the Idiomarina genus gained many genes related to utilization of proteinaceous substrates.
Trophic specialization of the genus Idiomarina. Genome evolutionary reconstruction indicated that members of the genus Idiomarina, including strain X4, lost many genes related to sugar utilization and had to rely on, and even developed, the ability to use exogenous proteinaceous substrates as carbon and energy resources. This genomic prediction was further tested experimentally in strain X4. The substrate utilization profiles showed that glucose was the only sugar that supported the growth of strain X4 and that its ability to use a variety of other sugars as carbon and energy sources was limited ( Table 2). The identified species of this genus also showed limited sugar utilization ability (19,(27)(28)(29), which is in consistent with the genomic analyses that all species of the genus had lost many sugar utilization genes. Meanwhile, strain Trophic Specialization and Bacterial Genomic Reduction X4 has evolved a strong ability to use proteinaceous substrates. In 0.1% casein medium, strain X4 can grow more quickly and reach a higher cell density than strain S18K6 (Fig. 3A), while these two strains had comparable growth rates in rich medium (Fig. 1C). Consistent with its high growth rate, strain X4 can consume casein more quickly than strain S18K6 can (Fig. 3B). In addition, strain X4 responded and grew much more quickly than S18K6 in 0.002% (ϳ10 mg carbon per liter) casein medium (Fig. 3C), showing that X4 can adapt to utilize low concentrations of proteinaceous substrates. Degradation of high-molecular-weight proteinaceous substrates can produce small peptides and amino acids, and accordingly, strain X4 carries genes that encode transporters for peptides and amino acids. The amino acid cysteine absorption rates of these two strains were further tested, and the results showed that X4 cells had a higher cysteine absorption rate than S18K6 cells, especially when the cysteine concentration was high (Fig. 3D). The genomic and experimental analyses indicate that strain X4 is specialized in using proteinaceous substrates as nutrition and energy resources, which was also reported for the deep-sea bacterium I. loihiensis L2TR (31). This implies that trophic specialization in using proteinaceous substrates in Idiomarina species may be a strategy to adapt to the deep-sea environment, as it has been reported that high-molecularweight (HMW) dissolved organic nitrogen in the deep sea is mainly of proteinaceous origin and specific proteases were required to degrade these HMW proteinaceous substrates (32,33). Correlation between genomic reduction and trophic specialization. The genomic and physiological analyses indicate that genomic reduction is related to trophic specialization in the genus Idiomarina. However, the correlation between genomic reduction and trophic specialization is hard to elucidate. The strain may have first lost the sugar utilization genes, then it had to rely on proteinaceous resources, and trophic specialization developed further. It also may be that the strain had first a Symbols: ϩ, the OD 600 of cultures was Ͼ0.1 after 6 days' growth; Ϫ, the OD 600 was Ͻ0.05 after 6 days' growth. b CMC-Na, carboxymethylcellulose-Na.
Trophic Specialization and Bacterial Genomic Reduction ® developed and relied on the ability to use proteinaceous resources, and then the sugar utilization genes were less used under natural selection and were finally lost from the genome. There is a third possibility that genomic reduction and trophic specialization are the result of other selective pressure. We propose that trophic specialization evolved first. The lost sugar utilization genes such as glucose-6-phosphate dehydrogenase and 2-keto-3-deoxy-6-phosphogluconate aldolase are usually essential genes in other bacteria (34,35). The ancestor of the genus Idiomarina would not survive if it lost these genes but did not develop protein utilization ability. If the ancestor developed the protein utilization ability first, then the sugar utilization genes would not be under stringent selection and could be lost from the genome. Most strains of the genus Idiomarina were isolated from high-salinity or deep-sea environments, both of which are extreme conditions. In order to survive in these extreme environments, bacteria have to develop special adaptation abilities, of which obtaining nutrients is the basic and most important one. We then propose a hypothesis that adaptation to the special environment with trophic specialization narrowing the food spectrum could result in genomic reduction. This would not be uncommon in nature. Mutations that increase the activity of one process are likely to affect other processes, which can promote the emergence of mutants that specialize in metabolizing certain substrates (36,37). In cases where the ancestor had multiple abilities for substrate utilization, an adaptive trait could have evolved in response to a special environment through specialization for using one kind  of substrate with higher efficiency. If the remaining genes for utilization of other substrates become nonessential in this special environment, these genes would no longer be under strong purifying selection, allowing for the accumulation of deleterious mutations and eventually leading to permanent deletion of these genes and related regulatory genes, thus resulting in genome size reduction. Interestingly, a recent study showed that a Polaribacter isolate preferring to feed on proteins also had a reduced genome size compared to its close relative that had more polysaccharide utilization genes and a broader carbohydrate utilization spectrum (38).
Testing the trophic specialization hypothesis in other bacterial genomes. We searched the bacterial genomes belonging to the well-studied Gammaproteobacteria (Ͼ2,000 complete genomes) with genome sizes of 2 to 3 Mbp, and 131 genomes, including 2 Idiomarina genomes, were found. Of these 131 bacterial genomes, ϳ90% are parasites or pathogens of animals or plants and belong to genera such as Actinobacillus, Coxiella, Haemophilus, Pasteurella, and Xylella. These parasites or pathogens can be considered another type of trophic specialization (see discussion below). One genus that suffered dramatic genomic reduction and was not a parasite or a pathogen is Kangiella (four genomes) (Fig. 4A) (39,40). Therefore, the genomic reduction processes of this genus was reconstructed. The LUCA of the genus Kangiella had gained 538 gene clusters, while it had lost 596 gene clusters, showing that the LUCA of this genus suffered some extent of genomic reconstruction. A striking feature is that the LUCA also lost a high proportion of genes related to carbohydrate transport and metabolism (G), including an ABC-type sugar transport system, and had gained a high proportion of genes related to amino acid transport and metabolism (E), including 15 peptidases (Fig. 4B). This implies that the genus Kangiella, like the genus Idiomarina, had lost some carbohydrate utilization ability and developed the ability to use proteinaceous substrates. Accordingly, the proportion of carbohydrate utilization genes (G) in Kangiella geojedonensis YCS-5 is comparable to that in strain X4 and much lower than that in strain S18K6 (2.5% versus 6.6%), while the differences of other functions are not as large (Fig. 4C). The physiological traits showed that K. geojedonensis YCS-5 could hydrolyze casein, gelatin, and tyrosine, but not starch (41). The other three strains Kangiella koreensis DSM 16069, K. aquimarina SW-154, and K. sediminilitoris BB-Mw22 also could hydrolyze casein and tyrosine, but not starch (40,42). The genomic and phenotypic characteristics also indicate that genome reduction in the genus Kangiella is accompanied by a certain extent of trophic specialization.
Conclusion. This study indeed illustrates there are genomic reduction patterns and processes different from those in the classical free-living lineages such as SAR11. We propose that selection for trophic specialization in certain environments would be an important path leading to genomic reduction in the genus Idiomarina. Not only can the trophic specialization hypothesis explain genomic reduction in the genus Idiomarina, but it also sheds new light on the understanding of the genomic reduction processes in other marine free-living bacteria.

MATERIALS AND METHODS
Bacterial strains and phenotypic characteristics. Idiomarina sp. strain X4 was isolated by and maintained in our laboratory. The Paraglaciecola chathamensis S18K6 strain was purchased from the culture collection center as described previously (43). Both strains were routinely cultured using 2216E broth medium: 5 g peptone, 1 g yeast extract, 1 liter artificial seawater (containing 35 g sea salt [Sigma, USA]), pH 7.5. To investigate the ability to use proteinaceous substrates, 0.1% or 0.002% casein media (1 g or 0.02 g casein, 1ϫ vitamin mix, 1 liter artificial seawater, pH 7.5) was used to culture both strains. To test the substrate utilization profiles, the media containing different substrates (substrates tested are shown in Table 2), 1ϫ vitamin mix, 1 liter artificial seawater, 0.01 g NH 4 NO 3 for sugar substrate tests, and pH 7.5, were used. The growth of both strains cultured in proteinaceous and sugar-containing media at 18°C with a shaking speed of 180 rpm was quantified by measuring the optical density of the cultures at 600 nm (OD 600 ). Cysteine absorption rates were measured by the method of Button (44). The cysteine concentration was assayed by an L-8900 amino acid analyzer (Hitachi, Japan). Micrographs of the strains were obtained in air in ScanAsyst mode by atomic force microscopy (AFM) using a Multimode Nanoscope VIII AFM (Bruker AXS, Germany) with probe NSC11 (MikroMasch, USA).
Genome sequencing and bioinformatic analyses. The complete genome of Idiomarina sp. strain X4 was sequenced by using a combination of second (Illumina HiSeq2000) and third (PacBio RS II) Trophic Specialization and Bacterial Genomic Reduction ® sequencing platforms. The genome was annotated using the RAST annotation pipeline (45). Metabolic pathways were determined using the online KEGG mapping tool (46). Carbohydrate-active enzyme sequences and modules were analyzed using the CAZy database (47). The COG function category was analyzed by searching proteins against the updated COG database using the blastp program. The protein subcellular localization was predicted using PSORTb 3.0 (48). Transporters were identified by searching proteins against the transporter classification database (TCDB) (49). The final results of COG function, protein subcellular localization, and transporter identification were put together by custom-made Perl scripts. The orthologous clusters were grouped using OrthoMCL (50). Amino acid sequences of singlecopy orthologous proteins shared by all the genomes were aligned using MUSCLE (51). The alignments were concatenated, and the phylogenetic tree was constructed using MEGA 7.0 with the neighborjoining method (52). Count software was used to infer the gene gain and loss events in the Idiomarina and Kangiella genera with the posterior probability model (53).
Accession number(s). The complete genome sequence of Idiomarina sp. strain X4 was deposited in GenBank under accession no. CP025000.