Characteristics and Evolutionary Analysis of Photosynthetic Gene Clusters on Extrachromosomal Replicons: from Streamlined Plasmids to Chromids

The evolution of photosynthesis was a significant event during the diversification of biological life. Aerobic anoxygenic photoheterotrophic bacteria (AAPB) share physiological characteristics with chemoheterotrophs and represent an important group associated with bacteriochlorophyll-dependent phototrophy in the environment. Here, characterization and evolutionary analyses were conducted for 13 bacterial strains that contained photosynthetic gene clusters (PGCs) carried by extrachromosomal replicons (ECRs) to shed light on the evolution of chlorophototrophy in bacteria. This report advances our understanding of the importance of ECRs in the transfer of PGCs within marine photoheterotrophic bacteria.

carbon and energy cycling (8)(9)(10). AAPB are facultatively phototrophic, using light as an alternative energy source, which thereby reduces the respiration of organic carbon from marine primary production by ϳ2.4% to ϳ5.4% (4). In addition to their ecological significance, physiological and genomic analysis of AAPB has indicated their important potential to inform on the evolution of photosynthesis (11)(12)(13). AAPB are hypothesized to have evolved after widespread oxygenation of Earth's atmosphere ϳ2.4 to ϳ2.3 billion years (Ga) ago (14,15). Purple photosynthetic bacteria are one such group of anaerobic phototrophs that are thought to represent the ancestral lineage to AAPB due to the high similarity of their photosynthetic apparatuses (13,16,17). However, unlike the purple photosynthetic bacteria that grow in the absence of oxygen and are mainly autotrophic, AAPB are aerobic heterotrophs that perform phototrophy as an auxiliary energy conservation strategy, which can contribute up to 20% of their total cellular metabolic energy demands (6,18). Phylogenetic analysis of the 16S rRNA gene indicates that the distribution of AAPB is scattered throughout the Proteobacteria and that they are closely related to nonphototrophic bacteria and purple nonsulfur bacteria within the Proteobacteria (12,13,18).
Similarly to purple photosynthetic bacteria, photosynthetic gene clusters (PGCs) are found in AAPB. The large PGC superoperon is approximately 35 to 50 kb in length and contains the approximately 40 genes required for the biosynthesis of bacteriochlorophyll, carotenoids, photosynthetic reaction complexes, and light harvesting complexes, in addition to other regulatory functions (19)(20)(21)(22). PGC gene sequences are highly conserved, and some essential genes within PGCs (e.g., pufL, pufM, and bchY) are typically used as molecular markers in phylogenetic analyses to classify phototrophic Proteobacteria and study their ecological diversity and evolutionary relationships (5,(22)(23)(24)(25).
Previous comparisons of closely related strains have demonstrated that PGCs can be lost from bacterial genomes. For example, Citromicrobium sp. JL1363 lost PGCs from its genome during its evolutionary history and is now reliant solely on heterotrophy (26). Furthermore, PGCs can be transferred between proteobacterial strains and among distantly related phyla. For instance, although Erythrobacter sp. AP23 shares 99.5% 16S rRNA gene sequence identity with Erythrobacter sp. LAMA915, only the former strain contains a PGC within its genome (23). Intriguingly, the PGC of Erythrobacter sp. AP23 is closely related to that of Citromicrobium (23). Gemmatimonadetes is a novel phototrophic phylum and has been suggested to acquire its PGC from phototrophic bacteria of the Proteobacteria via a horizontal gene transfer (HGT) event (25).
Members of the Roseobacter clade are important ecological generalists within marine ecosystems, and many are AAPB (27)(28)(29). Inconsistencies within phylogenetic topologies between the 16S rRNA gene and PGCs indicate that HGT events of PGCs have occurred among members of this clade (30)(31)(32)(33). Of note, PGC-containing extrachromosomal replicons (ECRs) have been identified within six strains of the Roseobacter clade, further indicating the possibility of HGT of PGCs among these species (33). ECRs are mobile genetic elements that carry many essential genes involved in metabolism that are essential for rapid adaptation to changing environments (34)(35)(36). Therefore, it is likely that PGCs can also be transferred via ECRs. However, a limited number of studies have focused on the role of ECRs in the HGT of PGCs (31,33).
In this study, we analyzed 13 Roseobacter clade genomes that contain PGCs carried by extrachromosomal replicons (exPGCs). Phylogenetic and structural analyses of the exPGCs, in addition to comparisions against chromosomal PGCs (cPGCs), were used to elucidate the potential role of ECRs in the transfer of PGCs among Roseobacter clade species.

RESULTS AND DISCUSSION
General features of the 13 Roseobacter clade strains. Thirteen Roseobacter clade strains carried PGC-containing ECRs and were affiliated with seven different genera: Tateyamaria, Jannaschia, Sulfitobacter, Roseobacter, Oceanicola, Shimia, and Nereida ( Table 1). The genome sizes of the 13 strains ranged from 2.89 to 4.75 Mb, while the  (37)(38)(39)(40)(41). Tateyamaria spp. have also been frequently detected in algal culture bacterial communities and can occasionally dominate such communities (42)(43)(44). The pufM gene is present in all genomes reported for this genus (as of 28 February 2019). Three strains carrying exPGCs with similar genome sizes (average, 4.4 Ϯ 0.8 Mb) and GC contents (average, 61.30% Ϯ 0.70%) were chosen for analysis in this study. The sizes of the ECRs carrying exPGCs ranged from 77.6 to 139.8 kb, and the exPGCs had similar sizes (average, 53.0 Ϯ 1.0 kb). A considerable number of highly homologous genes were present in the three PGC-containing ECRs of this genus (Fig. 1). In addition, the genes present on smaller PGC-containing ECRs were mostly also present on larger PGC-containing ECRs. Phylogenetic analysis of the replication partitioning gene (parA), supported by high bootstrap values among homologs, indicated that the replication modules of the three PGC-containing ECRs were highly conserved (see Fig. S1 in the supplemental material; see also Table S1 in the supplemental material).
Jannaschia is an ecologically important genus of AAPB (45). Indeed, 25% to 30% of all AAPB 16S rRNA gene clone sequences from samples collected in the central Baltic Sea belonged to Jannaschia-related bacteria (46). Among known Jannaschia isolates, CCS1 is the only strain observed to conduct photoheterotrophy (47,48). Twelve Jannaschia strains were available for analysis with whole-genome sequences, with six containing PGCs; three of the six were exPGC types, and the other three were cPGC types. The PGC-containing ECRs ranged in size from 49.5 to 87.2 kb, while their exPGCs were ϳ45 kb in length. The sizes of the genomes of the three Jannaschia strains carrying exPGCs ranged from 3.49 to 3.81 Mb, with GC contents ranging from 62.0% to 65.5%. The three PGC-containing ECRs of Jannaschia shared a large syntenic region comprising ϳ50 kb, and the larger PGC-containing ECRs contained genes that were carried by the PGC-containing smaller ECRs, as observed for the ECRs in the Tateyamaria genomes (Fig. 1). In addition, their replicon replication modules were closely related based on a phylogenetic analysis of parA genes ( Fig. S1; see also Table S1). Sulfitobacter spp. are widely distributed in different marine environments and may play important roles in organic sulfur cycling (49)(50)(51)(52)(53)(54). Culture-independent surveys of AAPB have indicated that Sulfitobacter spp. account for a significant fraction of AAPB communities in natural environments (10,55,56). However, only a few Sulfitobacter AAPB strains have been isolated and described (10,51,57). Genomes have been sequenced from only three Sulfitobacter AAPB strains, and the data revealed that their PGCs were located on ECRs. The three genomes ranged in size from 3.98 to 4.69 Mb, with GC contents ranging from 56.1% to 64.9%. The three PGC-containing ECRs exhibited sizes greater than 100 kb, while the three exPGCs were 45, 50, and 51 kb. Other than genes involved in photosynthesis, only a small number of genes were shared by PGC-containing ECRs in Sulfitobacter, which contrasted with the high degree of conservation observed for ECRs of Tateyamaria and Jannaschia. Furthermore, the parA genes within the three ECRs of Sulfitobacter were phylogenetically very distinct ( Fig. S1; see also Table S1).
Identification of PGC-containing chromid-like ECRs and PGC-containing plasmidlike ECRs. Chromids represent a novel type of ECRs that were recently described as genetic elements that are distinct from both chromosomes and plasmids (68). Since chromids typically carry essential genes, they remain more stably present than plasmids in bacteria and are considered indispensable for bacterial hosts (68). Comparison of the relative synonymous codon usage (RSCU) levels of bacterial chromosomes and ECRs can help identify if ECRs are chromids or not, as RSCU levels of chromid are similar to those of the corresponding chromosomes (34,68,69). Principal-component analysis (PCA) of the RSCU levels of all replicons from the 13 strains were analyzed for use in classifying elements as chromids and plasmids. The analyses indicated that eight PGC-containing ECRs (Tateyamaria sp. ANG-S1, Tateyamaria sp. syn59, T. omphalii DOK1-4, Sulfitobacter sp. AM1-D1, Sulfitobacter noctilucicola KCTC 32123, S. guttiformis KTCT 32187, R. litoralis Och 149, and N. ignava DSM16309) could be clearly assigned as chromid-like ECRs, while the other ECRs containing PGCs were provisionally classified as plasmid-like ECRs ( Fig. 2; see also Table S2).
Phylogenetic analysis. A phylogenetic analysis based on 16S rRNA gene nucleotide sequences from 13 exPGC-containing bacterial strains and 43 reference strains was conducted with photoheterotrophs and heterotrophs to show the phylogenetic distribution of the 13 strains carrying exPGCs (Fig. S2). As observed for the cPGC-containing AAPB, the 13 exPGC-containing bacterial strains did not comprise a monophyletic phylogenetic group but were instead distributed throughout the 16S rRNA phylogenetic tree.
To further investigate the phylogenetic relationships of the Roseobacter clade strains, 38 photoheterotrophic bacterial genomes were subjected to phylogenetic analysis using 29 conserved PGC genes. Comparison of phylogenies based on 16S rRNA nucleotide sequences and amino acid sequences of 29 conserved genes of the PGC revealed considerable topological differences (Fig. 3). For example, the four Tateyamaria strains clustered and were shown to be closely related to the N. ignava DSM16309 strain in the 16S rRNA gene phylogenetic analysis. However, the Tateyamaria strains were using the 29 highly conserved genes of the PGC, with Tateyamaria sp. Alg231-49 closely related to Thalassonium sp. R2A62 and the other three strains associated with the Roseobacter species. Similarly, the five Jannaschia strains formed a group with (86%) bootstrap support based on the 16S rRNA phylogenetic analysis; however, they were clearly differentiated into two distant subtrees in the PGC-based phylogenetic analysis.
In addition, the PGC-based phylogenetic analysis indicated the presence of two phylogenetic groups corresponding to differences in their puf operon structures. Specifically, the groups corresponded to PufC-containing and PufX-containing groups (Fig. 3b). In particular, Tateyamaria sp. syn59, T. omphalli pDOK1-4, Tateyamaria sp. ANG-S1, R. litoralis Och149, and Oceanicola sp. HL-35 contained pufC genes, while the others contained pufX genes. Five of the exPGCs in the PufC group clustered together, suggesting common ancestry for these exPGCs. Moreover, the cPGC in Roseobacter denitrificans Och114 represented a basal clade to a subtree also comprising the five aforementioned exPGCs, thereby providing evidence for chromosomal reintegration from an ECR (33). Furthermore, the externally nested position of the exPGC from Oceanicola sp. HL-35 within this subtree indicated that the exPGC was possibly transferred from other AAPB strains and could be further transferred to other distant bacterial strains via an ECR. Nine of the 13 exPGCs belonged to three genera among the seven that were analyzed. Among these, the exPGCs from strains of the same genus were closely related phylogenetically, suggesting that the transfer of exPGCs was more likely to occur among strains within the same genus. Two types of puf operon structures were observed within the genomes of different strains within the Tateyamaria and Jannaschia genera. Among the six PGC-containing Jannaschia strains, three exPGCs were PufX types whereas the other three were PufC types. Similarly, among the four phototrophic Tateyamaria strains analyzed, one cPGC was a PufX type whereas the other three exPGCs were PufC types. exPGC structures and arrangements. Three different structures were observed among the 13 exPGCs (Fig. 4). The 41 photosynthetic genes on the ECR of the three Jannaschia strains were organized into one superoperon with the same structure as that of the cPGC (20). In addition, the exPGC of Sulfitobacter sp. AM1-D1 was separated by more than 100 genes between bchIDO and hemECA. The other exPGCs all appeared to have been inserted by their replication modules, as previously identified (31,33).
The arrangement of photosynthetic genes is also a key characteristic of PGCs in AAPB. Three forms of PGC arrangement have been observed in Roseobacter clade organisms based on a combination of two conserved regions (puh-LhaA-bchMLHNF and puf-bchZYXC-crtF) (20). The 13 exPGCs contained the same conserved gene order as the cPGCs and comprised two different types. The type I arrangement was exhibited by exPGCs of the three Jannaschia strains (J. pohangensis DSM19073, J. donghaensis  Table S1. CECT7802, and J. faecimaris DSM10004020), wherein genes were arranged as forward puh LhaA bchMLHNF plus forward puf-bchZYXC-crtF. Type II arrangements were observed for the other 10 strains, wherein arrangements followed the pattern of forward puf-bchZYXC-crtF plus forward puh-LhaA-bchMLHNF (Fig. 4).
The arrangements of exPGCs of the PufC-containing group were of type II, which exhibited the high conservation in the direction and order of all photosynthetic genes on the exPGCs. In contrast, the exPGCs in the PufX-containing group exhibited two different types of arrangements, with unique traits present in different genera. For example, hemC and hemE genes encoding tetrapyrrole biosynthesis proteins were present only in the exPGCs of Sulfitobacter. In addition, the genomic region ranging from bchG to idd was located upstream of the puf operon in Sulfitobacter, although it is typically downstream of the ppaA and ppsR regulator genes within PGCs (20). Furthermore, the three exPGCs of the Jannaschia strains lacked cytochrome c 2 (cyc2) and diphosphate delta-isomerase (idi) genes. cyc2 is involved in electron transfer, while idi is involved in isoprenoid biosynthesis. The loss of these genes is not lethal for phototrophic bacteria (70,71), but the genes are nevertheless expected to be present in the PGCs of phototrophic bacteria within the Roseobacter clade (33,72,73).

Evidence of ECR-mediated PGC transfer within the Roseobacter clade.
Recent studies have suggested that ECRs could be vehicles for HGT of PGCs (31,33), albeit with limited evidence. The idea of transfer of PGCs by ECRs was supported in our analyses by the coexistence of two different types of puf operon structures (PufC and PufX types) in different strains of two genera, Tateyamaria and Jannaschia. In particular, these two types of puf operons were located on cPGCs and exPGCs, respectively. The Global Ocean Sampling expedition metageomes were the first to reveal that pufC could be replaced by pufX in AAPB and that pufC and pufX were present in different AAPB phylogroups (5,74). Thus, phylogenetic divergence of the two types of puf operons in strains from the same genus suggested that one or both of them were introduced by other phototrophic phylogroups. Moreover, phylogenetic congruence between whole PGCs and conserved photosynthetic operons within the PGC (i.e., bchFNBHLM-IhaA-puhABC and pufMLABQ-bchZYXC-crtF) indicate that PGCs act as entire functional units rather than being subject to partial transfer between strains (Fig. S3); this is consistent with a previous study (33). ECRs are mobile genetic elements; thus, PGCs carried by ECRs are more likely to be horizontally transferred.
The potential for transfer of PGC-containing ECRs. As described above, the 13 PGC-containing ECRs were divided into two types based on their sizes and functions. Small PGC-containing ECRs within Oceanicola sp. HL-35, Shimia sp. wx04, and J. pohangensis DSM19073 carried more than 80% of the genes coding for PGCs. These ECRs are usually present as plasmids and are likely to play an important role in the transfer of phototrophic capacities among species. This is especially probable because the transfer of small plasmids achieves higher efficiencies and the three streamlined PGC-containing ECRs still appear to confer the capability of chlorophototrophy (75,76). The acquisition of streamlined PGC-containing ECRs might enable strains to obtain new lifestyles at low costs, thereby providing advantages under certain environmental conditions (34). The other large PGC-containing ECRs also encoded proteins with various nonphotosynthetic functions. Moreover, a sox gene cluster (soxRSVYAZBCD), usually located on the bacterial chromosome, was observed on the PGC-containing chromid-like elements of N. ignava DSM1630, suggesting that the sox gene cluster might be also transferred by the ECR. Most of these large ECRs were classified as chromid-like ECRs. Consequently, these PGC-containing ECRs might preferentially be maintained in bacterial hosts rather than be transferred among hosts. Notably, PGCs carried by both plasmid-like and chromid-like ECRs have been suggested to be genomically stable because most exPGCs have been inserted by their corresponding ECR replication modules (31).
Comparison of the GC contents of the bacterial genomes, exPGCs, and PGCcontaining ECRs did not reveal significant differences for any of the 13 Roseobacter clade strains (Fig. S4). Thus, the transfer of these PGC-containing ECRs into bacteria likely occurred during very distant evolutionary events or, otherwise, only between closely related species (77).
A scenario to explain the evolution of AAPB exPGCs in the Roseobacter clade. A previous scenario was suggested to explain the evolution of exPGCs in Roseobacter clade organisms, wherein a chromosomal PGC superoperon was transferred into an ECR, followed by integration of replication origin genes into the ECR (31,34). Our analyses validate this explanation, and we further present a more detailed scenario (Fig. 5) to explain the transfer of PGCs within the Roseobacter clade after analyzing genomic and evolutionary characteristics of 13 exPGCs in this group (31). In this revised scenario, PGCs were first initially translocated from chromosomes to ECRs, as represented by the superoperon structure of exPGCs from J. faecimaris DSM1004020, J. pohangensis DSM19073, and J. donghaensis CECT 7802. Three subsequent transfer possibilities are present for exPGCs: (i) exPGCs reintegrated into chromosomes and became cPGCs, as observed for most phototrophic Roseobacter clade strains; (ii) PGCcontaining ECRs were lost from strains, such that the bacteria became heterotrophs; or (iii) exPGCs carried by ECRs were subjected to further recombination and became stable within ECRs. A remarkable characteristic of the majority of exPGCs is the insertion of an ECR replication module within the PGC as a result of a series of recombination events. Such an event could have helped ensure the stability of exPGCs (31).
The patchy distribution of AAPB within the Roseobacter clade has been explained by two evolutionary models that invoke either loss or gain of PGCs (11)(12)(13). Given that ECRs have played a critical role in the loss or gain of PGCs during the evolutionary history of photosynthesis, the patchy distribution of AAPB within the Roseobacter clade can be plausibly explained by ECR-mediated mechanisms. To date, exPGCs have accounted for ϳ20% of all PGCs in the currently available genomes of Roseobacter clade strains (Table S3), highlighting their prevalence in these organisms. It is likely that additional exPGCs carried by other strains will be identified with further generation of new bacterial genome sequences. Moreover, we suggest that the gain and loss of PGCs, as mediated by chromosomes and especially ECRs, resulted in the patchy distribution of AAPB within the Roseobacter clade.
In the present study, the genomic characteristics and evolution of 13 PGCs carried by ECRs were analyzed. The coexistence of two types of puf operon structures within strains of the same genera provided clear evidence of the horizontal transfer of PGCs mediated by ECR. Analysis of PGC-containing plasmid-like and chromid-like ECRs indicated that exPGCs could stably exist in bacteria after transfer, highlighting the importance of phototrophic metabolism carried by ECRs for some bacteria. Furthermore, these analyses indicated that the process of gain or loss of PGCs, as mediated by ECRs, contributes to the patchy distribution of phototrophic capacities within the Roseobacter clade.

MATERIALS AND METHODS
Strain isolation. Tateyamaria sp. syn59 and Shimia sp. wx04 were isolated from the South China Sea in April 2016 using oligotrophic medium F/2 plates (78), followed by transfer onto rich organic liquid medium (Marine Broth 2216; Difco, USA) for further isolation and cultivation. All cultures were incubated at 28°C with shaking at 200 rpm in the dark. Genomic DNA from the two strains was extracted using a TaKaRa MiniBEST universal genomic DNA extraction kit (Japan).
Data availability. All data used in this study are publicly available in GenBank. Accession numbers can be found in Table S1 and Table S3.