Genome features of moderately halophilic polyhydroxyalkanoate-producing Yangia sp. CCB-MM3

Yangia sp. CCB-MM3 was one of several halophilic bacteria isolated from soil sediment in the estuarine Matang Mangrove, Malaysia. So far, no member from the genus Yangia, a member of the Rhodobacteraceae family, has been reported sequenced. In the current study, we present the first complete genome sequence of Yangia sp. strain CCB-MM3. The genome includes two chromosomes and five plasmids with a total length of 5,522,061 bp and an average GC content of 65%. Since a different strain of Yangia sp. (ND199) was reported to produce a polyhydroxyalkanoate copolymer, the ability for this production was tested in vitro and confirmed for strain CCB-MM3. Analysis of its genome sequence confirmed presence of a pathway for production of propionyl-CoA and gene cluster for PHA production in the sequenced strain. The genome sequence described will be a useful resource for understanding the physiology and metabolic potential of Yangia as well as for comparative genomic analysis with other Rhodobacteraceae.

The incorporation of 3HV into 3HB-based polymer increases the flexibility, impact resistance as well as ductility of the polymer [10] and makes the polymer suitable for many industrial applications.
Mangroves are highly productive ecosystems covering approximately 75% of the total tropical and subtropical coastlines. Apart from wood production, mangrove forests support a wide range of functions including coastline protection, nutrient cycling, habitat for endangered species, breeding ground for marine life and have been proven as natural barrier againt tsunami [11]. Matang mangrove, Malaysia is widely regarded as the best-managed sustainable mangrove ecosystem in the world. Yangia sp. CCB-MM3, analyzed in the present study, was isolated from soil samples obtained from the Matang mangrove. The sampling location was situated in estuarine mangrove ecosystem that is under both the influence of marine condition and the flow of freshwater. Saline environments including estuaries and coastal marine sites have been focus of study for halophilic organisms that flourish in these habitats. Halophiles have attracted interest as candidates for bioprocessing because of their unique property including the ability to grow in high salt containing media, allowing fermentation processes to run contamination free under non-sterile condition [12].
At the time of writing, there are more than 300 genome assemblies from members of the family Rhodobacteraceae but the complete genome from the genus Yangia has not been reported. Here, we present the first complete genome of a Yangia representative and insight into the genes or pathways for polyhydroxyalkanoate (PHA) biosynthesis in this halophilic bacterium.

Classification and features
Soil sediment samples (0-10 cm) were collected from Matang Mangrove (4.85228 N, 100.55777 E) located on the west coast of Penisular Malaysia in October 2014 [13]. The soil samples had moderate salinity (21 ppt) and the temperature was 30°C on the day of sampling. CCB-MM3 was isolated from the soil samples on low nutrient artificial seawater medium (L-ASWM) agar plates [14]. Bacteriological characteristics of the isolate are summarized in Table 1. The isolate is a Gram-negative, motile and rodshaped bacterium of 1-2 μm in size (Fig. 1). The strain exhibited growth at 20-40°C (optimum 30°C) and pH 5-10 (optimum pH 7.5). Transmission electron microscopy revealed the presence of discrete, electron-transparent inclusions in the cytoplasm of strain CCB-MM3, presumably containing accumulated PHA granules. There are five identical 16S rRNA gene copies in CCB-MM3 genome. When compared to the 16S prokaryotic rRNA database available at EzTaxon [15], the 16S rRNA gene sequence of CCB-MM3 exhibited an identity of 98.8% with the type strain Y. pacifica DX5-10. A phylogenetic tree was constructed on the basis of 16S rRNA gene sequences of strain CCB-MM3 and other members of the family Rhodobacteraceae. The 16 s rRNA gene sequence phylogeny placed CCB-MM3 in the same cluster as Y. pacifica DX5-10 ( Fig. 2). The high 16S rRNA gene sequence similarity and distinct phylogenetic lineage with Y. pacifica DX5-10

Genome sequencing information
Genome project history Yangia sp. CCB-MM3 was selected for genome sequencing on the basis of its physiological and phenotypical features, and was part of a study aiming at characterizing the microbiome of mangrove sediments. Genome assembly and annotation were performed at the Centre for Chemical Biology, Universiti Sains Malaysia. The genome project was deposited at GenBank under the accession PRJNA310305. Table 2 summarizes the project information in accordance with the Minimum Information about a Genome Sequence (MIGS).

Genome sequencing and assembly
Whole genome sequencing of Yangia sp. CCB-MM3 was performed using the PacBio technology. In short, a library was prepared following the PacBio 10 Kb SMRTbell library preparation protocol. The final library was size selected using Blue Pippin electrophoresis (Saga Science, USA). The library was sequenced using two SMRT cells on PacBio RS II platform using P6-C4 chemistry. The run generated 153,311 reads with an average length of 14.46 Kb and a total of 2.22 Gb data. Raw reads were filtered and de novo assembled using hierarchical genome-assembly process v2 protocol in SMRT Analysis v2.3.0 [17]. Two rounds of genome polishing were performed using Quiver to improve the accuracy of the assembly.  [44]. From outside to the center: genes identified by the COG on forward strand, CDS on forward strand, CDS on reverse strand, genes identified by the COG on reverse strand, RNA genes (tRNAs orange, rRNAs pink, other RNAs grey), GC content (black) and GC skew (purple/green)

Genome annotation
The genome annotation was performed using the rapid annotation using subsystem technology [18]. The predicted Yangia sp. protein sequences were compared against the clusters of orthologous groups database using BLASTP. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [19], SignalP [20], TMHMM [21] and CRISPRFinder [22].

Genome properties
The genome of Yangia sp. CCB-MM3 is 5,522,061 bp-long and consists of two circular chromosomes and five plasmids (Table 3 and Fig. 3). The genome has a 64.98% GC content (Table 4). There are 5027 predicted protein-coding genes and 69 RNA genes (five rRNA operon and 44 tRNAs). 49 RNA genes are found on chromosome 1 while 20 are on chromosome 2. Of the predicted protein-coding genes, 3774 were assigned with a putative function, while the remaining were annotated as hypothetical proteins. A total of 3945 genes were assigned to COG categories (2343 on chromosome 1; 1068 on chromosome 2; the remaining on plamids) and a breakdown of their functional assignments is shown in Table 5. The most abundant COG functional category in strain CCB-MM3 were amino acid   (Table 7). Some species from the Roseobacter clade have been characterized as essential players in biogeocycling of organic or inorganic sulfur-containing compounds [23][24][25]. The genome of Yangia sp. CCB-MM3 encodes the enzymes necessary for assimilatory sulfate reduction including sulfate adenyltransferase (AYJ57_25280), adenylnylsulfate kinase (AYJ57_25275), phosphoadenylylsulfate reductase (AYJ57_02835) and sulfite reductase (AYJ57_02830). Interestingly, CCB-MM3 genome also harbours the complete set of sulfur-oxidizing genes including soxX (AYJ57_01935), soxY (AYJ57_01940), soxZ (AYJ57_01945), soxA (AYJ57_01950), soxB (AYJ57_01955), soxC (AYJ57_01960) and soxD (AYJ57_01965) for thiosulfate oxidation in vitro. SoxYZ is the carrier protein that interacts with SoxAX, SoxB and SoxCD; SoxAX cytochrome complex is proposed to link sulfur substrate to SoxYZ; dimanganese SoxB removes oxidized sulfur residue from SoxYZ through hydrolysis; and SoxCD catalyzes the oxidation of reduced sulfur residue bound to SoxYZ [26][27][28][29]. These genes encoding essential components of the Sox multienzyme complex are organized in a single locus in CCB-MM3. Analysis of Yangia sp. CCB-MM3 genome also revealed that rodanese-like sulfurtransferases (AYJ57_05465, AYJ57_08495, AYJ57_10220, AYJ57_16970 and AYJ57_24415) that can participate in the metabolism of thiosulfate and elemental sulfur during disproportionation are present in the genome.
Although the ability of Yangia to grow with free nitrogen gas as sole nitrogen source has not been analyzed yet, all genes necessary for nitrogen fixation were identified in the genome of Yangia sp. CCB-MM3. The genome encodes the subunits α and β of molybdenumiron nitrogenase (AYJ57_00195, AYJ57_00200), its regulatory and accessory proteins (AYJ57_00310, AYJ57_00210, AYJ57_00215 and AYJ57_00315).

PHA metabolism
The ability of Yangia sp. CCB-MM3 to accumulate the copolymer P(3HB-co-3HV) with 7 mol% of 3HV from structurally unrelated carbon source was confirmed by NMR analysis (Fig. 4). In 'Norcadia corallina' and Rhodococcus ruber, P(3HB-co-3HV) is synthesized from simple carbon source by using a pathway in which majority of propionyl-CoA is derived from the methylmalonyl-CoA pathway [30]. Similarly, genes encoding for complete methylmalonyl-CoA pathway were identified in Yangia sp. CCB-MM3 (Table 8), suggesting that this is one of the potential pathways involved in providing propionyl-CoA in Yangia sp. Succinyl-CoA is an important intermediate of the methylmalonyl-CoA pathway. The isomerization of succinyl-CoA to (R)-methylmalonyl-CoA proceeds through the action of methylmalonyl-CoA mutase (AYJ57_16720). (R)-methylmalonyl-CoA is converted to the (S) form via methylmalonyl-CoA epimerase (AYJ57_06825). The latter is then decarboxylated to propionyl-CoA by methylmalonyl-CoA decarboxylase (AYJ57_16710). The formation of P(3HB-co-3HV) from its precursors, acetyl-CoA and propionyl-CoA is catalyzed by three enzymes [10] and the genes encoding these enzymes were identified in the genome of CCB-MM3. The first reaction consists of either the condensation of two acetyl-CoA or condensation of acetyl-CoA and propionyl-CoA by βketothiolase encoded by multiple phaA in CCB-MM3 (AYJ57_07995, AYJ57_09725, AYJ57_11220, AYJ57_15015 and AYJ57_20090). The resulting intermediate is reduced to 3-hydroxybutyryl-CoA or 3-ketovaleryl-CoA by NADPHdependent acetoacetyl-CoA reductase encoded by phaB (AYJ57_01725, AYJ57_11215 and AYJ57_24165). The hydroxyacyl-CoA monomers are then incorporated into the growing polymer chain by PHA synthase, encoded by phaC [31]. The genome of Yangia sp. CCB-MM3 possesses two PHA synthases genes, phaC1 Ys and phaC2 Ys (AYJ57_06535 Table 7 Glycoside hydrolase genes in the genome of Yangia sp.

CCB-MM3
and AYJ57_14600) that are located on chromosome 1 and 2, respectively. Both phaC1 Ys and phaC2 Ys encode 598 amino acid proteins which show 67 and 81% identity with phaC from Citreicella sp. SE45. These PHA synthases belong to Class I that have only one subunit and show preference to short chain length hydroxyacyl-CoA monomers [32].
Besides genes that are directly involved in PHA biosynthesis, gene involved in other aspect of PHA metabolism e.g. PHA depolymerase (phaZ) was annotated in the genome of Yangia sp. CCB-MM3. Since PHA is accumulated as storage compound for its producer, some PHA-producers harbour native machinery for the degradation of PHA. The synthesized PHA is catabolized by intracellular PhaZ and subsequently reutilized by cell [33]. However, mechanism of control for PHA biosynthesis or degradation in its native producer is not yet fully understood. Two PHA depolymerases, phaZ1 Ys and phaZ2 Ys (AYJ57_12275 and AYJ57_14595) were found in CCB-MM3. Another noncatalytic PHA granuleassociated protein, phasin, was found to be encoded by single copy of phaP gene (AYJ57_14605) in CCB-MM3. Phasin has putative role in maintaining the stability of PHA granules formed by preventing the coalescence of separated granules [34]. The transcriptional repressor gene phaR (AYJ57_10595) that encodes for protein that regulates the transcription of phaP was also annotated in CCB-MM3 genome. It was proposed that PhaR functions as a repressor protein of transcription by binding to the upstream region of PhaP [35].

Conclusions
At least 300 members of the family Rhodobacteraceae have publically accessible genomes. Yangia sp. CCB-MM3, however, represents the first sequenced genome from the genus. The strain was selected for genome sequencing by our research group as part of a study focusing on characterizing the microbiome of Malaysia mangrove sediments. The strain CCB-MM3 genome includes genes encoding monomer supplying and biosynthetic pathway for PHA production. Availability of the genome sequence will facilitate further study on the strain's biological potential and provide reference material for comparative genomic analysis with other Rhodobacteraceae.