Data on partial polyhydroxyalkanoate synthase genes (phaC) mined from Aaptos aaptos marine sponge-associated bacteria metagenome

We report data associated with the identification of three polyhydroxyalkanoate synthase genes (phaC) isolated from the marine bacteria metagenome of Aaptos aaptos marine sponge in the waters of Bidong Island, Terengganu, Malaysia. Our data describe the extraction of bacterial metagenome from sponge tissue, measurement of purity and concentration of extracted metagenome, polymerase chain reaction (PCR)-mediated amplification using degenerate primers targeting Class I and II phaC genes, sequencing at First BASE Laboratories Sdn Bhd, and phylogenetic analysis of identified and known phaC genes. The partial nucleotide sequences were aligned, refined, compared with the Basic Local Alignment Search Tool (BLAST) databases, and released online in GenBank. The data include the identified partial putative phaC and their GenBank accession numbers, which are Rhodocista sp. phaC (MF457754), Pseudomonas sp. phaC (MF437016), and an uncultured bacterium AR5-9d_16 phaC (MF457753).


a b s t r a c t
We report data associated with the identification of three polyhydroxyalkanoate synthase genes (phaC) isolated from the marine bacteria metagenome of Aaptos aaptos marine sponge in the waters of Bidong Island, Terengganu, Malaysia. Our data describe the extraction of bacterial metagenome from sponge tissue, measurement of purity and concentration of extracted metagenome, polymerase chain reaction (PCR)-mediated amplification using degenerate primers targeting Class I and II phaC genes, sequencing at First BASE Laboratories Sdn Bhd, and phylogenetic analysis of identified and known phaC genes. The partial nucleotide sequences were aligned, refined, compared with the Basic Local Alignment Search Tool (BLAST) databases, and released online in GenBank. The data include the identified partial putative phaC and their GenBank accession numbers, which are Rhodocista sp. phaC (MF457754), Pseudomonas sp. phaC (MF437016), and an uncultured bacterium AR5-9d_16 phaC (MF457753 The putative Class I and II phaC genes were isolated with semi-nested PCR using specific phaC primers:

Value of the data
This data reveals the presence of non-cultivable PHA-producing bacteria in marine sponges. This data can be used for comparative studies related to phaC isolated from marine environment especially marine sponges, which is known as a bacteria hot-spot.
This data can be used for further experiments to provide insight into the enzymatic activity of the identified phaC to synthesise polyhydroxyalkanoate using model organisms, such as E. coli.
This data serve as a benchmark as the first report of the isolation of phaC genes from the South China Sea sponge Aaptos aaptos.

Data
These data provide detailed information on the isolation and identification of phaC from marine bacteria metagenome in Aaptos aaptos sea sponge at Bidong Island, Terengganu, Malaysia. Table 1 shows tabular data on the similarity comparison of sequenced phaC genes against the BLAST sequence databases. Table 2 shows data on the nucleotide sequences of the three putative, partial phaC genes identified from A. aaptos marine sponge-associated bacteria metagenome. The protein identifiers assigned by GenBank to uncultured bacterium phaC 2 and 2B are ASV71961.1 and ASY93340.1 respectively. Fig. 1 shows a phylogenetic Neighbour-Joining tree on the evolutionary relationships of identified and known phaC genes from variable sources.

Experimental design, materials and methods
The marine bacteria metagenome was extracted from the tissue of the sea sponge Aaptos aaptos, which was collected in the waters of Bidong Island, Terengganu, Malaysia at a depth of 15 m (GPS: 5°3 6'48.1" N 103°03'30.0" E) on June 16, 2016. The metagenome was extracted from 1 cm 3 sponge tissue using phenol-chloroform isoamyl alcohol (PCI) according to modified protocols by Beloqui and coworkers [1]. Whole genome amplification (WGA) was then carried out on the extracted metagenome using REPLI-g Mini Kit (Qiagen). The purity and concentration of the metagenome before and after WGA were measured using Nanodrop™ 2000 Spectrophotometer (Thermo Fisher Scientific). The reaction mixture for PCR was prepared using EconoTaq® PLUS 2X Master Mix (Lucigen) according to the manufacturer's instructions prior to the PCR amplification process, which was proceeded in the sequence of pre-denaturation at 95°C for 3 min, denaturation at 95°C for 30 s, annealing at 56°C for 1 min, extension at 72°C for 90 s, and final extension at 72°C for 5 min using Applied Biosystems™ Veriti 96-Well Thermal Cycler (Thermo Fisher Scientific). The degenerate primers that targeted the Class I and II phaC genes were applied in the PCR process, which were forward primer, CF1  [2]. A semi-nested PCR was then carried out using forward primer, CF2 (5′-GT(C/G)TTC(A/G)T(C/G)(A/G)T(C/G)(A/T)(C/G)CTGGCGCAACCC-3′), and reverse primer, CR4, with similar protocols to amplify the target gene. The amplified PCR product was separated by 0.7% w/v agarose gel electrophoresis [3] using PowerPac™ Basic power supply (Bio-Rad Laboratories), and visualised using Gel Doc™ EZ Imager (Bio-Rad Laboratories). The amplified phaC genes were sequenced via submission to First BASE Laboratories Sdn Bhd, which used Applied Bio-systems™ Genetic Analyzer with Sanger sequencing method, prior to alignment and refinement using BioEdit software 7.2.6. The query sequences were compared against the sequence databases using the BLAST tool ( Table 1). The sequences were then released in the GenBank nucleotide sequence databases on September 4, 2017, under accession numbers MF457754, MF457753, and MF437016 ( Table 2).
The phylogenetic tree (Fig. 1) shows the evolutionary relationship among the three identified, putative partial phaC genes and previously reported phaC genes with complete cds (coding sequence) released in GenBank database, comprising of 36 ingroup nucleotide sequences and 1 outgroup nucleotide sequence, which was constructed using the Neighbour-Joining method [4]. The optimal tree with the sum of branch length ¼10.39764912 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (3000 replicates) are shown next to the branches [5]. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [6] and are in the units of the number of base substitutions per site. All positions containing gaps and missing data were eliminated. There were total 19 positions in the final dataset. Evolutionary analyses were conducted in MEGA7 [7]. Rhodocista pekingensis phaC (AY283802.1) 74% Table 2 Nucleotide sequences of the three putative, partial phaC genes identified from A. aaptos marine sponge-associated bacteria metagenome.