Classification and Identification of Petroleum Microorganisms by MALDI-TOF Mass Spectrometry

Indigenous bacteria isolated from a crude oil sample from a deep water reservoir in the Pampo Sul Oilfield (Campos Basin-RJ, Brazil) were previously classified as strains of B. pumilus. However, their enzymatic activities with fluorogenic probes and rates of petroleum biodegradation were completely different. Some of the bacteria depleted n-alkanes, whereas others did not. Aromatic compounds reported to be recalcitrant were also biodegraded by some of these Bacillus strains, revealing their outstanding ability to deplete petroleum. Further classification using matrixassisted laser desorption/ionization-time-of-flight mass spectrometry (MALDI-TOF MS) followed by statistical analysis revealed that these strains could be clustered into three different groups, consistent with their enzymatic activity evaluation. A more accurate phylogenetic analysis using gyrB gene sequences confirmed the MALDI-TOF MS classification of three groups of strains and identified them as Bacillus safensis, B. cereus and B. thuringiensis.


Introduction
Petroleum is a complex mixture of hydrocarbons (aliphatic and aromatic) and compounds containing oxygen (carboxylic acids, phenols, ethers, etc.), nitrogen (imidazoles, carbazoles, etc.) and sulfur. 1 Hydrocarbons are economically important and are preferentially consumed during biodegradation, affording carboxylic acids and phenols, a process that adds new constituents to the petroleum (de novo synthesis). 2 Petroleum biodegradation can occur aerobically, anaerobically or both, as recently suggested by Cruz et al. 3 The mechanisms of the biodegradation processes are complex, 4 and understanding the role of each member of a microbial consortium during petroleum depletion is important, with particular emphasis on Bacillus strains, which are often present in petroleum samples.Thus, identification of the microorganisms is of importance, which is cumbersome and requires digital genomic information for the detection of 16S rRNA genes to provide specific bacteria classifications. 5,6[9][10] However, this methodology fails to identify some environmental microorganisms because the current data libraries were mainly constructed using clinical microorganisms. 11dditionally, some cases of the misidentification of microorganisms by phylogenetic techniques, such as the Bacillus safensis that were classified as B. pumilus by 16S rRNA, 12 prompted us to verify the potential of this technique to classify and identify petroleum microorganisms.
Therefore, the aim of this work was to evaluate the reliability of MALDI-TOF MS in classifying petroleum microorganisms and its versatility in identifying petroleum microorganisms to provide the basis for constructing a data library with petroleum bacteria.growth on nutrient agar (Difco) plates, the genomic DNA of pure cultures were isolated according to the protocol described by Pitcher et al. 14 Primers gyrB UP-1 and UP-2r were used for the amplification of DNA gyrase subunit B genes, 14 aiming at the identification of the isolates.The reaction mixtures (25 µL) contained 50 ng of genomic DNA, 2 U of Taq DNA polymerase (Invitrogen), 1X Taq buffer, 1.5 mmol L -1 MgCl 2 , 0.2 mmol L -1 dNTP mix (GE Healthcare) and 0.4 µmol L -1 of each primer.The polymerase chain reaction (PCR) amplification program consisted of 1 cycle at 94 °C for 5 min, 30 cycles at 94 °C for 1 min, 60 °C for 1 min, 72 °C for 2 min and 1 cycle of final extension at 72 °C for 7 min in an Eppendorf thermal cycler.The PCR amplification of gyrB gene fragments was confirmed on 1% agarose gel stained with Sybr Safe (Invitrogen).DNA fragments corresponding to gyrase gene sequences were purified using mini-columns (GFX PCR DNA and Gel Band Purification Kit, GE Healthcare) and subjected to sequencing in an automated sequencer (ABI 3500 XL Genetic Analyzer, Applied Biosystems).The sequencing reactions were performed with the BigDye Terminator Cycle Sequencing Standard Kit V3.1 (Applied Biosystems), according to the manufacturer's specifications.The primers for sequencing were UP-1 and UP-2r. 15Identification was achieved by comparing the sequences with sequence data from type strains available in the GenBank public database (http://www.ncbi.nem.nih.gov).The sequences were aligned using the CLUSTAL X program 16 and analyzed using MEGA software v. 2.1. 17Evolutionary distance was derived from sequence-pair dissimilarity, which was calculated as implemented in MEGA using the DNA substitution model. 18The phylogenetic tree was prepared using the neighbor-joining algorithm, 19 with bootstrap values calculated from 1,000 replicate runs.

Aerobic degradation assays
The biodegradation assays were conducted as described by Cruz et al. 13 with isolated microorganisms rather than a consortium.The experiments were performed in duplicate via incubation in a shaker (150 rpm, 28 ºC) and monitoring at 30 and 60 days.The biodegraded oil was extracted with CH 2 Cl 2 (3 × 20 mL).The extracts were combined and then dried over anhydrous Mg 2 SO 4 , and the solvent was evaporated under reduced pressure.The oily residue was submitted to silica gel column chromatography (60 Å, 70-230 mesh, Sigma-Aldrich), which afforded fractions of saturated constituents (F1, eluted with hexane); heavy aromatic constituents (F2, hexane:toluene, 1:1 v/v); and resins and asphaltenes (F3, CHCl 3 :MeOH, 95:5 v/v).

Gas chromatography-mass spectrometry (GC-MS)
The F1 fractions were analyzed using gas chromatography-mass spectrometry (GC-MS) with a Hewlett-Packard 6890 instrument connected to a Hewlett-Packard 5970-MSD mass spectrometer.The GC conditions were as follows: split injection (10:1) with He as a carrier gas at a flow rate of 1 mL min -1 .The chromatographic column was a MDN5S (30 m × 0.25 mm × 0.25 mm, Supelco).Data were obtained in both full scan and single ion monitoring (SIM) modes by electron ionization at 70 eV.The injector temperature was 300 ºC.For the analysis of n-alkanes (reconstructed ion chromatogram (RIC) m/z 71), the scan mode and the temperature program were used as follows: 80 ºC (2 min) to 270 ºC at 4 ºC min -1 and to 300 ºC at 10 ºC min -1 (hold for 25 min).Hopanes, homohopanes (m/z 191) and steranes (m/z 217) 1 were analyzed using the SIM mode and the following temperature program: 70 ºC (2 min) to 190 ºC at 30 ºC min -1 , to 250 ºC at 1.5 ºC min -1 and to 300 ºC at 2 ºC min -1 (hold 20 min).All samples were analyzed using 0.03 mg mL -1 5-α-cholestan-3-one as an internal standard for biomarker quantitation.

Biomarker ratios
Biomarker ratios were calculated using the peak areas from RICs.For quantitative analysis, the response factor for the internal standard was calculated by dividing the concentration by the respective peak area.The concentrations of all compounds were determined by multiplying the respective peak area of the mass chromatograms by the response factor for the internal standard.
Preparation of bacterial cell samples for matrix-assisted laser desorption ionization-MS (MALDI-MS) Approximately 2-5 mg of bacterial cells were suspended in 1 mL of 0.1% trifluoroacetic acid (TFA), vortexed for 3 min and centrifuged at 11000 rpm.After centrifugation, the supernatant was removed.The process was repeated 3× per sample.After extraction, 50 µL of 0.1% TFA were added, and then the cells were cooled and stored in a refrigerator.
All measurements were performed using a Synapt HDMS (Waters, Manchester, UK) mass spectrometer.MALDI(+)-quadrupole time-of-flight (QTOF) MS of cells was performed using the reflectron V-mode, a 200 Hz solid state (Nd:YAG) laser and a positive ion mode.An aliquot (4 µL) of a cell suspension was added to 10 µL of acetonitrile (MeCN) with 0.1% HCO 2 H and mixed (1:1) with a 10 mg mL -1 solution of α-cyano-4-hydroxycinnamic acid (α-CHCA) matrix in 50:50 H 2 O/MeCN with 0.1% HCO 2 H.The mixture was then directly spotted (1.5 µL) onto MALDI stainless steel target plates and allowed to dry.The typical operating conditions were as follows: laser energy of 250 a.u., sample plate of 20 V, and trap energy and transfer collision energy of 6 and 4 V, respectively.Calibration was performed using polyethylene glycol (PEG) 600/1000/2000 sodium adducts between m/z 300 and 3000.

Results and Discussion
The seven microorganism strains were isolated from a biodegraded petroleum sample (P2) from the Pampo Sul Oilfield (Campos Basin), and they exhibited different hydrolase activities (Table 1) against five fluorogenic probes (Figure 1). 20Enzymatic hydrolises of epoxides (EP1 and EP2) produced diols (1) via epoxide ring opening, which were subsequently oxidized by the sodium periodate present in the experimental solution, yielding the aldehyde (2) that, by β-elimination, produced the fluorescent signal of the umberiferyl anion (3).The intensity of this signal was used to monitor the enzymatic activity, i.e., a higher fluorescent signal compared to a standard reaction provided the enzymatic reaction yield as a percentage.Thus, more intense fluorescent signals correspond to more active enzymes.Similar processes ocurred with the esters ES1, ES2 and ES3, in which the enzymatic hydrolysis of the esters produced fluorescent signals (Figure 1).
Aromatic compounds, such as phenanthrene and methylphenanthrenes, were also biodegraded.These classes of aromatic compounds were used in organic geochemistry to obtain information regarding the maturity of source rock. 1 The methylphenanthrene index (MPI-1) was evaluated in biodegradation assays, and we observed  3).
To identify these strains a 16S rRNA phylogenetic dendrogram of these microorganisms was generated using their specific gene sequences (Figure 3), grouping them as the B. pumilus/B.safensis group.However, all seven strains of the B. pumilus/B.safensis group were enzymatically different, showing different preferences for the biodegradation of the petroleum constituents.These results required additional evidence to confirm that these microorganisms were distinct, and thus, we performed whole cell MALDI-TOF-MS analyses.
All samples were prepared in triplicate and the MS data were automatically acquired for 1.7 min using a spiral laser pattern over the sample.All of the acquired spectra showed a good reproducibility (see Supplementary Information), so spectrum data were smoothed (Savitsky Golay) using MassLynx® and bin sized using MSTable_v2 (homemade software) to proceed with the statistical analysis.The bin sized data were mean centered and submitted to statistical analysis.The hierarchically clustered mass spectra (HCA) using the software Pirouette® v. 3.11 revealed three different groups with ca.50% similarity (Figure 4).Similar groups were also discriminated via principal component analysis (PCA) (Figure 5, score 1 vs. score 2).The principal component (PC) numbers were set to 2 explaining 65.49% of variance, cross-validation was also applied (Table S1).Additionally, the loadings vs. variables plot from the PCA analysis showed that all of the intense mass signals contributed to the sample classification (Figure 6).These results were inconsistent with the previous identification by 16S rRNA gene sequencing.Consequently, gyrB gene sequencing and a new phylogenetic analysis were undertaken, which allowed the discrimination of these bacterial isolates into distinct Bacillus species (Figure 7).These use of the gyrB gene as a phylogenetic marker confirmed the MALDI-TOF MS analysis discrimination of the strains, indicating the effectiveness of this technique even for strains unable to be differentiated by the 16S rRNA gene.

Conclusions
MALDI-TOF MS was successfully applied for the characterization of seven petroleum bacteria based on whole cell MS analyses.These seven petroleum strains were clustered into three groups, consistent with the molecular identification using the gyrB gene sequence as the phylogenetic marker.These results show the technical extent, as the usual phylogenetic marker (16S rRNA) was not able to distinguish these strains due to their similarities.
The intrinsic characteristics of the technique, such as its analysis speed, low sample requirement and relative ease of use, make MALDI-TOF MS a promising approach for the identification of petroleum bacteria, which motivates the assembly of a data library from petroleum microorganisms obtained using MALDI-TOF MS equipment.Additionally, the commercially available programs for bacterial identification are limited when the identification of bacteria is running with a consortium, which indicates the need for the development of robust software to address mixtures of microorganisms, such as the one found in petroleum.

Figure 3 .
Figure 3. Phylogenetic analysis based on partial 16S rRNA sequences obtained from the bacterial isolates belonging to the Bacillus safensis/B.pumilus group.

Figure 4 .
Figure 4. Dendrogram from HCA (mean center) of the MALDI-TOF MS analysis of whole cells from seven isolated strains preliminarily identified as Bacillus pumilus.

Figure 5 .
Figure 5. PCA (mean center) scores plot from the MALDI-TOF MS analysis of SG-X whole cells from seven isolated strains preliminarily identified as Bacillus pumilus.

Figure 7 .
Figure 7. Phylogenetic analysis based on partial gyrase B gene sequences obtained from bacterial isolates belonging to Bacillus safensis/B.pumilus and B. cereus/B.thuringiensis groups.

Table 3 .
MPI-1 index values for the biodegradation assays