Introduction

DNA phosphorothioate (PT) modification via the replacement of a non-bridging phosphate oxygen with sulfur is a well-known synthetic mechanism of stabilizing oligonucleotide therapeutic agents against nuclease degradation in serum and in cells1. However, we recently discovered that DNA PT modification occurs naturally in the DNA backbone in an RP stereo-specific and sequence-selective manner in bacteria that contain the five-member dndABCDE gene cluster2,3,4. Beginning with the original report in Streptomyces lividans that the dnd cluster conferred an unusual DNA modification that resulted in Tris-dependent DNA degradation during electrophoresis, the presence of dnd genes has since been established in more than 200 diverse species of bacteria and archaea. These genes were likely spread via horizontal gene transfer5,6.

A subset of these bacteria possess dnd gene clusters that lack dndA and possess only dndBCDE; in these strains, the DndA cysteine desulfurase activity is functionally replaced by an IscS homologue7. The DndC protein has ATP pyrophosphatase activity and shows significant homology with the PAPS reductase, while DndB has sequence homology with a group of transcriptional regulators6,8. A DndD homologue in Pseudomonas fluorescens Pf0-1, SpfD, exhibits ATPase activity and is potentially involved in DNA nicking during sulfur incorporation9. The recent X-ray crystallographic analysis of DndE revealed that the protein adopts a tetramer conformation that exhibits binding affinity to nicked double-stranded DNA10. Another three-member cluster, dndFGH, is located downstream of dndBCDE. This cluster has been identified in approximately 86 bacterial stains. The proteins encoded by this cluster exhibit similarities to classical methylation-based restriction-modification (R-M) systems, and, according to the results of a transformation assay, dndFGH provide a genetic barrier to foreign plasmids without PT11. However, approximately 125 strains have been found to possess only dndBCDE12. The absence of dndFGH cognates suggests that PT modifications and dndBCDE genes provide functions other than a typical R-M system, such as control of gene expression.

Due to the resistance of PT linkages to nuclease cleavage, DNA harboring PT modifications yields PT-linked dinucleotides and canonical nucleotides upon enzymatic digestion2. Based on this property, we have developed a sensitive HPLC-coupled tandem mass spectrometry (LC-MS/MS) technique to identify PT sequence contexts and quantify their frequencies in the bacterial genome13. To date, diverse PT sequence contexts have been identified by LC-MS/MS, including d(GPSG) in Streptomyces lividans, d(GPSA) and d(GPST) in both Escherichia coli B7A and Salmonella enterica serovar Cerro 87 and d(CPSC) in Vibrio cyclitrophicus FF7513. We recently adapted the SMRT sequencing technology for PT mapping across bacterial genomes12. In E. coli B7A, d(GPSA) and d(GPST) dinucleotides were located in complementary GPSAAC/GPSTTC sequence contexts without any apparent strict consensus sequence beyond four nucleotides12,13. Unexpectedly, only 18% of the GAAC/GTTC sites in the genome were found to be modified by PT and the PT modifications could be partial at any particular genomic site even in the presence of dndFGH, ruling out the known R-M mechanisms. PT mapping in V. cyclitrophicus FF75 lacking dndFGH revealed unexpected single-stranded PT modifications in a CPSCA motif; again only a small percentage (14%) of the CCA sites were modified12. Based on these studies, it remains unclear whether PT modifications are involved in the expression of particular genes or if they confer advantageous traits to the bacteria.

In this study, we carried out genome-wide transcriptome analysis to capture genes that were differentially transcribed in response to the loss of PT modification. The global transcriptome analysis revealed that the SOS response was activated after the loss of the DNA PT modifications. It was confirmed that the non-PT-modified DNA suffered DNA damage by the products of the dndFGH and dndI genes, triggering the SOS response, cell filamentation and prophage induction. However, the DNA cleavage activity of DndFGHI could be prevented by the expression of dndBCDE homologues that added PT modifications to the sequence GPSA and GPST. These data show that the PT modification system possesses features that are different from known R-M systems. Our results will provide a better understanding of the biological functions of physiological DNA PT modifications.

Results

Transcriptome profiling in response to PT loss

To characterize the cellular responses to PT modifications, RNA-Seq analysis was conducted to define the global changes in gene expression in response to the loss of dnd. We selected mid-log-phase cultures of XTG102, the dndBCDE in-frame deletion mutant, to perform RNA-seq analysis with wild-type S. enterica as the reference. Differential gene expression analysis in XTG102 showed that a total of 184 genes were expressed at a degree significantly greater than 2-fold (log2-ratio > 1, p-value < 0.05) different from the wild-type strain, including 89 up-regulated genes and 95 down-regulated genes (Fig. 1, Supplementary Data). To validate the RNA-seq data, the expression levels of twelve differentially expressed genes were confirmed using quantitative RT-PCR (Fig. 2).

Figure 1
figure 1

Comparison of gene expression profiles.

(A) Gene organization of dndBCDE and dndFGHI in wild-type S. enterica and mutants. White boxes indicate deleted regions. (B) Heat map illustrating changes in gene expression levels in XTG102, XTG103 and XTG104 relative to wild-type S. enterica (green represents down-regulation; red represents up-regulation). The log2 fold change ranges are shown at the upper bars. (C) Venn diagram indicating the number of individually or commonly regulated genes in XTG102, XTG103 and XTG104.

Figure 2
figure 2

qRT-PCR validation of differentially expressed genes.

Expression analysis of the 12 selected genes determined by RNA-seq (patterned bars) and validated by qRT-PCR (solid bars). The levels are presented relative to the values of the wild-type S. enterica strain. The values represent the average gene expression ± SD from three independent qRT-PCR experiments.

Notably, a number of known damage-inducible SOS genes, including lexA, recN, dinF, dinI and sulA, emerged in the clusters of up-regulated genes (Table 1). The mechanism by which the E. coli SOS regulon is induced has been intensively investigated with a variety of DNA-damaging agents (e.g., UV, mitomycin C and )14,15,16. Upon DNA damage, RecA becomes activated as a co-protease by binding to single-stranded DNA. The RecA/ssDNA co-protease mediates the cleavage of LexA, the repressor protein of the SOS genes, followed by the induction of the SOS response and cell division arrest to prevent the transmission of damaged DNA to daughter cells17. This cell division arrest is due to the induction of SulA, a bacterial SOS checkpoint protein, that later blocks the formation of the FtsZ ring and causes the formation of non-septate bacterial filaments. The expression of sulA in XTG102 exhibited a 3.33-fold (log2 fold change) increase compared to that in the wild-type strain (Table 1). In addition, twenty-nine bacteriophage-related genes, clustered in close proximity to each other, were significantly up-regulated; this was in agreement with the observation that DNA damage and the SOS response can induce prophage gene expression (Supplementary Data)18. Taken together, these features implied that DNA damage occurred in non-PT modified XTG102.

Table 1 Differentially transcribed SOS genes in XTG102, XTG103 and XTG104

In addition to the SOS and prophage genes, a cluster of genes involved in the production of the siderophore enterobactin were also significantly up-regulated in XTG102 (Supplementary Data). Enterobactin is a high-affinity siderophore that acquires iron for the bacteria from the environment. A major host defense against infection is the limitation of iron availability to prevent pathogen growth19. The ability to produce siderophores has been reported to be important for the virulence of E. coli, Salmonella typhimurium, Neisseria gonorrhoeae and Pseudomonas aeruginosa, among others20,21,22,23. A transcriptome comparison showed that the induction of enterobactin biosynthesis, suggesting that the dnd system might affect the virulence of S. enterica.

Loss of DNA PT modifications leads to cell division arrest of S. enterica

The transcriptome analysis suggests the induction of the SOS response upon the loss of the PT modification system. To elucidate the biological effects of DNA PT modifications, we first compared the morphology of the wild-type S. enterica 87 and XTG102. Fluorescence microscopy showed that XTG102 exhibited significant morphological changes from the typical rod-shape of the wild-type to an elongated filamentous phenotype. As shown in Fig. 3, uniform membranes without septa as well as large decondensed chromosomes were visualized in filamentous XTG102 stained with the lipophilic membrane dyes FM4-64 and DAPI, suggesting defects in cell division, chromosome condensation and segregation.

Figure 3
figure 3

Cell morphology and comet assay.

For cell morphology assay, cell membranes were stained with FM-64 and DNA was stained with DAPI. For comet assay, DNA was stained with ethidium bromide. The cells were visualized by fluorescence microcopy.

A restriction-dependent DNA damage phenomenon in XTG102

Many conditions have been reported to cause cell filamentation, including DNA damage, low temperature, nutrient deprivation, cell division defects and membrane aberrations16,24,25. The transcriptome analysis suggests that DNA may be damaged upon the loss of the PT modification system. To investigate the possible DNA damage in XTG102, we conducted a comet assay; this method is also referred to as the single-cell gel electrophoresis assay26. The alkaline comet assay (pH > 12.3) and the neutral comet assay (pH < 10) are employed to differentiate single-strand and double-strand breaks, respectively27. In contrast to wild-type S. enterica cells, XTG102 displayed pronounced DNA double-strand breaks in the neutral comet assay. The image obtained with XTG102 resembles a “comet” with a distinct head of intact DNA and a tail consisting of damaged or broken pieces of DNA (Fig. 3).

XTG103 and XTG104, the in-frame deletion mutants of dndBCDE-orf5-dndFGH and dndF respectively, exhibited intact DNA without a tail in the comet assay as well as normal cell shapes, indicating the involvement of dndFGH-orf5 gene products in DNA double-strand breakage (Fig. 1 & 3). We conducted an RT-PCR analysis and found that dndFGH-orf5 was co-transcribed as a single operon (Supplementary Fig. 1). The in-frame deletion of dndB-E together with the individual deletion of dndFGH-orf5 genes, yielding mutants HW1, HW2, HW3 and HW4, respectively, restored normal morphology. These results suggest that DndFGH and Orf5 were responsible for the DNA double-stranded cleavage. Considering its relationship to the dnd genes, orf5 was designated as dndI.

The transcriptome analysis revealed that the SOS genes were not induced in XTG103 and XTG104; this result was consistent with the normal cellular morphology observed for those strains. XTG103 and XTG104 displayed similar transcriptional profiles, indicating a major contribution of dndFGHI to the global transcriptional changes (Fig. 1). XTG103 and XTG104 had 229 and 216 differentially transcribed genes, respectively, compared to the wild-type S. enterica, of which 186 genes shared common transcriptional patterns. The most significantly changed genes in XTG103 and XTG104 are in the energy production and conversion, carbohydrate transport and metabolism, transcription, function unknown and unclassified COG groups (Fig. 4, Supplementary Data).

Figure 4
figure 4

COG functional categories of the differentially expressed genes in dnd mutants.

Each bar represents the actual number of genes (log2 ratio ≥ 1 and ≤−1 and with p-value<0.05). The COG categories are identified by capital letters as follows: C, energy production and conversion; D, cell cycle control, cell division and chromosome partitioning; E, amino acid transport and metabolism; F, nucleotide transport and metabolism; G, carbohydrate transport and metabolism; H, coenzyme transport and metabolism; I, lipid transport and metabolism; J, translation; K, transcription; L, replication; M, cell wall/membrane/envelope biogenesis; N, cell motility; O, posttranslational modification, protein turnover, chaperones; P, inorganic ion transport and metabolism; Q, secondary metabolites biosynthesis, transport and catabolism; R, general function prediction only; S, function unknown; T, signal transduction mechanisms; U, intracellular trafficking and secretion; V, defense mechanisms; and -, not in COGs. Note that since COG annotation groups overlap, the sum of COG annotated genes is larger than the number of total up- and down-regulated genes analyzed.

The DNA damage in XTG102 can be rescued by heterologous dnd genes

To investigate whether heterologous DNA PT system could reverse DNA damage by DndFGHI in XTG102, we constructed a plasmid, pWHU1841, carrying dnd genes from V. tasmaniensis 1F267 (Table 2). The plasmid conferred DNA PT modifications in XTG102 at the GPSA and GPST sequence contexts with the frequency of 198 ± 4 and 203 ± 3 per 106 nt, respectively, with predicted 4 bp consensus sequences of GPSAAC/GPSTTC. XTG102(pWHU1841) possessed a close level of PT modifications to wild-type S. enterica, which contained 195 ± 1 and 206 ± 4 of GPSA and GPST per 106 nt by LC-MS/MS analysis. pWHU1841 prevented DNA damage and fully restored the normal morphology of XTG102. Considering the partial PT modifications across the genome, DndFGHI might require more DNA signatures to determine the particular GAAC/GTTC sites to cleave.

Table 2 Strains and plasmids used in this study

Discussion

DNA PT modification involves in the incorporation of sulfur into the DNA backbone and has been regarded as a novel type of physiological modification. Phosphorothioation is wide-spread in bacteria in nature. However, its physiological functions are poorly characterized. In this work, we performed a global transcriptome analysis to define the biological functions of PT modifications. After PT loss, 184 genes were differentially transcribed in XTG102, while 126 genes were unique in comparison to XTG103 and XTG104 (Fig. 1). The biological roles of the differentially transcribed genes were categorized based on COG groups. The most highly up-regulated genes are in the replication (L), inorganic ion transport and metabolism (P), general function prediction only (R), function unknown (S) and unclassified COG groups, while the significantly down-regulated genes are in the energy production and conversion (C), amino acid transport and metabolism (E), carbohydrate transport and metabolism (G) and unclassified COG groups (Fig. 4). Most notable was the up-regulation of protein-encoding genes involved in the cellular SOS response in XTG102, in addition to the induction of prophage genes. Moreover, we observed that the genes involved in the biosynthesis of the siderophore enterobactin, a virulence factor in pathogenic bacteria, were up-regulated, which is unique compared to other reported SOS responses in E. coli, Pseudomonas spp. and Stapylococcus aureus28,29,30.

Generally, upon DNA damage, RecA becomes activated as a co-protease by binding to single-stranded DNA. The RecA/ssDNA co-protease mediates the cleavage of LexA, the repressor protein of the SOS genes. However, the global SOS response consists of two subpathways, one that is RecA-LexA regulated and one that is RecA-LexA independent. Compared to the reported overall global SOS response of Burkholderia thailandensis DW503 to mitomycin C, a DNA-damaging chemical, we observed both common and strain-specific differential transcription patterns in XTG102. In addition to the up-regulation of eight RecA-LexA dependent ‘classical’ SOS genes, including recA, recX, dinB, lexA, uvrA, etc., another 111 genes were found to be differentially expressed in B. thailandensis DW50331. These genes involved in energy production and conversion (C), carbohydrate transport and metabolism (G), lipid transport and metabolism (I), transcription (K), cell wall/membrane biogenesis (M), cell mobility (N), inorganic ion transport and metabolism (P) and general function prediction only (R)31. We compared the global transcriptome of XTG102 to the SOS response of Burkholderia thailandensis. In addition to the 14 ‘classical’ SOS genes in XTG102, 50 genes belong to the above COG groups (11 genes in group C, 5 genes in group G, 3 genes in group I, 6 genes in group K, 2 genes in group M, 2 genes in group N, 12 genes in group P and 9 genes in group R). That means, roughly 40% of genes (50 out of 126) could be possibly secondary to the SOS response.

DNA damage stimulates the SOS response followed by prophage induction. The well-studied model of prophage and SOS interactions is the E. coli λ phage model. The RecA/ssDNA co-protease assists in the cleavage of the CI repressor of the lambda phage, thus transforming the phage from a lysogenic to a lytic form32. Twenty-nine phage-related genes were significantly up-regulated in XTG102. Filtered supernatant from XTG102 failed to produce plaques when spotted on lawns of XTG102, XTG103 and other dnd mutants. An analysis of the sixty prophage genes did not reveal recombinase-coding genes, which are necessary for excision of the phage DNA and for the production of lytic phage particles33. Therefore, the putative prophage is likely defective and unable to form productive prophage particles, although multiple prophage genes are up-regulated.

Our study showed that dndFGH functioned together with dndI to cleave non-PT-protected DNA in the absence of dndBCDE and thus activated the cellular SOS response. A portion of the XTG102 cells,about 10% of cell populations grown in LB medium in exponential growth (OD600 = 0.8), remained the same size as the wild-type cells and displayed normal nucleoid organization. The heterogeneity of the SOS response at the single-cell level can be explained by the two-population model (TPM)17. The TPM suggests that some cells in the population express SOS at a high level while others do not express SOS at all and therefore the average expression in all of the cells leads to the observed increase in SOS gene expression17. One possible reason for the presence of two populations may be that replication fork collapse is a stochastic process and that there are multiple pathways for restarting repaired replication forks34. Thus, it is feasible that DndFGHI do not cleave consistently at a given site in a population of cells; this possibility is in agreement with the observation that PT modifications were partial at a given GAAC/GTTC site. The SOS response is only triggered in cells that have experienced replication fork stalling due to DNA damage at critical sites by DndFGHI.

Interestingly, the dndBCDE homologues from V. tasmaniensis 1F267 protected DNA against the cleavage by DndFGHI in XTG102. This observation implied that DndFGHI require GAAC/GTTC as recognition sites. Previous SMRT sequencing shows that not all of the PT modification sites are actually modified and it remains unclear how these GAAC/GTTC sites without PT could escape cleavage by DndFGHI. One possibility is that the recognition sites of DndBCDE and DndFGHI require DNA motifs in addition to the 4 bp consensus contexts, e. g., secondary structures or particular DNA signatures. Taken together, these results identified both common and unique adaptive responses of S. enterica to the DNA PT modification system of dndBCDE and dndFGHI.

Methods

Bacterial strains and growth conditions

PT-containing, enantiomerically pure dinucleotides in the RP and SP configurations were obtained from IBA Bio-Tagnology (Germany). The bacterial strains and plasmids used in this study are listed in Table 2. E. coli strains were grown in Luria-Bertani (LB) broth at 37°C. Salmonella strains were cultured in tryptic soy broth medium supplemented with 2% NaCl, while Vibrio strains were grown in Difco™ Marine Broth 2216 at 28°C. When necessary, antibiotics were used at the following concentrations: 100 μg/ml of ampicillin and 25 μg/ml of chloramphenicol. The in-frame deletion mutants of S. enterica, including HW1, HW2, HW3 and HW4, were constructed as previously described11.

Illumina RNA-seq and data analysis

Enrichment of the mRNA, fragment interruption, the addition of adapters, size selection, PCR amplification and RNA-Seq were performed by the staff at the Beijing Genome Institute (BGI) (Shenzhen, China). In general, the total RNA was extracted using RNAprotect Bacteria Reagent (Qiagen) and RNeasy Mini kits (Qiagen), followed by further purification using a MICROBExpress Kit (Ambion) to enrich the mRNA. Bacterial mRNA was fragmented using a RNA fragmentation kit (Ambion) and the resultant fragments were in a size range of 200–250 bp. Double-stranded cDNA was generated using the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, CA) and purified using a QiaQuick PCR extraction kit. An RNA-seq library was constructed using an Illumina Paired End Sample Prep kit and the sequencing was performed using the HiSeq 2000 (Illumina, CA) sequencer at BGI in Shenzhen35.

Raw reads produced from HiSeq 2000 contain dirty reads,which are adapters, unknown or low quality bases. To obtain clean reads, the dirty raw reads were discarded via the following four steps: (1) remove reads with adaptors; (2) remove reads in which unknown bases are greater than 10%; (3) remove low-quality reads (the percentage of the low quality bases of quality value ≤ 5 is more than 50% in a read); (4) clean reads were obtained. Clean reads were mapped to the Salmonella enterica genome using Tophat (version 2.0.5) software36. The assembly and quantification of transcripts were carried out using Cufflinks (version 2.1.1)37. Differentially expressed genes were identified using Cuffdiff, a part of the Cufflinks package, which calculated expression in two or more samples and reports genes and transcripts that are differentially expressed38. Genes with an adjusted p-value < 0.05 and fold change ≥ 2 were identified as being differentially expressed.

Quantitative RT-PCR analysis

The total RNA was isolated using an RNeasy Mini Kit (Qiagen). cDNA was generated using the RevertAid™ First Strand cDNA Synthesis Kit (Thermo). The primers used are listed in Supplementary Table 1. The housekeeping gene gapA, which encodes GAPDH, was used as a reference. Real-time RT-PCR was performed using the SsoFast™ EvaGreen® Supermix With Low ROX Kit (Bio-Rad) and a 7900HT Fast Real-Time PCR System (Applied Biosystems). An initial cycle at 95°C for 30 s was followed by 40 cycles of 95°C for 10 s and 58.5°C for 30 s. The RT-PCR data were analyzed using the 2−ΔΔCt method to determine the transcriptional differences between strains.

Neutral bacterial comet assay to detect DNA double-strand breaks

The comet assay was performed essentially as described by Dipesh Solanky et al39. Slides were precoated with 0.5% regular agarose and then covered with a 24 mm × 24 mm cover glass. The slides were incubated at 4°C for 10 min to cool the first layer of agarose. The coverslip was then removed and a subsequent agarose layer was added in the same manner. The first layer consisted of 100 μl of 0.5% regular agarose prepared in 0.85% NaCl. For the second layer, 2 μl of overnight-cultured bacterial cells were mixed thoroughly with 200 μl of 0.5% low-melting point agarose (Sigma). Then, 100 μl of this mixture was transferred to the slide as the second layer. A third layer comprised of 0.5% regular agarose solution containing 5 μg/ml of RNase A (Sigma), 1 mg/ml of lysozyme (Sigma) and 0.25% sodium N-lauroyl sarcosine. The slides were refrigerated for 10 min at 4°C and incubated for 30 min at 37°C. The embedded cells were then lysed by immersing the slides in a solution containing 2.5 M NaCl, 100 mM EDTA, 10 mM Tris, pH 10, 1% sodium lauroyl sarcosine and 1% Triton X-100 for 1 h at room temperature. Fresh Triton X-100 was added to the lysis solution with stirring for at least 10 min at room temperature. After lysis, the slides were transferred to an enzymatic digestion solution containing 2.5 M NaCl, 10 mM EDTA, 10 mM Tris, pH 7.4 and 1 mg/mL of proteinase K (Sigma). The slides were incubated in this solution at 37°C for 2 h. Following digestion, the slides were placed on the horizontal slab of an electrophoretic unit (Bio-Rad) and equilibrated for 20 min in buffer containing 1 mM EDTA and 100 mM Tris, pH 10. The slides were electrophoresed in this buffer for 10 min at 12 V and 100 mA. After electrophoresis, the slides were sequentially immersed in 1 M ammonium acetate prepared in ethanol for 20 min, absolute ethanol for 30 min and 70% ethanol for 10 min before being air-dried. To achieve uniform staining, the slides were pretreated with 50 μL of a solution of 5% DMSO and 10 mM NaH2PO4. The slides were then stained with 20 μL of 2 μg/mL ethidium bromide in 5% DMSO. Observations were made by using a Zeiss AxioCam HRc fluorescent microscope at 40 × magnification with the Cy3 filter (excitation 548 nm and emission 562 nm).

Fluorescence microscopy

The cells were stained and observed as previously described40,41. The cells were collected by centrifugation (5,000 × g, 5 min) and washed three times with saline (0.85% sterile NaCl solution). The washed cells were resuspended in an appropriate volume of saline. A 10 μl portion of the sample was spread on a clean glass slide and dried at room temperature. The sample was further dried and fixed completely with drops of methanol for 5 min. After fixing, the slide was washed six times with tap water in a large beaker. The slide was dried at room temperature and 10 μl of poly-L-lysine (5 μg/ml in distilled water) was spread over the cells at room temperature to fix all cells tightly to the slide. DAPI solution (10 μl of 4′, 6-diamino- 2-phenyl-indole at a concentration of 5 μg/ml in saline) was dropped onto the sample and incubated for 5 min. Next, 10 μl of FM4-64 (2 μg/ml in distilled water) was subsequently dropped onto the sample. A clean glass coverslip was put on the drop and the sample was immediately observed with a Nikon A1 confocal system. DAPI and FM4-64 were viewed using the Ex488 and Ex561 filters, respectively.

Quantification of PT modifications by LC-MS/MS

All Salmonella strains were grown in LB medium to an OD600 of 0.8 at 28°C. DNA samples were isolated, hydrolyzed and dephosphorylated as described previously13. Prior to quantitative analysis of PT-linked dinucleotides, external calibration curves containing varying amounts of d(GPSA) RP and d(GPST) RP standards (5, 10, 15, 25, 50 pmol) was plotted with 25 pmol of d(GPSA) SP as reference. The correlation coefficients (r2) were more than 0.999.