Ribosome profiling reveals the fine-tuned response of Escherichia coli to mild and severe acid stress

ABSTRACT The ability to respond to acidic environments is crucial for neutralophilic bacteria. Escherichia coli has a well-characterized regulatory network that triggers a multitude of defense mechanisms to counteract excess protons. Nevertheless, systemic studies of the transcriptional and translational reprogramming of E. coli to different degrees of acid stress have not yet been performed. Here, we used ribosome profiling and RNA sequencing to compare the response of E. coli (pH 7.6) to sudden mild (pH 5.8) and severe near-lethal acid stress (pH 4.4) conditions that mimic passage through the gastrointestinal tract. We uncovered new differentially regulated genes and pathways, key transcriptional regulators, and 18 novel acid-induced candidate small open reading frames. By using machine learning and leveraging large compendia of publicly available E. coli expression data, we were able to distinguish between the response to acid stress and general stress. These results expand the acid resistance network and provide new insights into the fine-tuned response of E. coli to mild and severe acid stress. IMPORTANCE Bacteria react very differently to survive in acidic environments, such as the human gastrointestinal tract. Escherichia coli is one of the extremely acid-resistant bacteria and has a variety of acid-defense mechanisms. Here, we provide the first genome-wide overview of the adaptations of E. coli K-12 to mild and severe acid stress at both the transcriptional and translational levels. Using ribosome profiling and RNA sequencing, we uncover novel adaptations to different degrees of acidity, including previously hidden stress-induced small proteins and novel key transcription factors for acid defense, and report mRNAs with pH-dependent differential translation efficiency. In addition, we distinguish between acid-specific adaptations and general stress response mechanisms using denoising autoencoders. This workflow represents a powerful approach that takes advantage of next-generation sequencing techniques and machine learning to systematically analyze bacterial stress responses.

T he infective dose of enteropathogens varies significantly among bacterial genera and is dependent on the number and complexity of acid resistance mechanisms (1).Escherichia coli is equipped with a high number of defense mechanisms to survive the acidity of the stomach and, correspondingly, can have an infective dose as low as less than 50 cells (2).Enterobacteria that survive the stomach also confront mild acid stress in the colon due to the presence of short-chain fatty acids produced by obligate anaerobes (3).Other neutralophilic bacteria encounter low pH environments in a variety of settings, such as acidic soils, fermented food, or phagosomes within macrophages (1).
The cytoplasmic membrane represents a primary barrier for protons (H + ).Neverthe less, at low pH, H + can permeate into the cytoplasm via protonated water chains, ion channels, or damaged membranes (4).Upon acidification, the cytoplasmic pH transiently acid-specific transcriptional adaptations by using machine learning to compare the low pH response to that of other stressors.

Examination of alterations in the translatome and transcriptome of E. coli in response to varying degrees of acid stress
To mimic natural stress conditions, such as the passage of E. coli through the gastrointes tinal tract, we established the following protocol involving a sudden change to low pH, detection of a rapid response, and severe, near-lethal acid stress.Specifically, E. coli K-12 MG1655 was cultivated in unbuffered lysogeny broth (LB) medium at pH 7.6 until the exponential growth phase (OD 600 = 0.5).Then, 5 M hydrochloric acid was added directly to expose the cells to a pH of 5.8 or stepwise to a pH of 4.4, corresponding to mild and severe acid stress (Fig. 1A).The final optical densities were comparable (pH 7.6: OD 600 = ~1.1;pH 4.4: OD 600 = ~0.7)(Table S1), and the pH values hardly changed compared to t 0 (Fig. 1A; Table S2).To investigate whether a 15-min exposure to pH 4.4 was sufficient to induce cellular adaptation to severe acid stress, we examined the temporal dynamics of adiA expression by quantitative reverse transcription PCR (RT-qPCR).We detected an increase in adiA mRNA levels as early as 15 min after the shift to pH 4.4 and no substantial further increase after 30 or 60 min (Fig. S1).Additionally, we evaluated cell viability at the final experimental time points using propidium iodide (PI) staining (36) and examined colony-forming units (CFUs).The average percentage of dead cells detected by PI staining was less than 1% at pH 7.6, 5% at pH 5.8, and 18% at pH 4.4 (Fig. S2A).A positive control (5-min heat treatment at 80°C) resulted in an average of 97.2% non-viable cells, as determined by PI staining.The CFU count results underline that at least 2 × 10 8 viable cells were collected irrespective of pH at the moment of sample collection for Ribo-and RNA-Seq (Fig. S2B).
Cells were harvested and lysed as previously described by whole-culture flash freezing and cryogenic grinding in a freezer mill to avoid bias from translation-arrest ing drugs and filtering (38).The subsequent steps of our ribosome profiling protocol were a combination of methodologies reported by Latif and colleagues (39) and Mohammad and Buskirk (40) (see Materials and Methods for details).Strand-specific Illumina sequencing yielded an average of approximately 30 million cDNA reads per sample for Ribo-Seq and 5-10 million for RNA-Seq.The next-generation sequencing data were analyzed using an extended version of the high-throughput HRIBO data analysis pipeline (37).All samples achieved sufficient coverage with over two million reads, each mapping uniquely to the coding regions.The rRNA contamination was higher in the pH 4.4 Ribo-Seq samples than in other conditions but accounted for less than 15% in all cDNA libraries (Fig. S3).The length distribution of the generated RPFs was broad, ranging from 15 to 45 nucleotides (Fig. S4), consistent with previous observations in other prokaryotic ribosome profiling analyses (38,39).We did not detect stress-induced increased relative ribosome occupancy in the initiation region of ORFs under acid stress compared to neutral pH.This is in contrast to previous observations in E. coli under heat stress (41) and in yeast under oxidative stress (42), which reported increased relative ribosome accumulation at start codons in stressed cells.In fact, ribosome occupancy in the translation initiation regions was slightly reduced at pH 4.4 and 5.8 compared with physiological pH (Fig. S5).This could be explained due to diminished ribosome-RNA complex stability and increased ribosome drop-off under acidic conditions.The biological triplicates for each experimental condition clustered on the first three principal components in a principle component analysis (PCA) plot (Fig. 1B).Notably, the global gene expression profiles were highly distinct at pH 4.4 compared with both pH 5.8 and 7.6.

Coordinated regulation of transcription and translation in response to acid stress
The tool deltaTE (43) was used to assess transcriptional and translational changes (i.e., differential expression and differential translation efficiency) in response to mild and severe acid stress.Low-expression transcripts were filtered out, and we focused our analysis on 3,654 genes with mean reads per kilobase per million reads mapped (rpkm) values ≥5 across all investigated conditions.Our findings reveal that 702 transcripts were significantly altered at pH 5.8 compared with physiological pH [absolute mRNA log 2 fold change (FC) ≥1 and false discovery rate (FDR) adjusted P ≤ 0.05], and 1,030 genes showed significant differences in mRNA levels at pH 4.4 (Fig. 2A).These results suggest that extensive transcriptional reprogramming occurred, which was influenced by the degree of acid stress.As illustrated by the Venn diagram overlaps (Fig. 2), a large number of adaptations occurred regardless of the degree of acid stress.Nonetheless, several hundred genes were differentially expressed exclusively at pH 5.8 or 4.4 (Fig. 2A).This suggests that in addition to universal adaptations at low pH, specific adaptations for mild and severe acid stress occur.We further determined the number of genes with stress-dependent alterations in RPF counts to be 679 at pH 5.8 and 1,440 at pH 4.4 (absolute RPF log 2 FC ≥1 and FDR adjusted P ≤ 0.05), which was in a similar range compared with the RNA-Seq data (Fig. 2A and B).Accordingly, the global FC values for mRNA and RPF levels showed a high Pearson correlation coefficient (r) under both conditions (Fig. 2C and D, gray dots).This indicates that transcriptional regulation of these genes is the predominant response to acid stress.However, a subset of genes exhibited exclusive and significant regulation at either the transcriptional (red dots) or translational (blue dots) level.Specifically, at pH 5.8, 193 genes were detected to be significantly regulated exclusively by RNA-Seq, while 216 genes were exclusively affected in the Ribo-Seq data (Fig. 2C; Table S3).At pH 4.4, 127 differentially regulated genes were found exclusively by RNA-Seq and 570 genes by Ribo-Seq (Fig. 2D; Table S3).Notably, for fruA at pH 5.8 and yecH at pH 4.4, opposite changes were observed at the transcriptional and translational levels (Fig. 2C and D, yellow dots).FruA is the fructose permease of the phosphoenolpyr uvate-dependent sugar phosphotransferase system (44), whereas the function of YecH remains unknown.
Next, we investigated translation efficiency (TE) to identify genes that undergo translational regulation in response to acidic conditions.TE provides information regarding ribosome counts per mRNA and is calculated as the ratio of RPFs over transcript counts within a gene's coding sequence normalized to mRNA abundance (43).We identified 22 genes at pH 5.8 and 89 genes at pH 4.4, which displayed significantly altered TEs (absolute log 2 TE fold change ≥1 and P-adjust ≤0.05) (Table S4).The highest increase in TE at pH 4.4 was found for the KpLE2 phage-like element (topAI), a hydrox yethylthiazole kinase (thiM), and a palmitoleoyl acyltransferase (lpxP).In contrast, yecH and yjbE, both encoding uncharacterized proteins, and malM of the maltose regulon showed the most prominent decrease in TE at pH 4.4 (Table S4).At pH 5.8, we noted the largest increase in TE for a ferredoxin-type protein encoded by napF, an iron transport protein (feoA), and a tagaturonate reductase (uxaB).Conversely, the largest decrease was observed for a protein of the fructose-specific phosphotransferase system (fruA), a tripartite efflux pump membrane fusion protein (emrK), and an HTH-type transcriptional regulator (ydeO) (Table S4).
In summary, besides extensive transcriptional reprogramming, dozens of genes exhibit significant FCs either at the transcriptional or translational level in response to acid stress.This underlines that transcription and translation are not always coupled in bacteria.Similar findings were reported by Zhang and colleagues (41), who conducted Ribo-Seq and RNA-Seq analyses for E. coli under heat stress (41).Overall, such differential regulation can be explained, for example, by delayed translation relative to transcript synthesis, selective recruitment or release of ribosomes, or regulation during translation initiation, elongation, or ribosome biogenesis (45)(46)(47)(48)(49), which could be beneficial under stress conditions.

Functional implications of genes with differential mRNA and RPF levels under mild acid stress
To obtain a more profound understanding of the fine-tuned response of E. coli to different degrees of acid stress, we first analyzed all genes with differential mRNA and ribosome coverage levels during mild acid stress (pH 5.8).Under this condition, the top candidates with the highest FC values for mRNA and RPF are as follows: (i) the cad operon, encoding the core components of the Cad AR system (see also below); (ii) the glp regulon, responsible for glycerol and sn-glycerol 3-phosphate uptake and catabolism (50); (iii) the mdtJI operon, encoding a heterodimeric multidrug/spermidine exporter (51); and (iv) genes encoding proteins involved in motility and flagella biosynthesis (Table 1).A comprehensive list of normalized read counts, mRNA and RPF FCs, and TEs for all E. coli genes is provided in Table S5.We tested a representative selection of differentially expressed genes by RT-qPCR.In all cases, the detected changes in mRNA levels were consistent with the data gathered by RNA-Seq (Fig. S6A).Both recA and secA were chosen as reference genes for RT-qPCR because their rpkm counts were relatively constant under the conditions tested (Table S5; Fig. S6B).
Next, we performed gene set enrichment analysis (GSEA) using clusterProfiler (52) to identify biological processes associated with differentially expressed genes at pH 5.8.Among the most enriched Gene Ontology (GO) terms for biological processes at pH 5.8 was "spermidine transmembrane transport" (Fig. 3), which corresponds to the induction of mdtJI (Table 1) and a polyamine ABC transporter encoded by potABCD (Table S5).Polyamines are crucial for survival under acid stress, as they reduce membrane perme ability by blocking OmpF and OmpC porins (53)(54)(55).External spermidine supplementa tion also improved acid resistance in Streptococcus pyogenes (56).On the other hand, overaccumulation of polyamines can be toxic and potentially lethal for E. coli (51,57).Therefore, precise transmembrane transport of polyamines in acidic environments is critical and contributes to survival in acidic conditions.The enrichment of the GO terms "glycerol-3-phosphate catabolic process" and "glycerol catabolic process" at pH 5.8 (Fig. 3) has not yet been associated with acid stress to our knowledge.Notably, of the 14 genes with the largest increase in RPF counts at pH 5.8, 7 belong to the glp regulon (Table 1).This regulon is required for the uptake and catabolism of glycerol and sn-glycerol 3-phosphate (G3P) (50).In this pathway, G3P is converted to dihydroxyacetone phosphate by membrane-bound dehydrogenases, either aerobically via GlpD or anaerobically by the GlpABC complex (58,59).Alternatively, dihydroxyacetone phosphate can be produced directly from glycerol by GldA and the protein products of the dhaKLM operon (60).The dhaKLM operon was also induced at pH 5.8 (Table S5).It remains unclear whether glycerol and G3P catabolism directly contribute to acid tolerance or whether the glp regulon is activated as a consequence of other low pH adjustments.Expression of glp genes is regulated by the repressor GlpR, which is inactivated upon binding of glycerol or G3P (61).We hypothesize that changes in phospholipid composition under acid stress conditions (62) may release G3P, which in turn induces the glp regulon.Accordingly, the GO term "phosphatidylglycerol biosyn thetic process" was enriched under acid stress (Fig. 3).
Another observation is the upregulation of de novo biosynthesis pathways for pyrimidine and purine nucleotides at pH 5.8 (Fig. 3).The induction of a large proportion of the PurR-dependent regulon involved in de novo nucleotide synthesis (Fig. 3; Table S5) suggests that E. coli requires additional nucleotides to cope with the extensive tran scriptional reprogramming.Besides, intracellular acidification can lead to DNA damage, such as depurination (63), making enhanced nucleotide biosynthesis a critical compen satory mechanism.Recently, Oenococcus oeni was reported to experience a decrease in the abundance of both purines and pyrimidines under acid stress, while nucleotide metabolism and transport increased (64), suggesting a similar phenomenon in this species.Other enriched GO terms under mild acid stress include "choline transport, " "siderophore transmembrane transport, " "phosphate ion transmembrane transport, " "ribosomal small subunit assembly, " "tRNA aminoacylation for protein translation, " and "bacterial-type flagellum-dependent swarming motility" (Fig. 3).pH-dependent motility has previously been observed in E. coli, Salmonella, and Helicobacter (1).These observa tions suggest that bacterial cells use an escape strategy to migrate to more favorable pH environments when challenged with acidic conditions.
On the contrary, our findings reveal that many membrane and periplasmic proteins (18 of the 20 genes with the most diminished RPF counts, Table 2) were among the top candidates with decreased mRNA and RPF levels under mild acid stress.This affected, for example, genes encoding ABC transporters (mal regulon, dpp operon) and symporters (actP, melB, gabP), highlighting the superiority of Ribo-Seq over mass spectrometry-based approaches, namely, its independence of protein biochemistry and higher sensitivity (29).Furthermore, GSEA identified membrane transport and meta bolic activities as the most downregulated biological processes in response to mild acid stress.For example, "maltose transport, " "isoleucine transport, " "heme transport, " "putrescine catabolic process, " "glycolate catabolic process, " and "aromatic amino acid family catabolic process" were among the most downregulated GO terms at pH 5.8 (Fig. 3).Downregulation of H + -coupled transport processes represents a key mechanism by which E. coli restricts proton influx into cells.In addition, the downregulated metabolic processes are in many cases associated with the synthesis and conversion of amino acids and carbon sources.For example, the catabolism of aromatic amino acids and arginine was also reduced at pH 5.8 (Fig. 3).Particularly noteworthy is the downregulation of the arginine catabolic pathway, which involves the protein products of the astEBDAC operon.At pH 5.8, hardly any reads were mapped in the astEBDAC region, despite detectable expression at pH 7.6 and pH 4.4 (Table S5).Presumably, E. coli preserves the intracellular arginine pool at pH 5.8, as this amino acid serves as a substrate for the Adi system during severe acid stress (15,65).
In summary, the response of E. coli to mild acid stress is characterized by the activation of the motility machinery to escape to less acidic habitats, by induction of the cad operon, and by genes involved in polyamine transport and glycerol-3-phosphate conversion (Tables 1 and 2; Fig. 3).In addition, E. coli restricts the influx of protons and conserves energy by reducing its metabolic activities.

Functional implications of genes with differential mRNA and RPF levels under severe acid stress
Next, we analyzed genes with differential mRNA and ribosome coverage levels in response to severe acid stress (pH 4.4) compared with non-stress (pH 7.6).Genes with the highest number of increased read counts, which were not already upregulated at pH 5.8, were asr, encoding an acid shock protein, followed by bdm, encoding a biofilm-modulation protein, and bhsA, encoding a multiple stress resistance outer membrane protein (Table 3).Originally, Asr was classified as a periplasmic acid shock protein, although its role in acid adaptation remained unclear (66).Recently, Asr was shown to be an intrinsically disordered chaperone that contributes to outer membrane integrity and to  act as an aggregase in order to prevent aggregation of proteins with positive charges (67).Our Ribo-seq data clearly illustrate the enormous importance of Asr under severe acid stress in E. coli, as it is one of the most abundant proteins in the cell, with approxi mately 2% of all reads mapping in the asr coding region at pH 4.4 (corresponding to an ~1,000-fold upregulation compared to pH 7.6).Strikingly, almost half of the top 20 genes with increased ribosome coverage of transcripts (ydgU, yhcN, yjcB, yedR, yhdV, ybiJ, ycgZ, and ycfJ) are poorly characterized (Table 3).So far, only YhcN from the above list has been shown to be involved in the response to acid stress (68).GSEA for biological processes identified the GO terms "enterobactin biosynthetic process, " "ferric-enterobactin import into cell, " "siderophore-dependent iron import into cell, " and "siderophore transmembrane transport" as significantly enriched at pH 4.4 (Fig. 3).Specifically, the complete enterobactin biosynthesis pathway, comprising the entCEBAH operon, entF, entH, and ybdZ, revealed significant enrichment under severe acidic conditions (Table S5).Furthermore, all subunits of the Ton complex (tonB, exbB, exbD) and its putative outer membrane receptor encoded by yncD exhibited significantly higher RPF and mRNA levels at pH 5.8 and pH 4.4 (Table S5).The Ton complex func tions as a proton motive force-dependent molecular motor that facilitates the import of iron-bound siderophores (69,70).Several other iron uptake systems, including a ferric dicitrate ABC transport system (fecABCDE), an iron (III) hydroxamate ABC transport system (fhuACDB), a ferric enterobactin ABC transport system (fepA, fepB, fepCGD), and a TonB-dependent iron-catecholate outer membrane transporter (cirA), were also induced under acidic conditions (Table S5).Moreover, the GO terms "protein maturation by iron-sulfur cluster assembly" and "iron-sulfur cluster assembly" were enriched at pH 4.4 (Fig. 3).Specifically, we detected a fivefold upregulation of all genes of the isc and suf operons (Table S5), which encode components of the complex machinery responsible for iron-sulfur cluster assembly in E. coli (71).In contrast, heme transport was among the most downregulated biological processes at both pH 5.8 and 4.4 (Fig. 3), which could potentially be the cause of iron limitation.Moreover, at low pH, the solubility of iron ions increases, which can destabilize iron-sulfur clusters (72).The iron limitation would be consistent with our data that E. coli upregulates the synthesis of iron-chelating sidero phores and their transporters, as well as the components of the iron-sulfur assembly machinery.Given the better solubility of iron in a low pH environment, the question arises whether E. coli synthesizes siderophores to respond to iron limitation, or rather, protects itself against an iron excess.The latter function has been demonstrated for Pseudomonas aeruginosa, where siderophores protected cells from the harmful effects of reactive oxygen species.In this case, P. aeruginosa no longer secreted siderophores into the extracellular environment but instead stored them intracellularly (73).In conclusion, these results prompt the question of whether the upregulation of the iron uptake machinery counteracts iron limitation or rather provides protection against iron excess under severe acid stress.
We also detected a significant enrichment for the GO terms "cellular response to acidic pH, " "stress response to copper ion, " and "copper ion transmembrane transport" at pH 4.4 (Fig. 3).These results are in line with previous studies that have suggested an interplay between resistance to copper and acid stress in Escherichia coli (74,75).This overlap between the two stress responses is further emphasized by our findings because at pH 4.4, substantial upregulation of the Cu + -exporting ATPase CopA and CusA, a component of the copper efflux system, was detected (Table S5).These results are of important physiological relevance, given that copper is an important antibacterial component in the innate immune system (76,77).
Among the downregulated genes at pH 4.4, the tnaAB operon and its leader peptide (tnaC) showed the most significant decrease in terms of RPF counts (Table 4).tnaA encodes a tryptophanase, which cleaves L-tryptophan into indole, pyruvate, and NH 4 + , whereas tnaB encodes a tryptophan:H + symporter (78).This finding is particularly intriguing because, in a previous study, persister cell formation in E. coli was related to a lower cytoplasmic pH associated with tryptophan metabolism (79).It is important to note that we also detected a substantial upregulation in RPFs for hipA (Table S5), which encodes a serine/threonine kinase that plays a role in persistence in E. coli (80).Therefore, our data provide further evidence for the link between internal pH and persistence.
The expression of several outer membrane proteins and porins (ompW, ompF, nmpC, lamB) was also downregulated at pH 4.4 (Table 4).This observation is consistent with the extensive restructuring of the E. coli lipid bilayers to reduce membrane permeability and limit proton entry.Similar to pH 5.8, the majority of the 20 proteins with the most reduced RPF levels compared with physiological pH are membrane proteins (Table 4).Moreover, the GO term "ATP synthesis coupled proton transport" was significantly reduced at pH 4.4 (Fig. 3).This is explained by the reduction in RPF levels of genes encoding subunits of the F O F 1 -ATPase (Table S5).F O F 1 -ATPase uses the electrochemi cal gradient of protons to synthesize adenosine 5′-triphosphate (ATP) from ADP and inorganic phosphate but can also hydrolyze ATP to pump protons out of the cytoplasm (81,82).As at pH 5.8, the most downregulated biological processes at pH 4.4 were almost exclusively GO terms related to transport and cellular metabolism (Fig. 3).
In summary, the response of E. coli to severe acid stress is dominated by the activation of survival strategies that limit the entry of protons into the cell, prevent protein aggregation, and maintain iron homeostasis.Severe acid stress leads to a reduction in metabolic, transcriptional, and translational activity, thereby preparing E. coli for a dormant state.Eventually, these dormant cells may be able to withstand antibiotic attack (i.e., persister cells).

Expanding the regulatory network of enzyme-based H + -consuming acid resistance systems
Recently, we have shown that the Adi and Cad AR systems are mutually exclusively activated in individual E. coli cells, indicating functional diversification and division of labor under acid stress (15).To gain further insights into the fine-tuned regulation of the three major AR systems, we first studied the mRNA and RPF levels of known enzymebased H + -consuming AR components.The core components of the Gad system (AR2) (gadA, gadB, and gadC) and several transcriptional components (gadW, gadX, gadY, phoP, phoQ) showed an increase in mRNA and RPF levels by approximately two-to sixfold at pH 4.4, but not at pH 5.8, whereas the expression of ydeO was massively induced at pH 5.8 (particularly at the mRNA level), and RPF levels were decreased at pH 4.4 (Fig. S7).Expression of the core components of the Adi system (AR3), adiA and adiC, was induced at severe acid stress but not at pH 5.8, consistent with our previous study (15).Upregulation was not detected for regulatory components of the Adi system.A novel finding was that the levels of adiA but not adiC were significantly higher in the Ribo-Seq data than in the RNA-Seq data (Fig. S7).In fact, adiA had the sixth highest increase in TE among all E. coli genes at pH 4.4 (Table S4), indicating translational regulation by a thus far unknown mechanism.The only other component of an AR system in E. coli, known to be subject to translational regulation, is the major regulator CadC of the Cad (AR4) system.CadC contains a polyproline motif, and its translation therefore depends on the elongation factor P, a process that keeps the copy number of CadC extremely low (83).
As expected, expression of the core components of the Cad system (AR4), cadA and cadB, was tremendously increased at both pH 5.8 and 4.4.Genes of the Orn system (AR5) were not induced in our experimental setup (Fig. S7).
Next, we analyzed the mRNA and RPF levels of all annotated TFs to search for other potential TFs involved in the acid stress response of E. coli (Fig. 4A and B).At pH 5.8, YdeO showed by far the strongest induction at the transcriptional and translational levels, but for all other TFs, the expression levels hardly changed (Fig. 4A).At pH 4.4, the expression of numerous TFs was induced, including GadW, YdcI, and the antibiotic resistance-con trolling regulator MarR.The strongest upregulation was found for the IclR-type regulator MhpR and the iron-sulfur cluster-containing regulator IscR (Fig. 4B).Notably, while most acid-induced TFs were differentially expressed and displayed constant TE, YdcI exhibited constant mRNA levels but was differentially translated in response to acid stress (Fig. 4B).The contribution of all TFs with high FC values to survival under acid stress (Table S6) was tested in an acid shock assay.Cells of the corresponding knockout mutants (84) and, for comparison, the rcsB and gadE mutants (each lacking a TF important for acid resistance) were exposed to pH 3 for 1 h.All mutants except marR and ydeO showed significantly reduced survival compared to the parental strain (Fig. 4C).For ydeO, this result was consistent with our finding that transcript abundance and occupancy with ribosomes were upregulated at mild but not severe acid stress (Fig. 4; Fig. S7).Thus, YdeO appears to be only crucial under mild acid stress (Fig. 4A).In contrast, the mhpR mutant had a low survival rate comparable to that of rcsB and gadE, and the survival rates of the iscR, ydcI, and gadW mutants were only slightly higher (Fig. 4C).These results confirm the physiological relevance of these TFs for acid resistance.As controls, we re-introduced the corresponding genes in trans using isopropyl-β-D-thiogalactopyranosid (IPTG)-inducible pCA24N plasmids from the ASKA collection (85).Complementation of the mhpR, iscR, ydcI, and gadW mutants, as well as rcsB and gadE controls, resulted in strains with survival rates comparable to the wild-type (WT) strain carrying the pCA24N control vector (Fig. S8).
Subsequently, we tested whether these TFs are involved in the regulation and interconnectivity of the Gad, Adi, and Cad systems.Therefore, we examined the pro moter activities of gadBC, adiA, and cadBA in the corresponding knockout mutants (84) using transcriptional reporter plasmids (promoter-lux fusions).The cultivation conditions were the same as those used for Ribo-Seq and RNA-Seq (Fig. 1A), and luciferase activity was monitored during growth in microtiter plates.We found that YdcI significantly affected the promotor activity of gadBC (Fig. 4D).Although the LysR-type regulator YdcI has been shown to affect pH stress regulation in Salmonella enterica serovar Typhimu rium and E. coli, its precise role is still unclear (86)(87)(88).Based on the data presented here, we hypothesize that the decreased survival of the ydcI mutant under severe acid stress is due to decreased expression of the Gad system.The absence of YdeO resulted in an eightfold stimulation of the adiA promoter activity (Fig. 4E).Thus, YdeO not only activates the Gad system (89) but also appears to be a repressor for the Adi system.This implies that the Adi system is regulated not only by the XylS/AraC-type regulator AdiY but also by YdeO.Thus, YdeO is the first example of a transcriptional activator shown to be involved in the regulation of more than one AR system in E. coli and might play a role in the heterogeneous activation of the Adi and Gad systems within a population.Although we observed a slight decrease in cadBA promoter activity in the ydcI mutant, the decrease was not statistically significant.Therefore, none of the tested TFs affected the Cad system (Fig. 4F).In conclusion, based on the differential expression data and lower survival of mutants during acid shock, we identified two novel TFs, namely, MhpR and IscR, which are crucial under severe acid stress (Fig. 4C), but are not associated with the Gad, Adi, and Cad systems (Fig. 4D through F).This implies that these regulators ensure the survival of E. coli in acidic habitats by inducing other defense mechanisms.Of particular interest is MhpR, which had the highest increase in RPFs of all TFs at pH 4.4 (Fig. 4B), and the corresponding mutant had the lowest survival at pH 3 (Fig. 4C).Further studies are needed to determine whether MhpR, which is a specific regulator of the mhpABCDFE operon-encoding enzymes for the degradation of phenylpropionate (90,91), directly or indirectly contributes to acid resistance.

Differential expression of known and novel sORFs under mild and severe acid stress
In recent years, the annotation of many bacterial genomes has been extended by previously unknown small proteins (29), many of which are located in the membrane (92).This progress has been achieved primarily through the development of optimized detection strategies using adapted ribosome profiling and mass spectrometry protocols (28,33,93).Recently, additional sORFs were identified in E. coli using antibiotic-assis ted Ribo-Seq, which captures initiating ribosomes at start codons (94,95).Advanced detection strategies also revealed novel small proteins in other species, such as the archaeon Haloferax volcanii, the nitrogen-fixing plant symbiont Sinorhizobium meliloti, Salmonella Typhimurium, and Staphylococcus aureus (33)(34)(35)96).
Among the previously known sORFs in E. coli K-12 and those discovered by Storz and colleagues (94), pH-dependent differential RPF levels were observed in our data sets for 12 and 29 small proteins at pH 5.8 and pH 4.4, respectively (Table S8).These findings validate the expression of these sORFs and highlight their physiological relevance in the acid stress response of E. coli.For example, induction of mdtU, an upstream ORF of mdtJI, was observed under mild acid stress (Fig. 5A) and corresponds to the observed upregulation of the multidrug/spermidine exporter MdtJI (Table 1).A previous study has shown that translation of MdtU is crucial for spermidine-mediated expression of the MdtJ subunit under spermidine supplementation at pH 9 (97).A similar mechanism could operate under acid stress conditions.The strongest induction of sORFs under severe acid stress was detected for ydgU (located in the same transcriptional unit as the acid shock protein-encoding gene asr) and azuC (Fig. 5B).AzuCR acts as a dual-function RNA and encodes a 28-amino acid protein, but it can also base pair as an sRNA (AzuR) with two target mRNAs, including cadA (98).AzuCR modulates carbon metabolism through interactions with the aerobic glycerol-3-phosphate dehydrogenase GlpD (98).
In addition to known sORFs, we aimed to uncover further hidden small proteins on the basis that our Ribo-Seq data were acquired under stress conditions to which E. coli is exposed in its natural habitat, the gastrointestinal tract.In particular, we searched for novel sORFs that remained undetected in previous Ribo-Seq approaches when E. coli was grown at a neutral pH.
Initial predictions for novel sORF candidates were acquired using the neural networkbased prediction tool DeepRibo (99).All potential candidates were filtered based on coverage (rpkm >30 across all Ribo-Seq samples) and codon count [10-70 amino acids (aa)], with the exception of sORF15 (93 amino acids) (Table S7), which was manually discovered by inspecting the 3′ UTR of gadW.To further refine our search, we focused on sORF candidates that were significantly induced at either pH 5.8 or pH 4.4 (RPF log 2 FC >2 and P-adjust <0.05) compared to pH 7.6.Predictions that overlapped with annotated genes on the same strand were excluded because Ribo-Seq signals were indistinguisha ble.This workflow yielded 152 candidates that were visually inspected using the webbased genome browser JBrowse2 (100).Candidates with continuous coverage across the predicted sORF, matching the ORF boundaries, and promising Shine-Dalgarno sequences were considered high-confidence candidates.In total, we identified 18 acid-induced sORF candidates (Table S7) that had not been previously detected.Of note, most of the candidates are encoded as part of operons or are located in the 3′ UTR of annotated genes.In addition, we detected one independent antisense sORF (sORF2 encoded antisense to tesA) and two upstream ORFs (leader peptides): sORF18, located upstream of the translation start site of the periplasmic chaperone encoding osmY, and sORF8, located close to the glucokinase-encoding gene glk (Table S7).
Of these 18 acid-induced candidate small proteins (Table S7), 17 had higher RPF counts at pH 4.4 than at pH 5.8.This suggests that the contribution of sORFs to acid defense in E. coli is more relevant under severe acid stress.Only sORF1 showed a higher expression level in cells exposed to mild acid stress (Fig. 5C).sORF1 is located in the 3′ UTR of tsx, which encodes a nucleoside-specific channel-forming protein.This finding is consistent with the observed increased requirement for nucleotides by E. coli at pH 5.8 (Fig. 3).
For the first time, we identified two sORF candidates located within genes encoding the redundant small regulatory RNAs OmrA and OmrB (Fig. 5D).omrA and omrB are highly identical at the 3′ and 5′ ends, differ mainly in their central parts, and regulate the expression of numerous outer membrane proteins (101).Our analysis suggests that both OmrA and OmrB act as dual-function RNAs under severe acid stress and encode small proteins: a 28-amino acid protein OmrA (sORF11) and an 11-amino acid protein OmrB (sORF12) (Fig. 5D).Due to the sequence variation in the central parts, the translation of OmrB ends at an earlier stop codon.Notably, both omrA and omrB displayed higher RPF levels at pH 4.4, whereas transcription of omrA but not omrB was induced at pH 4.4 (Fig. 5D).Thus, despite the high sequence similarity, omrA and omrB do not encode identical small proteins under severe acid stress and are differentially regulated at the transcriptional and translational levels.We also detected an acid-induced sORF candidate (sORF3) in rybB (Table S7), another sRNA involved in the regulation of outer membrane proteins (102).To our best knowledge, the presence of OmrA, OmrB, and RybB peptides has not yet been reported.
Three new candidate sORFs potentially involved in the regulation of AR systems were detected.sORF10 is located in the 3′ UTR of a potassium-binding protein encoded by kbp and encoded antisense to the transcriptional regulator CsiR (Fig. 5E).The latter might be involved in the regulation of the Adi system (15,103).Given the significant upregulation of sORF10 at pH 4.4 and its complete complementarity to the 3′ end of the csiR mRNA, we hypothesize that sORF10 plays a role in fine-tuning the expression of the Adi system.Strikingly, we also discovered two high-confidence candidates for sORFs located in the relatively long 3′ UTR of GadW, one of the major transcriptional regulators of the Gad system (Fig. 5F).sORF14 and sORF15 exhibit constant coverage across the predicted ORF and contain Shine-Dalgarno sequences (Table S7).These results suggest that the complex Gad system may consist of even more components.
To gain further insight into the subcellular location and features of the newly identified sORF candidates, we used PSORTb (104) and DeepTMHMM (105) for trans membrane topology prediction.Notably, sORF15 is predicted to be located in the inner membrane and has two transmembrane helices, which were predicted with a probability of >90% (Fig. S9A).Additionally, the sORF15 protein structure prediction using Alpha Fold2 (106) in Google Colab (ColabFold) (107) revealed a potential third helix toward the C-terminal end (Fig. S9B).Using blastp and tblastn (108), we found homologs of sORF15 with >80% identity in Vibrio, Shigella, Klebsiella, Salmonella, Enterococcus, and Escherichia (Fig. S9C) and identified homologs with at least 60% identity for approximately half of the other candidate sORFs (sORF2, 4, 5, 6, 7, 9, 10, 11, 16, and 18).These results strengthen confidence in the correct prediction of these sORFs.However, homologs in other species often only displayed partial matches and were almost exclusively annotated as "hypothetical proteins, " as illustrated for sORF15 (Figure S9C).Moreover, we evaluated whether sORF15 is translated in the absence of the upstream gene gadW.A pBAD24-sORF15:3xFLAG plasmid, which harbors the native Shine-Dalgarno sequence of sORF15 (Table S11), and a FLAG-tagged version of sORF15 were constructed.sORF15 translation was successfully verified by Western blotting (Fig. S9D), which exemplifies that sORFs detected in this study yield detectable protein products.
In conclusion, we identified 18 high-confidence candidates for novel sORFs that are significantly induced upon exposure of E. coli to mild or severe acid stress.

Differentiation of the acid stress and general stress responses using autoen coder-based machine learning
In general, stress response mechanisms can be broadly classified into two catego ries: global stress responses and adaptations to specific types of stress.Global stress responses can be triggered by various stimuli and provide protection against multiple other unrelated stress factors (109).The global response often involves the activation of alternative sigma factors that affect hundreds of genes.In contrast, adaptations to specific types of stress are tailored to the specific stressor and involve a regulator that senses an environmental cue and modulates the expression of a set of genes, which counteract the stress (109,110).
Given the large number of differentially regulated genes and pathways in response to acid stress (Fig. 2 and 3), we asked which of these adaptive mechanisms are acid-specific and which are also triggered by other stressors.In order to distinguish acid-specific and general stress responses, we used denoising autoencoders (DAEs), deep learning models designed for meaningful dimensionality reduction (111,112).DAEs accomplish this by passing data through an encoder that compresses it into activations of a bottleneck layer (Fig. 6A1), with each node in the bottleneck layer interpretable as a coordinated expression program (113).For our analysis, we employed an ensemble of deep DAEs (see Materials and Methods) (113), trained on the E. coli K-12 PRECISE 2.0 compendium (114), augmented with additional stress conditions (115), as well as the acid stress conditions of the current study (Fig. 6A1).Using this method and data set, we have conducted a comparative analysis of the transcriptional response of E. coli to pH 4.4 and pH 5.8, contrasted against an extensive range of other stress conditions, including heat stress (116), ethanol stress (117), osmotic stress (118), oxidative stress (119,120), low oxygen (LOX) (115), and exposure to sublethal concentrations of chloramphenicol (CAM) (115) and trimethoprim (TMP) (115).
To identify biological processes associated with a particular stress condition, we passed the associated RNA-seq data set into the encoder of each network and identified bottleneck nodes that were uniquely turned on by that data set (Fig. 6A2).We then manually turned on these nodes to generate gene sets that are associated with that condition, which can be further analyzed through GO term enrichment (Fig. 6A3 and 4).Using this procedure, we identified groups of nodes that uniquely turn on for acid stress conditions and turn off for all other stress conditions, as well as groups that are simulta neously on for both acid and one additional stress condition.We observed that there are many nodes that turn on simultaneously upon both acid and ethanol exposure (Fig. 6B).The overlap between acid stress and ethanol stress responses has been noted previously and can be explained by the fact that ethanol fluidizes the cytosolic membrane and increases the permeability for protons (121).Furthermore, there are indications of an overlap between acid and antibiotic stress (122,123), reflected in the high number of acid + CAM activating nodes (Fig. 6B).
To pinpoint which cellular adaptations cause acid-specificity for the 48 and 91 specific bottleneck nodes at pH 5.8 and 4.4 (Fig. 6B), respectively, we conducted GSEA for biological processes on each of the gene sets associated with acid-specific node groups (see Materials and Methods).The GO terms that were significantly enriched in the highest number of both pH 5.8-and pH 4.4-specific upregulating node gene sets were "siderophore transmembrane transport, " "response to cold, " "bacterial-type flagellum assembly, " and "chemotaxis" (Fig. 6C), reflecting our previous differential RNA-seq analysis (Fig. 3).The appearance of the GO term "response to cold" might be a result of the lack of cold stress in our compendium of stressors.Additionally, it should be noted that genes associated with this GO term include cold shock proteins, which may have broader roles in the survival of stress conditions (124,125), as well as several prophage genes and ribosome biogenesis factors.We found that mild acid stress turns on nodes, which correspond to gene sets associated with nucleotide and ribosome biosynthesis, including the GO terms "ribosome large subunit assembly, " "ribosome small subunit assembly, " and "de novo IMP biosynthetic process, " while severe acid stress turns them off (Fig. 6C).These findings are consistent with our previous observations, namely, that E. coli induces nucleotide and ribosome biosynthesis to cope with mild acid stress but enters a metabolically inactive state under severe acid stress.The GO terms significantly affected in the highest number of acid-specific pH 4.4 downregulated bottleneck nodes were "proton motive force-driven ATP synthesis" and "proton-transporting ATP synthase complex" (Fig. 6C).These two GO terms exclusively involve genes encoding subunits of the F O F 1 ATP synthase and can be considered paradigms for acid-specific adaptations since the F O F 1 ATP synthase can also pump protons (126).
Considering that we detected a high number of genes induced by severe acid stress with unknown functions (Table 3) and lacking GO associations, we expanded our search for acid-specific adaptations from GO terms to single genes.In order to select acid-specific candidate genes, we investigated all genes associated with acid-specific bottleneck nodes and calculated the log 2 FC between each acid stress and every other above-mentioned stress condition for these genes.Genes with the highest expression values under acidic conditions and log 2 FCs of at least 0.5 for at least 95% of comparisons were then selected.This procedure yielded 10 candidate genes (Fig. 6D).To experimen tally validate that these genes are indeed specifically upregulated under acid stress, we exposed E. coli to a variety of common stressors and performed qRT-PCR.E. coli cells were either grown under acid stress (Fig. 1A) or exposed to heat (42°C), oxidative (H 2 O 2 ), osmotic (NaCl), antibiotic (chloramphenicol), or ethanol (EtOH) stress.For all investigated genes, the strongest upregulation was observed at either pH 4.4 or pH 5.8 relative to non-stress conditions (Fig. 6D), except for ycfJ, which was activated at pH 4.4 and under oxidative and ethanol stress.The remaining investigated genes were only upregulated under one other stress condition at most (Fig. 6D).Given that emrE and mdtJ encode multidrug exporters, the induction upon supplementation with sublethal concentrations of chloramphenicol is not surprising and further underscores the interplay between acid and antibiotic stress.The observed upregulation of yhcN under oxidative stress (Fig. 6D) was also reported previously (127).Nevertheless, we uncovered four bona fide examples of genes (ybiJ, hslJ, yejG, and yhjX) that displayed exclusive pH-dependent expression (Fig. 6D).Induction of yhjX, encoding a putative pyruvate transporter, might be related to the deamination of serine, which yields ammonia and pyruvate in uropathogenic E. coli (128).The precise molecular functions of YbiJ, YejG, and HslJ in the context of acid stress are currently unclear.These results highlight that our autoencoder pipeline is complementary to differential gene expression analysis, yielding biologically consistent results while also identifying expression patterns that uniquely discriminate acid stress from other stress responses.

Conclusions
Here, we present the first comprehensive study on the global transcriptome-and translatome-wide response of E. coli exposed to varying degrees of acid stress.Our investigation goes beyond previous research, which focused on comparing E. coli transcriptomes across different pH levels during growth (18,21).Instead, we report on rapid changes occurring upon sudden pH shifts, which are relevant for bacteria such as E. coli, during passage of the gastrointestinal tract (129).
Using both Ribo-and RNA-Seq, we uncovered not only well-known acid defense mechanisms but also numerous previously undiscovered relevant genes and pathways to combat mild and severe acid stress (Fig. 7).The latter include siderophore production, glycerol-3-phosphate conversion, copper export, de novo nucleotide biosynthesis, and spermidine/multidrug export (Fig. 3 and 7).A striking number of membrane proteins and H + -coupled transporters were found to be downregulated under both mild and severe acid stress (Fig. 7; Tables 2 and 4), underscoring the importance of the cytosolic membrane and its composition as a barrier for protons.Moreover, under severe stress, many outer membrane proteins were downregulated (Fig. 7; Table 4).Notably, a large proportion of genes with yet unknown functions were strongly induced, particularly under severe acid stress (Table 3).Our approach implies that exposing E. coli to culture conditions mimicking near-lethal habitats can offer valuable insights into the molecular functions of genes with low expression levels under standard growth conditions.
Our analysis revealed two new TFs, MhpR and IscR, involved in acid stress adaptation.Furthermore, we gained new insights into the role of the TFs YdeO and MarR.YdeO controls not only the transcription of genes in the Gad system but also adiA in the Adi system (Fig. 4), suggesting that YdeO connects the regulation of two AR systems in E. coli.The observed upregulation of MarR under acid stress, but the low contribution of this TF to acid resistance (Fig. 4B), may provide a link to antibiotic resistance and solvent stress tolerance in E. coli (130).
In addition to the pH-dependent differential expression levels of previously identified small proteins, such as YdgU, MdtU, and AzuC, we identified 18 high-confidence, not yet annotated, sORF candidates (Fig. 5).Of particular interest are sORF14 (13 amino acids) and sORF15 (93 amino acids), which are located in a transcriptional unit with gadW and gadX, suggesting their association with the Gad AR system and a potential involvement in glutamate transport and/or glutamate decarboxylation to gamma-ami nobutyrate (GABA).Considering the predicted membrane location of sORF15 and its adjacent gene mdtF, an association with either the glutamate/GABA antiporter GadC and/or the multidrug efflux pump MdtF is conceivable.
The autoencoder-based comparison with other common stressors allowed us to distinguish acid stress-specific adaptations from general stress response programs (Fig. 6).Therefore, it was possible to differentiate between direct and indirect effects triggered by protonation and/or cellular damage.Considering the growing volume of next-gen eration sequence data, denoising autoencoders will be an increasingly important tool for interpreting future studies in the full context of accumulating RNA-seq data sets.Colonizing the intestinal tract is a complex process that includes not only rapid pH changes but also alterations in oxygen and nutrient availability as well as competition with other bacteria.The ability of pathogenic E. coli strains to respond to such rapidly changing environments ensures their fitness advantage.We have shown here, for acid stress, the complexity of the regulatory network for ensuring survival and adaptation.The use of autoencoders, successfully tested here, could allow for the identification of physiological weak points associated with the survival of specific stresses.Targeting such weak points could lead to new classes of antibiotics or antivirulence treatments that take advantage of the unique expression patterns induced by natural stress conditions encountered in the host environment.

Strains and growth conditions
Bacterial strains and plasmids used in this study are listed in Table S9.E. coli strains were cultivated in LB medium (10 g/L tryptone, 5 g/L yeast extract, 10 g/L NaCl) and incubated aerobically in a rotary shaker at 37°C.When appropriate, media were supplemented with 15 µg/mL gentamycin, 100 µg/mL carbenicillin, or 50 µg/mL chloramphenicol.
For ribosome profiling and RNA-Seq experiments, the pH of the medium was adjusted at the indicated time points by the direct addition of 5 M HCl to the growing cultures (Fig. 1A).
For comparison with other stress conditions (heat stress, osmotic stress, oxidative stress, antibiotic stress, and ethanol stress), E. coli was initially grown in LB medium to OD 600 = 0.5.Stress conditions were initiated either by moving flasks to a pre-heated 42°C incubator or by the addition of either H 2 O 2 (2 mM), NaCl (300 mM), chloramphenicol [1.2 µg/mL (wt/vol)], or ethanol [5% (vol/vol)].In all cases, samples were collected after 15 min of stress treatment.

Plasmid construction
Molecular methods were performed according to standard protocols or according to the manufacturer's instructions.Kits for the isolation of plasmids and the purification of PCR products were purchased from Süd-Laborbedarf.Enzymes and HiFi DNA Assem bly Master Mix were purchased from New England Biolabs.To construct the reporter plasmid pBBR1-MCS5-P gadBC:lux (Table S11), 335 nt of the upstream region of gadBC were amplified by PCR using primers KSO-0131/KSO-0132 (Table S10) and MG1655 genomic DNA as a template.For the construction of the pBAD24-sORF15:3xFLAG plasmid (Table S11), the sORF15 coding region (excluding the stop codon) and an additional 15 nt upstream of the start codon (harboring the native Shine-Dalgarno sequence) were amplified using primers KSO-183/KSO-184 (Table S10) from MG1655 genomic DNA.After purification, promoter fragments were assembled into PCR-linearized pBBR1-MCS5 or pBAD24-3xFLAG vectors via Gibson assembly (131).The pCA24N-control plasmid was obtained by excising the ydcI insert from a pCA24N-ydcI vector using restriction enzymes XhoI and SalI (NEB) and subsequent ligation using T4 DNA Ligase (NEB).Correct insertions were verified by colony PCR and sequencing.

Sample collection for Ribo-Seq and RNA-Seq
Three sets of biological triplicates of MG1655 cells were inoculated to a starting OD 600 of 0.05 from overnight cultures and grown to exponential phase (OD 600 = 0.5) in 200 mL of unbuffered LB medium (pH 7.6).Two sets of cultures were adjusted to pH 5.8 by direct addition of 5 M HCl, while one set was further grown for 30 min at pH 7.6 (Fig. 1A).After 15 min, one set of biological triplicates was further adjusted from pH 5.8 to pH 4.4 by the addition of 5 M HCl, while the other cultures remained at pH 5.8 or pH 7.6, respectively (Fig. 1A).Cells were grown for another 15 min, samples were collected, and Ribo-Seq and RNA-Seq were performed as described below.pH values before and after pH shifts, as well as final optical densities, were monitored throughout the experiment (TablesS1 and S2).

Ribosome profiling
Whole-culture flash freezing, cell lysis using a freezer mill, pelleting ribosomes over sucrose cushions, and ribosomal footprint isolation using a size selection gel were performed following the published protocol from Mohammad and Buskirk (40).MNase treatment, monosome recovery, RNA isolation, end-labelling by T4 polynucleotide kinase, and cDNA library construction were conducted by adapting the protocol from Latif and colleagues (39).
Briefly, 100 mL of liquid cultures and 10× lysis buffer [200 mM Tris pH 8, 1.5 M MgCl 2 , 1 M NH 4 Cl, 50 mM CaCl 2 , 4% (vol/vol) Triton X-100, 1% (vol/vol) NP-40] were flash frozen in liquid nitrogen.For each sample, 90 g of frozen cells was mixed with 10 g of frozen 10× lysis buffer and lysed by cryogenic grinding in a SPEX SamplePrep 6875 Freezer/Mill (10 cycles, 10 Hz, 5 min precool, 1 min run, 1 min cool).The pulverized samples were thawed, and the lysate was pre-cleared by centrifugation (9,800 × g, 10 min, 4°C) in a Beckman Coulter Optima XE-90 Ultracentrifuge using a 50.2Ti Rotor.The supernatant was used to pellet ribosomes over sucrose cushions by centrifugation in a Beckman Coulter Optima XE-90 Ultracentrifuge using a 70.1 Ti Rotor (330,000 × g, 1.5 h, 4°C).After resuspension of pellets in resuspension buffer (20 mM Tris pH 8, 15 mM MgCl 2 , 100 mM NH 4 Cl, 5 mM CaCl 2 ), nuclease digestion was performed using MNase (NEB) (2 h, 25°C).Monosomes were recovered using MicroSpin S-400 HR columns (GE Healthcare), and RNA was isolated using the miRNeasy Mini Kit (QIAGEN) in combina tion with the RNase-Free DNase Set (QIAGEN).The isolated RNA was loaded on a 15% TBE urea size selection gel, and after staining with SYBR Gold (Invitrogen), ribosomal footprints between 15 and 45 nt were excised from the gel.For elution of RNA, gel pieces were crushed by poking a hole with an 18 G needle in a 0.5-mL tube, placing it in a 1.5-mL tube, and subsequent centrifugation.RNA was recovered by precipitation after adding elution buffer (300 mM NaOAc pH 5.5, 1 mM EDTA pH 8) to the crushed gel, overnight incubation (4°C), and centrifugation in Corning-Costar Spin-X Centrifuge Tube Filters (20,000 × g, 3 min, room temperature).The isolation of RNA fragments corresponding to ribosomal footprints of 15-45 nt size was verified using the RNA 6000 Nano Kit (Agilent) and the Agilent 2100 Bioanalyzer.Then, 5′ phosphorylation of RNA fragments was achieved using T4 polynucleotide kinase (NEB) and ATP (NEB).After RNA recovery using an RNA MinElute Cleanup Kit (QIAGEN), footprint fragments were again evaluated using the Agilent 2100 Bioanalyzer as described above.cDNA libraries were constructed using the NEBNext Small RNA Library Prep Set for Illumina (NEB) with 14 PCR amplification cycles.cDNA library quality was assessed using a High Sensitivity DNA Kit (Agilent).cDNA libraries were purified using the QIAquick PCR Purification Kit (QIAGEN) and sequenced using a HiSeq 1500 machine (Illumina) in single-read mode with a 50-bp read length.

RNA-Seq analysis
After flash-freezing the cultures for Ribo-Seq, 6 mL of the remaining culture volume was mixed with 1.2 mL of Stop Mix solution [95% (vol/vol) ethanol, 5% (vol/vol) phenol] to terminate ongoing transcription and translation.Samples were frozen in liquid nitrogen and stored at −80°C until RNA isolation.Cells were pelleted (3,000 × g, 15 min, 4°C), and total RNA was isolated using the miRNeasy Mini Kit (QIAGEN) in combination with the RNase-Free DNase Set (QIAGEN).The integrity of RNA samples was evaluated using the RNA 6000 Nano Kit (Agilent) and the Agilent 2100 Bioanalyzer.RNA was quantified using the Qubit RNA HS Assay Kit (Invitrogen).Ribosomal RNA depletion was performed using the NEBNext rRNA Depletion Kit for bacteria (NEB), and directional cDNA libraries were prepared using the NEBNext Ultra II Directional RNA Library Prep Kit for Illumina (NEB).cDNA library quality was assessed using a High Sensitivity DNA Kit (Agilent).Finally, cDNA libraries were sequenced using a NextSeq 1000 machine (Illumina) in single-read mode with a 60-bp read length.

Next-generation sequencing data analysis
E. coli Ribo-seq and RNA-seq raw sequencing libraries were processed and analyzed using the published HRIBO workflow (version 1.6.0)(37), which has previously been used for analysis of bacterial Ribo-seq data (35).HRIBO is a snakemake (132) workflow that downloads all required tools from Bioconda (133) and Singularity (134).All necessary processing steps are automatically determined by the workflow.
Quality control was performed by creating read count statistics for each processing step and RNA class with Subread featureCounts (1.6.3)(138).All processing steps were analyzed with FastQC (version 0.11.8)(139), and results were aggregated with MultiQC (version 1.7) (140).Additionally, a PCA was performed to determine whether the major source of variance in the data stems from the different experimental conditions.The PCA ensures the correctness of the downstream differential expression analysis.To generate the PCA plots, normalized read counts for all samples were generated using DESeq2 (141) normalization.To improve the clustering, a regularized log transform was applied to the normalized read counts.Standard deviations and variance were subsequently calculated using DESeq2, and the first three principle components were plotted using plotly (142).
Read coverage files were generated with HRIBO using different full read mapping approaches (global or centered) and single-nucleotide mapping strategies (5′ or 3′ end).Read coverage files were normalized using the counts per million (mil) normalization.For the mil normalization, read counts were normalized by the total number of mapped reads within the sample and scaled by a per-million factor.
Metagene analysis of ribosome density at start codons was performed as described previously (143).Here, annotated start codons of coding sequences are collected, and the density of the read coverage is determined for every position in a pre-determined window around the collected start codons.
A differential expression and translation analysis was performed using the tool deltaTE (43).The tool combines both Ribo-seq and RNA-seq data by calculating the TE of genes in order to capture changes in translational regulation when comparing different growth conditions.

Gene set enrichment analysis
GSEA was conducted using the tool clusterProfiler (52).To this end, the genome-wide annotation database for E. coli K-12 MG1655 (144) was used in combination with the results of the differential expression analysis.The analysis was focused on the biological process domain, omitting the cellular component and molecular function domains.Furthermore, the minimum gene set size was set to 5 and the maximum to 30.To avoid redundancy within the results, GO terms were filtered, and only terms at the bottom (lowest branch level) of the GO hierarchy were analyzed.

Detection and differential expression of novel acid-induced sORF candidates
sORF candidates were initially detected using the neural network DeepRibo (99).Summary statistics, including TE, rpkm, codon counts, and nucleotide and amino acid sequences, for annotated and potential novel sORFs were computed using HRIBO (version 1.6.0)(37).Moreover, GFF track files were created for detailed manual inspection using the web-based genome browser JBrowse2 (100).
DeepRibo was reported to produce high numbers of false positives (145).Therefore, the high number of initial predictions (>25,000) was reduced by introducing cut-off criteria based on sORF length and Ribo-Seq coverage.Predictions with average rpkm (reads per kilobase of transcript per million mapped reads) values <30 across all Ribo-Seq samples and outside of the codon count range of 10-70 amino acids were excluded from further analysis.To specifically detect sORFs, which are only detectable under acidic conditions and involved in the acid response of E. coli, the search was further restricted to predictions with deltaTE Ribo-and RNA log 2 FC values of ≥2, in combina tion with P-adjust values of ≤0.05, at either pH 5.8 or pH 4.4, compared to pH 7.6.Additionally, DeepRibo predictions of overlapping annotated genes on the same strand were excluded.The remaining 152 candidates were manually inspected in JBrowse2, and novel sORFs were included in our final candidate list (Table S7) if the coverage was even over the predicted ORF, restricted to the ORF boundaries, and potential Shine-Dalgarno sequences were detectable.The remaining candidates were manually curated, and, in a few cases, alternative sORFs in the genomic vicinity of DeepRibo predictions were selected, which matched better to the Ribo-Seq coverage signal.Differential expression values, comparing pH 7.6, pH 5.8, and pH 4.4, for novel sORF candidates, previously known sORFs, and sORFs detected by Storz and colleagues (94), were calculated using deltaTE (43).

RNA isolation and RT-qPCR analysis
RNA was isolated using the Quick-RNA Miniprep Kit (Zymo Research) or miRNeasy Mini Kit (QIAGEN) in combination with the RNase-Free DNase Set (QIAGEN) according to the manufacturer's instructions.Total RNA was DNase digested for 30 min at 37°C using 1 µL TURBO DNase (2 U/µL) (Invitrogen).A 500-ng aliquot of the isolated RNA was conver ted to cDNA with the iScript Advanced Kit (Bio-Rad) according to the manufacturer's instructions.Then, 1 µL of a 1:10 dilution in nuclease-free water of the cDNA samples was mixed with 5 µL of SsoAdvanced Univ SYBR Green Supermix (Bio-Rad) and 0.8 µL of 5 µM forward and reverse primers (Table S10).The total reaction volume was adjusted to 10 µL with nuclease-free water, dispensed in triplicates in a 96-well PCR plate (Bio-Rad), and subjected to qPCR in a Bio-Rad CFX real-time cycler.Evaluation of the obtained data were performed according to the ΔΔCt method (146), using recA or secA genes as internal references.

Acid shock assay
Acid resistance was determined based on previously described protocols (16,147) with the following modifications: E. coli BW25113 cells were grown at 37°C in LB pH 7.6 to OD 600 = 0.5, adjusted to OD 600 = 1, and then shifted for 15 min each, first to LB pH 5.8 and then to LB pH 4.4.Then, cells were shifted to LB pH = 3 for 1 h at 37°C.As a control, cells were cultivated at pH 7.6 throughout the experiment.After 1 h at pH 7.6, or pH 3, samples were serially diluted in 1× phosphate-buffered saline (PBS) and plated on LB agar plates to count the number of colonies.Percent survival was calculated as the ratio of colony-forming units at pH 3 and pH 7.6.For complementation of mutants from the Keio collection (84), BW25113 strains were transformed with IPTG-inducible pCA24N plasmids from the ASKA collection (Table S11) (85).Acid shock survival rates were determined as described above, with the exception that media were supplemented with 50 µg/mL chloramphenicol and different concentrations of IPTG to mimic native expression levels.pCA24N-control, gadE, and gadW vectors were induced with 100 µM IPTG, pCA24N-rcsB, and iscR with 10 µM IPTG, and pCA24N-iscR, ydcI, and mhpR with 1 µM IPTG.

Promoter activity assay
In vivo promoter activities of gadBC, adiA, and cadBA were determined using lumines cence-based reporter plasmids harboring fusions of the respective promoter regions to the luxCDABE genes from Photorhabdus luminescens.BW25113 wild-type cells or corresponding mutants from the Keio collection (84) were transformed with plasmids pBBR1-PgadBC:lux, pBBR1-PadiA:lux, or pBBR1-PcadBA:lux.All strains were cultivated in LB medium supplemented with gentamycin overnight.The overnight cultures were inoculated to an OD 600 of 0.05 in fresh LB medium (pH 7.6), aerobically cultivated until exponential phase (OD 600 = 0.5), and shifted to LB pH 5.8.To assess the promoter activity of adiA and gadBC, the cultures were shifted again to LB pH 4.4 after 15 min of growth in LB pH 5.8.In the next step, the cells were transferred to a 96-well plate and aerobically cultivated at 37°C in LB medium at different pH values supplemented with gentamycin.Growth and bioluminescence were measured every 10 min in the microtiter plates using a CLARIOstar Plus plate reader (BMG Labtech).Data are reported as relative light units in counts per second of OD 600 .

Autoencoder-based identification of acid-specific genes and biological processes
The denoising autoencoders in this study were implemented using the Python package Keras (148).Details of the various hyperparameter and network architecture choices were taken from reference (113).In brief, an ensemble of 100 DAEs with two hidden layers of 2,000 and 1,000 nodes between the input layer and the bottleneck was used.The bottleneck layer of each network consists of 50 nodes, as this was found to be within an optimal range for DAEs trained on the PRECISE 2.0 expression compendium (114) by Kion-Crosby and Barquist (113).All layers have sigmoid activation functions; the weights of each layer were randomly initialized based on the Glorot distribution, and the bias vectors were zeros.The weight matrices that make up the decoder of each network are tied together such that they consist of the transpose of the corresponding weight matrices of each encoder.
All networks were trained in each ensemble using the Adam optimization algorithm.Data corruption during training was employed to improve generalizability, such that 10% of the entries of each input data point were randomly set to zero during each training step.Additionally, early stopping was employed during training; the data were randomly portioned into an 80% training set and a 10% validation set, and training was stopped once the validation score began to worsen.A 10% test set was also portioned to determine the optimal training parameters.A local search over all training parameters, including the learning rate and batch size, was performed using the test score as a metric, and training was done with batch shuffling enabled.
For the training of the autoencoder ensemble, the PRECISE 2.0 compendium (114) was used, as well as six additional data points (115) representing E. coli K-12 MG1655 strains grown in M9 medium and treated with various antibiotics, in addition to the nine data points from the current study.All data were converted to log transcripts per million, and features were normalized after the train/validation/test split such that the expression of each gene was scaled between 0 and 1.
Following the procedure described by Kion-Crosby and Barquist (113), the encoder of each of the 100 trained networks was used to determine which nodes turn on for the specific stress conditions of interest (e.g., pH 4.4) and simultaneously turn off for alternative stress conditions (e.g., oxidative stress, heat stress, and ethanol).This was done by passing each data point into each encoder while observing the activations of nodes at the bottleneck layer.After identifying which nodes are specific to the stress condition of interest, each corresponding decoder was utilized to generate gene expression predictions for all E. coli genes by activating each of these bottleneck nodes individually and propagating this signal through the decoder.After sorting all genes based on the average decoder output from all identified nodes, the top of this sorted vector was used to define a gene set associated with the condition of interest.
Additionally, GSEA was run on the outputs of each decoder for each node to determine which biological processes were associated with the condition of interest.Since these groups often consist of ~100 nodes and thousands of GO terms are evaluated, a conservative threshold for the adj.P-value of 0.0005 was set.The adj.P-value was found using the BH method, and each P-value was computed based on 1,000,000 permutations.Finally, for the selection of the high-confidence acid-specific gene candidates, first the log 2 FC between each acidic condition and every other stress condition was taken for all genes in the acid-specific gene sets.Genes corresponding to a log 2 FC of at least 0.5 for at least 95% of comparisons were then selected.

Propidium iodide viability staining
Cells were grown either at pH 7.6, 5.8, or 4.4 in the same manner as described for Ribo-Seq and RNA-Seq (Fig. 1A).As a positive control, cells were grown to an OD 600 of 1 and subsequently heat-shocked for 5 min at 80°C.After cultivation, 1 mL of each culture was centrifuged (15,000 × g, room temperature) and washed with PBS.Propidium iodide (Invitrogen) was added with a final concentration of 2 µg/mL, followed by a 5-min incubation at room temperature in the dark to label dead cells.After another wash step using PBS, 2 µL of the culture was spotted on 1% (wt/vol) agarose pads, placed onto microscope slides, and covered with a coverslip.Microscopic images were taken using a Leica DMi8 inverted microscope equipped with a Leica DFC365 FX camera.An excitation wavelength of 546 nm and a 605-nm emission filter with a 75-nm bandwidth were used to detect fluorescence with an exposure of 500 ms, a gain of 5, and 100% intensity in the Leica LAS X 3.7.4software.
To quantify the relative fluorescent intensities (RF) of single cells, phase contrast and fluorescent images were analyzed using the ImageJ (149) plugin for MicrobeJ (150).Default settings of MicrobeJ were used for cell segmentation (Fit shape, rod-shaped bacteria) apart from the following settings: area, 0.1-max μm 2 ; length, 1.2-5 µm; width, 0.1-1 µm; curvature, 0.-0.15; and angularity, 0.-0.25 for E. coli cells.In total, ≥1,000 cells were quantified per strain and condition, and the background of the agarose pad was subtracted from each cell per field of view.Cells with RF values ≥300 after subtraction of the background were considered dead.

Western blot analysis
To verify the translation of sORF15, E. coli MG1655 cells harboring a pBAD24-sORF15:3xFLAG plasmid were cultivated in LB medium.Once an OD 600 of 0.5 was reached, expression of sORF15 was induced for 1 h by supplementation of a final concentration of 0.2% (vol/vol) L-arabinose.Next, culture aliquots normalized to an OD 600 of 1 were collected by centrifugation (1 min, 13,000 × g, 4°C).Supernatants were removed, and pellets were resuspended in 100 µL of protein loading buffer [12% SDS (wt/vol), 30% glycerol (wt/vol), 0.05% Coomassie blue (wt/vol), 150 mM Tris-HCl pH 7].Prior to loading, protein samples were denatured at 95°C for 5 min and then chilled on ice.Protein fractions were separated by Tricine-SDS-PAGE on a 16% gel containing 6 M urea (151).After separation, proteins were transferred onto a nitrocellulose membrane in a semidry blot chamber.The membrane was blocked for 1 h in 5% (wt/vol) skim milk in 1× TBS.After short washing with Tris-buffered saline with Tween (TBS-T), blots were hybridized with the primary antibody (α-FLAG, Invitrogen) for 1 h at room temperature.After three washing steps in TBS-T for 10 min each, membranes were incubated with an alkaline phosphatase-conjugated secondary antibody (α-rabbit, Rockland Immunochem icals) for 1 h at room temperature.After three washing steps conducted as described above, proteins were visualized using colorimetric detection of alkaline phosphatase activity with 5-bromo-4-chloro-3-indolyl phosphate and nitro blue tetrazolium chloride.

FIG 2
FIG 2 Genome-wide adaptations correlate at the transcriptional and translational levels in E. coli under acid stress.Weighted Venn diagrams show the total number and overlap of genes with significant FCs (absolute log 2 FC ≥1 and P-adjust ≤0.05) determined by (A) RNA-Seq or (B) Ribo-Seq for cells exposed to pH 5.8 or pH 4.4, compared to the control (pH 7.6).(C) Comparison of global RPF and mRNA log 2 FC values for pH 5.8, or (D) pH 4.4 vs pH 7.6.Dashed lines indicate log 2 fold change values of +1 or −1.Hundreds of genes exhibited differential expression (absolute log 2 FC ≥1 and P-adjust ≤0.05) at both the transcriptional and translational levels, whereas others were exclusively detected by either RNA-Seq (red dots) or Ribo-Seq (blue dots) or had significant changes in opposite directions (yellow dots).Values of the Pearson correlation coefficient (r) are indicated.

9 C
Flagellar biosynthesis, initiation of hook assembly a C, cytosol; P, periplasm; IM, inner membrane.

FIG 3
FIG 3 Up-and downregulated biological processes in E. coli at pH 5.8 and 4.4.GSEA was conducted using the gseGO function in the clusterProfiler package (52) with the ribosome profiling differential expression data sorted by log 2 fold change values as input.GO terms were considered up-or downregulated if P-adjust values were ≤0.05.The top 15 non-redundant GO terms were sorted in descending order by the clusterProfiler enrichment score and are shown for pH 5.8 vs 7.6 and pH 4.4 vs 7.6.The dot size represents the number of genes associated with each GO term, and the dot color represents adjusted P-values corrected for the false discovery rate.

FIG 4
FIG 4 Contribution of TFs to survival and AR induction under acid stress.Comparison of global RPF and mRNA log 2 FC values of transcriptional regulators for (A) pH 5.8 vs pH 7.6 and (B) pH 4.4 vs pH 7.6.Dashed lines indicate log 2 fold changes of +1 or −1.Changes detected exclusively by either RNA-Seq (red dots) or Ribo-Seq (blue dots) are colored.TFs described in panels C-F are highlighted.(C) Acid shock assay to test the survival of E. coli BW25113 (WT) and the indicated mutants (84).Cells were grown in LB pH 7.6 to OD 600 = 0.5.The cultures were split and then either grown at pH 7.6 or stepwise stressed (15 min pH 5.8, 15 min pH 4.4) before being exposed to LB pH 3 for 1 h.Colony-forming units were counted, and the ratio of surviving cells was calculated.The dashed line indicates the average percentage of surviving WT cells.(D through F) Luciferase-based promoter assays.WT and the indicated mutants were transformed either with plasmid pBBR1-PgadBC:lux (D), pBBR1-PadiA:lux (E), or pBBR1-PcadBA:lux (F) and grown in LB medium (pH 7.6) until OD 600 = 0.5.The medium pH was then adjusted to 5.8 to induce the Cad system or to pH 4.4 to induce the Adi and Gad systems.Luminescence and growth were determined every 10 min in microtiter plates using a CLARIOstar plus plate reader (BMG Labtech).Data are reported as relative light units (RLUs) in counts per second per OD 600 , with maximal RLU shown.Dashed lines indicate the average maximal RLU values of the WT.C-F, All experiments were performed in biological replicates (n = 3), and error bars represent standard deviations of the mean.Analysis of variance, followed by Bonferroni's multiple comparisons test, was used to compare log-transformed max.RLU values between mutant strains and the wild type (BW25113) (*P ≤ 0.05, ***P ≤ 0.001, ****P ≤ 0.0001).

FIG 6
FIG 6 Differentiation between acid stress responses and general stress using autoencoders.(A) Schematic overview of the autoencoder ensemble training and subsequent bottleneck group identification pipeline.(B) Donut charts indicating the proportion of bottleneck nodes that turn on only under the specified conditions.Absolute numbers of specific bottleneck nodes are listed in brackets after each condition.(C) Significantly enriched GO terms associated with pH 4.4-and pH 5.8-specific nodes.Left-facing red bars indicate nodes that downregulate the corresponding GO term, while right-facing blue bars indicate upregulation.(D) Verification of acid stress-specific genes predicted by autoencoders.E. coli cells were either grown as indicated in Fig. 1A or exposed to heat (42°C), oxidative (H 2 O 2 ), osmotic (NaCl), chloramphenicol (CAM), or ethanol (EtOH) stress.Total RNA was isolated, and relative mRNA levels were measured by RT-qPCR.Fold change values were determined relative to non-stress conditions and normalized using either secA or recA as reference genes.Standard deviations were calculated from three replicates (n = 3) and accounted for <10% of fold change values in all cases.PQ, paraquat; LOX, low oxygen; TMP, trimethoprim.

FIG 7
FIG 7 Overview of the fine-tuned response of E. coli to mild (pH 5.8) and severe (pH 4.4) acid stress.Acid stress counteracting mechanisms, including those revealed by Ribo-Seq and RNA-seq, are indicated.

TABLE 1
Top 20genes with increased RPF levels at pH 5.8 compared to pH 7.6, sorted in descending order by Ribo-Seq log 2 FC values a

TABLE 2
Top 20genes with decreased RPF levels at pH 5.8 compared to pH 7.6, sorted in ascending order by Ribo-

TABLE 3
Top 20genes with increased RPF levels at pH 4.4 compared to pH 7.6, sorted in descending order by Ribo-Seq log 2 FC values a