Epithelial cells maintain memory of prior infection with Streptococcus pneumoniae through di-methylation of histone H3

Epithelial cells are the first point of contact for bacteria entering the respiratory tract. Streptococcus pneumoniae is an obligate human pathobiont of the nasal mucosa, carried asymptomatically but also the cause of severe pneumoniae. The role of the epithelium in maintaining homeostatic interactions or mounting an inflammatory response to invasive S. pneumoniae is currently poorly understood. However, studies have shown that chromatin modifications, at the histone level, induced by bacterial pathogens interfere with the host transcriptional program and promote infection. Here, we uncover a histone modification induced by S. pneumoniae infection maintained for at least 9 days upon clearance of bacteria with antibiotics. Di-methylation of histone H3 on lysine 4 (H3K4me2) is induced in an active manner by bacterial attachment to host cells. We show that infection establishes a unique epigenetic program affecting the transcriptional response of epithelial cells, rendering them more permissive upon secondary infection. Our results establish H3K4me2 as a unique modification induced by infection, distinct from H3K4me3 or me1, which localizes to enhancer regions genome-wide. Therefore, this study reveals evidence that bacterial infection leaves a memory in epithelial cells after bacterial clearance, in an epigenomic mark, thereby altering cellular responses to subsequent infections and promoting infection.


Reporting on race, ethnicity, or other socially relevant groupings
Please specify the socially constructed or socially relevant categorization variable(s) used in your manuscript and explain why they were used.Please note that such variables should not be used as proxies for other socially constructed/relevant variables (for example, race or ethnicity should not be used as a proxy for socioeconomic status).Provide clear definitions of the relevant terms used, how they were provided (by the participants/respondents, the researchers, or third parties), and the method(s) used to classify people into the different categories (e.g.self-report, census or administrative data, social media data, etc.) Please provide details about how you controlled for confounding variables in your analyses.

Population characteristics
Describe the covariate-relevant population characteristics of the human research participants (e.g.age, genotypic information, past and current diagnosis and treatment categories).If you filled out the behavioural & social sciences study design questions and have nothing to add here, write "See above."

Recruitment
Describe how participants were recruited.Outline any potential self-selection bias or other biases that may be present and how these are likely to impact results.

Ethics oversight
Identify the organization(s) that approved the study protocol.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Field-specific reporting
Please select the one below that is the best fit for your research.If you are not sure, read the appropriate sections before making your selection.

Life sciences
Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf

Life sciences study design
All studies must disclose on these points even when the disclosure is negative.

Sample size
Sample size was determined to reach statistical significance whenever possible.

Data exclusions
The only data excluded from analysis was in the transcriptome analysis.Replicate 3 from 1° infection was filter out after principal component analysis.Otherwise, no other data was excluded from analysis.
Replication 3 replicates and more were performed for every experiment and all data is shown in figures.
Randomization in vitro experiments were randomized only by different placement of samples in multi-well culture plates.Animals were randomized into treatment groups prior to in vivo experiments.

Blinding
Blinding was not possible in our experiments as the scientist performing the experiments was also analyzing them.

nature portfolio | reporting summary
April 2023 Reporting for specific materials, systems and methods We require information from authors about some types of materials, experimental systems and methods used in many studies.Here, indicate whether each material, system or method listed is relevant to your study.If you are not sure if a list item applies to your research, read the appropriate section before selecting a response.

Validation
All antibodies have previously been validated by the source providers with citations, and data present on the manufacturer's website.

Eukaryotic cell lines Policy information about cell lines and Sex and Gender in Research
Cell line source(s)

Authentication
None of these cell lines were authenticated.

Mycoplasma contamination
we confirm that all cells used in this study are mycoplasma negative Commonly misidentified lines (See ICLAC register) none

Specimen provenance
Provide provenance information for specimens and describe permits that were obtained for the work (including the name of the issuing authority, the date of issue, and any identifying information).Permits should encompass collection and, where applicable, export.

Specimen deposition
Indicate where the specimens have been deposited to permit free access by other researchers.

Dating methods
If new dates are provided, describe how they were obtained (e.g.collection, storage, sample pretreatment and measurement), where they were obtained (i.e.lab name), the calibration program and the protocol for quality assurance OR state that no new dates are provided.
Tick this box to confirm that the raw and calibrated dates are available in the paper or in Supplementary Information.

Ethics oversight
Identify the organization(s) that approved or provided guidance on the study protocol, OR state that no ethical approval or guidance was required and explain why not.
Note that full information on the approval of the study protocol must also be provided in the manuscript.

Novel plant genotypes
Describe the methods by which all novel plant genotypes were produced.This includes those generated by transgenic approaches, gene editing, chemical/radiation-based mutagenesis and hybridization.For transgenic lines, describe the transformation method, the number of independent lines analyzed and the generation upon which experiments were performed.For gene-edited lines, describe the editor used, the endogenous sequence targeted for editing, the targeting guide RNA sequence (if applicable) and how the editor was applied.

Seed stocks
Report on the source of all seed stocks or other plant material used.If applicable, state the seed stock centre and catalogue number.If plant specimens were collected from the field, describe the collection location, date and sampling procedures.

Authentication
Describe any authentication procedures for each seed stock used or novel genotype generated.Describe any experiments used to assess the effect of a mutation and, where applicable, how potential secondary effects (e.g.second site T-DNA insertions, mosiacism, off-target gene editing) were examined.

Plants
ChIP-seq

Data deposition
Confirm that both raw and final processed data have been deposited in a public database such as GEO.
Confirm that you have deposited or provided access to graph files (e.g.BED files) for the called peaks.

Data access links
May remain private before publication.

Files in database submission
Provide a list of all files available in the database submission.Peak calling parameters Narrow peak calling; options: no_model, cutoff = 0.1, genomeSize = hs, readLength: 65.Peak reproducibity: IDR for H3K4me3 and intersection approach requiring at least 40% length overlap for H3K4me2.

Plots
Confirm that: The axis labels state the marker and fluorochrome used (e.g.CD4-FITC).
The axis scales are clearly visible.Include numbers along axes only for bottom left plot of group (a 'group' is an analysis of identical markers).
All plots are contour plots with outliers or pseudocolor plots.
A numerical value for number of cells or percentage (with statistics) is provided.Cell population abundance between 12 000 and 19 000 cells are imaged for Figure S1. between 400 and 10 000 cells were quantified for Figure 4E.

Methodology
Gating strategy in Figure S1, all cells are shown.for Figure 4E Cells were stained for the epithelial cell marker, CD326 (EpCAM)), CD45 and H3K4me2 for analysis by flow cytometry Tick this box to confirm that a figure exemplifying the gating strategy is provided in the Supplementary Information.

Magnetic resonance imaging
Experimental design

Design type
Indicate task or resting state; event-related or block design.

Design specifications
Specify the number of blocks, trials or experimental units per session and/or subject, and specify the length of each trial or block (if trials are blocked) and interval between trials.
Behavioral performance measures State number and/or type of variables recorded (e.g.correct button press, response time) and what statistics were used to establish that the subjects were performing the task as expected (e.g.mean, range, and/or standard deviation across subjects).

Specify in Tesla
Sequence & imaging parameters Specify the pulse sequence type (gradient echo, spin echo, etc.), imaging type (EPI, spiral, etc.), field of view, matrix size, slice thickness, orientation and TE/TR/flip angle.

Area of acquisition
State whether a whole brain scan was used OR define the area of acquisition, describing how the region was determined.

Volume censoring
Define your software and/or method and criteria for volume censoring, and state the extent of such censoring.

Statistical modeling & inference
Model type and settings

Graph analysis
Report the dependent variable and connectivity measure, specifying weighted graph or binarized graph, subject-or group-level, and the global and/or node summaries used (e.g. clustering coefficient, efficiency, etc.).
Multivariate modeling and predictive analysis Specify independent variables, features extraction and dimension reduction, model, training and evaluation metrics.
were detached and stained with the indicated markers, mouse lungs are dissociated and stained for analysis Instrument CytoflexS, Beckman Coulter and Miltenyi MACSQuant Software FlowJo nature portfolio | reporting summary April 2023 Specify type(mass univariate, multivariate, RSA, predictive, etc.)and describe essential details of the model at the first and second levels (e.g.fixed, random or mixed effects; drift or auto-correlation).Define precise effect in terms of the task or stimulus conditions instead of psychological concepts and indicate whetherANOVA or factorial designs were used.Specify voxel-wise or cluster-wise and report all relevant parameters for cluster-wise methods.CorrectionDescribe the type of correction and how it is obtained for multiple comparisons (e.g.FWE, FDR, permutation or Monte Carlo).Report the measures of dependence used and the model details (e.g.Pearson correlation, partial correlation, mutual information).