Spontaneous episodic inflammation in the intestines of mice lacking HNF4A is driven by microbiota and associated with early life microbiota alterations

ABSTRACT The inflammatory bowel diseases (IBD) occur in genetically susceptible individuals who mount inappropriate immune responses to their microbiota leading to chronic intestinal inflammation. Whereas IBD clinical presentation is well described, how interactions between microbiota and host genotype impact early subclinical stages of the disease remains unclear. The transcription factor hepatocyte nuclear factor 4 alpha (HNF4A) has been associated with human IBD, and deletion of Hnf4a in intestinal epithelial cells (IECs) in mice (Hnf4aΔIEC) leads to spontaneous colonic inflammation by 6–12 mo of age. Here, we tested if pathology in Hnf4aΔIEC mice begins earlier in life and if microbiota contribute to that process. Longitudinal analysis revealed that Hnf4aΔIEC mice reared in specific pathogen-free (SPF) conditions develop episodic elevated fecal lipocalin 2 (Lcn2) and loose stools beginning by 4–5 wk of age. Lifetime cumulative Lcn2 levels correlated with histopathological features of colitis at 12 mo. Antibiotic and gnotobiotic tests showed that these phenotypes in Hnf4aΔIEC mice were dependent on microbiota. Fecal 16S rRNA gene sequencing in SPF Hnf4aΔIEC and control mice disclosed that genotype significantly contributed to differences in microbiota composition by 12 mo, and longitudinal analysis of the Hnf4aΔIEC mice with the highest lifetime cumulative Lcn2 revealed that microbial community differences emerged early in life when elevated fecal Lcn2 was first detected. These microbiota differences included enrichment of a novel phylogroup of Akkermansia muciniphila in Hnf4aΔIEC mice. We conclude that HNF4A functions in IEC to shape composition of the gut microbiota and protect against episodic inflammation induced by microbiota throughout the lifespan. IMPORTANCE The inflammatory bowel diseases (IBD), characterized by chronic inflammation of the intestine, affect millions of people around the world. Although significant advances have been made in the clinical management of IBD, the early subclinical stages of IBD are not well defined and are difficult to study in humans. This work explores the subclinical stages of disease in mice lacking the IBD-associated transcription factor HNF4A in the intestinal epithelium. Whereas these mice do not develop overt disease until late in adulthood, we find that they display episodic intestinal inflammation, loose stools, and microbiota changes beginning in very early life stages. Using germ-free and antibiotic-treatment experiments, we reveal that intestinal inflammation in these mice was dependent on the presence of microbiota. These results suggest that interactions between host genotype and microbiota can drive early subclinical pathologies that precede the overt onset of IBD and describe a mouse model to explore those important processes.


Tissue sampling and Histology
Following CO2 euthanasia, small and large intestine was dissected out, and attached fat was trimmed away.Then the small intestine was further divided into duodenum (proximal quarter), jejunum (next 2 quarters), and ileum (most distal quarter).Intestinal segments were swiss rolled (2) and fixed in zinc buffered formalin (Thermo Scientific 5701ZF) at room temperature overnight, then processed, sectioned and stained with haemotoxylin and eosin (H&E) by the Duke Pathology Research Immunohistology Lab.

Histological image analysis and scoring
Histological scores of H&E stained colon sections were assigned by a board-certified veterinary pathologist in a masked fashion without knowledge of allocation group.Due to the mild, segmental, and variable phenotype, germ-free control animals were used as a baseline reference control group, and a scoring system to subjectively and semi-quantitatively assess inflammation was developed based on lamina propria inflammatory infiltrates and increased crypt hyperplasia in the proximal, middle, and distal colon.Each mouse received a score of 0=no change, 1=minimal, 2=mild, or 3=moderate or 4=severe for both lamina propria inflammatory infiltrates (inflammation score) and crypt hyperplasia for each segment of the colon (proximal, middle and distal).The sums of the scores for each segment were taken as the total colon inflammation and total colon crypt hyperplasia scores for each mouse.The sum of total colon inflammation and total colon crypt hyperplasia scores was taken as the pathology index for each mouse.
40x scans of H&E stained slides were made with a Leica Aperio GT450 slide scanner by Duke Pathology Research Immunohistology Lab.Quantitative measurements of crypt length and goblet cell number were made by CK in a blinded fashion.Crypt length was measured with ObjectiveView software.Whole crypts with good cross-sectioning were selected for quantification.Goblet cells were counted by eye.12 crypts were quantified per proximal (innermost loops until rugae end), middle (loops in between proximal and distal loops), and distal colon (outermost ~2 loops) segment, then averaged within their colonic segment to generate a single average measurement per colon segment per mouse.

Statistical analysis
Longitudinal repeated measures data in Figures 1, 3, and 4 were analyzed as follows: each outcome of interest (weight, episodic loose stools and lipocalin) was examined with a repeated measures regression.In the antibiotic treatment experiment, % of starting body weight was examined instead of raw body weight.In the GF/CV experiment, only lipocalin was examined.Weight and lipocalin were continuous variables.Episodic loose stools was dichotomized into none vs mild/severe.The predictors were genotype, sex, age in weeks, genotype by age in weeks, and genotype by sex interaction for the SPF experiment.For the antibiotic treatment experiment, the predictors were treatment, sex, timepoint and sex by treatment interaction.For the GF/CV experiment, the predictors were genotype, colonization status, life stage, and all the two-way and three-way interactions of these three predictors.For weight and lipocalin, a three-level genotype variable was used.For loose stools, wildtype and heterozygous were combined and a two-level genotype variable.Terms for the experiment number (SPF experiments only) and the cage number within experiment were included in the models.An auto-regressive correlation matrix was used to account for the repeated measures across mice over time.Analyses were performed using SAS Version 9.4 (SAS Institute, Cary, NC).P-values less than 0.05 were considered statistically significant.
To test the hypothesis that instances of Lcn2 >300 ng/g and episodic loose stools were coincident on the same day more often than expected by chance, we calculated the chance that Hnf4a ΔIEC mice would have Lcn2 >300 ng/g at any given timepoint (161/1019 = 0.158).We also calculated the chance that Hnf4a ΔIEC mice would have an episode of loose stools at any given timepoint (229/1019 = 0.2247).Then we multiplied these to probabilities to calculate the expected chance of these two phenomena being coincident if they are independent (0.158*0.2247 = 0.0355).We used a Chi squared test to test whether the actual distribution of timepoints with only Lcn2 >300 ng/g, only loose stools, both, or neither was significantly different from the expected distribution.
All p-value calculations, excluding longitudinal repeated measures analyses, were performed using PRISM 9 using the tests indicated in the figure or figure legend.

Antibiotic treatment to deplete microbiota
Adult SPF Hnf4a ΔIEC mice were randomly assigned to treatment and control groups.Then, at three timepoints prior to treatment, fecal samples were taken for later assessment of pre-treatment Lcn2 levels.Treatment group mice were then treated continuously with 1 g/L ampicillin in the drinking water, and orally gavaged every 12 hours with a solution of 50 mg/kg body weight vancomycin, 100 mg/kg neomycin, 100 mg/kg metronidazole, and 0.5 mg/kg fluconazole for two weeks.Control mice were mock gavaged with reverse osmosis (RO) water.
To determine the efficacy of antibiotic treatment, fecal samples from selected timepoints were diluted to 50 mg/mL and DNA was extracted using the Quick-DNA Fecal/Soil Microbe miniprep kit (Zymo Research #D6010) according to manufacturer's instructions.qPCR was performed on extracted DNA using universal 16S rRNA gene primers 357F (forward) CTCCTACGGGAGGCAGCAG (3) and CD (reverse) CTTGTGCGGGCCCCCGTCAATTC (4).

Conventionalization of GF mice
Fecal samples collected from SPF Hnf4a fl/+ ;Vil1:Cre+ and Hnf4a fl/fl mice were homogenized in an anaerobic chamber (Coy Laboratory, 5% hydrogen, 5% carbon dioxide, and 90% nitrogen) in PBS + 0.05% cysteine + 10% glycerol to make a 10% w/v fecal slurry.The slurry was centrifuged for 2 minutes at 300xg to pellet large particles, and aliquoted into sterile Eppendorf tubes and frozen at -80 C. CV-to-be mice were gavaged with 120 uL of thawed slurry.GF mice were not gavaged.
16S rRNA gene amplification and sequencing DNA was extracted from mouse fecal pellets using the Quick-DNA Fecal/Soil Microbe miniprep kit (Zymo Research #D6010) according to manufacturer's instructions, then used for either 16S rRNA sequencing or qPCR.The Duke Microbiome Shared Resource (MSR) determined sample DNA concentration with a Qubit dsDNA HS assay kit (ThermoFisher, Q32854).Then the V4 variable region of the 16S rRNA gene was amplified by polymerase chain reaction using the forward primer 515 and reverse primer 806 following the Earth Microbiome Project protocol (http://www.earthmicrobiome.org/),which carry unique barcodes allowing for multiplexed sequencing.Equimolar 16S rRNA gene PCR products from all samples were quantified and pooled.Sequencing was then performed by the Duke Sequencing and Genomic Technologies shared resource on an Illumina MiSeq instrument configured for 250 base-pair paired-end SP sequencing run.Reads were converted to fastq files and demultiplexed by MSR.

Statistical analysis of 16S rRNA gene sequencing data
Raw data from 16S rRNA gene sequencing was processed using the DADA2 v1.8 package in the R statistical programming environment v3.4.0 (5,6).In brief, raw demultiplexed reads were filtered using the fastqPairedFilter function, with maxEE set to 2, rm.phix set to TRUE to remove reads that match against the phIX genome, and other settings set to their defaults.Paired-end reads were merged using the mergePairs command, after which chimeric sequences were removed using the removeBimeraDenovo command using the method "consensus".Following chimera removal, taxonomy was assigned using Silva v123, with exact species matches assigned using the addSpecies function (7).The phyloseq package was used to combine the resulting data into a phyloseq object for subsequent analysis.All statistical analyses and graphical representations of the output phyloseq object were conducted in the R statistical programming environment using the phyloseq, vegan, and ggplot packages (8)(9)(10).Code used for 16S rRNA gene sequencing data analysis in this manuscript is available at: https://github.com/jawah003/hnf4a-mice-manuscript-2022.Sequencing reads generated as part of this study are available at sequence read archive under bioproject ID: PRJNA945427.

Isolation and culture of A. muciniphila strains from mouse fecal samples
All A. muciniphila isolation and culture procedures were performed under anaerobic conditions (Coy Laboratory anaerobic chamber, 5% hydrogen, 5% carbon dioxide, and 90% nitrogen).Approximately 10 mg of frozen fecal pellet (half pellet) was used to inoculate 5 ml of mucin medium and incubated at 37°C for 48 h.After three sequential passages in mucin medium, a sample of the suspension was streaked on 1% agar BBL brain heart infusion (BHI, BD Biosciences; catalog 211065) plates supplemented with 0.2% mucin and incubated for 6-7 days at 37°C.Colonies were purified by restreaking on fresh BHI mucin plates and incubated for 4 days.Pure isolates were grown in mucin medium for 5 days and frozen at -80 C in 25% glycerol for storage.Total DNA was isolated, and the strains were identified by PCR-based amplification of the 16S rRNA gene using 27F (5' AGA GTT TGA TCC TGG CTC AG) and 1492R (5' GGT TAC CTT GTT ACG ACT T) primers.

Whole genome sequencing and assembly for isolated A. muciniphila strains
Strains from glycerol stocks were inoculated into synthetic media (11) and grown to saturation and pelleted.DNA was extracted from bacterial pellets using the Magattract® high molecular weight (HMW) DNA kit (QIAGEN 67563) according to manufacturer instructions for Gram-negative bacteria with the following modifications: 1) the proteinase K incubation step was extended to 1 hour, and 2) for steps involving incubation on a mixer, a vortex was used at max speed instead of a mixer.The extracted DNA was confirmed to be A. muciniphila by PCR with universal Akkermansia primers Muc-1129 and Muc-1437 (12).If necessary, extracted DNA was concentrated by ethanol precipitation.
Library preparation using the resulting DNA, sequencing, demultiplexing samples, conversion of resulting BAM files to fasta files, genome assembly, and subsequent annotation and quality evaluation of genomes was performed as written in (13).Genome assembly was performed with Flye version 2.7.1.

Comparative genomics of A. muciniphila strains
All A. muciniphila genome assemblies listed in NCBI Genome with either mus musculus listed as the host species or with an associated publication confirming that the strain originated from mice were selected for the pangenomic analysis (as of June 30, 2022; www.ncbi.nlm.nih.gov/genome/browse/#!/prokaryotes/1598/). Two assemblies in NCBI Genome were selected based on the paper describing their isolation from mice ( 14), although host information was not included in the BioSample metadata.
The genome database was used to compute average nucleotide identity (ANI) across the genomes using the command anvi-compute-genome-similarity with the -method pyani parameter (18).To determine which phylogroup our isolates belonged to, we included publicly available A. muciniphila genomes with previously published phylogroup assignments as a control in our pangenome and ANI analyses.Strains representing phylogroups AmI, AmII, and AmIV isolated from human donors were described in Becken et al. (13).Strains belonging to phylogroup AmIII were described in Guo et al. (19).All representative genome assemblies were retrieved from NCBI (see Table S5 for accession information).Based on the resulting analysis, phylogroup specific gene functions and metabolic module enrichment were obtained using the commands anvi-compute-functional-enrichment-in-pan and anvi-compute-metabolicenrichment.
Gene functions unique to AmV, missing from AmV but present in all other phylogroups, or present in AmI (the phylogroup with the majority of mouse strains) but absent in AmV were manually curated into Table S5.Metabolic pathways completeness analysis was determined using the command anvi-estimate-metabolism and heatmap visualization was performed using the R package pheatmap.(A) Heatmap showing the repeated measures regression p-value for the genotype by age interaction terms at matched weeks for the indicated genotype comparisons.Age in weeks is indicated along the x-axis.Non-significant p-values are colored white.(B) Representative images of 3-level scoring system we used for episodic loose stool severity.(C) Quantification of goblet cells in the ileum of SPF Hnf4a ΔIEC and control mice (same mice as described in Figure 1).(D) Average Ct values for 16S qPCR on DNA extracted from mouse fecal samples at indicated timepoints throughout the antibiotic treatment.(E) Chi squared test comparing actual vs expected Mendelian distribution in GF mouse litters.P-values for (C) were calculated using an ordinary one-way ANOVA followed by Tukey's multiple comparison testing.**** p<0.0001, ** p<0.01, ns = not significant.

Figure S2: Representative images of average crypt length and inflammatory infiltrates in different groups of gnotobiotic mice.
(A-F) Representative images of crypt length selected on the basis that their measurements fell near the mean of their respective group.(G-J) Example images showing inflammatory infiltrates selected by veterinary pathologist J.I. Everitt.We chose to show focal sites of inflammatory infiltrates in CV Hnf4a ΔIEC mice, in which they were seen most frequently, but also CV and GF control mice to provide examples of what we saw in mice that received non-zero scores in all groups.Mouse colonization status and genotype are labeled, and areas of inflammatory infiltrates are outlined.(A-C) Average crypt length (m) and (D-F) goblet cells/m of crypt.A 2-way ANOVA was performed to test whether genotype, colonization status or genotype by colonization status interaction had a significant effect on either crypt length or goblet cell number.Genotype by colonization status interaction had no significant effect in any of the comparisons.However, when either genotype or colonization status was a significant factor in a comparison, either Sidak's or Tukey's multiple comparisons tests were performed.Those p-values are indicated in the figure panels using brackets.**** p<0.0001, ** p<0.01, * p<0.05.Predicted KEGG metabolic pathways present in each A. muciniphila genome were selected for inclusion here based on having a threshold of 70% completeness of a KEGG Module in at least one strain.The heatmap color scale refers to estimated pathway completion.Numerical values for the heatmap can be found in Table S5.

Figure S1 :
Figure S1: Supplemental data related to Figures 1-4.(A)Heatmap showing the repeated measures regression p-value for the genotype by age interaction terms at matched weeks for the indicated genotype comparisons.Age in weeks is indicated along the x-axis.Non-significant p-values are colored white.(B) Representative images of 3-level scoring system we used for episodic loose stool severity.(C) Quantification of goblet cells in the ileum of SPF Hnf4a ΔIEC and control mice (same mice as described in Figure1).(D) Average Ct values for 16S qPCR on DNA extracted from mouse fecal samples at indicated timepoints throughout the antibiotic treatment.(E) Chi squared test comparing actual vs expected Mendelian distribution in GF mouse litters.P-values for (C) were calculated using an ordinary one-way ANOVA followed by Tukey's multiple comparison testing.**** p<0.0001, ** p<0.01, ns = not significant.

Figure S3 :
Figure S3: Quantification of crypt length and goblet cell number in GF and CV Hnf4a ΔIEC and control mice.(A-C)Average crypt length (m) and (D-F) goblet cells/m of crypt.A 2-way ANOVA was performed to test whether genotype, colonization status or genotype by colonization status interaction had a significant effect on either crypt length or goblet cell number.Genotype by colonization status interaction had no significant effect in any of the comparisons.However, when either genotype or colonization status was a significant factor in a comparison, either Sidak's or Tukey's multiple comparisons tests were performed.Those p-values are indicated in the figure panels using brackets.**** p<0.0001, ** p<0.01, * p<0.05.

Figure S4 :
Figure S4: Enrichment of A. muciniphila and other ASVs in Hnf4a ΔIEC and control mice at 52 weeks of age.(A-B) Summed relative abundance for each ASV across all samples (A) or only Hnf4a ΔIEC samples (B) is plotted on the x-axis against -Log2(Fold Change) on the y-axis to show where A. muciniphila falls compared to other ASVs along these two parameters.A. muciniphila is highlighted in light blue.(C-D) Spearman correlation of the indicated measures.(E-F) Plots of the Log10(relative abundance) of the top 8 most enriched ASVs and seq_4 (A.muciniphila) (E), as well as top 9 most depleted ASVs (F) in Hnf4a ΔIEC mice in endpoint 16S rRNA gene sequencing dataset determined by DEseq2 analysis.(G-H) PCoA of weighted UniFrac distance colored by (G) genotype and (H) Log10(Lcn2) levels.95% confidence ellipses are color coded to genotype.PERMANOVA R 2 and p values are indicated at the top of each figure panel, showing the percentage of variance explained by genotype and Log10(Lcn2), respectively.

Figure S5 :
Figure S5: KEGG pathway completeness in our A. muciniphila strains.Predicted KEGG metabolic pathways present in each A. muciniphila genome were selected for inclusion here based on having a threshold of 70% completeness of a KEGG Module in at least one strain.The heatmap color scale refers to estimated pathway completion.Numerical values for the heatmap can be found in TableS5.