Genomic Insights into Adaptations of Trimethylamine-Utilizing Methanogens to Diverse Habitats, Including the Human Gut

Methanomassiliicoccales are less-known members of the human gut archaeome. Members of this order use methylated amines, including trimethylamine, in methane production.

previously reported phylogenies based on 16S rRNA and mcrA genes (10,20,21). A third distal clade was formed by two closely related MAGs generated in a recent massive metagenome assembly effort (22), which we labeled external (EX) (Fig. 1). We use the terminology of Borrel et al. (23), as follows: the clade including Methanomassiliicoccus is labeled free living (FL), and the clade containing "Candidatus Methanomethylophilus" is labeled host associated (HA).
As observed previously (10), the reported source of the genomes was not always consistent with the clade in which it was grouped. For instance, while publicly available genomes originally retrieved from human, baboon, elephant, and cow gastrointestinal tracts were related to "Candidatus Methanomethylophilus" (HA), this clade also contained MAGs derived from digestors and reactors (Fig. 1) reportedly not treating animal waste (see Table S1A in the supplemental material). Moreover, MAGs retrieved from pit mud of solid-state fermentation reactors used for the production of Chinese liquor were present in both the HA and FL clades (Table S1A). Similarly, "Ca. M. intestinalis" Issoire-Mx1, M. luminyensis B10, and Methanomassiliicoccales archaeon RumEn M1, all retrieved from mammal hosts, grouped in the FL clade.
Abundance of Methanomassiliicoccales clades differs in gastrointestinal and environmental samples. We assessed the abundance of species-level representative Methanomassiliicoccales taxa in publicly available metagenomes that included 145 samples from gastrointestinal tracts of nonhuman animals, such as cats, pigs, elks, cows, mice, white-throated woodrats, trout, chickens, and geese, and 160 environmental samples from sediment, ice, and diverse water and soil sources (Table S1B).
Taxa from all three clades were detected in a wide range of metagenomes from environmental and gut origin. We observed differences in environmental preference by clade. Abundance of taxa from clade EX was highest in environmental metagenomes (0.001% 6 0.0012%) (Fig. 1). These were also detected in gut samples (0.0002% 6 0.0005%), albeit with a very low abundance, and in in fecal (0.0003% 6 0.0005), large intestine (0.0001% 6 0.0002%), and stomach (0.0009% 6 0.0006%) metagenomes (Fig. 2). Given their low abundances, further analysis is focused on the FL and HA clades.
The aggregated abundance of clades FL and HA differed across biomes (Fig. 2). In agreement with their names, HA clade members were more abundant in host-associated samples, and FL in non-host-associated samples (Fig. 2). The prevalence and abundance of Methanomassiliicoccales taxa varied across animal hosts, yet the overall abundance patterns were consistent across hosts and sample types (see Fig. S2 in the supplemental material).
The combined abundance of members of clade FL was higher in samples from environmental biomes (0.01% 6 0.008%), although nonzero abundances were observed in digestive system metagenomes (0.008% 6 0.015%), with some samples containing levels comparable to that of clade HA (Fig. 2).
We further validated the differences in clade abundances across biomes by generating a dendrogram of Methanomassiliicoccales taxa using the DESeq2-based log fold change of individual taxa on gut versus environmental biomes (i.e., the effect size of the test as a measure of enrichment on a given environment). We then compared the structure of this dendrogram with that of the phylogenomic tree and found that they were positively correlated (cophenetic correlation = 0.67; P , 0.01).
Overall, we observed a low abundance of individual Methanomassiliicoccales taxa across all samples, ranging from 0 to 0.15% ( Fig. 2 and Fig. S1). Their prevalence across hosts differed; they were prevalent in animals such as elks, pigs, poultry, and cattle, while in others, such as trout and geese, they were largely absent (see Fig. S3 in the supplemental material). The enrichment analysis of individual taxa from clade FL from diverse biomes showed that while most were significantly enriched in environmental metagenomes (adjusted [adj.] P , 0.1), some taxa showed the opposite enrichment. M. luminyensis and Methanomassiliicoccus sp. UBA386 were not significantly enriched in gut or environmental biomes. "Ca. M. intestinalis" Issoire-Mx1, Methanomassiliicoccales archaeon RumEn M1, and Methanomassiliicoccus sp. UBA6 were significantly enriched in gut biomes (Fig. 1), although they were also present in multiple environmental biomes ( fig. S1).
When assessed on a per-taxon basis, the vast majority of clade HA taxa were significantly enriched in gut samples (adj. P , 0.1), with the exception of "Candidatus Methanoplasma termitum." which was highly abundant in soil samples from grasslands and water samples from intertidal zones (Fig. 1).
Genome characteristics and core genes functions differ between Methanomassiliicoccales clades. Given the tendency of clades FL and HA to be enriched in environmental or animal metagenomes, respectively, we searched for genes and genome features linked to putative adaptations of Methanomassiliicoccales to an animal gut. For this, we compared 72 genomes from Methanomassiliicoccales taxa retrieved from humans, nonhuman animals, and environmental sources.
Gene clusters enriched in clade HA evidence adaptation to the gut environment. Because of the small number of genomes that cluster within clade EX, and because these are largely absent from animal-associated samples, subsequent analyses focus on comparisons between clades FL and HA.
To identify gene clusters potentially involved in the adaptation of members of Clade HA to a host environment, we compared the gene cluster content between clades. The gene cluster frequency spectrum shows many clusters present in few genomes; 7,990 (58.3%) gene clusters were singletons, and 2,002 (14.6%) were doubletons ( Fig. S4A and B). After removing rare gene clusters by filtering those with nearzero variance, we included 2,937 clusters, which we then used to perform in phylogenetic ANOVA. Results reveal 14 gene clusters significantly enriched in HA compared to FL (adj. P , 0.1 in all cases) ( Table 1). Three gene clusters are involved in detoxification and xenobiotic metabolism, namely, those encoding bile acid: sodium symporter, bleomycin resistance protein and HAD superfamily hydrolase. Two clusters are related to shikimate or chorismate metabolism, namely, those encoding chorismate mutase II and prephenate dehydratase. Other annotated clusters include the small unit of exonuclease VII, Holliday junction resolvase Hjc, nitrogen regulatory protein PII, xylose isomerase-like protein, and metal-binding domain containing protein; four had poor or no  (Table 1). Similar results were obtained when we performed this analysis without outlier taxa and when biome enrichment was used as independent variable (not shown), further indicating that genomic adaptations differ by clade, not habitat preference. Likewise, 89 clusters were enriched in clade FL compared to HA; these are presented in Table S1E.
Genomic adaptations to the gut of members of the FL clade. To determine whether outlier taxa belonging to clade FL had similar adaptations to the gut to those of members of clade HA, we explored gene clusters that were present in these outliers and in clade HA but were rare in other members of clade FL. We selected gene clusters present in the core genome of clade HA (i.e., present in at least 40 taxa or 80% of this clade, see Text S1 in the supplemental material) and present in less than half of the FL taxa. A total of 15 gene clusters were obtained, most of them encoded by only one of the outlier taxa. Two gene clusters, ferrous iron transport proteins A and B (InterPro accession numbers IPR030389 and IPR007167), were present in three of the outliers (M. luminyensis, "Ca. M. intestinalis" Issoire-Mx1, and Methanomassiliicoccales archaeon RumEn M1). Other clusters detected in more than one outlier included an uncharacterized membrane protein (InterPro number IPR005182, in Methanomassiliicoccus sp. UBA6 and Methanomassiliicoccales archaeon RumEn M1), a putative nickel-responsive regulator (InterPro number IPR014864, in M. luminyensis B10 and Methanomassiliicoccus sp. UBA386), and an ABC transporter (InterPro number IPR037294, in M. luminyensis B10 and Methanomassiliicoccus sp. UBA386). The remaining gene clusters, detected once, corresponded to transcriptional regulators or proteins of unknown function.
The repertoire of adhesion proteins tended to differ between clades HA and FL. We compared between FL and HA clades two large groups of membrane proteins involved in adhesion, namely eukaryote-like proteins (ELPs), a series of protein families involved in microbial adherence to the host (24), and adhesin-like proteins (ALPs), a class of proteins hypothesized to be involved in the microbe-microbe interactions of Methanobacteriales in the gut (13). We aggregated the counts of gene clusters annotated as the ALP and ELP classes and performed phylogenetic ANOVA. This analysis showed a trend toward differing repertoires of adhesion proteins by clade (Fig. 3), although we did not observe significant differences in the frequency of these factors (adj. P . 0.1 in all cases). Taxa from clade HA tended to have higher mean counts of tetratricopeptide repeats (TPR) (mean 6 SD counts: HA, 16.   Interestingly, outlier taxa from clade FL had gene counts of several of the adhesion factors higher than the mean of their own clade and more characteristic of clade HA. In some cases, the gene counts were higher than the mean for clade HA. These included Methanomassiliicoccales taxa cooccur with each other, with other Archaea, and with TMA-producing bacteria in the human gut. We characterized the distribution of Methanomassiliicoccales spp. across a collection of human gut metagenomes derived from 34 studies. Together, the combined 4,472 samples represented people from 22 countries, resulting in 35 unique data sets (i.e., study-country combinations). Across the whole set, we detected just two genera, Methanomassiliicoccus (clade FL) and "Ca. Methanomethylophilus" (clade HA), both rare members of the human gut microbiota (Fig. 5). "Ca. Methanomethylophilus" was detectable in 19 out of 35 data sets; in these 19 data sets, it had a prevalence ranging from 0.5% to 41.7%, and mean abundance ranged from 4.8 Â 10 26 % to 2.2 Â 10 22 %. Similarly, Methanomassiliicoccus was detectable in 22 of the 35 data sets; in the 22 data sets, it had a prevalence range of 1% to 25.7% and a mean abundance range of 1.5 Â 10 25 % to 1.0 Â 10 22 % (Table S1D). We tested associations of these two genera with age, sex, and Westernization status of the subjects using linear mixed models that included the data set and country as random effects. Subjects from non-Westernized countries had a significantly higher prevalence of "Ca. Methanomethylophilus" (mean prevalence 6 SD: non-Westernized = 8.9% 6 28.5%, Westernized = 1.1% 6 10.3%; adj. P = 0.002). Westernized individuals were more likely to harbor higher prevalences of Methanomassiliicoccus, although differences were not significant (non-Westernized = 3.9% 6 19.4%, Westernized, 5.0% 6 21.7%; adj. P . 0.1). The age and sex of the individuals did not explain variance in the prevalence or abundance of either genus (adj. P . 0.1 in all cases).
To identify other microbial taxa positively associated with members of Methanomassiliicoccales in the human gut, we calculated a network of positively associated microorganisms (i.e., coabundant taxa) across samples (rho . 0.1 in all cases) (25). In addition, we determined which taxa were present with members of Methanomassiliicoccales at a greater prevalence than that expected by chance (i.e., cooccurring taxa) relative to a permuted null model (26). Results showed that both "Ca. Methanomethylophilus" and Methanomassiliicoccus were part of the same coabundance network, together with a third archaeal genus, Methanoculleus (order Methanomicrobiales). We did not find evidence of positive or negative abundance associations of either Methanomassiliicoccales genus with Methanobrevibacter. Coocurrence analysis showed a random association pattern between these taxa (P . 0.05 for both "Ca. Methanomethylophilus" and Methanomassiliicoccus), indicating that their ecological niches do not overlap that of Methanobrevibacter.
Analysis of the combined network of "Ca. Methanomethylophilus" and Methanomassiliicoccus revealed a large overlap between taxa associated with either genus (Fig. 6 and Table S1F): out of 119 taxa in the network, 86 (72.3%) were associated with both. Moreover, 51 taxa (42.9%) also had a significant positive cooccurrence pattern with both genera (adj. P , 0.05 in all cases). Most bacterial members of this network had low relative abundances; only Bacteroides and Parabacteroides had a mean relative abundance above 1% (range, 22.7% to 0.0005%). Interestingly, they included taxa that can potentially produce TMA, since their genomes contain genes encoding enzymes involved in its Abundance of Methanomassiliicoccales species is not concordant in monozygotic or dizygotic human twins. To evaluate whether host genetics influences the abundance of Methanomassiliicoccales in the human gut, we compared the intraclass correlation coefficient (ICC) of their abundances at the genus level using a set of 153 monozygotic (MZ) and 200 dizygotic (DZ) twin pairs from the TwinsUK cohort. As a control, we first compared the mean ICC across all taxa between MZ and DZ twins and found that ICC MZ (0.1) was significantly higher than ICC DZ (0.03) (P , 0.01). In addition, we assessed the ICC values of bacterial (Christensenella, Faecalibacterium, and Bifidobacterium) and archaeal (Methanobrevibacter) genera, and consistently found a higher correlation for MZ compared to DZ twins (Table S1G). We were only able to assess ICC values of Methanomassiliicoccus, as it was the only Methanomassiliicoccales taxon detected in the twins with a prevalence (8.64%) above the 5% cutoff (see Materials and Methods). We did not detect a significant concordance between the abundances of Methanomassiliicoccus in MZ (ICC MZ = 0.004; adj. P = 0.59) or in DZ twins (ICC DZ = 0.017; adj. P = 0.71). Given the low abundance of Methanomassiliicoccales taxa, we performed a sensitivity analysis using samples with a high sequencing depth (.12 million reads/sample); however, we did not observe differences in the abundance and prevalence of the Methanomassiliicoccales genera or in the ICC estimates (data not shown).

DISCUSSION
While the source of the members of the Methanomassiliicoccales has been noted in previous surveys of single markers such as 16S rRNA and mcrA genes (10,11), here, we searched metagenomes from host-associated and environmental samples for their relative abundances. Overall, the HA taxa were enriched in host-associated samples and the FL taxa were enriched in environmental samples; intriguingly, all taxa, regardless of clade, were detected in both biomes. This suggests that members of the order Methanomassiliicoccales are generalists with an overall habitat preference according to clade, although there were some exceptions to the general pattern. We show that members of Methanomassiliicoccales use many of the same adaptations to the gut as other methanogens. These adaptations include genome reduction and genes involved in the shikimate pathway and bile resistance. In addition, gut-enriched taxa tend to have a distinct repertoire of genes encoding adhesion factors. We observed that potential adaptations to the gut differed by clade, not preferred habitat, indicating convergence on a shared niche through different genomic solutions. In the human gut, Methanomassiliicoccales taxa correlated with TMA-producing bacteria, rather than host genetics or other host factors.
For members of the HA clade, adaptations to life in the gut included an enrichment of genes involved in bile acid transport, efflux pumps, and hydrolases, which play a role in tolerance to these compounds in the gastrointestinal tract (28). This adaptation is also shared with other members of the gut microbiota, including Methanobacteriales taxa; Methanobrevibacter smithii and Methanosphaera stadtmanae are resistant to bile salts (3,4). Other gene clusters with known functions enriched in clade HA are involved in metabolism of shikimate and chorismate. The shikimate pathway is involved in the synthesis of aromatic amino acids in plants and microbes, but it is absent in mammals. Shikimate metabolism is carried out by archaeal (29) and bacterial (30, 31) members of the animal gut microbiota and was reported as one of the most conserved metabolic modules in a large-scale gene catalogue from the human gut (32). In turn, aromatic amino acids can be transformed by the gut microbiota into active metabolites, which are involved in diverse physiological processes (33) and conditions such as cardiovascular disease (34). Indeed, plasma concentrations of microbial derivatives of tryptophan have even been shown to negatively correlate with atherosclerosis (35). It remains to be elucidated whether Methanomassiliicoccales are involved in human health through the metabolism of aromatic amino acids and associated compounds.
We observed that each clade tended to encode different adhesion factors, although without statistical significance. These factors are involved in the maintenance of syntrophic relationships of the methanogens with bacterial (12,36) or eukaryotic (37) microorganisms. Two groups of adhesion factors, proteins containing Sel1 domains and Listeria-Bacteroides repeats, have been previously studied in Methanomassiliicoccales taxa retrieved from the gut (9,23). Our assessment of these factors in the broader context of the order Methanomassiliicoccales showed that these two groups are more likely to be higher in clade HA than clade FL taxa, with the exception of the outlier taxa. Indeed, the repertoire of ELPs and ALPs was similar between species inhabiting the gut, regardless of their clade. This emphasizes the potential involvement of these proteins in the adaptation to the intestinal environment, although the exact mechanisms are yet to be elucidated.
In contrast, members of clade FL appear to be generalists that colonized the animal gut independently from the HA clade. It has been previously noted that M. luminyensis, an outlier from clade FL, could have a facultative association to the animal gut. It possesses genes involved in nitrogen fixation, oxidative stress (9), and mercury methylation (23), which are common in soil microorganisms but rare in members of the gut microbiota (38). In accordance with this, we observed that members of clade FL are widespread and abundant in soil, water, and gut metagenomes, with a preference for environmental biomes. Similarities in ELP content between gut-dwelling taxa from both clades indicate that interaction with the host or other members of the gut microbiota might be a key factor in the adaptation of these methanogens.
Analysis of the gene content of outlier taxa from clade FL showed that they tended to be more similar to members of their own clade than to taxa from clade HA, with the exception of "Ca. M. intestinalis" Issoire-Mx1, which was distinct from either clade FL and HA. In addition, there was little overlap in gene clusters commonly observed in clade HA and outlier taxa from clade FL, with the exception of the adhesion factors discussed above. These observations support the hypothesis that colonization of animal guts by members of Methanomassiliicoccales occurred in two independent events (9,23), and suggests that there is not one solution to life in the gut for these archaea, as members from two clades seem to have solved the problem with a different set of adaptations.
Characterization of the abundance of Methanomassiliicoccales taxa across human populations showed members of this group are rare in the microbiota of healthy adults. We did not detect them in all the studied populations, and, when detected, they had low prevalence and abundance. Our extensive analysis of human gut samples corroborates estimates of Methanomassiliicoccales prevalence (up to 11%) (23,39,40) and mean abundance (below 1%) (23,41). Differences in Methanomassiliicoccales carriage between Westernized and non-Westernized populations remain to be explained, and may be due to diet. While Westernized diets are richer in TMA precursors than non-Western diets (42), intake varies across populations (43).
Our analysis allowed us to assess whether Archaea in the human gut are mutually exclusive. We observed positive correlations of "Ca. Methanomethylophilus" and Methanomassiliicoccus with each other and with Methanoculleus, another rare archaeal member of the gut microbiota (45). We did not find evidence of association between members of Methanomassiliicoccales and Methanobrevibacter, positive or otherwise, confirming the previous report that these methanogens are not mutually exclusive (39); abundance of H 2 in the gut, together with differences in other substrate utilization, might result in nonoverlapping niches (46).
While genus Methanobrevibacter was consistently found to have a moderate heritability in the TwinsUK (19,39,47) and other cohorts (48,49), this was not the case for members of Methanomassiliicoccales. Similarly to humans, methane production (50) and abundance of Methanobrevibacter (51) are also heritable in bovine cattle, but Methanomassiliicoccales taxa are not (51). Thus, host genetics might be linked to particular taxa and methanogenesis pathways, not to all Archaea or to methane production as a whole.
Genera "Ca. Methanomethylophilus" and Methanomassiliicoccus cooccur with TMAproducing bacteria (27), further supporting their potential use as a way of targeting intestinal TMA (52). The exact nature of the ecological relationships each of these taxa establishes with other members of the microbiome remains to be elucidated. In a facilitation scenario between the methanogens and H 2 and TMA producers, freely available TMA and H 2 required for methylotrophic methanogenesis could be utilized by Methanomassiliicoccales taxa (53) without cost to the producer. Alternatively, the methanogens could establish syntrophic interactions with other microorganisms, whereby the consumption of these metabolites is also beneficial to the producer (53).
The present study extends our understanding of the order Methanomassiliicoccales by revealing genomic adaptations to life in the gut by members of both clades that make up this group. Furthermore, the positive correlation between the relative abundances of these TMA-utilizing Archaea with TMA-producing bacteria in the gut is a first step toward understanding how they may be harnessed for therapeutic management of gut TMA levels in the context of cardiovascular disease.

MATERIALS AND METHODS
For a detailed description of the methods, see Text S1 in the supplemental material. Genome annotation and phylogenomic tree reconstruction. We used 71 substantially complete genomes (completeness, $70%) with low contamination (contamination, ,5%) retrieved from the NCBI Assembly database (https://www.ncbi.nlm.nih.gov/assembly), plus an additional high-quality metagenome-assembled genome (MAG) corresponding to "Candidatus Methanomethylophilus alvus." Gene calling was performed using Prokka (54). A maximum-likelihood phylogenomic tree was constructed using PhyloPhlAn (55) with the 72 Methanomassiliicoccales genomes plus members of the order Thermoplasmatales as an outgroup. We used interactive Tree Of Life (iTOL) (56) to visualize the tree.
Abundance of Methanomassiliicoccales in environmental and animal gastrointestinal metagenomes. We retrieved 305 publicly available gastrointestinal and environmental metagenome samples (57) (see Table S1B in the supplemental material). To avoid multiple mapping of reads, we dereplicated the 72 genomes at a species level (95% average nucleotide identity [ANI]) using dRep (58), resulting in 29 representative genomes. We quantified the abundance of dereplicated Methanomassiliicoccales genomes in the metagenomes using KrakenUniq (59). We estimated the enrichment of each representative Methanomassiliicoccales taxon in host or environmental metagenomes using DESeq2 with the Wald test (60) on sequence counts and classifying metagenome samples as either host derived or environmental.
Comparative genomics. We grouped the predicted genes into gene clusters using panX (61) and used InterProScan (62) and eggNOG-mapper (63) for annotation. Phylogenetic signal of genome characteristics and gene cluster presence was tested using the phylosignal R package with the local indicator of phylogenetic association (LIPA) (64). The R package micropan (65) was used to create a pangenome principal-component analysis (PCA). We performed phylogenetic ANOVA using the R package phytools (66) to determine clusters enriched in clades FL or HA. We adjusted P values for multiple comparisons with the Benjamini-Hochberg method. Due to the exploratory nature of this work, tests were considered significant if they had an adjusted P value (adj. P) of ,0.1; a false-discovery-rate-adjusted P value cutoff of 0.1 implies that 10% of significant tests will result in false-positives. In cases where adjusting P values was not necessary, raw P values are provided.
Characterization of Methanomassiliicoccales distribution across human populations. We retrieved and quality controlled 4,472 publicly available human gut metagenomes from 34 independent studies (Table S1C). Reads were classified using Kraken (67) and Bracken (68) with custom databases (69). Taxa with ,100 reads in a given sample were considered absent. To determine the cooccurrence patterns of Methanomassiliicoccales in the human gut, we used the cooccur package (26); to determine their coabundance patterns, we calculated the proportionality of taxa abundance (rho) with the propr R package (70). The lme4 and lmerTest R packages (71) were used to fit linear mixed effects models to test differences of Methanomassiliicoccales genera abundance by Westernization status, age, and gender. We employed binomial linear mixed models to test differences in genera prevalence. Lists of potential TMA (27,72,73) and methanol (74) (see Text S1 in the supplemental materials) producers were compiled from the available literature.
Heritability of Methanomassiliicoccales taxa was assessed by comparing relative abundances of taxa within 153 monozygotic (MZ) and 200 dizygotic (DZ) twin pairs from the United Kingdom Adult Twin Registry (TwinsUK) (19,39,75). Absolute read counts were transformed using the Yeo-Johnson transformation and adjusted by body mass index (BMI), sex, and sequencing depth (19,39). We calculated the intraclass correlation coefficient (ICC) in MZ and DZ twins with the irr R package, and adjusted P values using the Benjamini-Hochberg method. We compared the mean ICC across all taxa between MZ and DZ twins using the Mann-Whitney test and by assessing the ICC of taxa previously reported as heritable (Methanobrevibacter, Faecalibacterium, Christensenella, and Bifidobacterium) (39,48).
Data availability. The raw sequence data are available from the European Nucleotide Archive under study accession number PRJEB40256. Jupyter notebooks are available at https://github.com/leylabmpi/ Methanomassilii. The "Candidatus Methanomethylophilus" MAG generated here can be found at http:// ftp.tue.mpg.de/ebio/projects/Mmassilii/.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. TEXT S1, DOCX file, 0.02 MB. We are also grateful to Daphne Welter, Jessica Sutter, and Albane Ruaud for the fruitful discussions and comments.
We declare no competing interests.