Fermented food metagenomics reveals substrate-associated differences in taxonomy, health-associated- and antibiotic resistance-determinants

Fermented foods have been the focus of ever greater interest as a consequence of purported health benefits. Indeed, it has been suggested that the consumption of these foods that help to address the negative consequences of ‘industrialization’ of the human gut microbiota in Western society. However, as the mechanisms via which the microbes in fermented foods improve health are not understood, it is necessary to develop an understanding of the composition and functionality of the fermented food microbiota to better harness desirable traits. Here we considerably expand the understanding of fermented food microbiomes by employing shotgun metagenomic sequencing to provide a comprehensive insight into the microbial composition, diversity and functional potential (including antimicrobial resistance, carbohydrate-degrading and health-associated gene content) of a diverse range of 58 fermented foods from artisanal producers from around the Globe. Food type, i.e., dairy-, sugar- or brine-type fermented foods, was to be the primary driver of microbial composition, with dairy foods found to have the lowest microbial diversity. From the combined dataset, 127 high quality metagenome-assembled genomes (MAGs), including 10 MAGs representing putatively novel species of Acetobacter, Acidisphaera, Gluconobacter, Lactobacillus, Leuconostoc and Rouxiella, were generated. Potential health promoting attributes were more common in fermented foods than non-fermented equivalents, with waterkefirs, sauerkrauts and kvasses containing the greatest numbers of potentially health-associated gene clusters (PHAGCs). Ultimately, this study provides the most comprehensive insight into the microbiomes of fermented foods to date, and yields novel information regarding their relative health-promoting potential. Importance Fermented foods are regaining popularity in Western society due in part to an appreciation of the potential for fermented food microbiota to positively impact on health. Many previous studies have studied fermented microbiota using classical culture-based microbiological methods, older molecular techniques or, where deeper analyses have been performed, have involved a relatively small number of one specific food type. Here, we have used a state-of-the-art shotgun metagenomic approach to investigate 58 different fermented foods of different type and origin. Through this analysis, we were able to identify the differences in the microbiota across these foods, the factors that drove their microbial composition, and the relative potential functional benefits of these microbes. The information provided here will provide significant opportunities for the further optimisation of fermented food production and the harnessing of their health promoting potential.

provide a comprehensive insight into the microbial composition, diversity and functional potential 23 (including antimicrobial resistance, carbohydrate-degrading and health-associated gene content) of 24 a diverse range of 58 fermented foods from artisanal producers from around the Globe. Food type, 25 i.e., dairy-, sugar-or brine-type fermented foods, was to be the primary driver of microbial 26 composition, with dairy foods found to have the lowest microbial diversity. From the combined 27 dataset, 127 high quality metagenome-assembled genomes (MAGs), including 10 MAGs 28 representing putatively novel species of Acetobacter, Acidisphaera,Gluconobacter,Lactobacillus,29 Leuconostoc and Rouxiella, were generated. Potential health promoting attributes were more 30 common in fermented foods than non-fermented equivalents, with waterkefirs, sauerkrauts and 31 kvasses containing the greatest numbers of potentially health-associated gene clusters (PHAGCs). 32 Ultimately, this study provides the most comprehensive insight into the microbiomes of fermented 33 foods to date, and yields novel information regarding their relative health-promoting potential. 34

35
Fermented foods are regaining popularity in Western society due in part to an appreciation of the 36 potential for fermented food microbiota to positively impact on health. Many previous studies have 37 studied fermented microbiota using classical culture-based microbiological methods, older 38 molecular techniques or, where deeper analyses have been performed, have involved a relatively 39 Introduction 46 Fermentation is a form of food preservation with origins that can be traced back to the Neolithic age [1]. Despite recent advances in food preservation and processing, fermentation continues to be 48 widely used as a means of preservation and is the focus of renewed interest due to increased 49 appreciation of the organoleptic, nutritive and, especially, health promoting properties attributed to 50 many fermented foods [2,3]. 51 Indeed, various fermented foods have been shown to have enhanced attributes relative to the 52 corresponding raw ingredients by virtue of the microbial metabolites produced [4][5][6][7][8], the removal of 53 allergens [9], other desirable biological activities [10,11] and/or containing microbes that have the 54 potential to confer benefits following consumption [12,13]. Furthermore, although antibiotic use, 55 sanitation and food processing have greatly reduced the number of deaths due to infectious 56 diseases, these activities have also minimised our exposure to microbes and are thought to have 57 contributed to the 'industrialisation' of the human microbiome and associated increases in chronic 58 diseases [14,15]. It has been suggested that fermented foods offer a means of safe microbial 59 exposure to compensate for the absence/removal of desirable host microbes [15,16]. 60 Due to these potential benefits, and an increasing appreciation that the study of these foods provide 61 valuable fundamental insights into simple microbial communities [17,18], developing an even 62 greater understanding of the microbiology of these foods has the potential to be of considerable 63 value. 64 Advances in high throughput sequencing technology have revolutionised the study of microbial 65 populations, including those present in foods. Although, to date, the vast majority of studies relating 66 to fermented foods have employed amplicon sequencing to study bacterial and fungal 67 composition [19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36], there have been some exceptional studies in which shotgun sequencing has 68 been employed to gain a greater insight into the taxonomy and functional potential of specific 69 fermented foods [37][38][39][40][41][42][43][44][45][46][47][48][49]. Despite this, studies across a broad variety of such foods using this 70 approach have been lacking to date. Here we address this issue by employing shotgun metagenomic 71 sequencing to investigate the microbiota of broad range of, including many previously unexplored, 72 fermented foods.  Table 1), revealed that the microbiomes of these foods most 80 significantly clustered on the basis of food substrate (i.e., dairy, such as kefir and cheese; brined, 81 such as sauerkraut and kimchi; sugar, such as kombucha and water kefir; Table 2, Figure 1). Ten 82 characteristics of the food microbiome were defined and differences across these characteristics 83 were statistically examined ( Table 2); 4 taxonomic levels (species, genus, family and phylum), 4 84 functional profiles (Superfocus 1,2 and 3, and Carbohydrate functions, which were a subset of 85 Humann2 output), the bacteriocin profile and the antimicrobial resistance profile. 86 Taxonomy was the most distinguishing feature of the food substrates, as measured by the R statistic, 87 supported by NMDS plots, PLS-DA and L. lactis phylogenetic tree (Figure 1, Figure 2, Table 2). 88 Substrate-related differences were greatest at the family-level, but were also significant at the 89 species, genus and phylum level ( Table 2). Further analysis was implemented at strain level. 90 Examination of Lactococcus lactis, the species present across the greatest number of food samples 91 revealed that strains phylogenetically cluster according to food substrate (Figure 1). There was no 92 clustering of L. lactis strains according to any other factor. Functional analysis revealed that 93 substrate had the most considerable impact on the functional profile of the foods (Table 2, Figure 1). 94 Carbohydrate pathways most considerably differed across the food groups (Table 2). Indeed, of the 95 features examined, the bacteriocin profile was the only characteristic that was not statistically 96 different across the food substrates. 97 Three foods tested did not correspond to the three main food substrates or the corresponding 98 microbiome clusters. Two of these were derived from soy-based fermentations, which are known for 99 their alkaline fermentation environment [50], and the third was a coconut kefir, i.e., a dairy kefir 100 grain based fermentation but of a coconut carbohydrate. Other fermented food types, e.g. 101 fermented meats and fish, were not considered for this study. 102 103 Starter presence/absence, solid/liquid state and producer contribute to differences in microbiota 104 Although less obvious from a clustering perspective, other factors such as starter presence/absence, 105 solid/liquid state and producer, were also significant drivers of microbiome differences 106 (Supplementary Figure 1, Table 2). The presence or absence of a starter culture was associated with 107 differences in family, species, carbohydrate, genus, SF3 and the AMR profile of foods (in order of 108 descending effect size), but to a lesser extent than substrate. Solid/liquid state was significant at 109 three taxonomic levels and all 4 functional profiles (3 SuperFocus levels and Humann2 carbohydrate 110 pathways), but again with a smaller effect size than substrate and starter status ( Table 2). However,111 it was the only factor that was associated with significant differences across bacteriocin profiles. The 112 specific producer of the foods was reflected by the carbohydrate related functions and species 113 composition, but country of origin did not influence any of the factors investigated ( Table 2). 114 115

Microbial diversity differs between dairy foods and other food types 116
Overall, 476 unique species, present at above 0.1% relative abundance, were assigned to the 58 117 foods, whereof 301 different species were detected in brine foods, 242 in sugar foods and 70 in 118 dairy. This corresponded to an average of 11.5, 13.5 and 6.4 different species per sample for brine, 119 sugar and dairy foods, respectively. In line with these results, alpha diversity analyses demonstrated 120 that the microbiomes of dairy-based fermented foods had significantly lower alpha diversity than 121 those of either brine or sugar foods (Figure 3), which did not significantly differ from one another. It 122 was also evident that, as expected, the alpha diversity of spontaneously fermented foods was 123 significantly higher than those produced using starter cultures (Figure 4). Across the specific foods, a 124 spontaneously fermented orange preserve contained the highest number of species (67) The brine-type foods tested comprised 26 plant substrate-derived foods fermented in a saline 130 solution. Unlike both dairy-and sugar-type fermented foods, the majority of the brine-based foods 131 undergo a spontaneous fermentation and, therefore, rely on fermentation by autochthonous 132 microbes [51]. Among brine-type foods, Lactobacillus was the most abundant genus, comprising 133 46.8% of all reads assigned at the genus level. Lactobacillus plantarum was the most abundant 134 species (9.6% relative abundance on average) followed by L. brevis (7.9%), L. mucosae (4.7%), L. 135 xianfangensis (4.1%) and L. sakei (3%). Leuconostoc mesenteroides (4.7%) and Pediococcus parvulus 136 (4.3%) were also present in significant quantities. Across the brine-type foods Bifidobacteriaceae 137 were detected at a relative abundance of 1.6%. At the species level, 0.8% of species were assigned 138 as Bifidobacterium longum and 0.01% to B. breve. No other bifidobacteria were assigned at the 139 species level. 140 141 Several brine fermented foods were described for the first time, alongside foods that have been 142 described before. A detailed description of these foods can be found in the supplementary material. 143 From a functional potential perspective, 18.4% of Superfocus level 1 (SF1) functions within the brine 144 food microbiome were predicted to relate to carbohydrate metabolism. When functional pathways 145 were investigated at a deeper level, xylose utilisation (0.6%, SF3), fermentation (1.4%, SF2) and 146 response osmotic stress (1%, SF2) were among the most common functionalities (Supplementary 147 The microbiota composition of dairy foods is more homogeneous than that of other fermented 151 foods 152 Eleven dairy-type fermented foods were studied. Information supplied by the producers established 153 that all of these foods were produced through the use of starter cultures to initiate fermentation, 154 thus likely contributing to their reduced diversity relative to other foods [21]. Lactococcus lactis 155 dominated, corresponding to, on average, 44.8% of relative abundance and was present at a relative 156 abundance at or above 90% in 3 of the dairy foods, all of which were kefir or kefir-type foods. The 157 next most abundant species was Streptococcus thermophilus (16%), followed by S. infantarius 158 (5.7%), Kluyveromyces marxianus (3.7%), Escherichia coli (3.5%), Lactococcus raffinolactis (3%) and L. 159 mesenteroides (2.9%). It was notable that viruses (including (pro)phage) also made up a significant 160 portion of the dairy food microbiota (7.8%). Sugar foods contained many species previously associated with alcohol-generating fermentations, 177 such as Saccharomyces eubayanus (2.7%), Brettanomyces bruxellensis (5.2%), Hanseniaspora 178 valbyensis (9.3%) and Oenococcus oeni (5%). Many of the other species were well-known kombucha-179 associated species such as Gluconobacter oxydans (5%), Acetobacter cerevisiae (2.5%) and 180 Komagataeibacter rhaeticus (2%). At the species level, Hanseniaspora valbyensis was the most 181 abundant (9.3% average abundance). However, this reflects very high abundance in specific 182 instances, e.g., relative abundance in mead was 93.7%, whereas this species was not detected in 10 183 of the other 18 sugar-type fermented foods. Lactobacillus was the most abundant genus (25.8%) but 184 its abundance was lower than that found for dairy and brine foods. Within this genus, Lactobacillus 185 mali (7.6%) and L. plantarum (5.3%) were the most common species. Acetobacter was the next most 186 abundant genus (10.9%) and its distribution, along with other members of the Acetobacteraceae, 187 made it the most abundant family (33.3%). Like brine and dairy fermented foods, the specific 188 microbiomes of sugar foods are described in the supplementary material. 189 The most abundant SF1 function found in sugar foods was carbohydrate metabolism (14.5%). 190 Resistance to antibiotics and toxic compounds (3.8%) and osmotic stress (1%) were the most 191 common SF2 functions, while analysis at SF3 pathways highlighted the frequency of several 192 pathways involved in the synthesis of amino acids such as methionine (0.79%), as well as purine 193  With respect to specific AMR classes, multi-drug resistance was most commonly assigned gene 203 category across all three food substrates, corresponding to 2422, 293 and 133 CPMs per sample on 204 average for dairy, brine and sugar-type foods, respectively. Betalactam resistance genes were the 205 next most common class in dairy (718 CPM) and sugar (101 CPM) foods, while tetracycline resistance 206 genes were the second most numerous category of AMR genes in brine (45 CPM). It was also noted 207 that a five-fold higher abundance of AMR genes occurred in starter culture fermentations relative to 208 spontaneous fermentations. Multi-drug resistance genes again dominated, corresponding to 1326 209 CPM for starter cultures and 236 CPM for spontaneous fermentations. Betalactam resistance genes 210 were the next highest in foods containing starter cultures (428 CPM), whereas tetracycline 211 resistance genes were next highest in spontaneously fermented foods (48 CPM). The high CPMs for 212 both dairy and starter containing foods is consistent with the fact that dairy foods were those for 213 which starters were most extensively used. When gene distribution was investigated from the 214 perspective of specific food substrates, the wagashi cheese rind was found to have the highest CPM, 215 i.e., 17381, with tempeh being next highest with 5657 CPM. AMR genes counts in kombucha and 216 water kefirs were generally low, and no known AMR genes were identified 9 of the 58 foods, i.e., 1 217 kombucha, 2 water kefirs, 3 kimchi, 1 pickled carrot, 1 pickled vegetable and 1 apple cider vinegar. 218 Of the 9 fermented foods for which no AMR genes were assigned, 4 were sugar-type (including 2 219 water kefirs) and 5 were brine-type (including 3 kimchis). It was notable that very few AMR genes 220 were assigned in the 2 other kimchis studied (<42 CPM) while across the 5 other water kefir samples, 221 3 contained very few AMR genes (<6 CPM) but 2 had relatively high counts (>1000 CPM). Across the 222 two samples of Kombucha, 1 did not contain assigned AMR genes while the other contained 1.6 223

CPM. 224
To provide context, the frequency with which AMR genes are detected in fermented foods was 225 compared with that across human stool samples for comparative purposes ( Figure 5D). Human gut 226 samples (29 random stool samples from the Human Gut Microbiome Project[52]) had significantly 227 more AMR CPMs than fermented foods (p > 0.01) with the exception of 8 fermented foods. These 8 228 foods were the 2 wagashi cheese samples, tempeh, fermented ginger, 3 milk kefirs and labne. Of 229 these 8 foods, 6 were dairy, and 7 were starter-generated foods. A further 12 foods had similar CPM 230 of AMR genes, while 38 foods had lower AMR CPMs, when compared with the human samples. 231

232
The presence of putative health promoting genes differs markedly across fermented foods but 233 exceeds that of non-fermented foods 234 Bacteriocins are ribosomally synthesised antimicrobial peptides, many producers of which have been 235 sourced from fermented foods. The bacteriocin-producing potential across the 58 fermented food 236 samples was investigated, with 55 putative bacteriocin-encoding gene clusters being assigned across 237 54 of the foods (no gene clusters identified in 4 samples (Supplementary Table 3). Zoocin A-and 238 enterolysin A-like gene clusters were highly abundant across all 3 fermented food substrates. 239 Clusters corresponding to another bacteriolysin subclass, the helveticin J-like proteins were more 240 frequently detected in dairy and sugar-type foods than in brine-type foods (Fig 3B). Carocin A-and 241 colicin A-like clusters had a high abundance in brine and sugar, but not dairy, foods. As noted above, 242 there was a significant difference in the distribution of bacteriocins between solid and liquid food 243 types ( Table 2), with liquid foods having a higher relative abundance of helveticin J Propioncin F-like 244 and pediocin clusters and solid foods having more carnocin CP52-like and microsin 24-like clusters. 245 Examining the pediocin sequences in more detail, homology with pedA and pedB was discovered. 246 Given that bacteriocin production is regarded as a probiotic trait, these findings prompted an 247 investigation of other potentially health-associated gene clusters (PHAGCs) within these fermented 248 food microbiomes. PHAGCs were divided into 3 broad categories. Gene clusters binned as "survival" 249 are genes that were shown to be important for surviving the low pH of the stomach or the bile salts 250 of the small intestine [53]. Gene clusters binned as "colonisation" are genes which were shown to be 251 vital for colonising the gut microbiome. These included genes responsible for surface proteins and 252 exopolysaccharide production. "Modulation" gene clusters were all of the other potentially health 253 promoting gene clusters that did not fit the previous two bins. These genes were shown to affect the 254 host phenotype in other ways, such as stimulating the host immune system in the case of D 255 phenyllactic acid [13] or the production of γ aminobutyric acid (GABA) [54,55]. The majority of these 256 PHAGCs genes are based on studies reviewed in [53]. Shotgun metagenomic data from non-257 fermented foods, i.e., unpasteurized whole milk, pasteurized skimmed milk and milk powder, was 258 used for comparative purposes. In general, the fermented foods contained considerably more 259 PHAGCs than the non-fermented substrates. Among the fermented foods, a larger number of 260 PHAGCs were found in brine-and sugar-foods than in dairy foods, with several water kefirs, 261 sauerkrauts, beet kvasses and one kombucha being the foods with highest levels of PHAGCs ( Figure  262 6). With respect to the individual PHAGC sub-categories, all fermented foods contained more 263 colonisation-type PHAGCs than the non-fermented controls. In the case of the modulation and 264 survival clusters, the number of PHAGCs in some fermented foods, such as scallion kimchi, labne, 265 agousha and mead, were no greater than those in the non-fermented foods. 266 267 Metagenomic assembly reveals 10 putative new species 268 Metagenome assembled genomes (MAGs) were assembled from the reads and quality checked. 443 269 MAGs were assembled in total, with 127 genomes above 80% completeness and having less than 270 10% contamination (Figure 7). Traitar[56] was used to predict the growth phenotypes of the 127 271 MAGs. The outputs were concatenated into a single output for each food substrate (Figure 7) and 272 provided intuitive results, such as a high correlation between lactose utilisation and dairy foods and 273 high glucose oxidation potential in sugar food microbiomes. Consilience between the Traitar and 274 taxonomic output is supported by the abundance of Lactococcus lactis in dairy and brine samples. 275 FastANI[57] was used to assign taxonomy and to assess novelty and established that 10 of these 276 MAGs had <95% identity to known NCBI prokaryote genomes. 7 of these novel MAGs are acetic acid 277 bacteria, 2 are lactic acid bacteria and 1 belongs to the family Enterobacteriales (Table 3) the microbes, such as a necessity for osmotic stress tolerance in both brine and sugar-type foods. 303 Other factors, such as the presence or absence of a starter culture, also contributed to differences. 304 Starter culture foods had the lowest alpha diversity, likely a result of adding a community of 305 specialist microbes to the food, which would outcompete any autochthonous microbes less adapted 306 to such an environment. Two kefir samples made from the same starter, but using raw or 307 pasteurized milk, respectively, highlight this point. Although we do not have data on the pre-308 fermented milk, the raw milk likely contained its own unique consortium compared to the relatively 309 low bacterial load of the pasteurised sample. After 48 hours of fermentation, both samples had 310 almost identical microbial composition. The small differences may be due to carry over differences in 311 the microbiota of the substrates, the stochastic differences between any two fermented samples 312 and species falling below the 0.1% abundance threshold for inclusion (hence the appearance of 5 313 unique species between the 2 samples). Interestingly, P. helleri was found at 3% in the pasteurised 314 sample (not at all in the unpasteurised), having been isolated from raw milk in previous studies [60]. 315 The differences in diversity between solid and liquid foods is possibly due to the selective pressures 316 of mobility, nutrient availability (in a homogenous liquid compared to a less homogenous solid food) 317 and moisture content in solid foods compared to liquid foods. The observed differences in diversity 318 due to producer are more difficult to explain, but unrecorded factors such as individual fermentation 319 practises or cross contamination of foods or from the processing environment may be the cause of 320 these differences. Country of origin was not significant for any characteristic examined, possibly due 321 to the cosmopolitan nature of all of the fermenting microbes. Outside of composition and top-level 322 functionalities, other traits did vary in line with other categories, in that bacteriocin gene cluster 323 profile differed significantly across solid and liquid foods, and AMR-encoding genes differed across 324 food substrate and between spontaneous and starter-type fermentations. It is unclear as to why 325 bacteriocin gene clusters differed across solid and liquid foods, but perhaps the matrices of solid 326 foods require different ecological tools for competitive advantage than liquid substrates. 327 Analysis revealed that the microbiomes of starter culture-type fermentations contain more assigned 328 AMR-associated genes. However, this difference could represent the more extensive 329 characterisation of starter culture microbes, and their associated genomes and AMR profiles, leading 330 to better assignment of AMR genes from starter cultures strains than those involved in spontaneous 331 fermentations. Comparing with human gut metagenomes, the majority of the fermented foods had 332 a lower AMR CPM. Of the 8 foods with higher AMR CPM, only 3 foods stood out as having 333 considerably higher CPMs, 2 were subsamples of the same food, i.e. wagashi cheese. In contrast, 334 kimchi and kombucha samples were notable by virtue of either lacking detectable AMR genes or 335 having very low CPMs. Kimchi shared many taxa with other brine-type foods so the differences 336 observed may reflect strain level differences. Metagenomic sequencing of a larger collection of 337 these fermented foods, coupled with antibiotic resistance assessments of isolated strains, will be 338 necessary to determine how representative these results are. 339 Bacteriocin production is regarded as a probiotic trait. These peptides and, in the case of 340 bacteriolysins, proteins, are thought to be produced by bacteria to gain a competitive advantage 341 over other taxa, typically those occupying the same environmental niche. Bacteriocin production can 342 contribute to the quality and safety of foods through the removal of spoilage and pathogenic 343 bacteria, but bacteriocin production in situ in the gut can also enable the producing bacteria to 344 become established, compete against undesirable taxa and contribute to host-microbe dialogue [61, 345 62]. The bacteriocin profile did not differ according to food substrate, with zoocin A-and enterolysin 346 A-like genes being most abundant across all food substrates. However, the bacteriocin-associated 347 genes present in solid and liquid foods differed significantly from one another in that liquid foods 348 were enriched with pediocin-like genes. After a further analysis of the pediocin sequences, 349 homology with pedA and pedB, required for production of to pediocin AcH/PA-1, was apparent. 350 These bacteriocins are best known for their strong antilisterial effects [63]. Pediocin AcH/PA-1 has 351 also been shown to be active against enterococci and staphylococci[64], and the presence of these 352 genes potentially adds to the safety of these foods, and their potential to be health promoting. Solid 353 foods had a higher abundance of carnocin CP52-like bacteriocins, which are known for activity 354 against Listeria and Enterococcus, again potentially adding to the safety of these foods [65]. 355 Across a broader range of PHAGCs, it was apparent that these gene clusters were more common in 356 fermented, than non-fermented, foods. Sugar and brine foods were found to contain the highest 357 levels of PHAGCs. Microbes in sugar-type food microbes generally must persist in low pH 358 environments, with some kombucha fermentations dropping to as low as pH 3[66]. In contrast, 359 although also somewhat acidic, a milk kefir fermentation is regarded as complete when the pH 360 reaches 4.5[67], while the pH of most cheeses is between pH 5.1 and 5.9. Many of the sugar foods 361 also contained colonisation-associated PHAGCs. It was also noted that brine-type foods had the 362 highest abundance of Lactobacillaceae, specific representatives of which have been exploited for 363 their probiotic activity. A combination of these various factors likely contributes to the higher 364 abundance of PHAGCs in both of these foods relative to dairy foods. However, even within the 365 respective food substrate groups, the PHAGCs present varied considerably, with foods such as water 366 kefirs, sauerkrauts, pickled veg, ginger, kvass and kombucha being enriched in PHAGCs. These foods 367 all contained colonisation and survival PHAGCs at a higher frequency, e.g., glycotransferases for 368 colonisation in kombucha and pickled veg, and bile salt metabolism genes in water kefir and 369 fermented sliced ginger. D-lactate dehydrogenase pathways were consistently identified in these 370 foods but were absent from other such as scallion kimchi, carrot sticks and agousha. This 371 observation is notable as D-lactate dehydrogenase is the enzyme responsible for producing D-372 phenyllactic acid (D-PLA), a metabolite known to modulate the host immune system [13]. Glutamate 373 decarboxylase, which converts glutamate into gamma-aminobutyric acid (GABA), was present in 374 some (kombucha, kvass, coconut kefir and some water kefir samples), but not all, PHAGC-enriched 375 foods. GABA is a well-known modulator of mood [68], while this enzymatic reaction also consumes 376 protons and thus contributes to acid resistance [69]. Although in vivo studies are required to directly 377 examine the health benefits of specific fermented foods, these insights can undoubtedly help to 378 identify foods, and strains, that are more likely to be health promoting, facilitate the production of 379 fermented foods optimised for health promotion and direct the experimental design of human 380 intervention studies. 381 Finally, this study discovered 127 high quality MAGs, of which 10 are putative novel species. 3 382 putative new Acetobacter species from water kefir, milk kefir and sauerkraut, a Gluconobacter from 383 bread kvass, a Leuconostoc from sauerkraut and a Lactobacillus from boza were assembled from the 384 shotgun data. While these species are apparently novel, the corresponding genera are found in 385 fermented foods at a high frequency. However, 2 MAGs representing genera that have not been 386 found in fermented foods before were assembled, i.e., a Rouxiella species and 3 Acidisphaera 387 species, all from water kefir samples. Rouxiella chamberiensis and Acidisphaera rubrifaciens are the 388 only previously known members of their respective genera. Rouxiella chamberiensis was isolated 389 from parenteral nutrition bags and has been shown to ferment D-glucose but not sucrose [70] and 390 Acidisphaera rubrifaciens has been found in acidic hot springs and mine drainage systems and, like 391 many of the other sugar taxa, is acidophilic [71]. The assembly of these and other MAGs in the future 392 will contribute towards the building of fermented food, and other food, microbe databases, 393 equivalent to those available for the more complex human gut microbiome [72] , to enable the more 394 accurate and rapid identification of food microbes. Such databases will be key in the application of 395 metagenomics-based approaches on a widespread basis by the food industry. 396 Overall, this study combines many novel insights into fermented food microbiomes. Firstly, the 397 taxonomic composition of the 58 foods has been described, including many foods that have not 398 been described using NGS previously. Secondly, the functional profile of these foods has been 399 characterised, and like the taxonomic profile, highlights the differences between starting material 400 and microbial composition. Importantly, given the current interest in fermented foods as a healthy 401 food choice and the role diet plays in modulating the gutmicrobiome, the health promoting potential 402 of the microbes in these various foods has been explored. Finally, genomes, including potentially 403 novel taxa, were assembled from these foods, and will contribute to the better assignment of reads 404 from fermented food, and indeed broader food chain microbiome studies, in the future. 405 Methods 406 58 samples of fermented foods were collected from various artisanal producers (see Table 1). 5g of 407 solid foods were placed in a stomacher bag. 50ml of sterile MRD was added to the bag. The contents 408 were homogenised in a stomacher (BagMixer 400 from Interscience) for 20 minutes. After this step, 409 both solid and liquid foods were extracted using the same method. 50ml of the homogenised 410 solution was centrifuged at 10,000 rpm, at room temperature, for 10 minutes. The supernatant was 411 discarded. The pellet was resuspended in 550µl of SL buffer in a 2ml tube (SL buffer from GeneAll kit 412 below). 33µl of Proteinase K was added to the tube and incubated at 55°C for 30 minutes. The 413 solution was then transferred to a bead beating tube and placed in a Qiagen Tissue lyser 2 for 10 414 minutes at 20/s. The GeneAll Exgene extraction protocol from step 4 was then followed until the 415 final elution step, where 30µl of elution buffer (EB) was used here instead of the 50µl suggested in 416 the protocol. 417