Bacterial composition and function during fermentation of Mongolia koumiss

Abstract Koumiss is a fermented mare's milk beverage that has attracted increasing attention due to its nutritional richness and important economic value. Bacteria in koumiss play a major role in pH decreasing and reducing spoilage through bacterial inhibition. The dynamic changes in nutritional content were determined firstly during fermentation, and then the metagenomics sequencing technology was applied to profile koumiss core microbiota at the species level. We also clarified the function and effect of the bacteria on the nutritional content of the final product. We also investigated active microbial function by comparing the metagenomics of representative samples collected at different time points during the fermentation process. This study dynamically revealed the bacterial composition and function of traditional koumiss during its making process. Twenty‐three major functional categories related to amino acid and fat synthesis, metabolism, and so on were identified. Functional category L (represented replication‐, recombination‐, and repair‐related functions) was one of the most important categories with the highest relative abundance in all of the 23 major functional categories. CoG category having a significant correlation with Lactococcus piscium was the most abundant. The change in metabolic activity of bacteria at different fermentation time points showed that the metabolic activity was more active in the first 24 hr and then began to stabilize. LAB play the major role in the koumiss pH decreasing and quality improvement. The functional genes of related metabolic activity of lactic acid bacteria were more active in the first 24 hr of koumiss fermentation and then began to stabilize.

. It is produced using traditional method sandan indigenous starter culture (i.e., previously made koumiss) containing lactic acid bacteria (LAB), yeast, and other organisms associated with fermentation (Mu et al., 2012;Sun, Liu, Zhang, Yu, Gao, et al., 2010;Watanabe et al., 2008;Wu et al., 2009). However, the literature concerning the dynamic changes in bacterial composition and function with respect to the nutrition of ancient koumiss is scarce.
The nutritional components of koumiss are very rich and include fat, protein, vitamins, amino acids, carbohydrates, and trace mineral elements (Zhang & Zhang, 2012). Traditionally, it was considered a functional food, not only providing rich nutrients, but also due to its medicinal properties. It is considered to be beneficial in postoperative care (Jagielski, 1877;Thompson, 1879). The Cossacks have introduced koumiss into military rations to prevent tuberculosis (Ishii & Samejima, 2001). Chen et al. (Chen et al., 2010) found that koumiss was rich in angiotensin I-converting enzyme inhibitory peptides, which have antihypertensive properties. Chaves-López et al. (Chaves-López et al., 2012) also reported angiotensin I-converting enzyme inhibitory activity in yeast strains isolated from Colombian koumiss. But, at present, it is just as functional as any other fermented dairy product (including all cheeses). Despite this, few studies have clarified the interactions among microorganisms and describe the detailed function of these bacteria with respect to metabolite formation using metagenomics analysis. Restrictions associated with these techniques have left the scientific community with a limited view of the total diversity and functionality present in complex bacterial communities. New next-generation sequencing (NGS) technologies are a powerful tool to explore microbial function globally in food samples, including genes involved in microbial metabolism, and to characterize microbial composition using marker genes. Shotgun metagenomic analysis methods can infer both taxonomic and functional information in complex microbial communities.
This culture-independent method can improve the existing knowledge about microbial community structure and function by enabling direct identification of genetic material from environmental samples (Almeida et al., 2019). In the present study, nutrient composition and bacterial biodiversity were evaluated at different time points during koumiss fermentation and storage. The function of bacteria during koumiss fermentation was elucidated using metagenomic sequencing.

| Samples collection
A total of six samples were taken, over time, from traditionally fermented koumiss from one site in Xilingol, Inner Mongolia, which is located at latitude 43.94 and longitude 116.05. These samples (B 1 , B 2 , B 3 , B 4 , B 5 , and B 6 ) were taken at different stages of fermentation (0, 12, 24, 36, 48, and 168 hr). They were collected aseptically and were kept in a tank filled with liquid nitrogen during transportation (Liu et al., 2019).

| Biochemical analysis
The fat content of koumiss samples was determined using the Rose Gottlieb method (Nos. 991.20,905.02, respectively) described by the Association of Official Analytical Chemists (AOAC 1997). The total protein content was determined using the Kjeldahl nitrogen method and calculated from total nitrogen using a 6.38 conversion factor (Kirk, 1950).
Lactic acid and fatty acid contents were determined using HPLC (Liu et al., 2019). Lactose content was determined using the FIL-IDF 147B method (FIL-IDF Standard 1998). Pretreatment of samples followed the methods described by Choi (2016). The HPLC system (Varian) consisted of a solvent delivery system (9012Q), a refractive index detector (Star 9,040), and a column oven (101). The temperature of the 150 × 7.7 mm Rezex ROA column (Phenomenex) was 60℃. The eluant was 4 mM sulfuric acid and the flow rate was 0.6 ml/min.

| Determination of vitamin composition
Levels of water-soluble vitamin B 1 (VB 1 ) and B 2 (VB 2 ) were determined using HPLC based on the methods of Zafra-Gómez et al. (Zafra-Gómez et al., 2006). Separation of vitamin E was done using reverse-phase liquid chromatography (RP-LC) according to the methods of Gámiz-Gracia et al. (Gámiz-Gracia et al., 1999) and modified for the samples collected. After saponification, 50 ml sample was extracted by petroleum ether and prepared before HPLC analysis according to GB/T 5,009.82-2003methods (AOAC 1997. Then 20 µl of the extract was analyzed by RP-LC (Liu et al., 2019).

| Quantification of amino acids
The quantity of 17 common amino acids in the samples of koumiss taken at different stages of fermentation was measured in order to determine changes in their content. Details of the method used are described in Liu et al. (Liu et al., 2019). Briefly, 1 ml from each koumiss sample was placed into special glass tubes. 10 ml 6 M HCl was added to each tube, and then the tubes were sealed after vacuum treatment. Tubes were placed in dry boxes at a constant temperature of 110℃ to hydrolyze for 24 hr, after which time they were allowed to cool. Then, the hydrolysate was diluted with distilled water to a constant volume of 50 ml; 1 ml of this solution was dried in a vacuum and ground into a powder. Each of the samples was then homogenized in 1.2 ml of 0.02 M HCl and filtered through a 0.22 μm mesh to obtain the final solution. The amino acid content was measured using a Hitachi L-8800 automatic amino acid analysis unit with an ion exchange resin separation column that had a diameter of 4.6 mm ×60 mm (ion exchange resin 2622SC). The temperature of the separation column was set at 57℃. The flow speed of the buffer solution and ninhydrin was 0.4 ml/min (pressure 7.0-8.5 MPa) and 0.35 ml/min (pressure 0.9-1.1 MPa), respectively. After acid hydrolysis, the amino acids present in the koumiss were analyzed using an amino acid analyzer (L-8900, HITACHI, Japan).

| Genomic DNA extraction
Whole genomic DNA was extracted using the methods of Gesudu et al. (Gesudu et al., 2016). The quality of extracted DNA was checked by 0.8% agarose gel electrophoresis and spectrophotometry (optical density at 260 nm/280 nm ratio) with a microultraviolet spectrophotometer to determine its concentration. All extracted DNA samples were stored at −20℃ before subsequent analysis.
Sequencing libraries were prepared with a fragment length of approximately 300 bp. Paired-end reads were generated with 100 bp in the forward and reverse directions. The sample taxonomic profile was also determined directly from the metagenomic dataset using two online software packages, Parallel-META (Su et al., 2014) and MetaPhlAn (Segata et al., 2012), following the instructions of the two software developers.

| Functional annotation
Putative functional genes and amino acid sequences predicted in our gene catalog were aligned with the proteins/domains in the Cluster of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases using BLASTP (e-value ≤6 1e-5 with a bit-score higher than 60). Each protein was assigned to the Cluster of Orthologous Groups (COGs) (Tatusov et al., 2001) or Kyoto Encyclopedia of Genes and Genomes (KEGG) orthologues group (KOs) (Kanehisa & Goto, 2000) by the highest scoring annotated hit.

| Variation in metabolic pathways at different stages of fermentation based on KEGG
In order to identify changes in metabolic activity of each pathway at different fermentation time points, changes in metabolic activity of each pathway were compared at two adjacent time points. Pathways satisfying the following conditions were considered to be different between two adjacent time points: (a) KO detected in one pathway was more than 50% of the total KOs in the pathway; (b) There was at least 80% or 90% (difference threshold) of KO in which relative content in one sample was more than that in another.

| Statistical analyses
Statistical analyses were performed mainly using the R packages, including ade4 ggplot2, (http://www.r-proje ct.org/), together with Python (Sanner, 1999), Canoco for Windows 4.5 (Microcomputer Power), and PAST (Hammer et al., 2001). Differences in the relative abundances of taxonomic groups between samples at gene level were evaluated using a Mann-Whitney test. False discovery rate (FDR) values were estimated using the Benjamini-Yekutieli method to control for multiple testing (Benjamini & Yekutieli, 2001).

| Nutritional composition and content
The content of lactose in fresh mares' milk (0 hr) was 45.75 ± 2.15 mg/L. The content of lactic acid in koumiss ranged from 8.2 ± 1.21 to 32.13 ± 1.54 mg/L and began increasing as soon as the fresh mares' milk was seeded with the starter culture from aged koumiss at the beginning of fermentation. The protein content of fresh mares' milk was 1.86 ± 0.07%, and the koumiss was 2.22 ± 0.08%; there was no significant difference between fresh milk and fermented milk. Fat content ranged from 0.74 ± 0.16% to 1.48 ± 0.08%, and there was no significant difference between fresh milk and fermented milk. VB 1 and VB 2 concentration ranged from 1.48 ± 0.12 to 4.86 ± 0.42 µg/100 g and from 2.48 ± 0.12 µg/100 g to 4.86 ± 0.32 µg/100 g, respectively, and there was no significant difference between fresh milk and fermented milk. Vitamin E (VE) concentration ranged from 44.63 ± 0.63 to 99.90 ± 0.05 µg/100 g, and there was no significant difference between fresh milk and fermented milk. The content of the 17 amino acid ranged from 2.40 ± 0.11 µg/100 g for methionine (MET) to 39.64 ± 0.01 µg/100 g for glutamate (GLU). There was significant variation in the content of these amino acids which increases significantly as fermentation time increased. However, the content of cysteine (CYS) and histidine (HIS) decreased gradually with increased fermentation time. The results are shown in Table. 1.

| Microbial diversify in the koumiss at different stages of fermentation
Changes in bacteria diversify of koumiss during fermentation are presented in Figure 1. The genus of Lactobacillus in the phylum Firmicutes was the predominant bacterial genus throughout fermentation. However, with the extension of fermentation time, Lactobacillus tended to decrease while Streptococcus tended to increase in abundance.

| Changes in microbial functional genes during fermentation
Following trimming and filtering, whole genome shotgun sequencing of six samples achieved 111.70 Gb (18.62 ± 2.09 Gb) data. After assembling, 495,259 contigs were obtained. The average N50 length of the contigs was 3,127 bp. A total of 858,433 genes were obtained after MetaGeneMark prediction, and these results are summarized in Table. 2.
The entire sequence data set has been deposited in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database (accession number: PRJNA527179).

| Identification of functional categories based on COG databases
The protein function and metabolic pathway annotation information were obtained from BLAST analysis by searching the protein sequences in the COG database, which are translated from genes. A total of 1883 COG functional units were identified from the six samples in this study. These sequences represented 42.83% of all the sequences that could be aligned to the COG database. These functional units fell into 23 major functional categories, and the relative content of the major functional categories in each sample is shown in Figure 2.
From Figure 2 We can determine that functional category L (representing replication-, recombination-, and repair-related functions) is one of the most important of the 23 major functional categories with the highest relative abundance. Function category S is the second largest category, and the function category is unknown.
Functional category R (representing general functions) was the third largest category. In addition, functional categories E (amino acid transport, metabolism), J (translation, ribosome structure, and biosynthesis), and G (carbohydrate transport and metabolism) had a relatively high abundance in these six samples.

| Bacteria function in relation to nutrition and chemical composition of fermented milk
The relationships between bacterial species and nutritional composition as well as the content of fermented milk were identified ( Figure 3a). Lactic acid content was significantly positively

| The relationship between bacteria, metabolites, and nutritional composition of fermented milk
The correlation between COG functional categories and predominant bacterial species of fermented milk was analyzed to de-

N50 length(bp)
Average read length (bp) Note: Contig represented the long fragments formed after assembly of short sequences; N50 can be described as a weighted median statistic such that 50% of the entire assembly is contained in contigs or scaffolds equal to or larger than this value.

| Changes in metabolic pathways at different fermentation time points based on the KEGG database
There were 51 pathways that differed in activity between B 1 and B 2 samples using a 90% differential threshold. All of them were more active in the B 2 sample ( Figure 5). These pathways were mainly concerned with: flagellum assembly; bacterial chemotaxis; VAL, LEU, and ILE biosynthesis; glycosylphosphatidylinositol; lipoic acid metabolism; and the biosynthesis of unsaturated fatty acids. There were 16 pathways that differed in activity between the B 2 and B 3 samples. These pathways were also more active in the B 2 sample.
These pathways were mainly associated with: N-glycine biosynthesis; autophagy regulation; glycosylphosphatidylinositol biosynthesis; and RNA transport. Differences between metabolic pathways activity in the B 3 and B 4 samples were significantly decreased. Only activity of the lipoic acid metabolic pathway was significantly higher in the B 3 sample than in the B 4 sample under the 90% difference threshold.
However, the differentially active pathways increased to 11 when the threshold decreased to 80%. Among them, lipopolysaccharide biosynthesis, flagellum assembly, and VAL, LEU, and ILE biosynthetic pathways were more active in the B 3 sample, and proteasome, steroid biosynthesis, and D-alanine metabolic pathways were more active in the B 4 samples. Differences in inactivity of metabolic pathway activity between the B 4 and B 5 samples were further reduced.
Under the 80% threshold, only ten metabolic pathways were differentially active in the two samples. Lipoic acid metabolism, D-arginine and ornithine D-metabolism, selenium metabolism, streptomycin biosynthesis, pentose and glucuronic acid and conversion of VAL, LEU, and ILE biosynthesis, and other chitosan degradation pathway metabolism were more active in B 4 samples and D-glutamyl amine and D-glutamate metabolism, cell cycle, and geranium campylobacter were more active in the B 5 sample. Differences in activity may be affected by the length of the fermentation interval, and there was a significant increase in pathway activities between the B 5 sample and the B 6 sample. Under the 90% difference threshold, there were significant differences in 25 metabolic pathways between the B 5 and B 6 samples, and these 25 metabolic pathways were all more active in the B 6 samples. The main differentially active pathways included those related to gene transcription factor, glycosylphosphatidylinositol biosynthesis, lipoic acid metabolism, geranium degradation, autophagy regulated steroid biosynthesis, transport, RNA and Darginine ornithine metabolism and N-biosynthesis, GLY, VAL, LEU, and ILE biosynthesis, and the streptomycin biosynthesis pathway.

| D ISCUSS I ON
Bacterial function during fermentation of koumiss was investigated by metagenomics in the present study. Lactococcus, Lactobacillus, and  provided that they are released through bacterial lysis (Valence et al., 1998). In this study, Lc.piscium was negatively correlated with fat content but positively correlated with lactose. The presence of Lc.piscium has also been reported in dairy products such as raw milk, using a novel multiplex PCR (Odamaki et al., 2011); and cheese, using 16S rRNA library sequencing (Carraro et al., 2011). In addition, Enterococcus was negatively correlated with protein. Wessels et al. (Wessels et al., 1990) and Bassem et al. (Bassem et al., 2002) also suggested that Enterococcus produced proteolytic enzymes involved in casein degradation. Overall, LAB played a major role in the fermentation of koumiss. Therefore, in this study is considered the role of different functional genes of LAB in different stages of fermentation from the perspective of metagenomics.
In recent years, there has been a lot of research on microbial diversity in traditionally produced yogurt. As early as 1985, Oberman and Libudzisz reported that Lactobacillus and Lactococcus were the predominant bacterial species in koumiss. It is also known that Lb.delbrueckii subsp. bulgaricus and Lb. helveticus have an important role in fermentation and flavor development (Oberman et al., 1985). Different researchers have isolated and identified bacteria and studied their diversity in traditional koumiss in Inner Mongolia (An et al., 2004;Ishii et al., 1997;Liu et al., 2019), Xinjiang , and Mongolia (Danova et al., 2005;Uchida et al., 2007).

In 2008, Watanabe et al. used PCR-RAPD identification to show that
Lactobacillus was the dominant genus in Mongolian koumiss (Watanabe et al., 2008). Xiaoqunet al. (Xiaoqun & Hesheng, 2012) studied the bacterial diversity of Xinjiang koumiss, and the results also showed that Lactobacillus was the dominant genus. In this study, we confirmed that Lactobacillus was the dominant genus in all stages of fermentation.
The functional genes of these microorganisms are bound to play an important role in the variation in type and content of metabolic products in fermented products. Thus, correlations were made between the functional genes present and nutritional content of koumiss at different stages of fermentation. Six of 23 function gene categories with high relative abundance were regarded as the major functional categories present in the six samples taken at different stages of fermentation (0, 12, 24, 36, 48, and 168 hr). Functional category L (represented replication-, recombination-, and repair-related functions) was the most abundant in all of the samples. It was found that lactose content was significantly positively correlated with the secondary metabolite' biosynthesis, transport, and catabolism. The results of the relationship between COG functional categories and nutrition in koumiss showed that categories G, K, and S were significantly positively correlated with the amino acid content of THR, PRO, ARG, LYS, PHE, TYR, LEU, ILE, SER, and ASP. From this point, we may conclude that the content of amino acids in koumiss is mainly determined by the function of bacteria related to amino acid transport and nucleic acid transport during fermentation process. We found that the functional gene categories obtained from bacterial metagenomics DNA were mostly related to LAB species in koumiss. LAB are important for decreasing pH and inhibiting spoilage bacteria. LAB are excellent candidates for preventing the growth of pathogenic bacteria in food products because they show bacteriostatic effect toward many bacterial species through various mechanisms (Ghanbari et al., 2013).
Vitamin B 1 and B 2 were significantly correlated with secondary metabolite biosynthesis, transport and catabolism, cell wall/membrane/ envelope biogenesis, and defense mechanisms. Riboflavin (VB 2 ) biosynthesis has been described both in gram-positive and gramnegative bacteria, with detailed studies performed for Bacillus subtilis (Perkins et al., 2002). Fermentation process was regarded as the major way for biosynthesis of VB 2 , and a gene rib C had been shown to code for a bifunctional riboflavin (Neuberger & Bacher, 1986) kinase/FAD synthetase (Bacher et al., 2000). Lb.helveticus between the 0 to 36 hr period of koumiss fermentation, and its relative abundance increased gradually until reaching a peak at 36 hr. Different species of Enterococcus also increased gradually throughout the fermentation process, reaching a maximum after 16 hr. The microbial composition of koumiss changed most before 36 hr of fermentation, and the predominant bacterial species were LAB. From this, we can infer changes in the function of microbes at different stages of fermentation might be induced by changes in the relative content of different enzymes.
The activity of particular metabolic pathways changed significantly between adjacent fermentation time points according to the bubble diagram ( Figure 5). If the relative activity of a metabolic pathway at one fermentation time point was significantly higher than that at another time point, then it meant that the metabolic activity of bacteria at that time point was also more active. There were more differentially active metabolic pathways in the B 2 sample than that in the B 1 sample, and in the B 2 sample compared with the B 3 sample. Overall, significantly differentially active pathways were most abundant at the B 2 and B 3 fermentation points than at the other time points. These pathways included N-glycine biosynthesis, autophagy regulation, glycosylphosphatidylinositol biosynthesis, and RNA transport. There were smaller differences of relative activity of metabolic pathway between B 4 and B 5 and between B 5 and B 6 , which meant that microbial metabolism became more active during the first 24 hr of fermentation and then began to stabilize.

CO N FLI C T S O F I NTE R E S T
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

E TH I C A L A PPROVA L
This study does not involve any human or animal testing.

I N FO R M E D CO N S E NT
Written informed consent was obtained from all study participants.