Gradient Internal Standard Method for Absolute Quantification of Microbial Amplicon Sequencing Data

To solve the problem of amplicon sequencing cannot discern the microbiota absolute abundance, we proposed a gradient internal standard absolute quantification method. We used Chinese liquor fermentation as a model system to demonstrate the reliability and accuracy of the method.

Thus, to compare samples across temporal and spatial dimensions, we must use microbiota absolute quantitative results (13).
The internal standard can be utilized to solve problems with absolute microbial quantification (12). Recently, researchers have conducted quantitative analyses of microbiota abundances through the synthetic internal standards of 16S rRNA, 18S rRNA, and internal transcribed spacer (ITS) (14) using a standard external strain, such as a fluorescently labeled strain (15). Presently, only one internal standard can be added to ecosystems with only one concentration. However, the different individual microbes in the microbiota cover a wide range of concentrations in the actual samples (16,17). For example, the concentrations of different bacteria ranged from 6.9 Â 10 4 CFU g 21 to 6.5 Â 10 8 CFU g 21 in compost (18). Typically, the closer the internal standard is to the microbial concentration, the more accurate the quantitative result is. Studies have indicated that different results are obtained using internal standards with various concentrations, and only 20 to 80% of microbes can be accurately quantified (14). Hence, to quantify different microbial absolute abundances, it was suggested to determine the same sample multiple times by adding different internal standard concentrations at various times, which is time-consuming. As a result, it remains challenging to accurately determine microbes with different concentrations by measuring a sample only once. Therefore, it is important to establish an efficient method to accurately quantify microbes with large concentration ranges in complex microbiota.
In this study, we established a gradient internal standard absolute quantification (GIS-AQ) method to absolutely quantify microbes in complex microbiota in solid-state fermentation. The gradient internal standard group concentrations were simultaneously added to the same sample to quantify microbial concentrations of different orders of magnitude. The specific primers of the 16S rRNA, ITS, and internal standards were designed on the same sequence using one sequence fragment to reduce the number of internal standards added. Moreover, the deviations from the quantitative equations of microbes and internal standards were eliminated through calibration. In addition, we built a mock community with specific primers for each organism in the communities with quantitative real-time PCR (qPCR) validation. Based on the mock community, we demonstrated the reliability and accuracy of the GIS-AQ method by microscopy quantification (the microbial gradient dilution solution was counted using optical microscopy). Finally, the GIS-AQ method was used to identify the absolute abundance of microbiota during solid-state Chinese liquor fermentation. This method is useful for quickly understanding the absolute abundance of microbial communities over different temporal dimensions and can play an immeasurable role in microbiological research.

RESULTS
GIS-AQ method principle. First, we added five different internal standards (pUC-57 plasmid) with a 10Â concentration gradient to a sample to construct a gradient internal standard group for absolute quantification (Fig. 1A). Each internal standard contained a specific pair of primers and recognition sequences ( Table 1) that were not present in the microbial genomes based on a search of the National Center for Biotechnology Information (NCBI) database and were flanked by the universal primers of the bacterial 16S rRNA V3-V4 and fungal ITS2 regions (Fig. 1A). The concentration of plasmid was calculated by the calculation equation in Materials and Methods. Second, before genomic DNA extraction, five internal standards (plasmid [ Fig. 1A]) were added to the same sample at different concentrations to construct the concentration gradient (approximately 10 4 , 10 5 , 10 6 , 10 7 , and 10 8 copies g 21 ). The concentrations were chosen based on previous studies on the microbial concentration ranges (10 3 to 10 9 copies g 21 ) in the Chinese liquor solid-state fermentation ecosystem (19)(20)(21) (Fig. 1B). Due to the specific primers and recognition sequences (see Materials and Methods) of the five internal standards, the internal standard sequences (ISS) were easy to identify during data analysis. Third, there was a positive correlation (Pearson's r . 0.9; P , 0.05) between the copies of the internal standards added and detected (Fig. 1C). The numbers of reads (log 10 reads) were normalized to the concentration (log 10 copies per gram) based on their linear correlation (raw data converted by log 10 ) to complete the individual microbial quantification.
Detection of the microbial absolute abundance in mock communities using the GIS-AQ method. To explore the possibility of quantifying bacterial and fungal communities using internal standards, we used five bacteria and five fungi to build a mock community (see Materials and Methods). Using the mock community, we established a gradient internal standard absolute quantification (GIS-AQ) method and demonstrated its accuracy using the following five steps.
First, we tested whether the five internal standards were stable and reliable (Fig. 1B, part 1; see Materials and Methods). Based on the qPCR results, the five parallel experiments demonstrate that there were no significant differences (P . 0.05) and the quantitative equation was stable and reliable (R 2 = 0.999; P , 0.001; see Fig. S1 in the supplemental material) for use in this method.
Second, we measured the recovery rates of the five internal standards, five bacteria, and five fungi of different absolute abundances during genomic DNA extraction from a solid substrate (steamed sorghum) (Fig. 1B,  Third, we calibrated the internal standards to eliminate the deviations between the quantitative equations of the internal standards, bacteria, and fungi, because we assumed that the internal standards cannot be completely equivalent to the microbes in the ecosystems. We built two staggered mock communities using 10 bacteria and fungi common in liquor fermentation. We then added five different absolute abundances of internal standards (approximately 10 4 , 10 5 , 10 6 , 10 7 , and 10 8 copies g 21 ) to each mock community (Fig. 1B, part 3; see Materials and Methods). Then, we extracted the internal standards (plasmids) and microbial genomic DNA for quantification (see Materials and Methods). The quantitative equation established by internal standards was as follows: which was calibrated to where a and b are the slope and intercept of the internal standards and microbes, respectively, c and d are the calibrated coefficients in the internal standards and microbes, respectively, x internals is the concentration of the internal standards (log 10 copies per milliliter) or microbes (log 10 copies per milliliter), and y CT-value is the cycle threshold value (CT-value) of qPCR results.
An analysis of variance (ANOVA) of the slopes (variable a in the equation) and intercepts (variable b in the equation) of the quantitative equations of the internal standards, bacteria (P a = 0.089; P b , 0.001 [ Fig. 2A]), and fungi (P a = 0.113; P b , 0.001 [ Fig. 2B]) demonstrated that there were no significant differences (P a . 0.05) between the slopes, but there were significant differences (P b , 0.001) between the intercepts. Thus, we attempted to calibrate the quantitative equations of the internal standards based on the intercepts of bacteria and fungi. Due to the lack of a significant difference between the slopes (P a . 0.05), the slope calibrated coefficients were 1 (c bacteria = 1; c fungi = 1 [ Fig. 2C and D]). The intercepts of the quantitative equations of the internal standards, bacteria, and fungi were 45.118 6 1.316, 36.251 6 0.534, and 39.529 6 0.970, respectively ( Fig. 2A and B). Therefore, the calibrated coefficients were 0.803 and 0.876  Fig. 2C and D]). When the calibrated equations were used, the deviations between the internal standards and microbes effectively disappeared (bacteria, P b = 0.982; fungi, P b = 0.985 [ Fig. 2C and D]), producing excellent agreement between the results of the quantitative equations of the internal standards and microbes. Fourth, amplicon sequencing was used to sequence the genome samples extracted in the third step. We assumed that the deviation between the quantitative results of the internal standards and microbes in the third step would also exist between the results of amplicon sequencing. Therefore, the calibrated coefficients (d bacteria = 0.803; d fungi = 0.876) obtained in the third step were applied to the quantitative equations of amplicon sequencing ( Fig. 3A and B) (see Materials and Methods). We used microscopy quantitation methods to validate the GIS-AQ method. Microscopy quantitation is the most accurate quantitative method because it can directly count the number of added individual microbes (15,22). Therefore, the accuracy of the GIS-AQ method can be validated if there was no significant difference between the quantitative results by calibrated quantitative equations and microscopy quantitation. Based on the gradient of the internal standards and calibrated coefficients, we established calibrated quantitative equations for the bacteria and fungi (Fig. S3) in each sample.
Finally, we compared the different quantitative results ( Fig. S4A and B) and found that there were no significant differences (P . 0.05) between the quantitative results using the calibrated quantitative equations and the microscopy quantitation in the mock communities ( Fig. 3C and D). These results suggest that the GIS-AQ method can accurately (P internals versus microscopy . 0.05) quantify the microbial absolute abundances.
Perturbation of quantification with single internal standards. Before we applied the GIS-AQ method to solid-state fermentation, we sought to answer two questions. (i) In the amplification sequencing process, does the use of internal standards affect the relative abundance of microbes in solid-state fermentation? (ii) Does the use of a single internal standard perturb the absolute abundance?
To determine the effect of additional internal standards on the solid-state fermentation, we only added one internal standard to three independent solid fermentation samples with 0 copies g 21 (control), approximately 10 5 copies g 21 , and 10 8 copies g 21 (Fig. S5A and B) (see Materials and Methods). To facilitate comparison between the samples, we analyzed the composition of the microbial communities after the internal standards were removed.
There were no significant differences between groups with different internal standards added in the bacterial communities (analysis of similarities   Then, genera with an average relative abundance of $1% (dominant microbiota) were selected for analysis. There were no significant differences (P . 0.05) in the relative abundances of the seven dominant bacteria (Fig. 4C) and nine dominant fungi (Fig. 4D) in the groups with different internal standards added. These results indicate that during the amplification sequencing process, the use of internal standards did not affect the relative abundances of microbes in solid-state fermentation.
The absolute abundances of bacteria and fungi were obtained based on the concentrations of the single internal standard (approximately 10 5 copies g 21 and 10 8 copies g 21 ).
The absolute abundances of all bacteria were significantly different (P , 0.05) between the two internal standards of different concentrations (except for Acetobacter) (Fig. 4E). The absolute abundances of all fungi were also significantly different (P , 0.05) between the two internal standards of different concentrations (except for Pichia) (Fig. 4F). These results indicate that adopting gradient internal standards is necessary because of the quantitative perturbation that results from using a single internal standard.
Detection of the absolute abundance of microbiota in solid-state fermentation samples using the GIS-AQ method. To explore the difference between the relative and absolute abundances, the Chinese liquor solid-state fermentation (a type of ecosystem) (see Materials and Methods) was used as a model system for quantification. Microbial compositions were constructed for the solid-state fermentation samples from the first and second batches based on 16S rRNA and ITS sequences ( Fig. S6A and B). The absolute quantitative equations of bacteria and fungi were obtained based on the calibrated quantitative equations of the internal standards (Fig. S7). Then, the absolute abundances of the communities in the fermentation were calculated based on the calibrated quantitative equations (Fig. S6C and D).
For bacteria, Lactobacillus (absolute dominant genus; $10% average abundance) was selected as an example to illustrate the difference between relative and absolute abundance. The relative abundance of Lactobacillus gradually increased until day 20 of fermentation, while the absolute abundance reached its peak on day 7 (Fig. 5A). For fungi, Saccharomyces (absolute dominant genus) was selected as an example to illustrate the difference between relative and absolute abundance. The relative abundance of Saccharomyces gradually increased until day 10 of fermentation, while the absolute abundance reached its peak on day 7 (Fig. 5B).
In summary, the results obtained from the relative and absolute abundances are different. The relative abundance is more suitable for the study of the different compositions of communities in one sample. The absolute abundance can be used to study the difference in communities between different samples, and the GIS-AQ method allows this to be done swiftly.

DISCUSSION
Characteristics of different quantitative methods. We established and implemented a gradient internal standard absolute quantification (GIS-AQ) method to determine the absolute abundances of bacteria and fungi. This method establishes a reference and extends existing quantitative methods for microbial communities. Every method has its advantages and disadvantages; therefore, we compared the GIS-AQ method with existing quantitative methods to provide context and allow the reader to choose the most appropriate method for their respective research question. To resolve the shortcomings of traditional amplicon sequencing, various methods have been proposed to assess the absolute abundance of microbiota, such as flow cytometry (11,16,17), qPCR (17), fluorescence in situ hybridization coupled with catalyzed reporter deposition (CARD-FISH) (23), and spike-in (internal standard) (12,14,15,22,(24)(25)(26). However, determining the absolute abundance using these methods remains challenging. For example, combining amplicon sequencing and flow cytometry can accurately quantify the number of cells in liquid samples. However, it is mainly suitable for specific labeled cells in the liquid samples. In the case of qPCR, the external standard used for qPCR quantification seldom has the same recovery rate as the ecosystem. The CARD-FISH method is complicated and time-consuming to operate. The internal standard approach is the most widely used quantitative method. For example, some researchers designed three internal standards (plasmids) for prokaryotic 16S rRNA, eukaryotic 18S rRNA, and fungal ITS in soil samples. To realize the microbial quantification, three internal standards were added before DNA extraction and repeated DNA extraction and sequencing, changing the concentration of internal standards in each repetition and establishing a quantitative equation, of the same soil sample (14). Researchers designed a synthetic full-length 16S rRNA gene spike-in standard for prokaryotic 16S rRNA in soil samples, and they also used the same method as before to achieve the quantification of prokaryotic 16S rRNA (26). Compared with the existing internal standard quantitative methods, the GIS-AQ method can be described as follows. (i) We constructed a gradient internal standards group, and one round of amplicon sequencing can achieve the determination of different concentrations of internal standards. The GIS-AQ method is also faster and more economical than performing amplification sequencing multiple times to establish a quantitative equation. (ii) We designed the primer information of bacteria and fungi on the same internal standard sequence, which facilitates the quantification of bacteria and fungi at the same time.
(iii) Based on the mock communities, the quantitative equations of the internal standards were calibrated to eliminate the potential deviations between the internal standards and communities in the process of genome DNA extraction. Therefore, we believe that the GIS-AQ method has advantages for quantifying microbial communities.
Application of the absolute abundance data. The relative abundance of amplicon sequencing is based on the normalization or transformation of the total read number (27). Relative abundance is suitable for statistical analysis of the same samples or samples that have similar communities, such as natural soils (28) and healthy guts (29). For samples with large differences in communities, such as fertilizing soils (25) and patients' guts, the relative abundance may easily generate information that is contrary to the actual situation (11). Therefore, at present, the analysis method of combining relative and absolute abundances is primarily used for samples with large differences in microbial communities. Specific manifestations are as follows. (i) When characterizing the changes in communities, the absolute abundance is more convincing than the relative abundance because it shows the dynamic changes in communities between different samples (such as time series samples) that cannot be seen by traditional relative abundance (11). This application represents the most important use of absolute abundance (14). (ii) Regarding the replacement of relative abundance data with absolute abundance data in traditional statistical analysis, at present, no systematic comparative studies that have shown that absolute abundance data can be used in traditional statistical analysis methods are known; however, some researchers have attempted substitution in different statistical methods, such as principal-component analysis (25), network analysis (17), and cluster patterns (heat map) (15). The scope of the application of absolute abundance in the field of microbial quantification still requires further research to support and verify current results. Thus, absolute abundance is a crucial challenge for future microbiota variance studies (24).
Applicability of the GIS-AQ method. The GIS-AQ method established in this study provides a new reference for the field of microbial quantification. As the microbial abundances, species, and substrates are different in various research fields, the GIS-AQ method can be applied to different research fields after appropriate adjustments. These adjustments mainly reflect the design of the internal standards, mock communities, and substrates. (i) The parameters of the internal standards should be designed according to the general primer, GC content, and the sequence length corresponding to the amplification regions. For example, when the amplification region is V4 (30), V4-V5 (31), or ITS1 (32), it is necessary to change the primer of the internal standards used. (ii) The mock communities and substrates should be the same or similar to the object of study because the quantitative principle of internal standards is to use the quantitative equations of the internal standards to quantify the absolute abundance of the microbes. At present, it is unknown whether differences exist between the internal standards and microbes in the process of genome extraction and amplification; however, the problem can be solved using mock communities and substrates. Therefore, when studying the oceans (33), the mock communities should have microbes with different abundances in the oceans, and the mock substrates chosen should be a sterile isotonic saline solution with a seawater composition or sterilized seawater.
Matters requiring attention in the use of the GIS-AQ method. The amount of manual labor required by the GIS-AQ method is larger than for existing quantitative research using internal standards (25). This is mainly due to the construction of the mock communities and substrates. As we assume that the internal standards cannot be completely equivalent to the microbes in the natural systems, it is necessary to eliminate the deviations between the internal standards and microbial quantitative equations. Theoretically, different mock communities and substrates need to be constructed for various research ecosystems. However, if there is no change in the ecosystem, the mock communities and substrate need only be constructed once; therefore, the extra manual labor required is limited. In the Chinese liquor solid-state fermentation ecosystem, the detection line of GIS-AQ method is 10 4 to 10 5 copies g 21 , which is a very low detection line compared with that for the dominant microbes (10 8 to 10 9 copies g 21 ) (21,34). Furthermore, the concentration ranges of the internal standards can be adjusted according to the microbial concentration ranges in different ecosystems, such as soils (10 5 to 10 9 copies g 21 ) (25) and guts (10 7 to 10 11 copies g 21 ) (17). For the use of the GIS-AQ method, the following points should be noted. (i) There should be uniformity between the internal standards, because different internal standards need to be used to represent various concentrations. (ii) The concentration gradients of microbes in mock and natural communities should be similar. (iii) When calibrating the quantitative equations, the difference in the calibration coefficients between bacteria and fungi cannot be ignored.
In summary, we developed a GIS-AQ method for bacteria and fungi. The GIS-AQ method expands upon traditional amplicon sequencing based on relative abundance. Based on the design principle, after proper modification, the GIS-AQ method is compatible with any application using any amplification region (such as V3, V4, V3-4, ITS1, and ITS2) and is thus applicable in a broad range of ecosystems, including soil, rhizospheres, and food. Therefore, the GIS-AQ method is a valuable addition to microbiology research.

MATERIALS AND METHODS
Design and construction of internal standard sequences. The internal standard sequences (ISS) were designed based on three key elements. (i) The internal standard sequences had the same universal primer binding sites (PBSs) as those of the bacteria and fungi in the amplicon sequencing samples. (ii) The synthetic stuffer sequences had a similar length and GC% as that of the microbes in the solid-state fermentation. (iii) The specific PBSs were used to recognize the internal standard sequences (Fig. 1).
First, the primers 336F/806R of the 16S rRNA V3-V4 region (35) and ITS3/ITS4 of the ITS2 region (36) were used as universal primers for bacteria and fungi for the internal standards. Second, based on previous research (21), we selected the average length of amplified fragments and GC content of the 10 microbes (five bacteria and five fungi [Table S1]) with the highest relative abundances in the solid-state fermentation process as the references to construct the random sequences (14). There was a set of random sequences ( Table 1) that satisfied the following criteria: uniform GC content (61%), no homopolymers of .10 bp, no repeats of .16 bp (as determined by BLAST), and no self-complementary regions of .10 bp (26). In addition, the optimized set of artificial sequences shared negligible identity with sequences in the National Center for Biotechnology Information (NCBI) database. Third, we designed specific primers for internal standards sequences based on the NCBI Primer-BLAST (Table S2). The specific primers satisfied the following criteria: similar length, similar melting temperature (T m ) values, similar GC contents, no homopolymers of .3 bp, and no self-complementary regions of .5 bp (26). The designed primers were used as upstream and downstream primers, and BLAST was performed in the NCBI database, which demonstrated negligible identification with known sequences in the NCBI nucleotide collection (nr/nt) database (web-BLAST performed in May 2019). The specificity of these primers was demonstrated by a PCR cross-validation. The PCR conditions were as described in a previous study (37).
Complete internal standards sequences are provided in Table 1. Full-length internal standard sequences (472 bp) were chemically synthesized and inserted into the pUC-57 plasmid cloning vector by Sangon Biotech (Shanghai, China). Plasmid cloning vectors with internal standard sequence inserts were transformed into Escherichia coli TOP10 competent cells [genotype: F 2 mcrA D(mrr-hsdRMS-mcrBC) w 80 lacZDM15D lacX74 recA1 araD139D(ara-leu)7697 galU galK rpsL (Str r )endA1 nupG] (Sangon Biotech) as per the manufacturer's instructions. Internal standard DNA was extracted from overnight liquid cultures using a SanPrep spin column and collection tube (Sangon Biotech) according to the manufacturer's instructions. The internal standard DNA concentrations were measured using a PicoGreen double-stranded DNA (dsDNA) assay kit (Invitrogen Life Technologies, Grand Island, NY) (26), and the internal standard concentration was calculated as C internal ðcopies ml À1 Þ ¼ m DNA ðng ml À1 Þ Â 6:022 Â 10 23 ðcopies mol À1 Þ DNA length ðbpÞ Â 10 9 ðng g À1 Þ Â 660 ðDa mol À1 Þ where C internal is the internal standards concentration, and m DNA is the mass of the internal standards. In our study, five internal standards (IS1 to IS5) were stored at -80°C for future processing at a concentration of 1 mg ml 21 (Table S2) for the microbes. The specificity of these primers was demonstrated using PCR cross-validation.
qPCR. The qPCR was performed on a real-time PCR system (Applied Biosystems, Foster City, CA) with a mixture of 0.4 ml (20 mM) of primers, 10.0 ml of SYBR green supermix (SYBR Premix Ex Taq II; TaKaRa, Shanghai, China), 1.0 ml of DNA templates, and 8.2 ml of double-distilled water (ddH 2 O) (38). The PCR program was as follows: preheating at 98°C for 10 min, 40 cycles at 98°C for 30 s and 60°C for 30 s, and an increase of 0.5°C every 5 s from 72°C to 95°C for the melting-curve analysis to confirm the amplification specificity (39).
The calibrated coefficients. The recovery rates of internal standards and microbes differed in the genome extraction (14). To calibrate the differences in the recovery rate, we designed the following experiments.
First, we diluted all five internal standards (IS1 to IS5) according to the gradient of approximately 10 4 , 10 5 , 10 6 , 10 7 , and 10 8 copies ml 21 . Then, five internal standards of the same concentration were mixed to obtain the internal standard concentration gradient. qPCR was performed on the mixed liquid to determine whether there was an amplification preference among the various internal standards. Second, genomic DNA was extracted from the five groups of mixed liquid using the easy nucleic acid isolation (E.Z.N.A.) soil DNA kit (Omega Bio-Tek, Inc., Norcross, GA) according to the manufacturer's instructions to determine the internal standard recovery rate. Third, we used qPCR to establish the melting curves for five internal standards, five bacteria, and five fungi to discern the differences between their recovery rates. Finally, according to the microbe recovery rates, the quantitative equation established by internal standards, provided in Results, was determined.
Mock communities. To demonstrate the GIS-AQ method, we used mock communities to mimic the taxonomic diversity of the microbes in the solid-state fermentation process. Based on previous research (21), bacteria and fungi found to be dominant in the solid-state fermentation process of Chinese liquor fermentation were used as mock communities. B. amyloliquefaciens and L. fusiformis were incubated for 24 h at 37°C and 200 rpm in a Luria-Bertani sterile liquid medium (Sangon Biotech, Shanghai, China). L. acetotolerans, W. paramesenteroides, and P. pentosaceus were incubated for 48 h at 37°C in de Man-Rogosa-Sharpe medium (Oxoid Ltd., Hants, UK). P. kudriavzevii, N. castellii, H. osmophila, W. anomalus, and S. cerevisiae were incubated for 48 h at 30°C in a yeast extract-peptone-dextrose medium (Sangon Biotech).
After incubation, the 10 cultured bacteria and fungi were eluted with phosphate-buffered saline (0.01 M, pH 7.2) at 4,000 rpm for 20 min (5804 R centrifuge; Eppendorf, Hamburg, Germany) three times. Then, the microbes and internal standards were added to the sterile sorghum substrates after microscopy quantification. The sterile sorghum substrates were preprepared via autoclaving sorghum at 121°C for 20 min to simulate the substrates in the Chinese liquor solid-state fermentation process (21). The added concentrations of microbes were as follows: approximately 10 8 copies g 21 of B. amyloliquefaciens, approximately 10 7 copies g 21 of L. fusiformis, approximately 10 6 copies g 21 of L. acetotolerans, approximately 10 5 copies g 21 of W. paramesenteroides, approximately 10 4 copies g 21 of P. pentosaceus, approximately 10 8 copies g 21 of P. kudriavzevii, approximately 10 7 copies g 21 of N. castellii, approximately 10 6 copies g 21 of H. osmophila, approximately 10 5 copies g 21 of W. anomalus, and approximately 10 4 copies g 21 of S. cerevisiae. The added concentrations of internal standards were as follows: approximately 10 8 copies g 21 of IS1, approximately 10 7 copies g 21 of IS2, approximately 10 6 copies g 21 of IS3, approximately 10 5 copies g 21 of IS4, and approximately 10 4 copies g 21 of IS5. Next, the sorghum substrates containing different absolute abundances of bacteria, fungi, and internal standards were uniformly mixed (Table S3). We performed five independent preparations for the mock communities.
Sample collection in natural solid-state fermentation. For the solid-state liquor fermentation, steamed grains were mixed with the starter at a ratio of 9:1 (wt/wt) and fermented in sealed jars for 28 days. Subsequently, the fermented grains were distilled to obtain the liquor. We collected a total of 24 samples (100 g per sample) from 3 jars (three independent preparations) in the layer center (0.5 m deep) on days 0, 3, 5, 7, 10, 15, 20, and 28 (Table S3). All samples were stored at 280°C for further DNA extraction.
Single internal standard perturbation experiment preparation. To calibrate the influence of different concentrations of the internal standards on the results, we designed a single internal standard perturbation experiment. The perturbation experiment was designed based on the samples from day 10 of fermentation. Previous studies have demonstrated that the composition of the microbiota is stable and representative on day 10 of fermentation (21). We divided the same substrates (on day 10) into three parts (10 g each) and then added the internal standards (IS1) at 0 copies g 21 (control, equivalent ddH 2 O), 10 5 copies g 21 , and 10 8 copies g 21 (Table S3) and mixed evenly. All samples were stored at 280°C for further DNA extraction. We conducted three independent preparations for all perturbation experiments.
Absolute quantitation. To detect the absolute abundance of the microbes in the mock communities and ecosystems, we used gradient internal standards to build a linear equation between the concentration (log 10 copies per gram) and the number of reads (log 10 reads) of internal standards. The quantitative equation is as follows: y reads ¼ a Â x internals 1 b which was calibrated to y reads ¼ a Â c Â x internals 1 b Â d where a and b are the slope and intercept of the internal standards and microbes, respectively, c and d are the calibrated coefficients of the internal standards and microbes, respectively, and x internals is the concentration of internal standards (log 10 copies per gram).
Based on the above equations, absolute abundances representing the microbial copy number (x internals ) in the mock and natural samples were calculated based on the number of reads (y reads ) in the amplicon sequencing.
Statistical analysis. Diversity and statistical analyses were conducted using XLSTAT (version 19.02.42992). The significant difference between recovery rates, slopes and intercepts were calculated using one-way ANOVA using SPSS Statistics (version 22). The Shannon index and Chao1 estimator were calculated using QIIME (version 1.8) (43,44). ANOSIM was used to test differences between samples based on the Bray-Curtis distance matrices (45).
Data availability. The bacterial and fungal raw sequence data were deposited in the DNA Data Bank of Japan (DDBJ) database under the accession numbers DRA010282 and DRA010283. The mock community cultures are available by request for noncommercial purposes.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only.