Keystone Genes Confer Superior Performance in Hyperthermophilic Composting

Background: Large amounts of organic solid wastes originating from anthropogenic activities have imposed enormous pressure on the environment and human health. Our previous studies showed that compared with conventional thermophilic composting (cTC), hyperthermophilic composting (hTC) exhibits superior performance in organic solid waste disposal by providing advantages such as improved composting temperature, nitrogen conservation (NC), nitrous oxide (N 2 O) mitigation and germination index (GI). However, it remains unclear how hTC communities drive improved performance. Here, we used GeoChip 5.0M coupled with high-throughput 16S rRNA gene sequencing data to investigate the variations in carbon (C)-degrading and nitrogen (N)-cycling genes and microbial communities and their linkages with selected performance indices (composting temperature, NC, N 2 O emission rate and GI) in hTC and cTC in factory-scale experiments, aiming to identify the keystone biotic drivers for the improved performance. Results: We showed that hTC signicantly altered functional composition structures compared with those in cTC, which was driven by taxonomic shift in microbial communities. Specically, hTC signicantly increased the relative abundance of C-degrading genes and decreased the relative abundance of N-cycling genes during composting. These signicantly shifted genes were the keystone genes dominating the improved performance of hTC, as indicated by a random forest model. Furthermore, network and partial least squares path modeling analysis suggested that the keystone genes continued to dominantly drive the improved performance after multiple biotic (community composition and other genes) drivers were simultaneously considering in hTC. Conclusions: Together, our study provides evidence that keystone genes potentially play a pivotal role in improving composting temperature, N 2 O mitigation, NC and GI in hTC and emphasizes the importance of understanding the variation in functions for targeted manipulation of composting practices.


Background
Large amounts of organic solid wastes (e.g., livestock manure, sludge, crop straw, green waste) originating from anthropogenic activities have imposed enormous pressure on the environment and social health [1][2][3]. However, these problems would be greatly alleviated if organic solid waste could be recycled into useful resources through proper treatment [1,4,5]. Therefore, improving the recycling of organic solid waste is signi cant for environmental protection and resource conservation.
Composting is an attractive approach for recycling organic solid waste and generating a stabilized organic fertilizer, and it involves microorganisms converting degradable organic component into humuslike substances [2,4,6,7]. The nished product, which looks like soil, is high in carbon (C) and nitrogen (N) and is an excellent medium for growing plants [5,8]. However, conventional thermophilic composting (cTC) technology still suffers from low composting temperatures, high nitrous oxide (N 2 O) emissions and unsatisfactory germination indices (GIs), leading to a reduction in the environmental bene ts of composting plants [2,9,10]. Given that microorganisms play a crucial role in all events related to the biotransformation of organic substrates during composting [11], hyperthermophilic microorganisms have been recently adopted in composting ecosystem known as hyperthermophilic composting (hTC) [12][13][14].
This new composting technique exhibits extremely high composting temperatures (up to 90°C) without exogenous heating and proved superior in organic solid waste disposal by accelerating humi cation and composting [13,15], decreasing N 2 O and ammonia (NH 3 ) emissions [15,16], improving NC and GI [17], and enhancing antibiotic resistance gene and microplastic removal [7,12]. Previous studies indicate that these advantages are mainly provided by the distinctive microbial communities in hTC, especially dominant populations of the hyperthermophilic genera Thermus and Plani lum [12,13]. Such distinctive microbial communities in hTC are expected to have multiple implications for the improvement of performance in composting ecosystems. However, knowledge of microbial compositions offers little explanation for the improved performance indices, as microbial communities are highly diverse and typically undergo drastic successions during hTC [7,12,13]. Thus, it remains unclear how hTC communities drive improved performance.
Given the superior performance of hTC ecosystem, identifying keystone biotic factors related to improved performance is a prerequisite for the development of targeted manipulations to increase the environmental bene ts of composting practices. Predicting ecosystem performance based on functional traits, especially those provided by C-degrading and N-cycling genes, has received considerable attention in recent studies [18][19][20][21][22]. For example, the relative abundance of C-degrading genes was highly sensitive to temperature [23] and strongly correlated with ecosystem respiration and methane ux in permafrost soils [21]. Furthermore, several studies have noted that decreases in nitri cation and denitri cation gene abundances were signi cantly correlated with N 2 O emissions during composting [24][25][26]. Crucially, a recent study indicated that microbial functional attributes rather than taxonomic attributes drive topsoil respiration, nitri cation and denitri cation [27]. As a result, special attention should be paid to characterizing C-degrading and N-cycling gene variation and identifying their relationship with composting performance, which would elucidate the mechanisms by which microbes mediate improved performance of hTC ecosystems.
Considering the contrasting microbial communities and composting performance indices between hTC and cTC, we hypothesize that (i) hTC may exhibit distinct C-degrading and N-cycling function patterns compared with those in cTC and that (ii) the distinctive functional genes can serve as keystone genes that drive the improved performance of hTC and are therefore worth noticing [28][29][30]. Here, we employed GeoChip 5.0M analysis coupled with 16S rRNA gene sequencing to investigate the variation in microbial functions and communities, and their linkages with 4 selected performance indices (composting temperature, NC, N 2 O emission and GI) in both hTC and cTC at the factory scale to identify the keystone biotic factors responsible for the improved composting performance in hTC.

Page 4/22
2.1 Hyperthermophilic composting increases carbondegrading genes and decreases nitrogen-cycling genes To elucidate the microbial community function variation between hTC and cTC treatments, we performed GeoChip 5.0M analysis using 12 representative samples from both treatments at different time points, namely, day 0 (initial phase for both treatments), day 5 (hyperthermophilic phase for hTC, thermophilic phase for hTC), days 13 and 23 (thermophilic phase for both treatments), and days 37 and 44 (mature phase for both treatments), according to the composting temperature (Additional le 1, Figure S1a). The nonmetric multidimensional scaling plot (NMDS) indicated signi cant differences in the composition and structure pro les of the functional genes between the two treatments (P < 0.05, Additional le 1, Figure  S2). Moreover, composting temperature was found to strongly correlate with NMDS1 (58% variance explained, R 2 = 0.48, P < 0.01), suggesting that the functional structural variation was driven by composting temperature in both treatments [7,20,21].
Considering the signi cant differences in microbial functional composition, we next compared the Cdegrading and N-cycling genes between the two treatments, as these genes may play a vital role in the selected performance (composting temperature, NC, N 2 O emission and GI) [12,16,25,31]. We detected 45 genes associated with the decomposition of labile or recalcitrant C substrates with relative abundances above 0.1%. Among them, 18 genes exhibited signi cantly higher relative abundances on day 5 in hTC (hyperthermophilic phase) than in cTC (P < 0.05, Fisher's exact test, Fig. 1a). The signi cantly increased the number of C-degrading genes in hTC, including those encoding α-amylase (amyA) for starch degradation, xylanase for hemicellulose degradation, pme for pectin degradation, cellobiase for cellulose degradation, chitinase for chitin degradation and phenol oxidase for lignin degradation. More broadly, the overall relative abundance of C-degrading genes was also signi cantly higher on day 5 in hTC than in cTC (P < 0.01, Additional le 1, Figure S3a), suggesting that hTC might enhance the C-degrading functional capacity by enriching C-degrading taxa. At the end of the composting experiment (day 44), 12 genes associated with C degradation were signi cantly enriched (Additional le 1, Figure S3a). By contrast, the relative abundances of only 6 and 8 C-degrading genes were signi cantly lower on day 5 and day 44 in hTC, respectively. Accordingly, we detected a signi cant positive correlation between the relative abundance of total C-degrading genes and composting temperature in the two treatments (R 2 = 0.629, P < 0.001, Additional le 1, Figure S4a). Consequently, the higher abundance of Cdegrading capacities could contribute to enhanced heat production in composting ecosystems [27]. In addition, a signi cant negative correlation between the relative abundance of C-degrading genes and the GI was also observed (R 2 = 0.091, P < 0.05, Additional le 1, Figure S4b).
All detected N-cycling genes exhibited lower relative abundance in hTC than in cTC on day 5 but to varying degrees (Fig. 1c), implying a possible decrease in the N metabolic rate and fewer N products from microbially mediated processes escaping from the composting system, such as N 2 O and N 2 from denitri cation and NH 3 from nitrogenous compound mineralization [32,33]. Speci cally, a signi cant decrease (P < 0.05, Fisher's exact test) in denitri cation and N mineralization genes was observed on day 5 in hTC compared with cTC, especially narG and ureC (encoding nitrate reductase and urease, respectively). At the end of the composting experiment (day 44), 10 N-cycling genes were still signi cantly decreased in hTC compared with cTC (Fig. 1d). A signi cant correlation was observed between total Ncycling genes and both NC and the N 2 O-N emission rate (for NC, R 2 = 0.677, P < 0.001; for the N 2 O-N emission rate, R 2 = 0.472, P < 0.001; Additional le 1, Figure S4c and S4d). The increased C-degrading gene relative abundances and decreased N-cycling gene relative abundances were also observed in other samples (Additional le 1, Table S3). Together, the above results indicated that hTC signi cantly increased the number of C-degrading genes and decreased the number of N-cycling genes across the whole composting process and that these gene abundances were signi cantly correlated with their corresponding performance indices.
2.2 Identi cation of keystone genes driving the superior performance of hyperthermophilic composting We subsequently employed a random forest model to explore to what extent the shifted genes involved in C degradation and N cycling affected the selected individual composting performance indices in both treatments. Regression models comparing the predicted to actual composting performance indices found strong correlations in both treatments (for hTC, adjusted R 2 = 0.89-0.96; for cTC, adjusted R 2 = 095-0.97; Fig. 2a). The slopes of our models also displayed high values in both treatments (for hTC, slope = 0.65-0.75; for cTC, slope = 0.65-0.87; Fig. 2a). Correspondingly, high explained variances in individual composting performance were also observed in both treatments (Fig. 2b). Collectively, the above results highlight the potential of functional information to serve as a bioindicator of composting performance. The random forest model further suggested that the keystone genes related to individual composting performance were the signi cantly shifted genes in hTC (top 20, descending order in mean square errors, Fig. 2b and 2c). Speci cally, amyA, amyX, glucoamylase, ara, mannanase, cutinase and phenol oxidase, the signi cantly increased C-degrading genes ( Fig. 1a and 1c) in hTC, were the most important positive factors in uencing composting temperature; by comparison, although there were several C-degrading genes (amyA, pec_Cdeg, RgaE, cdh, limeh, and chitinase) signi cantly correlated with composting temperature in cTC, the predictive power was much lower ( Fig. 2b and 2c). Nitri cation (amoA) and denitri cation (narG, nirK, nirS and nosZ) genes were the most important negative factors in uencing NC in both treatments, regardless of whether these genes were signi cantly decreased in hTC. Notably, urec (encoding urease, which catalyzes the formation of NH 3 and carbamate), the signi cantly highly abundant gene in cTC ( Fig. 1b and 1d), was the most powerful negative factor in uencing NC. Similar to the NC prediction results, all denitri cation genes in cTC were the most important negative factors affecting the N 2 O-N emission rate; however, norB (encoding nitric oxide reductase, which catalyzes the formation of N 2 O), with a signi cantly lower abundance in hTC, had no signi cant relationship with the N 2 O-N emission rate in hTC. Most of the labile C-degrading (amyA, amyX, cda, glucoamylase, apu, and ara) genes with signi cantly higher abundances in hTC were signi cant negative factors affecting GI; however, exochitinase, pec_Cdeg, norB, endopolygalacturonase and nirK were the most in uential positive factors affecting GI in cTC. These results showed that the signi cantly (P < 0.05) shifted genes involved in C degradation and N cycling were the keystone genes driving the superior performance of hTC.

Keystone genes affect composting performance in hyperthermophilic composting more strongly than taxonomy does
Consistent with previous studies [7,13], the two composting treatments resulted in different bacterial communities within 44 days of composting, with the most remarkable shift in taxonomic diversity and composition on day 5 (Additional le 1, Figures S5, S6 and S7). We next generated a circular maximum likelihood phylogenetic tree constructed from the most abundant operational taxonomic units (OTUs) (top 102) on day 5 to identify the phylogenetic differences between hTC and cTC communities (Fig. 3a). Strikingly, two clades constituted the majority of feature species that differed between hTC and cTC. Clade I was solely composed of Deinococcus-Thermus, with all 14 OTUs belonging to the genus Thermus and exclusive to hTC. In clade II, there were 28 and 4 OTUs exclusive to cTC and hTC samples, respectively. All OTUs in clade II (39 OTUs) belonged to Firmicutes, including the genera Bacillus, Oceanobacillus, Exiguobacterium, Atopostipes and Pseudogracilibacillus. On the basis of Procrustes and pairwise similarity (Bray-Curtis) analysis, the microbial community composition (based on OTUs) was signi cantly correlated with functional composition (based on GeoChip 5.0M data) in both treatments (P < 0.05, Additional le 1, Figures S8 and S9). These results indicate that hTC caused signi cant changes in taxonomic distribution and that these differences determined their distinct microbial functions.
Subsequently, relationships between selected performance indices (temperature, NC, N 2 O-N emission rate and GI), taxonomy (genera with relative abundances above 0.1%) and microbial functions (C-degrading and N-cycling genes with relative abundances above 0.1%) were estimated by constructing correlation networks for samples from both hTC and cTC (Fig. 3b). The hTC network consisted of 66 nodes with 173 edges, with 18% positive and 14% negative edges between composting performance indices and microbial function nodes. Meanwhile, only 5% positive and 2% negative correlations between composting performance indices and taxonomy nodes were found in the hTC network. In comparison, the cTC network contained 78 nodes linked by 272 edges, including 11% positive and 9% negative edges between composting performance indices and microbial function nodes and 9% positive and 6% negative edges between composting performance indices and taxonomy nodes. Surprisingly, 12 C-degrading genes (Amya, amyx, glucoamylase, ara, mannanase, apu, xyla, exopolygalacturonase, phospholipase_C, cdh, cutinase, and phenol oxidase) were signi cantly positively correlated with composting temperature in hTC. Meanwhile, most of these signi cantly correlated genes were the keystone genes identi ed by random forest (Fig. 2c). Only two genera (Thermus and Plani lum) were signi cantly correlated with composting temperature in hTC. In contrast, 13 genes involved in C degradation (amyA, pula, apu, pel_Cdeg, RgaE, rgH, cdh, limeh and chitinase) and N cycling (narG, nirS, nosZ and nifH) and nine genera (Georgenia, Bacillus, Bacteroides, Nitrospira, Acinetobacter, Alcaligenes, Thermobispora, Paenibacillus and Sporosarcina) were signi cantly correlated with composting temperature in cTC. Similar trends were also observed in the relationships between other performance indices, taxa and microbial functions between the two treatments. These results indicate that the composting performance indices were more closely linked with keystone genes than with taxonomy in hTC.
Partial least squares path modeling (PLS-PM) analysis provided further statistical evidence that the keystone genes in uence individual performance indices more than community composition does in hTC (Fig. 4). Our model highly explained the variation in terms of all performance indices in both hTC and cTC (in hTC, 74% for composting temperature, 83% for NC, 95% for N 2 O-N emission and 79% for GI; in cTC, 82% for composting temperature, 82% for NC, 95% for N 2 O emission and 66% for GI). The abundances of both keystone C-degrading and N-cycling genes had stronger direct effects on all selected performance indices than either community composition or other genes did in hTC. However, the abundances of keystone C-degrading and N-cycling genes had relatively weaker effects than community composition on selected performance indices in cTC. Consistent with previous network analysis, we also found stronger relationships between community composition and performance indices in cTC than in hTC. Collectively, these results indicate that the keystone genes involved in C degradation and N cycling were the factors that most in uenced the selected performance in hTC but not in cTC.

Discussion
Microbial communities have a central role in driving composting performance by mediating organic solid waste component transformation, but their roles are poorly understood [1,2,7,34]. Determining the keystone microbial factors related to the superior performance of hTC (improved composting temperature, NC, N 2 O-N mitigation and GI) [7,12,[14][15][16] is useful for better understanding the microbially driven mechanisms. In this study, we surveyed the variation in microbial functions and communities and their linkages with composting performance in factory-scale experiments in both hTC and cTC. We identi ed keystone genes involved in C degradation and N cycling that potentially play a pivotal role in the improved performance of hTC (enhanced composting temperature, NC, N 2 O mitigation and GI).
Moreover, the keystone genes are more in uential on performance indices than taxonomy is in hTC but not in cTC. These ndings highlighted the importance of keystone gene abundances in mediating composting temperature, NC, N 2 O emissions and GI, providing the possibility to improve performance by regulating microbial functions in composting ecosystems.
Previous work has demonstrated that hTC resulted in a distinct bacterial community composition compared with that resulting from cTC [7,15]. Our results go further, showing that hTC markedly shifted both the function and phylogenetic structures of microbial communities, as indicated by NMDS-based ordination for GeoChip and pyrosequencing data, respectively (Fig. 1, Fig. 3, Additional le 1, Figures S2,  S3 and S7). Importantly, C-degrading and N-cycling genes experienced the greatest shifts between the two treatments ( Fig. 1) and predicted selected individual performance indices accurately in both treatments (for hTC, adjusted R 2 = 0.89-0.96; for cTC, adjusted R 2 = 0.95-0.97; Fig. 2). This supports previous studies of functional genes such as those involved in the N cycle that have been targeted to obtain insights into various aspects of performance such as the N content, N 2 O-N emissions and N conservation [15,16]. Notably, our results showed that hTC not only stimulates C-degrading genes but also suppresses genes involved in N cycling aspects such as nitri cation, denitri cation and N mineralization (Fig. 1). A potential explanation for the increased number of C-degrading genes in hTC may be that microbes have great potential to decompose C substrates under ultrahigh temperatures [35][36][37]. For instance, the amyA gene was signi cantly enriched in hTC; this gene encodes α-amylase, which exists widely in hyperthermophiles and is the most studied hyperthermostable enzyme [20,21]. The enriched amyA gene might catalyze starch degradation to release more monomeric or oligomeric sugars in hyperthermic conditions [38,39], which could enhance heat production by microbial metabolism and facilitate humic substance formation by the Maillard reaction (condensation between amides and reducing sugars) [40]. All N-cycling genes exhibited lower relative abundance in the hTC treatment, suggesting a shift to hinder some microbial functional groups that can utilize N-containing substrates or losses in certain organisms that thrive at elevated composting temperatures [9,[41][42][43]. These results were in accordance with our nding that the relative abundance of the genus Bacillus, a strongly ammonifying taxon [44], was lower in hTC (Additional le 1, Figure S7). This shift may re ect the adaptation of microbial communities in hTC to hyperthermophilic conditions over time [45,46].
We further detected the variations in bacterial abundance and community composition by applying qPCR and high-throughput sequencing of bacterial 16S rRNA genes, respectively. The results showed that the unique hyperthermophilic stage (composting temperature above 80°C) can be sustained for at least 5 days (Additional le 1, Figure S1a), leading to lower bacterial abundance and diversity by selection and environmental adaptation in hTC than in cTC (Additional le 1, Figure S5). Consistently, the phyla Proteobacteria, Firmicutes, and Bacteroidetes dominated in cTC and were replaced with hyperthermophilic taxa (Deinococcus-Thermus and Actinobacteria) during the hyperthermophilic stage ( Fig. 3a, Additional le 1, Figure S7). Hyperthermophiles have many properties making them suitable for organic waste treatment, as they typically have higher growth rates and tolerances to wide ranges of environmental conditions such as temperature, salt, pH and low nutritional requirements [35,36,47]. For example, the high temperatures at which thermostable C-degrading enzymes in hyperthermophiles can operate at allow more substrate to dissolve, which can increase diffusion and mass transfer rates and thus shift the equilibrium [39,48]. Additionally, although a wide variety of hyperthermophiles catalyze exergonic redox reactions involving nitrogenous compounds, nitri cation, denitri cation and dissimilatory nitrate reduction were strongly inhibited at high temperatures during composting [9,42,49]. The correlations between the microbial community composition (Bray-Curtis distance) and functional structure were further con rmed by Procrustes tests (for hTC, P < 0.05, M 2 = 0.8026, R = 0.3111, 9999 permutations; for cTC, P < 0.01, M 2 = 0.70544, R = 0.5428, 9999 permutations; Additional le 1, Figure S8) and pairwise similarity with linear regressions (P < 0.001, similarity was calculated by Bray-Curtis distance, Additional le 1, Figure S9). Such features might explain the distinct functional patterns in hTC compared with those in cTC.
Microbial co-occurrence relationships associated with composting performance indices were resolved in greater detail by constructing correlation networks among microbial taxonomy, functions and selected performance indices for both treatments (Fig. 3b) [50,51]. Our results indicate that the selected composting performance has a much closer link with keystone genes than with taxonomy in hTC, while these associations in cTC were comparable. This is not particularly surprising since the microbial community displayed drastic variations in composition, while the function maintained a relatively similar structure throughout hTC (Fig. 3a, Additional le 1, Figure S7) [8,13]. This functional similarity within changing communities may be attributed to strong stoichiometric balancing between multiple metabolic pathways, the majority of which serve to decompose complex organic molecules into simpler molecules [52]. These metabolic pathways have been discovered in nearly every form of microbial life, resulting in metabolic functional structures that are relatively conserved across divergent communities [21]. Furthermore, we found that the complexity of taxonomy-function-performance networks and the numbers of highly connective taxa and genes were signi cantly decreased in hTC. These results are consistent with those of previous studies showing that the correlations between species and the complexity of the microbial cooperation network of a community are usually simpli ed under harsh environments such as oil-, mercury-and alkaline tailing-contaminated ecosystems [18,53,54]. More importantly, given that more than half of the edges were identi ed between microbial taxonomy and function in both treatments, the numbers of connected genes in hTC were lower than those in cTC, while the contribution of functionperformance edges to cooccurrence patterns was higher in hTC. This highlighted the important role of keystone genes in in uencing performance indices in hTC, suggesting that functional redundancy might be reduced in hyperthermophilic communities [55,56]. The results of PLS-PM further indicated that the dominant effects of keystone genes on hTC performance indices were maintained after multiple biotic drivers (community composition, other genes such as those involving organic remediation, metal homeostasis, and phosphorus cycling) were simultaneously considered. Previous studies also indicated that the community composition of the macroalga Ulva australis was best explained in terms of functional information rather than taxonomic information, and the human gut microbiota exhibits a core set of genes despite high taxonomic differences between individuals [57,58]. Similarly, microbial functional attributes were the predominant biotic factors driving soil processes, including soil respiration, denitri cation and nitri cation [27]. However, although the GeoChip method can provide comprehensive information about microbial functions in C degradation and N cycling, it cannot determine the active portions of communities during composting [20,55,56]. Therefore, determining the transcription related to functions might indicate stronger relationships between composting performance and microbial functions [22,59].
In conclusion, this is the rst comprehensive study to demonstrate that functional gene abundances were the most in uential biotic factors in composting performance in hTC. More importantly, our study provides a list of keystone genes involved in C degradation and N cycling that potentially play a pivotal role in dominating composting performance, showing that hTC altered bacterial communities and their functions, which led to improved composting temperature, NC, N 2 O mitigation and GI. Understanding the functional contributions of microbial communities to composting performance may also allow us to better predict how composts will function in further fertilization. Overall, our study provides evidence for a strong relationship between microbial functions and composting performance and indicates that functional genes rather than taxonomic information were more predominant drivers of the improved composting temperature, N 2 O emissions, NC and GI in hTC.

Materials And Methods
Experimental setup and sample collection The hTC and cTC experiments were conducted according to our previous studies [7,12,14]. Brie y, each pile was constructed as a mixture of 10 t of chicken manure (79.8 wt% moisture) and 1.5 t of rice husk (7.9 wt% moisture). For hTC, approximately 5 t of end-products (35.5 wt% moisture, serving as hyperthermophilic microorganism inoculants) from the previous hTC process were added to the pile. The remaining pile, serving as a control, was mixed with 5 t end-products (41.8 wt% moisture) from a previous cTC process. The hTC end-products were purchased from GeoGreen Innotech Co., Ltd., Beijing, China. The composting experiments were conducted for 44 days, and piles of both treatments were turned and weighed on days 13 and 26. All composting samples were collected from the two treatments on days 0, 5, 13, 23, 37 and 44 according our previous studies. Supporting information provides the physicochemical properties of the raw materials used in this study (Additional le 1, Table S1).

Analyses of composting performance and physicochemical properties
Four composting performance indices, including composting temperature, NC, N 2 O emissions and GI, were determined. The composting temperatures were monitored daily using thermometers (Pt100; Shanghai Chekon Instrument Co. Ltd., China). The NC was calculated by quantifying the remaining compost mass and the total N contents in the two treatments according to a method in a previous study [60]. N 2 O emissions were captured by a static closed chamber and measured using a gas chromatograph [61]. The GI was measured for an aqueous extract obtained from fresh compost samples by using Chinese pakchoi (Brassica campestris L. ssp. chinensis Makino) seeds [33]. Physicochemical properties such as pH, water content (WC), electrical conductivity (EC), total nitrogen (TN), NH 4 + -N, NO 3 − -N, and NO 2 − -N content were also measured in the present study. WC was measured after drying in an incubator at 105°C for 24 h. EC and pH were determined using a conductivity meter and a pH meter, respectively. TN was determined with an elemental analyzer (Vario MAX cube, Hanau, Germany). NH 4 + -N, NO 2 − -N and NO 3 − -N concentrations were measured by ow injection analysis (Systea, Italy).

DNA extraction and bionomic analysis
Compost DNA was extracted from 0.25 g of each sample in triplicate using a PowerMax Soil DNA Isolation Kit (Qiagen, USA) according to the manufacturer's protocol. A NanoDrop ND-2000 spectrophotometer (Thermo Fisher Scienti c, Wilmington, MA) was used to assess DNA quality and concentrations [21]. The puri ed DNA samples were stored at -80°C until further analysis.

Phylogenetic analysis and network construction
A maximum likelihood phylogenetic tree was constructed based on the OTUs with the top 102 relative abundances in hTC and cTC on day 5, as determined by UPARSE. MEGA 6.05 was used to construct the phylogenetic tree with MUSCLE alignment, the maximum likelihood method and a bootstrap value of 1000. The nal tree was visualized by iTOL [20].
A correlation matrix was constructed by calculating the pairwise correlation coe cients between selected composting performance indices (temperature, NC, N 2 O-N emission rate and GI), taxonomy (genera with a relative abundance above 0.1%) and microbial functions (C-degrading and N-cycling genes with a relative abundance above 0.1%). Spearman correlations were extracted using R3.5.1 according to the methods in previous studies [63]. Only correlations with a correlation coe cient (ρ-value) above 0.6 and signi cance (FDR-corrected P < 0.05) were displayed in the networks. The networks were visualized in Gephi (v0.9.2).

Statistical analysis
Statistical tests of genes involved in C degradation and N cycling in the two metagenomes were performed by pairwise comparisons of their abundances by using two-sided Fisher's exact test with con dence intervals at 95% signi cance using the Benjamini-Hochberg FDR multiple test correction in STAMP [45]. Random forest linear regression models were used to assess the accuracy of the random forest predictions for composting performance indices; R 2 and slope values closer to 1 indicate better models [29]. Correlations and the best random forest model were used to analyze the associations between C-degrading and N-cycling genes and selected performance indices in the cTC and hTC treatments using the R package randomforest (v.4.6-10), with gene abundances serving as predictors for the selected performance indices [28,64]. Procrustes tests and NMDS analyses (based on all relative abundances of functional genes) were used to statistically compare variations in microbial functional structures using R (vegan package). PLS-PM was used to explore the various biotic factors for each selected performance index (temperature, NC, N 2 O-N emission rate and GI) using the R package plspm (v 0.4.7) [7]. The model included the following variables: community composition (based on OTUs), keystone C-degrading genes corresponding to each performance (identi ed by the random forest model), keystone N-cycling genes corresponding to each performance (identi ed by the random forest model) and other genes (GeoChip data, Additional le 2).

Declarations
Authors' contributions Peng Cui: designed the study, performed all experiments, and wrote the majority of the manuscript; Chaofan Ai and Zhongbing Xu: analyzed the data; Hanpeng Liao, Zhi Chen, Zhen Yu and Shungui Zhou: participated in the design of the study, provided comments and edited the manuscript. Figure 1 Comparison of normalized signal intensities of representative genes involved in N cycling and C degradation with relative abundances above 0.1% during hTC and cTC on day 5 (a, c) and day 44 (b, d),

Figures
respectively. Red bars represent the average normalized signal intensity of probes of each gene of hTC samples, and blue bars represent those of conventional thermophilic composting (cTC) samples. The differences in the functional gene relative abundances between hTC and cTC samples were tested using two-sided Fisher's exact test with con dence intervals at 95% signi cance using the Benjamini-Hochberg FDR multiple test correction in STAMP using ANOVA, with red and blue gene names indicating signi cantly higher abundances in hTC and cTC samples, respectively, at P < 0.05. a and b, Abundance comparison of C-degrading genes on days 5 and 44, respectively. c and d, Abundance comparison of Ncycling genes on days 5 and 44, respectively.

Figure 2
Identi cation of keystone genes driving the selected composting performances in hyperthermophilic composting (hTC) and conventional thermophilic composting (cTC). a, Predicted performance indices based on random forest regression analyses versus actual values. Models were based on gene relative abundances involved in C degradation (45) and N cycling (16)  Adjusted R2 and slope values for each linear regression are indicated on the plots. b, Keystone gene identi cation and performance associations of the C-degrading and N-cycling genes in hTC and cTC treatments evaluated by random forest predictions and correlation. The circle size represents the corresponding variable's importance (that is, decrease in the prediction accuracy [estimated with out-ofbag cross-validation]). Colors represent Spearman correlations. c, Random forest predictions of the effects of keystone genes (top 20, descending order in mean square errors) on each composting performance index in hTC and cTC.