The soil microbiome — from metagenomics to metaphenomics

Soil microorganisms carry out important processes, including support of plant growth and cycling of carbon and other nutrients. However, the majority of soil microbes have not yet been isolated and their functions are largely unknown. Although metagenomic sequencing reveals microbial identities and functional gene information, it includes DNA from microbes with vastly varying physiological states. Therefore, metagenomics is only predictive of community functional potential. We posit that the next frontier lies in understanding the metaphenome, the product of the combined genetic potential of the microbiome and available resources. Here we describe examples of opportunities towards gaining understanding of the soil metaphenome.


Introduction
Soil microbial communities carry out key ecosystem services that are vital for life on our planet, including cycling of carbon (C) and other nutrients and sustaining plant growth. Unfortunately, many beneficial functions carried out by the soil microbiome are currently threatened due to changing climate and precipitation patterns, soil degradation and poor land management practices [1]. Recently there has been increased interest in manipulation of soil microbiomes to restore ecosystem function [2]. The opportunity for managing ecosystem services and bioprospecting soil microbial metabolism will be possible with a greater comprehension of how soil microbiomes interact under different conditions. Exploration and management of soil microbiomes remains a daunting task, however, because the majority of soil microbes have not yet been isolated and molecular details underlying their functions are largely cryptic.
Here we will focus on one of the biggest enigmas facing soil microbiologists; namely understanding how soil C is transformed by soil microbes. Ultimately, the soil microbiome, together with plants, determines whether C is released to the atmosphere as CO 2 or CH 4 , or retained in soil [3]. Although molecular interactions between microbial species and their environment strongly influence the fate of soil C, details of these interactions are largely unknown. Unlike other microbial habitats (e.g. gut, water), both microbial communities and substrates in the soil are highly diverse and subject to physical protection and chemical stabilization [4]. Therefore, identifying and measuring the network of active microbial metabolic interactions in soils requires approaches adapted to the heterogeneous soil environment. Advancing soil microbiome research thus depends on identifying dominant heterotrophic pathways for C metabolism and how microbial physiology influences the relative importance of C cycling pathways in response to environmental conditions. Furthermore, as one of the most diverse habitats on the planet, soil microbiomes provide rich opportunities to commandeer metabolic interactions for industrial applications, such as biofuel, and mining for novel bioproducts, including new antibiotics [5].

From soil metagenomes to metaphenomes
High throughput sequencing studies have succeeded in illuminating the previously unknown compositions and diversities of soil microbial communities across a variety of soil habitats without the necessity for cultivation [6 ]. Deep metagenome sequencing has also started to reveal the functional potential of soil communities, for example, genes involved in C cycling [7] and links between community genes and functions [8]. A current challenge is to go beyond predictive understanding of gene function based on the genome/metagenome to understanding of actual functions carried out by the soil microbiome in situ. This is especially important in soil environments, where metagenomes include relic DNA extracted from dead and dormant cells [9,10] and DNA that is trapped in biofilms [11]. Even viable cells that are actively growing only regulate gene expression as needed, and not all genes are expressed at any given time. For example, representatives of the Verrucomicrobia are often present in soil metagenomes, but in a recent study they were shown to have low levels of gene expression based on metatranscriptome data [12]. Therefore, a soil metagenome provides an overview of potential microbial function and other methods are needed to determine the actual functions that are carried out by viable and active cells under given environmental conditions.
Here we define the metaphenome as the product of expressed functions encoded in microbial genomes (metagenome) and the environment (resources available; spatial, biotic and abiotic constraints). To our knowledge this term has only been reported once previously for microbial communities, when describing metagenomes as genomes of communities with expression through the community 'metaphenome' [13]. The soil metaphenome is dependent on the combined genetic potential encoded by the soil member genomes, the physiological status of the member populations, their access to resources, contact with other organisms and signaling molecules, combined with their genetic capacity to respond to environmental cues. The metaphenome thus encompasses the entire 'omics' field, including the metagenome, metatranscriptome (expressed genes), metaproteome (proteins resulting from translation) and the metabolome (metabolic products) [14]. The soil metaphenome is also ultimately governed by the highly structured soil environment, resulting in a very heterogeneous availability of electron acceptors and redox chemistry, and both strong spatial and temporal variability. Therefore, the soil metaphenome remains a considerable challenge to measure and predict. Here we will review the current state-of-thescience, knowledge gaps and discuss future opportunities for understanding the soil metaphenome.
Influence of soil structure and connectivity on the soil metaphenome Although we recognize that understanding of metaphenomes is important and relevant for understanding functions of microbial communities in a variety of ecosystems, ranging from water to humans to soil, soil presents unique challenges. For example, physical protection of substrates may prevent their utilization by the resident soil microbiome [15]. Soil is also spatially complex with a highly dynamic and patchy distribution of C and other resources that results in distributed hot spots suitable for growth of microbial consortia, for example within microaggregates or the rhizosphere [16,17]. The spatial constraints imposed upon distinct consortia residing in individual soil microaggregates (50-200 mm in diameter) presumably constrain the types of cross-species interactions that can occur in a given soil habitat. Soil aggregates have recently been considered to be analogs of evolutionary incubators for soil microbial life [18 ]. Because they are isolated and tremendously abundant, soil aggregates can allow for massively parallel evolution of distinct microbial consortia. We propose that this spatial isolation could be one of the contributing factors underlying the high microbial diversity found in most soil habitats [6 ].
Understanding the fine scale distribution of microbes and resources is required to predict species physiology and metabolic interactions among community members, that comprise the collective soil metaphenome. Given that life in soil is concentrated within micro-spatial 'islands', soil microbes have evolved to interact with each other through a variety of mechanisms that deal with spatial constraints [19 ]. Currently broad scale process measurements, such as soil respiration, mask details of the molecular reactions occurring by interacting members in discrete, spatially isolated soil consortia ( Figure 1). For example, as soils become drier, microbial dispersion becomes more limited and microbial life more constrained within physically protected soil pores ( Figure 1). Little is known about how resulting subpopulations or consortia distribute metabolic functions among themselves or regulate/signal other populations in response to changing environmental conditions and how this relates to the soil metaphenome.

Influence of physiological status on the soil metaphenome
The collection of physiological responses of individual microorganisms to the environment results in a community metaphenomic response, including genetic regulation and cell-cell interactions, that underlie which community genes are expressed in response to resource availability. Depending on the step in the chain of expression, different information is obtained about the physiological status of the soil microbiome. Metatranscriptomics captures transient responses to environmental conditions, whereas metaproteomics provides a more stable indication of the overall state of the environment [20]. The responses of individual soil microbes to changes in environmental conditions are highly regulated at the genetic level, resulting in a range of physiological alterations, including shifts in fatty acids making up the cell membrane, production of specific proteins (e.g. heat shock or cold shock proteins), and reduction in respiratory activity [21]. In addition, many metabolic pathways are regulated so that the genes are only transcribed when needed. Sequencing of genes by metagenomics will include genes that are not being transcribed and therefore not translated into proteins. Should conditions change, other genes will be induced, transcribed and translated. For example, prolonged survival under subzero conditions, results in a range of physiological responses across community members [21,22]. Some microbes accumulate C storage reserves and osmolytes as a resource to stay viable over extended periods of low nutrient conditions or low soil moisture levels. Another adaptive response is to decrease genome copy numbers to cope with low nutrients, such as observed during a long-term soil warming experiment [23 ]. In general, as microbes enter a state of low activity or dormancy, their contribution to the metaphenome decreases, relative to those that maintain an active metabolic state.

Influence of microbial community interactions on the soil metaphenome
At the community level the soil metaphenome includes the combined metabolic outputs of the community members. An example is cellulose degradation. Cellulose is a complex polymer that is degraded by different microorganisms with complementary metabolic traits. For example, some microbes possess glycoside hydrolases, others have transporters, and so on [24 ]. Different microbial community interactions might affect the fitness of microbes that have all of the enzymes to degrade cellulose to cellobiose to glucose, while others must compete for products of exocellulase enzymes [25]. However, details of soil microbial metabolic interactions during degradation of cellulose and other C compounds, as well as the myriads of other soil processes that result in a given metaphenome, are not well known. Even with increasing access to soil metagenomes, we are still challenged to understand how the physiology of the interacting organisms respond to environmental conditions to define the response surface -the possible metaphenomes -given the genetic potential and range of environmental conditions (temperature, moisture) for a given ecosystem. Interpreting bulk dynamics of microbial genes and gene-products involved in the resulting soil metaphenome thus remains challenging due to high diversity and complexity that confound our ability to link mechanistic details to emergent properties.
Ecological network theory has been employed to predict species interactions and the stability of simple microbial communities [26 ]. However, the concept of interacting microbial networks has been elusive to test in complex, heterogeneous soil ecosystems. Nutritional interactions in soil involve interconnected metabolic webs between species and across kingdoms [27 ]. These interactions include complex nutritional interactions, involving interconnected metabolic pathways, with cross-feeding and metabolite exchange between species. The types of interactions range from metabolic cooperation between microbes in syntrophic relationships, to competition for access to limiting nutrients ( Figure 2). Soil microbes communicate with each other and their environment through a variety of chemical signals [28]. Few studies, however, have determined specific metabolic and signaling interactions between members of soil microbial communities [29], including interactions across trophic levels. This knowledge is important because soil is home to highly diverse and complex communities of organisms, including bacteria, archaea, fungi, virus [30 ] and higher organisms (plants, insects, protozoa, and so on) [31]. Together these different soil organisms interact in trophic 164 Environmental microbiology

Current Opinion in Microbiology
Illustrative overview of biotic and environmental factors contributing to the soil metaphenome. A cross section of a field is shown with different soil moisture levels. On the right side, plant growth is constrained due to low soil moisture levels. An example of a measurable phenotype is shown (CO 2 , corresponding to soil respiration), which is the result of combined metabolic interactions between soil microbes and plants. Call out circles correspond to a microscale view of soil consortia residing in spatially discrete soil aggregates. Connectivity between consortia is determined by the extent of the pore volume that is water filled and available for diffusion of chemical signals and metabolites. Bacterial (purple symbols) interactions within consortia are designated with white arrows. Fungal hyphae (green filaments) may bridge spatially discrete consortia. Soil viruses (orange symbols) also play a yet undefined role in regulating the soil metaphenome. Lower panel illustrates different types of models applicable to defining the soil metaphenome; from left to right: biochemical reaction networks squares correspond to bacterial (purple) or fungal (green) metabolites, interspecies interaction networks, and interkingdom interactions [26 ,36].
food webs to break down complex organic compounds and exchange nutrients. For soil microbiome research to advance, innovative approaches are required to reveal the details underlying the myriad of interactions carried out by naturally complex soil microbiomes, and the interplay between and within different kingdoms that result in the soil metaphenome.
Opportunities for the future

Untangling the intricate web of metabolic interdependencies
Understanding the complexity of all metabolic interactions that result in measured phenotypes within a single organism is not yet achievable. However, recent advances in genome-enabled predictions, fluxomics (determining rates of metabolic reactions), and modeling have made great strides towards deciphering specific metabolic pathways in single bacteria [32 ]. In particular, flux-based analysis (FBA) is a promising approach that uses metabolic models to predict phenotypic responses of microorganisms to different environmental conditions [33]. A genome-enabled approach for prediction of metabolic interdependencies between soil community members is theoretically possible in the near future. For example, soil metagenomes can be explored for genes encoding specific bioactive compounds [34], or specific functional genes [35 ] based on their distinct sequence signatures. Eventually reconstruction of biochemical reaction networks should be possible from annotated metagenomes, combined with stable isotope-based fluxomics, and The soil metaphenome Jansson and Hofmockel 165  metabolic modeling [36]. When predicted metabolic pathways are incomplete -due to missing or missannotated genome sequence data -gap-filling can be employed to add the missing steps [33]. Gap filling can be aided by knowledge of the metabolite composition of soil, using sensitive mass spectrometry platforms to predict the identities of the metabolites. This combined with new computational approaches have great promise to increase the number of soil metabolite assignments in databases, such as recently demonstrated for human metabolites [37].
An attractive means to achieve increased resolution of metabolic dependencies is through dissection of complex soil microbiomes into discrete functional units, or guilds, responsible for a specific phenotype (cellulose decomposition, methanogenesis, sulfate reduction, and so on). This could be accomplished by combining soil isolates with known metabolic capabilities into 'synthetic' communities [38], or by selective enrichment in liquid media containing specific resource combinations. In both cases, the simplified communities should facilitate determination of metabolic exchange between species having specific metabolic capabilities [39,40 ]. However, a synthetic community that is built up from isolates loses the naturally adapted interactions between community members. Also, enrichment cultures in liquid medium do not include the spatial constraints inherent to soil microbes that predominantly reside in soil microaggregates [38]. Therefore, a future opportunity for identifying community phenotypes would be to construct naturally evolved and tractable model soil consortia in a soil environment. This would allow the study of metabolic and spatial interactions, and chemical signaling between microorganisms in their natural habitat.
Microfluidics has great potential to enable experimental manipulation of the soil microbiome to determine mechanisms underpinning specific microbial metabolic interactions and to understand and predict the influence of environmental gradients on specific microbial functions and with precise spatiotemporal control [41]. Microfluidics also lends itself to imaging in conjunction with concurrent developments in high resolution imaging platforms. For soil, it is particularly interesting to incorporate spatial heterogeneity into microfluidics, to approximate the complexity of the soil environment. We envision a future where specific interactions between soil microorganisms are visualized in a heterogeneous spatial context and at a micro-relevant scale using microfluidics and imaging. Combined with new modeling approaches, it should be possible to predict self-assembly of multispecies microbial communities on different rough surfaces mimicking a soil environment [42].

Interpreting the soil metaphenome
Functions carried out by members of the soil microbiome that result in the soil metaphenome, are now possible to assess using techniques such as stable isotope probing (SIP) and multi-omics approaches. For example, stable isotopes can be used to track how specific nutrients are metabolized by interacting members of soil microbiomes and across trophic levels. In one study 13 C-SIP was used to determine succession over time during the assimilation of 13 C-labeled xylose, suggesting that labile C traveled through different trophic levels [43 ]. By contrast there were fewer changes in the phylogenetic composition of cellulose degraders over time, and they corresponded mainly to abundant, but poorly characterized members of the soil microbiome.
Recent application of metatranscriptomics [44], metaproteomics [22] and metabolomics [45] are also helping to fill gaps in our knowledge about genes that are expressed and/or translated into proteins, and metabolic interactions that are possible under a given soil resource regime.
Although there have been some recent advances in use of a multi-omics approach to decipher soil microbial community functions [22], there remain significant challenges to overcome, including functional gene annotation, and extraction and identification of macromolecules (metabolites and proteins). Future advances in mass spectrometry technologies that will facilitate higher throughput, yield and depth of coverage of proteins have enormous potential to open the current bottleneck in soil proteomics [12,14]. Also, as the depth of metagenome sequencing continues to increase, we are seeing much better metagenome assemblies; in particular when combined with long-read sequencing technologies [12]. Along with better assembled metagenomes come a higher number of complete-to near-complete genome bins from soil [12]. A future opportunity thus lies in the use of genome bins as databases for searching soil metaproteomes [46]. The advantages of using genome bins is similar to that of using isolate genomes; that is, entire operons with genes encoding complete pathways and genes encoding regulatory mechanisms are intact and phylogeny can be coupled to function because the entire 16S rRNA gene is on the genome together with the functional genes. By application of comparative genomics approaches it should be possible to predict phenotypes directly from the genome bins [47 ] without the necessity to cultivate the represented species. Expression data (transcripts and/or proteins) can then be mapped to the binned genomes to determine their phenotypes [46]. The collective binned genomes, with expression data, has potential to illuminate details of the soil metaphenome.

Conclusions
Individual microbial phenotypes, including the combined metabolic outputs of the community members, together generate the higher scale outcomes of the soil metaphenome. Interpreting bulk dynamics of microbial genes and gene-products involved in the soil metaphenome amidst the high diversity and complexity of soil microbiomes requires linking mechanistic details to emergent properties. Advances in genome-enabled predictions, fluxomics, and modeling, combined with metabolomics, SIP and imaging technologies, have great promise to identify and track the exchange of signaling molecules and metabolites among soil organisms, which will enable transitioning from the metagenome to the metaphenome. This knowledge is important for prediction of the impacts of environmental perturbations on key functions carried out by the soil microbiome and will enable development of new approaches for optimizing soil carbon cycling, managing nutrient transport, and sustaining crop production.

24.
Lopez-Mondejar R, Zuhlke D, Becher D, Riedel K, Baldrian P: Cellulose and hemicellulose decomposition by forest soil bacteria proceeds by the action of structurally variable enzymatic systems. Sci Rep 2016, 6:25279. Identified several genes and proteins from forest soil bacterial isolates that are involved in cellulose degradation, including catabolic enzymes and transporters.

26.
Coyte KZ, Schluter J, Foster KR: The ecology of the microbiome: networks, competition and stability. Science 2015, 350:663-666. Used models to show that cooperation between community members in host-associated microbiomes reduces community stability, whereas competition stabilized communities.