The evolution of the metabolic network over long timelines

Metabolism is executed by an efficient, interconnected and ancient biochemical system, the metabolic network. Its evolutionary origins are, however, barely understood. We here discuss that because of niche adaptation, the evolutionary selection acting on the metabolic network structure distin-guishes modern species and early life forms. Yet, its basic structure remained conserved over more than three billion years of diverging evolution. We speculate that this situation attributes key roles in metabolic network evolution to (i) the reaction properties of central metabolites, (ii) simple catalysts (e.g. metal ions, amino acids) whose importance remained unchanged during evolution, and (iii) the interconnectivity of the network that limits its expansion. The conservation of network structure hence implies that early life forms already used similar metabolic reaction topologies as modern species.


Introduction
Despite its huge diversity, all forms of life are essentially built from a structurally limited set of metabolites. Amino acids, nucleotides, fatty acids, and many coenzymes are used universally. There is more variability in the biomolecules that build cell membranes and cell walls, but also these are made from a conserved set of precursor metabolites that originate in what is known as central metabolism, i.e. glycolysis (the Embden-Meyerhof-Parnas-Pathway or related glucose oxidation pathways), the pentose phosphate pathway, and the tricarboxylic acid cycle. Importantly, not only are the metabolites conserved between the kingdoms of life but also most of the metabolic pathways that synthesize them. Metabolic pathways often possess a striking similarity in their topological structure, that is, their biochemical reaction sequences. Interestingly, there are examples that show that the reaction sequences and metabolites can be older than the enzymes that catalyze them. For instance, Eubacteria and Archaea use a glycolytic pathway, that is, made of similar intermediates but without sharing sequence conservation in their metabolic enzymes [1]. Either the two kingdoms have, hence, evolved the same pathway independently or one or both lineages have over time replaced older enzymes with modern ones. In either case, the long evolutionary timeline of at least three billion years of divergent evolutionary pressure did not fundamentally alter many of the core metabolic reaction topologies. Here, we debate key evolutionary and biochemical constraints acting on cellular metabolism that differ, or remain, constant over these extreme evolutionary periods ( Figure 1). We conclude that early lifeforms must have been using metabolism-like biochemical reaction sequences from very early on.
that the metabolic network is at least 3.7 billion years old, as old as the earliest fossil records. One school of thought suggests that the metabolic network could have originated in specialized extreme environments, such as hydrothermal vents, as they offer a constant supply of reducing electrons and matter to feed the emerging primitive forms of life [3]. The other possibility is that life emerged in ambient environments, such as what is amenable to mesophiles, which by far has given rise to the most sophisticated lifeforms on earth, including us humans. It speaks for the latter that cells require a high degree of sophisticated and specialized enzyme structures to grow and survive in extreme environments. Moreover, the metabolic network contains several central metabolites, like 3-phosphoglycerate, that decompose at a high temperature within seconds [4]. However, no matter what the origins scenario is, the earliest life forms had to adapt their metabolism to meet the environmental constraints to survive and be able to have colonized the different niches.
The reactions that form the metabolic network have been selected out of the space of the thermodynamically plausible reactions ( Figure 2). However, thermodynamics alone does not suffice to explain the metabolic network structure. The situation that a reaction is thermodynamically possible does not mean the reaction is spontaneous. Moreover, metabolism exploits only a small fraction of the thermodynamically possible space. For example, many of the Maillard reactions between amino acids and glycosides that could be catalyzed to occur under ordinary conditions, are not part of the metabolic network [5]. Indeed, evolution does not change the thermodynamic properties of the metabolites it converts, but evolution can select an enzyme that lowers the activation energy of a specific interconversion, and in this way, defines the structure of the metabolic network ( Figure 2).
Up to 40% of modern enzymes bind a metal ion for their catalytic function [6,7]. As metal ions were prebiotically available, it is hence a reasonable assumption that also early forms of metabolism did exploit metal ions as catalysts or cofactors. Geochemistry has elaborated environmental constraints that apply to both the extreme and mundane prebiotic niches. This would have been the earliest and most significant among the constraints that have shaped early metabolic structure, that is, chemicals available in the environment, as these constrained the evolution of the first catalysts and their substrates.
Environmental conditions on primitive earth comprised principally CO 2 , N 2 , H 2 O, CO [8]. This dysoxic environment acted as an inert atmosphere, favoring high concentrations of the water-soluble Fe(II)-the by far most abundant transition metal in Archean Sediment [9,10]. Iron in various forms has been reported to drive many reactions abiotically reminiscent of metabolic reactions, including those found within glycolysis, the The evolution of the metabolic network structure as expressed by complexity and time. The formation of the early network is driven by simple chemical constraints involving the reaction properties of the early intermediates, and metal ions, amino acids and later more complex constraints involving the first enzymes, acting as catalysts. The expansion of the network is further driven by increasing ecological competition stimulating the evolution of enzymes, enzyme complexes and regulation of metabolic fluxes. Increased competition and network expansion distinguish the evolution of metabolism in early and modern-day metabolic networks. Metabolites (chemical compounds) are represented by , metabolic reactions (metabolite turnover) by / and regulation by (inhibition) ⊥.
Krebs cycle, and the pentose phosphate pathway [4,11,12]. Some of these reactions are pivotal to the topological organization of the metabolic network, indicating that Fe(II) could have been one of metabolism's earliest tools in its catalyst inventory. Indeed, the omnipresence of ferrous iron would have made it difficult for early life forms to escape iron-driven reactions. Even in the oxygenated environment of today, chelators are required to deprive environmental water from lifeessential iron concentrations. Hence it is plausible that early forms of metabolism were exposed to the reaction spectrum this metal catalyst provides.
It is only reasonable to assume that many environments of the primordial earth might have contained also a spectrum of molecules that are not taking part in essential reactions of the modern metabolic network. HCN or formaldehyde, for instance, show interesting reaction properties and are, hence, discussed in the context of the origins of life [8]. Both HCN and formaldehyde molecules, however, indiscriminately react with several metabolic intermediates. They hence disrup the metabolic network structure, and are toxic to metabolism also in the most sophisticated cells. Most likely, these molecules were hence toxic also to simpler forms of cells that had far fewer defense mechanisms to their disposal compared to modern cells [13,14]. We conclude that the fact that a metabolite was prebiotically available does not necessarily mean it was involved in metabolic evolution. Like in modern metabolism, a key constraint is that a molecule's reaction properties need to be compatible with the basic functional principles of metabolism, which centre around its topological organisation.
Early metabolic reactions could have either occurred via redox and rearrangement reactions of different inorganic species with a carbon source or driven by processes, such as mineral serpentinization or photolysis of water, generating the necessary potentials to reduce carbon [15e18]. Because the still recent discovery of irondriven non-enzymatic reaction networks resemble metabolic network topologies [4], there is increasing data support for a hypothesis in which the earliest life forms have contained chemical reactions that resemble many extant biological pathways. Such would indicate that the basic structure of the metabolic network is fundamentally driven by chemical rules that were already in place before life formed, that is constraint by the availability of the aforementioned metal catalysts and the reaction properties of the key metabolites [19e 21]. Importantly, a theory that roots the origin of the metabolic network structure in environmental nonenzymatic reaction sequences, does not exclude the possibility that at least some metabolites formed in extreme environments, that can give rise to several biomolecules [22e24], got incorporated into the developing metabolic network as well. This scenario entails that these molecules reached sufficient concentration environments so mundane that the metabolic network that contains many low abundant and reactive intermediates could function and evolve.
Genetics, which selects enzymes eventually, could only emerge and thrive once a diverse set of biomolecules, sufficient to initiate a Darwinian process, could be formed. The set of metabolites essential for Darwinian selection include amino acids for protein biosynthesis, Constraints acting on the evolution of metabolic networks. The space of thermodynamically plausible reactions is large, but only a small fraction is exploited in the expansion of the metabolic network. Besides selection on available thermodynamically plausible reactions, metabolites (indicated with circles) produced in metabolic reactions could also inhibit other reactions/pathways to occur by inhibiting enzyme catalysis (indicated by ⊥), hence become another constraint in the evolution of the metabolic network.
nucleotides including ATP for forming nucleic acids and the cyclic energy transfer that drives both transcription and translation, and lipids or at least fatty acids to build a cell membrane. We can derive that early selection pressure for the metabolic network was (i) to form these molecules at a sufficient quantity and specificity, and (ii) to be able to degrade these molecules to maintain homeostasis, gain efficiency, and escape thermodynamic equilibrium. It adds that metabolism functions far from equilibrium with its chemical environment, which highlights the need for compartmentalization. Indeed, early metabolism could not form in a chemical environment that only builds up metabolites but does not provide the ability to metabolize them.
It was considered for a long time that also an 'RNA world' could give rise to a metabolic network, in the sense that early metabolic reactions could have been catalyzed by ribozymes. However, there are meanwhile substantial question marks about an RNA-based origin of metabolic catalysis. First, despite sequencing thousands of genomes, there is still no evidence for a considerable amount of ribozymes catalyzing central metabolic reactions indicating that ribozymes are not efficient in catalyzing the reactions found in central metabolism. And second, an Ockham's razor applies, because many metabolism-like chemical reactions can be catalyzed by the (prebiotically available) amino acids and metal ions, there was no selective advantage for a much more complex ribozyme to evolve for catalyzing similar reactions. And indeed, in modern cells, a vast majority of metabolic catalysis can be attributed to the reaction properties of amino acids, nucleotides, and metal ions while at the same time polymeric ribonucleotides are absent from central metabolism.

Biological complexity gives rise to new constraints in the evolution of metabolism
A key step in the evolution of modern metabolism is, however, new constraints that appeared with the emergence of biological complexity. With all the restrictions on substrate availability, the origin of life had several hundred million years without the degree of ecological competition modern cells are exposed to. This means, one of the key constraints of modern metabolism was less important to early life, the speed of the biochemical reactions and the speed of their uptake. Early forms of the metabolic network could have very well contained much slower reactions than modern networks if they provided or converted the metabolites at a sufficient specificity or one that provided sufficient selective advantage. The prime example of an enzymatic reaction where the selective advantage outweighs a slow reaction rate and low product yields is Ribulose-1,5-bisphosphat-carboxylase/-oxygenase (RuBisCO). Despite catalyzing one of the slowest biochemical reactions discovered so far, Rubisco is the most abundant enzyme on Earth. It provides an autocatalytic route for fixing carbon dioxide to carbohydrates, bypassing gluconeogenesis for sugar synthesis. It is plausible that many enzymatic reactions started from slow reaction rates that did provide low product yields initially. However, it is equally plausible that evolvable enzyme structures that facilitate faster reaction rates became increasingly necessary with mounting chemical and, eventual, ecological competition ( Figure 1).
Computational simulation studies have shown that organizational transition from a chemical reaction system to one that is controlled by a polymeric genetical material is plausible, albeit a highly simplified 'toy version' of quantum chemistry was used. Similar approaches have demonstrated that a small set of starting components, such as one might find on the early earth, could populate large areas of the modern-day biochemical metabolic network [25e27]. But in vitro, the scope of prebiotic reactions networks reported so far is limited to just a few products representing only a fraction of plausible molecules that can be generated. A key aspect missing is the control over reactivity. Without such control flux is channeled into nonrecoverable products or trapped as metastable intermediates [28]. As many of the reactions generating these intermediates exist by virtue of laws in chemistry, a major bottleneck would have been the rise of mutually compatible, stable catalytic structures to manipulate different proto-metabolic reactions so as to sustain them. This brings us to the next constraint d the need for complex enzyme structures.
With the prebiotic availability of amino acids and metal ions, it seems plausible that the early metabolic enzymes were made from the very same components as modern enzymes, that is, amino acids, and for reaction properties that cannot be catalyzed by the amino acid side chains themselves, nucleotides, small organic molecules, and metal ions were sequestered as cofactors. These integrated into the non-enzymatic networks and provided an advantage by accelerating them with increasing competition (Figure 1). In a recent communication, Noda-Garcia et al. [29] have introduced an integrated metabolite-enzyme coevolution model that attempts to explain how associations between emerging catalytic polymers and the existing proto-metabolic network arose. A non-enzymatic reaction or a side product formed due to promiscuous reactivity of an existing enzyme provides a context for evolutionary innovation. Under these circumstances, a step-wise evolution of enzymes and pathways is possible, in contrast to requiring multiple enzymes, to develop simultaneously for a pathway to emerge. Such an expansion of the core network requires high levels of compatibility of the newly evolved reactions or products with the existing network requiring sufficient substrate allocation to produce the metabolite of interest in quantities sufficient to elicit an advantageous phenotype, as well as stability against the numerous reactive species that are already part of the metabolic network. Indeed, interactions between metabolites formed as part of the metabolic network and their potential to inhibit unrelated enzymatic reactions because of structural similarity with the individual substrates have been studied recently [30]. It has been found that inhibitory interactions by unrelated metabolites affect most enzymes, and compartmentalization mitigates this problem in eukaryotic cell's metabolism. In other words, unspecific inhibitory interactions limit the complexity, and hence, the expansion of the metabolic network ( Figure 2). Therefore, despite thermodynamics allowing a larger chemical space that could be the basis for a much larger set of metabolic pathways, metabolic networks are biochemically constrained in their expansion due to their intrinsic functional properties.

Niche adaptation and the continued evolution of the modern metabolic network
The process of adjusting cellular networks, such as metabolism, in response to environmental changes is known as niche adaptation and is stimulated by increasing ecological competition. Selective advantages such as the ability to use nutrients and to achieve faster metabolic rates drive this process. Thus, niche adaptation can serve as a framework to understand how networks expand and is used herein as a model to illustrate how the evolution of modern metabolism is constrained. Recent advances in systems biology contributed to the predictability of the underlying evolutionary trajectories of niche adaptation. Here, we consider two trajectories of metabolic network expansion. One of which involves enzyme promiscuity also referred to as underground metabolism and the other horizontal gene transfer and pre-adaption serve as a stepping stone to adapt to novel environments.
Besides catalytic activities for which enzymes are evolved, referred to as the native activity, enzymes frequently display the ability to catalyze the turnover of non-native substrates with low catalytic efficiency known as underground reactions [31e33]. Typically, these products are also formed in non-enzymatic reactions that involve metabolites; these can be either spontaneous, or again catalyzed by metabolism-typcial metal ions or other biological small molecules [5]. The underground activities provide the (genetic) basis to add to the catalytic properties of environmental molecules and increase the repertoire of possible biochemical reactions to innovate novel metabolic pathways by pathing together existing pathways [34]. Thus, the part of metabolism involving underground reactions that patch together existing pathways is known as underground metabolism. The novel pathways become selective traits once such novel pathways enable the utilization of new nutrients [35]. Recently, an underground metabolic network of Escherichia coli has been reconstructed, and by using computational and experimental systems biology, it has been demonstrated that underground metabolic reactions are not only prevalent but also facilitate niche adaptation [36]. The first reconstructed E. coli metabolic network, including underground metabolism, consists of 260 experimentally verified underground reactions of which half completely fit into the native metabolic network, that is, both substrate and product of the metabolic reaction are also present in the native network. It has been predicted by genome-scale metabolic modeling that underground metabolism enables or enhances growth in fifty environments [36]. Interestingly, these predictions agree with high-throughput gene overexpression experiments illustrating the predictive power of the metabolic model. Since gene over-expression does not prove the evolutionary capability of underground metabolism, adaptive laboratory evolution has recently been conducted [37]. The study revealed that E. coli can adapt quickly to the predicted novel environments by gradually changing the environment from a preferred carbon source for growth to a novel carbon source (in approximately twenty generations). In line with the observed short-term adaptation, genome sequencing revealed that only a few novel causal mutations lead to adaptation and that these mutations affect either substrate binding or regulation of gene expression of the enzymes with the predicted underground activity [37]. Even though causal mutations directly affected the underground activity in this study, they cannot be generalized [38]. Besides facilitating growth on new substrates, underground metabolism also provides the selective advantage to cope with detrimental mutations in existing pathways [39,40]. Taken together, these recent studies do not only demonstrate that underground metabolism plays an important role in niche adaptation, which, in turn, could trigger further niche expansion [41], but also show that adaptation is predictable with systems biology.
Niche adaptation does not only involve strategies to use novel nutrients needed for growth but it also involves the evolution of anti-stress mechanisms to cope with changing environments [42]. Underground metabolism seems to play a role in such adaptations suggested by a recent study on stress responses in yeast [43]. Yeast cells import much higher lysine concentrations than needed for growth, and lysine is converted to cadaverine by an underground activity of the first enzyme of the polyamine pathway (Spe1p). In this cellular state, NADPH is not needed for the synthesis of lysine, and as a result, the cell can maintain much higher levels of reduced glutathione, along with a higher flux in required NADPH forming reactions. This leads to a decrease in the concentration of reactive oxygen species and increased oxidant tolerance. We anticipate that adaptive laboratory evolution experiments in combination with womics experiments will increasingly shed light on the question of whether the observed anti-stress mechanism in high lysine concentrations is an adaptive trait by pinpointing causal (selective) mutations that affect regulatory and metabolic states of native and underground metabolism.
Another evolutionary trajectory of niche adaptation is known as the stepping-stone model of metabolic network expansion [44,45]. The model explains complex adaptations and relies on horizontal gene transfer and pre-adaptations in dynamic changes environments. Let us say two enzymes (enzymes 1 and 2) are required for growth in environment A but only one of the two enzymes are needed to enable growth in environment B (enzyme 2). Hence, adaptation to environment A would only require the addition (transfer) of one enzyme (enzyme 1) once pre-adapted to environment B. Thus, this stepping-stone evolutionary trajectory allows adaptation to environment A, which would or not likely be accessible at a realistic evolutionary time-scale without first adapting to environment B. Indeed, a recent systems biology approach predicts via integration of genome-scale metabolic modeling and horizontal gene transfer that such stepping-stone trajectories can happen [44]. Importantly, phylogenetic reconstruction of gene gain (transfer) patterns on the basis of hundreds of sequenced genomes, as well as laboratory experiments, are consistent with these predictions. Nonetheless, it is worth mentioning the above-mentioned constraint of biochemical compatibility between the newly acquired reactions and the existing metabolic system also applies in this context [30], that is, newly introduced enzymes can only provide an advantage if they are not inhibited by the existing metabolome, or if they do not disadvantage the cell by synthesizing inhibitors to existing metabolic reactions. Important questions regarding the patchwork model of underground metabolism and the stepping-stone model still remain despite major advances in our understanding [45,46]. First, which of the two models explain most of the metabolic network expansion and second, are these two models linked and can they be unified into an overall model? Interestingly, a recent study suggests a link between horizontal gene transfer and underground metabolism in compensating for the deletion of L-threonine dehydratase in E. coli [47]. With advances in the development of underground metabolic networks, phylogenetic approaches, ALE, womics and genome engineering techniques [48e52], it is expected that future studies will provide answers to these open questions.
Why has the metabolic network maintained its basic structure over billions of years? Despite niche pressure changing over time, the metabolic network seems to have maintained its basic structure over the extreme timeframe of its evolution. There are several factors that contribute to this situation. First, as the Russian Chemist A.G. Golubev commented, 'nothing of chemistry disappears in biology' [53]. If the biochemical topology came into place not only because of the progressive evolution of ever-better enzymes but also because of the reaction property of the metabolites in the presence of the metal ions and biological small molecules that always surround them evolution will revert to the same topology in different environments. This is specifically true if species adapt to more extreme or resourcescarce environments, where reaction specificity becomes an increasing constraint. Second, despite the huge diversity of enzyme structures, the basic molecules that conduct catalysis remained unchanged over billions of years of evolution. These include amino acid side chains, that act in most modern enzymes' catalytic pockets, nucleotides, a few universal organic molecules that act as cofactors (PALP for instance), as well as just a handful of life-essential metal ions, including iron, zinc, magnesium, and copper. Together, these explain the vast majority of biochemical reactions. Finally, there is also evidence of network effects. The complexity of enzymeemetabolite interactions constrains the expansion of the metabolic network. On average, each cellular enzyme is specifically inhibited by at least five other cellular metabolites [30]. Each enzyme has evolved sufficient activity to be active despite the presence of these inhibitors. Each novel metabolite added to the network will inhibit additional enzymes, creating a strong selection pressure to keep the chemical diversity, as well as metabolite concentrations, as low as possible. This situation limits the chance of new pathways to evolve and to expand, specifically, as these can only provide a selective advantage if they either allow a new adaptation or if they outperform the existing metabolite pathway. The latter explains why metabolic pathways have very limited redundancies.