Gene x environment interactions as dynamical systems : clinical implications

The etiology and progression of the chronic diseases that account for the highest rates of mortality in the US, namely, cardiovascular diseases and cancers, involve complex gene x environment interactions. Yet despite the general agreement in the medical community given to this concept, there is a widespread lack of clarity as to what the term ‘interaction’ actually means. The consequence is the use of linear statistical methods to describe processes that are biologically nonlinear, resulting in clinical applications that are often not optimal. Gene x environment interactions are characterized by dynamic, nonlinear molecular networks that change and evolve over time; and by emergent properties that cannot be deduced from the characteristics of their individual subcomponents. Given the nature of these systemic properties, reductionist methods are insufficient for fully providing the information relevant to improving therapeutic outcomes. The purpose of this article is to provide an overview of these concepts and their relevance to prevention and interventions.


Introduction
The accumulating evidence that gene x environment interactions play a major role in chronic diseases, accounting for high rates of world-wide morbidity and mortality (e.g., cardiovascular diseases and cancers), has led to increasing interest in studying mechanisms of interaction related to treatment and prevention.However, despite the lip service given to the concept that the environment interacts with genotype, the actual definition of 'interaction' is still medically and scientifically fuzzy.Not only can it have different connotations depending on whether it is being viewed from a statistical or biological perspective, but the biological mechanisms are still widely misunderstood.The purpose of this paper is to clarify some of the concepts related to gene x environment interactions in chronic diseases, especially concepts related to the little discussed but immensely important topic of nonlinear dynamics in chronic disease etiology and progression.

Statistical approaches to gene x environment interactions and "heredity"
Statistically, the term interaction means that the independent variables in an equation (for example, smoking and gender) do not have a linear additive effect on the outcome variable.For instance, smoking might be a risk factor for an outcome in women but not in men.This type of interaction was found in a study of the angiotensinogen genotype, AGTM235T, which had been reported in multiple studies to have a risk polymorphism associated with hypertension.In that study, no significant association (main effect) in either men or women was found for this genotype with heart rate (HR), an endophenotype significantly associated with hypertension [1].When significance is found in some GWAS studies but not in others, it is referred to as "failure to replicate".However, it should not be assumed that this means 'no genetic contribution'.The analyses in this particular study also included anxiety because it too had been reported in a number of studies to be associated with hypertension.But analyses of anxiety and HR also showed no main effect in men; whereas in women, the association was the inverse of what would have been expected, i.e., lower HR with high anxiety.(This type of result can often indicate confounding).When interaction terms were included in the equation, there was a gene x environment x gender interaction such that men with the TT (hypothesized risk) genotype had significantly higher HR if, (but only if), they also had high anxiety, than did low anxious men with that same genotype, or high anxious men with the MM genotype.There was no such interaction in women.Thus, the hypothesized risk genotype did increase risk in men but only those with high anxiety, and showed no interaction in women [1].This is an example of a 'failure to replicate' genetic study in the analysis of complex diseases, that is not reflective of a lack of genetic risk, but results from a failure to include interaction terms that would expose context vulnerability.Since the inclusion of interaction terms in GWAS studies is not common, one wonders whether lack of context might be responsible for the large number of failure to replicate GWAS studies [2,3] in a broad range of areas.In the type of complex diseases discussed here, influence from environmental factors, can change the expression of a gene, overriding the effect of a specific genetic polymorphism.The mechanisms through which these gene x environment interactions occur will be further explained under the section on biological interactions.
The use of inappropriate statistical methods can happen inadvertently when there is a general lack of knowledge concerning the biology of gene x environment interactions.If one doesn't know that the biology is nonlinear and not additive, there is no reason not to use linear, additive statistical methods to define "heredity", and these methods of modeling heredity were common before the sequencing of the human genome.Unfortunately, they are still in use today.The problem arises because although genotype remains constant throughout the lifespan, environmental and lifestyle factors change, and these factors can have major influences on the extent to which many genes are expressed.Thus, an oncogene that is usually in the "off" position can be turned "on" and a tumor suppressor gene that is "on" can be turned off by a change in environmental exposures, in which case, environment becomes dominant over genotype [4].Thus, although genotype doesn't change, the genetic contribution to phenotype is not always constant because expression can be inhibited or enhanced by changes in the surrounding microenvironment.
The statistical modeling of "heredity", which is based on phenotypic similarity between monozygotic (MZ) and dizygotic (DZ) twins, is still utilized by some researchers despite the availability of more accurate molecular genetic techniques.It assumes that genotype explains a certain "fixed" percent of the population variance related to disease prevalence and that environmental factors are added to that percent to make up the difference, (so that they sum to 100).These models have traditionally not measured either genotype or environment, only phenotypic differences between MZ and DZ twins in the outcome variable.A simplified version of the concept can be seen below [5]: Twin A phenotype = Twin B phenotype + Environ + (Twin B phenotype × Environ) In practical use, the models can be quite a bit more complex by parsing genetic and environmental variance into many more subcomponents (e.g., shared and "non-shared" environment, additive genetic variance, dominant genetic variance, etc.).But the principle of the calculations doesn't change-all permutations involve linear, additive models.
Like all models, this one has underlying assumptions and the problem arises when the assumptions are not met.These additive models do not take into account the non-linear complexity inherent in the biology of gene x environment interactions [6], i.e., they do not account for the dynamically changing nonlinear epigenetic influences on gene expression and the fact that gene expression is a combination of both intracellular and extracellular factors.They also assume that there are no differences in the prenatal environments between MZ and DZ twins.In cases where MZ twins share the same amniotic sacs (e.g., are monochorionic) this assumption is not met because the twins are essentially competing for the same nutrients.Differences in nutritional status between MZ twins can in some cases be more important than genetic differences between MZ and DZ twins with respect to subsequent disease phenotypes.This occurs when there are differences in fetal growth and nutrition resulting in divergences in birthweight that can lead to lifetime differences in risk for CVD and other illnesses [7].Linear additive models have also traditionally assumed no differences in shared family environments (e.g., rearing practices between siblings), an assumption which has subsequently been shown to be inaccurate [8].Furthermore, studies of personality measuring behavioral characteristics that are based on rank order scales are treated statistically as if they were integral scales, i.e., as if there is an equal distance between the quantitative units of the type that can be found between mm of mercury in blood pressure measurements.This is not the case.There is no evidence whatsoever that a unit of difference between 14 and 15 on the CES depression scale is the same as that between 19 and 20 on that scale.This is a very real problem since virtually all of the twin research on personality is based on these types of rank order scales but has been incorrectly interpreted as measuring quantitative trait differences [9][10][11].It is not surprising that studies analyzing actual genomic data related to cognitive and personality traits have often reported much lower genetic associations than the "heredity" implied by these models [12,13].The numbers add up but as we shall see below, the biology doesn't.There is a strong need for a more nuanced understanding in the scientific community about precisely how genes and environment interact biologically so that more appropriate methodology can be utilized.

The biological meaning of interaction
A primary characteristic of most chronic diseases is complexity, which refers to the fact that etiology and progression are usually multifactorial (involving a combination of genetic, lifestyle, environmental and biological factors), with the individual contributors interacting in a way that makes the whole (i.e., the phenotype) more than the sum of its parts.This signifies that the phenotype has functions and characteristics not found in any of its individual subcomponents.Properties that are not reducible to those of their constituent parts are referred to as "emergent," meaning that they cannot be adequately understood with reductionist methods.This is especially important with respect to genetics and gene expression.The concept of emergence in gene x environment interactions also cannot be adequately depicted statistically by using additive linear models.
The reason that the dynamically fluctuating microenvironment is so important is the primacy of its role in gene expression.Genes are essentially passive biochemical codes, like blueprints, for making amino acids and proteins.Like a cookbook recipe, they cannot "read" themselves or "bake" the chocolate cake.Genes are "read" (transcribed) and translated into proteins by factors in the surrounding microenvironment.This means that gene function is to a large extent dependent on environmental cues.When DNA is not active, it is wrapped tightly around histone proteins like thread around a spool, which consolidates it for storage in the nucleus, but also protects it from being activated at inappropriate times by circulating transcription factors.In order for the gene to become activated, factors in the microenvironment must initiate the unwinding of the DNA strand from the histones so that it is accessible to activation and transcription.Thus, biologically, interaction between genes and the surrounding microenvironment begins at the most fundamental molecular level.Additive statistical models simply do not reflect this reality.Furthermore, many genes have multiple functions, and the role they play at any particular point in time, depends on the stimuli they encounter from the surrounding microenvironment.For instance, a P53 tumor suppressor gene can function variously in repair of DNA damage, cell-cycle arrest, differentiation, apoptosis and cellular senescence [14,15].Macrophages, phagocytic cells in the immune system with distinct biological functions that can either fight or promote tumor development, are also strongly influenced by the microenvironment [16].The function a macrophage assumes depends on the cues it receives from the surrounding environmente.g., from other genes, intercellular communications, and a myriad of extracellular factors such as hormones, enzymes, etc.Therefore, the input signals are extremely important (whether they inhibit or activate gene expression), but so also is their timing.The order in which they arrive is very important for phenotypic outcome [17].Indeed, it has long been known in the field of molecular biology that "one gene, one function" is an outmoded concept.There are simply not enough genes to perform all the roles required to maintain a healthy system.So, the assumption that phenotype can be understood as a simple process of translating the genetic code into proteins, does not reflect the dynamic, ongoing interactions inherent in the complexity of biological systems.Understanding the dynamic nature of these systems is important from the standpoint of designing research.A cell does not do the same thing in an intact animal that it does in a petri dish because the surrounding environment is completely different, and elicits different responses.Nor does a genetically modified animal respond in the same way to environmental toxins or medication as one that is intact.Physiological systems in an organism are inextricably interrelated, so the removal of one gene, unavoidably affects the function of more than one system.
It has been suggested that a better way to understand phenotype would be to conceptualize it as the result of process in a reactive system [17].That means that the relative contribution of genes and environment varies within the same person at different points in time.According to this conceptualization, a given subsystem in the body is receiving and reacting to multiple inputs simultaneously.The emergent characteristics of the resulting complex systems are responsible for the protein's ability to assume properties and functions that are not inherent in its individual amino acids.This characteristic extends to cells and the formation of tissues and organs.It also seems to apply to "life" itself.It has been demonstrated that the entire DNA can be removed from a bacterial cell and synthetic DNA inserted to reprogram it into a different type of bacteria [18].However, the synthetic DNA does not have the property of "life" but requires a living cell to start reproducing different type of bacteria.Thus, the synthetic DNA is inserted into a cell without DNA that is, nevertheless, "living" (The DNA was removed to make room for the synthetic DNA).In this case, the DNA is the software that reprograms the cell, but does not constitute its "life", since removing it does not remove life.This is an illustration that emergent properties cannot be understood by examining characteristics of individual components (e.g., genes).This also implies that phenotypic "causality" is not unidirectional (e.g., starting at the molecular level and moving to the organ or systemic levels), but bidirectionalthe microenvironment can influence gene expression, just as the gene can influence its microenvironment.With the exception of relatively rare monogenic diseases, most illnesses, especially chronic diseases such as cardiovascular disease and cancers, phenotype evolves from a constant interaction between genes and the environment.With respect to carcinogenesis, many lifestyle and environmental factors influence gene expression by epigenetically activating or repressing oncogenes and tumor suppressor genes.In cases where tumor suppressor genes are inhibited or quiescent oncogenes activated, the microenvironment is dominant over genotype.Reductionism is inadequate for understanding this type of complexity.
The clinical importance of these interactions is that although knock-out animal models (e.g., those with reduced immune function) help identify the function of specific genes or signaling pathways, they are of less use as clinical models because their physiological systems do not reflect the response of an intact animal (or human).

Dynamic equilibrium
The human body is a dissipative thermodynamic system, i.e., it is an open system that exchanges energy and matter with the environment.Thus, it often functions far from thermodynamic equilibrium.Years of research in the field of nonlinear dynamics has taught us that everything in the body is in a state of continuous flux whose function is to maintain robustness and adaptability in the face of shifting needs and input signals from multiple sources.Organelles, cells, tissues and organs are systems unto themselves but also components of larger systems that interact with each other at many levels [17].This complexity provides the flexibility for timely response to the constant, but temporally varying demands being made on different cells and organs throughout the day.In healthy systems, these moment-to-moment variations serve to adjust and fine-tune responses as local needs change.Just as evolution has been a balancing act between robustness and adaptability, the human body, in order to maintain health, must be flexible enough to adapt to new circumstances (e.g., changes in diet, increased physical exertion), while remaining robust enough to resist random minor perturbations, such as exposure to bacteria or mistakes in gene transcription that might otherwise cause ill health.Examples of physiological feedback loops that serve the purpose of health maintenance include immune mechanisms that fight invading bacteria or renegade cancer cells, mechanisms that repair or destroy DNA mutations, and renal processes that lower blood pressure.The interactions within and between these health maintenance systems are highly complex and they are also characterized by redundancy.If one system becomes overloaded, there are multiple back-up systems in place that kick-in "as to restore equilibrium.Thus, "stasis" exists only in death, and the term "dynamic equilibrium" more accurately describes systemic physiological interactions than the term, "homeostasis."

Genes and networks
Genes are essentially biochemical codes (recipes) for making amino acids and proteins.They do not initiate action but are instead activated or repressed by other cellular factors.Like recipes in a cookbook, they are passive; i.e., they remain quiescent until switched on or off by transcription factors interacting with cis-regulatory mechanisms.They cannot "read" their own code (transcribe) nor "bake the chocolate cake" (translate the code to amino acids and proteins) without wide-ranging help from the surrounding microenvironment: including, cis-regulatory networks, RNA binding proteins, RNA polymerase, ribosomes, microRNAs, transcriptional co-factors, and numerous other transcription factors [19].
Mechanisms that contribute to dynamic equilibrium by supporting redundancy at the genetic level, include multiple enhancers and transcription factors for single genes that can bind to the same cis-regulatory element [19], creating a back-up for transcription failure.Thus, feedback loops exist not only within and between tissues and organs of larger functional systems (e.g., the sympathetic nervous system) but also within and between cells, genes, and protein networks.Like other systems in the body, gene regulation is not linear but involves complex interactions at multiple levels of molecular and systemic functioning.Not only can genes belong to multiple networks involving different functions, but genes and transcription factors are regularly modified by other microenvironmental factors (e.g., enzymes, hormones, immune factors, the basement membrane), which are influenced by exchanges between the individual and the outside environment.What contributes to the nonlinear dynamics of these systems is the scale-free nature of their networks.
Scale-free networks are not random.They reflect a self-organizing capacity of organic life that, regardless of system type, creates connectivity between vertices (nodes) with a distribution that follows a power-law function [20].These networks are characterized by a few nodes or vertices that are connected to many others, forming hubs, while the majority of nodes have very few connections.This network structure supports robustness to perturbation (knocking out one gene or protein seldom knocks out an entire system), while allowing the network to retain its adaptability.Part of the adaptability comes from the propensity of networks to expand, in a non-random, preferential manner, with new vertices attaching to already well-connected nodes [20].The dynamic nature of these networks results in the emergence of new characteristics or functions needed by the system, contributing to both robustness and adaptability.
However, when network expansion continues long enough, it can reach a threshold where a giant connected component emerges, with distances so close that perturbations of a single gene or protein can actually propagate through the entire network having multiple, unrelated effects.This results in an increase in flexibility and adaptability by allowing a gene to belong to multiple networks.A simplistic system that required separate proteins for every single function would be unwieldy and dysfunctional with respect to robustness and adaptability.The flexible structure of molecular networks facilitates pleiotropy, a common characteristic of protein interaction networks [21].The P53 tumor suppressor gene mentioned earlier is a typical illustration.Expanded network membership allows it to respond to a broad range of signals and fulfill multiple functions.[22,23].To summarize, new knowledge of the interconnected nature of gene and cellular networks has led to a reassessment of gene function as something that should be shifted from an individual attribute to one of the network in which the gene participates [21].
Conceptually, it should now be clear that a diversity of environmental exposures (lifestyle, environmental, biological) that have been associated with complex diseases such as cardiovascular diseases and cancer, are mediated by influences at multiple levels: inflammation, gene transcription, the maintenance of mRNA stability, mRNA translation and protein stability [19] by enzymes, hormones, and metabolic processes.How input signals affect different systems depends on both their interactions with other inputs signals and the timing of their arrival.Thus, they can be synergistic, repressive, additive, and multiplicative, or cancel each other out.
Thus, biological networks, like organ systems in the body, are characterized by complexity, meaning that gene function is often dynamic and flexible [24].The fact that molecular networks adapt and evolve can result in long-distance spatiotemporal patterns from what originated as local neighborneighbor interactions [25], indicating the difficulty of predicting network dynamics from single nodes [26].

The edge of criticality
The nature of complex gene, protein and cellular networks contributes to another important property of healthy, dynamic systems, namely that their most efficacious functional range lies on the edge of criticality between order and chaos [26][27][28].That means that they tend to be dynamically stable to random perturbations but can react with global state changes to targeted perturbations [27].It is precisely at this critical juncture of minimal information loss where they provide enough order for robustness, while retaining enough flexibility to adapt to needed changes required for optimal responsiveness.Functioning "on the edge" explains the well-known adage from chaos theory that a small change can result in a major phase transition.This is exemplified by macrophage functionality.Depolarization of just a single mitochondrion at the edge of criticality can stimulate network collapse [25]; and macrophages can undergo global gene expression change to targeted perturbations, coordinating complex behavior with minimal information loss [27].Furthermore, because the structure of many biological networks is not static but evolves both structurally and functionally over time, stability and flexibility are balanced through self-organization to a dynamically critical state [29].These adaptive networks can evolve either through continued use (synaptic activity), lack of use, or evolutionary fitness (gene regulatory networks) [29].
Sudden changes are termed "phase transitions" or bifurcations and can occur when systemic load progressively increases, calling on more and more systems to be on the alert for maintenance and restoration of equilibrium.This tips the balance toward disequilibrium and disorder.Systems that are required to fulfill more than one function (their regular function plus back-up) for more than a short period of time, begin to redirect some of their energy away from efficiency to take on a higher quantity of work.One result of this is that disposal of waste products becomes less efficient and disorder increases.The more burden increases, the farther out of balance the system becomes.The term used to characterize systemic burden and "wear and tear" on the body, is "allostatic load" [30][31][32][33][34].
When system overload goes from acute to chronic, it can tip the dynamics away from a healthy attractor toward dysfunctionality.The functional state of a system can only be maintained up to a certain threshold of burden (disorder) before it collapses, resulting in a phase transition that can be likened to the "straw that broke the camel's back."It bifurcates from order into chaos, where new dynamical attractors emerge that may support disease rather than health.

Clinical relevance
The importance of these dynamics is their direct relevance for clinical interventions.Multifactorial etiology calls for multifactor approaches to treatment and prevention [35].Lack of a major gene for disease risk can easily mask genetic susceptibility that expresses only in the presence of certain environmental conditions.Genes that contribute to the body's ability to metabolize toxins are a typical example.It has been reported that in utero exposure to pesticides is associated with increased risk of leukemia in simple exposure analyses [36], but in the presence of specific polymorphisms of CYP1A1, CYP2D6, GSTT1 and GSTM1 genes, the risk increases 7-fold [37].This has major health policy implications for prevention.The GSTT1 null genotype and the GSTP1 ValVal genotypes have also been reported to increase vulnerability for prostate cancer [38], however the results are inconsistent [39].The fact that environmentally induced epigenetic changes can determine how a gene is expressed, explains some of the inconsistency in studies focused solely on genetics and clarifies the importance of understanding gene x environment interactions.Just as policy measures have been implemented to reduce and prevent smoking, more focus needs to be placed on environmental factors that increase epigenetic risk for chronic diseases.Knowledge of genotype alone is not sufficient for assessing risk for complex diseases.The glutathione S-transferase (GST) family of genes are involved in the regulation of metabolism of a wide range of chemicals, including carcinogens [40].But previous research has also reported that high intake of cruciferous vegetables combined with the GSTM1 genotype contributes to a reduction of prostate cancer risk [39].Thus, multi-level interventions involving health policy at the state and community level (e.g., regulation of pollution sources), individual behavioral factors (e.g., diet, physical activity, smoking and alcohol consumption), as well as family level factors that can influence health behaviors, are all important for addressing the complex gene x environment interactions related to complex etiology [35].System dynamics (network structure), as well as type and sequence of signal inputs from the surrounding microenvironment are important multifactorial contributors to clinical phenotype.They also illustrate why traditional pharmaceutical approaches that tend to target single mechanisms have so often failed to achieve permanent improvement [41][42][43][44].
Because complex diseases such as cardiovascular diseases and cancers progress slowly and involve innumerable gene x environment interactions, clinical care would be greatly facilitated by more detailed knowledge of systemic risk indicators.Just as timing is crucial to the interactions between microenvironmental inputs and molecular network dynamics, it is also crucial to intervention strategy.Because genes can belong to multiple networks and change their functional expression based on varying conditions, a step towards improved clinical care would be to develop assessment measures that reflect not only individual risk factors but also risk for emergent systemic dysfunction If clinicians could be alerted to increasing systemic burden early in the subclinical disease process, heightened preventive measures and/or perturbations that reboot the system back towards the healthy attractor might facilitate reversal, avoiding an increase in disorder and bifurcation into chaos.On the other hand, a state of advanced disease progression and the presence of unhealthy attractors would be helped by additional clinical information related to network dynamics driving the dysfunction.

Conclusion
Globally, cardiovascular diseases and cancers account for 43.3% of mortality in women and 40.2% of mortality in men [45].Unlike monogenic diseases such as Huntington's, where genotype determines phenotype, the biology of these diseases is highly complex and nonlinear.The next step in improving outcomes necessitates a more nuanced understanding of the systemic and molecular nature of interactions between genes, lifestyle, environmental factors and individual biology.It will also require the development of more accurate, nonlinear methods of analysis to identify the timing and contribution of individual components, as well as the emergent network phenotypes.