Towards a unified framework to study causality in Earth–life systems

Abstract There is considerable interest in better understanding how earth processes shape the generation and distribution of life on Earth. This question, at its heart, is one of causation. In this article I propose that at a regional level, earth processes can be thought of as behaving somewhat deterministically and may have an organized effect on the diversification and distribution of species. However, the study of how landscape features shape biology is challenged by pseudocongruent or collinear variables. I demonstrate that causal structures can be used to depict the cause–effect relationships between earth processes and biological patterns using recent examples from the literature about speciation and species richness in montane settings. This application shows that causal diagrams can be used to better decipher the details of causal relationships by motivating new hypotheses. Additionally, the abstraction of this knowledge into structural equation metamodels can be used to formulate theory about relationships within Earth–life systems more broadly. Causal structures are a natural point of collaboration between biologists and Earth scientists, and their use can mitigate against the risk of misassigning causality within studies. My goal is that by applying causal theory through application of causal structures, we can build a systems‐level understanding of what landscape features or earth processes most shape the distribution and diversification of species, what types of organisms are most affected, and why.


| INTRODUC TI ON
Imagine a version of Earth in which the movement of tectonic plates builds mountains and topography, but where there are no geological hotspots forming island archipelagos, no methane levels fluctuating through time, and no growing and shrinking of glaciers. In a world where there is only growth of topography, what does the diversity and distribution of life look like? This may be an unfair question as any geologist would point out that it is not possible to separate geological processes in this way; as soon as topography grows, rivers flow through and incise that topography to form ridges and valleys. In fact, steeper slopes drive higher rates of river | 5629 DOLBY incisiona relationship governed by the stream power equation (Whipple & Tucker, 1999). Yet, I would argue that this is the basis for much of what phylogeography aims to achieve-to understand how individual geological and climatic (geoclimatic) processes shape the distribution and the diversification of life on Earth. The same is true of studies that use phylogenetic or species richness (macroecological) data coupled to features of the landscape, such as mountains or latitudinal gradients (Antonelli, Kissling, et al., 2018;Hoorn et al., 2010;Rabosky et al., 2018;Rahbek, Borregaard, Antonelli, et al., 2019).
Within statistical and comparative phylogeography, a major goal is to understand what geoclimatic factors govern the evolution and distribution of populations, whether species respond similarly or not, and why (Crandall et al., 2019;Dolby et al., 2015Dolby et al., , 2019Leaché et al., 2020;Myers et al., 2016;Thomaz & Knowles, 2020;Wan et al., 2021). Here, I explain how answering these questions is really about establishing causal relationships between the nonliving yet changeable landscape and species which evolve in response to it.
To do this I will first present evidence for why earth processes can be thought to impart an organized, deterministic effect on species evolution. Then, I will show that the landscape features earth processes produce, and which are commonly studied, are aggregations of variables whose effects can be teased apart using a set of tools called causal structures (sensu Grace et al., 2012). These tools represent causal hypotheses as networks and can be used to organize and restructure knowledge from individual studies to build Earthlife theory and guide new hypotheses, as I will demonstrate with examples from the literature.

| IS S PECIATI ON DE TERMINIS TI C ?
Scientists have long debated how predictable life is (Kolata, 1975).
In 1989, Gould introduced a now-famous thought experiment about replaying the tape of life, that is, if life were restarted from the beginning, would it result in the same outcome we see today (Gould, 1989, pp. 48-49)? Gould and

others argued that life is unrepeatable
because it is the product of initial starting conditions and random stochastic events (Blount et al., 2018;Gould, 1994;Raup et al., 1973;Schopf et al., 1975). For example, if life started over, there might be different mass extinction events. Or, mutations that were key to evolutionary transitions in our history may arise at a different time point and so affect a different set of organisms, or they might not arise at all. Others suggested that life is the "inevitable" product of channellizing forces such as the favourability of certain chemical reactions, or of developmental constraints engrained early on that make some biological outcomes likely to be repeated again and again (Flessa & Levinton, 1975;Morris, 2006Morris, , 2010. Debated often at a macroevolutionary level, the theme echoes at the molecular level with studies of parallel evolution (Powell & Mariscal, 2015). Work on sticklebacks has shown that genetic mutations in the same genes are responsible for the repeated loss of armoured plates as fish colonize low-predator streams (Colosimo, 2005;Schluter & Clifford, 2004).
The mc1r gene has been shown to underpin divergence in coat and plumage colour in many independent species and ecological settings (Brockerville et al., 2013;Mundy, 2005;Ritland et al., 2001;Steiner et al., 2007). Finally, the specialization of Anolis lizards into ecological microniches has been repeated across the Anolis phylogeny (Gunderson et al., 2018;Losos et al., 2003;Velasco et al., 2016). Repeated molecular or phenotypic evolution is not equivalent to repeating all of life's diversity over the last billion years. To this point, however, the stochastic vs. deterministic debate has largely omitted one observation: species evolve in response to the physical landscape, and the earth processes that shape that landscape are themselves largely deterministic (Figure 1). So although the impact of a meteor may prune the evolutionary tree at random, the everyday processes that shape the landscape life lives on have behaved in F I G U R E 1 Diagram showing examples of mostly stochastic and mostly deterministic earth processes or events that can impact biology. Determinism here refers to processes whose outcomes can be moderately well predicted from initial starting conditions and knowledge of the system (i.e., quasideterministic). Stochastic events are those which are poorly predicted in time and space or which have a random distribution of effects on biology. Note that stochastic vs. determinism here is considered a gradient, where these processes do not fall perfectly into one group or the other. For example, global temperature can be estimated from greenhouse gas concentrations, but a component of those concentrations are stochastically driven (i.e., from volcanism) DOLBY a consistent way for much (if not all) of life's history. More explicitly, they may have an organized or deterministic influence on life even if life is not determinable itself (Smith & Morowitz, 2016).
There are many examples of deterministic behaviour among earth processes. When continental plates converge, they form mountainous topography and the height of this topography can be estimated from the shear-force of the colliding plates (Dielforder et al., 2020), although climate-modulated erosional processes may also be a major control (Brozovi et al., 1997;Champagnac et al., 2012;Egholm et al., 2009). The rate of river incision can be estimated from the stream power equation and rates increase with the amount of discharge and steepness of the surrounding slopes (Whipple & Tucker, 1999). The increase in mean global temperature and loss of ice volume on Earth can be predicted from the amount of methane and other greenhouse gasses that are added to the atmosphere (e.g., TFE.3 in Stocker et al., 2013). Some details of exactly how these processes unfold are still debated among Earth scientists. However, these examples show that the outcomes of geological and climatic events can be estimated to a first order and the large-scale outcomes behave quasideterministically on the biological scales discussed here.
On the biological side, it is well established by theory and empirical studies how reduction of gene flow or adaptation to different selection regimes can cause lineages to diverge (Coyne & Orr, 2004). Adapting Gould's experiment, if we imagine that a mountain range is built 100 times over within the range of a low-dispersing beetle, then with all else equal we might expect that if that beetle lineage diverges due to isolation in one iteration then it would diverge in isolation in many other iterations. In contrast, we would expect that distribution of outcomes to differ if we performed the same 100 experiments using a high-dispersing bird species. This is because we know these organisms have vastly different traits. These two outcomes are probabilistic, not deterministic, because the outcome of any trial is not perfectly predictable. However, if we agree that earth processes behave quasideterministically and the divergence response of organisms is likely to vary based on a set of biological traits, then it stands to reason that we should be able to build a set of "speciation boundary conditions" that describe what geological settings promote the origination of lineages and amongst which groups.
The main challenge becomes measuring individual cause-effect relationships between earth processes and evolutionary patterns.
Although this is what many phylogeographical studies seek to do, we lack an organizing framework to systematically compare individual taxonomic and geographical studies to achieve this greater synthesis. I believe one path forward is through using causal structures, particularly in more deterministic scenarios (Figure 1), which I will introduce and apply in the next sections.

| C AUSALIT Y IN OTHER FIELDS
Judea Pearl largely formalized the algorithmic and mathematical definition of causality (Hopkins & Pearl, 2007;Pearl, 1995Pearl, , 1998Pearl, , 2009Pearl & Verma, 1995), which is key for modelling systems in a way where new knowledge can be learned beyond what is already observed. Pearl and Verma wrote, "… an intelligent system attempting to build a workable model of its environment cannot rely exclusively on preprogrammed causal knowledge, but must be able to translate direct observations to cause-and-effect relationships" (Pearl & Verma, 1995). Causal theory has since been applied widely across disciplines, for example within the social sciences especially when variables are intangible or difficult to measure (e.g., intelligence; see references in Pearl, 1995), as well as within artificial intelligence (Pearl, 2019). Grace and colleagues have done tremendous work to adapt causal theory for use in ecological studies (Eisenhauer et al., 2015;Grace, 2006Grace, , 2010Grace, , 2015Grace et al., 2010;Grace & Bollen, 2007;Grace & Irvine, 2020;Pugesek & Grace, 1998). A key development of this work was the translation of causal principles to be used in observational studies, whereas Pearl's original theory was developed specifically for interventionist experiments (i.e., laboratory experiments) where variables can be controlled and manipulated to establish and quantify "true" causal relationships (Pearl & Verma, 1995). However, through knowledge of a study system and careful development of causal structures (graphs) at different levels of detail, these ecological studies have relaxed this interventionist constraint; results from observational studies are then not interpreted strictly as causal inferences, but instead as estimates of causal relationships. Despite this more limited interpretability, the use of causal structures in ecological studies has contributed substantive new knowledge about system dynamics in several settings (Eisenhauer et al., 2015;Grace, 2010;Grace et al., 2016). Within evolutionary biology, use of causal structures has been limited to the application of structural equation models to quantify genotype-phenotype relationships (Li et al., 2006;Otsuka, 2014;Scheiner et al., 2000). Here we will use the two higher order causal structures-structural equa-

| IMAG INING C AUSALIT Y FOR E ARTH-LIFE SCIEN CE
As discussed in the beginning of this article, Earth's landscape is dynamic and shaped by many processes. This presents two challenges when working to link evolutionary patterns with underlying geological process(es). The first is that earth processes are interrelated and are therefore often co-occurring ( Figure 3a). Looking into the landscape history at many (perhaps most) locations on Earth will reveal that several aspects have changed over a given evolutionary period. The co-occurrence of processes means that using the age of an evolutionary event (e.g., a lineage divergence or bottleneck) is insufficient to discern which aspect of the changing landscape caused a pattern (Dolby et al., 2015(Dolby et al., , 2019, a phenomenon known as pseudocongruence (Feldman & Spicer, 2006;Lapointe & Rissler, 2005;Riddle & Hafner, 2006;Soltis et al., 2006). In such cases, population genomic data have a benefit over phylogenetic data because they provide information not only in a spatial dimension that matches the | 5631 DOLBY spatial nuance of the landscape, but can be assayed for population effects and signs of local adaptation, particularly when whole genome data are used. Because some types of landscape change (e.g., differences in precipitation due to monsoon or precession cycles) are expected to drive adaptive divergence, and physical barriers may be expected to produce more "neutral" or nonadaptive divergence, the ability to interrogate both neutral and functional elements of the genome is potentially powerful. In this approach it is the structuring of types of information spatially between genetic and landscape features, rather than the coincidence of similar timings, that causally link the two systems.
The second (but related) challenge to linking geological processes with evolutionary patterns is that many of the most noticeable physiographic features of the landscape are in fact aggregations of collinear variables. Examples of aggregate features include mountain ranges, latitude and bathymetry that in the literature are long thought to control diversification and species richness patterns (Colwell & Hurtt, 1994;Hodkinson, 2005;Hoorn et al., 2010Hoorn et al., , 2013McClain & Etter, 2005;Rabosky et al., 2018;Stevens, 1989). It is almost certain that these features are causal to the generation and/ or distribution of biodiversity. However, it is the collinearity of direct variables such as temperature, precipitation and solar insolation within these aggregate features that makes it difficult to determine which of the variables exert causal control over a biological pattern (Rahbek, Borregaard, Antonelli, et al., 2019; Table 1).
To give an example, if formation of a mountain range leads a lineage to diverge, is that divergence due to differential adaptation to the gradient in atmospheric oxygen or UV burden or temperature? Or was it because the lineage was physically isolated by peaks or valleys? The distinction here matters because if it is due to a temperature gradient then there are many other instantiations of that gradient on Earth, such as across latitudes or from hydrothermal vents ( Figure 4). It may seem trivial but pinpointing the direct variable in this case would inform not only what we understand the external agents shaping evolution in the setting to be but also would direct what hypotheses to test in other settings to determine that variable's impact more broadly. If we return to the example of lineage divergence over a mountain, there is yet a trickier issue to contend with. In this example, if it was instead found that a lineage diverged due to physical isolation by ridges or valleys, would we say the mountain is causal to that divergence? Or are the rivers that did the work incising topography to form those ridges and valleys causal? Or is climate causal because, in a region devoid of rainfall, there would be no water to flow into streams to incise the topography? Or are they causally inseparable? There may not be a simple answer, but in the next sections we will see how causal structures can be used to represent complex networks of interactions to aid our thinking and discussion of complex causal networks.

| The problem of epiphenomena
An epiphenomenon is a byproduct or an associated effect of the phenomenon of interest. Examples of epiphenomena mentioned above include temperature, precipitation and solar insolation, which are direct variables found in different collinear combinations within aggregate features ( Figure 4; Table 1). The relevance of epiphenomena when working to establish causal relationships is clear: epiphenomenal variables can be easily confused with true causal variables and lead to spurious inferences (Gould & Johnston, 1972). If A is causal to B and C co-occurs or covaries with A, then it may be incorrectly inferred that C is causal to B; or, that C is causal to A and B, or both A and C are causal to B (Figure 3c). This is particularly problematic when variable C is easier to observe or measure on the landscape than variable A in which case A may be overlooked. Grace et al. (2012) from most general (top) to most specific (bottom). Structural equation meta-models (SEMMs) are conceptual networks that describe the higher level, generalizable theory of a system. Causal diagrams (CDs) are more specified and serve to bridge higher level theory to a study of interest; CDs play a pivotal role in translating theory to the design and interpretation of a study and vice versa. The most detailed level are structural equation models (SEMs) that convey the variables measured in a study, their causal relationships and the statistics used to quantify its paths. (b) An example of how a multiple regression (A, B, C onto X) could be translated into a causal diagram that would support compound pathways (A → B → X) to allow for more nuanced depiction of a system. In this example, by not allowing compound pathways the multiple regression might overemphasize the importance of variable B on X relative to the causal diagram, or the role of B may be oversimplified In many statistical tests, including those common to phylogeography or macroecology (e.g., testing isolation by distance or spatial associations), often "no pattern" (randomness) is used as a null hypothesis.

F I G U R E 2 Summary of causal structures. (a) Causal structures defined in
However, we know that the distributions of species or relatedness of populations is rarely, if ever, truly random. This poses a particular risk in the context of epiphenomena and pseudocongruence. If more than one aspect of the landscape has changed in a study region, or there is collinearity amongst direct variables, then if not all relevant features or variables are tested it is possible that whatever pattern detected, being nonrandom, will be interpreted as support for the experimental hypothesis even if it is not the "true" causal variable. This concept is well established, as many researchers have emphasized F I G U R E 3 An example of how co-occurring geoclimatic processes can interfere with the ability to accurately identify which changes in the landscape initiated a biological pattern of interest, in this case speciation of lineages in the southwestern USA. (a) Depiction of the geoclimatic events thought to have occurred over the time period when most lineages diverged (circle vs. star; Dolby et al., 2019). Stippling represents boundary uncertainty, and a question mark denotes a boundary of unknown age. Panels i-iv show cartoon representations of the geographical extent of each process. (b) A toy phylogenetic tree to show the pattern of lineage divergence for desert tortoises (Edwards et al., 2016); note that a divergence time of 5 million years roughly correlates with three of the four major processes (monsoon, river formation and flooding of the Gulf of California in (a)). (c) Causal diagrams showing how a true causal relationship can be misinferred due to collinear variables if not all variables relevant to the system are considered. Arrows depict causal relationships. The dotted circle represents an unsampled variable the importance of thoughtful hypothesis testing (Hickerson, 2014;Peterman & Pope, 2021). However, it becomes even more critical in the context of complex geoclimatic settings and when working to establish causal relationships between earth processes and evolution.
For instance, vicariant barriers are often more obvious features on the landscape than ecological or climatic factors. In the western USA and Mexico, it was thought for decades that river and seaway barriers drove diversification of dozens of desert species, but recent work has highlighted the importance of less visible climatic phenomena as perhaps equally or more impactful (Dolby et al., 2015(Dolby et al., , 2019Ornelas et al., 2018;Valdivia-Carrillo et al., 2017). The reason these findings are important is they change the hypothesized causal structure-they shift our understanding of what processes are important for shaping species diversification or distributions ( Figure 3). One comes to quickly appreciate how misassigning causality in many smaller individual studies can bias our understanding of the external controls on species diversification and distributions more broadly. When contending with epiphenomena, considering complexity of the geoclimatic setting is paramount; causal structures can help diagram these complexities.

| An introduction to causal structures
Before defining causal structures in detail, let us explain how they help to meet the challenges outlined in the last section. There is an increasing need for new theory to bridge the earth and life sciences (Antonelli, Ariza, et al., 2018;. A primary strength of causal structures is that they simultaneously facilitate data analysis and theory development by forcing an explicit consideration of the variables relevant to a system and, importantly, their relationships (Grace et al., 2012;Pugesek & Grace, 1998). Causal structures are represented via directed acyclic graphs (Pearl, 1995(Pearl, , 1998 and depict cause-effect relationships at different levels of detail (sensu Grace et al., 2012) that serve different purposes. Causal structures rely on the visual representation of concepts or variables as networks (Figure 2b), which allows for the direct comparison of these relationships across different systems or studies. For instance, there are countless studies that test whether a river acts as a barrier to gene flow (Balao et al., 2017;Dolby et al., 2019;Lugon-Moulin et al., 1999;Naka & Brumfield, 2018;Peres et al., 1996;Vechio et al., 2020;Weir et al., 2015) and  CDs are the level at which most phylogeographical studies take place. At the most detailed level are SEMs, which are fully specified models that convey the precise variables measured in a study, their causal paths and the statistics used to quantify those paths. An SEM is a testable causal hypothesis for a given system whereas a causal diagram can be specified into different SEMs based on the design of a study. SEMs can be used to quantify the pathways proposed in a CD, and testing hierarchically nested SEMs can be used to determine the level of complexity necessary to describe a system (Grace et al., 2016). The application of SEMs to Earth-life science is a worthwhile topic that requires its own consideration and will not be discussed further here. A detailed review of causal structures and their implementation is found in Grace et al. (2012).
A starting point to defining causal structures for a system is to ask, "What variables are relevant?". When drawing connections (edges) between variables (nodes) it becomes evident that some intermediary variables are missing if a parent variable does not have direct or complete causal influence on its child. A primary strength of causal analysis is its representation of system complexity in the form of compound paths (not just A → C, but A → B → C). This is due to the fact that causal analysis is based on a network graph (e.g.,   Figure 4). These direct variables are easily measured, and importantly, they exert a measurable effect on an organism's biology through documented and/or quantifiable mechanisms (Table 2). For example, temperature is known to impact the energy invested in behavioural or physiological thermoregulation (e.g., finding shelter/shade, shivering, sweating), enzyme activity (Feller, 2010;Low et al., 1973;Peterson et al., 2007) and mutation rate (Berger et al., 2017;Garcia et al., 2010;Matsuba et al., 2012). These effects differ from those expected in response to an oxygen gradient, which instead include changes in oxygen-haemoglobin binding affinity (Miao et al., 2017) and haemoglobin concentration (Simonson et al., 2010). Still different patterns are expected from differential adaptation to UV burden, such as divergent responses within UV radiation receptor pathways (e.g., mediated by UVR8; Tossi et al., 2019) and the induction of protective phenolpropanoids in plants (Zeng et al., 2020). While observations of the richness or relatedness of populations in a geographical/geological context are important, more detailed assays into the physiology or the genome (e.g., to assess adaptation) may be necessary to answer many causal questions.
Often not all variables can be evaluated in a study. sensu Dawson, 2014) to isolate individual effects similar to controlled laboratory experiments (Dawson, 2014;Gould & Johnston, 1972;Morris, 1995

Reproductive timing
Note: This is not an exhaustive list and effects will vary by taxon.
If so, this could be a "rule" that describes a fundamental property of how earth processes shape life.

| Applying causal structures
In this section we will show how SEMMs and CDs can be applied to Earth-life systems and what we can learn from their application.
Over the past decade, mountain ranges have garnered tremendous attention as putative generators of biodiversity (Antonelli, Kissling, et al., 2018;Hoorn et al., 2010Hoorn et al., , 2013Rahbek, Borregaard, Antonelli, et al., 2019). This comes from observations that many mountain ranges have high numbers of species, which suggests they either accumulate biodiversity or promote the origination of lineages in situ. This has led researchers to ask, "What is it about mountains that leads to high diversity?" According to our criteria, a mountain is an aggregate feature-it is decomposable into a suite of direct causal variables. Using our framework here we might more directly ask, "What are the direct causal controls on biodiversity within aggregate mountain systems?" Work by Antonelli, Kissling, et al. (2018) proposed the main controls on species richness in montane settings to be soil heterogeneity, temperature, and precipitation, which is depicted in a causal diagram in Figure 6a. Focusing on soil diversity, the authors proposed that soil diversity was due to lithological diversity.
Others proposed that the entrainment, uplift and exposure of partially melted oceanic crust at subduction zones provides key nutrients or leads to the development of specific soil types that require specialized adaptation for organisms to inhabit (e.g., serpentine soils; Rahbek, Borregaard, Antonelli, et al., 2019). We can make a more detailed causal diagram for the controls on soil heterogeneity based on this hypothesis. We know that soil formation would depend on the rate of erosion and exposure of the bedrock, which involve several variables ( Figure 6b) and the presumed causal relationships amongst these are described in Table 3. Drawing these diagrams teaches us two main lessons.
The first lesson is that the system dynamics detailed in Figure 6b lead to several predictions. Prediction one is that there should be a correlation between soil diversity and species richness; indeed, Antonelli, Kissling, et al. (2018) showed this relationship, but it could be tested further, for example over different spatial scales. The second prediction is that mountains formed by subduction of oceanic crust should have higher richness than mountains at continentcontinent collisions which have more silica-based lithology with lower concentrations of iron-or magnesium-bearing minerals. For example, all else being equal, the Himalaya should have lower richness than the Andes mountains; and this is also consistent with their findings (Antonelli, Kissling, et al., 2018;Rahbek, Borregaard, Antonelli, et al., 2019). The third prediction is that the variability of mineral composition of oceanic crust (e.g., Arevalo & McDonough, 2010) may manifest an effect on species richness; perhaps soils could be compared from hotspot-driven ocean islands vs. subductiondriven mountains to test this prediction. Lastly, but most importantly, erosion and exhumation rates along or between mountains should play a key role because lithosphere is the first rate-limiting step of providing fresh material from which soils can form. The diagram helps us hypothesize that mountains with higher erosion rates due to faster uplift rates or greater precipitation might lead to faster soil generation and higher richness. Indeed, Antonelli, Kissling, et al. (2018) found an effect of erosion rate on richness, but more detailed study could better constrain this relationship in different regions, perhaps to decouple a signal of climate from a signal of uplift. The relationship could also be tested at different spatial scales.
From this logic, it also follows that because erosion rate is coupled to uplift rate, when uplift slows, the tectonically controlled rate of soil formation may decrease and therefore richness may also decrease. If this were true, we would expect a normal distribution of richness over time that mirrors the life of the mountain itself ( Figure 6c). Alternatively, there could be a latency period in which growth of topography leads to abiotically driven soil formation, but after some time the biological community contributes to or becomes the main generator of soil. If so, the biotic system would have entered a self-perpetuating state where nutrients are recycled by the biotic community and become decoupled from and are no longer controlled by the exhumation or erosion of the bedrock (Figure 6d).
This second scenario implies: (i) the causal control on diversity shifts from abiotic to biotic at some critical threshold; and (ii) mountains "launch" biological diversity but diversity maintains diversity. These speculations would require explicit testing but offer predictions against which new observations can be compared (Figure 6c vs. d).
In  (Christian & Lewis, 1997). Upwelling of these nutrient-rich bottom waters in coastal areas causes high biomass and diversity (Pauly & Christensen, 1995), such as in kelp forests off the western coast of the Americas (Winkler et al., 2017). Likewise, waters of the open ocean are often nutrient-depleted and the blowing of aeolian dust from land and deposition of dust from icebergs into oligotrophic waters can bring trace elements, particularly iron, that fuels the patchy increase of biomass and productivity (if not diversity per se; Moore et al., 1984;Aumont et al., 2008;Raiswell et al., 2008;Maher et al., 2010). They are conceptually linked, even though they are usually separated in practice. A second interpretation from Rahbek, Borregaard, Antonelli, et al. (2019) proposed that lithological heterogeneity could lead to local soil characteristics that require special biological adaptations to inhabit (e.g., serpentine soils). This could lead to speciation by differential adaptation, thereby increasing richness. Another SEMM ( Figure 6f) contextualizes this idea to formalize patchiness of abiotic conditions as another phenomenon that links terrestrial and marine systems. In marginal marine environments, work has shown that the steepness of continental shelves can restrict and isolate habitat types, leading to isolation of habitat patches, population divergence and potentially high richness (Dolby et al., 2018(Dolby et al., , 2020 as well as demographic changes (Stiller et al., 2020). By drawing this SEMM we again see that patchiness of minerals (due to lithological heterogeneity) or patchiness of land steepness are conceptually related. The main difference is that pathway 1 only implies physical isolation, although differential adaptation is possible , whereas the assumption of pathway 2 requires differential adaptation and would best be tested with genomic or common garden methods (Figure 6f).  Table 3. Other relationships are possible with proper justification. Discussion of these variables and their paths are a natural point of discussion and collaboration across disciplines and study systems. (c) The expectation if species richness (green) depends on soil diversity and soil diversity is entirely abiotically generated. It would follow the birth, life, and death of montane topography. (d) Proposed expectation of species richness if a critical threshold is reached at which point soil formation switches from abiotic control (AC) to biotic control (BC) and is therefore retained following erosion of the topography. (e) A structural equation metamodel (SEMM) of how the lithological diversity (which generates soil diversity) hypothesized by Antonelli, Kissling, et al. (2018) is comparable to other abiotic processes that control nutrient fluxes. (f) An SEMM of how habitat heterogeneity (Rahbek, Borregaard, Antonelli, et al., 2019), fuelled by soil/lithosphere patchiness, could lead to genetic divergence through differential adaptation. This mechanism is comparable to population isolation due to the patchiness of marginal marine habitat caused by heterogeneous morphology of continental shelves (Dolby et al., 2020), which is expected to produce more nonadaptive divergence. Blue denotes marine processes and pink denotes terrestrial processes. Graph conventions follow Grace et al. (2012) degree diversity and diversification patterns are only explainable through the processes of biology itself. One could imagine extend- Employing these structures and an understanding of causal systems may be a way to formally bridge ecology, evolution and geology.
More work is needed. Note: An alternative to path 11 can be proposed that instead connects ruggedness to species richness. Path 11 was proposed by Antonelli, Kissling, et al. (2018) but routing the pathway instead through ruggedness suggests that erosion does not directly affect species richness but does so indirectly.

| CON CLUS IONS
TA B L E 3 Explanation of relationships used to justify pathways drawn in Figure 6

DATA AVA I L A B I L I T Y S TAT E M E N T
No data were generated for or used in this paper.