Integrated human-earth system modeling—state of the science and future directions

Research on humans and the Earth system has historically occurred separately, with different teams and models devoted to each. Increasingly, however, these communities and models are becoming intricately linked. In this review, we survey the literature on integrated human-Earth system models, quantify the direction and strength of feedbacks in those models, and put them in context of other, more frequently considered, feedbacks in the Earth system. We find that such feedbacks have the potential to alter both human and Earth systems; however, there is significant uncertainty in these results, and the number of truly integrated studies remains small. More research, more models, and more studies are needed to robustly quantify the sign and magnitude of human-Earth system feedbacks. Integrating human and earth models entails significant complexity and cost, and researchers should carefully assess the costs and benefits of doing so with respect to the object of study.


Introduction
Historically, much of the research on humans and the Earth system has occurred separately, with different teams and models devoted to each (Moss et al 2010). Links between the two communities are increasing, however, as (1) the questions being asked of researchers increasingly involve the intersection of human and Earth systems, and (2) the intricate links between these systems are increasingly recognized. For example, Palmer and Smith (2014) argued that feedbacks between humans and the environment must be modeled, as human decisions are now being altered due to changes in the physical system. In a similar vein, Motesharrei et al (2016) posited that 'Earth System Models must be coupled with Human System Models through bidirectional couplings representing the positive, negative, and delayed feedbacks that exist in the real systems. ' Voldoire et al (2007) noted that twoway feedbacks with respect to land use need to be investigated due to the strength of climate-vegetation feedbacks at different spatial scales, thus requiring the incorporation of socioeconomic models into Earth system models (ESMs).
The potential for human-system changes to exert a feedback effect, however, remains largely unexplored, as by definition doing so requires some form of integrated human-earth system modeling (van Vuuren et al 2012). This is in contrast to other earth systemonly feedbacks that have been explored and quantified in ESMs for decades. Early work, for example, looked at cloud (Cess et al 1989) and boreal fire (Randerson et al 2006, Stocks et al 1998 feedback effects, while more recent studies have explored permafrost (Koven et al 2015) and ocean (Randerson et al 2015) climatecarbon feedbacks, as well as albedo changes from snow Qu and Hall (2014) and land-use change (Jones et al 2015). Understanding such feedbacks yields insight into the coupled carbon-climate system and its likely future evolution (Arora et al 2013).
Human-earth system feedbacks, in contrast, remain largely unstudied, even though the ability to model such coupled human-environment systems (and the feedbacks between them) is fundamental to fields such as vulnerability analysis (Turner et al 2003) for example with respect to vulnerable socialecological systems such as agricultural drylands (Fraser et al 2011). In this and many other applications, vulnerability assessments must incorporate nonclimatic (human) factors and adaptive behavior (Füssel and Klein 2006). In addition, explicitly treating human-earth system feedbacks is valuable because the alternative is an implicit model, with hidden assumptions and untested internal dynamics (Epstein 2008). Formalizing such feedbacks is the mark of a developing field of research, and doing so has both heuristic and exploratory modeling benefits, even if the predictive value of the coupled model can never be entirely resolved (Oreskes et al 1994).
This review on integrated human-Earth system modeling is intended to help synthesize the state of the science, coordinate communication of these efforts, and articulate challenges and successes among different research teams. It embraces a wide range of modeling efforts spanning a wide range of spatial and temporal scales, but focuses on quantitative, large-scale spatial models. We recognize that more qualitativelyoriented models may sacrifice precision for accuracy (Matthewson andWeisberg 2009, Dowlatabadi 1995), and in so doing be valuable in distilling expert opinion and likely policy directions. Many biogeochemical and land-use processes, however, require a fine-grained spatial resolution and substantial numerical, temporal, and spatial precision, and it is in this direction that the field has moved over the last two decades (Moss et al 2010).
Before reviewing the state of the literature, we must first define 'integrated human-earth system modeling'.
Human models. The term 'human model' encompasses many different categories of models, including economic models and social system models. Our review found articles using integrated assessment models (IAMs), simple economic growth models, and empirical/statistical models. We include any modeling system with a human decision-making component. However, we exclude models that have human-relevant management layers, but not human responses (e.g. crop-climate models).
Earth models. ESMs are typically considered to be models that include feedbacks between the physical climate system and global-scale biogeochemistry (Flato 2011), in particular with an interactive carbon cycle component (Hurrell et al 2013). This review takes a broader view, however: we also include articles using general circulation models, earth system models of intermediate complexity (EMICs), emulators, and land surface models.
Integrated. When defining 'integrated', two primary distinctions arise: (1) how the model is developed, and (2) the type of coupling. Voinov and Shugart (2013) noted that there are different ways of developing integrated models. They used the term 'integral' to refer to models that are built 'as a whole' and 'assemblage' to refer to models developed by coupling together existing models. Here 'integrated' simply means two models exchanging information in a two-way coupling of some sort (see 2.2 below).
Integrated human-earth models are designed to explore and answer questions related to possible feedbacks between human and natural systems. For example, Thornton et al (2017) examined how feedbacks between climate and land will alter energy, agriculture, land and carbon in the future, while Beckage et al (2018) explored how interactions between the risk of extreme events and behavior could alter future climate change. Van Vuuren et al (2012) identified several areas of research where they thought integrated models were needed (cooperation type D in their review), including (1) climate-land use interactions, (2) water use and drought, and (3) mitigation to prevent specific local climate effects. Such questions have given rise to various efforts around the world to integrate these modeling tools and research efforts. These efforts are still in the nascent stages, with different models and studies around the world. Coordination among these efforts could help advance the science, as there is much that teams could learn from each other.
A number of insightful reviews of linked humannatural system models have been previously performed. Van Vuuren et al (2012) assessed the strengths and weaknesses of different coupling approaches and discussed guidelines for their use, the use of which depend on the scientific question being examined. Because different types of IA-ESM coupling are possible, van Vuuren et al (2012) surveyed a range of such questions and research areas, suggesting appropriate coupling types for each, and noted possibilities for simplification and the importance of uncertainty. Verburg et al (2016) took a similar tack, focusing on modeling methods appropriate to the Anthropocene (Crutzen 2006), crucial characteristics of which include 'societal influences and interactions with natural processes, feedbacks and system dynamics, tele-connections, tipping points, thresholds and regime shifts.' They proposed model categories (see table 1 in that publication) and assessed the scientific domains most appropriate for each. Krey (2014) provided an overview of energyeconomic models, including IAMs, and categorizing models based on their system boundaries (degree of integration), level of detail, and mathematical underpinnings. Recent developments in these models tended to expand the system boundaries and increase the heterogeneity of the various components represented, and increasingly challenging questions are being asked, requiring 'coupling of IAMs to hydrological models and possibly earth system models to study feedbacks among these systems ' Krey (2014). Similarly, Weyant (2017) surveyed IAMs, dividing them into those focused on benefit-cost analysis and those focused on detailed process representations. They noted that emerging challenges for IAMs include which climate change impacts to include, how to represent extremes, and how to capture feedbacks both within the human system and between the human and Earth system. Muller-Hansen et al (2017) reviewed different approaches to representing human-decision making in ESMs. Surveying the many and diverse modeling approaches (from game theoretic frameworks to network models to representative-agent modeling), they concluded that many of these approaches may be overly complex: 'If behavioural patterns are not expected to change over the relevant timescales or feedbacks between natural and social dynamics are sufficiently weak, modelers can simply use conventional scenario approaches.' Thus Müller-Hansen et al (2017) echoed the warning of van Vuuren et al (2012) that modelers need to carefully assess the complexity needed by the question, and feedback domain, being examined. In a more limited spatial domain, Monier et al (2017) argued that modeling the Earth system in northern Eurasia requires IAM modeling, and surveyed model studies in that area.
Zvoleff and An (2014) examined different approaches to modeling human-landscape interactions, including statistical methods, spatial analysis methods, simulation models, and mixed method approaches (including IAMs). The authors identified strengths and weaknesses of each approach, concluding that most appropriate tool depended on the question asked.
Finally, Bonan and Doney (2018) reviewed the representation of 'life' in ESMs, noting that 'climate change must be studied in terms of a myriad of interrelated physical, chemical, biological, and socioeconomic processes.' The authors noted that the inclusion of the biosphere in ESMs enabled studies of impacts and vulnerability (e.g. ESMs with embedded crop models can assess the implications of climate change on agricultural yield), but also that further integration of ESMs and impacts models is needed.
This review distinguishes itself from the above efforts by (i) focusing exclusively on modeling frameworks that feature two-way information flow between the component models, meaning that they are truly capable of examining human-Earth system feedbacks; (ii) attempting to synthesize and quantify, relative to models with no human-Earth system coupling, findings about the strength of these feedbacks; and (iii) situating those feedbacks within the context of other, more frequently considered, feedbacks in the Earth system.

Search method
We use a Web of Science search to identify articles for this review. We use three different types of search: keyword searches, articles cited by a particular study, and articles citing a particular study (see table 1). For the keyword search, we use four sets of terms. The selection of terms was an iterative process, as some combinations produced few articles, while others produced thousands. For example, a search of 'integrated human-Earth system modeling' in quotations results in only a single article. At the other extreme, searching for 'human' AND 'natural' and 'model' returns more than 30 000 results, as 'natural' does not necessarily limit the results to Earth system or climate studies. For the cited by and citing searches, we identify two key articles that we know are relevant to this review. We then look at all of the articles that either cite those papers or are cited by those papers. For each search, we save all metadata (e.g. title, journal, digital object identifier, etc.) provided by Web of Science, as well as information on the type of search, the search terms and the search date. These searches result in 378 total articles, and 346 distinct articles 3 . We supplement our Web of Science searches with 11 additional articles cited in the original set of papers, identified in unrelated searches, and forwarded to us by colleagues. A full list of articles is included in the supplementary material available at stacks.iop.org/ERL/13/063006/mmedia.

Criteria for inclusion/exclusion in review
We did an initial scan of all 357 articles identified in our searches, eliminating articles that were obviously unrelated from their titles and/or abstracts. We read all remaining articles. As mentioned in section 1.2, several different types of 'human' and 'Earth' models, as well as several different methods of 'integrating', were used in the various articles found in our searches. In this review, we use the broadest definitions of 'human' and 'Earth' system model allowable. That is, we include any articles with some representation of human behavior AND Earth system processes. Two coupling types arise in the review articles: 'one-way' and 'two-way'. For one-way coupling, information is passed from one model to another, but no information is returned (e.g. Johns et al 2011). The focus of this review, however, is on two-way coupling that enabled study of human-Earth system feedbacks, and thus we only include models and modeling systems with two-way feedbacks between the human and Earth system or that provide information relevant to creating a system with a two-way feedback (see section 2.3.1).
All other studies are excluded. In general, these excluded articles either do not include any modeling or describe/use a single component model (e.g. Hartin et al 2015). We also exclude gray literature and non-English sources. The excluded articles are reported in the total number of articles, as well as in figure 1 and the article database (see supplementary material). However, these articles are excluded from the rest of this review.

Categorization of articles 2.3.1. Categorization of included articles
We categorize the included articles into five major categories: Integrated Model, Review, Linking Tool, Commentary, and Coupling Example. Integrated Model articles describe or use an integrated human-Earth model. Review articles provide a review on a related topic (e.g. component models or systems). Additionally, early reviews on human-Earth system modeling (e.g. van Vuuren et al 2012) are included in this category. Linking Tools describes tools and/or processes that could facilitate human-Earth system modeling. Many of these examples describe methods for going from one spatial/temporal scale to another.
Commentary includes articles that provide commentary related to human-Earth system modeling. Some of these articles are commentaries. Other articles are studies of related topics (e.g. one-way coupling of human-Earth systems) that make statements related to integrated human-Earth system modeling. The final category is Coupling Examples. These articles describe examples of coupled modeling frameworks; however, the coupling does not include two-way feedbacks between human systems and Earth systems. For example, Howells et al (2013) couple models of land, energy, and water, but incorporate climate information in a one-way fashion. All included articles are incorporated throughout this review; however, only articles from the first category (Integrated Model) are included in the figures and tables in section 3.

Categorization of integrated models and studies
For each modeling system included in the review, we saved information about the Earth component, the human component, and the data exchanged (see table  2). We categorize the studies using a particular modeling system based on the spatial domain (figure 3) and the degree of complexity in the human and Earth system component ( figure 4). The spatial domain is the spatial extent of the analysis, typically either global, regional, or an individual country.

Data availability
All data and code used to generate all of the figures in this article, including our article database with categorizations, are available at https://github. com/kvcalvin/erl-hes-review.

Summary of literature
We found 19 articles describing or using an Integrated Model (figure 2). While the first was in 1993, 14 of these 19 articles were published since 2012 (figure 2). Earlier articles appeared in disciplinary journals (e.g.

Summary of modeling frameworks
These articles include eleven distinct integrated models or modeling systems (BNU-HESM, CSM, CLM * , DICE, GOLDMERGE, GUMBO, iESM, IGSM, IMAGE-CNRM, Jarvis, PRIMA). Table 2 shows the either as their Earth system model or in their Earth system model. There is more diversity on the human system side, although two modeling frameworks use DICE and two use variants of GCAM.
For most (eight out of 11) of the modeling systems, emissions (either CO 2 or all GHGs) are passed from the human to the Earth system (table 2). In three, land use and land cover are also exchanged. The Earth system feedback is more variable: six systems exchanged temperature and three exchanged some indicator of land productivity. In some frameworks, data is exchanged in code (e.g. iESM) such that the entire system executes at once. In other modeling frameworks, data is passed between the two models manually, with exchanges happening until the modelers determine the system has converged (e.g. IGSM). In one framework, an existing land surface model was expanded to incorporate human dimensions, specifically the application of irrigation to crops (Leng and Tang 2014). The resulting model captured the implications of changes in precipitation on irrigation water demand (a human dynamic) and the feedback of those changes on runoff and evapotranspiration (Earth system dynamics).

Categorization of modeling studies
Most (13) of these articles are global in scope, but a significant number focused on large but sub-global regions ( figure 3). These studies tend to examine regional-scale climate or effects such as runoff or evapotranspiration.
We further categorize the various frameworks by the level of complexity in the component models (human and Earth system), as well as the degree of coupling (figure 4). Note that this figure adds specificity to the conceptual diagram included as figure 1 in van Vuuren et al (2016) by including particular modeling systems. In addition to including the individual modeling systems from this review, we also include some model types as a point of comparison. For human modeling types, we include computable general equilibrium (CGE) models and generic IAMs as modeling frameworks with comparable human complexity to the iESM, IGSM, and IMAGE-CNRM frameworks, as these models included either an IAM (iESM and IMAGE-CNRM) or a CGE model (IGSM) 4 . We also include agent-based models (Manson et al 2012) on the figure to illustrate that more complex human models exist; Matthews et al (2007) provide a useful review of this approach as regards land use change, while An (2012) does the same for coupled human-natural systems. With respect to Earth models, we include ESMs, AOGCMs, EMICs, and Climate Emulators (e.g. MAG-ICC, Meinshausen et al 2011, Hector, Hartin et al 2015 as points of reference, as these are the types used in the Integrated Models in this review or are typically used in one-way IAM studies (Climate Emulators). We list DICE as having a simpler approach than even the Climate Emulators as it consisted of only a handful of equations and excluded non-CO 2 effects.

Synthesis of key findings
Many of the studies quantified the implications of including human-Earth system feedbacks on either human or Earth systems or both. The sign and magnitude of feedbacks vary by variable and study; the results for five variables (CO 2 emissions, CO 2 concentration, global mean temperature, land productivity, and cropland area) are synthesized in figure 5 and figure S1 (see SM for detailed methods), putting each change in the context of the underlying uncoupled model results.
CO 2 emissions. Year 2100 CO 2 emissions in the RCPs ranged from ∼0 to ∼30 Gt C yr −1 ; differences across RCPs were due to differences in population, GDP, technology, and mitigation effort (van Vuuren et al 2011). However, the effect of climate on emissions was excluded in uncoupled simulations. Three studies reported changes in CO 2 emissions due to feedbacks. Yang et al (2015) reported negligible effects on 2005 emissions when feedbacks through climate damages to economic activity were considered. Thornton et al (2017) found that such feedbacks, as mediated through land productivity, had a modest effect on 2100 CO 2 emissions, which were reduced by ∼0.7 Gt C yr −1 (17%). Beckage et al (2018) found that feedbacks via changes in behavior due to perceived climate risk could alter CO 2 emissions dramatically, with 2100 emissions ranging from 0 Gt C yr −1 to nearly 300 Gt C yr −1 . 4 We do not attempt to evaluate complexity within a class of models. That is, we do not assess whether IMAGE is more complex than GCAM. Instead, both modeling systems are shown with the same level of complexity. The one exception is that we do classify PRIMA as having a more complex IAM than iESM, because that framework used a more highly resolved version of the same IAM (GCAM). Similarly, we do not attempt to determine whether one ESM is more complex than another, so CESM is at the same level of Earth system complexity as BNU.
In contrast, Schuur et al (2015) found that permafrost feedbacks, which are included in the CMIP5 models, resulted in between 37 and 174 Pg C being released to the atmosphere in an RCP8.5 scenario. We conclude, based on these very limited data, that human-earth system feedback effects on CO 2 emissions have the potential to be substantial relative to other feedbacks in the earth system. CO 2 concentration. CO 2 concentrations in the RCPs ranged from 420 ppmv-935 ppmv in 2100, with differences due to differences in CO 2 emissions. Two studies reported changes in CO 2 concentration due to feedbacks. Thornton et al (2017) found changes of only a few ppm in 2094 due to land productivity differences. Yang et al (2015), however, reported differences of 30 ppm in 2005 when the effect of climate damages on GDP was included. These changes, however, are trivial when compared to the effect of changing RCPs, economic activity (Canadell et al 2007), and other feedback effects in the earth system (Arora et al 2013). Note that Beckage et al (2018) likely had large changes in CO 2 concentration due to human-Earth system feedbacks, but these were not reported.
Global mean temperature (GMT). Four studies quantified the effect of human-Earth system feedbacks on GMT, with two focusing on feedbacks via land productivity, one focusing on climate damages on GDP, and one on human behavior. In the land productivity cases, Voldoire et al (2007) showed an increase in GMT of 0.5 • C starting in 2000 in their integrated simulation and this difference remained throughout the century. Thornton et al (2017), however, found negligible changes in GMT due to land productivity feedbacks. Yang et al (2015) found temperature reductions of 0.4 • C in 2005 due to feedbacks. While the Voldoire et al (2007) and Yang et al (2015) results are non-negligible, these values are relatively small compared to the effect of changes in human behavior, which have the potential to dramatically alter GMT. In particular, Beckage et al (2018) found differences in GMT of anywhere from −1.5 • C to +1.3 • C in 2100. These differences depended on 'the functional form of response to extreme events, interaction of perceived behavioral control with perceived social norms, and behaviours leading to sustained emissions reductions'. Similarly, Jarvis et al (2012) found differences in temperature due to feedbacks ranging from negligible effects to 4 • C; however, the authors calculated the feedbacks parameter needed to limit temperature to a pre-defined target.
The effect of human-Earth feedbacks in Beckage et al (2018) and Jarvis et al (2012) are comparable to the effect of moving from one RCP to another and much larger than other Earth system feedbacks that have been explored. For example, Schuur et al (2015) found that permafrost feedbacks in CMIP5 resulted in increased warming of 0.13 • C-0.27 • C in 2100. Fire effects in the earth system have been estimated to exert a cooling (of −2.3 W m −2 , Randerson et al 2006) but Figure 5. Change in key RCP variables (CO 2 emissions, CO 2 concentration, GMT, land productivity, cropland area) due to feedbacks. Lines are the original RCPs. Dots indicate the change due to feedbacks shown in each study. Colors indicate the RCP used for the reference calculation. For studies that were not based on the RCPs, we use the closest RCP in terms of 2100 global mean temperature rise and not the original reference scenario. For the land productivity results, 'High Pollution' is a high emissions scenario and 'Climate and GHGs only' is the same as 'High Pollution' but excluding the effects of ozone damage on crops Reilly et al (2007). 'Paris Forever' includes the pledges from the UN COP-21 meeting but no new climate policy after 2030; '2 C' is a scenario limiting 2100 temperature to 2 • C (Monier et al 2018). See supplementary material for detailed methods.
Earth System Model runs incorporating realistic fire processes more often report a warming feedback, e.g. of +0.18 • C . In summary, the few studies reporting human-earth feedback effects on GMT reported a wide range of values, ranging from much smaller than other human or natural system feedbacks to effects as large as moving from one RCP to the next.
Land productivity. Productivity effects can be quite large in human-earth feedback scenarios (figure 5). Differences in productivity in the Reilly et al (2007) study depended on inclusion of ozone damages. Climate and greenhouse gas emissions alone increased crop yields, due primarily to CO 2 fertilization, with increases in ozone concentrations, having a strong negative effect, resulting in significant yield declines. Thornton et al (2017), however, excluded ozone effects, resulting in a modest positive increase in productivity (∼10%). In contrast, Hartley et al (2017) found a difference in gross primary productivity due to differences in the distribution of plant functional types of between −12.7% and +11.2%. This is similar in magnitude to the feedbacks on land productivity found in Thornton et al (2017), but much smaller than the feedbacks from Monier et al (2018) or Reilly et al (2007).
Cropland area. Cropland area in the uncoupled RCPs ranged from 11 million km 2 to 20 million km 2 , depending on population, income, diet, etc. One study quantified the effect of human-Earth system feedbacks on land use and land cover. Thornton et al (2017) found that these feedbacks resulted in a decrease in cropland area of ∼1 million km 2 (∼10%). This is much smaller than the effect of changing socioeconomic scenario (Popp et al 2017) and potentially smaller than the implications of mitigation although those effects depend on policy context (Calvin et al 2014). In terms of climate, Brovkin et al (2013) found that land-use change could have a significant effect on regional climate in grid cells with at least 10% land use, land cover change (LULCC).
GDP. GDP measures the size of the global economy and an important driver for future emissions; see for example Kriegler et al (2016). In the uncoupled RCPs, GDP ranged from $200-$300 trillion 2000$ in 2095(van Vuuren et al 2011. However, these estimates exclude the effect of climate. Nordhaus (1993) found that damages from a changing climate reduced global output by $2 trillion 1989$ (∼$2.5 trillion 2000$) in 2095, i.e. ∼1%. This is much smaller than the direct effects of policy in mitigation scenarios (Clarke et al 2014). It should be noted that damage functions are highly uncertain and thus the effect on GDP is sensitive to the selection of model (Diaz and Moore 2017).
The existing literature on feedbacks in integrated human-Earth system modeling studies suggests that these feedback effects could be strongest for CO 2 emissions, land productivity, and GMT. However, this finding is likely due to a sample bias. These variables are strongly correlated with the remaining two variables (CO 2 concentration and cropland area), which were not reported in the studies with the largest feedback effect. In general, the precise effects vary by study, with different studies using different modeling systems and examining different feedback pathways. It is important to note, however, that figure 5 is not a comprehensive view of the potential for feedbacks to alter human and Earth systems, as the literature in this young field of research remains sparse: there are a limited number of studies using a limited number of models. In some cases, results from one-way feedback studies suggest a wide range of feedback effects and at times a different sign of the effect. For example, Nelson et al (2014) showed an increase in cropland area for the RCP8.5 scenario in 2050 of between 0 and 15% depending on the crop model, climate model, and economic model. This is in contrast with the 10% decline shown in Thornton et al (2017). Similarly, Diaz and Moore (2017) showed damages at 3 • C of between 0% and 2.5% of GDP, while Nordhaus (1993) found a change of $2 trillion (∼1% of GDP in the RCPs).

Limitations of current studies
Several challenges with integrated human-Earth system modeling were identified in the literature. These challenges offer both insight into the limited number of human-Earth system modeling studies and areas for future work. The challenges fall into four different categories: complexity, data, mathematical representation, and cost.

Selecting the appropriate level of complexity
Concerns about complexity were noted in several articles. For example, Nordhaus (1993) stated that '(o)n the whole, existing (climate) models, are unfortunately, much too complex to be included in economic models. ' Voinov and Shugart (2013) noted that '(b)y integrating complexity we create even more complexity.' Transparency and tractability are more difficult when including feedbacks between models (Verburg et al 2016). Furthermore, coupling models with different levels of complexity presents additional challenges-it is 'less clear how to communicate information from one complexity level to other' (Voinov and Shugart 2013). This problem is one of the motivations behind EMICs (Claussen et al 2002). Such complexity complicates the interpretation, analysis, and use of such modeling systems (Voinov and Shugart 2013). Additionally, coupling of modeling systems via feedbacks 'makes models extremely sensitive to error propagation' (Verburg et al 2016). We agree with these analyses and suggest that the additional complexity introduced by coupling human and earth models needs to be carefully balanced against the potential analytical and inferential gains. Palmer and Smith (2014) identified two challenges:

Mathematical representation of human behavior
(1) 'describing how humans make decisions' and (2) 'describing the relationships between humans and the physical and biophysical components of the Earth system. ' Similarly, Beckage et al (2018) found that the 'functional form of response to extreme events' was a key uncertainty in their modeling, and one with the potential to alter global mean temperature in future climate simulations. Further research into tractable, numerically stable methods of characterizing humansystem dynamics is critically needed to support future coupled model experiments.

Computational cost
The development of new models is extremely expensive, and this has been identified as one reason why there are relatively few integrated models (Verburg et al 2016). The cost of exercising these modeling systems can also be prohibitive; one recent integrated modeling analysis (Thornton et al 2017), for example, required ∼500 000 processor hours per simulation on a supercomputer. As a result, there are a limited number of scenarios, feedback mechanisms, and ensemble members in the literature. While this limited literature includes early indications of the strength of feedbacks, it is difficult to determine robustness. Howells et al (2013) noted that 'fully integrated assessments may not always be fully compatible with the expediency that is sometimes required in policy analysis. The time required to develop and integrate the framework does not allow fast turnaround projects.' This problem is particularly acute with ESMs, motivating work with EMICs (and to a lesser extent climate emulators): 'In some EMICs, the number of processes and/or the detail of description is reduced for the sake of simulating the feedbacks between as many components of the climate system as feasible ' (Claussen et al 2002). We argue that encouraging a diversity of model structures, with different levels of complexity and cost, holds the greatest potential for enabling robust future research in this area. Verburg et al (2016) noted that one challenge for integrated human-earth system modeling is that datasets are 'collected by different disciplines, and different schools within each discipline concerned, and often for different purposes'. Furthermore, the authors noted that 'feedbacks are difficult to measure in reality. ' Liverman and Cuesta (2008) noted that there are 'difficulties in gathering socio-economic information at global and regional scales, linking social data to satellite imagery, and forecasting human activities and policies.' We agree with these points but note that they are not unique to human-earth system modeling and can frequently be ameliorated by community synthesis and coordination work and imaginative study designs.

Suggestions for future directions, as noted in the literature
One of the fundamental challenges in this review is the limited number of studies using coupled earthhuman systems that have been published to date. More modeling systems, and studies using them, are needed in order to robustly assess the sign and magnitude of human-Earth system feedbacks. This is a significant challenge for the ESM and IAM modeling communities.
Furthermore, these systems need to include a 'diversity of approaches' (Verburg et al 2016) and a 'cross comparison of models' (Zvoleff and An 2014). However, the specific approach should use 'appropriate computational and conceptual frameworks' (Palmer and Smith 2014). 'As fully integrated models can become too complex, the appropriate type of model (the racehorse) should be applied for answering the target research question (the race course)' (van Vuuren et al 2016). Verburg et al (2016) suggest that 'plug-and-play component programming' could help address these limitations, enabling the development of tailored modeling systems and model intercomparisons. Regardless of the approach to integrating components, the coupled model should track the propagation and accumulation of errors (Verburg et al 2016).
Finally, there is a need for more interdisciplinary research (Palmer and Smith 2014), as well as improved communication among interdisciplinary teams (Newell 2012, Hibbard et al 2010. Community development could help facilitate this communication and interaction (Laniak et al 2013, Mauser et al 2013. Laniak et al (2013) also noted a need for openness, arguing that transparency and collaboration could foster access and innovation. Finally, better data and accompanying detailed metadata are needed (Verburg et al 2016, Palmer andSmith 2014).

Conclusions
In this review, we survey the integrated-Earth system modeling literature, identifying, categorizing, and synthesizing 357 different articles. In our search, we find eleven different Integrated Models that have explored a variety of human-Earth system feedbacks, including climate-land interactions, temperature-GDP linkages, and water use-water availability effects. We categorize each of these models by the degree of complexity in their component models, finding a wide range across the literature.
We find a wide range of feedback effects across modeling studies and variables. In some cases, the effect of feedbacks was seemingly negligible. For example, the effect of climate damages on global output in Nordhaus (1993) was a mere ∼1% of RCP GDP, a change smaller than the effect of mitigation on GDP. In other cases, the effects were large: for example, Beckage et al (2018) finds that global mean temperature could change from −1.5 to +1.3 • C due to feedbacks, a change as large as the effect of moving from one RCP to the next. Furthermore, we find that the human-Earth system feedbacks identified here can be much larger than other feedbacks included in more traditional ESM analyses.
More modeling systems, modeling studies are needed in order to robustly assess the sign and magnitude of human-Earth system feedbacks. More effort is needed to understand the range of feedbacks across studies. For example, land productivity estimates varied widely across studies-is this due to the use of different models, different scenarios, or the inclusion of different factors? Additionally, many potential human-Earth system feedbacks were not explored in these studies. For example, none of the studies look at the interactions between energy and climate. Van Vuuren et al (2012) hypothesize that one-way coupling was more suitable for modeling this particular interaction, but has this been tested?
Several challenges and open questions remain in integrated-human Earth system modeling. What level of complexity is required to evaluate particular humanearth system feedbacks? The use of complex ESMs is expensive, both in the development and execution of scenarios. However, when relying on simpler modeling systems, researchers run the risk of missing important feedbacks and interactions. A related issue is whether researchers can use one-way coupling studies to identify potential feedbacks. These studies are easier to implement and understand; however, such studies have the potential to under-or over-estimate feedbacks in the fully coupled system. For example, ENSO is an emergent property of a coupled atmosphere-ocean model and would not be found in either atmosphere-or ocean-only models (Meehl et al 2001).
Future work should include the development of a permanent, open-access database to house the results of integrated modeling studies, like those in this review. Such a database can help researchers assess the sign, magnitude, and robustness of human-Earth system feedbacks. Additionally, with the right metadata, such a database can help facilitate future modeling teams to identify when to include which feedbacks in their studies. There are significant challenges with archiving the results of complex modeling exercises (Thornton et al 2005), but exploring and designing such a system could provide large benefits to the entire community.
In summary, more research, more models, and more studies, including both one-way and two-way feedbacks, are needed to robustly quantify the sign and magnitude of human-Earth system feedbacks.