System dynamics simulation of transport mode choice transitions under structural and parametric uncertainty

Complex social processes introduce difficulties to validating causal parameters and identifying the correct system structure in modelling. Policy impact assessment for sustainability transitions should therefore not expend too many resources modelling any single set of assumptions about the world. Furthermore, keeping models relatively simple allows more effective communication and stakeholder collaboration. This paper presents an exploratory system dynamics model of urban mode choice. We demonstrate that, despite structural and parametric uncertainty, it is possible to rank alternative policy approaches and identify high-leverage uncertainties as targets of policy action or further analysis. We also show how different narrative theories of change can have drastically different or unintui-tive outcomes for the same intervention. Simulation can benefit both impact assessment and the further scrutiny and refinement of change narratives. We argue that the following methodological choices and their synergies made our modelling approach effective: exploratory modelling, focus on endogeneity, coarse resolution and avoidance of abstract variables.


Introduction
The European Green Deal [10], aims to cut 90% of transport-related greenhouse gas emissions by 2050 with the help of the Strategy for a Sustainable and Smart Mobility [11]. One of the three key pillars of action proposed by the strategy highlights the wide availability of sustainable alternatives in a multimodal transport system. Alongside technical changes, modal choice is a key aspect of decarbonizing transport [43].
Transport systems are complex, featuring interaction between individuals, material objects, policy frames and infrastructure [3,12,15,49]. While they feature path dependencies regarding habits and policies, Marsden et al. [34] highlights that people are also far more adaptable to a major change than the current policy process assumes. Complex Adaptive Systems (CAS) has been a well-received concept for expressing complexity in sustainability transitions research, as it highlights the nonlinearities, uncertainties and unpredictable emergence of novel phenomena entailed in transition [17,31,32,48]. Sufficiently rapid and large-scale transitions likely demand nonlinear tipping-point dynamics of mutually reinforcing social changes. These include changes in the underlying norms, values and meanings of social life [45].
While model-based impact assessments have the strength of testing the outcomes of policies, their realism faces substantial challenges from system complexity. In general terms, there is a trade-off between the precision of causal parameters and their ease of empirical validation [35: 101-102]. A root issue is that empirical data alone in isolation of theory does not express causality. Empirical validation that begins from minimal  14:40 causal theory uncovers correlations rather than causations (ibid.). Refining the causal theory increases the precision of statistically inferred causal parameters, but also increases the difficulty of validating the growing total of technical assumptions underlying results. For instance, in widely used regression models, these assumptions relate to correct identification of all causal variables and their mathematical formulations, the independence of causal variables, and the distribution of errors [16: 14-15]. Meanwhile, a CAS perspective undermines the independence of causes, the stability of the functions that (are thought to) govern change, and ability to mechanistically and accurately describe the system. Models, however, remain attractive options for assessment given their ability to test even quite complex ideas consistently. Collaboration with stakeholders is one approach for validating uncertain models [52,53], but complicated models can be difficult to understand for non-experts. Following Ghaffarzadegan et al. [19], we propose that models should be relatively small in terms of their number of variables and causalities particularly in context of uncertainties. Small size also makes model building less resource-intensive. If the model technicalities allow, resources saved on validating and expressing details of one particular set of assumptions can be spent on flexibly testing and scrutinizing alternative assumptions of causality (see e.g. [37]).

Open Access
Flexibly adjusting underlying assumptions of change is more typical for qualitative methods. Simulation modelling approaches meanwhile may not adopt the possibility of exploring alternative change dynamics and in most cases are not motivated by a specific desire to keep models simple and flexible. In this paper, we present a system dynamics simulation model of urban transport mode choice. It is intended as an investigation and example of the kind of support that can be offered to e.g. city-level decision-makers before major effort is expended on validating parametric detail (where validation is feasible).
Our research questions are: 1. Under uncertainty about causal structures and parameters, how can small models produce insight for policymaking aiming at modal choice transitions? 2. What are the key elements of such a model?
Our conclusions are methodological. Under uncertainty it is nonetheless possible to compare the outcomes of alternative policies, identify the most significant uncertainties, and reveal and compare alternative assumptions of causal structures based on their (sometimes drastically) different outcomes. These capabilities are possible due to exploratory parameter sampling, focus on endogeneity, avoidance of abstract variables and a coarse model resolution.
This paper first provides theoretical justifications for our approach to modelling a complex and uncertain system (Sect. 2) and discuss prior literature on mode choice research (Sect. 3). After presenting our model building process (Sect. 4), we present our system dynamics model itself as a result and demonstrate its use with starting state parameters from Helsinki, Finland (Sect. 5). Based on the results and our model building process, we discuss the model building principles that we found synergistic and that could be applied in models grounded in uncertainty (Sect. 6). Finally, we highlight key conclusions and answer our research questions (Sect. 7).

Theoretical background on modelling complex and uncertain systems
Though complexity does not have a generalized definition [36, 40: 169], it is generally considered to imply uncertainty and nonlinearity [42]. In a nonlinear system, each additional stimulus of an equal size does not lead to an equal amount of change. One source of nonlinearity is feedback, which we emphasize in this paper with our system dynamics modelling approach [25]. Feedbacks can reinforce or balance prior change. Combining reinforcing and balancing feedbacks in a single system means that even the direction of change may not be predictable without simulation tests. Including feedback in models can expand their realism as far as true feedbacks are identified, but also allow testing for the uncertainty (difficulty of prediction) that follows nonlinearity. Mechanistic prediction of outcomes would require knowledge of each variable and causality. If sustainability transitions are interpreted as CAS, uncertainty extends beyond the difficulty of prediction to an inability to completely and realistically describe underlying social processes [24]. Hanneman and Patrick [23] emphasize that models are always "artificial research environments", to make a clear distinction to the real target of research environment such as the real transport system. While realism is one guiding value in modelling, it can never be fully realized, and it competes with other values such as usefulness towards an aim, ease of understanding and communication, and resources need for sufficient completion.
As such, our research approach is to only model a "system of interest" [28: 22] or a bounded aspect of the whole without attempting to represent reality completely. Our system of interest highlights endogenous factors of mode choice. We expect that, since feedbacks are reinforcing or balancing, endogenous dynamics can produce meaningfully different outcomes for different interventions or assumed causal structures despite of parametric uncertainty. In contrast, results would be directly determined by assumed causal parameters in an entirely linear system.
A model representing an incomplete and contestable system nonetheless allows systematically scrutinizing the outcomes of cause-effect assumptions. We argue that the type of modelling approach we present in this paper can benefit the coherent and critical development of narratives for how transport mode transitions can (or cannot) occur. In other words, we can hope to develop more coherent theories or narratives of change [22].
In the absence of a guiding theory, science turns to exploration [18]. Exploratory modelling techniques include using a variety of alternative models or parameter combinations. They have been successfully applied in a variety of disciplines ranging from physics to social science (ibid.). In our demonstration of policy impact assessment, we use randomized sampling of parameter ranges, as has been recommended for transitions modelling [21,38,46]. Exploratory modelling does not provide a single most likely or otherwise justified result, but a range of possible results that follows from the assumptions made including model structure and parameter ranges.

Previous research
In our review of previous simulation models, modal choice is usually calculated at a macro level as the result of policy packages and likely or envisioned future trends. Most papers do not focus on exploring alternative hypothetical mode choice mechanisms. The intricacy and technical solutions of models vary depending on their aims.
Costs and tax interventions are common determinants of mode choice in models (e.g. [1,2,8]). The overall need to travel to services and jobs [26], congestion [5] and intangible affective factors such as environmental consciousness and social acceptance (ibid.) also feature in model mechanisms. GDP per capita or some measure of income is often used as a determinant of car use [2,5,8,14,33]. The share of public transport may in turn be determined by maximum or average wait times, departure intervals and crowding [2,33].
Besides using a variety of causal factors, prior research has also adopted alternative modelling approaches. Of the reviewed system dynamics models, we note Barisa and Rosa [5] as featuring one of the most intricate sets of causal factors of mode choice. Pfaffenbichler et al. [47] also presents high detail in this regard, and collapse their causal variables into a single measure of generalized cost to calculate optimal modes. A utility variable is another option for representing multiple overlapping effects [8,41]. Hradil et al. [26] do not opt for optimization, but rather determine car, public transport and active travel adopters hierarchically. As such, exogenous factors first determine car and public transportation use, and active travellers are the share of population left over. Kaaronen and Strelkovskii [30] offer a similar research approach to this paper in terms of explaining behaviour with relatively few variables and feedback effects. Due to a social learning effect-a reinforcing feedback on cycling-the intervention of improving cycling infrastructure lead very strongly to increased cycling regardless of how the parameters of the model were set.
Qualitative or narrative scenarios are also used in assessing mode choice change, and these are often more flexible in terms of underlying assumptions than numeric modelling [4,44,50]. Hanneman [22] argues that qualitative research of social change tends not to articulate detailed behavioural mechanisms while quantitative research often does not explore the implications of alternative reasonable causal assumptions or formulations that could change results. In our modelling approach, we seek to retain openness to alternative assumptions about the world while still conducting test-based research.

Model development
As a research group, we represented expertise in transport research and system dynamics modelling. Our aim was to produce a model that is simple but capable of impact assessment in an urban transport context. We built the model in an iterative manner in which the specific modelling questions and model content were flexible [52]. Iterations happened based on group discussions in weekly meetings over the course of several months. We steered model development by balancing between three aims: representing realistic and important causalities, keeping the model technically simple and flexible to alternative assumptions, keeping the model easily communicable and ensuring results have meaning despite uncertain mathematical formulations and parameters.
Our first model iteration was a qualitative causal loop diagram featuring a large number of possible feedback loops concerning mode choice. We quickly moved to building iterations of a simulation model. The simulation model building phase narrowed model scope substantially. Many feedbacks that occur through policy responses or fiscal governance were excluded for two reasons. First, if a public decision-maker were to test the effects of its own strategy with the model, it may not be intuitive to treat their decision-making as endogenous. Second, policy reactions feature much variety and it was not clear how to narrow down a manageable set of alternative cause-effect structures to represent policy and budgetary feedbacks. The feedbacks that remained in the simulation model were possible to implement with few and understandable-though uncertain-parameters. Several versions of the simulation model including alternative technical choices and variable aggregations were attempted before arriving at the model version reported here. We consider that a future model building process aiming at simplicity and flexibility would be more rapid when following the solutions and principles of this paper.

Data, indicators and scenarios
We used the Helsinki region of Finland as a case study for our impact assessment demonstrations (Sect. 5.4). The Helsinki region comprises 15 municipalities with 1.5 million inhabitants. In autumn 2018, residents of the Helsinki region made on average 4.7 million journeys within the region on a weekday, i.e., 3.5 journeys per person. Altogether, 39% of the journeys were made by car, 22% by public transport (including bus, metro and tram), 9% by bike, 29% on foot and 1% by other means of transport. The share of sustainable modes of transport (public transport, walking, cycling) in the region rose from 57% in 2012 to 60% in 2018 [7].
Starting state values were the key data inputs to the model. These include mode choices at start, divisions of trip purposes, and feasibilities of modes. We interpreted feasibility of active travel based on the shares of trips that were "short enough" to cycle. We used 10 km as a threshold of infeasibility when interpreting data, given that about 96% of trips by bike and 100% of trips by walking are shorter than 10 km in Finland [13: 61]. Public transport feasibility was an assumption of the availability of a connection for a trip purpose. Car feasibility was based on an estimate of the access to a car of the population. Starting state data affect impact potentials of interventions and could be replaced by data from other regions. Our data sources were the Helsinki region transport survey 2018 [7] and the Helsinki region dataset of the Finnish National Travel Survey [51].
Our key indicator for the impact assessment demonstration was change in Co 2 -eq. emissions. This allowed considering also the differing emission reduction potentials of different modes besides the simulated changes in mode choice. We used car trips and public transport capacity (not public transport trips by individuals) to calculate change in emissions, while active travel was without emissions. Our emission data was from the Helsinki Region Environmental Services [27]. The full set of starting state parameters and a discussion of our emission accounting method are found in Sections 1.4 and 3 of the Additional file 1.
Our baseline assumption in scenarios was that no change occurs in mode choice or emissions if no intervention is made. We intended results produced under alternative assumptions to be compared to one another rather than being interpreted in isolation. We designed two types of tests: ones that compared the same intervention under alternative endogenous dynamics, and ones that compared alternative interventions under the same set of uncertain causal parameters.

Representation of mode choice and its causes
The model simulates changes in three modes and three trip purposes or a 3 × 3 grid of purpose-mode combinations. The transport modes are car, public transport and active travel. Active travel includes walking and cycling. The mode categories serve as both targets of policies and emission impact categories. The trip purposes are commutes, errands, and leisure. Different trip types have different potential for being travelled by a given mode, informed by starting state data and assumptions. Policies and other urban change can also target different trip types. A road toll may for instance only apply during typical commuting hours or desired leisure activities can change.
Mode choice is affected by four endogenous causes: crowding, trends, safety in numbers and affect. Additionally, exogenous causes were included: cost, capacity (higher capacity alleviates crowding), and feasibility. Feasibility sets a maximum use of a mode for a given purpose. Since we calculate mode share from feasible trips, increasing mode feasibility also increases mode use (mode use divided by feasible trips by the mode is constant while feasible trips increase), though we do not present that scenario in this paper. Excluding feasibility, the other causes of mode use are calculated in terms of relative change since start. The degrees of their effect are governed by weight parameters, and all weighted effects are multiplied to produce a relative change in mode shares (see Sect. 6 for discussion, and Section 1.1 of the Additional file 1 for the mathematical formulation).
One limitation is that the multiplicative form of calculating mode share could be contested. Another is that the model has no mechanism for calculating movement between specific modes: which of the two other modes are given up when one mode grows, and which mode is abandoned when one declines. Such changes need to be assumed. We make optimistic assumptions from an emissions impact perspective (see Section 1.4 of the Additional file 1).

Postulated feedback loops
Our selection of feedbacks is not intended as a default theory of mode choice dynamics, but as a demonstration of the principle that policy effects depend strongly on the assumed system and that a small dynamic model can allow meaningful comparison between different policies and theories of change. Here we narratively explain each dynamic. We also briefly explain their operationalization in the model. Each feedback is assigned a weight parameter governing its strength of effect (if any). We use parameter values with diminishing marginal effect to prevent uncontrolled exponential growth also under narratively reinforcing effects (see Section 1.2 of the Additional file 1). Figure 1 illustrates our feedback loops. In the Figure, R and B refer to reinforcing and balancing loops and crossed lines indicate delayed effect. Crowding (balancing loop): When more travellers opt for public transport or car travel, those modes (vehicles, roads etc.) become more crowded [2,20,33]. Crowding can manifest, for instance, as a loss of comfort or as concern over late arrivals. Inversely, when fewer people travel with these modes, they appear more attractive. In the model, crowding effects can be alleviated by expanding capacity. If mode use increased by 10% while capacity increased by the same amount, there would be no net crowding effect.

Safety in numbers (reinforcing loop):
If the number of accidents increases less than proportionally to the volume of traffic (e.g. if traffic doubles, the number is less than doubled), a safety-in-numbers effect may be in play [9]. A motorist is less likely to collide with a person walking and bicycling if more people walk or cycle [29]. Thus, a larger number of active travellers makes active travel feel safer and encourages more active travel. In the model, a higher number of cyclists relative to start increases the safety-in-numbers effect, encouraging more cycling. Infrastructure capacity is not included as a variable for cyclists and safety in numbers does not apply to public transport and car travel.
Trends (reinforcing and balancing loops): This dynamic may represent excitement around a new travel opportunity, wanting to fit in, and being curious about current developments such as car-free lifestyles. Such a social contagion effect is typical in system dynamics models (e.g. [6]). In our model, mode popularity is affected by "recent change" in its popularity, or current mode use minus a lagged value of mode use. When the rate of increase/decrease in mode use starts slowing down, so does the reinforcing feedback, resulting in a combination of reinforcing and balancing effects.
Affect (reinforcing loop): Changes in affect or an underlying societal attitude regarding normal and desirable behaviour can drive social change [45]. However, it is also challenging to conceptualize in way that allows using '(relative) changes in affect' as a numeric input to mode choice. In the model, our solution is to understand affect as stemming from mode choice change since start. When more/fewer trips are taken with a mode, the affect effect of that mode increases/decreases. It is mathematically distinct of the safety-in-numbers effect by being formulated based on mode split per trip purpose, while the safety-in-numbers effect is based on the absolute number of active travel trips per trip purpose.

Dynamics of the model
Before the impact assessment demonstration using the full model, we show the dynamics that follow each endogenous factor. In the following tests, the same exogenous improvement was implemented while activating different feedbacks. We use the same weight parameter value for each feedback. The key here is to qualitatively compare the shapes of the curves rather than scrutinize alternative test settings or observe the exact y-axis value. In Section 4 of the Additional file 1, we show that results of this qualitative comparison of dynamics do not change with alternative parameter values, though naturally the numeric degree of change is affected. Crowding: Figure 2 demonstrates the crowding dynamic. Since crowding is a balancing feedback loop, it reduces change caused by interventions (other than capacity interventions which alleviate the crowding effect). A symmetrical effect for car travel would be that when car travel is discouraged, car travel becomes less crowded which to some extent undermines the discouragement of car travel. The weight of the crowding effect also determines the effectiveness of capacity increases/ decreases to encourage/discourage travel.

Safety in numbers:
The solid orange curve in Fig. 3 demonstrates the safety-in-numbers dynamic. A safetyin-numbers effect for active travel increases the impact of interventions. Whatever positive effect is put in motion gets accelerated and reaches a higher outcome.
Trends: The dashed black curve in Fig. 3 demonstrates the trends dynamic. The more weight is given to trends, the larger is the oscillation effect. When growth slows down, the trend effect declines. Since part of prior growth was due to the trend effect, growth slows down even more, eventually causing a negative trend effect. Mode decline also slows down eventually, reducing the negative trend effect, and thus the oscillation turns to an upswing.
Affect: The dashed orange curve in Fig. 3 demonstrates the affect dynamic. Affect works similarly to the safetyin-numbers effect: prior change in mode choice is amplified. However, note that the trajectories under the affect assumption and the safety-in-numbers assumption differ despite using the same weight parameters. This shows the significance of different mathematical formulations for feedback loops that narratively emerge from the same phenomenon (in this case mode choice). Figure 4 demonstrates how alternative combinations of endogenous effects can lead to very different outcomes. All four curves in Fig. 4 feature the same intervention to make active travel easier. All activated feedback loops use the same weight parameter. The solid orange curve and solid black curve apply the trend and affect dynamics respectively. The dashed orange curve activates both effects at once. The trend and affect effects support one another: trends build up the mass of behavioural change, which generates affect, while increasing affect maintains the growth of active travel to mitigate the downward cycle of the trend oscillation effect. Growth is faster compared to the solid black curve, and an equal or higher level of active travel is achieved at all times compared to the solid orange curve. However, combining endogenous causalities can also lead to strange and adverse effects. The dashed black curve in Fig. 4 shows active travel dipping below the starting values for a moment despite a positive intervention. This result followed combining the trend effect with the safety-in-numbers effect. Safety in numbers amplifies the oscillation effect of trends by quickly removing/increasing support of active travel when the trend effect goes into a downturn/upturn.
We draw three conclusions from the combined dynamics demonstrations. First, narratively simple changes to causal assumptions can lead to qualitatively different trajectories of change that can also imply highly divergent numeric outcomes. Second, explaining or targeting rapid and large-scale behavioural change benefits from (correctly) identifying dynamics that could compound positive effects and mitigate unwanted effects. Third, constructing alternative theories of change as feedback structures for simulations allows scrutinizing and refining them. For instance, if we were to think that both trend effects and safety-innumbers effects are key factors of transition, then we also need an explanation for why the wild oscillation of the dashed black curve in Fig. 4 would not/does not occur.

Impact assessment demonstration: emission reductions from policies directed at mode choice in Helsinki
In this section, all causal factors are used to demonstrate how the impact potential of interventions may be analysed when feedback structures are defined but many parameters of the system are highly uncertain. Discussion of our minimum and maximum weights is in Section 1.3 of the Additional file 1, and intervention descriptions in Section 2 of the Additional file 1. The principles of analysis can be understood in isolation of these precise test settings. Figure 5 shows four emission scenarios for the same set of interventions but alternative assumptions of the strength of the initial exogenous interventions and subsequent endogenous dynamics. The exogenous interventions are cost increases of car use, cost decreases of public transport use, ease increase to active travel and public transport, and capacity increase for public transport. The dashed orange curve uses maximum weights for exogenous and endogenous effects. The dashed black curve uses minimum weights. The large difference between the two curves indicates that emission effects are highly sensitive to the combined weightings of feedback effects.
Between the two extremes in Fig. 5 are intermediate cases. In these cases, weights are grouped as (arguably) social phenomena that are reactions to the behaviour of others-trends and affect-and (arguably) more individualistic reasoning-costs, ease, and comfort (crowding and safety in numbers) of travel. When the weights of 'individualistic' factors are set to maximum and social causes to minimum (solid black curve), emissions decline more than in the inverse case (solid orange curve). One explanation is that there are a larger number of effects in the 'individualistic' category. Another would be that the 'individualistic reasoning' effects produce the initial behavioural change upon which 'social' reactions continue to expand-whatever the weighting of the latter.   Figure 6 makes the same set of interventions but samples all weights randomly between the minimum and maximum (using Latin hypercube sampling over 200 repetitions). The method assumes that all parameter values within their respective ranges are equally likely. The 50% band of results (orange shaded area) is closer to the most pessimistic than the most optimistic outcomes, meaning that the most optimistic results rely on a rather specific set of weight conditions. Observing outcomes for individual modes revealed that active travel featured clearly the highest variance in results including particularly strong best optimistic results. If the model were accepted as a starting point, analysis could thus progress to investigate how the causes of active travel could be targeted specifically (in the real world) to promote achieving the best outcomes under uncertainty.
Another takeaway is that the set of intervention did not lead to undesirable outcomes such as increasing emissions or declining active travel under any combination of parameters, even though we showed this to be possible in principle under combined nonlinear dynamics (Fig. 4). The lowest emission reduction in the model for this set of interventions was around 10%.
Finally, it is possible to compare alternative policy approaches under parameter uncertainty. Using the same sampling as in the previous test, Fig. 7 shows the results for an improvement in ease to active travel and public transport. Figure 8 shows the results for cost increases to car travel and cost reductions to public transport. The ease increases led to somewhat better results in the 50% band and the most optimistic runs than the cost interventions. It is also notable that combining multiple interventions in the context of uncertainty (Fig. 6) avoided the worst possible outcomes shown in Figs. 7 and 8 while securing a better 50% band.

Discussion
The amount of variables in our model contrasts with larger and more detailed models in the literature (e.g. [5,47]). Our approach sacrificed on various types of granularity, e.g. spatial variance of outcomes, different transport user groups, or more precise emission accounting. A coarse granularity was however in practical terms synergistic with using uncertain parameters and the desire for a relatively quick model building framework. Using fewer categories of effect meant that fewer mathematical formulations needed to be designed and implemented. It also meant that fewer dimensions of uncertainty needed to be added to random sampling of parameters, which helps in model interpretation. The exploratory sampling of parameters is continuation of prior exploratory sustainability transitions modelling [39]. Highlighting feedbacks or endogeneity had synergies with using uncertain parameters. Endogeneity gives a variable a direction of change, even if the degree of that change is uncertain. Our work is similar to Kaaronen and Strelkovskii [30] in terms of producing explanations of change in modal choice that are strongly determined by feedback structures and relatively insensitive to parameters. To our knowledge, previous simulation work on modal choice feedbacks has not discussed how each assumed endogeneity implies distinct dynamics, a principle we demonstrated in Sect. 4.2. Following Hanneman [22], we argue such practices of simulating simple dynamics can help scrutinize and further develop existing narrative theories of change. There are numerous change narratives to choose from (see e.g. [4,50]), but the types of changes implied by qualitative causalities can be ambiguous or difficult to predict due to nonlinearity.
A common approach in modelling is to use utility or monetary equivalents (sometimes of non-market variables) to contain the net effect information of all causes (e.g., [8,41,47]). However, in our model building process we found that minimizing unobserved/ unobservable constructs had synergies with parameter uncertainty, the lack of empirical validation and focus on endogeneity. For instance, if we had used an affect variable to explain utility, we would have had to ask, "What is affect and how can it be formulated mathematically?", "How does affect change utility?" and "What does it mean to assume the causal parameter of affect on utility is X?" Instead, the more feasible question guiding our model building was "What endogenous factors could generate an 'affect effect' on mode choice?" Affect as such was not a distinct variable in the model; instead, we postulated that affect manifests as an information feedback from mode choice back to mode choice. Thus, the input and output of affect are both measurable and concrete, giving real-world meaning to the uncertain parameter that represents the 'affect effect' (see interpretation of our weights given in Section 1.3 of the Additional file 1). Interpreting parameter values in terms of the implied effect under different causal variable values, perhaps together with experts and stakeholders, and subjectively assessing how reasonable such implications are, may be the only way to validate causal parameters that are not empirically observed.
We note the following limitations of the model. Contestable model formulations include the multiplicative form of calculating the mode adoption rate (see the Additional file 1). Such technical choices are not neutral and are one source of model uncertainty across different methods [54]. For instance, since we calculated adoption rate by multiplying effects, each increase of mode choice multiplies prior increases of mode choice. An alternative formulation could be an additive form in which multiple effects add to (or subtract from) one another without implicit compounding. We opted for the multiplicative form because we did not find an additive function form that could use inputs with relative units (starting value 1). Relative units in turn were used so that we could include causes that are unmeasured/difficult to measure. For instance, we do not need to know the state of public transport crowding at start to simulate a crowding effect under a push toward higher public transport use. Nor do we need to know the cost of car use (per trip type or traveller segment) to implement a 10% increase in costs in the model. Though biases of presupposed function forms are not unique to our work, we acknowledge that an inability to sensitivity test alternative function forms resulted from an inclusion of unmeasured parameters and variables. Finally, our model does not have a mechanism for determining which mode previous car users move to after opting against the car, or which mode do new public transport and active travel users come from. This makes scenarios that affect multiple modes somewhat inconsistent, as assumed mode displacement and mechanistic explanations of mode use change get mixed up.

Conclusions
Our first research question asked how can small models support decision-making in context of uncertainty regarding causal parameters and system structure. Our model case was simulating transport emission reductions following urban modal choice change. The model was able to do at least the following: (1) demonstrate the dynamic implications of assumed causes or system structures; (2) identify synergies or adverse effects resulting from multiple (assumed) nonlinear causal factors; (3)