An agent-based modeling framework for examining the dynamics of the hurricane-forecast-evacuation system

Hurricane evacuations involve many interacting physical-social factors and uncertainties that evolve with time as the storm approaches and arrives. Because of these complex and uncertain dynamics, improving the hurricane-forecast-evacuation system remains a formidable challenge for researchers and practitioners alike. This article introduces a modeling framework built to holistically investigate the complex dynamics of the hurricane-forecast-evacuation system i.e., to determine which factors are most important and how they interact across a range of real or synthetic scenarios. The modeling framework, called FLEE, includes models of the natural hazard (hurricane), the human system (information flow, evacuation decisions), the built environment (road infrastructure), and connections between systems (forecasts and warning information, traffic). In this paper, we describe FLEE ’ s conceptualization and implementation and present proof-of-concept experiments illustrating its behaviors when key parameters are modified. In doing so, we show how FLEE is capable of examining the dynamics of the hurricane-forecast-evacuation system from a new perspective that is informed-by and builds-upon empirical work. This information can support researchers and practitioners in hazard risk management, mete-orology, and related disciplines, thereby offering the promise of direct applications to mitigate hurricane losses.


Introduction
Hurricanes Irma (2017) and Rita (2005) demonstrate how, in the mainland US, the forecast-evacuation system is uncertain, dynamic, and complex. For example, Irma's 3-10-day forecasts indicated the storm was likely to make landfall as a major hurricane somewhere in Florida, with the most likely track near Miami, triggering the largest evacuation in US history [1]. However, the forecast track shifted slightly westward as the storm approached, with eventual landfall near Tampa Bay-St. Petersburg, a common evacuation destination in the event, while leaving Miami largely unscathed [2,3]. Similarly, uncertainties in Hurricane Rita's track and intensity forecasts, combined with the aftermath of Hurricane Katrina, led to mass evacuations and severe traffic jams in Houston-Galveston. The worst of the storm missed the area, but had Rita struck Houston-Galveston directly, the consequences could have been severe, as many evacuees were stranded on area roads [4,5].
The events are relevant since the forecasts were fairly accurate, with the westward shift of Irma's track falling within the National Hurricane Center's (NHC) cone of uncertainty [2], and Rita's forecast track being less erroneous than most [5]. However, forecasts were less successful in providing useful guidance for many affected by the events, despite being as useful as one can expect given current forecast skill. These cases illustrate the complexities of people using inevitably imperfect forecasts to make evacuation decisions well before the storm arrives, and they demonstrate how evacuations involve many interacting physical-social parts and uncertainties which evolve over time [6][7][8][9]. Because of these complex dynamics, safe and efficient evacuations can be a formidable challenge.
Empirical studies provide insight to different aspects of hurricane evacuations, such as how forecasts, warnings, and other factors influence evacuation decisions [10][11][12]. However, it is difficult to empirically study all aspects of evacuations across multiple cases. Computational models, on the other hand, provide a complementary tool where empirical knowledge can be codified and used to run virtual experiments for many different hurricane scenarios, real and synthetic [6,13]. Recent research demonstrates the potential of modeling the hurricane evacuation system together in one framework [13,14]. With that we ask: can a modeling framework be designed to holistically investigate the complex dynamics of the hurricane-forecast-warning system i.e., to determine which factors are important and how they interact across a range of scenarios?
To answer this question, we introduce a new modeling framework, FLEE (Forecasting Laboratory for Exploring the Evacuation-system). FLEE includes several empirically-informed models representing key, interwoven aspects of real-world hurricane evacuations: the natural hazard (hurricane), the human system (information flow, evacuation decisions), the built environment (road infrastructure), and connections between systems (forecasts and warning information, traffic, impact zones). The hurricane and forecast information are represented using data and products from the National Hurricane Center (NHC), a component of the U.S. National Weather Service which is the leading authority for real-time hurricane forecasting. Two agent-based models (ABMs) replicate 1) the flow of information and evacuee decisionmaking, and 2) evacuation infrastructure, routing, and traffic. These models are conceptually and numerically interconnected as shown in Fig. 1.
This paper has two primary objectives. First, to overview the conceptualization and implementation of FLEE. This includes describing the model components, which are designed to represent key aspects of real-world hurricane evacuations, while remaining sufficiently idealized to build fundamental and practical knowledge (e.g., see Refs. [13,15]; discussion in Section 2). The paper's second aim is to show results from experiments demonstrating how FLEE is uniquely positioned to examine the hurricane-forecast-warning system dynamics. That is, how it can explore the effects of altering different factors, interactions among system components, and to show how large-scale patterns of evacuation can emerge from individual decisions of many heterogeneous agents interacting with each other and with their physical-informational environments.
Experiments are performed on a simplified representation of the Florida peninsulaa place frequently visited by tropical systems [16] and for Hurricane's Irma and Dorian, which affected these areas in 2017 and 2019. FLEE was designed to be flexible, however, and thus the modeling framework can be modified to study other regions, hurricane scenarios, and multi-hazards e.g. hurricanes followed by flooding or cascading failures such as loss of power networks, damage to roads etc.
This research builds on previous work which models the hurricane evacuation system by expanding the components of the full system represented within the same modeling framework. For example, one body of work uses ABMs to study evacuation planning [17][18][19][20][21][22][23]. Such work focuses on evacuation traffic while using highly idealized representations of the forecast and warning information and evacuation decision-making. Meanwhile another body of work uses ABMs and other models to study information flow and evacuation decision making but does not include representations of evacuation routing and traffic [6,13,[24][25][26][27]. Arguably the most comprehensive model of the hurricane evacuation system is Blanton et al. [14] and Davidson et al. [28]; as they integrate the forecast, evacuation decisions, and evacuation traffic into one system. However, its representation of information flow and evacuation decision making were fairly simplistic as these models were designed for operational use.
Modeling frameworks like FLEE, which represent the entire hurricane-forecast-warning system, can support researchers, practitioners, and policy-makers in a variety of disciplines. This includes hazard risk management, which would benefit from increased knowledge of the relative effectiveness of evacuation management strategies. The evacuation modeling community would benefit from improved understanding of evacuation, which provides better rationale for variable selection in future models. In meteorology, modeling frameworks like FLEE can provide a societally-relevant alternative to traditional measures of forecast accuracy, by showing how forecasts influence evacuation success. Lastly, by looking at the system holistically, these modeling frameworks can cultivate shared understanding across these disciplines, a need emphasized by Bostrom et al. [29].

Modeling framework and implementation
This section describes FLEE's components and design [30]. The modeling framework was developed using Fortran due to familiarity with the language but could be developed using existing agent-based Fig. 1. A conceptual overview of FLEE which includes models of the three interconnected systems of hurricane evacuations: (a) the natural hazard (b) the human system, and (c) the built environment, represented by NHC forecast products and two ABMs, respectively (italics). Forecast and warning information (purple), evacuation traffic (light blue), and impact zones (gold) serve as conceptual links between systems. Coupling the individual models (a-c) via these links makes FLEE a hybrid agent-based and system dynamics model [54] uniquely positioned to perform experiments impossible to conduct in the real-world.
software. For further details, the commented code, a model description, and input files are available for download at the CoMSES model library (https://www.comses. net/codebase-release/4cd05855-f387-48bd-8899-9d62375518cb/). FLEE can run on multiple operating systems, including MacOS, Linux, and Windows, and on computers with average memory and cores (e.g., we used computers with 2 cores and 4 GB memory). Simulations typically require 3-5 days of real-time. Though it cannot run in quasireal time on a desktop computer, the paper's goal is proof of conceptimproving run time is a key next step for more practical use.
The modeling framework includes a spatially explicit virtual world representing a geographical area of interest (described in section 2.1); a dynamic hurricaneand forecast information about itthat passes through that world (section 2.2); a multi-agent model where information is interpreted by millions of heterogenous agents and used to make evacuation decisions (section 2.3); and a traffic model where agents move across the virtual world as the hurricane approaches (section 2.4).
To design and implement FLEE, we integrated across multiple relevant areas of expertise, including agent-based modeling, meteorology, emergency management, protective decision making, risk communication, social vulnerabilities, and traffic modeling. As in any modeling effort, aspects of FLEE are simplified and some real-world processes are not represented. Decisions about what to include were based on our research goals (e.g., to explore the broad system dynamics), review of relevant literature, and discussions among our research team. These decisions are discussed throughout Sections 2.1-2.4.

The virtual world
FLEE's virtual world is a 10 x 4 cellular representation of the northsouth axis of Florida, an area susceptible to hurricanes [16] and which has experienced mass evacuations such as Irma (2017). The grid spacing is coarse by design (40 grid spaces of 69-km x 69-km each) as the project's goal is to explore the broader system dynamics, and to provide a starting point for more complex experiments. Census data informs the spatial distribution of agent households on the abstracted grid as well as household characteristics (which then influence evacuation decisions as discussed in section 2.3). For the built infrastructure, virtual highways and interstates designed to simulate key aspects of Florida's road network are overlaid on the model grid (section 2.4). These roads allow agents to move between grid cells for evacuation. Details regarding the construction of each model system (i.e., the natural hazard, the human system, and the built environment) and the key connections between them is provided in the next three subsections.

The natural hazard (hurricane, forecasts, and warning information)
FLEE includes a hurricane that approaches and can move through the model domain (Fig. 1a). The storm and its forecasts can be real or synthetic; here we simulate real, historical storms using archived NHC forecast products which were issued in real-time. The products include information about the observed storm characteristics (Table 1) and official forecast information ( Table 2), both of which update every 6-h (both in FLEE and in the real-world). When taken together, the products capture the critical storm information and its evolution as the storm approaches. We chose to use NHC products in this implementation rather than meteorological model ensembles (as used in Refs. [14,28]) because they more closely resemble forecasts seen by the public [38], and can be systematically perturbed to assess the evacuation's sensitivities to the forecast. Note, the NHC products are a starting point, but FLEE can be extended to include additional or more complex information about the storm and forecasts and warnings, if desired. In this article, NHC forecast products are obtained for Hurricanes Irma (2017) and Dorian (2019), which represent forecast scenarios with different tracks, speeds, forecast errors, and subsequently, different evacuation behaviors [3,39,40].
Each time a new forecast is entered into the model, information from the NHC products is synthesized into a "light system" forecast of the three major hazards known to drive hurricane evacuation decisions: wind, storm surge 1 , and rain. The approach resembles the Meteoalarm web platform (http://www.meteoalarm.eu) where hazard risk are displayed in traffic-light color-coding (green, yellow, orange, red). Reds are reserved for severe and rare events, while also capturing some degree of immanency (i.e., reds are warnings, yellows are watches) [41]. We chose to use this type of light system in the modeling system because it (1) represents a synthesis of the forecast for public consumption like TV personnel do [42], and (2) provides means to connect forecast products with the model grid where evacuation decisions are made (Fig. 1b). Table 1 Observed storm characteristics used in FLEE and the NHC products from which the data are located. Storm characteristics includes the storm's observed location, size, intensity, and forward speed as it moves across the virtual world (left). This information was taken from archived NHC forecast products (right) which were issued in real time (available at https://www.nhc.noaa.gov/gis/). Consistent with the wind speeds in the NHC data, winds are discussed here in the unit knots (nautical miles per hour, equivalent to approximately 1. 15 1 Storm surge is defined by the National Oceanic and Atmospheric Association (NOAA) as the abnormal rise in seawater level during a storm, measured as the height of the water above the normal predicted astronomical tide. The surge is caused primarily by a storm's winds pushing water onshore.
Light system forecasts are created with ArcGIS by overlaying products onto the 10 x 4 model grid. Then, at each grid cell, forecast products are combined and weighted to estimate risk for wind, surge, and rain. Weights are based on current knowledge of the contributions of different factors to these types of hazards ( [43,44]; team expertise in meteorology and risk perception), combined with an empirical validation that the progression of hazard risks for Irma and Dorian is reasonable. Sensitivity tests on the light system weighting (not shown) indicated that shifts in the weightings of the different factors did not have a significant effect on evacuations. The exact process of combining and weighting information to create light system forecasts is provided as Supplementary Tables 1-3. Fig. 2 presents the light system forecasts for Hurricane Irma (2017) at 24 h intervals. The early NHC forecasts depict the most likely scenario as a landfalling major hurricane near Miami. However, the forecasts shifted westward as the storm approached Florida, with the storm eventually making one mainland U.S. landfall in the Florida Keys and a second in southwest Florida near Naples. The light system captures the gradual westward shift in threats. Moreover, as the storm approaches Florida and track uncertainty decreases (confidence increases), the light system estimates increased risk focused on areas inside the narrowing cone of uncertainty. Because of these features, the light system appears to be a reasonable way of representing the risks associated with hurricane hazards and is good enough to proceed. As a result, FLEE becomes the first to use synthesized NHC products with ABMs, and alongside Watts et al. [13] and Morss et al. [6], contains one of the most sophisticated representations of hurricane forecast information in models of the hurricane evacuation system to date.

The human system (information flow, evacuation-related decisions)
With the synthesized light system forecasts as inputs, an ABM simulates the "human system" i.e., information flow and evacuation-related decisions (Fig. 1b). This system includes two types of agents: emergency management agents who issue evacuation orders, and household agents (i.e., the public) who collect information, assess risks, and make protective decisions. An overview of the agents and their decision-making algorithms, which run every 30 min in FLEE, is described in this section.
As the hurricane approaches the coastline, emergency management agents (EMs) decide whether to issue evacuation orders for each grid cell. The decision-making process is represented schematically in Fig. 3 and is based on research by Demuth et al. [38], Dye et al. [45], and Bostrom et al. [29], as well as the analysis in Cutter [46]. Clearance times are subjectively assigned to FLEE's grid cells using data from the Florida Statewide Regional Evacuation Study Program [47] which accounts for available road networks and the number expected to evacuate per county (based on population density and forecast intensity). For example, high clearance times (40-60 h) are located in Miami and Tampa Bay for intense (red) surge forecasts; low clearance times (5-20 h) occur in rural areas upstate with less intense (yellow) surge forecasts. Since surge is not expected inland, only coastal EMs issue evacuation orders in FLEE.
The second type of agent, household agents, represent groups of 4 individuals, bringing the number of estimated households in FLEE to 4.1 million (note: the literature suggests people generally make householdbased evacuation decisions e.g., summary in Ref. [48]). This is a simplification to reduce model run-time, as the average household size in Florida is estimated at 2.7. Since the paper's goals are to describe FLEE and demonstrate its capabilities, we believe this assumption is okay, for now. Future experiments building fundamental knowledge of the system dynamics should accurately reflect household size.
Household agents collect information about the hurricane, assess risk posed by the storm, and decide whether the risk warrants evacuation. The design of the evacuation decision-making algorithms prescribed to these agents was adapted from conceptual models of protective decisionmaking for hazards, such as the Protective Action Decision Model (PADM [11]; see hurricane applications in Refs. [13,49,50]), and findings from empirical research on decision-making for hurricanes [10,12,[51][52][53][54][55][56][57][58][59]. As noted in Watts et al. [13], a major challenge is to synthesize the conceptual PADM model and information from empirical analyses into simple yet sufficiently specific instructions for agents. For the purposes of our model, we are not seeking a fully realistic algorithm, but one that captures the main processes underlying public evacuation decisions in the context of the modeling system so we can examine the broader evacuation dynamics holistically.
To develop the household decision algorithm, we synthesized the relevant literature which suggests that people generally evacuate when they believe that the hurricane poses a risk to themselves or their family, and that different people perceive risk differently and have different evacuation barriers [12,49,52]. This literature also finds that factors with the strongest, most consistent influence on evacuation decisions include the risks indicated by forecast information and evacuation orders, as well as household characteristics associated with risk perceptions and evacuation barriers [10]. Thus, we construct the decision-making algorithms by combining time-varying information about the evolving risk (from light system forecasts and EM's evacuation orders) and household characteristics related to perceived and actual risk (age, mobile home residence) to form a risk assessment. This risk assessment is then compared with evacuation barriers (socioeconomic status, car ownership) which vary across the agent population and the model grid. Undecided agents seek information and update decisions every 30 min, making agents active participants in the evacuation decision making process [6,13,60,61]. A high-level schematic of the decision-making algorithm is presented in Fig. 4; details regarding the algorithm's variables and formulation is provided in Supplementary Table 4.
Agent's household characteristics are prescribed by subjectively projecting county-level census and social vulnerability data regarding mobile home ownership, age, car ownership, and socioeconomic status (which includes poverty rates, unemployment, and income) onto FLEE's model grid (Supplementary Figure 1; [62]). Once the geographical distribution of variables is sorted between cells, specific characteristics are stochastically assigned to individual households (Supplementary Table 5). The idea is to not perfectly represent the real-world characteristics, but to generally capture its geographical distribution, and have an appropriately wide range of household characteristics within grid cells. This results in many heterogeneous agents with unique preferences and characteristics.
To account for complexities in how people process and value different information, factors influencing a household's risk assessment are weighed differently between households (Supplementary Table 6). For example, some agents are concerned about evacuation orders while others are not; some are concerned about their mobile home's durability while others are not, and so on. Varying the weights captures these differences. In addition, varying the weights indirectly represents other factors such as culture and worldviews which are sometimes important [63,64]. Weight distributions are stochastically generated for each household with specified ranges informed by the literature [53,58,[65][66][67][68][69]. The idea is to reflect the relative importance of each factor (e.g., evacuation orders, forecast information, mobile home ownership, and age, in that order) as established in Huang et al. [10].
One noteworthy simplification of the decision-making algorithm is that households do not share forecast information with other agents. In other words, everyone has the exact same forecast and evacuation order information i.e., it is a world with perfect, instantaneous communication of updated forecast information. Another is that they do not consider social cues, such as seeing other people evacuate, which can increase one's risk perception. We also do not consider previous experience of disasters, social-media influence, or the structural integrity of buildings, which can influence people's risk assessments and behaviors [11,52,59]. Again, the idea is to capture the main processes underlying public evacuation decisions so we can examine the hurricane-forecast-evacuation system dynamics holistically. Such features could be added in future model versions, depending on the intended research goals.   [11], the process begins when agents combine information obtained from multiple sources (e.g., forecast information, evacuation orders, and household characteristics) into a household risk assessment, which is then compared with evacuation barriers (i.e., socioeconomic barriers, car ownership) that vary across the agent population. A household will evacuate if the household's risk assessment is greater than the household's evacuation barriers.

The built environment (infrastructure, evacuation routing, and traffic)
If a household decides to evacuate, they enter another ABMthis time representing evacuation trafficwhich moves the household across an idealized road network toward a (presumably) safer location (Fig. 1c). An overview of this traffic model, its vehicle agents, and the idealized road infrastructure is described in this section.
FLEE's idealized road network, and its relationship with the 10 x 4 model grid, is depicted in Fig. 5. The built environment consists of two five-lane interstates (blue arrows) situated on the edges of the model grid. These interstates, representing Florida's I-75 and I-95, transport evacuees northward along FLEE's "coasts." Additionally, two east-west running, three-lane interstates (purple arrows), representing Florida's I-75 and I-4, allow residents to move horizontally across the grid. For example, these interstates let households move from Miami (yellow star) towards Tampa Bay (blue star) or inland towards Orlando (orange star). Lastly, eight, two-lane highways (red arrows) allow inland residents access to the interstates where they can flee northward/inland to safety. Though idealized, FLEE's built infrastructure is designed to capture the main elements of Florida's real world road network that influence largescale evacuation dynamics. However, future models could add complex road structures, such as including local and intra-city road networks, if desired.
Evacuating households are instructed to depart within 12 h of the evacuation decision [48,70,71]. Departure times are generated stochastically within this 12 h timeframe. When it's time to depart, households are assigned a vehicle and look for spots on the nearest highway ( Fig. 5; red and purple lines). Specifically, households search for any unoccupied spot along the 69 km stretch of highway corresponding to their home grid cell. If an open spot exists, they are immediately placed in this spot. If spots are unavailable due to traffic for a period of time, evacuees can lose patience, abandon the evacuation and shelter in-place instead (this process is detailed in Supplementary Table 7). In this way, the amount of evacuation traffic influence evacuation decision-making for households.
In regard to destinations, nearly half of the evacuees are randomly selected to evacuate out-of-state (e.g., based on [3,48]). For the remaining in-state evacuees, evacuation destinations are chosen based on where the forecast hazard risk is lower (e.g., from red to green) and where accommodations are available, which is typically in more populated areas [48]. In the case of Hurricane Irma, in-state evacuees typically moved upstate (e.g., towards Tampa Bay, Jacksonville) and inland (e.g., towards Orlando). Carless households move to local shelters, meaning they do enter the road networks and influence traffic [3]. Regarding route selection, we simplify the complex process by assigning agents the shortest route [72]. Once assigned, evacuee routes do not change. The amount of time required to reach destinations is not considered, though this could be added in future models.
For those who enter the road, rules governing vehicle movement are simple: drivers accelerate when they can, slow down if they must, and do not accelerate at the speed limit (70 mph on interstates, 50 mph on roads) or behind another car. Lane switching is not permitted but could be added in future models. Some drivers exhibit erratic behaviors by randomly braking, potentially leading to traffic jams. Accidents are stochastically generated, with a frequency based on Robinson et al. [73]. Default settings for these parameters are described in Supplementary  Table 7.
An example of FLEE's evacuation traffic is shown in Fig. 6. The traffic model, which has a 1.2 s timestep, captures interactions between vehicles at micro-scales, e.g., over-reactive and/or erratic drivers cause other drivers to slow down, triggering realistic-looking traffic jams ( Fig. 6; blue streaks). These interactions are important for investigating complex system dynamics such as traffic [7,9]. Congestion and slowdownssimilar to what is shown in Fig. 6 occur at intersections, in densely populated regions, surrounding accidents, or when vehicles run out of gas. In Section 4.1, we show that, before Hurricane Irma, severe traffic occurs along I-75 and I-95 northbound due to Miami and Tampa Bay being in the storm's path. During Irma's actual evacuation, severe traffic was also observed in these areas [3,74,75]. Because the traffic model captures important vehicle interactions at microscales, and generates reasonable traffic phenomena at regional scales, we believe FLEE's built environment represents evacuation traffic sufficiently well to examine the hurricane-forecast-evacuation system dynamics holistically.

Model validation
There are no governing equations to model human behavior. Therefore a thorough understanding of the FLEE's behaviorand a validation the behavior is realistic as possiblemust be achieved. This was accomplished in several ways. First, the modeling framework was tested throughout implementation to ensure the model code is errorfree. This includes conducting sensitivity analyses on FLEE i.e., components were perturbed, one-by-one, to check if it behaves reasonably (e.g., sensitivity tests on light system weights described in Section 2.2). Second, the model framework was calibrated against existing observational data, namely for Hurricane Irma [1,3,40,76]. These empirical studies provide an overview of Irma's evacuation behaviors, including the total number of evacuees, how Irma's evacuation rates change with time and vary spatially, and when/where significant traffic occurred. Throughout Section 4, we compare FLEE's default evacuation behaviors to these observations in an effort to validate the model framework, and in turn, demonstrate that FLEE portrays key aspects of real-world evacuation dynamics sufficiently well to be suitable for experimentation. Table 3 provides an overview of the different experiments reported in this article. The first experiment (Table 3a) uses the default model parameters described in Section 2.1-2.4 for Hurricane Irma. It provides a baseline of evacuation behaviors which are compared to existing observational data for validation. Based on this default simulation, we then systematically modify model parameters one-by-one, while holding other variables constant, to explore FLEE's behaviors and sensitivities. These experiments include varying the evacuation order timing (Table 3b), implementing contraflow (Table 3c), and changing the storm to Hurricane Dorian (Table 3d). Additional experiments changing the evacuation decision-making inputs (Table 3e) and the population density (Table 3f) are included as Supplementary Information. Together, these proof-of-concept experiments are intended to demonstrate how FLEE can serve as a virtual laboratory uniquely positioned to advance our understanding of the hurricane-forecast-evacuation system.

Data analysis
To compare evacuation patterns and behaviors quantitatively across simulations, FLEE tracks evacuation statistics for all grid cells. The primary model output analyzed here are the percent of households that successfully evacuated (i.e., evacuation rates), and the percent who intended to evacuate but "gave up" due to traffic. The latter statistic provides insight to where the excessive traffic may be preventing successful evacuations. In addition to displaying data by grid cells, values are broken down into multiple impact zones, designed as first-order approximations of areas likely to experience different levels of impacts based on the actual meteorological conditions produced by the storm. Here, we use four impact zones, defined by whether the grid cells: a) are coastal or inland, and b) primarily experiences winds that are greater than 64 knots (hurricane-force) or less than 64 knots. Using the impact zones, we can determine who evacuated from locations that did not end up experiencing hazardous conditions. In addition, we examine compliance rates (i.e., the percentage of residents under evacuation orders who evacuated) and shadow evacuation rates (i.e., the percentage of residents who evacuated from areas not under evacuation orders; [48,77]). Note: evacuation orders are issued for entire grid cells i.e., everyone in that grid cell either gets an evacuation order or not.
In looking at the results, we compare multiple metrics that might indicate successful outcomes in different ways. For example, high compliance rates may not be "good" if the storm ends up not having much impact in those areas, and shadow evacuation rates may not matter if those at highest risk can get out safely.
Because FLEE includes stochastic elements, it can exhibit some runto-run variability. For example, in a series of tests where simulations were repeated five times, evacuation rates ranged from 0 to 2% within grid cells. This run-to-run variability is smaller than other agent-based evacuation simulations [13,36], likely because there are many more agents in this model (nearly 4.1 million households/vehicles). Nevertheless, when interpreting results, changes less than this 0 to 2% variability within grid cells are considered insignificant.

Spatial and temporal patterns of evacuation
First, we examine results from a simulation with the default FLEE configuration for Hurricane Irma (Table 3a). By comparing these results with observations of Irma's actual evacuation [1,3,40,76], they provide a first-order assessment that agents in the model are behaving reasonably based on the processes implemented. They also illustrate key aspects of FLEE's behavior, including the spatial and temporal patterns of evacuation, which provide a baseline for interpreting results from subsequent experiments (sections 4.2-4.4; supplementary results 1-2).
Based on the default model settings for Irma, EM agents issue  These provide a frame of reference for the evacuation rates in a-f. evacuation orders in a similar pattern to what was observed ( Fig. 7; red cells). Evacuation orders were first issued around Miami-Ft. Lauderdale 36-48 h into the simulation (Fig. 7b), and spread northward along both coastlines over the next several days (Fig. 7c-e). The last evacuation orders were issued in Jacksonville 120 h into the simulation, which coincides with the time Irma makes landfall along the southwest Florida coast (Fig. 7e). By the end of the simulation, Irma's hurricane-force winds ( Fig. 7f and g; dotted cells) impacted the western two-thirds of the modelparticularly the southwest and western coastlineswhile leaving the east-coast generally unscathed. This general progression of evacuation orders being issued from south-to-north along both coasts matches what occurred with Irma (e.g., see Page 14-15 and Fig. 2 of [3] for evacuation orders by county). This increases our confidence that the EM decision-making algorithmand the storm surge forecasts on which its basedbehaves reasonably and realistically. The percentage of households who evacuate is shown at 24 h intervals for each grid cell (Fig. 7 a-f). The results depict spatial and temporal patterns that are similar to real hurricane evacuation behaviors. First, evacuation rates increase after evacuation orders are issued, showing its importance to decision-making [10]. Secondly and relatedly, evacuation rates are higher along the coasts than inland [12]. Thirdly, evacuation rates are still high for most areas. This arises because the forecasts in this simulation were dire everywhere, especially before the storm's track shifted westward (Fig. 2). The dire forecasts prompted EMs to issue evacuation orders along both coasts, and as a result, many agents evacuated areas which did not experience hurricane force winds.
FLEE's simulated evacuation rates generally match existing observational data for Irma, which suggest evacuation rates vary from 40 to 60% along Florida's east coast, to 60-80% across the south and west coasts, and around 5-40% inland (e.g., see breakdown of evacuation rates by region in Fig. 4 of [3]; breakdown by voting precinct in Fig. 1c of [40]). One area for improvement is that FLEE produces evacuation rates higher than realistic early in the simulation, especially in the northern part of Florida [40]. Table 4 depicts evacuation rates in different impact zones. In total, 45.1% of households on the model grid evacuate, which equals 7.38 million people. Note, estimates from the Florida Department of Emergency Management [1] suggest actual evacuation numbers totaled 6.9 million. For a given level of wind impact, evacuation rates are higher along the coasts than inland (52.3% coastal vs. 22.2% inland for >64 knots, 58.1% coastal vs. 36.7% inland for <64 knots). Interestingly, areas experiencing hurricane force winds had lower evacuation rates than areas less affected by the storm. This could be due to the potentially higher than realistic evacuation rates early in the simulation. They may also partially result from the east coast receiving evacuation orders, Table 4 Evacuation rates by impact zones for Irma's default run. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic.  albeit unnecessarily, which increased evacuation rates in these areas, combined with excessive traffic along the west coast. For example, 17-32% of the populated Tampa Bay-St. Petersburg gave up evacuating due to excessive traffic (Fig. 7g). The severe congestion, which did occur with Irma's actual evacuation, also reduced evacuation rates along the southwest and southeast coasts (e.g., see traffic information in Page 15 of [1,3,76]). A second pattern illustrated by the evacuation rates is the variability in evacuation decisions among households i.e., some households decide to leave, but many do not, despite seeing similar information and having similar characteristics. This is consistent with real-world hurricane evacuations, and more generally with the heterogeneity exhibited by US households in the real-world [24,27]. In the model, the variability arises from household's different weighting of information as well as their different characteristics and barriers, which create differences in household risk perception. Fig. 8 illustrates the temporal evacuation patterns. Despite not receiving evacuation orders, many households (black dotted line) evacuate in the first 0-36 h. Evacuation rates increase linearly between 36 and 108 h as evacuation orders expand along the coasts. Just before the storm moves ashore around 126 h, evacuation rates decrease, while the number of households giving up due to excessive traffic (black dashed line) increase. The latter occurs as household agents' patience is influenced by the forecast arrival time of the storm. In other words, agents see the impending landfall, then decide to abandon the evacuation and stay home. These temporal patterns of evacuations, as with the Table 5 Evacuation behaviors by impact zones for experiments varying evacuation order (EO) timing. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic. Note: EO is short for evacuation orders; CT is short for clearance times. Irma's default model run is included for reference.   Fig. 7f. Also expressed is the swath of hurricane force winds (dotted cells), evacuation orders (red cells), and the population by grid cell (e). These provide a frame of reference e.g., major cities depicted include Miami-Ft. Lauderdale (yellow star), Tampa Bay-St. Petersburg (blue star), Jacksonville (green star), and Orlando (orange star). Note, run-to-run variability due to stochastic elements in the model ranges from 0 to 2% in grid cells for both evacuation rates and percent giving up due to traffic. Therefore values of − 2 to 2 lie within that variability and should be ignored. spatial patterns, generally match existing empirical data, which suggests that evacuation rates increased semi-linearly throughout this period (e. g., see Fig. 6 of [3]; Fig. 2c of [40]). As a result, we believe FLEE's simulated evacuations provide a realistic baseline for interpreting results from subsequent experiments (sections 4.2-4.4).

Varying timing of evacuation orders
Now we investigate the effects of changing the evacuation order timing in FLEE (Table 3b). Specifically, we conduct four experiments: 1) shifting evacuation orders 10 h earlier, 2) shifting evacuation orders 10 h later, 3) equalizing the clearance times for all grid cells, making the storm's forecasted arrival time the only factor influencing differences in evacuation order timing across grid cells, and 4) shifting evacuation orders 10 h earlier than in experiment 3. These experiments build on the results examined in section 4.1, and begin to explore interactions among the evolving forecasts, evacuation orders, and household evacuation behaviors.
Evacuation rates, broken down by impact zones ( Table 5), indicate that changing evacuation order timing in the four experiments reduces the overall evacuation rates from 45.1% in Irma's default simulation (top row) to 43.3-44.6%, which is 295,200-82,000 less evacuees. Similarly, rates of evacuees giving up to traffic increases from 10.5% in the default simulation to 10.9-13.0%, which is 65,600 and 410,00 more people. This is surprising, as one might expect evacuation rates to increase if evacuation orders are issued earlier, as this creates more time to evacuate.
When examining the results for every grid cell (Fig. 9), results indicate that, despite only affecting evacuation rates by 1-2% overall, changing the evacuation order timing has significant and sometimes opposite effects between neighboring areas. For example, shifting evacuation orders 10 h earlier (Fig. 9a) increases evacuation rates (and decreases traffic) in Tampa Bay-St. Petersburg by 4%, while decreasing evacuation rates (and increase traffic) from 2% to 16% in neighboring cells to the south. This points to the importance of coordination amongst EMs for issuing evacuation orders within a region and a need for followup experiments to unpack these complex processes.
Shifting evacuation orders 10 h later (Fig. 9b) across all grid cells results in evacuation orders not being issued in the Jacksonville metropolitan area. This is because, during the additional 10 h where EMs are deciding whether to issue evacuation orders, the forecast shifted westward and away from Jacksonville (Fig. 2), thus prompting EMs to decide against issuing evacuation orders for the area. These results demonstrate how the model captures the real-world tradeoffs between issuing evacuation orders earlier (when the uncertainty is greater) versus waiting until closer to the storm's arrival (when the forecast uncertainty is reduced). Fig. 9c and d show results from experiments where clearance times are equalized. Recall that clearance times is meant to account for differences in available road networks and the number expected to evacuate e.g., clearance times are highest in populated metropolitan areas and in south Florida where people travel longer distances to evacuate. Thus, equalizing the clearance times, which makes the storm's arrival time the only influence on evacuation order timing, is meant to demonstrate the importance of clearance times in EM decisions. The experiments produce a slight increase in evacuation rates for Tampa Bay-St. Petersburg (1-4%) but with a general decrease in evacuation rates everywhere else. This is especially true in Miami, where evacuation rates drop by 10-18%. In this experiment overall, removing the default clearance times worsened hurricane evacuations by 1-2% in total, which is a decrease of 164,000-328,000 evacuees (Table 5). This demonstrates how evacuations can be made more successful by accounting for clearance times in EM's evacuation order decision-making. Fig. 10 shows the evolution of evacuation rates (and rates giving up due to traffic) with time for the different experiments. Shifting evacuation orders 10 h earlier (green lines) than default (black lines) simply causes evacuation rates to increase earlier in the simulation, and does not meaningfully change the evacuation "shape" otherwise. Similar  (Table 3a; section 4.1) is expressed (black lines), as are experiments modifying the timing of evacuation orders, specifically by a) shifting evacuation orders 10 h earlier than default (green lines), b) shifting evacuation orders 10 h later than default (purple lines), c) equalizing the clearance times, making the storm's arrival time the only influence on evacuation order timing, causing evacuation orders to be issued linearly from south to north, and (orange lines) d) shifting evacuation orders 10 h earlier than in experiment c (red lines). effects are observed with the uniform clearance time experiments (orange/red lines). This information suggests the model behaves as expected, and in general, the experiments demonstrate how the model can quantify and explore, in a simplified context, the effects of varying evacuation order decisions by EMs. This includes simulating the tradeoffs between waiting on evacuation orders and its effect on evacuation success, which cannot be quantified using empirical methods. In addition, the results suggest the modeling system is capable of exploring the effects of evacuation strategies such as phased evacuations, which may be helpful to emergency management [22,78,79].

Implementing interstate contraflow
Next, we investigate the effects of adding contraflow to lessen evacuation traffic and improve evacuation rates in FLEE. For the experiments, we add one contraflow lane on I-95, one contraflow lane on I-75, and one contraflow lane on both interstates (Table 3c).
The results in Table 6 suggest adding contraflow lanes does improve evacuation rates and reduces traffic overall. For example, evacuation rates improve from 45.1% in the default simulation (top row) to 48.0, 47.6, and 49.8% when adding contraflow onto I-95, I-75, and both interstates, respectively. This equates to an increase of 475,600, 410,000, and 770,800 evacuees. Meanwhile, rates giving up from traffic decrease from 10.5% to 6.6-8.3%, which is a decrease of 639,700-360,800 people. The improvements in evacuation ratesand reduction in trafficare not limited to particular times in the simulation; rather the improvements are uniform throughout (Fig. 11).
When comparing the impact of the different experiments on various grid cells (Fig. 12a-d), the targeted effect of contraflow becomes clear. For example, adding contraflow onto I-95, which is located along the eastern coastline, improves evacuation rates (and reduces traffic) along the eastern half of the model grid. Adding contraflow onto I-75, which is found along the western coastline, improves evacuation rates (and reduces traffic) along the western half of the model grid. These improvements in evacuation rates are large locally, ranging from 3 to 14% along the southwest coast and 5 to 12% along the southeast coast.
The results suggests that, if given accurate forecasts, implementing contraflow in the modeling system reduces traffic and thus increases successful evacuation in targeted regions, which is what contraflow is designed to do. This provides evidence that the model can be used to investigate the potential impacts of modifying different parts of the system, such as implementing contraflow or other evacuation Table 6 Evacuation behaviors by impact zone when implementing contraflow. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic.   (Table 3d; Section 4.4) is also expressed (grey lines). Note, Dorian's simulation extends to 184 h while Irma's ends after 144 h. management strategies, and determine its influence on the hurricane evacuation in its full context (e.g., supporting studies by Refs. [20,22,36,[78][79][80][81][82][83][84][85][86][87]).

Hurricane Dorian
Finally, we explore the modeling system's behavior when a different scenario, with a different storm and a different set of evolving forecasts, is simulated. This experiment (Table 3d), with Hurricane Dorian (2019), uses the same set of parameters as in the default Irma simulations. This experiment should be of interest to meteorologists and emergency managers, by exploring how differences in storm characteristics and forecast information can propagate through the agent-based system and translate into different patterns in evacuations. Fig. 13 shows the evolution of Dorian along with the NHC and light system forecasts. The early forecasts (0-72h into the simulation) predict the most likely scenario as a landfalling major hurricane along Florida's east coast. However, the forecasts shift northward (96-120h), significantly reducing areas under threat. After remaining nearly-stationary over the Bahamas (120-144h), the storm re-accelerates northward (>168h) narrowly missing Florida's east coast. As with Irma, the light system captures the spatial and temporal shifts in threats with Dorian. Because of the forecasts, EMs issue evacuation orders along the central east-coast by 72 h (Fig. 14; red cells). The evacuation orders spread along the coastline over the next several days, generally matching what was observed [88].
Compared to Irma, this is a fundamentally different storm with different areas at risk and less people under evacuation orders. As a result, evacuation rates were less with Dorian (33.5%) than with Irma (45.1%), which is 2 million less evacuees (Table 7). Similarly, fewer households give up on evacuating due to traffic with Dorian (6.1%) than Irma (10.5%). This reduction in evacuation rates in FLEE generally matches existing observational data for Dorian [39].
During the first 24-72 h, evacuation rates are increasing everywhere, as most areas are under threat (Fig. 14a-b). As with Irma, we suspect the model is producing evacuation rates higher than realistic during this period, especially in the northern part of Florida and inland. However, this observational data [39] is quite limited and cannot confirm this. Beyond 48 h, however, evacuation rates only increase along the eastern-most portions of the grid where evacuation orders are issued ( Fig. 14c-f). By the end, the highest evacuation rates occur in areas where you would expect (i.e., along the east coast where risk is highest, and where evacuation orders are issued), which is consistent with real-world evacuation behaviors [12]. With the exception of the Tampa-Bay-St. Petersburg area where evacuations occurred early in the simulation, evacuation traffic was primarily confined along the southeast coast (Fig. 14h).
The evolution of Dorian's evacuation rates with time, averaged across the model grid, is shown in Fig. 11 (grey lines). Similar to Irma's default run (black lines), evacuation rates during Dorian quickly increase due to the dire initial forecasts. Once the forecasts shift northward, Dorian's evacuation rates slows significantly but with some increases due to the issuance of evacuation orders between 60 and 120h. The evacuation stops by 140 h because, at this point, the storm is expected to remain offshore. The results again suggest that Dorian's evacuation is, in many respects, different than Irma's.
Robust empirical data on Dorian's evacuation rates is not publicly available. However, the available data [39] suggests the model is, to first order, generating reasonable evacuation behaviors e.g., it captures the inland versus coastal differences in evacuations, the correct issuance of evacuation orders, and the prolonged, linear increases in evacuation rates observed for several days [39]. When combined with the results Petersburg (blue star), Jacksonville (green star), and Orlando (orange star). Note, run-to-run variability due to stochastic elements in the model ranges from 0 to 2% in grid cells for both evacuation rates and percent giving up due to traffic. Therefore values of − 2 to 2 lie within that variability and should be ignored. from Irma (section 4.1), the results provide further evidence that the model reasonably simulates the integrated hurricane evacuation system, and can be used to study various storm scenarios, real or imagined. Furthermore, the differences in the spatial and temporal patterns of evacuation between the two hurricanes confirm the importance of forecast information to the evacuation dynamics [10,12].

Summary and discussion
This article conceptualizes and implements a modeling framework for studying the dynamics of the hurricane-forecast-warning system. The modeling framework, called FLEE, integrates models of the natural hazard, the human system, the built environment, and connections between systems. It includes millions of agentswith behaviors and characteristics informed by empirical researchwho interact with each other, with their physical environments, and with evolving, uncertain forecast information to produce evacuation decisions and generate evacuation traffic. After describing FLEE, we validate the model framework by comparing its evacuation behaviors to observations, mainly for Hurricane Irma (2017), and present a set of proof-of-concept experiments illustrating its behaviors when key parameters are modified. In doing so, we show FLEE is capable of examining the dynamics of the hurricane-forecast-evacuation system from a new perspective.
We propose several areas for future work. First, FLEE can explore how changes in forecast track, intensity, storm size, forward speed, uncertainty, and different forecast scenarios influence evacuations [89]. This provides meteorologists with a societally-relevant alternative to traditional measures of forecast accuracy (need described by [90][91][92]), by measuring the impact of forecasts elements and uncertainties on how people receive and process the information, make evacuation decisions, and physically evacuate. Second, the model can be used to address behavioral science questions, such as how future projections of population density, socioeconomic status, inequality, and car access may affect hurricane evacuations. Third, FLEE can further determine the relative effectiveness of evacuation management strategies such as contraflow, adding public transportation, evacuation order timing, and phased evacuations (building on, e.g., [17]), and how forecasts influence evacuation order decisions [28]. This benefits researchers, practitioners, and policy-makers in hazard risk management.
FLEE is intentionally abstracted to explore the broader evacuation

Fig. 14. Evacuation rates for grid cells during Hurricane Dorian (2019).
Rates are expressed every 24 h (a-g). The percentage of each grid cell which intended to evacuate but could not due to traffic is also expressed (h), as is the spatial and temporal patterns of evacuation orders (red cells). In addition, the number of evacuees still enroute at the various times is shown (bottom of panels a-f). Note, the hurricane force winds (>64 kts) did not impact the model grid. Also expressed is the population by grid cell (i) which provide a frame of reference e.g., major cities depicted include Miami-Ft. Lauderdale (yellow star), Tampa Bay-St. Petersburg (blue star), Jacksonville (green star), and Orlando (orange star). Table 7 Evacuation behaviors by impact zone when switching from Irma to Dorian. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic. dynamics. However, additional layers of complexity can be added, depending on research goals e.g., to account for family composition, social circle's evacuation status, social-media influence, and house/ building strength in evacuation decisions. FLEE can be extended to study other regions or hazards, such as hurricanes followed by flooding, loss of power networks, damage to roads, and other cascading failures. Additional in-depth comparisons with observational data can improve FLEE's realism, and subsequently, its capability to answer questions of interest. But given the sparse availability of empirical data on hurricane evacuations, new data sets are likely needed. Nevertheless, in its current form, FLEE can significantly advance our understanding of the integrated hurricane-forecast-warning system. This new knowledge is informed by and feeds back into empirical research, and can ultimately support researchers, practitioners, and policy-makers in a variety of disciplines, thereby offering the promise of direct applications to save lives and mitigate hurricane losses.

Data availability
The model code was created using the Fortran programming language. The commented code, an ODD specification (a formal, detailed model description), and supporting input files are available for download at the CoMSES model library (https://www.comses.net/codebas e-release/4cd05855-f387-48bd-8899-9d62375518cb/).

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Table 3 Rain risk is calculated for each grid cell by assigning a risk score (1-4) based on the storm speed (>15 knots, 10-15 knots, 5-10 knots, and <5 knots), location within the forecast wind field (34, 50, 64+ knot intervals) which estimates the size of the rain field, location in the cone of uncertainty, and the expected arrival time of tropical storm force winds. The scores are weighted, summed, and rounded to the nearest integer to provide an overall rain threat score (1-4) expressed as green-yellow-orangered, respectively. Note: scores for the expected category and forecast period are set to 1 if the grid cell is not situated within the cone of uncertainty and/or the forecast wind radii. When taken together, the products capture the rain's critical forecast elements (e.g., storm's track, size, forward speed, amount of uncertainty, evolution with time, imminency). Table 4 Key variables in the household evacuation decision-making algorithm. The algorithm's inputs (i.e., forecast, evacuation orders, mobile home ownership, age) are normalized onto a 0-100 scale and summed to produce household risk assessment, which is then weighed against evacuation barriers to produce a decision. Supplementary Table 5 Prescribing agent characteristics to individual households. At the beginning of the simulation, FLEE checks the agent's location and subsequent values in Supplementary Figure 1, then stochastically assigns household characteristics at the values established above. These variables are static, meaning they are assigned at the beginning of the simulation and do not change, but serve as inputs into the agent decision-making algorithm as detailed in Supplementary  Table 6 Weighting of key variables in a household's risk assessment. Weights are designed to reflect the relative importance of each factors (e.g., evacuation orders, forecast information, mobile home ownership, and age, in that order) as established in Huang et al. [10]. For the individual hazards, studies suggest most households perceive wind and surge as the primary threat over rain (e.g., [65]). But in general, the relative weighting is not well known. The frequency of accidents along the two outer interstates i.e., I-95 and I-75. These stop traffic for 10 minutes.

Supplementary Results 1 -Varying household's weighting of different types of information
Next, we investigate the effects of changing household agent's weightings of the four factors that influence their hurricane risk assessment: the forecast, evacuation orders, mobile home ownership, and age. For each experiment, we set the information weights to zero, effectively "turning off" each parameter, one-by-one, while holding the others constant (Table 3e). When comparing the results to the default settings in Section 4.1, the experiments demonstrate the specific influence of the different information on the evacuation behaviors, both spatially and temporally.
In Irma's default simulation (Section 4.1), 45.1% of households evacuate. However, turning off the information for evacuation orders, the forecast, mobile home, and age, one-by-one, results in evacuation rates of 28.3%, 33.2%, 40.6%, and 44.8%, respectively. Similarly, in the default simulation, where 10.5% of households give up due to traffic, turning off the inputs reduces the rate to 2.6%, 8.1%, 9.8%, and 9.3%, respectively (Supplementary Table 8). In other words, the results indicate that, in the model's current formulation, evacuation rates are generally more sensitive to evacuation orders than they are to forecast information, mobile home ownership, and age. However, this is zone dependent e.g., evacuation orders has a greater influence in coastal zones, and mobile homes have a greater influence upstate/inland. The former is due to model formulation (evacuation orders are limited to coastal zones) and the latter due to the geographic distribution of mobile homes (e.g., as shown in Supplementary Figure 1). That said, we cannot draw conclusions (or interpret the model dynamics) based on these findings. Rather, we can say the relative importance of these factors is generally consistent with the metaanalysis of Huang et al. [10], which we used to prescribed the information weightings, thus adding confidence that the model behaves reasonably.
Breaking down the experiments by impact zones shows that, as expected, evacuation orders primarily impact evacuations along the coast. For example, turning off the evacuation order parameter decreases evacuation rates in the coastal >64 knot zone from 52.3% to 31.9%, while inland evacuation rates remain the same (Supplementary Table 8). Table 8 Evacuation behaviors by impact zone when varying household weighting of information. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic. Supplementary Figure 2 shows evacuation rates and traffic broken down by grid cell. Note, in this figure, rates are expressed as the departure from the default settings in Figure 7. The results further show how evacuation orders are a strong determinant of evacuation rates, as turning off the parameter reduces evacuation rates from 7% to 40% in places along the coast (Supplementary Figure 2b). Note: turning off evacuation orders increases evacuation rates in the inland Miami suburbs, as traffic is reduced in the surrounding coastal areas. This highlights how evacuation rates in a given grid cell are also influenced by those in other grid cells. Unlike evacuation orders, the other three parameters (Supplementary Figure 2 a, c-d) exhibit a more uniform influence on evacuation rates across FLEE's grid. Areas most influenced by mobile home and age information occur in grid cells where rates of mobile home ownership are highest, and where age is expected to play a larger role (Supplementary Figure 1, see cells with higher ranking). Though such information does not provide any new behavioral insights, it does verify that FLEE behaves as expected given the model's current configuration, and is capable of capturing complex processes (e.g., evacuation behaviors in one part of the model influencing those in other areas). These results increase our confidence that FLEE adequately represents real-world evacuations and is suitable for further experimentation.

Supplementary
Supplementary Figure 2. The spatial effects of "turning off" information inputs on evacuations rates and percent "giving up" from traffic for grid cells. Results are presented for experiments "turning off" the forecast information (a), evacuation orders (b), mobile home ownership (c), and age (d), one-by-one while holding the other parameters constant. Values are shown as the departure from the default settings in section 4.1 and in Figure 7f-g. Also presented is the swath of hurricane force winds (dotted cells), evacuation orders (red cells), and the population by grid cell (e) which provide a frame of reference e.g., major cities depicted include Miami-Ft. Lauderdale (yellow star), Tampa Bay-St. Petersburg (blue star), Jacksonville (green star), and Orlando (orange star). Note, run-to-run variability due to stochastic elements in the model ranges from 0-2% in grid cells for both evacuation rates and percent giving up due to traffic. Therefore values of -2 to 2 lie within that variability and are insignificant.
Supplementary Figure 3 shows the importance of the different information on certain periods during the evacuation. For example, turning off evacuation orders (red lines) causes a reduction in evacuation rates compared to the default simulation (black lines), especially during the 36-102 hour period when evacuation orders were issued. Forecast information (purple lines) most influences evacuation rates between 30-60 hours, as forecasts indicated significant risk throughout Florida during this period. Unlike evacuation orders and forecast information, modifying the age (orange lines) and mobile home (green lines) factors do not impact any specific periods of time, but simply reduces the evacuation rates overall. This is to be expected, as these parameters are defined at the start of the simulation and are not updated.
In summary, the simulations in this section illustrate how modifying the factors that influence households' evacuation decisions in the human system agent-based model propagate through FLEE's full modeling system to influence the spatial and temporal patterns of evacuation. In general, the results suggest FLEE behaves as expected given the model's current configuration, and matches patterns seen in empirical studies which suggest forecast/warning information is a key driver for evacuations (e.g., [3,10]). Additionally, the results illustrate how modeling laboratories such as this can build our understanding of the evacuation decision-making processes and how they intersect with other factors (e.g., the evolving forecast information, traffic) to produce evacuations. Figure 3. The temporal effects of "turning off" information inputs on the timing of evacuation rates (solid lines) and numbers giving up due to traffic (dashed lines), averaged across all grid cells. The default simulation (Table 3a; section 4.1) is expressed (black lines), as are experiments turning off the four main types of information used to assess risk: no forecast information (purple lines), no evacuation orders (red lines), no mobile home ownership (green lines), and no age (orange lines). Comparing the experiments to the default experiment (black lines) provides a general sense of the relative importance of the parameter on the overall evacuation behaviors. Also shown is the simulation where the population density is uniform (grey lines), which is further described in Supplementary Results 2.

Supplementary Results 2 -Varying geographical distribution of households
In this section, we investigate FLEE's behavior when the non-uniform geographical distribution of households in the default settings is changed to a uniform population distribution (Table 3f) i.e., where the 16.4 million residents (4.1 million households) are spread evenly across grid cells. As a result, the experiment is a first attempt to explore the effects of population density on evacuations, as this cannot be done empirically, and it demonstrates how FLEE can be used to run different scenarios with population shifts, e.g., times of year when there are a lot of tourists in certain areas, looking 10+ years out for how evacuations may change as the population grows.
In total, evacuation rates increase from 45.1% in the default simulation to 49.9% when the population distribution is uniform, which is an increase of 786,720 people (Supplementary Table 9). Meanwhile rates of households unable to evacuate due to excessive traffic decrease from 10.5% in the default simulation to 3.5%, a decrease of 1,147,300 people. Thus, the experiments suggest the real-world, non-uniform population density substantially increases evacuation traffic and reduce evacuation rates.
A more in-depth look reveals an interesting pattern in the spatial distribution of evacuation behaviors. In most places, evacuation rates are higher than in default while traffic is minimal (Supplementary Figure 4, bottom panel); the exception is the southern "coastal" cells where rates unable to evacuate due to traffic increase 12-17%, particularly around Miami, which reduces evacuation rates by 9-17%. One possible explanation is that the southern cells have 1) more evacuees than in the default run and 2) more evacuees downstream i.e., the area is "last in line" to evacuate based on the available road network. It is also possible that we are seeing the impacts of clearance times being out of balance with what the clearance times would be in a world with this revised population density.
The results quantify contributions of the built environment to the evacuations. Furthermore, they illustrate the significant and potentially complex effects of population density on the evacuation success, which should be explored further. The experiment also shows how, in a modeling laboratory such as this, different components can be modified systematically to isolate influences which are impossible to do empirically, and highlights the potential value of this type of modeling laboratory to increasing our fundamental understanding of the system dynamics, and our understanding how evacuations may change as the population grows. Table 9 Evacuation behaviors by impact zones when making the population uniform across the grid. Successful evacuation rates are broken down into impact zones (coastal vs. inland, and areas experiencing vs. not experiencing hurricane force winds of 64+ kts), compliance rates (i.e., those instructed to evacuate via evacuation order who did evacuate), shadow evacuation rates (i.e., percentage of people not instructed to evacuate who did), and the percentage of evacuees who attempted to evacuate but "gave up" due to excessive amounts of traffic.   Figure 7. Also expressed is the swath of hurricane force winds (dotted cells), evacuation orders (red cells), and the population by grid cell (c) which provide a frame of reference e.g., major cities depicted include Miami-Ft. Lauderdale (yellow star), Tampa Bay-St. Petersburg (blue star), Jacksonville (green star), and Orlando (orange star). Note, run-to-run variability due to stochastic elements in the model ranges from 0-2% in grid cells for both evacuation rates and percent giving up due to traffic. Therefore values of -2 to 2 lie within that variability and should be ignored.