A Data-intensive Approach to Allocating Owner vs. NFIP portion of Average Annual Flood Losses

Accurate loss assessment plays a vital role in understanding the economic risk of natural hazards, for planning, mitigation, and actuarial purposes. Because of its juggernaut status as the most widespread and costly hazard, both nationally and around the world, loss assessment due to flood is particularly important. One of the shortcomings in existing flood loss models is to partition the structure (or building) economic value of loss into that borne by the homeowner and that covered by flood insurance. The goal of this research is to model the loss incurred by the homeowner and that incurred by the National Flood Insurance Program, considering flood damage, building replacement value, flood insurance coverage amount, deductible, and flood characteristics (slope and y-intercept of the loss vs. return period curve). A Monte Carlo approach is used to calculate the annual average loss due to flood at the individual homeowner scale. Multiple linear regression (MLR) and Classification and Regression Tree (CART) models are trained to provide the output of the owner’s share of the loss. The CART model outperformed the MLR model with lower RMSE and MSE values and a higher R2 value (0.95) on the test data set. Because out-of-pocket expenses due to flood can be devastating to financial security, the results of this study support and inform the proactive decision-making process that homeowners can use to self-assess their degree of preparation and vulnerability to the flood hazard.


Introduction
Flooding is the costliest natural hazard globally and nationally, in terms of loss of life and property, with impacts felt disproportionately by the economically disadvantaged.
Quantification of total flood losses is important for monitoring and mitigating the flood hazard across space and time. In light of the ever-increasing value of property at risk, policymakers are increasingly adopting the approach of integrated flood risk management, which includes the engagement of households in flood insurance and structural flood protection measures at the micro-level. Existing research tends to emphasize quantification of total loss rather than the direct economic impact on owners; it does not explicitly identify the financial contribution of the National Flood Insurance Program (NFIP) relative to that covered by the homeowner, at the individual flood-insured residence scale.
This study partitions the total modeled residential building flood damage between out-ofpocket expense to the owner and that covered by the National Flood Insurance Program (NFIP). Input variables include the Gumbel extreme-value distribution parameters (i.e., the slope ( ) and y-intercept ( ) of the flood depth as a function of return period), first floor elevation above the adjacent ground surface, building value (i.e., the product of construction cost and livable space), and flood insurance coverage and deductible. Then, Monte Carlo simulation is performed to partition the AAL between that borne by the owner and that reimbursed by the NFIP. The contribution of this paper is to help the individual homeowner to make a more informed decision regarding the purchase of flood insurance coverage and the associated deductible. Results from this work can be incorporated into webtools or other education/outreach material made available to the general public and to realtors, homebuilders, and community leaders.
This research takes the next step forward by filling the research gap regarding the allocation of owner and NFIP shares of residential flood loss. Thus, these results will be of interest to homeowners, insurance companies, and lending institutions.

Input Parameters
The parameters used to run Monte Carlo simulation to model the average annual loss (AAL) are described below:

Flood Hazard Parameters
Gumbel parameters (α, ) that vary such that 0<α<10 and -20< <20, with α increasing in increments of 2 feet and increasing in increments of 4 feet. These ranges of α and are selected because they allow for the 100-year flood depth to range from -20 to 66 feet; values that would encompass the entire range that could be experienced in the U.S.A.

First Flood Elevation
The first-floor elevation ( ; i.e., above the adjacent surface) is calculated based on the assumption that it could range from 0 feet (i.e., the base flood elevation, ) to a height of 3.5 feet above the . This additional elevation, known as freeboard, is computed at increments of 0.5 feet.
For each of the α and u combinations, the BFE is computed based on the 100-year flood depth, which is typically used as the across the U.S.A. ( = -α * ln{-ln[1-(1/100)]}).

Building Value
Building value (i.e., replacement value) is defined as the product of the livable area and the per unit construction/repair cost of the structure.

Summary & Conclusion
This study allows for the development of an interactive decision-making tool in which the user could input the building value , , amount of flood insurance coverage, and flood insurance deductible for a new construction. The output would then provide an estimate of the amount that the user should be prepared to incur in out-of-pocket expenses based on the probabilities of flood likelihood for the home in question and the insurance parameters provided.
The tool would support a more thoughtful selection of flood insurance coverage for the homeowner, to optimize expenses (either through elevating the home, increasing insurance coverage, or both) in consideration of the local hazard. Similar benefits would be available to community planners and leaders as the results are upscaled from the individual to the community level, as they consider options to enact policy and/or infrastructure to mitigate the hazard.
It should be noted that in this analysis, only the direct losses to the structure (i.e., home) are considered, such as the removal and replacement of flooring and drywall; losses to building contents, such as furniture, vehicles, clothing, and items of sentimental value, and indirect losses, such as time unemployed and hotel expenses incurred during renovation are not considered in this analysis. Furthermore, for convenience and ease of comparison across parameters and structures, all costs and payouts are expressed in terms of average annual values, even though it is recognized that the costs will actually be zero in most years and much greater than the mean value following a flood. In future work, the methodology can be expanded to include some or all of these indirect losses, if the relevant loss functions are known. Likewise, content loss (i.e., monetary, but not sentimental, losses to the goods within the structure) can be included in future research if loss functions are known.

Data Generation:
A data set is generated using the Monte Carlo method for all the combination of input parameters. A small sample for the owner share, NFIP share, and ratio of owner share to the total, for a few combinations of input parameters, is shown in Table 1.
A test data set is also generated using the parameter values that are not present in the training/validation data set. The data set is then tested to determine the extent to which

Regression Analysis
The generated data set is used to train regression models that can predict the ratio of owner loss to with reasonable accuracy. The scikit-learn Python library is used to preprocess and train the models. The data set is divided between training (80 percent of the observations) and validation (20 percent) components. The "train_test_split" method in the "sklearn.model_selection" package is used for splitting the data set. The MinMax scaler is used to scale the variables. The "GridSearchCV" algorithm in the scikitlearn.model_selection package is utilized to tune the models hyperparameters. The test data set is used to check the performance of the regression models.

Decision Rules
Four decision rules are set up to allocate the AAL between the owner and the NFIP. Specifically, 1) if the AAL is less than the insurance coverage, then the owner portion of the loss is considered to be the deductible, for all cases in which the AAL exceeds the deductible; 2) If the AAL does not exceed the deductible, then the owner suffers the entire loss and NFIP's share is zero; 3) As long as the AAL is less than the coverage, the NFIP portion is the difference between the AAL and the owner's portion; 4) If the AAL exceeds the insurance coverage, then the owner's portion of the loss is equal to the difference between the AAL and the coverage, plus the deductible, and NFIP's share of the loss is the coverage minus the deductible (or AAL minus owner's share).

Monte Carlo Simulation
Monte Carlo simulation of 50,000 flood events (i, where i increases iteratively from 1 to 50,000) is then run for each combination of α, , , building value, coverage, and deductible. The Monte Carlo simulation provides a random value between 0 and 1 for each of the i runs, in order to generate an exceedance probability ( )such that: = random(0,1) Then, the flood elevation for each simulated event ( ) is calculated using α, u, and P(i); = -α * ln The flood elevation above the is then calculated as -. The function expressing the annual average loss to the structure is determined by the following equations, implemented by Gnan (2021): The dollar value of the average annual loss is calculated using the following equation: At this point, the values for coverage and deductible in the scenario under consideration are input so that can be partitioned into that cost borne by the owner vs. that assigned to the NFIP, as described by the four decision rules above. The values from these 50,000 runs are then averaged to calculate the , and the portion of the loss assigned to the owner and to the NFIP. The owner ratio is then calculated as the owner share divided by the total .
A comparison of the model fit diagnostics for the validation and test data sets reveals that CART performs more strongly than multiple linear regression (Table 2). Thus, no further complex models are tested, and CART is selected as the final model with 0.95 RMSE value at the test dataset.