A database for evaluating the InMAP, APEEP, and EASIUR reduced complexity air-quality modeling tools

Policy analysts and researchers often use models to translate expected emissions changes from pollution control policies to estimates of air pollution changes and resulting changes in health impacts. These models can include both photochemical Eulerian grid models or reduced complexity models; these latter models make simplifying assumptions about the emissions-to-air quality relationship as a means of reducing the computational time needed to simulate air quality. This manuscript presents a new database of photochemical- and reduced complexity-modelled changes in annual average particulate matter with aerodynamic diameter less than 2.5 μm and associated health effects and economic values for five case studies representing different emissions control scenarios. The research community is developing an increasing number of reduced complexity models as lower-cost and more expeditious alternatives to full form Eulerian photochemical grid models such as the Comprehensive Air-Quality Model with eXtensions (CAMx) and the Community Multiscale Air Quality (CMAQ) model. A comprehensive evaluation of reduced complexity models can demonstrate the extent to which these tools capture complex chemical and physical processes when representing emission control options. Systematically comparing reduced complexity model predictions to benchmarks from photochemical grid models requires a consistent set of input parameters across all systems. Developing such inputs is resource intensive and consequently the data that we have developed and shared (https://github.com/epa-kpc/RFMEVAL) provide a valuable resource for others to evaluate reduced complexity models. The dataset includes inputs and outputs representing 5 emission control scenarios, including sector-based regulatory policy scenarios focused on on-road mobile sources and electrical generating units (EGUs) as well as hypothetical across-the-board reductions to emissions from cement kilns, refineries, and pulp and paper facilities. Model inputs, outputs, and run control files are provided for the Air Pollution Emission Experiments and Policy Analysis (APEEP) version 2 and 3, Intervention Model for Air Pollution (InMAP), Estimating Air pollution Social Impact Using Regression (EASIUR), and EPA's source apportionment benefit-per-ton reduced complexity models. For comparison, photochemical grid model annual average PM2.5 output is provided for each emission scenario. Further, inputs are also provided for the Environmental Benefits and Mapping Community Edition (BenMAP-CE) tool to generate county level health benefits and monetized health damages along with output files for benchmarking and intercomparison. Monetized health impacts are also provided from EASIUR and APEEP which can provide these outside the BenMAP-CE framework. The database will allow researchers to more easily compare reduced complexity model predictions against photochemical grid model predictions.


a b s t r a c t
Policy analysts and researchers often use models to translate expected emissions changes from pollution control policies to estimates of air pollution changes and resulting changes in health impacts. These models can include both photochemical Eulerian grid models or reduced complexity models; these latter models make simplifying assumptions about the emissions-to-air quality relationship as a means of reducing the computational time needed to simulate air quality. This manuscript presents a new database of photochemicaland reduced complexity-modelled changes in annual average particulate matter with aerodynamic diameter less than 2.5 mm and associated health effects and economic values for five case studies representing different emissions control scenarios. The research community is developing an increasing number of reduced complexity models as lower-cost and more expeditious alternatives to full form Eulerian photochemical grid models such as the Comprehensive Air-Quality Model with eXtensions (CAMx) and the Community Multiscale Air Quality (CMAQ) model. A comprehensive evaluation of reduced complexity models can demonstrate the extent to which these tools capture complex chemical and physical processes when representing emission control options. Systematically comparing reduced complexity model predictions to benchmarks from photochemical grid models requires a consistent set of input parameters across all systems. Developing such inputs is resource intensive and consequently the data that we have developed and shared (https://github.com/epa-kpc/RFMEVAL) provide a valuable resource for others to evaluate reduced complexity models. The dataset includes inputs and outputs representing 5 emission control scenarios, including sector-based regulatory policy scenarios focused on on-road mobile sources and electrical generating units (EGUs) as well as hypothetical across-the-board reductions to emissions from cement kilns, refineries, and pulp and paper facilities. Model inputs, outputs, and run control files are provided for the Air Pollution Emission Experiments and Policy Analysis (APEEP) version 2 and 3, Intervention Model for Air Pollution (InMAP), Estimating Air pollution Social Impact Using Regression (EASIUR), and EPA's source apportionment benefit-per-ton reduced complexity models. For comparison, photochemical grid model annual average PM 2.5 output is provided for each emission scenario. Further, inputs are also provided for the Environmental Benefits and Mapping Community Edition (BenMAP-CE) tool to generate county level health benefits and monetized health damages along with output files for benchmarking and intercomparison. Monetized health impacts are also provided from EASIUR and APEEP which can provide these outside the BenMAP-CE framework. The database will allow researchers to more easily compare reduced complexity model predictions against photochemical grid model predictions.  Table   Subject Atmospheric Science Specific subject area Regional scale air quality modeling of chemically speciated particulate matter Type of data

Value of the Data
The dataset provided in this article will make it easier for researchers to compare multiple reduced complexity models against full-scale photochemical models using consistent inputs.
The dataset includes all necessary inputs (i.e., emission changes, meteorological data, and atmospheric chemistry) needed to run each reduced complexity tool. This information can be used to replicate an existing evaluation and evaluate newer versions of these tools.

Data description
Regulatory assessments and research applications often use models to translate expected emissions changes from pollution control policies to estimates of air pollution changes and resulting changes in health impacts. Two approaches are typically used to simulate primary emitted and secondarily formed PM 2.5 in the atmosphere: "full form" photochemical modeling and reduced complexity modeling. Full form photochemical modeling captures the complexities of environmental processes (e.g., atmospheric chemical reactions, gas-particle partitioning, dispersion of pollutants, and deposition to surfaces) by including detailed representations of each mechanism in the atmosphere to quantify the relationship between emissions and ambient concentrations. Local to regional scale dispersion of emissions is affected by many factors including emissions release characteristics (e.g., height above ground) as well as local topography and meteorological variables such as temperature and wind speed. In contrast, reduced complexity models use various methods to approximate estimates from full-scale photochemical models without explicitly representing the atmospheric chemical and physical processes that impact pollutant fate and transport.
This dataset includes inputs and outputs representing 5 emission control scenarios for multiple reduced complexity and full form models. Each emissions control scenario includes a projected future year reference and control scenario set of emissions. Inputs for each of the modeling systems were developed where possible with the same domain structure (e.g., grid cell size, domain extent, and vertical structure), reference emissions, emissions changes, and meteorology to facilitate comparison. In some situations, certain model formulations precluded implementation of consistency for certain inputs and those are noted in the following sections.
Here, a dataset (https://github.com/epa-kpc/RFMEVAL) is provided to help researchers perform systematic comparison of multiple reduced complexity models. The dataset includes consistent emissions for 5 different emissions control scenarios. Inputs were developed for 4 reduced complexity tools including the Intervention Model for Air Pollution (InMAP) [1], Air Pollution Emission Experiments and Policy Analysis (APEEP) versions 2 and 3 [2], Estimating Air pollution Social Impact Using Regression (EASIUR) [3], and EPA's source apportionment benefitper-ton tool (SA-BPT) [4] (Table 1). Input files (emissions and where possible meteorology), output files (estimated changes in PM 2.5 concentrations and monetized health impacts), and necessary application files (run control and code) are provided as part of this database to facilitate model comparison for current and future versions of these tools. For comparison, photochemical grid model annual average speciated PM 2.5 output is provided for each emissions control scenario. Further, the input and output files for the Environmental Benefits and Mapping Community Edition (BenMAP-CE; https://www.epa.gov/benmap) system are also provided to allow for estimation of the monetized health impacts associated with each of these emission scenarios. Photochemical grid model output is from the Community Multiscale Air Quality (CMAQ) modeling system (https://www.epa.gov/cmaq) or the Comprehensive Air Quality Model with Extensions (CAMx; www.camx.com).
The case studies simulate policies affecting emissions from various sources and sectors (e.g., power plants and onroad vehicles) and can provide a basis for evaluating the degree to which reduced complexity models represent air quality impacts and health outcomes resulting from emissions changes for a range of policy-relevant scenarios. There are five separate policy scenarios representing emissions reductions from various sources and sectors: electrical generating units (EGUs), onroad vehicles, cement kilns, refineries and pulp and paper facilities ( Table 2). The national total changes in emissions for each of the case studies are provided in Table 2 and shown spatially for the onroad scenario in Fig. 1, EGU scenario in Fig. 2, and each of the industrial scenarios in Figs. 3e5. Table 3 provides a schematic showing how the model output species were mapped for comparability to make the comparison of total and chemically speciated components of particulate matter most consistent across modeling systems. Table 1 Overview of the input and outputs for the reduced complexity models and photochemical models provided in this database.

Model
Emissions

Reduced complexity model application
For two of the reduced complexity tools (InMAP and EASIUR), run control files were constructed to clearly indicate what model options were selected so other users can reproduce the model predictions. InMAP and EASIUR have run control files that identify relevant input files (i.e., scenario-specific emissions) and the location and naming convention for scenario-specific output files. As distributed, APEEP does not include a run control file or standardized set of input or output files, only MATLAB files that users must modify to include emission scenario-specific information. A list of MATLAB subroutines, and the subroutines themselves, are provided so users understand the sequence of subroutine execution. Some of the APEEP code was modified to direct the modeling system to use particular emission input files with scenario-specific information and to automatically generate output files of the predicted air quality surface. The code also needed to be modified to reflect scenario-specific information (e.g., value of statistical life).

Emission scenarios
The 2014 Tier 3 Motor Vehicle Emission and Fuel Standards Final Rule (Tier 3) was selected as an onroad vehicles sector policy scenario [5]. The Tier 3 fuel and vehicle standards directly reduce emissions of NO X , volatile organic compounds (VOCs), PM 2.5 , and SO 2 . The emission inventories used include a 2030 future reference case (i.e. emissions representing a 2030 future year without any Tier 3 regulation) and a 2030 control case (i.e. emissions representing a 2030 future year with emissions expected under the Tier 3 regulation). The national total emissions reductions, between the 2030 reference (2030rg_ref_v5_07e) and control (2030rg_ctl_v5_07e) cases are provided in Table 2 and spatially in Fig. 1.
One control option from the 2015 Clean Power Plan (CPP) proposal [6] was selected (option 1S) for an EGU policy scenario. This CPP proposal was intended to implement greenhouse gas emission guidelines for existing fossil fuel fired EGUs with the goal of reducing carbon dioxide (CO 2 ) emissions. Implementing the proposed CO 2 emission guidelines was predicted to have ancillary emission reductions (i.e., co-benefits) of sulfur dioxide (SO 2 ), nitrogen oxides (NO X ), and directly emitted PM 2.5 , which would lead to lower ambient concentrations of PM 2.5 . The emission inventories include a 2025 future reference case (i.e. emissions representing a 2025 future year without any CPP regulation) and a 2025 control case (i.e. emissions representing a 2025 future year with emissions characteristic of the CPP option 1S scenario). The national total emissions changes between the projected future reference scenario (2025ef_v6_11g) and the future control scenario (2025ef_ghg-1S_v6_11g) are shown in Table  2 and spatially in Fig. 2.
Multiple industrial sector case studies were developed focused on sectors with unique geographic distributions of facilities: cement kilns, refineries, and pulp and paper facilities (Figs. 3e5). The Control Strategy Tool (CoST) program (https://www.epa.gov/economic-and-cost-analysis-air-pollutionregulations/cost-analysis-modelstools-air-pollution#control strategy tool) was applied to a 2025 future reference case for each sector and pollutant with a maximum emissions reduction algorithm to find the control technology option providing the maximum emissions reduction regardless of cost. The resulting relative change in emissions for each sector were aggregated nationally by the relevant North American Industry Classification System code.
For each of the hypothetical industrial sector policy scenarios, we applied a specific percentage of precursor emission reductions to all facilities in that sector for NO X , SO 2 and PM 2.5 . The emissions inventories used the same 2025 future reference case as the CPP policy scenario (2025ef_v6_11g). For cement kilns, there was a respective 40%, 50% and 40% reduction of NO X , SO 2 and primary PM 2.5 applied to each source in the country. For refineries, there was a respective 40%, 15% and 15% reduction of NO X , SO 2 and primary PM 2.5 applied to each source. For pulp and paper facilities, there was a respective 20%, 35% and 25% reduction of NO X , SO 2 and primary PM 2.5 applied to each source. The industrial sectors national total emissions reductions are listed in Table 2.

Photochemical modeling benchmarks
CAMx version 6.10 was used to conduct the full-form air quality modeling for CPP proposal and the industrial sector policy scenarios and CMAQ version 4.7 was applied for the Tier 3 scenario. Both CMAQ and CAMx were applied with hourly emissions inputs for VOC, SO X , NO X , ammonia (NH 3 ), and primary PM 2.5 . Both CMAQ and CAMx were applied with gridded low-level emissions, location-specific elevated point emissions sources, and gridded meteorology input files. Emissions and meteorological inputs to CMAQ are based on netCDF file format and CAMx are a structured binary format. Hourly 2007 meteorological inputs are provided for CMAQ for the Tier 3 related emissions and hourly 2011 meteorological inputs are provided for CAMx for CPP proposal and the industrial sector scenarios.

Emissions input files
The user-specified emissions input required for running InMAP is a shapefile or set of shapefiles containing annual total emissions of VOCs, SO 2 , NO X , VOC, NH 3 , and primary PM 2.5 (not chemically speciated). Photochemical model emissions inputs files were converted to shapefile format for use in InMAP. Shapefiles were created for gridded annual total non-point emissions and separate shapefiles with location specific annual point source emissions. All emissions are in tons per year. Separate shapefiles of gridded 2D emissions were created that include 1) anthropogenic emissions and 2) APEEP (version 2 and 3) uses annual county total emissions as input data. EPA calculated county level annual total emissions of NH 3 , NO X , SO 2 , primarily emitted PM 2.5 (not chemically speciated), VOC (not chemically speciated) from anthropogenic sources, and VOC (not chemically speciated) from biogenic sources for each scenario. Emissions files are provided as text-format comma delimited files with emission rates for each U.S. county included in the APEEP source-receptor matrix. No emissions were included from Canada, Mexico, or offshore locations, as the APEEP model does not have relevant source-receptor relationships. All emissions were in units of tons per year. Separate files were generated for each policy scenario including emissions by varying release height: 1) "ground" level (all non-point) emissions, 2) "low" level, or point sources with effective stack height less than 200 m, 3) "medium" level, or point sources with effective stack height between 200 m and 500 m, 4) "tall" level, or point sources with effective stack height greater than 500 m, and 5) "new tall", or point sources with effective stack height greater than 500 m that were not part of the original source-receptor matrix and added later. Emissions for "new tall" point sources were included as multiple sources per county that added up to the county total. Not all U.S. counties are represented in APEEP's "tall" stack source-receptor matrix. In these situations, "tall" stack emissions were put into the "medium" stack emissions source-receptor matrix so they would be represented in the model simulation.
EASIUR emissions input files were generated for each of the scenarios matching the EASIUR 36 km grid cell resolution domain covering the contiguous U.S. and then converted to ascii text format (comma delimited files). Each file contains gridded annual emissions of NH 3 , NO X , SO 2 , and primarily emitted PM 2.5 (not chemically speciated). All grid cells that are part of the EASIUR 36 km domain were included in each file. Where a grid cell did not contain emissions, a 0 value was assigned to each species so that each grid cell has a record. Separate files for each scenario are provided by varying emission release height: 1) gridded (all non-point) emissions, 2) point sources with actual stack height less than 150 m, 3) point sources with actual stack height between 150 and 300 m, and 4) point sources with actual stack height greater than 300 m.

Meteorological and chemical input files
InMAP requires a single netCDF format input file containing 3D annual average meteorology, air quality, and deposition information. This input file includes spatially explicit annual averages of wind vectors, eddy diffusivity and convective transport coefficients (annual average coefficients calculated using temporally explicit wind speed, temperature, pressure, friction velocity, boundary layer height, and heat flux information), dry and wet deposition rates of various pollutants (annual average rates calculated using temporally explicit wind speed, land cover, stability, and precipitation information), gas/particle phase partitioning for pollutants, and parameters relevant to the calculation of emissions plume rise (annual averages of scalar windspeed; temperature; and two parameters related to atmospheric stability).
InMAP is distributed with a netCDF input file for optional use that has chemical and meteorological parameters based on values derived from a simulation using the WRF-Chem Eulerian model [7] applied with emissions from the 2005 National Emissions Inventory (NEI [8]). Alternatively, users can use annual meteorological and photochemical model simulations to develop their own annual average meteorology/chemistry/deposition input file. The input filed provided in this dataset was generated using output from 2007 WRF and CMAQ simulations for the onroad mobile emissions scenario and from 2011 WRF and CAMx simulations for the EGU and industrial sector emissions scenarios using the conversion utility distributed with InMAP (https://godoc.org/github.com/spatialmodel/inmap/ inmaputil#ConfigData.Preproc). The 2007 WRF/CMAQ and 2011 WRF/CAMx outputs were obtained from previously available model simulations that are described in Refs. [5,9] respectively. The conversion utility was updated to work with WRF and CMAQ/CAMx since the distributed version only had compatibility with WRF-Chem and GEOS-CHEM output.
APEEP, EASIUR, and SA-BPT do not accept user-supplied meteorological input files although the formulation for these models was developed using meteorological parameters. EASIUR and SA-BPT were both parameterized based on model simulations that used 2005 meteorology [3,4]. APEEP contains source-receptor matrices for the formation and transport of particulate matter to produce annual means which was generated by the Gaussian model using climatological meteorology [2,10].

Air quality model output
Annual average PM 2.5 surfaces output for each of the emissions scenarios by each of the reduced complexity and photochemical grid models are provided to allow for inter-comparison and benchmarking to ensure model inputs were correctly applied. Files are available for each modeling system and each emission scenario. The APEEP model directly outputs county level total PM 2.5 . For InMAP, the 12 km gridded total PM 2.5 was estimated by summing PM 2.5 chemical components: nitrate, sulfate, ammonium, primary PM 2.5 , and secondary organic aerosol. Full-scale hourly photochemical model PM 2.5 chemical component output was aggregated to annual average. Each reduced complexity model predicts annual average PM 2.5 so no temporal aggregation was necessary. InMAP output are provided as shapefiles, APEEP as comma-delimited text files, and the annual aggregated photochemical model output as netCDF based files.

BenMAP-CE
Model predicted annual PM 2.5 was converted for input to BenMAP-CE [11]. BenMAP input files were generated for each emissions scenario and model. Following the approach typically used in past benefits assessments, photochemical model estimated annual PM 2.5 was adjusted with ambient speciated PM 2.5 measurements from routine surface monitor networks using a statistical technique part of EPA's Software for Model Attainment Test-Community Edition to minimize areas of extreme over or under prediction tendency [12,13]. Table 3 shows how precursor emissions relate to raw model output species and adjustments made to those species for input to BenMAP to estimate monetized health benefits associated with specific precursors. InMAP and photochemical model output were converted to the comma delimited format required for input to BenMAP with annual PM 2.5 gridded to match the 12 km sized grid cell model domain. The APEEP inputs to BenMAP are county specific rather than gridded. BenMAP input and output files for each of the modeling systems part of this analysis and each of the emissions scenarios are provided as part of this database. All BenMAP outputs are text files with county specific information. Table 3 Mapping model precursor emissions to model output and adjustments to modelled output for input to BenMAP. The empirical equation used to estimate particle bound water based on sulfate, nitrate, and ammonium concentrations is provided elsewhere [13].

Model Emissions Raw model output species
Input to BenMAP

Estimating monetized health benefits for SA-BPT and EASIUR
The BenMAP tool estimates monetized health damages associated with PM 2.5 changes in each county. EASIUR and SA-BPT provide this information as a look-up table rather than by producing intermediate air quality estimates. SA-BPT estimates nationally aggregated monetized health benefits by multiplying a change in PM 2.5 precursor emissions by pre-computed marginal benefits estimated for specific sectors. Relevant SA-BPT sectors used here include onroad mobile, EGUs, cement kilns, refineries, and pulp and paper facilities [4]. The sector specific benefit-per-ton values used for this assessment were adjusted to reflect a value of statistical life (VSL) projected to 2015. Nationally aggregated emissions changes by precursor are provided for use with the SA-BPT in Table 2.
A program was developed to match the EASIUR grid-cell, stack height (surface or elevated release), and precursor emissions specific monetized health damage estimates with scenario-specific emissions information. The program adjusted the aggregated monetized health damages estimated by EASIUR to reflect a 2015 VSL.