Dataset for evaluating WRF-Chem sensitivity to biogenic emission inventories in a tropical region. Global online model (MEGAN) vs local offline model (BIGA)

This article presents a dataset comparing emissions of Biogenic Volatile Organic Compounds (BVOC) in a zone of complex topography in the tropical Andes, which presents elevations ranging from 250 to more than 4000 m above sea level in a radius of only 50 km. Two approximations were evaluated, (1) online with the Model of Emissions of Gases and Aerosols from Nature (MEGAN) coupled with the Weather Research and Forecast model with Chemistry (WRF-Chem) and (2) offline applying the Biogenic Altitudinal Gradient Model (BIGA). Modeled concentrations of pollutants (mainly isoprene and tropospheric ozone) were obtained with WRF-Chem employing the biogenic emission models mentioned previously. This information identified areas where BVOC emissions vary significantly, comparing the global emission inventory (MEGAN) and the local inventory (BIGA). Re-evaluation of the emission factors and land cover assigned to those areas in the global online biogenic models should be considered in order to reduce the uncertainty in the values. In addition, the dataset shows the impact of the biogenic emission inventories on the air quality simulations on a tropical high mountain area, where vegetation is diverse, and the altitudinal changes influence meteorological variables.


a b s t r a c t
This article presents a dataset comparing emissions of Biogenic Volatile Organic Compounds (BVOC) in a zone of complex topography in the tropical Andes, which presents elevations ranging from 250 to more than 40 0 0 m above sea level in a radius of only 50 km. Two approximations were evaluated, (1) online with the Model of Emissions of Gases and Aerosols from Nature (MEGAN) coupled with the Weather Research and Forecast model with Chemistry (WRF-Chem) and (2) offline applying the Biogenic Altitudinal Gradient Model (BIGA). Modeled concentrations of pollutants (mainly isoprene and tropospheric ozone) were obtained with WRF-Chem employing the biogenic emission models mentioned previously. This information identified areas where BVOC emissions vary significantly, comparing the global emission inventory (MEGAN) and the local inventory (BIGA). Re-evaluation of the emission factors and land cover assigned to those areas in the global online biogenic models should be considered in order to reduce the uncertainty in the values. In addition, the dataset shows the impact of the biogenic emission inventories on the air quality simulations on a tropical high mountain area, where vegetation is diverse, and the altitudinal changes influence meteorological variables.

Value of the Data
• This dataset is useful to identify areas with major differences in VOC biogenic emissions (BVOC) in the Andes region within a global emission model and a local emission inventory. Results obtained suggest areas that need further study to determine adequate emission factors or land cover classifications to reduce uncertainty in the estimates. • The dataset provides insights of the influence of BVOC fluxes in atmospheric chemistry of a region of the tropical Andes, through the application of a regional air quality model (WRF-Chem) with the two biogenic emission models (MEGAN and BIGA). • The dataset can be used to identify areas with critical levels of pollution in the studied area supporting air quality assessment.
• The dataset can be used as a reference for future air quality simulations performed in areas with dense and diverse vegetation, as well as a high climatic variability caused by the complex orography. • The dataset contains the files needed to elaborate local emission inventories in other regions using BIGA, as well as to run the WRF-Chem simulations. • A modified version of the WRF-Chem module "module_cbmz_addemiss.F" is included in the dataset. The module was adjusted to include biogenic emissions (isoprene) from the offline biogenic model along with the anthropogenic emission inventory.

Data Description
The dataset made available through the Mendeley repository (See Specification Table) has three folders. The folder WRF-Chem_Outputs contains two NetCDF files. Each file presents gridded hourly mean values of air quality and meteorological variables specified in Table 1 , for the period of analysis (June 3, 0 0:0 0 to July 1, 0 0:0 0 UTC 2018). In particular, the file CAL-DAS_WRF_37_MEGAN.nc contains simulated values when using MEGAN model to estimate biogenic emissions, while the file CALDAS_WRF_37_BIGA.nc presents modeled values when the BIGA model is used to supply biogenic emissions.
The folder named BIGA_Inputs contains the information needed to replicate the elaboration of the local biogenic emission inventory. This includes a digital elevation model (DEM) of the area in GeoTIFF format, a land cover and use map (LCU) in GeoTIFF format, and several CSV files containing hourly records of temperature and solar radiations for the meteorological stations specified in Table 2 . BIGA source code and tutorials can be download at http://idea.manizales. unal.edu.co/gta/ingenieria _ hidraulica/BIGA/index.php .   The folder WRF-Chem_Inputs contains the namelist.wps and namelist.input files used to run WRF-Chem. In addition, two NetCDF files containing the local anthropogenic and biogenic emission inventories are provided. Note that for simulating using MEGAN, only the file wrf-chemi_anthropogenic.nc is used, as biogenic emissions are estimated online by WRF-Chem. On the other hand, when the local biogenic emission inventory is to be used, the anthropogenic and biogenic emission files need to be added. This can be done employing an NCO operator using the following command "ncbo -op_typ = add wrfchemi_anthropogenic.nc wrfchemi_biogenic.nc wrfchemi_total.nc".
The figures ( Figs. 1-7 ) and videos (Videos 1-3) summarize the main differences in isoprene emissions according to the two estimation methods (MEGAN and BIGA), and the impact of these changes in isoprene and ozone concentration at a surface level according to the WRF-Chem modeled outputs. Finally, Table 3 presents some statistical performance metrics evaluating O 3 forecasting accuracy with both biogenic models. The evaluation was performed against O 3 ground measurements of concentration, with an hourly time resolution, obtained inside the urban area of the city (Lat: 5.06 84 8, Lon: −75.51709). These data suggest that further assessment of emission factors and land use assignment in the region must be considered, in order to reduce the uncertainty of the BVOC emissions, and consequently, improve the accuracy of air quality simulations.

Experimental Design, Materials and Methods
The WRF-Chem model version 3.7.1 was used to simulate air quality in a region of the tropical Andes (Extent: −75.8954, −74.9054, 4.6229, 5.5229) for a period of 28 days in 2018 (June 3 to July 1). The area is characterized for dense and diverse vegetation, and high climatic variability related to the drastic altitudinal changes (terrain elevations varying from 250 to more than 40 0 0 m.a.s.l.) [2] . Average temperatures inside the simulation domain range from 28 °C in the lower elevation areas, to −3 °C in the higher mountain peaks. Two simulations were made to test the sensitivity of the model to the biogenic emission inventories (MEGAN and BIGA). Details of the inputs to the models, WRF-Chem settings, and outputs postprocessing are given below.

Inputs
Initial and boundary conditions: Meteorological data were retrieved from the National Center for Environmental Prediction (NCEP) Global Forecast System (GFS) Final Analysis (FNL) with a horizontal grid spacing of 0.25 °and 6 h intervals ( https://rda.ucar.edu/datasets/ds083.3/ ) and chemical data were obtained from the Community Atmosphere Model with Chemistry (CAM-Chem) simulations [6] .
Anthropogenic emission inventory: Anthropogenic emissions were included in the model using a local emission inventory [5] . The inventory was disaggregated and speciated as specified in the emissions section of the study of [4] . Then, an emission file compatible with WRF-Chem was generated using the AAS4WRF emission preprocessor [7] . The final file is provided in the Mendeley repository. Online/Global biogenic emission inventory: Online biogenic emissions were estimated using the MEGAN model couple with WRF-Chem. The gridded emissions were obtained with a temporal resolution of 1 h and a spatial resolution of 1 km -1 km.
Offline/Local biogenic emission inventory: Offline biogenic emissions were estimated using the BIGA model [2] with a temporal resolution of 1 h and a high spatial resolution of 0.1 km -0.1 km, in order to capture the high climatic variability and varied vegetation of the area of study, caused by the high altitudinal changes. The emissions were later aggregated to a resolution of 1 km -1 km to be included in the WRF-Chem simulation using the AAS4WRF emissions preprocessor [7] .
To execute BIGA, the following information was needed: (1) DEM downloaded from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) online repository ( https://asterweb.jpl.nasa.gov ). The DEM was later resample using GIS software to meet the horizontal resolution needed (0.1 -0.1 km). (2) LCU map for the area of interest retrieved from the Colombian Environmental Information System (SIAC in Spanish) online repository ( http://www.siac.gov.co/catalogo-de-mapas ). The LCU map was converted from polygon to raster format using GIS software (3) Ground measurements of temperature and solar radiation were obtained from sixteen meteorological stations located in the area of interest (see Table 2 ) using the Caldas Environmental Data and Indicators Center (CDIAC in Spanish) online platform ( http: //cdiac.manizales.unal.edu.co/indicadores/public/searchClimatological ). The records were average from five minutes to an hourly resolution. Missing data were filled using mean values. The prior information was made available in the Mendeley repository.

WRF-Chem configurations
Two air quality simulations were performed using each of the biogenic emission inventories previously describe. The model configurations used were defined according to the suggestions of Cifuentes et al. [4] and are listed in Table 4 . The WRF-Chem module named module_cbmz_addemiss.F was modified to include the local biogenic emissions through the same channel that local anthropogenic emissions are introduced into the model. The modified module was made available through the Mendeley repository, in the folder WRF-Chem_Inputs. The changes are found in lines 90, 130, 164-165 y 229 of the module. Note that WRF-Chem must be re-compiled after modifying the module.

Outputs postprocessing
Model outputs were obtained on an hourly time resolution for the period of simulation, leading to a time dimension of 673 records. NCO operators were used for averaging the data into hourly mean values (time dimension of 24 records) and to subset the variables of interest presented in Table 1 . Then, Python software was used to generate the visualizations presented within the article.

Chemistry options
Gas-phase chemical mechanism CBMZ Aerosol scheme MOZAIC (Four bins) Photolysis Fast-J

Data Availability
WRF-Chem sensitivity to biogenic emission inventories in a tropical region. Global online model (MEGAN) vs local offline model (BIGA) (Original data) (Mendeley Data).

Ethics Statement
The work did not involve the use of human subjects, animal experiments, or data collected from social media platforms.

Declaration of Competing Interest
The authors have no conflict of interest to report.