1. Introduction
People are at risk of health problems due to air pollution. Clean air is fundamental to health. Air pollution has a similar burden of disease to unhealthy diets and smoking, according to the World Health Organization. Air pollution is responsible for non-communicable diseases such as ischaemic heart disease, stroke, chronic obstructive pulmonary disease, and asthma, as well as the substantial associated economic cost. Atmospheric pollution comes in many forms such as gaseous or airborne particulate matter (including airborne biological material such as pollen and mold) [
1]. Among these pollutants, fine size airborne particulates, such as PM
are linked to numerous health issues from Asthma to Alzheimer’s [
2,
3,
4,
5,
6,
7,
8,
9]. Because airborne particulates are so small, they exist everywhere and can penetrate deeply into human lungs like Trojan Horses carrying toxic chemicals across the air-blood barrier, and some of their smallest constituents even cross the blood-brain barrier [
1,
10]. Understanding the distribution of PM
in a high temporal and spatial resolution is essential to address health concerns. The trending PM
acquisition sources are ground-based monitoring stations such as the Air Quality Data (AQS) from the United States Environmental Protection Agency (EPA), which includes more than 500 sites across the country. Because of their limited number and connectivity, ground-based monitoring sites do not have the capability to cover continuous spatial area.
To address this issue, studies have used satellite-based remote sensing to model ground level PM
concentrations and estimate PM
in a broader spatial coverage [
11,
12,
13,
14,
15,
16,
17,
18]. Aerosol optical depth (AOD) is one of the most utilized remote sensing products for PM
studies. Depending on the satellite platform, AOD products can be classified into the type of polar orbit and geostationary orbit. Examples of AOD instruments from polar orbit platforms include the Moderate Resolution Imaging Spectroradiometer (MODIS), MODIS Multi-Angle Implementation of Atmospheric Correction (MAIAC), and Infrared Imaging Radiometer Suite (VIIRS). The AOD from these instruments share the feature of relative high spatial resolution, but lower temporal resolution and noncontinuous spatial coverage, which makes them the ideal data source for PM
modeling, but not for high temporal estimation [
15,
16,
17,
18,
19,
20,
21]. On the other hand, the AOD instruments from the geostationary platform, such as the Advanced Himawari Imager (AHI) on Himawari-8 and Advanced Baseline Imager (ABI) on Geostationary Operational Environmental Satellite (GOES-16), stay at 35,786 km above the equator and monitor a fixed area on the earth. These AOD products provide comprehensive spatial coverage on a certain area of the earth in a continuous temporal manager. Hence, in additional to modeling, geostationary AOD is capable at high temporal PM
estimation [
11,
12,
22,
23,
24,
25,
26].
In addition to AOD, supplemental information, such as meteorological data, local weather condition data, have been incorporated in PM
modeling as well for better modeling accuracy [
12,
27,
28]. AOD is a measurement of the optical depth, which is directly related to the particles suspended in the air, while meteorological variables, such as wind, surface pressure, and humidity are important influential factors that affect the concentration of PM
. In situ monitoring stations and the ECMWF are the two main sources for meteorological and weather condition data. To better capture the spatial and temporal PM
variations, ancillary data including but not limited to population density, landcover, and elevation, have been utilized for PM
in studies [
22,
29,
30,
31,
32,
33]. Among these studies, a comprehensive study has been conducted over the US by incorporating a total of 17 variables for hourly PM
modeling and estimation, which reveals a strong PM
distribution pattern upon elevation, population density, land cover and soil type [
22].
Data fusion is an important step to integrate data from a variety of sources into a consistent spatial and temporal form for model training. Changelings exit in terms of the incompatibility in both spatial and temporal scales of these datasets from different sources. Specifically, weather information and PM
concentration values from monitoring stations are in the form of sparse points across a continuous space and commonly with a high temporal resolution, but lack spatial coverage. Weather data from ECMWF, on the other hand, are reanalyzed data from complicated models and have been converted to regular grids with global coverage, which has relative fine temporal resolution but coarse spatial resolution. The data fusion process includes integrating grid data with point data in the spatial scale and matching data into a consistent time window in the temporal scale. Studies have explored different data fusion approaches, including interpolation, nearest search, and gridding alignment [
22,
34,
35]. Depending on the model resolution design, higher-resolution data could be downgraded to lower resolution, and lower resolution data could be upgraded to a higher resolution. However, no matter what approaches are used, noise and uncertainty can be introduced during the data fusion process. Hence, the native resolution of predictor and target variables, as well as the data fusion design together determine the spatial and temporal resolutions of PM
models.
The ECMWF is a popular data source providing meteorological variables for PM
studies, which has an hourly temporal resolution [
36]. These variables are widely used for PM
modeling and estimations because of their comprehensive spatial coverage and satisfactory temporal resolution. However, challenges still exist in the following aspects. First, although the meteorological variables from ECMWF are suitable for hourly PM
modeling, when utilized for finer temporal modeling, up-sampling in temporal scale is necessary during the data fusion process, either by interpolation or nearest matching, which in turn introduced noises and uncertainty and hence to negatively affect the model accuracy. Second, many studies utilized site-based PM
concentrations and ECMWF variables for PM
modeling. However, in the spatial perspective, the data fusion process that integrates the gridded areal ECMWF data with a point-based PM
values by either interpolation, grid alignment, or nearest search, could also introduce noise. Third, the concentration of PM
changes dynamically based on processes in the atmospheric boundary layer (ABL), variations in meteorological parameters such as wind speed, vertical gradients in wind speed, and air temperature. Given the limitations of assimilation and modeling, parameters from reanalysis datasets are not always consistent with ground measurements [
37,
38,
39]. Thus, the veracity of the reanalysis dataset is an issue when it is being used for PM
studies. Although utilizing meteorological data from ECMWF together with site-based PM
observation for PM
modeling is a common approach adopted in many PM
studies, it is underexplored that how the temporal and spatial incompatibility during data fusion process as well as the data veracity issues could affect the site based PM
modeling accuracy. In addition to the ECMWF, few other weather data sources are explored to be suitable for high temporal PM
modeling.
Next-Generation Weather Radar (NEXRAD), consists of 160 sites, which provide important data for climatological and airborne object studies [
40]. Due to the high scanning frequency, NEXRAD provides weather monitoring data in high temporal resolution, which is particularly effective in tracking dynamic objects and weather status, such as monitoring bird roosts [
41], detecting clusters of “biological targets”, evaluating hurricane impacts on forests [
42], and forecasting weather. In addition, studies have been explored that NEXRAD is responsive to pollution plumes, such as smoke from fires and pollen. Hence, NEXRAD can be utilized for forest fire management support and pollen concentration estimation [
43,
44]. Although some use cases of NEXRAD have been discussed in studies, the application potential is not fully explored. Limited by the signal frequency, fine particulate matters, such as PM
, can not be directly captured by NEXRAD. Hence, research on utilizing NEXRAD for PM
modelings has never been done before.
This research has three main contributions. First, previous studies reveal that the concentration of the airborne particulate, such as pollen and PM
has a close relationship with the weather condition such as the wind velocity, humidity, and temperature [
16,
43], but never explore how meteorological variables from NEXRAD can help with the PM
modeling. This study explores NEXRAD’s potential in-ground PM
modeling. Second, most studies for PM
focus on generating a model with a relatively coarse temporal resolution (monthly, daily, or hourly) limited by the coarse sampling frequency of PM
values predictors variables [
45,
46,
47]. This study utilizes variables from NEXRAD and GOES-16 AOD for high temporal PM
modeling. Third, although variables from ECMWF have been widely used together with site-based PM
for modeling purposes, no studies have explored the negative impact on the modeling accuracy caused by the veracity issue and the data spatial-temporal incompatibility between predictor and target variables. By comparing two models using weather variables from in situ observations and ECMWF respectively, this research quantitatively explores the uncertainty and error that are introduced from the data fusion process.
4. Discussion
The wind influences the PM
concentration during saltation. Particles are carried by the wind and move in their own way as they creep, saltate, or suspend. When particles moved by the wind collide with each other or with the ground surface, the saltation process occurs. Researchers have found that wind plays an important role in saltation, in which fine size airborne particulates including PM
and PM
can be emitted [
57,
58]. A NEXRAD’s spectrum width and velocity can provide information on how fast and what is the direction the wind is moving. Hence, variables from NEXRAD are incorporated together with in situ observation and ECMWF variables for PM
modeling in this study. The performances of models with and without NEXRAD are compared in group 1 (see
Table 6). Limited by the availability of in situ weather observations, the variables from ECMWF are widely adopted as an alternative source for measuring weather conditions, which are commonly used for meteorological studies and PM
estimations [
34,
59]. Therefore, group 2 replaces in situ observations with similar variables from ECMWF to investigate the model performances with and without NEXRAD. Replacing in situ observations with similar variables from ECMWF allows better spatial coverage, but at the cost of lower spatial and temporal resolution, and higher uncertainties. Under this circumstance, the comparison group 3 explores how effective the variables from ECMWF are in PM
modeling compared to the sensor measured in situ weather observations.
A summary of the model comparison results can be found in
Table 6. The results from group 1 and group 2 demonstrate that variables from NEXRAD could provide extra weather information, especially the velocity, which helps to improve the PM
modeling accuracy. The more significant model accuracy improvement in group 2 compared to group 1 encourages us to believe that when lacking in situ weather observations, NEXRAD can serve as an important supplementary information source to ECMWF for a better PM
modeling accuracy. Group 3 shows that although ECMWF could serve as an alternative weather variables source for in situ weather observations in PM
modeling and has better spatial coverage, variables from ECMWF are at hourly intervals, while the other variables are available every 15 min. As a result, variables from ECMWF need to be interpolated to match the temporal resolution of the rest variables for PM
modeling, which could introduce noise and lead to a model accuracy drop.
In addition to NEXRAD, ECMWF, and in situ observations, variables from GOES-16 and solar angles (see
Table 5) are employed as predictor variables as well. Previous studies examined the relationship between AOD and PM
concentration varies depending on the time of day [
60]. Solar angles are closely related to the local time and have a huge influence on the AOD quality, which provides important information to help the PM
estimation. This conforms with the
Figure 9,
Figure 12 and
Figure 15, where the variables from solar angles contribute significantly to PM
modeling accuracy.
5. Conclusions
By comparing the four established models in three groups, this study has made the following discovery. First, the in situ weather observations including humidity, temperature, dew point, and pressure, together with AOD from GOES-16 and solar angles could achieve good modeling results (0.83 R) for PM concentrations at a 15 min temporal resolution. Second, combining variables from NEXRAD with an in situ model could improve the R modeling accuracy by 2.8%. Third, combining variables from NEXRAD with ECMWF could achieve more accuracy improvements (9.7%) compared to the accuracy improvements from the in situ models. Last, by replacing in situ observations with variables from ECMWF, the model has an 0.7 R score on the testing dataset, which is 0.13 less than the model with in situ weather observation.
Several novel facts have been uncovered by these discoveries that could provide insights for future research. Due to NEXRAD’s high scanning frequency, even though it cannot detect PM particles directly, it can provide supplementary weather conditions that improve PM modeling, especially for high temporal PM modeling. In addition, due to the incompatibility in resolution and veracity issues, using weather variables from ECMWF as an alternative for in situ observation leads to a significant model accuracy drop. However, when in situ observations are not available, weather variables from ECMWF are still an effective source for PM modeling. Furthermore, NEXRAD improves model accuracy more significantly in the model based on variables from ECMWF than in the model based on variables from in situ observations. These facts suggest that NEXRAD can be utilized as a supplementary weather source for high temporal PM modeling, especially in the absence of in situ weather measurements.