Estimation of power plant SO 2 emissions using HYSPLIT dispersion model and airborne observations with plume rise ensemble runs

. SO 2 mixing ratio measurements were made by a research aircraft close to three power plants in North Carolina on March 26, 2019. An ensemble of dispersion simulations with HYSPLIT model v5.2.0 are carried out using a total of 15 heat release parameters ranging from 10 MW to 150 MW for the Briggs plume rise calculation to quantify the underlying modeling uncertainties. For each heat release value, a total of 72 independent HYSPLIT Lagrangian model runs with unit hourly emissions from the three point sources are made to form a transfer coefficient matrix (TCM) with the airborne observations. 5 The TCMs can be decoupled into six segments where the observations of each segment are only influenced by a single power plant in its morning or afternoon operation. Prior to estimating the power plant emissions, the simulation performance is first evaluated with the correlation coefficients between the observations and the model prediction with constant unit-emission in its morning or afternoon operations. The segment influenced by the afternoon operations of Belews Creek power plant has negative correlation coefficients for all the plume rise options and is excluded from the emission estimate when the “optimal” 10 member is selected based on the correlation coefficient. For the other segments, the plume rise runs with the highest correlation coefficients are selected for the emission estimates using the HYSPLIT inverse modeling system. In the TCM-based inverse modeling, the emission estimates are obtained by minimizing a cost function which measures the difference between logarithmic predicted and observed mixing ratios but also takes model uncertainties into account. A cost function normalization scheme is adopted to avoid spurious

The TCMs can be decoupled into six segments where the observations of each segment are only influenced by a single power plant in its morning or afternoon operation.Prior to estimating the power plant emissions, the simulation performance is first evaluated with the correlation coefficients between the observations and the model prediction with constant unit-emission in its morning or afternoon operations.The segment influenced by the afternoon operations of Belews Creek power plant has negative correlation coefficients for all the plume rise options and is excluded from the emission estimate when the "optimal" member is selected based on the correlation coefficient.For the other segments, the plume rise runs with the highest correlation coefficients are selected for the emission estimates using the HYSPLIT inverse modeling system.In the TCM-based inverse modeling, the emission estimates are obtained by minimizing a cost function which measures the difference between logarithmic predicted and observed mixing ratios but also takes model uncertainties into account.A cost function normalization scheme is adopted to avoid spurious emission solutions when using logarithmic concentration differences following Chai et al. (2018).The source estimation results of the three power plants with the morning and afternoon flight segments are compared with the Continuous Emissions Monitoring Systems (CEMS) data.Overestimations are found for all the segments before considering the background SO 2 mixing ratios.Both constant background mixing ratios and several segment-specific background values are tested in the HYSPLIT inverse modeling.The estimation results by assuming the 25th percentile observed SO 2 mixing ratio inside each of the five segments agree well with the CEMS data, with relative errors as 18%, -12%, 3%, 93.5%, and -4%.After emission estimations are performed for all the plume rise runs, least root mean square errors (RMSEs) between the predicted and observed mixing ratios are calculated to select a different set of "optimal" plume rise runs which have the least RMSEs.Identical plume rise runs are chosen as the "optimal" members for Roxboro and Belews Creek morning segments, but different members for the other segments yield smaller RMSEs than the previous correlation-based "optimal" members.It is also no longer necessary to exclude the Belews Creek afternoon segment that has negative correlation between predictions 1 Introduction Both Eulerian and Lagrangian atmospheric transport models have been widely used to provide forecasts or analyses of atmospheric components for a wide range of purposes varying from emergency responding to climate change predictions.However, in many applications, such as volcanic eruptions, wildfire events, accidental radionuclide releases from nuclear power plants, and climate change predictions, emissions are the most critical model input parameters but are mostly unknown and difficult to quantify.Even when emission inventories are made available through bottom-up approaches, some of the emissions are often associated with large uncertainties and systematic biases due to outdated databases, inaccurate emission factors, and invalid assumptions regarding operations, processes, and/or activities (throughput) during the bottom-up emission estimation.Therefore, various inverse modeling methods using so-called top-down approaches have been developed in order to estimate the emissions by combining the direct observations and the accumulated knowledge already built into the atmospheric transport models.Lagrangian particle dispersion models are particularly suited to the applications related to point source emission estimations because they effectively avoid calculation outside air pollutant plumes and do not have numerical diffusion problems suffered by most Eulerian models.Many source term estimation applications have been developed using various dispersion models and inverse modeling schemes (e.g., Stohl et al., 2012;Winiarek et al., 2012;Saunier et al., 2013;Winiarek et al., 2014;Chai et al., 2015;Bieringer et al., 2017;Hutchinson et al., 2017;Chai et al., 2018;Kim et al., 2020).
The National Oceanic and Atmospheric Administration (NOAA) Air Resources Laboratory's (ARL) HYSPLIT Lagrangian model is one of the most extensively used atmospheric transport models to simulate the atmospheric transport, dispersion, and deposition of pollutants and hazardous materials (Draxler and Hess, 1997;Stein et al., 2015).A HYSPLIT inverse system based on 4D-Var data assimilation and a transfer coefficient matrix (TCM) was developed and applied to estimate cesium-137 source from the Fukushima nuclear accident using global air concentration measurements (Chai et al., 2015).The system was further developed to estimate the effective volcanic ash release rates as a function of time and height by assimilating satellite mass loadings and ash cloud top heights (Chai et al., 2017).More recently, a HYSPLIT-based Emissions Inverse Modeling System (HEIMS) was developed to estimate wildfire emissions from the transport and dispersion of smoke plumes captured by geostationary satellite aerosol optical depth observations (Kim et al., 2020).In another HYSPLIT inverse system study with the Cross Appalachian Tracer Experiment (CAPTEX) data collected from six controlled releases, Chai et al. (2018) found that adding model uncertainty terms was able to improve source estimate results.
The source term estimation problem proves to be challenging because of the chaotic nature of the atmospheric flow.In addition, the observations from routine monitoring networks are typically sparse and often do not provide enough information https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License. to determine emission sources.Many field campaign studies have been carried out with airborne measurements by research aircraft in order to estimate certain air pollutant and greenhouse gas emission sources.Both traditional mass balance methods (e.g., Mays et al., 2009;Cambaliza et al., 2014;Ren et al., 2018), and various inverse modeling methods which take advantages of atmospheric transport models (e.g., Karion et al., 2019;Angevine et al., 2020;Pitt et al., 2022;Lopez-Coto et al., 2022) have been applied to quantify different emissions.While many inverse modeling applications have been carried out and compared with the bottom-up emission inventories, large uncertainties are still associated with the top-down estimations.Karion et al. (2019) showed an intercomparison study using both inventory scaling method and Bayesian inversion with several dispersion models and meteorological inputs for emission estimation with flight observations.They found significant variabilities (up to a factor of 3) between different models and between different days and indicated that further work was needed to evaluate and improve vertical mixing in the tracer dispersion models.To better evaluate the top-down estimates of emissions, Angevine et al. ( 2020) studied a power plant with Continuous Emissions Monitoring Systems (CEMS) data as the known emissions.
They used a model-assisted mass balance method and examined the estimate uncertainties with an ensemble of HYSPLIT runs with different meteorological inputs and concluded with reasonably large (30%-40%) uncertainties for the top-down estimates of emissions.However, a constant heat release of 85 MW as the main plume-rise parameter used in the Briggs formulation was specified for all the simulations.This could have caused an underestimation of the uncertainties.
In this study, the HYSPLIT inverse modeling system is tested with flight observations collected in 2019 by the University of Maryland Cessna 402B research aircraft to estimate SO 2 point source emissions from three power plants in North Carolina, USA.An ensemble of model runs with a range of emission heat release parameters are used to quantify the forward model simulation uncertainties due to the plume rise calculation.The paper is organized as follows.Section 2 describes the flight observations as well as the HYSPLIT model configuration, and the source term inversion method.Section 3 presents emission inversion results and a summary is given in Section 4.

Observations
A suite of airborne measurements were collected using an instrumented small research aircraft, University of Maryland Cessna  boro (36.4350°N 78.9619°W), and Belews Creek (36.2811°N 80.0603°W).Note that another power plant, Mayo (36.5278°N 78.8917°W), is also in the region, but did not operate on the day.Measurement of SO 2 mixing ratios was made with a Thermo Environment Model 43S pulsed fluorescence analyzer.Calibration of the SO 2 analyzer was conducted before and after the field study with an SO 2 standard that is traceable to National Institute of Standards and Technology (NIST) reference standards.
Additional measurements were also made, including aircraft locations, wind speed, wind direction, temperature, pressure, rel-https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.ative humidity, and mixing ratios of several other gas species, as well as some aerosol optical properties.More details related to the aircraft instruments and measurements can be found in Ren et al. (2018).
To better compare the HYSPLIT model results with the observations, the original 1-sec data are averaged inside each fourdimensional (4-D) HYSPLIT sampling grid box, i.e., 0.01°longitude by 0.008°latitude, 100 m in altitude, and 1 minute in time in this application.It should be noted that the aircraft typically travels several three-dimensional (3-D) grid boxes within a minute.The original 1-sec data inside each 3-D grid box are averaged separately so that multiple 1-min records would result from such a 4-D averaging.For brevity, the 4-D averaged data are still referred as 1-min data hereafter.

HYSPLIT model
In this study, SO 2 plumes originated from the power plants are modeled using the HYSPLIT model (Version 5.2.0) in its particle mode in which three-dimensional (3D) Lagrangian particles released from the source location passively follow the wind field (Draxler andHess, 1997, 1998;Stein et al., 2015).A particle release rate of 20,000 per hour is used for all calculations.
The meteorological data used to drive HYSPLIT are from the Weather Research and Forecasting (WRF; version 4.0.1)model (Powers et al., 2017) at 3 km horizontal grid spacing (D03 in Fig. 2) and 5-minute temporal frequency.In the WRF simulations, 3D grid nudging of winds is applied in the free troposphere and within the planetary boundary layer (PBL).Fig. 3 shows that the WRF wind speed data mostly agree well with the aircraft observations.However, at the beginning of the afternoon flight the 1-min observations show large variations in wind direction that the 5-min WRF data cannot represent.Note that the WRF simulations of the two outer domains (D01 and D02 in Fig. 2  The dry deposition velocicy of SO 2 is calculated using the resistance method by the HYSPLIT model where molecular weight, diffusivity ratio, and effective Henry's law constant are specified as 64 g/mol, 1.9, and 1 ×10 5 mol/L/atm, respectively.Actual Henry's constant of 1.24 mol/L/atm is used to define the wet removal process for SO 2 as a soluble gas.The sampling grid is defined to be 0.01°longitude by 0.008°latitude, 100 m in altitude from surface to 2000 m above ground level.Mass mixing ratios are output every minute by setting the HYSPLIT parameter ICHEM = 6 to divide output mass by air density.
They are later converted to volume mixing ratios by multipling by the molecular weight ratio of air to SO 2 .

Plume rise
The plume rise calculation in HYSPLIT is based on the Briggs formula derived from dimensional analysis for buoyancydominated plume from power plant stacks (Briggs, 1969).Equation 1shows the formulas used in the HYSPLIT model for the final plume rise ∆H in different meteorological conditions following Arya (1999).

∆H =
where F b is the buoyancy flux term, ū is the mean wind speed, u * is the friction velocity, and s is the static stability parameter, as defined in Equation 2.
Here g is gravitational acceleration.T v is the moist air virtual temperature and θv is the mean virtual potential temperature.
The buoyancy flux term F b is approximated by Equation 3 (Briggs, 1969).flow rate and gas temperature at the stack exit are available.However, the exit gas temperature of the three stacks during the study period cannot be obtained.Note that even if Q H can be accurately estimated, the ∆H calculation through Equation 1is still subject to significant uncertainties due to some assumptions for simplification.In addition, when certain parameters are not readily available, it is preferable to assume them as unknown to allow better applicability for the source term estimation method.Thus we use a range of Q H values for plume rise height calculation to form an ensemble of dispersion runs and the "optimal" plume rise runs that best matches the observations will be selected afterwards.In the detailed studies at six Tennessee Valley Authority over many years, it was found that heat emissions ranged from 20 to 100MW per stack with one to nine stacks operating (Briggs, 1969).For each stack in operation, 15 heat emission values uniformly distributed from 10MW to 150MW are tested in HYSPLIT simulations.During the study period, only one stack was operating at each of the three power plants.

Inverse modeling method
Similar to the previous HYSPLIT inverse modeling applications (e.g., Chai et al., 2015Chai et al., , 2017Chai et al., , 2018;;Kim et al., 2020;Crawford et al., 2022), a transfer coefficient matrix (TCM) approach is used for the inverse modeling application.After a stack heat emission scenario is specified, 24 independent HYSPLIT Lagrangian model runs with unit hourly emissions starting from 0Z to 23Z on March 26, 2019 are made at each power plant to form a TCM using the 4-D averaged 1-min airborne SO 2 observations.
A transfer coefficient at row m and column n of the TCM represents the source-receptor sensitivity of observation m with respect to the nth unit emission run from a certain source location and release hour.The unknown emissions can be solved by minimizing a cost function that integrates the differences between model predictions and observations, deviations of the final solution from the first guess (a priori), as well as other relevant penalty terms if needed (Daley, 1991).Following Chai et al. (2018), a cost function normalization scheme is introduced and the cost function F is defined as, where q ij is the discretized source term at hour i and location j for which an independent HYSPLIT simulation has been run and recorded in a TCM.q b ij is the first guess or a priori estimate and σ 2 ij is the corresponding error variance.We assume the uncertainties of the release at each time-location are independent of each other so that only the diagonal term of the typical a priori error variance σ 2 ij appears in Equation 4. c h and c o denote HYSPLIT-predicted and measured mixing ratios, respectively.
The observational errors ϵ m are assumed to be uncorrelated.Since the term ϵ 2 m is essentially used to weight (c h m − c o m ) 2 terms, the uncertainties of the model predictions and the representative errors are included besides the observational uncertainties.
To consider ϵ 2 in a simplified way, it is formulated as As the additive term parameters a o and a h affect the ϵ 2 in a similar way, the representative errors caused by comparing the measurements with the predicted concentrations averaged in a grid can be included in either a h or a o .The multiplying factor https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.applied to the second term is the normalization to avoid having zero source as a spurious solution when logarithmic metric is used in the cost function.ϵ b m is the total uncertainties when q b ij is initially used in the model predictions.The details of the normalization can be found in Chai et al. (2018).Chai et al. (2018) shows that the logarithmic metric yields better inversion results than the original air concentration metric.
In this application, the metric variable in Equation 4 is changed to ln(c), i.e., replacing is comprised of two parts, as Note that a constant small mixing ratio 10 −6 ppbv is added to denominators c o m and c h m to avoid division by zero.

Transfer coefficient matrix
As mentioned in Section 2.4, a TCM approach is used in the inverse modeling.The time varying model predictions of each independent HYSPLIT Lagrangian model run with unit hourly emission at all the receptor time and locations are recorded as the transfer coefficients (TCs).The transfer coefficients from a set of model runs can be combined to generate a transfer coefficient matrix (TCM).Roxboro.However, some of hourly emissions will not be well-constrained.For instance, the hourly emissions at 18Z from Roxboro can only be constrained by 6 1-min SO 2 mixing ratio observations and the hourly emission at 18Z from Belews Creek can only constrained by 5 observations.Figure 4 shows that each observation row has only one or two non-zero TC values.If there are two non-zero TC values for any observation row, they are in two consecutive columns which represent two HYSPLIT runs with hourly emissions at two consecutive hours.Instead of trying to estimate the emissions at the individual hours from each power plant, here we will only estimate the average emissions of the two or three consecutive hours that can be constrained by the morning or afternoon flights.With this decoupling approach, the cost function minimization becomes a very simplified problem.

Stack heat emission
As described in Section 2.3, when other meteorological parameters are fixed, the stack heat emission Q H becomes the single user input parameter to affect plume rise calculation with the Briggs formula being used in HYSPLIT.A total of 15 Q H values from an expected range of 10 MW to 150 MW are tested.For each heat emission value, 24 independent HYSPLIT Lagrangian model runs with unit hourly emissions starting from 0Z to 23Z on March 26, 2019 are made at each power plant, resulting in a total of 1080 model simulations.Figure 5 shows some of the plume rise results at the three different power plant locations.Note that the plume rise is added to the stack height listed in Table 2 for the virtual release height used in the model.The plume rise mostly goes up during the day, following the PBL development.Because Roxboro and CPI Roxboro are close to each other, both the PBL heights and the plume rise results with the same Q H = 50 MW are quite similar.Increasing heat emission from had the plume rise drastically reduced.
For each heat emission value applied to a power plant, the 24 HYSPLIT simulations with unit hourly emissions can be combined together to generate the SO 2 plume patterns for the particular power plant.Unless there are significant hourly emission variations the correlation coefficients (r) between the combined plume and the observations is a good metric to evalute the model performance without the need to estimate the emission magnitudes.TCMs.For instance, the number of non-zero rows of the TCM in Fig. 4 increases to 570 with horizontal interpolation from the previous 464 with nearest neighbor option.In addition, the interpolation helps to smooth the gridded predictions.Figure 6 shows that correlation coefficients typically improve by up to 0.1 using the interpolation option.All the results presented later Creek only reach reasonable correlation coefficients of r = 0.5 when Q H is between 120 MW and 140 MW.When Q H is below 80 MW, low and even negative correlation coefficients appear between the predictions and observations.This will be investigated later by separating the morning and afternoon flights.
Table 1 shows the correlation coefficients between 1-min aircraft SO 2 observations from the morning and afternoon flights and the model counterparts using the unit-emission HYSPLIT simulations with different heat emission from the three power plants.For Roxboro, the HYSPLIT simulation with Q H = 70 MW yields the best correlation coefficient r = 0.68 for the morning flight, but the best correlation coefficient r = 0.64 for the afternoon is obtained when Q H is given as 100 MW.In fact, the power plant emissions had variations among the operation hours during the day.The HYSPLIT predictions of the morning and afternoon flight observation are contributed by the unit hourly emission runs from 15Z to 17Z and 19Z to 21Z, respectively.
The CEMS SO 2 hourly emissions at Roxboro are 582, 345, and 360 kg/hr for hours 15Z, 16Z, and 17Z, respectively; 465 and 486 kg/hr for hours 19Z and 20Z, respectively.The lower average hourly emission (429 kg/hr) contributing to the morning flight than the higher average hourly emission (476 kg/hr) contributing to the afternoon flight suggests that a higher Q H for the afternoon flight than the morning flight since the emission of SO 2 and heat emission are expected to be proportional to each other for a particular stack.This agrees with the findings here, i.e., an higher "optimal" Q H (100 MW) is needed for a better simulation of the afternoon flight than the "optimal" Q H (70 MW) for the morning flight.
For Belews Creek plume, the model results with Q H = 80 MW seems to capture the plume pattern recorded by the the morning flight, with a correlation coefficient r = 0.87 between the 1-min observations and the HYSPLIT counterparts.However, the  Belews Creek (see Fig. 1.An attempted assimilation of aircraft wind measurements using the WRF observational nudging is not quite effective to correct the wind direction biases.In addition, successful predictions of the measured SO 2 require wind field measurements at the upwind locations in an earlier time period, which are not available for the current case.No "optimal" plume rise will be selected for this segment before Section 3.3.3.
The HYSPLIT simulations with Q H = 90 MW and Q H = 140/150 MW are found to correlate best with the CPI Roxboro SO 2 plumes measured during the morning and afternoon flights, with correlation coefficients as 0.75 and 0.44, respectively.
The CEMs SO 2 hourly emissions at Roxboro CPI are 281 and 300 kg/hr for hours 15Z and 16Z, respectively; 316 and 295 kg/hr for hours 19Z and 20Z, respectively.While the fact that the optimal Q H is higher in the afternoon corresponds well with the higher average SO 2 emission from the CPI Roxboro power plant, 306 kg/hr for 19-20Z versus 291 kg/hr for 15-16Z, the much lower correlation coefficient r = 0.44 for the afternoon plume indicates large prediction errors even with the "optimal" Q H (140/150 MW).

Inversion results
It has been shown that the current problem can be decoupled among the three different power plants.In addition, the SO 2 measurements from the power plant plumes during the morning and afternoon flights are affected by emissions of distinctive periods of two to three hours.Thus six segements are considered independently.Considering the very limited number of 1-min observations to constrain the emissions at certain hours as discussed in Section 3.1, constant emissions are assumed for each of the six segments.
When pre-processing the observations, multiple 1-second SO 2 are averaged to generate 1-min observation.The standard deviation of the multiple original 1-sec observations is calculated to represent the observational uncertainty.The parameters in Equation 5 are found using linear regression, as f o = 0.1 and a o = 0.05 ppbv.Chai et al. (2018) shows that setting the model uncertainty parameter f m = 0.2 yields good results when compared with the known emission sources in the case study.Here the model uncertainty parameter f m = 0.2 is also assumed and the additive term a m is set as 0.05 ppbv, identical to a o .

Zero background
Inversion estimations are first carried out without subtracting any background SO 2 mixing ratios from the observations.That is, the observations are assumed originating only from the three power plant sources.Emission estimation results of the three power plants obtained by minimizing the cost functions using the morning and afternoon flights separately are listed in Table 3 with 15 different assumed heat emissions.
Based on the morning flight, the estimated Roxboro SO 2 emission varies from 701.5 kg/hr with Q H = 10 MW to 473.2 kg/hr with Q H = 150 MW.With the "optimal" Q H =70 MW, SO 2 emission is estimated as 551.9 kg/hr, 29% greater than the average CEMS between 15Z and 17Z.Table 2 shows the emission at 14Z and 15Z are both 582 kg/hr, while the emissions at 16Z and 17Z decrease to 345 and 360 kg/hr before going up again to 509 kg/hr at 18Z.The average emission from 19Z to 20Z estimated based on the afternoon flight with the "optimal" Q H = 100 MW is 520.9 kg/hr, 9% larger than the average indicates that the model probably predicts the mixing ratio gradient relative well but misplaces the plume relative to actual plume.Since the model predicts higher mixing ratios when observation values are low but predicts lower mixing ratios when observation values are higher, neither lower or higher emissions would improve the agreement between the predictions and observations.Thus, no significant biases arise from such cases.

SO 2 background
With zero background SO 2 mixing ratios, the emission estimates based on the "optimal" heat emission are all greater than the CEMS emissions.This indicates that it is necessary to consider the SO 2 background mixing ratios.The HYSPLIT simulated mixing ratios are actually the enhancements over the background mixing ratios.As shown in Table 4, there are 810 1-min observations, more than half of the 1503 1-min SO 2 observations, not residing in any of the HYSPLIT simulated plumes originated from the three power plants with any of the 15 heat emissions.
At first, the median value of the missed SO 2 observations (0.199 ppbv) is assumed as the background SO 2 mixing ratio.This value is subtracted from all the observations unless the values are below this background value, where the observations are set as zero.Using the adjusted observations, the emission estimations results are listed in Table 5.Compared to the estimates with zero background mixing ratios, the estimated emissions are all reduced, as expected.The Roxboro emissions are estimated to be 436 kg/hr for 15-17Z and 403.3 kg/hr for 19-20Z.Both estimates agree much better with the CEMS than the previous estimates without considering the background SO 2 mixing ratios.The estimated Belews Creek emission of 1259.7 kg/hr is significant improved as well.The CPI Roxboro emission during the 15-16Z period is overestimated by 89%, but it is not as severe as the previous 145% overestimation.The estimated CPI Roxboro emission for the 19-20Z period is within 4% of the CEMS value.
Table 4 shows the statistical distribution values of the six different segments, i.e., the morning and afternoon plumes from the three power plants.The highest 1-min SO 2 mixing ratio of 7.249 ppbv is inside the Belews Creek plume measured during the morning flight.The observed SO 2 mixing ratios inside the Belews Creek plumes are much higher than those from the other plumes.It is beneficial to assume different background values for the six different segments of the observations.The minimum, the 5th percentile, the 10th percentile, and the 25th percentile mixing ratios of the morning and afternoon observations inside  Up to now, the best heat emission parameters are selected based on the correlation coefficients between the observations and predicted counterparts for each segment of the observations after an ensemble of HYSPLIT runs with 15 different heat emission are performed.This can be performed before the emissions are estimated since the correlation coefficients are not affected by the magnitudes of the emissions when emissions for each segment are assumed to be constant.After the emission magnitudes are estimated, model performance can be evaluated using other statistic metrics.
The correlation-based emission estimations using all the 15 different heat emission parameters by assuming the segmentspecific 25th percentile observation as the background mixing ratios are listed in Table 6.The root mean square errors (RMSEs) of the HYSPLIT predicted morning and afternoon plumes from the three power plants with all the plume rise ensemble runs are listed in Table 7.The "optimal" heat emissions that yield the best correlation coefficients also result in the smallest RMSEs for two segments, the morning plumes from Roxboro and Belews Creek.The afternoon plume from Roxboro predicted with Q H = 90 MW and the estimated emission of 389.9 kg/hr has the smallest RMSE of 0.428 ppbv.The emission is underestimated by 18%.However, for both the morning and afternoon plumes from CPI Roxboro, "optimal" heat emissions associated with the For all the other four segments, the best RMSEs are slightly larger than the median of the observations.This indicates the poor performance of the Belews Creek afternoon model simulation.However, the emission inversion with Q H = 140 MW still yields a very good estimate of 811.6 kg/hr, which is only 2% overestimated.
Figure 7 shows the comparison of both the RMSE-based and correlation-based optimal predictions with the morning and afternoon flight observations in the HYSPLIT predicted plumes from the three power plants.Identical results are obtained using the smallest RMSE and the highest correlation coefficient for the morning segments from Roxboro and Belews.For both cases, the predicted SO 2 mixing ratios agree well with the observations.Note that here the SO 2 predictions include both the predicted SO 2 enhancement with the estimated emissions and the assumed segment-specific background values which are chosen as the 25th percentile observations inside the particular plumes.For the other cases, the RMSE-based predictions tend to produce lower mixing ratios for the observed high SO 2 values.Thus the linear regression lines for the RMSE-based predictions tend to have flatter slopes.However, the RMSE-based emission can still be larger, such as the CPI Roxboro afternoon case.
The scatter plot for the Belews Creek afternoon case clearly shows anti-correlation as indicated by the negative correlation coefficients listed in Table 1.This is caused by plume misplacement due to wind direction errors.Although the predicted high and low mixing ratios are opposite to the observations, the minimization of the cost function defined by Equation 4 the inverse modeling method is not very sensitive to the plume misplacement.If Q H = 110 MW that generates the highest negative correlation of -0.68 is chosen as the "optimal" plume rise parameter, the estimated emission for Belews Creek during the 18-19Z period is 709.9 kg/hr, which is only 11% lower than the CEMS value of 794 kg/hr.It might still be possible to have 375 reasonable emission inversion results even when plumes are misplaced by the model.Table 7. RMSEs of the SO2 mixing ratios of morning and afternoon plumes from three power plants calculated using the estimated SO2 emissions from the three power plants with 15 different assumed heat emissions listed in Table 6.Bold numbers are associated with the heat emissions which generate the highest correlation coefficients between observations and unit-emission HYSPLIT predictions for the specific flight segments.and RMSE-based "optimal" predictions are shown in Figure 9 and enlarged in Figures A1-A10.For the morning flight, the "optimal" predictions of the Roxboro and Belews Creek plumes based on the highest correlation coefficient and minimal RMSE are identical.The prediction results agree well with the observed plume placement and width, as well as the mixing ratios.On the other hand, for the CPI Roxboro morning plume, the RMSE-based "optimal" prediction with Q H = 30 MW is quite different from the correlation-based "optimal" prediction with Q H = 90 MW.The center of the RMSE-based plume is at lower altitude than the correlation-based plume (Figures A1, A2, and A4).The lower-placed plume is also associated with lower mixing ratios that match the observations better.For the Roxboro plume captured during the afternoon flight, the correlation-based "optimal" prediction with Q H = 100 MW and SO 2 emission of 418.9 kg/hr shows very similar spatial structures and mixing ratios as the RMSE-based "optimal" prediction with Q H = 90 MW and 389.9 kg/hr.Figure 8 and Figures A7-A10 show little differences between the two and both agree well with the 1-min aircraft observations.Figure A7 show that both predictions underestimated the observed peak values along correlation-based solution.Figure A6 shows that it is not very different from the RMSE-based result with Q H = 140 MW.
Both cases clearly misplaced the first transect of the plume and predicted wider transects than the observations.It is found that the second transect shown in Figures A6 is well predicted with Q H = 110 MW.
For the CPI Roxboro plume observed during the afternoon flight, the correlation-based "optimal" prediction with Q H = 140 MW and the RMSE-based "optimal" prediction with Q H = 50 MW appear drastically different in Figure 8  better than the correlation-based results although the estimated emission of 389.1 kg/hr is not closer to the CEMS emission of 306 kg/hr than the correlation-based estimation of 294.1 kg/hr.In addition, Figure A7 shows that the RMSE-based solution captures an observed narrow CPI Roxboro plume transect that correlation-based solution fails to reproduce.

Summary and discussion
An ensemble of HYSPLIT runs with various heat release parameters for the Briggs plume rise algorithm are made to estimate SO 2 emissions from three power plants.Using TCM approach for the inverse modeling, independent HYSPLIT Lagrangian model runs with unit hourly emissions are carried out for each heat release value.The SO 2 emissions from the three power plants during the morning and afternoon flight periods on March 26, 2019 are estimated separately through six different segments.
Initially the "optimal" plume rise runs are selected based on the highest correlation coefficients between predictions and observations.A segment with negative correlations is excluded.It is found that the SO 2 emissions are overestimated for all the remaining segments if background mixing ratios are not considered.Several different assumptions of background values are then tested.Assuming the 25th percentile observed SO 2 mixing ratio inside each segment as the background SO 2 mixing ratios yields good emission estimates, with the relative errors as 18%, -12%, 3%, 93.5%, and -4% when compared with the CEMS data (see Figure 10).
Using the same segment-specific SO 2 background assumption, "optimal" plume rise runs are later selected to have the smallest RMSEs between the predicted and observed mixing ratios.The previously excluded segment that has negative correlation coefficients between predictions and observations is also included in the emission inversion.While identical plume rise runs are chosen as the "optimal" members for Roxboro and Belews Creek morning segments, different runs are selected for the other three segments than the previous correlation-based results.In addition, emission inversion for the previously excluded segment that has negative correlation coefficients between predictions and observations is also carried out.The relative errors as 18%, -18%, 3%, -9%, and 27% for the five segments, and 2% for the Belews Creek afternoon segment.Figure 10 shows that the RMSE-based estimate of SO 2 emission from CPI Roxboro at 15-16 Z agrees much better with the CEMS data than the correlation-based estimate does.The RMSE-based SO 2 emission estimates of Roxboro at 19-20 Z and CPI Roxboro at 19-20 Z appear to deteriorate slightly.However, the associated HYSPLIT predictions show better agreement with the observations than the correlation-based "optimal" runs because of their smaller RMSEs.
While the uncertainty of the heat emission is focused here, there are a lot of other uncertainties associated with the meteorological data input and some of the HYSPLIT turbulence parameterizations.The inverse modeling results also depend on show the ranges of the results using 10 MW above and below the "optimal" heat emissions.
the assumed background SO 2 mixing ratios and several uncertainty parameters used in the cost function formula.Additional analyses could be carried out, but will not be included here.The relative low resolution of heat emission with an increment of 10 MW for the plume rise ensemble runs may result in significant errors for some cases.For instance, Figure 10 shows large ranges of the emission estimates when using 10 MW above and below the correlation-based and RMSE-based "optimal" heat emissions for the CPI Roxboro afternoon segment.Since it is not easy to select the best performance plume rise run based on the limited observations, it is probably better to use several ensemble members to quantify the uncertainties of the model simulation as well as the emission estimates.This is indicated in Figure 10, but needs to be further explored in the future.
Negative correlation is found between predictions and observations for the Belews Creek plume captured by the afternoon flight due to the wind direction errors of the meteorological data.However, the RMSE-based SO 2 emission estimate is only 2% above the CEMS value.More surprisingly, if the plume rise run with the highest absolute correlation coefficient is selected,

UTC hour
Altitude (m) 402B on March 26, 2019.A morning flight started from 13:45 to 17:38 UTC and an afternoon flight lasted from 19:31 to 23:33 UTC.The flight tracks and the locations of the power plants are shown in Fig 1.The flights were intended to sample downwind plumes originated from three coal-fired power plants in North Carolina, Roxboro (36.4833°N 78.0731°W), CPI Rox-

Figure 1 .
Figure 1.Flight tracks of the morning (left) and afternoon (right) flights on March 26, 2019 on top of the ©Google Maps satellite image (retrieved in February 2023).Color represents the aircraft travel time of the day (UTC).The locations of Belews Creek, Roxboro, and CPI Roxboro power plants are also shown.
) are generated to provide boundary conditions for the most inner domain D03.The WRF turbulent kinetic energy (TKE) data are used to calculate the turbulent velocity variances.The ratios of https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License. the vertical to the horizontal turbulence for daytime and nighttime are set as 0.18 for both daytime and nighttime.The boundary layer stability is computed from the heat and momentum fluxes from the meteorological data.The WRF mixed layer depth is directly used in the HYSPLIT model.

Figure 3 .
Figure 3. Wind speed (top) and wind direction (bottom) comparisons between the 1-min aircraft measurements (OBS) and 5-min WRF data along the flight.Aircraft above ground level altitudes are also shown.
Figure 4 shows a TCM with 72 columns separately into three parts representing the three power plants.Each of the 24 columns for a power plant represents a HYSPLIT run with unit hourly SO 2 emission specified for a single hour on March 26, 2019.Each row indicates a 1-min 4D SO 2 observation with at least a non-zero transfer coefficient obtained from the 72 HYSPLIT runs.The stack heat emission Q H = 50 MW is specified for all the 72 runs.A total of 464 out of 1503 1-min 4D SO 2 observations are affected by the three power plants during this test period, according to this set of HYSPLIT runs.Among those 464 observations, the first 234 1-min observations belong to the morning flight and the next 230 observations are from the afternoon flight.Most of the observations with zero transfer coefficients for all the 72 HYSPLIT runs have low SO 2 mixing ratios, which are likely due to SO 2 background caused by other minor sources than the three power plants.Note that the background SO 2 mixing ratio may vary from one location to another and from one hour to the next hour.

Figure 4
Figure4shows that the emissions before 15Z or after 21Z of the day from any of the three power plants do not contribute to the predicted SO 2 plumes along the tracks of the morning or afternoon flights.For 463 of the 464 indexed observation rows in Fig.4, the non-zero transfer coefficients only appear in one of the three parts.That is, all observations except one are only affected by a single power plant for the current set of model runs.The only exception (I obs = 369) of the 1-min observations is influenced by both Roxboro and CPI Roxboro.When stack heat emission Q H = 60 MW or higher values is applied, the plumes from the three power plants are all separate without any overlapping.This implies a decoupled system in which the emission sources from the three different power plants can be solved separately.However, with lower heat emissions (Q H = 10 MW, 20 MW, 30 MW, 40 MW) some isolated 1-min observations may be influenced by both Roxboro and CPI Roxboro.The largest number of such observations appears when Q H = 10 MW is applied to all three power plants where 6 of the 479 observations with non-zero transfer coefficients are affected by both Roxboro and CPI Roxboro.It is found that estimating the emissions

Figure 4 .
Figure 4. Transfer coefficients (TCs) calculated with unit hourly SO2 emissions starting from 0Z to 23Z on March 26, 2019 at the three power plants with QH = 50MW.I obs is the index of the 1-min 4D observations ordered by their measurement time.Observations with zero transfer coefficients for all the 72 HYSPLIT runs are excluded.The first 234 1-min observations belong to the morning flight and the next 230 observations are from the afternoon flight.TC units: ppbv/(kg/hr).
Figure 5. PBL heights and the final plume rise calculated with QH = 50 MW at three different power plant locations from 5Z to 23Z on Mar. 26, 2019 (left).Plume rise calculated with QH = 100 MW at Belews Creek and QH = 20 MW at CPI Roxboro are compared with those calculated with QH = 50 MW.Both PBL heights and plume rise shown are above ground level (AGL) heights.

Figure 6 .
Figure 6.Correlation coefficients between 1-min SO2 observations and the model counterparts using unit-emission HYSPLIT simulations with different heat emissions (QH ) from the three power plants.When calculating model counterparts of the observations, both horizontally nearest neighbor (n) and interpolation (i) approaches are used.
CEMS value.Contrary to the morning flight, the estimated emissions are generally greater with increasing emission heat.The estimated emissions are 875.7 and 449.3 kg/hr with Q H = 150 and 10 MW, respectively.Using the morning flight observations, Belews Creek SO 2 emissions between 16Z and 17Z are overestimated with all 15 heat emissions.With the "optimal" Q H = 80 MW, the estimated emission is 1417.3kg/hr.Although this is 57% larger than the average CEMS emission, it is better than the estimates with other Q H values except Q H = 100 or 120 MW which yields slightly lower emissions (1417.3/1406.0kg/hr).It is also noted that estimated Belews Creek SO 2 emissions using the afternoon flight observations are mostly within a factor of two when comparing with the average hourly CEMS emission while significant negative correlations are found between the observations and the model predictions.The worst underestimations when Q H = 40-70 MW are associated with lower absolute correlations (|r| < 0.3).At Q H = 110 MW when the most extreme anticorrelation (r = −0.68)occurs, the estimated SO 2 emission of 697.3 kg/hr is very close to the average hourly CEMS emission of 794 kg/hr between 18Z and 19Z.The inverse correlation is caused by the plume misplacement mostly due to wind direction error.The high absolute correlation https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.

330
the plumes from three different power plants are assumed as segment-specific background mixing ratios.After subtracting the assumed background values from the observations, the emission estimations results are listed in Table5.The estimated emissions decrease with increasing background values.With the segment-specific 25th percentile as the background, the Belews https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.Creek emission estimation of 929.7 kg/hr is within 3% of the CEMS values and the other estimates are comparable to the results by assuming a constant background mixing ratio of 0.199 ppbv.
highest correlation coefficients are quite different from the heat emissions that produce the smallest RMSEs.If the model runs associated the smallest RMSEs are selected, the estimated CPI Roxboro SO 2 emissions are 265.1 kg/hr for 15-16Z and 389.1 kg/hr for 19-20Z, 9% underestimated and 27% overestimated over CEMS.While the 19-20Z emission is worse than the result based on the best correlation, the 15-16Z emission estimation is much closer to the CEMS than the correlation-based result, which is 94% overestimated.For the plume from the Belews Creek observed during the afternoon flight, Q H = 140 MW yields the least RMSE of 1.874 ppbv, which is more than three times of the median SO 2 observation in the segment.The least RMSE of 0.859 ppbv for the Belews Creek morning segment is smaller than the 25th percentile value of the observation (1.041 ppbv).
is still https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.capable of reaching to an estimate close to the actual emission rate.The observations appear to have a good representation of its mixing ratio distribution for the plume at the distance from the source.Even if the model misplaced the plume location, predicted mixing ratios that have a similar distribution of the low and high values still has the minimal cost function.That is,

Figure 8
Figure8shows the "optimal" predictions based on the highest correlation coefficients and minimal RMSEs at 800 m above ground level at 17Z and 19Z.Continuous vertical profiles along the flight track, or "curtain" plots, of the correlation-based Figure 8(c) shows a wider CPI Roxboro plume of the RMSE-based result than the correlation-based result in Figure 8(a).The larger extent of the RMSE-based CPI Roxboro plume results in an extra appearance of the plume under the flight track in the curtain plot (Figure A3).

Figure 8 .
Figure 8.Comparison of the correlation-based (a, b) and the RMSE-based (c, d) "optimal" predictions at 800 m above ground level at 17:00Z (a, c) and 19:00Z (b, d).The morning (a, c) and afternoon (b, d) 1-min observations are overlaid as circles.Color indicates the SO2 values for both predictions and observations.The three power plants are marked with solid black circles.

Figure 9 .
Figure 9. "Curtain" plots of the correlation-based (a, b) and the RMSE-based (c, d) "optimal" predictions.In the "curtain" plots, continuous vertical profiles along the flight track are shown following the observation time.The morning (a, c) and afternoon (b, d) 1-min observations are overlaid as circles.Color indicates the SO2 values for both predictions and observations.

Figure 10 .
Figure 10.The CEMS and estimated SO2 emissions from the three power plants on March 26, 2019 during the specified hours.Error bars of CEMS emissions indicate the ranges of hourly emissions for the specified hours as well as one hour before and one hour after.Correlationbased and RMSE-based estimates are the inversion results using the "optimal" heat emission that generates the highest correlation coefficient and the smallest RMSE between observations and the HYSPLIT predications for the specific flight segment, respectively.The correlationbased Belews Creek afternoon segment is based on the highest absolute correlation coefficient.Error bars of the estimated SO2 emissions 440 https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License. the SO 2 estimate of 715.6 kg/hr is very close to the CEMS average emission rate of 794 kg/hr.We speculate that the inverse modeling is not very sensitive to the plume misplacement because the cost function minimization would favor an unbiased population distribution even when misplacement by the model is present.It has to be noted that the current dispersion simulation directly places the pollutant release points with the calculated plume rises elevated above the stacks while the actual plumes reach their apexes gradually.Thus the dispersion model is not able to accurately reproduce the exact plume shapes at locations close to the source.The afternoon flight around Belews Creek power plant is closer to the source than the other segments.This probably makes this case more difficult to simulate accurately than the other segments.https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.

Figure A1 .Figure A2 .Figure A3 .Figure A4 .Figure A5 .
Figure A1.Enlarged "curtain" plots of the correlation-based (a) and the RMSE-based (c) "optimal" predictions in Figure9(Part 1).In the "curtain" plots, continuous vertical profiles along the flight track are shown following the observation time.The morning flight 1-min observations are overlaid as circles.Color indicates the SO2 values for both predictions and observations.Predicted plumes from Roxboro, Belews Creek, and CPI Roxboro are indicated with letters "R", "B", and "C", respectively.

Figure A6 .Figure A7 .Figure A8 .Figure A9 .Figure A10 .
Figure A6.Enlarged "curtain" plots of the correlation-based (b) and the RMSE-based (d) "optimal" predictions in Figure 9 (Part 1).In the "curtain" plots, continuous vertical profiles along the flight track are shown following the observation time.The afternoon flight 1-min observations are overlaid as circles.Color indicates the SO2 values for both predictions and observations.Predicted plumes from Roxboro,Belews Creek, and CPI Roxboro are indicated with letters "R", "B", and "C", respectively.
where c p , ρ, and T are the specific heat at constant pressure, average density, and temperature of ambient air, respectively.Q H is the heat emission from the stack.Assuming standard atmosphere, Q H is the only user input parameter besides meteorological 130 conditions that affects the final plume rise height ∆H.It is possible to calculate Q H when the relevant parameters such as the

Table 1 .
Correlation coefficients between 1-min SO2 observations from the morning and afternoon flights and the model counterparts using the unit-emission HYSPLIT simulations with different heat emissions from the three power plants.The highest correlation for each flight segment is highlighted with bold font.
Nonetheless, the negative correlations between model and observations indicate model deficiencies and require special attention.The CPI Roxboro emission estimates based on the morning and afternoon flights with the "optimal" Q H values (90 and 140 MW) are 712.8 and 384.4 kg/hr, respectively.They are overestimated over the CEMS by 145% and 26%.The CPI Roxboro emission estimates based on the morning flight increase significantly when Q H is above 100 MW.Overestimation of the SO 2 emissions by factors of 18 and 15 are found with Q H set as 140 and 150 MW, respectively.Table1shows that the two heat emissions yield correlation coefficients of 0.10, a significant drop from r = 0.75 when Q H is assumed as 90 MW.Although the highest correlation coefficient between observations and unit-emission HYSPLIT predictions for a specific flight segment may not produce the best emission estimates, a low correlation coefficient typically indicates modeling deficiencies very effectively.

Table 2 .
The power plant geolocations, stack heights, and CEMS emissions (United States Environmental Protection Agency (U.S. EPA),

Table 3 .
Estimation of SO2 emissions from the three power plants on March 26, 2019 with 15 different assumed heat emissions and the average CEMS emissions during the specified hours.The ranges of CEMS hourly emissions for the specified hours as well as one hour before and one hour after the period are shown after the average CEMS emission.The relevant CEMS hourly emissions are listed in Table 2.The bold numbers are associated with the heat emissions that generate the highest correlation coefficients between observations and HYSPLIT predictions for the specific flight segments.

Table 4 .
Number of 1-min SO2 observations and some statistics of the SO2 mixing ratios.There are overlapping between Roxboro and CPI Roxboro segments since some observations are affected by both power plants.

Table 5 .
Estimation of SO2 emissions from the three power plants onMarch 26, 2019with different background mixing ratios.The "optimal" heat emission that generates the highest correlation coefficient between observations and unit-emission HYSPLIT predictions for the specific flight segment is assumed.Complete emission estimates with all heat emissions and different background mixing ratios are listed in Tables 3, A1, A2, A3, A4, and 6.The average CEMS emissions during the specified hours are listed for reference.The relevant CEMS hourly emissions are listed in Table2.The segment-specific statistical distribution values are listed in Table4.https://doi.org/10.5194/egusphere-2023-329Preprint.Discussion started: 13 March 2023 c Author(s) 2023.CC BY 4.0 License.3.3.3Root mean square errors (RMSEs)

Table 6 .
Estimated SO2 emissions from the three power plants onMarch 26, 2019with 15 different assumed heat emissions and the average CEMS emissions during the specified hours.The segment-specific 25th percentile observations are assumed as the background SO2 mixing ratios and have been subtracted from the observations for emission inversion.The ranges of CEMS hourly emissions for the specified hours as well as one hour before and one hour after the period are shown after the average CEMS emission.The relevant CEMS hourly emissions are listed in Table2.The bold numbers are associated with the heat emissions which generate the highest correlation coefficients between observations and HYSPLIT predictions for the specific flight segments.The underlined numbers are associated with the smallest RMSEs listed in Table7.
The underlined numbers indicate the smallest RMSEs of each segment.

Table A2 .
Estimated SO2 emissions from the three power plants onMarch 26, 2019with 15 different assumed heat emissions and the average CEMS emissions during the specified hours.The segment-specific minimum observations are assumed as the background SO2 mixing ratios and have been subtracted from the observations for emission inversion.The ranges of CEMS hourly emissions for the specified hours as well as one hour before and one hour after the period are shown after the average CEMS emission.The relevant CEMS hourly emissions are listed in Table2.The bold numbers are associated with the heat emissions which generate the highest correlation coefficients between observations and HYSPLIT predictions for the specific flight segments.The underlined numbers are associated with the smallest RMSEs listed in Table7.

Table A3 .
Estimated SO2 emissions from the three power plants onMarch 26, 2019with 15 different assumed heat emissions and the average CEMS emissions during the specified hours.The segment-specific 5th percentile observations are assumed as the background SO2 mixing ratios and have been subtracted from the observations for emission inversion.The ranges of CEMS hourly emissions for the specified hours as well as one hour before and one hour after the period are shown after the average CEMS emission.The relevant CEMS hourly emissions are listed in Table2.The bold numbers are associated with the heat emissions which generate the highest correlation coefficients between observations and HYSPLIT predictions for the specific flight segments.The underlined numbers are associated with the smallest RMSEs listed in Table7.

Table A4 .
Estimated SO2 emissions from the three power plants onMarch 26, 2019with 15 different assumed heat emissions and the average CEMS emissions during the specified hours.The segment-specific 10th percentile observations are assumed as the background SO2 mixing ratios and have been subtracted from the observations for emission inversion.The ranges of CEMS hourly emissions for the specified hours as well as one hour before and one hour after the period are shown after the average CEMS emission.The relevant CEMS hourly emissions are listed in Table2.The bold numbers are associated with the heat emissions which generate the highest correlation coefficients between observations and HYSPLIT predictions for the specific flight segments.The underlined numbers are associated with the smallest RMSEs listed in Table7.