Comparison of Ozone Formation Attribution Techniques in the Northeast United States

. The Integrated Source Apportionment Method (ISAM) has been revised in the Community Multiscale Air Quality (CMAQ) model. This work updates ISAM to maximize its flexibility, particularly for ozone (O 3 ) modeling, by providing multiple attribution options, including products inheriting attribution fully 10 from nitrogen oxide reactants, fully from volatile organic compound (VOC) reactants, equally to all reactants, or dynamically to NO x or VOC reactants based on the indicator gross production ratio of hydrogen peroxide (H 2 O 2 ) to nitric acid (HNO 3 ). The updated ISAM has been incorporated into the most recent publicly accessible versions of CMAQ (v5.3.2 and beyond). This study's primary objective is to document these ISAM updates and demonstrate their impacts on source apportionment results for 15 O 3 and its precursors. Additionally, the ISAM results are compared with the Ozone Source Apportionment Technology (OSAT) in the Comprehensive Air-quality Model with Extensions (CAMx) and the brute force method (BF). All comparisons are performed for a 4km horizontal grid resolution application over the northeast U.S. for a selected two-day summer case study (August 9th and 10th, 2018). General similarities among ISAM, OSAT, and BF results add credibility to the new ISAM 20 algorithms. However, some discrepancies in magnitude or relative proportions among tracked sources illustrate the distinct features of each approach while others may be related to differences in model formulation of chemical and physical processes. Despite these differences, OSAT and ISAM still provide useful apportionment data by identifying the geographical and temporal contributions of O 3 and its precursors. Both OSAT and ISAM attribute the majority of O 3 and NO x contributions to boundary,

mobile, and biogenic sources, whereas the top three contributors to VOCs are found to be biogenic, boundary, and area sources.

Introduction
Tropospheric O3 is a critical air pollutant that endangers human health (WHO, 2013) and sensitive vegetation (Booker et al., 2009), and contributes to climate change (Jacob and Winner, 2009).
It is produced through non-linear photochemical reactions of carbon monoxide (CO), volatile organic compounds (VOC), and nitrogen oxides (NOx = NO + NO2) with sunlight (Atkinson, 2000).In the United States, the national average ambient O3 concentration has decreased by 22% since 1990, owing to regulations such as the Clean Air Act (CAA) on NOx and VOC emissions (Simon et al., 2015).Longterm space observations have also confirmed the improvement in air quality (Duncan et al., 2013;Lamsal et al., 2015).However, many major metropolitan areas continue to exceed the O3 national ambient air quality standards (NAAQS) set by the US Environmental Protection Agency (US EPA).To continue to reduce O3 levels, it is critical to develop effective emission control strategies as has been done for other pollutants (Lefohn et al., 1998;Reitze, 2004;Cooper et al., 2015).The effectiveness of any O3 control strategy hinges on accurately quantifying the contributions of various precursor emissions to O3 formation.
Numerous techniques have been used to characterize and quantify the relationship between emission sources and O3 concentrations, including statistical methods, model sensitivity simulations, and model source apportionment approaches, each with its own set of advantages and disadvantages (Cohan and Napelenok, 2011).While some traditional receptor-based methods based on chemical mass balance (CMB, Hidy and Friedlander, 1971), such as Effective Variance solution (EV, Watson et al., 1984) and Positive Matrix Factorization (PMF, Paatero and Tapper, 1994), produce insightful results when measurements are taken at a specific receptor, they are typically applied to speciated VOC and particulate matter (PM) and are also constrained by the relative sparsity of observations in space and time, rendering them unsuitable for regional and national O3 precursor emission control strategies.
Alternatively, three-dimensional air quality models (AQM) allow for the quantification of O3 source contributions at regular intervals over longer periods and wider spatial distributions.The most basic source apportionment (SA) technique in the context of an AQM is to conduct source sensitivity simulations using the brute force (BF) method, in which several simulations are conducted, each with one source eliminated or reduced.The differences in the output fields compared to the baseline simulation are then attributed to the eliminated or reduced source (e.g., Marmur et al., 2005).BF has some limitations when used to determine total source culpability of O3 due to the pollutants' nonlinear dependence on both relative and absolute VOC and NOx concentrations.For example, removing NOx may lead to an increase of O3 concentrations in the vicinity of large NO emissions (e.g., power plants), as the result of net conversion of O3 to NO2 (Gillani et al., 1996) or at night-time when NOx titration cannot be balanced by the photolysis of NO2.In some cases, where a source contributes a substantial portion of total NOx or VOC emissions, complete source removal for the purposes of source apportionment calculation may also substantially alter the underlying chemical regime for formation of secondary pollutants such as O3.Further, to separate the contributions and interactions of "n" sources, Stein and Alpert (1993) showed that BF would require two to the power of the number of sources of simulations (2 n ).This is quickly impractical leading to a subset of BF simulations with unknown interactions.As a result, summarizing the O3 change in response to multiple brute force emission source simulations can make it difficult to interpret the cumulative effect of those emissions on O3 (Kwok et al., 2015).
Reactive tracer or tagged species SA methods for O3 have also been incorporated in AQMs.
These tracers are usually additional species added to the AQM to track the contributions of pollutants from specific source categories.They undergo the same atmospheric processes as the bulk chemical species within the model (Kwok et al, 2015).As one example, OSAT within CAMx quantifies the contributions of various emission sectors, source regions, as well as initial and lateral boundary conditions, to simulated O3 concentrations (Ramboll Environ, 2015).OSAT allocates instantaneous O3 formation to either NOx or VOCs based on the ratio of hydrogen peroxide (H2O2) to nitric acid (HNO3) production (Dunker et al., 2002).O3 formation is classified as being NOx-limited or VOC-limited based on the gross production of H2O2 (PH2O2) and HNO3 (PHNO3).When the ratio (PH2O2/PHNO3) is above 0.35, the formation is classified as NOx-limited and VOC-limited otherwise (Sillman, 1995).If the photochemical formation of O3 (PO3) occurs in a NOx-limited regime, the NOx tracers are used to attribute PO3 proportionally to the emissions sources that contributed to the NOx concentrations.
Otherwise, VOC tracers are used to attribute PO3 to the sources that contributed to the VOC concentrations (Dunker et al., 2002;Kwok et al., 2015).The OSAT formulation was recently changed (OSAT3) to track all forms of NOx to account for NOx recycling, which occurs when NOx is converted to another form of NOx (e.g., peroxyacetyl nitrate (PAN) or HNO3) and then converted back to NOx.
Additionally, the Integrated Source Apportionment Method (ISAM) within CMAQ has shown promising results for O3 tagging (Kwok et al., 2015).Recent ISAM experiments have quantified the contribution of O3 sources to air pollution in several major cities throughout the United States and Europe (Kwok et al., 2015;Valverde et al., 2016;Karamchandani et al., 2017;Butler et al., 2018;Pay et al., 2019).The attribution of O3 and precursors from specific sources estimated by ISAM implemented in version 5.0 of CMAQ compared well with source-specific aircraft transect measurements (Baker and Woody, 2016).The ISAM algorithms have also been updated several times following the original implementation in CMAQv5.0.2.
ISAM updates presented in this study substantially increase the flexibility to the user of the CMAQ source apportionment model.These updates were intended to provide long term flexibility within the model to accommodate newer chemical mechanisms and changed the attribution approach as detailed in the methods section.These flexibilities allow for apportionment of more species and allow for more methods of apportionment.Further in the manuscript we apply the changes to CMAQ-ISAM for a Northeastern U.S. O3 air quality episode and compare the results to CMAQ-BF and CAMx-OSAT.
The manuscript is organized as follows: Section 2 documents the ISAM updates in detail; Section 3 describes the methodology for this study, which includes the base modeling configurations, simulation designs for source apportionment, tracked species classes, evaluation methods, and case study development; Section 4 presents the findings, including model evaluation results and comparisons of source apportionment for several species; Section 5 documents the running speed comparisons between CMAQ-ISAM, CAMx-OSAT and CMAQ-BF; and finally, the findings and their implications for future research are discussed in Section 6.

Updates in ISAM
The ISAM implementation in the version 5.0 release of CMAQ was based on Kwok et al. (2013 and2015).That approach was then updated starting from CMAQ version 5.3 to an attribution based on integrated reaction rates and product yields (US EPA, 2019).The later versions (v5.3.2 and beyond) of CMAQ-ISAM (US EPA, 2022a) employ an apportionment scheme that assigns products of each chemical reaction to sources based on reactant stoichiometry.For example, the isoprene peroxy radical (ISO2) reacts with nitric oxide (NO) to produce several different stable and radical species as represented in the CB6R3 chemical mechanism by the following reaction R1.
ISO2 + NO = 0.1*INTR + 0.9*NO2 + 0.673*FORM + 0.9*ISPD + 0.818*HO2 + 0.082*XO2H + 0.082*RO2 (R1) In addition to nitrogen dioxide (NO2), the products include isoprene nitrate (INTR), formaldehyde (FORM), hydroperoxy radicals (HO2), alkoxy radicals (XO2H), peroxy radicals (RO2), and other isoprene reaction products (ISPD).ISO2 is a product of the oxidation of isoprene, which originates from overwhelmingly biogenic sources.NO is typically emitted from anthropogenic combustion processes, with a much smaller natural component originating from lightning strikes and microbial soil processes on the global scale (Jacquemin et al., 1990;Yienger et al., 1995).Thus, the reactants are approximately half from biogenic and half from anthropogenic sources, so the reaction's products have the same attribution distribution.However, source attribution approaches, both receptorbased (such as PMF) and source-based (such as ISAM), are often used to understand how originally emitted NOx and VOC from particular sources ultimately contribute to model-predicted O3 production.
The loss of source identity through processes such as the NOx cycle and the role of organic peroxy radicals from sources not controlling O3 production make it difficult to determine the culpability of emission sources.In the preceding example, the NO2 produced by R1 is assigned a source that is approximately 50% biogenic and 50% anthropogenic.These source assignments propagate quickly when catalytic processes cause NO2 to cycle back to NO through photooxidation and radical oxidation Because NOx cycling is fast in regional air pollution models, anthropogenically emitted nitrogen species can be assigned to biogenic (or other nearby) sources downwind, so the original source identity was not retained.R1 is just one example that illustrates the complex relationship between precursors and subsequent source identities of secondary pollutants.Many such reactions exist in modern chemical mechanisms.Some source apportionment applications, such as O3 source attribution assessments, focus on how sources induce O3 production above background levels.Nitrogen molecules should then retain their original source signatures.This approach is used by other apportionment models such as OSAT, earlier ISAM implementations (Kwok et al., 2015), and other tagging methods (Butler et al., 2018;Grewe et al., 2010).
Because attribution objectives may vary based on scale (e.g., global compared to urban) or purpose (e.g., policy or tracing chemical reactions), ISAM has been enhanced to provide additional configuration options for the user to define how secondarily formed gaseous species are assigned to sources of parent reactants (Table 1) (US EPA 2022b).The existing scheme based on stoichiometrically proportional product attribution introduced in CMAQ version 5.3.2 has been retained as ISAM option 1 (ISAM-OP1).Four new options have been added so the user can configure their simulation based on the application's goal.Each option allows for greater retention of source identity based on subsets of species in the chemical mechanism.ISAM-OP2 apportions products according to the source identity of reactive nitrogen species, including NO, NO2, nitrate radical (NO3), nitrous acid (HONO), HNO3, dinitrogen pentoxide (N2O5), and aerosol nitrate (ANO3).For example, CB6R3 contains the following reaction between the methyl peroxy radical (MEO2) and NO: In the original ISAM-OP1 configuration, the products of R2, FORM, HO2, and NO2 inherit source identities proportional to the source identities of the reactants (MEO2 and NO).However, ISAM-OP2 apportions the product to be from the source identity of NO (presumed predominantly anthropogenic), because NO is a weighted nitrogen-containing species.When a reaction's reactants do not include any of the weighted species, products are apportioned to source identities using the same methodology used in OP1.
ISAM-OP3 expands OP2's list of weighted species to include VOC species identified as important to O3 production.In CB6R3, this includes aldehydes (ALD2 and ALDX), FORM, acetone (ACET), lumped ketones (KET), peroxy operators (XO2 and XO2H), ISO2, acetyl peroxy radicals (C2O3 and CXO3).Therefore, products of reactions containing these VOCs in addition to the nitrogen species of OP2 as reactants would inherit these species' source identities.For example, ALD2 reacts with the NO3 as follows in CB6R3.

ALD2 + NO3 = C2O3 + HNO3 (R3)
The reaction's products, C2O3 and HNO3, inherit identities equally divided between the sources of the reactants because ALD2 and NO3 are on the list of OP3 species.Reactions without any of these species in the reactants list, like OP2, have their products apportioned to source using OP1's methodology when the reactants are not among the weighted ones.
ISAM-OP4 lists only VOC species and daughter products instrumental in O3 chemistry as defined in OP3.In the R1 example, the products are apportioned to the source identity of ISO2, because the other reactant, NO, is not on the list of weight species.Similarly, the products of R3 are attributed to the source identity of ALD2.As in options 2 and 3, reactions (such as R2) without any listed species are attributed as in OP1's method.
Finally, ISAM-OP5 was added to account for the instantaneously calculated O3 formation regime or limiting case.The regime is determined using the ratio of PH2O2/PHNO3.The transition point between regimes has a default value equal to 0.35 (Sillman, 1995).For the NOx-limited regime (PH2O2/PHNO3>0.35),source identity is passed from the nitrogen species of OP2, while for the VOClimited regime (PH2O2/PHNO3≤0.35),source identity is passed from the organics of OP4.These CMAQ-ISAM options, including the regime threshold value (or transition point), are accessible at runtime through the standard model run script.

CMAQ ISAM option
Reaction product source identity assignment Representative CB6R3* Species

OSAT description
The source apportionment approach implemented in CAMx is briefly recapped here.Detailed updates of all OSAT versions can be found in the CAMx official user guide (https://camx.com/Files/CAMxUsersGuide_v7.10.pdf).All available versions of OSAT (including OSAT3) in CAMx separately solve for production and destruction of O3 with production being attributed to either NOx or VOC emissions, depending on which is estimated to be limiting O3 production.When the ratio of PH2O2/PHNO3 exceeds 0.35, the produced O3 is attributed to NOx emissions, and VOC emissions below that threshold.The CAMx source apportionment implementation includes an option (OSAT-APCA) that allows for a redirection of attribution to anthropogenic emissions in situations where the limiting precursor is biogenic.In CAMx-OSAT, O3 attributed to NOx and VOCs is tracked as separate tracer groups.O3 tracers are first adjusted to account for O3 destruction processes and subsequently for net O3 production, which is defined as the difference between O3 production and O3 destruction based on a subset of photochemical reactions that result in O3 destruction.In situations where the net O3 production is negative (destruction reactions dominate), all the O3 tracers are proportionally decreased.When net O3 production is positive, production is assigned proportionally to the sources of those emissions (NOx and VOC precursor tracers) at the time and place where O3 was made.OSAT includes a group of tracers that track odd-oxygen that is consumed when O3 reacts with NO to form NO2 that can quickly photolyze and reform O3 through a reaction with oxygen.
In this situation, the O3 removed from the O3 tracers due to the NO + O3 reaction is moved to the oddoxygen tracers (which have separate NOx and VOC tracer groups).When NO2 is photolyzed and O3 formed a proportional amount of O3 is taken from the odd-oxygen tracers and moved to the O3 tracers.e Horizontal diffusion fluxes for transported pollutants were parameterized using eddy diffusion theory.

Base model configurations
The horizontal diffusivity coefficients were formulated using the approach of Smagorinsky (1963).f KZMIN was turned on in CMAQ as default.
g Vertical diffusivity coefficients were calculated with Yonsei University (YSU) bulk boundary layer scheme (Hong et al., 2006) and were adjusted with the KVPATCH which is comparable to the KZMIN approach in CMAQ.

Source apportionment simulation designs
As discussed in Section 2, ISAM has been updated to include a user option with five possible configurations for source apportionment approach.Here, we conduct CMAQ source apportionment simulations for all these options: ISAM-OP1, ISAM-OP2, ISAM-OP3, ISAM-OP4 and ISAM-OP5, hereafter referred to as OP1, OP2, OP3, OP4 and OP5.The OSAT3 approach was also used in the CAMx v7.10 base model for comparison with the five ISAM simulations.Hereafter OSAT3 is referred to as OSAT.A brute force method (zeroing out the entire emission stream for tracked sources in CMAQ, hereafter referred to as CMAQ-BF) was also used to compare with the ISAM options and OSAT.Eleven different emission source categories were tracked using each apportionment technique.
The source categories comprise four point-source categories including electricity generating units (EGU), non-electricity generating units (NONEGU), fires (FIRE), and commercial marine vessels (CMV), and six area-source categories including on-road mobile (ONROAD), non-road mobile (NONROAD), biogenic (BIO), railway (RAIL), airports (AIRP), and other (AREA).Additionally, OILGAS was tracked as a mixed category (both point and area) of emissions from the oil and natural gas industry in the domain.Total emissions from the above sectors have been displayed in Table 3.
Finally, three predefined tracers for lateral boundary conditions (BCON), initial conditions (ICON), and other sources (OTHR) were also tracked for O3 and its precursors.OTHR is used for all remaining untagged emission categories.For example, when there are a total of ten emission streams but only five of them are tracked in ISAM, the remaining five emission streams will be defined as OTHR.In this study, all emissions sectors were tracked as previously mentioned above for OSAT and ISAM.For CMAQ-BF, a unique CMAQ simulation for each emission source category listed above was performed by fully removing the category's entire emission stream.CMAQ-BF apportionment was then calculated by subtracting the resulting pollutant fields from a base model simulation.However, for ICON and BCON, each was reduced by 50%, and the output field difference with the base model was scaled up by a factor of 2 to avoid numerical issues associated with very low model ICON and BCON values.As for OTHR, there is no suitable way to retain an appropriate chemical state of the troposphere after subtracting necessary emission categories, initial and boundary conditions from an original CMAQ simulation.Thus, OTHR is not being compared among CMAQ-BF, ISAM and OSAT in this study.

Tracked species classes
O3, NOx and VOC species were tracked by each method.As mentioned above, ISAM tracks individual oxidized nitrogen and VOC species based on selected chemical mechanism in CMAQ, whereas OSAT tracks tracer families for each.To facilitate the comparison between the two models, the ISAM species were aggregated in the same fashion as OSAT (Table 4).However, some differences still exist since species representations between the two models are not completely the same.The nitrogen groupings NOy and RNOx (Table 4) were added to better elucidate the behavior of each model under different O3 producing chemical regimes.

Evaluation method and case study development
Although identical emissions and meteorological inputs are used for CAMx and CMAQ (Table 2), potential differences still exist in multiple scales and processes.Shu et al. (2017Shu et al. ( , 2022) ) have reported that deposition is one of the largest uncertainties between the two models when other processes are constrained.For inter-comparing ISAM and OSAT, it is not feasible to constrain all process uncertainties.Thus, we established criteria to choose representative days for ISAM and OSAT comparisons based on the performance of their parent models rather than comparing them throughout the entire simulation period to reduce the difference that may be brought on from their parent models.
We initially set the correlation relationship (R 2 ) criteria of maximum daily 8-hour averaged (MDA8) O3 between CMAQ and CAMx to be above 0.7 to ensure that the performance of the two parent models is comparable.Next, MDA8 O3 was also used as the indicator for case study selection since ISAM and OSAT normally are used as regulatory application with this metric.We assess the mean bias (MB) of MDA8 O3 for every day to choose the days on which both models have the lowest MB for predicted MDA8 O3.Therefore, CMAQ and CAMx simulated ambient concentrations were paired in space and time with observed data from the Air Quality System (AQS, https://www.epa.gov/aqs)monitoring network.Hourly concentrations of total O3, NO and NO2 were also compared to the AQS observations, and their bias statistical metrics were calculated as well.

Model performance evaluation and case study selection
Figure 1 shows observed site averaged MDA8 O3 and its corresponding biases predicted by CMAQ and CAMx over paired AQS sites for the entire episode.Observed site averaged MDA8 O3 ranges from 30 to 50 ppbv.The performance of two models for predicting MDA8 O3 varies by paired day and monitor site with the range of biases from -23 to 35 ppbv, approximately.Table S1 summarizes R 2 and MB of MDA8 O3 for each day for both models.Based on our criteria introduced in Section 3.4, there are 13 days on which the two models show very good correlation relationships.Among these days, two models both show good performance on predicting MDA8 O3 with closest MB on Aug 09 th (CMAQ/CAMx = 3.09/2.99ppbv) and 10 th (CMAQ/CAMx: 2.42/2.61ppbv).For other days, either two models both have higher MB (> 10 ppbv), or their predictions do not agree well with each other, with a difference of MBs up to 8 ppbv.Therefore, Aug 09 th and 10 th were selected as a two-day case study for source apportionment comparisons.Additional evaluations of hourly O3, NO and NO2 is available in Fig. S1 of the supplemental information (SI).From Fig. 2 The differences of MB, NMB and R 2 between the two models also diminish for MDA8 O3 but increase for hourly O3 from the monthly episode to the two-day episode.The statistical metrics of hourly O3 and MDA8 O3 demonstrate that the selected two-day case is suitable for a source apportionment comparison in which CAMx and CMAQ not only both have the least-biased predictions compared to observations but also show a good agreement with each other.
, R 2 ranges from 0 to 1 with 1 indicating perfect correlation and 0 indicating an uncorrelated relationship.

Temporal variations of sector contributions
To better understand how the ISAM model apportionment approach simulated source contributions at each time step, time-series comparisons for each source were examined for O3 and its precursors, RNOx and VOC for the two-day case study.Figure 3 shows hourly variations of domain averaged predicted total O3 (bulk) concentrations and sector contributions for seven source apportionment simulations (OSAT, BF, ISAM OP1 to OP5).In Fig. 3, CMAQ and CAMx predict similar O3 concentrations during the day, but differences appear at night, with a maximum difference of 5 ppb.This disparity was discussed in Section 4.1 and can be mitigated by employing the MDA8 O3 metric.The seven source apportionment simulations yield similar diurnal trends via the trajectory of the total concentrations, but they apportion concentrations to each sector somewhat differently.
Comparisons of five ISAM options reveals significant variability.OP1, which apportions uniformly according to stoichiometry, shows similar trends of apportionments for each sector as OP4, an option that always allocates products to sources with reactive VOCs and their radicals.They both apportion more BCON and BIO O3 but fewer contributions from all other sectors than the other three ISAM options (OP2, OP3 and OP5).Results of OP1 and OP4 would likely overestimate sensitivity to emissions to these reactants because VOCs are often available in excess.OP2 always allocates products to sources with nitrogen reactants, which prevents the attribution of NOx to non-nitrogen reactants.
Typically, these non-nitrogen reactants are common in transported (e.g., BCON) or natural sources (e.g., isoprene in BIO).As a result, OP2 decreases BCON and BIO contributions while increasing contributions from other sectors relative to OP1 and OP4.
OP5 assigns products to either reactive VOCs or NOx based on the ratio of PH2O2/PHNO3, placing O3 contribution results for all sectors between the previous four ISAM options.OSAT, which utilizes a similar methodology as OP5, shows consistent diurnal patterns of domain averaged total O3 and sector contributions as the ISAM options, but with varying magnitudes.OSAT has the largest BCON O3 but the least contributions from AREA, BIO and FIRE.The rest of the OSAT sector contributions are between the ISAM options.Consistent with earlier findings, CMAQ-BF estimates systematically smaller O3 contributions for all sectors besides EGU and BCON (Kwok et al., 2015).
While ISAM and OSAT appear to retain bulk mass as intended, CMAQ-BF shifts the chemical system into a different nonlinear O3 response to source change.
In Fig. 4, CAMx and CMAQ predict comparable total RNOx except for the first 12 hours of the two-day example, when OSAT values deviate from those of the other six simulations.As the total concentrations of the two models converge, OSAT exhibits similar patterns to OP2 and OP3.OP1, OP4 and OP5 show comparable results, with increased BCON and BIO RNOx but decreased contributions from other sectors.CMAQ-BF show comparable results with OSAT, OP2 and OP3 except for BCON and BIO, which are negative for CMAQ-BF, suggesting that removing these source sectors results in a slight rise in RNOx.In previous source sensitivity and allocation investigations, it has been shown that BF may have limits when the model response contains an indirect effect coming from the influence of substances other than the direct precursors (Kwok et al., 2015;Burr and Zhang, 2011;Koo et al., 2009;Jimenez and Baldasano, 2004;Zhang et al., 2009).This would be particularly true in situations where emissions are a large percentage of total NOx or VOC in a particular area.The nonlinear impacts on gas phase chemistry realized in a source sensitivity model simulation would not be a relevant representation of culpability from that same source group.
Figure 5 illustrates the hourly variability of domain-averaged VOC concentrations and sector contributions.CAMx only gives pre-lumped VOC (Table 4) for OSAT outputs.For consistency, VOC for CMAQ ISAM and BF has also been carbon-weighted by summing all individual VOC species in CMAQ outputs using the same method as OSAT (Table 4).In Fig. 5, CAMx consistently simulates higher attribution to total VOC concentrations than CMAQ, with a maximum difference of 30 ppb.
These larger CAMx VOC concentrations are also reflected in apportioned OSAT sectors, particularly those with substantial contributions, such as BCON and BIO.Given that the difference is present in the total concentration, this is unlikely caused by different source apportionment formulation between CMAQ and CAMx.As CAMx only gives pre-lumped VOC, it is challenging to compare individual VOC species between CMAQ and CAMx to explain this difference at current stage.Another possible reasons to cause it could be that models have different internal treatments for advection and diffusion, which can impact surface-level concentrations and indirectly impact chemical reactions.The five ISAM options have comparable diurnal patterns for most sectors, with the exception of CMV, EGU, and RAIL, however the magnitudes for these three sectors are relatively minor, which is consistent with earlier findings (Kwok et al., 2015).CMAQ-BF estimates notably lower sector contributions for VOCs, which is similar to O3 results (Fig. 4), with negative contributions for small sectors (e.g., CMV, EGU, and RAIL).Additional figures of other grouped nitrogen species tracked in Table 4 (e.g., RGN, HNO3 and NOy) can be found in SI.

Spatial distribution of source apportionment simulations
Spatial patterns of total and sector contributions of MDA8 O3 (Fig. 6), RNOx (Fig. 7) and VOC (Fig. 8) have been examined for the seven simulations.In Fig. 6, OSAT exhibits the same spatial 435 distribution of MDA8 O3 total concentrations as other CMAQ-based simulations (OP1, OP2, OP3, OP4, OP5, and CMAQ-BF), with the exception of OSAT's relatively high marine and offshore total concentrations (> 5 ppbv), which could be explained by the difference in planetary boundary layer dynamics or different marine chemistry configuration between the two parent models.CMAQ CB6R3 uses a rough parameterization for full marine halogen chemistry to destroy O3, depending only on land-440 use category and sunlight (Sarwar et al., 2015(Sarwar et al., , 2019)), whereas CAMx CB6R4 handles O3 depletion in the marine boundary more efficiently by including the 16 most important reactions of inorganic iodine (I-16b, Emery et al., 2016b).According to a sensitivity test conducted by Emery et al. (2016b), I-16b could reduce O3 depletions by 2-5 ppbv in comparison to full halogen chemistry.Regarding sector concentrations, the spatial distributions of seven simulations are comparable.They can all capture geographic contribution hot spots from each sector, although their magnitudes vary.OP2 stands out with fewer contributions from BIO than the other four ISAM options, and subsequently assigns larger concentrations to other sectors, particularly over east coastal regions, as shown in Fig. 3 and 6.Since OP2 assigns all products to sources with nitrogen reactants, the influence of reactants from biogenic sources is diminished, as intended.
Figure 7 depicts the associated outcomes of RNOx.Except for BCON, the seven simulations produce geographically and quantitatively consistent findings.From the spatial distributions, we can conclude that local sources govern RNOx more than long-transported sources compared to O3. Anthropogenic RNOx is either more concentrated in the urban areas (e.g., AREA, NONEGU, NONROAD), gasoline industry (OILGAS) and electric facilities (EGU) or along with transportation (e.g., AIRP, ONROAD, CMV and RAIL).Biogenic RNOx is more prevalent in rural locations with vegetation.It should be noted that OP1, OP4 and OP5 show more BCON RNOx across the entire domain because of the way to assign products in nitrogen related reactions (Section 2).OP1, OP4 and OP5 show local hotspots of RNOx attributed to BCON.Since there is no physical reason to suspect hotspots over urban areas, we conclude that these contributions represent RNOx attributed based on VOC or oxidants transported from the boundary.Figure 8 depicts the outcomes associated with VOC.
Higher VOC concentrations from CAMx already shown in Fig. 5 are primarily from Virginia and North Carolina (OSAT bulk).As CMAQ and CAMx both use the same BEIS inventory data, the difference in total VOC concentrations may result from other differences between two models, like chemistry or deposition, accordingly, leading to higher biogenic sources in CAMx (BIO).For the rest of sectors, OSAT and ISAM options are fairly consistent except that the OP2 predicts more contributions from EGU, CMV and RAIL.CMAQ-BF predicts consistently lower source contributions for MDA8 O3, RNOx, and VOC, as shown in Section 4.2.1.This yet again illustrates that brute force represents an integrated sensitivity while the OSAT and ISAM represent attribution at a point in the nonlinear chemical systems.Monthly averaged spatial maps for MDA8 O3, RNOx, and VOC are also included in Fig. S4(a-c) and show consistent results as two-day averaged maps.This demonstrates that our case study is appropriate, efficiently selecting representative days as well as minimizing the uncertainties from parent models (CMAQ and CMAQ).Additional figures of other grouped nitrogen species tracked in Table 4 (e.g., RGN, HNO3 and NOy) can also be found in SI.

Model Simulation Time
The CPU time required to complete a source apportionment simulation in a 3D AQM is an important consideration for usability.For a 4 km x 4 km simulation domain encompassing the northeast U.S., the model run times for OSAT and ISAM are similar.Using 128 processors, base CMAQ (without 490 ISAM) and CMAQ-ISAM simulations (11 source categories) are tested.Base CMAQ requires around 60 minutes per simulation day (24 hours), whereas CMAQ-ISAM requires approximately 120 minutes.
If the number of processors is increased to 256, the simulation time for CMAQ-ISAM can be reduced by 30 minutes showing good scalability.It is worth noting that our CMAQ-ISAM simulations simultaneously track all additional species classes, such as sulfate, nitrate, ammonium, elemental carbon, organic carbon, and chloride.It would shorten simulation times if related species were only tracked for O3.Base CAMx (without OSAT) and CAMx OSAT are also tested with 128 processors, taking 37 and 67 minutes, respectively.CAMx also provides an optional tool for particles that can be simultaneously applied similarly to ISAM (PSAT, Yarwood et al., 2007).When additional pollutants are selected for tracking (e.g., sulfate, primary PM2.5 species, etc.) total simulation time will increase for both ISAM and OSAT/PSAT.CMAQ-BF speed is based on CMAQ base simulation (60 mins/day x (1 base + 11 sources + 1 boundary condition + 1 initial condition + 1 other) = 900 mins/day).

Discussions and Conclusions
Source attribution approaches are generally intended to determine culpability of precursor emissions sources to ambient pollutant concentrations.Source-based apportionment approaches such as ISAM and OSAT provide similar types of information, specifically an estimate of which sources or groups of sectors (e.g., a sector) contributed to the air quality measured or estimated at a particular location.The assumptions in each technique have implications for interpretation in the context of air quality management.
Source attribution of secondarily formed pollutants cannot be explicitly measured, which makes evaluation of source apportionment approaches challenging.Here, the ISAM approach was evaluated by 1) a comparison with a source apportionment approach implemented in a different photochemical modeling system and 2) a comparison with a simple source sensitivity (brute-force difference) approach in the same modeling system that is most comparable to source apportionment in more linear systems and less useful when formation and transport is nonlinear.Further, this section notes qualitative consistency between the spatial nature of sector emission and the attribution of precursors and O3 as another method to generate confidence in these approaches.
In this study, multiple apportionment approach comparisons show common features but still reveal wide variations in predicted sector contribution and species dependency.The attribution to sources emitting NOx and VOC is consistent with the spatial nature of these sources, which provides confidence in the approach.However, nitrogen species (e.g., NOx), for instance, are more sensitive to the choice of ISAM options than VOC.For example, although the attribution of NOx to EGUs matches the location of these sources (e.g., New York urban area) for all ISAM options, OP1, OP4 and OP5 predict more BCON NOx.This is because the fast NOx cycling process assigns anthropogenically emitted nitrogen species to other sources, as the original emitted source identity is not retained through these complex reactions.Further, sources entirely located offshore, such as commercial marine vessels, do not have culpability assigned to distant inland regions of the model domain.Most of the time, the amount of attribution to a certain sector depends on the number of emissions from that sector, how far away those emissions are, and whether the prevailing winds carried emissions from those places to the monitor or grid cell where air quality was predicted.
The designed five ISAM options maximize its flexibility, particularly for modeling source apportionment of O3 and its precursors, but the choice of option depends on target species.Among all ISAM options, the OP5 option, after making the assignment decision based on the ratio of PH2O2 to PHNO3, is expected to predict generally similar spatial and temporal patterns for O3 to the OSAT source apportionment approach implemented in CAMx.However, it still shows disparity for some sectors (e.g., biogenic sectors for O3).This result may be because of the OSAT formulation which differs from the ISAM options presented here.The OP5 option was also similar to brute-force sensitivity estimates predicted in CMAQ with the exception of source groups that dominate regional emissions or O3, such as biogenic VOC and O3 introduced into the model through boundary inflow.In those situations, it is not reasonable to expect a source sensitivity approach to provide a useful comparison for source attribution given the highly nonlinear change in atmospheric chemistry.After assigning products to sources emitting nitrogen reactants, the OP2 option can predict results of RNOx attributions that are more comparable to OSAT and BF.It demonstrated that the OP2 works better for RNOx because it makes it easier to find the original source and lessens the effect of other sources when these species are cycling quickly through an integrated chemical reaction system.Unlike O3 and RNOx, the VOC contribution for the majority of source categories depends very little on the ISAM option.We expect that the user will use OP5 for O3 and OP2 for RNOx, but this is not a firm suggestion.In turn, we give the user this flexibility so that ISAM can be used for a wide range of purposes.
By comparing the multiple approaches in the Northeast U.S., we found that both OSAT and ISAM attribute the majority of O3 and NOx contributions to boundary, mobile, and biogenic sources, whereas the top three VOC contributions are attributed to biogenic, boundary, and area sources.
However, comparisons of OSAT and ISAM have some limits, especially when they are under the two different parent models, CAMx and CMAQ.Although we have put efforts into diminishing the differences between the two models by making most configuration options as similar as possible, some inevitable uncertainties cannot be eliminated at the current stage of this study (e.g., an imperfect match of chemical mechanisms, different internal treatments for advection, diffusion, and deposition processes).Further, it is also worthwhile to note that our results in this study are based on limited duration and specific regions, and they may not comprehensively reflect all situations.Given that the source attribution of secondary pollutants cannot be explicitly measured, these inter-comparisons between ISAM and OSAT are still useful for reference.We continue to need further efforts that combine field experiment studies and model evaluations for longer terms and multiple regions to better understand source attribution given the highly nonlinear change in nature of O3-NOx chemistry.
not track INO3 2 ISAM does not track OPAN 3 ISAM does not track CRON 4 OSAT VOC has been pre-calculated as equation in Table4

Fig. 1
Fig.1 observed site averaged MDA8 O3 and its corresponding biases predicted by CMAQ and CAMx over paired AQS sites for the entire episode.R 2 shows correlation relationship between CMAQ and CAMx.

Fig. 3
Fig.3 Total and attributed O3 concentrations to various sectors as a function of hour of day and apportionment technique.

Fig. 4
Fig. 4 Total and attributed RNOx concentrations to various sectors as a function of hour of day and apportionment technique.

Fig. 5
Fig. 5 Total and attributed VOC concentrations to various sectors as a function of hour of day and apportionment technique.

Table 1 . Expanded CMAQ-ISAM options.
Emery et al., 2015)for chemistry.Similarly, all base meteorological and emissions inputs for CAMx were identical to those for CMAQ but were processed using CAMx appropriate data pre-processors (https://www.camx.com).The CAMx model was configured with Carbon Bond 6 version 4 (CB6R4, Emery et al., 2016a) chemical mechanism.It is (Henderson et al., 2014) 5.3.2 with modified ISAM and CAMx version 7.10 with OSAT3, are used to simulate a one-month period during the summer of 2018 (July 29 th to August 30 st ).The summary of the two model configurations is presented in Table2.Both models are applied to the same horizontal modeling domain with 4 km x 4 km resolution encompassing the northeastern United States.This domain is nested within a larger 12 km domain that encompasses the entire contiguous United States which is used for providing simulation boundary and initial conditions (BC and IC) for the 4 km domain.BCs were generated for the 12 km simulations using a hemispheric application of the GEOS-Chem model(Henderson et al., 2014)that was run for 2018.Identical ICs and BCs were applied to the two models.Anthropogenic emissions were based on version 1 of the 2016 National Emission