Chemical identification of new particle formation and growth precursors through positive matrix factorization of ambient ion measurements

. In the lower troposphere, rapid collisions between ions and trace gases result in the transfer of positive charge to the highest proton affinity species and negative charge to the lowest proton affinity species. Measurements of the chemical composition of ambient ions thus provide direct insight into the most acidic and basic trace gases and their ion-molecule clusters — compounds thought to be important for new particle formation and growth. We deployed an atmospheric pressure 15 interface time-of-flight mass spectrometer (APi-ToF) to measure ambient ion chemical composition during the 2016 Holistic Interaction of Shallow Clouds, Aerosols, and Land Ecosystems (HI-SCALE) campaign at the United States Department of Energy Atmospheric Radiation Measurement facility in the Southern Great Plains (SGP), an agricultural region. Cations and anions were measured for alternating periods of ~24 hours over one month. We use binned positive matrix factorization (binPMF) and generalized Kendrick analysis (GKA) analysis to obtain information about the chemical formulas and 20 temporal variation in ionic composition without the need for averaging over a long timescale or a priori high-resolution peak fitting. Negative ions consist of strong acids including sulfuric and nitric acid, organosulfates, and clusters of NO 3-with highly oxygenated organic molecules (HOMs) derived from monoterpene (MT) and sesquiterpene (SQT) oxidation. Organonitrates derived from SQTs account for most of the HOM signal. Combined with the diel profiles and back trajectory analysis, these results suggest that NO 3 radical chemistry is active at this site. SQT oxidation products likely contribute to 25 particle


Introduction
Ambient gas-phase ions, both molecular ions and ion-molecule clusters, determine atmospheric electrical properties and promote new particle formation (NPF) (Hirsikko et al., 2011;Shuman et al., 2015;Kirkby et al., 2016).In the troposphere, ambient ions are typically present at low concentrations (~10 2 -10 3 cm -3 ) and have lifetimes of ~10 2 -10 3 s controlled by loss to aerosols and ion-ion recombination (Shuman et al., 2015).Recently, ambient ion measurements have been of particular interest because of the insights they can provide into NPF (e.g.Kirkby et al., 2016;Bianchi et al., 2016;Jokinen et al., 2018;Yin et al., 2021).Measurements of ionic composition, however, have long been used to provide insight into trace gas chemical composition (Perkins and Eisele, 1984;Eisele and Tanner, 1990;Viggiano, 1993;Möhler et al., 1993;Krieger and Arnold, 1994) with much of the early work laying the foundation for development of chemical ionization mass spectrometry for measurements of gas-phase neutral compounds.
Ion chemistry in the lower troposphere is mostly driven by Brønsted acid-base and ligand switching reactions with protons transferred from species with lower gas-phase proton affinities to species with higher gas-phase proton affinities (Shuman et al., 2015).Therefore, anions are commonly derived from strong acids such as nitric acid and sulfuric acid (Perkins and Eisele, 1984;Eisele and Tanner, 1990;Eisele, 1988).Highly oxygenated organic molecules (HOMs) derived from monoterpenes (MTs) have frequently been observed clustered with HSO4 -or NO 3 - (Ehn et al., 2010(Ehn et al., , 2012;;Bianchi et al., 2017;Beck et al., 2022).In the boreal forest, HOMs derived from sesquiterpenes (SQTs) have also been observed as naturally charged clusters (Jokinen et al., 2016).HOMs are of interest due to their role in NPF and growth; however, measurements of HOMs in diverse ecosystems are limited.Ambient cations are typically bases such as ammonia, alkylpyridiniums, and alkylamines (Eisele and Tanner, 1990;Perkins and Eisele, 1984;Eisele, 1988Eisele, , 1983)).Ambient ion measurements of positive ions have also detected series of high mass ions at m/z 250-450 (Eisele, 1988;Junninen et al., 2010;Ehn et al., 2010;Frege et al., 2017).These compounds are likely organic bases containing an odd number of nitrogen atoms (Eisele, 1988;Ehn et al., 2010) or clusters of organic compounds with ammonium (Frege et al., 2018), but their exact molecular formulas remain unknown.Measurements of atmospheric reduced nitrogen compounds remain limited (Lee, 2022) and better characterization of these ions will inform our understanding of organic reduced nitrogen chemistry.
Although measurements of ambient ion chemical composition have been a powerful tool to understand small ionic clusters and probe neutral trace gases, extracting chemical information from ambient ion measurements at high-time resolution (minutes) presents a challenge because of the low signal-to-noise ratio.Additionally, charge competition presents challenges for using ion data for interpretation of neutral species because changes in signal intensity for a given ion may be due to changes in the composition and/or concentration of other species competing for charges rather than changes in the concentration of the neutral species corresponding to that ion.Nevertheless, ambient ion measurements are advantageous in that they are easier to perform than active ionization techniques.The compounds observed as ambient ions will be the most acidic and basic species and their clusters, providing direct insight into molecules that are crucial to understanding NPF, particle growth, and reduced nitrogen chemistry.Circumventing low signal-to-noise and increasing time resolution of ambient ion measurements could enable the use of more powerful data analysis techniques, such as positive matrix factorization, providing greater insight into these processes.
While croplands and rangelands account for ~42% of global land area (Ellis et al., 2020), the atmospheric chemistry of trace gases and aerosols remains relatively understudied for these land use categories.In general, agricultural emissions and their impacts on aerosols, air quality, and the nitrogen cycle are insufficiently understood.Agricultural activities and processes are known to emit reduced nitrogen, sulfur, and carbon species that are thought to play a role in NPF; however, large uncertainties remain in the emissions inventories and chemical fates of these compounds (Aneja et al., 2009).
Agricultural regions are a promising location to investigate the potential importance of atmospheric bases to NPF because reduced nitrogen compounds used in fertilizers represent the largest source of anthropogenic nitrogen (Fowler et al., 2013), and some organic reduced nitrogen compounds, particularly amines, have been identified in regions of the atmosphere impacted by agriculture (Ge et al., 2011).Although several bases have been hypothesized to contribute to NPF (Glasoe et al., 2015;Jen et al., 2016;Olenius et al., 2017;Myllys et al., 2019;Cai et al., 2022), many of these bases have yet to be measured in the ambient atmosphere.Emissions of certain VOCs, such as methanol and acetone, have been well-studied in agricultural regions (Loubet et al., 2022).However, it is the VOCs that are more reactive, albeit typically less abundant, and their associated oxidation products that will be most important for aerosol budgets and which remain poorly understood.
Measurements that constrain reduced nitrogen and reactive carbon species are therefore required for a more complete understanding of NPF, aerosol growth, and overall organic aerosol mass in agricultural regions.
The Department of Energy Atmospheric Radiation Measurement Southern Great Plains (SGP) research station, located in an agriculturally intensive region, hosts an extensive array of instruments for atmospheric measurements.Previous measurements at the site have investigated aerosol chemical composition (Parworth et al., 2015;Chen et al., 2018;Liu et al., 2021a;Vandergrift et al., 2022) and growth pathways (Hodshire et al., 2016), and have suggested that diamines may contribute to NPF at this location (Jen et al., 2016).Gas-phase measurements of compounds thought to be important for NPF and growth such as extremely low volatility organic compounds and a broad array of reduced nitrogen species are lacking and thus these species remain poorly constrained.
In this work, we use measurements made with an atmospheric pressure interface time-of-flight mass spectrometer (APi-ToF;Junninen et al., 2010) at the SGP site to provide insight into trace gas species that contribute to reactive nitrogen and aerosol formation and growth.We demonstrate that binned positive matrix factorization (binPMF; Zhang et al., 2019), particularly when coupled with generalized Kendrick analysis plots (Alton et al., 2022) for visualization of mass spectra, is an effective method to extract molecular composition from low signal-to-noise datasets on a rapid timescale and without the need for a priori high-resolution peak identification and fitting.We identify a variety of reduced nitrogen compounds, including bases such as alkylpyridiniums and amines, higher m/z organonitrate (ON) species, and HOMs derived from both MTs and SQTs.With our binPMF results and back trajectory analysis, we postulate sources and chemical processing pathways that are controlling the temporal variation of ambient ion chemical composition.

HISCALE 2016 Campaign
The Holistic Interaction of Shallow Clouds, Aerosols, and Land Ecosystems (HISCALE) campaign occurred between 24 April 2016 and 23 September 2016 with two 4-week intensive observational periods.The goal of the campaign was to improve the understanding of how interactions between land and clouds impact the atmospheric radiation budget and hydrologic cycle (Fast et al., 2019).The data presented here were collected during the second intensive (28 August 2016 -24 September 2016) at the guest instrumentation facility at the Atmospheric Radiation Measurement Southern Great Plains site.The site has been described in detail elsewhere (Sisterson et al., 2016).It is surrounded by agricultural land used mainly for livestock pastures and the cultivation of winter wheat and soybeans (Mills et al., 2016, USDA-NASS, 2016).Oil and natural gas are extracted in the area, mostly to the west of the site (Pritchett, 2014).There are several small cities (pop.< 50,000) within 100 km of the site.The larger metropolitan areas of Tulsa, Oklahoma City, and Dallas-Fort Worth are all at least 100 km from the site; however, the site experiences aged anthropogenically influenced air masses under the right transport conditions (Parworth et al., 2015).

APi-ToF Measurements
The chemical composition of ambient ions was characterized using an atmospheric pressure interface long time-offlight mass spectrometer (APi-LToF, Tofwerk AG and Aerodyne Research Inc;Junninen et al., 2010).Ambient air was sampled through a 0.9 m long, 2.54 cm outer diameter stainless steel tube at a flow rate of 9 SLPM.From the 9 SLPM flow, 3 SLPM was subsampled through an 8 cm stainless steel tube with a 10 mm outer diameter.Ambient ions entered the instrument through a 0.3 mm pinhole.For the first half of the campaign (31 August 2016 -11 September 2016), flow into the instrument was 0.8 SLPM.Partway through the campaign the inlet became clogged.The clog could not be completely resolved, and the sample flow was reduced to 0.5 SLPM during the second half of the field campaign (13 September 2016-23 September 2016).The pressure in the small segmented quadrupole (SSQ), the first chamber of the APi-ToF, was adjusted to be approximately constant before and after the clog.The potential effects of the clog on expansion and declustering within the APi-ToF were evaluated using water clusters in the positive mode and found to be minimal (see Sect.S1).A slight decrease (~ 10%) in total positive ion counts was observed while total negative ion counts remained approximately constant.
Slight changes in total ions do not have direct implications for the results of this work which are based on changes in ion composition that would not result from a clog.We discuss the changes and the potential impacts on the measurements and interpretation in greater detail in Sect.S1.Mass spectra were collected over a range of m/z 10 -1700 at a rate of 0.1 Hz.The APi-ToF was operated in "high mass" mode, which enhances transmission of ions at higher (> 200) m/z.It was switched between positive and negative polarity approximately every 24 hours.Mass resolving power was approximately 7100 for positive ion and 6500 for negative ion data.We note that direct comparisons of measurements by APi-ToF instruments are difficult because the tuning of each instrument will result in unique m/z transmission functions and differing degrees of cluster fragmentation.A transmission correction was not applied to the data presented in this work.

binPMF
Measurements were post-processed in IGOR Pro 8.04 (Wavemetrics, Lake Oswego, OR, USA) using Tofware v3.2.2 (Stark et al., 2015).Data were averaged to a 15-minute timescale and analyzed using binned positive matrix factorization (Zhang et al., 2019).Positive matrix factorization (PMF) is a dimensionality reduction technique that has been used extensively with mass spectrometric data to investigate organic aerosol sources (e.g., Ulbrich et al., 2009;Sun et al., 2014;Parworth et al., 2015) and more recently for understanding gas-phase measurements from chemical ionization mass spectrometers (e.g., Yan et al., 2016;Olin et al., 2022).Most commonly, applications of PMF use either unit mass resolution (UMR) or high-resolution peak fit (HR) data.The disadvantages of these approaches include loss of chemical identification particularly for isobaric ions when using UMR data and the time-intensive nature of HR analysis.Additionally, HR analysis of low signal-to-noise data, such as used here, is subject to additional noise resulting from the peak fitting unless longer time averaging is first performed.In binPMF, the recorded mass-spectral signal is divided into "bins" much smaller than one unit mass with each of these bins then used in the PMF analysis.Thus, no a priori chemical information is required and, instead, information about chemical composition is obtained from the distribution of signal among bins following PMF analysis (Zhang et al., 2019).A potential drawback of binPMF is that signals from different isotopic compositions of the same ion (e.g., the 13 C 1 C 2 H 4 O 4 isotope of C 3 H 4 O 4 ) are not removed prior to analysis.Removal of such signals is common practice when performing PMF analysis on high-resolution datasets.These signals do not provide additional insight because in theory each isotopologue should vary identically in time, but in practice there will be slight differences due to random noise.Thus, these signals add no additional chemical information but can be detrimental to PMF analysis because they add noise that may be interpreted as signal.The low signal-to-noise of our APi-ToF dataset avoids this obstacle because only the monoisotopic ion is detected for most species.A notable exception is that the H2 34 SO 4 isotopes of sulfuric acid clusters are detected in the negative ion data during the day and are sorted into the same binPMF factors as the most abundant (H 2 32 SO 4 ) isotope.
Binning was performed in Tofware using the Binned Data Export v3.2.5 workflow contained in the Tofware software package.The workflow follows the description given by Zhang et al., (2019) and helps with exporting binned raw mass spectral data.It allows the user to reduce the sizes of data matrices by specifying the m/z range, the region to be binned at each m/z, and the bin size.The data can be output as Igor binary waves or text files for further processing using existing PMF tools.Prior to binning, mass calibration and baseline subtraction were performed.Positive ion data were binned at each nominal mass between -0.20 and 0.50 from m/z 10 to 610.Several ions were removed from the positive ion data before performing PMF because they provide little chemical insight.These ions were N2 + , O 2 + , Ar + , NO + , ammonium-water clusters ((H 2 O) n NH 4 + , n =1-3), and water clusters ((H 2 O) n H + , n =1-5).The peak at m/z 240, identified as (C 13 H 21 NO 3 )H + , was also removed because it has a very strong signal that dominates PMF results and obscures the behavior of other species (further discussion of m/z 240 in Sect.3.5).All ions in the negative data were retained and the signal at each nominal mass was binned between -0.20 and 0.40 from m/z 60 to 560.Bin sizes of 0.02 Δm/z were used for both positive ion and negative ion data.The nominal mass range and the binning region surrounding each unit mass were selected to include all observed peaks.PMF calculations were performed using the PMF Evaluation tool PMF2 v3.05A (Ulbrich et al., 2009).
To identify exact masses in the binPMF factors, we fit a Gaussian function in m/z space to the bins at each nominal mass with signal of sufficient intensity.The width of the Gaussian was not constrained.We selected a Gaussian function rather than the peak shape determined using Tofware to avoid the potential for bias associated with the required long averaging times required to achieve a well-defined shape in Tofware.To evaluate the error introduced by this fitting method, synthetic peaks were generated using Gaussian functions in time-of-flight space (Sect.S2).The synthetic peaks in ToF space were transformed to m/z space, binned, and fit with a Gaussian peak shape to determine the peak center.Additional errors include non-Gaussian peak shapes and mass calibration uncertainty for the measurements.Differences between Gaussian fitted peaks and real peak shapes were determined to be minor and are discussed further in Sect.S2 and shown in Figs.S1   and S2.To investigate the contribution of m/z calibration error, a Monte Carlo analysis of the ToF space to m/z space transformation of synthetic data was performed using a range of m/z calibration parameters consistent with the observed dataset (Sect.S2).By comparing the peak positions of the synthetic peaks before and after processing, the error introduced by the fitting process was found to be ≤50 ppm.Most of the error is due to the mass calibration with a small contribution (<1 %) by Gaussian peak fitting.These uncertainties were taken into consideration when evaluating potential formula assignments.The fitted binPMF peak centers and corresponding assigned formulas for all peaks highlighted below can be found in Tables S1 and S2 (Sect.S3).
Our goal in peak fitting was only to determine which compounds were present, not attribute signal to each specific ion.Thus, we were concerned with identification of the peak centers rather than precise quantitation of peak areas.However, fitting a Gaussian to a peak consisting of insufficiently resolved isobaric ions may cause the fitted peak center to be shifted.Such peaks would be identifiable because they would be wider than expected.Peak width as a function of m/z was evaluated by comparing the fitted widths of the binPMF peaks to the peak width calculated within Tofware, and the full widths at half maximum were found to agree to within 20%.The agreement of peak widths therefore confirms that the peaks in the binPMF spectra mostly contain signal from one ion.This is expected because isobaric ions occur at much lower frequency in APi spectra than CIMS spectra.We note that the use of Tofware peak shapes and peak widths would be preferable for datasets in which peaks are asymmetrical, interference of isobaric ions is significant, and/or when peak area allocation is desired.
Positive matrix factorization (PMF) minimizes the sum of squared residuals weighted by uncertainty, so an appropriate estimation of error is critical (Ulbrich et al., 2009).Given the low signal levels inherent in ambient ion measurements, error due to electronic noise dominates over error from counting statistics, meaning an error estimate that is independent of signal intensity is likely to be a good approximation.An error value for each nominal m/z was calculated using the bins with the lowest and highest masses at a given m/z.The ranges of masses for binning were selected so that noise dominates the signal in these two bins.The standard deviation of the signal in both of the bins was calculated over the course of the campaign.The average of these two standard deviations was used as the error for every bin at the given nominal m/z.Thus, the error value is a function of m/z but is independent of time and signal intensity.In addition to the technique described above, several other methods (e.g., using the standard deviation of very high m/z bins where no peaks are observed to estimate error for all bins at each time point and using standard deviation of high m/z throughout the campaign to estimate a single error value) were used to estimate the error (Sect.S4).We found PMF solutions to be insensitive to exact error values for error values of the correct order of magnitude and that error estimation dependent on signal intensity did not produce significantly different PMF solutions (Abdelhamid, 2020).

Generalized Kendrick Analysis Plots
For visualization of mass spectra we used generalized Kendrick analysis (GKA; Alton et al., 2022) a technique related to resolution-enhanced Kendrick mass defect plots (Fouquet and Sato, 2017).Kendrick mass defect plots (Kendrick, 1963) are a popular tool to analyze complicated mass spectra that contain many chemically related compounds.Kendrick masses are calculated by redefining the IUPAC mass of a base unit (R), typically 12 CH 2 (14.0157), to its nucleon number ( 14for 12 CH 2 ).In Kendrick mass space, all species related by the base unit have the same Kendrick mass defect (KMD), the difference between the exact Kendrick mass and the nominal Kendrick mass.In a plot of KMD versus exact mass, these species will fall along a horizontal line.Building on Kendrick mass analysis plots, generalized Kendrick analysis introduces a positive integer scaling factor (X) that replaces the nucleon number as shown in Eq. 1 (Alton et al., 2022).
This scaling factor increases the resolution over KMD plots by spreading the mass defects over a larger region of mass defect range (-0.5 to 0.5).The choice of the scaling factor X affects how different chemical species are arranged in the resulting plot, but species related by the base unit will still be aligned horizontally.Thus, GKA analysis increases the resolution of KMD plots without the need for improved mass spectral resolution (Fouquet and Sato, 2017).The increased resolution of GKA helps visualize chemical trends that would not be apparent in raw mass spectra and provides an effective method to characterize binPMF factors.

Supporting Analysis
To characterize the sources of the observed binPMF factors, we used the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Stein et al., 2015) to calculate air mass back trajectories and back trajectory clusters.For each hour of the campaign, 24-hr back trajectories were calculated with a starting height of 10 m and using the Weather Research and Forecasting (WRF) model 27-km resolution hourly meteorology.To search for external tracers that may explain variations in the binPMF factors, SO 2 monitor data (Trojanowski, 2016), and meteorological measurements including radiation, temperature, and relative humidity (Zhang, 1997) routinely measured by instruments at the SGP site were used.Aerosol size distributions and select trace gases were monitored using a scanning mobility particle sizer (SMPS, cutoff mobility diameter of 14 nm) and quadrupole proton transfer reaction mass spectrometer (PTRMS) respectively as part of the HI-SCALE campaign (Liu and Shilling, 2016a, b).PTRMS measurements have been previously described in Liu et al., 2021.A variety of other instruments were deployed to measure trace gases, aerosols, clouds, and meteorological conditions during HISCALE.Specific results from these instruments were not used in this work and can be found elsewhere (Fast et al., 2019).A sulfuric acid proxy was calculated according to the method described by Mikkonen et al. (2011) and is described in detail in Sect.S5.Surrounding cropland was assessed using the USDA-NASS Cropland Data Layer (USDA-NASS, 2016, for details see Boryan et al., 2011).Ozone data for Newkirk and Seiling, OK, two monitoring locations within 150 km of the SGP site, were obtained from the US Environmental Protection Agency Air Quality System API (US EPA, 2020).

Negative Ion Chemical Composition
For the negative ions, we select a four-factor solution as the best description of the measurements; the basis for selection is further described in Sect.S6. Figure 1  acid dimer, (H 2 SO 4 )HSO 4 -, at m/z 195 (Fig. 1d).Exact m/z were used in identifying chemical composition; we report the 250 nominal masses for clarity.The sulfuric acid trimer (H 2 SO 4 ) 2 HSO 4 -, is present at approximately 2% of the intensity of the dimer.Other sulfur anions include bisulfate (HSO 4 -) and its water cluster ((H 2 O)HSO 4 -), sulfur pentoxide (SO 5 -), and HSO 4 - clustered with nitric acid ((HNO 3 )HSO 4 -).Other intense ions include m/z 155, 253, and 337 which also appear in other factors and are tentatively attributed to organosulfates (further details in Sect.3.3).
The SA dimer factor signal intensity follows the diel profile of solar radiation, presumably reflecting sulfuric acid 255 production via hydroxyl radical oxidation of SO 2 , and is zero from 19:00 local time (UTC -5) to 7:00.The factor shows a weak correlation with SO 2 measured at the site (r = 0.34) and a stronger correlation with a sulfuric acid proxy (r = 0.62) calculated from SO 2 concentrations, solar radiation, and condensation sink (details in Sect.S5).The dominance of sulfuric acid in the daytime spectra is expected based on photochemical production of sulfuric acid, competition for limited charges available in the ambient atmosphere, and the strong acidity of sulfuric acid and is consistent with previous observations (e.g., (Eisele and Tanner, 1990;Ehn et al., 2010;Bianchi et al., 2017;Beck et al., 2022).
In the "sulfur species" factor, the most intense signals are HSO 4 -, SO 5 -, (H 2 O)HSO 4 -, and (H 2 CO 3 )NO 3 -at m/z 97, 112, 115, and 124, respectively (Fig. 1c).The SA dimer is present, but at a lower intensity than any of these ions.The diel profile of this factor peaks in the morning around 8:00-9:00, declines slightly during the middle of the day, and increases again briefly during the afternoon.This behavior is consistent with increasing production of H 2 SO 4 during the day shifting signal into the sulfuric acid dimer factor.Although it is approximately two to four times more intense during the day, it is non-zero at night, reflecting the small HSO 4 -signal that is observable at night.
The remaining two factors are characterized by clusters which have NO 3 -as the dominant charge carrier rather than HSO 4 -(Figs.1b, 1e, and 1f).The "low m/z NO 3 -" factor (Figs. 1b, 1e, and S6c) is consistently present with a minor morning peak, a midday dip, and a higher evening peak.It remains elevated throughout the night.This diel behavior is consistent with species that are produced by photochemical reactions because the morning increase in intensity and evening decrease in intensity correspond to the respective increase and decrease in solar radiation (Fig. 1b).The midday dip is likely the result of charge competition with H 2 SO 4 and not a decrease in the concentration of neutral species corresponding to the low m/z NO 3 - factor.The intensity of the low m/z NO 3 -factor (Figs.1b and S6c) is similar to that of the sulfuric acid dimer factor (Figs. 1a   and S6b) during the day, suggesting that sulfuric acid is relatively low and nitric acid fairly abundant.Although charge competition complicates interpretation of the neutral species related to the observed NO 3 -clusters, the NO 3 -factors reveal atmospheric composition in a similar manner as an NO 3 -chemical ionization mass spectrometer (CIMS).
The low m/z NO 3 -factor is mostly composed of peaks between m/z 100 and 300 with major ions attributed to clusters of NO 3 -with non-nitrate organic compounds (e.g.(C 3 H 4 O 4 )NO 3 -at m/z 166 and (C 5 H 6 O 4 )NO 3 -at m/z 192) and clusters of NO 3 -with inorganic acids (e.g.(H 2 CO 3 )NO 3 -at m/z 124 and (HNO 3 )NO 3 -at m/z 125).Also present are clusters containing two nitrogen atoms which are likely clusters of NO 3 -with small ONs (e.g.(C 5 H 7 NO 7 )NO 3 -and (C 5 H 9 NO 7 )NO 3 -at m/z 255 and 257).Both C 5 H 7 NO 7 , and C 5 H 9 NO 7 have been identified as ON products of isoprene oxidation by hydroxyl radicals in the presence of NO x (Ng et al., 2008;Lee et al., 2016).The peak at m/z 288 is also intense and is likely (C 5 H 10 N 2 O 8 )NO 3 -, which can be produced by hydroxyl radical oxidation of isoprene in the presence of NO x (Lee et al., 2014;Xu et al., 2020).C 5 H 9 NO 7 and C 5 H 10 N 2 O 8 are also produced by NO 3 radical-initiated oxidation of isoprene (Ng et al., 2008).
It should be noted that (C 10 H 10 O 6 )NO 3 -is also a possible formula for the ion at m/z 288.While we consider this formula to be less likely because of the high number of double bond equivalents, C 10 H 10 O 6 would correspond to chorismic acid, a central branching point in plant cell metabolism (Tzin and Galili, 2010)  The remaining factor, termed the high m/z NO 3 -factor, is characterized by intense signals above m/z 300.The diel profile of the high m/z NO 3 -factor shows a rapid increase at approximately 19:00 and peaks at midnight.It gradually declines in the early morning and reaches zero by late morning.Figure 2 shows the GKA plot of the high m/z region of this factor.The mass-to-charge ratios plotted in the figure are the centers of the Gaussian fits to binPMF results.The Kendrick base unit used for this plot is oxygen with a scaling factor of 14, and the most intense fitted peaks fall along horizontal lines, demonstrating that the formulas are related by the addition of oxygen atoms.There are two series of odd m/z peaks with the most intense peaks separated by Δm/z of 16, consistent with formulas that vary by the addition of an oxygen atom.The most intense series of peaks in the range of m/z 339 to 419 are found at odd masses and thus likely contain an even number of nitrogen atoms.
We attribute these peaks to (C 10 H 15 NO (8-13) )NO 3 -, which are consistent with NO 3 -clusters with ON HOMs derived from MTs (Ehn et al., 2012;Bianchi et al., 2017).Other chemical formulas within our estimated mass calibration error for these peaks include (C 13 H 11 NO (6-11) )NO 3 -and (C 14 H 15 NO (5-10) )NO 3 -.The formulas with 13 and 14 carbon atoms have ten and nine degrees of unsaturation, respectively, and do not fit the data as well as the formulas with 10 carbon atoms and thus are deemed unlikely.Within a similar mass range (m/z 340-404) there is also a series of less intense peaks at even masses.These peaks contain only one nitrogen atom and, based on similar reasoning, are assigned (C 10 H 14 O (9-13) )NO 3 -, formulas which correspond to NO 3 -clusters with non-nitrate HOMs (Ehn et al., 2012;Bianchi et al., 2017).The second region of intense peaks spans m/z 423 to 535 with the most intense ions occurring at odd m/z.We assign these peaks as (C 15 H 23 NO (9-16) )NO 3 -; other possible formulas include (C 11 H 23 NO (12)(13)(14)(15)(16)(17)(18)(19) )NO 3 -and (C 19 H 23 N 2 O (6-13) )NO 3 -and are rejected based on carbon number and degrees of unsaturation.The numbers of carbon and hydrogen atoms in the C15 formulas correspond to HOMs derived from SQTs (Jokinen et al., 2016;Richters et al., 2016).Because all the most intense peaks identified in the high m/z NO 3 - factor are HOM clusters with NO 3 -, these HOMs and those detected in the low m/z NO 3 -factor are discussed together in the following section.

3,2 HOMs
The HOMs signals we observe differ from those reported by previous APi-ToF measurements in three ways: i) the intensity of C15 ion clusters is greater than the intensity of C10 ion clusters, ii) the ON HOMs clustered with NO 3 -(odd m/z) peaks are higher in intensity than non-nitrate HOMs clustered with NO 3 -(even m/z), and iii) signals from HOM dimers (e.g., C20 and C30 molecules) are absent from the spectra.In this section we explore potential explanations for these three observations and discuss the implications for particle growth at this site.
We attribute C10 species to MT oxidation products and C15 species to SQT oxidation products.Clusters of MT non-nitrate HOMs with NO 3 - (Ehn et al., 2012) and MT ON HOMs with NO 3 - (Bianchi et al., 2017) have been observed previously in APi-ToF measurements, but we do not observe the C16-C20 MT dimers that have been previously detected.
Similarly, SQT non-nitrate HOMs have been observed as neutral compounds and as ions in the boreal forest (Jokinen et al., 2016).In that study, SQT HOMs contributed only a minor fraction (0.2%) to the total ion signal compared to up to 12% of the total signal observed here.Additionally, C29 and C30 dimers were observed previously, but not in this work.To our knowledge, SQT ON HOMs have not been reported previously in ambient measurements.Although C15H24Ox compounds can be formed from cross reactions of isoprene and MT peroxy radicals (Heinritzi et al., 2020), we would expect the C10 ions to be relatively more intense than the C15 ions if cross-reactions were important.The attribution of the C15 compounds to SQT oxidation products is also consistent with measurements of SQT and MT emissions from agricultural crops.For herbaceous crops such as alfalfa, a species grown in the area (USDA-NASS, 2016), SQT emissions are greater than MT emissions (Ormeño et al., 2010).In general, measurements of SQTs are challenging due to their low concentrations and short lifetimes.Thus, while knowledge of SQT emissions and HOM yield is of most relevance for understanding particle formation and growth, observations of SQT oxidation products are informative.Given the low detection limits of APi-ToF, measurements of SQT oxidation products through ambient anion cluster observations may represent a useful methodology for better understanding the atmospheric fate of SQTs and their ultimate contribution to particle growth.
The high m/z NO 3 -factor accounts for the majority of the HOMs signal.In this factor, both C10 and C15 species ON HOMs (odd masses) are more abundant than non-nitrate HOMs (even masses).Approximately 55% of the signal in the lower m/z series of peaks (C10 compounds) and 60% of the signal in the higher m/z series of peaks (C15 compounds) is found at odd m/z.Based on the assigned formulas for the C15 series, the singly substituted 13 C isotopologue ion would account for 25-50% of the signal at the next nominal mass.It thus appears likely that some of the observed signal at even m/z in the C15 series is due to clusters of NO 3 -with SQT non-nitrate HOMs.Therefore, the portion of signal due to ON species is even greater than the percentage of odd m/z signal indicates.
To explain the higher intensity of C10 and C15 ON HOMs compared to non-nitrate HOMs in the high m/z NO 3 - factor, we hypothesize that NO 3 radical chemistry is important for producing the HOMs observed in this factor.The diel profile of the high m/z NO 3 -factor is high at night and decreases to near zero during the day.This profile is broadly consistent with compounds produced via nighttime chemistry, although, as will be discussed below, other explanations are possible for low HOM signals during the day.If ozonolysis were the main HOM production route, we would expect to see higher signals of the non-nitrate HOMs.Production of HOMs (particularly ON HOMs) by NO 3 radical chemistry is also supported by the greater intensity of C10 non-nitrate HOMs than C10 ON HOMs in the low m/z NO 3 -factor.The low m/z NO 3 -factor exhibits increasing intensity in the morning and evening with a dip in the middle of the day.As discussed earlier (Sect.3.1), this diel profile is suggestive of photochemical production with charge competition leading to the midday dip.
HOMs are observed clustered with NO 3 -in both the high m/z NO 3 -and low m/z NO 3 -factors, and thus the change in the ratio of C10 ON HOMs and C10 non-nitrate HOMs between day and night suggests a change in the ratio of the corresponding neutrals.Since ON yields from NO 3 radical oxidation of MTs and SQTs are larger than ON yields from OH oxidation in the presence of NO x (Lee et al., 2016), the changing ratio of ON to non-nitrate HOMs over the course of the day supports our hypothesis that NO 3 radical chemistry is important for HOM formation at night.The low HOM signal during the day is likely a result of charge competition but may also be due to inefficient clustering of the HOMs with HSO 4 -compared to NO 3 -.Although previous ambient ion measurements have found that both ON and non-nitrate MT oxidation products cluster with NO 3 -and with HSO 4 - (Bianchi et al., 2017), in our work both C10 and C15 species are only observed as clusters with NO 3 -.We are unable to unequivocally explain the lack of HSO 4 -clusters.
One possibility is that, as we propose in Sect 3.1, nitric acid is relatively abundant compared to sulfuric acid at the SGP site.
The lack of C15 HOMs during the day could potentially be because increased boundary layer height and a short lifetime of SQT HOMs result in SQT oxidation product mixing ratios that are too low to be observed, particularly when coupled with charge competition.While some questions regarding the production pathways and speciation of daytime HOMs remain, the changing ratio of C10 ON HOMs to non-nitrate HOMs between day and night is strong evidence for the importance of NO3 radical chemistry for nighttime HOM formation.Our observations highlight how combining measurements of ambient ion composition with measurements of neutral HOMs is important for fully understanding the HOMs budget.
The lack of C20 and C30 dimers in the high m/z NO 3 -factor implies that RO 2 -RO 2 cross reactions are infrequent at night during the measurement period.Because the observed C15 ONs are likely formed at night when NO is low, the suppression of dimer formation is probably not due to reactions of RO 2 radicals with NO.However, RO 2 radicals terminated by reaction with HO 2 radicals could result in the observed monomers.Measurements of HO 2 radicals were not made during the campaign.Nonetheless, HO 2 radicals are produced at night through oxidation of VOCs by ozone or NO 3 radicals (Stone et al., 2012), and both mechanisms may contribute to nighttime HO 2 radicals present at the SGP site.Although dimers formed via cross-reactions are thought to be important for NPF, monomers can play a role in the growth of new and/or small particles.
A few brief particle growth events were identified during the campaign, but several of these events were detected by aircraft observations and cannot be directly related to APi-ToF observations made at the surface (see Fast et al., 2019 for detailed description of NPF observations).The small number of events and the fact that instrument polarity was switched every 24 hours mean that negative ion data coverage is insufficient during these events to draw conclusions about how the behavior of HOMs contributed to particle growth at SGP during the measurement period.However, we can assess the potential partitioning behavior of the identified HOM species by using the volatility parametrization described by Li et al. (2016), to estimate saturation mass concentrations (c*) for the species corresponding to the neutral C15 formulas (without the NO3 -anion).We calculate a c* of ~10 -8 -10 -9 µg/m 3 for the most intense observed species (C 15 H 23 NO 12 and C 15 H 23 NO 13 ).
The values are several orders of magnitude smaller than the ELVOC saturation mass concentration values used by Hodshire et al. (2 × 10 -4 µg/m 3 ) to model particle growth at the SGP site.This suggests that these ONs could condense onto particles if they were present during a nucleation or growth event, but it should be noted that the anion clusters related to the neutral ONs were detected only at night whereas most NPF and growth occurs during the day.Future investigations of particle growth should consider the influence of SQT oxidation products, including daytime oxidation products which were not observed in this work, at this site and potentially in agricultural regions more generally.

Organosulfates
We tentatively attribute several peaks to organosulfates.Diel profiles of these ions are shown in Sect.S8.In the sulfuric acid dimer factor, ions at m/z 155 and 253 are assigned as C 2 H 3 SO 6 -and (H 2 SO 4 )C 2 H 3 SO 6 -, formulas consistent with glycolic acid sulfate and its cluster with sulfuric acid.C 2 H 2 O 2 -is not observed.The signal at m/z 155 was also present in the sulfur species factor and the low m/z NO 3 -factor.While the proposed formula could be attributed to a cluster of glyoxal with HSO 4 -, quantum chemical calculations by Ehn et al. (2010) show that this cluster is too weakly bound to be plausible and would likely break apart in our instrument.Their calculations also suggest that the cluster of glycolic acid sulfate with sulfuric acid is stable and that the glycolic acid sulfate ion has a lower proton affinity than HSO 4 -and thus is likely the charge carrier in the cluster.Glycolic acid sulfate is thought to be formed through a multiphase reaction of glycolic acid with acidic sulfate aerosol (Liao et al., 2015); however, there is evidence that organosulfates may be formed photochemically in the gas phase (Friedman et al., 2016).Glycolic acid sulfate has been previously observed in the gas-phase (Ehn et al., 2010;Le Breton et al., 2018).In addition to gas-phase measurements, glycolic acid sulfate has been detected in SOA derived from isoprene and its oxidation products in both ambient samples (Surratt et al., 2008;Safi Shalamzari et al., 2013;Wach et al., 2019;Vandergrift et al., 2022) and chamber experiments (Surratt et al., 2008;Galloway et al., 2009;Wach et al., 2019).
The signals at m/z 302, 337, and 479 appear to be a series of related organosulfate peaks and are present in both the sulfuric acid dimer factor and the low m/z NO3 -factor.The formulas assigned to these peaks are C 7 H 12 NSO 10 -, C 7 H 13 S 2 O 11 -, and C 14 H 23 S 2 O 14 -.While the structure of these clusters remains unknown, one possibility is clusters of a C7 organosulfate with NO 3 -((C 7 H 12 SO 7 )NO 3 -), HSO 4 -((C 7 H 12 SO 7 )HSO 4 -), and itself ((C 7 H 12 SO 7 )C 7 H 11 SO 7 -).This organosulfate has been observed previously in ambient SOA (Hettiyadura et al., 2017(Hettiyadura et al., , 2019) ) and has been shown to be produced by reactions of acidic sulfate with MT HOMs (Surratt et al., 2008;Mutzel et al., 2015;Hettiyadura et al., 2017) or isoprene oxidation products methyl vinyl ketone and methacrolein (Nozière et al., 2010;Hettiyadura et al., 2019).If these signals are indeed due to a C7 organosulfate, this would be, to our knowledge, its first observation in the gas-phase.HYSPLIT back trajectory cluster analysis was performed to assess potential sources of the species identified in the negative binPMF factors.HYSPLIT was used to combine trajectories into clusters before concentration analysis in Igor.

Negative factor back trajectory clusters
Four clusters were used for the analysis because the large increase in total spatial variance when using only three clusters indicates that using three clusters requires combining very distinct trajectories.Figure 3a shows the back trajectory cluster results over a map of MODIS 500 meter resolution leaf area index (8-day, 2016/09/05-2016/09/12; Myneni et al., 2015) for 24-hour back trajectories.The northeast cluster contains 45 trajectories, 33% of which arrive during the day (8:00-18:00).
The percentage of trajectories arriving during the daytime for the other clusters are 50% in the east cluster (n = 72 clusters), 45% in the southeast cluster (n = 118), and 49% in the south cluster (n = 43).The lack of trajectories coming from the west is expected for the SGP site during the late summer and is consistent with other studies that have modelled back trajectories for the site at similar times of year (e.g.Parworth et al., 2015;Liu et al., 2021).Figure 3b shows the average signal of each binPMF factor for each HYSPLIT cluster.binPMF factors were averaged from a 15-minute timescale to a one-hour timescale to match the frequency of calculated back trajectories.The reduction of instrument sample flow that began on 13 September does not appear to have an effect on the intensities of the binPMF factors (see Sect.S1).
The intensity of the sulfuric acid dimer factor is similar for all the back trajectory clusters, a trend consistent with sulfuric acid as the primary charge carrier during the day.The intensity of the other two factors that are present during the day, the sulfur species and low m/z NO 3 -factors, are both constant for the northeast and east clusters and then exhibit opposing trends in the other two clusters.The sulfur species factor is enhanced in the south trajectory that passes near urban centers, including Dallas-Fort Worth, and is low in the southeast trajectories while the low m/z NO 3 -factor is low for the trajectories from the south and higher in the trajectories from the southeast.Given that daytime total negative charges are approximately constant (~20% daily variation in total negative ions 10:00-14:00 LT), we interpret the change in signal partitioning between the low m/z NO 3 -and sulfur species factors as reflecting changes in the relative abundance of nitric acid and SO 2 /H 2 SO 4 .We hypothesize that airmasses from the southeast have a higher nitric acid to sulfur ratio than do airmasses from the south; however, it is likely that nitric acid is enhanced under both the south and southeast trajectories, as indicated by the higher intensity of the nighttime high m/z factor in both the south and southeast clusters compared to the other two clusters.We note that the ion data tells us only about the ratio of these species and not about the absolute changes; fully testing this hypothesis would require measurements of neutral nitric acid and sulfuric acid which are not available.
Additionally, our measurements were not corrected for transmission efficiency as a function of m/z and thus while we can evaluate changes in the ratio of signals, the ratio of signals itself does not directly reflect the ratio of ion concentrations.The increasing abundance of sulfuric acid with back trajectories from the south is consistent with the results of Hodshire et al.
(2016) who demonstrated that sulfuric acid contributes significantly to SGP particle growth when air masses originate from the south.It is also broadly consistent with the recent work of Vandergrift et al. (2022) which showed that many organosulfates in SOA were unique to air masses originating from the south.The increase in the low m/z NO 3 -factor in the southeast cluster could result from increased biogenic emissions to the east of the site.However, these biogenic compounds would be aged during the 24 hours that the trajectories take to reach the site, and local biogenic emissions are likely more homogenous throughout the campaign.An increase in local biogenic emissions when the low m/z NO 3 -factor is enhanced later in the campaign is supported by the time series of MTs measured by the PTRMS (Sect.S9).
The high m/z NO 3 -factor is highest to the south and is also increased in the southeast cluster relative to other clusters.This factor may be increased because of interactions between biogenic and anthropogenic emissions.While SQTs are the proposed precursor of the high m/z species in this factor, they are likely emitted nearby the site and the increased intensity in these clusters may be due to the presence of other precursors, e.g., NO x , NO 3 , transported from the urban areas to the south of the site.

Identification of Positive Ion binPMF factors
A four-factor binPMF solution was selected for the positive ion data (details in Sect.S10) with the factors being identified as the alkylpyridinium factor, the C18 factor, the nighttime factor, and daytime high m/z factor.Figure 4 shows the average diel profiles and mass spectra of the factors.Hourly coverage is approximately constant with 34 ± 3.3 (average ± standard deviation) total observations during each hour (minimum of 27 and maximum of 37 observations).Time series of these factors can be found in Sect.S11.The most intense peaks in the positive ion data are all found at low m/z (<150) and the lower m/z regions of the different factors exhibit similarities while the higher m/z species tend to be more distinct.In the low mass region, the most prominent series of ions in all the factors is a set of peaks related by CH2 units between m/z 94 and 150 having formulas (C 5 H 5 (CH 2 ) 1-5 N)H + , which are consistent with alkylpyridiniums and/or aromatic amines.Other intense and ubiquitous peaks include (C 6 H 9 NO)H + , (C 7 H 11 NO)H + , (C 8 H 13 NO)H + , and (C 9 H 15 NO)H + at m/z 112, 126, 140, and 154 which we attribute to a water cluster with the alkylpyridinium ions.While we cannot distinguish between alkylpyridiniums and aromatic amines, we will refer to these species as alkylpyridiniums due to previous attributions (Ehn et al., 2010;Junninen et al., 2010).Although present in all the factors, the alkylpyridinium factor accounts for the majority of the alkylpyridinium ion signal with some contribution from the nighttime factor before sunrise.The diel profile of select individual alkylpyridinium ions is presented in Sect.S12.
Alkylpyridine species have relatively long atmospheric lifetimes against OH oxidation, on the order of ten days for methylpyridine and five days for ethylpyridine (Yeung and Elrod, 2003) and thus the midday depletion in signal is likely the result of boundary layer dynamics.Sources of alkylpyridines are uncertain.Various nitrogen heterocycles have been detected in biomass burning studies (Hatch et al., 2015;Coggon et al., 2016;Hatch et al., 2019); however, a biomass burning source is unlikely here since only a weak correlation (r < 0.33) with biomass burning tracers measured by the PTRMS (acetonitrile) is observed.Other possible sources include various industrial sources and pesticide usage (Sims et al., 1989).
Pesticides in particular may account for alkylpyridiniums at this agricultural site.
The alkylpyridinium factor contains two unique low mass peaks at m/z 74 and 75.The ion at m/z 74 corresponds to (C4H11N)H + , a C4 alkyl amine, while m/z 75 is attributed to (C 2 H 6 N 2 O)H + .The ion at m/z 75 is the only intense signal observed in the positive ion data containing two nitrogen atoms; we do not observe any strong signals that are consistent with diamines.Diamines have been hypothesized to contribute to NPF at SGP (Jen et al., 2016) and are sufficiently basic (e.g.butanediamine has a proton affinity of 1006 kJ/mol, Hunter and Lias, 1998) that we would expect to observe them if they were present.The (C 2 H 6 N 2 O)H + formula may correspond to n-nitrosodimethylamine or glycinamide.The gas-phase basicity is higher for glycinamide than for n-nitrosodimethylamine (900 vs. 850 kJ/mol; (Li et al., 2004;Crestoni and Fornarini, 2004) and n-nitrosodimethylamine would likely require higher NO x conditions to form, making glycinamide the most likely attribution for this cation.While diamines may participate in particle nucleation when they are present, our results suggest that monoamines and other single-nitrogen species were far more abundant during HI-SCALE.
A C3 amine (m/z 60, (C 3 H 9 N)H + ) and its water cluster (m/z 78, (C 3 H 11 NO)H + ) are also present in all the factors; however, unlike the alkylpyridinium ions, no one factor captures the majority of the signal.The diel profiles of these ions (Sect.S13) have a peak in both the early morning, like the nighttime factor, and a peak in the afternoon.The morning peak is captured by the nighttime factor while the afternoon peak is captured first by the increase in the C18 factor around 17:00-19:00 then by the nighttime factor at 20:00 and later in the night.The remaining signal is represented mostly by the alkylpyridinium factor.The diel profile is broadly consistent with ethanol CIMS measurements of C3 amines at the site during HISCALE which increase due to emissions and decrease due to daytime oxidation.C1 amines would appear at m/z 36, which coincides with O2 + , a large signal that was removed before binPMF analysis.C2 amines would be at m/z 48, but no intense signal is detected at this nominal mass.C1 and C2 amines respectively have gas-phase basicities of 865 kJ/mol and 878 kJ/mol, which are slightly lower than the gas-phase basicity of C3 alkylamines (884 kJ/mol, Hunter and Lias, 1998).
The small difference in gas-phase basicity between C2 and C3 amines likely does not account for all of the difference in signal intensity and suggests that the C2 amine is likely less abundant than the C3 amine.
While the low m/z regions of each factor contain similar long-lived and ubiquitous species, the C18, nighttime, and daytime high m/z factors all exhibit distinct high mass regions.In the C18 factor the most intense peaks are the series of peaks at m/z 306, 308, and 310 which are related by H2 units and have formulas (C 18 H (27,29,31) NO 3 )H + .The lower intensity peaks at m/z 270 and 346 have the respective formulas (C 15 H 27 NO 3 )H + and (C 21 H 31 NO 3 )H + .The C18 factor shows a rapid increase 17:00, a peak at 18:00 and then a decrease to intensity of nearly zero.This factor is present early in the campaign on 7 September and increases in intensity after 13 September.Although no significant correlations with external tracers were identified, the time series of MTs measured by the PTRMS (Sect.S9) shows a significant increase at a similar point in the campaign, and both the C18 factor and MTs peak around 18 September.While MTs are not necessarily the precursors of the species observed in the C18 factor, the increase in MTs suggests that real chemical changes in the atmosphere led to an increase in biogenic compounds.
The nighttime factor, as the name suggests, is present mostly at night.It exhibits an early morning maximum at 6:00-7:00 just before sunrise, which occurred around 7:00-7:20.It decreases to near zero by 10:00 and then increases again at 19:00 about an hour before sunset (~20:00 -19:30) and remains elevated overnight.The diel profile is suggestive of local emissions followed by photochemical loss and dilution from boundary layer evolution.Like the C18 factor, the nighttime factor is present at highest intensity later in the campaign.In the low mass region, the nighttime factor differs from those discussed thus far by the presence of a C6 amine, (C 6 H 15 N)H + , at m/z 102.The most intense high m/z peaks in the nighttime factor are at m/z 298 and 312.They are assigned the formulas (C 17 H 31 NO 3 )H + and (C 18 H 33 NO 3 )H + , which are related by a CH 2 unit.
The daytime high m/z factor begins to increase around 7:00-8:00 as the sun rises and increases more strongly in the afternoon starting around 13:00 then reaches its peak at 17:00 before decreasing rapidly as the sun sets.The diel profile suggests a small emission source and a larger photochemical source.While there is only a weak correlation between the daytime high m/z factor and ozone (r = 0.35), the afternoon peak in the intensity of this factor roughly coincides with the peak in the diel profile of ozone concentration.Chemically, it is characterized by a series of peaks between m/z 250 and 450.
Series of high m/z peaks similar to the ones observed in this factor have been previously detected in positive ion APi-ToF measurements but have not yet been identified (Junninen et al., 2010;Ehn et al., 2010).Junninen et al. ( 2010) used mass defect analysis to constrain formulas and suggested that the most intense species contain one nitrogen atom and 2-6 degrees of unsaturation.GKA analysis of the binPMF peaks helps reveal the character of these species.A basis of CH2 and a scaling factor of 35 for the GKA plot of the daytime high m/z factor (Fig. 5) were selected such that all species at even nominal masses have negative GKA values (i.e., they appear in the lower half of the plot) and all species at odd nominal masses have positive GKA values (i.e., they appear in the upper half of the plot).A detailed description of how the selection of this scaling factor splits even and odd masses can be found in Alton et al., (2022).The purpose of the GKA plot in Fig. 5 is to illustrate the trend in even and odd masses; a GKA plot which provides insight into the chemical composition of the ions and their relative intensities is included in Sect.S14. Figure 5demonstrates that the majority of peaks, and all of the highest intensity peaks, are found at even nominal masses.This means that these formulas contain an odd number of nitrogen atoms and, based on the most likely possible formulas, most contain a single nitrogen atom.Several peaks in the daytime factor differ by CH 2 units, but some peaks are related by units of oxygen and H 2 .An GKA plot with CH 2 basis and a scaling factor of 13, which spreads the species more evenly across the plot, was used to propose formulas for this series of peaks (see Sect.S14).Rather than falling along horizontal lines that run across the entire plot, the peaks shift toward more negative GKA values at higher m/z, which could be due to error in the mass calibration or a change in the composition of the observed species (i.e., they are not separated only by units of CH 2 ).Although the exact composition cannot be determined, the most reasonable formulas for all observed peaks are organic compounds with one nitrogen atom, 13-24 carbon atoms, 2-5 oxygen atoms, and 2-7 double bond equivalents.All formulas have O/C ≤ 0.35, meaning these compounds are not considered HOMs.The identified carbon numbers are inconsistent with clusters of ammonia with isoprene, MT, or SQT oxidation products, and therefore we propose that they are organic reduced nitrogen compounds.The range of double bond equivalents is similar to the range of double bond equivalents for the most intense peaks observed by Junninen et al. (2010).However, the species we observe show increased intensity at higher m/z and have more negative Kendrick mass defects, suggesting that they are more highly oxygenated than those observed by Junninen et al. (2010).Although formulas with one nitrogen atom can explain most of the peaks, formulas with three nitrogen atoms fit well for some peaks and may account for some of the observed signal.The proton affinities required for species to be present as ambient cations are much higher than those required to be present as ambient anions, so it is unsurprising that these species do not appear to be chemically related to the highly oxidized isoprene, MT, and SQT oxidation products observed in the negative ion data.
As with the negative data, HYSPLIT back trajectory analysis was performed to investigate possible sources and is presented in Sect.S15.We refrain from interpreting the back trajectories in detail and instead focus on general trends since we are unable to attribute the ions in these factors to specific chemical processes or sources.Despite these limitations, the C18, daytime, and nighttime factors all increase towards the end of the measurement period (see Sect.S11) as does the signal at m/z 240 (which was removed prior to calculating binPMF solutions) identified as a C13 compound: (C13H21NO3)H + .MTs are also observed to increase during this period (Sect.S9), however there is no correlation between MTs and the C18 factor (r = 0.13) or the daytime high m/z factor (r = -0.14)and a moderate correlation with the nighttime factor (r = 0.61).
The diel profile of MTs shows a peak in the morning, like the nighttime factor, but unlike the nighttime factor MTs do not increase near sunset (17:00) and instead remain near zero until after midnight.Although we are unable to identify the precursors of the observed C13, C17, and C18 species, the transient nature of these signals suggests that they may be tracers of specific chemical processes, e.g., emissions from plants and soils or agricultural practices nearby the site.Nitrogencontaining compounds such as indoles (Erb et al., 2015) and oximes (Sorensen et al., 2018) are emitted by a wide variety of plants, including crops.Both classes of compounds are volatile wound compounds that play a role in plant defense and are emitted in plant communication (Erb et al., 2015;Sorensen et al., 2018), and therefore may be produced in large quantities in response to specific agricultural processes, such as cutting and harvesting crops.
Except for amines, and possibly the compound at m/z 75 with two nitrogen atoms, it is unlikely that the identified species are important for NPF or particle growth.For instance, pyridine is much less effective in particle nucleation than would be expected from its basicity (Berndt et al., 2014).However, they are still of interest due to their possible role in the nitrogen cycle.Organic reduced nitrogen species have been recognized as an important player in the global nitrogen cycle, but organic nitrogen remains largely unconstrained and many compounds have not yet been identified (Altieri et al., 2012).
The formulas we have identified are generally similar to organic nitrogen species found in rainwater (Altieri et al., 2009(Altieri et al., , 2012)), although the maximum number of nitrogen atoms in the formulas we propose are somewhat lower than has been observed in precipitation.The species we identify provide insight into the composition of reduced nitrogen species in an agricultural context and contribute to a more complete picture of nitrogen cycling.

Conclusions
We deployed an APi-ToF at the agriculturally influenced SGP site for one month to measure the chemical composition of ambient anions and cations.These measurements indicate that, at least during this period, SQTs are the major HOM nighttime precursors and NO3 radical chemistry is an important nighttime oxidation pathway.Products of NO 3 radical chemistry such as the SQT ONs are predicted to have sufficiently low volatility to partition effectively onto particles, suggesting that future studies of particle growth in this and likely other agricultural regions should account for these species.
Measurements of positive ions show that nearly all the positive ions contain only one nitrogen atom.Diamines were not observed.Further work should be done to identify precursors of the unique C13, C17, and C18 bases which may be useful as tracers of specific chemical processes.In particular, the C13 compound (C13H21NO3)H + ) observed at m/z 240 is consistently among the most intense cations, but has highly variable signal intensity that could be indicative of emissions or chemistry yet to be identified.More generally, our work demonstrates how ambient ion measurements when combined with binPMF analysis can be a powerful tool for elucidating trace gas chemistry.In the negative ion measurements, binPMF enabled the separation of HSO4 -and NO 3 -anions as charge carriers with the NO 3 -factors providing insight into diel variation in trace gases that could contribute to particle growth.It is well known that the strong acids, strong bases, and HOMs we detect as ambient ions are important for NPF and growth.Given the chemical and physical complexity of NPF, a more complete understanding of the precursors of particle formation is required to better predict NPF, even in the absence of NPF events.Given the strong signature of sulfuric acid-base clusters typically observed in NPF, we anticipate that binPMF would resolve NPF events, providing insight into atmospheric composition favorable to NPF, given the appropriate dataset.Although measurements using active ionization will provide a more straightforward and quantitative way to interpret the sources and abundance of such gases since they will not be influenced by charge competition, ambient ion measurements offer advantages for longer term and remote deployments given their relative ease.
Figure 1: (a) Hourly diel plots of the sulfate factors and shortwave radiation over the whole campaign.Markers represent median values and shaded regions show the range between the first and third quartiles.(b) Hourly diel plots of the two NO3 -factors over the whole campaign.(c)-(f) Mass spectra of the four factors.Note that the scale of the y-axes varies in (c)-(f).
, and cannot be definitively ruled out.While we cannot distinguish between clusters of ONs with NO 3 -(ON•NO 3 -) and clusters of C x H y O z with nitric acid and NO 3 -(R•HNO 3 •NO 3 -), we suggest that clusters of ONs with NO 3 -are more likely for these formulas both because we do not observe each of the clusters that would correspond to the organic compound clustered with NO 3 -(R NO 3 -) and because the ON species have been previously identified in the atmosphere.The remaining ions identified in the low m/z NO 3 -factor are less intense and include deprotonated organic acids like C 3 H 3 O 4 -at m/z 103 and C 5 H 7 O 5 -at m/z 147.The peaks are m/z 340 and 372 are likely MTderived non-nitrate HOMs.HOM observations in this factor and in the high m/z NO 3 -factor are discussed in Sect.3.2.The series of peaks at m/z 302, 337, and 479 are tentatively assigned as (C 7 H 12 SO 7 )NO 3 -, (C 7 H 12 SO 7 )HSO 4 -, and (C 7 H 12 SO 7 )C 7 H 11 SO 7 -and discussed further in Sect.3.3.

Figure 2 :
Figure 2: Generalized Kendrick analysis plot of negative binPMF high m/z NO3 -factor.The basis is O and the scaling factor is 14.Formulas in blue are tentatively assigned as the most probable.Light brown formulas are Carbon-13 isotopes of the most probable assigned formulas one m/z lower than the predicted isotopes.Purple and green formulas were rejected.Formulas are presented as clusters with the NO3 -anion.Marker size corresponds to peak intensity.Error bars are ±50 ppm from the fitted peak position.

Figure 3 :
Figure 3: (a) HYSPLIT clusters calculated from back trajectories.The color scale shows leaf area index measured by MODIS.(b) Average signal intensities of factors for each HYSPLIT back trajectory cluster.

Figure 4 :
Figure 4: (a) Hourly diel plots of the alkylpyridinium and C18 factors and shortwave radiation over the whole campaign.Markers represent median values and shaded regions show the range between the first and third quartiles.(b) Hourly diel plots of the nighttime and daytime high m/z factors over the whole campaign.(c)-(f) Mass spectra of the four factors.Note that the scale of the y-axes varies in (c)-(f).

Figure 5 :
Figure 5: Generalized Kendrick analysis plot of positive binPMF daytime high m/z factor.The basis is CH2 and the scaling factor is 35.Marker color corresponds to peak intensity.