Extreme isomeric complexity of dissolved organic matter found across aquatic environments

The natural aquatic environment contains an enormous pool of dissolved reduced carbon, present as ultra‐complex mixtures that are constituted by an unknown number of compounds at vanishingly small concentrations. We attempted to separate individual structural isomers from several samples using online reversed‐phase chromatography with selected ion monitoring/tandem mass spectrometry, but found that isomeric complexity still presented a boundary to investigation even after chromatographic simplification of the samples. However, it was possible to determine that the structural complexity differed among samples. Our results also suggest that extreme structural complexity was a ubiquitous feature of dissolved organic matter (DOM) in all aquatic systems, meaning that this diversity may play similar roles for recalcitrance and degradation of DOM in all tested environments.

Dissolved organic matter (DOM) is the dominant form of organic carbon in most aquatic environments. Upon mineralization, it is an important precursor of outgassing of CO 2 from inland waters (Tranvik et al. 2009), it carries substantial amounts of nutrients and energy from land to sea (Medeiros et al. 2016), and it persists in the deep ocean for millennia (Dittmar and Stubbins 2014). It is an ultra-complex mixture of compounds that provides the ultimate test of the capabilities of analytical chemistry (Rodgers et al. 2005;Dittmar and Stubbins 2014), and investigations into its nature have been confounded by its extreme molecular complexity. High-resolution mass spectrometry (HRMS) is able to resolve many thousands of molecular masses from complex natural mixtures of organic compounds (Marshall et al. 1998;Riedel and Dittmar 2014;Hendrickson et al. 2015), but is unable to differentiate between structural isomers of a molecular formula.
Most recent research has utilized advanced visualization or multivariate statistical approaches to interpret HRMS data (Wu et al. 2004;Sleighter et al. 2010;Kellerman et al. 2015), but when the HRMS data are not combined with more *Correspondence: jeffrey.hawkes@kemi.uu.se Author Contribution Statement: JAH designed the research and wrote the MATLAB code. JAH, CP, and PJRS conducted the experiments. LJT and JB acquired equipment and funding. JAH wrote the paper with contributions from all authors. JAH is accountable for the integrity of the data.
structurally sensitive techniques (Lam et al. 2007;Arakawa and Aluwihare 2015;Arakawa et al. 2017), average biogeochemical behavior of several structural isomers is measured without knowledge of how diverse the isomeric mixture actually is (Zark et al. 2017). The study of organic matter in the environment relies on detecting and characterizing subtle changes to the complex mixture, but most of the complexity of the mixture is hidden behind this isomeric averaging.
The isomeric complexity within each molecular formula may be probed by chromatography or by fragmentation within the mass spectrometer. These techniques have previously been used separately to confirm that numerous structural isomers contribute to any obtained molecular formula (Leenheer et al. 2001;Witt et al. 2009;Capley et al. 2010;Arakawa and Aluwihare 2015;Cort es-Francisco and Caixach 2015;Brown et al. 2016;Zark et al. 2017). Chromatographic separation of complex mixtures of DOM has revealed variability in the primary and tertiary structure of their components (Saleh et al. 1989;Woelki et al. 1997;Reemtsma and These 2003;Namjesnik-Dejanovic and Cabaniss 2004;Woods et al. 2011;Arakawa et al. 2017), and several recent studies have led to tremendous advances in our understanding of the compositional complexity of terrestrial humic and fulvic acids via chromatographic separation and subsequent high resolution MS analysis (Koch et al. 2008;Stenson 2008;Gaspar et al. 2010;Brown et al. 2016;Sandron et al. 2017). Fragmentation has revealed the importance of carboxylic acid and alcohol functionality in fulvic acids (Leenheer et al. 2001;Witt et al. 2009;Capley et al. 2010;Zark et al. 2017) and has led to estimates that at least 28 structural isomers are present at any particular molecular mass in seawater (Zark et al. 2017). When used in combination with chromatography, fragmentation has allowed the separation and measurement of specific biomarkers, metabolites, or contaminants within a complex mixture (Wu et al. 2015;Longnecker and Kujawinski 2017), but has rarely been used to probe the isomeric complexity of the bulk organic mixture (Brown et al. 2016).
Online coupling of high performance liquid chromatography (HPLC) with HRMS or HRMS/MS have been underexploited in the investigation of natural ultra-complex mixtures (Patriarca et al. 2017;Petras et al. 2017). This is partly due to the nature of electrospray ionization, the most commonly applied ionization method coupled to HRMS, which is sensitive to inorganic buffers and requires close control of the solvent mixture for stable ionization efficiency, meaning that typical highly buffered gradients of solvents are inappropriate (Koch et al. 2008). Also, signal to noise is typically quite low in individual transients (Brown et al. 2016), making the scan rate a critical limiting feature of online HPLC-HRMS. Collection of fractions and pre-concentration is a sensible, though time consuming, way of improving sensitivity (Capley et al. 2010;Brown et al. 2016). However, subtle detail in the elution gradient is lost through combining eluted material into fractions (Brown et al. 2016).
Here, we use an online HPLC-tandem HRMS method to explore the isomeric complexity of individual molecular masses in several natural samples. Previous studies have demonstrated that similar structural complexity is found at every tested molecular mass in such complex mixtures (Witt et al. 2009;Capley et al. 2010;Zark et al. 2017), so here we do not explore numerous masses, leaving this for future investigation according to individual research questions. Instead, we go into greater detail at one particular nominal mass, [M-H] 2 5 369, which is near the average molecular mass in most aquatic systems, and for which three soluble and commercially available compounds could be purchased and compared to the natural mixtures ( Fig. 1). We test the method on extracted organic matter from several aquatic environments, ranging from a headwater stream to the deep ocean.

Reagents and instrumentation
All solvents were high purity grade (Supporting Information) and glassware was muffled at 4508C for at least 4 h prior to use. The HPLC was an Agilent 1100 with binary pump and autosampler. The Orbitrap was an LTQ-Velos-Pro (Thermo Scientific, Germany). Model compounds A and B ( Fig. 1) were purchased from Sigma Aldrich (Sweden) and compound C was purchased from Toronto Research Chemicals (Canada), as powders. Suwannee River fulvic acid (SRFA) and Nordic Reservoir natural organic matter (NRNOM) reference materials were purchased from the International Humic Substances Society (U.S.A.).

Sample preparation
Samples were collected from a headwater stream (HW), a river (RIV) and a fjord (FJO) in Sweden, the deep Caribbean Sea (MAR), and from two terrestrial reference materials (SRFA and NRNOM). DOM from the samples was concentrated to 500 ppm C in 0.1% formic acid by solid phase extraction. The model compounds A-C were added at 10 ppb to the SRFA sample. Details about sample collection and preparation can be found in the Supporting Information.

Chromatographic methods
Separation was conducted with an Agilent PLRP-S poly(styrene/divinylbenzene) column, similar in chemistry to Agilent PPL sorbent, with dimensions 1.0 3 150 mm, 3 lm bead size, 100 Å pore size, with a pre-column filter (0.5 lm, Supelco ColumnSaver). The mobile phases were 0.1% formic acid in water (mobile phase A) and 0.1% formic acid in acetonitrile (mobile phase B), in a stepped gradient from 5% to 90% B over 20 min at 50 lL min 21 as shown in Supporting Information Table S1.

Orbitrap-MS/MS analysis
The Orbitrap LTQ-Velos was calibrated and tuned to maximize the peak at 369.1 in SRFA (see Supporting Information). Ions were filtered to a 1 Da mass window (368.5-369.5) in a dual pressure ion trap, and fragmentation was conducted by collision induced dissociation (CID; activation q 0.25, time 10 ms) with nitrogen gas at excitation voltages of 0%, 21%, and 27% normalized collision energy. The resulting mixture of ions was transferred to the Orbitrap for analysis at resolution setting 100,000, with an accumulation time set to a maximum of 2 s and a maximum of 5 3 10 4 ions.
Fourteen precursor peaks with nominal m/z 369 were identified in the analysis with no excitation voltage. The exact mass of possible fragments from these precursors with neutral losses of up to five CO 2 and/or H 2 O molecules were calculated to make a target mass list for assignments in the analyses with excitation voltage applied. Assignments were allowed with mass error < 3 ppm (see Supporting Information for more detail on assignment). These losses (CO 2 , H 2 O) are the only ones that can be unambiguously assigned to a precursor peak due to the high likelihood of their loss and previous experiments demonstrating their dominance from single isolated ions (Witt et al. 2009). These losses made up more the 70% of total intensity (Fig. 2).

Separation of precursor peaks by liquid chromatography
The 14 precursor ions with masses ranging from 369.0099 to 369.2435 and their main fragments were easily massresolved by the Orbitrap under the analytical conditions, but were not fully chromatographically resolved by our HPLC method. Rather, they were partially separated with substantial overlap (Fig. 1a). The formulas with higher oxygen abundance in each series eluted first, as is expected due to their higher oxygen functionality (e.g., carboxylic acid and alcohol groups) and resulting higher average polarity (Saleh et al. 1989;Namjesnik-Dejanovic and Cabaniss 2004). Three purchased compounds with the molecular formula C 16 H 18 O 10 (deprotonated mass 369.0827) and functionality that resembles moieties found in natural organic matter (Hertkorn et al. 2006;Lam and Simpson 2009;Woods et al. 2011) were obtained to compare with the compounds with the same formula in the natural mixture of SRFA. In contrast to the broad humps found eluting from the complex natural mixture, the model compounds gave relatively sharp, wellresolved peaks (Fig. 1b). This showcases the extreme isomeric complexity of the natural mixture (Stenson 2008;Capley et al. 2010;Arakawa et al. 2017;Sandron et al. 2017), revealing that each precursor peak has a continuum of structural isomers that can be somewhat smeared out on such a hydrophobicity/acidity gradient. As a result, some isomers of one isomeric mixture may elute later than isomers of the nexteluting isomeric mixture. It is unclear from these results how many isomers exist per molecular formula, but our results support the estimate of some large number > 30 computed by Zark et al. (2017), assuming that individual isomers elute as well resolved chromatographic peaks, as do the model compounds ( Fig. 1b and Supporting Information Fig. S3). Summed intensity of C 16 H 18 O 10 (third peak from a) and assigned fragments in SRFA (500 ppm C) at 21% normalized energy in the CID cell, with the model compounds added (10 ppb C each). Fragments that are specific to model compounds B and C are shown. No unique fragment was found for compound A, so a common fragment that is particularly abundant for compound A is shown. This highlights the full chromatographic separation of compound A from B and C, which overlap with each other. In contrast, the natural isomeric mixture has no easily identified features. A-C: Model compounds with formula C 16 H 18 O 10 and negative ion mass 369.08272.
Collection of fractions followed by secondary, orthogonal separation on a second column may allow further separation of these isomers (Arakawa and Aluwihare 2015;Brown et al. 2016;Arakawa et al. 2017;Sandron et al. 2017), but this is beyond the scope of this article. Here, we consider the fragmentation patterns of each isobaric peak at [M-H] 2 369 in this faster (34 min), more automated method.

Fragmentation of precursor ions
The fragmentation patterns (the relative intensities of the various fragments) were dominated by neutral losses of CO 2 (244) and H 2 O (218), as has been observed previously (Leenheer et al. 2001;Plancque et al. 2001;Witt et al. 2009;Brown et al. 2016;Zark et al. 2017) (Fig. 2). Samples that had been stored for several years at 2208C in methanol also gave substantial neutral losses of methanol (232), indicative of methylation of carboxylic acid groups. Such samples were not further used in this study.
Model compounds B and C had glycosidic bonds, leading to fragmentation at the glycoside linkage (Fig. 1). This type of fragmentation was not observed in high abundance in the natural aquatic samples using CID (Fig. 2), suggesting that this functionality is not a major component of the natural material (Witt et al. 2009). Solid phase extraction at pH 2 is selective toward humic substances over sugars, so this may reflect a bias in our sample treatment-but we have found that glycoside compounds retain well on reverse phase sorbents at low pH, provided they have some acidic or phenolic functionality. Due to the high lability of carbohydrates, it seems likely that natural mixtures are dominated by aromatic or aliphatic hydrocarbon backbone molecules substituted with carboxylic acid and alcohol groups that are more recalcitrant to biotic degradation (Lam and Simpson 2009;Witt et al. 2009; Dittmar and Stubbins 2014; Arakawa and Aluwihare 2015; Arakawa et al. 2017), more like the model compound A. This has been suggested previously by (Witt et al. 2009) for the same model compound, and somewhat supports the concept that natural humic and fulvic acids are composed of substituted monomeric species which form weak aggregates in natural waters (Peuravuori 2005;Hertkorn et al. 2006).
The more apolar, oxygen poor precursor ions required more energy to fragment (Capley et al. 2010), leading to a higher relative intensity of precursor peak remaining after fragmentation at a given energy (Fig. 3). However, contrary to our expectations, the fragmentation pattern hardly changed for any particular precursor peak over the course of the chromatographic separation (Fig. 3).
The fragmentation pattern changed little over the polarity separation. There was often a slight increase in 2CO 2 loss and sometimes an equivalent decrease in H 2 O loss. Loss of CO 2 , usually the most abundant fragment, typically stayed surprisingly uniform. The increased loss of 2CO 2 was likely due to an increase in the more acidic isomers at higher retention times, as these compounds are likely to have a higher number of labile carboxylic acid groups, similar to model compound A, which had a high retention time, four such functional groups and a very large 2CO 2 loss peak (Figs. 1b, 3d).
The CID fragmentation pattern of peaks in natural organic matter is often found to be dissimilar to any purchased model The number of fragments assigned and unassigned per sample (points) and their % contribution to the total intensity (bars). Typically, there were more fragment peaks unassigned than assigned, but the unassigned peaks made up < 30% of total intensity. compound, as has been discussed previously (Leenheer et al. 2001;Witt et al. 2009;Capley et al. 2010;Zark et al. 2017). The most recent theory to explain this is that so many isomers are contributing to the signal that an average result is obtained-statistically described as the "central limit theorem" (Zark et al. 2017). The fragmentation pattern then takes on this average signal as the sum of all the constituent fragmentation patterns (Capley et al. 2010;Zark et al. 2017). The implication of our result is that this isomeric averaging continues even after polarity separation, so that not only is the central limit of fragmentation patterns obtained at any particular retention time, but every retention time has a similar amount of isomeric averaging and a similar result. In this case, the total number of compounds present in seawater (100,000) calculated by Zark et al. (2017) may be at least an order of magnitude too low (a result that they do not rule out).
Alternatively, our results may be explained by similar functional group chemistry of the various structural isomers leading to similar fragmentation patterns, and chemical differences that lead to the polarity distribution being present on the carbon backbone (Capley et al. 2010). This issue is difficult to resolve with our online technique. Generally, there is not enough material for MS 3 or MS 4 fragmentation at any chromatographic time to probe deeper structural differences (Leenheer et al. 2001;Capley et al. 2010). The two glycoside model compounds we analyzed (B and C) had largely different structures and different carboxylic acid abundance (0 vs. 2), but very similar retention times. Model compound A had a much longer retention time, suggesting that it is rather hydrophobic when neutralized (Supporting Information Fig. S2). The parallel retention of acidic and purely hydrophobic functionality perhaps obscures and complicates potentially important trends in the fragmentation data. A stationary phase with a different selectivity such as hydrophilic interaction liquid chromatography may help in future work (Woods et al. 2011). However, we can state that the natural mixture is complex and yet gives a surprisingly consistent fragmentation pattern that can be significantly disrupted by simply adding in a small concentration of a known compound (Fig. 3d).
Up to this point, we have only considered the most important neutral losses (combinations of CO 2 and H 2 O) because we could not unambiguously assign other atomic combinations. For example, a loss of vinyl ketene (C 4 H 4 O, 68.026 Da) was observed for some peaks, indicating the presence of cyclic ketone functionality (Harris et al. 1967), but this may also be due to loss of carbon suboxide (C 3 O 2 , 67.990 Da) (Huber et al. 2007) from the previous precursor peak. These and other more exotic neutral losses may only be investigated with clarity with more advanced techniques that can isolate a single precursor peak (Witt et al. 2009;Brown et al. 2016). These pre-selection techniques would come with a necessary loss in time resolution, but would greatly improve the scope of the technique that we used. It is possible that individual compounds could be investigated this way based on their specific losses.
It was possible to identify some chromatographically resolved fragments from the complex mixtures without assigning them to specific precursor ions. Such peaks were a small minority compared with the typical broad peaks and were generally low in abundance. Example chromatograms are available in the Supporting Information. Particularly interesting were fragments that were only identified in seawater samples, such as C 10 H 17 O 2 5 at 17.27 min, C 14 H 7 O 2 11 at 9.46 min, and C 13 H 5 O 2 5 at 17.02 min, as these peaks presumably originate from molecules that are not present in terrestrial waters, and so are specific to seawater primary production or degradation products of terrestrial compounds. It was our expectation that more features like this would be visible after chromatographic separation, but in reality, the vast majority of fragment ion intensity took on the type of broad, average pattern that analytical chemists are accustomed to seeing in natural organic mixtures, with any analytical technique. We find this to be a fascinating result that indicates that the composition of natural organic matter is transformed to an ultra-complex mixture in every aquatic environment, the diversity of which is only constrained by structural possibilities and probability.

Comparison of various aquatic samples
The six samples analyzed had different abundances of the molecular masses with m/z 369 (Fig. 4). The freshwater samples had more of the highly oxidized and unsaturated molecular masses (high O/H ratio, e.g., C 15 H 14 O 11 ) compared with the samples from seawater. These highly oxidized and unsaturated molecules tend to come from terrestrial organic matter, and are known to be gradually diluted in seawater (Medeiros et al. 2016). Moreover, new supply of more saturated and reduced compounds by marine primary production may have contributed to the relative increase in these compounds in marine waters.
The polarity-dependent fragmentation pattern of these different precursor peaks was astonishingly similar in the various samples, which were taken from six completely different environments (Fig. 5), despite a roughly 100-fold variation in dissolved organic carbon (DOC) concentration and several hundreds of years of difference in aquatic residence time (Catal an et al. 2016) and therefore biogeochemical processing. It may be said that the marine samples showed greater loss of water than the terrestrial samples for more oxygen rich peaks, possibly indicating an increase in alcohol functionality, but this trend is rather weak and not consistent for all precursors (Fig. 5). The overall fragmentation similarity suggests that the individual structural components of the isomeric mixture are either the same across aquatic environments, or that the distributions of structures are centered around similar average patterns (Reemtsma et al. 2006). It also shows that all of the samples are extremely complex to the extent that this technique cannot readily distinguish between them.
It is possible to assess the functional complexity of individual samples based on the absolute number of different fragments generated, even though most of the low abundance fragments could not be unambiguously assigned to a precursor. The samples taken from closer to the marine endmember-from the coastal end of a river, a fjord and seawater-had the highest number of fragments and for this reason, we propose these three samples had the highest functional complexity (Fig. 2). This result contrasts with recent evidence that fresher waters (i.e., headwater streams) are more complex, based on the number of assigned formulas (Mosher et al. 2015). These two results can be reconciled if either the larger molecular diversity of fresher samples is degraded into a smaller range of masses but with an increased level of structural diversity, or if subsequent new formation by indigenous primary production in marine waters causes increased functional diversity. These effects would not be measurable by broadband mass spectrometry alone.
DOM persistence in the environment and recalcitrance to bacterial degradation is related to its diversity (Kellerman et al. 2015). Deeper insight into this diversity is crucial to understand the persistence and reactivity of DOM. It has been suggested that, in oceanic waters, most of the DOM is below the concentration threshold allowing bacterial degradation due to the implied low concentration of the thousands or millions of individual compounds present (Arrieta et al. 2015). However, within the concentration range typical for freshwater environments, bacterial exploitation of DOM does not appear to be constrained by concentration (Eiler et al. 2003), although we here demonstrate diversity of compounds similar to that in seawater. There is therefore a need for deeper information into how the huge chemodiversity of DOM translates into functional diversity such as the diversity of enzymatic pathways required for initial breakdown.

Conclusion
Natural DOM from diverse environments is too complex for HPLC-MS/MS to distinguish fragmentation patterns from individual isomers from a molecular mass. Furthermore, the fragmentation pattern obtained bears resemblance to the average signal obtained when the bulk sample is infused, Fig. 5. Relative intensities of different fragments representing neutral losses of combinations of CO 2 and H 2 O over the chromatographic run at 27% normalized CID energy from three precursor peaks (left to right) in the six samples (shown as different colors, see legend). Cooler colors represent samples with longer aquatic residence times. Data is only shown where the total ion count for the respective precursor peak exceeds 5% of the maximum value, as in Fig. 3. Features caused by model compound A are indicated by black arrows. The red arrow indicates a possible feature found in the marine samples, with specific fragment C 14 H 20 O.
rather than chromatographically separated. The pattern is dissimilar to purchased compounds and may be an average signal of numerous structural isomers. Together, these results suggest that isomeric complexity is at least an order of magnitude greater than previously realized. Samples from diverse aquatic environments have similar levels of complexity and similar fragmentation signals, confirming that complexity is an inherent and ubiquitous feature of natural DOM. This feature of DOM should be carefully considered in all studies that use mass spectrometry to study the environmental processing of DOM without additional analysis that is sensitive to molecular structure, particularly direct infusion mass spectrometry, but also after one-dimensional chromatographic separation.