Inconsistencies in sulfur dioxide emissions from the Canadian oil sands and potential implications

Satellite-derived and reported sulfur dioxide (SO2) emissions from the Canadian oil sands are shown to have been consistent up to 2013. Post-2013, these sources of emissions data diverged, with reported emissions dropping by a factor of two, while satellite-derived emissions for the region remained relatively constant, with the discrepancy (satellite-derived emissions minus reported emissions) peaking at 50 kt(SO2) yr−1 around 2016. The 2013–2014 period corresponds to when new flue-gas desulfurization units came on-line. Previous work has established a high level of consistency between at-stack SO2 emissions observations and satellite estimates, and surface monitoring network SO2 concentrations over the same multi-year period show similar trends as the satellite data, with a slight increase in concentrations post-2013. No clear explanation for this discrepancy currently exists. The implications of the discrepancy towards estimated total sulfur deposition to downwind ecosystems were estimated relative to 2013 emissions levels, with the satellite-derived values leaving the area of regional critical load exceedances of aquatic ecosystems largely unchanged from 2013 values, 335 000 km2, and reported values potentially decreasing this area to 185 000 km2.


Introduction
Space-based sensors have been widely used to derive or constrain emissions of air pollutants since the mid-1990s (e.g. Streets et al 2013). Most of these early 'topdown' efforts involved a form of inverse modeling in which emissions are derived such that simulation of the satellite observations from an atmospheric model agree with the satellite observations. More recently, direct approaches have been developed, in which the satellite observations are paired with winds from a meteorological reanalysis (Beirle et al 2011, Fioletov et al 2015. This newer approach is best suited for short(er)-lived species, but it has also proved useful for intermediate (CO;Pommier et al 2013) and longlived (CO 2 ; Nassar et al 2017) pollutants.
One successful application of this direct approach has been in deriving emissions of sulfur dioxide (SO 2 ) from the ozone monitoring instrument (OMI; Levelt et al 2006, Krotkov et al 2016, Levelt et al 2018. A global catalogue consisting of roughly 500 sources and their annual emissions was compiled using this methodology , NASA 2019, is updated annually and includes sources not captured in bottom-up inventories , and recently was merged with a leading bottom-up gridded inventory (Liu et al 2018). The ability of OMI to capture annually varying emissions has been demonstrated for many sources globally, including power plants , smelters (Ialongo et al 2018), and volcanos .
This methodology was applied to the Athabasca oil sands region (AOSR) in the Canadian province of Alberta, and in particular the surface mining region in the northwest corner of the AOSR (roughly 57 • N, 111 • W; just north of the community of Fort McMurray). Large deposits of bitumen (a viscous form of oil) reside within the AOSR. Extraction of bitumen, and its subsequent upgrading to a synthetic crude oil, has increased rapidly in recent years. In 2018, production from the AOSR was (the equivalent of) 3.0 million barrels of oil per day (mBPD) from bitumen, a number expected to rise to 4.2 mBPD by 2028 (AER 2019). The process of upgrading can lead to the release of large amounts of SO 2 (McLinden et al 2012, and other pollutants (Gordon et al 2015. According to the bottomup reported emissions of the Canadian National Pollutant Release Inventory (NPRI), the surface mining region emitted 90-100 kt(SO 2 ) yr −1 (hereafter kt yr −1 will be used) prior to 2013, but less than half of that in subsequent years due to the installation of control devices. In contrast, and as will be elucidated below, our evaluation using OMI observations finds emissions similar before and after installation of the devices, in broad agreement with measurements of surface concentrations in the vicinity of the emissions sources.
In this work, SO 2 data from multiple sourcesemissions monitoring, surface in situ, surface remote sensing, and satellites-are brought together in an attempt to understand this discrepancy. Previous work (Makar et al 2018) found exceedances of critical loads associated with AOSR emissions were occurring over a region greater than 320 000 km 2 : the extent to which subsequent emissions levels have been reduced thus has a critical impact on potential long-term environmental damage in the region. Here, a critical load is a quantitative estimate of an exposure to one or more pollutants below which significant harmful effects on specified sensitive elements of the environment do not occur (Nilsson and Grennfelt 1988). Estimates of the potential impacts of the top-down versus bottom-up emissions estimates are provided as part of the analysis.

Oil sands emissions data
Reported SO 2 emissions in the oil sands include hourly emissions measured by continuous emissions monitoring systems (CEMS) on the main stacks (AG 1998), which emit most of the SO 2 at upgrading facilities, as well as engineering estimates for emissions from other sources SO 2 , such as flaring. Facilitylevel emissions data were retrieved from the Canadian National Pollutant Release Inventory (NPRI 2019). Further, monthly and hourly emissions estimates and related quantities from CEMS and industrial monitoring reports, which include CEMS, flaring estimates and other SO 2 emission sources, were also used (AG 2016). The emissions data in the NPRI and the industrial monitoring reports are generated using similar datasets and methodologies and are therefore consistent. Figure 1 shows an image of the oil sands surface mining region. There are smaller amounts of SO 2 emitted from mobile sources, but the vast majority of emissions come from three upgraders with the two largest responsible for more than 90% of total emissions in the area over this 2005-2018 period. These are Syncrude-Mildred Lake (referred to here as SML; NPRI ID 2274; main stack height is 183 m) and Suncor (SUN, NPRI ID 2230; main stack height is 137 m), shown in figure 1, and emit in roughly a 3:1 (SML:SUN) ratio. The height of secondary and flaring stacks, and those at the smaller CNRL upgrader, are in the range of 50-110 m. In the mid-2000s SML undertook the sulfur emission reduction plan (SERP), an initiative to retrofit flue gas desulfurization facilities, or 'scrubbers' , into the operation of Syncrude's two original cokers. Completed and brought online in late 2013, SML reported emissions fell from 73 kt yr −1 in 2012-28 kt by 2014. A decline at SUN was also reported, 22-13 kt yr −1 , between 2010 and 2015 which is attributed to a series of improvement initiatives and plant optimizations. Since 2009 or so, the total amount of bitumen mined at SUN and SML, a good proxy for bitumen upgraded, has remained roughly constant (AER 2020).

In situ monitoring
The Wood Buffalo Environmental Association (WBEA 2019) monitors the environment of the Regional Municipality of Wood Buffalo in northeastern Alberta, including robust passive and continuous surface monitoring networks in and around the surface mines (Hsu 2013, Percy 2013, Bari and Kindzierski 2015. Identified in figure 1, there are a dozen stations within a 50 km radius of the SML/SUN upgraders outfitted with continuous Thermo Scientific 43i SO 2 analyzers (sampling height of 4 m), including four stations within 10 km (Lower Camp, Mildred Lake, Buffalo Viewpoint, and Mannix). The peak levels of SO 2 (annual 99th percentile of hourly averages) increased sharply in 2014, primarily at Lower Camp, prompting the government of Alberta to study the issue (AEP 2018). WBEA also operates a network of Maxxam Analytics passive SO 2 monitors (sampling height are variable, see SM; Tang 2001), which are deployed for approximately 1-2 months during which air pollutants are adsorbed onto a filter. The ambient concentration of SO 2 is calculated from the mass of SO 2 on the filter, analyzed in a laboratory, and an estimated rate of uptake. Annual average SO 2 from passive samplers was compared against continuous analyzers across Alberta and are in good agreement, as discussed in the supplemental material (SM).

Satellite observations and emissions
Satellite remote sensing can be used to derive SO 2 abundances by measuring the intensity of backscattered sunlight in the ultraviolet where SO 2 absorbs. The two nadir-viewing (down-looking) spectrometers used in this work are OMI on the Aura satellite (2004-present;Levelt et al 2006Levelt et al , 2018 and the tropospheric monitoring instrument (TROPOMI) on-board the Sentinel-5 precursor (2017-present; Veefkind et al 2012). Retrievals are performed by first matching laboratory-measured SO 2 absorption cross-sections and other relevant parameters to these observed spectra which provide a determination of the SO 2 slant column densities (SCDs), or the SO 2 number density integrated along the path of the sunlight through the atmosphere. SCDs were then converted to the more physically meaningful vertical column density (VCD), the vertically integrated SO 2 number density, using an air mass factor (AMF) which quantifies the sensitivity of the satellite to a particular scene. In practical terms, a multiple-scattering model is used to calculate AMFs (Palmer et al 2001) which depends on factors such as solar and viewing geometry, the presence of clouds, scene reflectivity and the vertical distribution of the SO 2 . VCDs are then determined through VCD = SCD/AMF. OMI observations date back to late 2004, and hence are extremely useful to track the temporal evolution of SO 2 and its emissions. With a spatial resolution of 13 × 24 km 2 at best, and more typically 15 × 35 km 2 -distances that are comparable to the entirety of the surface mining region-data spanning multiple years are analyzed together. The OMI SO 2 product used here is version 3.1, which utilizes the principle-component analysis retrieval algorithm developed by Li et al (2013). The standard NASA OMI SO 2 product uses a spatial and temporally invariant AMF, calculated for summertime conditions in the eastern US (AMF = 0.36; Li et al 2013). TROPOMI, the successor to OMI, has a much shorter data record (effectively March 2018 to present) but with its much higher spatial resolution (3.5 × 7 km 2 ) it can better capture details of the spatial distribution. The official European Space Agency TROPOMI SO 2 data product is used here (RPRO version 010105) (Theys et al 2017) and is calculated using scene-specific AMFs based on input information at a spatial resolution of 1.0 • .
To improve the effective spatial resolution, OMI and TROPOMI AMFs were reprocessed using higher resolution input information (McLinden et al 2014) as discussed further in the SM. Also, our analysis was limited to observations between April and October where Sun angles are high (see figure S4 (available online at https://stacks.iop.org/ERL/16/014012/mmedia)). A further examination of this approximation and details on data screening are given in the SM.
Satellite observations were also used to derive emissions using methods developed for point-sources (Fioletov et al 2015) and multiple or area sources (Fioletov et al 2017). As is expanded upon in the SM, these methods are based on the observed VCDs combined with coincident winds from a meteorological reanalysis. Emissions were obtained by fitting the satellite observations and winds to an exponentially modified Gaussian plume function (Fioletov et al 2015) based on an effective lifetime, derived as part of the analysis, of 4.0 h. Multiple years of observations and winds are analyzed together in order to reduce noise. Further details on the emissions calculations and a detailed error budget are provided in the SM.

Pandora spectrometer
In addition to the traditional in situ ground-based SO 2 instruments, a Pandora spectrometer has been operating at the Oski-Otin (see figure 1) monitoring site in Fort McKay since August 2013 (with intermittent gaps) . Pandora is a remote sensing instrument that measures direct UV-visible sunlight transmitted through the atmosphere in the ultraviolet (Herman et al 2009). Unlike the satellitesensors, conversion of direct-Sun SCD to VCD is straightforward and based purely on geometry, making it a more accurate method. One clear advantage of observing VCD as opposed to surface concentrations is that it is sensitive to plumes aloft.
Here observations from Pandora #104 (August-October 2013; October 2014-February 2016) and #122 (September 2017-December 2018, except May 2018) were considered. While #122 replaced #104, the monthly-calibration procedure  ensures there are no appreciable inter-instrument offsets and no distinction will be made between the two instruments in our interpretation. It is noted that the Pandora monitors are located to the N/NW of the SML/SUN upgraders, and that winds at the Pandora site originate from that direction during 15%-20% of the data record.

Results
Figure 2 shows reported annual SO 2 emissions from the AOSR surface mining area, total and flaring. In the second half of 2013, additional scrubbers came online at SML with the impact of these scrubbers being the inferred cause for the reported factor of three decrease beginning in 2014, and a factor of two decrease in overall emissions from the upgraders. However, this decrease in reported emissions is not matched by a decrease in ambient SO 2 levels. Figure 2 also shows the time series of annual mean SO 2 from several continuous WBEA surface stations in the area with the multi-station averages indicating a slow decline between 2002 and 2013, broadly consistent with emissions. Subsequent to this, however, there is an increase of 20%-30%, peaking in 2016, before a subsequent decrease back to 2013-2014 levels. These surface concentration trends are followed whether one considers the four continuous stations in the immediate vicinity of the upgraders (<10 km) or all continuous stations throughout the surface mining area. Nearby WBEA passive SO 2 sensors are broadly consistent with the continuous monitoring although their post-2013 increase is less pronounced.
In an effort to understand this apparent discrepancy, which was also noted by Edgerton et al (2019)  and then averaged onto the same 2 × 2 km grid using an inverse distance-squared weighting. When both continuous and passive observations were made at the same station, their mean was used, and only grid-boxes where three or more stations were within 30 km were retained. These interpolated maps show the same general characteristics as the OMI maps, including a hot spot around the SML and SUN upgraders, as well as a slight increase in the latter period.
In addition to mapping spatial distributions, OMI observations were used to derive emissions, thereby enabling a quantitative comparison with reported emissions. Running 3 yr emissions time series from OMI are compared with the reported emissions in figure 4(a). The three different OMI emissions algorithm variants, as mentioned above described in more detail in the SM, were used. These are (a) assuming a single point source located between SUN and SML, (b) a multi-source approach assuming emissions from the three (SML, SUN, and CNRL) upgrader locations, and (c) a multi-source approach allowing for a 3 × 3 grid of potential emission locations which does not assume the location of emissions but lets the algorithm determine where to place them to best match observations. These three variants produce emissions that are all quite consistent with each other, and with the two multisource approaches, (b) and (c), well within the pointsource uncertainty estimate, for which a detailed error budget was developed (see table S2). It is worth noting here that OMI has demonstrated its ability to track changes in SO 2 emissions at other locations (e.g. Fioletov et al 2016b, Ialongo et al 2018). Two specific examples at a latitude comparable to the oil sands, shown in figure S3, are a copper smelter in Flin Flon, Manitoba, Canada, in which OMI captures the rapid decrease resulting from its decommissioning, and a nickel smelter in Thompson, Manitoba where both OMI and NPRI indicate an approximate 5% yr −1 decline.
From figure 4(a), up until 2013, OMI compares well with the reported emissions which suggests there are no significant systematic errors in the OMI emissions. This is important as the largest potential source of uncertainty in OMI emissions are systematic, and manifest as a relative error that is largely time independent. The decrease in reported emissions beginning in 2014 is not apparent in the OMI-estimated emissions. Rather, OMI emissions estimates show a   There is one additional source of multi-year SO 2 observations from the area: Pandora spectrometer observations from the Oski-Otin monitoring site in Fort McKay, roughly 20 km N/NW of the upgraders (see figure 1). When analyzed as a function of wind direction, as in figure S7, there is a clear peak in VCD only for winds originating from the general direction of the SML and SUN upgraders (Fioletov et al 2016a).   between P1 and P2. This might appear as evidence of the expected drop in emissions from the scrubbers except that (a) for the large majority (if not all) of P1 the scrubbers were on-line and (b) this appears to be sampling artifact based on the specific months the Pandora was operating for each period. Considering only the months of Pandora operation, the average monthly reported emissions were virtually unchanged between P1 and P2. By contrast, the average amount of bitumen produced at SML and SUN-generally an excellent proxy for emissionsdropped by 16% from P1 to P2, making it consistent with Pandora within uncertainties. Sampling the four continuous WBEA stations nearest the upgraders in the same way, there is a 19% decrease between P1 and P2. When annual totals or averages are considered, bitumen production was flat while the four continuous WBEA stations showed an increase (see, e.g. figure 2) from P1 to P2, substantiating the claim that Pandora sampling was an issue. All told, Pandora is able to corroborate that significant emissions originate only from the SML/SUN direction and is consistent with the other SO 2 observations, but given a lack of observations from the pre-scrubber period, it cannot specifically address the question of how emissions changed as a result.
While OMI is ideal to track the evolution of SO 2 due to its long data record, the TROPOMI satellite sensor, owing to its superior spatial resolution, is much better suited to help isolate the location of the current emissions. Figure 5 shows the TROPOMI 2018 (April-October) average SO 2 . Here SCDs are first presented as converting SCD to VCD requires an assumption about the location of the sources. Thus, SCD is preferred when there may be some uncertainty as to the source location(s). To better delineate the location of origin, averages were calculated considering only wind speeds (between 950 and 900 hPa) below the median value to limit how far the emitted SO 2 can travel before chemical transformation or deposition. The distribution of TROPOMI VCD, shown in figure 5(b), is very similar to SCD. This combination of higher resolution and low winds indicates the location of all significant sources of SO 2 to be isolated to the immediate vicinity of the Syncrude and Suncor upgraders, or somewhere in between, within roughly a 10 × 10 km 2 box, as shown in figure 5(c). This source area is completely consistent with the sector identified by the Pandora, also shown in figure 5(c).

Discussion
The atmospheric observations considered here, collectively, suggest that total SO 2 emissions in the surface mining region did not decline in 2014 as suggested by the emission reports, although there is mixed evidence that a more modest decline occurred around 2018. They also confirm that any sources of SO 2 are limited to a small region in the immediate vicinity of the SUN or SML upgraders. While, at present, no satisfactory explanation exists that reconciles the reported and top-down emissions, it is nonetheless worthwhile exploring the potential explanations.
One possible reason, differences in meteorological conditions, can be ruled out immediately as distributions of wind speeds and direction show no significant differences for the 2010-2013 and 2014-2017 periods. This is true when examining the meteorological reanalyses and the WBEA station data (see figure S5). The efficacy of the scrubbers is considered next. The majority, roughly 80%-90% (see figure 2), of the reported SML and SUN SO 2 emissions are from the stacks in which SO 2 has been scrubbed and monitored using CEMS, a direct and reliable measurement of stack emissions. According to monthly emission reports (AG 2016), the remaining SO 2 is emitted from flaring stacks in which the high temperature of emissions prevents the use of CEMS, and emissions estimates for these sources are consequently based on engineering estimates rather than direct observation, and have a higher degree of uncertainty. Since the SML CEMS indicates an approximate three-fold decrease in emissions, the scrubbers appear to be operating as expected, reducing SO 2 reaching the main, non-flaring stacks.
This decrease and the broad consistency between atmospheric observations and reported emissions up until 2014 suggests that any explanation must include some kind of transition towards higher emissions from non-CEMS sources in 2013-2014. One possibility is that a completely new source emerged around this time, with a magnitude sufficient to cancel whatever reductions were gained from the scrubbers. For example, tailings ponds are known to emit a mixture of air pollutants (Galarneau et al 2014), as do the surface mines themselves (Liggio et al 2016). However, this possibility seems unlikely given the required magnitude of the source, tens of kt yr −1 , and the need for an explanation for its absence prior to 2014 given similar types of activities in the mines before and after that time. Another non-CEMS source is flaring. This source cannot, at present, be discounted but yet there is no clear evidence pointing to this possibility.
It is reiterated that (a) both the surface and satellite SO 2 observations employed here agree in their trends before and after 2014 and (b) the CEMS monitors are reporting the SO 2 concentrations with significant reductions from upstream emissions. It is further assumed that (c) no new significant sources of SO 2 emerged in 2013-2014 and (d) that the process of upgrading contributes most of the facilities' SO 2 emissions (as evidenced by the regulatory requirement of scrubbers). If these statements are all valid, they suggest that a significant fraction of SO 2 upgrading emissions originate from a pathway not connected to the scrubbers.
Concentrations of SO 2 in the region are controlled by emissions, and hence both dry and wet total sulfur deposition. The latter are used to determine exceedances of critical loads, which are used to assess risk of potential future ecosystem damage associated with acidifying deposition (Makar et al 2018). A difference of 60% in emissions levels has a significant impact on the size of the region at risk of future ecosystem damage, as we demonstrate here. Previous work (Makar et al 2018) combined observationcorrected modeled deposition fields, and internationally established protocols for the creation of critical loads, to estimate the potential damage to downwind terrestrial and aquatic ecosystems associated with acidifying deposition. Exceedances of critical loads for aquatic ecosystems (a measure of the potential for eventual aquatic ecosystem damage, if emissions continue at a given level) were predicted for a large region (334 000 km 2 for aquatic ecosystems; approximately half of the size of the provinces of Alberta or Saskatchewan; see figure 6(a)). However, the study made use of 2013 emissions data, i.e. just prior to the large decrease in the reported SO 2 emissions. Using this combination of observationcorrected model output and critical loads, an estimate of the size of the area in exceedance resulting from a 40% reduction in emissions (the percent reduction between 2013 and 2016 in the reported emissions data) is shown in figure 6(b). The latter is an approximation with the assumptions that SO 2 surface concentrations (and hence SO 2 deposition fluxes) will be linearly proportional to SO 2 emissions levels, and that all SO 2 emissions in the model domain were reduced by 40%. The area in exceedance would decrease by 44.5% relative to 2013, a substantial reduction in potential environmental impacts associated with the emissions. However, if no change in emissions has occurred (as implied by the satellite-derived emissions and the surface concentration monitoring network data), then the area in exceedance of critical loads will be the same as predicted in figure 6(a).
Finally, we note that recent work examining sulfur uptake to multiple forms of vegetation in the oil sands region on a yearly basis between 2009 and 2016 is in accord with our results (Weider et al 2020). In that work, no decrease in sulfur uptake was observed between the years 2014 and 2016, despite the factor of two decrease in reported SO 2 emissions. We thus have three independent sources of information (satellite observations, surface concentration observation, and sulfur uptake in vegetation) which show no evidence of a decrease in SO 2 loading within the 2014-2016 time period.