Excess methane emissions from shallow water platforms elevate the carbon intensity of US Gulf of Mexico oil and gas production

Significance Decisions on future energy production in the US Gulf of Mexico depend on climate impact assessments. We present an approach to calculate the carbon intensity of oil and gas production in the Gulf of Mexico using atmospheric observations of carbon dioxide and methane. We find that excess methane is emitted compared to government inventories. Platforms in shallow water have notably poor climate performance compared to either deep water or typical global oil production. Targeted shallow water mitigation measures for current or future production would have substantial climate benefits. The approach outlined here, including the use of observations of both greenhouse gases and attribution to both fossil products, could be widely applied to assess climate impacts of different production basins.


Background
Methane emissions from onshore oil and gas facilities are often intermittent and can change by orders of magnitude over the course of hours to days. For this reason, short duration measurements of these facilities can find widely different fluxes across observation periods (1). Such intermittency increases the challenge to build an accurate emissions profile of a site and it is possible that a chance quantification of an infrequent event could bias the estimated mean to appear higher than it actually is. Alternatively, infrequent or small sample size quantifications may miss intermittent emissions that could contribute disproportionately to the mean, leading to low bias in the estimated mean. Multiple studies have discussed how the difficulty of resolving intermittency challenges comparisons between inventories and short duration observations (1)(2)(3)(4).
Intermittent emission events are present offshore in the US Gulf of Mexico (GOM), especially at shallow water central hub facilities. Emissions have been documented to change by >1000 kg CH4/hr across consecutive days of observation (5,6). It is statistically unlikely that these are rare events (5). The sources of these events include cold venting, tanks, and unidentified sources (6). Ayasse et al. (2022) estimated that these sources are persistent where present, with frequencies of 0.75 for venting, 0.58 for tanks, and 0.65 for unidentified (6). Cold venting should be monitored by meters, but may not be fully accounted for due to faulty or absent meters. In some cases, operators may be under-reporting venting. This explanation is supported by a recent probe that found a prolific GOM operator to be venting in excess of regulations for years (7).
Are intermittent high emissions at central hubs actually more frequent than represented in the BOEM GOADS inventory? There are not enough observations per site to evaluate this question at the site-level. However, our large sample size allows us to evaluate this question at the sub-population-level and confirm whether we can adequately represent basin emissions. In the next sections, we a) evaluate whether intermittent emissions are accounted for in the BOEM inventory and b) explain how our method of aggregation to population-level emissions is robust without resolving site-level intermittency.

Comparison of Intermittent Fluxes between GOADS and Observations
Are intermittent emission events accounted for in the BOEM GOADS inventory? To answer this question, we compare hourly CH4 fluxes between observations and the inventory for the 34 federal water central hub facilities currently sampled and present in GOADS. GOADS reports emissions by emission process per piece of equipment per month. We can further separate these average monthly emissions into intermittent hourly fluxes using the hours the piece of equipment was active as reported in GOADS.  (3) used a similar approach to reconcile the GOADS inventory with boatmeasurements of CH4 from Gulf of Mexico platforms collected by Yacovitch et al. (2020) (8). They showed that using emissions reported at the site-level to calculate average emissions could lead to overestimated annual fluxes. But, at the population level, the emissions distributions were relatively similar between GOADS and observations. Here we conduct a similar investigation considering the first large sample size of central hubs.
We generate 100 random hourly fluxes for 34 central hubs from a) the GOADS inventory for the month of October and b) 118 observations. We focus on October because the majority of our samples were collected during the fall deployment by Ayasse et al. (2022). Our results should not be sensitive to season since there is little seasonal variability in total GOADS emissions and production operations are generally kept constant except in cases of emergencies (e.g. hurricane shutdowns). Hourly emission rates for observations are made using a randomly chosen sample of the facility and a flux value generated from the normal distribution of the estimated mean and standard deviation. These hourly intermittent emission rates can then be summed to generate a distribution of 100 total emissions rates gathered from any random hourly snapshot of a facility.
The fat-tail distribution of high hourly intermittent emission rates from the observations is poorly represented in the inventory. Figure S4 compares facility-level simulated intermittent emissions. High emission events are present in both observations and inventories, but more common in observations. For example, there are many more emission events of >500 kg CH4/hr in the observations (15% of all hourly fluxes) than in the inventory (0.5% of all hourly fluxes). Figure S5 shows the same data as a probability density function of hourly fluxes. Both distributions are predominantly low values and have a fat-tail of high emission events, but the probability of high emission events is higher in the observations. The simulated total emissions from the sum of random hourly fluxes are higher in the observations than the inventory. Figure S6 compares the histograms of total emissions. The total calculated from observations can vary widely. The GOADS inventory is on the very low bound of this distribution. We ran a two sample t-test to test whether the distribution of the observation totals is statistically greater than the inventory totals. The results of the test [ t(df=108)=20, p=(< 2.2e-16) ] indicate that we can strongly reject the null hypothesis and accept the alternative hypothesis that the observation total is greater than the inventory total.
Independence of Method to Estimate Basin-level Emissions from Resolving Site-level Intermittency Our method to aggregate emissions to the basin-level does not require us to resolve site-level intermittency. We resample from the distribution of observations for all platforms in an infrastructure category (see methods). This approach is agnostic to assumptions on the intermittency for a given site. Instead, intermittency is directly embedded with the observed distribution.
We make two assumptions to use this approach. First, it assumes that the platforms in a given category show similar emission behavior to one another. Figure S9 shows we likely meet this assumption since the CH4 distributions across studies for each platform category are most comparable within that category. The distribution for central hub facilities is extremely wide and it would be preferable to further separate this platform category into "high emitting facilities" and "low emitting facilities". To this end, we have explored whether any other characteristics could explain different emission rates between central hubs. Figure S11 compares emissions across some of the obvious explanatory traits: age, venting rates, and gas and oil production. None of these explains the variation across platforms. Therefore, aggregating by generic central hub design remains our best predictive trait.
Second, it assumes that we have gathered a representative sample of the real world distribution of platform emissions. This is especially important to central hub facilities, where there is the widest variability in intermittent emission events ( Figure  This assumption requires that our sample is not biased toward high-emitting facilities. To check this, we resample the BOEM GOADS inventory for only the sites sampled and then compare the distribution from the complete population in BOEM GOADS. Figure S8 shows that the resampled distribution matches the true distribution and Figure S7 shows that the total emissions are similar. Replacing the temporal average of a site with a spatial average of the population is similar to the idea behind the Birkhoff Ergodic Theorem (9). The Ergodic theorem posits that for a dynamical system where most points eventually revisit the set, the time average of one point will be the same as the average over the full space. In our case, this assumes that the temporal distribution of emissions over one site is the same as the spatial distribution of the population.
Proving that the distributions follow the Ergodic Theorem is difficult without actually comparing a temporal average with a spatial average. Nevertheless, previous work supports the use of the ergodic theorem for estimating aggregate basin emissions, even if site-level intermittency is not resolved. Chen and Sherwin et al. (2022) (10) (3). The primary difference between their data set and ours is sample size, particularly for central hub platforms.

Production
We link sub-sea well production data to (1) production platforms and (2) central hub facilities. For federal waters, we gather production data from BOEM OGOR-A, available at https://www.data.boem.gov/Main/Default.aspx. We link well production to platform complex ID by using the BOEM borehole data set, which includes both well ID information and platform complex ID. This links the well to the first above water facility that handles the volumes. For state waters, we use the offshore and coastal Enverus well data-set. In cases where production is gathered by a central hub, we next aggregate production to this facility as an estimation of throughput. We use visual association between satellite facilities and central hubs using pipelines ( Figure S17).

Gas Composition, Heating Values, and Joule Production
We estimate joules of energy produced as the sum of joules in natural gas and crude oil (equation S1). For crude oil, we use a higher heating value of 5.8x10 6 btu/ bbl (11) (American Petroleum Institute 2021, Table 3 -8). For natural gas, we consider the heat released from the combustion of both CH4 (the primary constituent) and other constituents (including ethane and propane). In the ideal scenario, we should use the direct gas composition of the Gulf of Mexico. This is unavailable. Therefore, we use a generic higher heating value for unprocessed natural gas of 1,236 btu/ ft 3 ((11), Table 3 -8). This value is reported by the American Petroleum Institute (API) Compendium of Greenhouse Gas Emissions Methodologies for 2021 ((11), Table 3-8) and is used as an official generic heating value by the EPA for greenhouse gas calculations in the code of federal regulations (12). This should correspond to what the API reports for the generic raw composition of gas (80% CH4 content by volume) ((11),

Loss Rates
Loss rates are estimated for (1) equivalent natural gas production that is lost and (2) equivalent joules of oil and gas production that is lost. Loss rates can be estimated two ways: loss/production or loss/(production+loss). While arguments can be made for both, we choose the first approach for two reasons. First, we believe it better contextualizes the efficiency of production. This is clearest in the extreme case where emissions exceed production (>100% loss rate). Using the second approach would lower the loss below 100%, even if the facility emits much more than it produces. Second, we are methodologically required to define the loss rate as the ratio of loss/production since our CH4 simulations multiply reported production by loss rates.
We estimate natural gas loss rate by first converting CH4 emissions to natural gas emissions (equation S2) and then finding the ratio of natural gas emissions to natural gas production (equation S3). This requires the CH4 composition of natural gas. We use a generic raw unprocessed natural gas composition of 80% CH4 by volume ((11), Table 5-1) since there is no available data in the Gulf (see more discussion in gas composition section). To estimate joule loss rates we convert CH4 emissions to joule emissions and then find the ratio of joule emissions to joule production (equation S4). We estimate joules emitted from the energy content of the estimated natural gas emitted (see equation S2). This includes the energy content of other constituents, besides CH4, in the natural gas (see section on gas composition, heating values, and joule production).       Figure S4. Figure S7. Total CH4 emissions for the Gulf of Mexico estimated by inventories and observations. Totals are shown for years that correspond to the most recent inventory for federal waters (2017 BOEM GOADS inventory) and state waters (2019 reported in the 2021 EPA GHGI). Observationally Informed emissions are shown as a mean and 95% confidence interval for the resampling of absolute flux rates approach (resampling approach A). As a check to see if the sites we resample are representative of the full population, we resample the inventory using the same approach and same sites we used to resample observations, displayed as "Resampled Inventory". Note, we scale the state water inventory upward using production to match with the production we define as state waters. Figure S8. Cumulative distribution of emitting sites (top) and emissions (bottom) for federal waters. We show the distribution from the true BOEM GOADS 2017 inventory, resampled GOADS inventory using sites sampled in-situ, and resampled observations for 2017 using absolute flux rates (resampling approach A). Frequent values are darker and less frequent values are more transparent. The resampled inventory tracks with the true inventory, suggesting that there is no obvious bias introduced by the samples used for the stratified resampling approaches. The resampled observations do not track with the true inventory because of the presence of high emission rates that are unaccounted for in the inventory. Figure S9. Average facility-level CH4 emissions by field deployment separated into broad platform categories. The corresponding production (joules of crude oil and natural gas) and equivalent percent of joules lost are included. Percentages of joules lost include instances that were equivalent to values above 100%. Figure S10. Marginal, moderate, and high production are observed at emitting central hubs. We compare facility-level production at central hub facilities to average facility CH4 emissions by field deployment. The red line shows the low production category cut-off used by Omara et al. (2022) (≤ 15 boed averaged over the year) (13). It is possible that throughput is higher than we think for unknown reasons.      Figure S14, we show yearly groupings of spud dates around four central hub platforms in federal waters. Drilling has occurred around these facilities for decades including in the last five years. Figure S16. Example of how the three CH4 resampling methods simulate observations. Here we compare one of a 1000 simulations of CH4 emissions from state water central hub platforms compared to actual observations. A) Comparison for approach A showing a histogram of simulated emissions (top) and a histogram of daily observations (bottom). B) Comparison for approach B showing observed average facility natural gas loss rates for each field campaign and simulations. C) Comparison for approach C showing observed average facility joule loss rates for each field campaign and simulations. Loss rates are shown as a fraction of production and can reach values above 1.

Figure S17.
Map showing how production is linked to central hub facilities. Federal water satellite platforms (white circles), which are already associated to well production via ID information in the BOEM borehole data set, and state water wells from enverus (blue circles) are linked to central hubs (large red and orange circles) via visual association using pipelines (white lines).