Equipment leak detection and quantification at 67 oil and gas sites in the Western United States

Introduction Recent studies have utilized a variety of on-site and off-site measurement methods to characterize regional and sitelevel methane emissions from oil and gas infrastructure in the United States (US), including techniques such as direct measurement of individual sources (e.g. Allen et al., 2013; Kuo et al., 2015; Thoma et al., 2017), ground-based mobile surveys (e.g. Brantley et al., 2014; Robertson et al., 2017) and regional box-model flights (e.g. Schwietzke et al., 2017). In addition to the development of equipment or component-specific emission factors for inventory development, studies with on-site, direct measurements are important for informing off-site measurement approaches by identifying the actual sources of emissions (Vaughn et al., 2017), although some variation in site-level emission rate estimates would be expected when comparing off-site and on-site measurement approaches (Bell et al., 2017). Direct measurements can also be used to constrain possible source-specific emission rates for emission inventory development studies that use statistical distributions of emission rates rather than average emission factors to characterize regional emissions from the oil and gas sector (e.g. Zavala-Araiza et al., 2015; Allen et al., 2017). For the onshore oil and gas production and gathering and boosting segments in the US EPA Greenhouse Gas Reporting Program (GHGRP), the most commonly-used emission estimation approach for equipment leak emissions (US EPA, 2017b) is based on a count of major equipment at the basin-level (e.g. number of separators) by operator, default component count estimates per piece of major equipment (e.g. number of valves per separator), and a default component average, or population, emission factor (e.g. standard cubic feet per hour [scfh] per valve). For 2017 reporting under the GHGRP, a second type of emission estimation method was available for the first time for sites that undertook a qualifying leak detection and repair (LDAR) survey and was based on a count of identified leaking components and a default emission RESEARCH ARTICLE


Introduction
Recent studies have utilized a variety of on-site and off-site measurement methods to characterize regional and sitelevel methane emissions from oil and gas infrastructure in the United States (US), including techniques such as direct measurement of individual sources (e.g. Allen et al., 2013;Kuo et al., 2015;Thoma et al., 2017), ground-based mobile surveys (e.g. Brantley et al., 2014;Robertson et al., 2017) and regional box-model flights (e.g. . In addition to the development of equipment or component-specific emission factors for inventory development, studies with on-site, direct measurements are important for informing off-site measurement approaches by identifying the actual sources of emissions , although some variation in site-level emission rate estimates would be expected when comparing off-site and on-site measurement approaches . Direct measurements can also be used to constrain possible source-specific emission rates for emission inventory development studies that use statistical distributions of emission rates rather than average emission factors to characterize regional emissions from the oil and gas sector (e.g. Zavala-Araiza et al., 2015;Allen et al., 2017).
For the onshore oil and gas production and gathering and boosting segments in the US EPA Greenhouse Gas Reporting Program (GHGRP), the most commonly-used emission estimation approach for equipment leak emissions (US EPA, 2017b) is based on a count of major equipment at the basin-level (e.g. number of separators) by operator, default component count estimates per piece of major equipment (e.g. number of valves per separator), and a default component average, or population, emission factor (e.g. standard cubic feet per hour [scfh] per valve). For 2017 reporting under the GHGRP, a second type of emission estimation method was available for the first time for sites that undertook a qualifying leak detection and repair (LDAR) survey and was based on a count of identified leaking components and a default emission

RESEARCH ARTICLE
Equipment leak detection and quantification at 67 oil and gas sites in the Western United States factor per leaking component, or leaker emission factor, (e.g. scfh per leaking valve). Both population and leaker emission factors for GHGRP reporting are based on a study (Hummel et al., 1996) that used field measurement data (API, 1993) from more than 20 years ago, which may not be representative of current equipment leak emissions in the oil and natural gas industry due to changes in equipment design, operation, and monitoring since the completion of that former work.
A previous direct measurement study (Allen et al., 2013) conducted measurements of leaking components at 150 new gas well sites throughout the US. Based on a national extrapolation of the study data, Allen et al. (2013) estimated that equipment leaks were higher than the value that was reported in the US Greenhouse Gas Inventory (GHGI) for 2011. The Allen et al. (2013) study did not collect component count metadata that would be necessary to make comparisons to current GHGRP emission estimation approaches, and the previous study utilized a single approach, OGI, for identifying leaking components on well sites.
Modeling approaches (Kemp et al., 2016;Ravikumar et al., 2017;Ravikumar et al., 2018) have been developed to assess the effectiveness of different leak detection technologies for identifying leaking components on oil and gas sites. These studies have typically utilized distributions of the magnitude and frequency of equipment leaks compiled from direct measurement studies (e.g. Allen et al., 2013) to estimate the emission reduction potential of LDAR programs. Additional information on the magnitude and frequency of leaking components should help to improve such models and can serve as an independent validation of the appropriateness of model assumptions with respect to the effectiveness of different leak detection methods, such as OGI and FID, using side-by-side comparison data between methods under actual field and survey conditions. This study focuses on direct measurement of equipment leaks from oil and natural gas production and gathering and boosting sites in the western US based on GHGRP geographic definitions. Equipment leaks are emissions from on-site piping and equipment components, such as valves and flanges. Equipment leaks accounted for more than 40% of industry-reported methane emissions from the US onshore production and gathering and boosting segments in 2016 (US EPA, 2017a) as part of the GHGRP. In addition, identification of leaking components is a typical feature of voluntary and regulatory LDAR programs. Such programs may include the use of optical gas imaging (OGI), handheld flame ionization detector (FID), and audio, visual, and olfactory (AVO) techniques to distinguish leaking components from those operating as designed. Previous studies have noted that emissions from other equipment on oil and gas sites, such as tanks  or malfunctioning pneumatic controllers (Thoma et al., 2017), may be present and detectable with similar tools on some oil and gas sites. This paper, however, focuses on one category of emissions, equipment leaks, and does not characterize emissions from other potential oil and gas sources.
The purpose of this study is to compare existing EPA default emission and component count factors used in the GHGRP for the Western US to equipment leak emissions observed at oil and gas sites. The study also seeks to compare the in-field side-by-side results of actual OGI and FID-based surveys for upstream oil and gas sites, which has not yet been studied and reported in the peerreviewed literature to date.

Basin and site selection
Equipment leak surveys and quantification measurements were conducted between June 2015 and December 2015 in four basins, as defined by the American Association of Petroleum Geologists (AAPG): Permian (AAPG Basin 430), Anadarko (AAPG Basin 360), Gulf Coast (AAPG Basin 220 that includes the Eagle Ford shale), and San Juan (AAPG Basin 580). Based on 2016 GHGRP data (US EPA, 2017c), operators in these four basins accounted for 47.2% of the total reported methane emissions from equipment leaks in the production and gathering and boosting segments. These four basins were also the top four basins nationally for reported equipment leaks, as shown in the Supporting Information (SI) Figure S1. Within these basins, eight companies volunteered sites for this study. No metadata was collected as part of this study to understand the fraction of production or well-counts represented by these volunteer companies.
Before the measurement team arrived on site, each participant company provided a list of assets in the basin, which were classified into four facility categories based on the types of major equipment, as defined in 40 CFR 98 Subpart W Table W-1B and Table W-1C, located on the site: well site, well production, central production, and gathering and boosting, as outlined in Table S1. While these four facility categories are not related to GHGRP reporting terminology, they were used to ensure that sites selected for this study had a variety of major equipment and levels of hydrocarbon processing. Before arriving in the basin, the GHD Services Incorporated (GHD) field team selected measurement sites from the asset list and communicated the selections (typically 5 primary sites and 5 back-up sites, in case the primary sites were not accessible) to the company field office approximately two weeks before sampling occurred so that arrangements could be made to provide operational support for those sites. Attempts were made to select sites from the range of facility categories that were present in the company's assets in the basin. Some sites that were selected were geographically clustered to minimize the driving time between sites and maximize the number of measurements that could be undertaken while the GHD team was in the field.

Onsite detection and measurement methods
For each site, FID and OGI-based equipment leak detection surveys were conducted independently by different trained GHD field technicians to minimize the bias in leak determination between the two methods. Typically, the OGI survey was started while the FID was being calibrated, and the OGI technician noted the location of leaks but did not identify the leaks until after the FID-based survey was completed in the same area of the facility. OGI was conducted with a FLIR Model GF-320 infrared (IR) camera, as outlined in 40 CFR §60.18 (g) work practice for identifying fugitive emissions. Both conventional and high sensitivity modes on the FLIR Model GF-320 camera were used, based on the experience of the trained camera operator.
FID-based surveys were conducted with a Thermo Scientific TVA-1000B that was calibrated with methane following procedures outlined in 40 CFR 60 Appendix A, Method 21. For the FID-based surveys, the probe was moved around the seal surface of each component until the maximum concentration value was located, and any maximum concentration reading that exceeded 500 parts per million (ppm) for a component was tagged and designated for additional quantification measurements. With FID-based approaches, different ppm thresholds can be used to designate a component that is leaking and designated for repair from a component with a screening value below the repair threshold for the program. Unless otherwise stated in this work, a 500-ppm leak threshold was used to take a conservative approach to leak identification. Additional details on instrument calibration and field procedures are outlined in the SI.
For each leak that was identified by OGI or FID, the emission rate of whole gas was quantified using a highvolume sampler. Briefly, measurements with this technique involve vacuum-sampling from a nozzle or bag loosely fitted over the emission source. The sample consists of the gas from the leak and any ambient air drawn into the enclosure in which the combustible gas concentration is measured with either a catalytic oxidation (0-5% hydrocarbon gas) or a thermal conductivity (5-100% hydrocarbon gas) detector. The combination of gas concentration and sample flow measurements are used to compute the gas emission rate. Details on highvolume sampler operation are available in the SI and references such as Lamb et al. (2015) and Thoma et al. (2017). In this study, both the commercial Bacharach high volume sampler and a custom Indaco high volume sampler were used for leak measurements. Whole gas emission rates were determined by adjusting the measured combustible gas concentration by the high-volume sampler response to site-specific gas compositions, which were provided by the participant companies based on their most recent gas analysis for the site. This was intended to follow protocols that companies would use for greenhouse gas emission reporting, which do not include compositional analysis for each leak or process stream on a site. Instrumentspecific response factors to methane, ethane, propane, and butane were developed by GHD and are outlined in the SI. If the site gas composition was not available, the average gas composition from other sites in this study in the same AAPG basin was used. The average instrument response factor used in this study was 1.028, which varied based on the site gas composition and the high-volume sampler or back-up detector that was used for the concentration measurement used in the emission calculation.
During the field work for this study, a leak was discovered at a cracked fitting on a glycol pump for a dehydration system. This leak was identified by sound and a buildup of ice on the system piping. The presence of gas in the area surrounding the leak, which exceeded the lower explosion limit (LEL) action level, required the crew to stay away from the leak, and the source was determined to be unsafe to quantify. Since this emission source would have been identified and quickly repaired by operations personnel during any site visit due to the audible and visual evidence for the emission source as shown in Figure S4, the estimated emissions from the source are not included in the study estimates for equipment leak emissions that focused on components, such as valves and flanges. Additional descriptions of the source and upper bound emission calculations are presented in the SI.
Recent concerns have been raised in the literature (Howard, 2015;Howard, et al., 2015) about the performance of Bacharach high-volume sampler during field campaigns at oil and natural gas production sites. For this study, an augmented protocol was utilized to ensure that the instrument was performing accurately and transitioning properly between the two measurement modes on the device. The augmented protocol included at least daily calibration and the use of a secondary hydrocarbon detector in the high-volume sampler exhaust to verify the accuracy of the hydrocarbon sensors and transition between catalytic and thermal conductivity mode. More information on the augmented high-volume sampler protocol is available in the SI.
In addition to collecting information on the number of leaking components and their associated emissions for each site, the GHD field team also conducted an actual component count (i.e. number of valves, flanges, etc.) and assigned each component to a major piece of equipment (i.e. wellhead, separator, etc.). A database with study component counts, component count assignments to on-site equipment, and information on leaking components is available as described in the Data Accessibility Statement. The study team did not collect metadata information on the existence of voluntary or regulatory LDAR programs at the sites.

Sampled population
Component count surveys were conducted at a total of 65 sites (41 gas and 24 oil, as defined by the site operator) with 83,960 components inventoried. Leak detection surveys were conducted at 67 sites with OGI and at 65 sites with FID. For two sites (GHD0011 and GHD0012), an FIDbased detection survey was not conducted due to time constraints for the GHD field team. Hydrocarbon leaks were identified, by one or both methods, and measured at 52 sites. No leaks were identified at 15 of 67 sites surveyed with OGI, and no leaks were identified at 13 of 65 sites with the FID. The surveyed population of components at the 65 sites included connectors (70%), flanges (15%), valves (14%), open-ended lines (OEL) (1%), and pressure relief valves (PRV) (1%), including components in both gas (75%) and oil (25%) service. More information on the classification of sites included in each of the four basins is available in Table S2.

Results and discussions Leaks detected and method comparisons
For the 67 sites in this study and 83,960 total components screened, a total of 331 components were identified as leaking during on-site leak detection surveys by either visual identification with OGI, an FID screening value of 500 ppm or greater, or both methods. The population of leaking components in this study had whole gas emission rates that ranged from 0.006 scfh to 83.6 scfh. The study average whole gas emission rate per leaking component was much higher (4.4 scfh) than the median (0.3 scfh), suggesting that the distribution of leaks was skewed, and the top 10% of individual leaking components accounted for 73% of the measured emissions in the study. As shown in Table S6, connectors (flanged and threaded) and valves represented 62% and 16%, respectively, of the total leaks detected by any emission survey method.
Summary statistics shown in Table S7 for the leaking components identified at the 67 sites indicate that leaks detected in oil and gas service had similar emission characteristics. For the 65 sites in this study which had component counts, a higher percentage of components in gas service (0.49%) were identified as leaking than components in oil service (0.11%). These two findings suggest that leaking components identified in oil service were similar to leaking components identified in gas service, but a smaller fraction of components in oil service had emission rates that reached a detection threshold for OGI or FID-based surveys.
In this study, OGI and FID-based surveys for equipment leaks identified different populations of leaking components. The two methods overlapped identification for 23% of the total count of leaking components and 56% of the total emissions from identified components. As shown in Figure 1 for sites with both survey methods present, OGI identified 33% of leaking components while FID-based survey methods identified 90% of leaking components when utilizing a 500-ppm leak definition threshold. However, after quantification measurements were completed on each of the components that were identified as leaking by either method, OGI identified similar overall emissions (80%) to the FID-based survey (79%) at a 500-ppm threshold for sites on which both survey methods were conducted. The comparison between OGI and FID-based surveys was broadly consistent with a previous report (Concawe, 2015) that compared the performance of these types of devices on the leak detection methods in oil refinery applications.
The distribution of leaking component emission rates identified in this study is shown in Figure 2. Among the 17 components in the top 5% of leaks in the study, 4 were identified uniquely by OGI, and these components represented 14% of total emissions measured in the study. By contrast, 67% of measured leaks had emission rates of less than 1 scfh, and most of these leaks (83%) were identified uniquely by the FID survey using a 500-ppm threshold for leak definition. Emissions from these 184 components, however, only accounted for 3.3% of the total emissions from leaking components that were measured in the study. Summary statistics shown in Table S8 indicate that OGI and FID-based detection methods may be identifying different populations of components as leaking. For all monitored components in oil and gas service, respectively, OGI identified a smaller percentage of leaking components (0.02% and 0.17%) than FID-based methods (0.11% and 0.43%) across the sites in this study with a component count.
The types of leaks identified in this work by OGI and FID-based leak detection methods indicates that FIDbased approaches may identify more leaking components  Figure 1 shows the number (left) and emission rate (right) of equipment leak emissions detected in the field campaign using optical gas imaging (OGI), flame ionization detection (FID) following EPA Method 21 with a 500 parts per million (ppm) threshold, or both detection methods. Results indicate that the FID-based surveys identified a larger count of leaks but that OGI-based surveys detected a similar percentage of overall emissions. Emissions and leak counts labeled in yellow were from two sites where only OGI emission surveys were undertaken. DOI: https:// doi.org/10.1525/elementa.368.f1 overall, but many of the additional leaks detected with an FID will have emission rates on the lower end of the leak distribution that may not, in aggregate, constitute a significant percentage of total equipment leak emissions from a site. Conversely, OGI-based surveys may uniquely identify some leaking components that are on the upper-end of the overall leaking component distribution and may have a larger emission reduction potential upon repair than the lower volume leaks that were identified with FID-based methods alone. There are several potential explanations for identification of leaks with OGI and not with FID. Since OGI surveys are broader in scope than component-specific FID surveys, OGI may identify emissions from components that are not included in FID surveys or from elevated locations on a site. In addition, the relative geometry of a leak location and the FID probe location at the component may, at times, involve sufficient dispersion such that the FID reading falls below a repair threshold for the program. This study was designed to compare the performance of the two leak screening methods under real field conditions and not to fully assess the differences as to why certain leaks were detected or not by individual methods.
Different LDAR programs have established a range of thresholds for which an FID reading is considered a leak that should be designated for repair. Comparisons of FIDbased survey results are presented in Figure 3 on a leak count and emission basis, relative to the performance of OGI for the leaks identified in this study. For this study, regardless of the leak definition threshold (between 500 and 10,000 ppm) that is selected for the FID-based survey, leaking components identified by FID readings were always greater in count but lower in cumulative emissions than leaking components identified by OGI. This suggests that the FID and OGI-based surveys conducted in this work identified fundamentally different populations of leaking components and that it would be difficult to establish a simple ppm leak definition threshold that implied equivalency between the two methods.

Number of leaks identified per site in surveys
In order to understand the effectiveness of different leak detection methods, information on the number of leaking components per site with different leak detection approaches is important for modeling the cost effectiveness of different LDAR program options and in informing models (Kemp et al., 2016;Ravikumar et al., 2017;Ravikumar et al., 2018) that estimate the frequency and size of different types of fugitive emission sources, including equipment leaks, at oil and gas sites in the US.
On average for the sites that were surveyed in this study, OGI and FID-based survey methods using a 500-ppm threshold identified 1.7 and 4.5 leaks per site, respectively. As shown in Figure 1, OGI surveys identified a higher volume of emissions from site equipment leaks. The number of leaking components identified per site with OGI in this study (1.7) is comparable to previous emission detection surveys conducted with OGI as part of the Allen et al. (2013) study, which found 1.9 leaks per site. Notably, the distribution of leaks detected per site with OGI is also a skewed distribution, which is a commonly-observed feature from direct measurement studies of oil and gas production sources (e.g. Allen et al., 2015). As shown in Figure  S5, 45% of sites did not have any leaking components  Figure 2 shows the emission rate in standard cubic feet per hour (scfh) that was measured with the high-volume sampler for each leak identified by one or both emission detection methods. Leaking components identified only by OGI techniques tended to be higher on the overall distribution of measured equipment leaks in the study while those identified uniquely by FID surveys tended to be lower in emission magnitude. DOI: https://doi.org/10.1525/elementa.368.f2 identified by OGI while 7% of total sites accounted for 42% of the total leaks identified by OGI. A weak correlation (R 2 = 0.41) was observed between the component count at a site and the number of leaking components identified on that site by OGI, as shown in Figure S6. This suggests that additional factors besides site size influenced the number of leaking components present during the study leak detection surveys in this work.

Comparison to current EPA emission estimation approaches
In the United States, the most common equipment leak emission estimation method in the GHGRP involves population-average emission factors for components and default average component counts by major equipment. Due to the location of the four basins in this study, all comparisons in this work were to default values for the western United States rather than the eastern United States (US EPA, 2017b) for gas and light oil services. Note that some components identified as leaking, such as pressure regulators, did not have an exact match to components outlined in GHGRP reporting. The emissions from these sources, however, are included in site-level emission estimates developed in this study. Figure 4 and Table S9 compare site-level emission estimates for equipment leaks based on major equipment counts and the actual measured emissions from all identified leaking components at the site level. For the 65 sites in the study for which a component count was completed, major equipment count-based estimates (Figure 4) would have predicted aggregate equipment leak emissions of 2241 scfh; however, direct measurements of all identified leaking components from FID and OGI-based surveys indicate that actual emissions from equipment leaks were 36% lower with total emissions of 1433 scfh. As a sensitivity analysis, each leaking component at the 65 sites was randomly assigned an emission rate value from the study leak distribution. Over 1000 trials, the 95% interval for the measured equipment leak emissions ranged between 1127 scfh and 1780 scfh, and none of the trials exceeded the value from application of major equipment counts and GHGRP factors for the 65 sites.
As shown in Figure 4, there is some variation on an individual site basis with 11 sites (17% of the sites in the study sample) exceeding the emission estimate from major equipment count-based approaches when all major equipment on site is counted. Most sites (54 of 65) had measured equipment leak emissions below those predicted by methods based on site-level major equipment counts. The two sites that had leak detection surveys and quantification but no major equipment or component counts had measured total equipment leak emissions of 7.3 scfh and 0.6 scfh. Having a component count for those sites would not be expected to significantly affect the results of this study.
Analysis of average population emission factors per component surveyed (Table S10) and component counts (Table S11) gives insight into a likely explanation that equipment leak emissions that were measured in this study are lower than would have been predicted by the most common emission estimation method used in the GHGRP. Over the 65 sites in the study, the total number of components counted was 55% higher (Table S11) than would have been predicted with default EPA component count estimates. Component count estimates per major equipment are also presented in Table S12 and show

Figure 3: Ratio of leak count and emissions for different FID-based leak definition thresholds compared to
OGI. Figure 3 displays, for the 50 sites with measured leaks and both survey techniques, the ratio of the count of leaking components as well as the associated emissions by FID-based surveys that would have been identified, relative to those identified with OGI, based on the range of leak detection definitions in parts per million (ppm) that are commonly-adopted for LDAR programs. Results show that OGI surveys for the sites identified greater emissions but less total leaks than FID-based surveys with different leak threshold definitions. DOI: https://doi.org/10.1525/ elementa.368.f3 a similar trend to aggregate components from across all sites in the study. EPA leaker emission factors in the GHGRP, by contrast, are lower than those developed from data in this study (Tables S13 and S14), especially for components in gas service. For the 331 leaking components identified in this study, the total measured emissions were 1441 scfh compared to an estimate of 859 scfh that would have been calculated with current EPA leaker emission factors in the GHGRP for leaking components in oil and gas services. Furthermore, the population-average emissions factors per component (Table S10) in this study, except for OELs in gas service, were much lower than values in the default EPA emission factors. As shown in Table S13 for gas service, the leaker emission factors from this study are lower in magnitude than previous EPA analysis (EPA, 2016) of data from Allen et al. (2013) and the Fort Worth Air Quality Study (City of Fort Worth, 2011). A sensitivity analysis was conducted using emission factors (EPA, 1995) for non-leaking components in light oil and gas service. Non-leaking components would add an estimated 48.9 scfh (3.4%) to the equipment leak measurements from the study that were made on components identified as leaking. This impact would likely be an upper bound since the emission factors were developed for components with an FID screening value of less than 10,000 ppm, while this study measured all components with a screening value of 500 ppm or greater. Inclusion of emissions from the non-leaking components does not explain the observed difference between the results of this study and current estimates from GHGRP methodologies.
In this study, leaking components occurred less frequently than leaking components in the previous study (Hummel et al., 1996) on which current EPA equipment leak emission factors are based. The Hummel et al. (1996) study is based on a prior American Petroleum Institute (API) publication (API, 1995) that published total component and leaking component counts based on FID surveys that identified ' emitters' (10-9,999 ppm) and 'leakers' (≥10,000 ppm) at a total of 24 industry sites including four light crude and four gas production sites with 48,652 and 32,534 screened components, respectively. In the prior API study, 0.67% of components at onshore crude production sites and 1.61% of components at gas production sites had an FID screening value that  Figure 4 compares measured equipment leaks from identified leaking components at sites surveyed in this study to a default major equipment count-based approach that is typically utilized for industry-reported emissions data under the GHGRP. The gray area indicates site-level equipment leak emissions where measured emissions exceeded estimates, while the blue area indicates sites for which the default approach would have overestimated emissions. Calculated emissions are based on site major equipment counts and population-average emission factors for components. Overall measured emissions were 36% lower across all sites than would have been predicted from using EPA's major equipment count methodology. DOI: https://doi.org/10.1525/elementa.368.f4 was 10,000 ppm or greater. More broadly, 1.77% of all screened components at the eight oil and gas production sites had an FID reading greater than or equal to 500 ppm. By contrast, only 0.17% and 0.35% of components screened in this study had an FID reading that exceeded 10,000 ppm and 500 ppm, respectively, and 0.39% of components overall were identified as leaking with OGI or an FID reading exceeding 500 ppm (Table S15). This study, however, was not designed to understand the extent to which factors, such as existing LDAR programs, improved facility design, or differences in emission detection technology deployment, may be driving this lower frequency of observed leaking components. Similarly, a California study of equipment leaks at natural gas sites (Kuo et al., 2015) identified 378 leaking gas service components (over a 100 ppm leak threshold), or 0.47% of the 80,423 nonwelded components screened, which is similar, but not directly comparable to the results of this study since components in that work spanned the natural gas value chain from production through distribution and also had an initial leak screening method that differed from the methods used in this study.

Study uncertainty
While direct measurement studies are important to understand potential emission rates associated with certain types of equipment on oil and gas sites, there are several sources of uncertainty for this study. As noted in Allen et al. (2013), direct measurement study sample sizes tend be small relative to the national population of sources. For example, there are more than one million active oil and gas wells (U.S. EPA 2017a) in the United States, and the 67 sites in this study would compose a small fraction of overall sites. While the study team attempted to gather data from a variety of site types, this study does not claim national representativeness in the study results. Other sources of uncertainty include the measurement techniques, the representativeness of single emission snapshots at the visited facilities, and the statistics around the underlying skewed distribution for the emission sources measured in this work. Quantifying values for such uncertainty is beyond the scope of this study. It should be noted that the study (Hummel et al., 1996) that underlies current emission factors in the US would have similar questions around uncertainty. In addition, this study was not designed to understand the extent to which factors, such as prior LDAR surveys on the sites, improved facility design, or differences in emission detection technology deployment, may explain differences in emission estimates from this study compared to current US emission factors.

Comparison to other direct measurement studies
The most appropriate comparisons of equipment leak measurements in this study are previously-published studies that provide disaggregated information for leaking components since this study did not make measurements on other on-site sources, such as tanks or liquid unloading, that would be necessary to compare results to studies that report site-level emissions (e.g. Bell et al., 2017).
In Figure 5, a comparison is made between the equipment leaks that were identified with OGI in this field campaign to measured equipment leaks from the Allen et al. (2013) study. That study also employed OGI-based approaches to identify leaking components and the emission rate from all identified leaking components were quantified. Other studies that have been used in numerical simulations of fugitive emissions (Kemp et al. 2016) were not selected for comparison since they typically relied on FID-based identification approaches for identification of leaking components, which, as shown in Figures 2 and 3, likely leads to an identification of a fundamentally different set of leaking components than OGI, which is the most common method used in industry LDAR surveys at oil and gas sites in the United States. In addition, several other field studies were not selected for comparison since they involved different instrumentation for leak identification (Kuo et al., 2015) and since the leak detection survey only included a subset of the components at each site (City of Fort Worth, 2011).
In general, the distribution of equipment leaks identified in this study by OGI had higher emissions per leaking component than the equivalent percentile of leaks identified in Allen et al. (2013), except for the two highest emitting components in that 2013 field campaign, as shown in Figure 5. The largest two emission sources that were identified in the Allen et al. (2013) field campaign were both located in the Appalachian region in the northeastern United States, which is not one of the four basins where measurements were made in this study. As a result, a sensitivity analysis is also shown in Figure 5 that compares the percentiles of leaking components identified with OGI in this study with only data sets in the Midcontinent, Gulf Coast, and Rocky Mountain regions from the Allen et al. (2013) study. Comparing the regionally consistent datasets between the two studies, the maximum measured individual leaking component had a similar order of magnitude [78 scfh in Allen et al. (2013) and 84 scfh in this study], but the average emission per leaking component identified by OGI is larger in this study (10.3 scfh) than that previous field campaign (4.7 scfh).

High-volume sampler measurement
Results from this field campaign indicate the importance of using the augmented protocol with a secondary verification instrument in the high-volume sampler exhaust. In this study, the GHD field team noted six instances in which the Bacharach high-volume sampler failed to transition between measurement modes, and all instances occurred on sites with less than 90% methane content in the sales gas, as suggested by Brantley et al. (2015). These six measurements represent 2.1% of overall equipment leak measurements in the study with the Bacharach high volume sampler and 2.4% of equipment leak measurements with the Bacharach high volume sampler in the study on sites with less than 90% methane content in the provided gas composition. For these six measurements, the average emission rate increased by 10.3 scfh per leaking component with the use of the hydrocarbon concentration from the Heath DPIR backup detector. If the uncorrected high-volume sampler concentration reading for those six measurements would have been utilized, overall emissions from equipment leaks in the study would have been 1379 scfh, or 4.3% less than the 1441 scfh estimated from all leaking components measured in this study.
This result shows that high-volume sampler transition failures can occur within upstream direct measurement studies and that unaccounted emissions can be a notable fraction of overall equipment leaks as posited in previous work (Howard, 2015;Howard, et al., 2015). For this study, the effect of this phenomenon was not sufficient to fundamentally change the order of magnitude for equipment leak emissions, which agreed with a previous modeling-based assessment . It should be noted that firmware and calibration schedule on the high-volume sampler that was used in this study was different than those used in the studies noted by Howard (2015).
From a post-field campaign laboratory characterization of the response factors for several different models of hydrocarbon detectors that could be used as backup instruments for these types of measurements, it was also noted that some backup devices may have elevated response to higher molecular weight hydrocarbons in field gas, such as propane. Thus, the response factors of the high-volume sampler and the back-up instrument must be well-characterized to develop accurate emission rates from field high-volume sampler measurements.
There is an additional source of uncertainty in this study related to the use of instrument response factors to account for the differential response of the concentration measurements used in this study for emission rate quantification. For a sensitivity analysis, emissions from all measured components in the study were estimated without the application of response factors. Under this sensitivity analysis, total emissions would have been 1768 scfh, which is 23% greater than the total study emissions with the application of the correction factors for the measurements. For sites with a major equipment count available ( Figure S4), total emissions without the response factor applied (1761 scfh) would still be 22% less than emissions estimated based on GHGRP major equipment count methods for those sites (2241 scfh).

Conclusions
Results of a field campaign to conduct direct measurements of emissions from equipment leaks at oil and gas production and gathering and boosting sites in four producing basins in the western United States indicate that the current default emission estimation approach for this source category (US EPA, 2017b), which is based on major equipment counts, would have overestimated detected  Figure 5 compares the distribution of measured emissions in this study from components that were identified with optical gas imaging (OGI) to a previous field campaign ( Allen et al., 2013) that utilized a similar in-field procedure to identify leaking components. Emission rates of components that were identified as leaking with OGI in this study were typically larger than the equivalent percentile of leaking components in the Allen et al. (2013) study except for the largest two emission sources from the Allen study, which were greater than 100 scfh and were both located in the Appalachian region. Since the Appalachian region was not included in this study, a sensitivity analysis was conducted to compare Allen et al. (2013) measurements in similar regions to the measurements in this study. DOI: https://doi.org/10.1525/ elementa.368.f5 and measured emissions from leaking components at these 65 sites by 22% to 36%, depending on the handling of instrument response factors. This difference was driven by a lower fraction of leaking components identified in this study compared to the data set on which current EPA emission estimation approaches in the GHGRP are based. Using a 10,000 ppm leak definition threshold for FID-based methods, this study identified 0.18% of components at gas sites and 0.15% of components at oil sites as leaking, while the previous study (API, 1995) had identified 1.61% of components at gas sites and 0.67% of components at oil sites as leaking.
Results from this study also indicate that the two most prominent identification methods for leaking components, FID and OGI, identified different populations of leaking components under field conditions in this study, but the two methods had similar performance based on total emissions identified across all sites in this study, even with an FID leak detection threshold of 500 ppm. This emissions equivalency occurred despite the total count of leaks identified by OGI being only 36% of the total count of leaks identified with FID-based methods. This result suggests a need for emerging methane detection technologies to be evaluated on a different framework than a comparison of the counts of leaks detected. Emission detection equivalency may be achieved through a variety of different approaches that could include considerations related to both the number of detections, the volume of emissions, and the duration over which the leaks would have likely occurred. Such conclusions are consistent with previous modeling (Kemp et al., 2016) and field study results (Schwietzke et al., 2018) comparing different methane emission detection strategies.

Data Accessibility Statement
Datasets produced in this work are available online, freeof-charge, as part of supplement material.

Supplemental files
The supplemental files for this article can be found as follows: • Text S1.