Imperfect detection of spills: A case study in U.S. onshore oil drilling

This paper specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. In addition to conventional models, detection-controlled models are also specified to explicitly control for the potential for imperfect reporting. The results suggest that continuity in operations and supervision act to reduce the likelihood of pollution. Additional variables such as site complexity are also significant. The results are largely consistent with related research on personal safety incidents. While the analysis was completed for one organization in one geographic area, the results may be applicable to similar regions and organizations. The results can be used to drive decisions regarding operating practices and managerial policies.


Introduction
The consequences of pollution have been studied extensively, and the importance of prevention is self-evident, for example see Assaf et al. (1986), Cohen (1986), Sovacool (2006), Douglas (2011), and Harzl and Pickl (2012). In an oil company, engineers and analysts are involved in identifying pollution risks, estimating the probability of incidents, and advising decision-makers on options for elimination, mitigation, and control of these risks. This work often involves the quantitative analysis of historical pollution performance. To this end, this investigation specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. The results of this investigation provide information that can be used for resource allocation and the definition of operating practices and managerial policies.
Quantitative analysis of pollution poses special challenges because there is typically no theoretical basis for assumptions regarding the functional form of pollution incident phenomena, historical incident data is often unbalanced (few incidents), data typically is not collected in cases when there are no incidents, and incidents are not always reported. Some of these and other challenges in pollution risk analysis are described in Stewart and Leschine (1986). However, many of these challenges can be overcome with common-sense assumptions, improved data collection strategies, and advanced modeling methods.
The models specified in this investigation include conventional regression-based models, but also include detection-controlled models that explicitly control for the potential for imperfect reporting. Specifically, we investigate the drivers for loss of primary containment (LOPC) incidents. An LOPC event is defined here as an unplanned release at the drilling location from the well or drilling-related equipment into the environment via air releases, spills, leaks, tank overfills, etc., irrespective of measures to protect the environment (e.g. safely capturing the release) or the fact that the release was removed immediately. Excluded are supplied water, and planned or anticipated flaring or venting related to drilling operations.

Regression model specification
This investigation specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of LOPC incidents. In addition to conventional models, the authors specify detection-controlled models to explicitly control for the potential for imperfect reporting. This is an important aspect of the investigation. As described in Pransky et al. (1999), Deissenberg et al. (2001a), Leigh et al. (2004), Phimister et al. (2004), Rosenman et al. (2006), Probst et al. (2008), and Probst and Estrada (2010), imperfect reporting of incidents in the workplace occurs across many sectors. There are various reasons for underreporting, some are intentional (evasion) while others are unintentional (ignorance). Also, it is acknowledged that the prospect exists for overreporting, for example, fraudulent reports of personal safety incidents that did not occur. With respect to LOPC incidents however, it is reasonable to assume there is no overreporting.
The notion of imperfect reporting (and/or detection) of pollution was introduced in Epple and Visscher (1984), and there is a growing body of empirical work on the subject of incomplete detection based on the subsequent seminal work of Feinstein (1989Feinstein ( , 1990). Feinstein's model of detection-controlled estimation (DCE) has been applied in various contexts. Studies have been completed in tax compliance (Erard, 1997), health diagnosis (Bradford et al., 2001;Kleit and Ruiz, 2003), political science (Scholz and Wang, 2006), and safety in oil and gas drilling (Jablonowski, 2007(Jablonowski, , 2011. The present investigation specifies and estimates a detection-controlled model of pollution in oil and gas drilling, and will add to the existing detection-controlled literature in environmental compliance (Brehm and Hamilton, 1996;Helland, 1998;Stafford, 2003).

Implications of imperfect reporting
Imperfect reporting distorts the observations of incident data. A simple example demonstrates the impacts of imperfect reporting, assuming that no fraud occurs. Consider 100 hypothetical pollution outcomes in Table 1. The columns represent whether or not a pollution incident occurred, while the rows represent whether or not the incident was reported. In this unobservable "truth" case, the imperfect reporting is evident. In practice, however, the underreported incidents are counted with the actual non-incidents. Thus, the analyst observes the data as depicted in Table 2.
Depending on the levels of imperfect reporting, the implications can be severe. The true frequency of an incident in this example is equal to 14/100, while the analyst computes a value of 10/100. Of course the conditional probabilities are also affected. It is clear that in the presence of imperfect reporting, use of the data in Table 2 will bias any qualitative or quantitative analysis. However, when the imperfect reporting is modeled explicitly, more accurate assessments can be made of the true incident phenomena. Also, the analyst can investigate factors that affect the reporting rate.
The unit of observation is defined as one well. Data is collected for each well i on each rig r in the study period. There are r = 1 . . . R rigs and i = 1 . . . N r wells on each rig, x ri is a 1 × h vector of independent variables for well ri believed to affect LOPC incidence, and β is defined as a h × 1 vector of coefficients to be estimated. An ordinary least squares model is specified first in Equation (1) where ε ri is a random error term, ∼iid(0, σ 2 ). A second model specifies the incidence function as Poisson where ln(μ ri ) = x ri β. The probability for observation y ri is represented as shown in Equation (2) with the resulting log-likelihood equation shown in Equation (3). A likelihood ratio test indicated that the null hypothesis of equidispersion in the Poisson model could not be rejected. Pr(Y ri = y ri ) = π(y ri ; μ ri ) = e −μ ri μ y ri ri y ri ! (2)

Model of imperfect reporting
It is assumed that the probability of a reported incident, Prob(Report), is the product of two sequential events. First, an incident either occurs or does not occur, denoted as Prob(Incident). Second, an incident either is reported or not reported, denoted as Prob(Report|Incident). Because it is assumed that there is no overreporting, the Prob(Incident|Report) = 1. One set of independent variables are specified for the incidence function, while another set of independent variables are specified for the conditional reporting function. When more than one incident occurs, there are three potential outcomes for reporting. One outcome is that all of the incidents are reported, a second outcome is that none of the incidents are reported, and a third outcome is that there is partial under-reporting and a subset of incidents is reported. In the derivation below, it is assumed that for each observation of the dependent variable, incidents are either all reported or all not reported, simplifying the computations.
The incidence function is specified as Poisson, and the reporting function is specified as a binary probit model (see the Appendix for development of the probit model). The variable z ri is a 1 × j vector of independent variables for well ri believed to affect reporting, and δ is defined as a j × 1 vector of coefficients to be estimated. The probability that observation y ri on the dependent variable takes on a value greater than zero is shown in Equation (4) with the resulting log-likelihood function for all non-zero observations, m, shown in Equation (5).
The probability that observation y ri on the dependent variable takes on a value equal to zero is the sum of the probability that no incident occurred plus the probability that an incident occurred but was not reported, and this is shown in Equation (6) with the resulting log-likelihood function for all zero observations, n-m, shown in Equation (7).
The log-likelihood for the sample is L = L m + L n−m and is maximized numerically. The asymptotic covariance matrix is estimated by evaluating the negative inverse of the Hessian at the maximum likelihood estimates. The identification conditions for this family of models are derived and explored in Feinstein (1990). The derivations and discussion are substantial and are not repeated here. An important condition is that x ri and z ri each contain at least one unique variable. As described in the definition of independent variables and in the discussion of results, this condition is satisfied.

Data set and pollution incident hypotheses
Observations were collected from eight drilling rigs over a recent ∼24 month period from onshore oil and gas development assets in the Permian basin in the U.S. For each well ri, the dependent variable is defined as the number of LOPC events that occurred on the well. In this data set, 22 LOPC events occurred on 143 wells which yields a frequency of ∼15%. While the dependent variable is somewhat unbalanced, this aspect of the data set does not appear to impede the regression analysis in this case. When defining the hypotheses and independent variables, the emphasis was placed on those operational and managerial attributes that are controlled by the organization and thus subject to modification (although this criterion does not apply to every variable, all of the attributes are amenable to mitigation activities). For each independent variable defined below, the hypothesis regarding the directional impact (sign) of the variable on incidence and reporting is stated, along with the expectation for statistical significance (at the 95% confidence level).
The first three variables reflect attributes of the work and site. Rig Move: This binary variable takes a value of 1 if the well drilled on a different pad than the rig's previous well and 0 otherwise. This variable tests the hypothesis that the first well after rig-up on a new location increases the likelihood of LOPC incidents (e.g. during mechanical shakedown). The expectation is that the sign of this variable will be positive and significant in the incidence function. Also, the rig move and gap in normal drilling operations is believed to disrupt established practices, policies, and behavioral norms with respect to LOPC incidence prevention and reporting. Therefore, the expectation is that the sign of this variable will be negative and significant in the reporting function.
Well Type: This binary variable takes a value of 1 for development wells and 0 for exploration wells. Previous research has indicated that differences between well types in engineering design, operations, and site attributes increase the likelihood of incidents on development wells (Jablonowski, 2011). The difference is believed to result from additional congestion and complexity on development sites. The expectation is that the sign of this variable will be positive and significant in the incidence function. There are no hypotheses for this variable with respect to reporting.
Drilling Days: This variable is defined as the count of days from the start of the well to the end of the well. It is intended only as a control for exposure time and thus the expectation is that the sign of this variable will be positive and significant in the incidence function. There are no hypotheses for this variable with respect to reporting.
Supervision and monitoring are important factors in driving compliance with procedures as discussed in Epple and Visscher (1984), Embrey (1992), Viladrich-Grau and Groves (1997), Cohen (2000), Deissenberg et al. (2001b), and Skaugrud et al. (2012). Variables were specified to test hypotheses along two dimensions of supervision: concentration and turnover. The results can be used to adjust policies on allocation of supervisory resources.
Foreman Concentration: This variable is defined as the sum of individual foreman-days divided by DrillingDays. For example, if Foreman#1 worked 10 days and Foreman #2 worked 15 days on a 15 day well, then this variable would equal (10 + 15)/15. This variable is intended to measure the concentration of supervision. It is believed that a larger concentration of supervision leads to better oversight, a decrease in LOPC incidence, and an increase in reporting because events are more likely to be observed. Thus, the expectation is that the sign of this variable will be negative and significant in the incidence function, and positive and significant in the reporting function.
Foreman Turnover: This variable is defined as the sum of individual foreman-days of those foremen who worked on the previous well, divided by the sum of all foremen-days on the current well. Thus, a large value indicates that much of the supervision on the current well is the same as the previous well (less turnover). This consistency is believed to decrease the likelihood of LOPC events because the foremen can maintain and strengthen previously established practices, policies, and behavioral norms. The expectation is that the sign of this variable will be negative and significant in the incidence function, and positive and significant in the reporting function.
All of the rigs in this study were governed by the operator's safety management system (SMS). Simply put, the SMS defines expectations and requirements for practically all aspects of well operations. One hypothesis of great interest to safety and environmental practitioners is whether there is improvement over time with respect to SMS compliance. It is an important hypothesis because if indicated to be true, the result may affect procurement strategy when picking up and dropping rigs from the fleet. The following variable is defined to test this hypothesis.
Rig-SMS Maturity: This variable is defined as the cumulative well count on each rig up to and including the current well. It is intended to model the effect of experience in working with the operator under its SMS, both in the passage of time and in the completion of additional wells. This variable tests the hypothesis that over time, the operator's SMS becomes more established and ingrained in the rig crew, and that this improves LOPC incidence performance and reporting. The expectation is that this variable will be negative and significant in the incidence function, and positive and significant in the reporting function. Note, for model identification purposes in the detection-controlled model, an alternate specification of this variable is defined as the cumulative number of months each rig has been working for the operator up to and including the current month (correlation coefficient with the primary variable is 0.97).
The operator invests considerable resources to ensure that all rigs are compliant with the SMS (e.g. assigning multiple foremen on site). However, the drilling rig contractor employs its own rig manager to manage the detailed activities of the rig crew. These individuals also may have an impact on pollution incident performance and reporting. The following suite of binary variables are defined to control for this potential effect.
Rig "X": Binary variables are defined for each of the eight rigs in the data set. These variables are intended to model differences in pollution incident and reporting performance between rigs not captured by other variables. The expectation is that these variables will be insignificant in the incidence and reporting functions.
There are some unavoidable regrets in the hypothesis tests. For example, because the rigs were all drilling similar types of wells using similar procedures, variables such as well design and operational practices that have been shown to reduce LOPC incidents like underbalanced or managed pressure drilling could not be tested (Jablonowski and Podio, 2011). Also, all of the rigs were provided by the same drilling contractor and outfitted in a similar way (e.g. similar levels of automation), and the drilling sites all shared the same geography and degree of site remoteness, thus it was not possible to test hypotheses about these variables.

Regression analysis and discussion
The models of perfect reporting were estimated first to identify probable drivers of incidence and/or reporting. That is, when one observes a statistically significant variable in these models, it is not discernible whether the effect is attributable to incidence or reporting behavior. However, it is a sign that the variable is probably important in one or both functions and careful attention is warranted in the model of imperfect reporting. When a variable does not indicate as significant in the model of perfect reporting, one cannot ignore the variable in the model of imperfect reporting. That is, it is possible that the incidence and reporting behaviors can "cancel out" and thus are not observed in the model of perfect reporting. Table 3 presents a summary of regression results. Columns A and B report the results from the ordinary least squares model, where column A contains the Rig binary variables and column B does not. The model was re-estimated without the Rig binaries because none of the Rig binary variables were statistically significant. The same structure is used in columns C and D where the Poisson model results are presented. Column E contains the detection-controlled estimates.
The Rig Move variable is not significant in any of the perfect reporting models, contrary to expectations. However, when the detection-controlled model is estimated, this variable is significant in the incidence function, and significant (at the 90% level) in the reporting function, consistent with expectations. This result suggests that after a rig move, there is the potential for a breakdown in incidence and reporting performance. For this variable, the perfect reporting models and the  detection-controlled model appear to be inconsistent. However, recall that the in the models of perfect reporting, the coefficient estimate represents a mash-up of two underlying effects (incidence and reporting), and instead of observing increased incidence and decreased reporting, the effects cancel out and there is no apparent effect. The detection-controlled model disentangles these two effects. Beyond the obvious value of this result and how it may affect mitigation activities, the coefficient estimate also matters. That is, if one were estimating the likelihood of a LOPC event after a rig move, the appropriate coefficient to use would be the one from the detection-controlled model which is two to three times as large as estimate from the Poisson model of perfect reporting. The Well Type variable is significant in all models as expected, suggesting that additional congestion and complexity on development sites increases the likelihood for LOPC incidence, consistent with expectations. The control variable Drilling Days is significant in the detection-controlled model as expected, but not in the models of perfect reporting. A diagnostic detection-controlled regression was estimated with Drilling Days in the reporting function to test whether this result was similar to the mash-up effect discussed above with respect to the Rig Move variable and it was not. Thus, the reason for this inconsistency between the models is probably attributable to a spurious correlation in the data set.
The Foreman Concentration variable is not significant in any of the models, suggesting that a larger concentration of supervision and oversight does not necessarily result in a decrease in LOPC incidence or increase in reporting. This result is contrary to expectations and additional diagnostics revealed that while the variable ranges from 1.5 to 3, the observations are tightly clustered at a value of 2, and this lack of dispersion in the data may constrain the ability to estimate the coefficient with precision.
The Foreman Turnover variable is negative and significant in the models of perfect reporting and in the incidence function of the detection-controlled model, consistent with expectations. This result suggests that well-to-well consistency in supervision decreases the likelihood of LOPC incidents. The variable is not significant in the reporting function of the detection-controlled model, contrary to expectations.
The Rig-SMS Maturity variable is negative and significant in the models of perfect reporting and in the incidence function of the detection-controlled model, suggesting that cumulative experience in working with the operator under its SMS serves to improve LOPC incidence performance. While this result is consistent with expectations, the variable is not significant in the reporting function of the detection-controlled model, which suggests that reporting behavior is independent from experience.
As already mentioned, none of the Rig binary variables were statistically significant in the models of perfect reporting (note, two of the Rig variables were excluded because of collinearity issues). This result is consistent with expectations because the operator invests considerable resources to ensure that all rigs are compliant with the SMS, and there is consistency across rigs. Diagnostic detection-controlled regressions were estimated that included the Rig variables, but there were no noteworthy outcomes and these results are not reported.
As reported in Table 3, there is only one variable (Rig Move) that affects the likelihood of reporting, and it is significant at only a 90% confidence level. The intuitive interpretation of this result is that imperfect reporting is probably not a significant problem in this asset, except of course for the biases introduced in the coefficient estimates (i.e. compare Columns D and E in Table 3). This is not surprising because LOPC events are difficult not to report because they are often witnessed by more than one person, and some releases linger and take time and resources to clean up (e.g. chemical spill) increasing the likelihood of detection. It is possible to use the detection-controlled model to compute the probability of a false negative for each zero observation, Prob(Incident| No Report), or Prob(I|NR). The result provides an indication of whether imperfect reporting is a significant problem. The probability is defined in Equation (8). P(I|NR) = P(I)P(NR|I) P(NR) = (1 − π(y ri = 0; μ ri ))(1 − Φ(z ri δ)) π(y ri = 0; μ ri ) + (1 − π(y ri = 0; μ ri ))(1 − Φ(z ri δ)) The average estimated probability for all zero observations is less than 4%. This result, combined with the lack of statistical significance in the reporting function, suggests that imperfect reporting of LOPC incidents is not a significant problem in this asset. Future analysis probably can be completed without the more complex detection-controlled models without introducing significant bias. However, for definitive results, the detection-controlled estimates are always recommended.

Conclusion
This paper specifies and estimates regression models to test several hypotheses about the operational and managerial drivers of pollution in a multi-rig onshore drilling organization. The overarching theme in these results is that consistency is important in reducing the likelihood of LOPC incidents, and potentially in reporting. This applies to consistency in operations (Rig Move), in supervision (Foreman Turnover), and in the rig crew (Rig-SMS Maturity). Thus, disruptions should be taken seriously and mitigation activities should be defined and implemented when disruptions cannot be avoided. Site complexity was also shown to be an aggravating factor in LOPC incidence. The results are largely consistent with related research on personal safety incidents. While the analysis was completed for one organization in one geographic area, the results may be applicable to similar regions and organizations. The results can be used to drive decisions regarding operating practices and managerial policies. In addition to conventional models, the authors estimated a detection-controlled model to explicitly control for the potential for imperfect reporting. The detection-controlled model provides incremental value over models of perfect reporting by disentangling the drivers of incidence and reporting phenomena. In this case, the detection-controlled model is also able to demonstrate that imperfect reporting probably is not a significant problem in this asset.
A final note of caution is needed regarding the use of this kind of information. First, when a relationship between LOPC incidents and a variable is identified and an intervention plan or policy is enacted to reduce risk, then over time the relationship between LOPC incidents and the variable will degrade and ultimately be eliminated if the intervention plan or policy is effective. That is, the intervention policy was successful and should be continued. Also, this same phenomenon makes it difficult to identify risk factors that are already being mitigated by some policy. In this case, the lack of statistical evidence would not be a sufficient reason to alter or cancel an existing mitigation policy that is otherwise believed to be working.