Introduction

In placebo-controlled acute post-surgical pain studies, provisions must be made for study subjects to receive adequate analgesic therapy. As such, most protocols allow study subjects to receive a pre-specified regimen of open-label analgesic drugs (rescue drugs) on an as-needed basis [16, 810, 13,14,15, 17, 19, 20, 25,26,27, 2934, 36,38,, 39, 40]. These drugs are administered only if the blinded study drug (active or placebo) fails to adequately relieve the subject’s pain.

When designing experiments, researchers must select a rescue regimen for their protocol. This regimen will specify (1) the selected first-line rescue drug, including its dose and frequency, and (2) any allowed second-line rescue drugs. The selection of an appropriate rescue regimen is a critical experimental design choice [37], because a rescue regimen that is too liberal may lead to all study arms receiving similar levels of pain relief [11] (thereby confounding experimental results), while a regimen that is too stringent may lead to a high subject dropout rate (giving rise to a preponderance of missing data) [4].

Despite the importance of rescue regimen as a study design feature, there exist no published review articles or meta-analysis focusing specifically on the topic. Therefore, when selecting a study protocol’s rescue regimen, researchers generally provide therapy that would be considered clinically reasonable. However, the goal of analgesic therapy in a clinical situation (to relieve pain) is different from the goal of rescue analgesic therapy in an experimental situation (to ethically provide some pain relief while maintaining study assay sensitivity). As such, the best clinical choice of analgesic therapy may not be the best experimental choice of rescue therapy in a study.

Relying on the points above, we assert that a scientific discussion and analysis of rescue therapy as an experimental design feature is needed. An ideal data set that might allow one to isolate rescue therapy as a dependent variable of interest would need to control for (1) study model, (2) study design, and (3) data imputation techniques. In an attempt to procure such a data set, the authors performed an extensive literature search encompassing several acute and chronic pain models (low back pain, osteoarthritis, diabetic peripheral neuropathy, dental extraction, bunionectomy, and joint replacement surgery). The authors concluded that, among these models, bunionectomy was the best candidate for exploration because (1) it is a common model used for analgesic clinical trials, (2) it employs a standardized methodology such that study design features are tightly homogenized between experiments, and (3) it is generally used for registration studies (as such, statistical results utilizing a variety of data imputation techniques are available on FDA.gov).

Seven bunionectomy studies suitable for review, comparison, and discussion were identified. The authors acknowledge that while the data set is too small to be subjected to standard meta-analysis techniques, it is large enough to permit a detailed, relevant, and critical conceptual review. The function of this manuscript, then, is not to draw definitive conclusions (as a classic meta-analysis would) but rather to (1) create a framework for discussion and future exploration of rescue as a methodological study design feature, (2) discuss the interplay between data imputation techniques and rescue drugs, and (3) inform the readership regarding the impact of data imputation techniques on the validity of study conclusions.

The concepts laid out in this manuscript are iterative. As such, each section of the manuscript addresses a different topic by presenting that topic’s relevant results and discussion simultaneously.

Compliance with Ethics Guidelines

This article is based on previously conducted studies and does not involve any new studies of human or animal subjects performed by any of the authors.

Data Imputation (Background and Nomenclature)

Missing data can be imputed using a variety of statistical techniques [12, 21, 35]. Techniques that truncate data and lead to a high degree of missing-ness (absent data) are unreliable [21, 35]. Scientific consensus continues to evolve regarding the proper handling of missing data; new techniques are frequently being employed and evaluated [12, 21, 35]. As such, the nomenclature regarding data imputation techniques is not firmly established and continues to change. In order to streamline discussions in this review, several common imputation techniques are defined, named, and discussed.

Figure 1a depicts hypothetical data from a single subject participating in a study in which scheduled pain intensity assessments are collected every 4 h for 48 h. This subject received rescue on six occasions (red arrows). Prior to each dose of the rescue drug, an unscheduled pain score was collected (blue X’s). The dotted line sketched across the graph corresponds to subject’s baseline pain intensity score. The summed pain intensity difference (SPID) is determined by calculating the time-weighted difference in pain intensity scores from the initial baseline score. Panels 1b, 1c, and 1d build off the raw data set in panel 1a in an attempt to illustrate how different data imputation methods affect SPID calculations.

Fig. 1
figure 1

Hypothetical data set illustrating different data imputation methods and their effects on SPID48 values. SPID48 summed pain intensity difference over 48 h after first dose of study medication

Figure 1b represents a SPID48 applying no data imputation secondary to rescue. When no imputation is used, the impact of rescue drugs on a subject’s pain relief is ignored. Note that pre-rescue pain intensity scores (blue X’s) are ignored when performing this non-imputed SPID calculation.

Figure 1c represents a single imputation technique [last observation carried forward (LOCF) after first dose of rescue medication]. This subject reported an unscheduled pain intensity score of 7 prior to their first dose of rescue. In order to ensure that the rescue drug does not statistically impact this subject’s SPID, the pre-rescue pain intensity score (7) is carried forward for the remainder of the 48-h efficacy evaluation period. When a single imputation technique is used, the efficacy of the rescue drug cannot impact the statistical outcome of the study, because as soon as a subject receives even a single dose of rescue therapy, their data is statistically truncated. Single imputation techniques give rise to significant missing data and are therefore no longer favored or accepted by the Food and Drug Administration (FDA) [37].

Figure 1d represents a common multiple imputation technique (windowed LOCF). With this technique, unscheduled pre-rescue pain intensity scores are carried forward for a window of time equivalent to the expected duration of action of the rescue therapy. In this hypothetical case, the window selected is 3 h. The theory behind this technique is that scheduled pain intensity scores gathered within 3 h of administration of rescue are contaminated (i.e., artificially lowered) by the rescue drug. Therefore, these contaminated pain intensity assessments are ignored when the imputed SPID value is calculated.

Methods

Search Criteria

A PubMed search was conducted of all articles through May 2015 using the search term “bunionectomy.” This list was limited to randomized controlled trials. Abstracts obtained using these search criteria were reviewed, and manuscripts that met the following criteria were included in our data set: (1) in English, (2) placebo-controlled, (3) randomized, (4) double-blind, (5) bunionectomy model data presented, and (6) pain as the primary or secondary endpoint. References from papers obtained via this search were assessed for any other relevant manuscripts.

Next, a search of the FDA database (drugs@fda) was conducted for Summary Basis of Approval (SBA) and Statistical Reviews for all drugs described in the collected manuscripts, as well as for any other FDA-approved drugs for the treatment of acute pain. This search yielded one additional study (identified as “Jensen 2013 [41]”).

In an effort to obtain a homogenous multi-dose set, we pruned articles to those matching the following secondary criteria: (1) provided an SPID48 and (2) were postoperative day 1 (POD1) studies. SPID48 was chosen, as it is the most common endpoint in this model and contains multi-dose information. POD1 (a study in which the first dose of study medication is administered on the day following surgery) was chosen, as it is also the most common and contains a specific pain trajectory that is not comparable to studies done on postoperative day 0 (POD0; studies in which the first dose of study drug is administered on the same day the surgery is completed). Table 1 provides a summary of all investigations included in the analysis.

Table 1 Description of studies included in analysis

Data Extraction

SPID48 values were extracted from manuscripts and SBA Statistical Reviews based on the type of data imputation method used. Data were sorted by imputation method for presentation in this manuscript. Manuscripts that employed a 100-mm visual analog scale (VAS) were converted to a 10-point numeric pain rating scale (NPRS) by dividing the score by 10.

Discussion

Acute Pain Studies: Rescue Is Frequent and Prevalent

In these acute pain studies, rescue was quite prevalent in both the placebo and active treatment arms. In our data set, 95.0% percent of placebo subjects required protocol-mandated rescue drugs (an average of 4.6 doses over 48 h; Table 2). A significant proportion of subjects allocated to the treatment arms also received rescue drugs (85.6%); the mean number of doses received per subject was 2.8 (Table 2).

Table 2 Rescue medication use during first 48 h for POD1 studies

Typical immediate-release rescue drugs [e.g., ibuprofen, acetaminophen (APAP), hydrocodone/APAP, oxycodone/APAP] have at least 4 h of moderately potent analgesic efficacy [38, 22,23,24]. After multiple doses, the efficacy of these drugs may be prolonged secondary to accumulation (i.e., steady-state plasma levels achieved) [38, 22,23,24]. Acute pain studies are generally short (48–72 h). Therefore, subjects enrolled in such studies routinely receive multiple doses of efficacious rescue drugs during a relatively short (48–72 h) study evaluation period.

Placebo-Arm Pain Relief Is Influenced by Rescue

The amount of pain relief experienced by subjects in the placebo arm of acute pain studies is, in part, a function of the rescue regimen utilized in the experiment. This fact is not surprising considering the frequency and prevalence of rescue usage amongst placebo subjects (as discussed above).

In Table 3, the rescue regimen for each experiment in our data set is displayed alongside the corresponding SPID48 value for the placebo arm. The SPID48 values in the table were calculated with no imputation secondary to rescue (non-imputed data permit an objective assessment of the effect of rescue on the pain trajectory of subjects receiving placebo). Studies that allowed no rescue or weak rescue generally had low placebo SPID48 values (small placebo response), while studies that allowed liberal rescue had high SPID48 values (large placebo response; Table 3).

Table 3 Rescue medication and SPID48 values for placebo

Treatment Response May Be Less Influenced by Rescue

Rescue drugs are received by subjects in both placebo and treatment study arms in an investigation (Table 2). Therefore, one could argue that both the placebo and treatment response are influenced by rescue, and to the same degree. If this were true, the observed net treatment effect (treatment response–placebo response) would not be negatively impacted by efficacious and/or frequent rescue. However, if rescue disproportionately affects the placebo arm, then efficacious rescue will erode net treatment effect.

In the data set, there are two studies that can be compared to allow one to draw preliminary conclusions regarding the presence or absence of differential impacts of rescue on placebo versus treatment study arms. During the development of Tapentadol, two large (n = 901 and n = 603) replicate bunionectomy studies were performed [4, 8]. The study design features were essentially identical except for the selection of rescue regimen. One of the studies [7] prohibited any use of rescue, while the other [4] allowed rescue of up to two 1000-mg doses of APAP. Figure 3 displays the salient results of these two investigations. One can see that while pain relief in the placebo arm is doubled by the provision of APAP rescue (versus no rescue), treatment response remains unaltered.

Rescue Regimen Is Correlated with Efficacy Dropouts

In acute pain studies, efficacy dropouts are correlated with the rescue regimen allowed by the protocol (Table 4). Liberal use of rescue reduces efficacy dropouts (Table 4), but may negatively impact assay sensitivity (Fig. 2). Stringent rescue regimens give rise to unacceptably high placebo dropout rates (29.0% for weak rescue and 50.4% for no rescue) but may improve assay sensitivity. An ideal rescue regimen balances these two phenomena by providing enough pain relief to prevent high placebo dropout rates, but not so much pain relief that the experiment is unduly confounded (Fig. 3).

Table 4 Dropout rates
Fig. 2
figure 2

Differential impacts of rescue drug potency on placebo vs. treatment study arm. mg milligram, SPID48 summed pain intensity difference in 48 h following first dose of study drug

Fig. 3
figure 3

Balancing the amount and frequency of rescue medication is crucial in experimental design

Published Literature Can Be Misleading

In much of the currently published acute pain literature, the reported results have been calculated using single imputation techniques [1, 2, 8, 14, 25, 28, 29, 34, 36]. With single imputation techniques (like LOCF), all data after the first use of rescue are replaced with imputed data (Fig. 1c). This statistical technique masks the impact of efficacious rescue regimens on assay sensitivity [12, 21, 35]. Studies utilizing single imputation techniques can therefore use rescue with impunity. Even if rescue regimens are gratuitous, assay sensitivity will not be diminished.

There is growing scientific consensus that single imputation methods are unreliable in multi-dose studies [37, 35]. Modern scientific standards require the use of imputation techniques that minimize missing data (i.e., windowed LOCF; Fig. 1d). Researchers designing clinical trials that will employ modern imputation techniques often select rescue regimens that have been utilized with success in previous experiments. However, previously successful experiments that utilized liberal rescue regimens and calculated results via a single imputation technique may not have been successful if their results had been calculated with a currently acceptable multiple imputation technique. Therefore, modern-day researchers should pay special attention to the data imputation techniques utilized in previous investigations before relying on those investigations to guide their choice of rescue regimen.

Windowed Imputation May Not Be the Answer

Windowed imputation is a technique that is commonly employed to account for and mathematically reverse the impact of rescue therapy on placebo arm pain relief (Fig. 1d). In Fig. 4, non-imputed data are compared to data that have been analyzed using a windowed technique. One can see that in our data set, windowed techniques have mixed results. In three of five cases, they resulted in lower placebo SPID48 values (the desired and predicted statistical outcome; Fig. 4). In two circumstances, however, windowed techniques led to an increase in placebo SPID48 values (a counterintuitive and surprising outcome; Fig. 4).

Fig. 4
figure 4

Comparison of SPID48 values [mean ± standard error (SE)] using no imputation or windowed imputation techniques. Data sets from five of seven articles included in this analysis; two remaining manuscripts did not provide windowed imputation data to allow for comparisons. Altman [1] and Jensen [41] used windowed baseline observation carried forward (6 h); Altman [14], Singla [31], and Singla [32] used windowed last observation carried forward (6 h). SPID48 summed pain intensity difference in 48 h following first dose of study drug

The Ethics of Rescue Therapy

Subjects in analgesic clinical trials must have adequate mechanisms for obtaining pain relief. When left untreated, pain can lead to significant postoperative sequelae (e.g., hypertension, impaired wound healing.) [16, 18]. The clinical situation, determined by the surgical insult and the study population’s expected comorbidities, will dictate the rescue options that are adequate for the study under question.

It is important to understand that the provision of pain-relieving drugs to a subject in a research study can be accomplished in two ways: first, through administration of rescue medication as allowed/described in the study protocol, and second, through subject withdrawal. In other words, if the subject has received protocol-mandated rescue but has not experienced adequate pain relief, they can/should withdraw from the study. After study withdrawal, subjects can receive any analgesic deemed appropriate per the investigator’s clinical discretion.

The ethical implications of rescue should be considered with this two-tiered approach in mind. For example, in the Daniels Upmalis [7] study, no rescue therapy was allowed. While this provision may seem unethical, it is important to consider the clinical situation for subjects on postoperative day 1 after bunionectomy (they are generally young, healthy, and lucid), as these subjects are able to effectively verbalize their desire to withdraw from the study.

In contrast, a no-rescue-allowed approach would be inappropriate for subjects undergoing cardiac surgery, because (1) the sequela from untreated pain (e.g., hypertension) may be quite serious, and (2) subjects post-cardiac surgery may not be entirely lucid and, as such, may be unable to express a desire for withdrawal in a timely fashion. Therefore, researchers must consider the clinical implications of their experimental models when adjudicating the ethics of any potential rescue paradigm.

Limitations and Conclusions

The rescue regimen utilized in an acute surgical pain study has an impact on the experimental outcomes. Despite the importance of rescue as a protocol design feature, researchers currently lack the evidence required to select ideal rescue regimens for their experiments. In this review, we attempted to analyze a homogeneous set of studies; as such, our sample size was small and included only one surgical model. Therefore, we acknowledge that our results must be interpreted with caution. Further work from colleagues and collaborators will be needed to validate the preliminary conclusions we have drawn in this manuscript.