Medications for Acute and Chronic Low Back Pain: A Review of the Evidence for an American Pain Society/American College of Physicians Clinical Practice Guideline

In the United States, low back pain is the fifth most common reason for all physician office visits and the second most common symptomatic reason (1, 2). Medications are the most frequently recommended intervention for low back pain (1, 3). In 1 study, 80% of primary care patients with low back pain were prescribed at least 1 medication at their initial office visit, and more than one third were prescribed 2 or more drugs (4). The most commonly prescribed medications for low back pain are nonsteroidal anti-inflammatory drugs (NSAIDs), skeletal muscle relaxants, and opioid analgesics (47). Benzodiazepines, systemic corticosteroids, antidepressant medications, and antiepileptic drugs are also prescribed (8). Frequently used over-the-counter medications include acetaminophen, aspirin, and certain NSAIDs. A challenge in choosing pharmacologic therapy for low back pain is that each class of medication is associated with a unique balance of benefits and harms. In addition, benefits and harms may vary for individual drugs within a medication class. Previous reviews found only limited evidence to support use of most medications for low back pain. For example, a systematic review published in 1996 found insufficient evidence to support use of any medication for low back pain other than NSAIDs (good evidence) and skeletal muscle relaxants (fair evidence) (9). This article reviews current evidence on benefits and harms of medications for acute and chronic low back pain. It is part of a larger evidence review commissioned by the American Pain Society and the American College of Physicians to guide recommendations for management of low back pain (10). Methods Data Sources and Searches An expert panel convened by the American Pain Society and the American College of Physicians determined which medications would be included in this review. The panel chose acetaminophen, NSAIDs (nonselective, cyclooxygenase-2 selective, and aspirin), antidepressants, benzodiazepines, antiepileptic drugs, skeletal muscle relaxants, opioid analgesics, tramadol, and systemic corticosteroids. We searched MEDLINE (1966 through November 2006) and the Cochrane Database of Systematic Reviews (2006, Issue 4) for relevant systematic reviews, combining terms for low back pain with a search strategy for identifying systematic reviews. When higher-quality systematic reviews were not available for a particular medication, we conducted additional searches for primary studies (combining terms for low back pain with the medication of interest) on MEDLINE and the Cochrane Central Register of Controlled Trials. Full details of the search strategies are available in the complete evidence report (10). Electronic searches were supplemented by hand searching of reference lists and additional citations suggested by experts. We did not include trials published only as conference abstracts. Evidence Selection We included all randomized, controlled trials that met all of the following criteria: 1) reported in the English language, or in a non-English language but included in an English-language systematic review; 2) evaluated nonpregnant adults (>18 years of age) with low back pain (alone or with leg pain) of any duration; 3) evaluated a target medication, either alone or in addition to another target medication (dual therapy); and 4) reported at least 1 of the following outcomes: back-specific function, generic health status, pain, work disability, or patient satisfaction (11, 12). We excluded trials that compared dual-medication therapy with therapy using a different medication, medication combination, or placebo. We also excluded trials of low back pain associated with acute major trauma, cancer, infection, the cauda equina syndrome, fibromyalgia, and osteoporosis or vertebral compression fracture. Because of the large number of trials evaluating medications for low back pain, our primary source for trials was systematic reviews. When multiple systematic reviews were available for a target medication, we excluded outdated systematic reviews, which we defined as systematic reviews with a published update, or systematic reviews published before 2000. When a higher-quality systematic review was not available for a particular intervention, we included all relevant randomized, controlled trials. Data Extraction and Quality Assessment For each included systematic review, we abstracted information on search methods; inclusion criteria; methods for rating study quality; characteristics of included studies; methods for synthesizing data; and results, including the number and quality of trials for each comparison and outcome in patients with acute (<4 weeks' duration) low back pain, chronic/subacute (>4 weeks' duration) low back pain, and back pain with sciatica. If specific data on duration of trials were not provided, we relied on the categorization (acute or chronic/subacute) assigned by the systematic review. For each trial not included in a systematic review, we abstracted information on study design, participant characteristics, interventions, and results. We considered mean improvements of 5 to 10 points on a 100-point visual analogue pain scale (or equivalent) to be small or slight; 10 to 20 points, moderate; and more than 20 points, large or substantial. For back-specific functional status, we classified mean improvements of 2 to 5 points on the RolandMorris Disability Questionnaire (scale, 0 to 24) and 10 to 20 points on the Oswestry Disability Index (scale, 0 to 100) as moderate (13). We also considered standardized mean differences of 0.2 to 0.5 to be small or slight; 0.5 to 0.8, moderate; and greater than 0.8, large (14). Some evidence suggests that our classification of mean improvements and standardized mean differences for pain and functional status are roughly concordant in patients with low back pain (1520). Because few trials reported the proportion of patients meeting specific thresholds (such as >30% reduction in pain score) for target outcomes, it was usually not possible to report numbers needed to treat for benefit. When those were reported, we considered a relative risk (RR) of 1.25 to 2.00 for the proportion of patients reporting greater than 30% pain relief (or a similar outcome) to indicate a moderate benefit. Two reviewers independently rated the quality of each included trial. Discrepancies were resolved through joint review and a consensus process. We assessed internal validity (quality) of systematic reviews by using the Oxman criteria (Appendix Table 1) (21, 22). According to this system, systematic reviews receiving a score of 4 or less (on a scale of 1 to 7) have potential major flaws and are more likely to produce positive conclusions about effectiveness of interventions (22, 23). We classified such systematic reviews as lower quality; those receiving scores of 5 or more were graded as higher quality. Appendix Table 1. Quality Rating System for Systematic Reviews We did not abstract results of individual trials if they were included in a higher-quality systematic review. Instead, we relied on results and quality ratings for the trials as reported by the systematic reviews. We considered trials receiving more than half of the maximum possible quality score to be higher quality for any quality rating system used (24, 25). We assessed internal validity of randomized clinical trials not included in a higher-quality systematic review by using the criteria of the Cochrane Back Review Group (Appendix Table 2) (26). We considered trials receiving more than half of the total possible score (6 of a maximum 11) higher quality and those receiving less than half lower quality (24, 25). Appendix Table 2. Quality Rating System for Randomized, Controlled Trials Data Synthesis We assessed overall strength of evidence for a body of evidence by using methods adapted from the U.S. Preventive Services Task Force (27). To assign an overall strength of evidence (good, fair, or poor), we considered the number, quality, and size of studies; consistency of results among studies; and directness of evidence. Minimum criteria for fair- and good-quality ratings are shown in Appendix Table 3. Appendix Table 3. Methods for Grading the Overall Strength of the Evidence for an Intervention Consistent results from many higher-quality studies across a broad range of populations support a high degree of certainty that the results of the studies are true (the entire body of evidence would be considered good quality). For a fair-quality body of evidence, results could be due to true effects or to biases operating across some or all of the studies. For a poor-quality body of evidence, any conclusion is uncertain. To evaluate consistency, we classified conclusions of trials and systematic reviews as positive (the medication is beneficial), negative (the medication is harmful or not beneficial), or uncertain (the estimates are imprecise, the evidence unclear, or the results inconsistent) (22). We defined inconsistency as greater than 25% of trials reaching discordant conclusions (positive vs. negative), 2 or more higher-quality systematic reviews reaching discordant conclusions, or unexplained heterogeneity (for pooled data). Role of the Funding Source The funding source had no role in the design, conduct, or reporting of this review or in the decision to publish the manuscript. Results Literature Reviewed We reviewed 1292 abstracts identified by searches for systematic reviews. Of these, 21 appeared potentially relevant and were retrieved. We excluded 7 outdated reviews of NSAIDs (28), antidepressants (2931), and multiple drugs (9, 32, 33) (Appendix Table 4). We also excluded 3 reviews that did not clearly use systematic methods (3436) and 4 systematic reviews that evaluated target medications but did not report results specifically for patients with low back pain (3739). We included 7 systematic reviews (Appendix Table 5) of NSAIDs (40, 41), antidepressants (42, 43), skeletal mus

I n the United States, low back pain is the fifth most common reason for all physician office visits and the second most common symptomatic reason (1,2). Medications are the most frequently recommended intervention for low back pain (1,3). In 1 study, 80% of primary care patients with low back pain were prescribed at least 1 medication at their initial office visit, and more than one third were prescribed 2 or more drugs (4).
The most commonly prescribed medications for low back pain are nonsteroidal anti-inflammatory drugs (NSAIDs), skeletal muscle relaxants, and opioid analgesics (4 -7). Benzodiazepines, systemic corticosteroids, antidepressant medications, and antiepileptic drugs are also prescribed (8). Frequently used over-the-counter medications include acetaminophen, aspirin, and certain NSAIDs.
A challenge in choosing pharmacologic therapy for low back pain is that each class of medication is associated with a unique balance of benefits and harms. In addition, benefits and harms may vary for individual drugs within a medication class. Previous reviews found only limited evi-dence to support use of most medications for low back pain. For example, a systematic review published in 1996 found insufficient evidence to support use of any medication for low back pain other than NSAIDs (good evidence) and skeletal muscle relaxants (fair evidence) (9).
This article reviews current evidence on benefits and harms of medications for acute and chronic low back pain.
It is part of a larger evidence review commissioned by the American Pain Society and the American College of Physicians to guide recommendations for management of low back pain (10).

Data Sources and Searches
An expert panel convened by the American Pain Society and the American College of Physicians determined which medications would be included in this review. The panel chose acetaminophen, NSAIDs (nonselective, cyclooxygenase-2 selective, and aspirin), antidepressants, benzodiazepines, antiepileptic drugs, skeletal muscle relaxants, opioid analgesics, tramadol, and systemic corticosteroids.
We searched MEDLINE (1966( through November 2006 and the Cochrane Database of Systematic Reviews (2006, Issue 4) for relevant systematic reviews, combining terms for low back pain with a search strategy for identifying systematic reviews. When higher-quality systematic reviews were not available for a particular medication, we conducted additional searches for primary studies (combining terms for low back pain with the medication of interest) on MEDLINE and the Cochrane Central Register of Controlled Trials. Full details of the search strategies are available in the complete evidence report (10). Electronic searches were supplemented by hand searching of reference lists and additional citations suggested by experts. We did not include trials published only as conference abstracts.

Evidence Selection
We included all randomized, controlled trials that met all of the following criteria: 1) reported in the English language, or in a non-English language but included in an English-language systematic review; 2) evaluated nonpregnant adults (Ͼ18 years of age) with low back pain (alone or with leg pain) of any duration; 3) evaluated a target medication, either alone or in addition to another target medication ("dual therapy"); and 4) reported at least 1 of the following outcomes: back-specific function, generic health status, pain, work disability, or patient satisfaction (11,12).
We excluded trials that compared dual-medication therapy with therapy using a different medication, medication combination, or placebo. We also excluded trials of low back pain associated with acute major trauma, cancer, infection, the cauda equina syndrome, fibromyalgia, and osteoporosis or vertebral compression fracture.
Because of the large number of trials evaluating medications for low back pain, our primary source for trials was systematic reviews. When multiple systematic reviews were available for a target medication, we excluded outdated systematic reviews, which we defined as systematic reviews with a published update, or systematic reviews published before 2000. When a higher-quality systematic review was not available for a particular intervention, we included all relevant randomized, controlled trials.

Data Extraction and Quality Assessment
For each included systematic review, we abstracted information on search methods; inclusion criteria; methods for rating study quality; characteristics of included studies; methods for synthesizing data; and results, including the number and quality of trials for each comparison and outcome in patients with acute (Ͻ4 weeks' duration) low back pain, chronic/subacute (Ͼ4 weeks' duration) low back pain, and back pain with sciatica. If specific data on duration of trials were not provided, we relied on the categorization (acute or chronic/subacute) assigned by the systematic review. For each trial not included in a systematic review, we abstracted information on study design, participant characteristics, interventions, and results.
We considered mean improvements of 5 to 10 points on a 100-point visual analogue pain scale (or equivalent) to be small or slight; 10 to 20 points, moderate; and more than 20 points, large or substantial. For back-specific functional status, we classified mean improvements of 2 to 5 points on the Roland-Morris Disability Questionnaire (scale, 0 to 24) and 10 to 20 points on the Oswestry Disability Index (scale, 0 to 100) as moderate (13). We also considered standardized mean differences of 0.2 to 0.5 to be small or slight; 0.5 to 0.8, moderate; and greater than 0.8, large (14). Some evidence suggests that our classification of mean improvements and standardized mean differences for pain and functional status are roughly concordant in patients with low back pain (15)(16)(17)(18)(19)(20). Because few trials reported the proportion of patients meeting specific thresholds (such as Ͼ30% reduction in pain score) for target outcomes, it was usually not possible to report numbers needed to treat for benefit. When those were reported, we considered a relative risk (RR) of 1.25 to 2.00 for the proportion of patients reporting greater than 30% pain relief (or a similar outcome) to indicate a moderate benefit.
Two reviewers independently rated the quality of each included trial. Discrepancies were resolved through joint review and a consensus process. We assessed internal validity (quality) of systematic reviews by using the Oxman criteria (Appendix Table 1, available at www.annals.org) (21,22). According to this system, systematic reviews receiving a score of 4 or less (on a scale of 1 to 7) have potential major flaws and are more likely to produce positive conclusions about effectiveness of interventions (22,23). We classified such systematic reviews as "lower quality"; those receiving scores of 5 or more were graded as "higher quality." We did not abstract results of individual trials if they were included in a higher-quality systematic review. Instead, we relied on results and quality ratings for the trials as reported by the systematic reviews. We considered trials receiving more than half of the maximum possible quality score to be "higher quality" for any quality rating system used (24,25).
We assessed internal validity of randomized clinical trials not included in a higher-quality systematic review by using the criteria of the Cochrane Back Review Group (Appendix Table 2, available at www.annals.org) (26). We considered trials receiving more than half of the total possible score (Ն6 of a maximum 11) "higher quality" and those receiving less than half "lower quality" (24,25).

Data Synthesis
We assessed overall strength of evidence for a body of evidence by using methods adapted from the U.S. Preventive Services Task Force (27). To assign an overall strength of evidence (good, fair, or poor), we considered the number, quality, and size of studies; consistency of results among studies; and directness of evidence. Minimum criteria for fair-and good-quality ratings are shown in Appendix Table 3 (available at www.annals.org).
Consistent results from many higher-quality studies across a broad range of populations support a high degree of certainty that the results of the studies are true (the entire body of evidence would be considered good quality). For a fair-quality body of evidence, results could be due to true effects or to biases operating across some or all of the studies. For a poor-quality body of evidence, any conclusion is uncertain.
To evaluate consistency, we classified conclusions of trials and systematic reviews as positive (the medication is beneficial), negative (the medication is harmful or not beneficial), or uncertain (the estimates are imprecise, the evidence unclear, or the results inconsistent) (22). We defined "inconsistency" as greater than 25% of trials reaching discordant conclusions (positive vs. negative), 2 or more higherquality systematic reviews reaching discordant conclusions, or unexplained heterogeneity (for pooled data).

Role of the Funding Source
The funding source had no role in the design, conduct, or reporting of this review or in the decision to publish the manuscript.

Acetaminophen
Six unique trials of acetaminophen were included in a Cochrane review of NSAIDs (40, 41) and a systematic review of multiple medications for low back pain (47). From 134 potentially relevant citations, we identified 3 other trials of acetaminophen that met inclusion criteria (49 -51). The longest trial of acetaminophen for acute or chronic low back pain lasted 4 weeks. We excluded 2 trials that did not evaluate efficacy of acetaminophen specifically for low back pain and 11 trials that compared dual therapy with acetaminophen plus another medication to a different medication, medication combination, or placebo.
For acute low back pain, 1 lower-quality trial included in the Cochrane review found no difference between acetaminophen (3 g/d) and no treatment (52). Four trials (3 of acute low back pain and 1 of mixed-duration back pain) found no clear differences in pain relief between acetaminophen at dosages up to 4 g/d and NSAIDs (40, 41).
For chronic low back pain, 1 higher-quality trial found acetaminophen inferior to diflunisal for patients reporting good or excellent efficacy after 4 weeks (53). Several other higher-quality systematic reviews of patients with osteoarthritis (not limited to the back) consistently found acetaminophen slightly inferior to NSAIDs for pain relief (standardized mean difference, about 0.3) (54 -57).
Adverse events associated with acetaminophen for low back pain were poorly reported in the trials. Data on potentially serious harms, such as gastrointestinal bleeding, myocardial infarction, and hepatic adverse events, are particularly sparse.

NSAIDs
A total of 57 unique trials of NSAIDs were included in 3 systematic reviews (40,41,47,48). From 74 potentially relevant citations for aspirin and 85 potentially relevant citations for celecoxib (the only cyclooxygenase-selective NSAID available in the United States), we identified 1 trial of aspirin that met inclusion criteria (60). We excluded 1 trial that did not evaluate aspirin specifically for low back pain (61), 10 trials that evaluated selective NSAIDs not available in the United States, and 3 trials that evaluated celecoxib in postoperative settings.
For acute low back pain, a higher-quality Cochrane review (51 trials) found nonselective NSAIDs superior to placebo for global improvement (6 trials; RR, 1.24 [95%, CI, 1.10 to 1.41]) and for not requiring additional analgesics (3 trials; RR, 1.29 [CI, 1.05 to 1.57]) after 1 week of therapy (40, 41). For chronic low back pain, an NSAID (ibuprofen) was also superior to placebo in 1 higher-quality trial (62). A second, higher-quality systematic review that included fewer (n ϭ 21) trials reached conclusions consistent with the Cochrane review (47). For back pain with sciatica, 1 higher-quality systematic review found no difference between NSAIDs and placebo on a combined outcome of effectiveness (3 trials; odds ratio, 0.99 [CI, 0.6 to 1.7]) (48).
The Cochrane review found no evidence from 24 trials that any nonselective NSAID is superior to others for pain relief (40, 41). It also found no clear differences in efficacy between NSAIDs and opioid analgesics or muscle relaxants, although trials were limited by small sample sizes (6 trials, 1 higher-quality; 16 to 44 patients) (40, 41). Use of NSAIDs also was no more effective than nonpharmacologic interventions (spinal manipulation, physical therapy, bed rest).
The Cochrane review found that nonselective NSAIDs were associated with a similar risk for any adverse event compared with placebo (RR, 0.83 [CI, 0.64 to 1.08]) (40, 41). However, the trials were not designed to evaluate risks for less common but serious gastrointestinal and cardiovascular adverse events (63)(64)(65). Data on long-term benefits and harms associated with use of NSAIDs for low back pain are particularly sparse. Only 6 of 51 trials included in the Cochrane review were longer than 2 weeks in duration (the longest evaluated 6 weeks of therapy) (40, 41).
We found insufficient evidence from 1 lower-quality trial to accurately judge benefits or harms of aspirin (acetylsalicylic acid) for low back pain (60). Evidence regarding gastrointestinal safety of aspirin is primarily limited to trials of aspirin for prophylaxis of thrombotic events (66,67).

Antidepressants
Ten unique trials were included in 3 systematic reviews of antidepressants (42,43,47). In all of the trials, the duration of therapy ranged from 4 to 8 weeks. From searches for the serotonin-norepinephrine reuptake inhibitors duloxetine or venlafaxine, we identified no relevant trials from 14 citations.
For chronic low back pain, 2 higher-quality systematic reviews (1 qualitative [43] and 1 quantitative [42]) consistently found antidepressants to be more effective than placebo for pain relief. Effects on functional outcomes were inconsistently reported and did not indicate clear benefits.
Pooling data for all antidepressants, the quantitative systematic review (9 trials) estimated a standardized mean difference of 0.41 (CI, 0.22 to 0.61) for pain relief. However, effects on pain were not consistent across antidepressants. Tricyclic antidepressants were slightly to moderately more effective than placebo for pain relief in 4 (43) and 6 (42) trials (2 higher-quality) included in the systematic reviews, but paroxetine and trazodone (antidepressants without inhibitory effects on norepinephrine uptake) were no more effective than placebo in 3 trials. Maprotiline, the only tetracyclic antidepressant evaluated in trials included in the systematic reviews, is not available in the United States. There was insufficient evidence from 1 lower-quality trial (which found no differences) (68) to directly judge relative effectiveness of tricyclic antidepressants versus selective serotonin reuptake inhibitors.
One systematic review found that antidepressants were associated with significantly higher risk for any adverse event compared with placebo (22% vs. 14%), although harms were generally not well reported (42). Drowsiness (7%), dry mouth (9%), dizziness (7%), and constipation (4%) were the most common adverse events. The trials were not designed to assess risks for serious adverse events, such as overdose, increased suicidality, or arrhythmias.

Benzodiazepines
Eight trials of benzodiazepines were included in a higher-quality Cochrane review of skeletal muscle relaxants (45,46). The trials ranged from 5 to 14 days in duration.
For acute low back pain, 1 higher-quality trial found no differences between diazepam and placebo (69), but another, lower-quality trial found diazepam superior for short-term pain relief and overall improvement (70). For chronic low back pain, pooled results from 2 higherquality trials (71,72) found tetrazepam to be associated with a greater likelihood of not experiencing pain relief (RR, 0.71 [CI, 0.54 to 0.93]) or global improvement (RR, 0.63 [CI, 0.42 to 0.97]) after 8 to 14 days. A third, lowerquality, placebo-controlled trial of diazepam for chronic low back pain found no benefit (73).
In head-to-head trials included in the Cochrane review, efficacy did not differ between diazepam and tizanidine (1 higher-quality trial of acute low back pain [74]) or cyclobenzaprine (1 lower-quality trial of chronic low back pain [73]). For acute low back pain, a third, higher-quality trial found diazepam inferior to carisoprodol for muscle spasm, functional status, and global efficacy (global rating of "excellent" or "very good," 70% vs. 45% of patients) (75). One study that pooled data from 20 trials (n ϭ 1553) found no difference between diazepam and cyclobenzaprine for short-term (14 days) global improvement (both were superior to placebo) but was excluded from the Cochrane review because it included patients with back or neck pain (mixed duration) (76).
Central nervous system events, such as somnolence, fatigue, and lightheadedness, were reported more frequently with benzodiazepines than with placebo (45,46).

Antiepileptic Drugs
We identified no systematic reviews of antiepileptic drugs for low back pain. From 94 citations, we identified 2 trials of gabapentin (77,78) and 2 trials of topiramate (79,80) that met inclusion criteria (Appendix Table 7, available at www.annals.org). The trials ranged from 6 to 10 weeks in duration. We identified no other trials of antiepileptic drugs for low back pain.
For low back pain with radiculopathy, 3 small (41 to Clinical Guidelines Medications for Acute and Chronic Low Back Pain 80 patients) trials found gabapentin (2 trials [78], 1 higherquality [77]) and topiramate (1 higher-quality trial [79]) to be associated with small improvements in pain scores compared with placebo (or diphenhydramine as active placebo [79]). One trial reporting functional outcomes found no differences (79). For chronic low back pain with or without radiculopathy, 1 higher-quality trial found topiramate moderately superior to placebo for pain, but only slightly superior for functional status (80). There was no clear difference between gabapentin and placebo in rates of withdrawal due to adverse events. However, drowsiness (6%), loss of energy (6%), and dizziness (6%) were reported with gabapentin (77). Compared with diphenhydramine (active placebo), topiramate was associated with higher rates of withdrawal due to adverse events (33% vs. 15%), sedation (34% vs. 3%), and diarrhea (30% vs. 10%) in 1 trial (79).

Skeletal Muscle Relaxants
Thirty-six unique trials of skeletal muscle relaxants (drugs approved by the U.S. Food and Drug Administration for treatment of spasticity from upper motor neuron syndromes or spasms from musculoskeletal conditions) were included in 4 systematic reviews (44 -48). The duration of therapy in all trials was 2 weeks or less, with the exception of a single 3-week trial.
For acute low back pain, a higher-quality Cochrane review found skeletal muscle relaxants moderately superior to placebo for short-term (2 to 4 days' duration) pain relief (at least a 2-point or 30% improvement on an 11-point pain rating scale) (45,46). The RRs for not achieving pain relief were 0.80 (CI, 0.71 to 0.89) at 2 to 4 days and 0.67 (CI, 0.13 to 3.44) at 5 to 7 days. There was insufficient evidence to conclude that any specific muscle relaxant is superior to others for benefits or harms (45,46). However, there is only sparse evidence (2 trials) on efficacy of the antispasticity drugs dantrolene and baclofen for low back pain. Tizanidine, the other skeletal muscle relaxant approved by the Food and Drug Administration for spasticity, was efficacious for acute low back pain in 8 trials. Only 1 trial of patients with chronic low back pain-a lowerquality trial of cyclobenzaprine that did not report pain intensity or global efficacy-evaluated a skeletal muscle relaxant available in the United States (73).
Two other systematic reviews had a smaller scope than the Cochrane review but reached consistent conclusions (44,47). One of the systematic reviews included 2 additional lower-quality trials of cyclobenzaprine for chronic or subacute low back or neck pain that reported mixed results compared with placebo (44). Another systematic review (48), which focused on interventions for sciatica, found no difference between tizanidine and placebo in 1 higherquality trial (81).
Skeletal muscle relaxants were associated with a higher total number of adverse events (RR, 1.50 [CI, 1.14 to 1.98]) and central nervous system adverse events (RR, 2.04 [CI, 1.23 to 3.37]) compared with placebo, although most events were self-limited and serious complications were rare (45,46).

Opioid Analgesics
We identified no systematic reviews of opioids for low back pain. From 600 potentially relevant citations, we identified 9 trials of opioid analgesics that met inclusion criteria (Appendix Table 8, available at www.annals.org) (59,(82)(83)(84)(85)(86)(87)(88)(89). Twelve trials were excluded because they evaluated dual therapy with an opioid plus another medication compared with another medication or medication combination, 1 trial because it evaluated single-dose therapy, 2 trials because they did not report efficacy of opioids specifically for low back pain, and 2 trials because they did not evaluate any included outcome.
For chronic low back pain, a single higher-quality trial found that sustained-release oxymorphone or sustainedrelease oxycodone was superior to placebo by an average of 18 points on a 100-point pain scale (87). However, opioids were titrated to stable doses before randomization, so poorer outcomes with placebo could have been due in part to cessation of opioid therapy and to withdrawal. Two lower-quality trials reported no significant differences between propoxyphene and placebo for back pain of mixed duration (83) or codeine and acetaminophen for acute back pain (59).
Two systematic reviews of placebo-controlled trials of opioids for various noncancer pain conditions (most commonly osteoarthritis and neuropathic pain) found opioids to be moderately effective, with a mean decrease in pain intensity with opioids in most trials of at least 30% (38), or a standardized mean difference for pain relief of Ϫ0.60 (CI, Ϫ0.69 to Ϫ0.50) (39). In 1 of the reviews, opioids were also slightly superior for functional outcomes (standardized mean difference, Ϫ0.31 [CI, Ϫ0.41 to Ϫ0.22]) (39). Estimates of benefit were similar for neuropathic and nonneuropathic pain.
There was no evidence from 5 lower-quality trials that sustained-release opioid formulations are superior to immediate-release formulations for low back pain on various outcomes (84 -86, 88, 89). In addition, different longacting opioids did not differ in 2 head-to-head trials (82,87).
In 1 higher-quality trial, 85% of patients with low back pain randomly assigned to receive opioids reported adverse events, with constipation and sedation as the most frequent symptoms (87). Trials of opioids were not designed to assess risk for abuse or addiction and generally excluded higher-risk patients. In addition with the exception of 2 longer-term (16 weeks and 13 months) studies (82,88), all trials lasted fewer than 3 weeks.

Tramadol
Three trials of tramadol (90 -92) were included in a systematic review of various medications for low back pain (47). From 147 potentially relevant citations, we identified 2 other trials of tramadol that met inclusion criteria (93,  (99). For chronic low back pain, tramadol was moderately more effective than placebo for short-term pain and functional status after 4 weeks in 1 higher-quality trial (92). Evidence from 2 trials (1 higher-quality) (90,91) was insufficient to judge efficacy of tramadol versus the combination of acetaminophen plus codeine or dextroprofentrometamol (an NSAID not available in the United States). Two other lower-quality trials found no differences in benefits or harms between sustained-release and immediate-release tramadol for chronic low back pain (93,94). No trial compared tramadol with acetaminophen or opioid monotherapy, or with other NSAIDs. Tramadol was associated with similar rates of withdrawal due to adverse events compared with placebo (92) or the combination of acetaminophen plus codeine (91).

Systemic Corticosteroids
We identified no systematic reviews of systemic corticosteroids for low back pain. From 418 potentially relevant citations, we identified 4 trials that met inclusion criteria (Appendix Table 9, available at www.annals.org) (100 -103). We excluded 3 trials that evaluated systemic corticosteroids in operative or postoperative settings and 1 German-language trial.
For acute sciatica or sciatica of unspecified duration, 3 small (33 to 65 patients), higher-quality trials consistently found systemic corticosteroids associated with no clinically significant benefit compared with placebo when given parenterally (single injection) or as a short oral taper (100,102,103). For patients with acute low back pain and a negative result on a straight-leg-raise test, a fourth trial found no difference in pain relief through 1 month between a single intramuscular injection of methylprednisolone (160 mg) and placebo (101).
A large (500-mg) intravenous methylprednisolone bolus was associated with 2 cases of transient hyperglycemia and 1 case of facial flushing in 1 trial (100). Another trial found a smaller (160-mg) intramuscular methylprednisolone injection associated with no cases of hyperglycemia requiring medical attention, infection, or gastrointestinal bleeding (101). Adverse events were poorly reported in the other trials.

Dual-Medication Therapy
Five trials comparing dual therapy with a skeletal muscle relaxant plus an analgesic (acetaminophen or an NSAID) versus the analgesic alone were included in a systematic review of skeletal muscle relaxants (45,46). One other trial evaluated an opioid plus an NSAID versus an NSAID alone (88). We identified no other trials evaluating dual-medication therapy versus monotherapy from any of the other systematic reviews or searches.
A higher-quality Cochrane review of skeletal muscle relaxants (45,46) found tizanidine combined with acetaminophen or an NSAID to be consistently associated with greater short-term pain relief than acetaminophen or NSAID monotherapy in 3 higher-quality trials. However, 2 lower-quality trials found no benefits from adding orphenadrine to acetaminophen or cyclobenzaprine to an NSAID. Compared with acetaminophen or an NSAID alone, adding a muscle relaxant was associated with a higher risk for adverse events of the central nervous system ( For chronic low back pain, 1 small (36 patients) trial found an opioid with naproxen slightly superior to naproxen alone for pain (5 to 10 points on a 100-point scale), anxiety, and depression after 16 weeks, but results are difficult to interpret because doses of naproxen were not clearly specified (88).

DISCUSSION
This review synthesizes evidence from systematic reviews and randomized, controlled trials of medications for treatment of low back pain. Main results are summarized in Appendix Tables 10 (acute low back pain), 11 (chronic or subacute low back pain), and 12 (low back pain with sciatica) (available at www.annals.org).
We found good evidence that NSAIDs, skeletal muscle relaxants (for acute low back pain), and tricyclic antidepressants (for chronic low back pain) are effective for short-term pain relief. Effects were moderate, except in the case of tricyclic antidepressants (small to moderate effects). We found fair evidence that acetaminophen, tramadol, benzodiazepines, and gabapentin (for radiculopathy) are effective for pain relief. Interpreting evidence on efficacy of opioids for low back pain is challenging. Although evidence on opioids versus placebo or nonopioid analgesics specifically for low back pain is sparse and inconclusive, recent systematic reviews of opioids for various chronic pain conditions found consistent evidence of moderate benefits (38,39). For all medications included in this review, evidence of beneficial effects on functional outcomes is limited. We found good evidence that systemic corticosteroids are ineffective for low back pain with or without sciatica. We could not draw definite conclusions about efficacy of other medications for sciatica or radiculopathy because few trials have specifically evaluated patients with this condition. One systematic review identified only 7 trials evaluating medications for sciatica (48).
Assessing comparative benefits between drug classes was difficult because of a paucity of well-designed, headto-head trials. Gabapentin, for example, has been evaluated in only 2 small, short-term, placebo-controlled trials, and Clinical Guidelines Medications for Acute and Chronic Low Back Pain no trials directly compared potent opioids with other analgesics. One exception is acetaminophen, which was slightly but consistently inferior for pain relief compared with NSAIDs-although this conclusion assumes that estimates of pain relief from trials of osteoarthritis can be applied to patients with low back pain (54 -57).
We also found little evidence of differences in efficacy within medication classes. However, head-to-head trials between drugs in the same class were mostly limited to NSAIDs and skeletal muscle relaxants. Among skeletal muscle relaxants, we found sparse evidence on efficacy of the antispasticity medications baclofen and dantrolene. Among antidepressants, tricyclics are the only class shown to be effective for low back pain, although other drugs with effects on norepinephrine uptake (such as duloxetine and venlafaxine) have not yet been evaluated.
In contrast to limited evidence of clear differences in benefits, we found clinically relevant differences between drug classes in short-term adverse events. For example, skeletal muscle relaxants, benzodiazepines, and tricyclic antidepressants are all associated with more central nervous system events (such as sedation) compared with placebo. Opioids seem to be associated with particularly high rates of short-term adverse events, particularly constipation and sedation. Data on serious (life-threatening or requiring hospitalization) adverse events associated with use of medications for low back pain are sparse. For NSAIDs, this is a critical deficiency because much of the uncertainty regarding their use centers on relative gastrointestinal and cardiovascular safety (63). For opioids and benzodiazepines, reliable evidence on such risks as abuse, addiction, and overdose is not available. Among skeletal muscle relaxants, clinical trials have shown no clear differences in rates of adverse events, but carisoprodol is known to be metabolized to meprobamate (a scheduled drug), dantrolene carries a black box warning for potentially fatal hepatotoxicity, and observational studies have found both tizanidine and chlorzoxazone to be associated with usually reversible and mild hepatotoxicity (104).
Our evidence synthesis has several potential limitations. First, because of the large number of published trials, our primary source of data was systematic reviews. The reliability of systematic reviews depends on how well they are conducted. We therefore focused on results from higherquality systematic reviews, which are less likely than lowerquality reviews to report positive findings (22,23). In addition, overall conclusions were generally consistent between multiple higher-quality systematic reviews of a medication. Second, we only included randomized, controlled trials. Although well-conducted randomized, controlled trials are less susceptible to bias than other study designs, nearly all are "efficacy" trials conducted in ideal settings and selected populations, usually with short-term follow-up. "Effectiveness" trials or well-designed observational studies could provide important insight into benefits and harms of medications for low back pain in real-world practice. Third, high-quality data on harms are sparse. Better assessment and reporting of harms in clinical trials would help provide more balanced assessments of net benefits (105). Fourth, reporting of outcomes was poorly standardized across trials. In particular, the proportion of patients meeting predefined criteria for clinically important differences was rarely reported, making it difficult to assess clinical significance of results. Fifth, language bias could affect our results because we included non-English-language trials only if they were included in English-language systematic reviews. However, only 2 systematic reviews restricted inclusion solely to English-language trials (42,44). Finally, the systematic reviews included in our evidence synthesis did not assess for potential publication bias. Formal assessments of publication bias would be difficult to interpret because of small numbers of studies and clinical diversity among trials (106).
We also identified several research gaps that limited our ability to reach more definitive conclusions about relative benefits and harms of medications for low back pain. First, no trials formally evaluated different strategies for choosing initial medications. In addition, evidence is sparse on effectiveness of dual-medication therapy relative to monotherapy or sequential treatment, even though patients are frequently prescribed more than 1 medication (4). There is also little evidence on long-term (Ͼ4 weeks) use of any medication included in this review, particularly with regard to long-term harms.
In summary, several medications evaluated in this report are effective for short-term relief of acute or chronic low back pain, although each is associated with a unique set of risks and benefits. Individuals are likely to differ in how they prioritize the importance of these various benefits and harms. For mild or moderate pain, a trial of acetaminophen might be a reasonable first option because it may offer a more favorable safety profile than NSAIDs. However, acetaminophen also seems less effective for pain relief. For more severe pain, a small increase in cardiovascular or gastrointestinal risk with NSAIDs in exchange for greater pain relief could be an acceptable tradeoff for some patients, but others may consider even a small increase in these risks unacceptable. For very severe, disabling pain, a trial of opioids in appropriately selected patients (107-109) may be a reasonable option to achieve adequate pain relief and improve function, despite the potential risks for abuse, addiction, and other adverse events. Factors that should be considered when weighing medications for low back pain include the presence of risk factors for complications, concomitant medication use, baseline severity of pain, duration of low back symptoms, and costs. As in other medical decisions, choosing the optimal medication for an individual with low back pain should always involve careful consideration and thorough discussion of potential benefits and risks.

Appendix Table 1. Quality Rating System for Systematic Reviews
Criteria for Assessing Scientific Quality of Research Reviews* Operationalization of Criteria

Were the search methods reported?
Were the search methods used to find evidence (original research) on the primary questions stated? "Yes" if the review states the databases used, date of most recent searches, and some mention of search terms.
The purpose of this index is to evaluate the scientific quality (i.e., adherence to scientific principles) of research overviews (review articles) published in the medical literature. It is not intended to measure literary quality, importance, relevance, originality, or other attributes of overviews.

Was the search comprehensive?
Was the search for evidence reasonably comprehensive? "Yes" if the review searches at least 2 databases and looks at other sources (e.g., reference lists, hand searches, queries of experts).

Were the inclusion criteria reported?
Were the criteria used for deciding which studies to include in the overview reported?

Was selection bias avoided?
Was bias in the selection of studies avoided? "Yes" if the review reports how many studies were identified by searches, numbers excluded, and appropriate reasons for excluding them (usually because of predefined inclusion/exclusion criteria).
The index is for assessing overviews of primary ("original") research on pragmatic questions regarding causation, diagnosis, prognosis, therapy, or prevention. A research overview is a survey of research. The same principles that apply to epidemiologic surveys apply to overviews: A question must be clearly specified; a target population identified and accessed; appropriate information obtained from that population in an unbiased fashion; and conclusions derived, sometimes with the help of formal statistical analysis, as is done in meta-analyses. The fundamental difference between overviews and epidemiologic studies is the unit of analysis, not the scientific issues that the questions in this index address.

Were the validity criteria reported?
Were the criteria used for assessing the validity of the included studies reported? 6. Was validity assessed appropriately?
Was the validity of all the studies referred to in the text assessed by using appropriate criteria (either in selecting studies for inclusion or in analyzing the studies that are cited)? "Yes" if the review reports validity assessment and did some type of analysis with it (e.g., sensitivity analysis of results according to quality ratings, excluded low-quality studies).
Because most published overviews do not include a methods section, it is difficult to answer some of the questions in the index. Base your answers, as much as possible, on information provided in the overview. If the methods that were used are reported incompletely relative to a specific question, score it as "can't tell," unless there is information in the overview to suggest that the criterion was or was not met.

Were the methods used to combine studies reported?
Were the methods used to combine the findings of the relevant studies (to reach a conclusion) reported? ЉYesЉ for studies that did qualitative analysis if report mentions that quantitative analysis was not possible and reasons that it could not be done, or if "best evidence" or some other grading of evidence scheme used. 8. Were the findings combined appropriately?
Were the findings of the relevant studies combined appropriately relative to the primary question the overview addresses? ЉYesЉ if the review performs a test for heterogeneity before pooling, does appropriate subgroup testing, appropriate sensitivity analysis, or other such analysis.
For question 8, if no attempt has been made to combine findings, and no statement is made regarding the inappropriateness of combining findings, check "No." If a summary (general) estimate is given anywhere in the abstract, the discussion, or the summary section of the paper, and it is not reported how that estimate was derived, mark "No" even if there is a statement regarding the limitations of combining the findings of the studies reviewed. If in doubt, mark "Can't tell." 9. Were the conclusions supported by the reported data?
Were the conclusions made by the author(s) supported by the data and/or analysis reported in the overview?
For an overview to be scored as "Yes" in question 9, data (not just citations) must be reported that support the main conclusions regarding the primary question(s) that the overview addresses. 10. What was the overall scientific quality of the overview?
How would you rate the scientific quality of this overview?
The score for question 10, the overall scientific quality, should be based on your answers to the first 9 questions. The following guidelines can be used to assist with deriving a summary score: If the "Can't tell" option is used 1 or more times on the preceding questions, a review is likely to have minor flaws at best and it is difficult to rule out major flaws (i.e., a score Յ4). If the "No" option is used on question 2, 4, 6, or 8, the review is likely to have major flaws (i.e., a score Յ3, depending on the number and degree of the flaws). The number of participants who are included in the study but did not complete the observation period or were not included in the analysis must be described and reasons given. If the percentage of withdrawals and dropouts does not exceed 15% and does not lead to substantial bias, a "yes" is scored.

Yes/No/Don't Know
J. Was the timing of the outcome assessment in all groups similar?
Timing of outcome assessment should be identical for all intervention groups and for all important outcome assessments.

Yes/No/Don't Know
K. Did the analysis include an intention-to-treat analysis? "Yes," if Ͻ5% of randomly assigned patients were excluded.
All randomly assigned patients are reported/analyzed in the group they were allocated to by randomization for the most important moments of effect measurement (minus missing values) irrespective of nonadherence and co-interventions. Continued on following page www.annals.org