Physical activity and the prevention, reduction, and treatment of alcohol and other drug use across the lifespan (The PHASE review): A systematic review

The aim of this review is to systematically describe and quantify the effects of PA interventions on alcohol and other drug use outcomes, and to identify any apparent effect of PA dose and type, possible mechanisms of effect, and any other aspect of intervention delivery (e.g. key behaviour change processes), within a framework to inform the design and evaluation of future interventions. Systematic searches were designed to identify published and grey literature on the role of PA for reducing the risk of progression to alcohol and other drug use (PREVENTION), supporting individuals to reduce alcohol and other drug use for harm reduction (REDUCTION), and promote abstinence and relapse prevention during and after treatment of alcohol and other drug use (TREATMENT). Searches identified 49,518 records, with 49,342 excluded on title and abstract. We screened 176 full text articles from which we included 32 studies in 32 papers with quantitative results of relevance to this review. Meta-analysis of two studies showed a significant effect of PA on prevention of alcohol initiation (risk ratio [RR]: 0.72, 95%CI: 0.61 to 0.85). Meta-analysis of four studies showed no clear evidence for an effect of PA on alcohol consumption (Standardised Mean Difference [SMD]: 0.19, 95%, Confidence Interval −0.57 to 0.18). We were unable to quantitatively examine the effects of PA interventions on other drug use alone, or in combination with alcohol use, for prevention, reduction or treatment. Among the 19 treatment studies with an alcohol and other drug use outcome, there was a trend for promising short-term effect but with limited information about intervention fidelity and exercise dose, there was a moderate to high risk of bias. We identified no studies reporting the cost-effectiveness of interventions. More rigorous and well-designed research is needed. Our novel approach to the review provides a clearer guide to achieve this in future research questions addressed to inform policy and practice for different populations and settings.


Background
Globally, 5.9% and 1% of deaths are attributable to alcohol and other drug use, respectively (World Health Organisation, 2014). Worldwide, alcohol attributable deaths increased from 3. 8% in 2004(World Health Organisation, 2011) to 5.9% in 2012(World Health Organisation, 2014, and alcohol use is the seventh leading risk factor for both death and disability adjusted life years (Griswold et al., 2018). Mental and addictive disorders affected more than one billion people worldwide in 2016 accounting for 7% of all global burden of disease (Rehm & Shield, 2019) with alcohol and other drug use accounting for 33.0 and 6.9 deaths per 100,000 people, respectively (Peacock et al., 2018). Globally, substance use has failed to decline between 2005 and 2010 (United Nations Office on Drugs and Crime, 2012), with a global lifetime substance use disorder (SUD) prevalence of 3.5% (Degenhardt et al., 2019).
In the UK, alcohol harms are associated with an economic annual cost of around £21 billion (£3.5bn in healthcare (Public Health England, 2013)). Other drug use in the UK has an economic cost of around £15 billion (The Centre for Social Justice, 2013) (£488 million through healthcare (The National Treatment Agency for Substance Misuse, 2012)). Alcohol and other drug use is the leading behavioural risk factor for death in the UK for those aged 15-49 (Public Health England, 2017) and there has been an increase in levels of substance use in the UK in recent years with nearly 1 in 10 adults aged 16-59 in England and Wales having used substances in the past year (Public Health England, 2014).
Physical activity (PA, defined as any bodily movement produced by skeletal muscles that requires energy expenditure, inclusive of organised sport (World Health Organisation, 2017)) may affect alcohol and other drug use through various psychological mechanisms, including an acute reduction in cravings and urges, improved positive affect and mood, and sustained improvement in co-morbid depression and anxiety which are often linked to alcohol and drug use (Linke & Ussher, 2015). Involvement in exercise may help the avoidance of drug use cues, and provide exposure to new environments, which provide diversionary, safe, and immediately rewarding experiences (Linke & Ussher, 2015). Behaviourally, meaningful participation in structured and rewarding activities are a key part of overcoming SUDs, and some physical activities may offer the chance for identity transformation through exposure to such activities, informal social controls, and promoted personal agency (Landale, 2012). Physiologically, animal studies suggest neurobiological changes associated with exercise (Linke & Ussher, 2015;Lynch, Peterson, Sanchez, Abel, & Smith, 2013;, which may help to explain the consistent evidence that exercise acutely reduces consumption of cocaine, morphine, nicotine, and alcohol (Engelmann et al., 2014;Hashemi Nosrat Abadi, Vaghef, Babri, Mahmood-Alilo, & Beirami, 2013;Lynch et al., 2013;Sanchez, Moore, Brunzell, & Lynch, 2013;Thanos et al., 2013). The combination of cognitive, behavioural and physiological processes, while engaging in PA interventions, may provide a valuable approach to prevention, harm reduction and treatment for alcohol and drug use.
There is broad interest in the role of physical activity (PA) as a treatment and reduction strategy for alcohol and other drug use (Taylor, Oh, & Cullen, 2013;Volkow, 2011) and in how physical activity can be used to influence other health behaviours (Taylor, 2014;Thompson, Lambert, Greaves, & Taylor, 2018;Ussher, Faulkner, Angus, Hartmann-Boyce, & Taylor, 2019). Physical activity interventions could impact on the prevention, reduction, and treatment of alcohol and other drug use but previous reviews have been conceptually confusing and methodologically weak. For example, existing reviews have included studies with a focus on alcohol, substance use and smoking in a single meta-analysis, and have not discriminated between studies with a focus on efficacy (i.e., does a dose of exercise work?) versus effectiveness (conducted in different settings with different intervention engagement issues, and possible mechanisms by which exercise may influence outcomes) (Hallgren, Vancampfort, Giesen, Lundin, & Stubbs, 2017;Wang, Wang, Wang, Li, & Zhou, 2014;Zschucke, Heinz, & Strohle, 2012). The reviews have also not clearly differentiated between studies with a focus on prevention, harm reduction and treatment (Hallgren et al., 2017, Simonton, Young, & Johnson, 2018, included nicotine and cigarette smoking with alcohol and substances in the same analyses with studies reporting no outcome related directly to level of substance use (Simonton et al., 2018), used search strategies that were not comprehensive and only included randomised controlled trials (Zschucke et al., 2012).
Given the commonly different treatment approaches and prevalence levels for alcohol and other drug use this review will consider the influence of PA on each separately, even though there may be common potential processes involved. We include studies concerned with prevention, reduction and treatment of both alcohol and substance use in this one review because while the co-morbidity of the behaviours has long been recognised (Stinson et al., 2005), services and interventions for both have become aligned in the UK and globally. Physical activity and exercise as an intervention is rarely prioritised to be promoted within cognitive-behavioural therapies for alcohol and other drug use and a single review is more likely to provide a useful resource professionals and policy makers who are often working with clients with one or both conditions. Also, holistic approaches are routinely used and we are keen for the review to have as broad a reach as possible: For example, those working in the broader field of mental health treatment and prevention are likely to recognise the potential for physical activity to impact on common mechanisms such as depression and anxiety, which also co-exists with the uptake, use and recovery from alcohol and other drug use.
We used the framework shown in Fig. 1 (Taylor & Faulkner, 2014) to structure our review of quantitative literature to describe the complex interaction of the effects of PA on alcohol and drug use, and potential mechanisms of change. This is to attempt to describe processes and outcomes which will be of direct relevance to those designing and implementing interventions with PA to address alcohol and other drug use. The framework highlights 6 important research questions: (1) Does PA and sedentary behaviour (SB) contribute to preventing alcohol and other drug use? (2) What do we know about how dose of behaviour influences these effects?
(3) What do we know about the physiological, psychological and social processes/mechanisms that mediate any effects of PA on alcohol and other drug use? (4) What approaches are most effective in supporting changes in PA and SB in the context of influencing alcohol and other drug use? (5) How much support and in what form and duration is needed to optimise changes in PA and SB? (6) What processes of behaviour change and psychological mediators are most important to include in an intervention to change PA and SB in the context of influencing alcohol and other drug use? It also highlights how dose of intervention may affect behaviour change processes, and how dose of PA may affect psychological and physical mechanisms.
The present review is restricted to addressing these questions by considering studies in which quantitative data has been reported. A separate paper details our findings from the synthesis of qualitative studies (Horrell et al. under review).

Aims
Linked to Fig. 1, our aims were as follows: Primary aim: to assess the effects of PA interventions, versus no PA control, on alcohol and other drug use outcomes.
Secondary aim: describe any apparent effect of PA dose, possible mechanisms of effect, impact of intervention type and dose, significant intervention components on alcohol and other drug use (including behaviour change processes), and report and analyse any costeffectiveness or economic data.

Methods
Detailed methods for this review have been published elsewhere , and is registered with PROSPERO (registration number CRD42017079322).
The scope of the review was to locate studies relating to PA and its potential to impact our three outcome domains: 1. Reduce the risk of progression to alcohol and other drug use; by preventing the initiation of use among those who are abstinent and preventing the progression to a disorder (PREVENTION); 2. Support individuals to reduce alcohol and other drug use for harm reduction; among the general population who do not have a diagnosed disorder or are not seeking structured treatment programmes (REDUCTION), and; 3. Promote abstinence and relapse prevention during and after structured treatment programmes for SUD (including AUD) (TREATMENT).
To differentiate between reduction and treatment, studies involving participants who were diagnosed with SUD or who were actively seeking treatment were classified as 'treatment' studies. Studies involving participants who were neither seeking treatment nor diagnosed with SUD were classified as reduction studies. Prevention studies were those that involved an intervention that explicitly aimed to prevent participant initiation of use, or had outcomes reflecting progression from non-use to use in those who have previously not used alcohol and/or other drugs. The differentiation between these three categories was developed by an expert advisory group and consultation with practitioners and people with lived experience, based on differing population characteristics and different treatment pathways and options.

Eligibility criteria
Searches were not limited by country and 1978 was chosen as a cutoff point based on a frequency analysis on a subsample of relevant literature.
We included quantitative studies (RCTs, quasi-RCTs, non-randomised controlled trials, controlled before and after studies, prospective or retrospective cohort studies that include a control group, historically controlled trials, nested case-control studies, case-control studies, and before-and-after comparisons). No restriction was placed on the setting or country, who delivered an intervention or in what format, and the participant characteristics.
We included studies evaluating or comparing interventions that included the promotion of PA (either as a sole focus or a substantial part of a multi-component intervention) either explicitly or implicitly targeting alcohol and other drug use. The comparator could be no intervention, treatment as usual (e.g. pharmacotherapy and psychological therapies), or alternative PA interventions (e.g. running vs walking).
For prevention studies we looked at rates of progression from nonuse to use (initiation) and prevalence rates of alcohol or other drugs. For reduction we included reductions in level of use measured and reported in various ways (e.g. drinks per day, percent days abstinent). For treatment outcomes we included reduction outcomes as well as abstinence and relapse rates. Table 1 summarises population, intervention, comparator, and outcomes against our three domains of PA impact.

Information sources
A highly sensitive search strategy of published and grey literature was developed using background scoping searches, previously identified relevant research, and in consultation with subject experts and public and patient involvement. The strategy included searches of the following sources: T.P. Thompson et al. •

Search
The search strategy was intentionally kept broad to encompass the three aims of the review (i.e. prevention, reduction, and treatment). The strategy included searches of the above databases from Jan 1, 1975 to June 2, 2017 and was peer reviewed using the PRESS checklist (McGowan et al., 2016) prior to execution. No additional search filters or limiters were used. The strategy was translated for use in each database. An updated search was conducted on March 1, 2018. The updated search was refined based on included studies from the original search in line with best practices (Garner et al., 2016). A supplementary search was completed immediately prior to publication to identify any significant subsequent studies published after the updated search.
See Appendix 1 for a sample search strategy.

Grey literature search
Extensive grey literature searching was conducted to ensure maximum coverage of the subject area. The grey literature strategy encompassed focused searches in Google, Google Scholar, and several specialized databases. We also conducted backwards and forwards citation chaining of all included studies, and directly contacted known experts in the field and the lead authors of key publications for knowledge of any other relevant work. Any paper which described a relevant intervention and reported outcome data was considered for inclusion, including white papers, conference proceedings, or PhD theses. Masters theses were excluded.
See Appendix 2 for sample grey literature search strategy.

Data management
Exported citations from traditional databases were imported and deduplicated in EndNote X8 (Clarivate Analytics). Grey literature citations were manually imported into EndNote or, where available, captured through a browser-based citation management plug-in (such as Zotero [https://www.zotero.org/]) then imported into EndNote. Using a structured and piloted data extraction form, we extracted relevant outcome data, study characteristics, and participant characteristics from each included paper. Data were extracted by one reviewer and checked by another (JH and TT).

Selection process
Two phases of study selection took place. Titles and abstracts were screened by two reviewers (TT/JH) independently and disagreements resolved by discussion or, where necessary, a third reviewer (AT). Title and abstract screening was conducted using Rayyan software (QCRI; Doha, Qatar; https://rayyan.qcri.org/)). Two initial subsets of 500 results were screened by two reviewers and inclusion and exclusion discrepancies discussed following each in order to ensure good agreement between reviewers. Following this, a set of 1000 were completed and discussed before the remaining results screened independently by 2 reviewers. This helped ensure reliable and consistent screening. Full texts were obtained for studies appearing to meet the criteria above and screened by two reviewers (each paper reviewed by one member of the team and checked by another (JH/TT).

Appraisal of studies (quality and bias)
Studies were assessed for bias based on the primary outcome of interest within the aims of this review. Randomised controlled trials were assessed for quality and risk of bias using the Cochrane Risk of Bias Tool 2.0 (Higgins et al., 2016) which has been developed for assessing more appropriately behavioural interventions. Non-randomised studies assessed using the ROBINS-I . The GRADE approach (Guyatt et al., 2008) was used to create indicate the overall quality of the evidence for each domain of prevention, reduction, and treatment.

Data synthesis
Where data allowed (e.g. data on the same outcome from at least two studies of similar design, intervention, and population), we conducted meta-analyses to estimate the overall effect and consistency of the intervention effect across studies. As the population and setting of studies were different, a random effects model was used to obtain the summary result as an estimate of the average intervention effect, rather than the common effect estimated from a fixed effects model (Borenstein, Hedges, Higgins, & Rothstein, 2010).
Data from non-randomised trials which used different study designs, or data from randomised trials and non-randomised trials, were not combined in meta-analyses (Higgins & Green, 2011). In these cases, where suitable numerical data were not available for pooling, or if pooling was considered inappropriate, we used other approaches to provide a systematic summary of the studies, including: tabulation, transformation of data into common rubric (e.g. days abstinent), groupings and clusters (e.g. different population to assess influence of country, age, socioeconomic status, type/intensity of intervention, setting), and textual descriptions including a detailed narrative synthesis (Popay et al., 2006). We present dichotomous data as risk ratios with their associated 95% confidence intervals (CI).
For continuous data, we calculate the mean differences (MD) for outcomes measured by the same scale, or the standardised mean differences (SMD) for outcomes measured by different scales, and present both with a 95% CI.
If outcomes were collected at multiple time points, we attempt to present a summary effect over all time points.
Methods relating to issues of unit of analysis, dealing with missing data, assessing statistical heterogeneity, assessing reporting bias, sensitivity analysis, confidence in cumulative evidence, and external validity/generalisability are reported in the protocol . T.P. Thompson et al.

Identification of studies (quantitative and qualitative)
Database searches (June 2017 and March 2018) resulted in 42,826 results after de-duplication. Grey literatures searching returned 672 results from un-indexed databases and 2660 results were screened from search engines. Other sources (citation chasing and contacting experts) returned 14 results. Following title and abstract screening (and readings of grey literature without abstracts) by two reviewers (TT and JH) 45,996 results were excluded resulting in 176 full texts which were retrieved for further consideration. Following full text screening against the predefined inclusion criteria by two reviewers (TT and JH) with disputes resolved with the help of a third reviewer (AHT), 69 texts were included for review (See Fig. 1 for a flow diagram of identification of studies). One other paper was included following an additional search immediately prior to submission, bringing the total number of studies to 70.

Studies included in quantitative synthesis
Of the 70 identified texts, 32 are considered within this paper as containing quantitative data relating to the primary outcome of alcohol and other drug use in relation to the three domains of PA (see Fig. 2).
Of the 32 studies considered within this paper, 11 studies present data on both alcohol and drug use (with the outcomes often being inseparable and measured via a single tool), 14 focus on just alcohol use, and seven focus on just other drug use. Table 2 presents a breakdown of the number of studies by domain and number of randomised controlled trials.
Characteristics of included studies are given in Table 3.

Populations
All included studies took place in the USA and were conducted among adolescents. All participants were of middle or high school age (average age range 10.9-17.5 years), and interventions were set within schools or in community settings such as community counselling agencies or National Guard community sites.

Interventions
Interventions varied greatly in terms of type, intensity, mode of delivery and duration. Some interventions contained no direct PA: One intervention (Werch et al., 2003) as part of a three arm experimental design consisted of a one of consultation promoting an active lifestyle emphasising the conflict between being active and consuming alcohol. This was added to by further consultation addressing risk/protective factors for alcohol use, and mailed cards sent once a week for five weeks to the parents of the adolescents as the comparator conditions. Another intervention (Velicer et al., 2013) was a 30-min tailored computer based intervention accessed 5 times over a three year period aimed at increasing PA levels (as well as a partially tailored element for increasing fruit and vegetable consumption and decreasing time spent watching television). Two studies focussed on more vigorous physical fitness (Collingwood et al., 1991(Collingwood et al., , 2000 consisting of exercise training for one and a half hours three times a week for 9-12 weeks, and one study (Butzer et al., 2017) examined the impact of 35 min Kripala Yoga sessions delivered once or twice a week over six months.

Controls/comparators
Two studies were of pre-post within subjects design therefore had no control or comparison condition (Collingwood et al., 1991(Collingwood et al., , 2000. The remaining three studies all utilised a randomised design: one comparing yoga to physical education as usual (Butzer et al., 2017), another a cluster (school) design comparing a computer based tailored substance use prevention (with no PA component) equal in contact time (Velicer et al., 2013), and the other a three arm individually randomised design comparing three versions of the same one off short counselling type intervention for increasing PA (Werch et al., 2003). None of the studies incorporated what could be considered a true control condition as they all either included experimental interventions as the comparator or an alternative PA intervention. It is not clear if one study actually decreased the level of PA completed by using yoga as an intervention in place of standard physical education (Butzer et al., 2017).

Outcomes
The main outcome of interest was subsequent levels of alcohol and other drug use representing a transition from not using to using (initiation). We also included studies which reported levels of use across time to show a reduction in use.

Initiation.
The three studies of random design reported measures relating to alcohol use initiation (Butzer et al., 2017;Velicer et al., 2013;Werch et al., 2003). One study collected data at the end of the intervention period (Velicer et al., 2013), one at three months post intervention (Werch et al., 2003), and the other at one year post intervention (Butzer et al., 2017). All three used a different inventory for collecting the status of participants. One study reported significantly lower levels of alcohol use initiation at 12, 24 and 36 months in the "energy balance" intervention aimed at increasing PA compared to a comparison intervention aimed explicitly at reducing other drug use (Velicer et al., 2013), with 10.1% of the energy balance group having initiated alcohol use compared to 14.4% in the other drug use prevention intervention. A study with three experimental arms reported a reduction in 30-day heavy drinking (F(1,441) = 4.05, p = 0.04) and alcohol initiation (F(1,441) = 4.27, p = 0.03) at 3 months post intervention across all groups with no effect of group (Werch et al., 2003). One preliminary randomised study investigating the effects of a yoga intervention reported no difference in the proportion of participants who reported having ever tried a sip of alcohol in either arm at any time point (Butzer et al., 2017).

Reduction.
One study reported a reduction in 30-day heavy drinking (F(1,441) = 4.05, p = 0.04) across all three experimental arms of the study with no difference between groups (Werch et al., 2003). Two non-randomised studies examining physical fitness interventions with pre-post measures showed non-significant trends for reduction in alcohol and other drug use levels (Collingwood et al., 1991(Collingwood et al., , 2000 T.P. Thompson et al.   T.P. Thompson et al. and adherence was poor. Generally, all studies reported increases in reported levels of PA as a result of different modalities of intervention. Adherence is unknown due to lack of reporting. One study observed greater reductions in self-reported substance use patterns amongst those who demonstrated quantifiable improvements in physical fitness (Collingwood et al., 1991) compared to those who showed no improvement.
No serious adverse events related to PA were reported in any of the studies.

Meta-analyses
Of the five studies considered for prevention, only two studies had an outcome which could be combined for meta-analysis after conversion (Butzer et al., 2017;Velicer et al., 2013) and focussed on initiating alcohol use during or following intervention.
Of the two studies included, one (Butzer et al., 2017) reported the number of participants who had ever used alcohol at the end of the intervention, and the other (Velicer et al., 2013) reported percentage of people who had initiated alcohol use in each group at the end of intervention. Multiplying the percentage with the total number of participants, we obtain the number of participants who had initiated alcohol use at the end of intervention for each group.
The summary data from these two studies are combined in a metaanalysis. The overall risk ratio of alcohol use initiation is 0.72 (95% CI: 0.61 to 0.85), estimating from the random-effect meta-analysis, and 0.72 (95% CI: 0.61, 0.84), estimating from the fixed-effect meta-analysis ( Fig. 3). There is evidence for the effect of the intervention on reducing the alcohol use initiation in middle school students. There is little between-study heterogeneity (I 2 = 0.0%).

Risk of bias and quality appraisal
Two studies of randomised design were judged at high risk of bias (Butzer et al., 2017;Velicer et al., 2013), with some concerns over the risk of bias for the other (Werch et al., 2003). Two studies of non-random design (Collingwood et al., 1991(Collingwood et al., , 2000 were judged to have a severe risk of bias. (See Supplementary file 1).

Quality of the evidence
Using the GRADE approach for assessing the quality of evidence (Guyatt et al., 2008), the evidence for PA and the prevention of alcohol and other drug use was consistently downgraded due to study limitations, inconsistency of results, indirectness of evidence and imprecision. Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect (Table 4). Further research is likely to have an important impact and change our findings.
For details of included studies, see Table 5.

Populations
Five of the studies (four in the USA and one in Australia) recruited undergraduate university students (Correia et al., 2005;Murphy et al., 1986;Oaten & Cheng, 2006;Weinstock et al., 2014Weinstock et al., , 2016, one study conducted in Greece recruited heavy drinking male volunteers (Georgakouli et al., 2017), another study in the USA recruited females with at least sub-threshold post-traumatic stress disorder (Reddy et al., 2014), and another recruited "native" adolescents in Canada (Scott & Myers, 1988).

Interventions
Interventions varied in intensity and duration from a one off provision of oral and written instruction to increase and self-monitor PA in the next 28 days (Correia et al., 2005) to 8 weeks of three times weekly running session of 70 min with instructions to run "some other time of the week" on their own (Murphy et al., 1986). One study of within subject pre-post design reported no more information than an eight week supervised exercise intervention of moderate intensity at 50-60%

(5)
⊕⊕©© low a A reduction represents fewer participants reporting having initiated alcohol use at follow up a High and severe risk of bias found in most studies related to confounding, missing data, and outcome measurement. No change in desire to drink, but changes in amount drunk.
(continued on next page) T.P. Thompson et al.  (continued on next page) T.P. Thompson et al.  (continued on next page) T.P. Thompson et al. of heart rate reserve (Georgakouli et al., 2017). Another intervention involving sedentary undergraduate students included three to four times per week over eight weeks of gym based exercise tailored to the individual (Oaten & Cheng, 2006) and also incorporated a "thought suppression" task. Two interventions recruiting alcohol drinking college students in the USA utilised a counselling-based intervention; one compared motivational enhancement therapy (MET) with MET plus contingency management (Weinstock et al., 2014); the other (Weinstock et al., 2016) consisted of an intervention including two motivational interviewing (MI) sessions with eight weekly exercise contracting sessions. The study compared the impact of contingency management, in one group reinforcing the completion of physical activities and in the other reinforcement was offered for attending the exercise contracting sessions. One study recruited women with at least sub-threshold PTSD and offered twelve 75-min sessions of trauma-sensitive hatha yoga incorporating elements of mindfulness and dialectical behavioural therapy.

Controls/comparators
Only three studies had what could be considered a true control condition. One three arm randomised trial (Murphy et al., 1986) compared running to either meditation or a no treatment control asked to keep journals of their behaviour. Another 3 arm randomised trial (Correia et al., 2005) compared oral and written advice to exercise with oral and written advice to reduce frequency of substance consumption and with a control group. Another two arm RCT compared hatha yoga with an assessment control group. One further study used a waiting list control phase (Oaten & Cheng, 2006).The remaining studies consisted of two experimental arms (Weinstock et al., 2014(Weinstock et al., , 2016.

Outcomes
Two studies reported data collected using the timeline follow back method (Weinstock et al., 2014(Weinstock et al., , 2016. Other studies report data collected using a variety of questionnaires designed to capture alcohol consumption on a daily or weekly basis, although one study collected data using the Alcohol Use Disorders Identification Test (AUDIT) and the Drug Use Disorder Identification Test (DUDIT) (Reddy et al., 2014). The majority of follow up data was collected at the end of the intervention period with the longest follow up at six months post baseline (Weinstock et al., 2016).
Of the randomised studies with a control group, one study of three times weekly running demonstrated a significant reduction in alcohol   consumption of 60% at longest follow up compared to control, and a non-significant reduction compared to the meditation arm (Murphy et al., 1986). Providing written and oral information on increasing PA compared to information on other drug use reduction significantly reduced the number of drug use days (includes alcohol and other drugs) from baseline to follow up (Correia et al., 2005) by approximately 2 days in activity group for the previous 28 days. An intervention comparing yoga to a control group (Reddy et al., 2014) saw improvements in both AUDIT and DUDIT scores one month post intervention, and although the scores worsened for the control group, the differences were not statistically significant. One study using a wait list control design with 3 cohorts saw a reduction of 5 standard drinks per week during the intervention phase compared to the control phase. One small within subjects study (n = 11) reported a significant reduction in all variables at the end of the intervention relating to alcohol consumption, most notably a reduction of alcohol consumption from 19.00 (SD = 3.20) alcohol units at baseline to 11.64 (SD = 3.03) at follow up (Georgakouli et al., 2017). However, this study reported no change in variables related to desire to reduce or stop drinking. One study examining MET compared to MET plus contingency management found no differences on alcohol use for either time or condition (Weinstock et al., 2014) despite increases in self-reported levels of PA. However, this pilot study had a limited sample size (n = 31) and a baseline imbalance (the MET plus contingency management had significantly more drinking days at baseline than the MET alone group. The other study comparing two experimental conditions involving PA (Weinstock et al., 2016) showed that reinforcing either attendance at an exercise contracting session or for completing contracted exercise activities made no difference, as significant reductions in number of weekly drinks and weekly binge drinking episodes at all follow ups was seen in both arms, with no difference between groups. They also report that changes in PA levels were not predictive of the changes observed in alcohol drinking behaviour.

Secondary outcomes.
Several studies (Georgakouli et al., 2017;Murphy et al., 1986;Scott & Myers, 1988;Weinstock et al., 2014) reported improvements in physical fitness measures as a result of intervention at follow up. Self-reported levels of PA were also shown to increase across a variety of interventions (Correia et al., 2005;Georgakouli et al., 2017;Weinstock et al., 2014). Adherence data relating to PA was sparse and rarely reported. One study reports 100% completion of the intervention, with participants averaging running 3.4 times per week (Murphy et al., 1986) over an eight week intervention, and another reports an average attendance of 6.94 sessions out of a possible eight exercise contracting sessions over eight weeks, where subsequently around 60% of planned activities were verified completed. No studies report mediating effects of PA on alcohol and substance use outcomes.
No serious adverse events related to PA were reported in any of the studies.

Meta-analyses
Only one analysis with four studies focussed on alcohol consumption was possible, presented as changes in total drinks per week at end of intervention.
Correia (2005) reported a three-arm randomised trial with two intervention groups and one control group. Both intervention groups involved PA, thus, we combined two intervention groups into one group using the methods described in the Cochrane handbook.
Three studies (Correia et al., 2005;Weinstock et al., 2014Weinstock et al., , 2016 reported mean units of drinks per week at baseline and the end of intervention. We calculated the change in mean units of drinks per week and its associated standard deviation for both intervention and control groups. The standard deviation was calculated by assuming 0.5 as the correlation coefficients between baseline measurements and final measurements (Fig. 4). We carried out a sensitivity analysis of assuming correlation coefficients to be 0.7 (Fig. 5) and 0.9 (Fig. 6).
The overall standardised mean difference (SMD) in total drinks per week was − 0.27 (95% CI: 0.69 to 0.15) from the random-effect metaanalysis and − 0.24 (95%CI: 0.51 to 0.03) from the fixed-effect metaanalysis, suggesting insufficient evidence for the effect of exercise on reducing the total drinks per week.
A subgroup analysis of three randomised controlled trials (Correia et al., 2005;Weinstock et al., 2014Weinstock et al., , 2016 showed an overall SMD of − 0.15 (95% CI: 0.52 to 0.22), indicating insufficient evidence for the effect of the intervention on reducing total drinks per week. There was one within-subject study (Georgakouli et al., 2017), where the SMD is − 1.01 (95%CI: 1.90 to − 0.11).

Risk of bias and quality appraisal
Of five randomised controlled trials, one study was judged at high   Participants attended an average of 8.6 (SD = 3.9) weekly exercise sessions out of 12; averaged 3.9 (SD = 1.1) days of exercise per Low-attenders (less than 75% of session) were significantly more likely to relapse than (continued on next page) T.P. Thompson et al.  Veterans of a residential program for homelessness and substance abuse (N = 34).
Voluntary participation (alongside the residential treatment program) in a softball team playing league games once a week over 6 months, with twice weekly practices, and a weekly team meeting to discuss individual and team issues such as attitudes and behaviours. Comparison group showed higher levels on 'non-compliance'.
(continued on next page) T.P. Thompson et al.     (substance use inventory). 1, 3, and 6 months post residential care use rates at all follow ups reported by both urine drug screens and selfreport, however these differences were not significantly different.
sessions. Participants who attended 16 or more sessions were significantly less likely to self-report MA use and 1, 3, and 6 months. Urine drug screens also showed high attenders more likely to be abstinent at all time points. There was no relationship between attendance at control sessions and MA use.
baseline) group in exercise intervention showed significantly better outcomes in abstinence rates at all follow ups compared to lower use groups in health education control.
Age, mean (SD): 31.7 (6.9) Those who were abstinent 1 month post discharge report significantly more PA than those who were not. Those with lower MA dependence more frequently engaged in PA.

Male (80%)
Roessler et al.  therapy and counselling and urged to become members of alcoholics anonymous following release from the programme. Fitness classes took place early each weekday morning for 6 weeks, consisting of 20 min stretching and warm up, progressing to light calisthenics, followed by a 12min walk/run (with residents encouraged to cover greater distances in shorter time spans) followed by 20 min of muscle strengthening exercises such as sit ups and push ups. T.P. Thompson et al. risk of bias (Murphy et al., 1986) and four had some concerns over risk of bias (Correia et al., 2005;Reddy et al., 2014;Weinstock et al., 2014Weinstock et al., , 2016. Of three studies of non-randomised design, two were judged to be at severe risk of bias (Oaten & Cheng, 2006;Scott & Myers, 1988), and one at critical risk of bias (Georgakouli et al., 2017). (See supplementary file 1).

Quality of the evidence
Using the GRADE assessment of quality (Guyatt et al., 2008), the evidence for PA and reduction of alcohol and other drug use was consistently downgraded due to study limitations, inconsistency of results, indirectness of evidence and imprecision. We have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect. Further research is needed to draw any conclusions (Table 6).
For details of included studies see Table 7.

Populations
All studies included adults who were either diagnosed as alcohol or substance dependent (SUD) or were engaged in specialist treatment programmes across a variety of settings (e.g. inpatient, outpatient, community, or residential programmes), one focussed solely on currently abstinent patients with severe alcohol dependence (Giesen et al., 2016). Only one study included participants not receiving specialist alcohol or other drug treatment (non-treatment seeking cannabis dependent adults (Buchowski et al., 2011)). One study included people attending heroin assisted treatment (Colledge et al., 2017) and two studies those engaged in methadone maintenance therapy (Cutter et al., 2014;Shaffer et al., 1997). One study included inactive alcohol dependent women with depressive symptoms (Abrantes et al., 2017), one veterans of a residential programme for homelessness and other drug abuse (Burling et al., 1992), and one veterans attending an outpatient drug and alcohol programme (Linke et al., 2019). Along with alcohol, other drugs used included heroin, opioids, methamphetamine, cocaine, stimulants, and polysubstance use. The majority of the studies were conducted in the USA (n = 12), with one study conducted in each of Switzerland, Scotland, Germany, Sweden, Norway, Denmark, and Canada.

Interventions
Interventions varied greatly across studies: including: supervised treadmill exercise sessions three times a week aimed at achieving a predefined 'dose' of PA for 12 weeks (Trivedi et al., 2017); ten supervised treadmill sessions at 60-70% of heart rate reserve over two weeks (Buchowski et al., 2011); three 30-min treadmill session at 75% of maximum heart rate with concurrent computerised CBT modules and up to $700 and new running shoes, socks, shorts and a t-shirt for attending all sessions (De La Garza et al., 2016); one to one consultations with a PA counsellor weekly or biweekly for 12 weeks and provision of a FitBit activity tracker to guide progressive goal setting (Abrantes et al., 2017); a 12 week multicomponent intervention of group psychoeducation for exercise, gym membership and weekly group activity, and provision of FitBit activity tracker (Linke et al., 2019); community based participation in a soft ball team playing matches once a week with training sessions twice a week for six months (Burling et al., 1992); a 12 week multicomponent intervention consisting of moderate intensity aerobic exercise, group behavioural treatment, and an incentive system (Brown et al., 2014); 12 weeks of parallel groups chosen by the participant including either moderate to vigorous varied activities (e.g. climbing, badminton, strength training, or dance) or walking and coordination games for those less physically able or with a dislike of sports (Colledge et al., 2017); eight weeks of active video gameplay (Wii Fit Plus) for 20-25 min five times per week (Cutter et al., 2014); 10 weeks of weekly 90 min yoga sessions combined with instruction to complete yoga at home once per day (Hallgren et al., 2014); and twice weekly supervised running sessions for 24 weeks . Adherence.

Controls/comparators
Of the 10 randomised controlled trials (three of which had threearms), six employed controlled conditions matched to the PA intervention in terms of time, frequency, and duration without a PA element (Colledge et al., 2017;Cutter et al., 2014;De La Garza et al., 2016;Donaghy, 1997;Rawson et al., 2015;Trivedi et al., 2017), one compared brief advice to exercise (Brown et al., 2014), and three included control arms consisting of treatment as usual (Hallgren et al., 2014;Shaffer et al., 1997). Three prospective studies of non-randomised design compared PA with treatment as usual (Burling et al., 1992;Giesen et al., 2016;Sinyor et al., 1982); and one retrospective controlled study compared those who chose PA as a contingency management plan for preventing relapse with those who chose non-PA activities (Weinstock et al., 2008).

Alcohol.
Of the studies with a control group, only one study (Brown et al., 2014) with 26 participants reported statistically significant beneficial effects of a multicomponent PA intervention on self-reported alcohol use (number of drinking and heavy drinking days) compared to usual care during the intervention period, but the difference was not sustained at 12 weeks follow up. Three other studies showed a non-significant trend in favour of a PA intervention compared to a control (Giesen et al., 2016;Hallgren et al., 2014;Sinyor et al., 1982), and two studies showed no differences between arms. One study of pre-post design showed a significant reduction in drinking days following a multicomponent PA intervention (12 weeks) with 44% remaining abstinent throughout (Abrantes et al., 2017).

Other drugs.
All other drug use studies (n = 7) involved a control condition. Only one study reported significant reductions in levels of use of the targeted drug (cannabis) (Buchowski et al., 2011) at end of two weeks of ten treadmill running sessions among 12 participants compared to control. One three-arm study showed a non-significant trend for lower levels of cocaine use throughout the study in favour of walking and running conditions compared to a sitting condition (De La Garza et al., 2016), however when data from the walking and running groups were combined, it showed significant differences in favour of the active group compared to passive sitting. One other study showed a non-significant trend in favour of a PA intervention on abstinent rates up to 6 months post discharge from residential care for methamphetamine use (Rawson et al., 2015). Four other studies reported no significant differences between arms (Colledge et al., 2017;Cutter et al., 2014;Shaffer et al., 1997;Trivedi et al., 2017).

Comorbid alcohol and other drug use.
Two studies with a control condition reported significant results. One of which demonstrated a higher rate of abstinence at three months post discharge from outpatient care following participation in a softball team (Burling et al., 1992). The other, a retrospective control comparison, showed those who chose to participate in a range physical activities achieved longer periods of abstinence than those who didn't (Weinstock et al., 2008).
Two studies of pre-post design reported significantly lower levels of use following a 12 week multicomponent PA intervention compared to baseline (Brown et al., 2010;Linke et al., 2019), but in one of these studies with a three month follow up the effect diminished (Brown et al., 2010). One other pre-post study reported no effect on screening scores for alcohol and other drug use of an adjuvant PA intervention during outpatient treatment (Mamen et al., 2011). 4.5.4.4. Secondary outcomes. PA levels were reported in a variety of ways, and in some cases not at all. Two studies using FitBit activity trackers reported significant increases in number of daily steps form baseline to follow up (Abrantes et al., 2017;Linke et al., 2019). Two studies reported higher self-reported levels of PA in the experimental arm compared to a comparison group (Cutter et al., 2014;Donaghy, 1997). Four studies reported measures of improved physical fitness associated with the intervention (Brown et al., 2010;Donaghy, 1997;Linke et al., 2019;Sinyor et al., 1982).
Adherence data was reported in the majority of studies, but in a wide variety of formats. Some were very detailed (e.g. Fitbit was worn for 73% of days and participants completed an average of 4.7 out of 6 scheduled telephone PA counselling sessions) (Abrantes et al., 2017), and some were particularly vague (e.g. "the majority of participants attended twice a week") (Giesen et al., 2016). An adherence rate of around 60-70% of planned intervention sessions was the most commonly reported range (Abrantes et al., 2017;Brown et al., 2010Brown et al., , 2014Cutter et al., 2014;Linke et al., 2019;Rawson et al., 2015;Trivedi et al., 2017). Some studies considered sensitivity analyses related to adherence to PA (regardless of allocation) or the intervention, and showed higher levels of adherence frequently associated with better outcomes related to alcohol or substance use (Brown et al., 2010(Brown et al., , 2014Rawson et al., 2015;Trivedi et al., 2017).
One study reported that better outcomes were reported for the PA intervention in patients with less severe methamphetamine use compared to those with higher levels (Rawson et al., 2015).
A dose-response relationship between PA and alcohol use was suggested by another study, reporting a 4% reduction in alcohol use for each extra exercising day .
No serious adverse events related to PA were reported in any of the studies.

Meta-analyses
Only two studies had outcome data on alcohol abstinence suitable for the meta-analysis. One other study (Giesen et al., 2016) reported relapse rates but the data were too small to incorporate.
One study (Donaghy, 1997) reported the number of people who relapsed back to alcohol use at the end of intervention. The other study (Sinyor et al., 1982) reported the abstinence rate from alcohol, based on which we calculated the number of people who relapsed. The results from these two studies were consistent, with overlapping confidence intervals.
The results from random-effect meta-analysis accounted for the between-study heterogeneity, indicating insufficient evidence for the effect of PA interventions on alcohol abstinence.

Quality of the evidence
Using the GRADE assessment of quality (Guyatt et al., 2008), the evidence for PA and the reduction of alcohol and other drug use was consistently downgraded due to study limitations, inconsistency of results, indirectness of evidence and imprecision, giving very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect. Further research is needed to draw any conclusions (Table 8).

Economic data
Whilst we had intended locating economic evaluations with a view to conduction an analysis of economic data, no useable data was found in any of the included studies. T.P. Thompson et al.

Discussion
This is the most comprehensive review of evidence on the effects of PA interventions for the prevention, reduction, and treatment of alcohol and other drug use. The broad scope ensured maximum coverage of what is a relatively under researched field, while maintaining a clear framework to help the reader identify how research links to specific questions and phases in developing an evidence-base for efficacy, effectiveness, and mechanisms.

Prevention
The concept of using PA to prevent the initiation of alcohol and other drug use was the least well researched area with only five studies identified and all taking place in the USA. This is likely due to the challenges and cost of designing experimental studies requiring a long duration of follow up to evidence any effect. The multi-component nature of interventions which largely were targeted at younger people in education added to the complexity, as isolating the specific impact of PA was problematic. Whilst the meta-analysis completed within this review suggests a modest favourable effect of PA on subsequent initiation of alcohol and other drug use outcomes, this should be treated cautiously due to the heterogeneity of the interventions. One study focussed on "at risk" youth and showed favourable outcomes (Collingwood et al., 1991), particularly among those who increased their levels of physical fitness. It may be that interventions focussed on those who are at risk of initiating alcohol and other drug use are more beneficial than those that are broader in their aim. More robust larger studies accurately describing changes in PA levels (perhaps alongside structured exercise programmes) and subsequent outcomes are needed with appropriate statistical analysis to single out the effects of PA from other possible influences on outcomes.
The framework shown in Fig. 1, highlights the current limited research available on which PA related behaviours can influence uptake of alcohol and other drug use, while also laying out future research questions to be answered. For example, do structured exercise programmes and/or daily physical activity help to reduce the risk of initiating alcohol and other drug use and if so, is there a dose response and what are the cognitive, affective and physiological mechanisms? Related questions emerge in terms of what interventions can have sustainable effects on behaviour, and how much support and of what type is needed to influence behaviour change processes?

Reduction
Research relating to PA and the reduction of alcohol and other drug use among those not seeking treatment nor diagnosed with a use disorder was found to lack sufficient consistency to draw conclusions relating to its effect, highlighted by only four studies included in the meta-analysis for one common outcome. The studies identified were of different designs, and of low quality with all being considered at high risk of bias. The majority of studies included college students which limits generalisability, and the heterogeneity of interventions limits any inference about specific effective components and design.
With reference to Fig. 1, there remains uncertainty about which PA related behaviours can help reduce alcohol and other drug use. There is insufficient data to address our secondary aims in terms of PA dose, mechanisms of change, and detail related to effective intervention components. There are indications that alcohol consumption increased during COVID-19 lockdown in some cultures (Da, Im, & Schiano, 2020) and future research may help to understand if those who were less sedentary and did more daily physical activity or exercise were less likely to increase alcohol consumption to harmful levels and if so, was there a dose response relationship and what mechanisms were involved? We may also learn more about if and how individuals developed strategies to use PA to self-regulate alcohol and substance use during lockdown, though these processes may be best understood through mixed methods.

Treatment
The most well researched area was for PA in the treatment of alcohol and other drug use amongst those seeking treatment or diagnosed with a use disorder, yet conclusions are again difficult to make due to heterogeneity of study design, populations, interventions, and outcomes. Studies were a mix of efficacy and effectiveness studies and of generally poor quality (no studies rated at low risk of bias). In the absence of any adverse events being reported, not any of the studies reporting increases in levels of use, there is no indication that PA intervention pose any risk to people using alcohol and other drugs. With reference to our framework in Fig. 1, the intervention in only four studies included details of a tailored motivational element to the intervention (i.e. one to one or groups motivation sessions) (Abrantes et al., 2017;Brown et al., 2014;Brown, Prince, Minami, & Abrantes, 2016;Linke et al., 2019), rather than or in addition to a structured exercise programme. Such a tailored approach may offer those in the treatment setting greater scope to develop and use behaviour change skills to manage PA and an addiction when they are ready to make changes. In the studies that did report PA engagement during an intervention there appeared to be a dose response relationship, which implies interventions need a stronger focus on overcoming barriers to being physically active, especially for those with higher levels of drug use. From within subjects studies, only one did not report changes in levels of use at the end of intervention and was a yoga based intervention with no data on adherence to PA (Shaffer et al., 1997) suggesting the modality and dose of PA may be important in producing positive outcomes and the one study with longer term outcomes and reported a return to pre-intervention levels of use. More understanding of how to create changes in PA long term is needed, and consideration for this should be built into intervention design. Only one RCT included appropriate power calculations to detect changes in the primary outcome (Trivedi et al., 2017) but failed to detect significant changes in levels of drug (stimulant) use. The intervention was 1399 (19) ⊕©©© very low a Effect is uncertain a High and severe risk of bias found in most studies related to confounding, classification of interventions, deviation form intended interventions, reporting bias, and missing data.
T.P. Thompson et al. prescriptive in nature with no tailored motivational support where participants required to exercise regularly on a treadmill to complete a 'dose' of PA equivalent to the weekly guidelines of 150 min of MVPA per week, which suggests empowering individuals to complete PA through tailored approaches outlined above may have more success. There was also limited evidence to suggest PA may be more effective for people with lower levels of drug use, which may have implications for when PA interventions are introduced and may be best suited to when levels of use are stable or under control. Studies varied in terms of setting, whether as adjunct to an inpatient or outpatient programme, and no clear trend was evident to suggest how an intervention might be best placed. Considerable heterogeneity in study design, populations, interventions, and outcomes limits any firm conclusions on the effects of PA on alcohol and other drug use. The majority were pilot studies or exploratory in nature with speculative interpretation and conclusions.
The treatment studies included in this review represent a wide variety of PA modalities and approaches, the majority of which have shown potentially promising effects on alcohol and other drug use outcomes. However, no studies included in the review included an objective measure of physical activity. Some studies did assess fitness change as an indicator of exercise adherence (and these studies had a tendency to report positive outcomes on alcohol and other drug use) but further research is needed that can quantify both sedentary behaviour and different intensities of PA.

Limitations
The range of literature we identified, in terms of study and intervention design, added to the challenge of conducting this review. We hope the way the manuscript is presented adds clarity but we accept that the framework used and focus on prevention, reduction and treatment may have involved over simplification in describing which populations were involved, what the intervention and control conditions involved and how the outcomes were defined. Despite the inclusion of numerous studies under each of the aims, meta analyses were limited by the lack of comparable data reported in the literature meaning any findings should be treated with caution. Because the review identified some studies which involved a multi-component intervention, we had to make some difficult decision about whether or not to include the study in a review of the effects of PA on our outcomes of interest. As a result, some multicomponent interventions were excluded where PA was not deemed to be of sufficient focus or of sufficient intensity and participation was not evident.

Conclusions
This is the first rigorous systematic review of the effects of PA interventions for the prevention, reduction and treatment of alcohol and other drug use, and to determine if there is evidence for an optimal dose of PA, and any specific mechanisms involved in how PA influences alcohol and other drug use. Overall, the quality of evidence is limited, and only a few studies were adequately designed to provide any confidence in the findings. Firm conclusions are difficult to make due to the numerous limitations in the research identified. The review revealed that further research, with more rigorous approaches as outlined in this review, is need to understand how to make PA interventions acceptable and feasible (in terms of delivery mode and dose) in the context of prevention, reduction and treatment, and to test the effects of such interventions on alcohol and other drug use in well-designed studies.

Declaration of competing interest
JN has received, through her University, research funding from Mundipharma Research Ltd and Camurus AB to study novel opioid pharmacotherapy delivery systems and nasal naloxone.