Wilderness adventure therapy effects on the mental health of youth participants

Adventure therapy offers a prevention, early intervention, and treatment modality for people with behavioural, psychological, and psychosocial issues. It can appeal to youth-at-risk who are often less responsive to traditional psychotherapeutic interventions. This study evaluated Wilderness Adventure Therapy (WAT) outcomes based on participants’ pre-program, post-program, and follow-up responses to self-report questionnaires. The sample consisted of 36 adolescent out-patients with mixed mental health issues who completed a 10-week, manualised WAT intervention. The overall short-term standardised mean effect size was small, positive, and statistically significant (0.26), with moderate, statistically significant improvements in psychological resilience and social self-esteem. Total short-term effects were within age-based adventure therapy meta-analytic benchmark 90% confidence intervals, except for the change in suicidality which was lower than the comparable benchmark. The short-term changes were retained at the three-month follow-up, except for family functioning (significant reduction) and suicidality (significant improvement). For participants in clinical ranges pre-program, there was a large, statistically significant reduction in depressive symptomology, and large to very large, statistically significant improvements in behavioural and emotional functioning. These changes were retained at the three-month follow-up. These findings indicate that WAT is as effective as traditional psychotherapy techniques for clinically symptomatic people. Future research utilising a comparison or wait-list control group, multiple sources of data, and a larger sample, could help to qualify and extend these findings. ã 2016 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Engagement and treatment of adolescents is a key challenge for mental health clinicians. Mental health disorders affect a greater proportion of young people than other age groups (Department of Health and Ageing, 2013;World Health Organization, 2001), with 26% of 16-24 years old in Australia experiencing a personal mental illness in the previous 12 months (Australia Bureau of Statistics [ABS], 2007). Teenagers are susceptible to vulnerabilities arising from social and cultural changes, including unstructured home environments, growth in single-parent families, and media saturated with sex, violence, and pleasure-seeking (Barrett & Ollendick, 2004;Dowd, Singer, & Wilson, 2006;Villani, 2001). Adolescents face a myriad of other challenges including learning problems, disengagement from education, family issues, homelessness, delinquency, substance abuse, and unemployment (Perkins & Borden, 2003).
Although there are a wide range of treatment options, most interventions for teenagers are based on approaches which were originally developed for adults (Crisp, 1997;Rutter et al., 2008). However, adolescents' needs differ substantially from adults' needs (Rutter & Taylor, 2005). Teenagers who have mental health issues are often hesitant to seek help and can be difficult to engage in traditional treatment modalities (Rickwood, Deane, & Wilson, 2007). Therapeutic approaches for teenagers should be developed with an understanding of adolescent developmental needs in order to address mental health problems in a way that decreases stigma and promotes growth in fundamental domains of competency and performance, responsibility, judgement, social orientation, motivation, and identity (Crisp, 2014). Teenagers can benefit from modulated exposure to risk, stress, and precipitants of mental health problems, thereby increasing resiliency and developing protective strategies against future problems (Crisp, 1997;Rutter et al., 2008). Adventure therapy is one such option to address atrisk teenagers' mental health problems (Schell, Cotton, & Luxmoore, 2012;Tucker, Javorski, Tracy, & Beale, 2012).

Adventure therapy
Adventure therapy uses experiential learning activities in outdoor environments for assessment and intervention at an individual and group level, in order to effect psychological and/or behavioural therapeutic change (Gass, Gillis, & Russell, 2012;Norton, Carpenter, & Pryor, 2015). Adventure therapy utilises an eclectic therapeutic approach, drawing on aspects of cognitivebehavioural, systemic, existential, psychodynamic, and occupational therapy (Association for Experiential Education, 2012). Adventure therapy may be used as a form of brief intervention or embedded in a broader case management approach. Adventure therapy empowers participants by providing fun and engaging activities that involve real obstacles which, although often appearing to be impossible to overcome, are attainable. Activities are sequenced for success in order to provide participants with a sense of self-efficacy and mastery. Adventure therapy activities can include problem-solving activities, ropes challenge courses, outdoor adventure activities (such as rock climbing, abseiling, rafting, caving, and bushwalking), and extended overnight expeditions involving backpacking, canoeing and rafting, skitouring and/or snow camping. Meanings derived from participating in adventure therapy programs are intended to be incorporated back into the participant's individual and social world.
Although a relatively small field, research to date has shown adventure therapy to be effective in treating a range of behavioural and mental health problems. A recent meta-analysis of 197 studies of adventure therapy program outcomes (Bowen & Neill, 2013a) found that adventure therapy is moderately effective in facilitating positive short-term change in psychological, behavioural, emotional, and interpersonal domains and that these changes appear to be maintained in the longer term. The overall short-term adventure therapy standardised mean effect size (ES) was moderately positive and statistically significant (g = 0.47, p < 0.05), and larger than for alternative treatment (0.14) and no treatment (0.08) comparison groups. In this meta-analysis, adventure therapy participants also reported small, positive, and not statistically significant change during the lead-up period (0.09) and maintenance of the moderately positive short-term improvements during the follow-up period (0.03). Of the eight major outcome categories, clinical outcomes (e.g., anxiety, depression, emotional stability, locus of control, resilience) had the highest ES (0.50). The only significant moderator was participant age. An ES of approximately 0.5 is recommended as a benchmark for adventure therapy programs, ESs between 0.3 and 0.5 are more typical of programs for 9-17 year olds and ESs between 0.5 and 0.7 are more typical of participants aged 18 years and over (see Bowen & Neill, 2013b for specific outcomes by age). Despite these promising results, there is a lack of well-established adventure therapy models that are supported by rigorous evaluations of their effectiveness (Newes, 2000).

Wilderness adventure therapy
Wilderness Adventure Therapy 1 (WAT) is a clinical psychology treatment model developed by Simon Crisp (Crisp, 1997, 1998Crisp, Noblet, & Hinch, 2004). The WAT model is based on principles of best practice service design and includes a systemic framework and theoretical paradigm, client psychological assessment, intake processes and treatment planning, group composition, psychological safety procedures, therapeutic group procedures, monitoring of client outcomes, therapist skill training, management of ethical issues, and research evaluation (Crisp, 1997(Crisp, , 1998). The WAT model emphasises development of socialemotional competencies and coping skills through group-based adventure experiences that are facilitated by a psychologist. The WAT model has undergone three phases of development ): 1. 1992-2000: WAT was designed as a multi-systemic, group therapy intervention to treat adolescents with severe psychological, behavioural and psychiatric problems (Crisp & O'Donnell, 1998 Kingston, Poot, & Thomas, 2000). Between 1992 and 1996, 115 adolescents completed the program (Kingston, Poot, & Thomas, 1997). 2. 2000 A WAT program was provided as a "stand-alone" out-patient treatment at the Barwon Health Adolescent Mental Health Service, Geelong, Victoria, Australia (Crisp, 2003) targeting a broad range of adolescent out-patients and their parents who required more intensive out-patient treatment than traditional approaches  Community Health Service, Melbourne, Victoria, Australia, to work with local government high schools and community youth and family agencies and to provide early intervention, prevention and treat psychological, behavioural, and family-based problems in adolescents before they required referral to a clinical service . Nine WAT programs were implemented in this phase, with 36 adolescents completing the WAT treatment: two early intervention programs involved Year 9 and 10 students at risk of educational non-completion or school failure, one program was for families with one or more adolescent child who had experienced substantial domestic violence that significantly affected the adolescent and family's ability to function, one program was for female adolescents who had experienced sexual assault, one program was for female adolescents who had significant body image issues or who were in the early stages of an eating disorder, two programs were conducted for families with adolescents in the early stages of known or suspected substance abuse, and two programs involved families with adolescents who had evidence of significant disruption to parent-adolescent attachment . Since 2003, other WAT programs have been conducted in independent and government schools as community and early-intervention programs.
The WAT intervention is a 10-week, manualised, part-time program, which is typically facilitated by three WAT practitioners for six to eight participants. The manual  includes protocol about how to implement the program for different target groups, including early intervention at-risk clients, community counselling where clients have already sought help and WAT is part of on-going case management, and clinical treatment that may be part of a multi-pronged approach within a comprehensive range of clinical services.
The WAT program has four components . Intake (Week 1) includes screening, assessment, engagement, orientation, and negotiation of client goals. Treatment (Weeks 2-9) involves seven day-based adventure activities (e.g., bushwalking, abseiling, cross country skiing, and white water rafting), a two-day overnight training expedition, and a five-day expedition. Parents, teachers, and support workers also participate in up to eight weekly indoor adventurous problem-solving activities incorporated within group therapy sessions. Termination (Week 10) includes a review of goals and unresolved needs/issues, identification of post-treatment goals and strategies, and enlisting of psycho-social supports. Follow-up includes liaison with other agencies, group reunion, and school or placement outreach follow-up. A sample program structure is included in Table 1.
There have been several previous evaluations of WAT programs (Crisp, 2003;Crisp & Aunger, 1998;Crisp & Noblet, 2001;Crisp & O'Donnell, 1998;Kingston et al., 1997). An evaluation of 101 clients during Phase 1 found statistically significant improvements in self-reported difficulties with social, attentional, and attitudinal problems, self-esteem, and task leadership skills (Kingston et al., 1997). There were nonsignificant improvements in sociability, appearance, family, school, emotions, parents, and crisis problems, achievement motivation, emotional control, social competence, time management, withdrawal, anxiety/depression, use of some coping strategies (e.g., self-blame, social action, ignoring, keeping problems secret, and relaxation), social supports, physical recreation, and hard work (Kingston et al., 1997). In addition, there were non-significant increases in not coping and somatic problems.
A case study of a 15-year-old girl (Susan) with psychological and social problems, who completed WAT during Phase 1, was provided by Crisp and Aunger (1998). Upon program completion, Susan showed increased self-confidence, personal insight, assertiveness, sociality, and clarity of future plans. At the two-week follow-up, Susan had returned to mainstream schooling with perfect attendance and psychometric evaluation showed an increase in coping skills, a reduction in reported emotional problems, and a slight increase in self-esteem. At the six-month follow-up, Susan was continuing to attend school full-time and was involved in a range of extra-curricular activities. Crisp and O'Donnell (1998) presented three other case vignettes of participants who completed WAT programs. Reported outcomes included an increased sense of hope, perseverance, problem-solving, self-sufficiency, willingness to access support, and returning to school.
An evaluation of WAT during Phase 1 (n = 75) found that posttreatment levels of symptoms were reduced but the differences were not statistically significant. However, at a five-year follow-up mental health symptoms and the frequency of non-productive coping had significantly reduced.
Outcomes for 36 WAT Phase 2 participants were evaluated by . Overall results indicated a statistically and clinically significant reduction in mental illness symptoms (from clinical levels to non-clinical levels following the WAT intervention) which remained at three-month and two-year follow-ups. Furthermore, there were statistically significant improvements in general and social self-esteem following treatment, with nonsignificant improvements in life-threatening attitudes, productive and non-productive coping, social competence, school functioning, psychological resilience, and family relationships.
Crisp and Noblet (2001) evaluated a Phase 3 WAT program for eight clients who were at risk of school failure and experiencing mental health problems. Results indicated clinically significant improvements in mental health, behaviour problems, and lifethreatening attitudes (i.e., future high-risk behaviour). Nonsignificant improvements were evident for expectation of receiving help and increased connectedness to, and trust in, peers, communication, and overall functioning.
Program evaluation results for 36 clients from six WAT Phase 3 programs who were identified as showing risk factors including school failure, poor body image and eating problems, substance abuse, being victims of sexual abuse or assault, and/or family dysfunction indicated a statistically and clinically significant reduction in mental illness symptoms which remained at follow-up (3 months later; . Furthermore, there were statistically significant improvements in general and social self-esteem following treatment, and school self-esteem showed a delayed statistically significant improvement at follow-up. There Table 1 Example WAT program structure.
Week Activity

À4
Appraisal of referrals À3 Contacts parents À2 Parent and student information session Student interviews À1 Teacher were also non-statistically significant improvements in lifethreatening attitudes, productive and non-productive coping, social competence and school functioning, psychological resilience, and family relationships. Therapeutic programs for youth that utilise innovative and nontraditional approaches often do so in isolation and with limited knowledge about how to maximise their effect. A critical task for adventure therapy program developers, and for advancing the adventure therapy movement as a whole, is the use of high quality program design along with research and evaluation. Although WAT Phase 1-3 programs have been evaluated, the data should be independently analysed by non-program staff, expressed as effect sizes and benchmarked against adventure therapy meta-analytic findings, and formally peer-reviewed. Due to the varying target groups and level of symptom severity of WAT Phase 1 and Phase 3 programs, as well as the small samples sizes, it was impractical to include data from these programs. Therefore, this study builds on previous WAT research by evaluating the participant outcomes for WAT Phase 2 programs using standardised mean effect sizes and compares the outcomes to adventure therapy meta-analytic findings. It was hypothesised that the WAT treatment would be associated with statistically significant short-term improvements in psychological and behavioural symptomatology, that these changes would be maintained at a three month follow-up, and that the amount of change would be at least equivalent to comparable adventure therapy benchmarks.

Participants
There were 36 participants (21 females and 15 males) who were out-patients of a regional state-based adolescent mental health service in Victoria, Australia (Barwon Health) who required more intensive out-patient treatment than traditional approaches. Participants were aged between 12 and 18 years (M = 14.6; SD = 1.6). Participants completed one of six standardised WAT interventions during 2000 and 2001. The only other treatment option of equal or greater intensity was admission to a psychiatric in-patient ward .

Materials
Six self-report questionnaires, consisting of 226 closed-ended items, were completed by participants on up to three occasions: pre-program (Time 1), post-program (Time 2), and at a threemonth follow-up (Time 3).

Resilience Questionnaire
The Resilience Questionnaire (RQ) is designed to measure psychological resilience (Crisp, 2001). It consists of 14 items. Four items are rated on a dichotomous scale (True (1 point) or False (2 points); e.g., "I can change things in my life if I really try."). The other 10 items are measured on a seven-point Likert scale. For example, the item "How close, or connected to any person your own age do you feel?" is rated from 1 (Very Close/Connected) to 7 (Not Very Close/Connected). A total score was calculated by adding the scores from the true-false and Likert scale items, ranging from 14 to 78. Higher scores signify lower resilience. For the current study, the internal consistency was strong (Cronbach's a = 0.83).

Beck Depression Inventory-II
The Beck Depression Inventory-II (BDI-II; Beck, Steer, & Brown, 1996) assesses severity of depressive symptomology. There are 21 items measured using a four point ordinal scale (0 = I Do Not Feel Sad; 1 = I Feel Sad Much of the Time; 2 = I Am Sad All the Time; 3 = I Am So Sad or Unhappy That I Can't Stand It). Items are summed to form a total score that ranges from 0 to 63, with higher scores indicating greater depressive symptomology (Minimal = 0-13; Mild = 14-19; Moderate = 20-28; Severe = 29-63). The BDI-II has high internal consistency amongst a variety of populations (a = 0.91-0.93) and the one-week test-retest reliability is strong (r = 0.93; Antony & Barlow, 2010

Youth Self-Report
The Youth Self-Report (YSR; Achenbach, 1991) assesses participants' level of behavioural and emotional functioning. There are 112 items measured on a three-point ordinal scale (0 = Not True, 1 = Somewhat or Sometimes True, 2 = Very True or Often True). The YSR is comprised of eight core syndrome scales: Withdrawn, Somatic Complaints, Anxious/Depressed, Social Problems, Thought Problems, Attention Problems, Aggressive Behaviour, and Delinquent Behaviour. The YSR also includes two higher order scales: Internalising (Withdrawn, Somatic Complaints, Anxious/Depressed) and Externalising (Aggressive Behaviour, Delinquent Behaviour) problems, and a Total score (includes all 8 core syndrome scales; Achenbach & Rescorla, 2001). Raw scores are converted to a T-score, with higher scores reflecting poorer behavioural and emotional functioning. T-scores 67 or higher for the syndrome scales and 60 or higher for the higher order scales are clinically significant (Achenbach, 1991). The YSR core syndrome scales have reasonable internal consistency amongst a variety of populations (a = 0.71-0.89). One-week test-retest coefficients are generally good, with all scales being above 0.70, except for Withdrawn/Depressed. Seven-month test-retest correlations are generally in the 0.50 range (Achenbach & Rescorla, 2001).

CORE Family Functioning Questionnaire
The CORE Family Functioning Questionnaire (Author Unknown, 2001) is designed to assess family functioning. The CORE FFQ consists of five items measured on a five-point Likert scale ranging from 1 (Very Well) to 5 (Poor). A sample item is "How well has your family communicated together over the past month?". Total scores can range from 5 to 25, with higher scores reflecting poorer family functioning. For the current study, the internal consistency was 0.92.

Life Attitudes Schedule -Short Form
The Life Attitudes Schedule -Short Form (LAS-SF; Rohde, Lewinsohn, Seeley, & Langhinrichsen-Rohling, 1996) measures adolescents' suicide-proneness. The dichotomous response options for the 24 items (e.g., "I avoid unnecessary risks") were False (0) or True (1). The 12 positively worded items were reversescored and summed with the 12 negatively worded items to create a total score that can range from 0 to 24, with higher scores indicating greater risk of suicidality. The LAS-SF has reasonable internal consistency (KR-20 = 0.80), and the 30-day test-retest reliability was moderate (0.73; Rohde et al., 2003).

Procedure
The questionnaires were administered in a standardised manner on a one-to-one basis prior to the first program session (Time 1), following the final program session (Time 2), and three months after the completion of the program (Time 3). Additional assistance and/or verbal administration were provided when required (e.g., poor attention or literacy skills). On average, it took 45-60 min for participants to complete the six questionnaires on each occasion.

Analysis
Thirty six participants provided pre-program (Time 1; T1), postprogram (Time 2; T2), and follow-up (Time 3; T3) data. However, 24% of the data values were missing due to participant noncompletion of some of the survey questions, with lower completion rates for measures collected in the latter half of the test battery. Thus, multiple imputation was used to replace the missing data. Multiple imputation uses a regression-based procedure to generate multiple copies of the data set, each of which contains different estimates of the missing values (Enders, 2010). Based on Graham's (2012) recommendation, the Markov-Chain Monte Carlo (MCMC) method in SPSS 22 was used to generate 40 imputed data sets, with 50 iterations of MCMC between each imputation. Methodologists regard multiple imputation as a "state of the art" missing data technique because it improves the accuracy and the power of the analyses relative to other missing data handling methods (Schafer & Graham, 2002).
Scale means were reflected, where necessary, so that increased scores over time signified improvement. Short-term (T1-T2) and follow-up (T2-T3) changes were investigated using descriptive statistics and standardised mean ESs (Hedges' g). A commonly referred to rule of thumb for interpreting standardised mean ESs is 0.20 (small), 0.50 (medium), and 0.80 (large; Cohen, 1988). Ninety percent confidence intervals (CIs) for ESs are also reported. If an ES CI excludes zero, then the ES is statistically significant (Ellis, 2010). However, ES CIs should be interpreted with caution as power for a two-tailed test was estimated to be 47%, based on the overall average ES (0.26), a sample size of 36, and a 90% confidence level. ESs are also expressed as estimated percentages of participants who improved, using the Binomial Effect Size Display (BESD; Randolph & Edmondson, 2005) and Cohen's (1988) U 3 . Using the BESD, an effect size of 0.2 is equivalent to a 10% increase in the outcome of interest, whilst an effect size of 0.4 is equivalent to a 20% increase in the outcome of interest. Using Cohen's U 3 , an effect size of 0.2 is equivalent to 58% of participants who received treatment being better off than an equivalent group who did not participate, whilst an effect size of 0.4 means that 66% of participants who received treatment are likely to be better off than an equivalent group who did not participate.
Participants were an out-patient group of adolescents with diverse reasons for referral, thus it was important to consider the outcomes in clinical terms, that is, the presence or absence of pathology (Kazdin, 1999). Additional analyses (Glasser, 2014) were conducted for the two questionnaires (BDI-II and YSR) in which clinical cut-offs were available. Short-term and follow-up changes for participants who reported clinical and non-clinical levels of depression and behavioural and emotional dysfunction preprogram were investigated using descriptive statistics and standardised mean ESs. Short-term and follow-up descriptive statistics and ESs were also calculated for the most severe symptom for each participant, as indicated by the most elevated subscale on the YSR for each participant before treatment.

Overall
Descriptive statistics for participant responses to each of the six questionnaires at T1-T3 are presented in Tables 2-4 , along with standardised mean ESs with CIs for short-term changes (T1-T2) and during the follow-up period (T2-T3). Overall and questionnaire total results are shown in Table 2. Subscale results for the CSEI (self-esteem) are shown in Table 3 and the YSR (behavioural and emotional functioning) subscale results for participants with clinical and non-clinical range pre-program scores are shown in Table 4. Tables 2 and 3 also show short-term ES adventure therapy benchmarks for 10-17 years olds from Bowen and Neill's (2013b) meta-analysis. Statistically significant short-term ESs and meta-   3 Descriptive statistics for T1, T2, and T3 self-esteem (CSEI) subscales, with effect sizes, 90% confidence intervals for WAT participants, and comparative benchmarks (N = 36).  analytic benchmarks for all participants and clinical range participants are presented in Fig. 1. The overall short-term ES for WAT program participants was small, positive, and statistically significant (0.26), representing a 13% overall improvement and is akin to 60% of participants reporting improvements at the end of the program. The total shortterm ESs were statistically significant for two of the questionnaires (p < 0.10; RQ (resilience) ES = 0.49; BDI-II (depression) ES = 0.46) and non-significant for the other four questionnaires (p > 0.10; YSR (behavioural and emotional functioning) ES = 0.36; CSEI (selfesteem) ES = 0.20; CORE FFQ (family functioning) ES = 0.12; LAS-SF (Suicidal-proneness) ES = À0.06; see Table 2).
The overall follow-up period (T2-T3) ES was very small, negative, and not statistically significant (À0.06), representing a 3% loss, and thus indicating overall retention of the short-term gains. During the follow-up period, the total ES was statistically significant for one questionnaire (suicidal-proneness ES = 0.43) and not statistically significant for the other five measures (family functioning ES = À0.58; resilience ES = À0.30, depression ES = 0.00, behavioural and emotional functioning ES = 0.03, and self-esteem ES = 0.06; see Table 2). The follow-up result for suicide-proneness represents a 21% improvement and is akin to 67% of participants reporting improvements in suicidality during the three-month follow-up period.

Self-esteem
The results for the CSEI (self-esteem) subscales are reported in Table 3. The short-term ESs were statistically significant for one of the subscales (Social ES = 0.40) and non-significant for the other three subscales (Home/Parents ES = 0.34; General ES = 0.21; School/Academic ES = À0.18). During the follow-up period, the ES for one of the subscales was statistically significant (General ES = 0.39), while the ES for the other three subscales was nonsignificant (School/Academic ES = 0.08; Social ES = À0.03; Home/ Parents ES = À0.22).

Clinical participants
Twenty-three participants reported clinical levels of depressive symptoms pre-program, as measured by the BDI-II (T1 M = 34.03, SD = 12.84; Severe range; see Fig. 2). For these participants, the short-term change was large, positive, and statistically significant (ES = 0.80; 37% change; CI = 0.30:1.31; T2 M = 44.19, SD = 12.45; Mild range). Change during the follow-up period was very small, positive, and not statistically significant (0.03; 1% change; CI = À0.46:0.51; T3 M = 44.51, SD = 11.31; Mild range), indicating retention of the short-term gains. The average short-term effect is akin to 79% of clinical range participants reporting improvements in mental health at the end of the program and at follow-up.

Non-clinical participants
Thirteen participants reported non-clinical levels of depressive symptoms pre-program (T1 M = 57.00, SD = 3.99; Minimal range; see Fig. 2). For these participants, the short-term ES was small, positive, and not statistically significant (ES = 0.26; 13% change; CI = À0.39:0.91; T2 M = 58.1; Minimal range). During the follow-up period for these participants, the ES was small, positive, and not statistically significant (ES = 0.22; 11% change; CI = À0.42:0.87; T3 M = 59.1; Minimal range). Thus, for these non-clinical participants, there was small, positive and non-significant change from pre-program to post-program, and from post-program to followup.

Clinical participants: total
Twenty six participants reported clinical levels of behavioural and emotional dysfunction pre-program, as measured by the YSR (see Table 4). For these participants, the overall short-term ES was large, positive, and statistically significant (0.70; 33% change). The follow-up period ES was very small, positive, and not statistically significant (0.02; 1% change), indicating retention of the shortterm gains. The average overall short-term effect is akin to 76% of clinical range participants reporting improvements in mental health at the end of the program and at follow-up.

Non-clinical participants: total
Ten participants reported non-clinical levels of overall behavioural and emotional dysfunction pre-program (see Table 4). For these participants, the overall short-term ES was small, negative, and not statistically significant (À0.16; À8% change). During the follow-up period, the ES was small, positive, and not statistically significant (0.11; 4% change). Thus, for these non-clinical participants, there was minimal change from pre-program to postprogram and from post-program to follow-up.

Clinical participants: sub-scales
The short-term ES for the 26 participants who reported clinical levels of Internalising problems was large, positive, and statistically significant (0.82; 38% change) and for the 21 participants who reported clinical levels of Externalising problems, the ES was moderate, positive, and not statistically significant (0.42; 21% change). During the follow-up period, the change in Internalising and Externalising problems was very small, positive, and not statistically significant (ES = 0.12; 6% change).
Analyses of changes for participants with clinical-range preprogram scores for each of the YSR subscales found large to very large statistically significant short-term ESs for five subscales (Thought problems, ES = 1.73, n = 8; Somatic Complaints ES = 1.36, n = 11; Anxious/Depressed ES = 0.84, n = 23; Aggressive Behaviour ES = 0.82, n = 10; Social Problems ES = 0.70, n = 18) and three of the subscales were non-significant (Attention Problems ES = 0.59, n = 16; Withdrawn ES = 0.59, n = 7; Delinquent Behaviour ES = 0.38, n = 15; see Fig. 1). The YSR subscale ESs during the follow-up period were non-significant and ranged from moderate and positive (Delinquent Behaviour, ES = 0.48; 23% change; n = 16) to moderate and negative (Thought Problems, ES = À0.56; 27% change; n = 8). Thus, there was clinically significant change for those within the clinical ranges for five out of eight YSR subscales (internalising problems, social problems, aggressive behaviour, anxiety/depression, somatic complaints and thought problems), which were retained at follow-up.

Non-clinical participants: sub-scales
The short-term ES for participants who reported non-clinical levels of Internalising and/or Externalising problems was moderate, negative, and not statistically significant (À0.36; 18% change; n = 10 and ES = À0.32; 16% change; n = 15 respectively). During the follow-up period, the change in Internalising problems was very small, positive, and not statistically significant (ES = 0.02; 1% change), whilst the change in Externalising problems was small, negative, and not statistically significant (ES = À0.13, a 1% change). The eight core syndrome subscale short-term ESs were nonsignificant and ranged from small and positive (Somatic Complaints, ES = 0.16; 8% change; n = 25) to small and negative (Anxious/Depressed and Thought Problems, ES = À0.18; 9% change; n = 13 and 28 respectively). The eight core syndrome subscale ESs during the follow-up period were non-significant and ranged from small and positive (Withdrawn, ES = 0.23; 11% change; n = 29) to small and negative (Somatic Complaints, ES = À0.25; 12% change; n = 25). Thus, for these non-clinical participants, there was minimal change from pre-program to post-program and from post-program to follow-up.

Severest symptoms
Short-term and follow-up ESs were also calculated for the most severe symptom for each participant, as indicated by the most elevated subscale on the YSR for each of the 36 participants before treatment (T1 M = 24.58, SD = 12.34; N = 36; clinical range; see Fig. 3). The short-term ES for the most severe symptoms was moderate, positive, and statistically significant (ES = 0.57; 27% change; CI = 0.18:0.97; T2 M = 32.29, SD = 14.55; clinical range). The follow-up ES was very small, negative, and not statistically significant (ES = À0.02; 1% change; CI = À0.40:0.37; T3 M = 32.06, SD = 14.24; clinical range), indicating retention of the short-term gains. The average short-term ES of 0.57 is akin to 72% of program participants reporting short-term improvements in the most problematic aspect of their behavioural and emotional functioning.

Discussion
This study examined the short-and longer-term changes in WAT participants' psychosocial, behavioural, and psychological functioning. These changes are interpreted here in the context of adventure therapy and psychotherapeutic meta-analytic outcome research. Lessons learned and limitations of the present study are also considered.

Overall short-term effect
This study provides mixed evidence about the effectiveness of WAT as a psychotherapeutic treatment. Overall, there was a small, positive, statistically significant short-term impact on the measured outcomes (ES = 0.26). This result is similar to meta-analytic results for 10-17 year old adventure therapy participants (Bowen & Neill, 2013b), but is lower than findings from psychotherapy outcome research. There were moderate, statistically significant, short-term improvements in participants' psychological resilience and depression, moderate, non-statistically significant improvements in behavioural and emotional functioning, and small, nonstatistically significant improvements in self-esteem and family functioning. There was little to no evidence of short-term improvements in suicidality. All the short-term effects were within the adventure therapy age-based meta-analytic benchmark CIs, except for suicidality which was lower than the comparable benchmark.

Overall follow-up effect
The overall follow-up ES was very small, negative, and not significant (ES = À0.06), indicating no substantial change during the three-month follow-up period. This finding is consistent with the very small, positive, non-statistically significant follow-up ES for adventure therapy (ES = 0.03) reported by Bowen and Neill (2013a). The follow-up effect was not significant for each of questionnaire totals and subscales, with the exception of family functioning, for which participants reported reduced functioning, and suicide-proneness, which decreased. As family functioning can be a significant factor in the development and maintenance of mental health disorders as well as social and school functioning (Greenberg & Lippold, 2013;Patel, Flisher, Hetrick, & McGorry, 2007), it may be an important consideration in the development of future WAT interventions to use a multi-family group format and/ or explicitly aim to teach effective family functioning strategies as an overall treatment goal. As adolescent suicide is a significant public health concern (Sakinofsky et al., 2007), the statistically significant reduction in suicidality at follow-up is noteworthy. The delayed change in suicidality may reflect a process that begins with improvements in resilience and levels of depressive symptoms that take time to generalise to suicide-proneness. This finding may also suggest that reductions in suicidality and self-harm risk may take several months. Adventure therapy programs such as WAT could also consider offering one or more follow-up "booster ' " session to help generalise and integrate changes.

Effect on self-esteem
The short-term ES for Social self-esteem was moderate, positive, and statistically significant. Self-esteem subscale effects for Social, Home/Parents, and General self-esteem were moderately positive and within the expected age-based benchmark CI, whereas the School/Academic effect was lower than expected. The follow-up effect for each of the self-esteem subscales was not significant, with the exception of General self-esteem, for which participants reported a significant increase. The delayed change in General self-esteem may reflect a process that begins with improvements in social self-esteem.

Effect on depression
The overall short-term effect on depression was small and not significant, however the short-term effect for the 23 participants who reported clinical levels of depressive symptoms pre-program was large, positive, and statistically significant, and this was retained in the longer-term. Consistent with a previous evaluation of WAT , the pre-program mean for these participants was in the Severe range, moved to the Mild range by the end of the program, and remained in the Mild range at the three-month follow-up. There were no statistically significant shortor longer-term changes in depressive symptoms for the 13 participants who were not in the clinical range pre-program. Thus, it seems that clients with depressive symptoms respond well, with clinically meaningful reductions in symptoms. Such magnitude of benefit appears comparable to the most efficacious treatments reported in the literature (Klein et al., 2007;Michael & Crowley, 2002;Singh & Reece, 2014).

Effect on behavioural and emotional functioning
The overall short-term effect for the 26 participants who reported clinical levels of behavioural and emotional dysfunction pre-program was large, positive, and statistically significant, and was retained in the longer-term. Short-term effects for participants who reported clinical levels of symptoms were positive, large, and statistically significant for 6 out of the 10 of the YSR subscales (Social Problems, Aggressive Behaviour, Internalising Problems, Anxious/Depressed, Somatic Complaints, Thought Problems). These short-term changes were retained at follow-up. At the three-month follow-up, 7 out of the 10 subscale means had moved out of the clinical range. This result is important as it shows that participants experienced reduced symptoms that are clinically meaningful. For the 10 participants who reported non-clinical levels of behavioural and emotional dysfunction pre-program, there were no statistically significant short-or longer-term changes. All participants who were in the non-clinical range pre-program remained in the non-clinical range for all YSR subscales at the three-month follow-up.
The short-term WAT program effect on the most severe area of participants' symptoms was moderate, positive, and statistically significant and this change was maintained in the longer-term. This result is important as it shows that in the area of greatest need, clients experienced reduced symptoms. Although the mean T-score of the most severe area of participants' symptoms improved from pre-program to the three-month follow-up, it remained in the clinical range.

Summary of effects
Overall, these findings suggest that WAT participants experienced a small, positive, statistically significant overall improvement in psychosocial functioning, with moderate, positive, statistically significant improvements in psychological resilience, depression, and social self-esteem, and non-significant improvements in psychological, emotional, and behavioural functioning. For the most part, the changes appear to have been retained at a three-month follow-up. Although there was no short-term change in suicidality, there was a moderate, statistically significant longerterm reduction in suicidality.
As there were diverse presenting issues, additional analyses focused on participants who were in clinical ranges pre-program. These analyses revealed large, positive, and statistically significant improvements in depressive symptomology and behavioural and emotional functioning which were retained at a three-month follow-up. In comparison, participants who were in the nonclinical range pre-program, for the most part, experienced relatively little short-or longer-term change.

Lessons learned
The WAT program is an experiential, adventure activity-based approach to prevention, early intervention, and treatment for adolescents, delivered within a case management framework. Adventure therapy is not a panacea, but it can be useful in a variety of settings and for a broad spectrum of clients. Overall, there were statistically significant short-term improvements in resilience, social self-esteem and depression, and statistically significant improvements in the follow-up period for suicidality and general self-esteem. These results suggest that WAT can help adolescents to improve dysfunctional beliefs and attitudes, maladaptive behaviours, coping strategies, inadequate problem-solving methods, and to develop greater resilience to overcome risks and avoid negative outcomes in the future. However, the short-term change for academic self-esteem, when compared to the relevant metaanalytic benchmark, was lower than expected. Thus, the WAT program may be less well suited to improving adolescents' academic self-esteem.
Importantly, participants in clinical ranges pre-program, reported clinically and statistically significant improvements in depression, internalising and externalising problems, somatic complaints, anxiety, depression, withdrawal, social problems, thought problems and overall behavioural, and emotional functioning. These results suggest that WAT may be especially well suited to treating participants with clinical psychological symptoms. However, given the notable differences in results between participants in clinical and non-clinical ranges, it may be that WAT is an appropriate treatment modality for participants with clinical symptoms, but less well suited to prevention and early intervention. Alternatively, it may be that the evaluation methodology was not sufficiently targeted at indicators of efficacy with regard to prevention and early intervention.
The statistically significant improvements in suicidality and general self-esteem during the follow-up period may reflect a process that begins with improvements in resilience, alleviation of depressive symptoms, and enhanced social self-esteem but that takes time to generalise to suicide proneness and general selfesteem. Further research could investigate the reasons for, and mechanisms of, these delayed changes, as well as effects experienced by participants at even later post-program timepoints.
Despite the promise of adventure therapy and the WAT model, more in-depth and rigorous program evaluation could be considered. This could take the form of clinical trials of adventure therapy programs tailored to homogenous client groups (e.g., of depressed or conduct disordered adolescents) and use quasi-or fully-experimental designs, including wait-list control groups or cross-over designs with conventional treatments such as CBT. It could also be helpful to investigate which components of adventure therapy programs are most effective and which components could be improved. For example, exit or follow-up interviews with participants could be conducted to systematically document and analyse participants' responses to each of the program components. Additionally, interviews could investigate both intended and unintended effects of programs in order to gain a more holistic understanding of the obtained benefits. Case studies could also be informative in this respect, especially longitudinal case studies to further investigate longer-term changes. It may also be an important consideration for the development of future adventure therapy interventions to investigate the relative benefits of adventure therapy for different client types, presenting problems, and other individual differences.

Limitations
This study found that WAT programs are reasonably effective, however several limitations should be considered, including the evaluation design, reliance on self-reported data, small sample size, regression to the mean, missing data, and use of non-validated questionnaires.
There was no control or comparison group, therefore conclusions about causality are unable to be made. Use of a comparison and/or wait-list control group could be useful in future evaluations, and would be necessary to demonstrate the effectiveness of WAT interventions. Inclusion of comparison and/or control groups would also help to deal with the potential for regression to the mean (Barnett, van der Pols, & Dobson, 2005), given that participants in this study were pre-selected as being atrisk. Additionally, future such studies should consider including multiple long-term follow-up data collection points.
This study relied exclusively on empirical, self-reported data. Adventure therapy program evaluation studies could be strengthened by triangulating self-reported data with interviews with participants, and ratings and/or interviews with observers such as parents and teachers. Incorporation of other existing data, such as school attendance and behaviour records may also be helpful.
The small sample size (N = 36) limited the current study's statistical power. Future studies with larger sample sizes would help to validate the findings of this study. Utilising a consistent, minimal set of measurement tools for every participant would allow for easier integration into larger data sets.
In applied, longitudinal research missing data is inevitable. The current study is no exception, with only a small percentage of participants completing all measures at each time point. Thus, multiple imputation was used and can be recommended for use in future adventure therapy research.
Two non-validated questionnaires (RQ (resilience) and CORE FFQ (family functioning)) were used, potentially limiting the reliability and validity of findings. Where possible, existing psychometrically validated assessment tools should be used and the psychometric properties of instrumentation should be reported. A task for future adventure therapy research is the development of purpose-built, multi-dimensional assessment tools using the best available psychometric techniques.

Conclusion
The current study contributes to research literature about psychotherapeutic effects of adventure therapy treatment for adolescents by evaluating self-reported outcomes for 36 WAT program adolescent participants. Overall, results indicated small, statistically significant improvements (ES = 0.26) which are within the expected range of effects for adventure therapy programs with similar target groups. Large, positive, statistically significant changes were evident for participants in clinical ranges for depression and behavioural and emotional functioning. Importantly, the changes appear to have been retained at a three-month follow-up. These findings indicate that WAT offers a potentially viable alternative treatment modality to more traditional psychotherapeutic approaches for youth at-risk with clinically significant symptoms. More in-depth investigation using triangulated data, qualitative and quantitative data, evaluation of specific program components, comparison or wait-list control groups, and a larger sample would help to better understand the obtained benefits (intended and unintended), as well as what works, how it works, and what could be improved.