A systematic review of interventions targeting physical activity and/or healthy eating behaviours in adolescents: practice and training

ABSTRACT Despite the many health benefits of physical activity (PA) and healthy eating (HE) most adolescents do not meet current guidelines which poses future health risks. This review aimed to (1) identify whether adolescent PA and HE interventions show promise at promoting behaviour change and maintenance, (2) identify which behaviour change techniques (BCTs) are associated with promising interventions, and (3) explore the optimal approaches to training deliverers of adolescent PA/HE interventions. Nine databases were searched for randomised controlled, or quasi-experimental, trials targeting 10–19 year olds, with a primary aim to increase PA/HE, measured at baseline and at least six months post-intervention, in addition to papers reporting training of deliverers of adolescent PA/HE interventions. Included were seven PA studies, three HE studies and four studies targeting both, with two training papers. For PA studies, two were promising post-intervention with two promising BCTs, and five were promising for maintenance with two promising BCTs. For HE studies, three were promising at post-intervention and four at maintenance, both with four promising BCTs. There is preliminary evidence that interventions support adolescents to improve their PA and HE behaviours over a period of at least six months.


Introduction
Current UK guidelines advise adolescents undertake 60 mins of moderate-to-vigorous physical activity (MVPA) every day (UK Chief Medical Officers, 2019). However, Sport England (2019) found that only 43% of 11-16 year olds are meeting this guideline with 35% doing less than 30 min per day, with boys more active than girls (49% vs 42%, respectively, meeting the guideline). During adolescence there is a decline in physical activity (PA; Brooke et al., 2016;Ortega et al., 2013;). This could have negative consequences given that rates of PA during adolescence tend to continue, or decline, into adulthood (Corder et al., 2019;Hayes et al., 2019), and low levels of PA during adulthood put them at risk of cognitive impairment, depression, heart disease, type 2 diabetes and certain cancers (Physical Activity Guidelines Advisory Committee, 2018). PA during adolescence offers physical health benefits such as better bone strength (Christoffersen et al., 2015) and metabolism (Bell et al., 2018), in addition to positive effects on mental health including fewer depressive symptoms (Rothon et al., 2010), better executive functioning (Vanhelst et al., 2016) and higher self-esteem (Kristjánsson et al., 2010).
In addition to declining rates of PA, adolescents' diet quality also tends to deteriorate (Lytle et al., 2000) as they start making their own dietary decisions (Wang & Fielding-Singh, 2018). Healthy diets are considered to contain a balance of all the food groups, including the consumption of a minimum number of fruits and vegetables per day, in addition to the limited intake of free-sugars, total sugars, fats and salt (NHS, 2019b;World Health Organisation, 2020). A healthy diet provides protection against some health conditions including cancer (Key et al., 2020;Liese et al., 2015) and cardiovascular disease (Hartley et al., 2013), in addition to being required for optimal growth and development during adolescence (Salam et al., 2016). One measure of a healthy diet is the intake of fruit and vegetables, with current UK guidelines advising the consumption of at least five different portions per day (NHS, 2019a). However, data shows that only 8% of 11-18 year olds are meeting this recommendation (Public Health England, 2018), a figure which has remained quite stable for the last nine years (Bates et al., 2014(Bates et al., , 2016(Bates et al., , 2019Public Health England, 2018). As with PA, eating behaviours formed during adolescence can also continue into adulthood (Movassagh et al., 2017).
With increasing autonomy and independence from parents, adolescence is an opportune time to intervene and support the development of healthful behaviours. Some reviews have concluded that school-based PA interventions are effective (Carlin et al., 2016;Metcalf et al., 2012). Others have concluded that there is limited (Dobbins et al., 2013;Hynynen et al., 2016) or no evidence  for effective school-only interventions, but strong evidence for school-based interventions which actively involve the family (Van Slujis et al., 2007;). Some reviews have found that effectiveness is limited to PA conducted during school hours and does not transfer to leisure time PA (De Meester et al., 2009) while others have found that leisure time PA is improved through school-based interventions (Kriemler et al., 2011).
Fewer reviews have been conducted on HE interventions and have tended to combine children and adolescents (Chaudhary et al., 2020;Racey et al., 2016), or adolescents and young adults (Chau et al., 2018). Others have considered HE in the context of either obesity prevention  or treatment (Al-Khudairy et al., 2017;Quelly et al., 2015). However, reviews investigating dietary quality in adolescents, including fruit and vegetable consumption, have found inconclusive evidence for nudge strategies (Nørnberg et al., 2016) or practical sessions on preparing and cooking food (Calvert et al., 2018), but promising evidence for website interventions (Rose et al., 2017) or the use of media within face-to-face interventions (Calvert et al., 2018). One review found mixed evidence for involving families (Murimi et al., 2018) while another determined that more effective interventions involved peers (Calvert et al., 2018). Finally, one review found that the majority of ineffective studies targeted more than one dietary behaviour (Calvert et al., 2018).
The inconsistent evidence base for both PA and HE could be due to the differing content between individual interventions. The development of the BCT taxonomy v1 (BCTT v1; Michie et al., 2013) now provides a standardised vocabulary for describing intervention content. This allows for the identification of intervention content which can then be compared and replicated in future studies. The taxonomy is frequently used to identify promising BCTs as part of a review (e.g., Martin et al., 2013) or from empirical research (e.g., Ojo et al., 2019), which can then be applied to new behaviour change interventions (e.g., Howlett et al., 2017).
Another reason for the inconsistent evidence base could be due to the differences in methodologies of the interventions (i.e., who delivers it). Some studies utilise their own research staff to deliver the intervention while others train professionals already in contact with participants such as teachers or nurses. The training of professionals is an under-researched area but has the potential to greatly affect the success of an intervention. Knowing what skills professionals are developing and their proficiency of using those skills to deliver the intervention as intended (fidelity), can provide insights into why some interventions are successful and others are not. However, the level and quality of training provided are rarely considered or explored in terms of best practice for future interventions. A taxonomy related to the BCTT v1 has subsequently been developed by Pearson and colleagues (2020) and provides a standardised vocabulary to identify the ingredients of training programmes that may impact on the ability of professionals to deliver the intervention with optimal fidelity to planned content Fidelity is the subject of a range of recent research (e.g., McGee et al., 2018), supported by the development of the Template for Intervention Development and Replication checklist (TIDieR; Hoffman et al., 2014). TIDieR is a standardised reporting tool to allow for intervention methodologies, content and fidelity to be reported and replicated. Knowing how well an intervention was delivered according to the manual can provide valuable insights into the success or failure of different trials.
This review sought to explore the short and longer-term promise of PA and HE behaviour change interventions for young people, the associated promising BCTs at both time points and best practice of training professionals to deliver these types of programmes. None of the reviews highlighted have systematically considered the distinction between short-term effects on behaviour ('behaviour change') and longer-term effects ('maintenance'). To experience the benefits of PA and HE through to adulthood, the intervention needs to be able to produce effects after it has ended. During this maintenance phase, further behaviour change (from postintervention) may be achieved, behaviour change may be sustained (from baseline or post-intervention), or new change from baseline may occur. All scenarios are evidence of the intervention having a positive longer-term effect.
Therefore, this review will consider behaviour change outcomes at post-intervention as well as maintenance, of at least six months after the end of the intervention, as per a previous review of adult PA literature (Howlett et al., 2019). To our knowledge this is the first review to consider only studies which have assessed both behaviour change and maintenance of PA and HE behaviours in adolescents, and the behaviour change techniques that are associated with effective PA or HE interventions at both time points. Additionally, this review will assess included studies against the TIDieR checklist to allow for an assessment of reporting standards. A further unique contribution of this review is the inclusion of papers discussing approaches to the training of deliverers in interventions. By considering the training approaches used within PA and HE interventions, it is possible to gain insights into potential best practice leading to better outcomes for the professionals being trained and any knock-on effects for young people.
The aims of this review are: (1) to assess whether interventions targeting PA and/or HE among adolescents show promise in promoting behaviour change and maintenance; (2) to identify which BCTs from the BCTT v1 are associated with promising interventions targeting PA and/or HE behaviour among adolescents in relation to both behaviour change and maintenance; (3) to investigate the optimal approaches to training deliverers of interventions targeting PA and/or HE behaviour among adolescents.

Methods and materials
This review has been reported in accordance with the PRISMA Statement (Page et al., 2021;Supplementary Table 1). The review was registered with PROSPERO in May 2020 (registration number: CRD42020175245).

Eligibility criteria and search strategy
Both the inclusion/exclusion criteria and the search terms were built around the PICOS domains, as shown in Supplementary Table 2. Intervention studies were eligible for inclusion where they used an RCT or quasi-experimental design with an active or passive control group. Participants had to be aged 10-19 years old with no chronic health conditions. Interventions had to primarily target PA and/or HE and be delivered in community settings. A primary outcome measure of PA or HE had to be used at baseline and at a minimum of six months post-intervention (referred to as 'maintenance'). A measurement at post-intervention (referred to as 'behaviour change') was not required. In addition to intervention studies, the review also included papers which reported on the training of professionals to deliver PA or HE interventions to young people aged 10-19 years old. It was not required that these papers relate to intervention studies included in the review.

Information sources
The following databases were included in the search: EMBASE, PsycINFO and PsycEXTRA all accessed through Ovid, CINAHL Plus and SPORTDiscus both accessed through EBSCOhost, Cochrane Central Register of Controlled Trials (CENTRAL) via cochranelibrary.com/central, PubMed via pubmed.ncbi.nlm.nih.gov/ and Scopus via scopus.com. Grey literature was searched using PsycEXTRA (Ovid) and OpenGrey via opengrey.eu. Searches were conducted by title and abstract and were limited to publications written in English. Results included all entries from database inception to 20 July 2021. Reference lists of included studies were searched manually.

Selection process
Search results were exported into EndNote X8. Deduplication was conducted via EndNote and through manual screening and deletion. Records were initially screened by title and abstract (HW) according to the inclusion and exclusion criteria, with a random 10% screened independently by NH. Those which met the criteria were then assessed for eligibility according to the full text. Full text screening was conducted independently by two reviewers (HA-W, NH). During full text screening the authors of five papers were contacted to seek clarification of a study element and a further one author was contacted for results corresponding to a published protocol. Three authors provided the requested information; where no response was received the study was excluded from the review.

Data collection process and data items
Data was extracted (by HA-W and independently moderated by NH) and collated into a pre-piloted Excel spreadsheet using the following headings: general, study characteristics, participants, intervention features, outcomes and results (see Supplementary Table 3).

Study risk of bias assessment
For RCTs (see Figure 2), risk of bias was assessed using version 2 of the Cochrane risk-of-bias tool (RoB; Sterne et al., 2019). This tool assesses RCTs in five domains as low risk, some concerns or high risk before an overall rating is made. Assessments were made using the RoB 2 Excel tool which automatically applies an algorithm to calculate level of risk dependent upon reviewers' responses to signalling questions. HA-W and NH independently assessed risk of bias, with an agreement of .82 (Krippendorffs alpha). Risk of bias in quasi-experimental studies was assessed using the Risk of bias in non-randomised studiesof interventions tool (ROBINS-I; Sterne et al., 2016). The ROBINS-I tool assesses risk across seven domains as low, moderate, serious, critical or no information. All risk of bias assessments were conducted independently by two reviewers (HA-W, NH).

TIDieR
Intervention studies were also assessed against the TIDieR checklist (see Table 2) which contains items prompting the reporting of an intervention in sufficient detail to allow replication (Hoffman et al., 2014). The 12 Items were rated as present, missing, incomplete or not applicable. All studies were rated by one reviewer (HA-W) and five (38%) were rated independently by a second reviewer (NT), showing an inter-rater reliability of .89 (Krippendorffs alpha). The remaining studies were moderated by NT and NH.

Behaviour change techniques
Intervention content reported in any published paper (including protocols) relating to the experimental and active control groups of included intervention studies was coded using the BCTT v1 (Michie et al., 2013). For studies targeting both PA and HE, BCTs were coded separately for the PA and HE content. Where BCTs were referred to generically, it was assumed it be present for both PA and HE content. Training papers were coded using a modified taxonomy aimed at coding the training content for healthcare providers (Pearson et al., 2020). Coding was conducted independently by two reviewers (HA-W, NH), consulting a third (AC) to resolve discrepancies. Krippendorffs alpha showed an inter-rater reliability of .93.

Synthesis methods
Given the heterogeneity of outcome measures, a meta-analysis was not possible. Therefore, the promise of intervention studies was considered in terms of presented results at post intervention and follow up.
Promise ratios were used to explore whether studies achieved statistically significant improvements in primary outcomes and whether study promise was related to specific BCTs. All intervention studies were eligible for the calculation of promise ratios, which, presented in table format, were calculated using the method of Gardner et al. (2016) based on original work by Martin et al. (2013). In brief, BCTs had to appear in at least two studies and had to be unique to a study's intervention group i.e., not in the control group as well. Ratios were calculated as the number of times the technique appeared in a very or quite promising intervention divided by the number of times it appeared in non-promising interventions. An intervention was considered to be very promising when results on any measurement of the primary outcome showed both a within and between group significant difference in favour of the experimental group. Quite promising interventions were ones that showed either a within or between groups significant difference in favour of the experimental group, while those that showed neither of these were considered non-promising. BCTs were considered promising when used in at least twice the number of promising than non-promising interventions i.e., their ratio was ≥2. Promise ratios were calculated separately for PA and HE interventions, and further split into behaviour change measured at post-intervention and behaviour change maintenance using follow up results.
To explore best practice in the training of deliverers, it was intended to compare the content of training as identified with the BCT taxonomy by Pearson et al. (2020) between promising and nonpromising studies. However, given the nature of the studies relating to the included training papers this was not possible and a narrative synthesis of the BCTs used and outcomes of the training was conducted instead.
Ten studies reported the intervention being based on theories of behaviour change, with half of these citing Social Cognitive Theory (Cui et al., 2012;Jemmott et al., 2011;Meydanlioglu & Ergun, 2019;Prins et al., 2012;Ridgers et al., 2021). Six studies used an active control group which were all the same duration as the experimental group (Ardic & Erdogan, 2017;Jago et al., 2006;Jemmott et al., 2011;Prado et al., 2020;Prins et al., 2012;Taymoori et al., 2008). Only two studies used an objective measure of PA, accelerometers (Corder et al., 2020;Jago et al., 2006), the others used self-reported questionnaires or an activity log. All but four studies were conducted within schools during term time, one was conducted online (Ridgers et al., 2021), one in local parks plus other unspecified locations (Prado et al., 2020), one online plus unspecified locations (Jago et al., 2006) and the other took place in local education facilities during school holidays (Kuroko et al., 2020). Interventions ranged in duration from six days to six months with an average of 11 weeks in PA studies and 9 weeks in HE studies. The mean duration of follow up post intervention was 10 months for PA studies and 12 months for HE studies.

RCTs
As seen in Figure 2 only two RCTs were assessed as having a low risk of bias in all domains (Corder et al., 2020;Ridgers et al., 2021). Four studies were considered to be at high risk of bias overall (Isensee et al., 2018;Jago et al., 2006;Kuroko et al., 2020;Viggiano et al., 2015), and six at some risk (Cui et al., 2012;Jemmott et al., 2011;Lin et al., 2017;Prado et al., 2020;Prins et al., 2012;Taymoori et al., 2008). The domains most consistently rated as low risk were deviations from the intended intervention and missing outcome data (9/12 studies each), while the domains with the highest risk across all studies was randomisation and deviations from the intended intervention (2/12 studies each).

Quasi-experimental intervention studies
One study was rated as being at moderate risk of bias overall (Meydanlioglu & Ergun, 2019) and the other at serious risk (Ardic & Erdogan, 2017). Both studies were rated at low risk in four domains (confounding, selection of participants, classification of interventions, deviations from the intended interventions), moderate risk for measurement of outcomes, whilst insufficient information for selection of the reported result meant both studies were rated as No Information (NI). In the missing data domain, one study was rated as serious risk (Ardic & Erdogan, 2017) and the other as NI (Meydanlioglu & Ergun, 2019).

TIDieR
Intervention studies ranged in the number of items they reported from one to ten, with an average of six. The most frequently reported items were brief name (n = 13), procedure (n = 12), rationale and when and how much (both n = 11). All studies were rated as incomplete for materials, usually due to not providing copies of, or access to, the materials. All three studies that provided tailored interventions reported the details (Jemmott et al., 2011;Prins et al., 2012;Taymoori et al., 2008). Only two studies reported both planned and measured adherence/fidelity (Corder et al., 2020;Prins et al., 2012), while three studies partially reported adherence/fidelity despite not reporting the intention to do so (Jago et al., 2006;Kuroko et al., 2020;Prado et al., 2020). A further two studies reported some aspect of adherence/fidelity after either planning to do so (Ridgers et al., 2021), or partially discussing the plan (Cui et al., 2012). A summary of included TIDieR items can be found in Promise ratios -PA studies Table 3 shows the promise classifications for all studies. Jemmott et al. (2011) measured outcomes at 6, 12, 42 and 54 month follow ups. Given the 42 and 54 month follow ups were of a far greater  With regards to behaviour change, 14 BCTs were assessed. Of these, one showed promise: 5.1 information about health consequences (used in promising studies only). Promise ratios for maintenance were calculated for 18 BCTs. Two of these were found to be promising: 3.2 social support (practical), (ratio = 2), and 5.1 information about health consequences (ratio = 4). Promise ratios for all BCTs can be seen in Table 4.

BCTs -HE interventions
The number of BCTs reported in HE studies varied from none to eight, with an average of 4.9. The most common BCTs were 4.1 instruction on how to perform the behaviour and 5.1 information about health consequences (both k = 4), and 1.2 problem solving and 8.1 behavioural practice/ rehearsal (both k = 3). In total there were 20 unique BCTs identified in HE studies occurring 34 times. Table 3 shows the promise classifications for all HE studies. The Jemmott et al. (2011) study was treated the same as for PA promise ratios meaning it was considered very promising at follow up but did not have a post-intervention outcome. Likewise, Lin et al. (2017) did not measure outcomes at post-intervention and was excluded from analysis at that time point.

Promise ratios -HE studies
As shown in Table 4, for behaviour change, four BCTs showed promise, 4.1 instruction on how to perform the behaviour and 8.1 behavioural/practice rehearsal (both ratio = 2) and two were used only in promising studies, 3.1 social support (unspecified) and 5.1 information about health consequences (both k = 2). Promise ratios to assess maintenance showed four BCTs used in only promising studies: 5.1 information about health consequences (k = 4), 1.2 problem solving (k = 3), 1.4 action planning and 2.3 self-monitoring of behaviour (both k = 2).  Table 5 shows study characteristics for papers reporting on the training of professionals to deliver interventions. As can be seen, both sets of training were delivered by research staff in Australia.

Study characteristicstraining papers
Narrative summarytraining papers Kennedy et al. (2019) delivered a one-day professional development workshop to 27 teachers delivering the intervention, with 17 in a control group. They measured teachers' confidence and perceived personal fitness at baseline and six months later. Results showed that teachers in the experimental group reported significant increases on both measures in comparison to the control group with medium to large effect sizes (Partial eta squared 0.19 and 0.13 respectively). Lonsdale et al. (2019) used a combination of face-to-face workshops and online resources to train teachers to deliver more active PE lessons. A total of 94 teachers took part in the study, split equally between experimental and control groups. At post-intervention significant effects were found for all measures of teacher behaviours in favour of the experimental group. Large effect sizes were found for maximising movement, reducing transition time, building student's competence and supporting students (Cohens d 1.96, 4.36, 1.67 and 1.92, respectively). This was reflected in significant increases in MVPA during PE lessons for students whose teachers were in the experimental group (Cohens d .85).

BCTstraining papers
One paper used nine BCTs to train deliverers while the other used four. One code was discussed with the third reviewer. Both studies used feedback on behaviour, behavioural practice/rehearsal and adding objects to the environment. Additionally, Kennedy et al. (2019) used instruction on how to perform the behaviour while Lonsdale et al. (2019) included goal setting behaviour, problem solving, action planning, self-monitoring of behaviour, social support unspecified and demonstration of behaviour.

Discussion
This review synthesised studies of interventions across PA and HE for adolescents that included a measure of longer-term behaviour change, in addition to the training of professionals to deliver interventions. Regarding the first aim, this review has found mixed evidence of behaviour change post-intervention for both PA and HE behaviours, with preliminary evidence showing positive longer-term effects at maintenance. For PA studies, half showed short-term promise at post-intervention and continued to show promise longer-term at maintenance. Further one study did not measure outcomes at post-intervention but showed promise longer-term. The situation was similar for HE studies. Here, four studies were promising post-intervention and two continued to demonstrate promise at follow up. A further two studies that did not measure at this time point showed longer-term promise. These results show that while some studies resulted in positive behavioural change for participants, better effects of the intervention were seen during longer-term maintenance after the support had ceased. In line with the definition of maintenance used in this review, the results show that for some studies behaviour change occurred during the intervention and the longer-term. For others, changes to behaviour were noted over the longer-term. This suggests that change can occur both within and after intervention studies from which adolescents will benefit.
Consistent with previous reviews of health behaviour interventions, the majority of studies included in this review showed some-to-high risk of bias. The nature of behaviour change interventions often precludes researchers from blinding participants. Thus, studies utilising self-report measures are considered to be at some risk of bias due to the knowledge of intervention allocation, which can result in response bias and subsequent over-inflated effects. The use of objective outcome measures such as accelerometers, can help mitigate this bias.  Meester et al. (2009) this review found that studies targeting both PA and HE did not appear to be less effective than those targeting only one behaviour. Given that this review included only one school-based PA study that actively involved family members, this review is unable to either support or disagree with the conclusions of Van Slujis et al. (2007) who reported that there was strong evidence for these types of interventions being effective. However, this review found similar results to Murimi et al. (2018) as the included HE studies that involved parents demonstrated mixed evidence for effectiveness. This review agrees with the overall conclusions from other reviews (e.g., Dobbins et al., 2013;Hynynen et al., 2016) in that evidence for school-based PA interventions is mixed. Within this review, 80% of PA interventions were schoolbased and only half were effective for maintenance with less than half effective for initial behaviour change. Therefore, it would be prudent for future research to test delivery within other locations such as the community, or at the least adopt a multilevel approach by including both school and parents which has shown promise (Murimi et al., 2018;Van Slujis et al., 2007).

Contrary to De
The second aim, through the calculation of promise ratios, identified a number of promising BCTs. For PA interventions, behaviour change was associated with the use of information about health consequences, while maintenance was associated with practical social support and information about health consequences. For HE interventions, behaviour change was associated with unspecified social support, instruction on how to perform the behaviour, information about health consequences and behavioural practice/rehearsal. Maintenance was associated with problem solving, action planning, self-monitoring of behaviour and information about health consequences. As this is the first review to calculate promise ratios for PA and HE interventions in adolescents using the BCTT v1 rather than the CALO-RE taxonomy (e.g., Martin et al., 2013) it is not possible to draw direct comparisons with the findings of other reviews. However, although using different methodologies, many of the promising BCTs identified in this review were found to be effective in a review of PA interventions for 5-18-year-olds (Carlin et al., 2016). Additionally, information about health consequences was present in 100% of potentially effective interventions in a review of mother-daughter PA interventions , though again the methodologies differed.
The final aim of this review was to investigate the training of professionals to deliver PA and HE interventions to adolescents. The two included training papers found that staff reported increased confidence and improved performance to facilitate MVPA during lessons. Given the scarcity of training papers available there is little that can be gleaned in terms of best practice. Therefore, the knowledge and skills of practitioners that lead to better outcomes for trained professionals and participants remains unknown. The training of professionals and the subsequent impact on outcomes remains an area for future studies to explore. This can be facilitated through including training considerations as an integral part of intervention design, evaluation and reporting to allow the exploration and identification of best practice.

Implications
This review has highlighted several studies contributing to knowledge of change and maintenance of PA and HE behaviours in adolescents, though there is room for improvement in the design and reporting of such studies. Of the included studies, nine were published after the TIDieR checklist was released in 2014. However, none of the included studies documented all the recommended items, on average they included half of applicable items, meaning that none of them could be explicitly replicated to be used in practice within public health services. The worst reported TIDieR items were on fidelity, and two things should be noted here. Firstly, without measuring fidelity, i.e., whether components were delivered as they were intended, it is impossible to know what the deliverers did and what the participants actually received. The fact that professionals face challenges in using skills consistently (Moore et al., 2012) and may skip parts of the content which they are not comfortable delivering or believe to be necessary or beneficial (e.g., Whiteside et al., 2016) suggests that some participants may receive an intervention different to the one intended. Secondly, fidelity and adherence are combined in the TIDieR checklist despite measuring different things. Within this review, some studies were assessed as having reported this item when they had only discussed adherence, which tells us nothing about fidelity. We recommend separating adherence and fidelity within TIDieR to ensure both are considered and reported.
In coding BCTs, there were difficulties around using the taxonomy due to both ambiguous language in describing intervention content and lack of clarity in the application of BCTs. For example, this review found instances of BCTs referenced to the PA or HE component of dual targeted interventions, but also instances where the BCT was not related to a target behaviour. Therefore, authors are encouraged to list BCTs separately for each behaviour targeted. Authors are also encouraged to use language that is consistent with the BCT taxonomy to facilitate straightforward coding. One way to achieve this, as demonstrated by Corder et al. (2020), is to report which BCTs have been used, alongside an explanation of how the intervention components fulfil the criteria for each technique.
Another difficulty with coding was around the definitions in the taxonomy itself. To illustrate, 1.1 (goal setting, behaviour) states 'set or agree on a goal' (Michie et al., 2013, supplementary data pg.11). As stated, the BCT could be coded as present if the goal is set by the deliverer without the agreement of the participant. However, NICE recommends for behaviour change interventions that deliverers should 'agree goals for behaviour' (NICE, 2014, pg.15) while the competency framework for the delivery of behaviour change interventions specifies deliverers have an 'ability to agree goals' (Dixon & Johnston, 2010, pg.26). It could be argued that if someone does not agree with a goal it is unlikely that they will work towards it. This suggests the need to refine the BCT taxonomy to specify who set the goals and whether they were set, and agreed, at an intervention level (e.g., deliverers set goal for every participant to complete 150 mins of MVPA per week) or at the individual level (e.g., personalised goals based on each participant's abilities). Until such a refinement occurs researchers should note whether the participants have agreed to behavioural, or outcome, goals.
Evaluation of training programmes for deliverers is rarely reported in the literature, despite this review including nine studies that trained people other than the research team to deliver the interventions. In the field of mental health, the training and skills of professionals are frequently investigated in relation to patient outcomes (Liness et al., 2019), and as much as 16% of the variance in outcomes has been attributed to the relationship between patient and professional alone (Del Re et al., 2012). The field of behaviour change should likewise consider the role the deliverer plays in participant outcomes.

Strengths and limitations
To our knowledge this is the first systematic review to distinguish between initial behaviour change and maintenance phases, code against the TIDieR checklist and BCTT v1, and review training papers for PA and HE interventions in adolescents. It should be noted that this review was conducted using only published material, an approach used in many other systematic reviews (e.g., De Meester et al., 2009;Martin et al., 2013). This, combined with lack of reporting of fidelity means that BCTs (1) may have been omitted from coding because they were not reported in full (2) may have been erroneously coded to both target behaviours due to lack of specificity in reporting, (3) may have been planned to be delivered but not actually received by participants. The identification of promising BCTs may therefore be considered as preliminary.
Due to this review including only studies that included a measure of maintenance (i.e., longerterm behaviour change), a relatively small number of articles were included. This is not a limitation of this review, rather it reflects a sparse literature base on this topic. Similarly, the need for studies to use a primary outcome of PA or HE impacted on the number of studies eligible to be included. This highlights how few behaviour change interventions aim to primarily measure actual behaviour change. Instead, many studies aim to measure the outcomes of behaviour change, usually anthropometric measures such as BMI or weight. While this can provide useful information, it is the changes in behaviour that drives changes in outcomes.
Additionally, it should be noted that BCTs do not appear in isolation and it is possible that groups of BCTs have synergistic effects. Similarly, coding for BCTs does not account for dose: some BCTs may naturally occur multiple times, e.g., social support, whereas some may only be done once, e.g., goal setting. Finally, BCTs have been identified as promising on the basis of promise ratio calculations, the methodology of which combines interventions classified as quite promising with those considered very promising. In doing so, the process is in danger of artificially inflating the promise of some BCTs which might be considered ineffective if only very promising interventions were considered. In this review only three studies were considered quite promising at any time point. Reviews with larger numbers of quite promising studies may be more susceptible to overinflation of results and therefore should stay mindful of this issue when drawing conclusions.

Conclusions
This review provided a synthesis of the international evidence on adolescent PA and HE behaviour change interventions and their maintenance of at least six months. It was found that some adolescent PA and HE interventions are promising at post-intervention, with stronger evidence for maintenance. Given the importance of sustaining behaviour change, the BCTs found to be associated with maintenance should be considered for inclusion in future studies where possible. For PA interventions these are practical social support and information about health consequences, while for HE interventions they are problem solving, action planning, self-monitoring of behaviour and information about health consequences. Although the BCT taxonomy helps identify what goes into an intervention, we argue that evaluating fidelity is equally important in future studies. With that, an improvement in the reporting of interventions and training is warranted, to be addressed by both authors and editors. Additionally, commissioners would be well served to consider the impact of fidelity on future bids for public health tenders. Future efforts should go into refining the BCT taxonomy and TIDieR checklist in light of the issues raised in this review. The number of studies eligible for inclusion in this review highlights the maintenance of adolescent PA and HE behaviour change as an area of priority for future research.