The effectiveness of school-based run/walk programmes to develop physical literacy and physical activity components in primary school children: A systematic review

ABSTRACT The objectives of this review were to systematically review the research on school-based run/walk programmes and their measurements of physical literacy (PL) and physical activity (PA)-related components and to assess the different intervention methods and their impact on encouraging PL and PA. To be included in the review, studies had to satisfy all inclusion criteria. An electronic search was conducted on six databases, the last date search was 25 April 2022. All outcome measures were grouped using the Shearer et al. (2021) PL checklist and additional PA related outcomes. Ten studies were included in the final review. Five different run/walk interventions were identified and six studies followed or referred to The Daily Mile (TDM) protocol. Outcomes relating to the physical domain were most commonly explored, and no studies explored the cognitive domain. Four studies reported significant differences in cardiovascular endurance measures. Positive findings were also reported for outcomes relating to motivation and self-perception/self-esteem in the affective domain. Overall, run/walk programmes appear to provide promising results in favour of physical and affective development in PL. However, further high-quality studies are needed to draw firm conclusions. This review highlights the popularity of TDM and its potential to contribute to PL development.


Introduction
Schools are identified as essential environments for contributing to children's daily physical activity (PA) levels (Jones et al., 2020;Naylor et al., 2015;Public Health England, G. U, 2020b;Shah et al., 2017;Taymoori & Lubans, 2008) and prove a popular environment to roll out PA-based initiatives as children spend a large portion of their day in school (Jones et al., 2020;Naylor et al., 2015;Shah et al., 2017). Children and young people should be engaging in an average of 60 minutes per day of moderate to vigorous physical activity (MVPA) across the week, and at least 30 of their 60 active minutes per day should be achieved during school time (UK Chief Medical Officer Physical Activity Guidelines, 2019;). However, with increasing timetable pressures on schools and physical education (PE), the opportunity for active play is often not prioritised (Norris et al., 2015;Youthsporttrust.org, 2018). School-based PA programmes are offered as an opportunity for pupils to be active throughout the school day outside of PE lessons, including during breaktime or in-class activities (Jones et al., 2020). These PA-focused initiatives are often introduced to combat rising childhood obesity and sedentary behaviours in young children by increasing daily PA at school (Chalkley et al., 2020b;Jones et al., 2020).
In recent years, school-based run/walk programmes have gained popularity (Chalkley et al., 2018b(Chalkley et al., , 2020a and involve walking, jogging, or running a route on school grounds for either a set distance or time (Chalkley et al., 2020b;Sherar et al., 2020). School-based running programmes are often also referred to at a policy level as "active mile initiatives", which typically entail running for approximately 15 minutes at a self-selected pace until a one-mile distance is covered (Chalkley et al., 2018a;Public Health England, G. U, 2020;The Daily Mile, 2022). Due to the self-select nature and variation in pace, the interventions are referred to in the present review as "run/walk" interventions rather than solely "running interventions". Several national and local policies feature school-based interventions with specific attention on run/walk programmes such as The Daily Mile™ (TDM; Public Health England, G. U, 2020; The Daily Mile, 2022). The United Kingdom (UK) Government Childhood obesity: A plan for action report, Chapter 2 (Department of Health and Social care, 2019) and the School Sport and Physical Activity Plan (Department for Education, Department for Digital, Culture, Media & Sport and Department of Health and Social Care, 2019) all promote the implementation of "active mile initiatives" (Public Health England, G. U, 2020). Existing work on programmes like TDM (Marchant et al., 2020) and Marathon Kids (MK; Chalkley et al., 2018aChalkley et al., , 2018b have found that the programmes offer schools a flexible and straightforward approach to encouraging daily PA without needing additional equipment, staff training or funding to implement, all of which have repeatedly been noted as limitations in other forms of school-based activities. Since TDM's launch in Stirling in 2012, the programme has grown in popularity across the globe. The programme received £1.5 million as part of the Sport England National lottery funding in order to help primary schools in England implement TDM and to date, the initiative is taking place in approximately 87 countries, with over 3,175,000 children completing TDM daily (Sherar et al., 2020;The Daily Mile, 2022). Much discussion has been raised on how far these programmes go in establishing positive physiological change, mental health and improved academic attainment and in turn encouraging long-term PA participation Fairhurst & Hotham, 2017;Thorburn, 2020). Daly-Smith et al. (2019) and Thorburn (2020) both noted that current TDM studies driving policy have lacked quality and clarity in their findings and call for further investigation to confirm conclusions. Since its initial launch, research into TDM has developed; however, there are still currently no known reviews examining this specific intervention type, which aim to draw firm conclusions about the potential outcomes of participation.
The concept of physical literacy (PL) has been discussed as a way to encourage and maintain lifelong engagement in PA. PL can be described as a "multifaceted concept" that consists of affect, physical and cognitive domains that interlink (Cornish et al., 2020;Edwards et al., 2017;Shearer et al., 2021;Whitehead 1, 2001). Embodying a PL perspective in research looks beyond exclusively physical outcomes and also provides a holistic perspective considering psychological and cognitive elements. Globally, many definitions of PL are adopted by leading sports associations (Edwards et al., 2017). The Whiteheadian perspective is thought to cover a variety of movements and physical and psychological skills that go beyond solely competitive sports participation, and represents a holistic approach to PA, considering the lifelong processes associated with participation (Edwards et al., 2017;Whitehead 1, 2001). The International Physical Literacy Association (IPLA) also uses this popular definition. According to the IPLA website, PL can be defined as "The motivation, confidence, physical competence, knowledge and understanding to value and take responsibility for engagement in physical activities for life" (IPLA, 2017). While PL is lifelong, initiatives implemented during childhood are popular because it is a critical stage for developing PL attributes and lifelong PA participation (Belanger et al., 2018;Shearer et al., 2021). In the School Sport and Physical Activity Action Plan (Department for Education, Department for Digital, Culture, Media & Sport and Department of Health and Social Care, 2019), PL is also included as a core feature of children's school experiences and PA participation. It is thought that creating positive daily PA habits in schools could contribute to developing PL components in children and, in turn, increase the likelihood of developing a lifetime habit of PA (Department for Education, Department for Digital, Culture, Media & Sport and Department of Health and Social Care, 2019).
Given its inclusion of the three domains, PL can capture a broad range of processes that contribute to lifelong learning and engagement in physical activities (Cairney et al., 2019;Whitehead et al., 2013). Investigating PL could be more worthwhile than solely focusing on physical factors relating to health like body mass index (BMI) or motor skills that perhaps do not consider the broader processes associated with lifelong engagement (social, cognitive, affective etc.). Children with greater PL are thought to be more likely to meet daily PA guidelines (Cornish et al., 2020) and often, these elements are seen in conjunction with one another. Engaging in meaningful PA experiences will provide children with the opportunity to develop and nurture their PL; in doing so, they also then contribute to developing regular PA habits (Durden-Myers et al., 2018;Whitehead 1, 2001). Nevertheless, there are currently no known reviews assessing the influence of school run/walk programmes alongside PL and PA-related outcomes. Exploring the domains of PL with this intervention type may also provide an opportunity to address potential concerns on the efficacy of the interventions on children's PA participation.
Therefore, the aims of this review were 1) To systematically examine the research on school-based run/walk programmes and the measurements of PL constructs and PA-related components. 2) To assess the different intervention methods of school-based run/walk programmes and their impact on encouraging PA participation or developing PL.

Materials and methods
The reporting of this systematic review followed the PRISMA 2020 guidelines. The project was registered with PROSPERO (CRD42021253675), and ethical approval was granted by the institutional ethics committee.

Search strategy
An electronic search was conducted on six databases: 1) SPORTDiscus, EBSCOhost 2) MEDLINE, EBSCOhost 3) Sage journals, EBSCOhost 4) PubMed 5) ScienceDirect 6) APA PsycINFO, EBSCOhost. The last date of the search for all databases was 25 April 2022. The databases used included areas relevant to PL and PA components. A subject librarian and two researchers (SA and LS) developed the search strategy. The search of databases included a combination of keywords and subject headings for interventions research on children, school-based run/ walk programmes and PL or PA components based on the Population Intervention Control Outcome (PICO) framework displayed in Table 1.

Inclusion
The study characteristics used to determine eligibility for inclusion were based on the PICO framework (Higgins et al., 2019). The inclusion and exclusion criteria for the systematic review are shown in Table 2. Studies had to satisfy all inclusion criteria to be included in the review, this included assessing at least one outcome in the PL checklist (Shearer et al., 2021). The checklist is a recent tool developed to identify PL qualities in outcome measures. The tool is based on existing PL research and considers the different definitions of PL that have been adopted internationally (Shearer et al., 2021).

Selection and extraction
Two researchers conducted the data screening independently (SA and LS). All studies that returned from the initial searchers were screened in two stages in accordance with the review's inclusion and exclusion criteria (Higgins et al., 2019). First, all titles and abstracts were reviewed, and duplicates were removed. After stage one, assessors met to discuss any disagreements. At stage two, full texts were screened. Both researchers reviewed the resources twice for the familiarisation process. All studies and records of the selection process took place on a shared standardised Microsoft Excel form to reduce selection and publication bias (Higgins et al., 2019). Studies that did not meet the study criteria were removed. Any disagreements between the researchers were resolved by discussion between the two until consensus was reached.
In order to answer the review objectives and facilitate the risk of bias assessments, the basic characteristics were extracted from each study, as recommended by Higgins et al. (2019) and Mengist et al. (2020) and recorded in Microsoft Excel. One reviewer (SA) extracted data from the articles and a second verified the data (LS). The extracted data included year of publication, study type, country, sample size, and intervention characteristics.

Quality and reporting
The PEDro Scale was used to assess the risk of bias in each study included in the review. The tool has been validated and used widely in sports and exercise as a tool for quality assessment (Cashin & McAuley, 2019;Yamato et al., 2017). Three Physical literacy -Physical outcomes AND "physical literacy" OR "Physical activity" OR exercise OR "physical fitness" OR sports OR sedentary OR cardiovascular OR activity OR aerobic OR "motor control" OR coordination OR performance Physical literacy -Affective outcomes AND "physical literacy" OR "Affective well-being" OR affective OR self-efficacy OR self-confidence OR confidence OR behaviour OR motivation OR Enjoyment OR emotion OR attitude OR belief Physical literacy -Cognitive outcomes AND "physical literacy" OR "Cognitive function" OR cognitive OR well-being OR Knowledge OR understanding OR value Other outcomes -relation to physical activity and public health research OR Obesity OR obese OR weight OR "weight loss" OR "weight reduction" OR "weight management" OR "weight maintenance" OR BMI OR "body mass index" OR "academic achievement" OR "body composition" Searches were conducted by combining codes (1,2 and 3A, 1,2 and 3B, 1,2 and 3C, 1,2 and 3D). confidence, motivation, emotional regulation, enjoyment, persistence/ resilience/commitment, adaptability, willingness to try new activities, autonomy, self-perception/ self-esteem, perceived physical competence, object-control, stability, locomotor, movement skills -land, movement skillswater, moving using equipment, cardiovascular endurance, muscular endurance, co-ordination, flexibility, agility, strength, reaction time, speed, power, rhythmic ability, aesthetic/ expressive ability, sequencing, adapt movement strategies to the situation/ environment, progression from simplecomplex skills, knowledge and understanding of benefits of physical activity, knowledge and understanding of importance of physical activity, knowledge and understanding of effects of physical activity on the body, knowledge and understanding of opportunities to be active, knowledge and understanding of sedentary behaviour, ability to identify and describe movement, creativity and imagination in application of movement, decision making (ability to think, understand and make decisions, knowing how and when to perform), ability to reflect and improve own performance, including setting optimal challenges, knowledge and understanding of tactics, rules and strategy, knowledge and understanding of safety considerations and risk. -Be based in a primary school setting -Adhere to the definition of school-based run/walk programmes as per (Chalkley et al., 2020b;Shearar et al., 2020). This included: walking, jogging, or running a route on school grounds for either a set distance or time. The intervention should take place in addition to PE and throughout the school week or term. -Include participants aged between 4-16 years (pre-adolescents) in primary school Conference reports or readings, editorial and forewords Non-experimental research design, Qualitative methods only, process evaluations or protocols Interventions relating to medical illness and/or physical disability or specific health conditions. researchers (SA, LW & EE) independently assessed the quality of the studies. The included studies were scored on 11 criteria, and points were awarded when a criterion was satisfied. Studies were scored as "criteria met" (✓) or "criteria not-met" (×). The final quality scores are displayed in Table 3 and detailed criteria scoring is displayed in Table 4. Total scores of between 9 and 11 were considered "excellent", 6 to 8 "good", 4 to 5 "fair" and less than 3 "poor" (Moseley & Pinheiro, 2022). Inter-rater reliability for the PEDro risk of bias indicated strong reliability between assessors (SA, LW k = 0.76) (SA, EE k = 0.92). Any disagreements were resolved via discussion between SA, LW and EE until consensus was reached.

Data synthesis
The outcome measures assessed in the included studies were grouped under the relevant PL domain (physical, affective or cognitive) using the PL checklist developed by Shearer et al. (2021). In line with the study's aims, PA outcomes were also grouped. PA-related outcomes were deemed as any outcome measure that related to, or connected with, participation in PA and exercise, specifically measures that had been used in public health research and did not meet the PL checklist criteria (Biddle et al., 2019;Chaput et al., 2020;Hills et al., 2015). The items were scored as "assessed" (✓), "not-assessed" (×) or "Unclear" (?). Researcher one (SA) categorised each paper and the second researcher (LS) verified the groupings. Metaanalysis was not performed due to the variety between interventions with regard to study design, outcome measures, assessment tools and study/method quality. Therefore, findings were synthesized narratively and results are presented in outcome measure tables. The included studies were then assessed based on their study aims and outcomes and the relation to PL development.

Study selection
A total of 25,780 papers were recorded in the initial search, of which 11,268 duplicates were removed, and 3244 were removed for not meeting the journal article inclusion criteria. After screening titles and abstracts of 11,268 articles, 59 full-text articles were reviewed at screening stage 2. Following stage two, 10 studies were deemed eligible for inclusion in the review and the remaining 49 were excluded. Figure 1 displays a full breakdown of the exclusion of studies.

Characteristics and quality
For each study, the use of control, sample size, study type, quality tool and score are presented in Tables 3 and 4 includes a breakdown of all quality criteria and scored outcomes. The intervention characteristics extracted from each study include the study location, intervention type, intervention characteristics (length of study, frequency of completion and duration of intervention), with data presented in Table 5.

Design
Five different intervention types were included in the review (Table 5). Three studies implemented TDM as described by TDM website, and three implemented TDM with some form of variation from its initial protocol. Breheny et al. (2020) permitted teachers to adapt TDM as they thought it to be "motivational" for the pupils. This could include integrating maths classes or using reward tools. Two studies had an intervention group (IG), and an intervention-plus group (IPG) who performed a modified intervention. In De Jonge et al. (2020), the IG performed TDM as usual, and IPG performed TDM and received additional teacher support. Teacher support included visits to the school within the first 2 weeks of implementation, regular contact through WhatsApp such as weather reports and motivational support, and every 3 weeks teachers would have discussions with support staff on the potential barriers and issues with TDM (De Jonge et al., 2020). Brustio et al. (2020) varied the frequency between intervention groups. Subgroups were as follows: the 2_Times subgroup (IG2), which performed TDM less than 2.5 times per week, and the 3_Times subgroup, which performed TDM more than 2.5 times per week (IG3). One study intervention, reportedly "Inspired by" TDM Brustio et al. (2018), performed a walking intervention in which pupils walked 1 km along a marked school path for approximately 10 mins. Similarly, Booth et al. (2020) assessed the impact of 15 min bouts of self-paced activity, such as TDM, compared to more intense running schemes. The TDM protocol was neither explicitly mentioned nor adhered to in Booth et al. (2020), but the interventions shared similar qualities. Mønness and Sjølie (2009) performed 20 min daily walking and was the only to note that the intervention took place across varied terrains; the course involved "varied gradient, steepness, climbing and balancing". Garnett et al. (2017) investigated the intervention "Move it! Move it!" in which pupils, their families and school teachers voluntarily attended a morning run/walk   Booth et al. (2020) was the only study to record immediate preand post-intervention effects after one performance of the intervention. All other participations varied from 3 to 12 months. All studies stated the time of year of implementation and data collection; however, no studies investigated the potential seasonal impact on adherence or affect. Two studies conducted pre-, mid-and post-assessments (Breheny et al., 2020;Brustio et al., 2020). The frequency of participation ranged from one completion (Booth et al., 2020) up to five times per week, with durations ranging from 10 to 25 mins (Breheny et al., 2020;Brustio et al., 2018Brustio et al., , 2020Chesham et al., 2018;Marchant et al., 2020;Mønness & Sjølie, 2009 (2009) was the only study to report a replacement activity took place on missed intervention days. The intervention did not take place on three school days due to bad weather whereby it was replaced with an indoor PA.

Training
Five studies out of 10 studies provided some form of basic training or introduction to the intervention (Breheny et al., 2020;Brustio et al., 2019Brustio et al., , 2020Chesham et al., 2018;De Jonge et al., 2020). The training tools used included leaflets (Brustio et al., 2019;Chesham et al., 2018) or guidance to online information (TDM website; Breheny et al., 2020;Brustio et al., 2020), staff or public meetings (Brustio et al., 2020), and welcome packs (De Jonge et al., 2020). The welcome packs provided in De Jonge et al. (2020) included a "how-to" poster, temporary tattoos, flyers for parents, intervention instruction manuals for teachers, and a calendar that can be used to track participation. One study mentioned that no specific intervention training was required due to the simplicity of the intervention design (Brustio et al., 2018). The two remaining studies either did not mention training tools or did not specify UK TDM* 12 months 5X/week~15 min/ 1mile distance 15 min self-selected pace walk/jog/run around school play area.
Delivered by class teachers as a break from lesson not to replace PE or any PA in the day. Teachers were also allowed to adapt the implementation using motivational material.
General TDM information and guidance to official website De Jonge et al.  intervention-specific training (Garnett et al., 2017), in one case, general study participation information was provided (Brustio et al., 2018).

Physical literacy-related outcomes
According to the Shearer et al. (2021) checklist, no studies explored all three domains of PL; all studies explored either the physical or affective domains, and no studies explored the cognitive domain (Table 6). No unified PL method of assessment was identified, all methods of assessment were distinct, and no studies aimed to measure or track PL.

Physical domain
Cardiovascular endurance (CE) was the most frequently explored area in the physical domain (n = 7; Table 6). The methods of assessment varied across the majority of studies. However, all focused on the outcome result (e.g., distance run) and not competence scoring (e.g., technique). The tools included a British Athletics Linear track test (Breheny et al., 2020), step test (Mønness & Sjølie, 2009), 6-min run test (Brustio et al., 2020(Brustio et al., , 2019, and shuttle run tests (Chesham et al., 2018;De Jonge et al., 2020;Marchant et al., 2020). There was an inconsistency in the magnitude of CE change within the findings, although all studies reported some form of positive difference from baseline, as shown in Table 7. In total, four studies have found significance in CE (Brustio et al., 2020;Chesham et al., 2018;De Jonge et al., 2020;Mønness & Sjølie, 2009). Interestingly, three of the studies used control groups (CG) and performed TDM intervention, the third ( Similarly, Brustio et al. (2019) found that IG increased between baseline and 3-month follow-up (estimated difference (ED) 25.15 m, standard error (SE) 6.39 m, p < 0.001, percentage change = 3.1%) compared to CG (ED 4.44 m, SE = 6.69 m, p = 0.911, PC = 0.5%). After adjusting for age and sex, De Jonge et al. (2020) found a significant intervention effect on shuttle run tests in favour of both IG (IG 1.1 stage, 95% Confidence Interval (CI) 0.8 to 1.5, IGP 0.6, 95% CI 0.5 to 1.0). Breheny et al. (2020) was the only study to conduct midfollow-up assessments at 4 months and post-intervention at 12 months and was the longest intervention duration for participation. The study identified small difference in CE but in favour of the CG at both time points (4 months mean difference (MD) 5.96, 95% CI 113.81 to 9.94, p = 0.436) (12 months MD −65.61, 95% CI 113.81 to 17.21, p = 0.048). Overall improvements were observed in CE results in both groups, but these were statistically non-significant, although there was a large amount of missing data reported. Marchant et al. (2020) reported no differences but there were seasonal differences in data sets.

Affective domain
In total, four studies explored the affective domain of PL (Booth et al., 2020;Breheny et al., 2020;Brustio et al., 2018;Garnett et al., 2017). Table 6 shows the areas assessed, and Table 7 shows the outcomes of the affective domain. Small positive effects were found in emotional regulation. Breheny et al.
(2020) used Self-Reported Quality of Life and Well-being (Child Health Utility Dimension) and Child Well-Being (Middle Years Development instrument) tools to assess the outcome. The authors found small non-significant difference between groups in favour of the IG for quality of life (MD 0.003, 95% CI X number of studies, * significant results. −0.05 to 0.05, p = 0.894) and well-being (MD 1.90 m, 95% CI −3.07 to 6.87, p = 0.499) at 12-months. Brustio et al. (2018) investigated the motivation using The Participant Observation Questionnaire. Overall, the study found that participating in the intervention could positively influence motivation orientations towards PA participation. After controlling for age, Brustio et al. (2018) observed significant interaction between group and time in social status (F (1237) = 4.852, p = 0.028), team (F (1273) = 6.015, p = 0.015) and energy release (F(1273) = 8.527, p = 0.038). Specifically, significant decreases were observed in social status and an increase in team and energy release in IG. For CG, an increase was observed in social status and a decrease in team and energy release. Booth et al. (2020) observed the impact of different classroom break activities on cognition and well-being. The adapted Children's Feeling Scale and Felt Arousal Scale were used to assess self-perception/self-esteem, and results for the two measures were recorded as "affect" and "alertness". Statistically significant correlations were observed between change in alertness and affect associated with all physical activities performed. This included 15 min self-paced activity (SPA), CG and bleep test. Specifically, statistically significant differences were identified with SPA and CG for affect and alertness in linear mixed model regression analysis for unadjusted data (affect MD 0.21 ± 0.07, 95% CI 0.05 to 0.37, p = 0.006, ES 0.06) (alertness MD 0.32 ± 0.04, 95% CI 0.22 to 0.41, p = 0.000, ES 0.15) and fully adjusted models (affect MD 0.21 ± 0.07, 95% CI 0.05 to 0.38, p = 0.005, ES 0.06) (alertness MD 0.31 ± 0.04, 95% CI 0.22 to 0.41, p = 0.00, ES 0.15), although effect sizes were small. Similarly, statistically significant differences in change scores were also observed in IG for affect and alertness in unadjusted (affect MD 0.28 ± 0.07, 95% CI 0.11 to 0.44, p = 0.000, ES 0.07) (alertness MD 0.19 ± 0.04, 95% CI 0.10 to 0.28, p = 0.000, ES 0.08) and fully adjusted scores (affect MD 0.27 ± 0.07, 95% CI 0.10 to 0.44, p = 0.001, ES 0.07) (alertness MD 0.19 ± 0.04, 95% CI 0.10 to 0.28, p = 0.001, ES 0.07). Interestingly, no difference in change scores was observed in affect between the bleep test group and CG. CG alertness scores were significantly lower in alertness than the bleep test group. Garnett et al. (2017) also explored selfperception/self-esteem through the self-regulated learning tool. However, the study found no statistically positive correlation between self-esteem and miles ran (r = 0.6, p = 0.46).

BMI
For BMI outcomes, all studies reported non-significant differences between groups following intervention participation. Brustio et al. (2019) noted no significant difference between IG and CG in BMI (p > 0.05) at 3 months. The authors did note a − 0.6% change between IG baseline (M 17.5 kg.m-2, 95% CI 17.3 to 17.7) and posttest results (M 17.4, 95% CI 17.2 to 17.6), although the same change was also recorded between CG baseline (M 17.3, 95% CI 17.2 to 17.7) and post-test results (M 17.3, 95% CI 17.0 to 17.6). After correcting for age and gender, no significant group X time  interactions were observed for BMI (F 1793 = 0.792, p = 0.374). At 12 months, Breheny et al. (2020) recorded a small increase in favour of IG compared to CG, but this was not statistically significant (MD −0.036, 95% CI −0.085 to 0.013, p = 0.0146). Brustio et al. (2020) observed no difference in BMI and the waist-to-height ratio at any time point. No significant group X gender X time, nor group X time interactions were observed for BMI (F = 1.393, partial η 2 = 0.005, p = 0,234 and F = 1.280, partial η 2 = 0.004, p = 0.275 respectively). Garnett et al. (2017) used pre-existing BMI scores from school records and the authors reported a non-significant negative correlation between mile run and BMI (r = −0.07, p = 0.39). Booth et al. (2020) assessed cognitive function through three computer tasks: inhibition using an adapted stop-signal task, visual-spatial working memory using an adapted static box task and verbal working memory using a reading span task. The study found significant improvements with small effects in all measures after one performance of a self-paced activity compared to CG in unadjusted and fully adjusted models (ES 0.04-0.17, p < 0.05). Brustio et al. (2019) and Brustio et al. (2020)

Discussion
This is the first review to examine the current research on school-based run/walk programmes and their potential impact on PL and PA. Ten articles were identified, and results showed limited exploration of all domains of PL. No studies attempted to chart overall PL progress, but using the Shearer et al. (2021) checklist, it was possible to investigate individual domains and group the outcomes assessed within these. The results of the review suggest that participating in run/walk interventions contributes to improved performance in components of the physical (CE, power, stability and muscular endurance) and affective domains (motivation and emotional regulation) as well as some PA-related outcomes (MVPA, body composition and cognitive function) but no studies investigated all three PL domains nor was the cognitive domain explored at all. The limited exploration suggests missed opportunities to identify intervention functions that optimise PL development.
In line with existing research, the outcomes assessed most commonly met the criteria for the physical domain of PL. Cornish et al. (2020) and Edwards et al. (2018), both found frequent exploration of the physical domain and a lack of investigation within the remaining areas, particularly the cognitive domain. Whilst there has been some exploration into the impacts of TDM on cognitive function (Booth et al., 2020;Hatch et al., 2021;Morris et al., 2019), these studies did not meet the inclusion criteria for the PL outcomes in the review as they focused solely on cognitive development rather than the knowledge and understanding of participation that is addressed in the Shearer et al. (2021) checklist and PL definition. Often this measure is under investigated in research but is equally important in understanding PL and PA participation, so should be considered in the future research (Cornish et al., 2020;Edwards et al., 2018). According to Whitehead (2001;2013) the domains of PL are equal and should not be parted. Recently, research aiming to measure PL has shown that domains are often separated and assessed as individual constructs (Cornish et al., 2020;Edwards et al., 2018). One suggestion is that the strong focus on the physical domain (PA, CE and BMI), rather than holistic PL approach in current investigations is as a result of research being driven by a sport, rather than a health, perspective (Cornish et al., 2020;Edwards et al., 2018). The lack of assessment is also reflected in the measurement tools available that aim to assess PL as a whole as often these are also focused on physical outcomes (Shearer et al., 2021).
There is not yet a recognised standardised assessment that measures PL within young children. A recent review by Shearer et al. (2021) investigated the current measures of the PL domains (physical, affective and cognitive) and highlighted that there are still only three assessment tools, which aim to measure all elements of PL explicitly, those are The Canadian Assessment of Physical Literacy (CAPL), the Physical Literacy Assessment for Youth (PLAY tools) and Passport for Life (PFL), although these are not all internationally recognised. The lack of uptake for these tools could be due to the large debate surrounding the assessment of PL (Jean de Dieu & Zhou, 2021;Longmuir et al., 2015). With PL being a multifaceted concept, some believe that "assessing" PL is not an appropriate reflection of the concept and instead Whitehead et al. (2013) suggested "charting" PL progression as a more suitable approach. Given that PL is an individualised journey, charting individual progression could be considered a more appropriate approach than comparison to norms, enabling research to capture the individualised experiences that embody PL rather than producing an "end result" (Whitehead et al., 2013). Many other PL tools exist but do not adopt the multifaceted nature of PL and tend to favour certain domains; research has shown that these measures tend to focus on motor skills or fundamental sports skills (physical domain) (Jean de Dieu & Zhou, 2021;Longmuir et al., 2015). The lack of clarity surrounding a unified definition of the concept, and the charting of PL, led to different interpretations and measurements of PL. This uncertainty could explain the imbalance in this review's exploration of the concept and its domains and the uptake of interventions looking to investigate the impacts on PL. Conclusions could not be drawn on the concept of PL due to the variation in domain assessment featured in this review, but promising results were identified for physical and affective related outcomes.
Firstly, all studies within this review reported positive findings for outcome measures under the physical domain, but not all of these findings were statistically significant. The physical domain was most commonly assessed utilising CE (n = 7) rather than motor skills such as coordination, or locomotor or object control competence, which are listed within the Shearer et al. (2021) checklist and noted as equally important elements when capturing the domain (Edwards et al., 2017). Four studies reported significant positive changes in CE following completion of TDM (Brustio et al., 2020;Chesham et al., 2018; De Jonge et al., 2020) or a walking intervention (Mønness & Sjølie, 2009), and three reported beneficial but non-significant changes (Breheny et al., 2020;Brustio et al., 2019b;Marchant et al., 2020), but none reported negative associations. The studies that reported significant findings were no longer than 6 months (3 months, Brustio et al., 2019;De Jonge et al., 2020) and 6 months (Brustio et al., 2020;Chesham et al., 2018;Mønness & Sjølie, 2009) and the only study to track CE over a longer period of 12 months reported nonsignificant improvements (Breheny et al., 2020). The nonsignificant change over a longer period of time (12 months) may suggest that participation intensity could decline over a year, leading to less impact on CE. Although it is not possible to draw conclusion at this stage, future research may benefit from comparing CE over longer periods (6-12 months) and also identifying fidelity in participation to determine if there is a decline in performance intensity that may influence CE outcomes. Braaksma et al. (2018) recommended that PA intervention durations should be a minimum 6 weeks for cardiovascular fitness benefits and performed three or four times per week. All studies that reported significant improvements in CE also reported frequency to be of similar standard each week to Braaksma et al. (2018) (three or four times per week), although only one study compared intervention frequencies (Brustio et al., 2020). The authors concluded that performing TDM more than 2.5 times per week (IG3) was more beneficial for CE than performing TDM 2 times per week (Brustio et al., 2020). In general, physical fitness is generally seen as a stable trait of PA in young children (Chen et al., 2018;Raistenskis et al., 2016) and CE is found to have positive associations with cognitive and academic performance (Marques et al., 2018;Ruiz-Ariza et al., 2017) and strong associations with PL (CAPL; Lang et al., 2018). Interventions that can improve physical fitness and its components (CE) at a young age are considered crucial in reducing the risks of cardiovascular diseases and other factors like poor mental health and chronic pain that are associated with low CE (Rodrigues et al., 2013). These review findings taken in combination with intervention research are promising in promoting this type of school-based intervention and its potential benefits on healthrelated outcomes associated with CE improvements. Future research should investigate the connection between CE and PL in run/walk programmes specifically and in wider populations (Lang et al., 2018). This research may help to inform current and future policy and school guidance that currently focuses on encouraging PL development and PA through school-based interventions.
With many school-based run/walk programmes originating as public health initiatives, they are often driven by PA and obesity rather than PL, which is a more novel concept. The public health driven perspective could explain the lack of exploration of the physical elements like motor skills and instead a more dominant investigation of PA-focused measures like CE and BMI, which were two of the most commonly measured outcomes in this review. With the interventions also focusing only on locomotor movements (walk/jog/run), the lack of investigation into other outcome measures like fundamental movement skills (FMS) and coordination is expected. However, these are equally important components that contribute to PA participation and other health-related outcomes, so should not be overlooked (Brusseau et al., 2020). The development of FMS is considered vital in the refinement of more specific motor patterns for young children, and run/walk programmes can provide children with the opportunity to freely practise their skills within supportive school's environment (Sherar et al., 2020). Therefore, it may be of benefit for research to consider broader physical related outcomes like FMS in these settings that could contribute to wider skill development in children.
In affective measures, there were significant improvements in components of motivation following a walking intervention (n = 1; Brustio et al., 2018) and self-efficacy/self-esteem after one performance of SPA (n = 1) (Booth et al., 2020). Specifically, motivational benefits in terms of "social status" and "team" were identified, which is promising given that existing research demonstrates social support from friends is important in developing autonomous motivation, which is positively associated with PA participation (K. B. Owen et al., 2017). Similar qualitative studies such as Chalkley et al. (2020a) reported positive pupil experience after participating in the MK programme. The study found that autonomy to participate, perceived benefits, and supportive school environment facilitated pupil's enjoyment of MK. There was limited evidence available in relation to the affective domain in this review, with only four studies measuring affective related outcomes (Motivation, Brustio et al. (2018), Self-efficacy/self-esteem, Booth et al. (2020) and Garnett et al. (2017) and Emotional regulation, Breheny et al. (2020)). However, these study findings alongside similar qualitative research do provide positive insights addressing research concerns surrounding run/walk programmes and pupils' experiences such as potential boredom. However, there were only small effects, and most often were nonsignificant so larger scale research is needed. It is recommended that research looks to also clarify any causality between outcome measures and intervention methodologies in order to draw firm conclusions on run/walk programmes and affective outcomes (Dale et al., 2019;Liu et al., 2015).
In terms of PA-related outcomes, there was a focus on weight-related measures (Breheny et al., 2020;Brustio et al., 2020Brustio et al., , 2019Chesham et al., 2018;Garnett et al., 2017) and little or no exploration of other outcomes like daily PA, cognitive function or physical and mental fitness. All five studies assessed either body composition (Chesham et al., 2018), BMI (z; Breheny et al., 2020;Brustio et al., 2020Brustio et al., , 2019Chesham et al., 2018;Garnett et al., 2017) or waist-to-height ratio (Brustio et al., 2020(Brustio et al., , 2019. As previously mentioned, many school-based run/walk programmes have developed as public health initiatives where the focus is on reducing obesity and increasing PA participation, so this is not unexpected. Nevertheless, identifying other activity outcomes like daily PA and sedentary behaviours could be important contributors to understanding the long-term impact of intervention participation (Sherar et al., 2020). There was also limited evidence in this review to support the impact of school-based run/walk programmes on BMI despite its popularity in investigations, and no studies reported any significant changes in BMI over three (Brustio et al., 2019), six (Brustio et al., 2020;Chesham et al., 2018) and 12 months (Breheny et al., 2020). Collectively, these findings and existing reviews (Mei et al., 2016;Waters et al., 2011) suggest that BMI may only be reduced through multi-structured longitudinal interventions, although further research is needed (Demetriou & Höner, 2012;Jacob et al., 2021;Mei et al., 2016). It appears that selfselected pace programmes (Booth et al., 2020;Chesham et al., 2018) are effective at improving cognitive function, daily PA, and body composition over time, all of which are promising for contribution to PA participation. Measures of PA outcomes overall were limited within the review so it is not possible to draw conclusions on run/walk programmes in general. Future research may benefit from evaluating the contribution run/ walk programmes including varied implementation (selfselect pace etc.) has on these measures.
TDM was the most commonly investigated intervention, three studies performed TDM in accordance with the website (Brustio et al., 2019;Chesham et al., 2018;Marchant et al., 2020) and a further three performed TDM with some variation (Breheny et al., 2020;Brustio et al., 2020;De Jonge et al., 2020). The remaining studies all shared similar characteristics to TDM, including being adaptations of the intervention (Brustio et al., 2018) and having similar qualities, such as frequency (Mønness & Sjølie, 2009), time, distance (Booth et al., 2020) and self-selected pace (Garnett et al., 2017). One reason for the success of TDM could be the pace of the intervention. All studies that used TDM focused on a "self-selected" pace (n = 6) whilst some other remaining interventions (n = 2) in this review focused on a selected pace, such as walking or just jogging and running. Research shows that PA that provides children with choice can promote autonomy (Roemmich et al., 2012;Teixeira et al., 2012). The choice of self-select pace in TDM could promote autonomy in children and, in turn, benefit intrinsic motivation and PA participation. Therefore, a specific focus on self-selected pace programmes that are integrated into curricula would be beneficial.
The included studies that focused on "self-selected pace" reported greater findings than other interventions when the same outcomes were assessed. However, there was inconsistency within the findings for TDM intervention, which could be due to the quality of the available data. Only one study using TDM scored "good" (Breheny et al., 2020) and all other studies that reported greater findings in CE scored either "fair" or "poor" (Brustio et al., 2020(Brustio et al., , 2019Chesham et al., 2018;De Jonge et al., 2020). Interestingly, the study of higher quality also reported smaller intervention effects on CE compared to studies that performed similar methodology. Most of the studies (n = 8) in this review were categorised as "fair" or lower for quality scoring (scored <5, see, Table 3).
Interventions implemented within school settings face certain challenges that can negatively impact upon quality scoring. Firstly, it is not possible for participants or intervention administrators to be blinded, and allocation is often difficult to conceal. For example, it may not be possible for randomised interventions to take place within the same school due to chances of cross-over-effects and contamination as other pupils may observe the intervention and copy it. However, it is possible to blind the assessors to the groupings and limit the risk of bias that can be caused by knowledge of groups during data collection and analysis (Forbes, 2013), yet only one study was noted to have completed this within the review (Breheny et al., 2020). It is recognised that the blinding may be a barrier to this type of intervention study. However, it is recommended that future research blinds aspects of study design where possible, such as study assessors.
Only one study implemented randomised groups (Breheny et al., 2020), and all other study groupings were predetermined (e.g., schools chose to take part in the intervention or not). Without CG, research is unable to discount the potential effects of confounding variables, and determine the extent to which findings can be attributed to intervention participation (Polgar & Thomas, 2013). Often, within the recruitment for intervention studies, schools either opt to be experimental groups or control, so research teams are unable to randomise the allocations. There are also then potential ethical concerns that need to be considered (Polgar & Thomas, 2013). Specifically, Mønness and Sjølie (2009) noted that it would be unethical to split classes from the same school and instead more suitable to control for age effects by using growth curves to estimate natural improvement. Randomisation within one school could also lead to crossover effects within the study between CG and IG and would be difficult to manage. Almost all studies in this review were also based within one school so randomisation may not have been appropriate.
According to the quality scores in Table 4, only three studies reported that groups were similar at baseline regarding important prognostic indicators (Booth et al., 2020;Breheny et al., 2020;Brustio et al., 2018). Without controlling for baseline differences, any reported impact of participation could be misleading. Often participants were treated as one group regardless of initial baseline scores for variables such as fitness or BMI and the reported measures were based on participant mean scores. This could lead to interpretation bias within the findings as the participant response to treatment may vary based on the initial baseline findings.
Given the nature of the included interventions, participants' self-selected pace to complete their distance could also vary the treatment response experienced. The lack of clarity around treatment integrity could lead to bias within the interpretations of the findings. For example, schools could implement the intervention with a focus on running rather than self-selected jog, run or walk, which would influence the extent of impact on participantreported measures like fitness. In order to understand the true extent of the "self-selected" nature of the intervention and difference within or between participant groups, it is recommended that future research is clear on the interpretation of baseline groups and intervention integrity. These suggestions are in line with similar reviews where it was stated that clarity on intervention implementation and integrity is needed before firm conclusions can be drawn on outcomes (Love et al., 2019).
Finally, many studies also reported missingness in follow-up data, which could indicate issues with adherence to intervention or research design. Often this can be down to participants leaving schools and not being present on research days; however, large values were reported within this review; missingness ranged from 39.5% (Brustio et al., 2019) up to 56% (Breheny et al., 2020). One reason for the large recorded score could be due to poor compliance with the intervention. Breheny et al. (2020) performed multiple imputations to complete the data set, and no significant difference was observed between the input and complete case analysis. However, no other studies within the review reported data imputation to reduce the risk of bias, it is recommended that studies report complete and input case analyses. The present review is unable to determine if the missingness of data is related to intervention adherence. Therefore, future research would benefit from the inclusion of process evaluations in addition to intervention studies.

Limitations
The findings of the review were limited to the search terms and strategy conducted. The review was restricted to only quantitative studies; therefore, no other types of studies (qualitative, process evaluations, etc.) and grey literature exploring PL, PA or this intervention type were not included in the review. Qualitative studies were excluded for not meeting the criteria and/or checklist included in the review. Often the methodologies of qualitative studies were limited and firm conclusion could not be drawn with regard to intervention implementation or outcomes explicitly assessed. Qualitative studies and forms of process evaluations may, however, have captured psychological elements, PA and other domains of PL from a holistic perspective and should be considered. It is recommended that future research reports detailed intervention methodology to prevent exclusion in other PL-related reviews. During the writing of the review, a consensus statement on PL in England was started which may result in a new definition of PL and/or approach in research. To date, there is no "gold-standard" for PL definition or for approaching methods of monitoring or charting PL. It is recognised that this may have limited the inclusion of studies in the review.
Intervention adherence was not explicitly reported nor featured in the inclusion criteria of the review. The variance in intervention adherences such as times per week of intervention completion could impact the variance in results between studies included in this review. It is recommended that future intervention studies report fidelity measures where possible.

Conclusion
The present study is the first-known review to offer an insight into run/walk programmes and their implementation in school settings. The result of participating in run/walk programmes showed promising benefits for the physical and affective domains of PL, including CE, motivation and self-perception /self-esteem. However, no studies assessed PL as a whole nor was the cognitive domain of PL explored. Positive findings were also reported for PA-related outcomes such as daily PA, waist-to -height ratio and cognitive function. TDM was the most commonly implemented intervention and those studies that participated on average 3 times per week for a minimum of 3 months showed positive PL or PA related outcomes. The findings of this review can be used to support current and future policy recommendations on the implementation of run/walk programmes and their contribution to potential PL development. It is recommended that further research considers all domains of PL and methods to "chart" PL progress, particularly over longer periods of time, in order to provide a detailed account of progression.