Normative values and changes in range of motion, strength, and functional performance over 1 year in adolescent female football players: Data from 418 players in the Karolinska football Injury Cohort study

Objective: To study normative values of range of motion (ROM), strength, and functional performance and investigate changes over 1 year in adolescent female football players. Design: Cross-sectional. Participants: 418 adolescent female football players aged 12 e 17 years. Main outcome measures: The physical characteristic assessments included (1) ROM assessment of the trunk, hips, and ankles; (2) strength measures (maximal isometric and eccentric strength for the trunk, hips, and knees, and strength endurance for the neck, back, trunk and calves), and (3) functional performance (the one-leg long box jump test and the square hop test). Results: Older players were stronger, but not when normalized to body weight. Only small differences in ROM regarding age were found. ROM increased over 1 year in most measurements with the largest change in hip external rotation, which increased by 6 e 7 (cid:1) (Cohen's d ¼ 0.83 e 0.87). Hip ( d ¼ 0.28 e 1.07) and knee ( d ¼ 0.38 e 0.53) muscle strength and the square hop test ( d ¼ 0.71 e 0.99) improved over 1 year. Conclusions: Normative values for ROM and strength assessments of neck, back, trunk, hips, knees, calves and ankles are presented for adolescent female football players. Generally, ﬂ uctuations in ROM were small with little clinical meaning, whereas strength improved over 1 year. © 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).


Introduction
Football is the most popular female sport in the world with more than 13 million players in organized clubs, and there are more than 3 million players in the age group under 18 years (FIFA, 2019).Female adolescent football players have a high risk of injury and most injuries are located in the lower limbs followed by the trunk and upper limbs (Robles-Palaz on et al., 2021).Screening tests to identify players who are at high risk for injury and, in turn, guide injury prevention measures, often involve a combination of clinical measures, functional performance tests, and patient-reported outcomes (G.J. Davies, McCarty, Provencher, & Manske, 2017).Screening tests are also used to measure performance in athletes (Bishop, Read, McCubbine, & Turner, 2021), for progression during rehabilitation and for return to sport decisions (van Melick et al., 2020).Therefore, it is important as a researcher, clinician, or coach to have appropriate normative reference values for different defined populations (e.g., according to sex, age, or sport) to make it possible to assess and evaluate normal and abnormal values when screening athletes (Risberg et al., 2018;Sankar, Laird, & Baldwin, 2012).Normative values can also be used as a comparison tool for primary care physicians and other professions, to set rehabilitation goals, and for research.It is also important to know how these values may change over time.However, it is important to be specific when reporting normative values since data may differ with regards to e.g., sex, age, body weight and sport (Harbo, Brincks, & Andersen, 2012;Onate et al., 2018).
To our knowledge, there are no studies providing normative data, including potential changes over a longer time, for female adolescent football players on range of motion (ROM), strength measures and functional performance.Therefore, the aim of this study was to investigate and establish normative values in different screening tests and to investigate if these values change over 1 year in adolescent female football players.

Methods
Details about the study design, definitions and data collection procedures in the Karolinska football Injury Cohort (KIC) study are reported elsewhere (Tranaeus et al., 2022), and are therefore only briefly described here.

Design
This study reports cross-sectional baseline and follow-up data.

Participants
Twenty-eight local Swedish football teams from a metropolitan area with adolescent female football players aged 12e19 years from the two highest divisions were asked to participate in the study.One team and 4 players from different teams declined to participate; a total of 418 players from 27 teams were included and performed the tests at baseline.In addition, the first 106 players included from 11 teams were re-tested after 1 year.All players and their parents or legal guardians (players <15 years) received oral and written information about the study and signed written consent.The study was approved by the Swedish Ethical Review Authority (Dnr 2016/1251-31/4).

Procedures
The tests were conducted in indoors facilities during weekends at different times in the football season.Before the testing session, players completed a standardized 7-min warm-up programme comprising 4 min of jogging, 10 squats, 10 squat jumps and 10 unilateral lunges.The tests took approximately 60 min per player to complete.The tests included ROM assessment (of the trunk, hips, and ankles), maximal strength measures (isometric and eccentric for the hips, isometric for the trunk and knees, and endurance for neck, back, trunk, and calves) and functional performance with the one-leg long box jump (OLLBJ) test (G.J. Davies et al., 2017;van Melick et al., 2020), and the modified square hop test (Caffrey, Docherty, Schrader, & Klossner, 2009;Gustavsson et al., 2006).All tests are described in detail in Table 1.Intrarater and interrater reliability and minimal detectable change (MDC) were also calculated for all the tests and are provided in detail in the Supplementary Material.

Statistical analyses
Descriptive statistics are presented as means ± standard deviation (SD) or 95% confidence interval (CI) for the total cohort (n ¼ 418), divided into age groups: 12 years (n ¼ 97, 23%), 13 years (n ¼ 157, 38%), 14 years (n ¼ 91, 22%), 15e17 years (n ¼ 73, 17%) and for the sub-cohort followed for 1 year (n ¼ 106).The sub-cohort included players aged 12 years (n ¼ 19, 18%), 13 years (n ¼ 21, 20%), 14 years (n ¼ 39, 37%), 15e17 years (n ¼ 27, 25%) at baseline.Normality and homogeneity of variance were evaluated for continuous data.The mean value for the 3 different measures was used for ROM and maximum values were used for the strength measurements.Values are reported separately for the dominant (preferred kicking leg) and non-dominant leg.The strength measurements were also normalized to body weight and reported in Newton/kg.Paired-sample t tests, Wilcoxon's test (OLLBJ test) and McNemar's test (on the number that completed the cranio-cervical flexion test) were used to compare differences between baseline and the 1-year follow-up.Changes from baseline to follow-up are reported as mean percentage values.Effect sizes (ESs) are presented as Cohen's d or odds ratios (ORs), where d ¼ 0.2 indicates a small effect, d ¼ 0.5 is a medium effect, and d ¼ 0.8 is a large effect.Intrarater and interrater reliability were calculated for the tests with intraclass correlation (ICC) and Cohen's kappa (for the craniocervical flexion test) to detect the conformity of the intrarater and interrater measurements.The standard error of measurement (SEM) for each ICC estimate was calculated as SD Â √(1 e ICC).The SEM was used to calculate the minimal detectable change with the

Outcome measures Description and scoring
Range of motion (ROM) Trunk Active trunk rotation ROM measured in a modified seated rotation test, and a in a lunge position halfkneeling rotation test on a gym mat graded with 5-degree increments.The player was instructed to maximally rotate alternating between right and left: (1) in a cross-legged position and (2) in a lunge position on the dominant and non-dominant leg measuring the rotational degrees in the end range.In the 3 separate positions, 3 repetitions were performed in each direction.The mean value for each position was used for analysis.
(1) (2) Hip Passive hip ROM using a universal goniometer was measured in supine position (1) flexion and (2) abduction and in prone position (3) extension, (4) internal and (5) external rotation.The endpoint to measure ROM was determined to when a firm end feel was achieved, indicated by a motion of the pelvis.Three consecutive measurements for each position were performed for both the dominant and the nondominant leg.The mean value for each position was used for analyses.
(1) (2) (3) (4) (5) Foot Weight-bearingelunge ankle dorsal flexion (DF) ROM measured with the player's foot placed on a metric ruler 10 cm away from a wall.The player was instructed to lunge forward until contact with the wall was achieved without allowing the heel to lift off the ground.Three warm-up trials were performed to familiarize the player with the test before measuring 3 trials and the mean value was used for the analysis.The maximal DF ROM was measured with a digital inclinometer (Clinometer, Plaincode, Stephanskirchen, Germany) in degrees and the distance from the wall to the greater toe was measured in centimetres.Three trials were measured, and the mean value was used for the analysis.
Strength measures: endurance Neck Deep neck flexor muscle endurance was assessed through a modified version of the cranio-cervical flexion test with a pressure sensor (Stabilizer Pressure Bio-Feedback, Chattanooga Group Inc, Hixon, TN).
The test consists of a pre-test and an endurance test.In the pre-test, the player was positioned in a supine position on an examination table and instructed to perform a gentle cranio-cervical flexion to increase the pressure starting from a baseline target pressure (TP) of 20 mmHg and then maintain the pressure for 3 Â 3 s, with a 3-s rest between each contraction.If the player was able to perform this task, she was instructed to increase the pressure to 22 mmHg and keep the pressure for another 3 Â 3 s.This was repeated with a 2-mmHg increase until the player reached 30 mmHg.If the player was able to perform the pre-test, the endurance test was then performed with the same setup.However, the player was instructed to hold each contraction at the TP for 3 Â 10 s with a 10 s rest between contractions.The highest completed TP with a full set of contractions (3 Â 10 s) was registered and later used for analysis.Back Isometric back extensor endurance was assessed by the modified Biering-S€ orensen test.The player's lower body was supported on an examination table in prone position with 3 straps and the anteriorsuperior iliac spine was aligned with the edge of the table.Before the assessment, the player completed a shorter warm-up trial to orient the desired sagittal plane target angle.The player was instructed to keep her arms folded across the chest throughout the procedure and isometrically maintain the upper body in a horizontal position until failure when the time elapsed was registered.A digital inclinometer (Clinometer, Plaincode, Stephanskirchen, Germany) was placed on a metric ruler at the level of T5 in the thoracic spine to monitor sagittal plane movement.

Calf
Ankle plantar flexion muscle endurance was investigated using unilateral weight-bearing calf heel raises.
The maximal height that the player achieved during one barefoot calf heel raise was marked with a metric ruler.The player was then instructed to perform maximum unilateral barefoot heel raises continuously until the player failed to reach the marked maximal height, guided by a metronome to standardize the pace (1 s concentric, 1 s eccentric contraction).The same procedure was then conducted on the opposite foot.The number of repetitions accomplished was used in the analysis.
Strength measures: isometric and eccentric Trunk Isometric trunk rotational strength was measured in a modified standing wood chopper test utilizing a force gauge to evaluate force output (RS Pro Digital Force Gauge, RS Components Ltd., Corby, UK).In this modified test, the player held a handle attached to the force gauge at shoulder height in a standing position.The player was instructed to generate force through her trunk and rotate for 5 s while maintaining straight arms.Three consecutive repetitions were performed in each direction and the maximal force output was used for analysis.
corresponding 95% CI, (MDC)95, as 1.96 Â SEM Â √2.The significance level was set at p < 0.05.Statistical analyses were performed using SPSS Statistics for Windows (IBM SPSS Statistics for Windows, Version 27.0.IBM, Armonk, NY).We used R version 4.02 and the psych package for ICC and Cohen's kappa.

Results
The characteristics of the 418 adolescent female football players at baseline and specific to the different age groups and the subcohort of 106 players before and after the 1-year follow-up (12.2 ± 0.7 months) are presented in Table 2. None of the 106 players followed for 1 year changed club during the follow-up.
Normative values and changes in the sub-cohort from baseline to the 1-year follow-up for the different tests are presented in Tables 3e6.Generally, there were no differences regarding ROM regarding age except for external and internal hip rotation.Older players were stronger in the hip and knee muscles, but not when normalized to body weight.There were no significant differences between the dominant and non-dominant legs in the 418 players.The intrarater and interrater reliability for all tests ranged from 0.40 to 1.00 and 0.30 to 0.98, respectively (Supplementary Material).

Changes in ROM over 1 year
Trunk ROM decreased in all measurements, except for the inlunge rotation test with rotation to the left and right, left leg in front (Cohen's d ¼ À0.27 to À0.51) (Table 3).Hip, knee, and foot ROM increased slightly in both dominant and non-dominant legs in all directions (Cohen's d ¼ 0.22e0.89)with the largest change seen in external hip rotation, which increased by 6e7 (19%, Cohen's d ¼ 0.83e0.87)(Table 4).All the changes were below the MDC (Supplementary Material).

Changes in strength measurements over 1 year
Trunk strength (Wood chopper test) to the left increased slightly (Cohen's d ¼ 0.37).The number of players who completed the cranio-cervical flexion test decreased significantly (p < 0.001; OR, 5.33) (Table 3).Hip muscle strength increased in both the dominant and non-dominant legs in all directions (9%e24%, Cohen's d ¼ 0.23e1.07)and so did knee extension strength in both the dominant and non-dominant legs (22%e26%, Cohen's d ¼ 0.38e0.53)(Table 5).All the changes were below the MDC (Supplementary Material).
Strength normalized to body weight increased in hip extension  4) abduction strength as well as (5) eccentric hip abduction and (6) adduction strength was measured with a hand-held dynamometer (HHD) (MicroFet2, Hoggan Health Industries inc.West Jordan, UT, USA).Isometric (7) knee extension strength was measured with an HHD with the player in a seated position with the knee joint in 90-degrees of flexion.Before executing the strength tests, 2 submaximal isometric contractions were performed in each direction to familiarize the player with the procedures.Three isometric contractions with gradually increasing power output for 5 s, and 3 maximally eccentric contractions for 3 s were performed in the isometric and eccentric tests, respectively, with a 10-s rest between contractions.The maximal power output for each position was used for analysis.

Functional performance
The one-leg long box jump test (OLLBJ) In the OLLBJ, the starting position was calculated by dividing the player's height (cm) by 1.6 (height/1.6).
The player was then instructed to stand on 1 leg on the starting position and then jump on 1 leg directed inside the boundaries of the square and maintain balance after landing.Three warm-up trials and 5 consecutive test trials were performed on each leg.The total number of approved trials on each leg was used in the analysis.

Square hop test
The player was instructed to jump in a clockwise direction on 1 leg in and out of the square as many times as possible for 15 s.The player performed 2 warm-up trials on each foot before executing the test.

Discussion
Extensive normative data, reference values and changes over 1 year in adolescent female football players for common clinical tests measuring ROM, strength, and functional performance were established and presented in this study.We chose to include these field friendly tests for different joints, and for the neck, back, and trunk, because although most injuries located in the lower extremities, injuries to the groin and lumbar spine are also common (Clausen et al., 2014).To our knowledge, there are no studies describing normative data for this specific cohort and for the tests used, reported by age categories.Previous studies differ regarding sex, age, and sports as well as in terms of the joints and muscle groups being investigated.In addition, measuring instructions and techniques differ or are not reported in detail among studies.Therefore, it is difficult to compare the results for normative values in the present study with previous values in the literature.
However, the present study can serve as a reference for future studies in the field on adolescent female football players.Research and risk factors studies on adolescent female football players is expected to increase rapidly, because it is the world's biggest sport for girls and it is growing fast.
Data were presented separately for the dominant and nondominant legs for clinical purposes and for comparability with previous studies that reported normative values (Daloia, Leonardi-Figueiredo, Martinez, & Mattiello-Sverzut, 2018;Risberg et al., 2018).We did not find any significant differences between dominant and non-dominant legs in any of the tests.Previous studies have shown conflicting results, with stronger isometric strength on the dominant side in a Brazilian population of girls aged 5e15 years (Daloia et al., 2018), but elite female handball and football players demonstrated no clinically important difference between the dominant and non-dominant legs in isokinetic quadriceps and hamstrings strength (Risberg et al., 2018).Therefore, in future reports on normative values, the averaged value of the dominant and non-dominant legs to produce a single value could probably be reported for both ROM and strength measurements in adolescent female football players.When screening individual players, however, large side differences in leg strength could be present, especially after an injury, and it is important to detect and report this (Gustavsson et al., 2006).

Normative ROM data
No differences were found regarding ROM depending on age, except for the external and internal hip rotation tests where older players had decreased ROM of 5e7 .The clinical relevance of this finding is unclear because the minimum clinically important difference for external and internal hip rotation in youth baseball players has been reported previously to be 7.5 and 5.1 , respectively (Bullock, Beck, Collins, Filbay, & Nicholson, 2021).Normative data for ROM measurements have been reported previously for hip and ankle ROM in different cohorts (McKay et al., 2017;Onate et al., 2018;Sankar et al., 2012).The normative values reported in a general population of girls aged 11e17 years were almost identical to our reported data (Sankar et al., 2012).Compared with our data, lower values for ankle ROM, external and internal hip rotation, but similar hip flexion were reported in a general female population aged 10e19 years in Australia, (McKay et al., 2017).These differences were probably due to active ROM being measured (McKay et al., 2017) instead of passive ROM, as in our study.Ankle dorsiflexion measured with a weight-bearing lunge test was greater (13 vs 10 cm) in our cohort than reported in a cohort of high school students aged 13e19 years who played basketball, football, lacrosse, or football (Onate et al., 2018).This highlights the importance of being specific when reporting normative values regarding measuring performance, sex, age, and sport.
To our knowledge, our study is the first to report normative data for trunk rotation tests.The tests have been used to identify risk factors or the relationship between a shoulder injury and trunk rotation flexibility in collegiate softball players (Aragon, Oyama, Oliaro, Padua, & Myers, 2012) and adolescent elite handball players (Asker et al., 2017), but no normative values have been reported.

Normative strength data
Normative data have been presented previously for strength around the hip and knee measured with a hand-held dynamometer (Daloia et al., 2018;Thorborg et al., 2013).In one study, results for knee extension strength were similar to our results (Daloia et al., 2018).In our cohort, older players were generally stronger, especially for the hip muscles and knee extensors, but normalized to body weight this difference disappeared.Strength has been previously reported to be related to both age and body mass (Harbo et al., 2012).Using a hand-held dynamometer for isometric hip and knee strength measurements has been reported previously to be suitable for evaluating and monitoring athletes with hip, groin, and hamstring injuries, which are common injuries in football (Thorborg et al., 2013).Our results indicate that it is important to consider age, but also body weight, in future evaluations of hip and knee strength in young female football cohorts.
Normative data for the Biering-S€ orensen test have been reported previously in woman of different ages and varied between 142 and 220 s (Moreau et al., 2001).In girls aged 15e18 years, results for the Biering-S€ orensen test ranged from 148 to 228 s (Dejanovic, Cambridge, & McGill, 2014) compared with a mean of 146 s in our 15-to 17-year-old players.One explanation for the values in the lower range in previous studies could be differences in age and low motivation to identify perceived limit of fatigue in our players.An increase in endurance strength is expected with rapid growth during puberty and in our cohort, strength increased with age but was not correlated to body mass.Psychological outcome measures for motivation and perceived effort during isometric low back testing should also be evaluated further (Moreau et al., 2001).Extensor muscle endurance of the back seems to play an important role in prevention, rehabilitation and the risk of future back pain, and the Biering-S€ orensen test might be of value as a screening tool for preventive measures (Moreau et al., 2001).Thus, it is important that clinicians have appropriate normative data to use, especially when baseline assessments are unavailable or inappropriate due to long testeretest intervals (Merritt et al., 2017).

Normative data of functional performance
The functional tests used, the OLLBJ and the square hop tests, are rarely described in the literature and we did not find any previous normative values for these tests.The single-leg hop for distance is more commonly used for evaluating hop performance (W.T. Davies, Myer, & Read, 2020).Our aim was to evaluate both hop performance and the ability to land and stop in a pre-specified area using one test.The OLLBJ is a modified single-leg hop-and-hold (van Melick et al., 2020) and takes the height of the player into account (G.J. Davies et al., 2017).However, most of the players achieved 4 out of 5 valid hops for OLLBJ.Thus, the test was probably too easy, and the players almost reached a ceiling effect.The square hop test includes multi-directional movements, which are characteristic for football, but this test is also described and performed in different ways (Caffrey et al., 2009;Gustavsson et al., 2006).In our study, the players jumped for 15 s in a square of 40 cm.Before the start of the study, a pilot study of the square hop test was performed.In this pilot, 65 players performed the test for 30 s, but players lost concentration and function so the test was changed to 15 s.Therefore, the best functional hop test to assess the player's unilateral jump performance for this cohort should be described and evaluated further.

Change in test results over 1 year
A sub-cohort was followed for 1 year and performed all the tests once more to analyse potential within-player changes.The subcohort had less ROM at baseline, especially in external and internal hip rotation, compared with the total cohort.The reason is unclear, but could be due to systematic measurement differences because the sub-cohort comprised the first 106 players included in the study.There were small increases in almost all ROM tests, but most of them probably had no clinical importance (Bullock et al., 2021) and below the MDC.ROM is reported to decrease with age (McKay et al., 2017), but apparently follow-up for a longer period than 1 year seems to be needed in this age group.A decreasing trend in ROM for almost all measurements in the hip with older age is reported, but this decline was less apparent among girls (Sankar et al., 2012).
The change in the test value reported as a percentage will help the clinician to interpret the results together with the ES.However, change in the percentage should be interpreted with caution for tests with low values (e.g.OLLBJ and heel raises), because a small change will result in a big percentage change.The strength in both the dominant and non-dominant legs increased in the knee extension (22%e26%) and in all directions in the hip (9%e24%), which could be a clinically important increase in strength.However, normalized to body weight, the increase was smaller and as the highest percentage change in hip abduction (13%e14%).The increase in strength was below the MDC.ES indicated mostly small to medium effects.Moreover, the large MDC could be explained in some cases by the wide range of performance and thus a wide range in SD in the test values among the players.Puberty with physical, psychical and social maturity could affect the strength tests results and the normative values for strength were also generally higher with older age.At the end of puberty, the girls are expected to develop increased strength due to increased height and body mass.This is important to bear in mind with the knowledge of rapid growth and maturation of our studied population, Therefore, we also presented strength values normalized to body weight.The participants' understanding of the importance of the tests and motivation could also affect the test results.Factors such as learning effects of the tests would not be relevant because of the time interval between the tests (1 year).

Strengths and limitations
We included a large, homogeneous cohort of young female competitive football players, which enabled analyses stratified by sex and sports.We used a longitudinal design to assess changes within individual players over a football season.The purpose was to report on young competitive female football players in general.Therefore, the players were tested at different time points during the season to avoid seasonal variations.We did not exclude players with injuries, but players were informed to refrain from certain tests that evoked pain, provoked ongoing injuries or other healthrelated issues.Clinical tests measuring ROM, strength, and functional performance that are simple, quick, low cost, and can be used by sports medicine clinicians in the field were used.However, simple and quick measurement techniques could also be associated with potential sources of error.The assessments included several challenges such as fixation of surrounding joint and tissues, standardizing the starting position, standardized instructions, isolated movements, goniometer/inclinometer/hand-held dynamometer placement, and rater dependence.Several different test leaders performed the measurements, which could be a weakness, but also a strength, because it reflects reality in the clinic.However, the measurements were also tested for intrarater, interrater reliability and MDC.Most of the tests had good or excellent intrarater and interrater reliability with ICC values > 0.75, indicating that the methods were reliable.

Conclusions
The present study provides clinicians and coaches with reference normative values to be used in the evaluation of ROM, strength and functional performance in adolescent female football players.The ROM and strength measurements normalized to body weight did not differ between the age groups.The test results changed slightly over 1 year with improvements especially in hip abduction strength and in the square hop test.

Table 1
Description and scoring of the different tests.

Table 2
Characteristics of the adolescent female football players at baseline (n ¼ 418) and at the 1-year follow-up (n ¼ 106 players).
Values are reported as means ± standard deviation or n (%).The values regarding training/match are means of the preceding 6 months.aMissing value from 1 player.

Table 3
Results from the neck, back and trunk tests.Values are reported as the mean (range of motion) and max (strength) value of 3 repetitions.p values in bold type are significant.Effect size measured as Cohen's d, where d .F€ altstr€ om, E. Skillgate, U. Tranaeus et al.Physical Therapy in Sport 58 (2022) 106e116 a Missing value from 0 to 1 player in the different tests both at baseline and at follow-up.A

Table 4
Results from the tests for lower limb range of motion.

Table 6
Results from the tests for lower limb strength normalized to body weight and reported in Newton/kg body weight.
Values are reported as max value of 3 repetitions.pvalues in bold type are significant.Effect size measured as Cohen's d, where d ¼ 0.2 indicates a small effect, d ¼ 0.5 indicates a medium effect, and d ¼ 0.8 indicates a large effect.CI, confidence interval.aMissing value from 0 to 2 players in the different tests both at baseline and at follow-up.