Establishing Reference Data for Fitness Assessment of Law Enforcement Officers Using a Qualitative Systematic Review

Physical fitness tests are a standard means of evaluating the competence of police officers. This qualitative review aims (i) to document, compare, and examine the reference values available in the current literature regarding fitness tests for Law Enforcement Officers (LEOs), and (ii) to define reference values for the most used fitness tests to assess and predict police officer performance. A total of 1879 records were collected for review from two major literature databases, PubMed and ScienceDirect. After applying our exclusion criteria, a total of 19 studies were considered. All studies demonstrated acceptable methodological quality in fitness assessment, and the most used components were muscle strength, muscular endurance, muscle power, aerobic and anaerobic capacity, flexibility, and agility. This review provides (i) a methodological definition for the physical fitness assessment that helps select the most used fitness tests, (ii) a standardised methodology for establishing reference data for fitness tests appropriate for LEOs; and (iii) aggregate reference values for selected fitness tests. This may improve selection and retention procedures, considering that this group performs its duties in an environment and under conditions that differ from those of other occupational groups. Complementarily, this qualitative review also provides a foundation for developing effective interventions to improve each aspect of fitness testing for police officers.


Introduction
In recent years, the demand for emergency services and first responders in public security has increased significantly to protect society from crimes and violence. This has led to a greater emphasis on the physical abilities of officers, highlighting the need for proper fitness testing and training programs.
The profession of Law enforcement officers (LEO) can be physically and mentally demanding. They may be required to perform various physical tasks, such as apprehending subjects, running up and down stairs, pushing their body over obstacles, dragging objects, and engaging in a foot chase. It has been shown that the tasks performed by LEO to protect society from hazards and eliminate threats in real time require adequate physical fitness to be performed efficiently and safely [1][2][3]. Current literature suggests that a large variety of demographic and physical fitness variables are correlated to law enforcement physical ability, including age, body mass index, anaerobic and aerobic capacity, upperbody muscular endurance, lower-body power, and agility [1][2][3].
Many LEO agencies use physical fitness testing as part of the recruitment process to ensure that recruits have the necessary skills to perform academy training [4][5][6]. However, physical fitness also takes on particular importance when results depend on physical fitness

Databases Search Terms Filters (Sort by) Results
PubMed "Police" OR "Law enforcement" AND "Fitness test" OR "Physical fitness" AND "health" Best Match 177 ScienceDirect "Police" AND "Fitness test" AND "health" Relevance 1702 Healthcare 2023, 11, 1253 3 of 24 We aimed to increase the relevance of our search results by applying filters that reflected the study eligibility criteria in each database, where available. These criteria were then used for the full text of articles that passed the initial title and abstract screening process to make a final selection of eligible articles for this qualitative review. The PRISMA flow diagram ( Figure 1) [17] documents the search, screening, and selection results. Inclusion criteria were defined as individuals from law enforcement measuring physical fitness and health. In contrast, exclusion criteria were (i) studies older than 15 years, (ii) studies examining only body composition, (iii) studies addressing instrument development, (iv) studies addressing only weight bearing, (v) studies addressing only screening instruments, (vi) validity studies, and (vii) reliability studies. After collecting all studies, duplicates were removed.
PubMed "Police" OR "Law enforcement" AND "Fitness test" OR "Physical fitness" AND "health" Best Match 177 ScienceDirect "Police" AND "Fitness test" AND "health" Relevance 1702 We aimed to increase the relevance of our search results by applying filters that reflected the study eligibility criteria in each database, where available. These criteria were then used for the full text of articles that passed the initial title and abstract screening process to make a final selection of eligible articles for this qualitative review. The PRISMA flow diagram (Figure 1) [17] documents the search, screening, and selection results. Inclusion criteria were defined as individuals from law enforcement measuring physical fitness and health. In contrast, exclusion criteria were (i) studies older than 15 years, (ii) studies examining only body composition, (iii) studies addressing instrument development, (iv) studies addressing only weight bearing, (v) studies addressing only screening instruments, (vi) validity studies, and (vii) reliability studies. After collecting all studies, duplicates were removed.

Critical Appraisal
We utilised the Critical Appraisal Skill Programme (CASP) checklist, which includes nine questions, to evaluate the study's methodological quality [18]. Each question had three possible answers: "yes", "cannot say", or "no". As question ten was subjective, we chose to leave it blank. To avoid bias, two authors assessed the methodological quality individually. The results of this quality assessment can be found in Table 2.

Data Extraction
After critical analysis of the full text of the selected articles, a list of intended data was used: (i) authors and year of publication; (ii) study population (country where the study was performed, participants' gender, age, and intervention groups); (iii) physical capacity evaluated (aerobic capacity; agility; flexibility; muscular endurance; muscular power; muscular strength); and (iv) fitness tests (fitness test results presented as mean ± standard deviation). Table 3 shows data extraction.

Meta-Analysis and Data Aggregation
The data collected from female or male LEO fitness assessment results were subjected to a meta-analysis to establish reference data. We combined the mean estimates and standard deviations of fitness test parameters across several studies. We only aggregated fitness data collected using the same acquisition protocol and collected from the samesex participants and LEO group (cadets and officers). In accordance, sample size (n), mean estimates (M), and standard deviation (SD) for fitness test results in each of the selected studies were used as effect size estimates. Aggregated effect sizes were calculated using random effect estimating methods (which allows the study outcomes to vary in a normal distribution between studies), i.e., the random effect model was used to compute statistically combined measures and 95% confidence intervals (CI). The restricted maximum likelihood method (REML estimator) was used to estimate the between-sample variance (τ 2 , tau-squared).              The heterogeneity test results should be considered alongside a qualitative assessment of the combinability of studies in a systematic review. To measure the inconsistency of studies' results, Cochran's Q (a classical measure of heterogeneity) and the I 2 (describes the percentage of variation across studies that is due to heterogeneity rather than chance, i.e., expression of the inconsistency of studies' results; I 2 = 100% × (Q − df)/Q) were considered [32]. The classification used to evaluate I 2 is as follows: 0-40%, might not be important heterogeneity; 30-60%, moderate heterogeneity; 50-90%, substantial heterogeneity; 75-100%, considerable heterogeneity (these cut-offs are not absolute, and the interpretation of I 2 considers the context and clinical relevance of the studies being analysed).
Results of the meta-analysis are also presented in forest plots for matched LEO groups if significant heterogeneity was observed in some fitness tests. Articles that report more than one LEO group of participants within the same sex are written as separate observations in the model. The size of the points on the forest plot is a function of the precision of the outcome, more precise estimates are more prominent in the plot, and their area corresponds to the weight they received in the random effect model. Statistical analysis and forest plots were performed using the Statistical Package for the Social Sciences (IBM Corp. Released 2021. IBM SPSS Statistics for Windows, Version 28.0. Armonk, NY, USA: IBM Corp).

Search Results
A total of 1879 studies were found during the initial search of the two databases. After removing duplicates and screening by title and abstract, the full-text versions of 51 studies were compiled for review. These studies were then assessed against the inclusion and exclusion criteria, leaving 19 studies for critical review (Table 3). A summary of the screening and selection process and the literature search results can be found in the PRISMA flow diagram [17] (Figure 1). Of the 19 studies, three referred to Portuguese police officers, and the other seventeen referred to police officers from around the world (Brazil, Canada, Germany, Ireland, Korea, Serbia, and the USA). Fifteen studies examined male and female participants, while four included only male participants. The average age of the studies is 34.59 ± 5.58 years old.
In addition, it was observed that in some of the studies with participants of both sexes, the results of the fitness tests were not presented separately for males and females (i.e., the average value of joint performance is given). Table 4 identifies the studies where this is verified.
In addition, it was observed that in some of the studies with participants of both sexes, the results of the fitness tests were not presented separately for males and females (i.e., the average value of joint performance is given). Table 4 identifies the studies where this is verified.
The effect of LEO groups (cadets and officers) as a moderator of fitness tests was evaluated. The mixed effect model only indicates a statistically significant moderator effect in female sit-and-reach (Q M [df, 1] = 9.21, p < 0.001), i.e., performance in push-ups, sit-ups, handgrip (dominant), 1 RM bench press, vertical jump, and 2.4-km run do not differ significantly among the LEO groups. However, small sample sizes in LEO cadets may have reduced the statistical significance of differences among samples.
Aggregation of fitness tests in male LEO based on meta-analysis, including the subgroup analysis (LEO: cadets and officers), were summarised for females in Table 5 and males in Table 6.

Discussion
This qualitative review aimed to document, compare, and examine the reference data available in the literature regarding fitness tests for LEOs. All studies showed acceptable methodological quality in the assessment of fitness attributes.
This review also provides a detailed analysis of existing data and objective reference data for essential physical skills in the components of fitness for LEO cadets and officers. One of the strengths of this study is the pioneering methodology used to establish reference data for the fitness assessment of LEOs.
Our data provide a basis for developing effective measures to improve each aspect of police officer fitness testing. The test battery includes assessments of muscular endurance, strength, power, aerobic capacity, agility, and flexibility, the essential skills for the job.
The tests have acceptable technical measurement errors and high reproducibility and are assumed to be used in our environment without interference.
Physical fitness testing is a valuable tool for assessing an individual's health status, identifying health-related risk factors, and determining job readiness and suitability.
The primary objective of physical fitness testing is to optimise functional fitness. To achieve this, it is crucial to understand the physical fitness requirements for the occupation and design or use tests that effectively measure the fitness level of recruits and officers. The results of these tests can guide exercise prescription and goal setting, which can help optimise adherence to the program, reduce injury risk, and enhance both physical and mental job performance.
It is thus evident that the need to profile fitness tests for LEOs can improve physical and overall job performance. Nevertheless, when selecting a physical assessment battery, it is essential to consider various variables, including the test population, available time, equipment and resources, and the specific information to gather from the tests.
Moreover, the standard scores obtained from fitness tests are essential for establishing health-related norms to assist individuals in setting performance goals and serve as motivational tools. Fitness tests can also positively affect individuals by fostering personal growth, reducing anxiety, and increasing motivation and confidence. Therefore, proper analysis and selection of the testing battery can help optimise the individual's physical fitness of LEOs and positively impact their overall well-being.
According to the literature, Orr et al. [34] showed that female police officers have a moderate to strong significant relationship with all fitness measures and influence officer performance. However, the meta-analysis conducted in this study found significant heterogeneity in the results of push-ups and sit-ups among female LEOs, suggesting that there may be differences in the performance of these fitness tests among female LEOs from different populations. This variability may be attributed to several factors, including differences in physical fitness levels, variations in training programs, and cultural and social factors that may affect an individual's level of physical activity. For example, it is hypothesised that female LEOs may face physical activity and fitness barriers due to workplace sexism and the lack of peer and supervisor support. Also, employment in a non-traditional occupation, like female LEOs, where males often deliver training, can be a reason for this disparity because males and females may approach task performance differently. On the other hand, there were no significant differences in the performance of push-ups, sit-ups, handgrip (dominant), 1 RM bench press, vertical jump (Sargent/Abalakov), and 2.4-km run between LEO cadets and officers, suggesting that training level or experience did not significantly affect the performance of these fitness tests.
The proposal to develop a battery of fitness tests stems from the need to assess and diagnose LEO's physical fitness. Given the physical demands of the police profession, specific assessment tests and the development of norm tables are needed to verify the relevance of these assessment results. The normative reference approach is used to evaluate the performance of the incumbent and officials against a normative sample, and a statistical procedure is used to establish a standard. However, a critical step in conducting a fitness test is establishing a minimally acceptable standard. It is important to note that standard setting should be reasonable and involves complex legal considerations. To ensure that standards are reliable and valid, professionals with relevant expertise should be involved in setting the standards. They can use various methods, such as job analyses and evidence-based research, to establish appropriate standards. When developing the standards, it is also essential to consider the tested people's specific job requirements and characteristics. This also applies to the presentation of results. As expected, the number of tests and the reported outcome variables show significant variability in how the fitness attributes of LEOs are tested. Although many personal factors can influence the results of a fitness assessment, this study attempted to account for unique characteristics to obtain homogeneous samples. In addition, most studies show heterogeneity between protocols used to measure components of fitness or the same protocol when results are presented for police populations. Therefore, comparing results between studies is difficult due to differences in assessment methods.
The second main objective of this qualitative review was to establish reference values for the main fitness tests adapted for LEO. Nevertheless, comparing the normative means of the studies raises some questions about the methodology, applicability, and presentation of the results. In other words, some literature provided preliminary results and had several limitations, such as the fact that some authors presented male and female average values of fitness assessments together [15,19,24,26,[29][30][31], others did not use the same units of measurement, and some authors presented few results or differentiated according to different age groups, which made the definition of reference values very difficult.
The meta-analysis showed heterogeneity in some fitness test results among LEOs groups, possibly due to differences in fitness levels, training programs, and cultural and social factors. The lack of homogeneity in the presentation of reference values and the lack of complete results were cited as significant limitations of the study. Since a substantial limitation of this study is the need for more homogeneity in the presentation of reference values and the absence of complete results, this work aims (complementarily) to define the scoring rules to establish and develop reference values adapted to LEOs in the future, i.e.,: (i) all tests must be performed with the same methodology and collected with the same units of measurement; (ii) the units of measurement most used were those for function according to Massuça et al. [9] (muscular endurance-all results must be reported in repetitions; muscle strength-in kg; muscle power-in centimetres for the vertical jump or in meters for medicine ball throw; aerobic capacity-in meters or minutes or maximum rate of oxygen consumption-VO 2 max; agility-in seconds; flexibility-in centimetres); (iii) all results must be reported by gender (males or females) and by four age groups (i.e.,: <29 years; 30-39 years; 40-49 years; >50 years). In this way, in the future, as more studies follow these criteria, we will be able to compile multiple international results and use them in a way that is more appropriate for LEOs and define reference values for setting cohort boundaries for assessment and career advancement as positive baseline values. It is suggested that further research be conducted to evaluate these criteria, as we have been able to define good cut-off points.

Conclusions
The risks associated with policing have numerous complex and long-lasting consequences that can affect the effectiveness of police operations and activities. It is critical to maintain optimal physical fitness over time, monitor changes in police officer health, and provide timely information about the positive and negative effects of irresponsible management of these issues by police officers and police management.
This qualitative review highlights the importance of optimal fitness in LEOs. It provides (i) a methodological definition for the physical fitness assessment that helps select the most used fitness tests, (ii) a standardised methodology for establishing reference data for fitness tests appropriate for LEOs; and (iii) aggregate reference values for selected fitness tests.
The battery of fitness tests should include assessments of muscular endurance, strength, power, aerobic capacity, agility, and flexibility, which are essential occupational skills.
Proper classification of fitness results to establish reference values raises awareness of optimal, salient, or diminished fitness attributes in LEOs with higher scores than the general population.
In sum, our study seems to provide a basis for developing effective interventions (to improve fitness testing interpretations for LEOs) and to improve the selection and reintegration procedures (considering that this professional group performs its duties in an environment and under conditions that differ from those of other occupational groups).