What is Known About Muscle Strength Reference Values for Adults Measured by Hand-Held Dynamometry: A Scoping Review

Highlights • Existing literature regarding handheld dynamometer (HHD) strength reference values is scarce.• The current literature includes gaps relating to strength units used and well-described protocols.• There is a critical need to develop HHD reference values in adults.• Considering the increased availability of high quality HHD, this issue is urgent.

Muscle strength is a central component of function and movement. As such, it is essential to accomplishing daily living tasks and maintaining autonomy. [1][2][3] Although muscle strength is known to be a good predictor of functional capacity among the general adult population, strength deficits are associated with physical limitations. 3,4 For these reasons, evaluating this variable is a key component of physiotherapists' work; muscle strength reference values obtained from healthy adults allow clinicians to detect muscle weakness, quantify and identify the presence of neuromuscular impairment by comparing the values obtained to those of a healthy individual of the same age group and sex, to objectify patients' progress and to determine treatment effectiveness. [1][2][3][4][5] Many tools have been developed to obtain objective measurements of muscle strength, a component of muscle power, which is an important function of the neuromusculoskeletal system and movement according to the International Classification of Functioning, Disability and Health (B730 Muscle power functions ICF, https://apps.who.int/classifica tions/icfbrowser/). Manual muscle testing (MMT) is the most accessible and commonly used method. Although clinically feasible and quick to perform, this subjective method has poor psychometric properties and demonstrates significant limitations in detecting changes of strength over time. [6][7][8][9][10] For example, H ebert et al 6 showed that even when MMT is used by clinicians who have several years of experience and are using a more sensitive tool such as a 10-point scale, it cannot accurately classify patients and discriminate between patients with mild and severe impairments. Moreover, in patients with muscular dystrophy type 1 presenting with the late-onset phenotype, quantified muscle testing using a handheld dynamometer (HHD) revealed a strength loss of as much as 20.4%, whereas MMT testing suggested normal strength. 9 At the other end of the spectrum, isokinetic dynamometry is a method with sound psychometric properties and is considered the criterion standard measurement of muscle strength. However, the equipment is costly and requires considerable space to accommodate, and extensive training of users is required. 11 An interesting compromise between MMT and isokinetic dynamometry is quantified muscle testing using an HHD. The HHD is accessible, user-friendly, affordable, and has excellent psychometric properties, rendering it a top choice for the assessment of muscle strength impairments. [11][12][13][14][15][16] Maximal isometric muscle strength (MIMS) values obtained in some muscle groups with HHD are highly correlated with values obtained with isokinetic dynamometry, indicating good to excellent validity of both methods. 15 However, it should be understood that the use of HHD is inevitably linked to different sources of error measurement depending on the muscle group assessed, the experience and training of the evaluators, and the standardization of the protocols. [17][18][19] The most recent generation of HHD that can measure in both compression (push) and distraction (pull) modes, such as the Medup a or the Chatillon, b are frequently used in clinical settings. 20 Make and break tests are commonly used to measure muscle strength with HHD. Performing a "make" test implies that the evaluator holds the HHD stationary, whereas the participant exerts a maximal force against it; for a "break" test, the evaluator has to exert enough force to break the isometric contraction produced by the person. In this study, we were only interested in "make" test protocols because "break" tests have questionable reliability according to our clinical experience and the literature, and this type of test exposes participants to a higher risk of injuries. 21,22 Currently, to draw conclusions on the presence or absence of significant muscle impairments, MIMS values obtained from the affected muscle group are compared with those of the same muscle group on the contralateral side, assuming that the latter is healthy and experiences no neuromuscular impairment. However, this practice becomes problematic when individuals present bilateral strength deficits or when the supposed healthy side is not perfectly free of impairments. In these circumstances, the values obtained from the contralateral side cannot provide a valid comparator and, therefore, an "external" comparison according to muscle group may be necessary to identify muscle weakness. Moreover, even in the presumed absence of impairments, it remains difficult to determine if the muscle strength of the healthy side is appropriate and considered normal for a given individual of a given age and sex. Few studies have reported normative values of muscle strength in healthy populations for some muscle groups, making it difficult to address this important question. For example, Hogrel et al 23 and DanneskioldÀSamsøe et al 24 established normative strength values of several upper and lower limb muscle groups and the trunk with an isokinetic dynamometer and a force gauge fixed to an external structure. Unfortunately, as these devices are quite different from HHDs used by physiotherapists and are mainly used as research tools in conditions inaccessible to clinicians, these values cannot be used as a reference. Moreover, the protocols used in both studies differ from that developed with push-pull HHD, which considerably limits the clinical applicability of the reference values established by these authors. Hebert et al 20 and Beenakker et al 25 established reference values for several muscle groups of upper and lower limbs using push-pull HHD in the pediatric population, limiting the use of these values in individuals younger than 18 years old. It would therefore be relevant to know if similar clinically applicable data exist in the literature for adults. As a first-view approach to examine the research activity in this field, we conducted a scoping review avoiding the methodological shortcomings often found in rehabilitation scoping reviews. 26 The main purpose of this scoping review was to map the existing literature regarding reference values of MIMS of upper and lower limb muscle groups obtained with HHD in healthy adults. The review will also serve to identify potential gaps in the literature and guide future research. Our principal hypothesis was that the current literature is incomplete, as it lacks reference values of MIMS for several muscle groups in adults using push-pull HHD.

Methods
This scoping review was performed using the framework methodology presented in Khalil 30 Our review complies with reporting guidance for the conduct of scoping reviews (ie, Preferred Reporting Items for Systematic Reviews and Meta-Analyses [PRISMA] extension for Scoping Reviews). In the literature on muscle strength assessment, the terms "reference values" and "normative values" are often considered synonymous. These values are referred to as to the data set for muscle strength measurements, which are expected in a group of functional and healthy people. These values allow comparisons to be made with measurements taken in the clinic so that the results obtained can be interpreted objectively. Therefore, to include all of the literature relevant to our scoping review, our research focused at both normative and reference values. However, for the purpose of this scoping review, the term reference values was defined as the value of a property obtained by observation or measurement on a reference individual and not in the context of randomized controlled trials or studies comparing healthy people to people with impairments and disabilities. In this scoping review, the studies considered were the ones using the following concept for reference values: isometric muscle strength reference values correspond to quantifiable data of isometric muscle strength gathered from a large sample of the population representative of the general population. These values, measured several times in the same individual, must be obtained under carefully described conditions, allowing interpretation within the limits of their known metrological properties, and they represent what we would expect as muscle strength data in healthy adults.

Research question
This scoping review aimed to improve our knowledge regarding the existence of reference values of quantified MIMS in healthy adults. The following questions were addressed in the review: (1) Is there a consensus and consistency in the use of the terms "reference values" vs "normative values"?, (2) What is known in quantified MIMS obtained with HHD in healthy adults?, and (3) Is there consensus concerning the protocols and methodology used for muscle testing with HHD to obtain reference values? These questions were built using the Population, Context, and Concept model in which healthy adults were the population, reference values of muscle strength were the concept, and the evaluation of muscle strength with HHD was the context.

Data sources and searches
To identify the relevant literature, PubMed, EMBASE, CINAHL plus, PEDRO and Cochrane databases were searched. The search strings were "reference values/normative values," "isometric muscle strength," and "handheld dynamometry" (see supplemental fig 1, available online only at https:// www.sciencedirect.com/journal/archives-of-rehabilitationresearch-and-clinical-translation, for complete list of terms). After consulting and extracting articles from the databases, gray literature was searched in the RehabData and Proquest Dissertations databases, using the same search terms. The search strategy was reviewed and validated by a health sciences information specialist. After the initial search, duplicates were removed. The systematic literature search of databases was undertaken before January 13, 2020 and the search in the gray literature before May 1, 2020.

Study selection
Two independent reviewers (D.L. and P.B.) completed an initial screening of article titles and abstracts according to the inclusion and exclusion criteria. The selected articles were kept for further analysis. To be included in the study, the articles had to concern testing protocol using HHD for the purpose of establishing reference values in healthy adult populations aged 18 years and older (ie, without any history of medical, neurological, and musculoskeletal impairments or any condition that could affect torque measurements), be written in French or English, and be available in full text. Studies addressing the following themes or populations were excluded: (1) animals, high level athletes, adults with pathologies or any other condition affecting muscle integrity; (2) measurements of spine force, nonisometric strength (isokinetic or isotonic methods) or hand grip strength; (3) studies where a "break test" approach was used; (4) case studies; (5) studies using a device other than an HHD; and (6) studies in which strength values of healthy participants were obtained in the context of randomized controlled trials or when comparing healthy individuals with those with impairments and disabilities. After the initial screening, the remaining articles were read in their entirety and screened twice by the same independent reviewers (DL and PB) to ensure their eligibility. Disagreements regarding eligibility were discussed by both reviewers and resolved by consensus, with recourse to a third reviewer (JB) when needed. References of selected articles were checked to identify other eligible articles not retained following the initial database search. Because scoping reviews do not entail the appraisal and exclusion of articles based on the quality of research methodology, no risk of bias assessment was undertaken. 27

Data extraction
Data of the selected studies were extracted and charted by 2 independent reviewers (M.M. and L.J.H.) using a data extraction grid to ensure method standardization (table 1). A beta version of the extraction grid was tested on 2 articles before the final grid was produced. The data from the extraction grids completed by the 2 independent reviewers were subsequently merged to produce the complete final extracted data.

Data synthesis and analysis
The results were summarized in table format under 2 main themes: protocol variables and positioning descriptions for muscle testing. The protocol variables were subdivided into 5 items: HHD, units of measurement, testing procedure, muscle groups assessed, and positioning. The positioning item was subdivided in 5 categories: subject position (during the test), tested limb position, anatomic landmark for HHD placement, stabilization, and whether or not gravity was eliminated (limb placed in a neutral position in regard to gravity to eliminate the effect of segment weight) for each muscle group tested. Extracted data were analyzed, classified, and interpreted to map the breadth of the current existing knowledge regarding the research questions and to specify future research needs.

Relevant literature identification
As shown in figure 1, a total of 5021 studies were identified with the initial search in scientific literature databases and 336 papers were found in the gray literature by searching the Proquest Dissertations and Theses website. As 1342 duplicates were identified and excluded, 4015 studies were screened. Of these, 43 studies were selected based on titles and abstracts. Three articles were added after verification of references. During full-text screening of the remaining 46 articles, 35 papers were excluded by the 2 reviewers in accordance with the inclusion and exclusion criteria (see fig 1 for reasons for exclusion). Eleven articles were selected for the final data extraction. Two studies, Bohannon 31 and Bohannon, 32 were excluded, as they were a systematic review and a meta-analysis, respectively. These 2 studies included articles that were either already included in our scoping review or were excluded according to our eligibility criteria. Finally, the data from 9 articles were extracted, analyzed, and discussed.

Study characteristics and data summary
Information regarding the selected studies is presented in table 2. The data regarding the protocol variables are summarized in table 3, and the data for the positioning for muscle testing are summarized in tables 4 (upper limb muscle groups) and 5 (lower limb muscle groups), respectively.

Normative or reference values
Different terms were used to identify the maximal muscle strength data obtained from groups of individuals presenting with similar characteristics. Two studies used the term "normative values," 33,34 1 study used the term "reference values" only, 35 and 3 studies used both terms as synonyms. [36][37] Two studies used the terms "preliminary baseline databases" or "preliminary information" to describe the obtained strength values, 39,40 and 1 study reported them as data. 41 No study provided a definition of the terms "normative" and "reference" values.

Instruments and measures
In the included studies, measures of MIMS were collected using 8 different HHD devices: Accuforce II, c MicroFET

Testing procedures
Protocols varied greatly between studies. All protocols used isometric "make" tests in compression mode. For most protocols, muscle strength evaluations were performed in gravity-neutralized positions for all muscle groups tested, with the exception of 3 studies in which some or all muscle groups were tested against gravity. [33][34][35] The duration of the maximal isometric voluntary contraction for each trial varied across studies from 3 to 7 seconds, whereas the resting time varied from 10 seconds to 2 minutes. The number of repeated trials per muscle group ranged between 1 and 5 maximal isometric voluntary contractions. Verbal encouragements and stimuli were given during measurements in only 3 studies. 33,35,38 The strength measures were performed by only 1 evaluator in 4 studies, 37,39-40 2 evaluators in 4 studies, [33][34][35]35 and 3 evaluators in 1 study. 36 The experience of the evaluators was not specified in half of the studies and for those who reported it, experience level differed greatly (3-10y using HHD).

Muscle groups
There is considerable variability in muscle groups tested in the 9 studies analyzed. Two studies reported strength measurements of upper limb muscle groups only, 33,40 3 reported for lower limbs only, 33,35,39 and 4 studies recorded data for both upper and lower limbs. [36][37][38]41 Muscle groups tested in upper limbs included flexors/ extensors, abductors/adductors and internal/external rotators of the shoulder, and elbow and wrist flexors and extensors. Regarding lower limbs, tested muscle groups were the flexors/extensors, abductors/adductors and internal/external rotators of the hip, the flexors/extensors of the knee, and the dorsi/plantar flexors of the ankle. In the 9 studies included, strength data were available for both sexes in all muscle groups at least once, except for the wrist flexors, which were only available for women. Plantar flexors, shoulder and hip adductors, and wrist flexors are the muscle groups for which strength data are poorly documented.

Participants
Convenience samples of participants were recruited for all studies included in the scoping review. In most of them, ethnicity was not specified. In the study by Al-Abdulwahab, 39 39 and the other studies reported reference values of muscle strength for both sexes. [34][35][36][37][38] Positioning and protocol reproducibility Seven of the included studies provided sufficient information to reproduce the protocol used, particularly the position of the participant for muscle testing, the limb and joint positions during the measurement process, the anatomic landmarks used for the placement of the dynamometer, and the stabilization of the segments. Additionally, McKay et al 38 described the evaluator's position. Only 4 studies included pictures. [34][35][36]37 Four studies referred to other published article protocols by the same research group where all the information needed to reproduce the protocol is available. 37,38,40,41 Most studies provided sufficient details to reproduce the protocol used, which allowed us to determine that there does not seem to be a consensus on standard protocols to measure maximal muscle strength. Tables 4 and 5 present the positioning for each muscle group in each study.

Discussion
The aim of this scoping review was to identify and map the existing body of literature regarding MIMS reference values of upper and lower limb muscle groups obtained with HHD in healthy adults. Only 9 studies met the inclusion criteria and were included in the scoping review and further analysis. In light of the results of these studies, certain MIMS reference values were established in healthy men and women between the ages of 18 and 101 years old using a HHD protocol for a variety of muscle groups of the upper and lower limbs. Unfortunately, these studies present several shortcomings that significantly restrict their use as valid reference values.
The first research question of this study was to identify whether consensus or consistency exists in the use of the terms "reference value" vs "normative value." This scoping review suggests that there is indeed no consensus in this regard in the literature. To determine if muscle strength is considered "normal" for a given individual of a given age and sex, the measured value must be compared with a value considered to be the norm. This reflects an unfounded assumption that there is a certain universality to the construct of muscular strength. In addition, it is to be noted that the terms "reference values" and "normative values," which are often used as synonyms in the literature, are 2 distinct concepts that are worthy of discussion. Normative values are defined as values "of, relating to, or determining norms or standards," which in turn are defined as "a set standard of development or achievement usually derived from the average or median achievement of a large group." 42 Such values should be obtained from a very large cohort. Most of the studies included in this scoping review involved specific and fairly homogeneous samples of the population, with distinct characteristics. The term "reference values" is defined as the values obtained from individuals presenting conditions that are similar to that of the tested subject and well described, in circumstances that are well controlled, thus allowing adequate comparison and interpretation of the values obtained from the test. 43 It may therefore be more appropriate to identify the values obtained from MIMS testing as reference values to be used for comparisons with individuals showing similar characteristics.
Regarding the second research question of this study, although one would expect muscle strength in adults to be well documented, this does not appear to be the case in manual dynamometry; there are many gaps in the studies published on the subject. Several limitations are related to the type of devices used to collect strength measurements as well as the procedures surrounding their use. As mentioned above, the type of device used was highly diverse. Eight different HHD devices were used in the included studies, all with different characteristics (units of measurement, upper force limit, device design [attachments, handles], compression, or traction mode), restricting comparison of the values obtained with each. Consequently, it is impossible to claim that the reference values with one device or another would be equivalent without knowledge of the concomitant validity between tools. This severely limits the clinical use of the existing reference values presented in these studies. The upper force measuring limit of the devices, also highly variable (250-1959 N), compromises the accuracy of measures in muscle groups with capacity that exceeds the measurement ceiling, as is the case for the knee extensors. Some studies included participants who generated forces above the dynamometer's upper limit of measurement, creating a ceiling effect that invalidates the mean values obtained for the muscle group involved. 36,37,40 Therefore, these values cannot be taken into consideration for comparison.
Another major limitation in the current literature on HHD strength values is that these values are reported in units of force (kg or N) rather than torque (Newton-meters), making it impossible to use these values for comparison purposes, which is the main reason for establishing reference values. Indeed, no included study considered the anthropometric characteristics of the participants, which have an important Table 4 Positioning for muscle testing (upper limb muscle groups)

Muscle Groups
Andrews et al 36 Bohannon 41 Bohannon 37 Bohannon 40 McKay et al 38 Riemann     influence on the torque that could be generated. The length of the lever arm (ie, the perpendicular distance between the placement of the HHD and the axis of rotation of the tested segment) is an important parameter as it takes individual differences in body segment length into account in the determination of the tensile force generated. For example, Alvarenga et al 33 showed stronger hip flexors than hip extensors, which is unlikely considering that when controlling for lever arm and muscle length, the hip extensors are almost twice as strong as the hip flexors in isometric or in low velocity testing conditions. 20,23,24 This observed difference could be explained by the more proximal placement of the dynamometer for the hip flexors than the hip extensors, resulting in a shorter lever arm for the flexors and therefore a greater force measurement in Newtons on the dynamometer. Had torque been calculated, results could have been quite different. This example demonstrates the importance of measuring the lever arm and of expressing results in torque rather than in units of force. Also, and surprisingly, some studies report strength data as a percentage of body weight. The rationale for doing so is not explained, and the clinical meaning of using such a ratio or percentage should be clearly described to make this percentage a significant biomarker of muscle impairments. From this scoping review, it appears that reference values are not available for both sexes for muscle groups such as the radial and ulnar deviators of the wrist, the ankle evertors/invertors, and the flexors/extensors and abductors/ adductors of the fingers. This highlights the lack of muscle strength reference values for distal muscle groups in the literature. In addition, no MIMS reference values were found for the wrist flexors in men. Although these muscle groups are less often evaluated in clinical settings, they can be a good indicator of weakness and diagnostic criteria for several neuromuscular diseases or musculoskeletal disorders. This supports the importance of paying closer attention to these muscle groups.
One of the research questions of this scoping review was to determine whether consensus exists regarding the protocols and methodologies used for muscle testing with HHD to obtain references values. Although most of the studies provided a description of the protocols, some of the muscle testing positions present measurement biases, such as evaluation of MIMS of certain muscle groups in positions against gravity or with insufficient joint stabilization. In addition to increasing the evaluator's role in achieving stability of the participant and the presence of cocontractions, testing muscle strength against gravity leads to an underestimation of the strength values obtained. In such a case, the weight of the limb or segment evaluated should be subtracted from the force exerted to obtain a valid result, which is clinically impractical. Alvarenga et al 33 and de Oliveira et al, 35 who tested hip muscle groups against gravity, as well as Riemann et al, 34 who tested the external rotators of the shoulder in prone position, did not take the weight of the segment into account. Such methods render the reference values obtained invalid for between-subject comparisons, especially for comparisons with other studies where gravity was eliminated.
Stabilization of the subject and the HHD is essential to ensuring good content validity of maximal values obtained in an isometric condition. When stabilization is insufficient, certain compensatory movements that influence the amount of force generated by the person can be observed. In addition, the balance between the force exerted by the subject and/or the rater's ability to properly resist is not respected, inducing a subtle movement of the joint and the segment. Therefore, the muscle length and consequently the strength values are modified. Some muscle groups like the knee extensors or the hip flexors, extensors, and abductors are very strong, and it is unlikely that a clinician would have the capacity to resist the force generated by these muscle groups in compression mode without any additional stabilization. 20,23 Indeed, in some studies (eg, Al-Abdulwahab et al 39 ), the evaluator used straps to stabilize the segment and minimize unwanted hip, pelvic girdle, and trunk movements during knee extension testing. For the same muscle group, Andrews et al 36 and Bohannon 37 had an assistant to help stabilize the trunk for the same reasons. Yet, these procedures do not increase the ability of the evaluator to resist the force exerted by the individual. [36][37][38][39] Only de Oliveira et al 35 used a belt strap made from inelastic material for better positioning of the HHD and minimization of the evaluator's effort during strength measurement of the hip flexors, extensors, and abductors. 35 However, the landmark for the positioning of the strap was not described in the paper, limiting the reproducibility of the protocol.
Other characteristics of the strength measurement protocols could also lead to measurement biases, such as the absence of verbal stimulation/motivation during the measurements, the duration of rest periods between each trial, and the contraction time. Many studies included in the review did not use verbal stimulation during the strength measurement or do not mention it; yet motivation can affect the force generated by the participant, increasing maximal strength values. Indeed, Jung et al 44 showed that static grip strength was significantly higher with the use of verbal encouragement. Furthermore, there is no consensus among studies concerning optimal rest time between trials. De Salles et al 45 showed that when executing repeated maximal strength assessments, 1 minute rest intervals are sufficient to then complete a second attempt of a 1 repetition maximum bench press or back squat. However, these concentric exercises require a high level of neuromuscular coordination and cannot be compared with maximal isometric contractions. No evidence has been found in the literature about repeated maximal isometric voluntary contractions. In this scoping review, some studies used an intertrial rest time of less than 1 minute; this may have affected recovery, but more research on the subject is needed. 33,35,38 Regarding the characteristics of the participants, although the study samples included participants aged between 18 and 101 years, some studies did not report the values according to decade, 40 and others stratified the values into large age groups. 35 This latter approach represents a way of reporting reference values that may tend to underestimate strength values of the younger participants and overestimate the values of the older, reducing the external validity of the data collected. Some studies did not specify the activity level of the participants, which is another limitation considering that the training volume and types of activity practiced can significantly affect muscle strength capacity.