Introduction

Osteoporosis and related fractures pose a major public health and economic burden, especially when there is a dramatic demographic shift toward an aging population. Approximately 30% of individuals with a hip fracture will die within the first year of the index fracture (Goldacre et al. 2002; Roberts and Goldacre 2003) and many more will experience significant functional loss (Boonen et al. 2004; Randell et al. 2000). Treating fractures is also very costly: a typical patient with a hip fracture incurs US $40,000 in direct medical costs in the first year following hip fracture and almost US $5,000 in subsequent years (Burge et al. 2007). Vertebral compression fractures are less costly but have a substantial negative impact on the patient’s function and quality of life (Tosteson et al. 2001). A recent study revealed that improvements in the surgical and medical management of hip fracture have resulted in a decline in hip fracture rates and subsequent mortality among persons 65 years and older, although comorbidities among these patients have increased (Brauer et al. 2009). As a consequence, health care systems may be further burdened by fractures and their associated comorbidity.

A number of clinical tools have been devised to screen for subjects at risk of osteoporotic fracture. The use of bone mineral density (BMD) alone has been proven to be unsuitable because of its low sensitivity: most fractures occur in the much larger group of individuals with BMD T-scores above the cutoff value of −2.5 for osteoporosis (Siris et al. 2004). To overcome this limitation, a number of fracture risk prediction tools have been developed that incorporate both clinical risk factors and BMD data, such as FRAX®. In 2008, the World Health Organization (WHO) task force introduced FRAX®, a country-based risk calculator for 10-year risk for hip fracture and major osteoporotic fractures. FRAX® is modeled on data derived from nine population-based cohorts with ethnic, gender, and geographic diversity and has been validated in 11 independent cohorts representing over one million person-years of observation. The WHO FRAX model utilizes ten clinical risk factors, with or without BMD, for fracture risk prediction. In areas where BMD measurement is unavailable, BMD is replaced by BMI as they share a similar risk profile for fracture prediction.

In addition to FRAX, a number of population-specific clinical tools have identified risk factors other than those in FRAX for fracture prediction (Hippisley-Cox and Coupland 2009; Kung et al. 2007; Lewis et al. 2007; Nguyen et al. 1996; Tsang et al. 2011). Incidence of fall, a well-established non-BMD clinical risk factor for fracture (Jarvinen et al. 2008; Karinkanta et al. 2010; Woolf and Akesson 2003), is one of those currently under evaluation. Handgrip strength (HGS) is a less studied but potentially useful objective parameter to predict fractures. HGS has often been used as an indicator of general muscle strength since it is an objective parameter that is quick, easy to determine, independent of observer variation, and inexpensive. It is also associated with markers of frailty other than chronological age (Syddall et al. 2003). Therefore, we hypothesized that HGS could be a useful predictor of fracture, and its action could be independent of BMD.

Previous studies reported a significant association between HGS and fracture. However, these studies were to young perimenopausal women (Sirola et al. 2008) or postmenopausal women (Karkkainen et al. 2008) only. Therefore, the association between HGS in men and women aged ≥50 years remains largely unknown. Moreover, the predictive power of HGS to incident fracture has not been studied. The purpose of this study was to examine the relation between HGS and major clinical fragility fractures at the spine, hip, distal forearm, and humerus in a population-based cohort and to address the usefulness of HGS in predicting major fragility fractures in a prospective cohort.

Materials and methods

Study design

This was a cross-sectional and prospective study conducted at the Queen Mary Hospital in Hong Kong.

Study subjects

This study formed part of the Hong Kong Osteoporosis Study which was initiated in 1995. The population cohort participants were community-dwelling Southern Chinese men and women recruited from public road shows and health fairs held in various districts of Hong Kong. From 1998 to 2009, a total of 9,353 southern Chinese men and women were recruited to Queen Mary Hospital, Hong Kong. Subjects with known skeletal disease or prescribed medication that would affect bone mineral metabolism were excluded. Among them, 4,649 individuals with missing data on HGS or BMD or prevalent fractures or any covariates were also excluded along with those under the age of 50 years (n = 1,883). Moreover, subjects with missing data on prevalent fracture and history of fall in the past 12 months were also excluded (n = 28). A total of 1,217 men and 1,576 women were analyzed in the present study. Among these 2,793 subjects, a subset of 1,702 subjects agreed to participate in the prospective Hong Kong Osteoporosis Study (Kung et al. 1999). The duration of follow-up corresponded with the time from baseline to the occurrence of fracture, death, or the last follow-up visit data. The last follow-up data were collected by June 2010.

All participants gave informed consent, and the study was conducted according to the Declaration of Helsinki. The study protocol was approved by the Institutional Review Board of the University of Hong Kong and the Hospital Authority Hong Kong West Cluster Hospitals.

Health history, lifestyle, and demographic data

In brief, baseline demographic data on anthropometric measurements, socioeconomic status, education level, and medical and reproductive history were obtained using a structured questionnaire administered by a research assistant. In addition, self-history and family history of osteoporosis and low-trauma fractures at the spine, hip, distal forearm, and proximal humerus after the age of 45 years were obtained. Lifestyle and dietary habits, including smoking, alcohol consumption, and physical activity, were also recorded. Details of this have been described previously (Kung et al. 2007; Tsang et al. 2011).

Handgrip strength

Baseline HGS (in kilograms) was measured using a dynamometer (Smedley Hand Dynamometer, Stoelting Co, Wood Dale, IL). The test was administered by a trained nurse, and the mean score of three measures in the dominant hand was used in the analysis since it has been suggested that the mean of three trials was more reliable than that of one trial (Mathiowetz et al. 1984).

We computed a score for standardized HGS using the formula: standardized T-score = (value − young reference mean)/young reference standard deviation. The age group with maximum mean HGS served as the reference group for the other age groups. A previous study suggests that HGS is influenced by bodyweight (Foley et al. 1999). A T-score for standardized HGS per unit weight (HGS-WT) was thus also computed to compare HGS when corrected for body weight.

Dual-energy X-ray absorptiometry and fracture assessment

BMD was measured at the L1–L4 lumbar spine, femoral neck (FN), and total hip region using dual-energy X-ray absorptiometry (DXA; Hologic QDR-4500). Instruments were calibrated daily. The in vivo precisions of DXA at the lumbar spine, FN, and total hip were 1.2%, 1.5%, and 1.8%, respectively (Mei et al. 2001). Baseline thoraco-lumbar spine X-rays were assessed for radiographic evidence of spinal fractures. All DXA measurements were performed by two licensed technologists who had completed training by the equipment manufacturers and were accredited by the International Society for Clinical Densitometry. Bone mass measurements were expressed in both absolute units (grams per square centimeter) and BMD T-score according to local reference data (Kung et al. 1999). Osteoporosis was defined according to the WHO classification of BMD T-score less than or equal to −2.5 at either the lumbar spine or hip.

Major clinical fragility fractures

For the prospective cohorts, occurrence of incident low-trauma fractures was determined through yearly telephone interview using a structured questionnaire. In this study, the primary end point was the occurrence of a major clinical fragility fracture at the spine, hip, distal forearm, or proximal humerus. Only low-trauma (defined by a fall from standing height or less) fractures were included in the analyses. The information was subsequently corroborated using the computerized patient record system of the Hong Kong Hospital Authority that manages outpatient clinics and hospitals attended by the majority (94%) of the Hong Kong population. For those patients who did not attend Hospital Authority clinics, clinical outcome information was verified by their attending physician. Clinical fractures were verified by X-ray.

Statistical analyses

In the cross-sectional analysis, we assessed the association of HGS with prevalent osteoporotic fractures in the population-based cohort using univariate and multivariate logistic regression models.

A Cox proportional hazards regression model was applied to determine whether HGS is a predictor of major incident clinical fracture at the spine, hip, distal forearm, or proximal humerus. Potential confounders—age, BMI, presence of diabetes (categorical: yes/no), presence of prevalent fracture (categorical: yes/no), current smoker (categorical: yes/no), current drinker (categorical: yes/no), history of fall in the past 12 months (categorical: yes/no), exercise >1 h/week (categorical: yes/no), and FN BMD T-score—were adjusted in the model. All these variables were measured at baseline. The results of the univariate and multivariate model are presented. A two-sided p value ≤0.05 was considered statistically significant. Cox regression was performed using SPSS V16.0.2 software. Receiver operating characteristic (ROC) curves were constructed and the area under the curve (AUC) of the diagnostic test was obtained. Calculations were performed using SPSS V16.0.2 software. A p value of the difference in AUC between two ROC curves was calculated using a freeware ROCKIT (http://www-radiology.uchicago.edu/krl/KRL_ROC/software_index6.htm; Dorfman et al. 1992). A two-sided p ≤ 0.05 was considered statistically significant. We also assessed the potential benefit of using a combination of HGS T-score and FN BMD T-score (by simple addition of the HGS T-score to the BMD T-score) to identify fractures using six performance measures: sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR+), and negative likelihood ratio (LR−). The likelihood ratio is a measure of a test result’s ability to modify pretest probabilities and is used to convert the estimated probability of the suspected diagnosis before the test result is known (pretest probability) into a posttest probability. It also takes the result into account (Pewsner et al. 2004).

Results

The mean age of men and women in the population-based cohort was 67.9 and 64.1 years, respectively (Table 1). HGS was normally distributed in both males and females. As expected, men had a higher BMD and HGS than women across all age groups (Tables 1 and 2), with the highest HGS recorded in the 30- to 39-year age group for both men and women (Fig. 1). HGS showed moderate but significant correlations with BMD at both the hip and spine with higher correlation coefficient detected at the hip (r = 0.408, p < 0.001 and r = 0.298, p < 0.001, respectively).

Table 1 Baseline characteristics (mean ± SD) of participants in the cross-sectional study (N = 2,793)
Table 2 Baseline characteristics (mean ± SD) of participants in the prospective study (N = 1,702)
Fig. 1
figure 1

Mean and SD of HGS (a) and HGS (b) per unit body weight in different age groups: (1) 20–29 years; (2) 30–39 years; (3) 40–49 years; (4) 50–59 years; (5) 60–69 years; (6) 70–79 years; (7) 80 years and older

Among postmenopausal women and men older than 50 years, prevalences of osteoporosis, as defined by BMD T-score −2.5 or less at the spine or hip, were 6.6% and 38.2%, respectively. At the baseline visit, a total of 592 clinical major fragility fractures at the spine, hip, distal forearm, and proximal humerus were recorded in 10.5% of men and 29.4% of women. In all subjects studied, 21.8% had a history of fall in the last 12 months, 13.5% had diabetes, 18.6% were current smokers, 15.1% were current drinkers, and 55.7% exercised >1 h/week.

The relationship between HGS and fractures with adjustment of different covariates in the cross-sectional cohort is shown in Table 3. In the univariate model, each decrease in HGS T-score and BMD T-score was associated with a 2.20-fold (95%CI = 1.98–2.43) and 3.32-fold (95%CI = 2.97–3.71) increased odds for fracture (both p < 0.001). The multivariate model with adjustment of clinical risk factors including age, sex, BMI, history of fall, presence of diabetes, current smoking, current drinking, physical activity, HGS T-score (in BMD analysis), and FN BMD T-score (in HGS analysis) revealed a 1.24-fold (95%CI = 1.09-1.42, p < 0.001) and a 2.13-fold (95%CI = 1.84-2.46, p < 0.001) increased odds for fracture with each decrease in HGS T-score and BMD T-score, respectively. The same observation was detected when HGS-WT T-score, instead of HGS T-score, was analyzed (data not shown).

Table 3 Binary logistic regression analysis of each SD reduction in HGS with major clinical osteoporotic fracture at baseline

To evaluate the association of HGS T-score with fracture risk prospectively, a prospective study of 1,702 (51.8% men, mean age 67 (SD = 9.5); 48.2% women, mean age 60.9 (SD = 8.4)) was performed (Table 2). All 1,702 subjects completed follow-up. During a mean follow-up of 2.9 ± 1.4 years and a total follow-up of 4,855 person-years, 43 confirmed fragility fractures were recorded, giving an overall fracture incidence of 886 per 100,000 person-years.

In the univariate analysis, each SD reduction in FN BMD and HGS was associated with an increased risk of fracture with a hazard ratio (HR) of 2.92 (95%CI = 2.14–3.98, p < 0.001) and 2.58 (95%CI = 1.88–3.52, p < 0.001), respectively (Table 4). In the multivariate adjusted model, each reduction in HGS T-score and FN BMD T-score was associated with an increased risk of fracture events with a HR of 1.57 (95%CI = 1.06–2.33, p = 0.024) and 1.77 (95%CI = 1.15–2.71, p = 0.009), respectively (Table 4).

Table 4 Multivariate Cox regression analysis of HRs (95%CI) for clinical fracture in the prospective cohort (n = 1,702)

Our prospective analysis confirmed that HGS T-score is a predictor of fracture, independent of FN BMD. Therefore, we hypothesized that the combined T-scores of HGS and FN BMD (combined T-score) may have increased power to identify fracture subjects. ROC analyses were then performed to examine the power of different variables in predicting incident fracture. The AUCs of HGS T-score, FN BMD T-score, and combined T-score at baseline in predicting fracture were 0.735, 0.778, and 0.801, respectively (all p < 0.001; Table 5). After inclusion in the model of clinical risk factors such as age, sex, BMI, history of falls, presence of diabetes, current smoker, current drinker, and presence of prevalent fractures, the AUCs of HGS T-score, FN BMD T-score, and combined T-score in predicting fracture were 0.853, 0.853, and 0.859, respectively (all p < 0.001). Difference in the AUC between two ROC curves was calculated using ROCKIT, and there was no significant difference between the AUC of HGS T-score and FN BMD T-score. Despite the fact that combined T-score had a higher AUC value than HGS T-score or FN BMD T-score, the difference was also statistically insignificant (p > 0.05).

Table 5 Area under the curve (95%CI) from the ROC analysis for incident fracture

Based on the ROC analysis, combined T-score of −2.69 has the highest summation of both sensitivity and specificity (Electronic supplementary material (ESM) Table 1). We also observed that combined T-score of −4.21 had the same specificity as FN BMD −2.5 or less in predicting fractures. The accuracy of FN BMD −2.5 or less in predicting fractures in all subjects and the combined T-score −2.69 or less and −4.21 or less in predicting fracture is provided in Table 6; the corresponding ROC curve and the coordinate point are provided in ESM Fig. 1 and Table 1, respectively. In Table 6, as expected, FN BMD ≤2.5 as a cutoff yielded a high specificity of 0.91 (95%CI = 0.896–0.923), but a low sensitivity of 0.386 (95%CI = 0.257–0.534) in predicting incident fractures. On the other hand, combined T-score −4.21 or less as a cutoff yielded the same specificity as FN BMD ≤2.5, but having a higher sensitivity, PPV, NPV, LR+ and a lower LR− (Table 6). Using combined T-score −2.69 or less as a cutoff provided the highest sum of sensitivity and specificity level, with a sensitivity of 0.837 (95%CI = 0.687–0.927) and specificity of 0.653 (95%CI = 0.630–0.676).

Table 6 Accuracy of various cutoffs in predicting incident fracture in the prospective cohort

Discussion

HGS is significantly associated with fracture, even after adjustment for age, BMI, history of falls, FN BMD T-score, the presence of diabetes, being a current smoker, current drinker, exercising >1 h/week, and, particularly, the presence of prevalent fracture. This implies that the effect of HGS is independent of BMD and the presence of fragility fracture. This study also demonstrated, for the first time, that combined T-score can identify more subjects at risk of fracture with higher specificity. In addition, our study is also the first study to compare the predictive ability of HGS and BMD. This study provides evidence that HGS is a predictor of future fracture risk and may be applied in addition to BMD as a diagnostic tool in assessing risk of fracture.

The relationship between HGS and BMD with fracture has been a subject of controversy with conflicting data (Aydin et al. 2006; Bevier et al. 1989; Dixon et al. 2005; Foley et al. 1999). Both positive (Dixon et al. 2005), negative (Aydin et al. 2006; Foley et al. 1999), as well as sex-specific (Bevier et al. 1989; Dixon et al. 2005) associations have been reported, although these studies have been small, limited to the female sex (Dixon et al. 2005; Foley et al. 1999; Karkkainen et al. 2008), or confined to a specific disease population (Aydin et al. 2006).

To define the relation between HGS and osteoporosis and fractures, we evaluated two cohorts, a cross-sectional and a prospective cohort. Our cross-sectional study confirmed the association of low HGS T-score with increased risk of fracture, although it is arguable that the presence of prevalent fractures may lead to physical disability and hence reduced muscle mass and strength. Nevertheless, our prospective study confirmed that HGS is a predictor of future fracture risk and that its effect is independent of BMD and other clinical risk factors. This observation is in accordance with that of Karkkainen et al. (2008) who reported a significant association between HGS and hip fracture in a cross-sectional study of 2,928 Finnish postmenopausal women. A prospective study using 971 Finish women also reported that HGS was a significant predictor of fracture in subjects with normal BMD (T-score greater than −1; Sirola et al. 2008), although the study was performed in young peri-menopausal women with a relatively small number of fracture events.

Our study revealed that HGS per se is strongly associated with BMD and fracture. In the univariate analysis, HGS was strongly associated with osteoporosis at the hip. Both BMD and HGS were strongly associated with age and BMI, and the association remained significant after adjustment for age and BMI. This suggests that the association of HGS with BMD was independent of the effects of age and BMI.

HGS is a good predictor of major clinical fracture. A combination of HGS and FN BMD had better predictive power than either HGS or FN BMD alone. Accurate identification of individuals at risk of osteoporotic fracture facilitates the clinical decision of when and how to treat. A number of fracture risk prediction tools have thus been developed that incorporate both demographic and BMD data. The 10-year fracture risk calculator FRAX®, which was developed by the World Health Organization task force as a country-specific fracture risk assessment tool, was based on data from nine large population-based cohorts with ethnic, gender, and geographic diversity. At present, it is argued that population-specific risk factors rather than a common set of clinical risk factors should be adopted for fracture evaluation. In our study, more than 50% of fractures occurred in individuals with T-scores above −2.5. It is therefore essential to identify other predictors to increase the accuracy of fracture prediction. Interestingly, our findings suggest that HGS could be such an important risk prediction factor that is independent of BMD and other clinical risk factors. Although the occurrence of fall is a well-established non-BMD clinical risk factor for fracture, the circumstances where falls occur are highly variable. Conversely, HGS is an objective and reproducible parameter that reflects overall strength and hence risk of fall. Future study will be required to validate our findings and determine whether they can be generally applied or whether there is variability across populations or difficulty in classification and standardization. Despite the fact that BMD is the most widely used parameter in assessing risk of fracture, DEXA scan is expensive and not available in many health centers, especially in many developing countries. In our study, we, for the first time, showed that the AUC value using HGS T-score and eight risk factors (age, sex, BMI, history of fall, presence of diabetes, current smoker, current drinker, physical activity) reached 0.859. Our findings suggested that assessing HGS using a relatively cheap dynamometer may be an alternative approach to predict fracture in those regions where DEXA scan is not available.

In our prospective study, we demonstrated that combined T-score was more accurate than using the currently agreed threshold of BMD less than −2.5 alone. Fewer than 40% of subjects with incident fractures in our study had FN BMD less than −2.5. The combined T-score of −4.21 can identify additional incident fractures with the same specificity. In addition, although our estimate may be affected by the limited number of incidence fractures, we observed a similar high sensitivity and specificity when the same strategy was applied in discriminating prevalent fractures in our cross-sectional cohort (data not shown). Similarly, improved PPV, NPV, LR+, and LR− were also noted when combined T-score of −4.21 was used to predict fractures, suggesting that combined T-score improved the overall accuracy in predicting incident fractures. Notably, the high-risk subjects identified by combined T-score of −4.21 had a LR+ of 5.14, meaning that these subjects had a ∼10% increased probability of having fractures compared with pretest probability, given that the prevalence of incidence fracture is 2.6% in our prospective cohort (Grimes and Schulz 2005). If we assume that the prevalence of fracture is 20%, these subjects will have a ∼36.2% increased probability of having fractures. Nevertheless, a large-scale prospective study is required to confirm our findings and estimations.

The Hong Kong Osteoporosis Study is relatively homogeneous: our study was confined to people from southern China, therefore limiting the heterogeneity of results due to the large differences among people with different lifestyles and/or genetic components. The study included subjects with various age ranges, with at least 100 subjects in each age group (except the age group of 20–29 in men), suggesting that the T-score computed in our study and our population-based findings should be representative. It has been suggested that an allometric scaling should be used to study HGS by removing the influence of body weight on HGS (Foley et al. 1999). To overcome this problem, we also computed a T-score using HGS per unit weight scale. The findings using HGS-WT were similar to those using HGS, suggesting that our findings were not influenced or mediated by the effect of body weight.

Although there was a high incidence of prevalent fractures in the cross-sectional study, the number of incident fractures in the prospective study was relatively small: this may have led to low statistical power and consequent inflated false negative rate. For example, despite the fact that combined T-score had a higher AUC value than HGS T-score or FN BMD T-score, the difference was not statistically significant. The small number of incident fracture could be one of the underlying explanations. Moreover, due to the low incidence of fracture in the prospective cohort, the data therefore may not be generalized to high-risk individuals who are those of major clinical interest. In addition, the estimated accuracy of HGS in predicting fracture was likely to be affected by the small number of incident fractures. A large-scale multicenter prospective study is required to determine the optimal diagnostic criterion of HGS in fracture prediction.

Our prospective study is ongoing. We expect the strong association between HGS and fracture risk to persist in future large-scale analysis. In addition, although our prospective fractures were all validated by physicians, the self-reported fractures in our cross-sectional study may have been less accurate, and misclassification of cases may have led to invalid results. Nevertheless, the questionnaire was validated and the fracture information was cross-validated from the patients’ record. Although this study primarily examined the relationship of HGS with bone strength, other important bone parameters, such as bone geometry (Cheung et al. 2010), were not included in our analysis. The relationship of HGS with other bone parameters remains to be identified. Lastly, our study was performed in southern Chinese only and so may not be applicable to other ethnic populations.

This is the first study that compared the predictive ability of HGS and BMD on incident fractures. HGS can predict future fracture and the prediction is independent of BMD. It serves as an objective test with multiple end points to evaluate subjects at risk of fall and fractures. Together with six simple clinical risk factors, it may be used to predict risk of fracture in those regions where DEXA scan is not available.