Introduction

Patient-reported outcome measures (PROMs) aim to assess patients' experiences, such as pain, quality of life or satisfaction with care. There is an increasing emphasis on patient-centered research, and patient perspectives are now recognized as a critical element in the evaluation of health interventions. Indeed, using a PROMs will allow essential aspects of patient-relevant treatment effectiveness to be captured. Government regulatory agencies such as the Food and Drug Administration (FDA) or the European Medicines Agency (EMA) have encouraged the appropriate use of PROMs in regulatory studies [1, 2]. Health-related quality of life (HRQoL), one of the most commonly measured PROMs, can be measured using generic HRQoL questionnaires, such as the SF-36 or the EQ-5D, as well as specific instruments. As their category name suggests, disease-specific HRQoL questionnaires tend to measure more specific elements of the disease in question, can detect subtle effects of a disease on HRQoL, and are therefore theoretically more sensitive to treatment-related changes than generic HRQoL measures [3]. Because of the characteristics offered by specific instruments, many disease-specific HRQoL have been developed in the past few years such as the diabetes-specific quality of life questionnaire (DMQoL), the osteoarthritis knee and hip quality of life questionnaire (OAKHQOL) and the rheumatoid arthritis quality of life questionnaire (RAQoL).

According to the 2nd European Working Group on Sarcopenia in Older People (EWGSOP2) [4], sarcopenia can be defined as a “progressive and generalized skeletal muscle disorder that is associated with increased likelihood of adverse outcomes including falls, fractures, physical disability and mortality”. Some recent meta-research syntheses have highlighted the increased likelihood of functional decline, falls, fractures, hospitalizations and even death in individuals with sarcopenia [5,6,7,8]. While these investigations have mainly focused on so-called “hard clinical outcomes”, there has also been a growing interest in the lived experience of people with sarcopenia. Until 2015, HRQoL in individuals with sarcopenia was only measured using generic instruments. Some studies reported reduced HRQoL in individuals with sarcopenia, but this was mainly observed in particular domains of HRQoL, mainly related to the physical function and mobility of individuals [9,10,11,12]. These results suggested that quality of life of people with sarcopenia may be affected in specific domains that are directly related to the disease and therefore to muscle function. To complement the information obtained from these generic tools and to obtain a more specific measurement of HRQoL in this population, a group of experts decided in 2015 to develop the first sarcopenia-specific HRQoL questionnaire, namely the Sarcopenia & Quality of Life (SarQoL) questionnaire [13]. As one year later, in 2016, the 10th version of International Classification of Diseases (ICD-10-CM codes) added a code for the diagnosis of sarcopenia[14], the availability of a specific HRQoL assessment for sarcopenia is particularly interesting. To date, the SarQoL questionnaire (http://www.sarqol.org) is the only sarcopenia-specific validated HRQoL instrument available for older people in the scientific literature [13, 15]. Eight years after its development, SarQoL is being used in many epidemiological and interventional studies worldwide.

This narrative review aims to provide an update on the characteristics and validated implementation of the SarQoL questionnaire, relevant to researchers, clinicians, regulators, pharmaceutical industry and other stakeholders. Throughout a in depth literature research in Scopus and Medline bibliographic databases, the SarQoL-related scientific literature published up to January 2023 is presented in this review.

Development of the SarQol questionnaire

The Sarcopenia Quality of Life (SarQoL) self-administered questionnaire was developed in 2015 by a research team of thirteen French-speaking experts from Belgium, France and Switzerland. This team of experts comprised geriatricians, rheumatologists, specialists in physical medicine and rehabilitation, researchers in the field of sarcopenia, French linguists, experts in questionnaire methodology and statisticians. The questionnaire was developed in four steps, based on scientific guidelines and the literature available at the time of the development of quality-of-life instruments [16, 17]. In the first step (i.e., item generation), a systematic literature review and interviews with five individuals with sarcopenia (diagnosed according to EWGSOP1 criteria [18]) and seven experts in the field were conducted to generate a list of items related to HRQoL in sarcopenia. This initial list was composed of 180 items, which was considered too extensive to develop a questionnaire to be completed by a population of older adults. Therefore, a second step, item reduction, was undertaken to reduce this list to the most relevant items to be included in a PROM. Twenty-one individuals with sarcopenia and seven experts in the field of sarcopenia were invited to participate in this item reduction phase and were asked to select the items they considered most relevant from the 180 proposed items. Using a cut-off point of 0.5 (frequency x importance) and expert consensus, a final list of 55 items was obtained, divided into seven domains. The list of items was then divided into 22 questions by the expert panel. Finally, the questionnaire was pre-tested on a sample of 20 older individuals with sarcopenia who were asked about the relevance and comprehensiveness of each question.

The final SarQoL questionnaire, therefore, consists of 55 items structured into seven domains of HRQoL and composed of 22 questions rated on a 3-, 4- and 5-point Likert scale of frequency and intensity (the English version of SarQoL is available in the Appendix). The seven domains of HRQoL are Physical and Mental Health, Locomotion, Body Composition, Functionality, Activities of Daily Living, Leisure Activities and Fears. The total score of the SarQoL questionnaire ranges from 0 to 100, and individual scores can also be generated for each domain (the SarQoL scoring system can be obtained by contacting the lead authors at info@sarqol.org). A lower score indicates a lower QoL. The questionnaire can be used free of charge for all academic or clinic unsponsored studies.

Validated trenslations of SarQol

So far, SarQoL is available in 35 different languages. Leading SarQoL developers were contacted before each translation and provided instructions based on Beaton’s recommendations [19] and COSMIN guidelines [20]. From those 35 translated versions, 19 have currently been validated in a population of individuals with sarcopenia. The following versions of SarQoL have been validated: Brazilian [21], Chinese [22], Dutch [23], English [24], French [15], Greek [25], Hungarian [26], Korean [27], Lithuanian [28], Persian [29], Polish [30], Romanian [31], Russian [32], Serbian [33], Spanish [34, 35], Taiwanese [36], Turkish [37] and Ukrainian [38].

Psychometric properties of SarQol

The psychometric properties of SarQoL have been measured in 24 different populations; 19 validation studies (number of participants = 3657 of whom 776 (21.2%) were sarcopenic) and five studies aimed at assessing specific psychometric properties (number of participants = 1150) [39,40,41,42,43].

In 2022, an independent systematic literature review was published to analyse the structural characteristics and psychometric properties of SarQoL in different languages [44]. At that time, the authors were able to include 14 individual studies reporting on the psychometric properties of SarQoL. Surprisingly, they did not report the psychometric properties of the French version. Moreover, the Hungarian, Brazilian, Taiwanese and Persian versions of SarQoL were published afterwards.

In these different publications on the psychometric properties of SarQoL, the discriminant validity, reliability (i.e., internal consistency, test–retest reliability, measurement error), validity (i.e., content validity, construct validity), responsiveness and floor/ceiling effects of SarQoL were reported (Table 1).

Table 1 Studies supporting psychometric validation of SarQoL

Capacity of SarQoL to detect difference in HRQoL between individuals with and without sarcopenia

As SarQoL is an instrument specifically designed for use in populations with sarcopenia, the ability of the questionnaire to discriminate HRQoL between individuals with and without sarcopenia must be considered. A systematic review of the literature identified 20 individual cross-sectional studies that used SarQoL to measure the quality of life in individuals with sarcopenia, diagnosed according to a consensually accepted definition, compared to individuals without sarcopenia and published until December 2022 [45]. Pooled results of these 20 individual studies using meta-analytic statistics (random effect model) showed a lower HRQoL in individuals with sarcopenia [mean difference of −15.01 points/100 (95%CI of −19.00; −11.01)] compared to individuals without sarcopenia (Fig. 1). The discriminant validity of SarQoL in regards of HRQoL using different diagnostic criteria for sarcopenia was further confirmed in two individual studies [39, 41].

Fig. 1
figure 1

Discriminative power of SarQoL – data reported from 20 individual studies identified from a systematic literature review (Figure issued from Beaudart et al. 2023 [45])

Reliability

Internal consistency: All 19 different validated versions of SarQoL showed an internal consistency that was considered excellent (Cronbach’s alpha > 0.8). Most validation studies also measured the effect of removing one domain at a time on global internal consistency. This statistical analysis allows the identification of a potential domain that could introduce some heterogeneity into the scale. None of the studies reported a significant change in internal consistency when a domain was removed from the total score.

Test–retest reliability: As SarQoL is a self-administered questionnaire, only test–retest reliability was measured and not inter-rater reliability. Almost all versions (except Romanian, Hungarian and Serbian) provided a measure of reliability. Intraclass coefficient correlations (ICCs), which are expected to be higher than 0.7 to reflect acceptable reliability of the tool, ranged from 0.93 for the Russian [34] version to 0.99 for the Ukrainian [38], Chinese [22], Persian [29], Spanish [34] and Polish [30] versions, reflecting high reliability of SarQoL, whatever the version used.

Standard error of measurement (SEM): SEM is a parameter used to indicate the amount of measurement error in an instrument and is consequently an indicator of reliability. Pooling together data from 9 different cohorts (278 individuals with sarcopenia) reporting values of test–retest reliability, an SEM of 2.65 points (on a scale of 0–100 points) was reported by Geerinck et al. in 2019 [43]. This value means that one can be 68% confident (± 1 SEM) that the ‘true’ score of a subject can be found between −2.65 and + 2.65 points from the observed score. The lowest SEM value was found in the Lithuanian validation study [28] (i.e., SEM = 0.18) and the highest SEM value was found in the English validation study [24] (i.e., SEM = 4.2).

Smallest detectable change (SDC): The SDC is defined as the change in the instrument’s score beyond measurement error and depends on the SEM. In the aforementioned publication of Geerinck et al. 2019 [43], pooling together data from 9 individual validation studies, an SDC of 7.35 (on a scale 0–100) was found. This value means that the overall HRQoL score of an individual would have to change with at least 7.35 points before the observed change can be considered to be a true change in the HRQoL of an individual, and not potentially a result of measurement error. The lowest SDC value was found in the Lithuanian validation study [28] (i.e., SDC = 0.49) and the highest SDC value was found in the English validation study [24] (i.e., SDC = 11.65).

Validity

Content validity: According to the COSMIN guidelines for the assessment of content validity published in 2016 [46], two steps should be taken to measure the content validity of a scale. For the first step, it is required to evaluate the quality of PROM development, which shows that a sample of the target population was involved in the development of the items of the scale, but also that a pre-test of the scale was conducted to measure the comprehensibility and comprehensiveness of the scale. Both requirements were met during the development phase of SarQoL. Indeed, during the development of SarQoL questionnaire [13], a sample of 5 individuals with sarcopenia was involved in a qualitative interview to develop concepts and items to be included in the questionnaire. SarQoL team also sought the opinion of experts on the items that should be included in the scale. A list of 180 potential items was generated from the literature review and experts and patients’ interviews. Experts and individuals with sarcopenia were then asked to review this list and select the items they considered most relevant. Once SarQoL questionnaire was developed, a pre-test was conducted with 21 individuals to ensure a good understanding of all questions. For the second step, the COSMIN guidelines [46] require to evaluate the quality of content validity studies using the PROM by asking patients and professionals about the relevance of each item included in the scale but also by asking patients and professionals about the comprehensibility and comprehensiveness of the questionnaire. To date, only one study (i.e., Mahmoodi et al., Persian translation of SarQoL [29]) has conducted a content validity analysis according to the COSMIN guideline. The authors asked professionals about the relevance of the scale and reported a content validity ratio of 0.8–1 depending on the questions of SarQoL and a content validity index of 1, which are higher than values indicating adequate and acceptable content validity. However, patients were not interviewed in this study, and the content validity of the Persian SarQoL, as defined by the COSMIN guidelines, was therefore not entirely confirmed.

Construct validity

Construct validity can be measured through convergent and divergent validity. All 19 validation studies compared SarQoL with the generic instruments SF-36 and EQ-5D. Hypotheses were made about the correlation between SarQoL, or some of its specific domains, with the subdomains of the SF-36 questionnaire and the EQ5D instrument that were expected to have a similar (i.e., convergent validity) or different (i.e., divergent validity) construct to SarQoL. The validity of an instrument is usually reported when more than 75% of the pre-defined hypotheses are confirmed [47]. Except for the Greek [25], Russian [32] and Ukrainian [38] versions of SarQoL, where lower-than-expected correlations were obtained, all other publications highlighted a consistent construct validity of SarQoL questionnaire for measuring HRQoL in sarcopenia. In most studies, SarQoL correlated well (r > 0.5) with the physical functioning, role limitation due to physical problems, bodily pain, general health, and vitality domains of the SF-36 questionnaire. Lower correlations (r < 0.5) were generally found with mental health and role limitation due to mental health problems. For the EQ-5D, high correlations were found with the mobility and usual activities subscales. Lower correlations were generally found with the other domains (i.e., self-care, pain/discomfort, anxiety/depression).

Responsiveness

The responsiveness of SarQoL has been reported in two different prospective studies [38, 40]. In one of these publications, Geerinck et al. compared the specific SarQoL questionnaire with generic tools (i.e., SF-36 and EQ-5D) to detect change over time in a population of individuals with sarcopenia (n = 42). Good responsiveness was observed, as authors confirmed eight out of the nine hypotheses developed a priori, which is well above the75% confirmation threshold [48]. The standardised response mean of the total SarQoL score was significantly higher than that of the SF-36 Physical Component Summary, the EQ-5D Utility Index and the EuroQol visual analogue scale. The second publication, by Witham et al., aimed to assess the responsiveness of SarQoL in a population of individuals with probable sarcopenia (n = 147) and the suitability of SarQoL as an outcome measure in clinical trials. Using an anchor-based method, the authors reported a minimum clinical improvement after six months of follow-up that ranged from 5 to 21 points, giving trial sample size estimates of 25–100 participants, demonstrating that SarQoL is sufficiently responsive for use in clinical trials in sarcopenia. The authors also mentioned that the responsiveness of SarQoL may allow smaller sample sizes to be used in trials than implicated by use of some generic tools (for example, the EQ-5D typically requires sample sizes of 200–300 to detect the minimum clinically important difference of 0.074 points).

To date, the responsiveness of SarQoL following an intervention aimed at improving key parameters of sarcopenia (i.e., muscle mass, muscle strength, physical performance) has not been reported. According to a systematic review of the literature, eight interventional studies aimed at the treatment (pharmacological or non-pharmacological) of sarcopenia reported a measure of HRQoL as a secondary outcome [49,50,51,52,53,54,55,56]. Of these eight trials, only one used SarQoL questionnaire [49]. In this study, Tsekoura et al. [49] proposed an 12-week exercise programme to improve sarcopenia. Three groups were defined: home-based exercise, supervised exercise, and control. The results showed a significant group x time interaction for quality of life. Both exercise interventions improved HRQoL compared to the control group (supervised-exercise group + 7.28 points, home-based exercise group + 3.41 points and control group -2.19 points after 12 weeks, p < 0.05). Exercise interventions also improved physical performance and muscle strength compared to the control group. Although this study was not designed to measure responsiveness to change of SarQoL, the improvement of HRQoL and sarcopenia parameters by the intervention may be considered as an indication of the sensitivity to change of SarQoL.

Floor & ceiling effects

No floor nor ceiling effects were reported in the 19 translation validation publications. None of the 3657 participants in the 19 language translation studies achieved the maximum or minimum score on SarQoL questionnaire.

Short form SarQol

While the original developers of SarQoL estimated, based on the results of a pre-test in the target population, that most people would take approximately 10 min to complete SarQoL, in practice a significant number of respondents take longer. This was further confirmed in the study by Witham et al., who reported that in people with more functional limitations, the completion might take longer. Given that most clinical trials involve many tests and questionnaires, it seems worth considering whether it would be possible to reduce the related burden on trial participants by reducing the size of the questionnaire. With this intention, Geerinck et al. developed a shorter version of SarQoL questionnaire in 2021, hereafter referred to as SF-SarQoL [57]. Following a two-stage item reduction process, the full SarQoL was reduced from 55 to 14 items (i.e., a 75% reduction). The authors investigated the clinimetric properties of this new version and confirmed that equivalence was achieved. Indeed, SF-SarQoL discriminated well between participants with and without sarcopenia, had an excellent internal consistency (α = 0.915, ω = 0.917) and an excellent test–retest reliability (ICC = 0.912 [0.847–0.942]). For this new format of the questionnaire, the authors also investigated the structural validity of the questionnaire and examined the item parameters with a graded response model (IRT). As a result, an unidimensional model was fitted with no misfitting items and a good response category.

Discussion / expert commentary

Currently, SarQoL is the only specific HRQoL questionnaire for individuals with sarcopenia available in the literature. Another sarcopenia-specific PROM, the Age-Related Muscle Loss Questionnaire (ARMLQ), was developed in 2011 [58]. Even if both SarQoL and ARMLQ provide information on the patient’s perspective, only SarQoL evaluates HRQoL, while the ARMLQ restricts its domains of interest to the functional impact of reduced muscle strength. Moreover, the psychometric performances of the ARMLQ have not yet been reported.

The current review presents an overview of all the psychometric properties of SarQoL measured in different translation and validation studies. It is important to note that most translated versions of SarQoL showed similar psychometric properties. Demonstrating that a tool is consistently valid and reliable in different populations from different countries using different diagnostic criteria for sarcopenia makes the available evidence robust.

In addition, a recent systematic review that aimed to analyse the structural characteristics and psychometric properties of translated versions of SarQoL questionnaire supports the conclusion of the current review. After carefully assessing the psychometric properties of the translated versions of SarQoL based on the COSMIN guidelines [20, 59], the authors concluded that the analysed versions have psychometric properties that can be qualitatively classified between good and excellent. They state that SarQoL is valid for assessing the quality of life in people with sarcopenia in different countries [44]. The authors however regret the absence of content validity measurement across the different validation analyses. Since 2021, SarQoL has been recognised by the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO), as the official tool for measuring HRQoL in sarcopenia. Furthermore, the recent revision of EWGSOP2 recommends using SarQoL questionnaire in clinical care and research studies [4].

Despite the available evidence of the suitability of SarQoL to measure HRQoL for individuals with sarcopenia, research on the psychometric properties of SarQoL questionnaire should continue.

  • First, the responsiveness to change of SarQoL has still not been measured in the context of interventional studies. It is therefore still uncertain if SarQoL may be sensitive enough to detect HRQoL changes associated with improvements in muscle mass or muscle strength following a pharmaceutical or non-pharmaceutical intervention. The only clinical trial that used the SarQoL questionnaire as a secondary outcome showed an improvement of HRQoL following a non-pharmaceutical intervention, which may suggest that SarQoL may be sensitive to change [49]. However, this is currently only an assumption based on a small amount of preliminary evidence, and no statistical measure of the sensitivity to change using anchor questions was performed.

  • Second, as highlighted in the systematic review by Martinez-Fernandez [44], the content validity of SarQoL has rarely been studied so far. Because content validity is a psychometric property recognized by COSMIN, future studies should investigate this property in different populations. Currently, researchers interested in the translation and validation of SarQoL in another language are requested to follow the guidance provided by the developers of SarQoL. This guidance has now been updated to include a per se measurement of the content validity. It is now recommended to conduct patient interviews or focus groups using open-ended methods to elicit patients’ input. Additional evidence regarding this psychometric property should therefore be available in the next few months.

  • Third, most of the available evidence summarized in this review was obtained from cross-sectional studies. Very few longitudinal studies have provided data on the evolution of HRQoL in individuals with sarcopenia. We encourage the use of SarQoL in prospective studies to provide new evidence on the impact of sarcopenia on HRQoL and strengthen the sensitivity to change analyses.

  • Fourth, currently, no cut-off score exists to define a low HRQoL for sarcopenia. The score of SarQoL is currently only used as a continuous value. Providing a cut-off for the definition of a low HRQoL is scientifically challenging. Nevertheless, it may be relevant to develop such a cut-off to identify individuals with sarcopenia with a particularly low HRQoL in which specific health action targeting HRQoL could be proposed.

  • Fifth, SarQoL has only been used in community-dwelling older individuals with sarcopenia and has never been studied in populations such as those living in nursing homes or those with severe cognitive impairment. In order to extend the applicability of SarQoL to other population profiles, it may be interesting to test the psychometric properties of this questionnaire in these different populations as well.

Conclusion

Although studies are underway or planned to further characterise the psychometric properties of SarQoL, this literature review shows that SarQoL can be used in observational and interventional studies to validly assess sarcopenia-specific HRQoL in older individuals with sarcopenia. Disease-specific instruments such as SarQoL should be used to complement measurements from generic questionnaire, as generic measures are still essential to assess broader health status in older people who usually suffer from multiple conditions.