When and how to assess quality of life in chronic lung disease

Since the beginning of the 1980s the importance of quantifying quality of life (QoL) in clinical trials has been increasingly recognised: the first standardised evaluations of QoL in chronic respiratory diseases were reported in the Nocturnal Oxygen Therapy Trial (NOTT) and the IPPB (Intermittent Positive Pressure Breathing) trial; subsequently, measurement of QoL appeared in studies assessing the efficacy of bronchodilators, theophylline, inhaled steroids, patient education and pulmonary rehabilitation programmes [1, 2]. QoL is currently considered a clinical endpoint per se. QoL is, however, influenced by many factors other than health (i.e. marital status, income, job satisfaction, social opportunities). Accordingly, the concept of “health-related quality of life” (HRQL), i.e. that part of QoL which is related to an individual’s health status and can potentially be improved through better health care, progressively replaced that of QoL. A practical definition of HRQL suggested by PW Jones is “quantification of the impact of disease on daily life and well-being in a formal and standardised manner” [3]. Recently, the concept of “health status” has been suggested as an alternative to HRQL. At the end of the 1980s disease-specific questionnaires were developed to quantify HRQL in chronic lung diseases [4, 5]. Disease-specific questionnaires for patients with asthma appeared in the Since the early 1980s there has been increasing awareness of the importance of quantifying health-related quality of life (HRQL) in patients with chronic respiratory disorders included in clinical trials. HRQL scores are clearly complementary to functional assessments, and have been shown to be better predictors of use of health resources (hospital readmissions, GP consultations, exacerbations) than pulmonary function tests alone. Two types of HRQL score are available: “generic scores” cover a wide array of items and allow comparison of patients suffering from various medical conditions; they may however lack responsiveness and therefore underestimate changes in HRQL induced by a pharmacological or nonpharmacological management; “disease-specific scores” are more responsive and sensitive to changes, and thus more suitable for assessing the impact of management on HRQL. The choice of a HRQL instrument must take into account its validity, reliability, and responsiveness in the population studied; it must also be adapted to the severity of respiratory impairment, to ensure optimal discriminant potency. In clinical trials, the use of a “generic” score combined with a “disease-specific” score is recommended for optimum assessment. This is however too time-consuming for clinical practice: the use of short-time HRQL tools quantifying specific items such as resting and exertional dyspnoea, activities of daily life and emotional status appears more appropriate in this setting; furthermore, these items correlate better with HRQL scores than pulmonary function tests.

Since the beginning of the 1980s the importance of quantifying quality of life (QoL) in clinical trials has been increasingly recognised: the first standardised evaluations of QoL in chronic respiratory diseases were reported in the Nocturnal Oxygen Therapy Trial (NOTT) and the IPPB (Intermittent Positive Pressure Breathing) trial; sub-sequently, measurement of QoL appeared in studies assessing the efficacy of bronchodilators, theophylline, inhaled steroids, patient education and pulmonary rehabilitation programmes [1,2].QoL is currently considered a clinical endpoint per se.QoL is, however, influenced by many factors other than health (i.e.marital status, income, job satisfaction, social opportunities).Accordingly, the concept of "health-related quality of life" (HRQL), i.e. that part of QoL which is related to an individual's health status and can potentially be improved through better health care, progressively replaced that of QoL.A practical definition of HRQL suggested by PW Jones is "quantification of the impact of disease on daily life and well-being in a formal and standardised manner" [3].Recently, the concept of "health status" has been suggested as an alternative to HRQL.
At the end of the 1980s disease-specific questionnaires were developed to quantify HRQL in chronic lung diseases [4,5].Disease-specific questionnaires for patients with asthma appeared in the Since the early 1980s there has been increasing awareness of the importance of quantifying health-related quality of life (HRQL) in patients with chronic respiratory disorders included in clinical trials.HRQL scores are clearly complementary to functional assessments, and have been shown to be better predictors of use of health resources (hospital readmissions, GP consultations, exacerbations) than pulmonary function tests alone.Two types of HRQL score are available: "generic scores" cover a wide array of items and allow comparison of patients suffering from various medical conditions; they may however lack responsiveness and therefore underestimate changes in HRQL induced by a pharmacological or nonpharmacological management; "disease-specific scores" are more responsive and sensitive to changes, and thus more suitable for assessing the impact of management on HRQL.The choice of a HRQL instrument must take into account its validity, reliability, and responsiveness in the population studied; it must also be adapted to the severity of respiratory impairment, to ensure optimal discriminant potency.
In clinical trials, the use of a "generic" score combined with a "disease-specific" score is recommended for optimum assessment.This is however too time-consuming for clinical practice: the use of short-time HRQL tools quantifying specific items such as resting and exertional dyspnoea, activities of daily life and emotional status appears more appropriate in this setting; furthermore, these items correlate better with HRQL scores than pulmonary function tests.

Introduction
early nineties [6][7][8].At present approximately 800 different questionnaires are available for measurement of quality of life: of these, an estimated 30 questionnaires have been used to quantify HRQL in chronic respiratory disease.
Several reports have described the poor correlation between the usual measurements of functional impairment and HRQL scores, thus supporting the use of HRQL scores as independant contributors to a better evaluation of patients [3,9,10].Ferrer et al., for example, showed that patients with mild COPD (ATS stage I disease, FEV 1 >50% of predicted) already had markedly abnormal HRQL scores, suggesting a very early impairment of quality of life in COPD [11].
Recently, health care administrators have become particularly interested in HRQL scores as measurements of care quality and clinical effectiveness [12].The fact that HRQL scores may be better predictors of use of health resources (hospital re-admission, outpatient physician consultations, frequency of exacerbations) than pulmonary function tests further emphasises the benefit of using HRQL scores in clinical studies [13][14][15].
This review addresses technical issues relative to HRQL measurement and discusses the most frequently used generic or disease-specific HRQL scores for patients with either COPD or asthma, together with their advantages and limitations, as emerges from recent publications relative to chronic respiratory disorders; availability of validated translated versions in French, Italian, or German is given in table 1 [16].
When and how to assess quality of life in chronic lung disease 624

Validity, reliability, reproducibility, responsiveness and sensitivity (figure 1): necessary properties of HRQL scores
An HRQL instrument is considered valid if it measures what it claims to measure.For example, the oxygen cost diagram (OCD), which measures a patient's perception of his or her tolerance to exertion, can be considered valid inasmuch as OCD ratings correlate closely with the results of a 6 min walk test in patients with chronic respiratory disease [17].Validity is difficult to assess because of the difficulty of establishing a "gold standard"; items of an HRQL instrument are expected to correlate with indicators of disease severity and previously validated scales.
Reliability is an important item since it determines the threshold above which a change in HRQL may be considered clinically relevant: this includes test-retest reproducibility, interobserver reproducibility, and internal consistency (usually assessed by measuring Chronbach's reliability coefficient).The coefficient of variation is one of the components of a scale's reliability [4].
Responsiveness and sensitivity are key features of HRQL instruments: responsiveness is the measure of the association between a change in QoL (∆Q, figure 1), and the HRQL score (∆Z), after inducing a change in a variable C expected to influence QoL.Sensitivity to change (Z 0-Z1, figure 1) is also central to the choice of an HRQL instrument and can be markedly influenced by "floor" or "ceiling" effects: this has been well illustrated in a comparative study of two disease-specific questionnaires (MRF-28 and SGRQ) and a generic instrument (SIP) in patients with chronic hypoxia or hypercapnia: most of the patients studied were in the low range of the SIP scores ("floor effect"), as opposed to the MRF-28: in this population the discriminant properties of MRF-28 were far better than either the SGRQ or the SIP [18].

Generic vs. disease-specific questionnaires
There are basically two different types of instrument for measurement of HRQL (table 1).The first type is the general health questionnaire or "generic" questionnaire.General health questionnaires allow comparisons between different populations of patients, i.e. groups of subjects suffering from different medical conditions; their reproducibility and validity have been verified in various diseases and populations.They are more likely to be appropriate if it is desired to assess the impact or side effects of a given treatment on a wide array of HRQL domains or items.Their disadvantage, however, is that they may not be sensitive enough for a specific disease and may lack responsiveness to changes induced by a given treatment [10].
The second type of instrument is "diseasespecific".Disease-specific questionnaires focus on the domains most relevant to the disease or condition under study and on the characteristics of patients in whom the condition is most prevalent.Their advantage is their increased responsiveness to changes.They are therefore most appropriate for clinical trials in which specific therapies are being evaluated [19].They do not, however, permit comparisons between populations with different illnesses [3].
The most frequently used HRQL in chronic lung disorders are listed in table 1 and reviewed below [16].

The Sickness Impact Profile (SIP)
The SIP is a self-administered 136-item questionnaire (estimated duration: 30 min) covering 12 aspects of HRQL: sleep and rest, eating, work, home management, recreation and pastimes, ambulation, mobility, body care and movement, social interaction, alertness behaviour, emotional behaviour and communication [20].The SIP has been widely used in patients with chronic respiratory failure to assess the value of either long term oxygen therapy, home mechanical ventilation, or intermittent positive pressure breathing [1,2,[21][22][23][24].Although described as reliable and responsive, it appears to be relatively insensitive to mild or moderate disease in patients with COPD [3,10].Also, a marked "floor effect" has been demonstrated with the SIP in patients with chronic respiratory failure, suggesting a loss of discriminant properties and responsiveness to therapeutic measures in these patients [18].

The Nottingham Health Profile (NHP)
The NHP contains 45 statements which are weighted to obtain 6 component scores (sleep, pain, energy, physical mobility, social isolation and emotional reactions) and a total score; it can be self-administered and is completed in approximately 10-15 min [25].It has been used in descriptive studies in patients with COPD, and in clinical trials of bronchodilators or inhaled steroids [11,26].The NHP is reliable and valid, but its responsiveness in COPD is not well established [10].

Technical issues in measuring HRQL
The Short-Form 36  Derived from the "Medical Outcomes Study" (MOS) questionnaire, the SF-36 measures nine different health concepts through 36 questions."Physical functioning" describes the extent to which health interferes with activities such as bathing, dressing, shopping, walking or climbing stairs."Role physical" and "Role emotional" quantify the impact of disease on daily activities in terms of physical or emotional problems."Bodily pain" score refers to the extent of bodily pain over the past 4 weeks."Vitality" quantifies subjective wellbeing (energy or tiredness)."Social functioning" describes the extent to which health interfered with social activities, such as visiting friends or relatives, during the preceding month."Mental health" describes the general mood of affect, including depression, anxiety and psychological well-being, during the past month."General health" provides an overall rating for current health in general [27].A database of normal values has been published for the US population by sex and age group [27].Reference values for the French (and Swiss) populations are also available.The SF-36 has been extensively used in patients with either obstructive or restrictive lung diseases [12,[28][29][30][31].It is described as valid (in asthma, COPD, and interstitial disorders), with good discriminatory potency in interstitial diseases [31] and high internal consistency; test-retest results showed, however, rather wide differences (significantly different for "General Health" and exceeding 10 points on a 0-100 scale in 5 domains).Furthermore, in outpatients with COPD (FEV 1/FVC <70%), several domains of the SF-36 showed either "floor" or "ceiling" effects, indicating limitations in measurement of changes in health from baseline [32].Indeed, a recent study of pulmonary rehabilitation in 151 COPD patients showed an increase in 6 min walk tests and improved scores using the CRQ, a disease-specific HRQL instrument, but no impact of rehabilitation in any of the domains measured by the SF-36 [33].More recently, 2 shorter versions have been proposed ("SF-12" and "SF-8"); however, no studies have been published to date using these versions in patients with chronic lung disorders.
Several other generic instruments are available, although they are rarely used in pulmonary disorders (such as the Quality of Well-Being scale [QWB] or the Symptom Check List [SCL-90]) and thus will not be discussed in this review.

Disease-specific instruments
As previously mentioned, disease-specific instruments are more responsive to changes than generic instruments, and are therefore more likely to be sensitive to small changes in a therapeutic trial.

The St George Respiratory Questionnaire (SGRQ):
The SGRQ [5] was initially designed to allow direct comparisons of gain in HRQL with different types of therapy in both asthma and COPD.It can be self-administered, is completed in approximately 15 min and contains 76 items divided into three sections: "Symptoms" relates to respiratory symptoms, their frequency and severity; "Activity" relates to activities that cause or are limited by breathlessness; and "Impacts" covers social functioning and psychological disturbances resulting from respiratory disease.A database of SGRQ scores for normal subjects without a history of respiratory disease is supplied by the authors of the SGRQ.Higher scores relate to increasing impairment or severity of symptoms (range: 0-100).Changes of more than 4 units for each score are considered clinically relevant.The SGRQ is widely accepted as having good reliability and being responsive (sensitive to change) in COPD and asthma [5].It has been used in patients with severe respiratory limitation, under long term oxygen therapy [34], and in patients under NIV for COPD or predominantly restrictive diseases [18,30,31,35,36].

The Chronic Respiratory Questionnaire (CRQ)
The CRQ [4] is a disease-specific HRQL questionnaire widely used in chronic airway disease although available only in English.It contains a total of 20 questions, each being answered on a sevenpoint scale; it is interviewer-administered and takes 25-30 min to complete (20 min on subsequent evaluations).The questions, covering 4 domains (dyspnoea, fatigue, emotional function, and mastery), refer to specific activities in daily living identified by the patient (i.e. the questions are individualised for relevance to each patient during the first session).It has been shown to be highly reliable and very responsive, but the fact that the questions are individualised renders comparisons between patients theoretically invalid: CRQ scores are to be compared before and after a given treatment in the same patient.Changes are considered significant only if above 2.5 points for "dyspnoea", 2.0 for "fatigue", 3.5 for "emotion" and 2.0 for "mastery".This is important in interpreting clinical trials: i.e., although statistically significant, changes in CRQ scores attributed to pulmonary rehabilitation in a large -and frequently cited -randomised study by Goldstein are not clinically relevant for two of the four items of this HRQL scale![37].

The Maugeri Foundation Respiratory Failure item set (MRF-28)
The MRF-28 was recently developed in Italy by the "Quality of Life in Chronic Respiratory Failure Group" and published by Carone et al. [18].The questionnaire includes 30 items encompassing activities of everyday life, cognitive function, emotional status, perception of general health, invalidity and respiratory health.The results of the MRF-28 correlate well with either the SIP or the SGRQ in patients with chronic respiratory failure (COPD under home oxygen therapy or patients with kyphoscoliosis treated by home mechanical ventilation).The MRF-28 was specif-When and how to assess quality of life in chronic lung disease ically developed for use in patients with chronic respiratory failure, is applicable to patients with both obstructive and restrictive pulmonary disorders, is less time-consuming than the SGRQ and CRQ (approximately 10 minutes to complete, selfadministered); moreover, the distribution of scores for the MRF-28 is wider than that for either the SGRQ or the SIP in chronic respiratory failure, suggesting that the MRF-28 may perform better in discriminating between different levels of impaired health status than available questionnaires in this population.

Instruments focusing on HRQL in asthma
Approximately 20 questionnaires have been developed to date for measurement of HRQL in asthma, most of which exist only in English.The previously described SGRQ and CRQ were both initially developed for quantification of HRQL in asthmatics as well as in COPD, and have thus been widely used in studies of asthmatic patients.The three asthma-specific HRQL scores most frequently referred to are listed in Table 1.The "Living with Asthma Questionnaire" (LAQ) [8] is a self-administered 68-item questionnaire covering 11 domains of everyday life, including items particularly relevant for asthmatics such as frequency of colds or impact on sports and sleep.Although responsive, the LAQ may lack sensitivity and is thus of uncertain worth in evaluating the impact of a therapy [38].The 32-item "Asthma Quality of Life Questionnaire" (AQLQ) is the most widely used HRQL score for asthmatics [38] and has been described as more responsive and sensitive than the LAQ.The "Air Index" was developed in France [7] and comprises 63 items covering 4 subscales or dimensions: "Psychological", "Physical Activity", "Physical Symptoms", and "Social".Its validity, high internal consistency and high testretest repeatability are established, but its responsiveness and sensitivity have yet to be determined.

Dyspnoea scores
Elaborate scales such as the Baseline Dyspnea Index (BDI) and the Transition Dyspnea Index (TDI) are available for clinical studies focusing on dyspnoea, but are time-consuming for the clinician and do not exist in validated translations in either French, German, or Italian.

Other HRQL instruments relevant to clinical practice
Dyspnoea can be quantified using weighted visual analog scales such as either the Borg scale [39] or the "Oxygen Cost Diagram" (OCD) [40].
The OCD is a 100 mm vertical line with efforts involving increasing "oxygen cost" listed on either side: patients are asked to quantify the effort above which they think their breathlessness would not allow them to go.The OCD correlates well with the results of the SF-36 in chronic lung disease [28]; it also correlates with the results of a 6 min walk test or average daily ambulation mea-sured by pedometer in patients with severe chronic respiratory disorders, and is thus considered valid [34].Its responsiveness is, however, questionable: the OCD did not adequately depict changes in objective performance over time in a group of patients with chronic respiratory failure on home ventilation [17].

Scores for emotional disturbances
Amongst the scores for emotional disturbances used in patients with respiratory disorders, the simplest and the most widely used is the Hospital Anxiety and Depression scale (HAD).The HAD, published in 1983 by Zigmond and Snaith, is simple to use and can be self-administered in less than 10 min; it contains 14 multiple choice-type questions, seven of which are oriented towards detection of anxiety disorders and seven towards detection of depression [41].

When and how should we measure HRQL?
HRQL scoring is most contributive in clinical trials assessing the impact of pharmacological or non-pharmacological management (i.e.patient education programmes, pulmonary rehabilitation, ventilatory support device) [10,19].In recent studies, even a modest impact of a treatment on HRQL scores (such as the slower decline in health status in COPD patients treated by fluticasone in the ISOLDE trial) has become a major argument for drug promotion [42].Assessment of health status in clinical trials is best performed using a combination of a generic and a disease-specific HRQL score.The choice of an HRQL score must take into account its discriminant properties in the population studied as well as its responsiveness and sensitivity when available: the deleterious impact of "floor" or "ceiling" effects has been mentioned previously [19].In commenting on the results, particular care must be taken to emphasise not only statistically significant but also clinically significant changes, based on either threshold values determined by the authors of the HRQL scores or the distribution of normal reference values (standard deviations or confidence intervals of normal values, or percentiles).Results of test-retest studies are important in distinguishing significant changes over time from differences compatible with the variability of sequential measurements.Detailed descriptions of the performances and limitations of HRQL scores are unfortunately seldom published in the "methods" section of clinical studies and are not always taken into account.
Very little has been published on the use of HRQL scores in clinical practice.The use of HRQL scores in everyday clinical practice is limited by several factors.One obvious limitation is time: most questionnaires are time-consuming and therefore incompatible with everyday clinical practice.Moreover, HRQL scales are not at present sensitive enough to be used as determinants of clinical decisions.The scores obtained are meaningful only in population studies, but do not per se precisely quantify an individual's HRQL [10,14].However, the use of HRQL scales may provide a means of eliciting information on areas of distress for patients who are otherwise reluctant to address the issue of the emotional impact of illness, and may also identify areas of concern which are not evident during routine visits.For this purpose, the clinician needs short, self-administered, reliable, valid and responsive HRQL instruments.
An alternative for the clinician is the use of "HRQL tools" to assess specific items of health status such as anxiety, depression, resting or exertional dyspnoea and activities of daily life: these items can easily be tested in clinical routine with valid instruments such as the Borg scale for resting dyspnoea [39], the Hospital Anxiety and Depression scale (HAD) for emotional disorders [41] and ADL scales; they offer a simple means of detecting symptoms which may have been overlooked during routine clinical visits.Furthermore, these items show stronger correlations with HRQL scores than routinely performed pulmonary function tests.The contribution of new short generic HRQL scores such as the SF-8 or SF-12 has yet to be determined.
When and how to assess quality of life in chronic lung disease 628

Conclusion
A standardised description health status or HRQL has become essential in clinical trials and an endpoint per se when studying either pharmacological or non-pharmacological measures such as pulmonary rehabilitation, patient education programmes or respiratory support in chronic respiratory failure.A large number of HRQL instruments are currently available, with more than 30 questionnaires addressing the HRQL of patients with respiratory disorders.
The recommended combination is a generic plus a disease-specific HRQL instrument, to ensure sufficient responsiveness and sensitivity in the population studied.The choice of the HRQL instrument must take into account the population for which it was designed, to avoid a "ceiling" or "floor" effect with loss of responsiveness or sensitivity and discriminant potency, especially in subjects with either very mild or very severe disease.
The use of HRQL scales in clinical practice is time-consuming and probably questionable at present, although certain items may contribute to better assessment of patients and a wider appreciation of the impact of illness on everyday life.The use of HRQL individualised items, such as simple dyspnoea scales, ADL scores and short scores for emotional disturbance appears at present to be the best choice for the clinician.

Figure 1
Figure 1 Properties of an HRQL scale essential for measuring quality of life.Q(0): Actual initial quality of life; Q(1): Actual quality of life after an intervention which modifies a relevant variable C. Z(0) and Z(1): Measured quality of life, before and after the intervention.The reliability of a given HRQL score is determined by the variability of successive measurements performed in stable conditions (Z ± E).Responsiveness is determined by the presence of a change in Z (∆Z) when a change in actual quality of life [(∆Q) effectively occurs.The sensitivity of the response is the amplitude of the change in Z [Z(1) -Z(0)] for a given change in Q. Adapted from Testa et al., N Engl J Med 1996; 334: 835-40.

Table 1
Most frequently used Health-related quality of life (HRQL) questionnaires, authors, and availability in French, German or Italian (May 2001).