Introduction

Dermatology Life Quality Index (DLQI) is the most commonly used quality of life instrument in dermatology [1, 2]. The 10-item questionnaire covers the following aspects of health-related quality of life (HRQoL): symptoms and feelings, daily activities, leisure, work or school, personal relationships and treatment [1]. Each question is scored on a four-point scale: 0—not at all/not relevant, 1—a little, 2—a lot, and 3—very much. The final score ranges from 0 for the best health state to 30 for the worst health state, is obtained by summing the item scores; thus, each item receives the same weight in the total score.

The DLQI is widely applied as an outcome measure in clinical dermatology practice and in a variety of dermatological researches, including randomised controlled trials (RCT). In RCTs, the European Medicines Agency recommends its use as a secondary or tertiary endpoint to assess efficacy of treatment [3]. Owing to the increasing availability of new, highly effective but very costly treatments in dermatology, such as biological drugs, DLQI has become a reference point for clinical and financial decisions (e.g. treatment indications, treatment goals, hospitalisation decisions and drug reimbursement eligibility criteria). Biological drugs are genetically engineered protein molecules that interfere with the inflammatory response of the immune system in psoriasis and other chronic inflammatory diseases. The European Consensus group on ‘Treatment goals for psoriasis’ defines the two main severity categories of psoriasis—mild and moderate-to-severe—based on Psoriasis Area and Severity Index (PASI), body surface area (BSA) and DLQI [4, 5]. (BSA > 10 or PASI > 10) and DLQI > 10 can be considered moderate-to-severe disease and is recommended to be treated with phototherapy or systemic treatments including biologicals [4, 5]. Furthermore, in several European countries, such as the UK, Sweden, Denmark, the Czech Republic, Croatia, Hungary, Romania and Poland, national reimbursement criteria on financing biologicals in psoriasis are based on DLQI and PASI scores [68]. Severity scores to be considered for reimbursement vary across jurisdictions. For example, in the UK, PASI ≥ 10 and DLQI > 10, in Hungary PASI > 15 and DLQI > 10, in Poland PASI > 18 and DLQI > 10 or in Croatia PASI > 15 and/or BSA > 15 and/or DLQI > 15 are required to receive reimbursement [7, 8]. In the UK, similarly to biologicals for psoriasis, oral alitretinoin (a derivative of vitamin A) is financed for those severe chronic hand eczema patients, who reach a DLQI ≥ 15 [9].

On the other hand, the DLQI is subject to debates about its appropriateness as an outcome measure [1015]. Few studies addressed that it suffers differential item functioning, for instance, results are influenced by age, gender, diagnosis, and nationality of patients [1315]. In addition, the unidimensional nature of the DLQI is also argued. Twiss et al. [15] have suggested that certain items of the measure depend upon each other, and hence, the total score is not able to provide valid information about the respondent.

Despite its prominent role in clinical as well as in financial decision-making, the feasibility of the DLQI for these purposes has not yet been scrutinised from the viewpoint of health economics. As the DLQI is used in decisions about allocating health resources, it is expected to pursue cost-effectiveness. In general, utility values for different health states are used for the calculation of quality-adjusted life years (QALY) in cost-effectiveness analysis of health interventions. A significant difference between utilities for health states with the same DLQI score would influence the outcomes of cost-effectiveness analysis, for example, a treatment might result in different cost/QALY in patients whose baseline and post-treatment dermatology-specific HRQoL are identical. Similarly, the absence of a significant difference between utility values for health states which differ in DLQI scores larger than minimal clinically important difference (MCID) implies that a clinically meaningful HRQoL improvement does not necessarily accompanied by health gain. If such discrepancies existed that would question the use of DLQI in medical and reimbursement decision-making.

Therefore, the purpose of this study is to assess utility values for health states defined by the ten items of the DLQI by time trade-off (TTO) method. We investigate whether different health states with identical DLQI total score are weighted equally, and on the contrary, whether health states with significantly different DLQI scores receive different utility values.

Methods

Study overview

An Internet-based cross-sectional survey was developed for this study. A convenience sample of university students and staff was recruited at the campus of a Budapest university. The survey’s online link was published in the university’s daily newsletter on 5 weekdays in March 2015. The link could be opened from any computer, laptop or mobile device. Individuals were invited to participate regardless of having any dermatological condition at the time of the survey or not. Participation was voluntary, anonymous and no remuneration was offered. Hungarian-speaking subjects of 18 years or over were included in the study. Ethical approval was obtained from Semmelweis University Regional and Institutional Committee of Science and Research Ethics (Reference No. 58./2015).

The questionnaire was divided into three parts; each presented to the subjects on a separate sheet. The first part covered demographics (gender, age, level of education and employment) and a question on whether the participant had any dermatological condition(s) diagnosed by a physician at the time of the survey. On the second page, a practice TTO task of binocular blindness was displayed to the participants to get familiarised with the method. The third part of the questionnaire included the TTO utility assessment of the dermatologic health states described by the items of the DLQI. Each respondent was asked to elicit three DLQI-defined health states in a randomised order.

Health state descriptions

In forming the health states to be evaluated, the items of the DLQI were used. Taking into account that each of the 10 items of the DLQI has 4 attributes, there are 410 = 1,048,576 possible health states. Undoubtedly, this enormous number of health states cannot be evaluated. For the purposes of this study, we selected seven different health states with each corresponding to a DLQI total score of 6 (moderate effect on HRQoL), 11 (lower cut-off value for very large effect on HRQoL) or 16 (very large effect on HRQoL) (Table 1) [16]. Health states were unlabelled and participants were informed that they did not refer to any specific dermatologic condition. Generally, a DLQI score >10 indicates that the skin disease is having a very large effect on the patient’s life and considered to be a strong supportive evidence for the need for active patient intervention [16]. Based on this, when developing the questionnaire, we first selected the total score of 11 to be evaluated in this study. Then, we decided to set the difference between health states to 5 points as it exceeds the MCID for general inflammatory skin conditions (4 points) [2, 17]. Thus, we selected the scores of 6 and 16.

Table 1 Health states defined by DLQI

Three health states of 11 points (L1–L3 where L is for large impact on HRQoL), three other with a total score of 6 (M1–M3 where M refers to moderate impact on HRQoL), and one with 16 points (S, for the most severe health state) were selected. Only one health state of 16 points was applied, since we assumed this degree of HRQoL impairment as severe which was unlikely to result in significantly different utilities between same scored health states. Amongst the 6- and 11-point states, our intension was to select as different health state profiles as possible in terms of both the number of negatively affected items and the severity level of impairment, within the limits of keeping the health states imaginable for participants.

The description of health states was presented to participants from the second-person point-of-view. No changes were made to the original 10 items (including the bolded words) with the exception of the order of the questions. In order to facilitate the understanding of the differences between health states, we rearranged the 10 items. For each health state, the items were classified into two to four blocks based on the severity level of impairment (‘Appendix’). Therefore, items in which a certain health state was featured ‘very much’ impairment or ‘prevented work or studying’ moved to the top, pursued by items affected ‘a lot’, ‘a little’ and finally ‘not at all’. In this study, ‘not relevant’ responses were disregarded.

Time trade-off (TTO)

The study was designed in accordance with the checklist for utility assessment proposed by Stalmeier et al. [18].

As the DLQI is used in over 30 different dermatological conditions [2], we opted to perform the utility assessment on a general population sample, in order to avoid bias in selecting a patient population with a particular diagnosis. Another reason to derive the utilities from the general public is that such utilities are expected to be used for resource allocation decisions in health care in many jurisdictions including the US, Canada and the UK [1921]. Moreover, general population samples are commonly used for the estimation of utilities for disease-specific measures, for example, they have previously been successfully used in asthma, inflammatory bowel diseases, dementia and myelofibrosis [2225].

An example TTO task is provided in ‘Appendix’. We decided to set the time frame of the TTO at 10 years as the EQ-5D health states were derived using this time horizon in the Measurement and Valuation of Health study [26]. The subjects had to imagine that they were in a DLQI-defined chronic health state for the next 10 years. Then, they were asked to indicate their indifference between two health states: the first prospect living 10 more years in a certain imperfect health state characterised by dermatological symptoms or concerns (L1–L3, M1–M3, S), while the second option being a shorter remaining lifetime but in perfect health. We followed the protocol for self-completion TTO method developed by the University of York [27]. The top-down titration procedure was applied with starting from the upper anchor (10 years) and descending to 0 years. The smallest tradable unit of time was 6 months between 10 and 9 years of remaining lifetime, afterwards it was set at 1 year between 9 and 0 years. Only health states better than dead were elicited and no visual aids were used at all.

Utilities (U) for each health state were calculated by dividing the number of required years in perfect health by 10 years. Therefore, utilities ranged from 0 for the value of death to 1 for perfect health. For instance, if a respondent indicated 10 years in health state ‘L1’ is equal to 7 years in perfect health, this yielded U L1 = 7/10 = 0.7.

Those TTO responses which met any of following criteria were excluded from the study:

  • missing answer;

  • inconsistent answers in the TTO task indicating that the participant clearly did not understand the task (e.g. more than one indifference point with large gaps between them where the participant was able to choose between the health states).

Moreover, participants unable to provide a valid answer within any of the three health states were excluded from the whole study.

Statistical analysis

A sample size calculation was performed to ensure the sufficient number of subjects to detect significant difference between utilities assigned for different health states. To detect an expected difference of 0.10 with an assumed SD of 0.25 between utilities [28] and to achieve a power of 80 % and α = 0.05 (running a two-tailed test), 100 valid responses were needed per health state. However, health utilities are typically not normally distributed due to being bounded by the limits of the scale (here: 0, 1) [29]. The comparison of such data requires the application of non-parametric tests. To do so, we have increased the estimated sample size by 15 %, as suggested in the literature [30]. Thus, we aimed to reach 115 observations per health state.

Descriptive statistics of demographics were performed, and utilities were calculated for each health state. The differences between utilities assigned for health states, and between respondents with or without any dermatological condition at the time of the survey, were compared by Mann–Whitney U test. As a sensitivity analysis, we removed all responses from participants with any dermatological condition and repeated all analyses.

All statistics were two-tailed at the 0.05 significance level. Data were analysed using SPSS 22.0 (Armonk, NY: IBM Corp. 2013).

Results

Altogether 516 participants completed the online survey, of which 15 were under the age of 18 years and consequently were excluded from the analysis. Further, 193 were excluded according to the exclusion criteria. Of these, 175 returned the TTO part of the questionnaire blank and 18 provided inconsistent answers in all three health states evaluated. Thus, answers of 308 respondents were analysed. Their mean age was 27.4 (SD 10.3, minimum–maximum 18–75) years and 210 (68.6 %) were female (Table 2). The majority of the sample consisted of university students (57.5 %). A total of 54 (17.6 %) subjects reported to have any dermatological condition diagnosed by a physician at the time of the survey. Amongst these, non-atopic dermatitis (3.9 %), acne (2.6 %) and psoriasis (2.3 %) were the most frequent diagnoses.

Table 2 Characteristics of the study population

Each of the seven health states have been assessed by 124–130 individuals, which exceeds the original target. The mean utilities ranged from 0.56 to 0.75 (Fig. 1). Most respondents were willing to participate in the TTO task and traded life years. Out of the 882 observations, there were 71 (8 %) ‘0’ utility values (i.e. considered the health state as bad as dead) and 130 (14.1 %) ‘1’ utility values indicating that the respondent refused to trade any time. The moderate health state ‘M1’ was regarded as bad as being dead by 12 % of the respondents, while this rate was only 4.6 % for ‘M2’. In accordance with the mean utilities, the highest rate of ‘1’ answers was found for state ‘M2’ (22.3 %), whereas the lowest for the most severe health state ‘S’.

Fig. 1
figure 1

Mean utilities (95 % CI) for the seven health states. Note ‘M1–3’ refers to DLQI total score of 6, ‘L1–3’ to 11 and ‘S’ to 16

Amongst the three 11-point health states (very large health impact: L1, L2 and L3), significant difference was revealed between U L1 = 0.66 (SD 0.31) and U L3 = 0.59 (SD 0.29), but not between U L1 and U L2, or between U L2 and U L3 (Table 3). Regarding the three 6-point moderate health states, both U M1 = 0.64 (SD 0.32) and U M3 = 0.62 (SD 0.30) were found to be significantly lower compared to U M2 = 0.75 (SD 0.27). However, no statistically significant difference was noticed between U M1 and U M3.

Table 3 Time trade-off utilities for the health states defined by DLQI

The lowest mean utility was elicited for the most severe health state ‘S’ [U S = 0.56 (SD 0.29)], and this was significantly lower than utilities for all other health states, except for ‘L3’ and ‘M3’. The highest mean utility was attached to health state ‘M2’ [U M2 = 0.75 (SD 0.27)], and this was significantly better than all other health states. In spite of both U M1 and U M3 referred to health states scored 6 on the DLQI, they did not differ significantly from either U L1, U L2, or U L3, which all represented a DLQI total score of 11.

When considering all the 882 observations, the mean utilities assessed by respondents with any dermatological condition were higher compared with those who did not have any skin problem [U = 0.68 (SD 0.30) and U = 0.63 (SD 0.29), p = 0.029]. In contrast, mean utilities for binocular blindness (applied as a practice TTO task) were similar in these two groups [U = 0.49 (SD 0.30) and U = 0.50 (SD 0.27), p = 0.796]. The number of respondents with any dermatological diagnosis was too small for the single health states (n = 18–28) to detect a significant within-group difference except for health state ‘L2’ [U = 0.75 (SD 0.26) and U = 0.62 (SD 0.28), p = 0.036].

After removing responses of participants with any dermatological conditions, only minor changes were observed in mean utilities and in the significance of the differences between health states (Table 3).

Discussion

In this paper, we presented estimated preferences for seven selected health states described by the 10 items of the DLQI using time trade-off method. The experiment was undertaken on a convenience sample of adults regardless of having any dermatological condition. On a cardinal scale of 0–1, where 0 indicates death and 1 indicates perfect health, the mean utilities for the 6-, 11- and 16-point DLQI score health states were 0.62–0.75, 0.59–0.66 and 0.56, respectively.

In three out of our six pairwise comparisons, significantly different utilities were attached to health states with identical total DLQI score. This suggests that certain items of the DLQI might have a greater impact on HRQoL than others; and hence, judging HRQoL of dermatological patients from their DLQI total score itself might not be adequate. Recently, Twiss et al. [15] reported problems with individual items in the scale, such as differential item functioning, response dependency between certain items and redundancy of others on a mixed sample of psoriasis and atopic dermatitis patients.

In eight cases out of the 15 pairwise comparisons, utilities for health states, of which DLQI total score differed larger than the MCID, were not significantly different. This implies that a clinically meaningful DLQI improvement reached as a result of a treatment (e.g. a patient moves from health state ‘S’ to health state ‘M3’ (Table 1), indicating a 10-point reduction in DLQI score—more than twice as large as the MCID value) is not always followed by significant improvement in terms of utilities.

Our findings raise many theoretical concerns regarding the appropriateness of using DLQI as a benchmark in clinical and financial decisions. We are considering the example of psoriasis to discuss the possible issues related to the discrepancies observed between DLQI scores and utilities.

According to the European Consensus group on ‘Treatment goals for psoriasis’ and reimbursement criteria of many European countries, biologicals are recommended in moderate-to-severe psoriasis patients who meet (BSA > 10 or PASI > 10) and DLQI > 10, after being treated for at least 6 months with traditional systemic therapy or in case of these are not tolerated or contraindicated [48]. However, HRQoL of these patients expressed in utilities could be diverse, even though their total DLQI scores are equal. Correspondingly, we assume that the average utility gain achieved with biological therapy as well as the cost-effectiveness might vary widely between psoriasis patients, who initially indicated a DLQI score of 11 but differed in terms of the number of negatively affected items and/or the levels of impairment.

On the other hand, in spite of utilities for health states of DLQI total score of 6 and 11 are not necessarily different, biological therapy is only recommended for those, who reach the 11 points on DLQI. This directs the attention to a methodological weakness in the scoring of the DLQI that out of the 10 items, 8 has a possible response level of ‘not relevant’, which are scored the same as they were ‘not at all’ answers. Some patients, hence, who are somehow different in terms of their age, employment status, way of life or culture, might be adversely affected by the scoring system (e.g. unemployed, retired, does not do any regular exercises or not sexually active), because it is much harder for them to fulfil the DLQI criteria set out by the guidelines.

In some European countries, to be eligible for maintenance biological treatment, psoriasis patients are required to improve at least 5 points on the DLQI within a 10- to 16-week period [5, 8]. But as we observed, this improvement may not be associated with significant (or any) health gain. We assume thus, by using the DLQI in clinical and financial guidelines, the biological therapy might not be given to the patient population, where the largest utility gain and the lowest cost-effectiveness ratio could be achieved.

Utilities in this study reflect preferences of a general population sample (i.e. social utility values) but not the patients. It is well known, however, that preferences of patients and the general public often differ largely [31]. In the current study, 17.6 % reported having any dermatological condition diagnosed by a physician at the time of the survey. Considering all the 882 responses, individuals with any skin disease were likely to give significantly less life years for the DLQI health states, but not for binocular blindness compared with those without any dermatological condition. Some argue that one reason for the difference is that people without the experience of a particular health state cannot accurately predict its impact on HRQoL [32]. On the other hand, patients often adapt to their illnesses and members of the general public do not count on this when evaluating health states [33]. In countries with publicly funded healthcare systems, it may be expected that allocation of healthcare resources should take into account social preferences.

Future researches should expand this experiment to selected patient groups, especially to those, of which clinical management is highly based on DLQI, such as psoriasis or atopic dermatitis. Given Twiss et al. [15] found that DLQI could work differently in various diagnoses, we presume that such differences in utilities for the same DLQI health state profiles which were found in this study might exist when assessed by patients with different dermatological conditions. This would provide valuable information on patients’ preferences and might contribute to the development of clinical and financial guidelines of dermatological conditions.

This study has some limitations to consider. First, only a small number of health states were assessed that does not allow the development of a model that can predict utilities for each health state profile of the DLQI. However, we find important to point out that this was not the purpose of the study. Despite the small number and the arbitrary selection of health states, we were able to detect significant differences between health states with equal DLQI score. Considering the number of possible health states of the DLQI descriptive system, presumably, many more discrepancies exist. The largest difference between health states was set to 10 points, but based on our findings, even larger difference might yield similar utilities. Secondly, DLQI is a unidimensional measure and a factor analysis of the 10 items has not been tested in this study. Former factor analyses, however, questioned this unidimensionality [12, 14, 15]. This suggests that the differences found between utilities for health states with identical DLQI score stem from the fact that some items of the DLQI are weighted differently. Thirdly, the DLQI is a dermatology-specific measure and it is likely to be more sensitive to small but clinically relevant changes in HRQoL that the TTO is not able to capture. This might partly explain that utilities for health states different in larger than the MCID were not statistically significantly different. Fourthly, due to the online recruiting of the study, participants with dermatological conditions were slightly overrepresented because these individuals were likely to be more interested in a survey related to their illness and tended to participate in the study. Nonetheless, sensitivity analysis revealed that after removing all responses of these participants, only minor changes occurred (Table 3). Finally, description of some health states might have seemed unrealistic and hard to imagine for some participants, as we did not provide information regarding the extension of skin lesions, affected body sites, appearance of the skin, or the type or the name of a particular skin disease. Nonetheless, as these aspects are not covered by the 10 items of the DLQI, we felt that they could have biased the results.

Benefits of the DLQI as an outcome measure should be recognised. Being the first dermatology-specific HRQoL measure, it has largely contributed to a shift in the minds of dermatologists about the importance of HRQoL issues. Its simplicity, multilingual availability and the over 20-year experience apparently support its widespread use as a reference for treatment decisions. Nevertheless, from the perspective of health economics, it is time to reconsider its usefulness in medical and reimbursement decision-making to promote cost-effective management of dermatological conditions and effective allocation of scarce resources in healthcare budgets.