A comparison of health utility scores calculated using United Kingdom and Canadian preference weights in persons with alzheimer’s disease and their caregivers

Background The use of the EQ-5D to asses the economic benefits of health technologies has led to questions about the cross-population transferability of preference weights to calculate health utility scores. The aim of this study is to investigate whether the use of UK and Canadian preference weights will lead to the calculation of different health utility scores in a sample of persons with Alzheimer’s disease (AD) and their primary informal caregivers. Methods We recruited 216 patient-caregiver dyads from nine geriatric and memory clinics across Canada. Participants used the EQ-5D-3L to rate their health-related quality-of-life (HRQoL). EQ-5D-3L responses were transformed into health utility scores using UK and Canadian preference weights. The levels of agreement between the two sets of scores were assessed using intraclass correlation coefficients (ICCs). Bland-Altman plots depicted individual-level differences between the two sets of scores. Differences in health utility scores were tested using the Wilcoxon signed rank sum test. A generalized linear model with a gamma distribution was used to examine whether participants’ socio-demographic characteristics were associated with their health utility scores. Results The distributions of health utility scores derived from both the UK and Canadian preference weights were skewed to the left. The intraclass correlation coefficient was 0.94 (95 % CI: 0.92, 0.95) for persons with AD and 0.92 (95 % CI: 0.88, 0.94) for the caregivers. The Canadian weights yielded slightly higher median health utility scores than the UK weights for caregivers (median difference: 0.009; 95 % confidence interval: 0.007, 0.013). This finding persisted after stratifying by disease severity. Few socio-demographic characteristics were associated with the two sets of health utility scores. Conclusions Health utility scores exhibited small and clinically unimportant differences when calculated with UK versus Canadian preference weights in persons with AD and their caregivers. The original UK and Canadian population samples used to obtain the preference weights valued health states similarly.


Background
Alzheimer's disease (AD) is a chronic neurodegerative condition that accounts for 60 to 70 % of all cases of dementia. Cognitive impairment, functional decline, and behavior and mood problems are the core features of AD [1]. AD and other dementias are the seventh leading cause of mortality and disability and the fourth leading cause of disease burden in high-income countries [2].
Health-related quality of life (HRQoL) is an individual's dynamic perception of the impact of a health state upon physical, emotional, and cognitive function, social role performance, well-being, and life satisfaction [3]. HRQoL is an important means of assessing the impact of AD treatments because available therapies mitigate the symptoms of cognitive decline, but do not alter the progression of the disease [4].
The EQ-5D-3L is one of the most frequently used generic instruments to measure HRQoL [5][6][7]. Algorithms (preference weights) can be used to convert EQ-5D-3L responses into health utility scores (range: 0 [equivalent to death] to 1 [equivalent to full health]), which are employed in cost-utility analyses to calculate qualityadjusted life-years (QALYs). The original preference weights for the EQ-5D-3L were derived from the general UK population using the time trade-off (TTO) method [8]. Researchers generated a Canadian set of preference weights for the EQ-5D-3L using the TTO method and a sample of 1145 participants who belonged to a market research panel [9]. In the UK and Canadian studies, the researchers chose different sub-sets of health states from the 243 total possible health states on the EQ-5D-3L. These sub-sets were further divided into smaller groups for each participant to value using the TTO method. Regression analyses were employed to develop a set of beta coefficients that would serve as the preference weights to convert EQ-5D-3L responses into health utility scores.
This study investigated whether the use of UK and Canadian preference weights would lead to the computation of different health utility scores in a sample of persons with AD and their primary informal caregivers. The topic is important because studies based in populations without domestic sets of preference weights will often draw upon the preference weights of other populations, regardless of whether the other populations' weights are transferable. Unless transferability is assessed, researchers cannot be certain whether another population's weights will provide unbiased health utility scores in their population of interest.
This issue is important within the context of AD because health utility scores are essential components of cost-utility analyses. These analyses can influence reimbursement decisions for AD pharmacotherapies, as evidenced in 2006 when the results of a cost-utility analysis prompted the United Kingdom's National Health Service to delist coverage of cholinetserase inhibitors for persons with mild AD. The impact of cost-utility analyses on treatment decisionmaking highlights the importance of ensuring the unbiased nature of the underlying health utility scores.
Recent work in Canada has echoed our sentiments about the use of adequate preference weights in cost-utility analyses [10]. Lien et al. point out that differences in country-specific preference weights could lead to differences in cost-utility results [10]. Similar concerns have also been raised in Italy [11]. In the context of Canada's publicly-funded healthcare system, decisions regarding the efficient allocation of limited resources require support from unbiased analyses of cost-utility data. The issue extends beyond Canada to include any jurisidiction without a locally or domestically available set of preference weights. An examination of this issue may raise awareness among regulatory agencies that do not mandate the use of local or domestic preference weights.
To round out our objectives, we also explored whether socio-demographic factors might be associated with the health utility scores calculated in this study.

Subjects and data collection
Data were collected between November 2008 and August 2011 [12]. Two hundred sixteen persons with AD and their primary informal caregivers were recruited from nine memory or geriatric clinics across Canada. Eligible participants had a diagnosis of AD, as defined by the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text revision criteria [13] or the National Institute of Neurological and Communicative Disorders and Stroke/Alzheimer's Disease and Related Disorders Association criteria (NINCDS-ADRDA) [14]. We included persons with mild or moderate AD to ensure that participants would be cognitively capable of answering the study questions. The physicians who ran the recruiting clinics assessed disease severity using the Functional Assessment Staging in Alzheimer's Disease Scale [15]. Participants' primary informal caregivers also had to agree to participate in the study. All participants had to speak English or French.
We conducted separate one-on-one interviews with each participant. The interviews included socio-demographic questions (i.e., age, gender, education level, occupation and household annual income [Canadian dollars]) and the EQ-5D-3L. Each participant rated her or his own HRQoL. Caregivers did not provide proxy HRQoL ratings for persons with AD.
Prior to commencing each interview, participants read an information package about the study and they could ask the interviewer questions. The interviews began after participants signed an informed consent form. The study received ethics approval from the Hamilton Health Sciences/McMaster Health Sciences Research Ethics Board (project number 08-179) and from the local research ethics boards governing each of the nine recruitment sites.
We calculated health utility scores using the EQ-5D-3L responses and the UK [8] and Canadian preference weights [9]. We did not compute health utility scores for participants who failed to answer one or more of the EQ-5D-3L questions. Participants with missing health utility scores were excluded from statistical analyses involving these scores.

Statistical analysis
Socio-demographic characteristics were summarized using medians and interquartile ranges for continuous variables, and frequencies for categorical variables. We used 1000 bootstrap samples to calculate bias corrected and adjusted 95 % confidence intervals (CIs) for all median health utility scores. We assessed the statistical significance of the median differences between the UK and Canadian health utility scores using the Wilcoxon signed rank sum test. Hodges-Lehmann's methods for paired groups were employed to calculate the median differences and the 95 % confidence intervals for the median differences.
The overall agreement between the UK and Canadian health utility scores was assessed with the intraclass correlation coefficient (ICC), specifically the ICC(3,1) [16]. Bland-Altman plots [17] were created to graphically depict the difference in each participant's health utility score (the score based on UK weights subtracted by the score based on Canadian weights).
We tested the association between socio-demographic factors (i.e., age, gender, education level, occupation, and annual household income) and each set of health utility scores using a generalized linear model (GLM) with a gamma distribution and bootstrap 95 % CIs for the regression coefficients. This type of GLM was shown to optimally fit the data compared to a generalized additive model, quantile regression by means of residual plots, and analysis of variance. We employed the Akaike Information Criterion (AIC) to choose the optimal model from among these different approaches. Based on the literature [18], we considered a change in health utility score of 0.074 to be a minimum clinically important difference (MCID).
Most analyses were carried out using R v3.2.0 (R Foundation for Statistical Computing, Vienna, Austria). To calculate the ICC (3,1), we used a two-way mixedeffects analysis of variance model and absolute agreement in SPSS v19 (IBM Corp., Armonk, NY).

Results
A total of 216 persons with AD and their primary informal caregivers were included in the study ( Table 1). The median age was 80 years for persons with AD and 69 years for caregivers. One-hundred five persons with AD (48.6 %) were female and the majority (n = 143, 66.2 %) of caregivers were female. Most persons with AD (n = 112, 51.9 %) did not exceed a high school education and all except one were retired. Most caregivers (n = 147, 68.1 %) had a post-secondary education and 140 (64.8 %) were retired. Most (n = 175, 81.0 %) persons with AD were diagnosed with mild AD and the rest were diagnosed with moderate AD. Over 50 % of participants in both groups reported no problems in all five EQ-5D-3 L dimensions ( Table 2).
For persons with AD and caregivers, the distributions of health utility scores derived from the UK and Canadian preference weights were left-skewed (Figs. 1 and 2). Half of the persons with AD had health utility scores above 0.85 (UK weights) or 0.84 (Canadian weights). Similarly, half of the caregivers had scores above 0.80 (UK weights) or 0.83 (Canadian weights). We could not compute health utility scores for three persons with AD because they did not answer all of the questions on the EQ-5D-3L.
The difference between the two sets of health utility scores was not statistically significant in persons with AD (p = 0.63) ( Table 3). However, the difference was The Canadian utilities were higher than the UK utilities in the caregiver group when the caregivers were stratified according to the disease severity (mild, moderate) of the persons under their care (Table 4). Median differences were 0.009 (95 % CI: 0.007, 0.013) in the mild subgroup and 0.013 (95 % CI: 0.007, 0.028) in the moderate subgroup. For persons with AD, the UK and Canadian health utility scores did not differ significantly.
Note. AD Alzheimer's Disease The overall agreement between health utility scores using UK and Canadian preference weights was high. The ICC (3,1)s were 0.94 (95 % CI: 0.92, 0.95) for persons with AD and 0.92 (95 % CI: 0.88, 0.94) for caregivers. According to the Bland-Altman plots, 95 % of the >individual differences in health utility scores (UK -Canadian) fell within a range of -0.12 to 0.12 for persons with AD and within -0.16 to 0.12 for caregivers (Fig. 3). For persons with AD, only 15 (7 %) of the individual differences in score exceeded the MCID of 0.074; for caregivers, the number was also 15 (9 %).
With respect to socio-demographic characteristics, only some education categories were associated with the two sets of health utility scores ( Table 5). The scores for persons with AD who had a post-graduate education were an average of 0.1 points lower (UK weights) or 0.059 points lower (Canadian weights) than the scores for persons who had a high school or less education. The scores for caregivers with a post-graduate education were an average of 0.116 points lower (UK weights) or 0.04 points lower (Canadian weights) than the scores for caregivers with a high school or less education. Caregivers with a university education also had average scores that were 0.131 points lower than caregivers with a high school or less education. No other sociodemographic variables were statistically significantly associated with either set of health utility scores. Three of the statistically significant associations exceeded the MCID of 0.074.

Discussion
In this study of persons with AD and their primary informal caregivers, health utility scores derived from UK and Canadian preference weights exhibited slight differences from one another. Based on the MCID, these differences were not large enough to be considered clinically important. We could not find other studies that compared health utility scores calculated with UK and Canadian preference weights.
Evidence suggests health utility scores can be similar for people across countries with comparable socio-demographic characteristics (i.e., UK, Holland, and Germany [19] Spain and Germany [20]). Given the sociodemographic parallels between the UK and Canadian populations, the samples used to derive the UK and    Canadian preference weights [8,9] may have simply valued their health states similarly to one another. Thus, the health utility scores derived from each set of preference weights in this study did not differ appreciably from one another. On the other hand, not every population with similar socio-demographic characteristics will place comparable value on the same health states. Representative samples of the United States (US) and UK populations valued 42 EQ-5D-3L health states and the adjusted mean difference in health utility scores was 0.10 points higher in the US population [21]. An earlier study involving the same sample as in the present study compared health utilities calculated with US and Canadian preference weights [12]. This comparison showed that Americans and Canadians agreed on the types of health states that should be considered 'good' or 'bad' , but Canadians tended to place a lower value than Americans on most of these health states. When calculated with Canadian versus American preference weights, the mean health utility score was 0.06 points lower (95 % CI: -0.07, -0.06) in persons with AD and 0.05 points lower (95 % CI: -0.06, -0.04) in caregivers.
The findings reported in the literature raise a caution for researchers who wish to calculate health utility scores for study samples drawn from populations for which preference weights do not exist. The default practice in these situations has been to use preference weights derived from similar populations, but this approach could lead to over-or under-estimates of health utility scores. In the absence of preference weights for the population of interest, one can never be certain whether another population's weights will provide unbiased results.
The caution about transferability of weights also applies to the EQ-5D-5L, which measures the same five health dimensions as the EQ-5D-3L, but expands the number of response options from three to five [22]. Recent work involving the EQ-5D-5L suggests that one set of preference weights may not capture inter-regional differences in a single country's population if the population is spread over a large geographic area [23].
In our study, only a small number of socio-demographic characteristics affected the UK and Canadian health utility scores. Perhaps disease-specific factors, rather than sociodemographic characteristics, could better explain the scores in AD samples. In persons with AD, some studies have shown that health utility scores are affected by levels of depression and functional ability [24,25]. Other work has suggested that a more complex series of socio-demographic and disease factors combine to modify the relation between disease severity and health utility scores [26]. For caregivers, health utility scores may be influenced by the extent to which patients are dependent on care, the perceived burden of being a caregiver, and the time involved in providing care [27,28]. Researchers should consider these additional variables when they design studies to explain health utility scores in AD samples. Our study found that most caregivers' health utility scores were lower than the scores of persons with AD. The burden of caring for a person with AD may adversely affect a caregivers' HRQoL [29]. Meanwhile, the effects of cognitive impairment might prevent persons with AD from perceiving the full impact of disease on their lives, therefore leading to high ratings of HRQoL [30,31].
Few studies could be found that assessed the HRQoL of AD caregivers. One study conducted on the Canary Islands reported that the frequency of having at least some problems on each EQ-5D-3L dimension was greater in 237 AD caregivers compared to the islands' general population [27]. However, the authors did not convert EQ-5D-3L responses into health utility scores. The caregiver utility scores in our study were somewhat lower than the scores reported in two studies (i.e., 0.87 [32], 0.88 [33]) of general adult populations in Canada, and about the same as the scores reported in a third Canadian study (i.e., 0.85 female, 0.81 male) [34].
Readers should take certain issues into account when interpreting the results of this study. The participants were recruited from geriatric or memory clinics, so they are unlikely to be representative of the average person with AD or the average caregiver. The participants with AD may be a healthier subset of all patients and the caregivers may be more informed about AD.

Conclusion
Health utility scores exhibited some small yet clinically unimportant differences when calculated with UK versus Canadian preference weights in a sample of persons with AD and their caregivers. The UK and Canadian populations used to obtain the preference weights valued health states similarly to one another.