A discrete-choice experiment to assess patients ’ preferences for osteoarthritis treatment: An ESCEO working group Seminars in Arthritis and Rheumatism

Objective: To evaluate the preferences of patients with osteoarthritis for treatment. Methods: A discrete-choice experiment was conducted among adult OA patients who were presented with 12 choice sets of two treatment options and asked in each to select the treatment they would prefer. Based on literature reviews, expert consultation, patient survey and expert meeting, treatment options were characterized by seven attributes: improvement in pain, improvement in walking, ability to manage domestic activ- ities, ability to manage social activities, improvement in overall energy and well-being, risk of moderate/ severe side effects and impact on disease progression. Random parameters logit model was used to estimate patients ’ preferences and a latent class model was conducted to explore preferences classes. Results: 253 OA patients from seven European countries were included (74% women; mean age 71.3 years). For all seven treatment attributes, signi ﬁ cant differences were observed between levels. Given the range of levels of each attribute, the most important treatment attribute in this group was impact on disease progres- sion (29.5%) followed by walking improvement (17.1%) and pain improvement (16.3%). The latent class model identi ﬁ ed two preference classes. In the ﬁ rst class (probability of 56%), patients valued impact of dis- ease progression the most (39%). In the second class, walking improvement and improvement in overall energy and well-being were the most important (23%). Conclusion: This study suggests that all seven treatment attributes were important for OA patients. Overall, given the range of levels, the most important outcomes were impact on disease progression and improve- ment in pain and walking.


Introduction
The patient perspective is becoming increasingly important in clinical and regulatory decision-making [1,2]. Information about what patients prefer, and how they value various aspects of a health intervention can be useful when designing, evaluating and implementing healthcare programs [3]. Such insights can also help when establishing treatment guidelines and should be taken into consideration when designing new drugs or other interventions [4]. There has therefore been a growing interest in obtaining patients' preferences for healthcare interventions. In particular, the use of stated-preference methods (including discrete-choice experiments (DCE)) has markedly increased in recent years [5,6]. A DCE is a quantitative method used to elicit and quantify the preferences of participants [7] that are asked to repeatedly choose between hypothetical options that systematically differ in several attributes of interest.
Given the significant challenges and lack of therapeutic options for osteoarthritis [8], some stated-preference studies have been conducted to elicit preferences for OA treatment [4]. OA is the most common form of arthritis in later life and most frequently affects the knee, hand, and/or hip [9]. OA is predominantly characterized by pain and has been shown to substantially reduce the patient's mobility and quality of life and to represent a significant contributor to disability in the elderly [10]. Currently OA treatments aims primarily to reduce joint pain, maintain and improve joint mobility and enhance quality of life. Treatment options (including surgery, pharmacological and non-pharmacological treatment) may however differ in benefits and risks, emphasizing the need to assess patients' preferences for different aspects of OA treatment [4].
Previous DCEs conducted in OA were mainly conducted to assess preferences for the characteristics of OA drug treatment suggesting that DCE is a reliable method in this patient population, but revealed some contrasting results. Some studies reported that benefit attributes (e.g. pain improvement, physical improvement) were the most important for patients [11À13], while others suggested that patients were mainly concerned by side-effects [14À16]. However, few studies have investigated the trade-offs between all potential OA treatment outcomes including impact on pain, physical mobility, joint structure and side-effects. Further insight into patients' preferences for OA patients as well as investigating potential differences in preferences between patients and countries would thus be worthwhile. The European Society European Society on Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO) has therefore set up a working group to conduct a DCE to further elicit patients' preferences in OA across European countries.
The aim of this study was therefore to assess the preferences of European patients for OA treatment outcomes using a DCE. Furthermore, as secondary objectives, we aimed to assess heterogeneity in responses by determining different preference profiles of participants and revealing potential differences in preferences between the participating European countries.

Methods
A DCE was used to examine patients preferences for OA treatment. In the DCE survey, patients were presented with a series of choices and asked in each to select the treatment (Treatment A and Treatment B) they would prefer. The two hypothetical treatments were described by a set of attributes which were further specified by attributes`levels (see below). Good research practices for stated-preference studies were followed [3,17].

Attributes and levels
The identification and prioritization of important OA treatment attributes was conducted following a 4-stage process: (1) two scoping reviews (i.e. an exhaustive review of preference studies in OA, and a review of outcomes in RCTs of OA treatment) to generate an initial list of OA treatment outcomes; (2) interviews with OA patients (n=20) to identify any missing outcomes and an expert consultation (n=17) to review and validate initial outcomes; (3) a patient survey (n=56) with OA patients from Belgium, Italy, Portugal, Netherlands, Switzerland and the United Kingdom to rank the most important outcomes; (4) an expert meeting including one OA patient (n=17) to identify the most important OA treatment outcomes for inclusion in the DCE based on the survey results, and to select levels for each chosen outcome. More details about the findings of each stage are presented in Appendix A. Finally, the following seven attributes were included: improvement in pain, improvement in walking, ability to manage domestic activities, ability to manage social activities, improvement in overall energy and well-being, risk of moderate/severe side effects and impact on disease progression (see Table 1). For each attribute, three or four levels were agreed upon by the expert group.

Experimental design
A main-effect efficient design by maximizing the D-efficiency was used to select the subset of choice sets to be presented to the patients using Ngene software (Version 1.1.1, http://www.choice-metrics. com). A total of 24 choice tasks was designed and blocked into two versions of the questionnaire containing 12 choice tasks each. A dominance test -a choice set with one hypothetical treatment that is clearly better than the other (i.e. a treatment associated with higher improvement in pain, walking and overall energy, better ability to conduct domestic and social activities, similar side-effects risk and lower impact on disease progression), -was added to assess reliability of respondents' choices [18]. Each respondent therefore received 13 choice tasks. An example of a choice task is shown in Fig. 1. All choice tasks for one version of the questionnaire are available on Appendix B (the dominance test is the choice task number 10). We did not include an opt-out in the DCE because interviews with OA patients and experts suggested that participants would be willing to trade-off between hypothetical treatment options, further being not directly connected to drug options. To avoid larger numbers of respondents who choose the opt-out option to prevent additional loss of power, we therefore decided to use binary choices and thus forced a choice.

Questionnaire design
The questionnaire was paper-based. The attributes and levels were first described, and an example of a completed choice task was included. After respondents had completed the 13 choice tasks, they were asked how difficult they found the tasks on a seven-point Likert scale. Data on participants' demographics and socioeconomic (i.e. age, education, gender, and education) and some medical characteristics (i.e. knee and/or hip OA, treatment for OA, hip/knee replacement) were also collected.
The questionnaire was developed in English by MH and ED, and reviewed and approved by the ESCEO working group (n=17). The working group consisted of clinical scientists in the field of OA and DCE experts who were selected by the Scientific Advisory Board of ESCEO. The English version of the questionnaire was then pilot tested with some clinicians and 20 OA patients to check interpretation problems, face validity and the length of the questionnaire. Only minor changes to layout were made. The questionnaire was then translated into additional languages (French, Spanish, German and Italian), covering the languages spoken across the countries in our sample. Each language version was checked and approved by at least one additional native speaker. One version of the questionnaire is available in Appendix B.

Study population, data collection and sample size
The study was conducted in seven European countries including Belgium, France, Italy, the Netherlands, Portugal, Spain and United Kingdom between May 2018 and July 2019. The DCE was conducted in OA patients that were recruited during specialist outpatient clinics or from cohort studies. Participants who were not cognitively able to understand and fill out the questionnaire or lacked understanding of local language were excluded. The questionnaire was completed by the participant at the clinic or at home. In the latter case, it was returned in a postage-paid envelope. Sample size calculation for stated-preference studies is difficult as it depends on the true values of the unknown parameters estimated in the DCE. Hence, a minimum of 200 respondents was targeted for the whole sample, which was sufficient based on common rules-of-thumb for minimum sample size [19] and similar to other DCEs [6].

Ethics approval
Approval for this study was obtained from the Medical Ethics Committee of the University of Li ege that coordinated the project. Participants gave informed written consent. Additional local ethics approval was obtained at those participating centers that required ethics approval for a DCE questionnaire study.

Statistical analyses
Data analysis was carried out using Nlogit software, version 5.0. Data of participants who failed the dominance test were excluded. The available patient data were analysed with various recommended statistical methods [17].
First, a random parameters logit model was used which allows to capture heterogeneity by estimating the standard deviation of the parameter's distribution. The model provides mean coefficients as well as a measure of the distribution around the mean coefficient in the form of a standard deviation. If the standard deviation is significantly different from zero, this is interpreted as evidence of significant preference heterogeneity for the attribute/level in the sample. We used a panel model to account for the panel nature of the data, as each participant completed 12 choice sets. To take scale heterogeneity into account, which relates to potential differences between countries in the randomness of choice behaviour, a normally distributed random component was added for each country.
All variables were included as effects-coded categorical variables that we assumed to be normally distributed. The model was estimated by using 1,000 Halton draws and no interaction terms were included. Using effect coding, mean attributes are normalized to zero and preference weights are relative to the mean effect of the different levels of the attribute. The sign of a coefficient reflects whether an attribute level leads to an increase (positive) or a decrease (negative) on the patients' utility. The value of each coefficient represents the importance patients assign to an attribute level. P-values represent the statistical difference between the preference weight of the attribute levels and the mean effect of the same attribute [17]. If the 95% confidence interval around two levels did not overlap, the differences between the preference weights were considered as statistically different. A priori, we expected that the attribute levels with large improvements, heightened ability to conduct activities, lower risk of side-effects and improvement in joint structure would have a positive effect on utility (i.e. a positive sign).
Preferences estimate from the model was then used to calculate the conditional relative importance of attributes overall and per country. Using the range method [17], the range of attribute-specific levels is calculated by measuring the difference between the highest and lowest coefficient for the levels of the respective attribute. The conditional relative importance is then calculated by dividing the attribute-specific level range by the sum of all attribute level ranges. The relative attribute importance calculated with this method always depends on the range of levels chosen per attribute and on the other attributes included in the experiment. Given the large effect of the level "severe degradation", we also estimated the conditional relative importance by excluding this level.
Second, a latent class model was used to determine preference classes. Latent class models allow to identify the existence and number of classes in the population based on their treatment preferences [20]. To determine the number of classes, we selected the model with the best fit based on the Akaike information criterion. The association between selected patient characteristics and latent class membership was then determined using a multivariable logistic regression model. The multivariable model was considered exploratory and was limited to the variables with different probability between latent classes. This analysis was conducted with IBM SPSS 24 TM .

Participants' characteristics
A total of 285 questionnaires were completed and returned. Of those, 32 participants failed the dominance test and were excluded from the final analysis. Participants who failed the dominance test did not differ in age, gender and education level from those who passed, and inclusion of these patients in an additional analysis did not affect the results and conclusions. The final sample consisted of 253 participants (47 from Belgium, 31 from France, 45 from Italy, 15  from the Netherlands, 42 from Portugal, 51 from Spain, 22 from United Kingdom). The respondents had a mean age of 71.3 years, and 74% were female. Sample characteristics are shown in Table 2. Patients had mainly lower extremely OA (93%), with 57%, 14% and 21% of patients reporting having knee OA, hip OA and both knee and hip OA, respectively. On average, the task difficulty was seen as moderate with an average score of 4.1, based on responses to a sevenpoint scale (one for extremely easy and seven for extremely difficult).

Patients' preference
The random parameters logit model results are presented in Table 3 and Fig. 2. In all seven selected OA treatment outcomes, significant differences were observed between levels (as the 95% CI did not overlap), meaning that all attributes are important for patients. All relationships were in the expected direction, as improved levels for each attribute were associated with higher coefficients (by example, À0.63 for no pain improvement, À0.22 for mild pain improvement, 0.25 for moderate pain improvement and 0.61 for large pain improvement). Patients preferred thus a treatment with greater improvement in pain, walking and/or energy, heightened ability to conduct social and domestic activities, lower risk of side-effects, and lower structural progression. As previously mentioned, each coefficient is estimated relative to the mean attribute effect. As example, the negative coefficient of mild improvement in pain means therefore that this level has a lower effect on utility than the average effect of the related attribute.
Overall, given the range of levels of each attribute, the most important treatment attribute OA patient reported was impact on disease progression (29.5%) followed by walking improvement (17.1%), pain improvement (16.3%), ability to conduct domestic activities (11.5%), improvement in energy and well-being (10.2%), ability to conduct social activities (8.2%) and risk of severe side-effects (7.4%). When excluding the level 'severe degradation' for the calculation of the relative importance, the conditional relative importance of the impact on disease progression was only 12%. Given the significant standard deviation for at least one level per attribute, variations in preferences between participants were thus observed for all attributes, and were especially marked for impact on disease progression (see Table 3).
The conditional relative importance of attributes per country is shown in Appendix C. Given the range of levels of each attribute, the impact on disease progression was the most important OA treatment outcome in all countries except in Italy where pain improvement was the most important outcome. In all countries, significant differences between levels for all seven OA treatment outcomes were observed.

Latent class model
The latent class model identified two preferences classes with class probabilities of 56% and 44%, respectively (see Table 4). In the first class (probability of 56%), participants valued impact of disease progression the most (39%), and the other attributes had a score of conditional relative importance between 5% and 14%. In the second class, walking improvement and improvement in overall energy and   well-being were the most important attributes (conditional relative importance of 23%), while impact on disease progression was the second last attribute. Patients from Belgium, Portugal and France were more likely to be in the class concerned mainly by the impact on disease progression. Women, patients with hip OA and patients with OA treatment were also more likely to be in this group. The multivariable logit model revealed three significant results: patients from France and women were significantly more frequent in the class concerned by impact on disease progression and patients from Italy more often in the class with a focus on walking and overall energy. At a significance level of 10% only, patients with knee OA were also more likely to be in the class with walking and energy improvement.

Discussion
This study demonstrates that, on average, all the seven treatment attributes we selected were important for OA patients. Overall, given the range of levels included in our experiment, the most important outcomes to patients were impact on disease progression and improvement in pain and walking, although variations in preferences were observed between patients and countries. Two preference classes were identified in the latent class analysis: one with a stronger emphasis on disease progression, and the other on walking and pain improvement.
Overall, the conditional relative importance stated was the most important for the impact on disease progression. This finding should however be interpreted with great caution as the relative importance of the attributes is based on the range between the highest and lowest coefficient for the levels of the respective attribute. A level 'severe degradation' was included and was associated with a strong negative preference. Excluding this level in the calculation of the relative importance of disease progression (i.e. estimating a difference between mild degradation and improvement in joint structure) placed the impact on disease progression as the second least important attribute only.
Although the levels for the risk of severe side-effects were significantly different and thus important, side-effects were less important than treatment benefits. This finding is in contrast to the study of Hauber et al. [21] where the risk of side-effects expressed as heartattack or stroke risk were considered more important than benefits in ambulatory pain and doing domestic activities. The framing of specific side-effects could potentially be more sensitive for patients than a general description of moderate/severe side-effects. In the latent class analysis, patients from France and women were significantly more likely to be in the class concerned mainly by the impact on disease progression, while patients from Italy were more likely in the second latent class concerned by improvement in walking and overall energy.
This study could have implications for research, clinical and regulatory decisions. Our study reinforces the need to develop treatment that could influence joint structure. OA remains an unmet medical need and some treatments have been developed but failed to develop joint structure-modifying treatments for OA [22]. As current treatment options (including surgery, pharmacological and non-pharmacological treatment) may differ in benefits and risks, assessing patients' preferences and involving patients in decision-making could help to improve disease management. The findings of this study highlight the importance of investigating individual preferences and incorporating them in a shared-decision making process.
Although this study followed good research practices, some potential limitations exist. First, some participants who were not cognitively able to fill out and understand the questionnaire were excluded. Our sample could thus be slightly "over-educated" compared to the average OA patient. In addition, our patients were mainly recruited from specialist outpatient clinics and thus may not be representative of all OA patients visiting general practitioners. We also do not have information about how many patients were not cognitively able to understand the questionnaire or fill them in. Thus, selection bias and limitations in generalizability of our results cannot be excluded. Further, patients in our study had mainly lower extremely OA (at the hip and/or knee) and preferences for OA treatment may be different for hand or shoulder OA and would thus require further investigation. In addition, the sample size was lower in some countries and, although significance was observed in each country, a higher sample size would be needed to confirm findings per country. The sample size was targeted for the whole sample and not for individual country to reveal potential differences between countries. In our study, we therefore pooled data from all countries which could be a potential limitation. Second, although sound methodology was used to select and define attributes including large patient involvement, it is possible that additional attributes may play a role, at least in some countries or for some patients. In particular, costs could be an important attribute in countries/populations where patients have an out-of-pocket contribution. The cost attribute did not emerge as an important attribute in our survey that mainly included patients aged over 65 years that do not have to pay for drugs. To avoid presenting a cost attribute that would have not been relevant for most patients, we therefore agreed to not include a cost attribute in the DCE. Similarly, mode of administration was not considered as an important attribute in our preliminary survey and, given the constraint to limit the number of attributes, mode of administration was not included in our list of attributes that already contained seven characteristics. We however acknowledge that some patients may have preferences for route of administration [23] (topical vs oral vs injection) and this would require further investigation. To maintain consistency across countries, the same list of attributes as well as levels and the same design was used in all countries. Third, although the working group did not consider some combinations of levels as implausible, some could not be expected at first glance (by example being able to have all domestic activities with difficulty and a severe impact on disease progression but all social activities without difficulty). To avoid decreasing design efficiency, the working group agreed to not include implausible combinations in the design. The pilot test with about 20 OA patients further did not reveal any problems with the task. Fourth, back and forward translations of the questionnaire were not performed, and a pilot study was not conducted in all countries. Finally, while DCEs are widely used, an inherent limitation is that respondents are evaluating hypothetical options. Therefore, what respondents declare they will do may differ from what they would actually do if faced with the choice in real life.

Conclusion
This study suggests that all seven treatment attributes selected were important for OA patients. Overall, the most important outcomes to OA patients were the impact on disease progression and improvement in pain and walking, although variations in preferences were observed between patients. Excluding the level "severe degradation" in the calculation of the conditional relative importance substantially decreases the relative importance of the impact on disease progression

Author contributions
Conception and design: MH, ED and JYR. Analysis and interpretation of data: all authors. Drafting of the article: MH. Critical revision of the article for important intellectual content: all authors. Final approval of the article: all authors. Authors who take responsibility for the integrity of the work as a whole: MH (m.hiligsmann@maastrichtuniversity.nl)

Declaration of Competing Interest
Professor Bruy ere reports grants from Biophytis, IBSA, MEDA, Servier, SMB, Theramex, outside the submitted work. Professor Cooper reports personal fees from Alliance for Better Bone Health, Amgen, Eli Lilly, GSK, Medtronic, Merck, Novartis, Pfizer, Roche, Servier, Takeda and UCB. Professor Conaghan has received consulting fees or done speakers bureaus for AbbVie, EMD Serono, Flexion Therapeutics, Galapagos, Novartis, Pfizer, Samumed and Stryker. Professor Dennison has received consulting fees from UCB and Pfizer. Dr. Herrero-Beaumont reports grants from Novartis, grants from Sandoz, grants from Pfizer, grants from Amgen, grants from Mylan, grants from Servier, outside the submitted work; In addition, Dr. Herrero-Beaumont has a patent Patent for the use of 6-shogaol on osteoporosis treatment. Spanish patent issued, and a patent Patent on the use of osteostatin in osteoarthritis treatment issued. Professor Reginster reports grants and personal fees from IBSA-GENEVRIER, grants and personal fees from MYLAN, grants and personal fees from RADIUS HEALTH, personal fees from PIERRE FABRE, grants from CNIEL, personal fees from DAIRY RESEARCH COUNCIL, outside the submitted work. Professor Thomas reports personal fees from Abbvie, grants and personal fees from Amgen, personal fees from Arrow, personal fees from Biogen, personal fees from BMS, grants and personal fees from Chugai, personal fees from Expanscience, personal fees from Gilead, personal fees from Grunenthal, grants and personal fees from HAC-Pharma, personal fees from LCA, personal fees from Lilly, personal fees from Medac, grants and personal fees from MSD, grants and personal fees from Novartis, grants and personal fees from Pfizer, personal fees from Sanofi, personal fees from Theramex, personal fees from Thuasne, personal fees from TEVA, grants and personal fees from UCB, grants from Bone therapeutics, personal fees from Nordic, outside the submitted work. Professor Rizzoli reports personal fees from Amgen, CNiEL, Danone, Mylan, Nestle, Radius health, Sandoz, TEVA/Theramex, outside the submitted work. The other authors have no conflict of interest relevant to the content of this study.

Funding
This work was supported by the European Society for Clinical and Economic Aspects of Osteoporosis, Osteoarthritis and Musculoskeletal Diseases (ESCEO).