FormalPara Key Points for Decision Makers

Health technology assessment (HTA) bodies want patient preference studies to investigate attributes related to benefits, risks, and administration.

They are willing to incorporate patient preferences as supportive evidence in HTA.

While the weight of patient preferences in decision making is uncertain, it is expected to have an impact.

1 Introduction

Due to the combination of an infinite demand for health technologies and a finite budget, it is important to pay for technologies that offer real improvements for patients. Health technology assessment (HTA) therefore systematically evaluates health technologies to inform payer decisions. HTA procedures and decision-making elements differ per country. According to European network for health technology assessment’s (EUnetHTA’s) HTA Core Model®, HTA elements can include (a) health problem and current use of technology, (b) description and technical characteristics, (c) safety, and (d) clinical effectiveness, and can further be expanded to include (e) costs and economic evaluation, (f) ethical analysis, (g) organizational aspects, (h) patient and social aspects, and (i) legal aspects [1].

Currently, the patient perspective is not systematically included in cost-effectiveness analysis (CEA). CEA often uses quality-adjusted life-years (QALYs) as a key measure of benefit. QALYs are calculated by multiplying the utility of a health state by the years that a patient lives in that state. To this end, patients’ health states can be identified through the use of health-related quality of life (HRQoL) instruments such as Short Form 36 (SF-36) and EuroQol–5 Dimensions (EQ-5D) [2]. However, these tools do not identify how patients value that state, and the value (i.e., utility) of patients’ health states is often determined using societal valuations [3].

Considering the patient voice in payer decision making could lead to more cost-effective outcomes as this could result in reimbursement (i.e., coverage) of therapies that patients need and accept [3]. Patients can provide insights about the impact of conditions and treatments on their lives, outcomes that matter to them, and their needs and fears [3, 4]. Currently, patients are sometimes directly involved in discussions [5, 6]. However, direct patient involvement is thought to be subjective, potentially biased, and of limited representativeness [7,8,9,10].

To explore patients’ perspectives in a robust manner, patient preferences (PP) can be investigated. Patient preference studies (PPS) can provide preference evidence from a group of patients on the importance of, and trade-offs they are willing to make between, certain treatment features (attributes) or health states. PP have been defined by the US Food and Drug Administration (FDA) as “qualitative or quantitative assessments of the relative desirability or acceptability to patients of specified alternatives or choices among outcomes or other attributes that differ among alternative health interventions” [11]. PP can be obtained using different preference exploration (qualitative) or elicitation (quantitative) methods [12,13,14,15]. Exploration methods can be used for concept exploration and to obtain in-depth knowledge on the value of medical products. Examples of preference exploration methods include semi-structured interviews and focus groups [12, 16, 17]. Elicitation methods can quantify personal preferences, allow for statistical analysis, and possibly the detection of preference heterogeneity among patients. Examples of preference elicitation methods include discrete choice experiments (DCE), swing-weighting, and the threshold technique [12, 13, 18].

Multiple publications suggest that it may be beneficial to include PP in HTA [10, 16, 19, 20], especially when there is uncertainty in the available evidence and there are multiple alternatives with different benefit–risk profiles (i.e., preference-sensitive situations). This was also highlighted by the Patient Preferences in Benefit–Risk Assessments during the Drug Life Cycle (PREFER) project that aims “to strengthen patient-centric decision-making throughout the life cycle of medicinal treatments by developing expert and evidence-based recommendations on how patient preferences should be assessed and inform decision-making” [21]. While two literature reviews, 143 semi-structured interviews, and eight focus groups were previously conducted within PREFER on the design, conduct, and use of PPS [19, 22,23,24,25], some questions regarding the integration of PP in HTA and differences between healthcare systems remained unanswered.

Actual application of PP in HTA remains limited and not systematic [22, 26]. Huls et al. [27] identified issues of conceptual, normative, structural, procedural, or methodological nature currently blocking the integration of PP in HTA. Procedural issues were found to be HTA-specific and related to when in the HTA process PP can be used, the weight that PP can receive, the impact that PP can have on decision making, and how quality of PPS can be evaluated. Moreover, possibilities and processes to implement PP in HTA and payer decision making may be different per country as current HTA systems also vary between countries. The aim of this study therefore was to investigate HTA representatives’ opinions on whether and how to incorporate PP in HTA in their respective countries. Moreover, HTA representatives’ beliefs were explored regarding the potential weight and impact of PP on decision making, and on how quality of PPS should be evaluated.

2 Methods

Following PREFER work, questions remained on the use of PP in HTA regarding practical integration in HTA processes and assessments, as well as differences between healthcare systems. As a result, additional focus groups were organized to address these. Focus groups were conducted with HTA representatives in Germany, Belgium, and Canada following a pre-defined guide that covered HTA-specific challenges to the use of patient preferences. Methods are further described below and the consolidated criteria for reporting qualitative research (COREQ) checklist was completed (Supplementary material I, see electronic supplementary material [ESM]) [28].

2.1 Focus Group Guide Development

A focus group guide was developed (Supplementary material II, see ESM). Included questions could all be linked to the HTA-specific challenges mentioned by Huls et al. [27], namely ‘HTA stage’ (including use in multi-criteria decision analysis, MCDA), ‘weight’, ‘impact’, and ‘quality’. MCDA is any method that establishes criteria, weights them in terms of importance, and scores each alternative on each criterion to create an overall assessment of value. Moreover, questions were included on the current HTA procedures and criteria used to assess therapeutic value of treatments. In addition, a case of a PP-sensitive situation (gene therapy for the treatment of hemophilia) was added to allow for discussion on a concrete example. While only one criterion needs to be met for a situation to be preference sensitive, the context of gene therapy decision making in hemophilia meets multiple criteria that are stated by the FDA [11] to contribute to preference sensitivity, including:

  1. 1.

    “multiple treatment options exist and there is no option that is clearly superior for all patients” as for hemophilia, alternative treatment options exist (e.g., intravenous factor replacement therapy) and it is not clear what therapy would be superior for all patients;

  2. 2.

    “the evidence supporting one option over others is considerably uncertain or variable” as the evidence supporting gene therapy is considerably uncertain regarding long-term outcomes; contributing to the perception that no option exists that is superior for all patients;

  3. 3.

    Additional contexts mentioned in the FDA PP guidance as gene therapy in this case intends to yield significant health and appearance benefits, could directly affect HRQoL, is developed to fill an unmet medical need or treat a rare disease, offers alternative benefits to those already marketed, and uses a novel technology.

2.2 Participant Recruitment

Countries with different HTA procedures were selected based on their value assessment criteria and processes. From an initial selection of the UK, France, Germany, Belgium, Sweden, Poland, and Canada, the UK (National Institute for Health and Care Excellence [NICE]) and France (Haute Autorité de santé [HAS]) were excluded due to their involvement in the European Medicines Agency/European network for health technology assessment (EMA/EUnetHTA) qualification of the PREFER projects’ PPS framework, as participation in this study could have been perceived as a conflict of interest in the EMA/EUnetHTA qualification procedure [21]. To cover three heterogeneous healthcare systems, Germany, Belgium, and Canada were selected. In Germany, the Federal Joint Committee (G-BA) commissions the Institute for Quality and Efficiency in Health Care (IQWiG) with technology assessments that are then appraised by the G-BA. German HTA and payer decision making mainly focusses on the added therapeutic benefit of treatments [29]. IQWiG has performed PP pilot projects in the past [30, 31]. While efforts were made to recruit G-BA representatives, the G-BA contact person stated that due to lack of familiarity with the topic and high workload, no G-BA focus group could be realized. Therefore, an IQWiG focus group was set up. In Belgium, the Federal Health Care Knowledge Centre (KCE) can perform HTA for the National Institute for Health and Disability Insurance (RIZIV-INAMI), which gives advice on reimbursement decisions to ministers of the federal government [32]. Belgian HTA does not focus on one main criterion [33]. KCE has recently published their patient involvement report [34], and one representative is involved in the PREFER project. In Canada, HTA is performed on a national level (except for Quebec) by the Canadian Agency for Drugs and Technologies in Health (CADTH), while coverage decisions are made by 19 different payers [35]. Perspectives of patients form a key criterion in Canadian assessments [36]. CADTH recently organized internal discussions on the use of PP and one representative was involved in the PREFER project.

Country-specific focus groups of three to seven HTA representatives (not payers) were organized. HTA representatives were recruited through purposive sampling via the professional network of the researchers and the respective HTA agencies. The HTA agencies selected the candidates based on their HTA, patient involvement, and patient preference experience and expertise.

2.3 Conduct of Focus Groups

Focus groups were all conducted by the same moderator (EvO) and assistant (VF) and followed the predefined focus group guide. The Belgian and German focus groups were conducted face to face, and the Canadian one via teleconference. Focus groups lasted approximately 2 h, were conducted in English, audio-recorded, transcribed, and pseudonymized.

2.4 Analysis

Thematic analysis was performed by two researchers (EvO and VF) to minimize variability of interpretation (full analysis plan available in Supplementary material III, see ESM) [37]. First, the researchers familiarized themselves with the content of the focus groups by moderating or assisting and reading the transcripts. Familiarization resulted in an overview of the collected data, and the researchers became aware of key themes and concepts. Subsequently, the researchers independently identified topics throughout the transcripts and agreed upon a list of initial themes. The researchers reread the transcripts and formulated a list of final themes. Themes were critically assessed and necessary revisions were discussed among researchers to finalize the themes. Definitions were given to these themes (Supplementary material IV, see ESM). The text of the transcripts was organized along the themes using NVivo 12. Sections of text that corresponded to a theme were indexed, and placed in the respective column. These columns were then analyzed and interpreted for experiences and opinions of participants. Results were drafted and reviewed by participants to ensure correct interpretation; minor adaptions were made according to their feedback.

3 Results

The Canadian focus group consisted of five CADTH representatives, the German focus group of three IQWiG representatives, and the Belgian focus group of seven KCE representatives. Because of the important differences between the healthcare systems of the three countries, results are presented per focus group (except for the gene therapy case). A cross-country comparison of results is made in the discussion section.

3.1 Patient Preferences in Canadian HTA

3.1.1 Current HTA Procedures and Value Assessment Criteria

Canadian participants explained that CADTH has two drug committees: the Canadian Drug Expert Committee and the pan-Canadian Oncology Drug Review (pCODR) Expert Review Committee. Furthermore, CADTH has a Health Technology Expert Review Panel for non-drug technologies. According to participants, the committees have different deliberative frameworks, but all use comparable criteria: (a) clinical effectiveness, (b) cost effectiveness, (c) patient perspectives, and (d) other considerations such as ethics, implementation, or feasibility (Supplementary material V, see ESM). Consistency across assessments was found to be crucial for drugs, but for non-drug technologies a more tailored approach is taken. Regarding current patient involvement, participants explained that for drugs, a call for patient input is launched by CADTH in advance of the company’s anticipated date of submission. For non-drug technologies, a formal systematic review of patient perspectives literature is done in addition to engaging with one or more patients.

3.1.2 HTA Stage

A consensus was reached among Canadian participants that results from PPS could be integrated in early dialog with companies, to justify unmet medical need or selection of clinical outcomes, or to serve as supportive information in the assessment of clinical evidence. To that end, participants mentioned PPS should investigate (a) trade-offs between benefits and risks, and (b) the importance of the overall burden of a technology on patients’ lives (e.g., administration schedules, travel time, and travel expenses). One participant wondered how similar PP would be to current patient input (i.e., direct patient involvement). Another stated that, since patient input and PP answer different questions using different sample sizes, they cannot validate or contradict each other. Participants expressed they wanted to keep PP separate from the QALY as there is “already a lot we’re trying to potentially put into it” (CAN _1) and indicated they also were not supportive of using PP to weigh criteria in a multi-criteria decision analysis (MCDA). One participant explained “We want to have the flexibility to see what’s driving the assessment and I feel like MCDA, though it’s not meant to be used as a calculation, it can be misused that way” (CAN _1).

3.1.3 Weight

When asked what weight PP could receive in drug and non-drug technology assessments, participants struggled to provide an answer and expected the weight to be dependent on the situation (e.g., strength of clinical evidence, interaction of patients with the technology, or burden for the patient). A participant explained: “When asking people to use dialysis at home, PP would have more weight. While as when we are looking at two diagnostic technologies with a difference in diagnostic accuracy, clinicians’ perspectives would have more weight” (CAN _5).

3.1.4 Impact

Canadian participants agreed that the impact of PP would depend on the quality of the PPS. They felt that PP could potentially have an impact when clinical evidence is uncertain. However, one participant argued: “often I hear that we want to see a recommendation change, while for me, it is more about increasing the confidence of the decision” (CAN_2). To evaluate the impact, the participants suggested to compare confidence of people making recommendations on products with and without PP evidence.

3.1.5 Quality

To evaluate the quality of PPS, Canadian participants said they would first look at the preference method used and said a reliable tool to evaluate these methods is needed. Secondly, they would look at who is executing the study, with preference for an independent party. If performed by companies, the company should incorporate advice of the HTA body into the design. Thirdly, participants would look at how generalizable the results might be to Canada (e.g., representativeness of the sample to different Canadian regions). If executed outside of Canada, they would first have to evaluate if the patients could have similar perspectives as Canadian patients.

3.2 Patient Preferences in Belgian HTA

3.2.1 Current HTA Procedures and Value Assessment Criteria

As comprehensive HTAs in Belgium are mostly performed for Class 1 pharmaceuticals (i.e., with added therapeutic value), the discussion focused on those assessments. Participants explained that added therapeutic value is assessed based on efficacy, effectiveness, side effects, user-friendliness and applicability (Supplementary material V, see ESM). Besides therapeutic value, other criteria for value assessments are price, importance in clinical practice, budget impact, and cost effectiveness. Regarding current patient involvement, participants explained that, within the early scientific advice context, patient representatives are invited to discuss clinical trial research questions and outcomes. However, participants mentioned that both time and patients are often lacking to realize this involvement.

3.2.2 HTA Stage

Belgian participants agreed that results from PPS could be useful in early dialog with companies, to ensure patient-relevant outcomes are considered in clinical trial design. Interest in a PPS that covers multiple diseases and frequently used endpoints was observed, as it was believed that such a study could inform multiple HTAs with similar endpoints. For example, a PPS in oncology on survival versus quality of life (QoL) versus progression-free survival rate. Some participants argued that PPS could also inform therapeutic value assessment; “Why not with a two-control step? One at the RCT and one at the reimbursement?” (BEL_2). They wanted PPS to investigate the importance of (a) therapeutic effects and side effects, (b) user-friendliness, and (c) impact of the drug on daily life (e.g., the burden of administration schedules on a patient’s job). In contrast, some participants were more hesitant toward the use of PP, arguing that “we still shouldn’t be paying more as a society for something that has no added value regarding safety and efficacy” (BEL_3). Participants all seemed to agree that PP should not be integrated in the QALY. Although one participant opined that PP could be used to order assessment criteria according to their importance, others overall were not supportive of using PP to weigh criteria in MCDA. They argued that decisions will also be based on criteria not investigated in PPS, like cost and budget impact.

3.2.3 Weight

Opinions on the weight that PP should receive strongly differed between participants, ranging from almost no weight to more weight than other stakeholders. These differences in opinion were also expressed regarding the weight of PP compared with other assessment criteria.

3.2.4 Impact

Opinions on the impact of PPS differed between participants. One participant stated “I am questioning if the outcome would really be different if you add it” (BEL_3). While another representative said “For me, it will increase the empowerment of patients in decision making, also at higher levels like for reimbursement. A good study would be of much higher value than to put one patient in your expert group” (BEL_4). An example was given where a drug was reimbursed that delays the need for dialysis by 10 years. Participants argued that if a PPS had examined the acceptability for patients regarding the need to urinate every hour while using this drug, the drug would have never been reimbursed. Several suggestions were given on how the impact of PP could be evaluated. Decisions by countries using and not using PP could be compared, notwithstanding differences between populations. Secondly, patient satisfaction could be assessed before and after adding PPS. Lastly, an observation could be made of patients’ acceptance of technologies reimbursed without looking into PP.

3.2.5 Quality

When discussing the evaluation of quality of PPS, Belgian participants explained that they would first question whether a PPS is needed. If so, they would then use their established quality assessment grid for qualitative or quantitative studies, evaluating (a) initiator of the study, (b) description and selection of methods, (c) illustration of results with quotes, and (d) representativeness. However, the latter was seen as very difficult to achieve as selection bias can arise.

3.3 Patient Preferences in German HTA

3.3.1 Current HTA Procedures and Value Assessment Criteria

German participants explained that for their assessments, clinical effectiveness is the main criterion. The added therapeutic benefit is measured by the amount and the probability of gains in patient-relevant outcomes like mortality, morbidity, and QoL (Supplementary material V, see ESM). For non-drugs, they also look at non-inferiority. Potential harm is another essential component, measured by the amount and the probability of harms such as side effects. Lastly, participants also explained that IQWiG could be asked to perform CEAs, but that so far there has only been one commission. Regarding current patient involvement, participants explained that for most early drug assessments (according to the German AMNOG law), information on relevant outcomes and existence of patient subgroups is gained through a questionnaire that is completed by patient organizations. For non-drugs, patients are invited to IQWiG for a face-to-face discussion to identify important outcomes.

3.3.2 HTA Stage

German participants indicated that the limited assessment timeframes may pose a challenge for the integration of PP. Within the timeframe of 15 months for non-pharmaceutical assessments however, they have recently found that this challenge can be overcome. Within the 3-month timeframe for early drug assessments, other options are being considered such as QoL surveys and other patient evidence. Participants mentioned that PPS should investigate (a) burden of administration routes, (b) acceptability of adverse events, (c) trade-offs between benefits and harms, and (d) importance of these benefits and harms. A participant gave the example of prostate cancer screening: “We know that the cancer-specific mortality might reduce; but are patients willing to have the harms of the diagnostic cascade and the overtreatment?” (GER_2). Combining PP with the QALY or using PP to give weights to criteria in an MCDA were said to be possible in theory. However, as CEAs are not executed by IQWiG usually, this was not considered of use in practice. Using PPS to inform an MCDA was found to be difficult, as only at the end of the assessment process is it known what outcomes (15 on average) can be supported with clinical data of acceptable quality and will be considered in the assessment.

3.3.3 Weight

Participants explained that the assessment by IQWiG is solely focused on patient-relevant outcomes as their assessment is independent of preferences from all other stakeholders. Besides, participants also said that at the level of IQWiG, discussions on the weight of PP versus other assessment criteria are not relevant as the G-BA decides on this in their appraisal.

3.3.4 Impact

Participants expected the G-BA to consider evidence from PPS in their decisions. One participant gave an example of a breast cancer biomarker test that predicts the probability of relapse and would influence the treatment (chemotherapy) decision. The participant added: “We had to choose a target and it would have been very valuable to know what women in this situation would accept as a risk of relapse” (GER_3).

3.3.5 Quality

To assess the quality of PPS, German participants said they would look at the robustness of the method, independence of the study, representativeness of the sample, selection of attributes, and information given to patients. Moreover, they would assess whether the results are consistent across multiple PPS.

3.3.6 A Patient Preference-Sensitive Situation: Gene Therapy

Regarding the gene therapy example, Canadian and German participants said they would assess whether clinical efficacy is convincing, because if not, it would be difficult for a PPS to change the recommendation. If efficacy is proven, PPS could, according to Canadian, Belgian, and German participants, investigate the acceptability of issues such as adverse events and uncertainties, and the importance of outcomes. Canadian, Belgian, and German participants would describe the appraisal and assessment of the PP evidence in a separate section in their report, and might integrate findings as supportive evidence into other parts of the report and the discussion. Examining subgroups (e.g., age groups) through PPS was found to be interesting by Belgian and German participants, especially for therapies with uncertain long-term consequences. Belgian participants also wondered if PPS could be used to calculate the budget impact of high-cost drugs like gene therapy.

4 Discussion

Through the conduct of focus groups, this study investigated similarities and differences in organizational considerations for the use of results from PPS across countries with different HTA systems.

4.1 HTA Stage

The focus groups showed two main applications for PP in HTA, confirming areas mentioned by the UK HTA body NICE [38, 39]. First, a role in early scientific advice was suggested by Canadian and Belgian HTA representatives to justify unmet medical need and selection of clinical trial endpoints. IQWiG is not involved in early scientific advice and therefore no statements were made on this topic by German participants. Secondly, participants overall agreed that PPS could be used as supportive information to complement clinical evidence during HTA. An assessment of clinical evidence was found necessary before PP could be considered, with PP only being considered if evidence is convincing, as also stated by NICE [39]. Participants from all focus groups wanted PPS to investigate attributes related to benefits, risks, and administration (Table 1). In addition, Canadian representatives were also interested in out-of-pocket costs. With regards to administration attributes, Canadian participants, besides administration routes and schedules, focused on the implications of travel to other regions. This result was not confirmed in the other focus groups and may be of sole importance to Canada due to the size of the country. Regarding the gene therapy example, participants also mentioned that investigating preference heterogeneity between subgroups in PPS could be informative, confirming the statement of other studies and NICE on the importance of investigating preference heterogeneity [19, 38, 40].

Table 1 Overview of attributes that HTA representatives requested to be investigated in PPS

Participants would integrate the results of PPS in a separate section of the assessment report or in the discussion. According to Mott [26], PP can be incorporated into economic evaluations by using them within the calculation of the QALY. NICE recently stated that they do not want to use PP to value health states in CEA as this valuation needs to reflect societal values, rather than patient values, but that PP “could potentially provide additional support that would enable decision makers to make more informed recommendations in cases where the clinical trial data in isolation might not provide a clear demonstration on precisely what the value proposition is for patients” [38, 39]. Results from our Belgian and Canadian focus groups confirm this statement; participants prefer to consider PP separately as supportive evidence. German representatives said that, although integrating PP in the QALY could be possible in theory, it would not be of practical use for IQWiG as they do not often execute CEAs. Another option to incorporate PP into HTA decisions would be to use PP to assign weights to decision-making criteria in MCDA, based on their relative importance to patients [10]. While in previous research this option was identified in theory, HTA representatives until now remained uncertain on its practical use [22]. In the current study, however, neither Canadian, Belgian, nor German HTA representatives were supportive of using PP to assign weights to criteria. HTA representatives wanted to maintain flexibility and control over what is driving the assessment. Moreover, they believed that weighing criteria according to PP would be challenging and explained that only at the end of the assessment process does it become clear what criteria will be considered in the assessment. This means that at the moment that criteria are established, it is too late to conduct a PPS on these criteria to inform HTA. In addition, many criteria (15 on average in Germany) are considered in assessments and it may not be possible to include all in one PPS due to the quantity and their nature (e.g., societal cost and budget impact).

4.2 Weight

Participants overall struggled to determine the weight PP could get in decision making, for different reasons. While conflicting with their aim for consistency, for the Canadian drug committees the weight would be dependent on the situation. The need for consistency seemed to be less important in device assessments. Among Belgian participants, a high variability in opinions regarding weight of PP was observed, ranging from almost no weight to a substantial weight. In Germany, participants explained that the assessment by IQWiG is solely focused on patient-relevant outcomes and that the decision on the weight of assessment criteria is made during appraisal by the G-BA.

4.3 Impact

HTA representatives from all countries felt PPS could have an impact on recommendations and payer decisions. PP was not found likely to change decisions, but rather to ensure confidence in the decision. Therefore, it was suggested that the impact of PPS could be evaluated by comparing the confidence of people making recommendations on products with and without PP evidence. Previous PREFER work explored which decision-making situations can be sensitive to PP [19, 22, 23]. Therefore, this topic was not revisited in the current study. Instead, a PP-sensitive case was discussed (gene therapy) and it was concluded that in this case PPS could provide additional supportive evidence. Results of current focus groups align with the statement from NICE that the impact of PPS also depends on their quality [38].

4.4 Quality

Participants across all focus groups emphasized that the quality of PPS is crucial. Quality criteria coincided between countries—independence (relating to the initiator), robustness of the method, selection of attributes, information given to patients, and representativeness of the sample; representativeness and independence were also mentioned by NICE as important criteria [38, 39]. To mitigate introduction of bias in the design of PPS by sponsors, Canadian participants suggested that HTA bodies could provide scientific advice on PPS design, confirming the need for collaboration with these stakeholders as previously identified by van Overbeeke et al. [22], and in statements and recent activities by NICE [38, 41].

4.5 Strengths and Limitations

The design of the focus group guide was informed by research conducted previously within the PREFER project [19, 22,23,24,25] and the study of Huls et al. [27]. The interpretation of results was validated by participants. Focus groups were all moderated and assisted by the same researchers (EvO and VF).

Focus groups by nature provide subjective evidence. Therefore, these results reflect how HTA representatives believe PP can be used in HTA and this may be somewhat different from how current and future procedures allow PP to be considered in HTA. Moreover, due to the limited number of participants included in this study, results may not be generalizable to every HTA body or every representative of HTA bodies and countries represented in this study. As we were unable to recruit G-BA participants, insights gathered during the IQWiG focus group on the potential use of PP by the G-BA could not be validated. Results might be informative for countries with comparable healthcare systems. However, the number of countries involved in this study is limited and therefore generalizability to other countries remains uncertain. Moreover, due to the limited pool of participants, it was only possible to perform one focus group per HTA body.

The researchers acknowledge that the number of participants in the German focus group (n = 3) was limited. Group sizes of five to eight participants are generally considered to be ideal for noncommercial focus groups [42, 43]. Small focus groups can be informative if the purpose of the study is to understand an issue or behavior, the topic is complex, participants per focus group are homogeneous in background and perspectives, and have a high level of experience or expertise [42]. Moreover, ‘mini focus groups’ with a minimum of two participants can be informative when there is a small pool of experts that are difficult to recruit [44, 45]. In recruitment for the German focus group, the researchers noticed that the respective HTA body was selective in their recruitment and wanted to ensure that very knowledgeable participants were included. During the conduct of the focus groups, this high level of expertise was confirmed and one participant even had hands-on experience with the conduct of a PPS. Therefore, the German focus group was considered an expert mini focus group and was included in the analysis. Participants in the Belgian focus group overall had less experience with PPS. Nevertheless, one representative was an HTA preference expert involved in the PREFER project and the other participants were very knowledgeable in their areas of expertise (CEA and patient involvement). The Belgian HTA preference expert helped the researchers to first educate the other HTA representatives on PP before any questions were asked.

4.6 Future Research

HTA bodies may not be willing to combine results of PPS with other evidence in HTA. However, they find this information important and would use it as separate supportive evidence. To allow for this use, we think it may be necessary to step away from, or postpone, the discussion on combining PP with other evidence in HTA and to focus on what HTA bodies need to allow them to use PP in manners they value and accept. To strengthen discussions, we believe it is crucial to invest in training of HTA representatives on PPS as we noticed that there were only a few candidate participants with expertise in PPS.

5 Conclusions

Across all HTA bodies, an interest in the use of PP was observed for scientific advice and value assessments. PP may not receive a fixed weight in assessments, but are likely to have an impact on payer decision-making if PPS are of acceptable quality. Findings of this study were overall similar between HTA bodies. Small differences were observed in the type of attributes that HTA representatives want PPS to investigate, but in general they highlighted attributes related to benefits, risks, and administration. In the near future it may be impossible to achieve structural integration of PP with other evidence in HTA, but HTA bodies are willing to incorporate PP evidence in separate HTA sections and more efforts should be made to meet their needs.