Public responses to the sharing and linkage of health data for research purposes: a systematic review and thematic synthesis of qualitative studies

The past 10 years have witnessed a significant growth in sharing of health data for secondary uses. Alongside this there has been growing interest in the public acceptability of data sharing and data linkage practices. Public acceptance is recognised as crucial for ensuring the legitimacy of current practices and systems of governance. Given the growing international interest in this area this systematic review and thematic synthesis represents a timely review of current evidence. It highlights the key factors influencing public responses as well as important areas for further research. This paper reports a systematic review and thematic synthesis of qualitative studies examining public attitudes towards the sharing or linkage of health data for research purposes. Twenty-five studies were included in the review. The included studies were conducted primarily in the UK and North America, with one study set in Japan, another in Sweden and one in multiple countries. The included studies were conducted between 1999 and 2013 (eight studies selected for inclusion did not report data collection dates). The qualitative methods represented in the studies included focus groups, interviews, deliberative events, dialogue workshops and asynchronous online interviews. Key themes identified across the corpus of studies related to the conditions necessary for public support/acceptability, areas of public concern and implications for future research. The results identify a growing body of evidence pointing towards widespread general—though conditional—support for data linkage and data sharing for research purposes. Whilst a variety of concerns were raised (e.g. relating to confidentiality, individuals’ control over their data, uses and abuses of data and potential harms arising) in cases where participants perceived there to be actual or potential public benefits from research and had trust in the individuals or organisations conducting and/or overseeing data linkage/sharing, they were generally supportive. The studies also find current low levels of awareness about existing practices and uses of data. Whilst the results indicate widespread (conditional) public support for data sharing and linkage for research purposes, a range of concerns exist. In order to ensure public support for future research uses of data greater awareness raising combined with opportunities for public engagement and deliberation are needed. This will be essential for ensuring the legitimacy of future health informatics research and avoiding further public controversy.


Background
Since the publication of the World Medical Association's Declaration on Ethical Considerations regarding Health Databases in 2002, which stated that "databases are valuable sources of information" for health research, quality assurance and risk management [1] there has been steady and significant growth in the sharing of health data for 'secondary uses'. The Medical Research Council (MRC) and Wellcome Trust ([2], p.6) note that "recent years have brought many calls for the optimisation of data sharing for research, with the intention of deriving maximal societal benefit".
Recently this commitment to expanding research uses of data has led to growing interest in the public acceptability of data sharing and data linkage practices (e.g. [3]). This relates, in part, to the recognition of the importance of ensuring that data uses align with public interests or preferences. Recent highly publicised controversies (for example relating to care.data in England) have drawn attention to the importance of ensuring public support for the ways that data are used. Thus, there is increasing attention to public acceptability of secondary uses of data and to ensuring that these uses are understood and supported by the wider public (from whom the data originate). This may be crucial for ensuring the legitimacy of current practices and systems of governance. As Bradwell and Gallagher [4] have suggested; "personal information use needs to be far more democratic, open and transparent" and this means "giving people the opportunity to negotiate how others use their personal information in the various and many contexts in which this happens" (pp: [18][19].
Previously it was noted that the literature in this area was dominated by practitioner perspectives and public views were underrepresented or underreported [5]. However, over the last decade there has been a steady increase in the number of studies exploring public attitudes or acceptability of secondary uses of data. Such studies have been conducted in a range of contexts and in relation to various research practices. Qualitative studies in the field of medical and healthcare research have, historically, tended to receive less attention than quantitative studies. However, despite qualitative studies usually being based on small sample sizes that prohibit claims to being statistically representative [6,7], they can provide rich insights and a deeper understanding of the complexities or nuances of public opinions and experiences. They also allow for public views to be interpreted in a way that can effectively inform policy and practice issues [8]. Recently, reports discussing public views toward data sharing or data linkage for research purposes have principally used qualitative methods [3,9,10], exemplifying the value of such approaches for exploring the challenges and complexities of this topic.
Data-sharing and data-linkage refer to two distinct processes which are used in different ways. Data-sharing involves information moving from one organisation or department to another, whereas data-linkage is defined as: "the bringing together from two or more different sources, data that relate to the same individual, family, place or event" [11]. Increasing amounts of health research are conducted through data-linkage, for example health related records have been linked with population registries [12], alcohol and drugs services [11], genealogical registries [11], the census [13,14]), the education system [15] and the prison service [16]. Such linkages have enabled, among other things, examination of relationships between social factors and health or access to health services. This paper reports the results of a systematic review and thematic synthesis of qualitative studies which have explored public attitudes to data-sharing or data-linkage for research purposes. The study aimed to address the following research question: What are the key issues of public responses in datasharing and data linkage for research, and how have these been characterised?
This paper reports key themes that have emerged through this thematic synthesis and discusses their relevance for current debates around secondary uses of data for health research. Given the growing international interest in this area this represents a timely review of current evidence. It highlights the key factors influencing public responses and in doing so identifies particular topics of salience which it will be important to examine further.
Throughout this paper the terms 'review' , 'researcher(s)' , 'participant(s)' and 'author(s)' will be used to refer to this systematic review, the authors of the included studies, the research participants of each study and the authors of this paper, respectively.

Search strategy and inclusion criteria
A systematic literature search was conducted of five electronic databases (CINAHL Plus, EMBASE, Medline, Scopus and Web of Science) on 4 April 2014. Table 1 displays the key search terms that were tailored for all databases using both free-text terms and subject headings where possible (see Appendix 1 for an adapted search strategy for Medline). In addition, searches were conducted through Google Scholar and Open Grey as well as scanning references of included papers and contacting experts for a more inclusive result. There were no limitations on publication dates, languages or geographical locations.
The initial database searches revealed 1502 papers. Two authors (M.A. and J.S.J.) separately screened titles and abstracts and read eligible full texts before reconvening to discuss their results and resolve any discrepancies. Figure 1 shows the search and selection outcomes for each stage of the process. An additional 19 papers were identified through other sources (hand-searching references, expert communications and grey-literature searches). Papers were included if they met all inclusion criteria (see Table 2).

Quality appraisal and data extraction
Each included study was individually appraised and data were extracted by the same two authors. Table 3 displays the main characteristics of each study including the study aim, date of data collection, setting, sample characteristics, sampling and method of data collection. The Critical Appraisal Skills Programme [17] checklist was used to critically assess the qualitative research. It was agreed that all studies were of sufficient quality to be included in the study. The CASP checklist represented a valuable tool for facilitating critical reflection on each of the studies.

Synthesis
A thematic synthesis approach was adopted using Thomas and Harden's [18] three-step technique: Free line-by-line coding of the included studies, the emergence of descriptive themes from the codes and the development of analytical themes. Independently, M.A. and J.S.J. coded the included studies using an inductive approach without a priori codes.
All authors met to discuss the codes/themes and to resolve any discrepancies. Three authors (R.J., C.P. and S.C.B.) were each assigned three articles (totalling nine) from the included studies to validate the findings of M.A. and J.S.J. At this stage ten further studies were excluded from the synthesis for not reporting participants' verbatim views or first-order constructs (Britten et al. 2002); not reporting detailed qualitative findings (Sandelowski and Barroso 2003); not reporting findings relevant to the research topic; or for not including public responses. A list of descriptive themes (referred to as 'sub' themes) were agreed and organised by analytical ('key') themes. The key themes were identified and interpreted in relation to the research question (see Table 4). From the included studies, M.A. and J.S.J. extracted first-and second-order constructs (the latter being the original researchers' interpretations of the participants' constructs) including any reciprocal or refutational translations (comparable or opposing views) [19].
While three authors (M.A., S.C.B. and C.P.) work directly in the field of public engagement regarding health informatics research, the remaining two authors (J.S.J. and R.J.) were not previously familiar with the literature or debates in this area. The involvement of authors without prior understandings or perspectives on the literature was valuable for ensuring an inductive approach. The authors discussed and deliberated the coding and analysis to ensure that the findings emerged from the included studies rather than being shaped by or confirming the expectations of authors who are actively engaged in this subject matter.

Included Studies
A total of 1521 studies were identified from the systematic searches. From these, 25 studies were included in the review. The research was conducted primarily in the UK (five studies in Scotland, four in England, one in Wales and two across the UK) and in North America (seven studies in the USA and three in Canada) with one study set in Japan, another in Sweden and one worldwide. Data was collected from 1999 to 2013, though eight studies did not report data collection dates. The research participants included patients, service users, carers, surrogate decision-makers, lay persons and the general public ranging from 18 years of age to over 75 years. Six studies reported expert opinions from healthcare professionals, managers, health service staff and diabetes specialists in addition to the views of members of the wider public or patient groups. The qualitative methods of data collection included focus groups, interviews, deliberative events, dialogue workshops and asynchronous online interviews. Six studies included mixed methods using surveys or structured questionnaires. Additionally, three studies reported both primary and secondary research including a literature or policy review or systematic review. Table 1 Key search terms (lay OR patient* OR public OR citizen$1) AND (attitude* OR view$1 OR perspective* OR opinion$1) AND (data OR record$1) AND (share$ OR sharing OR link$2 OR linkage) AND Research AND (access OR purpose$1) AND (qualitative OR ethnograph* OR "grounded theory" OR "in depth interview$1" OR "structured interview$1" OR "focus group$1" OR "case study" OR "case studies" OR "case series" OR "citizen$2 juries" OR vignette* OR observation*) □ Asterisks ("*") are used as a wildcard to allow for any given search term to be truncated or remain the same.
□ Dollar signs ("$") followed by a number refer to the number of additional characters allowed for each selected term.
Seven key themes were identified across the included studies: Widespread Conditional Support; Conditions for Support; Benefits; Control and Consent; Uses and Abuses of Data; Private Sector Involvement; and Trust and Transparency.

Key Themes Widespread Conditional Support
The included studies point to a clear trend that there was generally widespread-albeit conditional-support for uses of data in health research. 1 This is typically expressed in relation to a view that health research-or research more broadly-is "in the public interest" or is expected to bring about benefits for "the greater good". 2 For example, one participant in study number 25 stated: "I think the medical research is going to be of general benefit to the general population and if my records can help; I think personally I would be quite willing to participate in any medical study that is of general benefit to the population. I just feel it is worthwhile to participate in these studies" ( Patient 4, Willison et al. 2003: 2) Uses of data for health or medical research were often conceptualised in relation to the potential for discovery of new cures or treatments, or the improvement of healthcare services. In several studies participants were reported as being surprised that data are not already more widely used, with questions being asked such as: "Doesn't this happen already?!". 3 Many studies reported that participants considered research uses of data to be in the public interest and conversely that not using data was against the public interest since this was a resource which should be used, not wasted. 4 Despite broad agreement that using health data for medical research is generally a good thing, across the studies it is evident that support for these data uses was never unconditional. A number of factors were identified as being important conditions for public support or acceptance.

Conditions for Support
In a large number of studies assurances of individuals' confidentiality were reported as crucial for public support. 5 Whilst confidentiality may be assured through various mechanisms, in the included studies this was largely associated with anonymisation of data. Public preferences for data to be anonymous were widely reported, 6 for example in one study 7 a participant stated: "[The public need] reassurance about anonymity because that's what people worry about" Some individuals expressed a view that if the data are anonymous "what does it matter?!". 8 However, others noted that anonymisation is not an absolute guarantee of confidentiality 9 and in a number of studies participants recognised that the anonymisation process is imperfect and therefore did not fully or adequately protect individuals' confidentiality. 10 For example, it was said: "I think you're right enough, it's anonymised. But then if you're dealing with particular areas, that again kind of cuts in to the anonymous factor, because if you're looking at maybe, let's say, a housing estate, so there's only so many people, so it's not…I don't think there's anything that's truly anonymous; I think everything can be found out if you've got the wherewithal and the curiosity to find things out." 11 In a number of studies participants made a distinction between "plain stats" and more detailed qualitative information, with the former largely considered not to be concerning while the latter raised greater issues relating to confidentiality and privacy. 12 Assurances of safeguards to protect against misuse or abuse of data were also widely considered important for ensuring public support/acceptability. 13 Similarly, members of the public often expressed a preference for strong accountability mechanisms to be in place. 14 However, there was generally found to be low public awareness of current research practices 15 and in particular, of current governance or ethics processes. 16 As such, in a number of studies it was reported that public acceptance increased after participants were informed about existing safeguards and governance mechanisms.
Assurances of data security were also found to be important for public acceptance of the use of health data in research 17 and across the studies concerns about data security were widely identified. 18 Such concerns related to the fallibility of IT systems to protect against breaches 19 as well as to human error. Media reports of "laptops left on trains" or misplaced data were widely called upon to illustrate this latter point. 20 However, in a All studies that included first-order constructs directly reporting participants' responses, such as quotations [19]. All qualitative studies that provided exploratory, descriptive and/or explanatory findings [38].
All studies based on secondary evidence. All studies that were solely based on quantitative research. All studies that did not include first-order constructs directly reporting participants' responses, such as quotations [19].

Study Population
All studies researching public (including patient/lay) perspectives. Studies involving public and expert opinions were included if public and expert responses were reported separately.
All studies that did not report public perspectives.

Research Topic
All studies that discussed the sharing and linkage of data in research.
All studies that did not discuss the sharing and linkage of data in research.   Quota sampling Primary and secondary research methods: Literature reviews (N = 2) and a series of deliberative events (4 half-day events with members of the general public and a separate, smaller scale event of LGBT people). Purposive and convenience sampling Focus groups with patients (N = 7); additional focus group with health service researchers (N = 1); semi-structured interviews with health service staff (N = 17). To explore public attitudes, predominantly of African Americans, toward data sharing in genetic and/or genomic research and the possible impact of said practices on research involvement.      Health research or more general research is typically "in the public interest" or will benefit "the greater good".

1, 3, 11, 14
Assurances of safeguards in place to protect against misuse or abuse of data were considered important for ensuring public support and/or acceptability. Concerns about data security were widely identified. 1,5,6,8,11,14,20,22,24 Concerns about data security related to the fallibility of IT systems to protect against breaches.

5, 8, 11
Concerns about data security related to human error were widely called upon. 5,6,8,11,23 Breaches of security were regarded as always being possible, yet security risks were sometimes said to be tolerated or accepted where individuals valued the purpose and potential benefits of research.

8, 10, 22
Public support was conditional if data would only be used for legitimate purposes. 8,11,19,21,22,25 Benefits Key condition for public support for research using individuals' data was that such research must have public benefits.  Public acceptance of opt-out models in recognition of the challenges or practical limitations of opt-in.

10, 19
Public preference for varied or flexible consent models which would enable individuals to set limits on their consent, or to indicate particular preferences or objections. 4,5,12,18,22,23,24 Public objections to one-time consent models which would not allow individuals to review or change their consent preferences.
10 Consent regarded as important in relation to named or identifying data. 5, 20, 21 Consent regarded as important in relation to qualitative information rather than "plain stats". 3 Consent regarded as important in relation to research using genetic data. 18,19,24 Consent regarded as important in relation to where a commercial entity is involved in research.

18
Consent was widely viewed to be important and in this regard, represented as an act of courtesy.

17, 21, 23
Concerns about the proliferation of data within modern societies and increasing surveillance through data collection -"Big Brother Society". 5,11,13,14,22 Concern related to the potential for stigma or discriminatory treatment to result from research which would label or categorise groups within society. 1,5,6,8,11,14,19 Concerns relating to potential indirect negative impacts on individuals from participating in research (e.g. increased or denied insurance premiums due to information being accessed from medical records, etc.). 1,5,6,9,11,12,17,24 Participants made differentiations between types of data and regarded some as more sensitive -and concerning -than others (e.g. mental health, sexual health, sexuality and religion). Distinctions were made between research perceived to be "for profit" and research perceived to be "for the greater good". 6,7,10,11,21,22,25 Distinctions were made between "research purposes" and "commercial gain".

, 2 2
Participants wanted assurances that public benefits would be prioritised over profit. 6,10,14,18,21,25 Participants wanted assurances that individuals' privacy would be prioritised over profit. 18 Participants wanted assurances that profits would be shared or reinvested so as to create public/societal benefits.

6
Participants felt it was appropriate that private sector organisations pay for access to public sector data.

6, 11, 22
Acceptance of private sector organisations paying for access to public sector data if the revenue generated is appropriately re-invested in the public sector.

17, 25
number of studies it was reported that participants regarded breaches of security as always being possible, yet security risks were sometimes regarded as tolerable or acceptable where individuals valued the purpose and potential benefits of research. 21 A further condition for public support was that data would only be used for legitimate purposes. Whilst the term "legitimate" was not always referred to explicitly, the included studies often suggested or concluded that the extent to which members of the public perceived uses of data to be legitimate influenced their responses or preferences. 22 However, there were varying views on how, or by whom, legitimacy was to be defined.

Public Benefits
Another key condition for public support for research using individuals' data was that such research must have public benefits. 23 Whilst in some cases perceived personal benefits, or personal relevance of research was reported to motivate participation in research, 24 benefits of research were largely conceptualised in terms of benefits to wider society, or "the greater good". For example, study participants said: "…We wouldn't have the national health service, we wouldn't have drugs, we wouldn't have anything, if it hadn't have been for people being allowed to try things Widespread concern about private sector involvement in research balanced by recognition that private sector involvement in research can be important or valuable. 8,11,21,22 Private sector involvement represented as a "necessary evil". 6 , 7 , 8 The private sector was not regarded as a homogenous entity, but rather distinctions were made between private sector organisations.

6, 8
Private sector involvement was acceptable as long as commercial actors did not have access to data.

15
Concerns about private sector organisations as funders of research and the implications this may have for the integrity or objectivity of the research. Higher levels of trust in the public sector compared to the private sector, largely related to greater confidence in accountability and data protection mechanisms within the public sector.

6, 11, 21
High levels of public trust in primary healthcare providers. Preference that data-sharing and research uses of data to be overseen within, and governed by the public sector.

5, 6, 11
Preference that such processes are overseen and controlled by healthcare professionals (e.g. known/familiar individuals).

5, 14, 15
To oversee and govern data-sharing and research uses of data may be overly burdensome to healthcare professionals and take valuable time and resources away from the provision of healthcare.

12, 14, 18, 21
The importance of awareness raising to build trust and public support is emphasised in certain studies. 6,7,11,14,15,18,19 There is public interest and enthusiasm for more meaningful forms of public engagement/ involvement.

5, 6, 11
Public engagement/involvement is essential for ensuring accountability. In many cases it was reported that concerns relating to personal privacy were balanced with recognition of the importance of societal benefits anticipated to come from research. 26 Moreover, in two studies it was reported that some participants prioritised societal benefits over personal privacy. 27 Assurances that research would bring about public benefits-or at least that it had the potential to bring about such benefits-were widely reported to be fundamental for ensuring public support or acceptance. If research is perceived to be focussed primarily at benefitting individual researchers (e.g. through advancing their careers or raising their profile), as having no clear practical application or "real-world" value, or as being conducted solely for profit this leads to concerns and opposition (or at least less support) for research uses of data. 28

Control and Consent
Perceived autonomy, or individual control over how data is used, was found to be a key factor shaping public responses in a number of studies. 29 It was reported that members of the public valued having control over their own data. 30 Such control relates to what data are collected, who has access to this, how and with whom data is shared and for what purposes the data are used. In a number of studies participants explicitly referred to this control in terms of individual or human rights. 31 Whilst perceived individual control clearly emerged as a key factor shaping public attitudes or acceptance of research uses of data, there was no clear consensus (across or within) the studies regarding what this control implied or necessitated. In some studies there was a clear link between levels of trust in research organisations or data controllers and desired level of individual control. 32 This suggests that where individuals trust organisations handling their data they are less likely to favour more stringent forms of control. Conversely, when this trust is lacking individuals want to have greater control over their own data.
Preferences for control are also influenced by wider attitudes towards the value of research. In a number of studies it was found that, whilst individual control was highly valued, participants did not want this control to come at the cost of creating barriers to research. Thus it was often found that participants felt that individual control needs to be balanced with efficiency of research. 33 Across the included studies control is largely discussed in relation to consent. There is evidence that members of the public also made this association and recognised consent as a mechanism for facilitating individual control. 34 However, both between and within studies there were varied views on consent and what form this should take. 35 Some studies indicated public preferences for explicit opt-in consent models, 36 whilst an acceptance of opt-out models was also reported due to recognition of the challenges or practical limitations of opt-in. 37 In a significant number of studies there was a clear preference for varied or flexible consent models which would enable individuals to set limits on their consent or to indicate particular preferences or objections. 38 Similarly, some studies reported that participants objected to onetime consent models which would not allow individuals to review or change their consent preferences. 39 This relates to the fact that public opinions or preferences are not fixed but change and adapt in response to information, deliberation, events or circumstances. 40 Whilst consent was widely valued as a mechanism for facilitating individual control in many studies, it was also recognised to be problematic. 41 In particular participants in the studies acknowledged the potential for selection bias or low participation rates if explicit opt-in consent is required. Such recognition led to some individuals becoming more inclined to support opt-out consent models or non-consented uses of data, however this trend was certainly not universal and others maintained that consent was always important.
The included studies highlight a number of areas where consent was regarded as particularly important, for example in relation to named or identifying data, 42 qualitative information rather than "plain stats", 43 research using genetic data 44 or where a commercial entity is involved in research. 45 Where consent was acknowledged to be problematic and/or where individuals reported that they were largely unconcerned about research uses of data, consent was nevertheless widely viewed to be important. In a number of studies consent was in this regard represented as an act of courtesy with participants suggesting that they would be happy to allow their data to be used for research but that this should nonetheless not be used without their permission. 46

Uses and Abuses of Data
A key area of concern regarding research uses of data related to the potential for data to be misused or abused. 47 In some cases this related to concerns that individuals with access to data would use it maliciously or inappropriately, for example it was stated that: "there are some people, [that] regardless of the consequences will defy rules and regulations to justify their existence or to prove they can do it…" (Damschroder et al. 2007: 231) In other instances these concerns related to data being sold or passed on to third parties 48 and used for commercial purposes, e.g.: "What I don't like is any information being passed on to a third party, for promotion purposes. Say you've got a particular problem then it goes to a drugs supplier or something like that, that I would object to." (Participant 4, group 1, Hill et al. 2013: 6) There was also concern about data being used for political purposes, 49 e.g.: "If the Government are using the details for the benefit of society, I think that's okay. But if the Government are using that data to then look at their next election campaign, or look at the independence campaign by looking at the demographics of a particular area, then I don't know if that's as acceptable. They'[d] simply be using our data for their own goals" (Female, aged 18-34, Glasgow, Davidson et al. 2013).
Some participants in the studies expressed concerns about potential future uses of data. 50 While current uses or research objectives may be regarded as acceptable participants expressed scepticism that such uses would remain clearly defined and limited. Some study participants were worried about potential "slippery slopes" with more and more information becoming accessible 51 or with data being used for purposes other than those which were originally described. 52 There were also concerns about the proliferation of data within modern societies and increasing surveillance through data collection. For some these concerns were expressed in relation to the creation of a "Big Brother Society", 53 e.g.: A significant area of concern related to the potential outcomes or implications of research. In particular, study participants were concerned about the potential for stigma or discriminatory treatment to result from research which would label or categorise groups within society, 54 e.g.: "I think research maybe tends to lump everybody together, and there must be individuals that would be totally different […] so it could lump everybody together and maybe that's not what we want." (Tayside-Female4, Aitken 2011: 12) "Some universities might feel: 'we don't want to involve people from areas of deprivation, because we know they are less likely to finish their course and that's bad for us, for our figures'" (Male, oldest age group, Edinburgh) (Davidson et al. 2013: 70) There were also concerns relating to potential indirect negative impacts on individuals from participating in research. 55 For example, a frequent concern related to potential for insurance premiums to increase or be denied as a result of information being accessible from medical records. Additionally there was concern that employers may gain access to information which could be used to the detriment of individual employees. Participants were concerned that data which was shared could be accessed and used in ways which could be harmful for individuals, e.g.: Such concerns were particularly salient in relation to more sensitive forms of data. Across the studies it was reported that participants differentiated between types of data and regarded some as more sensitive-and concerning-than others. 56 Examples of particularly sensitive forms of data include data relating to mental health, sexual health, sexuality and religion.

Private Sector Involvement
Across the studies there was significant concern about private sector involvement in research using individuals' data. 57 Such concerns largely related to two key factors: low levels of public trust in the private sector 58 and a perception that private sector organisations are primarily-or solely-motivated by profit. 59 Across the studies participants often made distinctions between research which was perceived to be "for profit" and research perceived to be "for the greater good". 60 Similarly, distinctions were made between "research purposes" and "commercial gain" 61 as if they were opposing motivations. As noted above, the creation of public benefits from research was widely regarded as an essential prerequisite for public support or acceptance. Therefore, where participants regarded research to be conducted for purposes other than creating public benefits this raised concerns.
However, such concerns did not necessarily mean outright opposition to private sector involvement in research. Profit-creation resulting from research was regarded as acceptable under certain conditions. Notably, the included studies indicated that participants wanted assurances that public benefits would be prioritised over profit, 62 that individuals' privacy would be prioritised over profit 63 and that profits would be shared or reinvested so as to create public/societal benefits. 64 Additionally, while there were concerns about individuals' data being sold, studies which explored private sector access of public sector data found that participants often felt it was appropriate that private sector organisations pay for access to these data 65 and that this would be regarded as acceptable on the condition that revenue generated is appropriately re-invested in the public sector. 66 While there was widespread concern about private sector involvement in research this was often balanced by a recognition that private sector involvement in research can be important or valuable. 67 In some cases private sector involvement was represented as a "necessary evil", 68 e.g.: "… the drug companies are just trying to make money, and yes of course they are, it's all about money in the end of the day but if they don't find the research for some of these the less interesting or less topical things then they, there will not be research into those things… we need to get funding from drug companies anyway, if they're the ones with the money." Thus profit-creation was regarded by some study participants to act as an incentive for private sector organisations to conduct valuable research in the public interest.
Overall, the included studies demonstrate that members of the public hold nuanced and complex views regarding private sector involvement. It is noteworthy that the private sector was not regarded as a homogenous entity, but rather distinctions were made between private sector organisations. 69 There was also acknowledgement of the different roles that private sector organisations can play in research. For example it was reported in one study that private sector involvement was acceptable as long as commercial actors did not have access to data. 70 Other studies reported concerns about private sector organisations as funders of research and the implications this may have for the integrity or objectivity of the research. 71 Whilst low trust in private sector actors is frequently reported, the included studies also demonstrate complex or ambivalent relationships of trust in actors from other sectors. For example, several studies identified ambivalent views on government research 72 and concern about government access to data. 73 Additionally, whilst some studies reported high levels of trust in universities and academic researchers 74 one reported a lack of trust in university researchers. 75 Thus relationships of trust are not straightforward and there does not appear to be a clear, or static hierarchy of trusted organisations/sectors.

Trust and Transparency
Trust is a key theme running through all of the included studies (both implicitly and explicitly). A number of studies indicated that the level of trust individuals place in research organisations, oversight bodies or government, informs their level of support for research uses of data. 76 The included studies indicate that trust is essential for ensuring public acceptance and/or participation in research. 77 As noted above, relationships of trust are nuanced and complex. However the included studies indicate generally higher levels of trust in the public sector compared with the private sector, largely related to greater confidence in accountability and data protection mechanisms within the public sector. 78 There is also evidence of particularly high levels of public trust in primary healthcare providers. 79 This reflects a trend of higher levels of trust in known or familiar individuals or organisations, 80 which was exemplified in study participants' confidence in particular healthcare professionals to make good judgements on access to patients' data: "I know my physician well enough to have a good feel for the types of things he would be involved with" It also leads to individuals preferring to be contacted only by healthcare professionals, or known individuals: "I am happy to have personal contact with our hospital, GP or the health professionals who knows me, but I am not happy being contacted by a Pfizer company, or whatever" (MRC & Ipsos-MORI 2007: 19).
Participants in the included studies often expressed a preference that data-sharing and research uses of data be overseen within, and governed by, the public sector. 81 In some instances there was a preference for such processes to be overseen and controlled by healthcare professionals (e.g. known/familiar individuals). 82 However, some study participants acknowledged that this may be overly burdensome and take valuable time and resources away from the provision of healthcare. 83 The importance of relationships and familiarity to trust is indicative of a broader desire for greater transparency about research practices. The included studies overwhelmingly suggest an appetite among study participants for more information about current research practices and uses of data. 84 Transparency about how data is used in research is considered crucial for building public trust, and thereby securing public support. 85 Moreover, many of the included studies point towards the importance of awareness raising for building trust and public support. 86 However, the included studies highlight that the public should not be conceived of as simply subjects of information provision relating to research uses of data. Rather, several studies indicate public interest and enthusiasm for more meaningful forms of public engagement/involvement. 87 Such involvement was considered essential for ensuring accountability. 88

Differences between studies
It is not possible to make clear or consistent comparisons between the findings of the included studies due to different social and cultural contexts. For example, in a Japanese study 89 participants were reported to describe "unequal relationships" between patients and doctors with patients belonging to a "lower rank". This may reflect (actual or perceived) traditional doctor-patient relations in Japan that are more hierarchical and paternalistic [20]. However, discussions of unequal relationships in other studies were not explicitly reported though some study participants may have implicitly referred to them. Diverse study populations also limit the findings from being comparable. These smaller populations include U.S. veterans reporting higher levels of trust and greater support for research by Veteran Affairs 90 ; African Americans expressing lower willingness to engage in genetic/genomic research due to past abuses 91 ; and LGBT participants in the U.K. concerned for the misuse of data, particularly identifiable data, that could lead to discriminating opinions and behaviour. 92 These findings build on previous research reporting concerns over the underrepresentation of minority populations in research, such as African Americans [21][22][23] and LGBTs [24]. While these views may not be comparable to other contexts, they are indeed essential to understanding the needs of different social groups to better inform a wide variety of policies and practices. Despite variations in opinion, the overall views of these study populations were consistent with the general findings of the thematic synthesis.
A further limitation to the review was the underrepresentation of young people across the studies. Of the few studies that compared all age groups, the variations in opinion were detailed. Two studies reported that younger participants expressed greater concerns for privacy and a desire for control over research data. 93 Another noted that some felt "anxious" about their data being held while others believed they had little control over their own information. 94 In contrast, older participants were reported to favour less individual control 95 or to be less worried about the possible loss of confidentiality. 96 Previous research by Buckley et al. [25] equally commented on the lack of participation of younger people in their study. The few that responded, were more cautious about the use of their medical information compared to older participants. However, the researchers were wary of these results due to the unrepresentativeness of the sample. Additionally, there are some contradictory findings, for example, King et al. [26] found that younger participants and older respondents over the age of 60 were less concerned about the privacy of their health information compared to participants in the mid age range. King et al. [26] suggested this may be due to the "carefree" nature of younger generations who were perceived to be more willing to share their personal information (e.g. on social-networking sites) and older respondents who are no longer invested in their career and therefore under less scrutiny. More recently the Wellcome Trust [3] found a non-linear relationship between acceptance of commercial access to health data and age and noted that young people are not automatically more supportive/accepting. These varying and, at times, conflicting findings point to the need for greater research to explore the variations in perceptions and opinions across age groups.
Finally, the authors conducted a broad search of public responses to data sharing and data linkage in research that included studies looking at genetic data 97 and medical-records data. 98 These topics were considered together with other papers discussing health, personal or administrative data or information for statistical, health, social or other research purposes. Some studies suggest genetic data is particularly sensitive 99 or personal/potentially identifying. 100 In one study, participants perceived genetic data to be potentially less sensitive than information from medical records (e.g. information relating to reproductive or mental health). 101 Participants' from another study reported no real variation in attitudes toward the use of medical records and biological samples. 102 In some studies, linking medical records data to biological samples raises concerns. 103 However, overall opinions were largely consistent with the key themes of this review.

Discussion
The included studies point towards widespread support for uses of data in research, including for practices of data-linkage and data-sharing. However, this support is never unconditional. Key conditions for public support or acceptance relate to the research being in "the public interest" or for "the greater good" and to public trust in researchers or organisations handling/accessing their data. The themes of public benefits and public trust run through all the studies (explicitly or implicitly) and underpin all other areas of concern or interest. As has been noted elsewhere [27] trust-or trustworthiness-is increasingly recognised as being central in shaping public responses. However, the included studies do not point to clear relationships or hierarchies between particular areas of concern or conditions for support and there is a lack of evidence relating to the ways in which trade offs might be made or how preferences would be formed in reality. This may represent a valuable area to explore further in future research.
As the literature in this area has frequently observed, confidentiality is a key area of public concern and assurances of confidentiality appear to be important for ensuring public support. However, in the wider literature relating to secondary uses of data in health research there has been much debate about the value and implications of anonymisation which is frequently described as representing significant challenges [28][29][30]. For example, it is argued that a certain amount of identifying information is needed in order to allow updating, linkage or validation of data [30,31]. Ohm has argued that 'data can either be useful or perfectly anonymous but never both' [32] (p.1704). Despite these challenges relating to anonymisation, confidentiality is largely discussed and understood in terms of anonymisation. The included studies which explored public attitudes towards confidentiality typically focussed on attitudes towards anonymisation of data.
Anonymisation is generally understood as the process of removing key identifiers such as names and dates of birth from personal data thus rendering the identification of subjects highly unlikely. However, anonymisation is not straightforward and, as the MRC & Wellcome Trust suggest: 'Because identifiability runs a spectrum, anonymisation is relative' [2] (p.10). The UK Information Commissioner's Office (ICO) has stated that '[i]n reality it can be difficult to determine whether data has been anonymised or is still personal data' [33] (p.16). This ambiguity around anonymisation has implications for understanding public responses in this area, as Haddow et al. note, where studies have explored public attitudes 'it is often unclear whether the research into publics' views relates to fully anonymised data, the use of weaker forms of anonymisation or indeed fully identifiable data [34] (p. 1141). Therefore whilst studies have reported public attitudes towards anonymisation it is not always clear what members of the public understand anonymisation to mean, or what they perceive it to require.
There is evidence within the included studies that assurances of anonymisation may be important for members of the public, however those studies which enabled greater reflection on the implications or practicalities of anonymisation (e.g. through deliberative methods) typically uncovered more nuanced positions with members of the public often acknowledging that anonymisation is imperfect as a mechanism for protecting confidentiality and/or problematic for facilitating valuable research. Thus, anonymisation is not regarded as a panacea for addressing public concerns and it may be fruitful to explore further public attitudes towards confidentiality-and the ways that this might be ensured-beyond anonymisation of data.
Similarly, whilst the extant literature in this area has focussed heavily on the role and challenges of consent in relation to data-sharing or data-linkage for research purposes, the included studies highlight that this may not be a fundamental requirement for public acceptability. Rather, the studies indicate that whilst autonomy-or individual control over one's data-is highly valued, consent is acknowledged to be problematic. As in discussions of anonymisation, where study participants have had opportunities to reflect on and discuss consent, views typically shift from an initial preference for explicit opt-in consent, towards more flexible models of either opt-out or varied consent. In some cases where study participants have been convinced of the value of research and the potential for public benefits consent has been regarded as non-essential. However, the degree of control individuals describe as necessary relates to the extent to which they trust the institutions, organisations or individuals involved in processing or accessing their data. A recent study conducted by Ipsos Mori on behalf of the Wellcome Trust found that whilst participants in their deliberative workshops initially tended to express preferences for opt-in consent models through the deliberative process, they shifted to a position where they "felt that if they knew more about the processes and safeguards in place they might feel more empowered, and hence more open and trusting in the decisionmaking process around data collection and sharing (and may not, therefore, need to opt-in)" [3] (p.13). Control may be facilitated through transparency and public engagement rather than direct or specific opt-in consent. As such, the findings reported in the included studies suggest that rather than focussing on which consent mechanisms are most favoured by members of the public, it may be more valuable to focus on how relationships of trust are built up (and conversely eroded) and how trust can be facilitated within research and datasharing or data-linkage processes including through public/patient engagement or involvement.
This represents an important finding of this review. The literature has often suggested consent may be a requirement for public acceptability, whilst simultaneously arguing that requirements for consent present obstacles to effective and necessary health research and/or surveillance [29,30,35]. One alternative to consent which is currently used in the United Kingdom and elsewhere is authorisation. In England, for example, the Confidentiality Advisory Group (CAG) advises on requests to access data for research where neither consent nor anonymisation are deemed practicable. Similarly, the Public Benefit and Privacy Panel (PBPP) in Scotland is responsible for advising on data access requests involving personal data held by Information Services Division (ISD) of NHS National Services Scotland (NSS) and NRS (National Records of Scotland). Authorisation is now a widely used governance mechanism and authorising bodies play a significant role within the data sharing landscape. However, this review has found that to date the literature has not engaged with the subject of authorisation and there is a lack of evidence on public awareness of, or responses to, authorisation as a governance mechanism. The findings that individual level consent may not be crucial for public acceptance and that trust in organisations and institutions may be more important in shaping public responses, point to the salience of public engagement relating to authorisation approaches. Future research ought to explore public responses to authorisation.
As well as highlighting important conditions for public support, the included studies also indicate a number of areas of public concern about research uses of data. These relate largely to the purposes the research is perceived to serve, and the extent to which it is considered to be in the public interest or likely to yield public benefits. There is significant concern about potential misuse or abuse of data with negative implications for individuals, however there are also concerns about the potential for wider negative impacts from the outcomes of research. These relate to: the potential for data-sharing or data-linkage to enable, or perpetuate mass surveillance and a perceived "Big Brother Society"; the potential for individuals or groups within society to be labelled as a result of data-linkage research and for such labelling to result in stigma or discriminatory treatment, and to; the potential for research based on analysis of large data-sets to be used to inform policies or practices designed "for the masses" rather than reflecting individual circumstances and needs. What is apparent in relation to all these concerns is the underlying questioning of whether the research and its potential impacts/outcomes are perceived to be in the public interest or likely to bring about public benefits. The potential for research to lead to harm (directly or indirectly) is an area of significant concern.
The studies identified in this review reveal generally lower levels of trust in private sector actors compared with public sector actors alongside concern about private sector involvement in research. These concerns are often related to profit creation from use of individuals' data and/or perceptions that data is routinely sold or passed on within the private sector. However, the studies do not suggest widespread opposition to private sector involvement, indeed many study participants acknowledged the important role of private sector actors in conducting or facilitating valuable research. Public support/ acceptance of private sector involvement was largely conditional on the extent to which the research was perceived to be in the public interest or to lead to public benefits (as has recently been found by the Wellcome Trust [3]). Profit creation largely was not perceived as a problem so long as public benefits were prioritised over profits. The extent to which this was expected to be the case depended on the level of trust study participants had in the individuals or organisations handling/accessing data.
An important observation to emerge from this thematic synthesis is the public's appetite for more information about current research and data-sharing or data-linkage practices. Many of the included studies reveal that there is generally very low public awareness of current research practices and governance systems or safeguards in place. There is evidence that those studies which used deliberative methods and provided participants with opportunities to learn more about current, or planned practices led to greater support/acceptance, or less concern about research uses of data. Additionally, almost all included studies reported that participants expressed a desire for more information and/or greater transparency about the ways in which data are used in research and the safeguards in place to protect against misuse/abuse or harms. This is significant and indicates not only that more awareness raising is needed but also that there may be significant enthusiasm amongst the public to engage more directly with and in these forms of research. Awareness raising should not be approached as a simple process of one-way information provision but rather requires a more engaged approach in order to ensure that it addresses public interests, concerns or uncertainties. The findings reported in the literature indicate that greater transparency may be needed, however, as we have previously noted, "research/ researchers will be more likely to be perceived as trustworthy if transparency and public engagement involve open dialogue with members of the public and opportunities for deliberation, rather than controlled dissemination of information" [27] (p.9).
Within the included studies members of the public have been conceptualised in a number of ways. Some studies have suggested that uses of data in research-and particularly data-linkage-is a complex area which is difficult for members of the public to understand or meaningfully engage with. This leads to suggestions that awareness raising should be used to reassure members of the public through simple information provision and reflects a deficit model approach to public understanding of science [36]. 104 However, those studies which involved deliberative methods have demonstrated that members of the public were able and enthusiastic to engage in discussions on this subject and were competent and valuable deliberators. 105 The nuanced positions described within the included studies highlight the value of qualitative methods for not only revealing but also informing and developing public attitudes. In this way qualitative methods themselves-as forms of public engagement-may have a role to play in building trust which in turn may underpin greater support for secondary uses of data. In this way increased use of qualitative methods might be a building block for support. Such public engagement and qualitative research are increasingly frequent components of large science projects and represent, in part, efforts to increase public trust and to ensure Responsible Research and Innovation (RRI) [27].
Overall this thematic synthesis has also revealed that there is great scope for qualitative methods to be used more fully or effectively in this area. This thematic synthesis has focussed only on qualitative studies-or qualitative findings reported within mixed methods studies-yet in some cases qualitative methods had been used primarily to inform the design of quantitative studies. 106 Moreover, ten studies were excluded at the final stage due to their limited reporting of qualitative findings or their narrow, structured approach (e.g. qualitative methods being used to examine public responses to narrowly defined questions/hypotheses). Therefore it appears there may be a tendency for qualitative methods to be used largely as a means for informing subsequent quantitative methods, which in turn suggests an under-appreciation of the value of qualitative methods. Indeed, there is some evidence that qualitative methods may at times not be recognised as research methods. The authors found that only just over half of the included studies 107 explicitly referred to ethical review procedures relating to the qualitative research while researchers in one study specifically stated that ethical approval was not required. 108

Study Limitations
Qualitative studies are sometimes criticised for their limited generalisability due to small and/or unrepresentative samples, such criticisms might be levied at the included studies within this thematic synthesis. The sample sizes ranged from 14 to 217 participants with the average being 54.84. Moreover, many of the included studies focussed on particular groups such as those with particular health conditions/susceptibilities, 109 particular sociodemographic groups 110 or with previous experience with research and/or data-sharing. 111 Additionally, it is important to note that while random or quota sampling was often used 112 the qualitative methods relied upon people volunteering to participate in the research which often involved a significant time commitment. Thus it might be speculated that those individuals who participated in these studies were more likely to be supportive of-or at least interested in-research and individuals who are less supportive, or more sceptical of research might have been less inclined to participate. Whilst these factors mean that the studies cannot be taken as being representative of the views of the wider public they remain valuable as indicators of the range of views within the public and particularly as illustrating how opinions are expressed and how they may be informed or influenced. This synthesis of the included studies has addressed some of the criticisms directed at qualitative studies in giving increased breadth through synthesising findings from a large (total) number of study participants and in a variety of contexts.

Conclusion
With ever-growing interest in secondary uses of data for health research, including practices of data linkage and data sharing, there has increasingly been attention directed at public acceptability of these practices. Public acceptability is recognised as crucial for ensuring the legitimacy of current practices and systems of governance. This systematic review and thematic synthesis has highlighted a growing body of evidence pointing towards widespread general-though conditional-support for data linkage and data sharing for research purposes. It has found that whilst a variety of concerns are raised (e.g. relating to confidentiality, individuals' control over their data, uses and abuses of data and potential harms arising) where members of the public perceive there to be actual or potential public benefits arising from research and where they have trust in the individuals or organisations conducting and/or overseeing data linkage/sharing they are generally supportive. However, the thematic synthesis has also highlighted current low levels of awareness about existing practices and uses of data, it points towards the need for greater awareness raising combined with opportunities for public engagement and deliberation. This will be important for ensuring the legitimacy of future health informatics research and for avoiding further public controversy.