Introduction

This paper was composed by members of the Ethics and Policy Committee (EPC) of the International Cancer Genome Consortium (ICGC), a large-scale genomics consortium that coordinates 74 research projects across 17 jurisdictions to investigate over 50 cancer types and sub-types.1, 2 The ICGC also conducts a variety of benchmarking initiatives, which are increasingly used for standardization in genomic research. However, two recent activities of this type posed a dilemma for the EPC.

The first activity, the Somatic Variant Calling Pipeline Benchmark, involved sharing cancer patients’ whole-genome sequence data and associated metadata with participating centers in order to improve the data generation and analysis methods. Each group analyzed the data separately and reported results back to a central analysis team. Some centers expressed interest in publishing their findings,3 which at the time raised some concern that the benchmarking activity might be better categorized as research.

The second activity, the Dream Somatic Mutation Calling Challenge, was a global competition meant to define standard methods for identifying cancer-induced mutations in whole-genome sequencing data. Data representing pairs of normal and tumor genomes were stored on a cloud computing repository and results were also returned to the cloud. Multiple metrics, including balanced accuracy, specificity, and sensitivity were used to determine the challenge’s winner. The global results were then published in a collaborative-scientific paper.4

Access to the ICGC data is overseen by the Data Access Compliance Office (DACO) in order to maintain robust privacy standards. Both activities were initially presented to DACO as quality assurance (QA) projects, but shared some common features of research, such as confidentiality risks and the documentation of novel bioinformatics techniques. After considerable debate, the EPC decided that for the data protection purposes both activities should be considered human subjects research. The EPC also recognized that ethics oversight should not pose a ‘burden’ so large as to dissuade investigators from having their simple QA activities reviewed.5 Therefore, a strict controlled access procedure was applied: the DACO committee reviewed each applicant’s credentials, collaborators, research plan, ethics approval, and potential risks to the data privacy in the same manner as conventional research applications before granting them access to the ICGC data. However, given the limitations of an international consortium, the choice to obtain local ethics approval from an IRB or adopt additional oversight procedures was left up to each participant in accordance with their jurisdiction’s laws and policies. This experience convinced the EPC that a more systematic approach was needed to help investigators and policymakers assess complex projects and select the appropriate oversight mechanisms.

In theory, QA requires less rigorous ethics review because it differs substantially from traditional biomedical research.6 Research is generally defined as ‘an undertaking intended to extend knowledge through a disciplined inquiry or systematic investigation’,7 whereas QA is the ‘systematic monitoring and evaluation of the various aspects of a project, service or facility to ensure that standards of quality are being met’.8 International standards indicate that best practices in biobanking and genetic data collection should include QA mechanisms.9, 10, 11, 12, 13 These are especially important in light of current emphases on personalized medicine and open science, which have led to massive amounts of genomic data being stored in biobanks and shared with the scientific community.14 Although such data sharing generally involves only minor psychosocial risks, in practice nobody can be aware of all future challenges raised by these projects.11, 15, 16, 17, 18 They can pose different risks from clinical research, such as privacy infringements and future sharing of the data in unanticipated ways.19, 20 Indeed, research and QA undertaken at the same biobank often have similar risks of confidentiality breach.11, 21

Given these considerations, it can be unclear how to review QA projects that share similarities with research, especially in the context of data-intensive activities. Both can begin with a clear question or problem, use systematic methods of data gathering, generate questions to inform future research, and use the same large data sets.22, 23 QA misclassified as research may lack rigor and fail to comply with requirements regarding study design, participants’ rights, and other applicable laws and policies.24, 25, 26 In addition to unnecessary bureaucratic delays, this sort of confusion has caused ‘criticism by regulatory authorities, rejection of manuscripts by journals for lack of informed consent procedures, and feelings of considerable frustration’.27 Conversely, classifying complex projects as QA risks offering too little protection for participants. As such, classifying ambiguous projects poses an increasingly important dilemma, one which has historically been hampered by a lack of established guidelines.21

Our goal therefore involved three steps: a literature review to determine the relevant factors for differentiating QA and research in genomics; an international comparison of policy frameworks and the review pathways they allocate to each type of project; and the integration of those results in the development of an effective decision-making tool to help researchers and policymakers alike classify genomics projects into the appropriate ethics review streams.

Scholarship on the distinction between research and QA

Our literature review began with a keyword search for ‘quality assurance’ or ‘quality improvement’ and ‘research’ using the tools Google Scholar and Web of Science. Further papers were identified from their references through a snowballing method until no more were accessible. One challenge for this review was that ‘quality assurance’, ‘quality improvement’, ‘quality activities’, ‘quality studies’, and even ‘audit’ are often used interchangeably to describe any kind of routine knowledge-generating process used to mitigate risky practices.15, 17, 28 Some papers compare these categories (see, for example, refs 6, 26, 29, 30, 31, 32), whereas others group them as functionally equivalent (see, for example, refs 28, 33, 34). We describe them all more generally as QA, as many of the distinctions referred to clinical care and were not relevant to our purposes. Indeed, nearly all of the sources comparing QA and research do so in a clinical context,35 including significant documents like the United States’ Common Rule.36 This may be attributable to the fact that much of the existing literature is over a decade old, predating the present importance of genomics. After excluding criteria considered to be outdated or inapplicable to the ICGC, like sample selection, rigor of analysis, and choice to publish, we were left with six primary criteria.

Generalizability

QA and research are often defined by their ‘intent’ or ‘purpose’, based on whether they aim at producing local improvements or generalizable knowledge.19, 22, 23, 25, 28, 31, 32, 34, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50 According to the Council for International Organizations of Medical Sciences, ‘the defining attribute of research is that it is designed to produce new generalizable knowledge as distinct from knowledge pertaining to a particular individual or program’.51 Indeed, several sources describe generalizability as the consensus criterion.32, 38, 42 A similar formulation asks who benefits directly from the activity: the institution or system, the participants, or society in general.6, 19, 23, 25, 37, 52, 53 However, generalizability on its own fails to consider important factors like risks, methodologies or the types and sources of the data collected.26, 42 Furthermore, intent is unenforceable and subjective, and can vary over the course of a single project.6, 26, 45

Risks and benefits

Many authors classify projects based on their levels of or increases in risks, burdens, and chances of failure, with research being more risky than QA.6, 20, 23, 31, 32, 35, 37, 44, 45, 47, 53, 54, 55 A related distinction asks whether risks are limited to privacy or confidentiality breaches or extend to broader emotional, psychological, social, financial, or physical harms.19, 20, 22, 37, 45, 56

Novelty

Although several authors suggest considering whether an activity departs from standard practice,52, 54, 55 a similar criterion asks whether it builds on previous research, such as by comparing performance to best practices, or whether it investigates a new question or area, such as by testing a new technology.22, 23, 28, 32, 37, 39, 43, 44, 54 QA is often described as occurring in response to a problem, whereas research is more forward thinking.32, 34, 37, 43, 45 Knowledge gained through research is used to confirm standards and best practices, whereas QA measures practices against those standards once they are established.22, 23, 24, 28, 33, 37, 44, 47, 54, 57, 58

Speed of implementation

Another common criterion asks whether the project’s results are applied immediately, in the case of QA, or after a delay, in the case of research.19, 22, 32 Reasons for this disparity include differences in methodology and intended outcome as well as the fact that research is typically disseminated to the broader scientific community for critique and replication before being put into action.19, 23, 40 In comparison, QA often functions as an ongoing process, consisting of rapid small-scale cycles in a sort of feedback loop.25, 26, 46, 48, 59

Theory and method

Research tends to generate or test a theory using the scientific method and a formal and explicit hypothesis.19, 20, 22, 25, 32, 33, 34, 37, 38, 39, 40, 43, 44, 45, 46, 47, 48 Yet, although QA methods are seen as more flexible and iterative,19, 23, 37, 43, 48 QA increasingly uses cohorts and other methods typical of research.27, 46 GWAS and other types of large-scale genomic research can also lack clear hypotheses, which can make this particular criterion challenging to use in this context.

Involvement of researchers

QA and research may be conducted by different people:39, 57 QA data collection is more typically performed by those with routine access to the data and who may be investigating their own practices,25, 32, 45, 47 whereas the research data is collected by trained investigators whose involvement may end at publication.32, 34, 39, 44 Research is also more likely to have external funding along with the associated requirements and risks of bias.20, 39, 44

Country-specific approaches to the review of QA projects

In order to identify potential review pathways for QA, we decided to compare models from various countries, since genomics projects often involve collaboration on an international level. We selected Canada, the United States, the UK, and Australia since each had previously considered the QA-research issue and their relevant ethics documents were available in English.

Canada

The Tri-Council Policy Statement (TCPS2) is the national guide mandating ethics review and approval for all human subjects research funded by Canadian agencies. It distinguishes research that requires IRB review from ‘non-research’ activities, including QA ‘used exclusively for assessment, management, or improvement purposes’.7 Although the TCPS2 does not describe a clear assessment process, Canada’s Interagency Advisory Panel on Research Ethics suggests that concerned researchers consult an IRB or base their decision on study generalizability, intent to publish, or other elements considered essential to QA.8 In Canada, QA projects presenting minimal risk generally receive either an exemption or expedited review. In an expedited review, the IRB chair and another delegated member review the proposal and can recommend a full review, if appropriate.60 Administrative approval is used if QA proposals ‘raise ethical issues that would benefit from careful consideration by an individual or body capable of providing independent guidance’ other than an IRB.7 This allows QA to be evaluated by department heads or other institutional representatives, but its specifics are not described in the TCPS2.

According to the Alberta-based network A pRoject Ethics Community Consensus Initiative (ARECCI), projects with human participants should first be sorted by ‘primary purpose’. Research aims to ‘contribute to the growing body of knowledge regarding health and/or health systems that is generally accessible through standard search procedures of academic literature’, whereas QA aims to ‘assess or improve the quality of a treatment, service or program’.61 Each project is then screened by risk level: QA posing more than minimal risk are vetted through the full IRB process, while the rest receive an exemption or expedited review. Whereas ARECCI does not define minimal risk, the TCPS2 describes it as when ‘the probability and magnitude of possible harms implied by participation in the research is no greater than those encountered by participants in those aspects of their everyday life that relate to the research’.7

United States

Regulations in the United States also focus on the purpose of an activity. The National Bioethics Advisory Commission describes research as ‘undertaken to test a new, modified or untested intervention, service or program’, whereas QA assesses the quality of an established program.62 The Common Rule defines research as ‘a systematic investigation, including research development, testing and evaluation, designed to develop or contribute to generalizable knowledge’,36 which does not necessarily exclude QA. It further requires that federally funded or affiliated projects with human participants undergo IRB review. However, the IRB may use an expedited procedure for research with ‘no more than minimal risk’.36 Although ‘quality assurance methodologies’ are also eligible for expedited review,63 a more precise definition of QA is not provided in the Common Rule. According to the Office for Human Research Protections (OHRP), QA is not subject to Common Rule regulations if its purpose is limited to delivering, improving, or collecting performance data for health care. However, some types of QA may be considered research if, for instance, they are also meant to establish proof of efficacy. Intent to publish is not considered a decisive factor in classifying these activities.64

United Kingdom

In the UK, research falling outside the National Health Service (NHS) generally requires review by an independent or institutional IRB, whereas research involving patients and users of the NHS or other services from the Department of Health requires review by an NHS IRB. Health Research Authority (HRA) guidelines stipulate that NHS IRBs should not consider activities like clinical audit, service evaluation, or public health surveillance as research.65 Audits and service evaluation projects that impose minimal additional risks are defined as QA and subjected to administrative approval. Recognizing that QA may overlap with research, the HRA also provides clear guidance for distinguishing them using four key determinants: intent; treatment/service; allocation; and randomization. The intent of research is ‘to find out what you should be doing’, whereas the intent of QA is to find out ‘whether it is working’.66 The HRA emphasizes that this criterion depends on the primary purpose, and even provides an online tool to help make the distinction.67 QA projects that do not require the involvement of an NHS IRB may be reviewed by research ethics committees from universities or other institutions. Although IRB members may be consulted during this process, those performing the QA are responsible for considering ethical issues themselves.65

Australia

Australia’s 1999 National Statement on Ethical Conduct in Human Research has been revised to include guidance for review of low-risk projects. It acknowledges that ‘there is a great deal of uncertainty about the appropriate levels of governance for such activity’, that review pathways are unclear, and that processes meant for research may be too onerous to apply to QA.68 Indeed, Australia’s Human Research Ethics Committee (HREC) review process for QA activities has been criticized as overly time consuming.69 Although activities involving ‘more than low risk’ require review by an HREC,68 QA is described mostly as having negligible or low risks.69 ‘Negligible risk’ projects have no foreseeable risks greater than inconvenience and ‘use existing collections of data or records that contain only non-identifiable data’; these may be exempted from review.68 ‘Low-risk’ projects may cause foreseeable discomfort68 and do not use non-identifiable existing data. The draft document recommends that these be reviewed by non HREC-level review bodies, which include department heads, departmental committees, delegated review groups that report to an HREC, or HREC subcommittees.69

Discussion

The laws and policies that guide ethical review and oversight of human research in Canada, Australia, the UK, and the USA are consistent on the need for a nuanced, tailored approach to the review of QA projects in the context of research. Depending on the project, they recommend one or more of four broad pathways: full review; exemption; expedited review; and administrative approval. Unless otherwise exempted by national laws and policies, it was agreed that clear ‘research on human participants’ should undergo ethics review by an IRB or similar committee authorized to review medical research projects. The IRB should have the authority to approve, reject, require modifications, terminate, or suspend approval of any proposed or ongoing research.

Although each country also presented criteria for the exemption of certain studies, requiring instead only a waiver or a letter of exemption, there was no consensus between them on those criteria. Full exemption generally depended on the types of the data collected and levels of risk to the participants. Even in such cases, all countries recommend careful consideration of ethical issues through administrative approval by a departmental committee chair, or minimally by the investigators before and during the activity. It was not seen as problematic for the investigators’ peers to review QA projects given the lack of significance-associated ethical issues.

Although projects using QA methodologies that pose minimal risk were generally seen as fit for expedited review, there was little agreement on the types of QA that should qualify for a less onerous process or how they should be evaluated. Given the increasing variety of sophisticated QA in biomedical research, there is a need for greater coherence between the approaches described. Furthermore, these laws and policies are mostly directed towards the clinical context. More thought should be given to QA in the context of non-clinical research.

The decision tool

These reviews of the literature and policy environment were followed by a discussion among the members of the EPC. On the basis of our research, we decided that the most difficult borderline cases ought to be resolved using the specific criterion of risk levels. Although international consortia like the ICGC have very limited influence over review at the national level, they can insist that member projects meet minimal ethics requirements and can use a controlled access strategy like DACO to ensure greater oversight over data sharing activities with privacy risks. With these considerations in mind, we developed a tool that can help researchers and policymakers determine how activities should be reviewed locally while resolving some of the difficulties posed by differing standards of project classification (Figure 1).

Figure 1
figure 1

A decision tool for assessing proposed QA projects in a research context. A decision tree to facilitate the classification of activities undertaken in genomics projects. The colored bar represents the spectrum between QA and research, with activities of uncertain type falling in the middle. The six criteria determining where a project falls are listed above the bar. Below each extreme of the spectrum, the flowchart indicates the appropriate level of ethics review, whereas the middle section proceeds to a second level of review. This step uses the criterion of risk level to divide projects between the use of exemption, expedited review, or administrative approval and the use of an independent ethics review.

The first step of the tool uses the six criteria discussed above (generalizability, risk, novelty, speed of implementation, methodology, and scope of involvement) to identify which projects clearly represent QA, which are clearly research, and which share characteristics of both. Generally, those which compare interventions between groups, extend biomedical knowledge, incur some risks to the participants, and produce generalizable results should fall on the research side. Those which aim at measuring and more immediately improving local performance with respect to a standard of quality, and have minimal risks and burdens, should fall on the QA side. Projects that exhibit overlap are then sorted based on whether they have low foreseeable risk and a favorable risk-benefit ratio (as in 6). In such cases, the amount of ethics oversight should correspond to the degree of actual, rather than hypothetical, risk to participants,70 independent of the project’s other characteristics. As complex projects of this type are more properly envisaged on a spectrum, this sorting is not meant to suggest that only binary classification is possible; it is only meant to indicate the appropriate level of ethics review.

Given the lack of policy uniformity on this topic, parts of this framework may be difficult to reconcile with some countries’ existing laws or ethics guidelines. In these countries, and in those with insufficiently developed ethics frameworks, our tool and the broad approach it proposes could be used to help guide policy reform, promote normalization, and achieve greater international coherence on QA review.

Conclusion

Although many projects are clearly QA and should not be subjected to unnecessarily lengthy ethical review, others share aspects of both QA and human subject research. It can be difficult to place these activities clearly in either category, and categorizing them arbitrarily can result in a suboptimal level of protection or in excessively burdensome regulation for projects with little likelihood of harm.55 As members of the ICGC’s multidisciplinary EPC, we were recently confronted with several projects of this type. On the basis of an international comparative policy review and a scoping literature review, we decided to devise an assessment tool that could satisfy members of the consortium while helping international research groups, academic institutions and researchers meet their responsibilities for ethical review and oversight of these activities. Our proposed framework uses a two-step approach that enables investigators and policymakers to classify complex projects as research or QA using the traditional characteristics of both as well as the evaluation of the actual risk posed by more hybrid projects. Although projects will not necessarily need to be evaluated by a formally designated IRB, they will always require independent ethical debate regarding the risks involved and the protection of human participants.