Race and ancestry in biomedical research: exploring the challenges

The use of race in biomedical research has, for decades, been a source of social controversy. However, recent events, such as the adoption of racially targeted pharmaceuticals, have raised the profile of the race issue. In addition, we are entering an era in which genomic research is increasingly focused on the nature and extent of human genetic variation, often examined by population, which leads to heightened potential for misunderstandings or misuse of terms concerning genetic variation and race. Here, we draw together the perspectives of participants in a recent interdisciplinary workshop on ancestry and health in medicine in order to explore the use of race in research issue from the vantage point of a variety of disciplines. We review the nature of the race controversy in the context of biomedical research and highlight several challenges to policy action, including restrictions resulting from commercial or regulatory considerations, the difficulty in presenting precise terminology in the media, and drifting or ambiguous definitions of key terms.

T Th he e r ra ac ce e d di il le em mm ma a i in n r re es se ea ar rc ch h At the heart of ongoing debates about the value and use of racial categories in biomedical research are disagreements about the underlying rationale (and motivation) for stratifying study cohorts and what to do with resulting observations. Although there is considerable interest in using social or political categories in the descriptive assessment of health outcome similarities and differences, several scholars have suggested that the subsequent attribution of causality to those categories is unjustified and potentially harmful [13,14]. So, although there is much agreement that race (and other forms of social identification) matters to health, there is little agreement about why or how race matters, how best to study its effects and how to translate and communicate research results from racially stratified studies (see Box 1).
Persistent disagreements about how best to understand race as an object of scientific inquiry complicate matters further. Racial definitions can fluctuate according to social context, geographic location, historical period and personal experience. Indeed, it is not uncommon for the same individual to report their racial identity differently in different contexts and at different points in their lives [15][16][17]. For these and related reasons many scholars view racial identity as primarily a social construct [18][19][20][21][22], and one that can misdirect the categorization of participants in biomedical research. Others see racial identity as correlated with a mix of social and biological risk factors that should be recognized and disentangled, even used to advantage, in an effort to explain and address health disparities [23][24][25][26].
Despite research correlating population genetic identity with geographic proximity [27-30], many researchers hold that self-identified racial or ethnic identity is a poor proxy for underlying genetic relatedness [31][32][33][34]. As a consequence, scientists with an interest in identifying genetic-association studies with disease are turning to DNA-based estimates of 'ancestry' as a basis for stratifying study samples and controlling for background genetic differences unrelated to disease risk [35][36][37]. Geneticists have also begun to use ancestry informative markers (AIMs) to identify groups and individuals subject to recent genetic admixture and use this information in methods such as admixture mapping [38][39][40]. In both gene-association and admixture studies, assessing ancestry is seen as preferable because it circumvents the problems of self-reporting, although major axes of differentiation continue to be drawn along continental lines, recapitulating previously used racial distinctions: African, European, Native American, and so on.
Although ancestry estimation has the potential to control biases due to genetic confounding, in isolation its use defers, rather than addresses, the important problem of how social and biological risk factors interact in the context of healthincluding the production of racial health disparities. Research that simultaneously assesses both genetic and environmental contributions to disease risk, drug response and other healthrelated variation, and that deliberately puts such findings in the context of self-identified race, is urgently needed [13, [41][42][43]. In the absence of such additional evidence, and despite its amorphous nature, the multi-dimensional and contested concept of race will probably continue to have an important place in biomedical research for many years to come.
The continuing salience of race as a research variable places researchers in the unenviable position of having to negotiate complicated, and often controversial, terrain. Given the potential for misinterpretation and misapplication of research findings, great care must be used in the characterization of study samples and the interpretation of observations (Box 2). Available research tells us that such rigor is often absent in the reporting of race and ethnicity in the biomedical literature [44][45][46][47][48][49][50]. In addition, researchers must remain aware of the manner in which their work could be translated, both clinically and in the popular press (Box 2).
C Ch ha al ll le en ng ge es s t to o c ch ha an ng ge e How can we move forward? Many journals, research entities and academic commentators have provided relevant recommendations (Box 2). Yet concerns persist, and race and related concepts continue to be used in an inconsistent and potentially misleading manner within biomedical research [44,[48][49][50].
The concept of race has a long and complex social history [51], and the research community operates within the constraints imposed by this history and its associated social structures. This overriding reality is one of the primary reasons why the use of race remains a controversial and uneasy concept in research. However, there are other challenges that make progress difficult, despite the numerous policy recommendations. Understanding these tendencies, trends and social forces may help us to more effectively use existing recommendations and address social concerns. What follows, although not comprehensive, is a list of some of the most salient challenges that emerged from the workshop.
C Co om mm me er rc ci ia al l a an nd d r re eg gu ul la at to or ry y i im mp pe er ra at ti iv ve es s Decisions on whether or how to use race in biomedical research and clinical practice do not take place in isolation. Often they are shaped by commercial and regulatory impera-Box 1. Examples of concerns with the use of 'race' in genetic research

Examples of research concerns
Stratification by race is being used on the assumption it can serve as a proxy for genetic similarity, but there is disagreement regarding the degree to which race correlates with genetic variation [23,[27][28][29][30][31][32][33]70,71].
There is a lack of agreement both in the public sphere and among researchers on what is meant by the term 'race'. In genetic research it is not being defined or applied consistently, nor is a rationale for the analysis of race in studies being consistently provided. This leads to a lack of clarity about the groups being investigated, hindering reproducibility and generalizability between studies and slowing scientific progress [65,[72][73][74][75][76].

Examples of social concerns
Stratification by race in genetic research can over-emphasize the role of genetics as the basis for health disparities, deflecting research funding and attention away from the substantial socio-economic and political determinants of inequities [74,[77][78][79][80].
The use of race to categorize groups in genetic research can lead to over-emphasis of the relative magnitude of genetic differences between populations and to the 'reification' of race as a natural genetically determined system of human classification (leading to 'racialization' and a belief in genetic underpinnings for social inequities and differences between groups) [54,66,79,[81][82][83].
The use of racial or population groups in studies to identify the genetic variation underlying disease susceptibilities can lead to 'racialization' of disease, whereby the disease state becomes irrevocably identified and linked with that group. This can lead to several secondary outcomes, including the discrimination and stigmatization of members of the group in question, and decreased access to information, surveillance and treatment that could be valuable to other groups [65,77,84,85].

Example of clinical/healthcare concerns
The descriptive use of race in genetic and biomedical research can lead to racial stereotyping in clinical practice. For example, the use of perceived or self-identified race as a proxy for genotype in prescribing most often overly simplifies the concept of pharmacogenomics. Diagnosis or assessment of disease risk on the basis of race can similarly result in serious medical errors [13, [86][87][88][89].
tives that reward or require the use of racial categories in particular ways that may not serve constructive purposes. Much biomedical data, for example, is produced as a result of regulatory mandates that direct the collection of data using social categories of race derived from such sources as national census tables [13, 46,52,53]. Such 'racialized' data necessarily raise questions of how best to manage the relationship between social census categories of race and the biomedical data being produced by researchers and clinicians. Moreover, once introduced into the biomedical arena, race can take on a life of its own, leading to the retrospective framing of data and/or the prospective design of product development in 'racialized' terms that were not originally contemplated by the researchers [54][55][56]. It also seems likely that market forces will push toward terminology that captures a larger population and has more immediate public recognition [57]. Narrowly defined terms, such as ancestry, are likely to have less public recognition than race.
M Me ed di ia a r re ep pr re es se en nt ta at ti io on ns s The popular press is an important source of health information, particularly for the general public [58]. Although the relationship between media representations and public perceptions of biomedical research is complex [59], there is some evidence that the media can influence social perceptions and attitudes, even about race [60]. There are certainly examples of news reports that include a thorough examination of the challenges associated with using race in biomedical research [61], but media representations often simplify the science and use concepts such as race without explaining how the social category relates to the research outcome [62]. Given the limited space and time available to write most science stories, this is hardly surprising. The research community, in an effort to translate the research results to the lay public, can also use terminology that does not accurately reflect or represent the research conducted (Box 2). For example, a study might have used ancestry as a variable but, in its media report, racial descriptors are used to describe the significance of the findings in the popular press [63]. In response to the social concerns associated with the notion of race, new terminology has been suggested -the hope being that this new terminology will be both more scientifically precise and have less historical and social baggage. For example, the term ethnicity emerged as an alternative to race [64]. However, there is often a migration back to the origin term or the new term simply comes to be understood to mean the same thing as the old one [65]. Given that the social category of race has the most cultural resonance, this slippage is likely to be from the more specific terminology toward the broader, and perhaps more inaccurate, notion of race [26,62].

D De ef fi in ni it ti io on n a am mb bi ig gu ui it ty y
The final challenge is the need to strategically tolerate the ambiguity of racial identity. Because, as described above, the relevance of race and of race categories far exceeds the arena of scientific discourse and becomes the concern of government regulation, media accounts and language debate, science cannot independently dictate its meaning or invent new terms to replace it. Moreover, the features that make race socially useful -its fluidity, ambiguity and contingencyand that feed its social ubiquity and thus contribute to its scientific utility also work against tidy definitions. These features of race cannot be reasoned away. Nor, however, can they be used as an excuse to ignore standard scientific requirements for explaining research terms and justifying design choices. Instead, they need to be recognized and selfconsciously engaged, as part of an iterative process directed at clarifying the import of human genetic variation in the long term and of using genetic insights to help eliminate, rather than reinforce, disparities in health status.
M Mi it ti ig ga at ti io on n s st tr ra at te eg gy y Race is best understood as the result of a process informed by social values and institutional practices that imbue superficial differences between groups, such as skin color, eye shape or language, with unwarranted significance. Historically, this has been informed by hierarchical thinking, in which group differences and social inequalities are naturalized and rearticulated as biological realities [66]. Genomic research that uses racial categories in the investigation of genetic contributions to disease can also inadvertently support such 'racialization' and influence how findings of group differences are interpreted and, in turn, translated into clinical care and health policy.
Clearly, although the recognition that certain susceptibility variants are more prevalent in certain groups can have health benefits, such observations should not validate the politically and historically charged concept of race or support assumptions that the entire range of attributes ascribed to race have a biological basis. There is a need to develop strategies to mitigate the inappropriate and potentially inaccurate use of categorizing terminology. The available recommendations, outlined in Box 2, have merit. But there are numerous social forces and tendencies, such as those outlined above, that challenge progress towards constructive change.
Future policy and social-science work should focus on exploring the influence of these social forces. For example, although some research in this area has already been done [62,67], a more nuanced understanding of how data on human genomic variation are interpreted by media, and in turn assimilated by the public, is needed. Recommendations for communication strategy could be used to raise the awareness of researchers as to how their work is apprehended by lay audiences (Box 2), especially journalists. Engagement and education of both scientists and media on their social and ethical responsibilities would be a related actionable strategy (Box 2). In the arena of clinical trials and genetic epidemiological research, the social impact and scientific utility of alternative methods of subject identification and reporting of findings could be explored. Most importantly, consideration should be given not only to ensuring that relevant policy recommendations are effectively implemented, but that they are being followed.
Finally, it must be recognized that the discussion and analysis surrounding the use of race, ethnicity or ancestry in medicine is, for the most part, flowing from scholars in North America and Europe. However, a number of countries, including Mexico, India [68,69], Thailand and South Africa, are already doing, or planning to undertake, projects studying human genetic diversity within their own populations; and in many others, such as Brazil, extensive admixture has created a continuum of ancestral proportions among individuals that challenge racial classification. It will therefore be important that experts from communities in both the emerging economies and developing countries also contribute to this very important debate.

A Ad dd di it ti io on na al l d da at ta a f fi il le es s
The following file is available: Additional file 1, consisting of an extended version of Box 1 that includes supporting quotations from the cited references for each example concern. C Co om mp pe et ti in ng g i in nt te er re es st ts s The authors declare that they have no competing interests.
A Au ut th ho or rs s' ' c co on nt tr ri ib bu ut ti io on ns s