Ethical Challenges Associated with Pathogen and Host Genetics in Infectious Disease

The Covid-19 pandemic has demonstrated the potential of genomic technologies for the detection and surveillance of infectious diseases. Pathogen genomics is likely to play a major role in the future of research and clinical implementation of genomic technologies. However, unlike human genetics, the specific ethical and social challenges associated with the implementation of infectious disease genomics has received comparatively little attention. In this paper, we contribute to this literature, focusing on the potential consequences for individuals and communities of the use of these technologies. We concentrate on areas of challenges related to privacy, stigma, discrimination and the return of results in the cases of the surveillance of known pathogens, metagenomics and host genomics.

England's then-Chief Medical Officer (Davies 2017). The authors of the chapter on pathogen genomics pointed to the potential role of pathogen whole genome sequencing (WGS) in detecting mutations associated with drug resistance in HIV and Mycobacterium tuberculosis, and in recognizing and tracing outbreaks of foodborne or nosocomial illness, and in the management of international outbreaks of infectious disease. While genetic sequencing of pathogens has taken place in some form since the early 1990s, it is in the last decade that pathogen sequencing has begun to play a central role in public health disease control and epidemiology more widely. Targeted genomic surveillance tools have supported public health responses to cholera (Dorman et al. 2020), Zika (Grubaugh et al. 2017), Ebola and Yellow Fever outbreaks (Gardy and Loman 2018). Most recently, genomic sequencing of SARS-CoV2, the agent causing Covid-19, has been integral to the public health response to the pandemic. The energies of major genomic research centres were redirected to rapidly scale up sequencing capacity (Midgley 2020) and efforts to identify and track novel variants became part of the everyday pandemic lexicon, as policymakers, public health officials and the population adapted to the emergence and spread of novel variants.
Through the pandemic and beyond, genomics offers public health possibilities through the identification and discovery of novel infectious agents and improved understanding their interaction with hosts. Metagenomics, for example, the study of the 'collective genomes' of a sample offers the possibility of rapid identification of microbial populations and novel pathogens in the field speeding the public health response. Such 'pathogen-agnostic' approaches present a powerful promise of pathogen genomics, albeit one that remains further from routine implementation (Govender et al. 2021). Furthermore, the cases of HIV/AIDS and SARS-Cov2/ Covid-19 highlight that considering infectious disease genetics requires attention to interactionsbetween humans, animals and environment (Gardy and Loman 2018); and between host and pathogen genomes.
Given the growing importance and potential of pathogen genomics there have been surprisingly few analyses of the ethical questions posed by the field's development and its intersection with existing discussions related to both human genetics and infectious disease control. There is, however, a small and growing body of literature in this area (see for example, Geller et al. 2014, Boyce and Garibaldi 2019, Johnson and Parker 2019, Mutenherwa et al. 2019, Walker et al. 2021. In an interdisciplinary expert review drawing on a workshop with primarily North Americanbased experts in public health, law, biobanking, genetic epidemiology, philosophy and ethics, Walker et al. (2021) highlight challenges related to the risks in reporting to government authorities, return of individual research results, and resource allocation, and the need to balance commitments to public health and individual privacy in relation to each. However, although the ethical questions associated with a range of tools and technologies contributing to the public health response to Covid-19 have received considerable attention (see for example Gasser et al. 2020, Lucivero et al. 2020 there has been relatively little consideration of the implications for the future of pathogen genomics of the pandemic experience. In this paper, we seek to contribute to this discussion. Decisions made during infectious disease outbreaks are made urgently and in the context of scientific uncertainty, social disruption and amid fear and distrust (WHO 2016). In setting out some key areas that require attention or consideration in the implementation of pathogen genomics, we aim to describe salient features of the discussion and facilitate the balancing of competing ethical values by decision-makers operating in these circumstances. Our discussion revolves around three specific applications of infectious disease genomics, drawing on examples from the Covid-19 pandemic and beyond. These encompass and extend beyond the genome of the pathogen and ethical implications related to privacy, utility and the return of results, and discrimination, justice and equity. We first consider questions associated with genomic surveillance, particularly phylogenetics. We then discuss metagenomics, and the potential for incidental findings associated with pathogen genomics. Finally, we touch on the implications of host genome sequencing and the genetics of resistance and vulnerability.

Genomic surveillance
It seems hard to remember that only a short time ago, some familiarity with the genetics laboratory would probably have been needed to recognize the acronym PCR or to have any notion of the role played by polymerase chain reaction techniques in amplifying and identifying nucleic acids from pathogens and diagnosing disease. PCR-based assays are at the heart of genetic pathogen surveillance techniques, particularly for viral diseases and generate information that has a value not simply for diagnosis, but for orienting public health policy, not least informing the introduction of restrictions on individual and community liberty.
Genomic surveillancethe use of sequencing to detect the presence of a pathogen, identify new variants and monitor trends in circulating variantsraises ethical questions for both individuals and communities. For example, the surveillance of SARS-Cov2 through the analysis of wastewater for the presence of the virus offers opportunities to detect the emergence of outbreaks in the absence of individual testing regimes. However, the use of wastewater in this way requires balancing of the potential benefits against the implications for privacy and autonomy, given the limited ability to make an informed decision about participation in such screening, and the risk that targeting or identifying communities on the basis of this information may lead to restrictions on their liberty or stigmatization (Gable et al. 2020).
Further challenges associated with the use of pathogen genomes for surveillance arise from the construction of phylogenetic trees and their potential use in establishing chains of transmission. Phylogenetic analyses compare pathogen sequences from different sources in order to establish the evolutionary relationships associated with them, trace the origins and path of outbreaks and provide 'real time' tracking and visualization of the evolution and transmission of viruses. Such phylogenetic analyses have been used in public health surveillance programmes of pathogens including HIV, to identify transmission clusters or hotspots and target interventions (Mutenherwa et al. 2019). In the case of SARS-Cov2 such analyses facilitated the early identification of variants of potential concern and the introduction of targeted restrictions on movement and activity aimed at preventing their spread.
For both communities and individuals, phylogenetic analyses are accompanied by potential challenges associated with the enhanced ability to track disease. For the former, a specific risk relates to the historical association of infectious disease outbreaks with xenophobia and efforts to restrict or contain the movements of those perceived to be different. Narratives of infectious disease transmission have often revolved around 'out of Africa' and 'out of Asia' tropes (Leach and Dry 2010) that emphasize differences between the Global North and the Global South and reproduce 'enemy figures of mutating, slippery, re-assorting pathogens, of free-roaming super-spreaders and patient zeros' associated with the emergence and transmission of new pathogens (Hinchliffe et al. 2016, p. 9). These narratives are reinforced by the explicit association of diseases with places or peoples. The potential negative impacts of disease naming were recognized by the World Health Organization in 2015 best practice guidance to avoid the use of geographic locations, people's names, animal species or cultural, population or occupational references (WHO 2015). In the case of the SARS outbreak of 2003, the linking of SARS in Canada with travel from China led to avoidance of Chinese businesses (Singer et al. 2003).
While in the Covid-19 pandemic the formal naming of the disease followed best practice, the popular and political association of the disease and viral variants with specific places contributed to stigmatizing places and populations, in ways that often intersected with and reproduced historic relations of oppression, exclusion and discrimination (Ganguli-Mitra et al. 2020). In early 2020, researchers writing to The Lancet reported individual and collective forms of aggression and discrimination against Chinese individuals (Devakumar et al. 2020), including the exclusion of Chinese people from restaurants. Although these initial associations were not linked to the application of pathogen genomics, newly identified Covid variants repeatedly became linked to specific places and peoples, from India to Brazil, the U.K. and South Africa. The associated targeting of restrictions on travel and trade led to concerns about discrimination that as Meier et al. point out, 'exacerbated political divisions, blocked essential goods and deflected from established mitigation measures ' (2022, p. 178). In doing so, they jeopardized a global response to the pandemic based on solidarity, a commitment to carry some costsfor example, in the form of funding or sharing of informationto assist others in the same situation (cf Prainsack and Buyx 2017).
From a genomics perspective, a key part of this solidaristic response is the sharing of samples and data. In the case of diseases such as HIV, the development of phylogenetics reflects what Crane describes as the molecular politics of global health, in which the available materials and toolsincluding reference datareflect global inequalities 'at the most minute scale ' (2011, p. 145). Despite the development of global disease surveillance systems around zoonotic disease and influenza, inequalities are reflected in the distribution and availability of the tools of phylogenetics and the over-representation of viral genomes from the global North. The rapid development of global capacity and the growth of open data sharing (WHO 2016) may help address this imbalanceas long as it is accompanied by appropriate and equitable sharing of the benefits of genomic analyses, as well as funding and resources (Johnson andParker, 2019, Bedeker et al. 2022). At an individual or community level, the use of phylogenetic data to map chains of transmission may enable the attribution of moral or legal responsibility for infection. Hence, for example, if the viral genomes sampled from two individuals are identical, it may be that one passed the virus to the other, or that both were in contact with the same third individual. In contrast, if there is a significant difference between genomes, it is unlikely that they were part of the same chain of transmission. The potential for genomic surveillance programmes that make use of such phylogenetic inferences about transmission is substantial, for example in the control of pathogens such as Methicillin-resistant Staphylococcus aureus in hospital contexts (Coll et al. 2017). Such information can allow targeted interventions that identify and remove potential sources of infection. However, establishing individuals, communities or areas as disproportionately implicated in transmission events risks breaching their privacy, and they introduces the potential for discrimination (Geller et al. 2014, Johnson andParker 2019).
In the case of Covid-19, individual superspreaders were highlighted within reporting on the early spread of Covid-19, such as the U.K. businessman thought to have contracted the virus on a business trip to Singapore in February 2020 and identified in news reports, or 'Patient 31', thought to have been at the centre of a large cluster of infections at a church in Daegu, South Korea. Such responses show how the public health interest in genomic information can exist in tension with, and have consequences for, individual or community desires for privacy. The response of both Patient 31 and the Shincheonji Church of Jesus to public health authorities prompted widespread public condemnation of both (Yang 2022). This had consequences for the privacy of the individual, whose travel details were publicly released when she refused to participate in contact tracing, and wider implications and for her religious communitymaps and apps were developed to identify Shincheonji communities around the country and public officials were identified as members of the increasingly controversial religious group on the basis of their SARS-Cov2 infection.
The balance between individual privacy and public health benefit is a persistent tension in infectious disease management, and particularly in pandemic conditions. However, the ongoing community implications of the use of phylogenetic analyses are further elaborated in Molldrem and Smith's (2020) discussion of equity and justice concerns related to the use of genetic sequence information initially gathered to inform surveillance of drug resistance. They describe how, in the U.S.A., this information has been used to identify people living with HIV in clusters of others with genetically similar strains of the virus. Molldrem and Smith (2020) suggest that the use of genetic data in this way represents a disproportionate impact of a public health intervention on marginalized groups and that its implementation relies on a potentially unjustified optimism about the public health potential of the approach. In addressing the potential consequences of genomic surveillance for communities, Juengst and van Rie (2020) propose the value of an approach which targets public health interventions by the importance of the disease burden in the targeted group, that shares authority with the groups targeted, that uses the minimum possible data and that, critically, is proactively transparent to avoid 'inaccurate public attributions of responsibility' (p. 2).
It is unlikely that phylogenetics alone can establish the likelihood of transmission between two individuals as there is too much uncertainty to establish this without additional demographic or epidemiological information being provided. As a result, while a phylogenetic analysis can exclude the possibility of transmission between two persons it cannot be used to prove whether a suspect infected a complainant (CPS n.d.). Nevertheless, there remains a substantial research interest in mapping and modelling transmission networks that may enable patterns of infection to be established and genetic data rarely exist in a vacuum without other data being available (Molldrem and Smith 2020).
There are multiple examples of legal cases in which phylogenetics has been used to support or contest claims of HIV spread, documented in detail by Abecasis et al. (2018). As early as 1991 genetic analysis were at the heart of a civil legal case in Florida related to allegations of malpractice against a dentist on the part of patients who had been infected with HIV, and of the U.S. Centre for Disease Control's conclusions that he had been the source of infection. By the late 1990s genetic analysis was used to establish sources of HIV communication in a Scottish prison, leading to the first prosecution for reckless HIV transmission in Scotland (Abecasis et al. 2018). In other cases, phylogenetics has been used to investigate deliberate infection in rape and sexual assault cases and cases of reckless community transmission. Beyond HIV, viral phylogenetics was used alongside medical records and interviews in the conviction of an anaesthetist accused of infecting hundreds of patients in Valencia, Spain with hepatitis C (González-Candelas et al. 2013).
Such efforts to assign both legal and moral responsibility for spreading disease require caution (cf Johnson and Parker 2019). This is doubly the case given the potential consequences of introducing sources of apparent certainty into complex epidemiological narratives. In discussing the Valencian hepatitis C case, for example, Vandamme and Pybus (2013) caution that the cultural weight afforded to genetic information may lead to an over-emphasis by jurors of phylogenetic evidenceechoing longstanding concerns about the so-called CSI effect in wider discussions of forensic genetics (Klentz et al. 2020). Although, as Klentz et al. (2020) argue, it is unclear whether such an effect exists, Vandamme and Pybus' concern reflects the possibility that perceptions of pathogen genomics reflect broader narratives of the power of genetics and genetic exceptionalismnot least in the aftermath of viral genomics' pandemic visibility.

Metagenomics and the challenge of incidental/additional findings
Targeted laboratory assays can allow the confirmation of diagnosis, the tracking of infection chains and the identification of pathogen variants. They may be of less value, however, in the case of previously unknown pathogens or those that were not included in original testing algorithms. As a result, it has been argued that genomic surveillance can be improved by the use of methods that are pathogenagnostic. Such metagenomics approaches involve the sequencing and analysis of all nucleic acids found in a sample. This may arise as the result of a wide pursuit THE NEW BIOETHICS of information about pathogens, or even where the initial focus of sample collection was for the analysis of the human/host genome. The most significant recent implementation of a metagenomic approach was in the identification of SARS-Cov2 in Wuhan in late 2019. The existence of the novel coronavirus was established by Zhu et al. (2020) using high throughput sequencing of bronchoalveolar fluid samples from patients with the novel respiratory disease. However, metagenomics has a potential range of applications in public health and clinical microbiology (Chiu and Miller 2019).
Metagenomics again, however, presents ethical challenges as applied to communities and individuals. At an individual level, one key concern in this context is the potential for incidental or additional looked-for findings. For example, Hall and colleagues (2015) present the hypothetical case of a stool sample sent to a medical laboratory for norovirus testing and analysed using a metagenomic approach. As they point out, if the patient in this case is HIV-positive, then it is probable that the metagenomics data will show the presence of HIV sequences. This presents challenges for patients and clinicians who need to be aware of the potential for such findings in advanceand should make decisions about what can and should be reported and, if such reporting is not desired, filtered from the data. The possibility of additional findings, and what to do with information that be health-relevant but not be actionable, present important challenges for major genomics initiatives as they move into metagenomics (cf Geller et al. 2014, Gardy andLoman 2018).
There are few examples, to date, of large-scale genomics initiatives engaging with these questions. However, an analysis of these challenges and a useful policy for engaging with them, was set out by a research group working on data from the U.K.'s 100,000 Genomes project (Magiorkinis et al. 2019). It was recognized that analysis of all nucleic acid present in a biological sample using high throughput sequencing had the potential to identify a range of microbial nucleic acids with potential clinical significance in a manner that 'bypasses the steps of clinical scrutiny and targeted testing' (Magiorkinis et al. 2019, p. 1). The project convened a group of experts in medical microbiology and infectious disease, sequencing, bioinformatics and bioethics to examine the potential for incidental microbial findings arising from the analysis of human genomic DNA metagenomic data, and to decide a strategy on communicating such findings. The group proposed that some incidental microbial findings that might impact the health and wellbeing of participants and families could be considered as high priority findings, and recommended that these should be assessed by a clinical microbiologist for reporting. Such high-priority findings, they proposed, should initially be limited to a small number of pathogens: HIV, hepatitis B virus (HBV), hepatitis C virus (HCV) and Human T-Lymphotropic Virus.
There is growing interest in the public health potential of metagenomics, its role in understanding the shared microbiome and mapping diverse microbial environments to provide insight into the commensal spaces of global health and detect emerging pathogens. One high-profile example is that of the 2015 analysis of samples from surfaces across the New York City subway system, which demonstrated the potential for metagenomics to identify an array of bacteria and viruses, as well as capture the predicted ancestry of human DNA (Afshinnekoo et al. 2015). In this study, DNA was identified from organisms from over 1500 taxabut it was the apparent presence of Yersinia pestis (Bubonic plague) and Bacillus anthracis (anthrax) that captured attention, even as the authors emphasized uncertainties about whether these were, in fact present, the lack of association with reported disease and their conclusion that the subway is 'primarily a safe surface ' (2015, p. 81). The reporting of the study highlighted the ability of metagenomics to potentially associate specific sites, cities or groups with pathogens and disease risk and, again, the importance of communication around the implications of genomic analyses.
The specificity of analysed metagenomic data to individuals and sitesnot least the homehas implications for privacy that are likely to become more important as metagenomics finds wider uses. The example of wastewater analysis is again illustrative for applications of metagenomics in addition to the targeted testing of wastewater for SARS-Cov2. Metagenomic surveillance of wastewater from hospitals has been proposed as a means of monitoring the presence of antimicrobial resistance (Karkman et al. 2018), while water companies in the U.K. have entered into commercial partnerships with environmental genetic sequencing companies to monitor bacterial populations and the performance of sewage treatment (Yorkshire Water 2022). As such metagenomic tools expand and their findings are reported, care will be needed to ensure that communities whose wastewater is monitored in this way are informed of the potential implications, that it benefits those communities most in need, and that any subsequent public health response does not disproportionately affect and stigmatize already marginalized or disadvantaged groups.

Host genomics: resistance and vulnerability
The genomics of infectious disease are not simply those of pathogens. First, for many diseases, the genomes of vectors such as ticks or mosquitos are also highly pertinentand come with distinctive ethical questions, such as those associated with gene drives (Annas et al. 2021). While these relationships lie beyond the scope of the current paper, we focus on another set of genomes, those of the host, and the role of variation in this genome in shaping the relationship between pathogen and host. In particular, we are interested in the consequences for individuals and health systems of understanding susceptibility to infection and its consequences as genetically influenced.
Numerous examples of the potential importance of host genetics have been identified (Tian et al. 2017). Studies of Covid-19 host genetics have suggested variants that predispose individuals to severe consequences of infection and provide insights into potentially 'druggable' mechanisms (Kousathanas et al. 2022). Other host genetic characteristics may provide protection against infection, such as HBB, which can reduce susceptibility to infection by the malaria parasite or FUT2, which confers some protection against norovirus infection. In the case of HIV/ AIDS, it has been known since the mid-1990s that individuals who are homozygous for an allele of the CCR5 gene called Δ32 (carried by around 1% of Northern European ancestry individuals) have substantial resistance to infection with the dominant R5 strain of HIV (Samson et al. 1996, Greely 2021. CCR5 is involved in virus entry and cell-to-cell spread and has been a key target for the development of drugs to protect those exposed to infection, commercial stem cell and gene therapy trials (Kirksey 2021). Notoriously, CCR5 also formed the focus of the first known case of human germline gene editing in 2018, reportedly in an attempt to protect against potential future HIV infection and the associated stigma and discrimination (Greely 2021). As in the case of genomic surveillance, this emphasizes how applications of pathogen genomics are shaped both by individual and community experiences of infectious disease in social, cultural and historical contextin which the attribution of vulnerability often exacerbates persisting injustices (Ganguli-Mitra et al. 2020).
The importance of situating the ethics of host genetics in such socio-historical context is reflected in the case of Covid-19. While studies of host genomics offer value in understanding why some individuals are particularly susceptible to the consequences of infection, and in highlighting potential biological pathways for drug development, differences in vulnerability to infection and susceptibility to disease are most affected by existing health inequalities associated with the distribution of social and economic resources (Dahlgren and Whitehead 2021). Nevertheless, it has been suggested information about vulnerability and susceptibility might have practical value, for example in identifying health workers most at danger of severe infection, or those who should be excluded from challenge trials (Gyngell et al. 2021). In that sense, host genomics information might be considered comparable to other uses of genetic information for workplace safety. However, such uses also raise the possibility that some individuals may be unreasonably excludedparticularly where genetic influences on vulnerability are only part of a wider picture, where the ability to avoid exposure is limited, and where the infectious agent is itself in a process of change (Milne 2020). In the case of CCR5, it has also been suggested that informing individuals that their CCR5 genotype might mean they are at lower risk of HIV infection may result in riskier behaviouralthough there appears to be no evidence of this (Walker et al. 2021).
In addition, Walker et al. (2021) highlight further implications of host genetics in relation to the likelihood that an individual will transmit disease and the potential restrictions on individual freedoms that may result. As described above, concerns about 'superspreaders' have come to form part of the narrative of infectious disease outbreaks including Covid-19. Walker et al. (2021) suggest that if host genetic variations were associated with an increased risk of transmitting infection, public health authorities might potentially require such genetic information to be reported for cases of that infection, or that this information could be used to preemptively identify those most likely to spread infection. Such action that disproportionately burdens some individuals would clearly raise concerns related to autonomy and justice and require justification, while also potentially creating a reluctance to test, making outbreak control more challenging.
Finally, and along similar lines, the ability to identify individuals who may be more or less vulnerable to infection or susceptible to disease presents possibilities in terms of precision public health and the more efficient targeting of therapies and preventative strategies. In the case of hepatitis C, for example, individuals who are thought more likely to clear the virus from their bodies or respond well to interferonbased treatments might be deprioritized from the phenomenally expensive curative therapies (Walker et al. 2021). As Geller and colleagues (2014) point out, in many such circumstances it would clearly be unethical to withhold or ration treatment on the basis of genetic informationfor example rationing access to HIV medication on the basis of CCR5 homozygosity. Furthermore, as Fouzia et al. point out, pharmacogenetic responses to infectious disease treatments are not equally distributedlower responses to interferon-based hepatitis C therapies are found at higher frequency in populations of African descent than those of European descent, while as many as 50% of individuals of African descent carry an allele of CYP2B6 that reduces the metabolism of the antiviral efavirenz (Fouzia et al. 2020).

Conclusions
Applications of genomics to infectious diseases have considerable public health potential, but attention to the wider social and ethical context in which these tools are applied is needed in order to realize this potential and ensure their trustworthy use. As a core public health measure in the Covid-19 pandemic, the framing and response to ethical questions associated with infectious disease genomics, particularly related to privacy and the tracing of infection pathways, has taken place in the context of the urgent need to react to the threat presented by SARS-Cov2. In this paper, we have begun to outline some of the wider challenges faced by this field as it continues to develop in areas ranging from the targeted application of genomic surveillance to the nature of vulnerability and susceptibility associated with the encounter between pathogens and hosts. There remains substantial work to establish robust frameworks for responsible use of pathogen genomics in global health in both emergency and ongoing contexts. There is, though, considerable opportunity to learn from the experience of the Covid-19 pandemic alongside the longer history of both human genetics and infectious disease epidemiology to develop equitable, privacy-preserving and proportionate approaches to the implementation of pathogen and host genomics.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Notes on contributors
Richard Milne is a sociologist of science, technology and medicine. He is Head of Research and Dialogue at Wellcome Connecting Science and Deputy Director of the Kavli Centre for Ethics, Science and the Public at the University of Cambridge.
Christine Patch is Principal Staff Scientist in Genomic Counselling in the Engagement and Society group, Wellcome Connecting Science. She was formerly Clinical Lead for Genetic Counselling at Genomics England and is a past President and Board Member of the European Society of Human Genetics. ORCID Richard Milne http://orcid.org/0000-0002-8770-2384 Christine Patch https://orcid.org/0000-0002-4191-0663