Cohort study protocol: Bioresource in Adult Infectious Diseases (BioAID)

Introduction: Infectious diseases have a major impact on morbidity and mortality in hospital. Microbial diagnosis remains elusive for most cases of suspected infection which impacts on the use of antibiotics. Rapid advances in genomic technologies combined with high-quality phenotypic data have great potential to improve the diagnosis, management and clinical outcomes of infectious diseases. The aim of the Bioresource in Adult Infectious Diseases (BioAID) is to provide a platform for biomarker discovery, trials and clinical service developments in the field of infectious diseases, by establishing a registry linking clinical phenotype to microbial and biological samples in adult patients who attend hospital with suspected infection. Methods and analysis: BioAID is a cohort study which employs deferred consent to obtain an additional 2.5mL RNA blood sample from patients who attend the Emergency Department (ED) with suspected infection when they undergo peripheral blood culture sampling. Clinical data and additional biological samples including DNA, serum and microbial isolates are obtained from BioAID participants during hospital admission. Participants are also asked to consent to be recalled for future studies. BioAID aims to recruit 10,000 patients from 5-8 sites across England. Since February 2014 >4000 individuals have been recruited to the study. The final cohort will be characterised using descriptive statistics including information on the number of cases that can be linked to biological and microbial samples to support future research studies. Ethical approval and section 251 exemption have been obtained for BioAID researchers to seek deferred consent from patients from whom a RNA specimen has been collected. Samples and meta-data obtained through BioAID will be made available to researchers worldwide following submission of an application form and research protocol. Conclusions: BioAID will support a range of study designs spanning discovery science, biomarker validation, disease pathogenesis and epidemiological analyses of clinical infection syndromes.


Introduction
Adult infectious diseases have a major impact on morbidity and mortality in hospitals worldwide [1][2][3] . These trends are driven by demographic changes associated with an ageing population, widespread use of immunosuppressive therapies and complex surgery in routine healthcare, and the emergence of new and drug-resistant pathogens. Although the use of molecular diagnostics has brought advances in the management of infectious diseases 4,5 , for most cases of infection in hospital the microbial cause of unselected febrile illnesses remains elusive. Consequently empiric treatment decisions are based on clinical and epidemiological knowledge of infectious disease syndromes. Technological advances in genomics, transcriptomics, proteomics and metabolomics combined with high-quality data on clinical phenotype have great potential to improve the diagnosis and management of infectious diseases. Bio-resources have already been established for specific infections such as HIV and HCV 6,7 . There are no other studies, to our knowledge, that set out to recruit unselected patients presenting acutely to hospital with suspected infection. Few studies have investigated diagnosis of infection in the emergency department (ED) population 8-10 , reflecting the difficulty in obtaining samples and consent in the acute setting and this creates an imbalance in the ability to undertake research in the field of acute infection. Innovation in this field is contingent on access to high-quality clinical data linked to prospective collection of biological samples.
BioAID has been established as a registry of unselected patients who present to the ED with suspected infectious diseases. The Bioresource is part of the Department of Health's National Institute for Health Research Bioresource Programme which provides a registry of healthy volunteers and patients who have been consented for recall by virtue of genotype or phenotype to participate in secondary studies.

Methods and analysis
Aim of the study BioAID has established a network of UK hospitals which provide the infrastructure for infectious disease research by linking clinical phenotype to microbial and biological samples. This will support the development and evaluation of novel diagnostic and risk stratification tools based on the application of emerging technologies for biomarker discovery and the conduct of clinical trials and service developments. The Bioresource also provides a platform for studies which investigate the genetic and immunological basis for host susceptibility to infectious diseases, which is likely to have a bearing on vaccine development.

Choice of study design
A cohort design with deferred consent was selected due to the need to sample patients prospectively, the difficulties associated with obtaining genuine informed consent in the ED, and our desire to sample an unselected group of patients with suspected infectious diseases. Clinical data were extracted by a combination of medical note review and extraction of data from electronic health records deemed to be the most efficient and cost-effective method to obtain detailed and reliable information from a large number of participants across multiple sites.

Patient recruitment and consent
Individuals aged > 16 years with suspected infection are eligible for inclusion in BioAID provided they undergo peripheral blood sampling for microbial culture, which is part of routine clinical assessment, contemporaneous with collection of an additional 2.5mL RNA blood sample in the ED, Figure 1. Within the following 72 hours, clinical research staff approach all participants from whom a RNA sample has been collected in order to invite them to take part in the study, provide detailed information on the study, and obtain informed written consent. Informed consent is sought by telephone from patients who have not been admitted to hospital, but have provided a RNA sample, as well as those who have been admitted but discharged before consent could be obtained. Importantly, informed consent is also sought by telephone from the next of kin in the case of patients who have died but have provided a RNA sample at the time of blood cultures being drawn. Consent is also sought for participants to be recalled to participate in future research studies. The acceptability of this approach was investigated at University College London Hospital in the FEVER study 11 , and was deemed to be acceptable to both patients and their relatives.
Collection and retrieval of biological samples Serum samples obtained as part of routine care at the time of admission, but surplus to diagnostic requirements are also retrieved for the BioAID collection. DNA and additional RNA sample are collected during the patient's admission to hospital, usually within 72 hours (Table 1). RNA samples will be used primarily to evaluate host responses to infection at the transcriptional level. The paired RNA samples will be used to investigate whether the timing of the sample impacts on blood transcriptional profiles. Serum samples will be available to evaluate host responses to infection by proteomic and metabolomic profiling, as well as serological assays and quantitation of cytokine responses. DNA extraction provides the opportunity for future research studies investigating host genetic variants associated with the response to infection. Microbial isolates derived from specimens obtained during admission are retrieved from the laboratory and stored, laying the foundation for future studies of infection surveillance, diagnosis, pathogen evolution and genomics, disease pathogenesis and host-pathogen interaction.  Data are transcribed by the research team into the research database which has been developed using REDcap electronic data capture tools hosted at UCLH and ICHT. REDCap (Research Electronic Data Capture) is a secure, web-based application designed to support data capture for research studies providing 1) an intuitive interface for data entry; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for importing data form external sources 12 .
Statistical analysis and sample size BioAID aims to recruit 10,000 patients over 8 years across 5-8 participating centres. These estimates are based on pilot data from the FEVER study at UCLH 11 . Novel genetic variants associated with infection related phenotypes have commonly required sample sizes ranging from 1000-5000 individuals in order to detect clinically meaningful associations between variant and phenotype 13 . Larger sample sizes would be expected to enable detection of smaller genetic effects (risk ratios < 2.0). This Bioresource would therefore be expected to contribute to such studies assessing susceptibility to infection or survival following sepsis. Based on preliminary data from the Bioresource, we anticipate recruiting approximately 2500 cases of respiratory tract infection, 2000 cases of urinary tract infection syndromes and 1000 cases with bacteraemia.
The Bioresource has been established to address a range of research questions from discovery science through the clinical trials and consequently we have not stipulated a single outcome measure. We will first summarise the clinical and epidemiological characteristics of the cohort, specifying the primary outcome as the proportion of individuals in whom a microbial diagnosis is achieved, stratified by clinical infection syndrome (respiratory, urinary tract, skin and soft tissue, systemic with no foci of infection). Secondary outcomes will include: estimating the sensitivity, specificity, positive and negative predictive values of a syndromic diagnosis of infection in the ED compared to diagnostic coding at discharge from hospital; identification of clinical and epidemiological factors associated with adverse outcomes including length of stay, admission to intensive care, death; and predictors of bacteraemia. Future studies based on data and/or samples from this cohort will be required to submit a full proposal and analysis plan before being granted access to data and/or samples.

Patient and public involvement
The BioAID protocol was developed with advice from the UCL Partnership Public Engagement Patient Panel. This group provided particular input on the issue of deferred consent and have continued to be involved as members of the BioAID Advisory Board.

BioAID governance
BioAID is overseen by an executive committee comprising the PI's from each participating site, and meets quarterly to review progress and recruitment and to process applications for access to data and samples.

Ethical approvals
A specific aim of BioAID is to obtain RNA samples before treatment commences because transcriptional profiles can be modified by antimicrobial or other treatments and there is increasing interest in host responses as part of diagnosis in infection [14][15][16] . Given the clinical necessity to initiate treatment for patients urgently (typically within 1-2 hours), and the fact that patients may have impaired consciousness or be distressed, it is rarely possible to obtain genuine informed consent from patients before collecting blood for RNA and, should it be possible, there would be significant biases in the patients recruited. Ethical approval and section 251 exemption have therefore been obtained for BioAID researchers to seek deferred consent from patients from whom a RNA specimen has been collected (or from their relatives/nominated consultee) within 72 hours of blood sample collection (REC ref: 14/SC/0008).

Governance, data protection and data management
The BioAID dataset is pseudo-anonymised. Participants are allocated a unique identification number (UIN) and at each participating site a separate electronic and hard copy file is maintained linking the UIN with the patient's hospital number, other identifiers and contact details. The local Principal Investigator has access to the linkage codes. The live BioAID database is held within the NHS firewall.

Data management and access
The BioAID database will be curated by the research team. Histograms will be plotted to investigate the distribution of continuous variables and rules will be applied to identify likely outliers based on laboratory reference ranges and errors in dates and age. Samples collected through the Bioresource and associated meta-data will be made available to researchers worldwide. To qualify for access, an application form including a research protocol should be submitted to the study coordinator for consideration by the BioAID Executive Committee (m.noursadeghi@ucl.ac.uk). Interested researchers are expected to cover the processing costs of sample aliquots from the Bioresource. Access to samples will be subject to a material transfer agreement. All studies using BioAID data and samples will be required to submit annual reports to the Executive Committee and a copy of all the derived data must be deposited within the BioAID database. Publications arising from use of the Bioresource are expected to acknowledge the support of the NIHR Biomedical Research Centres and to recognise BioAID investigators.

Dissemination of findings
Anonymised data will be made available at the time of peerreviewed publications, or by 12 months after completion of the project. Raw sequencing, genotyping data and linked metadata will be made available through quality controlled public repositories to maximise their use by the scientific community.
Specifically, European Bioinformatics Institute Array Express repository, for genome-wide transcriptomic data, and the European Bioinformatics Institute Genome-Phenome archive for genotypic and phenotypic data. Processed and analysed data sets will also be made available through supplementary on-line content associated with peer-reviewed scientific publications. All new computational analysis software that we develop in the course of this project will made publicly available on the Bioconductor platform. Research findings will be communicated to the scientific community via open access peer reviewed publications and presentation at conferences. BioAID investigators will work with the UCL-Partnership Public Engagement Patient Panel to disseminate research findings to patients and the public.

Study status
Ethical approval was granted for BioAID in February 2014 and recruitment began shortly afterwards. To date, > 4000 participants have been recruited across two NHS Trusts; a third site will join in 2018.

Conclusions
The purpose of BioAID is to support large number of collaborative projects and associated research publications. To date, BioAID has been used primarily for the development and validation of transcriptomic gene signatures for bacterial infection 7 , but the sample collection also provides unprecedented opportunities to evaluate proteomic and metabolomic biomarkers. In the future, there is scope to use BioAID as a recruiting framework for inpatient clinical trials or as a means of identifying candidates for studies investigating host susceptibility to infection or host-pathogen interactions. As the number of sites participating in BioAID increases it is anticipated that there will be a range of applications to use this dataset.

Data availability
No data are associated with this article.

Competing interests
No competing interests were disclosed. The BioAID resource provides a much needed biocollection of well-curated prospectively collected biological samples from patients presenting to one of several large teaching hospitals with a suspected infection syndrome. Importantly, samples will be obtained at presentation alongside blood cultures often prior to the administration of antimicrobial therapy.

Grant information
It aims to provide researchers with access to large numbers of samples providing a route to meaningful studies of biomarker identification, host genetics and pathogen identification. The scale of the resource will mean that genetic associations with outcome will be sufficiently powered for meaningful interpretation.
Judging by the listed authors, the design and the stated broad aims, the current focus appears to be to address bacterial infection, initially aiming to identify 2500 cases of respiratory infection, 2000 cases of UTI and 1000 cases of bacteraemia. However, the potential for identification of other pathogens, for example viral pathogens is also present.

Minor suggestions
Unfortunately, the case reporting form (Supplementary Figure 1) is not available for review (weblink error) but is likely to provide essential follow-up information on patient demographics and symptoms as well as outcome. I assume it will also include a travel history. While it is clear that "microbial isolates and sera obtained from the patient during admission" will be retrieved, there might also be an opportunity to gather other relevant samples e.g. CSF or urine which could be used for pathogen discovery at a later date. Sputum and respiratory samples from the 2500 cases of respiratory infection would allow a search for viral as well as bacterial pathogens and identification of co-infections. The ethics for the collection of samples is carefully considered and appropriate with prospective collection and retrospective approval (in the case of death by the next-of-kin). I wonder, however, if the possibility of returning results to the relevant patient has been discussed. I am referring, in particular, to HIV (and other chronic viral infections such as HCV and HTLV) which are likely to be identified if RNA sequencing is planned. There may be a role for returning such a result to the patient following clinically validated testing. Summary In summary, this is an exciting, large-scale, unique and much needed bioresource with the potential to unite advances in metabolomics and genomics with clinical pathogen diagnosis, host susceptibility and identification of predictive biomarkers of disease. It is highly likely to provide a platform for the development of multiple meaningful studies and will facilitate the introduction of technological advances into the health service to directly improve patient care.
Is the rationale for, and objectives of, the study clearly described? Yes

Are sufficient details of the methods provided to allow replication by others? Yes
Are the datasets clearly presented in a useable and accessible format? Yes No competing interests were disclosed. Competing Interests: Referee Expertise: Next generation sequencing, clinical infectious diseases I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.