The Covid Response Study (COVRES, NCT05548829) aims to carry out an integrated multi-omic analysis of factors contributing to host susceptibility to SARS-CoV-2 among a patient cohort of 1000 people from the geographically isolated island of Ireland. Due to differences in site, governance, and timelines the protocol below describes the study to be carried out in Northern Ireland (NI-COVRES) by Ulster University, the Republic of Ireland component (Trinity College Dublin) will be described separately.
Figure 1 shows an overview of the main stages and timeline with data for each participant (n = 519) on : i) Disease status ii) Genome iii) Transcriptome iv) Proteome v) Methylome iv) Microbiome iiv) Immune response iiiv) Patient history ix) Mental health x) Electronic care record and longitudinally on n = 40 at 1, 3, 6 and 12 months post positive PCR to assess persistent inflammatory and immune responses.
Status and timeline of the study
Main-stage recruitment was completed in March 2021 except for the longitudinal cohort (ongoing), integration of ECR record data was completed in January 2022, at time of writing and omics samples are being processed (Fig. 1).
1. Ethical approval
Standard operating procedures (SOPs) and participant response questionnaires included SOPs for saliva sample kit preparation, blood collection and processing, downstream sample processing, website management, data protection, and participant contact. The COVRES study was subsequently approved by the Health and Care Research Wales Ethics service on the 14th of July 2020 (REC ref 20/WA/0179).
2. Social Media outreach
Social media content (Twitter, Facebook) and webpage visuals were designed, with input from recovered patients, by the project Principal Investigators including a range of infographics and short explanatory texts. Information was circulated to local and national news outlets (TV, radio, newspaper) across Northern Ireland.
3. Participant recruitment with inclusion and exclusion criteria
Patients had to be >18 years of age but could have any body mass index (BMI) or ethnic origin. Patients were classified as hospitalised if they attended/admitted to hospital within 14 days of positive PCR result. Patients were also classified based on the World Health Organisation (WHO) scale (19). Note, WHO scores are based on overall highest WHO score during their infection regardless of hospitalisation or not. For example, a patient may have an overall WHO score of 5 and be classified as non-hospitalised as they attended hospital >14 days from their positive PCR result. Patients were excluded if <18 years of age and if any intellectual disabilities were present. After receiving a Participant Information Sheet (PIS) patients, interested in participating signed written informed consent and were enrolled. A self-report questionnaire established demographic information, lifestyle choices, family history of clinical disorders and COVID-19 severity and symptoms. This was followed by a general health questionnaire (GHQ-12) to help ascertain the patient’s mental health after COVID-19 infection (Figure 2). This data was securely digitalised onto a bespoke database CovresNIdb generated on the REDCap platform (20) to comply with the terms of the ethical approval, human tissue act, and general data protection regulations (GDPR). This process is being repeated for cohort b (prospective) n= 40 with stricter timelines followed (1, 3, 6 and 12 month).
4. Biological sample processing
The Western Health and Social Care Trust (WHSCT) recruitment team coordinated sample collection appointments at hospital wards, Clinical Translational Research and Innovation Centre (CTRIC) clinic rooms or home visits. Participants and related study code numbers were predetermined dependent on severity and logged in encrypted clinical data sheets on a secure server to ensure full data traceability. All whole blood and saliva processing was carried out includes recruitment numbers, samples collection types, sample processing and down stream analysis. n numbers refer to specific patient numbers for specific omics analysis.
Category III containment hood with full PPE. Samples were not deactivated upon receipt or prior to processing. Participants provided 3x 10 ml of whole blood and 2x saliva samples of approximately 2 ml each (Figure 2). Blood was extracted using 21G Vacuette® safety needles (Greiner Bio-One Ltd, Gloucestershire) into 3x10 ml EDTA coated Vacuette® tubes and centrifuged at 4000 rpm (4 oC) for 15 minutes. The buffy coat was extracted, washed, and stored for RNA sequencing (Figure 2). All samples were frozen at –80 oC; time to freezer was <2 hrs and none showed signs of haemolysis. Saliva was collected using 1xDNA Genotek (DNA Genotek, Ottawa) Oragene DNA (OG-500) and 1x RNA (CP-190) collection tube per participant), samples were considered deactivated once lysed. Peripheral blood mononucleocyte cells (PBMCs) were isolated using the ficoll gradient separation methods per (21).
4.1. Immune assays
Whole blood was analysed at 1 and 3 months post positive PCR test. Using the FACSAria III high speed cell sorter (Becton Dickinson, Oxford, UK, software version 9) with an 85 µm nozzle fitted, whole blood and PBMC samples were stained for T, B and NK cell populations using CD45 PerCP-Cy5.5, CD3 FITC, CD8 APC-Cy7, CD4 PE-Cy7, CD19 APC and CD16/CD56 PE (BD) before erythrocyte lysis by PharmLyse (BD) according to manufacturer’s instructions. T cell subpopulations were measured using two defined panels-. Panel 1: CD3 FITC, CD4 PE-Cy7, CD8 BV605, CD30 APC, CD45RA V450, CD45RO BV786, CD183 BB700; Panel 2: CD3 FITC, CD4 PE-Cy7, CD8 BV605, CD69 APC, CD45 V450, CD127 BV786, CD152 BB700, CD25 R718and FoxP3 PE. Cell-surface staining was performed prior to fixing, permeabilizing and FoxP3 labelling using the Transcription Factor Buffer Set (BD Pharmingen).
4.2 DNA isolation
Saliva samples (WGS, methylome, microbiome) were incubated for 2 h at 56 °C, followed by DNA isolation using PrepIT.L2P (DNA Genotek, Canada) DNA from whole blood (methylome) was isolated using the DNA Blood 200 360 prefilling H96 Kit (CMG-717, Perkin Elmer, UK) and 200 µl of whole blood on the Chemagic 360 system (Perkin Elmer, UK) was used. Microbial DNA was extracted from saliva aliquots using a modified protocol from Teng et al (2018) (22) using the DNeasy Blood and Tissue kit (Qiagen, UK) . all Extracted DNA was evaluated using the Qubit® 3.0 fluorometer (Thermo Scientific, UK) and Nano Drop 1000 spectrophotometer (Thermo Scientific, UK) and if sequencing, using the Invitrogen™ Quant-iT™ PicoGreen™ dsDNA Assay Kit (P7589)on the Hamilton Microlab Star before storage at -80 °C.
4.3 RNA isolation
RNA from saliva was isolated using the Oragene RNA purification protocol and Qiagen RNeasy micro kit (Qiagen, UK), RNA from whole blood using the Chemagic 360 system (Perkin Elmer, UK) with Chemagic RNA Tissue 360 H96 Kit (CMG-1212). Purity and quantity were assessed as above for DNA but with Invitrogen Quant-iT RiboGreen Assay Kit (R11490). Integrity (RIN) was determined using the Agilent 4200 TapeStation and RNA ScreenTape (5067-5366), before storage at -80 °C prior.
5. Electronic Care Record data collection
5.1 Northern Ireland Electronic Care Records (NIECR)
Consent was given by each patient to use their electronic health information (NIECR) held by the NHS. PCR positive dates, severity (hospitalised due to COVID-19 infection, or recovered from COVID-19 infection at home), lab results (full blood count, blood pressure, lipids, CRP, GFR, troponin), treatment administered, drugs prescribed within the last six months and co/multimorbidity’s held on record for each patient were recorded.
5.2 CovresNIdb development
The pseudonymised information with Personally Identifiable Information (PII) removed by the project’s data controller as per GDPR guidelines was recorded securely onto the REDcap platform to develop a clinical data capture tool (CovresNIdb) at UU for analysis. The same protocol is being followed for all prospective appointments (ongoing). 5.3 General Health Questionnaire and Health and lifestyle questionnaire
The GHQ-12 is a self-administered 12 item screening tool designed to detect current mental state disturbances in primary care settings, a score of ≥ 2 was indicates a disorder. The HLQ is a questionnaire designed by UU to capture key health related data not present on the ECR. Fields included; COVID-19 risk factors, medications, comorbidities, hospitalisation information, symptoms at admission, lab tests, family history, drinking status, occupation. Data was transferred into CovresNIdb using the REDcap platform. The same protocol is being followed for all prospective appointments (ongoing).
6. Merging and QC
Consent, self-reported questionnaires, demographics and NIECR data were merged in the database . in line with GDPR guidelines. Data were subjected to quality control by two independent researchers against the original sources.
7. Omics analyses
7.1 Genome
Whole genome library preparation was performed using the Illumina TruSeq PCR Free Library Prep protocol (20015963) with an input amount of 1 µg ona Hamilton NGS Star robotic workstation, Quality assessed using Roche KAPA Library Quantification Kit (7960298001)before pooling and sequencing (150 bp paired end (PE)) on an Illumina NovaSeq 6000 instrument using NovaSeq 6000 S4 Reagent Kit v1.5 (20028312), mean coverage of 30X as described previously (23). Sequences are being uploaded to European Genome-phenome Archive (EGA)
7.2 Methylome
Methylation analysis was performed on DNA samples from saliva (n=450) and whole blood (n=40) using the Illumina Infinium Methylation EPIC largely as described previously (23). Data was adjusted for known epigenetic covariates and surrogate variable analysis was performed via the sva inference module (24). Our in-house developed tool CandiMeth (25) will be employed to streamline methylation analysis for gene lists of interest.
7.3 Transcriptome
RNA-Sequencing library preparation used the Illumina TruSeq Stranded Total RNA Library Prep Globin kit (20020612) with an input amount of 100 – 1000 ng. Library preparation was automated and processed using a Hamilton NGS Star and quality was assessed using the Roche KAPA Library Quantification Kit (7960298001) and GX Caliper HS Assay (CLS760672, 760517), run on Roche Lightcycler 480 II and Perkin Elmer LabChip GX Touch analysers, respectively. Libraries were pooled and sequenced (75bp PE) on an Illumina NovaSeq 6000 instrument using NovaSeq 6000 S2 Reagent Kit v1.5 (20028314) targeting 50M paired reads. Raw data (BCL format) were demultiplexed and converted to FASTQ format using BCL2FastQ (Illumina). Adapters were trimmed using Skewer (26) and QC assessed using FASTQC. STAR (27) was used to align reads to the reference genome (GRCh38/hg38) as well as to the transcriptome (GENCODE v. 25). The quality of the RNA alignment was assessed using Picard QC. Gene and isoform quantification will be performed using RSEM (28) with prospective patient (1 and 3 month) T-cell receptor sequencing completed following flow cytometry..
7.4 Microbiome
16S rRNA gene amplicons for sequencing by Illumina MiSeq system (Illumina, USA) were prepared using the V3 and V4 region as described in Klindworth et al (2013). ,with sequencing performed in-house.
7.5 Proteome
Protein analysis of 400 plasma samples (baseline) (186 non-hospitalised, 214 hospitalised), 40 prospective (20 non-hospitalised, 20 hospitalised; 1 and 3 month), was outsourced to OLINK proteomics (OLINK, Uppsala, SW) using the Explore® 384 Inflammation panel (Protein Proximity Extension assay). EDTA plasma samples were thawed at room temperature (20℃) and 45 µl of each plasma sample was (at random) pipetted into LightCycle® 480Multiwell Plate 96-well, white PCR plates (Roche Molecular Systems Inc, Charles Avenue, Burgess Hill, West Sussex, UK; Product no. 04729692001) with 8 x wells left empty on each plate for internal controls to be added at OLINK. Samples were inactivated as per OLINK’s protocol and shipped on dry ice (CO2, -78℃). Only samples above 0.2 Normalised Protein Expression (NPX) and samples that deviate less than 0.3 NPX passed QC
The MSD plasma multi-Spot assay system comprising V-PLEX COVID-19 serology panel 11, ‘total IgG’ and ‘ACE2 neutralisation’ assays were used to determine viral variant prevalence. Samples were prepared at 1:10 (ACE2) and 1:5000 (neutralisation) for specific assays, then treated essentially as in (29).
The Roche COBAS Elecsys, SARS-CoV-2 spike (S) protein receptor binding domain (RBD) assay was used to determine SARS-CoV-2 antibody presence. as per manufacturer’s instructions.
8. Statistics
8.1 Univariate and multivariate analysis
We considered the following risk factors: gender, age, BMI, and disease subgroups. First, univariate analyses (table 1) were performed to identify risk factors. Variables with a p value < 0.001, i.e. gender, age <50 yrs, >50 yrs, cardiovascular, respiratory, endocrine, and musculoskeletal, were considered clinically relevant and entered into the multivariable logistic regression model (Table 3). This and further analysis is being undertaken on Base-R software (version 4.2.2) using the Visdat library
8.2 Demographics Table
The demographic table below (Table 1) of COVRES data (n = 519) was generated using IBM SPSS Statistics for Windows, version 27 (IBM Corp., Armonk, N.Y., USA)' (30). Statistical analysis for the contingency table was undertaken using Fishers exact two-sided test to obtain required P-values and confidence rates were set at 95 %.
8.3 Bioinformatics analysis
Bioinformatic analyses will focus on using computational approaches to identify genomic, transcriptomic, proteomic and clinical correlates of severity. Planned analyses primarily include the identification of clinical features, gene variants (host)/eQTLs, transcriptomics signature, cytokine profiles associated with disease severity, as well as the differential methylation among the host genomes of the severity groups. Whole genome sequencing and transcriptomics data are to be deposited in the EGA [EGAS pending] and shared as a collaboration with the International Covid19 Host Genetics Initiative.