Evaluation of antibody testing for SARS-CoV-2 using ELISA and lateral flow immunoassays

Background: The SARS-CoV-2 pandemic caused >1 million infections during January-March 2020. There is an urgent need for robust antibody detection approaches to support diagnostics, vaccine development, safe individual release from quarantine and population lock-down exit strategies. The early promise of lateral flow immunoassay (LFIA) devices has been questioned following concerns about sensitivity and specificity. Methods: We used a panel of plasma samples designated SARS-CoV-2 positive (from SARS-CoV-2 RT-PCR-positive individuals; n=40) and negative (samples banked in the UK prior to December-2019 (n=142)). We tested plasma for SARS-Cov-2 IgM and IgG antibodies by ELISA and using nine different commercially available LFIA devices. Results: ELISA detected SARS-CoV-2 IgM or IgG in 34/40 individuals with an RT-PCR-confirmed diagnosis of SARS-CoV-2 infection (sensitivity 85%, 95%CI 70-94%), vs 0/50 pre-pandemic controls (specificity 100% [95%CI 93-100%]). IgG levels were detected in 31/31 RT-PCR-positive individuals tested ≥10 days after symptom onset (sensitivity 100%, 95%CI 89-100%). IgG titres rose during the 3 weeks post symptom onset and began to fall by 8 weeks, but remained above the detection threshold. Point estimates for the sensitivity of LFIA devices ranged from 55-70% versus RT-PCR and 65-85% versus ELISA, with specificity 95-100% and 93-100% respectively. Within the limits of the study size, the performance of most LFIA devices was similar. Conclusions: The performance of current LFIA devices is inadequate for most individual patient applications. ELISA can be calibrated to be specific for detecting and quantifying SARS-CoV-2 IgM and IgG and is highly sensitive for IgG from 10 days following symptoms onset. anti-spike Our data on the kinetics of antibody responses to SARS-CoV-2 infection build upon studies of hospitalised patients in China reporting a median 11 days to seroconversion for total antibody, with IgM and IgG seroconversion at days 12 and 14 respectively; 15 another similar study reports 100% IgG positivity by 19 days. 16 Our ELISA data show IgG titres rose over the first 3 weeks of infection and that IgM testing identified no additional cases. Methods to enhance sensitivity, especially shortly after symptom onset, could consider different sample types (e.g. saliva), different antibody classes (e.g. IgA) 20 , T-cell assays or antigen detection. 21 we We which resulted from cross-reactivity of


INTRODUCTION
The first cases of infection with a novel coronavirus, subsequently designated SARS-CoV-2, emerged in Wuhan, China on December 31 st , 2019. 1 Despite intensive containment efforts, there was rapid international spread and three months later, SARS-CoV-2 had caused over 1 million confirmed infections and 60,000 reported deaths. 2 Containment efforts have relied heavily on population quarantine ('lock-down') measures to restrict movement and reduce individual contacts. 3,4 To develop public health strategies for exit from lock-down, diagnostic testing urgently needs to be scaled-up, including both mass screening and screening of specific high-risk groups (contacts of confirmed cases, and healthcare workers and their families), in parallel with collecting robust data on recent and past SARS-CoV-2 exposure at individual and population levels. 2 Laboratory diagnosis of infection has mostly been based on real-time RT-PCR, typically targeting the viral RNA-dependent RNA polymerase (RdRp) or nucleocapsid (N) genes using swabs collected from the upper respiratory tract. 5,6 This requires specialist equipment, skilled laboratory staff and PCR reagents, creating diagnostic delays. RT-PCR from upper respiratory tract swabs may also be falsely negative due to quality or timing; viral loads in upper respiratory tract secretions peak in the first week of symptoms, 7 but may have declined below the limit of detection in those presenting later. 8 In individuals who have recovered, RT-PCR provides no information about prior exposure or immunity.
In contrast, assays that reliably detect antibody responses specific to SARS-CoV-2 could contribute to diagnosis of acute infection (via rises in IgM and IgG levels) and to identifying those infected with or without symptoms and recovered (via persisting IgG). 9 Receptormediated viral entry to host cells occurs through interactions between the unique and highlyconserved SARS-CoV-2 spike (S) glycoprotein and the ACE2 cell receptor. 10 This S protein is the primary target of specific neutralising antibodies, and current SARS-CoV-2 serology assays therefore typically seek to identify these antibodies ( Figure 1A-C). Rapid lateral flow immunoassay (LFIA) devices provide a quick, point-of-care approach to antibody testing. A sensitive and specific antibody assay could directly contribute to early identification and . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

Enzyme-linked immunosorbent assay (ELISA)
We used a novel ELISA. Recombinant SARS-CoV-2 trimeric spike protein was constructed, 13 tagged and purified. Immunoplates coated with StrepMAB-Classic were used to capture tagged soluble trimeric SARS-CoV-2 trimeric S protein and then incubated with test plasma.
Antibody binding to the S protein was detected with ALP-conjugated anti-human IgG or antihuman IgM. (Further details in Supplementary Material.)

Lateral flow immunoassays (LFIA)
We tested LFIA devices designed to detect IgM, IgG or total antibodies to SARS-CoV-2 produced by nine manufacturers short-listed as a testing priority by the UK Government Department of Health and Social Care (DHSC), based on appraisals of device provenance and available performance data. Individual manufacturers did not approve release of device-level data, so device names are anonymised.
Testing was performed in strict accordance with the manufacturer's instructions for each device. Typically, this involved adding 5-20µl of plasma to the sample well, and 80-100µl of manufacturer's buffer to an adjacent well, followed by incubation at room temperature for 10-15 minutes. The result was based on the appearance of coloured bands, designated as positive (control and test bands present), negative (control band only), or invalid (no band, absent control band, or band in the wrong place) ( Figure 1C).
We recorded results in real-time on a password-protected electronic database, using pseudonymised sample identifiers, capturing the read-out from the device (positive/negative/invalid), operator, device, device batch number, and a timestamped photograph of the device.

Testing protocol
We tested 90 samples using ELISA to quantify IgM and IgG antibody in plasma designated SARS-CoV-2 negative (n=50) and positive (n=40). All positive samples were included and an unstratified random sample of negative plasma from healthy blood donors (n=23) and organ donors (n=27). We tested the nine different LFIA devices using between 39-165 individual . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint plasma samples (8-23 and 31-142 samples designated SARS-CoV-2 positive and negative,   respectively, Table S2). Total numbers varied according to the number of devices supplied to the DHSC; samples were otherwise selected at random.

Statistical analysis
Analyses were conducted using R (version 3.6.3) and Stata (version 15.1), with additional plots generated using GraphPad Prism (version 8.3.1). Binomial 95% confidence intervals (CI) were calculated for all proportions. The association between ELISA results and time since symptom onset, severity, need for hospital admission and age was estimated using multivariable linear regression, without variable selection. Non-linearity in relationships with continuous factors was included via natural cubic splines. Differences between LFIA devices were estimated using mixed effects logistic regression models, allowing for each device being tested on overlapping sample sets. Differences between devices were compared with Benjamini-Hochberg corrected p-value thresholds. (Further details in Supplementary Material.)
As safe individual release from lock-down is a major application for serological testing, we chose OD thresholds that maintained 100% specificity (95%CI 93-100%), while maximising sensitivity. Using thresholds of 0.07 for IgM and 0.4 for IgG (3 and 5 standard deviations above the negative mean respectively; Figure 2A,B), the IgG assay had 85% sensitivity (95%CI 70-94%; 34/40) vs. RT-PCR diagnosis. All six false-negatives were from samples taken within 9 days of symptom onset ( Figure 2D). IgG levels were detected in 31/31 RT-PCR-positive individuals tested ≥10 days after symptom onset (sensitivity 100%, 95%CI 89-100%). The IgM . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint assay sensitivity was lower at 70% (95%CI 53-83%; 28/40). All IgG false-negatives were IgMnegative. No patient was IgM-positive and IgG-negative.
Considering the relationship between IgM and IgG titres and time since symptom onset ( Figures 2C,D), univariable regression models showed IgG antibody titres rising over the first 3 weeks from symptom onset. The lower bound of the pointwise 95%CI for the mean expected titre crosses our OD threshold between days 6-7 ( Figure 2D). However, given sampling variation, test performance is likely to be optimal from several days later. IgG titres fell during the second month after symptom onset but remained above the OD threshold. No temporal association was observed between IgM titres and time since symptom onset ( Figure   2C). There was no evidence that SARS-2-CoV severity, need for hospital admission or patient age were associated with IgG or IgM titres in multivariable models (p>0.1, Table S3).
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint Of 50 designated negative samples tested by both ELISA and the nine different LFIA devices, nine separate samples generated at least one false-positive, on seven different LFIA devices ( Figure 3). Four samples generating false-positive results did so on more than one LFIA device, despite the absence of quantifiable IgM or IgG on ELISA, potentially suggesting a specific attribute of the sample causing a cross-reaction on certain LFIA platforms but not ELISA.

DISCUSSION
We here present the performance characteristics of a novel ELISA and nine LFIA devices for detecting SARS-COV-2 IgM and IgG using a panel of reference plasma. After setting thresholds for detection using 50 negative (pre-pandemic) controls, 85% of 40 RT-PCR-confirmed positive patients had IgG detected by ELISA, including 100% patients tested ≥10 days after symptom onset. A panel of LFIA devices had sensitivity between 55 and 70% against the referencestandard RT-PCRs, or 65-85% against ELISA, with specificity of 95-100% and 93-100% respectively. These estimates come with relatively wide confidence intervals due to constraints on the number of devices made available for testing. Nevertheless, this study provides a benchmark against which to further assess the performance of platforms to detect anti-SARS-CoV-2 IgM/IgG, with the aim of guiding decisions about deploying antibody testing and informing the design and assessment of second-generation assays.
LFIA devices are cheap to manufacture, store and distribute, and could be used as a point-ofcare test by healthcare practitioners or individuals at home, offering an appealing approach to diagnostics and evaluating individual and population-level exposure. A positive antibody . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint test is currently regarded as a probable surrogate for immunity to reinfection. Secure confirmation of antibody status would therefore reduce anxiety, provide confidence to allow individuals to relax social distancing measures, and guide policy-makers in the staged release of population lock-down, potentially in tandem with digital approaches to contact tracing. 14 As a diagnostic tool, serology may have a role in combination with RT-PCR testing to improve sensitivity, particularly of cases presenting sometime after symptom onset. 15,16 Reproducible methods to detect and quantify vaccine-mediated anti-SARS-CoV-2 antibodies are also crucial, as vaccines enter clinical trials, evaluating the magnitude and durability of immunogenicity.
Appropriate thresholds for sensitivity and specificity of an antibody test depend on its purpose, and must be considered when planning deployment. For diagnosis in symptomatic patients, high sensitivity is required (generally ≥90%). Specificity is less critical as some falsepositives could be tolerated (provided other potential diagnoses are considered, and accepting that over-diagnosis causes unnecessary quarantine or hospital admission).
However, if antibody tests were deployed as an individual-level approach to inform release from quarantine, then high specificity is essential, as false-positive results return non-immune individuals to risk of exposure. For this reason, the UK Medicines and Healthcare products Regulatory Agency has set a minimum 98% specificity threshold for LFIAs. 17 Appraisal of test performance should also consider the influence of population prevalence, acknowledging that this changes over time, geography and within different population groups (e.g. healthcare workers, teachers). The potential risk of a test providing false reassurance and release from lock-down of non-immune individuals can be considered as the proportion of all positive tests that are wrong, as well as the number of incorrect positive tests per 1000 people tested. Based on the working 'best case' scenario of a LFIA test with 70% sensitivity and 98% specificity, the proportion of positive tests that are wrong is 35% at 5% population seroprevalence (19 false-positives/1000 tested), 13% at 20% seroprevalence (16 falsepositives/1000) and 3% at 50% seroprevalence (10 false-positives/1000) (Figure 4). However, more data are needed to investigate antibody-positivity as a correlate of protective immunity.
Indeed pre-existing IgG could enhance disease in some situations, 18 with animal data . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint demonstrating that SARS-CoV anti-spike IgG contributes to a proinflammatory response associated with lung injury in macaques. 19 Our data on the kinetics of antibody responses to SARS-CoV-2 infection build upon studies of hospitalised patients in China reporting a median 11 days to seroconversion for total antibody, with IgM and IgG seroconversion at days 12 and 14 respectively; 15 another similar study reports 100% IgG positivity by 19 days. 16 Our ELISA data show IgG titres rose over the first 3 weeks of infection and that IgM testing identified no additional cases. Methods to enhance sensitivity, especially shortly after symptom onset, could consider different sample types (e.g. saliva), different antibody classes (e.g. IgA) 20 , T-cell assays or antigen detection. 21 In contrast to others, 16,22-24 we did not find evidence of an association between disease severity and antibody titres. We observed several LFIA false positives, which may have potentially resulted from cross-reactivity of non-specific antibodies (e.g. reflecting past exposure to other seasonal coronavirus infections).
The main study limitation is that numbers tested were too small to provide tight confidence intervals around performance estimates for any specific LFIA device. Expanding testing across diverse populations would increase certainty, but given the broadly comparable performance of different assays, the cost and manpower to test large numbers may not be justifiable.
Demonstrating high specificity is particularly challenging; for example, if the true underlying value was 98%, 1000 negative controls would be required to estimate the specificity of an assay to +/-1% with approximately 90% power. Full assessment should also include a range of geographical locations and ethnic groups, children, and those with immunological disease including autoimmune conditions and immunosuppression.
In summary, antibody testing is crucial to inform release from lockdown. This study offers insights into the performance of both a novel ELISA and a panel of LFIA devices that have been made widely available, but to date with limited systematic validation. Our findings suggest that while current LFIA devices may provide some information for population-level surveys, their performance is inadequate for most individual patient applications. The biobank of samples assembled for this study continues to be expanded and will provide a valuable resource for developing the next generation of ELISA and lateral flow assays. The ELISA we . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint describe is currently being optimised and adapted to run on a high-throughput platform and provides promise for the development of reliable approaches to antibody detection that can support decision making for clinicians, the public health community, policy-makers and industry.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

DATA AVAILABILITY
Results generated for all samples and relevant metadata is provided in Table S6.

ACKNOWLEDGEMENTS
This work uses data and samples provided by patients and collected by the NHS as part of their care and support. We are extremely grateful to the frontline NHS clinical and research staff and volunteer medical students, who collected this data in challenging circumstances; and the generosity of the participants and their families for their individual contributions. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint grants from NIHR, during the conduct of the study. No other author has a conflict of interest to declare.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint  Any positive test for IgG, IgM, both or total antibody is shown as positive, please see Figure   S2 for more detailed breakdown. Grey blocks indicate missing data as a result of insufficient devices to test all samples and one assay on one device with an invalid result. Samples in both panels are ranked from left to right by quantitation of IgG (as indicated in panel A).  QB04  QA10  BD26  QB05  QB11  QC01  BD28  QA07  BD15  BD27  QB12  QA05  BD13  QC04  BD19  QA11  BD03  BD07  BD08  QB02  BD20  BD24  QD11  QB03  BD01  BD12  BD04  QA09  QB10  QA02  QB08  BD29  QC03  BD17  BD25  QB07  BD09  QB09  QB01  QA12  QC02  QD02  QB06  QA01  BD18  BD23  QA03  BD10  UKCOV006_D5  UKCOV017_D3  UKCOV029_D5  UKCOV022_D5  UKCOV007_D3  HCW07_D14  UKCOV019_D5  UKCOV028_D3  HCW04_D14  UKCOV024_D5  HCW08_D14  UKCOV031_D3  UKCOV003_D3  UKCOV027_D5  HCW09_D14  HCW05_D14  UKCOV005_D3  HCW06_D14  UKCOV018_D3  UKCOV035_D5  UKCOV033_D3  UKCOV020_D5  COV19− . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 20, 2020. ; https://doi.org/10.1101/2020.04. 15.20066407 doi: medRxiv preprint  CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 20, 2020. ;