Qualification of ELISA and neutralization methodologies to measure SARS-CoV-2 humoral immunity using human clinical samples

In response to the SARS-CoV-2 pandemic many vaccines have been developed and evaluated in human clinical trials. The humoral immune response magnitude, composition and efficacy of neutralizing SARS-CoV-2 are essential endpoints for these trials. Robust assays that are reproducibly precise, linear, and specific for SARS-CoV-2 antigens would be beneficial for the vaccine pipeline. In this work we describe the methodologies and clinical qualification of three SARS-CoV-2 endpoint assays. We developed and qualified Endpoint titer ELISAs for total IgG, IgG1, IgG3, IgG4, IgM and IgA to evaluate the magnitude of specific responses to the trimeric spike (S) antigen and total IgG specific to the spike receptor binding domain (RBD) of SARS-CoV-2. We also qualified a pseudovirus neutralization assay which evaluates functional antibody titers capable of inhibiting the entry and replication of a lentivirus containing the Spike antigen of SARS-CoV-2. To complete the suite of assays we qualified a plaque reduction neutralization test (PRNT) methodology using the 2019-nCoV/USA-WA1/2020 isolate of SARS-CoV-2 to assess neutralizing titers of antibodies in plasma from normal healthy donors and convalescent COVID-19 individuals.


Introduction
In December 2019, a series of severe viral pneumonia cases were detected in Wuhan, Hubei Province, China, and later discovered to be induced by a novel coronavirus (Zhou et al., 2020). Since then, the severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) has spread into a pandemic (Cucinotta and Vanelli, 2020). As SARS-CoV-2 continues to devastate the world and its economies, the pressure for effective and equitable vaccine distribution remains a global priority (Wang et al., 2020;Schwartz, 2020). Many vaccine and therapeutic candidates are designed to induce a robust humoral immune response against the SARS-CoV-2 spike glycoprotein, which binds to human angiotensin-converting enzyme 2 (ACE2) and mediates viral entry into pulmonary alveolar epithelial cells (Letko et al., 2020;Ou et al., 2020;Shang et al., 2020;Yan et al., 2020). These vaccines attempt to neutralize the receptor-binding domain (RBD) (Roy et al., 2020) of spike and reduce viral entry into host cells upon exposure (Poland et al., 2020), a strategy similar to that employed for SARS-CoV (Du et al., 2009). When measuring the efficacy of therapeutics and vaccines, highthroughput and clinically qualified reproducible immunological assays provide crucial standardized information regarding the quantity and quality of antibody and cell-mediated responses (Andreasson et al., 2015;Burd, 2010). Specifically, the quantification of antibodies in a spike antigen-based endpoint titer (EPT) enzyme-linked immunosorbent assay (ELISA) in tandem with efficacy measurements via plaque reduction neutralization tests (PRNT) or pseudovirus neutralization can be leveraged to evaluate responses to SARS-CoV-2 infections and interventions (Liu et al., 2020;Okba et al., 2020;Henss et al., 2020;Perera et al., 2020;Sholukh et al., 2020;Earle et al., 2021).
Indeed, many ELISAs have been developed against SARS-CoV-2 (Amanat et al., 2020;Mohit et al., 2021), including those that characterize seroconversion in response to infection and quantify the magnitude of the humoral response (Roy et al., 2020). However, few are clinically qualified using trimeric SARS-CoV-2 spike antigen as our studies report. In order to meet this need, we have developed and qualified a comprehensive suite of high-throughput 384-well format ELISAs for detecting abundance of total human immunoglobulin G (IgG), IgG subclasses (IgG1, IgG3, IgG4), IgM, and IgA recognizing SARS-CoV-2 spike antigen and total IgG specific for RBD. Like other research groups (Mazzini et al., 2021;Chen et al., 2020) we were unable to identify a sufficient number of samples with robust IgG2 responses to qualify the assay for that subclass. Importantly, we have directly compared our calculated endpoint titer values with WHO international reference standards to make future comparisons between labs more accessible (WHO/BS/2020.2403). In tandem we have qualified clinical SARS-CoV-2 PRNT and pseudovirus neutralization assays evaluating the ability of antibodies in a given sample to reduce viral cell entry as a quality gold-standard surrogate for vaccine-mediated immunogenicity and efficacy (Sholukh et al., 2020;Gach et al., 2014). Each assay was qualified across precision (Chesher, 2008), linearity (Jhang et al., 2004) and specificity (Burd, 2010) endpoints to establish they 1) were accurate and repeatable across users and days, 2) showed a linear response with respect to sample concentrations, and 3) were explicitly detecting responses to SARS-CoV-2. We believe this combination of clinically qualified serology assays will be valuable in comparing responses from various vaccines (Polack et al., 2020;Keech et al., 2020;Sadoff et al., 2021;Erasmus et al., 2020;Folegatti et al., 2020), cutting-edge boost regimens (Logunov et al., 2021;Tanne, 2021), as well as evaluating preclinical vaccine candidates (Jain et al., 2020;Rice et al., 2020) poised to enter clinical trials. Furthermore, these assays will serve to aid the scientific community by allowing the assessment long term vaccine durability (Widge et al., 2020;Dan et al., 2021) and, with adaptation, the intermediate vaccine efficacy against circulating and emerging viral variants (Williams and Burgers, 2021).

Specimens and controls
Positive controls were obtained through the Seattle Children's Research Institute's (SCRI) Center for Global Infectious Disease Research Biorepository. These samples were drawn from Seattle Children's workforce and adult community members diagnosed with COVID-19 by detection of SARS-CoV-2 nucleic acids in nasopharyngeal specimens, who subsequently enrolled into the Seattle Children's SARS2 Recovered Cohort with informed consent and Institutional Review Board approval through SCRI. Commercially sourced (Bloodworks Northwest) positive controls were also used from confirmed and recovered COVID-19 patients, termed convalescent plasma. Commercial convalescent plasma was also pooled (CCPP) as a standardized positive control across assays. A sample of pooled plasma from 11 SARS-CoV-2 positive patients from the National Institute for Biological Standards and Controls (NIBSC 20/136), given an arbitrary WHO international standard value of 250 international units (IU) of binding antibody activity per ampule, was also obtained for use in selected assays. Negative plasma originated from historical in-house biorepository of samples isolated from individuals before November 2019, predating the SARS-CoV-2 pandemic and thereby reducing the opportunity for existing cross-reactive antibody responses. Commercially available Normal Human Plasma (Boston Biomedical, Cambridge, MA) was pooled from several donors (NHPP) and used as a standard negative control. Switzerland) with 0.05% Tween. Each well received 100 μL of blocking buffer for all plates. Plates were incubated for at least 2 h, or wrapped in plastic wrap and incubated at 4 • C overnight.

Sample addition
After blocking, plates were washed three times with 1× Wash Buffer A and subsequently 50 μL of diluent (mixture of 1.0 g bovine serum albumin, 500 mL 1× Wash Buffer A and 500 mL PBS) was added to every well of the washed plates. Next, plasma samples were diluted 1:20 in diluent (5 μL of plasma to 95 μL of diluent for total IgG, other Ig sample dilutions listed in Table 1) in a 96-well master block before their addition to the 384-well ELISA plate. Using a multichannel pipette, 12.5 μL of the master block sample dilution was manually pipetted into the first column of each sample and subsequently diluted 1:5 from left to right across the plate, discarding 12.5 μL from the final dilution column. Plate layouts varied depending on which aspect of the qualification was being examined. Plates were incubated overnight at 4 • C.

Development
After incubation, plates were removed from the refrigerator and left to warm to room temperature. Once warmed, plates were washed five times with 1× Wash Buffer A. HRP conjugated rec-Protein G antibodies (Invitrogen, 101,223) were diluted (diluent mixture described above) by a factor of 4000, HRP conjugated IgM antibodies (ThermoFisher, A18835) were diluted by a factor of 2000, HRP conjugated IgA antibodies (ThermoFisher, PAI-74395) were diluted by a factor of 8000, HRP conjugated IgG1 antibodies (Invitrogen, MH1715) were diluted by a factor of 1000, HRP conjugated IgG3 antibodies (Invitrogen, 05-3620) were diluted by a factor of 1000, and HRP conjugated IgG4 antibodies (ThermoFisher, MH1742) were diluted by a factor of 2000 (Table 1). Plates were incubated for one hour in the dark at room temperature,  (Coler et al., 2018;Day et al., 2020). The EPT value calculated for each duplicate plate is then averaged for a final EPT run value.

Variations for competition ELISA
The competition ELISA followed procedures outlined as above except for the following changes. Before adding plasma samples to coated and blocked plates, they were pre-incubated with SARS-CoV-2 Spike protein or RBD at decreasing concentrations. In a 96-well setup plate, soluble SARS-CoV-2 spike or RBD antigens were serially 5-fold diluted. Plasma samples were diluted 1:250 in diluent, and 100 μL was added to the setup plate containing spike protein or RBD. The setup/co-incubation plate was incubated at room temperature for 4 h with gentle rotation. The plasma/soluble spike or soluble RBD mixtures were added to spike or RBD-coated and blocked plates and the remainder of the ELISA methodology was followed as described above.

Cell lines and growth conditions
Vero E6 and HEK293T cells were obtained from ATCC and HEK293T cells stably transfected with human Angiotensin-Converting Enzyme 2 (HEK293T-hACE2) were obtained from BEI Resources (NR-52511). All cells were maintained in Dulbecco's Modified Eagle Media (DMEM) containing 10% fetal bovine serum (FBS), 1% penicillin/streptomycin, and 2 mM L-glutamine (complete DMEM; cDMEM) in a 37 • C + 5% CO 2 incubator. Cells were maintained 25 to 90% confluent and used for the assay up to but not past 14 passages.

Pseudovirus titration and neutralization assay
One day prior to infection, black-walled 96-well plates were coated with 0.01% poly-L-lysine for 5 min. Wells were washed twice with sterile water, and HEK293T-hACE2 cells were plated at 2.5 × 10 4 cells/well in 50 μL cDMEM. Plates were incubated at 37 • C + 5% CO 2 overnight. To titer the pseudovirus, aliquots were thawed and 2-fold dilutions of pseudovirus in cDMEM were prepared in a 96-well plate. Virus dilutions (100 μL) were added to the HEK293T-hACE2 cells followed immediately by addition of polybrene at a concentration of 5 μg/mL. Cells were incubated at 37 • C + 5% CO 2 . After 48 h, 100 μL of supernatants were removed from the wells, and cells were lysed with 30 μL Bright Glo Luciferase Reagent. Relative luminescence units (RLU) were measured immediately on a Spectramax i3x plate reader, as a surrogate for pseudovirus entry and replication.
For pseudovirus neutralization assays, plasma samples were diluted in cDMEM in a 10-point 3-fold dilution series in a 96-well setup plate. SARS-CoV-2 Spike pseudovirus was diluted to 1 × 10 5 RLU/mL and added to wells of the setup plate in a 1:1 ratio. The setup plate was incubated at 37 • C + 5% CO 2 for 1 h prior to addition to HEK293T-hACEs cell as above. Percent inhibition was calculated by dividing RLU values in each well to the average RLU value of virus only wells. Pseudovirus inhibition curves and the concentration of plasma required to inhibit pseudovirus entry by 50% (IC 50 ) was determined by plotting the data and fitting a curve using the neutcurve Python package (https://jbloomlab.github.io/neutcurve/).

SARS-CoV-2 virus production
SARS-CoV-2 isolate 2019-nCoV/USA-WA1/2020 was obtained from BEI Resources and housed under standard BSL-3 laboratory conditions. SARS-CoV-2 virus was propagated and titered by plaque assay in Vero E6 cells. The original stock virus was added to cells at an MOI of 0.1 and incubated at 37 • C + 5% CO 2 for 72 h. Supernatants were harvested, spun at 1,200 rpm for 20 min, and the supernatant was aliquoted and frozen at − 80 • C to create a Passage 1 (P1) virus stock. P1 virus was titered and propagated a second time as above to create Passage 2 (P2) virus. Aliquots of the P2 virus were diluted to 4.5 × 10 3 plaque-forming units (PFU)/mL to create a PRNT virus stock.

PRNT assay
One day prior to infection, Vero E6 cells were plated at 4 × 10 5 cells/ well in 2 mL cDMEM in a 6-well plate. The following day, heatinactivated serum/plasma samples (incubated at 56 • C for 1 h) were diluted in dilution media (DMEM +1% FBS) in a 6-point 2-fold dilution series in a 96-well setup plate to a total volume of 115 μL per well. SARS-CoV-2 PRNT stocks were thawed and diluted 5-fold in dilution media and 115 μL added to the setup plate (9.0 × 10 2 PFU/mL). Virus and plasma were co-incubated for 1 h at 37 • C + 5% CO 2 . Media was carefully removed from the 6-well plates and 200 μL of the virus/plasma mixture was added to the cells. The plates were incubated at 37 • C + 5% CO 2 with rocking every 15 min. After 1 h, 2.0 mL overlay media (dilution media +0.2% agarose, kept at a minimum of 45 • C) was added to the wells and plates were subsequently incubated at 37 • C + 5% CO 2 for 72 h. After 72 h, cells were fixed by the addition of 2.0 mL of 10% formaldehyde and incubated for 30 min. Fixative and overlay were removed, wells were stained for 20 min with 1.0 mL of 0.3% Crystal Violet (BD) and washed once with 1.0 mL PBS per well. Plates were inverted and plaques indicating cell lysis from viral infection were counted for each well and recorded. The lowest dilution of plasma to reduce PFU by 80% compared to virus only wells was deemed the PRNT 80 . The calculation for the threshold is as follows

Qualification
Precision is the reproducibility of an assay across different variables, including between users and days of execution. Given sample endpoint and neutralization titers are unknown, they are not compared to a known standard EPT value but rather evaluated for a consistent performance of the assay to return a similar value for specific control samples across conditions. Intra-assay variability measures the variation of the assay under the same conditions, while inter-assay variability measures variation between different operators and days. Intra-assay variability was measured across three runs performed by a single technician on the same day with like samples. Inter-day variability was measured across three runs performed by a single technician across three separate days with like samples. Finally, for inter-operator variability each of three operators generated their own independent plates and reagents and performed the assay on the same day with like samples. Plate layouts and samples used remained consistent across each measurement of variability. This evaluates the reproducibility and repeatability of the assay and demonstrates that the procedure generates consistent results. When evaluating precision, the inverse or imprecision is actually measured and reported as described below (Andreasson et al., 2015).

ELISA precision endpoint
Each sample must return a coefficient of variation (CV) no greater than 20% for any variable (operator, day, run) evaluated. Coefficient of variation is calculated as where EPT 1-3 are the calculated endpoint values for an individual sample across the three runs (operator, day) being evaluated. For example, CCPP sample EPT values across each of the three operator runs must not exceed 20% CV. This value is partially based on European Medicines Agency (EMA) guidelines that state precision and repeatability must be below this 20% CV threshold, as well as the U.S. Food and Drug Administration's Bioanalytical Method Validation guidance for Industry (2018) (Meesters and Voswinkel, 2018) where precision within and between runs should be below 20% CV.

Pseudovirus neutralization precision endpoint
The average from duplicate samples on any given plate must be within 2-fold (+/− ) of the geometric mean within the respective category (operator, run, reproducibility). For example, the average IC 50 value from duplicate samples for CCPP on each plate within an intraoperator variability test must be within 2-fold of the geometric mean IC 50 value of all CCPP samples tested within the intra-operator variability test.

PRNT precision endpoint
Each sample on any given plate must be within 2-fold (+/− ) of the geometric mean within the respective category. For example, each PRNT 80 value for CCPP within an intra-operator variability test must be within 2-fold of the geometric mean PRNT 80 value of all CCPP samples tested within the intra-operatory variability test. Linearity helps evaluate how well an assay is able to discern variable concentrations of an analyte and if the detected quantity well represents a value across a dilution series. These samples do not have a reference 'true value' so instead we leveraged known positive samples across a standardized dilution curve to determine if the detected response was proportional to the dilution. For ELISA assays, curves were created using O.D. values for positive samples from precision plates and analyzed by linear regression (Jhang et al., 2004) with calculated R 2 values for individual samples as well as an aggregate analysis of an adjusted multiple R 2 following established calculation methods (Wherry, 1931). Only samples with three or more data points were included. Similarly, for PRNT and pseudovirus neutralization assays precision plate dilutions were used from positive samples analyzed by linear regression (Jhang et al., 2004) with calculated R 2 values.

Linearity endpoint
A graphical analysis of the line of best fit (by least squares criterion) coupled with an R 2 value of 0.90 or greater by linear regression analysis, demonstrates the acceptable linearity of the assay.
Specificity is determined in two distinct ways to ensure SARS-CoV-2 antigen-specific responses are being captured, including: a robust screening of presumed negatives, and competition assays with soluble antigen pre-competing for antibody binding. This helps promote that responses are not due to cross-reactive humoral responses that are present in unexposed blood donors. A set of 92 (used for each SARS-CoV-2 spike antibody ELISA) or 82 (used for total IgG RBD) negative samples from a pre-COVID-19 biorepository and four positive SARS-CoV-2 convalescent samples (2 positives for IgG4) were used in determining specificity of our ELISA methodology. For total IgG, IgA and IgM ELISAs, samples were diluted 5-fold starting at 1:100 for a 4-point dilution. For IgG1, IgG3, and IgG4 ELISAs, samples were diluted 5-fold starting at 1:10 for a 4-point dilution. A single plate was run by one technician. A competitive ELISA was also performed to further demonstrate specificity of the immune response to SARS-CoV-2. In this specificity assay, we are expecting serum or plasma antibodies to bind and be sequestered from the coated plate by higher concentrations of soluble spike or RBD in solution. A positive sample was run in duplicate on the same plate by a single operator for this analysis.
We evaluated the specificity of serum/plasma samples to neutralize SARS-CoV-2 via a robust negative sample size. We used 20 negative plasma samples from a historical (pre-COVID) biorepository and 4 positive control samples from patients who became infected and recovered from COVID-19 as a characterization of the pseudovirus neutralization and PRNT assay specificity. For pseudovirus neutralization, each sample was assessed at a 1:20 dilution and was run in duplicate by a single operator on the same day. In addition, we tested plasma from a convalescent patient, as well as a pool of plasma from a set of convalescent patients, to evaluate specificity against a non-specific pseudovirus harboring the VSV G cell entry protein instead of the SARS-CoV-2 spike protein. For PRNT specificity, 20 negative samples from a pre-COVID-19 biorepository and four positive samples were used. Each sample was assessed at a 1:40 dilution for and run in duplicate by a single operator.

ELISA specificity endpoints
For ELISAs ≥90% of negative sample O.D.s must be lower than the collective average of all negative samples +3 SD (plate limit of detection [LOD]), while all 4 positives must exceed this value. For the competition ELISA the results must demonstrate that as the amount of exogenous SARS-CoV-2 Spike antigen or RBD decreases, there will be an increase in positive signal measured by O.D. 450 nm-570 nm. Graphical analysis of O.D. values versus dilution values and a linear relationship with an R 2 value >0.90 demonstrate an acceptable competition specificity qualification of the ELISA. Samples are recorded as pass-fail for the negativescreening specificity criteria.

Neutralization specificity endpoints
For pseudovirus neutralization specificity, all 20 negative samples must not achieve an IC 50 value at the 1:20 dilution of plasma, whereas all 4 positive samples must meet this threshold. Samples are recorded as pass-fail for these criteria. For the VSV G pseudovirus specificity test, all COVID-positive samples must not achieve an IC 50 value at the 1:20 dilution of plasma. For PRNT, the observed PFU from all 20 negative samples must not achieve a PRNT 80 at the 1:40 dilution of plasma, whereas all 4 positive samples must meet this threshold.

Precision
Assay repeatability, or the degree to which an assay will produce the same result when identical test samples are evaluated under the same operating conditions, was assessed in terms of run to run variation. Samples from 14 subjects and 2 pooled controls (CCPP and NHPP) were run by a single operator in triplicate across three separate days (interday), by a single operator in triplicate on the same day (intraday), and by three individual operators on the same day (interoperator). Representative data for Total IgG SARS-CoV-2 spike antigen-specific responses for each precision test is shown in Fig. 1 including two individual convalescent samples (M and N) and pooled controls (CCPP, NHPP). Fig. 2 shows representative samples assess for SARS-CoV-2 RBD responses across the different variables. The CV for each sample (16 total) across the precision variables evaluated are reported in Tables 2, 3 and 4, and cumulative EPT graphs containing all data points are in Supplemental Fig. 1. Importantly, across the robust set of samples tested, all were below the 20% CV endpoint established for this assay, demonstrating the assay is reproducible and precise.

Intraday variability
One operator completed three runs, totaling three sets of duplicate plates on the same day using like samples (Sample ID A-N, CCPP and NHPP). Endpoint titers were calculated as described above to evaluate the magnitude of total IgG, IgM, IgA, IgG1, IgG3 and IgG4 present in each sample (Supplemental Fig. 2) and the CV was evaluated across the three runs executed (Table 2). Importantly, the % CV for all samples tested and the average across samples was well below our endpoint of  ND: not done, sample values below the limit of detection or not used for this assessment. ND: not done, sample values below the limit of detection or not used for this assessment. ND: not done, sample values below the limit of detection or not used for this assessment.

Interday variability
One operator performed three runs with like samples over three different days (Sample ID A-N, CCPP and NHPP). Endpoint titers were calculated as described (Supplemental Fig. 3) and the CV was evaluated across the three runs executed (Table 3). Importantly, all samples and the average across samples %CV was well below our endpoint of <20% variability (Table 3).

Interoperator variability
Three different technicians ran two duplicate plates using like samples on the same day. Similarly, endpoint titers were calculated (Supplemental Fig. 4) and CV evaluated for each antibody ELISA across the three runs executed (Table 4). These data demonstrate that the EPT ELISA methodology yields results within an acceptable range of variance, all below 20%, meeting the EMA guidelines.

Linearity
We used a linear regression analysis of O.D. compared to dilution to calculate the R 2 for a positive samples across each antibody ELISA evaluated for spike and RBD (Total IgG only) responses. Importantly, only sample containing three data points in the linear range were used to calculate R (Cucinotta and Vanelli, 2020), and those values in plateau ranges were excluded (Fig. 3). The adjusted multiple R 2 values calculated for spike antigen-specific responses were as follows: total IgG 0.930, IgM 0.862, IgA 0.927, IgG1 0.863, and IgG3 0.882 (Fig. 3). Given the low responder rate for IgG4 linearity requirements for sample inclusion were not met. The adjusted multiple R 2 value calculated for total IgG RBD-specific responses was 0.903. Adjusted multiple R 2 values for spike specific IgA, total IgG and RBD specific total IgG responses exceeded the 0.90 linearity qualification endpoint, demonstrating acceptable linearity for this methodology. While spike specific IgM, IgG3 and IgG1 adjusted multiple R2 values did not meet 0.90, they did approach this threshold. Importantly, the positive plate control CCPP exceeded an R 2 value of 0.90 for each antibody class, except IgG4, suggesting these assays are linear.

Specificity
We evaluated the effectiveness of our assay to specifically distinguish antibodies in samples that recognize SARS-CoV-2 spike or RBD. Each antibody was screened against 92 total presumed negative (prepandemic) samples and 4 convalescent samples (or 2 in the case of IgG4) for spike antigen (Table 5), while 82 negative samples were leveraged for RBD ELISA analysis (Table 6). Each antibody evaluated met our endpoint of having ≥90% of negative samples screened return true negative values (below 3 SD of the plate LOD, while all SARS-CoV-2 convalescent samples evaluated returned 100% positive values (Table 5 and Table 6), establishing the specificity of our methodology.

Correlation to WHO reference standard
We ran a WHO reference positive standard (NIBSC 20/136) with an assigned antibody binding activity of 250 IU per ampule alongside a set of samples used for qualification to equate IU antibody binding activity to EPT. We observed a Total IgG endpoint titer of 4.73 and 4.47 against SARS-CoV-2 trimeric spike and RBD, respectively for the WHO sample (Fig. 4). Not surprisingly we saw a consistently lower EPT against RBD than Spike for the same plasma samples, including pooled controls (CCPP and NHPP) as well as the WHO sample (Fig. 4).

Precision
Assay repeatability was assessed in terms of run to run variation.
Samples from 2 subjects and 2 pooled plasma controls (CCPP and NHPP) were run by a single operator in duplicate across three separate days (interday), by a single operator in 3 different runs on the same day (intraday) and by three individual operators on the same day (interoperator). Data from individual samples (Positive, Negative) and pooled plasma controls (CCPP, NHPP) for all intraday, interday, and interoperator precision tests are shown in Fig. 5A, B, and C, respectively. Data from all 3 precision tests are compiled in Fig. 5D. In each individual test and in the compiled data shown in Fig. 5D, all samples fall within 2-fold of the geometric mean, demonstrating the assay is reproducible and precise.

Linearity
RLU values from the Positive plasma samples of the interoperator precision run were used to evaluate assay linearity using linear regression analysis between Log-transformed RLU values and reciprocal plasma dilution concentrations. Only reciprocal plasma dilution concentrations within the linear range were used to generate representative graphs and calculate R 2 values (Fig. 6). As the plasma became more   dilute, the PFU values plateaued, and the assay was no longer in the linear range (data not shown). Duplicate samples for each operator were averaged for an R 2 of 0.939, 0.929, and 0.975 for Operators 1, 2, and 3, respectively. The adjusted multiple R 2 value across all positive samples in all qualification runs came to 0.896, demonstrating the pseudovirus neutralization assay methodology is linear.

Specificity
Across our robust negative sample screen (pre-COVID-19 samples) for specificity, all 20 samples (100%) were negative for SARS-CoV-2 pseudovirus neutralization. All four positive samples (100%) inhibited SARS-CoV-2 pseudovirus by ≥50%. Neither of the two tested COVIDconvalescent samples inhibited the VSV G pseudovirus by 50% at any plasma concentration (Fig. 7), demonstrating the specificity of the assay for SARS-CoV-2 Spike antigen and not overlapping with an unrelated viral antigen.

Precision
To examine the sources of variation present in the assay, we determined the geometric mean of the PRNT 80 values for each sample within each category of precision evaluated (run, operator, day). Data from individual samples (Positive, Negative) and pooled plasma controls (CCPP, NHPP) for all intraday, interday, and interoperator precision tests are shown in Fig. 8A, B, and C, respectively. Data from all 3 precision tests are compiled in Fig. 8D. In each individual test and in the compiled data shown in Fig. 8D, all samples fall within 2-fold of the geometric mean, demonstrating the assay is reproducible and precise.

Linearity
We evaluated the ability of the assay to return PFU values that were proportional to the dilution being examined using linear regression. Only the first four dilutions from each operator (1:80, 1:160, 1:320, and 1:640) were used to generate representative graphs and calculate R 2 Fig. 7. Inhibition of VSV G pseudovirus entry and replication in HEK293T-ACE2 cells as a function of plasma concentration from healthy human donors (NHPP and Negative) or COVID-19 convalescent plasma samples (CCPP and Positive).   (Fig. 9). As the plasma became more dilute, the PFU values plateaued, and the assay was no longer in the linear range (data not shown). Positive control samples from each of the three operators were assessed using a simple linear regression. Each operator's positive sample demonstrated a linear relationship (R 2 = 0.922, 0.965, 0.950) between Log PFU and Log reciprocal dilution (Fig. 9). These data demonstrate that the PRNT assay generates results that are linear with respect to sample dilutions and that this feature is consistent across assay operators.

Specificity
We next evaluated the specificity of serum/plasma samples to SARS-CoV-2 using a robust number of negative samples. Across our negative sample screen (pre-pandemic samples) for specificity, all 20 samples (100%) were negative for SARS-CoV-2 neutralization. All four positive samples (100%) inhibited SARS-CoV-2-PFUs by ≥80%. Both of these metrics meet our required endpoints for qualification of specificity of the methodology.

Assay alignment
Many researchers have demonstrated that higher magnitude antibody responses often correlate with enhanced protection in endpoint assays (Earle et al., 2021;Lee et al., 2020;Wajnberg et al., 2020;Salazar et al., 2020;Padoan et al., 2021). Consistent with recent reports (Sholukh et al., 2021), we aimed to assess this correlation in our own suite of qualified assays. Pooled plasma controls (CCPP and NHPP) were used across all assays and a single convalescent plasma sample was use in the pseudovirus neutralization and total IgG ELISA for RBD and Spike responses. We used these overlapping samples to demonstrate a correlation between EPT values and neutralization for High, Medium and Low responder samples (Fig. 10, Supplementary Fig. 5). We observed that CCPP consistently returned High responder values, the single convalescent samples consistently returned Medium responder values and NHPP was consistently Low across all endpoint assays evaluated (Supplemental Fig. 5). Furthermore, we are able to demonstrate a significant correlation between Reciprocal Titer IC 50 from pseudovirus neutralization and both Total IgG Spike EPT (adjusted R 2 = 0.9623) and Total IgG RBD EPT (adjusted R 2 = 0.8921), as well as a correlation between Total IgG for Spike and RBD EPT (adjusted R 2 = 0.9814) (Fig. 10).

Conclusions
Our collective understanding of the required magnitude and composition of a protective immune response against SARS-CoV-2 continues to develop. Early investigations of infection-induced responses for diagnostic purposes included evaluation of IgA, IgM and IgG humoral immunity (Ma et al., 2020;Bichara et al., 2021;Maeda et al., 2021) and many commercial kits have been developed for these endpoints. SARS-CoV-2 specific IgA seems to be a high performing early diagnostic endpoint (Ma et al., 2020;Sterlin et al., 2021), while the kinetics of post-infection IgM and IgG better reflect peak and waning immunity in convalescent cohorts (Bichara et al., 2021;Duysburgh et al., 2021;De Greef et al., 2021). Interestingly, some studies have observed pro-inflammatory IgG subsets (IgG1 and IgG3) trend higher with severity of clinical COVID-19 disease while others (IgG2 and IgG4) are rarely detected (Yates et al., 2021;Luo et al., 2021). Many clinical vaccine immunogenicity and efficacy studies have focused on total IgG as a primary endpoint, and recent evaluations suggest this is a quality correlate with vaccine efficacy rates (Earle et al., 2021). Given the heterogenous health and infection history of the global population, we decided to include total IgG, IgG subsets, IgA and IgM in our qualifications to provide a comprehensive breadth of primary and secondary humoral immunity endpoints.
Here we have described robust methodologies qualified to evaluate the essential protective features of the humoral response against SARS-CoV-2. The EPT ELISA, pseudovirus neutralization and PRNT assays were all determined to be precise, linear, and specific for antibodyspecific SARS-CoV-2 responses. These assays can be reliably used across days and by different trained operators to measure the effectiveness of clinical vaccine candidates both in the pipeline and approved for emergency use. In addition, the use of a WHO reference standard makes these qualified assays more accessible to other labs and researchers for use in comparing SARS-CoV-2 vaccine candidates. We have also established robust high, medium and low responder samples or sample pools that can be leveraged for future work.

Author contribution statement
SEL, BJB, TP, EC and RNC designed experiments; SEL, BJB, TP, EC, BDW, and EJ carried out experiments and SEL, PQ, BJB and BPB executed statistical analysis; LC, SW, EK, CS and NPK provided reagents; SEL, BJB and EC wrote the manuscript and prepared figures and tables with assistance from PQ, BPB, SLB, and RNC. All authors reviewed the manuscript.

Additional information
All authors declare no financial or alternative competing interests with the data described in this manuscript.

Declaration of Competing Interest
None. for their commitment to science and resilience during a pandemic. The authors would like to thank their long-time colleague Dr. Jesse Erasmus for invaluable technical help when adopting and optimizing neutralization assays.
The datasets generated and/or analyzed during the study presented here are available from the corresponding authors on reasonable request.
Research reported here was supported by: National Institute of Allergy and Infectious Diseases (NIAID) of the National Institutes of Health under award numbers 1UM1AI148373-01 and 3UM1AI148373-01S1, a Bill & Melinda Gates Foundation (BMGF) grant to N.P.K. (OPP1156262), and B.P⋅B is supported through award number F32HD102290 from the National Institute of Child Health and Human Development (NICHD). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or BMGF.