SARS-CoV-2 Antibody Rapid Tests: Valuable Epidemiological Tools in Challenging Settings

ABSTRACT During the last year, mass screening campaigns have been carried out to identify immunological response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and establish a possible seroprevalence. The obtained results gained new importance with the beginning of the SARS-CoV-2 vaccination campaign, as the lack of doses has persuaded several countries to introduce different policies for individuals who had a history of COVID-19. Lateral flow assays (LFAs) may represent an affordable tool to support population screening in low-middle-income countries, where diagnostic tests are lacking and epidemiology is still widely unknown. However, LFAs have demonstrated a wide range of performance, and the question of which one could be more valuable in these settings still remains. We evaluated the performance of 11 LFAs in detecting SARS-CoV-2 infection, analyzing samples collected from 350 subjects. In addition, samples from 57 health care workers collected at 21 to 24 days after the first dose of the Pfizer-BioNTech vaccine were also evaluated. LFAs demonstrated a wide range of specificity (92.31% to 100%) and sensitivity (50% to 100%). The analysis of postvaccination samples was used to describe the most suitable tests to detect IgG response against S protein receptor binding domain (RBD). Tuberculosis (TB) therapy was identified as a potential factor affecting the specificity of LFAs. This analysis identified which LFAs represent a valuable tool not only for the detection of prior SARS-CoV-2 infection but also for the detection of IgG elicited in response to vaccination. These results demonstrated that different LFAs may have different applications and the possible risks of their use in high-TB-burden settings. IMPORTANCE Our study provides a fresh perspective on the possible employment of SARS-CoV-2 LFA antibody tests. We developed an in-depth, large-scale analysis comparing LFA performance to enzyme-linked immunosorbent assay (ELISA) and electrochemiluminescence immunoassay (ECLIA) and evaluating their sensitivity and specificity in identifying COVID-19 patients at different time points from symptom onset. Moreover, for the first time, we analyzed samples of patients undergoing treatment for endemic poverty-related diseases, especially tuberculosis, and we evaluated the impact of this therapy on test specificity in order to assess possible performance in TB high-burden countries.

Vaccines Global Access (COVAX) program (1,2). In the current scenario, in which the lack of vaccine doses has persuaded several countries to introduce different policies for individuals who had a history of SARS-CoV-2 infection (a decision that has not been fully endorsed by WHO) (3), access to reliable data about the serological status of individuals could gain new importance (4). However, whereas serological mass screening in high-income countries could be feasible using automatic, high-throughput technologies (5,6), this may not be a practical option in several challenging diagnostic settings where the prevalence of SARS-CoV-2 infection is still widely unknown (7). In these countries, lateral flow assays (LFAs) for the identification of SARS-CoV-2 antibodies may represent an affordable and practical tool to perform epidemiological evaluations, but only a few serological surveys have been conducted to date employing LFAs, associated or not with enzyme-linked immunosorbent assays (ELISA) (8,9).
Even if several studies have demonstrated their variable performance (10)(11)(12)(13), SARS-CoV-2 antibody detection LFAs are currently considered a homogeneous group with comparable specificity and sensitivity. This fact contributed to a common feeling of distrust toward them in the scientific community (14), and a consensus on which LFA could be employed as an effective epidemiological tool has not been reached. To date, WHO recommends their use only in research for possible epidemiological employment (15,16), and no LFA has received a WHO emergency use listing.
An accurate evaluation of which LFA offers the highest reliability in identifying SARS-CoV-2 infected individuals or in monitoring the immune response to the vaccine could be helpful to select an effective and cheap tool to be used in challenging diagnostic settings.
In this regard, our laboratory evaluated the performance of 11 different LFAs and one ELISA in detecting SARS-CoV-2 infection, analyzing plasma and serum samples collected from 350 subjects.
Moreover, to evaluate which tests could be useful in identifying an immune response developed in response to the COVID-19 vaccine, 57 plasma samples were collected at 21 to 24 days from the first dose of the BNT162b2 Pfizer-BioNTech vaccine. These samples were examined using the 11 different LFAs and an electrochemiluminescence immunoassay (ECLIA) dosing IgG against spike protein receptor binding domain (RBD).
The number of false positives increased dramatically in the group of samples collected before 2019 from patients receiving therapy for tuberculosis (TB) (PreK). There was at least one indeterminate or false positive for IgM and/or IgG for these samples across all LFAs (Fig. 1a).
Logistic mixed-effects models were used to evaluate the differences among PreH, PostH, and PreK groups.
For IgM, the analysis showed that the results were significantly less likely to be negative in tuberculosis subjects (PreK) compared to healthy subjects (PreH) (odds ratio [OR] = 0.12; P value = 0.0115) (see Table S2 in the supplemental material). For IgG, no statistically significant differences have been observed (see Table S3 in the supplemental material).
In the models, the effects of sex, age, and ethnicity were also assessed, and no differences have been observed. The group analyzed, including COVID-19 patients and negative controls, had a median age of 33 years old (from 18 to 84 years old); males and females were, respectively, 59% and 41%; 79.8% were Caucasian, 6.1% Hispanic, 4.3% Asian, and 9.8% black.
As different times of seroconversion have been reported in literature (17), the sensitivity of the tests was assessed in samples collected across different periods in relation to the occurrence of the symptoms. The results of the evaluation are shown in Fig. 1b Overall, LFA capability of identifying individuals with a SARS-CoV-2 infection was proven by real-time reverse transcriptase PCR (rRT-PCR) performed on nasopharyngeal swab (NPS). Logistic mixed-effects models, followed by post hoc pairwise comparisons (Bonferroni-corrected for multiple comparisons), were used to evaluate the differences among groups defined according to the time of sampling. For IgM, significant differences emerged only between samples collected at #7 days from symptom onset and samples collected at 15 to 35 days (P = 0.0072) (see Table S4 in the supplemental material). For IgG, significant differences emerged between samples collected at #7 days from symptom onset and samples collected at 15 to 35 days (P = 0.0004) (see Table S5 in the supplemental material) as well as samples collected at 8 to 14 days from symptom onset and samples collected at 15 to 35 days (P = 0.0188) (see Table S5). IgM and IgG kinetics in SARS-CoV-2 infection are still not completely understood (17). Our study confirmed the tendency of IgM and IgG to rise at the same time in COVID-19 patients, as well as a lower specificity of IgM in identifying the infected individuals (10,17).
Indeterminate analysis. Indeterminate results were observed for all of the tests in the analysis. IgM indeterminate results were reported more frequently than IgG, and each indeterminate test was repeated once. After repetition, the two tests with the highest number of IgM indeterminate results were BTNX (30/395) and RightSign (29/395). RightSign also had the highest number of IgG indeterminate results (21/395) (Fig. 2).
Indeterminate results could be interpreted as positive or negative if an indeterminate cutoff is not clearly defined before proceeding with the exam. Therefore, it was evaluated if the capability of the tests for identifying true negative and true positive samples was affected if the indeterminate results were, respectively, considered all positive or negative. The indirect effects of these modifications on the sensitivity and specificity of each LFA were carefully examined ( Fig. 3 and 4).
When comparing the obtained results to the sensitivity calculated excluding the indeterminate results, if considering the indeterminate results as positive, an increase in sensitivity from 2% to 8% was reported according to the test, and a loss in sensitivity of 2% to 7% was reported if considering them as negative (Fig. 3). Variations in specificity ( Fig. 4) have also been reported, with a loss in specificity from 2% to 4% considering indeterminates as positive and an increase of 2% to 3% considering them as negative.
Concordance between tests. The agreement between LFAs and ELISA for IgG has been estimated (Fig. 5). The concordance level between LFAs and ELISA in the COVID-19-positive cohort remained at each time point higher than 70%, reaching 100% only for one test (VivaDiag) at one time point (15 to 35 days from symptom onset).
For the PostH group, the results provided for IgG by BioMedomics, BTNX, Coretests, Orient Gene, Perfectus, RightSign, and SD correlated at 100% with ELISA results. In the PreH group, the maximum concordance rate between LFAs and ELISA was 99%. The agreement level was lower for the PreK cohort, reaching a peak of 98.8% for Innovita and VivaDiag.  The concordance rate between different LFAs plotted for IgM and IgG in PreH, PostH, and PreK groups is provided in Fig. S1A and S1B in the supplemental material. The overall level of agreement for IgM is higher than 92% in the PostH cohort and higher than 95% in the PreH one. In the PreK cohort, instead BTNX concordance rate with the other tests was low and reached its highest point at 87.8% with Orient Gene. Noticeably, BTNX was the only test that reported in its instruction for use that TB drugs (rifampicin, 78.1 mmol/liter; isoniazid, 292 mmol/liter; ethambutol, 58.7 mmol/liter) were tested as possible interfering substances without affecting the test's performance. Nonetheless, for IgM in the PreK group, we observed a dip in BTNX specificity for IgM that drops under 95% (93.51%; 95% CI, 85.49 to 97.86).
Seroconversion pattern. An evaluation of the seroconversion pattern has been performed sampling 45 individuals at two different time points. Of 16 patients sampled at #7 days from symptom onset, 3 were reanalyzed between 15 and 35 days and 13 at $36 days; of the 23 patients firstly sampled between 8 and14 days, 3 were again collected between 15 and 35 days and 20 at $36 days, and finally 6 were sampled at 15 to 35 days and then at $36 days. Decay of the antibody responses have been reported for IgM starting from 23 days after symptom onset, while IgG titers appeared to be stable for up to 79 days (18). In our study, IgM seroreversion was observed within 30 days from symptom onset with BioMedomics, Innovita, QuickZen, Orient Gene, Perfectus, Coretests, and Tigsun (Fig. 6a). Moreover, IgG decay was observed for three samples with Innovita and two with QuickZen within 1 month of symptom onset. Interestingly, one patient that tested positive for IgG at #7 days from symptom onset resulted negative once retested 30 days after the first sampling with all of the LFAs in the study, apart from VivaDiag and Tigsun. For only this patient, the seroreversion observed with LFAs was confirmed with ELISA. The patient was a 61-yearold male with a history of hypertension and diabetes who developed acute respiratory distress syndrome (ARDS) during his hospitalization but was not admitted into the intensive care unit (ICU). The same patient tested positive for IgM with all LFAs in the study immediately after admission into hospital, but 30 days later, at the second sampling, an IgM decay was observed too with 7/11 LFAs. The other four tests provided an indeterminate result. The occurrence of seroreversion for both IgG and IgM does not appear to be related to any specific preexisting condition or to the severity of the illness for any of the subject in the analysis.
Health care workers. Of the 57 health workers sampled at 21 to 24 days from the first dose of the Pfizer vaccine, all showed a positive titer of IgG against SARS-CoV-2 spike protein RBD ($0.8 U/ml), detectable by ECLIA (Table 1). Of the 11 LFAs used to detect an IgM response, only two, Tigsun and BTNX, failed to show a positive response in any of the samples tested, while Orient Gene had an IgM positivity rate of 12.24% (6/49) and the highest IgG detection rate (85.71%; 48/56) ( Table 2).
The rate of positivity to IgG of the different LFAs was evaluated in comparison to the quantitative results obtained by ECLIA. As shown in Table 1, VivaDiag and Innovita did not detect positive IgG for ECLIA titers .2,500 U/ml. The lowest positive titer was correctly identified by Orient Gene (4.04 U/ml).

DISCUSSION
In this study, we analyzed the performance of 11 different LFAs and one commercial ELISA in detecting SARS-CoV-2-specific IgG and IgM. Specificity was assessed in three cohorts as follows: historic samples collected before 2019 in healthy donors, in patients who were on TB treatment, and in individuals who tested RT-PCR negative for SARS-CoV-2.
The overall sensitivity of the tests (positive to IgM and/or IgG) was calculated by evaluating capability to correctly identify individuals confirmed to have been infected with SARS-CoV-2 by rRT-PCR. The obtained data allowed us to identify Coretests as the test with the highest specificity and sensitivity at $36 days from symptom onset. Hence, the latter appeared to be more appropriate for a serological mass screening due to the lower risk of identifying false positives because of the high specificity.
Furthermore, Orient Gene demonstrated the highest sensitivity in identifying a positive titer of IgG against protein S RBD, proving itself a possible test to evaluate an immunological response after the vaccine. Considering the limited number of cases analyzed, an in-depth evaluation on a wider cohort is needed to assess the effective reliability for this purpose of Orient Gene in comparison to other LFAs. Interestingly, VivaDiag and Innovita, even though they demonstrated good specificity (both 100%; 95% CI, respectively, 95.55 to 100 and 95.60 to 100) and sensitivity (VivaDiag, 94.12%; 95% CI, 71.31 to 99.85; Innovita, 86.67%, 95% CI, 59.54 to 98.34) in identifying positive subjects at 15 to 28 days from symptom occurrence, did not detect the IgG response at 21 days from the vaccination. Indeed, more information by LFA manufacturers on the antigenic targets of their tests would help to perform a more on-point evaluation of these tests. A further study, including more time points from symptom onset, could FIG 5 Concordance rate between ELISA and LFAs for IgG. The concordance rate between LFAs and ELISA in samples from COVID-19 patients remained at each time point higher than 70%. The highest agreement level with ELISA was observed between 15 and 28 days from symptom onset for all of the LFAs with the exclusion of BioMedomics, Innovita, and QuickZen. BioMedomics and QuickZen demonstrated 93.6% of agreement at more than 36 days. Innovita showed 88.9% of agreement at less than 7 days. The only test that reached 100% of concordance with ELISA was VivaDiag at 15 to 35 days from symptoms onset. be useful to evaluate the reliability of LFAs to identify a previous infection of SARS-CoV-2 at 60 and 90 days from the infection. The main limitation in the sensitivity assessment is due to the lack of asymptomatic and paucisymptomatic patients in our cohort to perform an evaluation of the rate of positivity in association with the severity of the developed disease.
The cohort of patients in therapy for TB was included to evaluate the possible effects of TB drugs on LFA performance, since one test (BTNX) recognized in its product insert rifampicin, ethambutol, and isoniazid as possible interfering substances. The manufacturers declared that sensitivity and specificity of the BTNX test were not affected by the presence of these drugs at therapeutic concentrations in blood. Nonetheless, the level of agreement of BTNX with other LFAs for IgM is the lowest in the PreK group; therefore, TB therapy could have had an effect on the test despite what is declared by the manufacturer. Even if it is well known that rifampicin can cause false-positive immunoassay results for urine opiates (19), to our knowledge, this is the first report that provides proof that TB medicines can affect SARS-CoV-2 LFAs for antibody detection. This occurrence probably deserves an in-depth analysis to identify the possible mechanisms for cross-reactivity, but it is to be kept in account if LFAs will be used in countries with a high TB prevalence.
A higher number of indeterminate results was overall observed for IgM than for IgG. The identification of these faint bands affected the general efficiency and reliability mainly of two of the LFAs in analysis, BTNX and VivaDiag. The repetition of the indeterminate exams did not provide a clear positive or negative result in the majority of cases for BTNX IgM, as 30/395 still remained not interpretable. Previous studies have suggested that all faint identified bands should be considered negative to improve the specificity of the test (10). However, our analysis demonstrated that considering negative all of the samples identified as indeterminate would result in a major decrease in the sensitivity of the tests (up to 7%) compared with a minimal gain in specificity (2% to 3%). Moreover, the definition of "faint" is  based on a subjective evaluation of the band intensity that in future may be objectified through the use of automatic readers or, at least, through an attentive training of the readers (20).
In conclusion, the tests analyzed demonstrated different performances and different levels of reliability in identifying IgM and IgG against SARS-CoV-2. Therefore, great prudence should be used to employ the most accurate point-of-care (POC) serological tests to evaluate local epidemiology as well as to verify the development of an immunological response after the vaccine, especially in diagnostically challenging settings. The need for reader training as well as the possible interference of TB therapy on the tests results have been identified by our study as two of the main limiting factors for the use of these tests in low-middle-income countries. Finally, in a period of scarcity of vaccine doses, when several European countries, including Italy and France, are recommending a single dose of vaccine for individuals who were positive for SARS-CoV-2 in the previous 6 months, the tests with the highest specificity may be used to determine a prior infection and therefore deeply influence the vaccination campaign (3,4).

MATERIALS AND METHODS
Study design: setting and population. This study included two different sampling phases that took place, respectively, between April and June 2020 and January and February 2021 at San Raffaele Research Hospital in Milan, Italy. The different patient cohorts analyzed are summarized in Fig. 7.
During the first phase, in the spring of 2020, 128 symptomatic COVID-19 patients who resulted positive with rRT-PCR performed on nasopharyngeal swab (NPS) participated in the study, and from 45 of them, two samples were collected at different time points. All patients were hospitalized. Their symptoms included cough (58.13%), dyspnea (55.40%), and fever (89.01%). Moreover, 49.71% of the patients developed acute respiratory distress syndrome (ARDS), and 16.18% died. None of the COVID-19 patients had a history of chronic obstructive pulmonary disease (COPD), but other chronic diseases such as diabetes and hypertension were reported (respectively, 9.82% and 32.94%). All clinical data were extracted from the San Raffaele Research Hospital internal database. At the same time point, 26 plasma samples were collected from volunteers who tested negative for SARS-CoV-2 infection by rRT-PCR performed on NPS (designated in tables and figures as the PostH group).  To evaluate the specificity of the tests, 196 samples collected and stored before 2019 were included in the analysis; 82 were from patients in therapy for tuberculosis (TB), and 114 were from healthy donors (in tables and figures, they are referred to as the PreK group and PreH group, respectively). The patients in therapy for TB have been included to evaluate any possible interfering effect of TB drugs on test specificity.
All samples were collected by venipuncture in serum tubes with spray-coated silica or in K2EDTA tubes, stored at 14°C and aliquoted for freezing at 280°C within 1 week of the blood draw. Serum and plasma were used interchangeably for all tests, except for Euroimmun ELISA applicable only with serum samples.
The clinical and demographic characteristics of this population are summarized in Tables 3 and 4.
To evaluate the capability of the tests for identifying an immune response elicited by the COVID-19 vaccine, a cohort of 58 health workers who had received the first dose of BNT162b2 Pfizer-BioNTech vaccine 21 to 24 days before was surveyed between January and February 2021. One clotted sample was finally excluded from the analysis. They were all adults, and females and males were equally represented. None of them reported any relevant comorbidity or previous immunological disease or allergic reaction to drugs and/or vaccines. From this cohort, all samples were collected by venipuncture, stored at 14°C and analyzed in the 24 h following the collection using 11 different LFAs and an ECLIA, dosing IgG against Spike protein RBD.
Immunochromatographic LFAs. Eleven LFAs were utilized according to the manufacturer's instructions (see Table S1 in the supplemental material).
In brief, the appropriate sample volume was added on the indicated sample port, followed by a defined amount of the provided diluent. The cartridges were then incubated at room temperature for the recommended time. Result reading was performed by two independent observers. In case of disagreement, a third reader was consulted, and the final result was given by two concordant readings.
Samples were considered negative if the control band was present and the test band absent and positive if both the bands were clearly observed. An indeterminate result was given if the control band has been identified jointly to a faint test band, with an intensity definitely lower than the control one and that could not be clearly associated to a positive reaction. The test was considered invalid if the control band was not identified.
All indeterminate and invalid results were repeated once. If a clear interpretation of the test was still not possible because of the presence of a low-intensity band identified by both readers, the test result was confirmed as indeterminate. None of the samples tested invalid a second time for any of the tests in analysis.
ELISA. Euroimmun anti-SARS-CoV-2 ELISA for the detection of IgG against the SARS-CoV-2 S1 domain was carried out according to the manufacturer's instructions. In brief, 10 ml of serum was diluted 1:101 in the provided sample buffer. Then 100 ml of the diluted samples, calibrator, and positive and negative controls were transferred into the precoated microplate wells according to the provided pipetting protocol and incubated at 37°C for 60 min. Following the washing step, conjugate and then substrate incubations were performed before the addition of the stopping solution and the consequent photometric measurement.
ECLIA. Elecsys anti-SARS-CoV-2 S (Roche) is an ECLIA for the determination of IgG against the SARS-CoV-2 spike (S) protein receptor binding domain (RBD). The assay, based on a double-antigen sandwich assay format, has been performed according to the manufacturer's instructions on a cobas e 411 analyzer on the 57 samples collected from health workers after the first vaccination dose.
Statistical analysis. Descriptive statistics of continuous variables were presented as median and interquartile range, while for categorical variables, frequencies were reported. In the absence of a goldstandard test for serology detection, sensitivity and specificity were estimated using surrogate reference standards. Sensitivity was estimated using samples collected from patients confirmed by rRT-PCR to have been infected with SARS-CoV-2, while specificity was estimated using samples from healthy negative controls and patients receiving therapy for tuberculosis collected prior to the circulation of SARS-CoV-2. Binomial exact 95% confidence intervals were calculated for all estimates. Logistic mixed-effects models were used to evaluate differences among groups, since the data consist of repeated measurements of the same subjects. The agreement between assays was evaluated by computing the percentage of concordant results. All statistical analyses were performed using R statistical software version 4.0.4 (www.r-project.org).
Ethical approval. This study was approved by the ethical committee and institutional review board of San Raffaele Research Hospital in Milan, Italy (protocol number COVID-19 IA evaluation). All patients and healthy controls agreed to the study by signing the informed consent.