Original Approach for Automated Quantification of Antinuclear Autoantibodies by Indirect Immunofluorescence

Introduction. Indirect immunofluorescence (IIF) is the gold standard method for the detection of antinuclear antibodies (ANA) which are essential markers for the diagnosis of systemic autoimmune rheumatic diseases. For the discrimination of positive and negative samples, we propose here an original approach named Immunofluorescence for Computed Antinuclear antibody Rational Evaluation (ICARE) based on the calculation of a fluorescence index (FI). Methods. We made comparison between FI and visual evaluations on 237 consecutive samples and on a cohort of 25 patients with SLE. Results. We obtained very good technical performance of FI (95% sensitivity, 98% specificity, and a kappa of 0.92), even in a subgroup of weakly positive samples. A significant correlation between quantification of FI and IIF ANA titers was found (Spearman's ρ = 0.80, P < 0.0001). Clinical performance of ICARE was validated on a cohort of patients with SLE corroborating the fact that FI could represent an attractive alternative for the evaluation of antibody titer. Conclusion. Our results represent a major step for automated quantification of IIF ANA, opening attractive perspectives such as rapid sample screening and laboratory standardization.


Introduction
Antinuclear antibodies (ANA) are essential biological markers for the diagnosis [1], classification, and disease activity monitoring [2] of systemic autoimmune rheumatic diseases. Given this central role, ANA screening should be accurate and reproducible. For several decades, indirect immunofluorescence (IIF) on HEp-2 cells has been the reference technique for ANA testing. Although new available techniques [3,4] such as ELISA or multiplexing solid phase technologies have been proposed to replace IIF, the American College of Rheumatology (ACR) still recommends IIF as the gold standard method for ANA detection [5]. The main drawback of this technique is IIF reading subjectivity, intra-and interlaboratory variabilities complicating the standardization expected in modern laboratories.
Recently, commercial automated systems for ANA IIF reading and interpretation have become available and were described in the literature [6][7][8][9][10][11]. Most of them are based on data mining and supervised machine learning methods [12]. In addition to their complexity, they share a common weakness in the detection of weak positivity.
In this work, we describe an original algorithm named Immunofluorescence for Computed Antinuclear antibody Rational Evaluation (ICARE) for automation of IIF ANA evaluation offering excellent analytical performance and an attractive quantitative approach for positive/negative discrimination. We assess the quantification of the fluorescence intensity as an alternative to antibody titer evaluation and validate our approach in a population of patients with systemic lupus erythematosus (SLE).

Patients and Serum Samples.
We collected serum samples from 2 cohorts of patients: a "routine cohort" and a "SLE cohort".  (Table 1). Based on visual IIF ANA analysis, this cohort was splitted into ANA negative ( = 103) and ANA positive ( = 134) sera. As expected, there were significantly more women in the ANA IIF positive group than in the ANA IIF negative group (70% versus 51%, = 0.01).

ANA Testing.
ANA in patients' sera were detected by commercial ANA HEp-2 indirect immunofluorescence assay. Automated instrument (PhD system, Bio-Rad Laboratories, Hercules, CA) was used for IIF slide preparation. Samples diluted in phosphate buffered saline were incubated on HEp-2 cells fixed on glass slides (Kallestad HEp-2 Cell Line Substrate, 12 wells slides, Bio-Rad Laboratories, Hercules, CA) for 30 minutes at room temperature (RT). The screening dilution was 1 : 100. After washing, bound antibodies were detected by incubation with fluorescein isothiocyanate (FITC) conjugated sheep anti-human immunoglobulin (Kallestad FITC conjugate, Bio-Rad Laboratories, Hercules, CA) for 30 minutes at RT. Subsequently, slides were washed, embedded with a 4,6-diamidino-2-phenylindole (DAPI) containing medium (Vectashield, Vector Laboratories Inc., Burlingame, CA), and visually assessed with a fluorescence microscope (Leica DM1000, Leica Microsystems, Germany) by two experienced observers. For each sample, fluorescence pattern and titer were visually assigned in case of positivity. The visual cutoff titer was 100 corresponding to sera with weak positivity. Based on visual ANA analysis, three patient groups were created: positive ANA (titer ≥ 200), weakly positive ANA (titer = 100), and negative ANA groups. The titer was defined as the reciprocal of the highest dilution of serum that still shows immunofluorescent nuclear staining.

Image Capture.
For each patient, two images of the same central microscopic field were captured with 20x objective at two different excitation wavelengths: 480 nm for FITC stain and 360 nm for DAPI stain. Captures were performed with a fluorescence microscope (Leica DM1000, Leica Microsystems, Germany) equipped with 360 nm and 480 nm leds for excitation (FluoLed, Fraen corporation Srl, Settimo Milanese, Italy). Captures with 1392 × 1040 pixels resolution were performed with a color CCD camera (Infinity 2, Lumenera Corporation, Ottawa, Canada). Exposure times for FITC and DAPI captures were 200 ms and 300 ms, respectively. All captured color images were 24 bit-depth and have been saved Clinical and Developmental Immunology in Tagged Image File Format (TIFF) for subsequent analysis.
As an example, Figure 1 shows the IIF microscopic images obtained from one positive and one negative sera.

ICARE Algorithm Description for Image Analysis.
First, using image analysis software, we splitted RGB color channels and kept blue or green channel for DAPI and FITC images, respectively. Then, DAPI image was used to determine nucleus position. This was performed using a thresholding method based on image histogram analysis. We defined the background intensity of DAPI image as the first peak of DAPI histogram. A threshold defined as twice this background intensity allowed appropriate segmentation and selection of nucleus region of DAPI image. This nucleus region selection was then superimposed on FITC image which allowed mean fluorescence intensity measurement of nucleus region of FITC image (MFI n). Then, an inversion of selection allowed mean fluorescence intensity measurement of non-nucleus background region of FITC image (MFI b).

ICARE Index Calculation.
For each captured well, we defined a nondimensional index called fluorescence index (FI) and calculated as follows: FI = (MFI n)/(MFI b).
The reproducibility of FI was tested. A single sample with weakly positive ANA (titer = 100) was tested 10 times in 10 wells each day on 3 consecutive days. Coefficients of variation of FI on the 3 days were 8.6%, 8.7%, and 5.9%.
We also evaluated the effect of exposure time of the camera on FI by studying the FI as a function of the timeexposure (50-300 ms) for positive ANA patients (data not shown). No variation was observed attesting that FI values were time-exposure independent.

Statistics.
Analytical performance of ICARE algorithm was evaluated by calculating sensitivity (Se.), specificity (Spe.), positive predictive value (PPV), and negative predictive value (NPV). Accuracy was defined as the proportion of the total number of correct predictions by FI. Mann-Whitney test was used to compare the mean values of FI and Spearman's rank correlation coefficient to study the correlation between FI and IIF ANA titers. The agreement between visual and algorithmic interpretation was evaluated using Cohen's Kappa coefficient which takes on the value (i) zero if there was no more agreement between two tests than expected by chance (ii) 1 if there was a perfect agreement. Kappa values below 0.4 were considered as poor agreement, values between 0.4 and 0.75 as fair to good agreement, and values higher than 0.75 as excellent agreement as described [13]. Data were analyzed and curves plotted using R statistical software (R Foundation for Statistical Computing, Vienna, Austria) and Microsoft Excel 2007. The threshold for statistical significance was set at = 0.05.

Index Cut-Off Determination and Analytical
Performance of ICARE Algorithm Performed on Routine Cohort. FI was calculated for the 237 patients from the routine cohort. As shown in Figure 2(a), FI was significantly higher in ANA positive patients compared to ANA negative patients (mean value: 2.06 ± 1.18 versus 1.13 ± 0.06, < 0.0001). To test further the ICARE algorithm performance in weakly positive samples (ANA titer of 100 corresponding to the visual cutoff), we compared samples with very low positive ANA to samples with negative ANA. Interestingly, FI was significantly higher in patients with weakly positive ANA than in patients with negative ANA (mean value: 1.33 ± 0.11 versus 1.13 ± 0.06, < 0.0001) (Figure 2(b)). Cut-off determination of FI was performed using ROC analysis (Figure 3(a)) and accuracy curve (Figure 3(b)). FI cut-off value was set at 1.246 and area under the curve (AUC) was 0.991, attesting the excellent performance of the algorithm.
For the whole ANA IIF positive group (including weakly positive samples), sensitivity and specificity for positive/negative discrimination were, respectively, 95% and 98%. The concordance between visual and algorithmic evaluation was also excellent, with a Cohen's kappa of 0.923 (Table 2).   For weakly positive samples only, ICARE algorithm performance was also very good with 86% sensitivity and 98% specificity and a coefficient of concordance (Kappa) of 0.86.

Result Comparison between ICARE Algorithm and Visual
ANA IIF. Concordant results between ICARE algorithm and visual evaluation of ANA by IIF were obtained for 228/237 routine samples (96%) ( Table 3). The 9 remaining samples showed discrepant results: 2 were classified as weakly positive by the ICARE algorithm and negative by visual examination (false positive) and 7 were classified as negative by the ICARE algorithm but visually recognized as weakly positive by the expert (false negative). None of the false negative samples were associated with positive extractable nuclear antigen (ENA) or anti-dsDNA antibodies, and among them, 3 were drawn in an infectious context, 1 was from a 75-year-old patient and 1 from a patient treated for psoriatic arthritis with previously negative ANA. Importantly, no false negative was observed for samples with an ANA IIF titer ≥ 200.

Quantification of Fluorescence Index as an Alternative to ANA Titer.
To assess the usefulness of FI quantitatively, as an alternative to antibody titer, we first investigated the effect of sample dilution on FI values. Twenty ANA positive samples with 5 different staining patterns (speckled, homogenous, centromeric, nucleolar, and nuclear dots) were diluted from 1 : 100 to 1 : 800. FI was evaluated for each dilution of a given sample ( Figure 4). For all samples tested, a decrease in FI was obtained when the dilution factor increased, whatever the staining pattern tested. To comfort the relationship between FI values and antibody levels, we then analyzed FI value as a function of titer in 87 ANA positive speckled samples ( Figure 5). A significant correlation between FI and ANA titers was found (Spearman's = 0.78; < 0.0001).

Clinical Validation on Patients with SLE.
In order to evaluate the clinical performance of ICARE algorithm in connective tissue diseases, patients from the SLE cohort were tested. In agreement with our previous results, a significant correlation was found between FI and ANA titer (Spearman's = 0.8; < 0.0001) (Table 4). Interestingly, a significant correlation was also observed between FI and anti-dsDNA antibody levels in this cohort (Spearman's = 0.47; < 0.01).

Discussion
In this study we propose an original approach for automation of ANA IIF based on the calculation of a fluorescence index to discriminate positive and negative samples in a reproducible and nonobserver-biased way. We demonstrate excellent analytical performance of ICARE algorithm in comparison to the gold standard IIF visual method. Moreover, we show that FI could be used as a quantitative value to evaluate ANA titers. Last, we show that ICARE has potential interest in the monitoring of ANA in SLE patients.
Our approach is based on a quantitative strategy that mimics the routine analysis of ANA IIF. In the routine practice, the first step of this analysis consists in positive/negative screening that allows the rapid reporting of the 60-70% negative results requiring no further investigation. The second step is the pattern recognition, which is under development.
In the study, for the screening of 237 samples, ICARE reached a sensitivity of 95% a specificity of 98% and evidenced an excellent concordance with the visual method (accuracy: 96.2%, kappa = 0.923). To analyze more accurately the ICARE performance, analysis was performed on an ANA IIF subgroup presenting a weak positivity defining the visual cutoff (titer = 100). In this more difficult design, performance of ICARE was also very good with 86% of sensitivity, 98% of specificity, and an excellent coefficient of concordance (kappa = 0.86). The very good performance of ICARE suggests that it could replace the screening routine step by an automated approach. Several commercially available systems are available for automated analysis of ANA by IIF: Aklides (Medipan, Berlin, Germany), G-Sight (Menarini, Florence, Italy), and EuroPattern (Euroimmun, Lübeck). In routine activity, good performance for ANA screening are reported. However, performance was not specifically evaluated in low positive ANA samples. This could change the interpretation of the results for the benefit of elevated positivity (high endpoint titers). Additionally, for Aklides system, Egerer et al. found a screening sensitivity of 94% for the whole population studied [6], while on the same system, Melegari et al. [7] found sensitivity of only 72% and suggested reassessing the cut-off for the detection of weakly positive samples. In the literature, the percentage of concordance between visual IIF and automated measures varied from 86% to 99%. With 96% of concordant results, our method is thus among the most performing. In our study, the only discrepant results (3.8%) were at the visual cut-off titer. The majority (7/9) were found weakly positive by the visual method. It is well known in laboratory practice that IIF visual reading becomes highly subjective and variable between observers when fluorescence intensity is around the cut-off titer. Moreover, none of the 7 samples were associated with positive extractable nuclear antigen (ENA) or anti-dsDNA antibodies, and other clinical settings than autoimmune disease may explain these visual low levels of ANA in some of them. Low titers of ANA may indeed be present in healthy aged subjects and patients with infections or with cancer [14]. This suggests higher performance of ICARE compared to visual methods and promotes an automated evaluation of ANA screening.
Only one system in the literature presents an "index, " but it is statistical, not quantitative, and is defined as a probability index. Indeed, for screening purpose, G-sight system provides a probability of positivity based on statistics of a set of previous training samples [10]. The fact that ICARE method, for automated evaluation of IIF ANA, was based on a quantitative evaluation opens attractive perspectives. We showed a significant correlation between the fluorescence index and ANA titers of the samples, suggesting that FI reflects the antibody level. Titer prediction with automated system could improve cost efficiency by suppressing    the need of serial dilution and speeding up the report of the results. This quantitative index could give a comparative scale between laboratories allowing, in the future, a possible standardization of methods. Moreover, we validated this quantitative approach in a population of SLE patients. Interestingly, we also found a significant correlation between FI and anti-dsDNA antibody levels, which suggests a possible interest of FI in SLE disease activity monitoring.
In conclusion, the automated discrimination between positive and negative results represents a major step for automated evaluation of antinuclear autoantibodies by indirect immunofluorescence. Although ICARE algorithm should be tested in a multicenter analysis, it already presents several benefits such as the detection of weakly positive samples and a quantitative fluorescence index determination.