Reproducibility of Serologic Assays for Influenza Virus A (H5N1)

Results for clade 1 viruses were more consistent among laboratories when a standard antibody was used.

I nfluenza viruses agglutinate erythrocytes by binding to cell surface sialic acid. Agglutination may be blocked by strain-specific antibody detectable in hemagglutination-inhibition (HI) tests (1). Because serum HI titers correlate with protection (2), they are used to evaluate immunogenicity of influenza vaccines (3)(4)(5). However, conventional HI is generally insensitive for the detection of antibody to avian strains (6,7). Alternative serologic assays, including neutralization and HI with horse erythrocytes (hHI), are used to evaluate vaccine for pandemics (7)(8)(9). HI sensitivity for avian influenza increases when erythrocytes that express sialic acid containing α2,6-galactose linkages are used; these erythrocytes are preferentially recognized by avian hemagglutinin (8,9). Virus neutralization can be developed for any influenza subtype, although use of live virus may require heightened biocontainment.
Variability of influenza serologic assay results is partly attributed to differences in protocols and expression of endpoints (10,11). Assay variability limits comparison of candidate influenza virus subtype H5N1 vaccines in different clinical trials, posing challenges for licensure, particularly if specific seroprotective titers are required as endpoints (3)(4)(5). The use of bioassay standards to improve interlaboratory agreement is well recognized (12,13). However, the antigenic diversity of subtype H5N1 viruses (14) may pose challenges in maintaining relevant strain-specific antibody standards. We assessed the reproducibility of neutralization and hHI tests for influenza virus A (H5N1) and evaluated the suitability of a standard (freeze-dried plasma pool, obtained from persons vaccinated with clade 1 subtype H5N1, called 07/150) for detection of antibody.

Serum Samples
We used 14 serum samples (coded A-N) from persons who had received nonadjuvanted or adjuvanted splitproduct vaccine derived from reassortant clade 1 virus (A/

Reproducibility of Serologic Assays
for Influenza Virus A (H5N1) Standard 07/150 contained freeze-dried plasma from 9 persons who had received inactivated whole-virus A/ Vietnam/1194/NIBRG-14 vaccine. Four donations (total volume 3 L) were obtained from Omninvest, Hungary (vaccine contained aluminum phosphate), and 5 donations (total volume 2 L) were obtained from Sinovac, People's Republic of China (vaccine contained aluminum hydroxide). Persons gave informed consent after studies had received approval by appropriate ethics committees. Donations were negative for antibodies to HIV-1, HIV-2, hepatitis B surface antigen, and hepatitis C RNA. Plasma was pooled and freeze dried at the National Institute for Biologic Standards and Controls, according to standard procedures (15) to produce 1-mg ampoules and stored at -20°C. Stability studies found no significant change in titers after 8 months at -20°C, +4°C, or +20°C when compared with samples stored at -70°C.

Virus Reagents
Reassortant subtype H5N1 influenza viruses were prepared by reverse genetics from wild-type viruses, amplified in 10-day-old embryonated hens' eggs, and stored at -80°C. Each virus passed internationally approved safety testing (16)

Study Design
Fifteen laboratories from 9 countries agreed to participate and were assigned a code from 1 to 15. One additional laboratory returned titers from 1 neutralization and 1 pseudotype assay and was excluded from analysis.
The participating laboratories were sent reagents on solid CO 2 , asked to store serum at -20°C and viruses at -70°C, and instructed to reconstitute 07/150 with 1 mL distilled water and to test it and the serum for antibodies to each antigen, using hHI and neutralization, on at least 3 separate occasions. Suggested protocols were supplied, but participating laboratories could use in-house assays.

Statistical Analyses
Neutralization and hHI data consisted of replicate absolute titers, expressed as the reciprocal of serum dilution, and represented the last dilution giving a positive response from a doubling-dilution series. If the initial dilution did not give a positive response, the titer was recorded as less than the reciprocal initial dilution, e.g., <10 if the starting dilution was 1:10. Serum was interpreted as negative if no titer was detected and positive if any titer was detected. For calculation, negative titers were assigned the value of half the minimum detectable titer, and titers greater than the final dilution were assigned a value 2× the largest titer. These values represent the hypothetical adjacent dilution steps in the doubling-dilution series. This convention enables comparison of overall mean titers among groups on a consistent basis.
We calculated the geometric mean titer (GMT) for each serum, virus, and assay combination. Overall titers were calculated as the GMT of the individual laboratory means. Interlaboratory variation was expressed as percentage geometric coefficient of variation (%GCV) between the individual laboratory GMTs. The distribution of hHI or neutralization titers does not represent a continuous variable, and the results from using different viruses within laboratories are not independent. Thus, use of parametric modeling techniques, such as analysis of variance, to characterize intra-and interlaboratory variability was precluded.
To assess intralaboratory variation, we calculated the percentage of endpoints of replicate tests for identical serum samples A and L that differed >2-fold or >4-fold for each antigen and assay in each laboratory. We also compared the percentage of replicate tests returned for all serum samples and postvaccination samples that differed by >2-fold or >4-fold for each antigen and assay.
To assess interlaboratory variation, we compared differences between hHI and neutralization GMTs for 07/150 by different laboratories by using a paired nonparametric Wilcoxon signed-rank test for each antigen separately. For each laboratory, the difference in GMT between hHI and neutralization for NIBRG-14 was calculated, and these dif-ferences were compared with zero by using the Wilcoxon signed-rank test. Similarily, the results for hHI with each antigen were compared, taking the laboratory differences between the hHI GMT for NIBRG-14 and NIBRG-23 and the differences between NIBRG-14 and IBCDC-RG5 and comparing these differences with zero. The same was done for neutralization assays. We also compared differences among overall (for all laboratories) mean GMT for all serum samples by using a paired nonparametric Wilcoxon signed-rank test for each antigen separately; e.g., for NIBRG-14, the difference between the overall mean GMT for hHI and neutralization was calculated for each sample, and these differences were compared with zero. The nonparametric tests use the ranks of observed titers to calculate the significance of differences among groups and are unaffected by the value chosen to represent titers below the initial dilution or greater than the highest dilution used in the individual assays.
To assess a standard's ability to improve interlaboratory agreement, we expressed titers relative to 07/150 by taking the ratio of the GMT for a sample to the GMT for 07/150 and multiplying it by an assigned value for 07/150. The assigned value was the overall GMT by hHI and neutralization. The effect on interlaboratory agreement and %GCV is independent of the value chosen.
To evaluate improvement in interlaboratory agreement from expressing titers relative to 07/150 (or sheep antiserum), we calculated %GCV between laboratory GMTs, both absolute and relative, for each sample. We then calculated the difference between the %GCV of the laboratory GMT of absolute titers and the %GCV of the laboratory GMT of relative titers. Using the Wilcoxon signed-rank test for each antigen separately, we compared these differences with zero.

Assays
All participating laboratories returned at least 3 replicates by both assays, except laboratory 11, which did not perform hHI. Negative serum M was excluded because all titers were negative, except in laboratory 3, which reported 1 positive ( titer 45) and 2 negative neutralization titers.

Reproducibility of Relative Titers: 07/150 or Sheep Serum as Standard
To evaluate the ability of 07/150 to improve interlaboratory agreement, GMTs were expressed relative to 07/150 for each sample ( Table 2) and then summarized for all samples (Table 3). For all serum, interlaboratory reproducibility improved significantly for NIBRG-14; the median %GCV for hHI decreased from 125% to 61% (p = 0.001) and for neutralization from 183% to 81% (p = 0.002, Wilcoxon signed-rank test). However, for clade 2 viruses, interlaboratory variation did not change significantly. For sheep antiserum, the interlaboratory variability was increased because some laboratories reported negative hHI titers, resulting in high %GCV when test serum samples were expressed relative to them (Table 3). However, when these laboratories were excluded from analysis, the interlaboratory variation for NIBRG-14 by hHI became comparable to that found for 07/150. Laboratory 5 reported negative hHI titers for serum P; when that laboratory was excluded from analysis, the range of %GCV by hHI improved from 689%-796% to 51%-71%. Laboratories 5, 6, and 12 reported negative hHI titers for serum O; when they were excluded, the range of %GCV improved from 306%-442% to 39%-113%. When neutralization titers were expressed relative to serum O, interlaboratory variation to NI-BRG-14 was reduced, in contrast with serum P, for which interlaboratory variation by hHI or neutralization did not improve for any antigen, even when laboratory 5, which failed to detect antibody in this sample, was excluded. Because a serum HI titer ≈40 is considered seroprotective (2), the establishment of a consistent equivalence factor between an hHI titer of 40 and neutralization would be useful. However, the relationship of hHI and neutralization is dependent on the virus-serum-laboratory combination and cannot be generalized. Equivalence factors display large differences of 0.1-40.3 based on absolute titers and 0.3-6.3 based on titers relative to 07/150 for NIBRG-14 (online Appendix Table 1, available from www.cdc.gov/ EID/content/15/8/1250-appT1.htm).

Assay Operating Protocols
Thirteen laboratories supplied hHI protocols. Although similar (online Appendix Table 2, available from www. cdc.gov/EID/content/15/8/1250-appT2.htm), they differed in some respects: pretest serum hemabsorption, erythrocyte suspension concentration (<1% vol/vol or >1% vol/vol), and time and temperature of erythrocyte-settling period (60 or >120 min, 4°C or room temperature). Although no relationship between protocol and intralaboratory reproducibility was found, laboratories that used lower erythrocyte concentrations or read plates at 4°C tended to report higher titers. Laboratories that performed pretest hemabsorption tended to report lower titers.
Thirteen laboratories supplied neutralization protocols (online Appendix Table 3, available from www.cdc.gov/ EID/content/15/8/1250-appT3.htm) that were grouped into 3 broad methods: use of cell suspension for virus infection with short incubation time to endpoint (<26 hours), use of cell suspension with long incubation (>3 days), and use of cell monolayer for infection with long incubation (>3 days). Although no parameters were clearly associated with reproducibility, laboratories that used monolayers tended to report lower titers than those that used cell suspensions, and those that used longer incubation times had more interlaboratory variation by more frequently reporting titers at either end of the range (i.e., highest or lowest) than laboratories that used shorter times. Expression of initial serum dilution varied among laboratories as dilution of test serum was calculated either before or after the addition of virus.

Discussion
Having effective vaccines against influenza virus A (H5N1) is a public health priority. However, interlaboratory assay variation limits comparison of vaccine strategies without direct comparative studies. We compared the re- producibility of hHI and neutralization against a candidate standard. Overall, both assays were consistent, although neutralization displayed more intralaboratory variability than did hHI; 3 of 15 laboratories reported >2-fold differences in >25% of identical replicates. Titers determined by neutralization were higher and had a greater range than those determined by hHI, which suggests that neutralization may be more sensitive, particularly with low-titered serum. However, for some prevaccination serum, e.g., sample N, 6 (40%) laboratories reported neutralization titers of 20-160 but negative hHI titers, which suggests nonspecific reactivity or that neutralization detects functionally different antibodies than HI. This finding is consistent with findings of seroprevalence surveys in which titers to influenza virus subtype H5N1 may be detected by neutralization but not HI or Western blot among some persons with no exposure to subtype H5N1 (7). Sample K was from a person who had no known exposure but had detectable antibodies against H5. Most (93%) laboratories detected anti-H5 reactivity to NIBRG-14 by neutralization in this sample, but fewer (21%) detected antibodies to IBCDC-RG5. Studies suggest that antibodies against subtypes H1N1 and H3N2 detected by neutralization may be more strain specific than those detected by HI (10,17); however, we did not observe this difference.
Although HI is straightforward, most laboratories preferred their own assays. Variable parameters that may influence hHI include pretest serum hemabsorption (lowers titers) and erythrocyte suspension (higher concentration lowers titers). Because no common neutralization protocols exist, laboratories have developed their own protocols, which creates potential for variability. Because operator inexperience may have influenced reproducibility of assays for subtype H3N2 (10), laboratories were selected for expertise in serologic testing for H5. Although most used microneutralization based on an assay described by the World Health Organization (18), protocols differed by starting dilution of serum; preparation of cells; and virus inocula- tion, incubation, and endpoint estimation. Laboratories that performed assays with virus infection of cell monolayers generally reported lower titers than those that used suspensions. Assays with long incubation times and non-ELISA endpoints (e.g., cytopathic activity) were associated with greater interlaboratory variation than ELISAs with shorter incubation times. A biostandard should reduce variation associated with assay differences because standardization of protocols may be limited by local availability of reagents. Expression of the initial serum dilution, which clearly influences absolute titers, should be standardized. Although HI titers are typically expressed as the serum starting dilution before any addition of virus, calculation of starting dilutions for neutralization varies among laboratories. We propose that the calculated starting dilution for seasonal and avian influenza neutralization be expressed as serum dilution before the addition of virus (e.g., 5 µL serum in 45µL diluent plus 50 µL virus solution is considered as 1:10) as it is with HI.
Because the correlation between serum antibodies detected by hHI and protective efficacy against influenza subtype H5N1 is unclear, by default, immunogenicity criteria established for seasonal vaccines (3)(4)(5) are used for subtype H5N1 vaccines despite the lack of established immune correlates for neutralizing antibodies. Although hHI and neutralization titers correlate closely (9,19), this and other studies (10) find that the relationship depends on indi-vidual laboratory-antigen-serum combinations and cannot be generalized.
A potential limitation to this study is that 07/150 was derived from recipients of adjuvanted whole-virus vaccine but test serum samples were obtained from persons who received plain or adjuvanted split-product vaccines. Interlaboratory agreement improved when NIBRG-14, but not heterologous antigens, was used, which suggests that 07/150 is clade specific. Although no association between vaccine formulation and %GCV was noted in test serum, the quality and cross-reactivity of antibodies induced by wholevirus vaccine may differ from quality and cross-reactivity induced by alternative formulations including adjuvanted, subunit, or recombinant vaccines. To reduce potential variation in antibody isotypes, we obtained day-42 postvaccination samples when possible; however, the avidity of antibody to hemagglutinin or presence of antibody against denatured viral proteins after whole-virus vaccination (20) could influence the effectiveness of 07/150 against test serum. Differences among vaccine formulations should be examined, if possible, during evaluation of clade 2 standards; however, because production requires substantial donations of plasma, providing separate standards for specific vaccine formulations is impractical.
The overall reproducibility of sheep antiserum raised against clade 1 H5 hemagglutinin was poor; reported titers ranged widely. Because some laboratories failed to detect antibodies in sheep antiserum, the expression of relative titers did not reduce %GCV. When these laboratories were excluded from analysis, sheep serum improved interlaboratory agreement to NIBRG-14 by hHI but not by neutralization or for clade 2 antigens. This finding suggests that if assays can detect antibodies, sheep antiserum is a useful internal control; however, its role as an international standard is limited if some hHI assays appear unable to detect antibody titers. The reason for this discrepancy is unexplained because no clear association with assay method has been found. The antibody repertoire induced by cleaved hemagglutinin in Freund adjuvant in sheep antiserum will differ from that induced in humans by purified antigens. An alternative animal source and/or production method may be more reliable.
The World Health Organization Expert Committee on Biologic Standards has accepted 07/150 as an antibody standard for clade 1 H5 hemagglutinin and has assigned an arbitrary value of 1,000 IU. The assigned value of 1,000 IU is equivalent to an hHI titer of 140 (i.e., GMT to NIBRG-14 found across study laboratories), giving a seroprotective titer for 07/150 of ≈285 IU. For neutralization, a standard value of 1,000 IU for 07/150 would be equivalent to a neutralization GMT of 518. Because the relationship between hHI and neutralization is inconsistent and immune correlates are lacking, assigning a seroprotective level to neutralization is not possible. Useful information may be obtained by retesting serum from completed trials of clade 1 subtype H5N1 vaccine candidates against 07/150. Regulators will be required to discuss the interpretation of a standard before vaccine licensure for clinical use.
For standardizing serologic assays that use different influenza (H5N1) clades, a reliable animal serum source would be most convenient, but failure of some laboratories to detect antibody in sheep antiserum limits their use. The production of clade-specific standards for subtype H5 viruses will require human plasma donations, which can only be produced after initial clinical trials have been conducted. This requirement must be considered in future vaccine studies.