Response to comment on 'AIRE-deficient patients harbor unique high-affinity disease-ameliorating autoantibodies'

In 2016, we reported four substantial observations of APECED/APS1 patients, who are deficient in AIRE, a major regulator of central T cell tolerance (Meyer et al., 2016). Two of those observations have been challenged. Specifically, ‘private’ autoantibody reactivities shared by only a few patients but collectively targeting >1000 autoantigens have been attributed to false positives (Landegren, 2019). While acknowledging this risk, our study-design included follow-up validation, permitting us to adopt statistical approaches to also limit false negatives. Importantly, many such private specificities have now been validated by multiple, independent means including the autoantibodies’ molecular cloning and expression. Second, a significant correlation of antibody-mediated IFNα neutralization with an absence of disease in patients highly disposed to Type I diabetes has been challenged because of a claimed failure to replicate our findings (Landegren, 2019). However, flaws in design and implementation invalidate this challenge. Thus, our results present robust, insightful, independently validated depictions of APECED/APS1, that have spawned productive follow-up studies.


Introduction
In 2016, in a paper published in Cell, we made at least four substantial observations concerning APECED/APS1 syndrome patients who are defined by deficiency in the AIRE gene, a major regulator of central T cell tolerance: i) that such patients share autoantibodies to a small subset of proteins, including Type I IFNs, Interleukin-(IL)À17, and IL-22, that was previously reported (Kisand et al., 2010;Meager et al., 2006); ii) that, quite surprisingly, many such naturally-arising antibodies are conformation-specific and of extremely high affinity, potentially explaining their powerful neutralizing capacity; iii) that most APECED/APS1 patients additionally harbor 'private reactivates' collectively targeting very many autoantigens; and iv) that strong antibody-mediated neutralization of IFNa correlated significantly with an absence of Type I diabetes (T1D) in patients otherwise highly disposed to it (Meyer et al., 2016). These observations formed the basis for several substantial follow-up papers in peer-reviewed journals (Rodero et al., 2017a;Frémond et al., 2017;Fishman et al., 2017).
Our 2016 paper discussed the close alignment of our first substantial observation with contemporaneous work by Landegren and co-workers that likewise featured a protein microarray screen of patient versus control sera (Landegren et al., 2016). Our second substantial observation could not be compared because Landegren and co-workers did not investigate the properties of the autoantibodies that were identified. Nonetheless, Landegren and other co-workers now dispute our remaining two claims (Landegren et al., 2019). While we welcome open, constructive discourse about science, we are disappointed by this dispute because we believe it reflects simple but important differences between our approaches that could have been easily resolved, had Landegren and coworkers approached us directly. Those important differences are explained below. Based on biological significance, they are considered in reverse order to their consideration in Landegren et al. (2019) (hereafter referred to as the comment).

Results and conclusions
No association between neutralizing autoantibodies to interferons and Type 1 diabetes in APECED/APS1 The comment disputes our observed correlation of strongly neutralizing IFNa autoantibodies with reduced incidence of T1D, claiming to have essentially repeated our experiment, but finding no difference in the IFN neutralization capacity of sera from APECED/APS1 patients with or without T1D.
In fact, the comparison that is described in the comment of IFN neutralization in two APECED/ APS1 patient cohorts defined simply as with or without diabetes did not repeat our experiment. We did not claim that differential Type I IFN neutralization is the sole regulator of T1D incidence. Rather, we hypothesized that the significance of differential Type I IFN neutralization might relate to the differential disease development in patients uniformly displaying pathognomonic features of T1D. Thus, we compared IFNa neutralization in two patient sub-cohorts: one presenting with T1D and one not, but all of whom either harbored and/or had harbored disease-associated, anti-islet antibodies, for example anti-GAD65, anti-GAD67, that are widely-utilized clinical indicators of T1D-risk. Supporting our hypothesis, we observed a statistically significant correlation of low neutralization and T1D, consistent with several other experiments in which we established the capacity of APECED/ APS1 autoantibodies to ameliorate immunopathology.
It is also unfortunate that an imperfect study design was employed in the comment. As presented to us, the comparison of different sera in the comment was made at a single [high] concentration (10%), which is inappropriate because it will most often saturate the assay, thereby failing to appropriately discriminate low versus high neutralizing activities. To illustrate this point, we re-examined IFNa neutralization, using a dose-dependent, cell-based assay that measures IFN-stimulated release of alkaline phosphatase (AP), as we previously described (Meyer et al., 2016), but mimicked the comment in examining only the effects of 10-fold diluted sera. This masked any significant differences between GADA-seropositive patients with or without T1D ( Figure 1A). By contrast, when sera were serially diluted so as to imbue the assay with appropriate sensitivity (as described in the Meyer et al., 2016), the patients' broad dynamic range of IC50 values was revealed, with clear segregation of the patients with and without T1D ( Figure 1B and C). Indeed, among 13 patients without T1D, the serum of only one ('N'; Figure 1C) showed low IFNa neutralization, comparable to that of all the patients with T1D.
The authors of the comment employed a phospho-STAT1 induction assay (Gupta et al., 2016). This is an inherently less sensitive assay, but nonetheless when we adopted it in another new experiment, we obtained the same pattern of results as we obtained with the AP assay. Namely, at high concentrations, the sera of patients with and without T1D showed comparable activities, but at lower, sub-saturation concentrations [50-fold dilutions], the cohort without T1D showed significantly greater capacity to limit IFNa activity ( Figure 1D). Thus, because their measurements were insufficiently sensitive to discriminate low neutralizers from high neutralizers, we believe that the experimental design employed in the comment was not appropriate to compare IFN neutralization by the sera of patients with and without T1D: as such, the comment provides no experimental basis on which to dispute the fourth substantial observation of Meyer et al. (2016).
Finally, the observations of Meyer et al. (2016) are germane to an important clinical issue. Specifically, the delayed onset and relatively rare incidence (~15%) of T1D in APECED/APS1 patients is puzzling given that: insulin is a prototypic AIRE-regulated tissue-specific autoantigen; there is defective negative selection of b-cell antigen-specific T cells; pancreatic b-cells are notoriously vulnerable to autoimmune attack; and idiopathic T1D commonly occurs in children and adolescents (Perheentupa, 2006;Anderson et al., 2002;Wolff et al., 2014;Sabater et al., 2005). In this context, the observations of Meyer et al. (2016) suggest that IFN-neutralizing antibodies may delay T1D onset in APECED/APS1 patients and may prevent it completely in those with very high neutralizing titres. This is consistent with longitudinal assessment, albeit limited, reported by in Supplementary Figure 7 of Meyer et al. (2016). When combined with increasing numbers of studies implicating Type I IFN as pathogenic in patients at genetic risk to develop T1D (Ferreira et al., 2014;Kallionpää et al., 2014;Foulis et al., 1987), our data compel us to disagree with the assertion made in the comment that there is insufficient evidence to "embark on in-depth investigations of targeting Type 1 IFNs for the treatment or prevention of Type 1 diabetes." IFNα2 Neutralization (nonlinear dose-response curve fitting) No evidence for widespread autoantibody reactivity in APECED/APS1 patients

GADA + APS-1 Patients
The comment disputes our observations that individual APECED/APS1 patients harbor small numbers of 'private' specificities shared by few other patients, but collectively comprising a very large number of proteins. In fact, our observations conspicuously mirror a key clinical aspect of APECED/APS1, namely that each patient is highly individual in the type and range of symptoms; the rate and course of disease progression; and, to some extent, the time-of-onset. Hence, it makes biological sense that individual patients would harbor correspondingly diverse antibodies as causes and/or biomarkers of discrete clinical courses.
Nonetheless, the comment argues that the private specificities comprise stochastic, irreproducible signals reflecting the high risk of false positives inherent in the statistical methods that we employed to analyse our ProtoArray data.
Importantly, all statistical approaches need to reflect a study's goals. For example, clinical trials use one type of statistical method to minimize type one errors (false positives) that might misleadingly indicate drug efficacy, while employing different statistical methods to minimize type two errors (false negatives) that might obscure adverse event(s). The approach advocated in the comment (and in Landegren et al., 2016) parallels the former, scoring signals in patients by comparison to mean and standard deviation (SD) of controls, and then additionally adding a Fisher's exact test to exclude signals that did not confer statistical difference to the whole patient group. This is wellsuited to defining how APECED/APS1 patients differ as a group from healthy controls.
By contrast, Meyer et al. (2016) sought to characterize the nature of auto-reactivities in APECED/APS1 patients, including private reactivities that might mirror individual clinical presentations. The Fisher's exact test filter would exclude 'private targets' as outliers or false positives because they are insufficiently frequent to significantly influence the distribution of reactivities across the whole group (see below). Anticipating this, and knowing that no existing standard data-analysis method can unequivocally discriminate private specificities from false positives, we compared the signal to mean and SD of controls without the additional filter (Meyer et al., 2016).
Although, this is a standard, widely-used approach, we do not dispute that it can be confounded by unwarranted assumptions about the behavior of the control cohorts, coupled with an imbalance in the numbers of controls (21) and patients (81) that we examined (Meyer et al., 2016). Hence, false positives can and will arise. Nonetheless, we consciously employed this approach because the validation of real reactivities versus false positives was to be made by a spectrum of additional, independent, serological, biochemical, and biological methods that were employed in Meyer et al. Examples of validation are as follows. First, several private anti-cytokine reactivities were validated by ELISA and by LIPS (Luciferase Immunoprecipitation -an unrelated assay platform using independent sources of target proteins displayed in native conformation), and have since been validated independently (Meyer et al., 2016;Fishman et al., 2017;Sng et al., 2019). Furthermore, we molecularly cloned and fully characterized such autoantibodies, for example anti-IL32g (Meyer et al., 2016), anti-BAFF (unpublished data), and anti-IL20 (Meyer et al., 2016) detected in five, four, and two patients, respectively, but in none of the sampled controls.
Third, Fishman et al. (2017) applied very stringent criteria to the data of Meyer et al. (2016), including a further filtration of private reactivities into those shared by >3 patients. Still there were~1000 reactivities: 490 shared by only three patients; 245 shared by 4 but not five patients; 111 shared by 5 but not six patients; 116 shared by >6 patients. These reactivities individually and collectively displayed five conspicuous traits: (1) correlations with clinical phenotypes, for example pernicious anemia or vitiligo; (2) more reactivities in patients with more complex clinical phenotypes; (3) a correlation of the average number of reactivities per patient with the severity of the AIRE gene mutation; (4) reactivities assessed longitudinally over relatively short time-frames correlated more closely than those sampled over longer time-frames (e.g. 10 years); and (5) reactivities mostly increased with duration of disease (Fishman et al., 2017).
Fourth, the reactivities described by Meyer et al. (2016) were conspicuously enriched in geneproducts of two sub-classes: a) those expressed in lymphoid tissues and with no known connection to AIRE function, but which comprise some of the strongest reactivities (as agreed by Landegren et al., 2016 and the comment); b) diverse tissue-restricted antigens (TRAs), which were strikingly enriched in those expressed by AIRE-expressing medullary thymic epithelial cells (Fishman et al., 2017). Consistent with this, male antigens were also targeted in females Table 1. Testis-and cancer-associated non-cytokine targets screened by LIPS.   (Fishman et al., 2017), whereas non-CT-antigen members of the MAGE family that are expressed in all tissues were not observed as targets (Meyer et al., 2016;Fishman et al., 2017). False positives could not meet any of these four sets of criteria, let alone all of them. In sum, the potential for a signal to be a false positive does not establish that it is, particularly when its validity is attested to by multiple independent means. The comment ignores another pitfall of ProtoArrays, which is the under-estimation of reactivities to proteins that are not displayed well, as we and others have noted (Kärner et al., 2016;Schnack et al., 2008). Critically, ProtoArrays should serve as guides for subsequent experiments, as was the case for Meyer et al. (2016) and a number of later studies (Rodero et al., 2017b;Frémond et al., 2017;Fishman et al., 2017).
In this regard, we note that a co-author of the comment recently published a study (Sng et al., 2019) describing a loss of B cell tolerance in APECED patients that was associated with a broad spectrum of autoantigen reactivities, including several new non-cytoline specificities. This aligns with the depiction of APECED/APS1 patients provided by Meyer et al. (2016).
Conceding, nonetheless, that we may have exaggerated some patient reactivities, we applied a more conservative statistical approach to Meyer et al. (2016): namely we based z-scores on the mean of the controls and SD across all patients plus controls. SD will now be increased by positive reactivities in controls and/or patients, thereby reducing the risk of false positives. Interestingly, this approach identified reactivities overlapping 81% with our original study: again, these comprised broadly shared autoantigens and from~30 to~100 private specificities that collectively composed a substantial fraction of the proteome. Moreover, when this same statistical approach was applied to an additional study in which we used an earlier version of the ProtoArray (v5.0) to interrogate sera from 23 patients examined by Meyer et al. (2016) but with eight different healthy controls, the overlap across the two independent studies (and platforms) was substantial and highly significant (p<1e-06), far exceeding any overlap obtained from 100,000 random permutations of patients and controls.
We conclude that our published and ongoing studies (Meyer et al., 2016;Fishman et al., 2017) accurately depict the serological status of APECED/APS1 patients, viewed collectively and individually. While we acknowledge that the limited numbers of patients and appropriate controls make it difficult to reach a precise estimate of the numbers of private specificities, there is no basis for disputing the four central findings of Meyer et al. (2016), consistent with which those findings have formed a basis for rigorous follow-up work by us and by others (Rodero et al., 2017a;Frémond et al., 2017;Fishman et al., 2017;Sng et al., 2019;Rice et al., 2018;Llibre et al., 2018;Dhir et al., 2018;Rodero et al., 2017b) that will inform our understanding of APECED/APS1 and of autoimmune diseases more generally. Reporter cell assay

Materials and methods
The IC50 values of IFNa neutralization of serum samples were tested with the help of HEK-BlueTM IFN-a/b reporter cells (InvivoGen) that express alkaline phosphatase (AP) under the inducible ISG54 promoter after ISGF binding to the IFN-stimulated response elements in the promoter. The cells were grown in DMEM supplemented with heat inactivated 10% FBS and 30 g/ml blasticidin (Invivo-Gen) and 100 g/ml Zeocin (InvivoGen). Cells were stimulated with IFNa2a (12.5 U/ml, Miltenyi Biotech) that was preincubated for 2 hr with serial dilutions of recombinant antibodies or one fixed concentration (10%) of serum. QUANTI-Blue TM (InvivoGen) colorimetric enzyme assay was used to determine AP in the cell culture supernatants after 21 hr of incubation. OD was measured at 620 nm with Multiscan MCC/340 (Labsystems) ELISA reader and IC50 values were calculated from the doseresponse curves using GraphPad Prism eight software.

Phospho-STAT1 assay
Peripheral blood mononuclear cells (PBMC) from a healthy control were isolated with density gradient centrifugation and aliquoted by 500 000 cells to test tubes containing IFN-a2a (10 000 U/ml) pre-incubated with serum dilutions for 2 hr. Tubes with or without IFN alone served as positive and negative controls. After 15 min of stimulation of PBMCs at 37˚C, the cells were fixed immediately with Cytofix buffer, permeabilized with Perm Buffer III and stained with PE-conjugated antibody to phospho-STAT1 (Y701; all from BD Biosciences). Data were acquired with LSRFortessa (BD Biosciences) and analyzed with FCS Express (De Novo Software). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Data availability All data generated or analysed in this study are included in the manuscript and supporting files. Pro-toArray data have been deposited at Array Express (E-MTAB-5369).

Author contributions
The following previously published dataset was used: Author (