Pathogenic T Cells in Celiac Disease Change Phenotype on Gluten Challenge: Implications for T‐Cell‐Directed Therapies

Abstract Gluten‐specific CD4+ T cells being drivers of celiac disease (CeD) are obvious targets for immunotherapy. Little is known about how cell markers harnessed for T‐cell‐directed therapy can change with time and upon activation in CeD and other autoimmune conditions. In‐depth characterization of gluten‐specific CD4+ T cells and CeD‐associated (CD38+ and CD103+) CD8+ and γδ + T cells in blood of treated CeD patients undergoing a 3 day gluten challenge is reported. The phenotypic profile of gluten‐specific cells changes profoundly with gluten exposure and the cells adopt the profile of gluten‐specific cells in untreated disease (CD147+, CD70+, programmed cell death protein 1 (PD‐1)+, inducible T‐cell costimulator (ICOS)+, CD28+, CD95+, CD38+, and CD161+), yet with some markers being unique for day 6 cells (C‐X‐C chemokine receptor type 6 (CXCR6), CD132, and CD147) and with integrin α4β7, C‐C motif chemokine receptor 9 (CCR9), and CXCR3 being expressed stably at baseline and day 6. Among gluten‐specific CD4+ T cells, 52% are CXCR5+ at baseline, perhaps indicative of germinal‐center reactions, while on day 6 all are CXCR5−. Strikingly, the phenotypic profile of gluten‐specific CD4+ T cells on day 6 largely overlaps with that of CeD‐associated (CD38+ and CD103+) CD8+ and γδ + T cells. The antigen‐induced shift in phenotype of CD4+ T cells being shared with other disease‐associated T cells is relevant for development of T‐cell‐directed therapies.

. Participants for flow cytometry and RNA-seq analysis  Table S2. Differentially expressed RNA-seq derived genes expressed on the surface of Tetramer + cells  Table S3. Mass cytometry staining panel for Tetramer +/-CD4 + T cells  Table S4. Participants for mass cytometry analysis  Tetramer + cells  Table S5. Untreated celiac disease participants for mass cytometry analysis.
 Table S6. Ten-fold cross validation of the markers that define Tetramer + cells best at d6 of gluten challenge  Table S7. Ten-fold cross validation of the markers that define Tetramer + cells best at d6 of gluten challenge and in untreated CeD  Table S8. Ten-fold cross validation of the markers that define Tetramer + cells best at baseline before gluten challenge

Gluten challenge and participants
We first included nine HLA-DQ2.5 + adult CeD patients for gluten challenge and RNA-seq (study design depicted in Figure 1A, participants in Table S1, Supporting Information).
Assessment of material from three additional patients were omitted due to technical issues.
All participants had been diagnosed according to the current guidelines [1] and were in clinical and serological remission, indicating that they were well treated on a gluten-free diet. [2] In one participant, IgG-anti-DGP was not completely normalized, despite eight months of adherence to a gluten-free diet (Table S1, Supporting Information). These initial participants were challenged by ingestion of one gluten-containing cookie, containing 8 g of gluten each, three days in a row, as described earlier. [3] For mass cytometry studies of Tetramer +/-CD4 + T cells we included additional six (of seven recruited, as one participant was HLA-DQ2.2 + , and thus only included in the studies of CD8 + and γδ + T cells summarized in Figure 5) individuals (Table S4, Supporting Information) and used residual blood samples from four of the participants recruited for RNA-seq. The seven additional participants recruited for mass cytometry were challenged with four slices of white bread daily, containing approximately 2 g of gluten each, three days in a row, as described earlier. [4] Blood samples were drawn at baseline and d6. Gluten ingestion started on day 1. Finally, we recruited four untreated CeD patients (Table S5, Supporting Information) to compare the phenotype in during gluten challenge and treated CeD patients with previously published phenotype of untreated CeD patients. [5] Healthy controls were not included in this study as we have previously shown that HLA-DQ2.5 + individuals without CeD do not have gluten-specific CD4 + memory T cells detectable with Tetramers. [4,6] We measured serum IgA-anti-TG2 (QUANTA Lite R h-tTG IgA ELISA) and IgG-anti-DGP (QUANTA Lite Gliadin IgG II) at the Department of Medical Biochemistry, and typed all participants for HLA-DQA1 and HLA-DQB1 alleles (LABType SSO, ONE LAMBDA) at the Department of Immunology, Oslo University Hospital.

Patient-reported outcome measures
Symptoms were scored daily by a Visual Analogue Scale (VAS) for gastrointestinal symptoms over a period before, during and after the gluten challenge. Scores were obtained for pain, bloating, flatulence, nausea, stool consistency and overall symptoms. We also recorded a modified Gastrointestinal Symptom Rating Scale for patients with irritable bowel syndrome (GSRS-IBS) [7] the last three days at BL, on day 3 and 6 ( Figure S1, Supporting Information).

Mass cytometry staining
For mass cytometry analysis of Tetramer +/-CD4 + T cells, we analyzed 10 paired baseline and d6 samples (Table S1 and S4, Supporting Information). We used a novel antibody panel (see data analysis), HLA-DQ2.5:gluten tetramers and our previously established staining protocol. [5,10] For barcoding, we used anti-CD45 coupled with 89Y, 113In, 106Pd or 110Pd. [11] Paired baseline and d6 samples were thawed and barcoded separately before they were merged and stained in onetube. After enrichment of HLA-DQ2.5:gluten tetramer stained cells, we paired 1 million from the baseline and d6 samples of the already barcoded Tetramerdepleted PBMCs and stained these cells in total 100 µl CyTOF buffer, with a modified mass cytometry staining panel (Table S9, Supporting Information). Although different mass cytometry staining panels for CD4 + T cells versus CD8 + and γδ + T cells (Table S3 and S9, respectively) we used the same staining approach as previously explained. [5]

Data analysis
Transcript abundance was estimated from fastq files against cDNA for the human reference genome (GRCh38 ensamble 91) using the Kallisto alligner [12] and aggregated at a gene level using tximport. [13] Differentially expressed genes were identified using DESeq2. [14] We adjusted for donor variance and used an adjusted p-value, <0.01. Uncertain log fold change (FC) estimates were reduced using the lfcShrink function (Deseq2 package). [15] We mapped the list of DE genes against a cell surface protein atlas, [16] generating a list of cell-surface expressed proteins on Tetramer + cells (Table S2, Supporting Information), for which we identified 15 corresponding, commercially available antibodies and designed a mass cytometry panel that also contained 16 antibodies (including anti-CD45RA, anti-CD62L and anti-integrin β7) used to characterize gluten-specific CD4 + T cells in untreated CeD. [5] We used FlowJo version 10.6 (FlowJo, LLC) to analyze mass and flow cytometry data and GraphPad Prism 8 (GraphPad Software) for statistical analysis. Mass cytometry-derived pre Tetramer-enriched and Tetramer-enriched samples were gated as previously shown, [5] downsampled to maximum 100 and 10 000 cells, respectively, before visualization with UMAP and t-sne and extracting mean T-cell marker expressions for visualization (Figure 2, Figure 3, Figure S4, Supporting Information). The median log 2 FC for expressed T-cell markers were visualized as heatmaps in R with ggplot ( Figure 2C-F), plotted against the corresponding RNA-seq derived median log 2 FC in GraphPad Prism ( Figure S5, Supporting Information). In parallel, the CD8 + and γδ + T cells were down-sampled to maximum 15 000 cells, before visualization with UMAP ( Figure 5D-E, 5H-I) and extracting mean T-cell marker expressions for each patient. The median log 2 FC for expressed markers on CD8 + T cells (CD103 + /CD38 + T cells versus all CD8 + ) were plotted against that of γδ + T cells and versus Tetramer + CD4 + T cells on day 6, respectively ( Figure 5J, 5K).
For the ranking and correlation studies visualized in Figure 4, we preprocessed and read fcs files in R as previously described. [5] The fcs files were exported from FlowJo and pre-gated on CD4 + , gut-homing Tetramer + T EM cells and CD4 + Tetramercells, respectively, and downsampled to a maximum of 300 cells per sample. For Figure 4B we used a standard Pearson correlation analysis with the already balanced down-sampled dataset to identify commonly co-expressed surface proteins in the Tetramer + population with Tetramercells as a backdrop.
For the analysis in Figure 4C, 4D we used the tidymodels framework in R to fit several different classification models, but discovered that a simple logistic regression model was fully able to differentiate the two cell populations. Then, we iteratively fitted logistic regression models on this dataset and used 10-fold cross validation with three repeats at each step to evaluate model performance with ROC-AUC. The weakest predictor with the highest p-value was removed at each step until a single predictor was left.

Figure S9. Gating strategy for identification of celiac disease-associated (CD103 + CD38 + )
CD8 + and γδ + T cells. Numbers indicate gating order of PBMCs, here exemplified by gating of celiac disease-associated CD8 + T cells on day 6.   List of differentially expressed genes (ensembl gene ID and gene name), log2 fold change (lfc) and adjusted p value (padj) for genes that code for proteins expressed on the cell surface as mapped against an atlas of cell-surface proteins. [16]  The panel includes metal tags for sample barcoding (anti-CD45), secondary antiphycoerythrin staining for identification of HLA-DQ2.5:gluten tetramer-binding cells,    a

Table S7. Ten-fold cross validation of the markers that define Tetramer + cells best at d6 of gluten challenge and in untreated CeD
The weakest predictor was removed stepwise giving one list for each number of markers from 31 down to 1. The estimate column is the maximum likelihood estimate of the log odds ratio for each regression term. The std.error column indicates the standard error of the estimated regression term in the previous column. The statistic column contains the T-statistic for the hypothesis that the regression term is non-zero. The p.value column contains the two-sided pvalue associated with the t-statistic.  Panel for staining of tetramer-depleted PBMCs, which was used to study CD8 + and γδ + T cells. The panel includes metal tags for sample barcoding (anti-CD45), which was performed prior to tetramer staining (see panel in Supplementary Table S3), viability staining (195Pt) and nucleated cell staining (191/193Ir). a) Final concentrations are stated in μg/ml when using self-conjugated antibodies or per volume 100 when the concentration was not available from the manufacturer. b) Novel markers based on RNA-seq analysis that were included in the mass cytometry analysis. C) Markers that were changed compared to the panel made for CD4 + T cells (see Supplementary Table S3).