Main

T cells control viral infections and provide immunological memory that enables long-lasting protection1,2,3. Whereas CD4+ helper T cells orchestrate the immune response and enable B cells to produce antibodies, CD8+ cytotoxic T cells eliminate virus-infected cells. For both, recognition of viral antigens in the form of short peptides presented on HLAs is fundamental. In consequence, characterization of such viral T cell epitopes4,5,6 is crucial for the understanding of immune defense mechanisms, but also a prerequisite for the development of vaccines and immunotherapies3,7,8,9.

The SARS-CoV-2 coronavirus causes COVID-19, which has become a worldwide pandemic with dramatic socioeconomic consequences10,11. Available treatment options are limited, and despite intensive efforts a vaccine is so far not available. Knowledge obtained from the two other zoonotic coronaviruses SARS-CoV-1 and MERS-CoV indicates that coronavirus-specific T cell immunity is an important determinant for recovery and long-term protection12,13,14,15. This T cell-mediated immune response is even more important as studies on humoral immunity to SARS-CoV-1 provided evidence that antibody responses are short-lived and can even cause or aggravate virus-associated lung pathology16,17. With regard to SARS-CoV-2, very recent studies18,19,20 described CD4+ and CD8+ T cell responses to viral peptide megapools in donors that had recovered from COVID-19 and individuals not exposed to SARS-CoV-2, the latter being indicative of potential T cell cross-reactivity21,22. The exact viral epitopes that mediate these T cell responses against SARS-CoV-2, however, were not identified and characterized in detail in these studies, but are prerequisite (1) to delineate the role of post-infectious and heterologous T cell immunity in COVID-19, (2) for establishing diagnostic tools to identify SARS-CoV-2 immunity and, most importantly, (3) to define target structures for the development of SARS-CoV-2-specific vaccines and immunotherapies. In this study, we define SARS-CoV-2-specific and cross-reactive CD4+ and CD8+ T cell epitopes in a large collection of SARS-CoV-2 convalescent as well as nonexposed individuals and their relevance for immunity and the course of COVID-19 disease.

Results

Identification of SARS-CoV-2-derived peptides

A new prediction and selection workflow, based on the integration of the algorithms SYFPEITHI and NetMHCpan, identified 1,739 and 1,591 auspicious SARS-CoV-2-derived HLA class I- and HLA-DR-binding peptides across all ten viral open-reading frames (ORFs) (Fig. 1a and Extended Data Fig. 1a,b). Predictions were performed for the ten and six most common HLA class I (HLA-A*01:01, -A*02:01, -A*03:01, -A*11:01, -A*24:02, -B*07:02, -B*08:01, -B*15:01, -B*40:01 and -C*07:02) and HLA-DR (HLA-DRB1*01:01, -DRB1*03:01, -DRB1*04:01, -DRB1*07:01, -DRB1*11:01 and -DRB1*15:01) allotypes covering 91.7% and 70.6% of the world population with at least one allotype, respectively23,24 (Extended Data Figs. 1c and 2a). To identify broadly applicable SARS-CoV-2-derived T cell epitopes, we selected 100 SARS-CoV-2-derived HLA class I-binding peptides comprising ten peptides per HLA class I allotype across all ten viral ORFs for immunogenicity screening (range 3–20 peptides per ORF, mean 10; Fig. 1b,c, Extended Data Fig. 1d–m and Supplementary Table 1). In addition, 20 SARS-CoV-2-derived promiscuous HLA-DR-binding peptides across all ORFs from peptide clusters of various HLA-DR allotype restrictions representing 99 different peptide-allotype combinations were included (Fig. 1d,e, Extended Data Fig. 2b–k and Supplementary Tables 2 and 3). Of these HLA-DR-binding peptides, 14 of 20 (70%) contained embedded SARS-CoV-2-derived HLA class I-binding peptides for 7 of 10 HLA class I allotypes. The complete panel of 120 SARS-CoV-2-derived peptides comprised 10% of the total SARS-CoV-2 proteome (57% and 12% of nucleocapsid and spike protein, respectively; Extended Data Fig. 2l) and showed an equally distributed origin of structural ORF proteins (61 of 120 (51%)) encompassing spike, envelope, membrane and nucleocapsid proteins as well as nonstructural or accessory ORFs (59 of 120 (49%)). The broad HLA class I and HLA-DR allotype restriction of the selected SARS-CoV-2-derived peptides covering ten common HLA class I and six common HLA-DR allotypes allowed for a total coverage of at least one HLA allotype in 97.6% of the individuals of the world population (Fig. 1f). Recurrent mutations of SARS-CoV-2 (refs. 25,26) affected only a minority of selected SARS-CoV-2-derived peptides with 14 of 120 (12%) sequences (1.7% at anchor position), including reported mutation sites (Supplementary Fig. 1 and Supplementary Tables 4 and 5). Taken together, we predicted for the most common HLA allotypes SARS-CoV-2-derived peptides across all ten viral ORFs and selected 100 HLA class I- and 20 HLA-DR-restricted epitope candidates for further immunological characterization.

Fig. 1: Identification and selection of SARS-CoV-2-derived HLA class I- and HLA-DR-binding peptides.
figure 1

a, Schematic overview of our prediction and selection approach and workflow to identify and finally select 120 broadly applicable SARS-CoV-2 HLA class I- and HLA-DR-binding peptides for further screening and validation as T cell epitopes. b, Selected HLA class I-binding peptides for the ten most common HLA class I allotypes. Each color represents a distinct ORF. spi, spike protein; env, envelope protein; mem, membrane protein; nuc, nucleocapsid protein. c, HLA class I peptide distribution within the ORF9 nucleocapsid protein (for ORF1–ORF8 and ORF10, refer to Extended Data Fig. 1e–m). Each color represents a distinct HLA class I allotype. d, Selected HLA-DR-binding peptides for the six most common HLA-DR allotypes. Each color represents a distinct ORF. e, HLA-DR peptide cluster distribution within the ORF9 nucleocapsid protein (for ORF1–ORF8 and ORF10, refer to Extended Data Fig. 2c–k). Each color represents a distinct HLA-DR allotype. f, Population coverage achieved with the selection of ten common HLA class I and six common HLA-DR allotypes for SARS-CoV-2 T cell epitope screening as compared to the world population. The percentage of individuals within the world population carrying up to five HLA class I or HLA-DR allotypes (x axis) are indicated as gray bars on the left y axis. The cumulative percentage of population coverage is depicted as black dots on the right y axis.

Source data

Characterization of SARS-CoV-2-derived T cell epitopes

Interferon (IFN)-γ ELISPOT screening of in vitro amplified T cells from patients convalescing from SARS-CoV-2 (SARS group 1, n = 116, Table 1 and Supplementary Table 6) and donors never exposed to SARS-CoV-2 (PRE group A, n = 104, samples collected before SARS-CoV-2 pandemic; Table 1 and Supplementary Table 7) validated 29 of 100 (29%) SARS-CoV-2-derived HLA class I-binding peptides (3 of 10 HLA-A*01; 2 of 10 HLA-A*02; 3 of 10 HLA-A*03; 2 of 10 HLA-A*11; 5 of 10 HLA-A*24; 2 of 10 HLA-B*07; 4 of 10 HLA-B*08; 0 of 10 HLA-B*15; 5 of 10 HLA-B*40; and 3 of 10 HLA-C*07) and 20 of 20 (100%) HLA-DR-binding peptides as naturally occurring T cell epitopes (Figs. 2a,b, 3a,b, Tables 2 and 3, Supplementary Figs. 2 and 3 and Supplementary Table 8). Additional flow-cytometry-based analyses for selected SARS-CoV-2-derived T cell epitopes revealed that T cell responses directed against HLA class I-binding peptides were mainly driven by CD8+ T cells and that HLA-DR-binding peptides were recognized by CD4+ T cells, notably in one single donor also by CD8+ T cells (Fig. 3c,d and Supplementary Table 9). Amplified CD4+ T cells often showed multifunctionality (expressing IFN-γ, tumor necrosis factor (TNF) and CD107a), whereas CD8+ T cells mainly produced only IFN-γ upon peptide stimulation (Fig. 3e and Supplementary Table 9). Twelve of 29 (41%) and 11 of 20 (55%) SARS-CoV-2-derived CD8+ and CD4+ T cell epitopes were dominant epitopes (recognized by ≥50% of SARS donors) with recognition frequencies up to 83% (A01_P01) and 95% (DR_P16), respectively (Fig. 2a,b and Tables 2 and 3). T cell responses showed high inter-individual as well as inter-peptide intensity variation (in terms of spot counts per 5 × 105 cells). Overall, the intensity of HLA-DR-specific T cell responses in the SARS group was more pronounced compared to those directed against HLA class I T cell epitopes (Fig. 4a,b). All SARS-CoV-2-derived HLA-DR-binding peptides were found to be immunogenic, independently of the source ORF. SARS-CoV-2-derived HLA class I T cell epitopes showed an equally distributed origin from structural (13 of 29 (45%)) and nonstructural or accessory (16 of 29 (55%)) ORFs (Table 2). However, ORF-specific differences regarding the proportion of validated HLA class I T cell epitopes were observed, revealing the highest frequencies for ORF9 (50%, nucleocapsid protein), ORF1 (45%) and ORF3 (38%; Fig. 4c). The highest recognition in SARS donors was observed for HLA class I T cell epitopes derived from ORF2 (55%, spike protein), ORF5 (52%, membrane protein) and ORF3 (45%), as well as for HLA-DR T cell epitopes derived from ORF5 (95%, membrane protein), ORF8 (68%) and ORF4 (55%, envelope protein; Fig. 4d). In summary, we identified and characterized multiple dominant and subdominant SARS-CoV-2-derived HLA class I and HLA-DR T cell epitopes in patients convalescing from COVID-19.

Table 1 Donor characteristics
Fig. 2: Validation of SARS-CoV-2-derived HLA class I and HLA-DR T cell epitopes.
figure 2

a,b, Recognition frequency- and allotype-sorted pie charts of SARS-CoV-2-derived HLA class I (a) and HLA-DR (b) T cell epitopes. Recognition frequency of T cell epitopes (donors with T cell responses/tested donors) in groups of HLA class I-matched convalescent donors of SARS-CoV-2 infection (SARS group 1, total n = 116, left pie chart, red) and donors never exposed to SARS-CoV-2 (PRE group A, total n = 104, right pie chart, blue) were assessed by ELISPOT assays after 12-d in vitro pre-stimulation. Dominant (immune responses in ≥50% of SARS donors) and subdominant T cell epitopes are marked with dark gray and light gray backgrounds, respectively. SARS-CoV-2-specific T cell epitopes with responses detected exclusively in the SARS group are marked with a red frame, cross-reactive epitopes with immune responses detected in the PRE group are marked with a blue frame.

Source data

Fig. 3: Immunological characterization of SARS-CoV-2-derived HLA class I and HLA-DR T cell epitopes.
figure 3

ad, IFN-γ ELISPOT assay (a,b) and flow cytometry-based characterization (c,d) of peptide-specific T cells from convalescent SARS donors after 12-d in vitro pre-stimulation with SARS-CoV-2-derived HLA class I- (a,c) and HLA-DR-binding (b,d) peptides. Flow cytometry data of indicated cytokines and surface markers are shown for CD8+ (c) and CD4+ (d) T cells. T cell responses were considered positive when mean spot counts in ELISPOT assays or detected frequency in intracellular cytokine staining was at least threefold higher than the negative control. ELISPOT data are presented as a scatter dot plot with mean. Neg. pep., negative control using an irrelevant HLA-matched peptide. e, Percentage of IFN-γ+, IFN-γ+TNF+ or CD107a+CD8+ (for HLA class I peptides) or CD4+ T cells (for HLA-DR peptides) across multiple donors. Indicated percentage depicts the frequency in the sample stimulated with the test peptide minus the frequency of the negative control stimulated with an irrelevant control peptide. Each data point represents one single donor analyzed within one single experiment. Horizontal lines indicate mean with s.d. (error bar). The gating strategy applied for the evaluation of flow-cytometry-acquired data presented in this figure is provided in Supplementary Fig. 5.

Source data

Table 2 Immunogenic SARS-CoV-2-derived HLA class I T cell epitopes
Table 3 Immunogenic SARS-CoV-2-derived HLA-DR T cell epitopes
Fig. 4: Intensity of T cell responses against SARS-CoV-2 HLA class I and HLA-DR T cell epitopes and immunogenicity of different SARS-CoV-2 ORFs.
figure 4

a,b, Intensity of T cell responses in terms of calculated spot counts in IFN-γ ELISPOT assays after 12-d pre-stimulation against the respective SARS-CoV-2 HLA class I (a) and HLA-DR (b) T cell epitopes using peripheral blood mononuclear cells (PBMCs) from convalescent SARS-CoV-2-infected donors (SARS) as well as unexposed donors (PRE). Dots represent data from individual donors. Bars represent mean with s.d. (error bar). c, Frequency of validated HLA class I T cell epitopes for structural (dark gray) and nonstructural/accessory (light gray) ORFs. d, Mean recognition frequency of HLA class I and HLA-DR T cell epitopes by SARS (red) and PRE donors (blue) within the different ORFs.

Source data

Cross-reactive T cell responses in unexposed individuals

Upon screening the PRE group A, cross-reactive T cell responses to 9 of 29 (31%) of the validated HLA class I and to 14 of 20 (70%) HLA-DR T cell epitopes were detected. Recognition frequencies (donors with T cell responses normalized to all tested donors) of single SARS-CoV-2 HLA class I and HLA-DR T cell epitopes in the PRE group A were lower compared to that of SARS group 1 (up to 27% for B08_P05 and 44% for DR_P01; Fig. 2a,b, Tables 2 and 3 and Supplementary Fig. 4). Recognition frequencies of HLA class I and HLA-DR T cell epitopes in individual donors differed profoundly between the PRE and the SARS group within the different ORFs. ORF1-derived HLA class I (9%) and ORF8-derived HLA-DR (25%) T cell epitopes showed the highest recognition frequencies in the PRE group, whereas none of the T cell epitopes from ORF5 (membrane protein) and ORF10 that were frequently recognized in SARS donors were detected by T cells in PRE donors (Fig. 4d). Donor-specific recognition rates (recognized peptides/tested peptides) of HLA class I and HLA-DR SARS-CoV-2 T cell epitopes were significantly lower in the PRE group (HLA class I, mean 26 ± 9; HLA-DR, mean 10 ± 5) than in the SARS group (HLA class I, mean 52 ± 23; HLA-DR, mean 52 ± 23; Fig. 5a). Alignments of the SARS-CoV-2 T cell epitopes recognized by unexposed individuals revealed similarities to the four seasonal human common cold coronaviruses (HCoV-OC43, HCoV-229E, HCoV-NL63, HCoV-HKU1) with regard to amino acid sequences, physiochemical and/or HLA-binding properties for 14 of 20 (70%) of the epitopes, thereby providing clear evidence for SARS-CoV-2 T cell cross-reactivity (Fig. 5b, Supplementary Tables 10 and 11 and Supplementary Data 1). Together, cross-reactive T cell responses to SARS-CoV-2 HLA class I and HLA-DR T cell epitopes were identified in unexposed individuals. These cross-reactive peptides showed similarity to common cold coronaviruses, providing functional basis for heterologous immunity in SARS-CoV-2 infection.

Fig. 5: Detection and characterization of T cell responses to SARS-CoV-2-derived HLA class I and HLA-DR T cell epitopes in unexposed individuals.
figure 5

a, Recognition rate of HLA class I and HLA-DR SARS-CoV-2 T cell epitopes (recognized peptides/tested peptides) in samples of donors from the SARS group 1 (n = 116) and PRE group A (n = 104), respectively (data shown for donors with T cell responses, mean with s.d. (error bars), two-sided Mann–Whitney U-test). b, Representative sequence and physiochemical property alignments of the cross-reactive SARS-CoV-2 T cell epitope A24_P02 with the four seasonal human common cold coronaviruses (HCoV-OC43, HCoV-229E, HCoV-NL63, HCoV-HKU1; for other cross-reactive peptides refer to Supplementary Tables 10 and 11 and Supplementary Data 1). Physiochemical properties were calculated by the PepCalc software. Column directions (up versus down) indicate hydrophilicity according to the Hopp–Woods scale. c, Schematic overview of the definition of SARS-CoV-2-specific and cross-reactive ECs for standardized evaluation of SARS-CoV-2 T cell responses in a group of convalescent individuals from SARS-CoV-2 infection (SARS group 2, n = 86) and a group of unexposed individuals (PRE group B, n = 94). d,e, Recognition frequency (donors with T cell responses/tested donors) of cross-reactive (d) and SARS-CoV-2-specific (e) ECs by T cells in the SARS group 2 and PRE group B. f,g, Calculated spot counts for SARS-CoV-2-specific (HLA class I, n = 68; HLA-DR, n = 78) (f) and cross-reactive ECs (g) in the SARS group 2 (HLA class I, n = 51; HLA-DR, n = 86) and PRE group B (HLA class I, n = 15; HLA-DR, n = 73) (boxes represent median and 25th to 75th percentiles, whiskers are minimum to maximum, two-sided Mann–Whitney U-test).

Source data

T cell responses in convalescent and unexposed individuals

Epitope screening in SARS and PRE donors enabled the identification of SARS-CoV-2-specific T cell epitopes recognized exclusively in convalescent patients after SARS-CoV-2 infection and of cross-reactive T cell epitopes recognized by both, convalescent patients and SARS-CoV-2 unexposed individuals. To allow for standardized evaluation and determination of T cell response frequencies to SARS-CoV-2, we designed broadly applicable HLA class I and HLA-DR SARS-CoV-2-specific and cross-reactive T cell epitope compositions (ECs) (Fig. 5c and Extended Data Fig. 6). These ECs were utilized for IFN-γ ELISPOT assays after 12-d in vitro pre-stimulation in groups of convalescent patients (SARS group 2, n = 86; Table 1 and Supplementary Table 6) and unexposed donors (PRE group B, n = 94; Table 1 and Supplementary Table 7). Of the SARS donors, 100% showed T cell responses to cross-reactive and/or specific ECs (HLA class I 86%, HLA-DR 100%; Fig. 5d,e), whereas 81% of PRE donors showed HLA class I (16%) and/or HLA-DR (77%) T cell responses to cross-reactive ECs (Fig. 5d). In line with the findings obtained with the screening group (SARS group 1), the intensity (in terms of spot counts per 5 × 105 cells) of HLA class I T cell responses was significantly lower compared to HLA-DR T cell responses, both for specific (median calculated spot count HLA class I 379, HLA-DR 760) and cross-reactive ECs (median calculated spot count HLA class I 86, HLA-DR 846; Fig. 5f,g). In line with the differences in recognition rates observed between SARS group 1 and PRE group A, the intensity of T cell responses to cross-reactive ECs was significantly lower in the PRE group (median calculated spot count HLA class I 14, HLA-DR 346) compared to the SARS group (Fig. 5g).

In addition, we evaluated SARS-CoV-2 T cell responses to our ECs ex vivo without 12-d pre-stimulation. Whereas the low-frequent pre-existing SARS-CoV-2 T cells detecting the cross-reactive ECs could not be delineated without pre-stimulation in PRE donors (0 of 42), ex vivo T cell responses to SARS-CoV-2 cross-reactive and/or specific ECs were observed in 96% (45 of 47) of SARS donors (58% HLA class I, 96% HLA-DR; Extended Data Fig. 3a,b). Intensity of T cell responses (in terms of spot counts per 5 × 105 cells) were lower in ex vivo analyses, showing a significant expansion of SARS-CoV-2-specific T cells upon pre-stimulation (Extended Data Fig. 3c–f). In addition to our convalescent SARS collection, including mainly donors with a mild course of COVID-19, we further evaluated SARS-CoV-2 T cell immunity in a group of hospitalized SARS donors (n = 21; Extended Data Fig. 3g). In 81% (17 of 21) of the severely ill patients, T cell responses targeting our specific (71%) or cross-reactive (76%) ECs could be detected ex vivo (Extended Data Fig. 3h). Compared to the ex vivo analyzed donors of SARS group 2, recognition frequencies in the hospitalized group differed most in cross-reactive EC HLA-DR (94% nonhospitalized versus 71% hospitalized). Taken together, SARS-CoV-2 T cell epitopes enabled detection of post-infectious T cell immunity in 100% of individuals convalescing from COVID-19 and revealed pre-existing T cell responses in 81% of unexposed individuals.

Relationship of SARS-CoV-2 T cell and antibody responses

Anti-SARS-CoV-2 IgG responses in SARS donors were analyzed in two independent assays. The anti-SARS-CoV-2 S1 IgG ELISA assay directed against the S1 domain of the viral spike protein, including the immunologically relevant receptor binding domain, revealed 149 of 178 (84%), 7 of 178 (4%) and 22 of 178 (12%) donors with positive, borderline and no anti-S1 response, respectively (Fig. 6a). Of the borderline/nonresponders, 18 of 29 (62%) were also negative in a second, independent anti-nucleocapsid immunoassay (Fig. 6b). However, SARS-CoV-2-specific CD8+ and/or CD4+ T cell responses after a 12-d in vitro pre-stimulation were detected in 10 of 18 (56%) of the ‘antibody double-negative’ donors (Fig. 6c). The intensity of SARS-CoV-2-specific and cross-reactive HLA-DR T cell responses correlated with antibody titers (Fig. 6d,e), whereas no correlation was observed with HLA class I T cell responses (Extended Data Fig. 4a,b). No correlation between antibody titers directed against the nucleocapsid of human common cold coronaviruses (HCoV-229E, HCoV-NL63 and HCoV-OC43), as determined by bead-based serological multiplex assays and the intensity of cross-reactive CD4+ and CD8+ T cell responses in the SARS group, was detected (Extended Data Fig. 4c–h). In conclusion, SARS-CoV-2-specific peptides enable the detection of post-infectious T cell responses, even in seronegative convalescents.

Fig. 6: SARS-CoV-2-directed antibody and T cell responses in the course of COVID-19.
figure 6

a,b, SARS-CoV-2 serum IgG S1 ratio (EUROIMMUN) in SARS donors (n = 178) (a) and anti-nucleocapsid antibody titers (Elecsys immunoassay) of SARS donors with borderline/negative responses in EUROIMMUN assay (n = 29) (b). Donors with negative and borderline responses are marked in white and gray, respectively. c, The pie chart displays T cell responses (positive, n = 15; negative, n = 3) to SARS-CoV-2-specific (n = 10) and cross-reactive (n = 5) T cell epitopes in donors without antibody responses (n = 18, assessed in two independent assays). d,e, Correlation analysis of IgG ratios (EUROIMMUN) to SARS-CoV-2 with spot counts assessed by ELISPOT assays for HLA-DR SARS-CoV-2-specific (n = 78) (d) and cross-reactive (n = 86) (e) ECs in SARS group 2 (dotted lines, 95% confidence level, Spearman’s rho (ρ) and P value). f,g, IgG antibody response (EUROIMMUN) to SARS-CoV-2 (n = 178) (f) and T cell response to SARS-CoV-2-specific (HLA class I, n = 68; HLA-DR, n = 78) (g) and cross-reactive ECs (HLA class I, n = 51; HLA-DR, n = 86), respectively, in SARS donors with low and high SC (combining objective (fever ≥38 °C) and subjective disease symptoms) in the course of COVID-19. h, Recognition rate of T cell epitopes (recognized peptides/tested peptides) in SARS donors (group 1) with low and high SC in the course of COVID-19 (n = 84). Boxes represent median and 25th to 75th percentiles, whiskers are minimum to maximum, two-sided Mann–Whitney U-test (f,g); boxes represent median and 25th to 75th percentiles, whiskers are minimum to maximum, one-sided Student’s t-test (h).

Source data

Association of antibody and T cell responses with COVID-19

Finally, the association of anti-SARS-CoV-2 antibody and T cell responses after a 12-d in vitro pre-stimulation with disease severity as assessed by a combinatorial symptom score (SC) of objective (fever ≥38.0 °C) and patient-subjective disease symptoms was determined (Table 1). Alike in critically ill patients27, independently of age: high-antibody ratios were significantly associated with disease severity in our collection of convalescent SARS donors (n = 180, group 1 and 2), who in general were in good health and had not been hospitalized (Fig. 6f and Extended Data Fig. 5a). Neither the intensity of SARS-CoV-2-specific nor of cross-reactive T cell responses to HLA class I or HLA-DR ECs correlated with demographics (sex, age or body mass index; Supplementary Tables 12 and 13) or disease severity (Fig. 6g). Rather, diversity of T cell responses in terms of recognition rate of SARS-CoV-2 T cell epitopes (number of recognized epitopes normalized to the total number of tested epitopes in the respective donor) was decreased in patients with more severe COVID-19 symptoms (Fig. 6h and Extended Data Fig. 5b), providing evidence that development of protective immunity requires recognition of multiple SARS-CoV-2 epitopes.

Discussion

This study reports the characterization of multiple broadly applicable SARS-CoV-2-specific and cross-reactive T cell epitopes of various HLA allotype restrictions across all viral ORFs identified in two large collections of donors recovered from SARS-CoV-2 infection as well as unexposed individuals. Our findings aid SARS-CoV-2 research with regard to the understanding of SARS-CoV-2 post-infectious and heterologous T cell responses, but also regarding the development of prophylactic and therapeutic measures.

To allow for the detection of even very small SARS-CoV-2 epitope-recognizing T cell populations especially in unexposed donors, where SARS-CoV-2 cross-reactive T cells were below the detection limit in ex vivo analyses, epitope definition was based on a 12-d pre-stimulation protocol before a routine 18–24-h ELISPOT assay. The requirement of this pre-stimulation protocol is further supported by a recent work characterizing human cytomegalovirus-derived T cell epitopes, showing a loss, even of dominant human cytomegalovirus-derived T cell epitopes when analyzing T cell responses ex vivo without previous amplification5. However, as in vitro culture might distort cytokine production or proportions of specific T cell subsets, further studies have to evaluate the physiological cytokine profile and phenotype of SARS-CoV-2-specific T cells in more extensive ex vivo studies. Further validation of the proposed T cell epitopes requires confirmation of MHC binding of the respective peptides, which could be achieved by refolding experiments to build monomers (MHC–peptide complexes) followed by tetramer staining of T cells or by cytotoxicity experiments utilizing, for example, SARS-CoV-2-infected cell lines.

At present, determination of immunity to SARS-CoV-2 relies on the detection of SARS-CoV-2 antibody responses. However, despite the high sensitivity reported for several assays there is still a substantial percentage of patients with negative or borderline antibody responses and thus unclear immunity status after SARS-CoV-2 infection28. Our SARS-CoV-2-specific T cell epitopes, which are not recognized by T cells of unexposed donors, allowed for detection of specific T cell responses even in donors without antibody responses, thereby providing evidence for T cell immunity upon infection. In additional analyses of T cell immunity in hospitalized donors, we could prove SARS-CoV-2 T cell responses also in severely ill patients with COVID-19.

In line with previous data on acute and chronic viral infection29,30, our data indicate an important role of SARS-CoV-2 CD4+ T cell responses in the natural course of infection, with the identification of multiple dominant HLA-DR T cell epitopes that elicit more frequent and intense immune response in SARS donors compared to the HLA class I T cell epitopes. This guides selection of T cell epitopes for vaccine design, also in light of the CD4+ T cell–dependent stimulation of a protective antibody responses.

Cross-reactivity of T cells for different virus species or even among different pathogens is a well-known phenomenon31,32 postulated to enable heterologous immunity to a pathogen after exposure to a nonidentical pathogen21,22,33. This heterologous immunity facilitated by cross-reactive T cell responses can mediate either beneficial or adverse effects34,35 such as in Epstein–Barr virus infection, where influenza immunity and the cross-reactive T cell antigen receptor repertoire can lead to protective immunity to Epstein–Barr virus infection36 or to severe symptoms of infectious mononucleosis37. Using predicted or random SARS-CoV-2-derived peptide pools, very recent studies reported pre-existing SARS-CoV-2-directed T cell responses in small groups of unexposed as well as individuals who are seronegative for SARS-CoV-2, thereby suggesting cross-reactivity between human common cold coronaviruses and SARS-CoV-2 (refs. 18,19,20). In our study we identified and characterized the exact T cell epitopes that govern SARS-CoV-2 cross-reactivity and proved similarity to human common cold coronaviruses regarding individual peptide sequences, physiochemical and HLA-binding properties38,39. Notably, we detected SARS-CoV-2 cross-reactive T cells in 81% of unexposed individuals after a 12-d pre-stimulation. Furthermore, evidence was provided for a lower recognition frequency of cross-reactive HLA-DR EC in hospitalized patients compared to donors with mild COVID-19 course, which might suggest a lack of pre-existing SARS-CoV-2 T cells in severely ill patients. To determine whether expandable, cross-reactive T cells indeed mediate beneficial heterologous immunity and whether this explains the relatively small proportion of severely ill or, even in general, infected patients during this pandemic40,41, a dedicated study using for example a matched case control or retrospective cohort design applying our cross-reactive SARS-CoV-2 T cell epitopes would be required. Moreover, it has to be emphasized that the approach of sequence alignments using National Center for Biotechnology Information (NCBI) BLAST42,43 mainly allows for the detection of cross-reactive epitopes with high sequence similarity, while cross-reactive epitopes with similarities in physiochemical properties within other ORFs of human common cold coronaviruses as well as in other human viruses such as influenza44 might not be identified.

Our observation that intensity of T cell responses and recognition rate of T cell epitopes was significantly higher in convalescent patients compared to unexposed individuals suggests that not only expansion, but also a spread of SARS-CoV-2 T cell response diversity occurs upon active infection.

The pathophysiological involvement of the immune response in the course of COVID-19 is a matter of intense debate. We showed a correlation of high antibody titers with enhanced COVID-19 symptoms in our cohort of nonhospitalized patients. This finding is in line with recent data describing a correlation of high antibody titers with disease severity in hospitalized patients27. Our data together with a recently published study20 provide evidence that, on the other hand, the intensity of T cell responses does not correlate with disease severity. This finding is of high relevance for the design of vaccines, as it provides evidence that disease-aggravating effects might not hamper the development of prophylactic and therapeutic vaccination approaches aiming to induce SARS-CoV-2-specific T cell responses. In contrast to the intensity of the T cell response, we showed that recognition rates of SARS-CoV-2 T cell epitopes by individual donors were lower in individuals with more severe COVID-19 symptoms. This observation, together with our data on increased T cell epitope recognition rates after SARS-CoV-2 infection compared to pre-existing T cell responses in unexposed individuals and reports from other active or chronic viral infections associating diversity of T cell response with antiviral defense45,46,47, provides evidence that natural development and vaccine-based induction of immunity to SARS-CoV-2 requires recognition of multiple SARS-CoV-2 epitopes. Confirmation of this observation in a larger SARS cohort, including more hospitalized patients is warranted and requires single epitope-based methods to determine T cell epitope recognition rates, as enabled by our SARS-CoV-2 T cell epitopes. Moreover, our data underline the high importance of the identified T cell epitopes for further studies of SARS-CoV-2 immunity, but also for the development of preventive and therapeutic COVID-19 measures. Using the SARS-CoV-2 T cell epitopes we are currently preparing two clinical studies (EudraCT 2020-002502-75; EudraCT 2020-002519-23) to evaluate a multi-peptide vaccine for induction of broad T cell immunity to SARS-CoV-2 to combat COVID-19.

Methods

Patients and blood samples

Blood and serum samples as well as questionnaire-based assessment of donor characteristics and disease symptoms from convalescent volunteers after SARS-CoV-2 infection were collected at the University Hospital Tübingen and the Cancer Research Department Rhein-Main (Hospital Nordwest) from April to July 2020 (SARS collection, n = 180). The collection of unexposed individuals (PRE collection, n = 185) includes samples of healthy blood donors (blood donations for research purpose from the Department of Transfusion Medicine, University Hospital Tübingen) that were never exposed to SARS-CoV-2, as the PBMCs of these donors were isolated and asserted (Department of Immunology, Tübingen) before the SARS-CoV-2 pandemic (June 2007 to November 2019). Informed consent was obtained in accordance with the Declaration of Helsinki protocol. The study was approved by and performed according to the guidelines of the local ethics committees (179/2020/BO2, MC 288/2015). Out of the SARS (n = 180) and PRE (n = 185) collections, two groups were built for (1) single-peptide-based T cell epitope screening (SARS group 1 and PRE group A) and (2) standardized immunity evaluation of ECs using IFN-γ ELISPOT assays after in vitro expansion as well as directly ex vivo (SARS group 2 and PRE group B). Donors were assigned to groups according to time of sample acquisition and available sample cell number. Some donors were analyzed in both groups (1 and 2 or A and B for SARS or PRE, respectively). In addition, samples from hospitalized severely ill SARS donors were collected for ex vivo T cell immunity evaluation. SARS-CoV-2 infection was confirmed by PCR test after nasopharyngeal swab. SARS donor recruitment was performed by online and paper-based calls. Sample collection for SARS donor of group 1 and 2 was performed approximately 3–8 weeks after the end of symptoms and/or negative virus smear. Sample collection of hospitalized SARS donors was performed 5–112 d after positive SARS-CoV-2 PCR. PBMCs were isolated by density gradient centrifugation and stored at −80 °C until further use. Serum was separated by centrifugation for 10 min and supernatant was stored at −80 °C. HLA typing was carried out by Immatics Biotechnology GmbH and the Department of Hematology and Oncology at the University Hospital Tübingen. SC was determined by combining objective (fever ≥38.0 °C) and subjective disease symptoms (no/mild/moderate versus severe, reported by questionnaire) of individual donors. Donors with severe disease symptoms and/or fever were classified as ‘high SC’ and all others as ‘low SC’. Detailed SARS and PRE donor characteristics as well as information on allocation of the donors to the experimental groups are provided in Table 1, Supplementary Tables 6 and 7 and Extended Data Fig. 3g.

Data retrieval

The complete highly conserved and representative annotated proteome sequence of SARS-CoV-2 isolate Wuhan-Hu-1 containing ten different ORFs was retrieved from the NCBI database with the accession number MN908947 (ref. 48). The amino acid sequence is identical to the reference sequence (EPI_ISL_412026) defined by Wang et al. conducting multiple sequence alignments and phylogenetic analyses of 95 full-length genomic sequences25.

Prediction of SARS-CoV-2-derived HLA class I-binding peptides

The protein sequences of all ten ORFs were split into 9–12 amino acid-long peptides covering the complete proteome of the virus. The prediction algorithms NetMHCpan 4.0 (ref. 49,50,51) and SYFPEITHI 1.0 (ref. 52) were used to predict the binding of peptides to HLA-A*01:01, -A*02:01, -A*03:01, -A*11:01, -A*24:02, -B*07:02, -B*08:01, -B*15:01, -B*40:01 and -C*07:02. Only peptides predicted as HLA-binding peptides by both algorithms (SYFPEITHI score ≥60%, NetMHCpan rank ≤2) for the respective allotype were further examined. Peptides containing cysteines were excluded to avoid dimerization in a potential subsequent vaccine production process. Peptides derived from the ORF1 polyprotein spanning the cleavage sites of the comprised different protein chains were excluded. An averaged rank combining NetMHCpan- and SYFPEITHI-derived prediction scores was calculated and peptides were ranked for each allotype and ORF separately. Through rank-based selection one peptide for each ORF and each allotype, respectively was selected. For peptides with equal averaged ranks, peptides with higher SYFPEITHI scores were nominated. For some HLA allotypes not every ORF gave rise to an appropriate HLA-binding peptide. To receive ten peptides per HLA allotype and ORF, remaining slots were filled with additional peptides from the ORF9 nucleocapsid protein, the ORF2 spike protein and ORF1.

Prediction of SARS-CoV-2-derived HLA-DR-binding peptides

For HLA-DR predictions all ten ORFs were split into peptides of 15 amino acids, resulting in a total of 9,561 peptides. The prediction algorithm SYFPEITHI 1.0 was used to predict the binding to HLA-DRB1*01:01, -DRB1*03:01, -DRB1*04:01, -DRB1*07:01, -DRB1*11:01 and -DRB1*15:01. The 5% (2% for ORF1) top-scoring peptides of each ORF (based on the total length of each ORF) and each HLA-DR allotype were selected. Position-based sorting of peptides within each ORF revealed peptide clusters of promiscuous peptides binding to several HLA-DR allotypes. Through cluster-based selection, peptide clusters of promiscuous peptides with a common core sequence of nine amino acids were selected. Thereby, ten and two clusters were selected for the ORF9 nucleocapsid and the ORF2 spike protein as well as one cluster for each of the remaining ORFs. Of each selected cluster one representative peptide was selected for immunogenicity analysis excluding cysteine-containing peptides.

Sequence and physiochemical property alignments to human common cold coronaviruses

Potential cross-reactive epitopes of SARS-CoV-2-derived peptides from the four seasonal human common cold coronaviruses (HCoV-OC43, HCoV-229E, HCoV-NL63 and HCoV-HKU1) were identified by sequence alignments of the SARS-CoV-2-derived peptide sequences with the sequences of the common cold coronaviruses using NCBI BLAST42,43. The HLA binding of the common cold coronavirus-derived peptides to the HLA allele of the corresponding SARS-CoV-2 peptide were predicted by the algorithms NetMHCpan 4.0 (refs. 49,50,51) and SYFPEITHI 1.0 (refs. 52). Physiochemical property alignments of the SARS-CoV-2-derived peptide sequences with the human common cold coronaviruses were performed by PepCalc (https://pepcalc.com/).

IFN-γ ELISPOT assay following 12-d in vitro stimulation or ex vivo without pre-stimulation

Synthetic peptides were provided by EMC Microcollections and INTAVIS Bioanalytical Instruments. For the 12-d in vitro stimulation, PBMCs were pulsed with HLA class I or HLA-DR peptide pools (1 μg ml−1 per peptide for class I or 5 μg ml−1 for HLA-DR) and cultured for 12 d adding 20 U ml−1 interleukin-2 (Novartis) on days 3, 5 and 7. Peptide-stimulated (expanded/in vitro pre-stimulated) or freshly thawed (ex vivo) PBMCs were analyzed by enzyme-linked immunospot (ELISPOT) assay in duplicates (if not mentioned otherwise). A total of 2–8 × 105 cells per well were incubated with 1 μg ml−1 (class I) or 2.5 μg ml−1 (HLA-DR) single peptides in 96-well plates coated with anti-IFN-γ (clone 1-D1K, 2 μg ml−1, MabTech). PHA (Sigma-Aldrich) served as positive control, irrelevant HLA-matched control peptides as negative control (negative control peptides are listed in Supplementary Table 14). After 22–24 h incubation, spots were revealed with anti-IFN-γ biotinylated detection antibody (clone 7-B6-1, 0.3 μg ml−1, MabTech), ExtrAvidin-alkaline phosphatase (1:1,000 dilution, Sigma-Aldrich) and BCIP/NBT (5-bromo-4-chloro-3-indolyl-phosphate/nitro-blue tetrazolium chloride, Sigma-Aldrich). Spots were counted using an ImmunoSpot S5 analyzer (CTL) and T cell responses were considered positive when mean spot count was at least threefold higher than the mean spot count of the negative control. The intensity of T cell responses is depicted as calculated spot counts, which were calculated as the mean spot count of duplicates normalized to 5 × 105 cells minus the normalized mean spot count of the respective negative control. In contrast, the recognition frequency of T cell responses within a donor group indicates the relative number of donors that can recognize the respective peptides or ECs (positive donors/tested donors) (Figs. 2a,b, 4d and 5d,e). The frequency (recognition rate) for single donors represents the number of recognized SARS-CoV-2-derived peptides (positive peptides/tested peptides) (Figs. 5a and 6h). For HLA-C*07-restricted peptides, screening in PRE donors was performed using samples of HLA-B*07+ samples due to unavailable HLA-C typing and the known linkage disequilibrium of HLA-B*07 and -C*07 (refs. 53,54).

Intracellular cytokine and cell surface marker staining

Peptide-specific T cells were further characterized by intracellular cytokine and cell surface marker staining. PBMCs were incubated with 10 μg ml−1 of peptide, 10 μg ml−1 brefeldin A (Sigma-Aldrich) and a 1:500 dilution of GolgiStop (BD) for 12–16 h. Staining was performed using Cytofix/Cytoperm solution (BD), APC/Cy7 anti-human CD4 (1:100 dilution, BioLegend), PE/Cy7 anti-human CD8 (1:400 dilution, Beckman Coulter), Pacific blue anti-human TNF (1:120 dilution, BioLegend), FITC anti-human CD107a (1:100 dilution, BioLegend) and PE anti-human IFN-γ monoclonal antibodies (1:200 dilution, BioLegend). PMA (5 μg ml−1) and ionomycin (1 μM, Sigma-Aldrich) served as positive control. Viable cells were determined using Aqua live/dead (1:400 dilution, Invitrogen). All samples were analyzed on a FACS Canto II cytometer (BD) and evaluated using FlowJo software v.10.0.8 (BD). The gating strategy applied for the evaluation of flow cytometry-acquired data is provided in Supplementary Fig. 5.

SARS-CoV-2 IgG ELISA

The 96-well SARS-CoV-2 IgG ELISA assay (EUROIMMUN, 2606A_A_DE_C03, as constituted on 22 April 2020) was performed on an automated BEP 2000 Advance system (Siemens Healthcare Diagnostics) according to the manufacturer’s instructions. The ELISA assay detects anti-SARS-CoV-2 IgG directed against the S1 domain of the viral spike protein and relies on an assay-specific calibrator to report a ratio of specimen absorbance to calibrator absorbance. The final interpretation of positivity is determined by ratio above a threshold value given by the manufacturer: positive (ratio ≥1.1), borderline (ratio 0.8–1.0) or negative (ratio <0.8). Quality control was performed following the manufacturer’s instructions on each day of testing.

Elecsys anti-SARS-CoV-2 immunoassay

The Elecsys anti-SARS-CoV-2 assay is an electrogenerated chemiluminescence immunoassay (Roche Diagnostics) and was used according to manufacturer’s instructions (v.1.0, as constituted in May 2020). It is intended for the detection of high-affinity antibodies (including IgG) directed against the nucleocapsid protein of SARS-CoV-2 in human serum. Readout was performed on a Cobas e411 analyzer. Negative results were defined by a cutoff index of <1.0. Quality control was performed following the manufacturer’s instructions on each day of testing.

Generation of expression constructs for the production of viral antigens

The complementary DNAs encoding the nucleocapsid proteins of HCoV-OC43, HCoV-NL63 and HCoV-229E (NCBI gene bank accession numbers YP_009555245.1; YP_003771.1; NP_073556.1) were produced with an N-terminal hexahistidine (His6)-tag by gene synthesis (Thermo Fisher Scientific) and cloned using standard techniques into NdeI/HindIII sites of the bacterial expression vector pRSET2b (Thermo Fisher Scientific).

Protein expression and purification

To express the viral nucleocapsid proteins the respective expression constructs were transformed in Escherichiacoli BL21(DE3) cells. Protein expression was induced in 1 l TB medium at an optical density (OD600) of 2.5–3 by addition of 0.2 mM isopropyl-β-d-thiogalactopyranoside for 16 h at 20 °C. Cells were collected by centrifugation (10 min, 6,000g) and pellets were suspended in binding buffer (1× PBS, 0.5 M NaCl, 50 mM imidazole, 2 mM PMSF, 2 mM MgCl2, 150 μg ml−1 lysozyme (Merck) and 625 μg ml−1 DNase I (Applichem)). Cell suspensions were sonified for 15 min (Bandelin Sonopuls HD70, power MS72/D, cycle 50%) on ice, incubated for 1 h at 4 °C in a rotary shaker and sonified again. After centrifugation (30 min at 20,000g) urea was added to a final concentration of 6 M to the soluble protein extract. The extract was filtered through a 0.45-μm filter and loaded on a pre-equilibrated 1-ml HisTrapFF column (GE Healthcare). The bound His-tagged nucleocapsid proteins were eluted by a linear gradient (30 ml) ranging from 50 to 500 mM imidazole in elution buffer (1× PBS, pH 7.4, 0.5 M NaCl, 6 M Urea). Elution fractions (0.5 ml) containing the His-tagged nucleocapsid proteins were pooled and dialyzed (D-Tube Dialyzer Mega, Novagen) into PBS. All purified proteins were analyzed via standard SDS–PAGE, followed by staining with InstantBlue (Expedeon) and immunoblotting using an anti-His (1:1,000 dilution, QIAGEN) in combination with a donkey anti-mouse labeled with AlexaFluor647 (1:2,000 dilution, Invitrogen) on a Typhoon Trio (GE Healthcare, excitation 633 nM, emission filter settings 670 nM BP 30) to confirm protein integrity.

Preparation of beads for serological multiplex assay

Antigens were covalently immobilized on spectrally distinct populations of carboxylated paramagnetic beads (MagPlex Microspheres, Luminex Corporation) using 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide/sulfo-N-hydroxysuccinimide chemistry. For immobilization, a magnetic particle processor (KingFisher 96, Thermo Fisher Scientific) was used. Bead stocks were vortexed thoroughly and sonificated for 15 s. A 96-deep-well plate and tip comb was blocked with 1.1 ml 0.5% (v/v) Triton X-100 for 10 min. Afterwards, 83 μl of 0.065% (v/v) Triton X-100 and 1 ml bead stock were added to each well. Finally, each well contained 0.005% (v/v) Triton X-100 and 12.5 × 107 beads of one single bead population. The beads were washed twice with 500 μl activation buffer (100 mM Na2HPO4, pH 6.2, 0.005% (v/v) Triton X-100) and beads were activated for 20 min in 300 μl activation mix containing 5 mg ml−1 1-ethyl-3-(3-dimethylaminopropyl) carbodiimide and 5 mg ml−1 sulfo-N-hydroxysuccinimide in activation buffer. Following activation, the beads were washed twice with 500 μl coupling buffer (500 mM MES, pH 5.0 + 0.005% (v/v) Triton X-100). Antigens were diluted to 39 μg ml−1 in coupling buffer and incubated with activated beads for 2 h at 21 °C to immobilize antigens on the surface. Antigen-coupled beads were washed twice with 800 µl wash buffer (1× PBS + 0.005% (v/v) Triton X-100) and finally, were resuspended in 1 ml storage buffer (1× PBS + 1% (w/v) BSA + 0.05% (v/v) ProClin). The beads were stored at 4 °C until further use.

Bead-based serological multiplex assay

To detect human IgG directed against nucleocapsid proteins from three different coronavirus species (HCoV-229E, HCoV-NL63 and HCoV-OC43), a bead-based multiplex assay was performed. All antigens were immobilized on different bead populations as described above. The individual bead populations were combined in a bead mix. A total of 25 μl of diluted serum sample were added to 25 μl of the bead mix resulting in a final sample dilution of 1:400 and incubated for 2 h at 21 °C. Unbound antibodies were removed by washing the beads three times with 100 μl wash buffer (1× PBS + 0.05% (v/v) Tween20) per well using a microplate washer (Biotek 405TS, Biotek Instruments). Bound antibodies were detected by incubating the beads with PE-labeled goat-anti-human IgG detection antibodies (Jackson Dianova) at a final concentration of 5 μg ml−1 for 45 min at 21 °C. Measurements were performed using a Luminex FlexMap 3D instrument using Luminex xPONENT Software v.4.3 (sample size, 80 μl; 100 events; gate, 7,500–15,000; reporter gain, standard PMT). Data analysis was performed on mean fluorescence intensity.

Software and statistical analysis

The population coverage of HLA allotypes was calculated by the IEDB population coverage tool (www.iedb.org). Flow cytometric data were analyzed using FlowJo v.10.0.8 (BD). Data are displayed as mean with s.d., box plots as median with 25th or 75th quantiles and min/max whiskers. Continuous data were tested for distribution and individual groups were tested by use of an unpaired Student’s t-test, Mann–Whitney U-test or Kruskal–Wallis test and corrected for multiple comparison as indicated. Spearman’s rho (ρ) was calculated for correlation between continuous data. A logistic regression model was used to calculate odds ratios and 95% confidence intervals. Factors before the outcome and measured continuous variables were included in the model. Missing data were included in tables and in descriptive analysis. Graphs were plotted using GraphPad Prism v.8.4.0. Statistical analyses were conducted using GraphPad Prism v.8.4.0 and JMP Pro (SAS Institute, v.14.2) software. P values <0.05 were considered statistically significant.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.