HLA-A*01:01 allele diminishing in COVID-19 patients population associated with non-structural epitope abundance in CD8+ T-cell repertoire

In mid-2021, the SARS-CoV-2 Delta variant caused the third wave of the COVID-19 pandemic in several countries worldwide. The pivotal studies were aimed at studying changes in the efficiency of neutralizing antibodies to the spike protein. However, much less attention was paid to the T-cell response and the presentation of virus peptides by MHC-I molecules. In this study, we compared the features of the HLA-I genotype in symptomatic patients with COVID-19 in the first and third waves of the pandemic. As a result, we could identify the diminishing of carriers of the HLA-A*01:01 allele in the third wave and demonstrate the unique properties of this allele. Thus, HLA-A*01:01-binding immunoprevalent epitopes are mostly derived from ORF1ab. A set of epitopes from ORF1ab was tested, and their high immunogenicity was confirmed. Moreover, analysis of the results of single-cell phenotyping of T-cells in recovered patients showed that the predominant phenotype in HLA-A*01:01 carriers is central memory T-cells. The predominance of T-lymphocytes of this phenotype may contribute to forming long-term T-cell immunity in carriers of this allele. Our results can be the basis for highly effective vaccines based on ORF1ab peptides.


INTRODUCTION
The Delta variant of SARS-CoV-2 caused the third wave of the COVID-19 pandemic in mid-2021 in many countries, including Russia (Klink et al., 2022). The surge in incidence was associated with the high transmissibility of this strain compared to the alpha variant (Callaway, 2021). The increase in transmissibility mainly was due to the rise in the number of viral particles exhaled at the peak of infection by an infected person (up to 6 times compared to the Alpha variant) (Earnest et al., 2022). Aside from the increased risk of hospitalization, the Delta variant also increases the risk of death in unvaccinated COVID-19 patients (Bast et al., 2021).
In the Delta variant, 18 protein level mutations significantly changed the course of the disease (Nersisyan et al., 2022a). Five were located in the spike protein and significantly decreased the effectiveness of humoral immunity formed either naturally (recovered patients) or after vaccination (Bian et al., 2021). In addition, one of these mutations (D614G) increased the affinity of the receptor-binding domain (RBD) of the spike protein to the ACE2 receptor (Ozono et al., 2021). Finally, it should be noted that the rate of virus replication has also changed: the Delta variant replicates two times slower than the Alpha variant in the first 8 h after infection in vitro (Shuai et al., 2022). This circumstance is crucial since the non-structural protein ORF8 produced by SARS-CoV-2 can directly interact with major histocompatibility complex class I (MHC-I) molecules and suppress their maturation. As a result, the export of MHC-I molecules to an infected cell's surface almost wholly stops after 18 h of ORF8 expression .
MHC-I molecules are one of the key mediators of the first steps of a specific immune response to COVID-19. Right after entering a cell, SARS-CoV-2 induces the translation of its proteins. Some of these proteins enter the proteasomes of the infected cell, become cleaved to short peptides (8-12 amino acid residues), and bind to MHC-I molecules. After binding, the complex consisting of the MHC-I molecule and the peptide is transferred to the infected cell's surface, where it can interact with the T-cell receptor (TCR) of CD8 + naïve, central memory or the effector memory T-cell subpopulations. In response to the interaction, the CD8 + T-lymphocyte destroys the infected cell using perforins and serine proteases (Wherry & Ahmed, 2004). The crucial role of long-term CD8 + T-cell activation in the immune response to SARS-CoV-2 has been recently studied in a cohort of patients with different severity of disease (Gattinger et al., 2021;Titov et al., 2022).
There are three main types of MHC-I molecules encoded by HLA-A, HLA-B, and HLA-C (Human Leukocyte Antigens) genes. Each gene is presented in two variants (alleles) inherited from parents. There exist thousands of HLA alleles, and every allele encodes an MHC-I molecule with an individual ability to bind various self-and non-self-antigens (Chowell et al., 2019). Individual combinations of HLA class I alleles essentially associated with the severity of multiple infectious diseases, including malaria (Lima-Junior & Pratt-Riccio, 2016), tuberculosis (Mazzaccaro et al., 1996), HIV (Goulder & Watkins, 2008), and viral hepatitis (Wang et al., 2009). Previously we have demonstrated the important role of MHC-I peptide presentation in the development of a specific immune response to the Wuhan-Hu-1 variant (GISAID accession EPI_ISL_402125) .
For over two years of the COVID-19 pandemic, the scientific community accumulated a significant amount of information on the actual epitopes of various SARS-CoV-2 variants (Vita et al., 2019), features of the formation of T-cell memory (Francis et al., 2022), trends in the frequency of mutations in the virus (Nersisyan et al., 2022b). We pay special attention to the analysis of epitopes emerging from ORF1ab region of SARS-CoV-2. SARS-CoV-2 genome comprises large ORF1ab region, four structural proteins genes (S, E, M, N), and several ORFs coding for accessory proteins (Arya et al., 2021). ORF1ab gene encodes several non-structural proteins involved in various molecular processes, which are essential for SARS-CoV-2 biology, including genome expression and replication (Badua, Baldo & Medina, 2021). Right after translation the resulting polypeptide chain is cleaved by viral proteases and 16 non-structural proteins emerge (nsp1-nsp16). It is known that ORF1ab gene is highly conservative comparing to other SARS-CoV-2 genomic regions (Jaroszewski et al., 2021;Lubin et al., 2022). Thereby ORF1ab was of particular interest to our research, as we pondered the action of memory T cells as one of the keys to understanding the observed phenomena.
The associations of the HLA genotypes and the course of COVID-19 were mainly analyzed according to data related to the first wave of the pandemic and the initial virus variant (Tavasolian et al., 2021;Lorente et al., 2021;Langton et al., 2021;Venet et al., 2022). Moreover, in studies of the associations between the severity of COVID-19 and HLA-I genotypes, the age of the recovered patients was practically not considered. It should be noted that age is a significant factor in the immune response to COVID-19 (Promislow, 2020;Sanchez-Vazquez et al., 2021;McGroder et al., 2021). In particular, it has been shown that in people over 60 years of age, telomere lengths of naïve T-lymphocytes decrease significantly, which leads to an almost 10-fold drop in their ability to divide upon activation . Also, the T-cell receptor (TCR) repertoire is reduced in older people (Britanova et al., 2014).
Previously, in a cohort of the first wave of COVID-19 patients, we showed that the low number of peptides presented by an individual's HLA-I genotype significantly correlates with COVID-19 severity only in patients under the age of 60 . In this study, we compared the features of the HLA-I genotypes of COVID-19 patients under 60 between the first and the third waves. In addition, we assessed the influence of mutations in the SARS-CoV-2 variants on the immunogenic epitopes of CD8 + T-lymphocytes.

Design and participants
Three groups of patients were enrolled in the study. First, the Population group of 428 volunteers was established using electronic HLA genotype records of the Federal Register of Bone Marrow Donors (Pirogov Russian National Research Medical University) . The group of 147 patients of the first wave of COVID-19 was formed from May to August 2020 (Wave 1 group). Out of them, the subset of 28 COVID-19 convalescent donors of the first wave was enrolled in a prospective trial (CPS group). Finally, the group of patients of the third wave was formed from June to July 2021 (Wave 3 group). 219 COVID-19 patients in Wave 3 group enrolled in O.M. Filatov City Clinical Hospital, (Moscow, Russia).
All patients had at least one positive test result for SARS-CoV-2 by reverse transcription PCR (RT-qPCR) from nasopharyngeal swabs or bronchoalveolar lavage. Patients with pathologies that led to greater morbidity or who had additional immunosuppression (patients with diabetes, HIV, active cancer in treatment with chemotherapy, immunodeficiency, autoimmune diseases with immunosuppressants, and transplants) were not included in the study. The medical practitioner collected blood (2 ml) in ethylenediaminetetraacetic acid (EDTA) tube for HLA genotyping.
The severity of the disease was defined according to the classification scheme used by the US National Institutes of Health (COVID-19 Treatment Guidelines Panel. Coronavirus Disease 2019, COVID-19)Treatment Guidelines(2022: asymptomatic (lack of symptoms), mild severity (fever, cough, muscle pain, but without respiratory difficulty or abnormal chest imaging) and moderate/severe (lower respiratory disease at computed tomography scan or clinical assessment, oxygen saturation (SaO2) >93% on room air, but lung infiltrates less than 50%).
The study protocol was reviewed and approved by the Local Ethics Committee at the Pirogov Russian National Research Medical University (Meeting No. 194 of March 16, 2020, Protocol No. 2020

Human Leukocyte Antigen Class I Genotyping with targeted next-generation sequencing
Genomic DNA was isolated from the frozen collected anticoagulated whole blood samples using the QIAamp DNA Blood Mini Kit (QIAGEN GmbH, Hilden, Germany). HLA genotyping was performed with the HLA-Expert kit (DNA Technology LLC, Russia) by amplifying exons 2 and 3 of the HLA-A/B/C genes and exon 2 of the HLA-DRB1/3/4/5/DQB1/DPB1 genes. Prepared libraries were run on Illumina MiSeq sequencer using a standard flow-cell with 2 250 paired-end sequencing. Reads were analyzed using HLA-Expert Software (DNA-Technology LLC, Russia) and the IPD-IMGT/HLA database 3.41.0 (Robinson et al., 2020). The minimum depth of each exon sequencing was 130x.
HLA-genotyping of convalescent donors was performed using the One Lambda ALLType kit (Thermo Fisher Scientific, USA), which uses multiplex PCR to amplify full HLA-A/B/C gene sequences, and from exon 2 to the 3' untranslated region of the HLA-DRB1/3/4/5/DQB1 genes as described previously (Titov et al., 2022). Prepared libraries were run on Illumina MiSeq sequencer using a standard flow-cell with 2150 paired-end sequencing. Reads were analyzed using One Lambda HLA TypeStream Visual Software, version 2.0.0.27232, and the IPD IMGT/HLA database 3.39.0.0 (Robinson et al., 2020). Processed genotype data for Wave 1 and Wave 3 groups are presented in Table S1. Genotype data for Population group were published previously , Table S1, ''Control'' sheet).

Peripheral blood mononuclear cell (PBMC) isolation
30 mL of venous blood from CPS group donors was collected into EDTA tubes (Sarstedt, Newton, NC, USA) and subjected to Ficoll (PanEco, Berg am Irchel, Switzerland) density gradient centrifugation (400×g, 30 min). Isolated PBMCs were washed with PBS containing 2 mM EDTA and used for assays or frozen in fetal bovine serum containing 7% DMSO.

T-cell expansion
Full details of the T-cell expansion are provided in the manuscript (Titov et al., 2022). Briefly, PBMCs sampled form COVID-19 convalescents were expanded for 12 days with the pre-selected 94 SARS-CoV-2 peptides (final concentration of each = 10 µM). On days 10 and 11, an aliquot of cell suspension was used for anti-IFN-γ ELISA aiming at the identification of responses to individual peptides.

Cell stimulation with individual peptides
After 10 days of expansion, an aliquot of cell culture was washed twice in 1.5 mL of PBS and was then transferred to AIM-V medium (Thermo Fisher Scientific, Waltham, MA, USA), plated at 1× 10e5 cells per well in 96-well plates, and incubated overnight (12-16 h) with the smaller pools of peptides, spanning the composition of the initial peptide set. The following day the culture medium was collected and tested for IFN-γ as described below. If the cells reacted positively to one or several pools we sampled an aliquot of cell culture again on days 11-12, and stimulated it as described above individually with each peptide (2 µM) from those peptide pools. Only peptides predicted to bind to each individual's HLA were tested.

Anti-IFN-γ ELISA
Culture 96-well plates with cells, incubated with peptide pools or individual peptides were centrifuged for 3 min at 700 g, and 100 µL of the medium was transferred to the ELISA plates and the detection of IFN-γ was performed (Titov et al., 2022). OD was measured at 450 nm on a MultiScan FC (Thermo Fisher Scientific, Waltham, MA, USA) instrument (OD450).
Test wells (medium from cells, incubated with peptides) were compared with negative control wells (cells incubated with the solvent for peptides). Test wells with the ratio OD450_test_well/OD450_negative control ≥ 1.25 and the difference OD450_test_well -OD450_negative control ≥ 0.08 were considered positive. Peptides with a ratio between 1.25 and 1.5 were tested again up to three times as biological replicates to ensure the accuracy of their response, and peptides with two or three positive results were considered positive.

Peptides
Peptides (at least 95% purity) were synthesized either by Peptide 2.0 Inc. or by the Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS.

Analysis of the binding affinity and immunogenicity of SARS-CoV-2 epitopes
To assess the immunoprevalent epitopes of HLA-A*01:01 and HLA-A*02:01, we queried the Immune Epitope Database (IEDB) (Vita et al., 2019) for epitopes with positive MHC-I binding and positive T-cell assays using ''Severe acute respiratory syndrome-related coronavirus 2'' as ''Organism'' on May 17, 2022. Epitopes with more than 50% response rate for the respective allele carriers were considered immunoprevalent. Mutation analysis was performed using data extracted from T-cell COVID-19 Atlas (Nersisyan et al., 2022a) (accessed May 30, 2022) and included 16 variants of concern (VOC) strains.
The data were obtained from supplementary materials of Francis et al. (2022). The experiment conducted by the authors consisted in performing single-cell RNA sequencing (scRNA-seq) of T-cells bound to DNA-barcoded peptide-HLA tetramers. This approach allowed us to extract information on both TCR sequences and their specific HLA-epitope pairs. Francis with coauthors used a comprehensive library of SARS-CoV-2 epitopes and epitopes emerging from SARS-CoV-1, cytomegalovirus, Epstein-Barr virus, and influenza. For the analysis of the distribution of epitopes across the SARS-CoV-2 genome, the data file S3 was used. It includes the list of SARS-CoV-2 reactive clonotypes with their specific epitopes and their antigen source. In addition, the list of SARS-CoV-2 epitopes was extracted, and duplicate entries were removed.
The data file S7 was used for the analysis of the T-cell phenotypes. It comprises the information about the phenotypes of T-cells reacting to particular epitopes. The phenotypes of T-cells were computationally assigned by the authors based on the analysis of differentially expressed genes. The authors note that they identified several distinct cell clusters, but except for naïve cells, central memory, and fully activated cytotoxic effectors, all other clusters had mixed properties, making it impossible to determine their phenotypes univocally. Therefore, we decided to take only the definitely determined cell phenotypes into our analysis: naïve cells, central memory, and fully activated cytotoxic effector cells.

Bioinformatics analysis of HLA/peptide interactions
Protein sequences of SARS-CoV-2 variants were obtained from GISAID portal (Elbe & Buckland-Merrett, 2017) T-CoV pipeline was executed to analyze HLA/peptide interactions (Nersisyan et al., 2022a;Nersisyan et al., 2022b). In the pipeline's core, binding affinities for all possible 8-to 12-mers from viral proteins and 169 frequent HLA class I molecules were predicted using NetMHCpan 4.1 (Reynisson et al., 2020). IC50 less than 50 nM was used as a threshold for tight binding peptides.

Statistical analysis
Allele frequencies in considered cohorts were estimated by dividing the number of occurrences of a given allele in individuals by the doubled total number of individuals (i.e., identical alleles of homozygous individuals were counted as two occurrences). The following functions from the stats library in R were used to conduct statistical testing: fisher.test for Fisher's exact test, wilcox.test for Mann-Whitney U test. In addition, the Benjamini-Hochberg procedure was used to perform multiple testing corrections. Plots were constructed with ggpubr and pheatmap libraries.

Distribution of HLA Class I Alleles in the cohorts of patients of the first and the third waves of COVID-19
We performed HLA class I genotyping of 147 patients who tested positive for COVID-19 during the first wave of COVID-19 in Moscow (from May to August 2020, Wave 1 group). Also, 219 COVID-19 patients were genotyped from June to July 2021 (Wave 3 group). The demographic and clinical data of these groups are summarized in Table 1. We did not find significant differences in the age of patients in the comparison groups and the gender ratio in the groups. The fraction of vaccinated patients (two doses of Sputnik V vaccine) in the Wave 3 group was insignificant and equal to 8.7%, slightly lower than the citywide vaccination rate for June 2021-15% (Boguslavsky, Sharova & Sharov, 2022). There was a significant increase in the proportion of patients with obstructive pulmonary disease (Fisher's exact test p = 0.01), obesity (Fisher's exact test p = 0.01, OR = 3.8), hypertension (Fisher's exact test p = 0.003, OR = 2.4) in the third wave of COVID-19. We also assessed the contribution of comorbidities to the risk of death from COVID-19. Interestingly, heart disease (Fisher's exact test p = 5.2 − 5, OR = 13) and hypertension (Fisher's exact test p = 0.045, OR = 3.5) were significant death risk factors in the first wave. At the same time, no similar effects were observed for the third wave patients (Table S2).
First, we tested whether the frequency of a single allele can differentiate individuals from three groups: COVID-19 patients of the first wave, COVID-19 patients of the third wave, and the population group. The distribution of major HLA-A, HLA-B, and HLA-C alleles in these three groups is summarized in Fig. 1. Fisher's exact test was used to make formal statistical comparisons. As a result, we found that for all possible group comparisons, only one allele out of dozen top alleles had a high odds ratio, which can be considered statistically significant after multiple testing correction. Specifically, frequency of HLA-A*01:01 allele decreased from 17.3% in the Wave 1 group to 9.8% in the Wave 3 group (Fisher's exact test adj. p = 0.025, OR = 0.5). Some of the alleles were differentially enriched if no multiple testing correction was applied (Table S3).
We hypothesized that the decrease in the frequency of HLA-A*01:01 carriers among patients of the third wave group could be attributed to the characteristics of the Tcell response. To test this hypothesis, computational methods were used to analyze the interactions between the HLA-I molecules and the viral peptides, the effects of mutations in VOC on these interactions, as well as the results of experimental testing of the immunogenicity of peptides in patients who recovered from the COVID-19 in the first wave.

Analysis of the binding affinity and immunogenicity of SARS-CoV-2 epitopes
The nonstructural proteins encoded by ORF1ab gene are translated first (V'kovski et al., 2021) and undergo proteasomal digestion before structural and accessory proteins (including ORF8 which suppress MHC-I maturation). Given that, we analyzed the ability of MHC-I molecules encoded by 12 most common alleles in the European population (Gonzalez-Galarza et al., 2019) (Fig. 2) to bind and present ORF1ab and non-ORF1ab peptides. The ability of HLA-A*01:01 to interact with peptides of both structural and non-structural proteins of the virus was moderate. Namely, it was predicted to interact with an affinity of less than 50 nM (tight binding peptides) with 28 peptides from ORF1ab and 8 peptides from structural and accessory proteins. Another allele, HLA-A*02:01, was the most frequent in the population and had 207 predicted tight binders from ORF1ab and 56 tight binders from the other proteins. We also compared the ability to present viral peptides between genotypes that include HLA-A*01:01 and genotypes which do not include this allele (Fig. S1A). HLA-I genotypes, which included HLA-A*01:01, had, on average, a lower number of high-affinity peptides compared to non-carriers regardless of wave (Mann-Whitney U test p < 0.02). Of note, carriers of the HLA-A*02:01 allele (Fig.  S1B), in contrast to carriers of HLA-A*01:01, have a higher number of high affinity viral peptides (Mann-Whitney U test p < 3.8e−08). Next, we analyzed the immunogenic epitopes of SARS-CoV-2 using the data from the IEDB portal for HLA-A*01:01 and HLA-A*02:01 carriers. At the time of the analysis (May 2022), the database contained information on the T-cell responses to 365 viral peptides. Two shared epitopes were found for both alleles: S 136−144 (CNDPFLGVY), M 170−178 (VATSRTLSY). For the HLA-A*01:01 molecule, the immunogenicity of 50 least 50% of the tested samples). For the HLA-A*01:01 molecule, 10 immunoprevalent epitopes from ORF1ab and only three epitopes not from ORF1ab were identified. For the HLA-A*02:01 allele, 51 immunoprevalent epitopes from ORF1ab and 59 from non-ORF1ab were identified. The ratio of immunoprevalent epitopes from ORF1ab for the HLA-A*01:01 molecule was 3.8 times higher than for the HLA-A*02:01 molecule (Fisher's Exact Test p-value = 0.04). This result indicates that among the immunoprevalent for HLA-A*01:01 molecule, epitopes from ORF1ab, characterized by a low mutation rate, predominate.

Validation of ORF1ab epitopes in a subset of convalescent HLA-A*01:01 and HLA-A*02:01 carriers
To validate the hypothesis that the first wave convalescent HLA-A*01:01 allele carriers had a high number of immunogenic epitopes from ORF1ab proteins, we analyzed the T-cell responses of 28 patients with the history of confirmed COVID-19 during first wave who carried at least one of the two most common alleles in the European population: HLA-A*01:01 (13 patients) and HLA-A*02:01 (15 patients). Individual immunogenicity of 15 validated epitopes from SARS-CoV-2 ORF1ab (Table S4) was tested for each T-cell sample. The peptides from this panel did not induce a T-cell response in patients who had not previously had COVID-19 (Titov et al., 2022). The number of epitopes for HLA-A*01:01 was seven and for HLA-A*02:01 -8. Since time from the symptoms onset to the first measurement could affect the strength of the T-cell response, we compared these times between groups of HLA-A*01:01 and HLA-A*02:01 carriers. For HLA-A*01:01 carriers, the median time after symptoms onset was 34 days, and for HLA-A*02:01 carriers, it was 41 days; the differences were insignificant (Fig. S2). To assess the strength of T-cell immunity formed by ORF1ab epitopes in carriers of the HLA-A*01:01 and HLA-A*02:01 alleles, true positive rates (TPR) for each of the epitopes were evaluated. This indicator reflects the ratio of true positive responses to the expected number of positive responses. TPR values for these alleles differed significantly for epitopes from ORF1ab (Fig. 3, Table S5). Three of the 7 HLA-A*01:01 epitopes tested were immunoprevalent (ORF1ab 1637−1646 TTDPSFLGRY, ORF1ab 1636−1646 HTTDPSFLGRY, ORF1ab 1321−1329 PTDNYITTY), but there were no immunoprevalent epitopes out of eight tested for HLA-A*02:01.

Phenotype analysis of CD8 + T-lymphocytes of convalescent patients
Aside from IEDB and own epitope validation data, we analyzed ex vivo scRNA-seq data of convalescent individuals CD8 + T-cells activated by single peptides (Francis et al., 2022). The considered dataset included the set of phenotyped T-cell clones responding to a comprehensive set of SARS-CoV-2 derived epitopes associated with four major HLA . First, we examined the distribution of SARS-CoV-2 derived epitopes which elicited T-cell response according to the genomic region from which they originated. Concordantly with the results mentioned above, the majority of HLA-A*01:01 epitopes were from ORF1ab. The same tendency was not observed for the other alleles (Fig. 4A). Next, we analyzed the phenotypes of responding T-cell clones (Fig. 4B). HLA-A*01:01associated epitopes elicited responses mainly from the T central memory cells (Tcm) subpopulation. At the same time, the proportion of Tcm responding cells for other alleles was significantly smaller (Fisher Exact Test pairwise comparisons of HLA-A*01:01 with HLA-A*02:01, HLA-B*07:02, and HLA-A*24:02 results in p-values < 0.02). Together with the observation that most of the known HLA-A*01:01-associated epitopes originated from the conservative ORF1ab region, these results imply that people bearing HLA-A*01:01 may have a higher chance of eliciting robust immune response upon secondary exposure to SARS-CoV-2 due to the pre-existing immune memory.    (Table 2). HLA-A*01:01 had 51 high-affinity peptides from ORF1ab and 13 high-affinity peptides from other proteins (structural and accessory) for the Wuhan-Hu-1 strain. A distinctive feature of this allele was that it had a relatively high number of high-affinity peptides, although VOC mutations do not affect them (Fig. 5). Of the 64 high-affinity HLA-A*01:01 peptides, only one peptide significantly changed the affinity in 16 VOCs analyzed. At the same time, similar to the number of tight binders HLA-A*24:02 molecule had 27 altered peptides from ORF1ab (Fisher's exact test p-value = 3e−09, OR = 0.02). It should be noted that the high-affinity peptides from ORF1ab for all analyzed alleles were less affected by the mutations compared to the rest of the peptides (Mann-Whitney U test p-value = 4.3e−05).

DISCUSSION
We conducted a comparative study of the HLA-I genotypes of symptomatic COVID-19 patients of the pandemic's first and third waves. The genotypes of 147 patients (first wave) and 219 (third wave) were studied. We found a significant increase in the proportion of patients with obstructive pulmonary disease, obesity, and hypertension in the third wave of COVID-19. This circumstance may be associated not only with the peculiarities of the Delta variant but also with the possible population's fatigue from complying with anti-epidemic measures, which led to the infection of risk groups that previously more strictly kept the social distancing regime (Chan et al., 2021). We tested whether the frequency of HLA alleles can differentiate individuals from three groups: COVID-19 patients of the first wave, COVID-19 patients of the third wave, and the population group. We found that for all possible group comparisons, only one allele out of the most abundant alleles had a statistically significant odds ratio after multiple testing correction. Namely, the frequency of the HLA-A*01:01 allele in the Wave 3 group fell by half relative to the Wave 1 group. Previously, the HLA-A*01:01 allele was considered a risk allele for infection and severe course of COVID-19 in the first wave (Pisanti et al., 2020;Ishii, 2020;Shkurnikov et al., 2021;Naemi et al., 2021). At the same time, the protective role of this allele against the formation of severe bilateral pneumonia caused by COVID-19 was also reported (Suslova et al., 2022).
We suggested that the decrease in the frequency of HLA-A*01:01 carriers among patients of the third wave could be associated with the characteristics of the previously formed T-cell responses. It is known that up to 50% of cases of COVID-19 are asymptomatic (Byambasuren et al., 2020;Alene et al., 2021) and lead to the formation of neutralizing antibodies and multispecific T-cells (Reynolds et al., 2020). While the efficiency of neutralizing antibodies decreases for VOCs (Dupont et al., 2021), the formed pool of multispecific T-cells in most cases can provide an immune response regardless of viral mutations (Nersisyan et al., 2022b).
To test this hypothesis, we first analyzed the number of viral peptides interacting with the 12 most common alleles in the European population with bioinformatics methods. HLA-A*01:01 allele was one of the alleles with a moderate ability to interact with peptides of both structural and non-structural proteins of the virus. We also found that HLA-A*01:01 carriers have fewer predicted high-affinity peptides compared to the non-carriers, regardless of the wave of COVID-19.
Analysis of the confirmed immunogenic epitopes of SARS-CoV-2 according to the IEDB database showed that 10 immunoprevalent epitopes from ORF1ab and only three not from ORF1ab were identified for HLA-A*01:01. In turn, there were 51 immunoprevalent epitopes from ORF1ab and 59 epitopes not from ORF1ab for the HLA-A*02:01 molecule. Thus, the ratio of immunoprevalent epitopes from ORF1ab for the HLA-A*01:01 molecule was 3.8 times greater than for the HLA-A*02:01 molecule. Among the immunoprevalent for HLA-A*01:01 epitopes from ORF1ab predominate. The stability of this region allows to expect that they will retain their immunogenicity in new SARS-CoV−2 VOCs. We additionally validated these data by analyzing the T-cell responses of 28 carriers HLA-A*01:01 and HLA-A*02:01 alleles. Concordantly with IEDB data, the true positive rates for ORF1ab epitopes were significantly higher for HLA-A*01:01 compared to the HLA-A*02:01. One of the possible explanations for these data may be the higher proportion of formed central memory CD8 + T-cells in HLA-A*01:01 carriers compared to the HLA-A*02:01, HLA-B*07:02, and HLA-A*24:02 carriers.
Moreover, an ORF1ab-derived HLA-A*01:01-restricted epitope TTDPSFLRGY was shown in multiple studies to induce exceptionally high frequency of T-cells (magnitude of response) in comparison to other multiple immunoprevalent epitopes (Snyder et al., 2020;Gangaev et al., 2021;Saini et al., 2021;Kared et al., 2021). Furthermore, it was shown that the response of PBMCs of HLA-A*01:01 + convalescents to the epitopes derived from non-structural proteins was higher than to the structural proteins. This was not observed for HLA-A*01:01-convalescents (Titov et al., 2022). This further suggests high importance of ORF1ab-focused T-cell response for carriers of HLA-A*01:01.
Not a single immunoprevalent HLA-A*01:01 epitope significantly changed its presentation affinity due to mutations in the actual VOCs: Delta G/478K.V1, Omicron (BA.1 -BA. 4). At the same time, other alleles such as HLA-A*02:01 were affected by the mutations. Interestingly, HLA-A*01:01 was the only allele with a relatively high number of high-affinity peptides unaffected by the mutations. In agreement with the reports on the lower mutation rate in ORF1ab (Vilar & Isom, 2021), high-affinity peptides from ORF1ab for all analyzed alleles were less affected by mutations compared to the rest of the proteins.
In this study, we demonstrate the possibility of a significant reduction in the frequency of HLA-A*01:01 allele carriers among hospitalized patients during the third wave of the COVID-19 pandemic. The computational prediction of binding affinity between MHC-I molecules and SARS-CoV-2 peptides, revealed a possible reason for the diminishing of the HLA-A*01:01 allele. Different genes, and therefore SARS-CoV-2 proteins, mutate at different rates. The ORF1ab gene is highly conservative compared to other SARS-CoV-2 genes. Carriers of HLA-A*01:01 have a significant number of high-affinity epitopes from this gene. In a cohort of convalescent patients of the first wave of COVID-19, we confirmed the results of computer modeling and demonstrated a higher persistence of immunoprevalent epitopes from ORF1ab gene of SARS-CoV-2 in HLA-A*01:01 carriers compared to epitopes from this gene in HLA-A*02:01 carriers. Moreover, analysis of the results of single-cell phenotyping of T-cells in recovered patients showed that the predominant phenotype in HLA-A*01:01 carriers is central memory T-cells. The predominance of T-lymphocytes of this phenotype may contribute to forming long-term T-cell immunity in carriers of this allele. Our results can be the basis for highly effective vaccines based on ORF1ab peptides.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
• Milena Chekova analyzed the data, prepared figures and/or tables, and approved the final draft.
• Fedor Polyakov analyzed the data, prepared figures and/or tables, and approved the final draft.
• Aleksei Titov conceived and designed the experiments, performed the experiments, authored or reviewed drafts of the article, and approved the final draft.
• Dmitriy Doroshenko conceived and designed the experiments, performed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.
• Valery Vechorko conceived and designed the experiments, authored or reviewed drafts of the article, and approved the final draft.
• Alexander Tonevitsky conceived and designed the experiments, authored or reviewed drafts of the article, and approved the final draft.

Human Ethics
The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers): The study protocol was reviewed and approved by the Local Ethics Committee at the Pirogov Russian National Research Medical University (Meeting No. 194 of March 16, 2020, Protocol No. 2020

Data Availability
The following information was supplied regarding data availability: The raw data is available in the Supplemental Files.

Supplemental Information
Supplemental information for this article can be found online at http://dx.doi.org/10.7717/ peerj.14707#supplemental-information.