High Resolution HLA-A, HLA-B, and HLA-C Allele Frequencies in Romanian Hematopoietic Stem Cell Donors

The HLA genes are associated with various autoimmune pathologies, with the control of the immune response also being significant in organs and cells transplantation. The aim of the study is to identify the HLA-A, HLA-B, and HLA-C alleles frequencies in the analyzed Romanian cohort. We performed HLA typing using next-generation sequencing (NGS) in a Romanian cohort to estimate class I HLA allele frequencies up to a six-digit resolution. A total of 420 voluntary donors from the National Registry of Voluntary Hematopoietic Stem Cell Donors (RNDVCSH) were included in the study for HLA genotyping. Peripheral blood samples were taken and brought to the Fundeni Clinical Institute during 2020–2021. HLA genotyping was performed using the Immucor Mia Fora NGS MFlex kit. A total of 109 different alleles were detected in 420 analyzed samples, out of which 31 were for HLA-A, 49 for HLA-B, and 29 for HLA-C. The most frequent HLA-A alleles were HLA-A*02:01:01 (26.11%), HLA-A*01:01:01 (12.5%), HLA-A*24:02:01 (11.67%), HLA-A*03:01:01 (9.72%), HLA-A*11:01:01, and HLA-A*32:01:01 (each with 8.6%). For the HLA-B locus, the most frequent allele was HLA-B*18:01:01 (11.25%), followed by HLA-B*51:01:01 (10.83%) and HLA-B*08:01:01 (7.78%). The most common HLA-C alleles were HLA-C*07:01:01 (17.36%), HLA-C*04:01:01 (13.47%), and HLA-C*12:03:01 (10.69%). Follow-up studies are ongoing for confirming the detected results.


Introduction
The major histocompatibility complex (MHC) has garnered significant scientific attention in recent decades due to multiple studies conducted on the HLA (human leukocyte antigen) genes, molecules that are part of the MHC.The HLA system is one of the most polymorphic genetic systems found in the human genome [1].
In anthropological research, as well as in the field of organ and stem cell transplantation, precise HLA allele identification is crucial [1][2][3][4][5].Next-generation sequencing (NGS) HLA genotyping is required due to the increasing number of HLA alleles discovered through multiple studies [6,7].Resolving allelic ambiguities and establishing updated allele frequencies are two benefits of using NGS for HLA typing [6,[8][9][10][11].These benefits will help with more accurate and comprehensive applications of HLA types in the fields of research and clinical medicine [11][12][13][14][15].
Analyzing HLA allele frequencies in the Romanian population has been the subject of very few studies.Furthermore, because of the low-resolution or insufficient locus description in these studies, detailed HLA information was not presented.A precise and comprehensive method for HLA allele distribution is desperately needed for HLA typing.
While previous research has examined the distribution of HLA alleles in the Romanian population, the objective of this study was to use next-generation sequencing (NGS) for high-resolution HLA typing (three-field) and to ascertain the HLA-A, -B, and -C allele frequencies in the Romanian population.
The aim of the current study is to identify the frequencies of the various HLA-A, HLA-B, and HLA-C alleles for the analyzed Romanian cohort.

Allele Frequencies
A total number of 109 different alleles were detected in the 420 analyzed samples, out of which 31 were for HLA-A, 49 for HLA-B, and 29 for HLA-C.
The top most frequent eight alleles count together for 80% of all alleles, each with frequencies of approximately 3% or higher.
Seven out of the top eight alleles showed frequencies of 5% or higher, totalizing more than three-quarters (approximately 77%) of all detected alleles.Three alleles with frequencies higher than 10% were detected, accounting together for more than half (50.28%) of all counted alleles (362/720 alleles), while the fourth most frequent allele revealed a frequency close to 10% (9.72%) (approximately 10% of all alleles).
A total of 15 of the lower frequency remaining 23 HLA-A alleles were detected in less than 1% of the studied cohort, accounting for 5.73% of all counted alleles.

HLA-B Alleles
HLA-B genotyping results can be viewed in Table 2.The results identified 49 different HLA-B alleles, with the most frequently detected allele being the B*18:01:01 variant (allele frequency 11.25%) (Table 2).
A total number of 45 alleles (approximately 92% of all observed HLA-B alleles) had frequencies lower than 5%, summing up for a combined frequency of approximately 65%.
Out of these alleles, 21 were rare, with frequencies lower than 1%, combining for a total frequency of approximately 9% (67 out of the total of 720 alleles).
Five of the rare variants were observed in a single individual each (0.14%), while another six were detected in two persons each (0.28%) and three were detected in three individuals (0.42%).

HLA-C Alleles
The HLA-C alleles identified in our study can be viewed in Table 3.A total number of 29 HLA-C variants were detected (Table 3).For two HLA-C alleles (C*12:12 and C*15:72), the NGS results came in four digits, a possible cause being the NGS library for the respective variants not yet fully completed.
Eight HLA-C alleles revealed frequencies of 5% or higher (Table 3).The most frequently identified 8 variants add up to 532 out of all the 720 detected HLA-C alleles, counting for a total of 75.25%.Out of these variants, three had frequencies higher than 10%, while the other five alleles were identified in between 5% and 10% of all identified HLA-C alleles (Table 3).
The 3 top HLA-C alleles with frequencies higher than 10% sum up to more than 41% of all the detected HLA-C variants, with all the other 26 alleles having a combined frequency of approximately 59% (Table 3).
Apart from these three variants with frequencies of more than 10%, the C*06:02:01 variant also presented a high allele frequency at 9.44%, combining for a total of approximately 51% with the previously mentioned top three alleles.
A total of 21 HLA-C variants had frequencies lower than 5%, totalizing 178 of all the 720 analyzed alleles, with a combined frequency of approximately 25%.
In total, these 12 rare HLA-C variants had a combined frequency of approximately 5.3% (38 of all the 720 tested alleles).Five of these variants were identified in one single individual each (0.13%), while the other rare alleles showed frequencies of 0.41-0.97%.
In concluding, the frequencies of all detected top alleles (HLA-A, HLA-B, and HLA-C variants with AF > 10%) can be viewed in Figure 1. while the other 12 were revealed to be rare variants (frequencies < 1%).In total, these 12 rare HLA-C variants had a combined frequency of approximately 5.3% (38 of all the 720 tested alleles).Five of these variants were identified in one single individual each (0.13%), while the other rare alleles showed frequencies of 0.41-0.97%.
In concluding, the frequencies of all detected top alleles (HLA-A, HLA-B, and HLA-C variants with AF > 10%) can be viewed in Figure 1.

Discussion
The current study aimed to identify updated frequencies of HLA alleles at the level of six digits of resolution.
The obtained data can be used as additional information in the identification of cases in which the same four-digit or two-digit HLA types present different characteristics; this information is useful in studies involving the analysis of the HLA alleles associated with certain pathologies, with the analysis of adverse reactions related to medicines, immunological interaction studies, anthropological genetics, and in studies related to the Romanian population.
The immunogenetic profile of the studied Romanian cohort in terms of allele frequencies has been characterized [5].Thus, the most frequent HLA alleles were the following: -

Discussion
The current study aimed to identify updated frequencies of HLA alleles at the level of six digits of resolution.
The obtained data can be used as additional information in the identification of cases in which the same four-digit or two-digit HLA types present different characteristics; this information is useful in studies involving the analysis of the HLA alleles associated with certain pathologies, with the analysis of adverse reactions related to medicines, immunological interaction studies, anthropological genetics, and in studies related to the Romanian population.
The immunogenetic profile of the studied Romanian cohort in terms of allele frequencies has been characterized [5].Thus, the most frequent HLA alleles were the following:

HLA-A Alleles
As in the current research, the most frequent HLA-A allele in previously studied European populations was HLA-A*02 (also in past Romanian studies), with one exception, the Norway Sami population, where the most prevalent variant is HLA-A*03, the fourth most common in Romanians.
Even though, in most of the cases, the analysis was low resolution (France, England, Albania, Austria, Bosnia and Herzegovina, Croatia, Norway, Romania, Serbia, Slovakia, Sweden, and Switzerland), in countries with four-digit results, the analysis revealed the HLA-A*02:01 allele to be the most frequent (Belgium, Czech Republic, Finland, Germany, Greece, Italy, the Netherlands, Poland, Portugal, Kosovo, and Spain), while just one study (Bulgaria) was performed in high resolution, showing the same HLA-A allele as found in our Romanian cohort (HLA-A*02:01:01) [12][13][14]26].
The same comparable findings can be seen when analyzing the second most frequent allele in our cohort, HLA-A*01:01:01 (AF 12.5%), in this frequency range being found, again, were populations such as the Russian Belgorod and Vologda Regions populations (AFs of 9.5% and 10.5%), Brasilian Caucasians (AF approximately 10%), the Spanish Canary Islands population (AF 10.5%), and Polish or USA Caucasians (both with AFs of approximately 14%), but also other populations, such as Moroccan or South African Indian (AFs of 13% 14%) [15].
The HLA-A*03:01 allele is considered a risk factor for multiple sclerosis, playing an important role in the initiation phase of the disease [28,29], while the HLA-A*26:01 allele is associated with an increased predisposition for Behcet disease (BD) and the HLA-A*29:02 is involved in a higher susceptibility for autoimmune uveitis [28][29][30][31][32][33].
In our study group, the frequency of HLA-A*03:01:01 is one of the highest detected, close to 10% (9.72%), while the HLA-A*29:02:01 and HLA-A*26:01:01 variants are less frequent, being identified in 2.22% and 4.58%, respectively, of all analyzed individuals.These findings, if confirmed population-wide through more and expanded research studies on Romanian cohorts, would indicate an approximately 10% population specific susceptibility for multiple sclerosis.This might represent a recommendation for screening programs which would need to include genetic testing for detecting the presence of the HLA-A*03:01:01 variant.

HLA-B Alleles
The HLA-B allele frequencies vary amongst the different European populations.The most prevalent alleles observed in previous studies were the HLA-B*07 variant (in France, Austria, Belgium, Finland, Germany, the Netherlands, Norway Sami population, Norway, Poland, Sweden, and Switzerland), the HLA-B*51 variant (Albania, Bulgaria, Greece, Italy, Portugal, and Kosovo), the HLA-B*44 allele (in England, Bosnian and Herzegovina, the Czech Republic, Slovakia, and Spain), and the HLA-B*35 variant (in Croatia and Serbia and a previous low-resolution HLA study on a Romanian cohort) [14].
Comparable high frequencies of the HLA-B*18:01:01 variant can be observed in certain populations which also revealed similar HLA-A allele frequencies to the current research, populations such as the Russian Belgorod and Novgorod Region (AF 12.7% and 6.8%), Polish (7.3%), or South African mixed ancestry (7%) populations [15].
For the other frequent variants reported in the different European populations, the frequencies vary for Romanians, but that may be caused by the low-resolution results not being able to reveal the exact HLA allele.
Even though the HLA-B*18:01:01 is the most frequent standalone allele, no other HLA-B*18 variants were detected.
The HLA-B*07:02 variant is involved in viral clearance and plays an important role in the development of antitumor cells, being associated with tumor regression [33][34][35][36].
The HLA-B*27:05 allele plays an important role in long-term protection against viral infection.Also, HLA-B*27:05 in complex with other peptides leads to incapacity of KIR3DL1 to recognize this allele, and the result is increased activation of NK cells during viral infection [37][38][39][40].
These findings and the immunological and pathological correlations need still to be verified for the Romanian population.The implications of the HLA-B*51 allele in the development of Behcet disease are still yet to be fully understood, while the involvement of the HLA-B*18:01 variant in antitumor immune response needs much more research and comprehension and HLA-B*35 alleles have no yet determined correlations.

HLA-C Alleles
The most frequently detected HLA-C alleles in previously analyzed European populations are HLA-C*07 (in most of the countries: England, Albania, Belgium, Croatia, Finland, Germany, Italy, the Netherlands, Norway and the Norway Sami population, Poland, Serbia, Sweden, and Switzerland), while the HLA-C*04 variant is the most common in Greece, Kosovo, and Spain, and the HLA-C*06 (C*06:02) just for the Czech population [14].
The frequency HLA-C*07:01:01 variant identified in our cohort is one of the highest determined in any previous studies worldwide.If the results of the current research are validated through further studies, this frequency would be the fourth highest globally (17.36%), being surpassed only by the Mexico Hidalgo Mezquital Valley/Otomi population (AF 29%), the USA San Francisco Caucasian population (AF 21%), and the Northern Ireland population (AF 19%).AFs comparable to that detected in our cohort can be observed in populations such as England Blood Donors of Mixed Ethnicity (16.2%),USA Eastern European and Italian Ancestry, Spanish Canary Islands, or Morocco Nador Metalsa populations (all with AFs of approximately 16%), but also in the Polish population (AF approximately 14%) [15].
The HLA-C*12:03:01 (10.69%), as was the HLA-C*07:01:01, was identified in high frequencies in our cohort when compared to other analyzed populations.The detected result would place the frequency observed in the current Romanian study group for the HLA-C*07:01:01 allele as the third highest worldwide, with only the Russian Belgorod region (14.4%) and the China Jingpo Minority (12%) populations revealing greater AFS.The Polish (10.4%) and the Russian Federation Vologda Region (9%) populations also show high frequencies [15].
Even though the HLA-C*04:01:01 is the second most common in our cohort, it is the only HLA-C*04 allele identified, while the HLA-C*06:02 variant, which is the most frequent in the Czech population, also has high frequencies in Romanians as well, being the fourth most observed HLA-C allele (HLA-C*06:02:01-9.44%).
HLA-C alleles have many immune functions, including antiviral immunity and an important role in reproduction.
The implications of HLA-C alleles, in association with KIR receptor variants, in infertility or fertility problems has become a reason for genetic testing in patient couples, with recommendations from geneticists, gynecologists, and fertility of FIV specialists.In this respect, knowing the frequencies of the various HLA-C/KIR constellations for the Romanian population has already become a subject of interest in these medical fields and patient groups.Confirming the results of the current and of previous Romanian studies on greater cohorts would represent an important step towards this goal.
The almost 10% predisposition for psoriasis identified in our cohort through the presence of the HLA-C*06:02:01 allele would single out this HLA-C*06:02 variant for genetic testing in individuals with a diagnosis of a family history of or clinical indications for this disorder.

Materials and Methods
A total of 420 voluntary donors (Romanians/Caucasians, 61% male, age 43.3 ± 7.7 years old) registered in the National Registry of Voluntary Hematopoietic Stem Cell Donors (RNDVCSH) were included in the study for HLA typing.
Healthy donors who voluntarily registered for stem cell donation in the RNDVCSH between 2020 and 2021 were included in the current research, which was conducted at the Fundeni Clinical Institute in the Medical Analysis Laboratory 2. Written consent from voluntary donors was sought for the processing of evidence and personal data in compliance with the Declaration of Helsinki.The Fundeni Clinical Institute's Ethics Committee examined and approved this study.
The research team that worked on the project extracted, processed, and statistically examined medical data from each donor's medical file.Participants in our study were willing donors who did not have any underlying medical conditions.In accordance with the national protocol, we examined each donor's medical history from their personal medical record.We also examined their biochemical parameters and viral status after donating blood.
The DNA utilized in this study was taken from peripheral blood that was collected in vacutainers containing the anticoagulant EDTA (ethylene-diamino-tetra-acetic acid).The manual DNA extraction method was used to separate DNA from blood.The QIAmp DNA Blood Mini ® extraction kit (QIAGEN, Hilden, Germany) was used to extract DNA.This quick and simple method is based on silicon dioxide membranes and allows for the purification of total DNA (genomic, mitochondrial) from bone marrow, cell cultures, leukocyte concentrate, and whole blood.
Each blood sample was thoroughly vortexed, combined with lysis buffer and protease, and then heated to 56 degrees Celsius for 10 min in a thermoblock to promote quick lysis.Following the breakdown of the cell membranes, the DNA was still free in the lysate, with 80% alcohol being added to cause it to precipitate.Since the two materials had different electrical charges, the lysate was placed into tubes that had silicon membranes to which DNA adheres.After adding the elution buffer, which neutralizes the electrical charges, the DNA was purified by several washings and separated from the silicon membrane.Before being used, DNA was separated into tubes and kept at −18 • C. Using an A260 nm/A280 nm ratio between 1.7 and 1.9, which certifies solution purity, and a DNA concentration > 20 ng/µL, an IMPLEN nanophotometer was used to measure the concentration and purity of DNA.
Genotyping of HLA class I alleles (HLA-A, -B, -C) at 6 digits of resolution was performed using the Mia Fora NGS MFlex kit from Immucor.
The MIA FORA NGS MFlex HLA kit (MIA FORATM NGS MFlex) from Immucor (Werfen, France) was used to conduct HLA genotyping using next-generation methods.Sequencing and data analysis, library building, and long-range PCR are the three key steps in this procedure.We amplified the most relevant HLA genes in the long-range PCR step.Adenine nucleotides are added to the ends of each fragment after fragmented probes are used to construct libraries, which improves the ligation of the unique index adapters.To facilitate simple identification during sequencing, every fragment is barcoded.Afterwards, a final amplification of the size-selected library was necessary to guarantee sufficient cluster generation.DNA fragments with 500-900 base pairs were chosen using the Pippin Prep system.Utilizing a Qubit ® fluorometer (Thermo Fisher Scientific, Waltham, MA, USA), the concentration was measured prior to final library preparation and adjusted in accordance with the protocol.
A MiniSeq sequencer manufactured by Illumina (San Diego, CA, USA) was used to load the NGS sequencing library once it had been prepared using Illumina reagents.Once sequencing was finished, data were interpreted utilizing the MIA FORA NGS FLEX program (Sirona Genomics, Inc., Mountain View, CA, USA), and two reference databases, Sirona Genomics and IMGT.

Conclusions
Although the characteristics of HLA class I and II alleles and haplotypes in the Romanian donors are similar to most previously studied European populations, they still retain unique characteristics.Data from this study will be useful in anthropology, immunemediated diseases, transplantation therapy, and drug hypersensitivity.As the current research was limited to 420 individuals and the MHC region is the most polymorphic of the human genome and can be highly affected by various populational parameters, more extended studies, including larger cohorts, are needed for confirming these findings and expanding them for the entire Romanian population.

Figure 1 .
Figure 1.Top HLA-A, -B, and -C alleles identified in the analyzed Romanian cohort.

Figure 1 .
Figure 1.Top HLA-A, -B, and -C alleles identified in the analyzed Romanian cohort.

Table 3 .
HLA-C alleles identified through the current study.