Selective HLA restriction enables the evaluation and interpretation of immunogenic breadth at comparable levels to that observed with broader HLA distribution

Existing approaches to identifying predictive T‐cell epitopes have traditionally utilized either 2‐digit HLA super‐families or more commonly utilizing autologous HLA alleles to facilitate the predictions. However, the use of these criteria may not consider the HLA representation within any target population. Here we propose a modification to concept of utilizing autologous HLA whereby subsets of individuals are selected for their specific HLA allele profiles and the representation they provide within a given population. Using this selective approach to HLA selection and the linkages to specific individuals may enable the design of more targeted experimentalstrategies.


1.1
Restricted sub-populations represent the HLA diversity of larger cohorts Assessment of an individuals' HLA diversity within populations has traditionally been interrogated at two distinct levels; comparison of HLA digits and comparison of HLA haplotype sequence. Both these approaches have some appealing characteristics including intrinsic simplicity and direct linkages to the individuals. However both these approaches also have flaws; at the four digit level, two individuals may have similar allele phenotypes (HLA*B57:02 and HLA*B57:03) but can have very different peptide recognition patterns, and therefore represent high epitope presentation diversity [8]. At the level of MHC amino acid sequence diversity the reverse applies; two HLA alleles may have very divergent amino acid sequences and yet have very similar binding properties and therefore represent low epitope presentation diversity [9].
An alternative metric of population HLA diversity that may better reflect the complexity of the HLA restricted epitope presentation interactions between individuals is to use an epitope predicting algorithm. Utilizing these algorithms, it is possible to generate predicted HLA restricted epitopes and then assess the affinity and stability of each predicted epitope to that HLA [10]. These values can then be represented as a relative measurement of the "distance" between HLA alleles.
As a theoretical proof of concept for this technique to characterize HLA diversity, a subset of 13 volunteers enrolled within IAVI Protocol C, a longitudinal prospective study of 613 HIV positive volunteers [11,12], were evaluated for their representative frequency and distribution against the full cohort by 4 digit characterization and HLA binding profile. The data indicated that at the level of allele frequency, these volunteers were representing >80% of total HLA-A, -B and -C frequency [13]. An HLA binding profile was then computed for each allele by predicting the binding affinity for 9mer HIV gag peptides against the binding profile of the total HLA allele distribution from within Protocol C.
From these profiles, the overall binding characteristics of each volunteer based on their HLA restriction were compiled. By evaluating the different affinities calculated by the algorithm, it is possible to assign a distance between each allele and each volunteer based on their predicted ability to iteratively bind gag peptides. The distances are then used to cluster alleles/volunteers and visualize HLA diversity [13]. Utilizing these concepts, it is then possible to identify, within a population, individuals with different allele frequencies and distributions that can represent the total HLA diversity within the same population.

Re-defining experimental immunogenic breadth
A widely-used definition of T-cell restricted immunogenic breadth is that the greater the number of positive responses to discrete linear peptide epitopes, the greater the breadth of the adaptive response.
This paradigm functions effectively within the restricted boundaries

"Significance Statement"
In this viewpoint article we want to propose a conceptual hypothesis that reevaluating the selections of HLA and interpretation of empirical immunogenic breadth may provide additional perspective through which to design and interpret experimental data. The use of predictive algorithms for identifying potential T-cell epitopes requires the selection of HLA alleles for the protein sequences to be considered against.
This selection would typically be guided by the haplotypes of specific individuals or the use of super families for increasing the HLA coverage.
HLA diversity has traditionally been evaluated though either HLA digit resolution of through haplotype sequences.
We propose that by adding evaluating the binding potential of HLA haplotypes to a reference set of epitopes and combining all three metrics may provide a more comprehensive measurement. Secondly, we propose that when considering any measurement for breadth of immunogenic responses it is worthwhile looking to consider immunogenic breadth at the broader population level as well as specific individuals.
of an experimental approach, for instance ELISPOT, but struggles once the complexities of multiple individuals, tested against multiple sources of potential experimental target inputs are included, as is the case in analyzing specific cohorts or populations. To understand immunogenic breadth at the population level requires the linkage of the experimental target inputs of the immunoassay to the population from which they are derived. How this linkage is defined is not always straight forward, although it has been made for some disease models [14] and techniques [15]. Our interest is in designing and developing an HIV CD8 T-cell restricted candidate vaccine that accounts for the relationships and linkages between an individual's immune repertoire and the prevalent HIV transmission sequences as key parameters to assess to ensure immunogenic breadth.
Traditional assessments of HIV sequence diversity rely on sequence clustering to define breadth, and vaccine strategies utilizing this concept are advancing to clinical evaluation [16]. There are some restrictive assumptions of these approaches however that may undermine the ability of this approach to appropriately reflect breadth, namely: • They assume that areas of sequence conservation are equally represented in the context of an HLA restricted epitope presentation • They assume sequence clustering, as a measurement of diversity, accurately represents the immune profile within individuals • They assume that an empirical determination of immunogenic breadth will capture any linkages of the experimental target input to the population, so that the greater the number of responses the greater the "breadth" F I G U R E 1 Scatter plots comparing epitope to sequence distance. (A) Distance comparing random 9mer to all 9mers. (B) Distance comparing epitope 9mer to all 9mers. (C) Distance comparing epitope 9mers to random 9mers • They equally weight the sequence inputs and make no allowance for the possibility that any sequence may fail to represent the total antigenic breadth Points that lay on or near the dashed line were similarly distant by the two metrics, while points above or below the line indicate a difference in the metrics.
As expected a distance based on all 9mers is similar to a distance based on 77 random 9mers ( Figure 1A). However, a distance based on the 77 predicted epitopes was systematically shorter than a distance based on all 9mers or on 77 random 9mers ( Figure 1B,C). This suggests that sequences are more similar from the perspective of HLA recognition and potentially from the perspective of the CD8+ T cells. By comparing the epitope-based distance to 100 different metrics each based on 77 random epitopes we can compute a p-value for this effect.
Out of 100 random sets of epitopes, none produced distances that were shorter than the epitope-based distance on average (p < 0.01).
This result implies that mutations are generally more common in nonepitopes, that is, that site-wise entropy should be higher in regions that are not predicted epitopes and therefore should be excluded when determining immunogenic breadth.
Capturing the diversity of the HLA within a population and the subsequent linkages that exist between that population and any associated experimental target inputs are essential for capturing and interpreting immunogenic breadth. By developing quantitative assessments of HLA diversity and relating these same profiles to the immunological relationships of any key disease targets, it would be possible to design restricted experimental strategies that do not introduce bias and can therefore be evaluated as a proxy for the complete population sample. org [17].

CONFLICT OF INTEREST
The authors declare no conflict of interest.