Introduction

Effective immunosurveillance of infected cells by αβ CD8+ T cells depends on T-cell receptor (TCR) recognition of specific epitopes that are derived from proteasome-degraded cytosolic proteins and presented by major histocompatibility complex (MHC) class I molecules. Remarkably, only a small fraction of the peptides that potentially can be generated from a given antigen—on the order of 1–3 %—are capable of being processed and binding with sufficient affinity to stabilize the surface peptide–MHC complex (Assarsson et al. 2007). Despite this limitation, the potential vulnerability of CD8+ T-cell defenses to immunoevasion through the mutation of epitope-encoding sequences is greatly reduced by the highly polymorphic nature of classical class I (Ia) molecules. In humans, for example, >4,500 human leukocyte antigen (HLA) class Ia alleles are known (Robinson et al. 2011). Because different alleles preferentially bind peptides with specific amino acid types at anchor positions (the binding motif), the collective pool of variants provides much greater coverage of the pathogen proteome than the restricted repertoires of individual alleles would suggest. Class Ia diversity is therefore a barrier within the population to the spread of intracellular organisms that are selected under immune pressure to evade detection. Even within an individual, protective polymorphisms are in force, as class Ia molecules are usually polygenic, with high levels of heterozygosity and co-dominant expression of alleles. For example, nucleated human cells typically display six different class Ia molecules—two alleles each at the HLA-A, HLA-B, and HLA-C loci—that collectively can present an array of pathogen-derived peptides.

Surprisingly, despite the long-standing value of the domestic cat as a model for human retroviral immunity (reviewed in Elder et al. 2010), the understanding of the MHC class I complex in this species remains comparatively rudimentary. Class I gene products can be detected on the surface of feline cells (Pollack et al. 1988), and transcripts can be found in a variety of tissues (Yuhki et al. 1989). An early estimate of gene number by Southern blotting suggested 20–30 class I sequences per haploid genome (Yuhki and O'Brien 1988). Polymorphisms in class I genes have been found by isoelectric focusing of immunoprecipitated lymphocyte proteins (Neefjes et al. 1986), restriction fragment length polymorphism assay (Yuhki and O'Brien 1988), and serologic analysis using lymphocytotoxic alloantibodies (Winkler et al. 1989). Initially, such polymorphisms were thought to originate from a single (Yuhki et al. 1989), or possibly two or more (Neefjes et al. 1986), unnamed loci. In a later study, comparison of seven distinct feline class I cDNA sequences obtained from a T-cell lymphoma line and three unrelated cats suggested two loci, designated A and B (Yuhki and O’Brien 1990). The homology of these clones to HLA-A2 permitted identification of the eight domains characteristic of class Ia molecules, as well as polymorphic residues within the antigen recognition site (ARS). Since then, additional class I alleles have been reported in the cat with tentative assignment to the A and B loci or without regard to specific origin (so-called Z locus) (Smith and Hoffman 2001). More recently, the genomic structure of the feline leukocyte antigen (FLA) complex has been detailed (Beck et al. 2005; Yuhki et al. 2007), and large sequences have been annotated (Yuhki et al. 2008). Nineteen class I genes, designated FLAI-A through FLAI-S, have been identified. Of these, seven were gene fragments, nine were assigned as nonclassical genes, and three loci—FLAI-E, FLAI-H, and FLAI-K—were considered to be class Ia genes. The assignment of classical status was based on the presence of full-length exons, intact splice sites, specific invariant residues in the α1/α2 domains, and a complete upstream promoter motif (enh.A-ISRE-W/S-X1-X2/site γ-Y/enh.B) known to regulate MHC class Ia expression.

There a number of viruses afflicting humans, including HIV and HTLV, that have naturally occurring homologs in cats (reviewed in O'Brien et al. 2012). Moreover, some of these feline viruses threaten wild cat species, such as the lion, panther, and lynx (Brown et al. 2008; Meli et al. 2009; Roelke et al. 2009). Given the potential usefulness of the domestic cat to understanding adaptive immune responses to such important pathogens, we sought to better characterize the three FLAI loci that presumably restrict antiviral CD8+ T-cell effectors by evaluating polymorphisms and transcription of these genes. Data obtained from 12 cats of various breeds showed high diversity at each locus, with 12, 11, and 10 alleles at FLAI-E, FLAI-H, and FLAI-K, respectively, and cDNAs from all three genes were found in diverse tissue samples. These results constitute strong support for the designation of FLAI-E, FLAI-H, and FLAI-K as class Ia genes and provide the groundwork for the study of epitope-specific, cytotoxic T-cell responses in the cat.

Materials and methods

Isolation and preparation of nucleic acids from cats in the study

Anticoagulated venous blood samples (200 μL) were obtained with owner consent from clinical specimens collected from 11 cats during evaluation at the North Carolina State University Veterinary Teaching Hospital. Blood and tissue samples were harvested from a feral adult domestic shorthair (DSH) cat (DSH7) euthanized at a community animal shelter. Genomic DNA (gDNA) was extracted using a QIAamp DNA Blood Mini Kit (Qiagen, Valencia, CA, USA), and RNA was purified with an RNeasy Plus Mini Kit (Qiagen) from peripheral blood mononuclear cells (PBMCs) that had been isolated by density gradient centrifugation over Histopaque-1077 solution (Sigma-Aldrich, St. Louis, MO, USA). Tissue samples from DSH7 were placed in RNAlater solution (Ambion/Life Technologies, Carlsbad, CA, USA) and stored at −80 °C until homogenization and RNA isolation were performed.

PCR amplification and cloning of FLAI sequences

Genotyping primers (Supplemental Table S1) were designed using NCBI PrimerBlast software (Rozen and Skaletsky 2000) to generate complete reads of exons 2 and 3 from FLAI-E, FLAI-H, and FLAI-K, based on the genomic sequence EU153401 (Yuhki et al. 2008), as well as earlier sequences and exon assignments (Yuhki and O’Brien 1990). Amplifications were performed in 25 μL reaction volumes using 1 μM primers (Invitrogen/Life Technologies) and a HotStar HiFidelity DNA Polymerase Kit with Q solution (Qiagen). Cycling conditions are specified in Supplemental Table S2.To minimize in vitro recombination artifacts, we monitored template quality, and the amount per reaction (40 ng) was kept as low as possible, extension times were not longer than recommended, and excessive cycles were avoided (Judo et al. 1998; Kanagawa 2003; Lenz and Becker 2008; Lukas and Vigilant 2005). Gel-purified amplimers were TA-cloned using the pGEM-T Easy Vector System (Promega, Madison, WI, USA). Plasmid DNA samples isolated from transformed Escherichia coli clones (GC10 Chemical Competent Cells, Genesee Scientific, San Diego, CA, USA) were screened via EcoRI restriction digest. Insert-positive plasmids were sequenced in both directions by Eurofins MWG Operon (Huntsville, AL, USA) using either standard SP6 and T7 primers or internal sequencing primers flanking FLAI exons 2 and 3.

For obtaining FLAI full-length (fl) sequences, primer pairs fl-E and fl-H/K (Supplemental Table S1) were designed using PrimerBlast, with the forward primer at the exon 1 initiating ATG and reverse primers in the 3′ untranslated region (UTR). Amplimers were generated in two separate PCRs, using each primer pair and 1 μL cDNA prepared from 250 ng PBMC RNA with Superscript III RT (Life Technologies, Grand Island, NY, USA), and then TA-cloned.

Evaluation of tissue-specific gene transcription

For expression analysis, universal (“U”) primers were designed to anneal to conserved sequences in exons 1 and 4 and amplify most, if not all, FLAI genes predicted to have full-length transcripts. Test sequences from gDNA demonstrated that products from at least seven different loci (data not shown; GenBank accession KC763068–KC763076), including FLA-E, FLA-H, and FLA-K, were recovered. cDNA was prepared using Superscript III RT from 2.5 μg total RNA isolated from each of eight different tissues from DSH7. Amplimers of tissue-specific cDNAs and gDNA from this same individual were cloned as described above. A minimum of 36 insert-positive cDNA clones from each tissue were screened by both StyI and PstI restriction digests and grouped by digestion band patterns. For each tissue, at least three clones from each pattern group, as well as any nonconforming clones, were sequenced, and a minimum of 11 insert sequences were obtained for analysis.

Data analyses and allele nomenclature

Sequences were analyzed and edited with A Plasmid Editor (ApE v2.0.41; Davis 2012) and Geneious v5.6.3 (Drummond et al. 2012). To validate a novel sequence, we always sought to obtain ≥3 representative clones per cat and, ideally, to recover that allele from more than one individual, or minimally, from more than one PCR. In a few cases, despite repeated attempts, these goals were not met (see caption, Supplemental Fig. S1). Sequences ultimately represented by a single clone were discarded. Obtaining six clones of the identical allele from an individual was considered sufficient to declare homozygosity at that locus with acceptable degree (>96 %; [1 − (0.5n − 1)] × 100) of confidence, with the assumption that amplification and allele recovery were unbiased. In our analyses, sequences initially were assumed to originate from the locus that was targeted by the specific primer pair; to confirm this, we compared identity bit scores to EU153401 using a BLASTN search (Zhang et al. 2000). If the highest score was to the target locus, then the assignment was considered valid. If the sequence had a very high identity to a non-FLAI-E, FLAI-H, or FLAI-K locus, then that nontarget identity was assigned. FLAI-E, FLAI-H, or FLAI-K are more similar to each other than to all other FLAI loci (see Fig. 1), so if assignment to one of the three was ambiguous, then the originating locus was deduced by elimination, considering all other sequences discovered in that individual and assuming that the sequence in question belonged to one of the putative class Ia genes, with no more than two alleles at any locus.

Fig. 1
figure 1

Neighbor-joining consensus tree of FLAI sequences that were amplified from the gDNA of DSH1 with putative locus-specific primer pairs relative to the 12 class I genes with full-length exons in the cat genomic sequence, EU153401. For clarity, allele names are not shown for reference sequences, but would be 00101 at each locus, according to our convention. Definitive names for alleles isolated from DSH1 were assigned at the end of the study, once relationships to other alleles were established. Sixteen clones were generated by primer pair E1; seven clones were FLA-E*00101, eight were E*00701, and one was H*00701. Ten clones were produced by primer pair H; six clones were FLA-H*00701, and four clones were H*008011. Ten clones were generated by primer set K; three clones were FLA-K*00302, five clones were K*00303, and two clones aligned with FLAI-O, a putative nonclassical gene (amplification of FLAI-O with this primer pair was later found in five additional cats—see Supplemental Fig. S3), and are designated FLAO3. For phylogenetic analysis, all sequences were trimmed to the region spanning exon 2 through exon 3. The tree was constructed in Geneious (Drummond et al. 2012) using the Tamura–Nei genetic distance model with bootstrap resampling, and outgrouped to HLA-A2. Numbers at nodes are consensus support (percent). Excluding the minor exceptions noted above, the alleles obtained with each locus-specific primer pair partition the reference locus sequence with strong consensus support

Alleles were named according to the five-digit nomenclature system developed for the dog leukocyte antigen (DLA) (Kennedy et al. 1999, 2001). Specifically, residues in the hypervariable regions (HVRs) define a type (digits 1–3), expected to recapitulate the serotype, and coding differences outside the HVRs define the subtype (digits 4–5). A sixth digit differentiates alleles that have noncoding differences, and a lower case “n” is used to denote nonfunctional sequences. At each locus, we used 00101 to designate the allele matching the genomic sequence EU153401, e.g., FLA-E*00101 (note: the “I” in FLAI signifying class I is omitted in allele names). Initially, alleles were provisionally named according to sequence of discovery; after feline HVRs were established by variability analysis of our larger data set, the correct names were assigned for use throughout the paper.

Sequences were deposited in GenBank, and alleles that meet acceptance criteria will be submitted to the Immune Polymorphism Database (IPD)-MHC database (http://www.ebi.ac.uk/ipd/mhc/) (Robinson et al. 2005). Observed and expected heterozygosities were calculated using Arlequin (Excoffier et al. 2005). To test for positive selection of amino acid sites, the exon 2 and 3 nucleotide sequences of all known FLAI alleles were aligned by codons and subjected to Bayesian analysis, using MrBayes 3.1.2 (Huelsenbeck and Dyer 2004). Two parallel runs of 106 cycles were performed, with the burnin percentage set to 10 %. Codons were presumed to be under positive selection when posterior probability values, analyzed by Tracer 1.5 (Rambaut and Drummond 2007), exceeded 0.95.

Results

The domestic cat genome contains 19 MHC class I gene candidates, of which 12 are predicted, by comparison to clone FLAA24, to maintain full-length exons, and three of those loci are anticipated to encode classical molecules: FLAI-E, FLAI-H, and FLAI-K (Yuhki et al. 2008). The goal of this study was to assess this claim by analyzing polymorphisms and transcription of these genes in tissues. Secondarily, our objectives were to examine the degree of diversity at these loci in a number of cats comprising several breeds and to characterize the relationship of newly discovered FLAI alleles to those found in earlier studies.

Development and evaluation of FLAI-E, FLAI-H, and FLAI-K locus-specific PCR primer pairs

Polymorphisms in classical MHC genes, which may be driven by one or more evolutionary mechanisms, such as selection of rare advantageous alleles or overdominant positive selection (reviewed in Piertney and Oliver 2006), are concentrated accordingly in the ARS. The sequences of the α1/α2 domains, which contain the ARS, are therefore sufficient for allelic differentiation and, typically, are the only reported portion of the allele (Bjorkman et al. 1987; Castro-Prieto et al. 2011b; Kennedy et al. 1999; Parham et al. 1995). As in other species, FLAI exons 2 and 3 encode these domains and contain the majority of polymorphic sites (Yuhki and O'Brien 1990). To efficiently genotype our cats, we sought to design locus-discriminatory primers by identifying unique sites that flanked these exons and avoided areas of nucleotide variability. The obvious limitations to this effort were that only a single representative allele for each gene was available from genomic data, precluding any estimates of intronic variability; the originating locus for each of the handful of reported full-length cDNA clones (Yuhki and O'Brien 1990) was uncertain; and nucleotide differences between the three loci at almost all candidate primer sites were minimal, often only one or two base pairs. Ultimately, five primer pairs (designated E1, E2, E3, H, K; Supplemental Table S1) were designed that appeared likely to have good discriminatory power, although some degree of cross-locus annealing or resistance to amplification was anticipated. We assessed the capability of our first set of primer pairs (E1, H, K) with gDNA from cat DSH1. A phylogram of all sequences obtained from this individual demonstrate tight clustering of the alleles retrieved with each primer pair with the reference sequences for their expected locus, with a few exceptions (Fig. 1). After analysis of additional samples, the E1 primer pair was found to have a fairly high rate of cross-locus amplification and was discarded in favor of the more specific E2 and E3 pairs (Supplemental Table S1, right column). Overall, the E2/E3, H, and K primer pairs, particularly when used as a set, preliminarily appeared capable of discriminating between these extremely similar genes.

Characterization of allelic variation at FLAI-E, FLAI-H, and FLAI-K

We genotyped an additional 11 unrelated cats and, in total, found 33 distinct FLAI sequences (Table 1). Eight of these alleles have been reported previously (Table 2) (Smith and Hoffman 2001; Yuhki et al. 2008; Yuhki and O'Brien 1990), while 25 are novel (Supplemental Fig. S1). In no cat did we obtain more than two alleles at any locus. In our samples, FLAI-E had the highest number of alleles, 12, encoding 12 unique amino acid sequences. At the H locus, we detected 11 alleles, encoding nine unique amino acid sequences. Ten alleles were found at the K locus. Two of these alleles, K*00601n and K*00602n (which differ by 1 bp), have a single nucleotide substitution in exon 2 that introduces a premature stop at position 63 in the α1 domain. These variants otherwise conform to the expected class Ia pattern of polymorphic and conserved sites and, hence, appear to be mutations (null [“n”] alleles), not pseudogenes. Excluding these two, eight FLAI-K amino acid sequences were found. Our data show that FLAI-E, FLAI-H, and FLAI-K are polymorphic genes, consistent with their putative classical status.

Table 1 MHC class Ia genotypes of cats in this study (GenBank accession KC763013–KC763045)
Table 2 Previously discovered FLAI alleles

Two common allele types were observed: FLA-E*003 and FLA-K*003. The FLA-E*003 representatives, E*00302 and E*00303, were found in six individuals; E*00303 was carried by 5 of the 12 cats. The FLA-K*003-type alleles, K*00302 and K*00303, were also detected in six cats. Their prevalence is further supported by the fact that two previously reported clones belong to these allele types: FLAA10 (designated E*00301 in our analysis) and FLAB2 (designated as K*00301) (Yuhki and O'Brien 1990). Sixteen alleles were not shared among the cats in our study population (Table 1, boldface type), although one (E*00101) has been previously identified (Yuhki et al. 2008). In fact, all three reference alleles found in the RPCI86 cat BAC library were recovered here.

For the four cats that were homozygous at two of the three loci, eight haplotypes could be deduced; of these, two (E*00303, H*00101, K*00801 and E*01001, H*00101, K*00801, both carried by DSH2) were potentially shared with another individual, DSH5. The Persian and Himalayan cats also shared a haplotype, as these individuals only varied by a single FLAI-E allele. The two breeds are very closely related (Himalayans are a “variant breed” originally selected from Siamese–Persian pairings, and occasionally outcrossed to Persians; Menotti-Raymond et al. 2008), so common alleles are not surprising. Moreover, it is possible that class I allelic diversity is restricted within feline breeds (purebred cats have lower overall genetic variation than do outbred cats; Lipinski et al. 2008), although perhaps not to the same degree as observed in other domestic animal breeds, such as the dog (Ross et al. 2012) or miniature swine (Ando et al. 2003), where breeding barriers are stronger. As a measure of genetic diversity at each loci, we computed the observed and expected heterozygosities for FLAI-E, FLAI-H, and FLAI-K (Table 3). These values are similar to an average heterozygosity of 0.85 calculated for ten polymorphic simple tandem repeat loci in outbred cats (Menotti-Raymond et al. 2012). For two loci, FLA-E and FLA-H, there was a significant departure from Hardy–Weinberg equilibrium (a deficit of heterozygotes), which could reflect, in part, our small sample size (Pruett and Winker 2008) or amplification resistance of some alleles, but presumably also is due to inbreeding in cats. Consistent with this supposition, genetic variance in outbred cats is lower than that in humans (Lipinski et al. 2008).

Table 3 Heterozygosity at classical FLA class I loci

Evaluation of locus assignment

To assess whether our sequences were accurately assigned to their locus of origin, we constructed a phylogenetic tree of the 33 FLAI alleles, using exon 2–intron 2–exon 3 nucleotide sequences; if attributed correctly, then alleles from the same locus should partition into monophyletic clusters (Gu and Nei 1999). Figure 2 shows that all H alleles segregate into a highly supported clade, while partitions between the tentative E and K alleles are less well resolved. Notably, one allele, K*00801, which was found in three individuals, can be seen to group with other presumed E locus origin sequences (Fig. 2, arrow) and, therefore, would seem to be misplaced, or possibly, to represent a PCR amplification artifact—an unlikely explanation, as this allele has been reported previously as a full-length transcript (FLAB9) (Yuhki and O'Brien 1990). In two of the cats (DSH2 and DSH5), K*00801 was carried with E*01001, which is contained within a well-supported clade of other E alleles and is identical to FLAA23, a clone that clearly originates from a locus (A, by the previous designation) distinct from that of FLAB9 (B locus). Moreover, both cats possessed a second allele (E*00303) with strong homology to the E locus. Collectively, these observations argue that K*00801 is the product of a non-E locus. In further support, the other B locus origin clone identified in the study of Yuhki and O'Brien (1990), FLAB2, best matches the FLAI-K alleles found in this study. Therefore, we maintained the assignment of K*00801 to the K locus, recognizing that its phylogenetic placement suggests that this is a hybrid allele generated by interlocus (intergenic) recombination. In fact, this process may explain the imperfect segregation of the FLAI-E and FLAI-K alleles; the expectation for monophyly of same locus alleles assumes that interlocus recombination or gene conversion is a very small contributor to diversity. MHC class I intronic data can also be phylogenetically informative (Cereb et al. 1995). In intron 2, we observed 35 variable sites among the FLAI alleles, including a 5-bp insertion/deletion (indel) in the 3′ GGG trinucleotide splicing enhancer region (Supplemental Fig. S1, nucleotides 458–462). However, when a phylogram is constructed only with our intronic sequence data (i.e., disregarding those regions expected to be under positive selection pressure), then most partitions were poorly supported, with only seven H alleles forming a distinct clade (98 % consensus support; data not shown). This intron homogenization supports the idea that interlocus recombination has occurred. Further, an analysis of a limited number (seven) of class I transcripts found clear evidence for interlocus (and interallelic) recombination in FLA, specifically identifying in the first α-helix region (in the α1 domain) a mosaic pattern of two shared polymorphic sequences separated by a conserved 23-bp stretch (Supplemental Fig. S1, bp 206–228) (Yuhki and O'Brien 1990). This presumptive recombination hotspot is also found in nondomestic felids—ocelot, cheetah, lion, tiger, and extinct saber-toothed cat—and therefore is likely ancient in origin (Pokorny et al. 2010; Sachdev et al. 2005; Yuhki and O’Brien 1994). These same motifs are observed in our sequences. There are two motifs found: a motif exemplified by FLAA10/E*00301 and common to the E origin alleles in one partition (Fig. 2, marked *) and a second pattern represented by FLAA23/E*01001 that is found in a separate partition shared by E and K origin alleles (marked **). Hence, the incomplete partitioning of our alleles originating from the FLA-E and FLA-K loci may be the result of interlocus recombination that was previously observed in their A and B analogs (and in two clones of seeming hybrid lineage, given an “AB” locus designation; Yuhki and O’Brien 1990; 1994). Consequently, there is an acknowledged uncertainty for our locus assignments that lies somewhere between humans, where virtually all alleles cluster by loci, and mice, where locus specificity is not phylogenetically discernible (Gu and Nei 1999). Such difficulty in definitively identifying the originating locus is commonplace in MHC class I studies in animals, due to interlocus recombination and haplotype-variable gene expression (Babiuk et al. 2007; Ellis et al. 1999; Holmes et al. 2003; Miltiadou et al. 2005; Tallmadge et al. 2010); in cattle and sheep, for example, alleles are designated with the prefix N to reflect that uncertainty.

Fig. 2
figure 2

Neighbor-joining consensus tree of all FLAI sequences that were amplified from gDNA samples from the 12 study cats with locus-specific primer pairs. Sequences were trimmed and the phylogram constructed as described in Fig. 1. Numbers at nodes are consensus support (percent). Most alleles partition into locus-specific clades with good consensus support. Alleles FLA-E*00701, E*01101, E*01201, and K*00801 had ambiguous identity (by BLASTN search; Zhang et al. 2000) to the intended reference locus, but only K*00801 (arrow) clustered with alleles from a different putative locus

Assessment of structural features common to classical MHC class Ia molecules

The 29 unique predicted amino acid sequences found in this study are shown in Fig. 3. All alleles conform to the expected pattern of polymorphic and conserved sites in the α1/α2 domains, as detailed in Table 4. All important structural residues are present, including those involved in TCR contact; the C101 and C164 residues that form the disulfide linkage between the β-sheet and the α2 domain α-helix; and amino acids in the ARS that are conserved across murine and human class Ia heavy chains, with the exception of an L160A substitution in two alleles, E*00501 and E*01001 (Bjorkman and Parham 1990; Kaufman et al. 1994; Yuhki and O’Brien 1990). Similarly, 13 of the 19 residues in the α1 and α2 domains of HLA-A2 that contact β2M are found in all our feline alleles (Saper et al. 1991). A few exceptions are noted: at F9, all but one allele has a Y, and at Y116, most residues use D or E. These same position 9 and 116 variants are also observed in the canine classical class Ia molecule, DLA-88 (Ross et al. 2012). Conversely, humans and dogs use a T at position 94, while our feline alleles possess N or S at this site. In addition to evaluating α1/α2 sequences, we sought to verify other characteristic MHC class Ia structural elements (leader peptide; α3 extracellular, transmembrane and cytoplasmic domains) in one or more representative FLAI alleles. To make this determination, we amplified full-length transcripts from PBMCs. Alignment of the recovered sequences with FLAA23 shows that all expected domains are present (Supplemental Fig. S2). We also generated two full-length FLAI-E clones from two additional DSH cats (not shown); the same structural components were identifiable.

Fig. 3
figure 3

The alignment of the predicted translations of all FLAI alleles amplified from gDNA samples from the 12 study cats with locus-specific primer pairs. Exons were deduced by comparison to cDNA sequences (from our data and Yuhki and O'Brien 1990). Dots signify identities to the 00101 allele (top rows) representing each locus, while letters indicate amino acid differences. Domain names are shown in the gray bar above the sequences. The asterisk symbol indicates sites expected to be conserved. The plus symbol indicates positions identified as highly polymorphic in felines (Castro-Prieto et al. 2011a, b; Pokorny et al. 2010; Yuhki and O'Brien 1990; 1994), i.e., those positions having at least three different amino acid substitutions. Residues in bold are predicted to constitute the ARS (Bjorkman et al. 1987; Yuhki and O'Brien 1994)

Table 4 Conserved and polymorphic sites in the α1/α2 domains of FLA class Ia alleles

Evaluation of tissue expression of FLAI-E, FLAI-H, and FLAI-K genes

MHC class I cDNA sequences with >99 % identity to FLAI-H and FLAI-K have been obtained from a cell line derived from the Abyssinian cat used in the feline genome project (Yuhki et al. 2008), demonstrating that these two loci are transcribed. By definition, classical MHC molecules are widely expressed, and accordingly, we sought to determine whether transcripts from any or all of the three putative class Ia loci could be isolated from multiple feline tissues. Using the U primer pair, FLAI-E and FLAI-K alleles were amplified from the cDNAs of all eight tissue samples obtained from DSH7, and an FLAI-H allele was retrieved from seven tissues (Table 5). An additional sequence that was identified in two clones isolated from the ovary matched E*00501 from the beginning of exon 1 to midway in exon 2, while the remainder matched K*00701. Most likely, this sequence was a PCR chimera, but given its discovery in ovarian tissue, this hybrid could also have resulted from a meiotic crossover event. Thus, all three loci are expressed across a range of tissue types, supporting the prediction that FLAI-E, FLAI-H, and FLAI-K are class Ia loci in the domestic cat.

Table 5 Recovery of FLAI transcripts from tissues [the identical FLAI-E, FLAI-H, and FLAI-K alleles, as well as sequences representing FLAI-F and FLAI-O (GenBank accession KC763059–KC763067), were amplified from this cat’s gDNA]

Characterization of other FLAI loci

Based on genomic analysis, there are five additional full-length FLAI genes, FLAI-F, FLAI-J, FLAI-L, FLAI-M, and FLAI-O, which have most, but not all, of the 31 residues in the α1 and α2 domains that are conserved across human and feline class Ia molecules (Yuhki and O’Brien 1990). Hence, Yuhki et al. (2008) designated these loci as nonclassical (class Ib)—functional MHC molecules with limited polymorphisms and tissue-restricted expression (reviewed in Rodgers and Cook 2005); however, the true status of each of these feline genes is unknown. While some have complete class I promoter motifs, such as FLAI-O, no cDNA clones representing these loci have been reported. Further, we found no evidence that these genes were transcriptionally active in tissue samples when using primers (U) capable of amplifying FLAI-F, FLAI-J, FLAI-O, and FLAI-Q origin sequences from gDNA. Smith and Hoffman (2001) reported an FLAI allele “1” from an undetermined (Z) locus that was carried by three of seven domestic cats, as well as by four other species of felids, including the caracal and Geoffroy’s cat. This sharing of a single allele of a normally highly polymorphic MHC gene across the three major cat lineages was recognized as “unusual” and was therefore considered to denote high selective pressure for locus conservation. With the benefit of subsequent sequencing and annotation of the domestic cat MHC, it now appears that allele 1 is FLAI-O. With the U or K primer pairs, we retrieved sequences from the gDNA of 6 of the 12 cats (each represented by ≥2 colonies) that were virtually identical to FLAI-O, although most had 1 to 3 bp deletions or additions in exon 3. An alignment of our FLAI-O sequences, as well as the homologous Z locus origin sequences reported by Smith and Hoffman (2001), is shown in Supplemental Fig. S3. Collectively, the low variation between individual cats and across species, the coding errors, and lack of evidence of expression suggests that FLAI-O may be an inactive, class I relic (also referred to as a “dead” gene) (Nei et al. 1997; Zemmour et al. 1990). Similarly, additional sequences represented by two or more colonies were recovered from DSH7 gDNA that were homologous to FLAI-F and FLAI-J—both full-length and variants with premature stops—which may suggest that these loci, too, are no longer active. Of course, the possibility that FLAI-F, FLAI-J, and FLAI-O are (nonclassical) class Ib genes expressed exclusively in tissues that were not sampled, or at levels below our detection capabilities, cannot be excluded.

Assessment of diversity across new and established classical FLAI sequences

Finally, we wished to investigate patterns of diversity within and between all known class Ia alleles of the domestic cat. Previous to this study, a number of putative classical FLAI sequences had been deposited in GenBank (Table 2); 12 were not observed in our cats. When aligned to the full-length class I genes, each of the 12 matched the reference FLAI-E, FLAI-H, and FLAI-K sequences most strongly, although none had been specifically designated as products of these loci. Combined with the alleles found here, there are thus a total of 41 unique amino acid sequences discovered in the cat to date that presumably represent classical molecules. When genetic diversity was evaluated in HLA class Ia genes by comparing polymorphisms in exons 2, 3, and 4 of 39 HLA-A, HLA-B, and HLA-C alleles, three clusters of high amino acid variability (now known as HVRs) could be identified, one in the α-helix of the α1 domain and two in the α-helix and β-strand of the α2 domain. Not surprisingly, the polymorphic residues of these HVRs pointed into, or up from, the ARS, likely interacting with the bound peptide or TCR (Parham et al. 1988). The same HVRs can be observed in DLA-88 (Graumann et al. 1998), and the boundaries of the canine HVRs (positions 62–77, 91–116, 152–158) (Kennedy et al. 1999) contain the highly polymorphic positions specifically identified in seven feline class I clones (Yuhki and O'Brien 1990), suggesting that these regions also constitute HVRs in cats. To confirm this supposition, we used the data set of 41 pooled alleles to calculate the variability index (vi) (Wu and Kabat 1970) for the α1 and α2 regions. Overall, as seen in Fig. 4, there were 15 positions where variability was defined as high (vi ≥ 4), with a pattern of polymorphisms that was similar to that observed in the class I alleles of tigers (Pokorny et al. 2010). Three HVRs, which are defined as amino acid sequences containing and bounded at either end by highly variable residues, are apparent: HVR1, positions 62–81; HVR2, 94–116; and HVR3, 152–156. These HVRs occupy virtually the same residue stretches as those of HLA and DLA, and similarly, many (27) of the feline polymorphic sites are part of the ARS (Yuhki and O'Brien 1994). If a variability plot is generated from the alleles at each locus separately, then the E locus has 21 positions where vi is ≥4; the K locus has 8 positions, and the H locus, 6 (data not shown). The H locus origin alleles are mainly variable in HVR1 only, and the diversity in HVR3 is due almost entirely to substitutions in E locus origin alleles. The dominant contribution of one locus to variability at particular positions is also observed in HLA-A and HLA-B molecules and may indicate that interlocus recombination is limited (Parham et al. 1988).

Fig. 4
figure 4

The variability index (vi) plotted versus residue position in the α1 and α2 domains of classical FLA class Ia molecules. For all 41 unique amino acid sequences, vi was calculated according to the equation: vi = (# of different residues at position x) / (frequency of most common residue at position × per total alleles considered). An invariant position has a vi = 1; a highly variable position was defined as vi ≥ 4 (dotted line). Three HVRs are observed. Residues indicated by a star were determined by Bayesian analysis to be positively selected with >95 % probability

The preferential accumulation of nonsynonymous nucleotide substitutions in the ARS is considered strong evidence of overdominant selection as a substantial factor in maintaining MHC class I diversity. This observation has been reported previously in cats: by using a pairwise comparison between seven FLAI transcripts, a biased distribution of nonsynonymous mutations was detected in the ARS, but not in the regions of the α1/α2 domains that lay outside of the ARS (Yuhki and O'Brien 1990). We wished to confirm this finding in our larger set of 43 nucleotide sequences (including K*00601n and K*00602n). Using Bayesian analysis, five codons were identified in the HVRs that had a mean probability >95 % of positive selection: positions 62, 63, 81, 97, and 156 (Fig. 4 [starred residues]). As expected, all positions occurred in the ARS. Interestingly, studies in the tiger and leopard have found evidence of positive selection at three of the same sites (residues 63, 97, and 156), as well as at other positions in the HVRs (residues 66, 67, 70, 74, 75, 77, 116, 152, and 155) (Castro-Prieto et al. 2011a; Pokorny et al. 2010).

We then investigated the phylogenetic relationship between all 41 FLAI alleles. For this analysis, the 12 previously reported class Ia sequences not detected here were renamed according to locus homology and interallelic relationships defined by synonymous and nonsynonymous substitutions within and outside the HVRs, as described in the “Materials and methods” section. When the amino acid sequences of the α1/α2 domains were grouped together in a consensus tree, most alleles segregated with >50 % support into locus-specific clades (Fig. 5). As in the previous analysis that had been limited to our new nucleotide sequences, two E locus lineages are visible, and K*00801 is again placed among the E locus alleles (arrow). Seven sequences fall outside of locus groupings. These alleles include four sequences that are available only as GenBank submissions, so the details of their discovery are unpublished. One of these alleles, H*00901 (B*no5), carries a T substitution at K146, which is a highly conserved position (Yuhki and O'Brien 1990 and our data), suggesting either potential errors in sequencing or the possibility that this variant encodes a class Ib molecule. H*00901 and the other ungrouped sequences could also represent chimeric alleles, resulting from either in vitro or genuine interlocus or interallelic recombination.

Fig. 5
figure 5

Neighbor-joining consensus tree of deduced FLAI amino acid sequences obtained in this study and from the GenBank database. A phylogram of the α1 and α2 domains was constructed with Geneious, using the Jukes–Cantor genetic distance model with bootstrap resampling, and was rooted by including HLA-A2 as an outgroup. Numbers at nodes are consensus support (percent). In this analysis, FLA-H*00801 represents allele subtypes H*008011 and H*008012, and FLA-H*00301 represents allele subtypes H*003011 and H*003012—these subtypes are distinguished by noncoding nucleotide substitutions. Most alleles partition into locus-specific clades with good consensus support, although seven sequences appear ungrouped; K*00801 is indicated with an arrow

Discussion

The objectives of this study were to evaluate the prediction that FLAI-E, FLAI-H, and FLAI-K are classical MHC class Ia genes in the domestic cat and to describe polymorphisms at these loci. Our design aligned with the goals advocated as necessary to characterize MHC diversity: define the number of expressed loci; assign alleles to loci and assess heterozygosity; and determine the extent of sequence variation in structurally important regions, i.e., the ARS (Piertney and Oliver 2006). Using sets of locus-specific primer pairs to amplify FLAI gene segments from 12 cats, we identified 25 novel alleles and verified the sequences of 8 others residing in the GenBank database. One unexpected finding is that we failed to recover transcripts from nonclassical genes in our expression analysis, even though there may be as many as nine full-length class Ib loci (Yuhki et al. 2008), and our primer pair was capable of amplifying at least four of those genes. Of course, this negative result may simply reflect a technical limitation; to find such evidence may require the design of primers specific for these genes or isolation of RNA from other tissues. Analysis of genomic FLAI-F, FLAI-J, and FLAI-O sequences suggested that these three genes do not encode nonclassical class Ib molecules, but rather have become inactive relics.

This work continues the compilation and clarification of FLAI alleles in the domestic cat. Most sequences from earlier studies appeared to be restricted to only two loci of origin, potentially due to probe or primer mismatch, so presumably, alleles were previously underreported. The PCR conditions and primers developed here should prove to be a valuable tool for class Ia genotyping, although it should be emphasized that we also may have failed to detect some alleles, leading to an underestimation of heterozygosity, as our primer pairs were designed using a limited data set. Additionally, analogous studies in other species have established that some alleles are preferentially amplified (Miltiadou et al. 2005) and that rare transcripts may be missed by conventional Sanger sequencing methods (Budde et al. 2010; Kita et al. 2012). On the other hand, the use of multiple primer pairs with different locus specificities can potentially improve the recovery of sequences from genes transcribed at lower levels (Bettinotti et al. 2003).

Further work will be necessary to confirm the provisional locus assignments for FLAI alleles that were made here. Identifying the originating locus solely from the gene regions encoding the highly polymorphic α1/α2 domains can be difficult. For example, when neighbor-joining trees of ten HLA loci were constructed based on sequences from either the 5′ flanking region, introns, exons 2 and 3, or exons 4 through 8 (which encode the α3, transmembrane, and cytoplasmic domains), topologies were similar, but the confidence in partitioning based on exon 2 and 3 sequences was lower than the others (Sawai et al. 2004). Locus-specific motifs have been shown to reside in other regions of class I genes, including the 5′ UTR (Cereb and Yang 1994), introns (Cereb et al. 1995, 1997), exons 4 through 8 (Birch et al. 2006; Ellis et al. 2005; Yuhki and O'Brien 1990), and the 3′ UTR (Koller et al. 1984; Tallmadge et al. 2010), and therefore, definitive FLAI locus interpretation will ultimately require full-length data. Ideally, next generation sequencing methods can be used to expediently obtain complete heavy chain sequences; these techniques will also facilitate scanning larger samples of the domestic cat population to identify prevalent haplotypes and promote the discovery of rare class Ia transcripts (Budde et al. 2010), which can clarify assignments. Finally, it should be noted that, despite the current uncertainty with locus assignment, this attribution is of secondary importance: as long as allele recovery is robust, then the downstream immunologic applications of such data—epitope discovery; disease association studies—can proceed unimpeded.

The final step in defining FLAI-E, FLAI-H, and FLAI-K as classical loci will be to demonstrate their restriction of CD8+ T-cell activity. If highly prevalent alleles are used in making such determinations, then immunodominant cytotoxic T-cell responses shared among cats carrying that allele can be defined, a critical prerequisite to understanding adaptive immunity to viruses in this species. The cat remains an important model of oncoviral and lentiviral diseases (O'Brien et al. 2008; 2012). Cytotoxic T lymphocyte (CTL) responses to experimental infection and vaccination have been investigated in feline sarcoma virus (McCarty and Grant 1983), feline leukemia virus (Flynn et al. 2000), and most notably, feline immunodeficiency virus (Bonci et al. 2009; Burkhard et al. 2001; Flynn et al. 1996; Gupta et al. 2007; Koksoy et al. 2001; Leutenegger et al. 2000; Pu et al. 1999; Song et al. 1992), and the role of these CTL in controlling infection has been demonstrated by challenge protection (Flynn et al. 1996, 2000) and adoptive transfer of immunity (Flynn et al. 2002; Pu et al. 1999). Despite this body of work, the MHC class I molecules that control such CTL are unknown. Importantly, in this study, we not only identified prevalent alleles, but also common allele types (E*003 and K*003). While not genotypically identical, the alleles that comprise these types have the same HVRs and potentially have overlapping peptide binding preferences and present the same epitopes. Such so-called supertypes (del Guercio et al. 1995; Sidney et al. 1995) effectively increase the prevalence of functionally equivalent alleles in populations under study, and hence, their recognition is valuable in developing peptide-based vaccines or studying peptide-specific T-cell responses with the broadest possible coverage.

Defining and surveying classical class Ia molecules in the domestic cat may also benefit wild cats. Recent viral epidemics with high mortality rates have adversely impacted a number of big cat populations, and there is concern for cross-species disease transmission (Brown et al. 2008; Heeney et al. 1990; Meli et al. 2009; O'Brien and Johnson 2005; Pearks Wilkerson et al. 2004; Roelke-Parker et al. 1996). Consequently, effective immunization programs are considered a critical part of the conservation efforts for these species. With a more complete understanding of FLA, the domestic cat could play a role in such vaccine development. Much of the genetic variation in the Felidae MHC appears to have been set before speciation (Castro-Prieto et al. 2011a; Wei et al. 2010; Yuhki and O'Brien 1994; 1997), and some polymorphisms are ancient. Therefore, it seems likely that there are similarities in MHC class Ia alleles—and the CD8+ T-cell responses they restrict—across species and lineages of cats.