Landscape of IGH germline genes of Chiroptera and the pattern of Rhinolophus affinis bat IGH CDR3 repertoire

ABSTRACT The emergence and re-emergence of abundant viruses from bats that impact human and animal health have resulted in a resurgence of interest in bat immunology. Characterizing the immune receptor repertoire is critical to understanding how bats coexist with viruses in the absence of disease and developing new therapeutics to target viruses in humans and susceptible livestock. In this study, IGH germline genes of Chiroptera including Rhinolophus ferrumequinum, Phyllostomus discolor, and Pipistrellus pipistrellus were annotated, and we profiled the characteristics of Rhinolophus affinis (RA) IGH CDR3 repertoire. The germline genes of Chiroptera are quite different from those of human, mouse, cow, and dog in evolution, but the three bat species have high homology. The CDR3 repertoire of RA is unique in many aspects including CDR3 subclass, V/J genes access and pairing, CDR3 clones, and somatic high-frequency mutation compared with that of human and mouse, which is an important point in understanding the asymptomatic nature of viral infection in bats. This study unveiled a detailed map of bat IGH germline genes on chromosome level and provided the first immune receptor repertoire of bat, which will stimulate new avenues of research that are directly relevant to human health and disease. IMPORTANCE The intricate relationship between bats and viruses has been a subject of study since the mid-20th century, with more than 100 viruses identified, including those affecting humans. While preliminary investigations have outlined the innate immune responses of bats, the role of adaptive immunity remains unclear. This study presents a pioneering contribution to bat immunology by unveiling, for the first time, a detailed map of bat IGH germline genes at the chromosome level. This breakthrough not only provides a foundation for B cell receptor research in bats but also contributes to primer design and sequencing of the CDR3 repertoire. Additionally, we offer the first comprehensive immune receptor repertoire of bats, serving as a crucial library for future comparative analyses. In summary, this research significantly advances the understanding of bats’ immune responses, providing essential resources for further investigations into viral tolerance and potential zoonotic threats.

respiratory syndrome coronavirus-2 (10,11).Studying the mechanisms of immune tolerance in bats could lead to new approaches to improving human health (12).
Bats carry highly pathogenic viruses without symptoms, which should be attributed to their special innate and adaptive immune responses.The composition and function of Toll-like receptor (13,14), interferon (15), and a variety of innate immune response genes have been preliminary elaborated in bats, and the mechanism of interferon in bats and humans is different (16), suggesting that bats have a stronger innate antiviral response and can control viral replication early (17,18).However, it is not clear what role the adaptive immune response of bats plays in this process.
Revealing the mechanism of B cell response and antibody production in bats will help to clarify the mechanism of asymptomatic bats carrying viruses.In 1982, IgM, IgG, and IgA were isolated from the serum of Artibeus lituratus and P. giganteus and which were homologous with that of human immunoglobulin (19).In 2010, the representative immunoglobulin heavy chain variable region (VH) genes of Pteropus Alecto and Pteropus vampyrus antibodies were found, involving all three VH families (I, II, and III) (20).In 2011, Butler et al. found the transcriptomic evidence of IgM, IgE, IgA, and IgG subclasses in Chiroptera (21), and bats showed high diversity of VH, DH, and JH genes (22).In 2021, Larson et al. annotated 66 IGHV genes, 8 IGHD genes, and 9 IGHJ genes at the IGH locus of Egyptian rousette bats using bacterial artificial chromosome (23).Although these previous studies provide a basis for understanding the humoral immune response of bats, further exploring bat B cell-mediated adaptive immune response depends on the annotation and application of bat IG germline genes at the chromosome level.
With the completion of genome sequencing and chromosome assembly in a few bats, Rhinolophus ferrumequinum (RF), Rousettus aegyptiacus, Phyllostomus discolor (PD), Myotis myotis, Pipistrellus pipistrellus (PP), and Molossus molossus (24), we have finished annotation and preliminary application of the TR in RF (25).Now, we unveiled a detailed map of Chiroptera IGH germline genes on chromosome level and provided the first immune receptor repertoire of bat.

Location of V, D, and J genes of IGH locus
The whole-genome sequence information of RF (GCA_00415265.2),PD (GCA_004126475.3),and PP (GCA_903992545.1)was obtained from the NCBI website (https://www.ncbi.nlm.nih.gov/).The classic IMGT_ LIGMotif (26) and 12/23RSS (27) approaches were adopted to identify bat's IGH germline genes.The chromosomal location of IGH loci was determined by comparing mammals' IGHC genes that are available on the IMGT website (https://www.imgt.org/genedb/)with the whole-genome sequence of three bat species.Similarly, mammalian IGHV, IGHD, and IGHJ sequen ces downloaded from the IMGT website were mapped with the chromosomes deter mined by bat's IGHC gene to obtain possible germline genes, and these genes were labeled with Geneious Prime software.Next, the sequences from IGHV to IGHC were selected with 10 KB as a group and were dropped into the Ig BLAST website (https:// www.ncbI.nlm.nih.gov/igblast/) to screen possible germline genes.Moreover, the Meme website (http://meme-suite.org/)was applied to screen the possible RSS sequences for finding unlabeled IGHV, IGHD, and IGHJ genes and verifying all germline genes.Finally, the genes with complete initiator codon, splice site, sequence length greater than 271 bp, and proper RSS at the end of the sequence were identified as IGHV genes.

Characteristic analysis and nomenclature of bat germline genes
The characteristics of each germline gene of the three bat species were analyzed according to IMGT guidelines (28).The labeled V, D, and J gene sequences were uploaded to IMGT/V-Quest (http://www.imgt.org/IMGT_vquest/input)and were defined as functional genes (F), open reading frame (ORF), and pseudogene (P) according to IMGT functional classification principles.The similarity of amino acids and nucleotides of three bat germline genes was assessed using Geneious Prime software.Additionally, the anchor positions, which define the beginning and end of each CDR (complementar ity-determining region) in the V and J genes across the three bat species, were identified and labeled using Geneious Prime software and IMGT/V-Quest.
According to the naming rules of human IGH in IMGT, those with nucleotide similarity ≥75% were classified into the same family, the IGHV genes of three bats were clustered with human IGHV gene families, and those with nucleotide similarity ≥75% were classified into the same family, which was named uniformly with that of human.Phylogenetic trees of IGHV/J genes of three bat species were also constructed, respec tively, in MEGA version 7 by the neighbor-joining method.The Logo graph was drawn to analyze the composition characteristics and conserved of RSS using the Weblogo website (https://weblogo.threeplusone.com/create.cgi).

Construction of bat IGH reference data set
Due to the high homology of IGH V, D, J, and C sequences of the annotated three bat species, the bat IGH reference gene bank was constructed for the first time by using the annotated IGH germline genes of the three bat species, with a total of 179 IGHV genes (including 57 pseudogenes), 27 IGHD genes, 19 IGHJ genes, and 12 IGHC genes.These V and J genes that have been included in the reference data set were divided into four framework regions (FR1, FR2, FR3, and FR4) and three variable regions (CDR1, CDR2, and CDR3) according to the amino acid conserved sites which been recognized by MiXCR (29).Finally, bat IGH reference data set was constructed and used for bat IGH repertoire analysis.

Sample preparation and IGH CDR3 repertoire sequencing
The muscle and spleen of bats were collected in Zunyi, Guizhou Province, China.The muscle tissues were used to extract genomic DNA, and the Cytb gene was amplified to determine the genotype of bats (Table S1).In this experiment, three Rhinolophus affinis were selected for the IGH CDR3 repertoire construction and characteristic analysis.The spleen tissues were used to extract total RNA.The construction and sequencing of the library were conducted by Hangzhou ImmuQuad Biotechnologies Ltd using the 5' RACE method.The primers were designed in the conserved region (Fig. S1), which was obtained by comparing all available bat's IGHC genes.In order to control the quality of library construction, two groups of primers with different specificity were designed for IGG, and the efficiency of high-frequency sequence amplification was analyzed between the two groups.After quality control of sequencing raw data, MiXCR software was applied for subsequent analysis using the bat IGH reference data set we created.
We also compared the characteristics of bat IGH CDR3 repertoire with those of human and mouse.Peripheral blood of three healthy volunteers (male, 19-25 years old) and bone marrow of three mice (2 months old) were collected for construction and sequencing of IGH CDR3 repertoire.Moreover, we also downloaded the public IGH CDR3 repertoire data for further comparative analysis (including human and mouse), both of which were obtained by 5' RACE method.The human accession number for the data is ERR3445161, and the mouse accession number is ERR5556766_1.Both data sets have been deposited in the NCBI repository.

Analysis of the IGH CDR3 repertoire
The MiXCR, VDJtools, and immunarch software were used to analyze the composition of each BCR CDR3 sequence, including nucleotide, amino acid (AA), count (reads), frequency count (%), CDR3 length, the V-J rearrangement of the CDR3 repertoire, the proportion and frequency of unique CDR3 sequences, CDR3 repertoire clonality, CDR3 amino acid length, CDR3 amino acid usage, V deletion and J deletion, and dominant V-J combination gene segments were also calculated in different samples.To assess the clone frequency of CDR3 region, the inverse Shannon index was performed.
In addition, we conducted an analysis of the cleavage/insertion rate and average length of V, D, and J genes in the CDR region using Excel software.The cleavage rate of V sequences was calculated by summing cleavage events for each unique V sequence in the raw data of a sample and dividing the total by the number of unique V sequences.Similarly, the insertion rate of V sequences was determined by summing insertion events for each unique V sequence and dividing by the number of unique V sequences.This methodology was then applied to calculate both cleavage and insertion rates for J sequences.To ascertain the average length of V, D, and J genes within the CDR region, pseudogenes were excluded, and the total length of annotated V, D, and J genes (measured from the TGT-end) in the CDR region was summed.This sum was then divided by the number of annotated V genes, excluding pseudogenes.For the average insertion rate in CDR3, the number of inserted amino acids in V and D (or D and J) genes was summed across various CDR3 sequences.Subsequently, this total was divided by the number of distinct CDR3 sequence types.

Statistical analysis and graphing
R package "ggplot2, " "Venn Diagram, " and GraphPad Prism (version 5) were used to plot the figures.Data analysis was performed by R studio (v3.3.3) and GraphPad Prism (version 5) software.P-values were calculated with the aid of the t test.P < 0.05 was considered statistically significant.
IGHV genes of three bat species were classified and named (Table S3).The 41 IGHV genes of RF were divided into six gene families, of which only one pseudogene was a monogenic family, and the other five were polygenic families.The 81 IGHV genes of PD and the 57 IGHV genes of PP were divided into eight polygenic families, respectively, and each contained two pseudogene families.The number of non-functional V genes of the three bat species were 6 (RF), 33 (PD), and 23 (PP), respectively (Table S4).
Comparing the bat IGHV genes with more than 20 species in the IMGT database, bats showed great genetic differences with other species in the number of IGHV families, functional genes, pseudogenes, ORFs, and the composition of polygenic families.However, as shown in the phylogenetic tree (Fig. 1D through F), the IGHV genes of the three bats were divided into three clans: clan I, clan II, and clan III, which were similar to those of mammals such as human and mouse.We further analyzed the evolutionary relationship of V genes between bat and human, pike, and cow and found no species with significant convergence in bats.Compared to carnivores, primates, and artiodactyls, bats exhibited distinct variations in the quantity and family distribution of IGHV genes (Fig. S3).

Nomenclature and amino acid composition of IGHJ gene
Six, seven, and six IGHJ genes were identified in IGH loci of RF, PD and PP, respectively, and all of them were functional genes.The amino acid sequence alignment showed that the number and characteristics of the IGHJ genes of the three bat species were consistent with those of human beings.The J genes of the three bats had conserved WGQG and VTVS structures except for one amino acid changed in IGHJ2 and IGHJ4 of the PP (Fig. 2A).

The characteristics of bat 12/23 RSS
A total of 23 RSS are located behind the IGHV genes, and each RSS contains a hep tamer and a ninomer.All the heptamer sequences of three bat species were relatively conserved (Fig. 2B), while the nucleotides at the fourth position of the ninomer were diverse.Also, 12 RSS were located on front of IGHJ genes and before and after IGHD genes, and each RSS contained a heptamer and a ninomer.The GTG nucleotides at the last three positions of heptamer and the TTTTT nucleotides at positions 3-7 of ninomer in the IGHJ pre-12 RSS sequence of the three bats were relatively conserved.For the pre-12 RSS of IGHD genes, the relatively conserved nucleotide sites were G at position 8 in the ninomer, CA at positions 1 and 2, and TG at positions 6 and 7 in the heptamer in the three bat species (Fig. 2C).These conserved sites had not undergone any mutations in the three bats.The conserved nucleotide of the post-12 RSS of IGHD genes in the three bat species was the third C in heptamer and the last four AACC in the ninomer.

Construction of RA IGH CDR3 repertoire
As expected, the construction of IGHM, IGHG, IGHA, and IGHE showed obvious peaks at the 1,500 bp, suggesting that the constructions were successful.Although there were variations in the total unique IGH CDR3 sequences and each subclass's CDR3 sequences among the three RA samples (Table 1), all sequences met the analysis criteria for CDR3 sequences (unique clone sequence/total functional sequence <10%).Moreover, quality control was performed on the IgG of each bat by designing two sets of primers.The number of sequences obtained was identical, and the common high-fre quency sequences proportion in each sample reached more than 52%.Interestingly, the sequence composition of RA IGH was high homology with that of the annotated RF, and only a few V and J gene families were homologous with that of the annotated PD and PP (Fig. 3).

The comparison of RA IGH CDR3 subclass
The IGH CDR3 sequences from human and mouse were also included in this study for a comparative analysis, aiming to contrast the characteristics of IGH CDR3 between bats and other species (Table S5).The IGH CDR3 subclass with the largest number of sequences in RA was IgG, followed by IgM, IgA, and IgE.At the transcriptome level in humans and mice, IgM exhibited the highest representation, followed by IgA and IgG.This trend aligns with findings from published articles and databases sourced from healthy human and mouse populations.

V/J access and pairing of IGH CDR3
The V and J usage in the IGH CDR3 of RA, humans, and mice is illustrated in Fig. 4A.The three species were biased toward IGHV1.Bats and people also favor IGHV4, and IGHV3 was only available in bats at high frequencies.The frequency of other IGHV gene families in the three species was very low.IGHJ4 appeared frequently in all three species, and the frequency of IGHJ family access in bats was almost the same as that in humans, in descending order of access frequency, were IGHJ4, IGHJ6, IGHJ5, IGHJ3, IGHJ1, and IGHJ2.Mice had no preference for the IGHJ families.Moreover, the access trend of V and J was consistent with shared data (Fig. S4A), and the V and J utilization of IGH subclass (IGM and IGE) was also similar to the overall (Fig. S5A and B).V/J pairing of RA was identical to that of human and mouse (Fig. 4B through D; Fig. S4B and S6), in which IGHV1-IGHJ4 pairing with high frequency was detected.

Length distribution of IGH CDR3
The CDR3 length of IgA, IgG, and IgM was bell shaped (Fig. 4E).Bat was centered on 13, 15, 14 AA, mouse was centered on 15, 15, 13 AA, and human was centered on 16, 17, 16 AA.The difference of the composition of V, D, and J (Fig. 4F) and deletion/insertion (Fig. 4G) of CDR3 regions in bats, humans, and mice were also carried out.The deletion and insertion of bats at V3' and J5' ends were higher than those of mice and humans.The terminal length of the V gene in the CDR3 region was the longest among the three bat species studied, while the D gene and the J gene exhibited the shortest lengths.In addition, the IgG showed a longer AA distribution than IgA and IgM in all three species.The length distribution and AA access of shared IGH CDR3 data also match the above results (Fig. S4C).

The clones of IGH CDR3 repertoire
Fewer than 100 CDR3 clones were defined as rare clones, and the proportion of rare clones in bats was significantly lower than that in humans and mice (Fig. 5A and B).The cloning frequencies of human and mouse were similar to the whole, while bats showed high individual differences (Fig. S7A and B).The Shannon index suggests that the IGH CDR3 repertoire diversity in bats was comparatively lower than that observed in humans and mice, yet this variance did not reach statistical significance (Fig. 5C).
The IGH CDR3 unique clones overlapped among bats were low, and the highest was that of mice (Fig. 5D).For IG subclass, the IGG unique clones overlapped among humans were most, and it was inconsistent with the total IGH.The shared unique clones of IGA and IGM were similar with the total IGH in the three species (Fig. S7C through E).Notably, clonotype tracking showed 14 shared clones between bats and mice (Fig. 5E).The AA intake of IG subclasses (IgG, IgA, IgM) in bats was highly consistent with that of humans and mice (Fig. 6A), with high-frequency intake of Y, G, A, R, W, and D. The IGH CDR3 motif showed specificity in the three species (Fig. S8), and we counted the top 10 motifs of each species and found that bats and mice have multiple same motifs with the high frequency (Fig. 6B), such as YFDYW, AMDYW, YAMDY, YYFDY, and YYAMD.There was only one intersection in the top 10 motifs of bats and humans, YFDYW.Moreover, the top 10 motifs of IG subclasses (IgG, IgA, IgM, and IGE) were also analyzed.IgA, IgG, and IgM have no significant special high-frequency motifs in the three species except that the order of motif frequency was slightly different (Fig. S9).In bats and mice, IgE exhibits distinct differences in specific high-frequency motifs (Fig. S10).We further analyzed the four subclasses of bats and found that IgA and IgE had multiple unique high-frequency motif, while IgM and IgG displayed almost the same high-frequency motifs.Moreover, each analyzable sequence contained a complete J region.The SHM of bat J region was significantly higher than that of human and mouse (Fig. 6C).

DISCUSSION
Bats are carriers of highly pathogenic viruses, which have caused massive damage to human health and pose a huge risk to the spread of viral diseases in the future (30).It is urgent to clarify the relationship between bat immune system and virus.Rabies virus (31) and Australian bat lyssavirus (32) induce clinical symptoms in bats, indicating an immune system response.We have finished the TR germline genes annotation of RF (25).In this study, we annotated the IGH germline genes of three bat species completely and explored the characteristics of bat's IGH CDR3 repertoire.Compared with bat T cell response, studying the mechanism of bat B cell response and antibody production will better elaborate the reason why bats coexist with virus.Early studies found that the intensity and duration of neutralizing antibody reaction of Eptesicus fuscus remained lower than that of Cavia porcellus and rabbits (33).The Artibeus jamaicensis experimen tally infected with Venezuelan encephalitis virus produced strong neutralizing antibody, but the detectable antibody response of P. discolor was slower and of lower magnitude and shorter duration than that of Artibeus (34).Neutralizing antibodies against Ebola virus (35), Hendra virus, and SARS-like coronavirus (8) were also detected in wild bats.These studies suggest that the annotation of bat Ig and its application to the study of the mechanism of bat B cell response will play a vital role in clarifying bat-specific immune response.The characteristics of BCR repertoire of human, mice, and other mammals are available and have applied in basic research and clinical diagnosis, while the composition and diversity of BCR libraries of bats are almost unknown.
This study presents the first comprehensive annotation of the IGH germline genes in RF, PD, and PP.The length of IGH heavy chain in most mammals recorded by IMGT is similar, but there are obvious differences among the three bats (430, 2,500, and 350 kb), which may be related to the long-distance distribution of IGHV genes in PD.The IGH germline genes of the three bats are highly homologous and also have high homology with the Egyptian rousette bats recently annotated using bacterial artificial chromosome (23).Comparing the germline genes of bats with those of human, mouse, cow, and dog, bats did not show significant homology with one of the species, but the V, D, and J genes on the IGH chain of the three bats showed a clustered arrangement of similar genes.Interestingly, there are 22 reverse IGHV in front of the D/J genes in the PD, which rarely occurs in the annotated IG and even TR genes.This abnormality of the PD and whether there is a similar arrangement in other bat species will be of great interest.
Myotis lucifugus, E. fuscus, Carollia perspicillata, and Cynopterus sphinx have 73, 20, 16, and 15 IGHV genes, respectively (21).In this study, 57, 41, and 81 IGHV genes were identified in PP, RF, and PD, respectively, and were different from Egyptian rousette bats (23).The variation observed in the number of V genes across different bat species suggests a distinction from other mammals.This disparity may indicate a significant expansion or diffusion of the IGHV gene within bats.We mapped the evolutionary tree of IGHV genes of three bats, which is consistent with the IGHV family of mammals such as human and mouse, and can be classified into clans I, II, and III, suggesting that the evolution of bats' IGHV genes is consistent with that of human and mouse.
Among the various species shared by IMGT, the proportion of IGHV pseudogenes are high, 63.3%, 35%, and 41.6% in human, pig, and mouse, respectively.In this study, 31.6%,39.5%, and 12.2% pseudogenes were found in PP, PD, and RF, respectively.Pseudogenes are slightly lower in bats but do not show significant difference compared to other species, suggesting that there is no significant difference in the process of IGHV gene becoming pseudogene by random and extensive mutation in different species.
The length range of IGHJ genes is generally 37-63 bp among the species recorded in the IMGT database.In this study, the length range of IGHJ genes of the three bat species is 45-65 bp, suggesting that bats are similar to other species in the length of IGHJ genes.We analyzed the IGHV and IGHJ sequences in three bat species and found that Cys23, Trp41, and Cys104 were highly conserved in the IGHV sequences, and IGHJ gene has highly identical amino acid-conserved sequences.In a total of 19 IGHJ genes, except for two genes which have one mutation, the other IGHJ sequences have WGQG and VTVS structures, which are basically consistent with that of other mammals recorded in the IMGT database, suggesting that the IGH genes of bats conform to the standard pattern of real mammals but are different from birds or protomammals.
RSS is one of the most critical components of adaptive immune evolution.RSS before and after V and J genes found in this study are classic RSS, and the conserved sites of RSS sequences of the three bats are similar, whether it is 7-mer or 9-mer.Compared to human and mouse, the conserved positions of nucleotides are basically the same, suggesting that the RSS sequences of bat annotated in this study are consistent with the classic RSS sequences of mammals, and have not changed greatly in the process of species evolution.However, whether bats have nonclassical RSS, such as spacer 12 ± 1 bp or 23 ± 1 bp, remains to be further explored.Some bat species may lose IgD isotype in evolution.The transcribe genes encoding IgA, IgG, IgM, and IgE subclasses were found in Cynopterus sphinx, Carollia perspicillata, M. lucifugus, E. fuscus, and two short-nosed fruit bat, but the IgD transcripts were only recovered from insectivorous bats and were comprised CH1, CH3, and two hinge exons (21).Moreover, no transcripts of IgD were detected in P. alecto (17).In this study, we analyzed the characteristics of the C-region of the IGH chain in RF, PD, and PP; similarly, there are four C-regions: IGHM, IGHG, IGHE, and IGHA but no IGHD.
Many evidences show that different bat species have common ancestors in the evolution (18,30,36).According to the high homology of the annotated IGH in RF, PD, and PP, we have established a method to analyze RA IGH CDR3 repertoire and made a preliminary comparison with that of human and mouse.
The sequencing for bat IGH CDR3 repertoire was successful.According to the IGH CDR3 sequence and isotypes composition of RA, we found a high homologous between RA and RF, suggesting that the adaptive immune response of bats can be studied at the level of "family." There are 18 families in bats, and the evolution of B cell IGH loci is quite different.The study of adaptive immune responses in bats may be more complex than other mammals.The IGH CDR3 isotype of bats differs significantly at the transcriptome level from human and mouse, such as the extremely low proportion of IgA.Moreover, the serum IgA of healthy bats was significantly lower than expected, suggesting that higher quantities of IgG in mucosal secretions may be compensation for this low abundance or lack of IgA (37).The characteristics of bats that differ from humans and mice in isotypes may be the reason for the bat special immune response.
The length of IGH CDR3 in each species is mainly caused by the differences of IGHV terminal, IGHD, IGHJ front-end, and insertions and deletions in the rearrangement.In general, the length of IGH CDR3 is positively correlated with body size, and B cells may also change CDR3 length during self-tolerance selection.In our previous study, we found that the shear of V3' and J5' ends of human in IGH CDR3 is higher than that of mouse (38).However, the results of this study are opposite, and the consistency of bat and mouse is higher than that of human, suggested that the insertion and deletion of IGH may be more frequent in the development and tolerance of B cells in bat and mouse species than in human beings.
The clonality and diversity of IGH CDR3 repertoire are related to the intensity and breadth of adaptive immune response.In the context of RA, there is a low prevalence of rare clones and a high prevalence of ultra-high clones, demonstrating stark differences from observations in human and mouse immune responses.The possible reason is that bats show a strong B cell response to a small number of antigens, but the breadth of response to antigens is lower than that of humans and mice.Certainly, small samples size is a defect of this study, and a consistent sequencing depth is also needed to further explore the diversity of IGH CDR3 repertoire in bats.
In general, there are abundant common T cell clones and B cell clones between different individuals of human or mouse (39).However, the source and effect of these shared clones are not clear at present.Among the three species in this study, mice have the highest rate of shared clones, which may be consistent with the genetic background of Balb/c mice.Notably, bats and mice share some clones, suggesting that the character istics of overlapping clones can be used to explore the response differences between bats and other species.
The AA uptake in IGH CDR3 region was highly consistent among bat, human, and mouse in this study, suggesting that the IGH CDR3 region in mammals has great commonality in composition, structure, and antigenic determinant combination.In P. lecto, the IGHV region was rich in Arg and Ala, and the amount of Tyr is small, which may lead to the evolution of antibodies in bats, whose polymerizing reactivity is low and only weakly associated with antigens (20).However, the Tyr content of the three RA in this study is almost the same as that of human and mouse.
The composition and conformation of motif in CDR3 region are important for B/T cells binding to the corresponding antigen.Among the top 10 high-frequency motifs in IGH CDR3, bats and mice showed higher similarity, with five same motifs.The high-frequency motifs of each subclass are basically consistent with the total IG, but bat IGE and mouse IGE showed less common high-frequency motifs than other subclasses.Bat IGA and IGE showed multiple unique high-frequency motifs, while IGM and IGG show almost the same high-frequency motifs, revealing that the classification conversion mechanism of IgE and IgA in bats is inconsistent.
Different bat species may have different SHM.In black flying fox (20) and fruit-eating bat (40), only few SHM were found, while significantly more mutations were detected in RA IGH J sequences compared to human and mouse in this study.Bat's higher SHM may be closely related to its tolerance or response to the virus, which is a breakthrough point to further explore the mechanism of bat B cell response and whether it produces high-affinity antibodies to respond to the virus through SHM.
Many studies on bats' high heart rate and metabolism, long life span, low tumor incidence, and asymptomatic ability to carry and transmit highly pathogenic viruses have been carried out, including the cell lines establishment of pteropid bat (41), the preparation of polyclonal antibodies of bat IgG, IgM, and IgA (37), the sequencing and assembly of bat genome (24), the establishment of the Bat1K genome consortium unites (42), etc.These efforts will provide the basis and technology for elucidating the innate and adaptive immune responses of bats.This study displayed the IGH germline genes of three bat species at the chromosome level and analyzed the characteristics of bat IGH CDR3 repertoire, which provided a new technology and basic data for studying the IGH characteristics and the mechanism of antiviral immune response in bat.

FIG 1
FIG 1 Structure of bat IGH loci and the three clan of V genes.(A) The IGH locus of Rhinolophus ferrumequinum.(B) The IGH locus of Phyllostomus discolor.(C) The IGH locus of Pipistrellus pipistrellus.(D) The phylogenetic tree of Pipistrellus pipistrellus.(E) The phylogenetic tree of Phyllostomus discolor.(F) The phylogenetic tree of Rhinolophus ferrumequinum.Green segment is IGHV gene; light blue segment is IGHD gene; yellow segment is IGHJ gene; dark blue segment is IGHC gene.

FIG 2
FIG 2 The structure of IGHJ genes and RSS sequence.(A) Sequence comparison of all IGHJ genes in three bat species.(B) RSS characteristics of V and J genes in bat, human, and mouse.(C) RSS characteristics of D genes in bat, human, and mouse.

FIG 3
FIG 3 Alignment of Rhinolophus affinis sequence with that of annotated three bat species germline genes.(A) Partial sequences of Rhinolophus affinis.(B) The proportion of V, J, and C genes of three annotated bats in the Rhinolophus affinis.(C) The comparison of C region between Rhinolophus ferrumequinum and Rhinolophus affinis.(D) Sequence comparison of Cytb gene in four bats.All the sequences of Rhinolophus affinis were obtained by sequencing.

TABLE 1
The sequencing statistics of Rhinolophus affinis IGH CDR3