The polymorphism analysis and therapy vaccine target epitopes screening of HPV-35 E6 E7 among the threaten α-9 HPV in Sichuan area

High-risk human papilloma virus (HR-HPV) persistent infection is closely associated with the development of cervical cancer and squamous intraepithelial lesion (SIL).The α-9 HPVs, which is predominantly composed of HR-HPV types, account for 75% of HR-HPV infection in Sichuan. The oncoproteins E6 and E7 of HPV play a crucial role in tumor initiation and progression. Notably, HPV-35 is the only HR-HPV type within the α-9 genus that is not included in the nine-valent HPV prophylactic vaccine. Cervical cell samples obtained from Sichuan were collected for HPV detection and genotyping. Among the 406 HPV-positive samples, 31 HPV-35 were detected, 24 HPV-35 E6 and 26 E7 were successfully amplified and sequenced, five nucleotide mutations in E6 and three in E7 were detected, T232C, T434G of E6 (W78R, I145R) and C67T, G84T of E7 (H23Y, L28F) were non-synonymy mutation. PAML 4.8 server was used to detect positive selection sites of HPV-35 E6, E7, and E6 is W78R. Phyre2 were used to predict and analyze protein structures, W78R made influences on protein structure. IEDB were used to screen epitopes vaccine target for HPV-35 affection therapy, and 5 HPV-35 E6 and 3 HPV-35 E7 most potential epitopes were obtained, the most potential peptides for therapy vaccine design were 79-91YRYSVYGETLEKQ, 45-60FACYDLCIVREGQPY, 124-135RFHNIGGRWTGR of E6; 3-19GEITTLQDYVLDLEPEA, 38-47TIDGPAGQAK, 70-88VQSTHIDIRKLEDLLMGTF of E7 and W78R mainly decreased the epitopes affinity. Conclusions Amino acid substitution in the positive selection sites of HPV-35 E6 and E7 genes have been found to influence protein structure and to decrease the overall affinity of antigen epitopes. This observation aligns with the evolutionary significance of positive selection site, which may confer advantages to the virus by making infected cells more challenging for the immune system to detect, thereby enhancing HPV’s adaptability to the host environment. The polymorphism analysis of HPV-35 E6, E7 contributes to the enrichment of α-9 HPV data in Sichuan China, which is instrumental in improving the effectiveness of clinical detection. Furthermore, these findings provide a relevant theoretical foundation for the prevention and treatment of HPV-related diseases. Supplementary Information The online version contains supplementary material available at 10.1186/s12985-024-02357-3.


Introduction
Cervical cancer, alongside liver cancer, is a type of malignancy associated with viral infection with 99.7% of cases linked to persistent infection with HR-HPVs.The α-9 genus HPVs predominantly comprises mucosal HR-HPVs, representing approximately 75% of HR-HPV infection in Sichuan.This genus includes HPV types 16,31,33,35, 52 and 58.Notably, HPV-35 is the only HR α-9 HPV that has been omitted from the nine-valent prophylactic vaccine and has not been extensively studied.The scarcity of data on HPV-35 hinders our understanding of its characteristics.Since HPV characteristic can vary by geography and ethnicity, it is particularly crucial to conduct research in Sichuan, to provide a theoretical basis for the prevention and treatment of HPV-35 infections.
HR-HPV are known to induce the occurrence and progression of cervical cancer and other malignant tumors primarily through the action of their E6 and E7 oncoproteins [1].Protein structure determines its function, HPV E6 is consists of one N-terminal (residues 1-36), one C-terminal (residues 147-158) and two Zinc finger (residues 37-73 and 110-146).These Zinc finger motifs are characterized by a consensus CXXC-(29x)-CXXC amino acid sequence and are critical for binding zinc ions, which are essential for various cellular processes, including RNA transcription, transformation, cell immortalization, and protein-protein interactions with host cell proteins.Moreover, the two Zinc finger domains fold into a deep pocket, which, through their common alpha-helical structure, is involved in the ubiquitination and degradation of the tumor suppressor protein p53 [2][3][4][5][6][7].Distinct from low-risk HPVs (LR-HPVs), the carboxyl terminal of HR-HPV E6 contains a highly conserved and unique amino acid sequence that serves as a PDZ-binding motif.This motif is known to interact with PDZ domain-containing proteins, which are involved in intracellular cell adhesion, cell junction formation and cell polarization.The binding of E6 to these specific proteins can confer a more aggressive phenotype on cells or maintain high levels of protein expression [8,9].The PDZ-binding motif (residues 145-149) is a key target for the E6 protein's involvement in cellular transformation, and the carboxy terminal half, principally responsible for p53 binding, appears to be crucial for the induction of cellular immortality [10].HPV E7, on the other hand, promotes cell immortalization and inhibits apoptosis in infected cells by binding to the retinoblastoma protein (pRB).The E7 protein is divided into three conserved regions (CR1:1-15, CR2:16-37, CR3:38-98).The CR2 region contains a strictly conserved sequence LXCXE, with amino acids at positions 21-29 comprising the pRB-binding regions [11].The CR2 region exhibits a high affinity for the RB protein and considered the active region of the E7 protein.Additionally, the LXCXE sequence serves as a phosphorylation site for casein kinase (CK) II [12], further modulating the protein's interactions and functions within the host cell.The CR3 region of HPV E7 contains two CXXC sequences separated by 29 and 30 amino acids, respectively, which constitute the zinc finger binding domain.The amino acids at positions 58, 61, 91 and 94 of the HPV E7 protein serve as the Zn-binding sites.The zinc finger binding domain is involved in the interaction with a variety of functional proteins [13].Substitutions within this region have the potential to alter the conformation, and consequently, the function of the E7 protein to some degree.
Human leukocyte antigens (HLAs) are molecules presen on the surface of all human cells that plays an crucial role in distinguishing immune responses by recognizing autoantigen epitope targets and either stimulating or modulating these responses.This mechanism enables the body to resist and eliminate HPV infections.Currently, there is no specific drug available to cure HPV infection, instead, the primary clearance mechanism is through the immune system, with T cell immunity being the key component.In the context of HPV infection-related immune responses, CD8 + cytotoxic T lymphocytes (CTL) and CD4 + helper T lymphocytes (Th) cells are the primary functional cells.The activation and proliferation of CTLs can directly kill tumor cells and secrete cytokines to combat tumors, making them the primary effectors against HPV infection and cervical cancer cells [14,15].Activated CD4 + Th cells can produce lymphocytokines that enhance the function of CTLs and natural killer cells (NK cells), activate antigen-presenting cells (APCs), and assist in the elimination of HPV-infected and cervical cancer cells [16].The E6 and E7 proteins are early proteins, distinct from the targets of preventive vaccines such as L1 and L2, which are expressed in large quantities following infection to activate the immune response.These oncoproteins are considered as important targets for therapeutic vaccines.T-cell epitopes of E6 and E7 have been found to be located in the C-segment and zinc finger region, and HLA-A epitopes may also be present in the N-terminal.
The E6 and E7 oncoproteins of HR-HPVs are critical for the induction of cervical cancer.These proteins not only possess transforming and carcinogenic capabilities but also exhibit trans-activation of viral genes and cellular gene transcription.Upon HPV infection, the immune system recognizes, processes, and presents foreign proteins.Nucleotide non-synonymy mutations can lead to amino acid substitutions in the protein sequence, altering the physical, chemical properties, structure, and immunogenicity of the protein.These changes can affect the immune response to the virus and its pathogenicity.
Positive selective mutation can result in a stable increase in gene frequency, enhancing the adaptability of the virus to the environment.This adaptability can lead to changes in infection rates, allowing those variants that are better suited to the natural environment to survive and thrive.Such as specific mutations L83V (L90V), D25E (D32E) in HPV-16 E6 [17].This adaptability can lead to changes in infection rates, allowing those variants that are better suited to the natural environment to survive and thrive.
Epidemic data indicate that HPV-35 is one of the most prevalent genotypes in cervical cancer cases and is associated with a risk of cervical cancer that is second only to that of HPV-16, making it a significant HR-HPV type.While, there is relevant research data on HPV-35 in North America, Canada, and other places, but lack of relevant research on HPV-35 in China, especially in Sichuan.The types of HPV and the prevalence of specific mutations can vary between different regions and ethnic groups.Consequently, it is imperative to investigate the polymorphism of E6 and E7, positive selection sites, epitopes, and protein structures of HPV-35 in Sichuan.Such research will aid in identifying therapeutic vaccine targets and provide crucial data for the prevention and control of HPV-35 infection-related diseases within this specific region.

Samples resource
Cervical cell specimens were randomly collected from January 2019 to December 2022 at Sichuan Provincial Maternity and Child Health Care Hospital, the Affiliated Hospital of the Sichuan Reproductive Health Research Center, Chengdu Jinsha hospital, the Infertility Hospital Affiliated to Chengdu Medical College, Chengdu Song zi niao Sterility Hospital.These specimens were stored in a -20℃ antiseptic buffer.Prior to sample collection, written informed consent was obtained from all patients or their legal guardians.The study was ethically approved by the Medical Ethics Committee of the Sichuan Provincial Maternity and Child Health Care Hospital, and stringent measures were implemented to ensure patient privacy and confidentiality.

Genomic DNA extraction and HPV typing
HPV DNA was extracted and genotyped by Human Papillomavirus Genotyping test Kit (Hailes Bio, Ningbo, China), in accordance with the manufacturer's instructions.

PCR amplification
Primers specific to the HPV-35 E6, E7 gene regions were designed by PRIMER version 5.0 software and validated through the NCBI (National Center for Biotechnology Information) Primer-Blast tool, based on the reference sequences of HPV-35 (GenBank No: HQ537730.1).These primers were subsequently synthesized by TSINGKE (Chengdu, China).The primer sequences and the polymerase chain reaction (PCR) reaction mix composition are detailed in Supplementary Table S1.The PCR products were visualized by gel electrophoresis in a 2% agarose gel (Sangon Biotech Co., Ltd.), and purified PCR products were subjected to bidirectional DNA sequencing by TSINGKE at least twice (Chengdu, China) to ensure accuracy.

HPV-35 E6, E7 optimal T-cell antigen epitopes selection
Based on the major histocompatibility complex database (dbMHC), HLA alleles with an average frequency of over 5% in the Chinese population were designated as high frequency allele for the selection of T-cell epitopes in Sichuan.This selection was performed by the Immunology Database and Analysis Project (IEDB) resources (http:// www.iedb.org/) (Supplementary Table S1).For potential vaccine design targets, epitopes with lower peptides percentile rank (PR) were considered to have better affinity.Therefore, HLA-I epitopes with PR < 1.0 and HLA-II epitopes with PR < 5.0 were chosen for potential vaccine design target [27].

Evolutionary and positive selection sites analysis of HPV-35 E6, E7
The phylogenetic tree constructed based on the HPV-35 sample sequences, reference sequences and known sequences is presented in Fig. 1.In this study, no novel mutation emerged, and all sequenced mutant strains were identified.Notably, the sample sequences QV31639 and QV29782 exhibited the highest homology to the reference sequence.Similarly, the sequences HPV35E6/ E7 01 and HPV-35 E6/E7 10 demonstrated the highest homology to the reference sequence, suggesting that these mutant strains may have originated from reference sequences.The HPV-35 A1 and A2 line variants previously identified by Chen Zigui et al. were not included in this analysis.However, the HPV E610 and HPV E620 sample were sequenced and are numbered in Fig. 1.These sequences have been deposited in Genbank under the names: HPV-35 E6 01-HPV-35 E6 24, HPV-35 E7 01-HPV-35 E7 24.
Calculated by Codeml software using Naive NEB and Bayes Empirical Bayes models, the positive selection site of HPV-35 E6 was W78R.

Non-synonymous mutation effects on alternative splice site, structure of HPV-35 E6, E7 RNA
In the HPV-35 E6 gene, the nucleotide mutation T232C (W78R) made a different in Intron GC, Alt./Cryptic, Constitutive, Confidence of sequence splice site in 220 bp, Score, Alt./Cryptic, Constitutive, Confidence of sequence splice site in 233 bp, 244 bp.T434G (I145R) made a different in Alt./Cryptic, Constitutive, Confidence of sequence splice site in 220 bp, Constitutive, Confidence of sequence splice site in 233 bp, T232C and T434G indicated that c.T272G mutation influences the RNA splice site (Table 2).In HPV-35 E7, G84T (L28F) made 2 new sequence splice sites in 93 bp and 98 bp.

The effects of non-synonymous mutation on HPV-35 E6, E7 protein structure, structural stability, and hydrophilicity
In the E6 protein, the mutations W78R, I145R were not situated within the alpha-helix or beta-sheet.The W78R resulted in an increase in the lowest free energy of the mutant protein increased from -80.20 to -82.10 kcal/ mol, indicating enhanced hydrophilicity, but decreased protein structural stability.Similarly, the mutation I145R led to an increase in the lowest free energy from -40.20 to -38.50 kcal/mol, again suggesting enhanced hydrophilicity and reduced protein structural stability (Fig. 3 and Table 1).
In the E7 protein, the mutation H23Y caused the disappearance of the alpha-helix located between the 15th and 89th amino acids, which is remote from the site of substitution.This mutation increased the lowest free    Fig. 2 The effect of HPV-35 E6, E7 polymorphism on RNA structure energy of the mutant protein from -58.40 to -56.60 kcal/ mol, indicating decreased hydrophilicity and increased protein structural stability.Conversely, the mutation L28F resulted in a decrease in the lowest free energy of the mutant protein from -70.00 to -71.10 kcal/mol, suggesting enhanced hydrophilicity and decreased protein structural stability (Fig. 3 and Table 1).
The HLA-II epitope binding core is a key important region for CD4 + T-cell recognition, binding, and activation, residing in the peptide-binding groove of HLA-II molecules.Both CD4 + and CD8 + T-cell participate in HPV-related immune response, CD8 + T-cell are considered as the primary effectors in eliminating HPV-infected and cervical cancer cells.Consequently, optimal epitopes Note : * is one of the constituent symbols of alleles Fig. 4 The effect of non-synonymous mutation on HPV-35 E6, E7 protein structure that are the potential MHC-I epitopes that contain at least one core region of HLA-II-binding peptide is valuable.These epitopes can potentially activate both CD4 + and CD8 + T-cells, thereby enhancing the breadth and effectiveness of the immune response against HPV.Finally, the optimal epitopes were selected as fellow step: 1. Selected the epitopes do not contain mutation sites.2. Selected epitopes located in the common region of HLA-I and HLA-II predicted epitopes.3. Selected epitopes contain at least one HLA-II binding core region peptide.4. Epitopes that bind more HLA-I alleles and lower the percentile rank is better.5.The integrated HLA-I epitope peptides are also excellent vaccine targets peptides in HLA-I and HLA-II predicted epitopes common region and called optimal peptides.In HPV-35 E6, 5 optimal epitopes were selected, they were 52-60IVY-REGQPY, 79-88YRYSVYGETL, 79-90YRYSVYGETLEK, 80-90RYSVYGETLEK, 81-90YSVYGETLEK; the optimal peptides were 79-91YRYSVYGETLEKQ, 45-60FACY-DLCIVYREGQPY, 124-135RFHNIGGR WTG R. In HPV-35 E7, optimal epitopes were 3-13GEITTLQDYVL, 5-13ITTLQDYVL, 11-19YVLDLEPEA; the optimal peptides: 3-19GEITTLQDYVLDLEPEA, 38-47TIDG-PAGQA, 70-88VQSTHIDRKLEDLLMGT.

Discussion
The paucity of data on HPV-35 hinders our understanding of this virus, highlighting the significance of conducting research in Sichuan to establish a theoretical framework for the prevention and treatment of HPV35.The HPV characteristics vary across different geographical and ethnic groups, highlighting the significance of conducting research in Sichuan to establish a theoretical framework for the prevention and treatment of HPV35.HPV-35 is the only HR α-9 genus HPV that is understudied, largely due to its historically low prevalence.International research has identified HPV-35 as a significant risk factor for cervical cancer, ranking it second only to HPV16, in terms of pathogenicity, with a threat to human health that is comparable to that of HPV16.The limited data on HPV35, coupled with its low infection rate, can lead to a tendency to overlook its risks, yet latent HPV-35 may pose an even greater threat than HPV16.In our Table 6 The T-cell optimal epitopes of HPV-35 E6, E7 HPV-35 isolates are genetically related to HPV31 and HPV16, and they shared a recent common ancestor with HPV33 and HPV58 isolates [28].In our study, 5 nucleotide-acid mutations in E6 (2 non-synonymous mutations: T232C (W78R) and T434G (I146R), 3 synonymous mutations: T18C, T27C, and G262C) were detected and 3 in E7 (2 non-synonymous mutations: C67T (H23Y) and G84T (L28F), 1 synonymous mutation: C114T), and due to the limitation of sample, some mutation sites may not have been identified.The variants of HPV-35 E6 and E7 in Sichuan, China, are distinct and do not belong to the well-known A1 and A2 series.They exhibit strong regional characteristics.The homology between these variants and the reference sequences is notably higher for the E6 variant compared to the E7 variant.Furthermore, the E7 variant appears to have a more distant evolutionary lineage.HPV 35E6/E7 01, HPV 35 E6/E7 10 was found to have the highest degree of homology with the reference sequence among the HPV-35 E6/E7 sample sequence.This isolate in question likely originated from a reference sequence, with HPV-35 E6/E7 01, 35 E6/E7 10 and the A1 line belonging to the same clade, while the remaining sequences also grouped within the same clade.As observed by Chen Zigui and Basto Diogo Lisboa et al., all variants of HPV-35 are highly conserved, displaying minimal intratype genetic diversity, and failing to meet the criteria for classification into multiple lineages.The highest infection rate for HPV-35 was found in the 56-years age bracket (42.9%) [29].These finding suggest the prevalence pattern may be influenced by the existence of another variant lineage of HPV-35 that could be present in isolated and/or unsampled populations.Intratype mutation within host cell may be infrequent, and the virus may have a longer incubation period, making it more challenging to detect.The mutant strains in the Sichuan area of China exhibit distinct regional characteristics, which should be taken into account, when designing vaccines and probes specific to the Sichuan population.The mutation rate in the E6 gene was higher than that of E7 (78.94%/52.63%).Consequently, the more stable E7 should be prioritized, when considering the design of probe for HPV35.
HR-HPV E6, E7 are critical oncoprotein that differ from those of LR-HPV.They are capable of interacting with RB, p53 tumor suppressor proteins and several cycle regulatory proteins.E6, E7 non-synonymous mutations can result in alterations to the amino acid composition, protein structure and function.For instance, the mutations W78R, I145R in E6 do not reside within the alphahelix or beta-sheet region.However, I145R is located in the highly conserved carboxyl terminal, which is involved in binding to PDZ domain-containing proteins that play roles in intracellular cell adhesion, cell linkage and cell polarization.W78R is situated near the Znic-figure and within the groove formed by two Znic-blinding figures, which are crucial for p53 binding.In E7, the mutations H23Y, L28F are not located within the alpha-helix or beta-sheet regions either.The region spanning amino acids 16-37 of E7 exhibits a high affinity for RB protein and is considered the active region of E7 protein.The residues 21-29 positions constitute the pRB binding regions, with the LXCXE motif being strictly conserved as the phosphorylation site of casein kinase (CK) II.Both H23Y and L28F are positioned within this functional region of E7.Additionally, H23Y induces a change from an alpha-helix to a coil conformation at positions 15 and 89, which are distant from the site of substitution.The structural change at the 89th amino acid caused by H23Y is in close proximity to amino acids 58, 61, 91, and 94 of HPV E7, which are zinc-binding sites involved in the binding of various functional proteins.The Znic-binding figures, carboxyl terminal and PDZ domain binding motif are essential structural and functional domains of E6, E7.Substitutions in these regions can lead to differences in the binding abilities of these oncoproteins to host proteins such as p53, pRB, and others, which is considered key factors in their carcinogenic potential.HPV E6, E7, which are early protein, are considered prime candidates for prophylactic vaccine targets.When these proteins are digested and presented to the body's immune system, they can active immune response that eliminate HPV infection and reduces the risk of HPVrelated diseases.E6 and E7 T-cell epitopes have been found to locate in the C-terminal segment and Zincfinger region, with HLA-A epitopes potentially also be in the N-terminal.Research has indicated that mutations can impact the effectiveness of HPV vaccine, particularly those designed to target specific epitopes, as the efficacy may be compromised if mutations are not considered.Non-synonymous mutation can alter epitope characteristics.For instance, the W78R mutation in E6 has been shown to increase the affinity and number of HLA-I epitopes, potentially leading to the emergence of new and improved epitopes.Similarly, the I145R mutation enhances HLA-I epitope affinity and number.H23Y in E7 increases HLA-I epitope affinity as well.The E7 protein exhibits a greater number and affinity of epitopes compared to E6, suggesting that E7 may not be as readily recognized by host cells, and its carcinogenic potential may exceed that of E6.These hypotheses require confirmation in subsequent studies.Positive selection sites play a crucial role in enhancing a species' adaptability to its environment by increasing the frequency of mutated genes [30].Analysis using the software package paml has revealed that the HPV-35 E6 positive selection site W78R (mutation risk:22/24) is a frequent non-synonymous mutation, indicating that W78R contributes to the adaptation of HPV-35 E6 and has become widely prevalent.The conformational and functional changes caused by this substitution may influence the virus's pathogenicity [31].Specific intramotypic mutations in HPV E6 can result in differences in virus infection ability and pathogenicity.For example, positive selection site D32E of HPV16, has been confirmed to be associated with the development of cervical cancer [17]; Similarly, W78R in HPV-35 E6, located in an active region, affects protein conformation, function and may reduce the immunogenicity of the peptide containing this sites.Substitutions at positive selection sites can impact protein structure, decrease epitope affinity, make HPV-infected cells more difficult to detect by the immune system, enhance HPV's adaptability to the environment, improve the efficiency of virus infection, and potentially promote the development of cervical cancer.
This study represents the first investigation into the interplay between the protein structure, positive selection sites, epitopes, and pathogenicity of HPV-35 E6 and E7 in Sichuan Province.It aims to elucidate the relationship between HPV E6, E7 and the varying abilities of HPV to infect, with the ultimate goal of enhancing the development of therapeutic vaccine targeting HPV-35 E6 and E7 in Sichuan Province.

Conclusion
HPV-35 variants exhibit high conservation and can remain latent in host cells for extended periods.Consequently, the design of detection probes for HPV-35 should be tailored to enhance the detection rate.Early detection and treatment of HPV-35 infections carry significant clinical and social importance.The enrichment of HPV-35 polymorphism data will provide a theoretical foundation for the development of effective prevention and treatment strategies.

a
Scores of the preprocessing models reflecting splice site strength, i.e. a PSSM for putative acceptor sites, and an MDD model for putative donor sites.Intron GC values correspond to 70 nt of the neighboring intron b Activations are output values of the backpropagation networks used for classification.High values for one class with low values of the other class imply a good classification.Confidence is a simple measure expressing the differences between output activations.Confidence ranges between zero (undecided) to one (perfect classification)

Table 1
The polymorphism and protein function effect analysis of HPV-35 E6 and E7 REF means nucleotide of this site in the reference sequence; Stability changes upon mutation from the protein sequence or structure; Window means Folding Window; Region means Local region; Substitution means Amino substitution

Table 2
Effect of non-synonymous mutation on RNA splicing pattern

Table 5
The HLA-I, HLA-II distribution analysis of HPV-35 E6 and E7