Unique LCR variations among lineages of HPV16, 18 and 45 isolates from women with normal cervical cytology in Ghana

In addition to being useful for classification, sequence variations of human Papillomavirus (HPV) genotypes have been implicated in differential oncogenic potential and a differential association with the different histological forms of invasive cervical cancer. These associations have also been indicated for HPV genotype lineages and sub-lineages. In order to better understand the potential implications of lineage variation in the occurrence of cervical cancers in Ghana, we studied the lineages of the three most prevalent HPV genotypes among women with normal cytology as baseline to further studies. Of previously collected self- and health personnel-collected cervical specimen, 54, which were positive for HPV16, 18 and 45, were selected and the long control region (LCR) of each HPV genotype was separately amplified by a nested PCR. DNA sequences of 41 isolates obtained with the forward and reverse primers by Sanger sequencing were analysed. Nucleotide sequence variations of the HPV16 genotypes were observed at 30 positions within the LCR (7460 – 7840). Of these, 19 were the known variations for the lineages B and C (African lineages), while the other 11 positions had variations unique to the HPV16 isolates of this study. For the HPV18 isolates, the variations were at 35 positions, 22 of which were known variations of Africa lineages and the other 13 were unique variations observed for the isolates obtained in this study (at positions 7799 and 7813). HPV45 isolates had variations at 35 positions and 2 (positions 7114 and 97) were unique to the isolates of this study. This study provides the first data on the lineages of HPV 16, 18 and 45 isolates from Ghana. Although the study did not obtain full genome sequence data for a comprehensive comparison with known lineages, these genotypes were predominately of the Africa lineages and had some unique sequence variations at positions that suggest potential oncogenic implications. These data will be useful for comparison with lineages of these genotypes from women with cervical lesion and all the forms of invasive cervical cancers.


Background
With an estimated 3052 new cases of cervical cancer in Ghana in the year 2012, cervical cancer is one of the two leading cause of cancer incidence and mortality in Ghana and in most sub-Saharan African countries. Cervical cancer is a significant health concern in Ghana in light of the facts that in spite of its high burden, cervical cancer screening programmes are still lacking. In the absence of empirical cancer registry based data, the estimated age-standardized cervical cancer incidence rate for Ghana (35.4 per 100,000) indicates a possible difference in the cervical cancer burden among close neighbours; Benin (27.6 per 100,000), Togo (21.5 per 100,000), Burkina Faso (23.3 per 100,000) and Cote d'Ivoire (21.7 per 100,000) [1].
These geographical variations in the incidence of cervical cancer are also seen in the prevalence of its etiological agent, which is the human Papillomavirus (HPV). In spite of these variations, HPV genotypes 16,18 and 45 together are responsible for more than 80% of cervical cancer globally [2][3][4]. Molecular studies of each of these genotypes have shown sequence variations in some genes that have been the basis for their classification into phylogenetic linages and sub-lineages [5]. Conventionally, if the DNA sequence of the L1 gene of two HPV isolates defer by between 1 and 10%, then these two HPV isolates are variants (lineage) of a genotype. A difference of between 0.5 and 1% makes the HPVs sublineage (sub-variant) of a lineage [6,7]. Furthermore, the level of nucleotide sequence identities or differences in other parts of the HPV genome (including the LCR, E6 and E7) and mostly in the full genome sequence have resulted in similar classification [5][6][7][8][9][10][11].
For HPV18 variants, two phylogenetic lineages (previously known as non-African and African) were identified based on full genome sequence analysis in an older classification. The non-African lineages had two sub-lineages namely, European (E) and Asian-American (AA) while the African lineage also had two sublineages, namely African 1 (Afr-1) and African-2 (Afr-2) [15]. However, with a more recent classification, three lineages have been identified; the previous non-Africa lineage was identified and designated as lineage A with five sub-lineages, designated A1 to A5. Part of the previous African lineage was identified as a separate lineage and designated lineage B with three sub-lineages, designated B1 to B3. The rest of the previous African lineage were identified as a new lineage and designated lineage C [6,10,15]. Also based on full genome sequence analysis, HPV 45 variants have been classified into two phylogenetic lineages, namely A and B. while lineage A has at least three sub-linages; A1, A2 and A3, lineage B has two sub-lineages, B1 and B2 [6,8,15].
The long control region (LCR) of the HPV genome is a 850 bp regulatory region that includes a section for regulating the transcription of HPV genes through interactions with many cellular and viral factors [13]. Therefore, sequence variation in the LCR in addition to being useful for classification has been implicated in the different extent of persistence and progression and oncogenic potential of HPV variants [15,16]. It has also been shown that HPV genotypes (which vary in their LCR sequences) are differentially associated with different histological forms (squamous cell carcinoma (SCC); adenocarcinoma (ADC) and adenosquamous carcinoma (ADSC)) of invasive cervical cancer [17][18][19][20][21][22]. Specifically for invasive cervical cancer (ICC) reported in Ghana, HPV16 and closely related HPV types such HPV31 were common in SCC, while HPV18 and closely related types such HPV45 and HPV59 were common in ADC and ADSC [23,24]. Such differential specificities have also been indicated for HPV variants/lineages [5,14,[25][26][27][28][29].
It is therefore important for our population and in order to enhance cervical cancer screening, to study the variants of the reported most prevalent HPV genotypes in Ghana, to better understand the potential of developing cervical cancers. Although there are limited empirical data on the prevalence of human Papillomavirus (HPV) in the general population, the most prevalent HPV genotypes in Ghana are HPV 16, 18 59, and 45. Few hospital based studies on HPV and its distribution in cancers have reported, for instance, a 10% HPV prevalence among 75 women screened for cervical cancer in a tertiary hospital [30]. Three other studies, using formalin-fixed paraffin embedded (FFPE) cervical tissue Fig. 1 An evolutionary relationship of HPV16 isolates based on sequences between the genome positions 7469 and 7840 within the long control region (LCR.). The evolutionary history was inferred using the Neighbour-Joining method [37]. The optimal tree with the sum of branch length = 0.04170288 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 16 nucleotide sequences. Codon positions included were 1 st + 2 nd + 3 rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 876 positions in the final dataset. Evolutionary analyses were conducted in MEGA 6.0 [39] samples of women diagnosed with high grade cervical lesion or invasive cervical cancer (at the same hospital) reported HPV prevalence of 93.9% [31], 98.0% [32] and 89.8% [23,24]. Overall, only two of these studies reported the genotype specific prevalence, which indicated that HPV 18, 59, 16 and 45 were among the common genotypes in Ghana [23,32]. However, there are no data on the variants of these common HPV genotypes in Ghana. With increasing evidence of HPV genotype, subtype and lineage specificity and association with different forms of invasive cervical cancer, subtype and lineage determination may become important in risk assessment. Additionally, kits development may depend on sequence data of HPV variants for specific targeting, particularly for the HPV 16, 18 and 45. Therefore, our study, focused on the determination of the lineages of HPV 16, 18 and 45 isolated from women participating in a community-based cervical cancer screening activity, all of whom were negative by Pap smear testing (normal cytology results).

Cervical specimen
Previously collected cervical specimen of women with normal cytology results, during a community based cervical cancer screening programme, were used for this study. The specimen positive for HPV16, 18 and 45 (either as a single infection or in a multiple infection) were selected and the long control region (LCR) of each HPV was separately amplified by a nested PCR to confirm the genotype present in the specimen. A total of 20 HPV16 positive specimens, 15 HPV18 positive specimens and 18 HPV45 positive specimens were used.

DNA extraction
DNA was extracted from these specimen on a MagNA Pure LC automated system (Roche Molecular Systems, Inc, Pleasanton, USA) using 200 μL of the cell suspension and the MagNA Pure DNA Extraction Kit (Roche Molecular Systems, Inc, Pleasanton, USA) according to the manufactures specification with a modification of an addition of 15 μL of RNase to the lysis buffer for each specimen. The negative controls included for each set of extraction were PBS, DNase/RNase free water, and DNAGard, while the positive control was a suspension of SiHa and HeLa cells with an integrated HPV16 and HPV18 genome respectively. The specimen/DNA quality was assessed by the amplification of the human RNase H gene in 10 μL of DNA extract and analysed with Light  7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 No. Isolate

NC001526_(A1) A A G A G T A C C G T A A T C T C C T T A A A T G T G C G A A HPV16_control
.
Each column and the numbers in first four rows represents a single specific nucleotide position in the LCR. The fifth row shows the nucleotide (indicated by the standard letter of its nitrogenous bases; (A) adenine, (C) cytosine, (G) guanine, and (T) thymine) at each specific position for the LCR of the prototype HPV16 isolates, NC001526. The sixth row is the sequence of the SiHa positive control. Alphanumeric in column are NCBI BLAST obtained isolates of known lineage. The following were the indication: variation unique to isolates of this study, variations that differentiate known HPV16 lineages, same nucleotide as prototype isolate, deletion of nucleotide at that position. Isolates' GenBank accession numbers are shown in Additional file 1: Table S1 Cycler 480 and related software (Roche Molecular Systems, Inc, Pleasanton, USA).
Nested PCR amplification of HPV16, 18 and 45 LCR Using 5 μL of DNA extracted from each of the selected specimen, first round PCRs were performed with the primer sets as indicated in Table 1  The second round PCR used 5 μL of the first round product with the second round primer sets indicated in Table 1 for the amplifications. The 100 μL final reaction volume contained 200 μM of each dNTP, 1 X PCR buffer (200 mM Tris-HCl pH 8.4; 500 mM KCl), 2.5 mM of MgCl 2 , 1.25 μM of each of the two primers, and 6.3 U of Taq DNA polymerase. The amplifications were achieved with the same cycling conditions as that of the first round PCR.

Visualization of second round PCR product
For all the second round PCR products of HPV16, 18 and 45, 20 μL were resolved on a 1.5% agarose gel in 1X TBE buffer with 1 μg/mL ethidium bromide at 120 V for an hour. The gels were documented with GelDoc™ XR+ System (Bio-Rad Laboratory, Mississauga, Canada).

Ethical approval and consent to participants
The study that collected the specimen from which the HPV isolates were obtained received ethical approval form the Ethics Review Committee of the Ghana Health Service (ID No GHS-ERC: 06/11/10) and worked within the guidelines of the Ethical Review Committee and ). The evolutionary history was inferred using the Neighbor-Joining method [37]. The optimal tree with the sum of branch length = 0.04170288 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 16 nucleotide sequences. Codon positions included were 1 st + 2 nd + 3 rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 876 positions in the final dataset. Evolutionary analyses were conducted in MEGA 6.0 [39] conducted the study in accordance to the ethical standards as declared in Declaration of Helsinki and the Belmont Report. Assurances were given and efforts were made to protect the privacy and confidentiality of the participants enrolled and their collected data. As such personal/ demographic data as well as data that will be useful for identifying the participants were not included in this study.

Results
The obtained partial LCR nucleotide sequences for each of the HPV genotypes were of an average size of 375 bp. These have been deposited in NCBI GenBank and the accession number shown in the Additional file 1: Table S1. The obtained sequences were aligned with the LCR positions of the complete genome of the HPV16 prototype NC0015626.2 [33,34], the HPV18 prototype AY262282_A1 [35] and the HPV45 prototype X74479.1 [36]. The alignments also included selected HPV variants with very similar sequences obtained by an NCBI BLAST of the isolates' sequences. A neighbour joining phylogenetic tree was constructed for each HPV type (Figs. 1-3). All 20 HPV16 specimen were confirmed by positive results for the LCR HPV16 type specific PCR. Nucleotide sequence variations were observed at 30 positions within the LCR (Table 2). Of these 30 positions, 19 were the known variations often observed for African variants, while the other 11 positions had variations unique to the HPV16 isolates of this study ( Table 2). The phylogenetic tree showed that about 85% of the HPV16 isolates were of the lineage C, 5% of lineage B, both of which were previously known as African variants (Fig. 1) and the lineages of two isolates (10%), isolate-866 and isolate 618 were not clearly identified, although they were most likely to belong to lineage B.
Of the 15 specimen positive for HPV18 isolates, 13 were confirmed as positive by the LCR HPV18 type specific PCR, one of which gave poor sequence data. Nucleotide variations were determined in 35 positions, 22 of which were known variations of the lineages B and C (Africa variants) of HPV 18. The other 13 variations were unique to the isolates of this study ( Table 3). The phylogenetic tree (Fig. 2) showed that most of the HPV18 isolate were of the lineages B2 (6 of 12), B3 (2 of 12), B1 (1 of 12) and C (1 of 12) (African variants) and only two isolates (isolates 613 and isolate 858) were of the non-African lineage (lineage A). Of the 18 specimen confirmed as HPV45 genotypes, 13 gave amplification products and 9 has good sequences. (Table 4). Compared to the HPV45 prototype, nucleotide sequence variations were determined in 35 positions, 2 of which were unique to the sequences of 8 isolates of this study. The phylogenetic tree showed that all the HPV45 were of the lineage A1 (Fig. 3).

Discussion
This long control region of the HPV genome has been shown to be very informative and therefore very useful in the study of HPV variants [7,[12][13][14], as such it was used in this study. A close look at the phylogenetic tree ( Fig. 1) suggests that Isolate-937, isolate-866 and isolate-618, may to belong to the HPV16 lineage B (Africa type 1 variants), although it showed that isolates-866, isolate-618 and 618_2 clustered slightly differently. A complete genome sequencing, which was not within the reach of this study would have clarified the lineages of these isolates [5]. Although not clearly understood, repeated PCR and subsequent sequencing of a single specimen pro-  7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 No. of isolates 1 1 1 1 1 2 3 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 7 7 7 7 7 7 7 8   1 4 6 6 7 7 2 2 6 8 0 2 2 5 0 0 2 5 6 0 2 6 9 3 3 4 7 7 7 7 5 1 2 7 9 4 3 2 9 3 1 1 4 0 9 2 1 3 3 4 6 7 7 3 9 9 0 5 4 9 3 0 3 5 9 4 6 3 8 7 Each column and the numbers in first four rows represents a single specific nucleotide position in the LCR.. The fifth row shows the nucleotide (indicated by the standard letter of its nitrogenous bases; (A) adenine, (C) cytosine, (G) guanine, and (T) thymine) at each specific position for the LCR of the prototype HPV45 isolates, X74479. The sixth row is the sequence of the HaLa positive control. Alphanumeric in column are NCBI BLAST obtained isolates of known lineage. The following were the indication: variation unique to isolates of this study, variations that differentiate known HPV45 lineages, same nucleotide as prototype isolate, deletion of nucleotide at that position duced the isolate-618 and isolate-618_2 with varied branch length and at two positions (Fig. 1). More importantly, the phylogenetic tree showed that the length of sequence obtained in this study were not enough to separate the sub-lineages of the HPV16 lineage B, since the nucleotide positions obtained in this study (Table 2) did not include the position (7438) which distinguishes the sub-lineages B1 and B2 (Af-1a and Af-1b respectively). This was evident by the close clustering of the previously described HPV16 sub-lineage B1 (KF466576) and HPV16 sub-lineage B2 (KF466592) in the phylogenetic tree (Fig. 1). These two variants were distinguished from each other by the substitution A7438C in an earlier study [7]. Therefore, this may imply that the clustering of isolates 866 and 618 may have been different with a longer sequence.

X74479_(A1) C G T A T G C A A G T T A T A G C G C G T A T G G C T T A A T T C C A
A more detail investigation of the variations in the sequences ( Table 2) showed how very similar these isolates were and the bases for the differences seen in the phylogenetic tree. It also showed nucleotide variations unique to the isolates of this study. The occurrence of most of these unique variability specific to the isolates of this study, occurred between the genome positions 7801 and 7828. These may have implication for transcription activities in pathogenic pathways and therefore in oncogenesis, since this range of positions include the binding sites for transcription factors, activator protein 1 (AP1) and Octamer-binding protein 1 [13] which are involved the expression of the genome of the HPV.
The phylogenetic tree presented as Fig. 2 indicates that the sub-lineages B1 and B2 (African sub-lineage 1, old Fig. 3 An evolutionary relationship of HPV45 isolates based on sequences between the genome positions 7074 and 7858 within the long control region (LCR.). The evolutionary history was inferred using the Neighbor-Joining method [37]. The optimal tree with the sum of branch length = 0.04170288 is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [38] and are in the units of the number of base substitutions per site. The analysis involved 16 nucleotide sequences. Codon positions included were 1 st + 2 nd + 3 rd + Noncoding. All positions containing gaps and missing data were eliminated. There were a total of 876 positions in the final dataset. Evolutionary analyses were conducted in MEGA [39] nomenclature) of the HPV18 isolates were the commonest. The sub-lineage B3 and lineage C (African sub-lineage 2, old nomenclature) were also common. Interestingly, only a few were of the A1, A2, and A3 sub-lineages (European sub-lineage (Eur), old nomenclature). A detailed comparison with sequences obtained by an NCBI BLAST and the HPV18 prototype AY262282_A1 [35] showed variations that were unique to the sequence of the isolates of this study, in addition to variations well known for the identified lineages. In all, 6 of the 11 isolates identified in this study had not been described by other studies while the other have been [6,15]. Most of the nucleotide variations specific to isolates of this study were located within the genome positions 7769 to 7714 and 7795 to 7814; regions where the binding sites for the transcription factors NF1, YY1 and AP1 are located. These changes may have implication for transcription activities in pathogenic pathways [13] and therefore the oncogenic potential of the isolates of this study at the level of sub-lineages. However, it must be noted that these HPVs were isolated from women with normal cytology results. Therefore, a follow-up study may provide more information relating to the potential relationship between these sequence variations and oncogenicity.
All the 9 HPV45 genotypes sequenced were identified without difficulty and were determined to belong to the HPV45 lineage A1 (Table 4). Although the lineages of most of the isolates of this study were identified much clearly, the primary limitation of this study was that in the absence of a complete genome sequence, or the sequence of other genes, such as the E6 and E7 gene, the lineages of some isolate were not clearly identified and the few isolates that did not cluster within known lineage cannot be said to indicate new lineages.

Conclusion
This study provides the first DNA sequence data on the variants/lineages of HPV 16, 18 and 45 isolates from Ghana and has shown that predominately, these genotypes are of the Africa lineages. The observance of specific sequence variation within region associated with transcription binding, indicates a potential for differential oncogenicity of the isolate. The fact that these were obtained from the specimen of women with normal cytology, these data will be useful for comparison with variants of these genotypes from women with the different form of ICC and different grades of cervical lesions to improve our understanding of relationship between lineage and oncogenicity. However, there is need for additional gene or a full genome sequencing to better clarify the classification of a few of the isolates that did not cluster with most of the isolates within a clear lineage.