Bovine leukemia virus long terminal repeat variability: identification of single nucleotide polymorphisms in regulatory sequences

Limited data are available on the incidence of variations in nucleotide sequences of long terminal repeat (LTR) regions of Bovine Leukemia Virus (BLV). Consequently, the possible impact of SNPs on BLV LTR function are poorly elucidated. Thus, a detailed and representative study of full-length LTR sequences obtained from sixty-four BLV isolates from different geographical regions of Poland, Moldova, Croatia, Ukraine and Russia were analyzed for their genetic variability. Overlap extension PCR, sequencing and Bayesian phylogenetic reconstruction of LTR sequences were performed. These analyses were followed by detailed sequence comparison, estimation of genetic heterogeneity and identification of transcription factor binding site (TFBS) modifications. Phylogenetic analysis of curated LTR sequences and those available in the GenBank database reflected the acknowledged env gene classification of BLV into 10 genotypes, and further clustered analysed sequences into three genotypes - G4, G7 and G8. Additional molecular studies revealed the presence of 97 point mutations distributed at 89 positions throughout all 64 LTR sequences. The highest rate of variability was noted in U3 and U5 subregions. However, the variability in regulatory sequences (VR) was assessed as lower than the variability within non-regulatory sequences (VNR) for both, U3 and U5 subregions. In contrast, VR value for R subregion, as well as for the total LTR, was higher than the VNR suggesting the existence of positive selection. Twelve unique SNPs for these LTR sequences localized in regulatory and non-regulatory elements were identified. The presence of different types of substitutions lead to the abrogation of present or to the creation of additional TFBS. This study represents the largest study of LTR genetic variability of BLV field isolates from Eastern part of Europe. Phylogenetic analysis of LTRs supports the clustering BLV variants based on their geographic origin. The SNP screening showed variations modifying LTR regulatory sequences, as well as altering TFBS. These features warrant further exploration as they could be related to proviral load and distinctive regulation of BLV transcription and replication.


Background
Bovine leukemia virus (BLV) is the etiological agent of enzootic bovine leucosis (EBL), a chronic, lymphoproliferative disease associated with persistent lymphocytosis and B-cell lymphomas. BLV, together with human T-cell leukemia viruses type 1 and 2 (HTLV-1, HTLV-2), belong to the genus Deltaretrovirus of the family Retroviridae. BLV virions contain a diploid RNA genome that is reverse transcribed into doublestranded DNA, which is ultimately integrated into the host genome in the form of a provirus. The BLV genome consists of gag, pol and env, structural genes and a region X which contains several open reading frames for Tax, Rex, R3 and G4 regulatory proteins [1]. The Tax protein is involved in activation of transcription of viral mRNA, while Rex protein regulates the synthesis of BLV structural proteins [2,3].
Whole BLV genome is flanked at both 5′ and 3′ ends by long terminal repeat (LTR) sequences [4,5]. Each LTR is composed of three regions called U3, R and U5. U3 plays a crucial role in the induction of BLV transcription, since its contains the viral promoter and several regulatory elements critical for the modulation of promoter activity [6][7][8][9]. The major regulatory elements are three repeated 21-bp Tax-responsive elements (TRE) that include imperfectly conserved cyclic AMPresponsive elements (CRE) matching the consensus sequence 5′-TGACGTCA-3′. CRE sites are used by cellular transcription factors ATF-1 and ATF-2, as well as cyclic AMP response element binding (CREB) proteins [10]. The virus-encoded Tax transactivator increases the DNA binding activity of CREB/ATF proteins by interacting with their bZip domains, which positively regulates the activation of BLV transcription. Besides the CRE elements, each TRE sequence includes an E box as a target sequence for the AP4 transcription factor. Additionally, the U3 region contains several other response elements, such as: a nuclear factor-κB (NF-κB) binding site, which permit activation of transcription in presence of NF-κB p50 and p65 proteins, a glucocorticoid response element (GRE) conferring responsiveness to dexamethasone in the presence of the Tax, and PU.1/Spi-B binding sites for ETS transcription factor family proteins [7][8][9]. Other elements which were involved in regulation of virus transcription are located downstream of transcription initiation site in R-U5 regions and include an upstream stimulatory factor (USF) binding site, downstream activator sequence (DAS) and an interferon regulatory factor (IRF) binding site, respectively [11][12][13].
Several studies have demonstrated that sequence variations in retroviral LTRs may affect interactions with transcription factors, leading to altered expression of viral genes and ultimately virus replication. Indeed, it was shown that mutations in lentiviral LTRs caused distinct transcriptional activities of HIV-1 and small ruminant lentiviruses [14,15]. Genetic diversity within LTRs was also associated with decreased pathobiology in lungs or increased neurovirulence in sheep infected with ovine lentivirus [16,17]. A correlation was also found between mutations within the Tax-responsive element of human HTLV-1 and increased virus replication [18]. Together, this points to subtle differences in transcription factor biding sites potentially impacting the global regulation and clinical outcomes of retrovirus infection.
Analysis of BLV LTR sequences is limited to few studies; however, the most extensive nucleotide analysis focused on the env gene, with the identification of ten distinct genotypes thus far [19][20][21]. Significant variability within 5′ LTR sequences were found in BLV isolates from North America, Costa Rica, Japan and Belgium, and grouped in 7 genetic groups. Notably, these LTR sequences were more conserved than non-regulatory regions [22]. Recently, phylogenetic analysis of 5′ LTR sequences of BLV isolates from Brazil showed that these sequences clustered into five well-defined groups, with accumulation of nucleotide variability in the R-U5 region [23]. LTR variability has been correlated with altered BLV virulence and pathogeny [24]. In addition, nucleotides polymorphisms found within the LTR region was associated with the emergence of two distinct infection profiles [25].
Sero-epidemiological surveys revealed that BLV infection is widely disseminated throughout the world, with a high prevalence in North and South America, as well as some Asiatic and Middle Eastern countries [26]. Poland has declared freedom from EBL in 2017; however, the infection is still present in East European countries like Ukraine, Moldova and Russia. A common economic zone with these countries still creates a potential for negative impact on the protection of the cattle populations and maintenance of BLV-free herd status in Poland. EBL risks in dairy herds must be considered given the development of trade with Eastern European countries.
In this study, full-length LTR sequences obtained from 64 BLV isolates from different geographical regions of Poland, Moldova, Croatia, Ukraine and Russia were analyzed for their genetic variability. We have found new SNPs modifying LTR regulatory sequences and provided further evidence for positive selection pressure in particular regions of the LTR.

DNA isolation
PBLs were isolated from 10 mL of blood by centrifugation at 1500 g for 25 min. Erythrocytes were haemolysed by osmotic shock with H 2 O and 4.5% NaCl. Afterwards, the cells were washed twice with PBS, aliquoted (5 × 10 6 cells), and stored as pellet at − 80°C until DNA extraction. Five archived samples collected in 2009 from BLV positive cows from Poland were stored as DNA samples at − 20°C until further use. Genomic DNA was extracted using a DNeasy Blood & Tissue Kit (Qiagen), following the manufacturer's instructions. For each sample, genomic DNA concentration was measured using a nanophotometer (IMPLEN) and the samples were stored at − 20°C until examination.

Overlap extension PCR
To amplify the BLV LTR sequences, an overlap extension PCR technique was developed. The general principle of the overlap extension PCR is shown in Fig. 1. This technique includes initial amplification of two fragments: 979 and 571 bp long. A 979 bp fragment was amplified using the following primers: P7736, 5′-TCGA TACCCTCCTTGTGGACC-3′; P8693: 5'-TGTTTGCCG GTCTCTCCTGGCC-3′. The conditions of amplification were as follows: 500 ng of template DNA, 1 x KAPA Buffer, 0.2 mM dNTP s , 0.4 μM each primer, 2 mM MgCl 2 and 0.5 U of KAPA Taq Polymerase. The amplification was run in Biometra T-Personal Thermocycler under following conditions: initial denaturation 95°C 5 min, subsequent steps 95°C 30 s, annealing 65°C 45 s, extension 72°C 1 min, 38 cycles total, final additional extension 72°C 7 min, hold at 4°C. The second fragment a 571 bp long was amplified using the following set of primers: P1: 5'-TGTATGAAAGATCATGCCG A-3′; P609: 5′-GACCCAAAATGCCGCCGAG-3′, at the same conditions except the annealing condition at 52°C 45 s and a 40-cycle run for the amplified product.

Sequencing of full-length LTR
The PCR products were purified using a NucleoSpin Extract II Kit (Marcherey Nagel GmbH & Co) and subsequently, three independent PCR amplification products of each BLV isolate were sent to Genomed SA Company (Warsaw, Poland) for DNA sequencing in both directions, using a 3730xl DNA Analyzer (Applied Biosystems) and a Big Dye Terminator v3.1 Cycle Sequencing Kit. Raw sequence data were viewed and proofread in Geneious Pro version 5.5.9 (Biomatters Ltd) [27], the reads of each sample were aligned. A consensus sequence was determined for each BLV isolate. The data were deposited in the GenBank database under accession numbers included in Table 2.

Sequence data analysis
The sequences generated from 64 field isolates from Poland (52), Moldova (4), Russia (5), Croatia (2) and Ukraine (1) were aligned with 17 sequences of previously characterized BLV isolates from other geographical locations using ClustalW algorithm implemented in Geneious Pro 5.5.9 ( Table 1). The multiple alignment was submitted to the Modeltest version 0.1.10 for the best model selection according to the Akaike information criterion (AIC). As a result, the HKY85 substitution model was applied in Geneious Pro to infer a phylogenetic tree according to Bayesian method. Genetic distance analysis between newly obtained sequences and average nucleotide substitution per site were calculated using the Jukes-Cantor model in MEGA version 5.2.2. Statistical analyses were performed using STATISTICA version 10 (StatSoft, Dell Software, USA).
To detect selection on noncoding DNA we used similar approach to Hahn et al. and Zhao et al. [22,28]. A standard indication of selective pressure is the ratio of   In further study, the consensus sequence was calculated based on the multiple alignment and it was used to determined nucleotide variations within each LTR in respect to known BLV genotypes (G1-G10). To analyze the transcription factor binding site (TFBS) modifications related to specific mutations, the multiTF server was used (http://multitf.dcode.org). Table 2 shows average genetic distances estimated for the 64 sequences obtained in this study, as well as 17 sequences from other studies (insert reference here), representing named reference sequences from other  The number of base differences per site from averaging over all sequence pairs between groups is shown. The analysis involved 81 nucleotide sequences. All positions containing gaps and missing data were eliminated. There was a total of 526 positions analyzed in the final dataset. The number of isolates in each group (n) is listed geographical locations. The level of divergence ranged between 0.012 to 0.037, when the two populations of sequences were analyzed. Genetic distances calculated for the 64 sequences varied between 0.008 and 0.029 and were comparable to those noted for reference sequences.

Analysis of the sequence variation within BLV LTRs
To identify the extent of genetic variations within the 64 LTR sequences, a consensus sequence was constructed (Additional file 1). Sequence variations were found amongst all BLV isolates. We found 97 (0.285%) point mutations distributed at 89 positions, along the 531 bp LTR amplicons of all 64 sequences. Among them, 46 were present in more than one isolate. The highest rate of variability was noted in U3 and U5 subregions (0.173 and 0.264, respectively), while the variability found in R subregion was relatively low (0.140). In fact, statistically significant differences were noted between the U5/R (p-value = 0.001155) and U3/R (p-value = 0.000002).
The most frequent changes were observed in regulatory elements of the U3 subregion and were characterized by substitution of G(− 133)A/C in the TRE2 of seven sequences, substitution T(− 65)C in the GRE seen in 28 sequences, and substitution of T(− 41)A, T(− 37)A and T(− 36)C in the TATA Box, noted in 4, 9 and 17 sequences, respectively.
The remaining variations were noted in DAS regulatory element of R subregion, and were characterized by substitution of A(+ 150)G found in 8 sequences, T(+ 161)C found in 3 sequences, and substitution of TC(+ 188/9)CT and T(+ 190)C recognized in 11 sequences.
Despite the high variability within the U5 subregion, no accumulation of specific mutations within regulatory elements was noted. Additionally, the LTR sequence analysis revealed the presence of insertions and deletions in LTR. Four deletions at position C(− 72)del of GRE, T(− 11)del of CAP site, TC(+ 188/9)del of BoxC and A(+ 320)del in non-regulatory site were observed. Two insertions at T (+ 191) ins and C (+ 192) ins were found in DAS and a non-regulatory site, respectively.

Selection pressure in BLV LTR sequences
A substantial number of SNPs found in regulatory sequences of LTR suggested the possibility of positive selective pressure on BLV LTRs. To examine this hypothesis, the variability in regulatory (V R ) versus non-regulatory (V NR ) sequences was calculated and compared using Wilcoxon matched pair test. The respective results were shown in Table 3. Variability in regulatory sequences (V R ) was lower than the variability within non-regulatory sequences (V NR ) for both, U3 and U5 subregions. V R versus V NR was lower than 1 showing a high probability of negative selection (p< 0.05 for U3 and p< 0.0005 for U5). Conversely, V R value for R subregion, was higher than the V NR suggesting the existence of positive selection (p< 0.0001). For total LTR non-significant positive selection was predicted (p < 0.900).

Phylogenetic analysis
Based on the multiple sequence alignment of 64 sequences and 17 reference sequences, a phylogenetic tree was constructed using the Bayesian method. The topology, with high posterior probabilities, indicated that all sequences were clearly classified into ten distinct groups (Fig. 2). This topology fully matched the phylogenetic tree which was constructed on the basis of a 444 bp fragment of the env gene, representing the same BLV isolates, with the presence of ten known BLV genotypes (G1-G10).
Analyzed sequences were segregated into three main groups, corresponding to the G4, G7 and G8 genotypes. Forty-five sequences representing Polish isolates and sequences from Moldova, Russia and Ukraine were clustered together within G4 genotype with a posterior probability of 0.62. Further, they were clearly separated into distinct subgroups, designated as G4-I, G4-II, G4-III and G4-IV. G4-III was cosmopolitan group including sequences from Poland, Ukraine, Russia and Moldova, while the remaining G4 subgroups represented only sequences from Poland.
The genotype G8, with high posterior probability score 1.00, grouped nine sequences from Poland and two sequences from Croatia, creating two separated subgroups G8-I and G8-II, respectively. All eight remaining sequences were found to create a G7 genotype with high posterior value 1.00. Five out them exclusively represented sequences from Poland forming a G7-I subgroup, while three sequences from Moldova were assigned to the G7-II subgroup.

Selection pressure in genotypes and subtypes
The variability in V R versus V NR sequences was further calculated to estimate any selection pressure on LTR regulatory elements in individual genotypes and subgroups detected in this study. The respective results were shown in Table 4. For most of the phylogenetic groups, the LTR sequences were under negative or neutral selection. However, positive selection was estimated in the R region of the LTR sequences representing genotype G4 (p < 0.006039) and G8 (p < 0.000002). In and additional reference sequences in GenBank (dark red color, n = 17) were inferred by Bayesian analysis of LTR sequences, based on the HKY85 substitution model. Novel and known genotypes and/or subtypes found in this study are indicated at the right by vertical lines particular, it was noted for the G4-III (p < 0.029411) cosmopolitan subgroup including Polish, Ukrainian, Moldavian and Russian sequences, as well as for G8-I (p < 0.011078) subgroup of Polish sequences. Positive selection was also detected in the U3 subregion (p < 0.000020) and total LTR (p < 0.000001) of G4-I subgroup which included Polish sequences.

Genotype-specific variability
Based on the analysis of the multiple sequence alignment of 64 and 102 BLV LTR sequences collected in GenBank database (Additional file 2) it was possible to identify 15 unique SNPs for each of known genotype (G1-G10) and 6 subgroup-specific SNPs, which are presented in Table 5. Overall, 21 genotype and subgroupspecific SNPs at 19 positions were found. Nine out of 21 SNPs identified were located in TRE1, TRE2, GRE/ TRE3, TATA and DAS regulatory elements of the LTR. The remaining 12 were equally distributed among the non-regulatory sequences of the U3, R and U5 subregions.
Here we report that genotype/subgroup-specific variations appeared to be geographically stratified throughout locations of the donor cattle. Some SNPs were found exclusively in isolates from certain geographic areas: e. The detection of SNPs in regulatory elements suggested that there may be differential BLV expression in host infected with particular genotypes and/or subgroups. Five out of 9 SNPs identified in regulatory elements corresponded with G4, G7 and G8 genotypes. Another six out of 12 SNPs typical for these genotypes were identified in non-regulatory binding sites of the LTR.
To analyze transcription factor binding site (TFBS) modifications related to genotype and subgroup-specific SNPs for G4, G7 and G8 genotypes we used multiTF software. The data were evaluated for potential cellular transcription factors binding sites using an in silico analysis, in conjunction with SNPs that possibly abrogated existing TFBS or created a new TFBS (Table 6).
Six other SNPs that were not identified in common regulatory binding motifs were present in TFIID, GATA-1, Pit-1a, Sp1, TBP, AFP1 and PR binding putative sites determined by in silico analysis. Variations disrupted the predicted sites for TFIID, GATA-1, TBP and AFP1, and may have created new AP-1, PR, ACF binding putative sites. Future studies are required to assess the functional capacity of BLV promoters from distinct genotypes with divergent TF binding sites.

Discussion
In this study we determined the nucleotide sequences for LTRs of BLV proviruses isolated from naturally-infected cattle from Poland, Moldova, Ukraine, Russia and ȣ -indicates p-value where V R was significantly higher than V NR , positive selection; p-value not marked with an ou mean that V R was significantly lower than V NR , negative selection; p-value > 0.05 mean not significant difference between V NR and V R ; the 68_G_P, 146_W_P and 297WS_M sequences were not classified to the subtypes giving 61 sequences to the subtype analysis Croatia. The 64 sequences analyzed proved to be 96. .2% identical to the other BLV sequences isolated from cattle living in Japan, Myanmar, North, Central and South America. This data revealed that the genetic distances between BLV LTR sequences were very low among infected cattle and were comparable to that observed for coding sequences within the BLV genome. It was further shown that the homology of the env gene was 94.5-97.7% amongst BLV isolates representative of the ten genotypes. Comparison the 64 LTR sequences with the previously described sequences from American and Asian countries reveals numerous SNPs which conferred differences throughout most of the length of the LTR. We propose that some of these differences may account for noted variability in viral transcription and replication. Expression of BLV is dependent on three imperfect 21 bp repeats (TRE), which are critical elements for both basal expression and for transactivation by the virally-encoded Tax protein. These TRE motifs contain imperfectly conserved cyclic AMP-responsive elements (CRE), specifically AGACGTCA, TGACGGCA, and TGACCTCA sites, that repress transcriptional activation by cellular CREB/ ATF factors. This strategy avoids TRE site recognition during a host immune response [10]. We noted TRE variations which were located outside, G(− 133)A/C, T(− 65)C, and within the corresponding CRE octanucleotide, C(− 53)A, which created an imperfect CRE (TGACCTCA). Besides, each TRE contains an E-box sequence (CAGCTG, CAGCTG and CACCTG) which overlaps the CRE motif. E boxes are can exert a negative cooperative effect on the activity of the BLV promoter through the CREs, most likely by sterically inhibiting the binding of CREB/ATF to these sites [31]. We found E Table 5 Unique LTR nucleotide differences which permit genotype/subtype classification ′Genotype/subtype′ column include pattern of the nucleotide sequence for each genotype and subtype, G1-G10 as well as G4-I, G4-II, G4-III, G4-IV, G7-I, G7-II, G8-I and G8-II, respectively. Nucleotides are represented by single letters and numbered from the beginning of 5′ to 3′ end of LTR. Blue squares represent unique SNPs for G4, G7 and G8 genotypes and/or subtypes. Underlined nucleotides represent specific SNPs for genotypes G2-G10 study might also introduce discrepancies in promoter activity. Specifically, these were included in three conserved boxes, which were found to be identical with HTLV-1 DRE 1, that is essential for up regulation of the HTLV-1 promoter [12,33]. Kashanchi et al. revealed that the 45 nt sequence DRE 1, located at the boundary of the R/U5 region of the LTR, is required for basal HTLV-1 transcription and plays an important role in the initiation and maintenance of virus replication [33]. There were also observed T(+ 256)C, T(+ 259)C/G and C(+ 264)T changes in IRF binding site of the U5 region, which may also play role in initiation of virus replication [13].
Despite the overall differences described above, the sequences of several regulatory elements of the LTR were conserved. These included the TRE1, CRE2, NF-κB-like motif and PU.1/Spi-B binding site of U3. Another interesting feature of BLV LTRs is the presence of a conserved kappa B (κB) site. Constitutive expression of NF-κB in B cells can induce low level of transcription during BLV infection, which can be up-regulated following immunological activation and thus initiate a positive feedback loop involving the Tax protein [13]. Two SNPs, T(− 104)C and T(− 84)C, were located within κB site and PU.1/Spi-B binding site, respectively. Notably, Xiao et al. revealed that TC(− 83 to − 84)CT substitution in the PU.1/Spi-B binding site increased BLV promoter function fivefold [32].
In our work, we investigated possible natural selection over the entire BLV LTR sequence, as well as particular regions of the LTR (U3, R and U5), from 64 new BLV isolates and found evidence of negative selection for most of the regulatory elements. These results are consistent with the observations of Zhao et al., who showed significantly higher variability of non-regulatory than regulatory sequences in LTRs based on the analysis of 52 BLV isolates from USA, Japan and Costa Rica [22]. Nevertheless, in contrast to their findings of no evidence for positive selection for all LTR regions, we found evidence that regulatory sequences within R region were under positive selection, specifically the CAP site, USF, TFIID Transcription factor II D, GATA-1 Erythroid transcription factor, Pit-1a Growth hormone factor 1, AP-1 Activator protein 1, Sp1 Specificity protein 1, PR Progesterone, XBP-3 X-box binding protein 1, CREM-tau2 Transcription factor cAMP-response element modulator protein, CIIIB1 Nuclear factor, TMF TATA element modulatory factor, NFE2 TF interacting with CREB-binding protein, RAR-gamma2 Retinoic acid receptor gamma, control peptide, TBP TATA-binding protein, AFP1 Hepatoma nuclear factor that binds to the alpha-fetoprotein enhancer and promoter, ACF ATP-dependent chromatin-assembly factor, Tf-LF1 Liver-specific transcription factor DAS: BoxA, BoxB, BoxC and Poly (A) sites. The essential difference in selection between these studies could be dictated by including the DAS element in this study, in contrast to Zhao and coworkers, who excluded it from their analysis. DAS is a 64 bp long sequence present at the 3′ end of the R region, which contributes to the enhancement of viral gene expression [12]. Interestingly, BLV DAS region harbors both progesterone and glucocorticoid binding sites, indicating a putative role in RNA synthesis. Asin et al. demonstrated that in HIV-1-infected cells, progesterone and estradiol regulate HIV-1 replication most likely by directly altering HIV-1 transcriptional activation and in the absence of the viral regulatory protein Tat [34]. Our analysis revealed that, unlike viral essential cis-acting promoter control sequences contained within U3 region (TRE1, CRE2, κB and PU_Box), R region can undergo continuous substitutions due to host selection pressure. This sequence variation of target transcription factor binding sites in R region may reduce the binding sites recognition and potentially be detrimental to the efficacy of transcription factors on the BLV LTR and/or hormone responsiveness. Since glucocorticoid (GR) and progesterone (PR) receptors are known to drive the activity of the HIV-1 and select endogenous retroviruses, we believe they may likely also modulate the activity of BLV 5′ LTR [8,34,35]. Another finding emerging from the in silico analysis is the possibility that many additional cellular transcription factors could contribute to the regulation of BLV transcription. We found that genotype and subgroup-specific SNPs can lead to creation or abrogation several transcription factors in the LTR. The putative binding sites for transcription factors, unique for particular genotypes/subgroups, were observed notably for AP-1, PR, CREM-tau2, NF-E2, RAR-gamma2, GATA-1 and ACF. We postulate that these observed variations can lead to genotype/subtype-specific differences in viral gene expression.
In support of this concept, Jeeninga et al. revealed that the change from κB to GABP binding site in the LTR of HIV-1 subtype E did not lead to loss of promoter function; instead, a gain of GABP binding to this mutated κB binding site enhanced Tat-mediated gene expression and improved virus replication in select cell lines [36]. We found that the SNP G(− 133)C, specific for G7-II, located in TREx2 regulatory element, was found to introduce CREM-tau2 and AP-1 potentially leading to abrogation of the XBP-3 and GATA-1 binding putative sites. Interestingly, CREM isoforms are transcription factors belonging to the CREB/CREM/ATF family, which are known to cooperate with activator protein 1 (AP-1), were upregulated by Tax protein [37]. AP-1 transcription factor family include FOS, JUN, MAF, and ATF proteins, which are significantly upregulated in the presence of Tax [37,38]. This AP-1 can form different AP-1 dimers, which control transcriptional activation or suppression of many genes.
The SNP T(− 41)A in the TATA Box, typical for the G7-I subtype, harbors a shift from an GATA-1/TMF to NF-E2/RAR-gamma2 binding putative sites. Interestingly, transcription factor NF-E2 is known to bind with a specific DNA sequence in order to modulate transcription by RNA polymerase II. These newly determined, in silico predicted TFBS can be useful for further experiments directed at defining the physical interactions between viral and host regulatory proteins, and may help define the biochemical processes of basal transcription and Tax transactivation in B cells.

Conclusion
Our study represents the largest study of the LTR genetic variability of BLV field isolates from Eastern part of Europe. The SNP screening showed variations modifying LTR regulatory sequences, as well as altering existing TFBS. Different sets of mutations for LTRs from different genotypes and subtypes suggests that multiple, geographically localized changes, may be involved in viral adaptation to recent selective pressures. These adaptations may underlie differential clinical outcomes of BLV infection, with potential implications for agricultural practice and economy [39][40][41].