Viruses, stemness, embryogenesis, and cancer: a miracle leap toward molecular definition of novel oncotargets for therapy-resistant malignant tumors?

Recent breakthrough studies documented consistent activation of specific endogenous retroviruses in human embryonic stem cells and preimplantation human embryos and demonstrated the essential role of the sustained retroviral activities for maintenance of pluripotency and embryonic stem cell identity. Present analysis has led to the hypothesis that activation of the human stem cell-associated retroviruses (SCARs), namely LTR7/HERVH and LTR5_Hs/HERVK, is likely associated with the emergence of clinically lethal therapy resistant death-from-cancer phenotypes in a sub-set of cancer patients diagnosed with different types of malignant tumors.

In human cells, retrotransposons' activity is believed to be suppressed to restrict the potentially harmful effects of mutations on functional genome integrity and to ensure the maintenance of genomic stability. Human embryonic stem cells (hESCs) and human embryos seem markedly different in this regard. In recent years, multiple reports demonstrate that retrotransposons' activity is markedly enhanced in hESC and human embryos and most active transposable elements (TEs) may be found among human endogenous retroviruses (HERV). Kunarso et al. [1] identified LTR7/HERVH as one of the most overrepresented TEs seeding NANOG and POU5F1 binding sites throughout the human genome. HERV subfamily H (HERVH) RNA expression is markedly increased in hESCs [2,3], and an enhanced rate of insertion of LTR7/ HERVH sequences appears to be associated with binding sites for pluripotency core transcription factors [1,2,4] and long noncoding RNAs [5]. Expression of HERVH appears regulated by the pluripotency regulatory circuitry since 80% of long terminal repeats (LTRs) of the 50 most highly expressed HERVH are occupied by pluripotency core transcription factors, including NANOG and POU5F1 [2]. Furthermore, TE-derived sequences, most notably LTR7/HERVH, LTR5_Hs/HERVK, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites (TFBS) in the hESC genome [4].
The LTR7 subfamily is rapidly demethylated and upregulated in the blastocyst of human embryos and remains highly expressed in human ES cells [6]. In human ESC and induced pluripotent stem cells (iPSC), LTR7 sequences harboring the promoter for the downstream full-length HERVH-int elements, as well as LTR7B and LTR7Y sequences, were expressed at the highest levels and were the most statistically significantly upregulated retrotransposons in human stem cells [7]. LTRs of HERVH, in particular, LTR7, function in hESC as enhancers and HERVH sequences encode nuclear noncoding RNAs, which are required for maintenance of pluripotency and identity of hESC [8]. Transient hyper activation of HERVH is required for reprogramming of human cells toward induced pluripotent stem cells, maintenance of pluripotency and reestablishment of differentiation potential [9]. Failure to control the LTR7/ HERVH activity leads to the differentiation-defective phenotype in neural lineage [9,10]. The continuing activity of L1 retrotransposons may be relevant as well because significant activities of both L1 transcription and transposition were recently reported in humans and other great apes [11] and L1HS was implicated in the creation of human-specific TFBS in the hESC genome [4].
Single-cell RNA sequencing of human preimplantation embryos and embryonic stem cells [12,13] facilitated identification of specific distinct

Research Perspective
populations of early human embryonic stem cells, which were defined by distinct patterns of marked activation of specific retroviral elements [14]. Consistent with definition of increased LTR7/HERVH expression as a hallmark of naive-like hESCs, a sub-population of hESCs and human induced pluripotent stem cells (hiPSCs) manifesting key properties of naive-like pluripotent stem cells can be genetically tagged and successfully isolated based on elevated transcription of LTR7/HERVH [15]. Targeted interference with HERVH activity and HERVH-derived transcripts severely compromises selfrenewal functions [15]. Transactivation of LTR5_Hs/ HERVK by pluripotency master transcription factor POU5F1 (OCT4) at hypomethylated long terminal repeat elements (LTRs), which represent the most recent genomic integration sites of HERVK retroviruses, induces HERVK expression during normal human embryogenesis, beginning with embryonic genome activation at the eight-cell stage, continuing through the stage of epiblast cells in preimplantation blastocysts, and ceasing during hESC derivation from blastocyst outgrowths [16]. Grow et al. [16] reported unequivocal experimental evidence demonstrating the presence of HERVK viral-like particles and Gag proteins in human blastocysts, consistent with the idea that endogenous human retroviruses are functionally active during early human embryonic development.
Significantly, expression of HERVH-encoded long noncoding RNAs (lnc-RNAs) is required for maintenance of pluripotency and hESC identity [8]. In human ESC, 128 LTR7/HERVH loci with markedly increased transcription were identified [8]. It has been suggested that these genomic loci represent the most likely functional candidates from the LTR7/HERVH family playing critical regulatory roles in maintenance of pluripotency and transition to differentiation phenotypes in humans [4]. Conservation analysis of the 128 LTR7/HERVH loci with the most prominent expression in hESC demonstrates that none of them are present in Neanderthals genome, whereas 109 loci (85%) are shared with Chimpanzee [4]. Considering that Neanderthals' genomes are ~40,000 years old and Chimpanzee's genome is contemporary, these results are in agreement with the hypothesis that LTR7/HERVH viruses were integrated at these sites in genomes of primates' population very recently. Distinct patterns of expression of different sub-sets of transcripts selected from 128 LTR7/HERVH loci hyperactive in hESC are readily detectable in adult human tissues, including various regions of human brain [4]. These observations support the idea that sustained LTR7/HERVH activity may be relevant to physiological functions of human embryos and adult human organs, specifically human brain.
Taken together, these breakthrough experiments conclusively established the essential role of the sustained, tightly controlled in the temporal-spatial fashion activity of specific endogenous retroviruses for pluripotency maintenance and functional identity of human pluripotent stem cells, including hESC and iPSC ( Figure 1). Is this true for cancer stem cells as well and activation of human stem cell-associated retroviruses (SCARs) occurs in malignant tumors? Activation of the stemness genomic networks in human malignant tumors was linked with the emergence of clinically-lethal death-from cancer phenotypes in cancer patients, which are consistently associated with significantly increased likelihood of therapy failure, disease recurrence, and development of distant metastasis [17][18][19][20][21][22][23][24][25][26]. Gene expression signatures of the hESC genomic circuitry successfully identified therapy-resistant tumors in cancer patients diagnosed with multiple types of epithelial tumors [25,26]. Since these genotype/phenotype relationships between activation of stemness genomic networks and clinically-lethal therapy-resistant phenotypes of human cancers are readily detectable in the early-stage tumors from cancer patients diagnosed with multiple types of malignancies [17][18][19][20][21][22][23][24][25][26], it seems likely to expect that emergence of these tumors may be triggered by (or associated with) activation of endogenous human SCARs in cancer stem cells. One of the key molecular mechanisms of SCARs-mediated reprogramming of genomic regulatory networks is likely associated with functions of SCARsderived long noncoding RNAs and human-specific TFBS (Figure 1). Consistent with this idea, recent experimental evidence revealing the mechanistic role in human cancer development of one of the LTR7/HERVH-derived long noncoding RNAs, namely linc-RNA-ROR [5,27,28], are beginning to emerge [29][30][31][32][33]. Comprehensive catalogues of transcriptionally active SCARs [8,[12][13][14][15][16], chimeric transcripts [14][15][16], and human-specific TFBS in the hESC genome [4] should greatly facilitate the followup mechanistic studies of human cancer. Collectively, these findings instantly open exciting new diagnostic and therapeutic opportunities for individualized targeted experimental and clinical interventions aiming at early detection and eradication of clinically-lethal sub-sets of malignant tumors in cancer patients.