The interplay of long non-coding RNAs and MYC in cancer

Long non-coding RNAs (lncRNAs) are a class of RNA molecules that are changing how researchers view eukaryotic gene regulation. Once considered to be non-functional products of low-level aberrant transcription from non-coding regions of the genome, lncRNAs are now viewed as important epigenetic regulators and several lncRNAs have now been demonstrated to be critical players in the development and/or maintenance of cancer. Similarly, the emerging variety of interactions between lncRNAs and MYC, a well-known oncogenic transcription factor linked to most types of cancer, have caught the attention of many biomedical researchers. Investigations exploring the dynamic interactions between lncRNAs and MYC, referred to as the lncRNA-MYC network, have proven to be especially complex. Genome-wide studies have shown that MYC transcriptionally regulates many lncRNA genes. Conversely, recent reports identified lncRNAs that regulate MYC expression both at the transcriptional and post-transcriptional levels. These findings are of particular interest because they suggest roles of lncRNAs as regulators of MYC oncogenic functions and the possibility that targeting lncRNAs could represent a novel avenue to cancer treatment. Here, we briefly review the current understanding of how lncRNAs regulate chromatin structure and gene transcription, and then focus on the new developments in the emerging field exploring the lncRNA-MYC network in cancer.


Introduction
In recent years, the investigations exploring the importance of how long non-coding RNAs (lncRNAs) influence epigenetic modifications and chromatin structure has truly been paradigm shifting in our fundamental understanding of how transcription is regulated in higher eukaryotes. Once considered to be "transcriptional noise" inherent to the large genomes of higher eukaryotic organisms, lncRNAs are now viewed as critical regulators of complex genomes and have added another layer of complexity to the molecular mechanisms that govern gene regulation. In humans there is ~ 2 times more genes that produce lncRNAs, an estimated ~ 48,000 lncRNA genes [1], than protein-coding genes, and only a very small fraction of these lncRNA genes have been characterized [1].
While lncRNAs are only a subset of the non-coding transcriptome, over the last several years these mysterious RNAs have stepped into the limelight. In particular, a topic of great interest has been how dysregulation of lncRNAs leads to the inappropriate epigenetic regulation of critical genes that are involved in the development and/or maintenance of cancer. Recent evidence suggests that MYC, a well-studied oncogenic transcription factor that is deregulated in most types of cancer and controls many cellular processes, including cell growth, metabolism, proliferation, differentiation and apoptosis [2][3][4][5][6], is an important mediator in the transcription of lncRNAs [7,8]. In turn, new evidence suggests that lncRNAs can also control the expression of MYC [9]. In this review, we briefly discuss our current understanding of the basic features of lncRNAs and how they regulate the epigenetic landscape and then focus on the emerging dynamic relationships between MYC and several lncRNAs as they pertain to cancer.

Characteristics of lncRNAs: Structure and Function
The generally accepted definition of a lncRNA is an RNA molecule longer than 200 nucleotides that does not code for a protein [10]. With the arrival of genome-wide platforms, such as microarrays and next-generation sequencing, and more sophisticated computational analyses of genome-wide data, the exploration of the non-coding transcriptome has developed a strong foothold in molecular research laboratories [11][12][13][14]. In a recent publication, it was estimated that there are ~ 110,000 different lncRNA transcripts within the human genome, with ~ 80,000 of these considered to be high confidence lncRNA transcripts, representing ~ 48,000 genes [1]. Through the use of sophisticated computational analyses, these high confidence lncRNAs transcripts were shown to have very limited or no appreciable coding potential [1]. LncRNAs share many similarities to protein-coding transcripts. LncRNAs undergo similar co-transcriptional and post-transcriptional processing. Many lncRNA transcripts are transcribed by RNA polymerase II (although some are transcribed by RNA polymerase III) [15], they share the same canonical splice sites and polyadenylation terminal signals and frequently contain a 5′ cap and a polyadenylated 3′-end [16,17].
Importantly, there are also some notable dissimilarities between lncRNAs and mRNAs. LncRNAs can undergo some unconventional processing [18,19,20] and tend to have a higher degree of tissue-specific expression relative to protein-coding genes [14,21,22]. Additionally, the primary sequences of lncRNAs tend to be less conserved across species [23]. These data suggest that the structure (rather than sequence) of lncRNAs may be of greater importance when exploring the function of lncRNAs, but this topic of RNA biology remains challenging. LncRNAs that have been extensively studied, such as HOTAIR and MALAT1, have provided some of the initial insights into the importance of the structural features of lncRNAs [23][24][25][26][27]. Recent evidence has shown that secondary structural elements of lncRNAs that are evolutionarily conserved contain important protein-binding domains [27]. Several methodologies have been developed in an effort to determine the secondary and tertiary structures of RNA molecules [28]. Some of the most noteworthy techniques have used either specific nucleolytic enzymes or chemical modifications of the RNA molecules followed by sequencing [29][30][31][32][33]. Once sequenced, the secondary structure of the RNA molecules can be determined from the ends of the reads using advanced computational tools ( Figure 1). For example, the technique frequently referred to as Structure-seq employs the use of dimethyl sulfate (DMS) which penetrates the cells and methylates N1 of adenines and N3 of cytosines when not involved in Watson-Crick base pairing [29,35]. Reverse transcription of the DMS-methylated RNA, results in reverse transcriptase stopping at the methylated bases generating cDNAs of different lengths that are then sequenced. From these data, the secondary structure of RNA molecules can be determined on a genome-wide level using computational models. However, most techniques for the structural analysis of RNA molecules involve in vitro conditions that may not retain the structural characteristics of lncRNAs in vivo [36]. In spite of the challenges in RNA structural biology, the elucidation of lncRNA structure appears to be of vital importance, if we are to fully understand the functional capabilities of lncRNAs.
LncRNAs have a myriad of functions within eukaryotic cells. One of the best-understood and studied functions of lncRNAs is how they modulate gene expression. LncRNAs have been described as "fine-tuners" of gene regulatory networks regulating gene expression both at the transcriptional and post-transcriptional levels via a variety of distinct mechanisms [37,38]. LncRNAs can be cisacting, regulating chromatin structure and transcription of neighboring genes, and/or trans-acting, regulating the transcription of genes at distant locations within the genome [10,38]. There are three broad functional classifications for lncRNAs; they can serve as decoys, scaffolds and/or guides ( Figure 2). In simple terms, lncRNAs that serve as decoys can associate with both regulatory RNAs and proteins, such as miRNAs, DNA-binding proteins and histone-modifying enzymes, and prevent their binding to specific target mRNAs or chromatin loci and/or inhibit their enzymatic activity [10,37,39,40]. In a recent example, MALAT1 was shown to mediate the mRNA levels of serum release factor (SRF), an important transcription factor in myogenesis, by acting as a sponge or competing endogenous RNA (ceRNA) for miR-133 [40]. Scaffold lncRNAs provide a platform onto which different molecular interactions can occur, such as proteinprotein interactions, including interactions of distant chromatin loci [41]. For example, HOTAIR has been demonstrated to aid in protein ubiquitination by acting as a scaffold for E3 ubiquitin ligases containing Dzip3 and Mex3b RNA-binding domains and also their substrates, Ataxin-1 and Snurportin-1, respectively [41]. Lastly, guide lncRNAs aid in the recruitment of protein complexes to specific locations within the cell, such as recruitment of epigenetic modifying enzymes to specific genes [10,37,38,42]. The Xist lncRNA and its role in dosage compensation is a prime example of a guide lncRNA. Xist has been shown to interact with the SHARP-SMRT complex and recruit it to the X chromosome, thereby activating HDAC3 leading to histone deacetylation and exclusion of RNA polymerase from the X chromosome [42]. As more lncRNAs are characterized, we suspect these functional classifications will change with the discovery of novel lncRNA cellular functions. All three of these broad functional classifications of lncRNAs are found within the lncRNA-MYC network, described below.

The lncRNA-MYC Network
Over the last decade as lncRNAs have been drawing the attention of more researchers, as too has the lncRNA-MYC network gained the attention of many investigators. In two recent reviews, the lncRNA-MYC network has been described: either by examining how lncRNAs influence MYC expression [9], or selectivity summarizing some of the interactions seen within the human lncRNA-MYC regulatory network [43]. Here, we will further expand on what is known about the lncRNA-MYC network by providing a comprehensive summary of the molecular interactions within this regulatory network (Table 1), with an emphasis on recent developments in the field demonstrating functional relationships between cancerassociated lncRNAs and MYC.

PVT1
We will begin by describing recent findings suggesting a reciprocal relationship between MYC (formerly c-MYC) and a well-known lncRNA, known as PVT1. Both the MYC and PVT1 genes are located in the 8q24 chromosomal region, which is frequently referred to as a "gene desert" because it contains few protein-coding genes ( Figure 3). However, several lncRNA genes have been discovered within this region. The 8q24 region has also been of particular interest because it is a frequent region of genomic alterations, including amplifications and translocation breakpoints, in several different types of cancer [44]. Moreover, aberrant overexpression of PVT1 has been discovered in many different human cancers [45]. As previous mentioned, MYC is an oncogenic transcription factor and can either activate or repress transcription [46]. In a recent study, PVT1 was shown to contain two non-canonical MYC-binding sites, which were found to be important for the binding of both MYC and its paralog MYCN (formerly N-MYC) to the promoter region of PVT1, with changes in H4 acetylation and PVT1 mRNA production correlating with changes in MYCN occupancy at the PVT1 promoter [47]. As suggested by the authors, this study demonstrates that PVT1 is a likely downstream target of MYCN. Conversely, PVT1 lncRNA has been shown to be important in the regulation of MYC expression. Through the use of chromosome engineering in mice and both loss and gain-of-function analyses in different human cancer cell lines, it was demonstrated that PVT1 was required for high MYC protein expression, via its capacity to protect MYC from phosphorylation and subsequent degradation [48]. PVT1 is an exceptionally interesting lncRNA, both in its ability to physically interact with and regulate MYC and its pivotal roles in many cancers, making it an attractive therapeutic target to combat different cancers. For a more extensive review of PVT1 and its oncogenic features, we will refer to a recent review by Colombo et al. [45].

The CCAT family
The colon cancer associated transcripts (CCATs) are a collection of lncRNAs located on different chromosomes that have been both associated with and functionally demonstrated to be involved in the development of human colorectal cancers (CRC). Specifically, three of the best-characterized CCAT lncRNAs are CCAT1 (also known as CARLo5), CCAT2 and CCAT6. CCAT6, also known as MYCLo2, will be discussed below in the MYCLos section. While the CCAT6 gene is located on chromosome 7, CCAT1 and CCAT2 are located in the gene desert region of 8q24, near MYC and PVT1. With the use of genome-wide association (GWA) studies, the 8q24 region has been implicated in CRC [49,50,51]. From these GWA studies, CCAT1 was later identified and characterized as being a highly specific marker for CRC [52].
The interplay between MYC and CCAT1 involves many complex molecular interactions. Contained within the 8q24 region are several chromatin-looping interactions that have been shown to be tissue-specific [53] and have been suggested to regulate MYC expression [53][54][55][56][57][58]. One of the most studied structural elements found in the 8q24 region is an enhancer region located ~ 335 kb upstream of MYC, frequently referred to as MYC-335 [55,56,57]. Located ~180 kb upstream of MYC-335 is CCAT1, and this region is considered to be a super-enhancer (Figure 3). A recent study showed a long-range physical interaction between MYC-335 and the promoter of CCAT1, suggesting that MYC-335 is important for CCAT1 expression [54]. Moreover, it was later demonstrated that a long isoform of CCAT1, referred to as CCAT1-L, was important in the maintenance of the chromatin interaction via its role in the recruitment of a transcription factor, called CCCTC-Binding factor or CTCF [58]. Moreover, CCAT1 has been suggested to also regulate MYC post-transcriptionally. Deng et al., found that CCAT1 was deregulated in hepatocellular carcinoma, and CCAT1 expression correlated with the progression of the malignancy and poor prognosis [59]. With the use of RNA immunoprecipitation, CCAT1 was discovered to function as a let-7 miRNA sponge, thereby disinhibiting MYC [59]. Adding to the complexity, MYC has been shown to bind to the promoter of CCAT1 and upregulate its expression and promote proliferation and invasion of colon and gastric cancer cells [60,61].
Also important to the regulation of MYC expression is the CCAT2 lncRNA. CCAT2 is transcribed from MYC-335, described above, and CCAT2 is overexpressed in CRC and has been shown to promote tumor growth and metastasis [62]. Moreover, in the same study, CCAT2 was also shown to upregulate transcription of MYC through a physical interaction with TCF7L2 [62]. However, further investigation is needed to determine mechanistically how CCAT2 is stimulating TCF7L2-mediated transcription of MYC. Recently, additional CCAT lncRNAs have been discovered; however, it remains unclear whether these novel CCAT lncRNAs are part of the lncRNA-MYC regulatory network [63]. Altogether, the CCAT lncRNA family is proving to be complex and important in the involvement of colorectal cancer, and possibly other cancers, and in the regulation of MYC expression.

MYCLos
MYCLos is a collective term for several lncRNAs included within the CCAT family, coined by a research group examining the importance of these lncRNAs in human CRC. In the original study, conducted by Kim et al., a microarray analysis was used to profile ~33,000 lncRNAs in both normal and CRC samples [63]. Their results revealed thousands of lncRNAs to be differentially expressed, including the CCAT1 and CCAT2 lncRNAs that had previously been demonstrated to be important in several stages of CRC [52,54,58,62,64,65]. To further narrow their search and to isolate the lncRNAs that were both differentially expressed in CRC and regulated by MYC, they examined the effects of MYC knockdown in different CRC cell lines. From these experiments, they identified three lncRNAs, referred to as MYCLo1, MYCLo2 (also known as CCAT6), and MYCLo3, that were transcriptionally upregulated by MYC. They later confirmed that MYCLos had influential roles in cell proliferation and cell cycle progression by regulating the expression of CDKN1A and CDKN2B, known gene targets of MYC. In a follow-up study by the same research group, three additional lncRNAs were identified, named MYCLo4, MYCLo5, and MYCLo6, and were repressed by MYC. Similar to MYCLos1-3, MYCLos4-6 were also found to influence cell proliferation and cell cycle progression, by regulating the expression of MYC target genes [66]. Collectively, MYCLos are a newly identified class of MYC-regulated lncRNAs, with some of them having an oncogenic role (MYCLos 1-3) and others having a tumor suppressor role (MYCLos 4-6). In the future, it will be important to determine how universal these lncRNAs are to the functions of MYC and whether a similar regulation of MYCLos by MYC is observed in other cancers.

The PCAT family
The prostate cancer associated transcripts (PCATs) are another class of lncRNAs within the lncRNA-MYC network. Three of the better-characterized PCAT lncRNAs are PCAT1, PCAT8 (also known as PRNCR1 and CARLo-3) and PCAT9 (also known as PCGEM1). While PCAT9 is located on chromosome 2, PCAT1 and PCAT8 are located ~ 715 kb and ~ 645 kb upstream of MYC, respectively. As mentioned above, lncRNAs have been shown to regulate the transcription of the MYC gene and the stability of the MYC protein. More recent evidence suggests that lncRNAs may also influence MYC protein expression at the mRNA level. In a recent study in prostate cancer cells, it was shown that PCAT1 attenuates the downregulation of MYC protein expression (but not mRNA amount or stability) by interfering with miR-34a [67], a known miRNA that regulates MYC expression by targeting the MYC mRNA 3′UTR [68,69,70]. Although many lncRNAs act as sponges to sequester miRNAs away from their mRNA targets [71,72], the investigators were unable to identify any putative miR-34a binding site in PCAT1. Therefore, it was suggested that PCAT1 indirectly affects the miR-34a post-transcriptional regulation of MYC [67]. While PCAT1 does appear to be directly involved in the lncRNA-MYC network, it is unclear if PCAT8 is also part of this network; however, PCAT8 has been associated with both prostate and colorectal cancers [73,74].
In a study by Hung et al., PCAT9, also known as prostate cancer gene expression marker 1 (PCGEM1), was found to be an important transcriptional mediator of many metabolic pathways in prostate cancer cells [75]. With chromatin isolation by RNA purification (ChIRP), a technique developed to examine specific RNA-DNA interactions [76], it was demonstrated that PCAT9 physically interacts with the promoters of metabolic genes, and that PCAT9 expression affected cell-cycle progression and proliferation [75]. Also discovered, PCAT9 was found to bind to MYC and that upon knockdown of PCAT9 recruitment of MYC to metabolic genes was diminished [75]. To our knowledge, PCAT9 is the only lncRNA that has been shown to bind to MYC and promote its transactivation activity thereby affecting the metabolism of cancer cells.

GAS5
The growth arrest-specific 5 (GAS5) lncRNA is a functionally diverse lncRNA [77], that is transcribed from chromosome 1. GAS5 has been suggested to be a tumor suppressor, implicated in several human cancers [78][79][80][81][82]. While PCAT1 disinhibited the translation of MYC mRNA, GAS5 has been demonstrated to interfere with the translation of MYC mRNA. In a recent study, GAS5 was shown to bind to both the eIF4E translation initiation factor and the MYC mRNA thereby inhibiting translation of MYC [82]. However, further investigation is needed to determine mechanistically how GAS5 is suppressing the translation of MYC mRNA. This study provides another example of the diversity of mechanisms by which lncRNAs regulate the expression of MYC.

GHET1
Gastric carcinoma proliferation enhancing transcript 1 (GHET1) is an unspliced lncRNA transcribed from chromosome 7 that has been implicated in gastric and bladder cancers [83,84]. First discovered by Yang et al., GHET1 was found to be upregulated in gastric carcinoma clinical samples and higher levels of GHET1 expression correlated with a poor survival rate [84]. Knockdown of GHET1 was shown to inhibit proliferation rates of gastric carcinoma cells. Conversely, overexpression of GHET1 promoted cell proliferation rates in vitro and tumor growth in vivo. With the use of different immunoprecipitation techniques, GHET1 was shown to physically interact with insulin-like growth factor 2 mRNA binding protein 1 (IGF2BP1) and also promote the binding of IGF2BP1 to MYC mRNA aiding in its stabilization [84]. The MYC mRNA is a very unstable mRNA that is rapidly degraded, and IGF2BP1 is part of a protein complex that has been shown to promote its stability [85,86]. In the future, it would be interesting to see if GHET1 maintains this same mechanistic relationship with MYC in other malignancies.

H19
Imprinted maternally expressed transcript, known as H19, is a lncRNA expressed only from the maternal allele on chromosome 11 that has been shown to be essential for human tumor growth and metastasis [87,88]. Moreover, H19 has been demonstrated to be functionally important in several human cancers [89][90][91][92][93][94]. While it has been known for many years that H19 is a key player in many human malignancies, it was only recently that a functional link between H19 and MYC had been discovered. MYC was found to bind to E-boxes located in the H19 promoter and assist in histone acetylation, thereby promoting H19 expression [94].
Many of these findings were recapitulated in a later study [93]. Interestingly, H19 is predominantly a cytoplasmic lncRNA, and recently has been demonstrated to be important in muscle differentiation by acting as a molecular sponge for the let-7 miRNA [96]. Furthermore, the role of H19 in metastasis was elucidated later in ovarian cancer cells were H19 was discovered to interfere with let-7 mediated downregulation of MYC mRNA and protein levels [97]. Collectively, H19 is one of the most pervasive dysregulated lncRNAs seen in human cancer, and to date it is one of only a few lncRNAs that feeds into a positive feedback loop with MYC, by being transcriptionally upregulated by MYC and posttranscriptionally disinhibiting MYC mRNA degradation.

TUSC8
A relatively uncharacterized lncRNA, referred to as tumor suppressor candidate 8 (TUSC8) located on chromosome 13 has also be suggested to modulate the expression of MYC. In a study by Liao et al., TUSC8 was found to be downregulated in cervical cancer, and TUSC8 expression was found to correlate with the progression of the cervical cancer and patient survival rate. In HeLa, SiHA and HCC94 cells, overexpression of TUSC8 was discovered to diminish both MYC mRNA and protein levels and decrease proliferation rates, while knockdown of TUSC8 had an opposite effect in both MYC expression and proliferation rates [98]. However, the mechanisms of how TUSC8 regulates the expression of MYC is unclear and these observed effects on MYC expression could potentially be indirect.

Conclusion
Our understanding of the dynamic regulatory relationship between lncRNAs and MYC remains in its infancy. However, just within the past year there have been several studies exploring this potentially invaluable relationship found within many human cancers. It is not surprising that MYC would transcriptionally regulate many lncRNAs, and it is especially interesting that MYC oncogenic functions could be mediated through the regulation of specific lncRNAs. Given the crucial role of MYC in many cancers, these findings suggest that MYC-regulated lncRNAs and also lncRNAs that regulate MYC could be potential valuable targets in the treatment of many human cancers. MYCN is another interesting protein of the lncRNA-MYC network that is garnering attention. New studies have been conducted exploring a functional connection between MYCN and lncRNAs implicated in cancer [99][100][101][102][103]. Given its importance in the nervous system and mesenchymal tissues [104,105], like MYC, MYCN could also mediate some of its oncogenic functions through the regulation of lncRNAs. Currently, we are still left with many unanswered questions concerning the importance of the lncRNA-MYC regulatory network in the development and/or maintenance of cancer. Specifically, it would be interesting to know how pervasive these regulatory networks are and whether the same or distinct molecular interactions exist in different malignancies. Given the sheer number of different lncRNA genes/loci, which give rise to an even larger number of lncRNA transcripts, and the fact that many of these lncRNAs are expressed both in a temporal and tissue-specific manner [14,21,22], one could postulate the existence of many more lncRNAs that could be regulated by MYC in a context-dependent manner. Altogether, future investigations in understanding this complex regulatory network could serve to provide critical insights in the biology underlying the many different types of cancers. Overview of two methodologies used to determine RNA secondary structure. A. Parallel analysis of RNA structure (PARS) uses an in vitro enzymatic treatment with single strand (S1 nuclease, red scissor) and double strand (RNase V1, blue scissor) cutters to generate two pools of digested RNA. Once digested, adaptor sequences are ligated to the cleavage sites, converted into a cDNA library and subject to next-generation sequencing (NGS). Cleavages sites, identified from the sequencing data, will provide the locations of double stranded RNA regions (seen from the RNase V1 cleavage sites) or single stranded regions (seen from the S1 nuclease cleavage sites). Collectively, from these data secondary structure of RNA molecules can be determined. B. An in vivo chemical treatment, named Structure-seq, uses DMS to selectively methylate available adenines and cytosines (denoted by red letters). Reverse transcriptase activity stops one nucleotide before reaching the methylated adenine or cytosine. A cDNA library is constructed and subject to NGS. As a result, the signature of discernable stop sites can be used to infer secondary structure from NGS data. Functional categories of lncRNAs. There are three broad functional classifications for lncRNAs. A. LncRNAs can act as decoys sequestering either proteins and/or regulatory RNAs, such as miRNAs, away from their targets or cellular locations. B. LncRNAs can also be key players in the recruitment of proteins, such chromatin-modifying enzymes, to specific genomic locations thereby influencing transcriptional events. C. LncRNAs can provide a platform or scaffold to facilitate different molecular interactions, such as protein-protein interactions. The cancer-specific molecular interactions of the lncRNA-MYC network at the 8q24 genomic region. CCAT1 both transcriptionally (through chromatin interactions) and posttranscriptionally (through titration of let-7) regulates MYC expression. CCAT2 stimulates TCFL2-mediated transcription of the MYC locus. PCAT1 prevents miR34-a mediated translational repression. PVT1 binds to MYC preventing threonine-58 phosphorylation by glycogen synthase kinase 3 (GSK3) and subsequent MYC degradation. Collectively, all of the flanking lncRNAs promote the accumulation of MYC; therefore, when these lncRNA are inappropriately upregulated, MYC-dependent malignancies can develop.