Genetic tools to dissect functions of long noncoding RNAs

LncRNAs represent an abundant group of noncoding transcripts, some of which carry out important regulatory functions. To survey the biological and molecular roles of lncRNAs, reliable strategies for their genetic inactivation are required. Several lncRNA features make them challenging to target by genome editing. First, lncRNA loci often span large genomic distances. As such, full or partial deletion alleles are not always easy to generate and interpret as they might affect DNA regulatory elements. Second, in contrast to proteins, lncRNA transcripts are usually resistant to the minimally invasive approach of point substitutions. Third, lncRNA sequences exhibit rapid evolutionary turnover, impeding prediction and targeting of the specific functional sequence elements. Nonetheless, advances in genome editing and comparative genomics have expanded the repertoire of genetic strategies to dissect lncRNA functions in model organisms and cell lines. In this review, we discuss several approaches that have been used to generate lncRNA mutant alleles, focusing on vertebrate lncRNAs. We also briefly highlight comparative genomics approaches to identify conserved lncRNA sequence motifs, which represent attractive target sequences to abrogate lncRNA functions and to pinpoint functional contributions of these elements.

teins, lncRNA transcripts are usually resistant to the minimally invasive approach of point substitutions. Third, lncRNA sequences exhibit rapid evolutionary turnover, impeding prediction and targeting of the specific functional sequence elements. Nonetheless, advances in genome editing and comparative genomics have expanded the repertoire of genetic strategies to dissect lncRNA functions in model organisms and cell lines. In this review, we discuss several approaches that have been used to generate lncRNA mutant alleles, focusing on vertebrate lncRNAs. We also briefly highlight comparative genomics approaches to identify conserved lncRNA sequence motifs, which represent attractive target sequences to abrogate lncRNA functions and to pinpoint functional contributions of these elements.

K E Y W O R D S
genetic inactivation, long noncoding RNAs, mutant alleles, rescue experiments, sequence motifs

| INTRODUCTION
The discovery of regulatory long noncoding RNAs (lncRNAs) such as PVT1, Xist, H19, and Airn 1-6 marked the beginning of a new era of RNA research, in which lncRNAs arose as important regulators of development, physiology and disease. [7][8][9] lncRNAs represent a vast and functionally heterogeneous group of RNA polymerase II-transcribed, capped, polyadenylated and usually spliced transcripts, with little or no protein-coding potential. 7 Dependent on the stringency of lncRNA annotations, the estimated number of human lncRNA transcripts equals or even exceeds the number of proteincoding genes, 10,11 raising questions about their biological functions. Whereas the majority of lncRNAs are likely to be functionally inert, there is an increasing number of Abbreviations: ASO, antisense oligonucleotide; KO, knockout; lncRNA, long non-coding RNA; MO, morpholino oligomer; poly(A), polyadenylation; PVT1, plasmacytoma variant translocation 1; TSS, transcription start site; XCI, X-chromosome inactivation; Xist, X inactive-specific transcript. lncRNAs with characterized biological and molecular functions. To understand the common principles underlying various roles of lncRNAs, classification of these transcripts into relevant sub-groups is key. However, assigning lncRNAs to distinct sub-groups has been hampered by the high evolutionary turnover of lncRNA sequences. In general, many lncRNAs are lineage-specific and do not have sequence orthologs in other species. 12 Only a small set of mammalian lncRNAs exhibits detectable sequence conservation to the base of vertebrates, 13,14 making the identification of their potential functional sequences very challenging. An alternative approach to sequence comparisons, is to base lncRNA categorization and grouping on their known biological and molecular functions exerted either by the mature lncRNA transcript, the act of lncRNA locus transcription, or the regulatory DNA elements embedded in the lncRNA locus 7 ( Figure 1). In addition, some lncRNAs are translated into functional micropeptides [15][16][17][18] (Figure 1). Different genetic strategies have been employed to uncouple RNAmediated effects from those mediated by DNA regulatory elements, transcription, or micropeptide translation.
In the past two decades, lncRNAs have been investigated primarily in mammalian cell lines and in the mouse model. 19 More recently, a number of lncRNA mutant alleles have been also generated and characterized in zebrafish. [20][21][22][23][24] Regardless of the animal or cellular system of choice, the generation of lncRNA mutant or null alleles is the first important step to understanding their roles. The ease of CRISPR-Cas9-based genome editing in cell lines and model organisms boosted lncRNA research. It promoted the development of different complimentary strategies to distinguish between RNA-based regulation, regulation driven by the act of transcription and/or DNA elements embedded into the lncRNA loci. In this review, we focus on currently available approaches to functionally interrogate lncRNAs in mammalian cell lines and vertebrate model organisms. We discuss several complimentary genetic strategies to disrupt lncRNA expression and potential caveats associated with each of the strategies. We briefly survey recently developed comparative genomics approaches for the identification of conserved lncRNA sequence elements facilitating their targeting for functional analyses. We also cover rescue strategies of mutant alleles that will further contribute to the discovery of functional elements of lncRNA transcripts.

| GENOME EDITING STRATEGIES FOR LNCRNA INACTIVATION
To genetically abrogate lncRNA expression, multiple strategies have been used in mammalian cell lines and vertebrate model organisms, ranging from the whole gene deletion to minimally invasive approaches ( Figure 2). Today, the majority of genetic tools are based on the CRISPR-Cas9 genome editing technology enabling easy, targeted alterations at specific genomic loci. Prior to CRISPR-Cas9-based technology, several alleles, which were key to our understanding of the biological functions of lncRNAs, have been generated by classical homologous recombination. In this section, we survey the main strategies to eliminate or diminish lncRNA functions including (a) removal of lncRNA One of the most straightforward, but rather crude approaches to generate lncRNA null alleles is a complete removal of the DNA sequence from which the lncRNA is transcribed. Alternatively, replacement of the lncRNA sequence by insertion of various reporter genes or antibiotic resistance cassettes can be applied. 25,26 This approach serves as a valuable starting point for assessing the functionality of the locus. If the full gene ablation leads to a mutant phenotype, the targeted locus can be further investigated with orthogonal, less invasive methods to distinguish between RNA-or DNAdependent effects on the phenotypic outcome.
One of the most studied lncRNAs is Xist (X inactivespecific transcript), a master regulator of mammalian Xchromosome inactivation (XCI) and dosage compensation between females and males. [27][28][29] Pioneering work on the roles of Xist during mouse development largely relied on the insertion of resistance cassettes, flanking either full or partial lncRNA gene sequences. 5,30 A $15 kb genomic deletion of the Xist locus removed the vast majority of the gene, while leaving its promoter intact. 30 Female mice carrying the Xist mutation inherited from the paternal X chromosome exhibited severe growth retardation and died early in embryogenesis, suggesting the requirement of the Xist locus for female dosage compensation. 30 To unequivocally state the functionality of the Xist transcript, multiple other alleles including additional deletions, transgene approaches and peptide nucleic acid (PNA) based interference 31 have been generated over the past two decades. These additional alleles have been extensively reviewed elsewhere. 32 An analogous strategy of the full gene ablation was utilized in a pioneering medium-throughput study in mice to explore the in vivo relevance of 18 lncRNAs, selected on the basis of functional in cellulo assays and in silico data. 25 The sequences of selected lncRNAs were replaced by the lacZ cassette in mice, thus inactivating the lncRNA transcript, while keeping the activity of their respective promoters transcriptionally active. Among the examined candidates, inactivation of five lncRNAs (Fendrr, Peril, Mdgt, Pint, and Brn1) led to an observable phenotype at the organismal level, with three (Fendrr, Peril, and Mdgt) resulting in lethality in early development. 25 Gene replacement by a reporter cassette offers full gene inactivation combined with a simultaneous in vivo tracking of its spatial-temporal expression. By inserting the lacZ reporter cassette into the lncRNA knock-out (KO) loci, expression of selected 13 lncRNAs has been examined in the developing and mature mouse brain. 26 Insertion of the lacZ reporter also facilitated the in vivo identification of functional elements of the Linc-p21 locus. 33 Replacing the entire gene body and a part of the promoter region of Linc-p21 by the lacZ reporter enabled tracking of Linc-p21 expression in mouse tissue. Significant changes in gene expression were observed even in tissue with no detectable lacZ reporter suggesting an RNA-independent regulation of gene expression by DNA enhancer elements embedded in this locus. 33 F I G U R E 2 Experimental approaches that have been used to inactivate lncRNAs in cell lines and animals Using a full-or partial-gene deletion approach, several lncRNA mutant alleles have also been generated in zebrafish. 23 One of these mutants was lnc-phox2bb À/À which, upon removal of the whole gene body, displayed several developmental abnormalities and did not survive to adulthood. 23 In contrast, zebrafish carrying a homozygous deletion of the transcriptional start site (TSS) of lnc-phox2bb developed normally, suggesting the presence of enhancer elements embedded in the lnc-phox2bb locus. 23 Taken together, full deletion of a lncRNA locus or its replacement by a reporter gene offer valuable strategies for a preliminary assessment of lncRNA locus functionality. However, interpretation of the phenotypic deficits caused by whole-gene deletions requires careful analysis, including characterization of additional alleles generated by less invasive methods.

| Partial deletions and targeted removal of functional domains of lncRNAs
Detailed structure-function analyses of some wellstudied lncRNAs have defined the domains and sequence elements that carry out distinct functions. A prominent example of such a lncRNA is Xist. Within its $17 kb RNA sequence, Xist encompasses six conserved regions, called repeats A through F that exert different functions during Xist-mediated gene silencing. 4,27,[30][31][32][34][35][36][37][38] In the last two decades, the functional contribution of different Xist regions to XCI has been intensively investigated through multiple complimentary approaches, including partial deletions of the Xist transcript either at the endogenous locus or in the context of autosomal cDNAinducible systems. Deletion of the A-repeat located at the 5 0 end of Xist abolishes its silencing function in mouse ES cells. 39 Consistent with the role of the A-repeat in ES cells, a mouse deletion allele of the A-repeat fails to undergo X-inactivation. 34 However, this deficit was due to the abolished expression of the Xist transcript in mice rather than a defect in the silencing function of the mutated RNA. 34 Whereas the deletion of the B-repeat impairs the maintenance of Xist-mediated gene silencing at later differentiation stages, 35 combined deletion of the B-and C-repeats abolishes recruitment of Polycomb repressive complex 1 and complex 2 (PRC1 and PRC2) to mark the future inactive X chromosome. 36 Removal of the E-repeat leads to loss of its interaction with CIZ1 (CIP1-interacting zinc-finger protein 1), which is required for Xist localization at the inactive X chromosome, and which prevents Xist diffusion throughout the nucleus. 37,38 Combined with other methods, partial deletion analyses established distinct functional modules within the Xist transcript and their differential contribution to XCI.
Targeted removal of specific lncRNA domains is a powerful tool to define lncRNA sequence elements required for particular cellular and molecular functions. However, this approach requires prior knowledge of lncRNA sequences, whereas the majority of lncRNAs do not exhibit easily identifiable functional domains. Moreover, partial removal of the endogenous lncRNA sequence may result in unpredictable changes in expression levels of the residual transcript as demonstrated for partial deletions of the endogenous Xist sequences 34,40 and for several partial deletion alleles in zebrafish leading to unexpected overexpression or impaired expression of truncated lncRNA transcripts. 22

| lncRNA sequence inversions
Mutant alleles carrying inverted lncRNA sequences can facilitate the interpretation of deletion alleles, while minimizing effects resulting from the removal of DNA regulatory motifs. By combining the Cre/loxP system and sequence inversions in mouse, an inducible partial inversion of the Xist sequence demonstrated the importance of the Xist transcript for the silencing of the Xlinked loci in cis. 41 The Xist transcript harboring the inversion remained localized to the X-chromosome in cis, but at reduced efficiency, leading to a decrease in its activity.
Whereas targeted sequence inversions require complex genetic strategies, spontaneous inversions are not uncommon and often occur when using CRISPR-Cas9-based deletions. We applied this strategy of searching for spontaneous inversions for the zebrafish lncRNA libra. 21 We generated both a full-locus deletion allele and an inversion of $5.5 kb of the most conserved part of the libra transcript in zebrafish. The inversion allele was a spontaneous byproduct of the intended partial deletion of the region. When we compared both alleles, the inversion allele fully recapitulated the phenotype of the full-locus deletion of altered explorative and anxiety-like fish behavior, serving as additional evidence for the RNAregulated phenotypic outcome. 21 Together, lncRNA sequence inversions can serve as an additional complimentary approach to other lncRNA inactivation strategies. Sequence inversion alleles are particularly useful when the expression level of the inverted transcript remains comparable to its wild type counterpart enabling direct comparisons and ruling out impact of additional functional lncRNA elements. In addition, partial sequence inversions of lncRNA alleles can be employed to distinguish specific contributions of lncRNA transcript or the DNA regulatory elements embedded in its locus, making this a very powerful tool for lncRNA studies.

| Removal of transcriptional start sites of lncRNAs
Deletions of lncRNA TSSs and upstream promoter regions is a popular strategy for lncRNA inactivation that has been successfully applied for multiple lncRNA genes in cell lines and model organisms. 23,[42][43][44][45][46][47][48] Because of the relatively small deletions, TSS removal is undoubtedly one of the least invasive approaches to eliminate transcription of the entire lncRNA with a reduced risk of affecting DNA regulatory elements. One of the key challenges associated with this approach is reliable identification of the TSS. Due to their noncoding nature, lncRNA genes are often transcribed from multiple TSSs generating numerous isoforms even in the same cell or tissue. Another challenge of lncRNA TSS removal particularly affecting model organisms is the potential activation of tissue-specific alternative TSS, maintaining lncRNA expression and generating hypomorphic mutants. 22 One of the most striking examples of the alternative TSS usage was found in zebrafish lnc-sox4a ΔTSS mutants. The lnc-sox4a ΔTSS allele was generated by a deletion of $200 nt surrounding the TSS of lnc-sox4. Examination of lnc-sox4 expression in lnc-sox4a ΔTSS embryos and adult tissues showed efficient ablation of lnc-sox4 expression. However, in the adult lnc-sox4a ΔTSS brain, lnc-sox4 remained robustly expressed at the level of the wild type transcript due to the usage of an alternative TSS. Remarkably, this brain-specific TSS was located in an intron, $70 kb downstream from the main TSS and activated only upon the removal of the main TSS. 22 While the ablation of lncRNA expression by TSS removal is easily trackable in cell lines, interpretation of the TSS deletion alleles requires particularly thorough analyses in model organisms.

| Insertions of the premature transcription termination signals
The insertion of premature polyadenylation (poly(A)) cassettes in various positions with respect to the lncRNA locus is a minimally invasive approach for studying lncRNAs, particularly those that are transcribed from complex loci enriched for multiple TSSs, cis regulatory elements and/or are antisense to protein-coding genes. In addition, the insertion of poly(A) cassettes is a powerful tool to differentiate the individual contributions of lncRNA transcript, act of transcription and DNA elements embedded in the locus. This strategy has been applied to studies of Airn or Haunt, lncRNAs influencing their respective neighboring gene clusters, and the lncRNA Lockd, transcribed from an enhancer of the Cdkn1b gene. 47,49,50 In addition to cell lines, this approach is also being increasingly used in lncRNA studies in animal models. The functional contribution of Fendrr, Upperhand (Uph), and Charme to heart development was revealed by replacing early exons with several copies of the poly(A) cassette to produce respective null alleles in mice. [51][52][53] In the case of Uph, this approach not only preserved the integrity of the important developmental regulator Hand2, whose gene is antisense to Uph, but also served as a basis to demonstrate that Uph acts in cis to regulate Hand2 transcription via modification of its chromatin signature. 52 In zebrafish models, for which genetic tools are not as elaborate as in mouse, this approach was established for inactivation of the malat1 gene expression. The malat1 lncRNA is the one of the most abundant transcripts in the cell and its locus contains multiple TSSs and clustered enhancers. 22,54 Following pioneering work in human cells, 55 premature insertion of the poly(A) cassette in the 5 0 region of the malat1 gene abrogated the expression of the lncRNA, establishing poly(A) insertions for gene inactivation in zebrafish. 22 Although the insertion of poly(A) signals is a valuable approach to maximally preserve locus integrity, it is important to note that the main downside of this approach is the potential leaky expression of the remaining transcript, as reported for Haunt or Lockd. 47,50 In order to minimize the possibility of leaky expression when applying the mutant allele design, one should consider the position and number of copies of the inserted polyadenylation cassettes, expression levels of the lncRNA in question, as well as the global architecture of the locus.

| Insertions of transcriptdestabilizing elements
The idea of inserting self-cleaving ribozymes into a target transcript arose as an elegant alternative for lncRNA inactivation. When inserted into a transcript, a selfcleaving ribozyme causes premature, co-transcriptional cleavage and degradation of the nascent transcript. The application of self-cleaving ribozymes for inactivation of lncRNA transcripts offers several advantages over other knock-down approaches, including very low off-target effects, minimal invasiveness of the genomic locus due to the small size of ribozymes, as well as the potential prolonged and reversible effect on gene inactivation. In addition, insertion of the mutated sequence coding for inactive ribozyme can be used as a suitable control to assess nonspecific binding of a target RNA. Alternatively, RNA expression can be partially rescued by an antisense oligonucleotide (ASO) against the ribozyme sequence via steric inhibition of the ribozyme. 56 Tagging of endogenous lncRNA transcripts with ribozymes has been applied to a set of lncRNAs in mouse ES cells. 56,57 A knock-in of the Hammerhead ribozyme into 15 selected lncRNA genes in mouse ES cells demonstrated that the efficiency of ribozyme-mediated lncRNA depletion varies depending on the locus and/or the site of ribozyme insertion, ranging from nearly complete to modest downregulation of the transcript. 56 A knock-in of the Twister (TWI) ribozyme into the first exon of lincRNA-p21 resulted in degradation of $90% of the lincRNA-p21 transcript in mouse embryonic fibroblasts (MEFs) isolated from lincRNA-p21 TWI/TWI mice. 58 Whereas ribozymebased transcript degradation offers an elegant approach to abolish or diminish lncRNA expression, its widespread application, especially in model organisms, is currently limited by our incomplete understanding of the efficiency of the ribozyme-mediated knock-down.

| Minimally invasive sequence perturbations targeting specific lncRNA elements
lncRNAs exert their molecular functions through interactions with DNA, other RNA molecules, and RNA-binding proteins, depending on secondary structure or short and oftentimes highly conserved motifs. 59 Targeted disruption of these short sequence elements, sometimes affecting only a few nucleotides, presents a minimally invasive approach when studying both full length lncRNAs and their individual sequence elements. In this section, we highlight some of the studies in which the utilization of such elegant approaches proved crucial in uncovering specific interactions and phenotypes.
Multiple studies were conducted to decipher the phenotypic contribution between the act of transcription and mature lncRNA transcript, and were recently reviewed elsewhere. 60 As an example, we sought to highlight the impact of the lncRNA Haunt (HOXA upstream noncoding transcript) and its corresponding locus on the expression of the neighboring HOXA cluster in mouse ES cells. 47 HOX gene clusters play crucial roles in animal development and their expression is tightly regulated on the spatiotemporal level. 61 In addition, both HOX clusters and their surrounding loci, several of which produce lncRNAs, [62][63][64] are strongly enriched in various DNA elements and binding sites for several pluripotency factors, making them particularly challenging to study. Multiple mutant Haunt alleles were generated by serial deletions ranging from 131 bp to 29 kb. 47 Subsequent individual knock-ins of termination signals or the Haunt cDNA to its own locus uncovered unique and opposing contributions of the mature Haunt transcript and its genomic locus to the expression of HOXA cluster genes in retinoic acid-induced neural differentiation. 47 A similar approach was employed to further decipher the role of the lncRNA Braveheart (Bvht) in mouse ES cell differentiation. 65,66 Bvht is a mouse-specific lncRNA with important roles in directing cellular commitment to the cardiovascular lineage. 65 Structural analyses through several approaches revealed the presence of a short, G-rich internal loop (AGIL), whose targeted excision impacted cardiovascular lineage commitment without affecting the overall stability of the mutant transcript. Through further biochemical approaches, this was shown to be a consequence of selective abrogation of Bvht interaction with CNBP, an RNA-binding protein with known roles in development, 67 underlying the importance of a small structural motif to an important developmental process. 66 The introduction of small, targeted deletions to maximize the preservation of the locus, while compromising function of the studied lncRNA transcript is becoming an increasingly utilized method of choice in lncRNA studies with animal models, particularly when the whole locus deletion causes lethality at early development stages. A prominent example of such a strategy is the functional study of the lncRNA veal2 in zebrafish. 24 While the whole locus deletion led to lethality at larval stages, a targeted 8 bp deletion in the 5 0 end of the transcript did not affect animal viability, and revealed the importance of veal2 for vascular integrity maintenance. 24 The impact of minimal deletions in lncRNA sequences on the whole organism was highlighted in a recent study that aimed to decipher the metabolic disorder phenylketonuria (PKU) using a mouse model. 68 The lncRNA Pair (PAH-activating lincRNA) was inactivated through the deletion of only two base pairs comprising an exon-intron junction, leading to the destabilization of the whole transcript and causing a PKU-reminiscent phenotype. This finding was further corroborated by the identification and exploration of Pair interaction with the PAH protein, whose mutations are a leading cause of PKU in humans. Secondary structure and conservation analyses of Pair identified a conserved stem loop whose single nucleotide substitution led to the abrogation of its interaction with PAH. 68 Such minimally invasive sequence alterations demonstrated that the lncRNA Pair modulates the enzymatic activity of PAH, thereby advancing our knowledge of PKU and providing the basis for the development of specific therapeutics.
Another example of such strategy being applied in parallel in both zebrafish and mice, is the deeply conserved libra/Nrep transcript. A remarkable feature of this transcript is its near-perfect microRNA binding site for miR-29, conserved throughout vertebrate lineage. 21 Nrep miR-29scr alleles were generated by scrambling the miR-29 binding site through a set of point substitutions in mouse and mouse ES cells. Analyses of the Nrep miR-29scr alleles showed that the near-perfect miR-29 binding site spatially and quantitatively restricts miR-29b expression by directing miRNA degradation. The failure to destabilize miR-29b in the cerebellum has physiological consequences and results in impaired balance and motor learning in mice. 21 Remarkably, another near-perfect miRNA binding site for miR-7 was identified in the deeply conserved lncRNA Cyrano. 13,69 Similar to Nrep, a series of miR-7 site mutant alleles showed that Cyrano promotes efficient degradation of miR-7 in the mouse brain and in human cells through its high complementarity binding site. 69 Taken together, targeted alterations of specific sequence elements offer one of the most powerful strategies to precisely pinpoint biological and molecular functions of lncRNA sequences. However, knowledge of the functional motifs is a prerequisite for the design of minimally invasive genetic strategies. In the next section, we survey computational approaches that facilitate identification of lncRNA sequence elements.

| COMPARATIVE GENOMICS APPROACHES TO IDENTIFY AND TARGET CONSERVED lncRNA SEQUENCE ELEMENTS
In the last two decades, comparative genomics approaches have emerged as a powerful tool for multilevel comparisons of lncRNAs from various species. The development of computational methods, increasing availability of sequencing data and improved resolution at the nucleotide scale confirmed that while lncRNA sequences are generally poorly conserved throughout vertebrate evolution, certain patterns of conservation are detectable including synteny, short primary sequence or secondary structures (Figure 3).
From pioneering work in the identification of zebrafish lncRNAs with putative mammalian orthologs, the advancement of data resolution and in silico tools represent a crucial step in lncRNA research. Several bioinformatic pipelines and algorithms such as PLAR, SEEKR, as well as recently developed lncLOOM or lncEVO have become essential for the identification of functional elements within lncRNAs [12][13][14]17,41,[63][64][65][66][67][68][69][71][72][73][74][75] (Figure 3). Recently, such approaches have also revealed that several lncRNA loci contain putative small ORFs, often embedded in their deeply conserved regions. 12,15,16,64,70,[76][77][78][79][80][81][82][83][84][85] As our review aims at discussing genetic approaches to lncRNA editing, for which the use of bioinformatic pipelines has become essential, we invite the reader to consult the respective publications, as well as a recent detailed review on the topic. 86 Adopting advanced computational approaches as a basis to understanding lncRNA functions goes hand in hand with elegant, minimally invasive, as well as polyvalent approaches on the genomic level when studying lncRNA function, often relying on several orthogonal methods reviewed above. 86 Such multilevel approaches were demonstrated in recent studies on Firre, correlating its expression dynamics to global gene expression changes, or in the identification of sequence-specific drivers of its differential subcellular localization in human and mouse cells. 87,88

| STRATEGIES TO RESCUE lncRNA MUTANT ALLELES
The interpretation of mutant lncRNA alleles can be facilitated by rescue experiments establishing a formal association between a transcript and a phenotype. One of the most commonly used rescue strategies is the expression of the mature, full-length lncRNA from a plasmid or by genetic knock-in. Both types of rescue experiments have been successfully carried out for several lncRNAs, assessing not only if the activity of the locus depends on the F I G U R E 3 Different layers of lncRNA conservation identified by comparative genomics approaches lncRNA transcript per se but also if the examined lncRNA acts in cis or in trans. Genetic inactivation of the lncRNA Fendrr either by whole-gene ablation or by insertion of the premature poly(A) cassette leads to embryonic lethality in mice. 25,51 Expression of the Fendrr transgene from a BAC clone rescues embryonic lethality, suggesting that Fendrr is acting in trans. Similar to Fendrr, rescue experiments established a trans-acting mode of the lncRNA Firre. 89 Full deletion of the Firre locus including its promoter region leads to deficits in hematopoiesis in mice. Overexpression of the Firre transgene from a different chromosome in a Firre-deficient background rescues physiological and molecular phenotypes suggesting that Firre acts in trans via its lncRNA transcript. 89 On the other hand, there are several examples where overexpression of a lncRNA could not rescue the observed phenotype. 47,52,90,91 However, these scenarios in which overexpression of the lncRNA does not rescue or only partially rescues the phenotype are rather complex to interpret. In such cases, additional analyses should be employed to discern whether the observed lack of rescue is due to the mechanistic (transcription-dependent or cisacting lncRNAs), biological (cell type, developmental stage, or specific isoform expression) or rather experimental setup (expression levels, promoter choice, expression of the primary lncRNA transcript).
Using the same rescue strategy as above, functional conservation of orthologous lncRNA transcripts can be tested by re-introducing a lncRNA transgene from a different species in the null lncRNA background (Figure 4a,b). The functional equivalence of the orthologous lncRNA transcripts has been examined for the lncRNA THOR, whose sequence is conserved from fish to human. 20 Deletion of the most conserved part of THOR leads to fertilization defects in zebrafish. Furthermore, THOR À/À fish show resistance to cutaneous melanoma formation when using an inducible NRAS 61K -based melanoma system. Remarkably, introduction of human THOR in the THOR À/À zebrafish expressing the NRAS 61K -oncogene reverted the effect of THOR loss on melanoma progression, resulting in significant increase in the onset of melanoma development and the size of the tumors. 20 The conserved function of THOR was further corroborated by identification of the conserved molecular interaction of orthologous THOR transcripts with the IGF2BP1 protein. 20 Expanding on the idea of inter-species rescue experiments, one could envision generation of "hybrid" transcripts by genetically stitching lncRNA sequence elements from different species or from different transcripts to test their functional equivalency (Figure 4c). This strategy of generating a hybrid rescue allele has been tested in mouse ES cells that express A-repeat deficient Xist. 39 The A-repeat of Xist binds SPEN protein which is absolutely essential for XCI. 92 In mouse ES cells, SPEN also binds and silences endogenous retroviral elements (ERVs), which show structural similarity to the A-repeat. 93 Hypothesizing that ERVs may complement the A-repeat deletion of Xist, 39 a hybrid allele was generated in mouse ES cells, in which multiple ERV-derived SPEN binding sites were inserted into an Xist transcript that lacks the A-repeat. 93 Remarkably, the ERV elements inserted into the A-repeat deficient Xist transcript restored SPEN binding to Xist and rescued local gene silencing in cis. 93 While hybrid alleles can be very informative for determining functional sequence elements of lncRNAs, one potential risk when designing such hybrid alleles is the unpredicted changes to hybrid transcript levels. Examination of several alleles in zebrafish showed that removal of relatively small lncRNA fragments might result in unexpected changes, such as overexpression of the residual lncRNA transcript as demonstrated for cyrano. 22 As our understanding of intrinsic sequence and structural elements regulating lncRNA stability remains limited, the same unexpected effects might occur when generating hybrid alleles.
Another variation of the rescue strategy above can be used for lncRNAs with well-established molecular mechanisms of action. As the majority of lncRNAs carry out their cellular and molecular functions through interactions with RNA-binding proteins (RBPs), an engineered recruitment of an RBP to a lncRNA using a high affinity tethering system may help to precisely define lncRNA functions and/or its functional elements. Several tethering systems including the bacteriophage MS2 stem loops-MS2 coat protein system have been successfully used in the RNA field for decades. 94 The principle of the technique is based on the interaction of two recombinant molecules, an RNA of interest tagged by specific stem loops that have high affinity to a small peptide fused to a target protein ( Figure 4d). As mentioned above, SPEN binding to Xist is essential for initiation of gene silencing on the X chromosome in ES cells and preimplantation mouse embryos. 95 An engineered tethering of only the SPOC domain of SPEN to Xist was sufficient to mediate gene silencing in mouse ES cells. 95 Adapting tethering experiments to the lncRNA biology, in combination with the careful analysis of lncRNA mutant alleles may be key for defining the functions of lncRNA transcript elements and their protein interactors.

| TRANSIENT ALTERATIONS OF lncRNA EXPRESSION
While this review is focused mainly on the generation and interpretation of lncRNA genetic mutants, transient alterations of lncRNA expression by antisense oligonucleotides or next generation CRISPR methodology have been used for lncRNA function discovery. Transient perturbations of lncRNA expression provide an invaluable tool for fast and often reversible assessment of lncRNA contribution to the phenotype or mechanism observed and/or investigated through classically engineered genetic mutants. In addition, the use of transient approaches is advantageous in terms of time and resources over genetic approaches in high throughput studies for identification of potential candidates contributing to a certain mechanistic process or phenotype. As this review focuses on genetic approaches to lncRNA studies, we will only briefly summarize past and currently used strategies for transient alteration of lncRNA expression, together with their advantages, as well as potential limitations.
Early use of transient approaches to lncRNA studies, largely relied on the use of morpholino oligomers (MOs), oligonucleotides in which the ribose moiety is replaced by a morpholine ring. MOs have been routinely used for knockdown in both zebrafish and mouse models, as well as in cell lines. 13,85,96 Alternatively, the use of short RNA or DNA oligonucleotides, with the former relying on the cellular RNAi pathway and the latter on targeted RNase H-based degradation or steric hindrance, resulting in modulation of the splicing pattern or poly(A) site usage, became a method of choice for lncRNA knockdown. 24,53,97-100 A clear advantage of such approaches, compared to MOs, is the fact that base pairing with the transcript of choice leads to its destruction by the cellular machinery, making the knockdown efficiency easily quantifiable. In addition, antisense oligonucleotides (ASOs) are often chemically modified to increase their stability and specificity, while decreasing toxicity, and can be used in lower concentrations to achieve the desired effect in both cell lines and animal models. 101,102 Importantly, ASOs can be used to target lncRNAs whose locus complexity or genomic localization, as in the case of TERRA, does not allow DNA editing. 103 The use of oligonucleotides targeting mature transcripts can be employed to delineate the contribution of the mature lncRNA transcript and the act of its transcription, although some evidence indicates that this approach should be used with caution. 104 An alternative strategy to achieve RNA-targeted knockdown is the recently developed CRISPR-Cas13. 105,106 Contrary to the routinely used Cas9 for gene editing, Cas13 is an RNA-guided RNA endonuclease, cleaving specifically only single-stranded RNA, with no affinity for DNA. 105,106 The use of this system has demonstrated increased precision compared to base-pairing strategies, and has been successfully applied in both mouse and zebrafish 107,108 In addition to transcript-targeting strategies, a widely used method for transient regulation of lncRNA expression relies on the engineered CRISPR-Cas9 system, involving the expression of a version of Cas9 which binds but is unable to cut its targets (dCas9) fused to repressor or activator domains. 102,109 The inducible nature and relatively low off-target effects of this system makes it ideal for targeted gene silencing or activation to study complex spatiotemporal contributions of lncRNAs. 64,110 6 | CONCLUSIONS AND PERSPECTIVES The characterization of functional elements of lncRNA loci still poses a challenge. This can be due to a number of genomic constraints, such as composition or location of the lncRNA-bearing loci, or technical constraints, in the form of missing or inappropriate annotations of full lncRNA loci or their individual elements, such as TSSs. In this review, we have summarized the knowledge on currently available and frequently used techniques for genetic characterization of lncRNAs ( Figure 1).
With the growing number of lncRNA-centric studies, it has become clear that a thorough characterization of lncRNAs requires a multidimensional approach relying on the utilization of orthogonal approaches discussed in this review and employed in recent studies in mammalian cell lines as well as animal models. 21,46,47,58 In addition, combinatorial approaches using in silico along with several of the above-mentioned strategies, particularly small sequence deletions or insertions, have been recently employed to dissect the contribution of several potential micropeptides whose ORFs are embedded in lncRNA sequences. 15,16,18,77 When considering different strategies to design lncRNA mutant or null alleles, detailed evaluation of the genomic landscape of the targeted loci may help to minimize the risk of DNA-dependent effects. Genomic datasets revealing the position of active gene promoters (by chromatin marks such as H3K4me), active enhancers (by chromatin marks such as H3K27ac), transcription start sites (TSS) (by Cap Analysis of Gene Expression; CAGE) and polyadenylation sites (by 3P-seq or other similar methods) are now available for multiple cell lines and model organisms including mouse and zebrafish. Combining these datasets with RNA-seq and sequence conservation analyses may help in choosing the most suitable way to disrupt lncRNA expression in a minimally invasive manner.
To distinguish between functions mediated by lncRNA transcripts, the act of their transcription or by overlapping DNA regulatory elements may require analyses of several independent lncRNA alleles. Towards this end, cell lines rather than model organisms are more applicable for detailed structure-function analyses of lncRNA transcripts. Thus, using complimentary cellular and animal systems can be particularly powerful for addressing lncRNA functions and defining their functional elements. In addition, rescue experiments enable the association of a phenotype with the lncRNA transcript. The ease of CRISPR-Cas9-based genetic manipulations opens countless possibilities for designing rescue alleles in cell lines and model animals including inter-species comparisons by expressing orthologous lncRNA transgenes to pinpoint their conserved functional elements.