Sequence-specific DNA labelling for fluorescence microscopy

The preservation of nucleus structure during microscopy imaging is a top priority for understanding chromatin organization, genome dynamics, and gene expression regulation. In this review, we summarize the sequence-specific DNA labelling methods that can be used for imaging in fixed and/or living cells without harsh treat- ment and DNA denaturation: (i) hairpin polyamides, (ii) triplex-forming oligonucleotides, (iii) dCas9 proteins, (iv) transcription activator-like effectors (TALEs) and (v) DNA methyltransferases (MTases). All these techniques are capable of identifying repetitive DNA loci and robust probes are available for telomeres and centromeres, but visualizing single-copy sequences is still challenging. In our futuristic vision, we see gradual replacement of the historically important fluorescence in situ hybridization (FISH) by less invasive and non-destructive methods compatible with live cell imaging. Combined with super-resolution fluorescence microscopy, these methods will open the possibility to look into unperturbed structure and dynamics of chromatin in living cells, tissues and whole organisms.


Introduction
More than 300 years ago, the nucleus was first described by Antonie van Leeuwenhoek who constructed light microscopes that could resolve cells and subcellular structures (Lane, 2015). The intricacy of nucleus organization can be appreciated when trying to imagine the almost 2 m long DNA complexed with multiple protein and RNA species, tightly folded into a micrometre-sized nucleus. This highly sophisticated nucleoprotein complex is called chromatin. Historically electron microscopy (EM) was instrumental in revealing several levels of chromatin organization, such as a famous "beads on a string" arrangement of nucleosomes. Multiple levels of chromatin organization are beyond the diffraction-limited light microscopy resolution (~250 nm) (Boopathi et al., 2020). However, EM is of limited utility for characterizing dynamic structures (Maeshima et al., 2019) and most insights into chromatin dynamics are obtained by relatively lower resolution fluorescence microscopy (Heun et al., 2001). Staining DNA with intercalating dyes or labelling high abundance chromatin proteins such as histone H2B visualizes global chromatin structure; nevertheless, true understanding of the role of chromatin organization in the cell life requires visualizing specific genome loci. Numerous methods are available to obtain labelled synthetic DNA, but sequence-specific labelling of native DNA still represents a challenge. The early studies on chromatin dynamics based on fluorescence microscopy used green fluorescent protein (GFP)-tagged lactose (Lac) repressor (Heun et al., 2001). To achieve high contrast, hundreds of copies of Lac operator sequences had to be integrated into the genome locus under investigation, thus potentially interfering with its natural dynamics. Later a less destructive DNA-labelling method was developed which utilizes a clustered regularly interspaced short palindromic repeats nuclease Cas9 (CRISPR-Cas9) to insert nonrepetitive short (<1 kb) DNA sequence (ANCH). This sequence is specifically detected by bacterial partition protein ParB fused with GFP (Germier et al., 2017). A similar method SHACKTeR (Short Homology and CRISPR/Cas9-mediated Knock-in of a TetO Repeat) allows imaging specific chromosomal regions in living cells without the need for a pre-existing repetitive sequence (Tasan et al., 2018). However, these approaches rely on the genetic engineering of locus to be observed.
Currently the most common methods for studying chromatin structure are a large family of sequencing-based chromosome conformation capture (3C) techniques and microscopy-based fluorescence in situ hybridization (FISH) variations (Jerkovic and Cavalli, 2021;Kempfer and Pombo, 2020). Resolution of 3C techniques can reach down to several base pairs, but visual model is difficult to build and requires considerable computing power. Recent developments towards FISH automation and combination with super-resolution microscopy allowed researchers to highlight the details of chromatin structure ranging from the whole chromosomes to kilobase-sized interactions between neighbouring cis-regulatory elements (Boettiger and Murphy, 2020). These methods uncovered the relationships between chromatin structure and epigenetics (Boettiger and Murphy, 2020;Wooten et al., 2020;Xu et al., 2018Xu et al., , 2020, the nature of topologically associated domains (TADs) (Barth et al., 2020;Xiang et al., 2018), and the enhancer-promoter interactions underlying transcriptional regulation (Brandao et al., 2021;Brown et al., 2018).
However, 3C and FISH methods require cell fixation-permeabilization, which are highly likely to introduce artefacts . Furthermore, harsh conditions of DNA denaturation in FISH inevitably destroy local chromatin structure. Such changes might be largely obscured by light diffraction limit in confocal microscopy, but become visible in super-resolution imaging (Markaki et al., 2012). Consequently, less invasive DNA labelling methods are needed for getting deeper insights into the genome dynamics, such as formation and rearrangement of TADs during cell cycle, pathogenesis, or external stimulation.
In this review, we focus on the methods for highlighting specific sequences in natural DNA that do not require modification of the target locus and could be employed for super-resolution imaging of fixed/ Example of sequence-specific dsDNA recognition and schematic representation of Py-Im polyamides. D. Schematic representations of tandem dimer, trimer, and tetramer Py-Im polyamide probes labelled with TAMRA targeting telomeric repeats and imaging of telomeres. Adapted with permission from (Kawamoto et al., 2016). Copyright 2016 American Chemical Society. E. Schematic structure representation and mitochondrial localization of MITO-PIP-TAMRA (red), nuclei and mitochondria were stained with Hoechst 33342 (blue) and CellLight Mitochondria-GFP BacMam 2.0 technology (green). Scale bars -10 μm. Adapted with permission from (Hidaka et al., 2017). Copyright 2017 American Chemical Society. living cells. We discuss triplex forming oligonucleotides (TFOs) and hairpin polyamides as less destructive alternatives to standard FISH methods and programmable DNA binding proteins (dCas9 and transcription activator-like effectors (TALEs)) that offer many opportunities for monitoring dynamics of the specific chromatin loci in living cells. In addition, we outline the approaches combining bacterial DNA methyltransferases and fluorescent cofactor analogues that are used for optical genome mapping.

Labelling using synthetic molecules
Small synthetic molecules provide an opportunity to highlight specific DNA sequences in living cells and organism without genetical modification. The most advanced methods involve hairpin polyamides and triplex forming oligonucleotides. These methods became even more attractive after high level of chemical synthesis automatization was reached during the last years.

Sequence-specific DNA labelling using pyrrole-imidazole (Py-Im) polyamides
Natural antibiotic and antiviral agents netropsin and distamycin A are small organic compounds that preferentially bind to the minor groove of adenine/thymine (A/T) -rich double stranded B form-DNA (B-DNA) in 1:1 or 2:1 ligand:DNA ratio (Kopka et al., 1985;Pelton and Wemmer, 1989). Their molecular structures contain two or three pyrrole (Py) moieties, respectively (Fig. 1A). The hydrogen bonding between the amide (NH) and N3-atom on adenine or O2-atom on thymine drive DNA complex formation, while van der Waals interactions serve as supporting forces (Coll et al., 1987;Pelton and Wemmer, 1989). Inspired by these natural molecules, Dervan and co-workers spent a couple of decades grasping the fundamentals of polyamide sequence-specific binding. Introduction of imidazole (Im) unit instead of Py enabled hydrogen bond formation between the exocyclic guanine amino group and N3 of imidazole, thus permitting recognition of guanine/cytosine (G:C) pairs (Dervan, 2001;Lown et al., 1986). This first generation of linear Py-Im polyamides bind targeted DNA sequence preferentially in a 2:1 ratio. Development of tert-butyloxycarbonyl (Boc) (Baird and Dervan, 1996) and fluorenylmethyloxycarbonyl (Fmoc) (Wurtz et al., 2001) solid-phase synthesis of a wide range of Py-Im polyamide variants facilitated extensive screening of structural modifications and evaluation their binding properties to the double-stranded (ds) DNA. Dervan and co-workers identified a basic DNA binding hairpin motif consisting of two Py-Im polyamides connected via γ-aminobutyric acid (γ-turn) in head-to-tail fusion (Mrksich et al., 1994) and described the principal DNA pairing rules (Fig. 1B).
Hairpin polyamides bind DNA in 1:1 mode and are the most commonly used motif due to their versatility and high affinity to DNA (Fig. 1B). Examples of Py-Im polyamides applications involve up-and downregulation of gene expression (Gottesfeld et al., 1997;Mapp et al., 2000;Oyoshi et al., 2003;Pandian et al., 2012), transcriptional activation (Ren et al., 2017;Richier and Salecker, 2015), bio-sensing (Vaijayanthi et al., 2012), cancer therapeutics (Hargrove et al., 2015;Szablowski et al., 2016) or recently as probes in bio-imaging (Kawamoto et al., 2016;Tsubono et al., 2020). The diversity of applied structural motifs extends far beyond principal pairing rules outlined here and was minutely summed-up in a recent review . We will focus on the recent examples of fluorescent polyamides applications in fluorescent microscopy.
Py-Im polyamides by their nature are non-fluorescent compounds and in order to be used as imaging agents a fluorophore has to be attached at a C/N (carboxy-/amino-) terminus or at γ-turn. Fluorescent polyamides were applied to detect specific DNA sequences and to investigate mismatched sequences using their ability to recognize the mismatched base pairs. U.K. Laemmli team demonstrated the first applications of fluorescent polyamides in fluorescence microscopy. In 2000 ′ s, they achieved a selective staining of AT-rich satellites, IIassociated regions and 5´-(GAGAA) n -3 ′ repeats of satellite V in Drosophila melanogaster using fluorescein-labelled oligo pyrrole compounds (Janssen et al., 2000). After a year the same group reported specific visualization of insect 5´-(TTAGG) n -3 ′ and vertebrate 5´-(TTAGGG) n -3 ′ telomeric repeats in fixed cells and on chromosomal spreads with Texas Red labelled tandem hairpin Py-Im polyamides (Maeshima et al., 2001). These pioneering papers outlined the perspectives of imaging DNA tandem repeat sequences as the high local concentration of binding sites attained many reporter molecules and provided a strong signal at a specific genomic locus.
Sugiyama et al. used fluorescent polyamides to label telomeres. They combined solid phase and solution based synthetic approaches to efficiently obtain tandem hairpin dimers, trimers and tetramer polyamides recognizing 12 base pair (bp) (Hirata et al., 2014;Kawamoto et al., 2013), 18 bp (Kawamoto et al., 2015), and a record breaking 24 bp (Kawamoto et al., 2016;Tsubono et al., 2020). It was found that tetramethylrhodamine (TAMRA) labelled polyamides were superior to cyanine 3 (Cy3) or Texas red conjugates in terms of contrast. The specificity of staining was significantly improved after optimization of hinge length between the hair pins (Hirata et al., 2014) and along with increased number of targeted base pairs ( Fig. 1C) (Kawamoto et al., 2013(Kawamoto et al., , 2015(Kawamoto et al., , 2016. Tandem hairpin polyamide labelled with TAMRA fluorophore was successfully applied in telomere visualization in mouse and human tissue sections (Kawamoto et al., 2016). Recently, an interesting application of TAMRA-linked polypyrrole for the genomic DNA optical mapping was demonstrated (Lee et al., 2018). The fluorescence increases upon binding 5 ′ -AGGGTT-3 ′ human telomere repeat sequence was demonstrated by substituting TAMRA for the far-red fluorogenic silicon-rhodamine (SiR) fluorophore (Tsubono et al., 2020). The authors visualized telomeres in living human osteosarcoma cells (U-2 OS) using a peptide-based delivery reagent, and hypothesized that the SiR-TTet59B probe can stain telomeres in a length-dependent manner and could be applied measurement of telomere sizes (Tsubono et al., 2020). In addition, the SiR-TTet59B probe should be compatible with stimulated emission depletion (STED) microscopy as the fluorescent marker, silicon-rhodamine, is ideally suited for this purpose, but the authors did not demonstrate that. Boutorine et al. showcased that smaller size fluorescein labelled hairpin polyamides are cell permeable, however their localization was highly dependent on the polyamide labelling approach. For example, a hairpin, which was labelled with fluorescein isothiocyanate, was suitable for the visualization of pericentromere regions (Nozeret et al., 2015). In contrast, labelling with N-hydroxysuccinimide ester or azido group based "click" reaction yielded fluorescent polyamides, which mainly localized in cytosol (Nozeret et al., 2018). Localization of some polyamides in cytosol, instead of nucleus, was also reported by Dervan and co-workers (Nickols et al., 2007) where they used fluorescein isothiocyanate (FITC) conjugation method. Introduction of cyclohexylalanines (Cha) and arginines (Arg) induced mitochondrial localization and allowed selective mitochondrial DNA imaging (Fig. 1D) (Hidaka et al., 2017). Recently, it was demonstrated that fluorescent Py-Im polyamides can stain DNA in a living mouse ( Fig. 1E) (Inoue et al., 2018).
Py-Im polyamides can be easily synthesized by machine-assisted solid-phase peptide synthesis and conjugated to a fluorescent dye to yield a fluorescent probe targeting a desired DNA sequence. Machineassisted synthesis is advantageous to solution phase synthesis as it allows obtaining larger variety of structurally diverse polyamides in a short time scale. Their binding to DNA is relying on the polyamide fitting into the minor groove of dsDNA and establishing a network of hydrogen bonds. It has been demonstrated that polyamides can recognize specific DNA sequences ranging from 4 to 24 base pairs. However, high specificity is still a major problem as smaller polyamides can off-target to other closely similar sequences or even other biomolecules (Lin and Nagase, 2020). In addition, imaging of less abundant genomic DNA sequences yet have not been demonstrated due to off-targeting and low signal-to-background ratio. Fluorescence signal amplification by attachment of 2 or more fluorophores to a single polyamide usually results in a loss of binding affinity due to increased steric hindrance (Lee et al., 2018). In general, specific DNA sequence labelling with polyamides seems to be a promising technology, but currently it is still in an early "proof of principle" stage.

Triplex forming oligonucleotides (TFOs)
In the native state, double-stranded genome DNA is hardly accessible for hybridization with complementary oligonucleotide probes. Simple solution of this situation is denaturation at high temperatures as for FISH applications. This technique has been routinely used to label specific DNA sequences in interphase and metaphase nuclei for research and medical diagnostics. In short, FISH protocol includes chemical fixation of cells followed by denaturation at 70-95 • C in the presence of chaotropic agents such as formamide. Next, the hybridization of labelled oligonucleotides to denatured single stranded DNA is performed. The denaturation of target DNA is one of the most damaging step in FISH technique that was clearly visualized in super-resolution scanning nearfield optical microscopy (SNOM) (Krufczik et al., 2017;Winkler et al., 2003). Therefore, less destructive protocols of FISH are sought. For example, the modified FISH methods, like fast-FISH and low temperature FISH, employ hybridization without the use of formamide or thermal denaturation, which results in a better-preserved morphology of the chromosomes observed by SNOM at the sub-hundred nanometre resolution (Krufczik et al., 2017;Winkler et al., 2003). Recently, exonuclease digestion has been used in a method called Resolution After Single-strand Exonuclease Resection (RASER)-FISH avoiding heat denaturation (Brown et al., 2022). An excellent alternative to FISH is methods based on formation of triple helix that preserve classical Watson-Crick base paring.
The ability of nucleic acids to form triple helices was discovered in 1957 by Felsenfeld and Rich (Felsenfeld et al., 1957). A biological significance of triplexes in biology was recognized in 1968, when Morgan and Wells showed that a three stranded complex between a dsDNA and a single-stranded RNA (ssRNA) inhibited transcription by E.coli RNA polymerase (Morgan and Wells, 1968). Later, a number of studies identified several triplex binding proteins and indirectly proved existence of these structures in multiple species: bacterial Tn7 protein, yeast CDP1 protein, drosophila GAGA transcription factor, murine HMG protein, human Orc4 and XPA-RPA proteins (Jimenez-Garcia et al., 1998;Kusic et al., 2010;Musso et al., 2000;Rahman et al., 2007;Suda et al., 1996;Thoma et al., 2005).
The third strand binds in a major groove of oligopurine or oligopyrimidine tracts of the double-stranded DNA (dsDNA) major groove (Faria and Giovannangeli, 2001). Early studies utilized TFOs composed of ribose or deoxyribose nucleotides, but lower stability of the unmodified ribonucleic acid (RNA) resulted in the introduction of 2 ′ -O-methyl-ribose and other modifications containing probes (Asensio et al., 1999;Cuenoud et al., 1998;Maciaszek et al., 2015;Rahman et al., 2007;Sasaki et al., 2004). In DNA double helix, the two strands are held together by formation of Watson-Crick base pairs (Adenine (A)-Thymine (T) and Guanine (G)-Cytosine (C)). The third strand is added by forming Hoogsteen or reverse Hoogsteen hydrogen bonds with free acceptor or donor groups in dsDNA major groove ( Fig. 2A and B). Under normal conditions only purine bases on a target DNA can establish two extra hydrogen bonds with the TFO base without disruption of the Watson-Crick hydrogen bonds ( Fig. 2C) (Frank-Kamenetskii and Mirkin, 1995). Typically, TFOs recognize 12-30 bp oligopurine or oligopyrimidine where dsDNA tends to adopt A-form conformation in triplex region and the junction between the duplex and triplex can either be A-form or B-form (Vasquez and Glazer, 2002). The cytosine (C) requires protonation and C:G × CH + triplex can form at acidic pH (Boutorine et al., 2013;Lee et al., 1979;Vasquez and Glazer, 2002). In contrast, TFOs containing purine nucleotides (A and G) and pyrimidine, thymine (T), form pH-independent triplexes (Cooney et al., 1988;Faria and Giovannangeli, 2001). Chemical modification to the backbone of TFOs can enhance the triplex stability. These modifications include N, N-diethylenediamine phosphoramidate (DEED), N3'→P5 ′ phosphoramidate (NP), phosphorothioate (PS) or peptide nucleic acids (PNAs) backbone (Faria and Giovannangeli, 2001). G-rich TFOs containing AG and TG nucleotides and T-rich TFOs containing TC and TG nucleotides are more likely to manifest in a cellular environment. Along with this, some G-rich oligonucleotides are prone to form four stranded G-rich quartets and intermolecular homoduplexes (GA repeats) (Noonberg et al., 1995;Olivas and Maher, 1995). These additional structures compete with the oligonucleotides for triplex formation with dsDNA. This could be overcome by replacement of guanines with synthetic base analogues such as 6-thioguanine or 7-deazaguanine. In addition, 7-deazaxanthine (c 7 X) in place of A or T forming c 7 X.A:T triplex in anti-parallel triplexes are also preferred (Faruqi et al., 1997).
PNAs are uncharged synthetic oligonucleotide analogues where the nucleobases are attached to the pseudopeptide backbone and hybridization follows the traditional base pairing rules: A-T and G-C (Fig. 2D). The uncharged PNAs form more stable triplexes compared to the phosphodiester backbone containing nucleic acids at physiological salt concentrations. However, triplex formation is competes with strand invasion which results in opening up of the duplex ( Fig. 2E) (Hansen et al., 2009). Further increase in concentration results in binding of two PNA oligomers to the DNA target strand, leaving the other complimentary strand displaced (Faria and Giovannangeli, 2001;Vasquez and Glazer, 2002). This gives rise to the following formation of PNA/DNA/PNA triplexes (2:1 PNA to DNA ratio): one PNA strand binds to the DNA strand via regular Watson-Crick bonds whereas the other employs Hoogsteen hydrogen bonds ( Fig. 2E) (Bentin and Nielsen, 1999). In homopyrimidine PNA oligomers substituting pseudoisocytosine for cytosine, in combination with (oligo)lysine or 9-aminoacridine conjugation results in PNA-dsDNA triplex formation (Bentin and Nielsen, 1999).
The shortcomings of conventional FISH are addressed in COMBinatorial Oligo-nucleotide FISH (COMBO-FISH) which omits the need of thermal denaturation of target DNA and use of any chaotropic agents. The improved method uses computationally optimized approach for designing of a set of fluorescently labelled triplex forming oligonucleotide probes (15-25mers) that uniquely co-localize at the target site or single oligo stretches that are repetitive but bind exclusively to the target (Schmitt et al., 2012). Schwarz-Finsterle and co-workers found the better-preserved chromatin structures while comparing COMBO-FISH to standard FISH method by quantitative microscopic image analysis of Abelson murine leukemia (ABL) and breakpoint cluster region (BCR) positions in nuclei of lymphocytes and chronic myeloid leukemia (CML) blood cells (Schwarz-Finsterle et al., 2007). In addition, COMBO-FISH can utilise PNA oligonucleotides that form a stable triplex and are metabolically more stable in living cells. These properties of PNA oligonucleotides enables COMBO-FISH to be applied for in vivo labelling . Jin-Ho Lee and co-workers applied COMBO-FISH in combination with immunostaining and successfully demonstrated possibility to measure nuclear architecture by 3D confocal microscopy and chromatin nano-architecture by single molecule localization microscopy (SMLM) (Lee et al., 2019). Here they studied the spatial distribution of genomic Alu elements around chromosome 9 centromeres (Fig. 2F). Furthermore, the low temperature conditions paved way for the concurrent COMBO-FISH and H3K9me3 antibody immunostaining (Fig. 2G).
One of the early applications of fluorescent PNA in microscopy was described by Molenaar et al. (2003). They used Cy3-labelled PNA to probe telomere structure. PNA's do not penetrate into living cells and requires laborious microinjections, permeabilization by pore-forming toxins (as streptolysin-O) or other invasive cell delivery methods (Flierl et al., 2003). In another application, PNA oligonucleotide probes were used for COMBO-FISH technique in combination with Spatially (caption on next page) Modulated Illumination (SMI) microscopy to measure the size of ABL gene region in 3D conserved blood cell nuclei (Muller et al., 2010). To demonstrate binding specificity and efficiency two colour experiments with OregonGreen 488 and TexasRed were performed. The probes were specific for centromere 9. They were separately hybridized on fixed peripheral blood lymphocyte nuclei and metaphase spreads. This was followed by another experiment to verify whether the labelled chromosome was actually chromosome 9. Here, the lymphocyte samples were labelled with PNA COMBO-FISH probe having OregonGreen 488 and 9q subtelomere (9qtel) DNA standard FISH probe labelled with rhodamine dye. This showed two chromosomes with red and green spots, suggesting that subtelomere ends and centromere region of chromosome 9 were marked by the DNA and PNA probe respectively (Fig. 2H).
In summary, the main advantage of DNA labelling by triplex formation is that it does not require the opening of target DNA duplex and thereby causes less damage to chromatin structure than conventional FISH. However, the background fluorescence of non-bound oligonucleotides is high and intense washing is required before imaging. The possible solution to this problem is generation of fluorogenic TFO probes, which only fluoresce upon DNA binding. For example, this can be achieved by conjugating intercalating agents such as acridine, benzopyridoindole and monomethine cyanines to oligonucleotides (Silver et al., 1997;Sun et al., 1989;Walsh et al., 2018). These modifications were shown to increase the stability of the triple helical complexes significantly, which is an important factor for imaging experiments (Silver et al., 1997). Another important aspect of triplex-forming PNAs is that they show higher affinity for dsRNA and such off-targeting could lead to false signals or background increase (Sato et al., 2016). Furthermore, PNA-dsDNA triplex formation competes with strand invasion that partially destroys local structure of dsDNA. This issue could be addressed by incorporating nucleotide analogues and/or variation of PNA probe length (Hansen et al., 2009). The main drawback of the TFOs and PNAs is their weak intracellular permeability, which limits their applications on living samples.

Labelling based on protein-DNA interactions
Exploiting proteins interacting with DNA for labelling purposes is a relatively straightforward approach and it is particularly attractive for the observation of dynamic processes in living cells. The use of dCas9 and TALEs offers higher specificity and flexibility compared to small molecule-based approaches. Less specific DNA methyltransferases provide opportunity to covalently link fluorescent label to nucleobase in the double helix with little perturbations. We will discuss these methods in the next chapters.

Chromatin imaging using catalytically deficient dCas9 nuclease
The discovery of the CRISPR antiviral bacterial defence allowed scientists not only to edit target genes in the genomes, but also opened new possibilities for endogenous genome imaging (Doudna and Charpentier, 2014;Ishino et al., 1987;Makarova et al., 2006). The effector protein of the CRISPR system -Cas9, is programmable with a single guide RNA (sgRNA) that recognizes target DNA sequence by formation of Cas9-sgRNA-dsDNA tertiary complex (Gasiūnas et al., 2012;Jinek et al., 2012). Catalytic activity is unnecessary for DNA binding, thus the endonuclease-dead Cas9 (dCas9) allows generation of RNA-programmable complexes without the ability to cleave the DNA (Chen et al., 2013;Qi et al., 2013). Both, protein and RNA, components of this complex can carry a reporter tag and can be used for chromatin labelling. High contrast is indispensable for locating specific loci in the crowded nucleus environment; therefore, we focus our discussion on the developments aimed at improving signal-to-background ratio (S/B).
The first application of dCas9-sgRNA for imaging specific DNA sequences in living cells was demonstrated by Chen et al. who coexpressed enhanced green fluorescent protein (EGFP)-dCas9 and respective sgRNAs to label telomeres and tandem repeats in MUC4 gene ( Fig. 3A) (Chen et al., 2013). Hundreds of tandem repeats in these targets attracted multiple EGFP-dCas9 molecules and ensured efficient labelling and good contrast. To detect non-repetitive sequences, the authors co-expressed EGFP-dCas9 with multiple sgRNAs tiling a non-repetitive 5 kbp region of MUC4 gene. They showed that at least 36 distinct sgRNAs were required for sufficient contrast. Later, multicolour imaging scheme was devised by fusing dCas9 orthologues from S. pyogenes (Sp), Neisseria meningitides (Nm) and Streptococcus thermophilus (St1) to 3 copies of GFP, red fluorescent protein (RFP) and blue fluorescent protein (BFP) (Ma et al., 2015). Pairs of differently coloured 3 × fluorescent protein (FP)-dCas9/sgRNAs were used to determine distance between loci on different chromosomes. Early on it was noticed that sgRNA is a limiting factor in DNA labelling with CRISPR (Chen et al., 2013). Therefore, signal-to-background ratio (S/B) could be improved not only by adding multiple copies of FPs, but also by stabilizing sgRNA and enhancing its interaction with dCas9 (Chen et al., 2013;Ma et al., 2015).
Introduction of a small multivalent epitope called SunTag (Super-Nova tag) at C-terminus of dCas9 reduces the size of fusion protein and improves S/B. The tag is 24-mer of GCN4 peptide recognized by a highly specific single-chain variable fragment antibody fused to GFP (scFv-GCN4-GFP) resulting in enormous fluorescence signal amplification at the target sequence (Tanenbaum et al., 2014). Similarly, dCas9-SunTag system fused with superfolder GFP (sfGFP), mNeonGreen and 3 × mNeonGreen was successfully used for imaging telomeres in HEK293 cells (Fig. 3B)  .
Alternative to dCas9 modification with FPs is sgRNA modification with phage-derived RNA aptamers MS2 (Emesvirus zinderi) or PP7 (Pseudomonas phage). The fluorescent label is brought-in by the respective binding proteins fused to FPs: MS2 coat protein (MCP) and PP7 coat protein (PCP). MS2 and PP7 systems are orthogonal and allow multiplexing. 2-fold enhancement of S/B was achieved by modifying sgRNA tetraloop and stem loop 2 with MS2 or PP7 aptamers compared to sgRNA extension with corresponding aptamers (10 versus 5) or two unmodified sgRNA (80 versus 40) (Fig. 3C) (Shao et al., 2016;Wang et al., 2016). S/B can be enhanced further by modifying sgRNA with RNA aptamers (MS2/PP7) that bind FP-tagged tandem dimer (td) MS2 coat (tdMCP) and PP7 coat (  2016). Much faster tdMCP and tdPCP exchange rates than dCas9 compensates photobleaching and makes it suitable for long-term tracking of telomeres and centromeres (Shao et al., 2016). To improve brightness of labelling with MS2, thermostable octets were rationally designed (Fig. 3D) (Ma et al., 2018). A careful optimization of sgRNA structure and inserting 16 × MS2 allowed imaging a non-repetitive genome locus with as few as 4 different sgRNAs. This approach allowed targeted loci tracking over the course of the cell cycle (Qin et al., 2017).
A recently described CRISPR-mediated fluorescence in situ hybridization amplifier (CRISPR FISHer) system employs a very interesting  principle of signal amplification at the targeted locus (Lyu et al., 2022). It consists of untagged dCas9, sgRNA labelled with at least 2 copies of PP7 aptamer and T4 fibritin trimeric motif foldon fused to GFP-PCP. Foldon-GFP-PCP forms stable trimers. When recruited to dCas9-sgRNA-2 × PP7 bound to the target DNA, foldon-GFP-PCP seeds rapid aggregation of multiple sgRNA-2 × PP7 and foldon-GFP-PCP molecules resulting in exponential signal amplification, which is sufficient for detecting non-repetitive sequences with a single sgRNA.
The number of colours was increased by introducing the third aptamerbinding protein pair boxBaffinity-enhanced λN22. Thereby, CRISPRainbow system was created that can detect 6 loci simultaneously. It consists of sgRNA that is modified by a pair of hairpins from MS2/ PP7/boxB set and respective binding proteins fused to FPs of different colours (RFP, GFP or BFP). In such a system, each of the six targeted loci recruits a different combination of FPs creating red, green, blue, magenta, yellow, and cyan spots (Ma et al., 2016).
An important aspect in improving S/B is suppression of background originating from unbound fluorescent component(s). One such example is the bimolecular fluorescence complementation (BIFC) based on split Venus FP assembly on multivalent SunTag peptide. The system consists of three components: single-chain antibody fused to C-terminal part of Venus (scFv-Venus C), MCP fusion to complementary N-terminal part of Venus (MCP-Venus N) and 24 copies of GCN4 peptide fused to dCas9. Fluorescence is switched ON only after a correct binding of all components resulting in up to 3-fold S/B improvement compared to SunTag system (6 versus 2) (Hong et al., 2018). Another example is self-complementing tripartite split sfGFP system. Here, sfGFP regains fluorescence only after GFP 1-9 (the first component) is complemented with dCas9-SunTag, which is fused to scFv-GFP 10 (the second component) and sgRNA-MS2 is fused to MCP-mCherry-GFP 11 (the third component) (Chaudhary et al., 2020).
Alternative to aptamerbinding protein pairs, the RNA binding domains of Pumilio/fem-3 binding factor (PUF) were employed in Casilio system (Cheng et al., 2016). These protein domains can be programmed to bind specific linear 8-mer RNA sequences, called PUF binding sites (PBS). Multiple copies of PBS can be attached to sgRNAs without negative effects on their expression, allowing for high signal amplification. The programmable specificity of PUF domains offers ample opportunities for multiplexing. In all-in-one Casilio system all the components (dCas9, PUF-mClover and sgRNA-25 × PBS) are encoded in just one plasmid for telomere and major satellite imaging (Zhang and Song, 2017). A recent study demonstrated a capability of Casilio system to simultaneously image up to three non-repetitive loci with a single sgRNA for each (Clow et al., 2022). In this study, sgRNAs were tagged with 15-25 copies of PBS. To increase number of colours, PBS and aptamers can be used together (Maass et al., 2018).
Most of the discussed dCas9-based methods rely on lentivirus-based delivery systems: a) delivery by lentiviral plasmids that allow simultaneous transfection of all components; b) delivery by lentivirus infection, which requires lentivirus production with every plasmid of imaging system and sequential infection-selection cycles. To simplify these procedures, dCas9-sgRNA complexes can be preassembled in vitro and used as imaging reagents in vitro or in vivo. For example, in vitro assembled dCas9-mNeonGreen/sgRNA targeted to RHM2 and RHM3 genes and telomere repeats was used to monitor mitotic chromatid formation of sperm nuclei in metaphase-arrested Xenopus egg extract . High S/B was ensured by targeting 100 separate loci within 3.4 megabase region on Xenopus chromosome 4 with multiple sgRNAs . In addition, chromatin labelling in fixed cells was performed by CasFISH method, applying dCas9-C-Halo-tag labelled with fluorescent JF646/JF549 substrates and sgRNA-DY547/Cy5 targeting major satellites, minor satellites, telomeres or coding gene in fixed mouse embryonic fibroblasts (Deng et al., 2015). Another method for labelling with dCas9 system is RNA-guided endonuclease-in situ labelling (RGEN-ISL) which employs S.pyogenes dCas9, crRNA and a dye (5 ′ -Alexa488 or 5 ′ -ATTO 550) on tracrRNA to label telomeres and centromeres in fixed animal and plant cells Nemeckova et al., 2019). Mild conditions of sample preparation allow easy combination of these methods with immunostaining. Preassembled dCas9-sgRNA complexes can be delivered into the living cells by electroporation or microinjection in a method called CRISPR LiveFISH (Wang et al., 2019). Either sgRNAs or dCas9 can be labelled with fluorophores (Geng and Pertsinidis, 2021;Wang et al., 2019). Interestingly, free sgRNAs are quickly degraded inside the cell, while those in complex with dCas9 and complementary DNA are protected, which ensures a good contrast when using labelled sgRNAs. While CRISPR LiveFISH greatly simplifies delivery of multiple sgRNAs, electroporation is an invasive procedure prone to disturbing cell physiology and generating artefacts.
In summary, CRISPR-based chromatin labelling is a versatile method that works in living and fixed cells. Importantly, elaborate signal amplification and background suppression methods allow labelling lowrepeat containing loci and truly unique sequences. On the downside, it is a multicomponent system, and a lot of effort must be put into tool development before the real biological question can be addressed. This is particularly the case when labelling non-repetitive sequences, where efficient delivery and then expression of dCas9, multiple sgRNAs and fluorescent reporters within a single cell must be ensured. Fast pace at which this technology is developing leads to expect more user-friendly reagents and protocols in the near future.

Applications of TALEs for fluorescence microscopy
TALEs from plant-pathogenic bacteria Xantomonas are bacteriaencoded transcriptional activators (Bogdanove et al., 2010). During the infection, they are injected into cytoplasm through type III secretory system and then are imported into nucleus, to upregulate host susceptibility genes (Bogdanove et al., 2010). TALEs recognize specific sequences via their central domain, which is composed of 11-20 nearly perfect repeats with length of 33-35-amino acids. Inside the repeats, the most variable are 12 and 13 positions, which are called the repeat-variable diresidues (RVD). Each repeat folds into helix-loop-helix motif, and the whole domain forms a right-handed superhelix that wraps around DNA following the sense strand (Deng et al., 2012) (Fig. 4A). RVD is located inside the loop facing DNA and residue 13 makes specific contacts with a base in the major groove, while residue 12 stabilizes the local conformation. RVDs bind to DNA bases in one-to-one ratio and encode the specificity of TALE (Fig. 4B) (Boch et al., 2009;Bogdanove et al., 2010). The most common RVDs are HD (histidine-aspartate), NG (asparagine-glycine), NI (asparagine-isoleucine), and NN (asparagine-asparagine) (Boch et al., 2009;Moscou and Bogdanove, 2009). They recognize C, T, A, and G bases in DNA, respectively. Importantly, the specificity of RVDbase interaction is largely context-independent, which allows programming TALEs to recognize almost any sequence of interest (Boch et al., 2009). Multiple bioinformatics tools are available for TALE specificity design and off-targeting prediction. The programmable code has led to adoption of TALEs for a wide range of genome manipulations: control of gene expression, genome editing, depositing epigenetic code and DNA imaging (Becker and Boch, 2021).
In microscopy applications, engineered TALEs targeting the sequences of interest are fused to FPs or self-labelling tags on N-or Cterminus. Multiplexing with FPs of different colours is readily possible. Such reporters can be expressed in vivo from the transfected plasmid(s) or delivered into the cells via mRNA or protein microinjection (Boutorine et al., 2013;Miyanari et al., 2013). TALEs also bind DNA in fixed cells, which allows specificity validation of the engineered constructs via co-localization with FISH probes (Boutorine et al., 2013;Miyanari et al., 2013).
TALE-FPs were used to study structure, localization and dynamics of repetitive sequences in human, mouse, drosophila and plant cells (Boutorine et al., 2013;Fujimoto et al., 2016;Miyanari et al., 2013;Thanisch et al., 2014;Yuan et al., 2014). TALE-mClover targeted to major satellite repeats allowed investigation of phase separation in heterochromatin in mouse embryonic stem cells (Novo et al., 2022). Establishing stable Arabidopsis cell lines expressing TALE-FPs targeted to centromeres, telomeres and rRNA genes allowed imaging of these sequences in multiple plant tissuesroots, hypocotyls, leaves, and flowers (Fujimoto et al., 2016). TALE-FPs turned out to be sensitive indicators of size and structure of the repetitive loci: staining intensity correlated with telomere length in a series of human cell lines (Boutorine et al., 2013); expansion of compact signals was observed during replication of satellite DNA in drosophila cells (Yuan et al., 2014) and as little as one polymorphism in 15 bp target was sufficient to discriminate against TALE-FP binding (Miyanari et al., 2013). Thereby, TALE-FPs helped to confirm previous knowledge and provided new insights into dynamics of repetitive loci.
In Arabidopsis and human cells, the observed number of telomeres were consistently lower than expected from the respective karyotypes (Boutorine et al., 2013;Fujimoto et al., 2016), suggesting telomere clustering that was also observed by other methods (Adam et al., 2019). Super-resolution fluorescence microscopy imaging with TALE-based probes would provide a good tool for investigating telomere-clustering mechanisms in vivo. Proof-of-principle experiment of telomere imaging with TALE-HaloTag by STED microscopy already has been demonstrated ( Fig. 4C) (Bucevičius et al., 2019). A very extensive study targeted TALEs to telomeric, centromeric, and ribosomal DNA loci in a series of tumor cell lines, stem and differentiated cells and even in the living mice (Ren et al., 2017). FP-TALEs efficiently detected a well-known telomere attrition upon aging and in the premature aging models. For the first time this work demonstrated attrition of ribosomal DNA repeats as a molecular marker for human aging. This study has some interesting technical aspects. It rigorously validated TALE specificity by FISH, expressed FP-TALEs as fusions with thioredoxin to prevent their aggregation in vivo and compared telomere staining in Henrietta Lacks (HeLa) cells by EGFP-TALE and dCas9-sgRNA. The latter experiment found EGFP-TALE signal stronger and with lower background. The structure of nucleolar organizer region-related ribosomal DNAs was investigated with structured illumination microscopy-transmission electron microscopy (SIM-TEM) (Ren et al., 2017). 5-methylcytosine (5 mC) is a prominent epigenetic mark in higher eukaryotes, with >70% of CpG dinucleotides methylated (Jaenisch and Bird, 2003). The canonical RVD recognizing cytosine (HD), is sensitive to 5 mC, which may compromise TALE binding to CpG containing sequences (Bultmann et al., 2012). This can be overcome by substituting HD with a truncated artificial RVD of a lower specificity that can accommodate 5 mC (Giess et al., 2018;Valton et al., 2012). Furthermore, comparison of staining by TALEs with canonical and lower specificity RVD at the position of interest can be used to probe methylation status of a particular CpG. This approach was validated in a model systemthe cells expressing DNA methyltransferase targeted to SATIII sequences (Munoz-Lopez et al., 2020). Interestingly, RVD with a clear preference to 5 mC has been engineered recently and, in theory, could be used to report directly the presence of 5 mC at the target CpG (Munoz-Lopez et al., 2021). It remains to be determined whether such tools can be useful outside the model experiments.
Most of the studies using TALEs focus on imaging the repetitive sequences, where attracting multiple copies of the reporter ensures high signal-to-background ratio. However, even this might not be sufficient: when labelling plant telomere sequences, one of the TALEs employed had to be fused to three copies of GFP to provide sufficient signal (Fujimoto et al., 2016). To our best knowledge, so far only one study succeeded in visualizing a non-repetitive locusa single integrated HIV provirus (Ma et al., 2017). In order to discriminate between bound and unbound reporters, a pair of TALEs targeting two separate HIV loci were constructed, expressed with two different tags in cells and conjugated with quantum dots of different colour in vivo via biorthogonal ligation reactions. Co-localization of both colours in 3D images indicated the presence of HIV provirus. Consistent with FISH data, 1.5 ± 0.5 proviruses per cell were found in chronically infected HIV-1 promonocytic (U1) cells.
In summary, TALEs are single-component genetically encoded constructs that work in living and fixed cells with little restrictions on target sequence. Programming TALEs is rather straightforward and they can be delivered to the cell by transfection, mRNA or protein injection and imaging can be multiplexed easily. Using TALE-FPs has already provided novel insights into biology of repetitive genomic sequences, however popularity of this tool in imaging community remains limited. Due to a highly repetitive sequence, constructing TALEs is not amenable to conventional cloning techniques and relies on sets of preassembled building blocks that are combined together by a series of restriction-ligation reactions, Golden Gate or ligation-independent cloning (see (Sakuma and Yamamoto, 2017) for overview). Several such sets are available from the Overall structure of a complex between DNA and synthetic TALE dHax3 (pdb code 3v6t (Deng et al., 2012)). 34-aminoacid long RVDs are coloured from N-to C-end from blue to red. B. Interaction of three RVDs with DNA bases. A possible van de Waals interaction between methyl group of thymine and Cα-atom of glycine is shown. Serine and aspartate form H-bonds (cyan) to respective DNA bases. Note, that residues at position 12 from the shown RVDs (two asparagines and histidine) point away from DNA and are important for stabilization of the local conformation. Figures were prepared with UCSF Chimera (Pettersen et al., 2004). C. Schematic representation of telomere-specific TALE-HaloTag fusion and STED microscopy image of telomere staining in U-2 OS cells expressing this construct (Bucevičius et al., 2019). Note, that the first T of the target is recognized by the TALE N-terminus, and not by RVD. HaloTag signal is shown in red-hot, DNA stained with 5-580CP-Hoechst is shown in grayscale. Scale bar -5 μm.
non-profit plasmid repository Addgene (Kamens, 2015). Existence of such kits allow to streamline TALE assembly, however, establishing and maintaining collections of building blocks is a labour-intense task with little appeal for those who prefer focusing on biological questions rather than on tool development. Achieving good image contrast and visualizing non-repetitive genome loci is still an unsolved challenge. Signal amplification methods, similar to those applied to dCas9 protein, need to be established in order for TALEs to be adopted by a wider community.

DNA labelling with methyltransferases
Methylation of DNA is an essential reaction for many biological processes in both mammals and prokaryotes (Jeltsch et al., 1999). This reaction is performed by the DNA methyltransferases (MTases), which recognize specific 2-8 bp target sequences and use a cofactor S-adenosylmethionine (AdoMet) as the methyl donor to modify a cytosine or adenine therein (Fauman et al., 1999;Katz et al., 2003;Szulik et al., 2020). Based on their target nucleobase DNA MTases are divided into three classes: adenine-N 6 , cytosine-N 4 , and cytosine-C 5 (Jeltsch and Jurkowska, 2016). The reaction in the active site proceeds via bimolecular nucleophilic substitution (S N 2) mechanism (Wu and Santi, 1987). This reaction can be utilized to transfer groups larger than methyl to functionalize DNA (Liutkevičiūtė et al., 2009;Lukinavičius et al., 2007;Pljevaljcic et al., 2004) (Fig. 5A). DNA were combed using mixed hydrophilic/hydrophobic surface-modified cover slips. The green channel shows the unspecific DNA intercalator YOYO-1 (a cyanine dye with excitation: 488 nm) and the red channel shows the TAMRA labelling (a rhodamine dye with excitation: 561 nm). The highest number of labels was obtained for non-methylated DNA (left panel) while the number of labels decreases with increasing CpG methylation levels (middle to right panel). The first successful attempt to use MTases for DNA functionalization used aziridine cofactors (Pljevaljcic et al., 2003). In this method called Sequence-Specific Methyltransferase-induced Labelling (SMILing) MTase facilitates covalent coupling of the entire cofactor to the DNA target site via aziridine-ring opening (Fig. 5A) (Pljevaljcic et al., 2003(Pljevaljcic et al., , 2004Zhang et al., 2006). However, this approach produces the locked DNA-MTase-cofactor complex and results in a single turnover reaction (Fig. 5A). Several years later a collaborative work between Weinhold and Klimašauskas groups showed that a double or a triple carbon-carbon bond at β-position to sulfonium centre facilitates the transfer of groups larger than methyl to specific DNA sequence sites by the MTases (Dalhoff et al., 2006). This method is known as "MTase-directed Transfer of Activated Groups (mTAG)" (Fig. 5B) (Lukinavičius et al., 2007). In contrast to SMILing, mTAG produces labile DNA-MTase-cofactor complex and allows multiple enzyme turnovers.
Cofactor binding pocket of MTase TaqI (M. TaqI) is extremely promiscuous, thus it is highly active with a wide range of AdoMet analogues and remains the enzyme of choice in multiple studies (Dalhoff et al., 2006;Weinhold and Chakraborty, 2021). However, the activity of other wild-type (wt) enzymes with AdoMet analogues is low to moderate (Kriukienė et al., 2013;Lukinavičius et al., 2012;Vranken et al., 2014). In some cases, engineering of the cofactor binding pocket can be employed to enable utilization of AdoMet analogues, as demonstrated for the enzymes M.HhaI, M.SssI, M.TaqI and eukaryotic Dnmt1 (Deen et al., 2017;Heimes et al., 2018;Kriukienė et al., 2013;Lukinavičius et al., 2007;Stankevičius et al., 2022;Staševskij et al., 2017). These engineering efforts are facilitated by a highly conserved structure of DNA MTases and extensive understanding of their reaction mechanism (Gerasimaitė et al., 2009;Lukinavičius et al., 2012;Merkienė and Klimašauskas, 2005).
MTase-directed labelling can have various applications, including imaging of DNA via attachment of fluorophores to target sites, functionalization of plasmid DNA for gene delivery, mapping epigenetic modifications and genotyping (Pljevaljcic et al., 2004). Most common application of DNA MTases is in the field of DNA optical mapping (OM). Established by Schwartz et al., in 1993, it is a microscope-based technique where MTases are used to fluorescently label specific DNA sequences and their images are used to create one dimensional maps of stretched DNA molecules (Schwartz et al., 1993;Yuan et al., 2020). This part of the method is well developed and commercialized thanks to the advancements in nano-and microfluidics (Yuan et al., 2020). This is a powerful tool complementing whole genome sequencing as it aids genome assembly by providing a scaffold for aligning repetitive regions (Bogas et al., 2017). It is also shown to be helpful in discovering and characterizing rare structural variants in heterogeneous tumor samples (Bocklandt et al., 2019;Lam et al., 2012). In addition, possibility to use optical mapping for virus identification has been explored (Wand et al., 2019) and its utility for taxonomic analysis of bacterial strains has been demonstrated (D'Huys et al., 2021). Most recently, combining OM and 523-gene next generation sequencing panel (NGS) was proposed by Sahajpal et al. to provide a cost-effective way of profiling myeloid cancers to improve clinical outcomes (Sahajpal et al., 2022). Although the resolution of OM is still low, making it unsuitable for maps shorter than 100-150 kb, it is a valuable complementary method in genomic studies (Yuan et al., 2020). mTAG can be applied to epigenome studies, because activity of bacterial MTases is blocked by the presence of 5-methylcytosine in their target sequences. Consequently, they attach fluorophores only to the sites lacking 5 mC epigenetic mark. Multiple studies exploit mTAG for unmethylated DNA enrichment in sequencing-based epigenome analysis (Kriukienė et al., 2013;Stankevičius et al., 2022;Staševskij et al., 2017). The same principles can be applied also for OM. Thereby, profiling the unmethylome of human cells using M.TaqI was demonstrated (Fig. 5D) (Sharim et al., 2019).
Further important mTAG developments turned into a reversible tagging direction, which allows installation and on-demand removal of various functionalities on the natural DNA. For example, the photo-reversible labelling, termed re-mTAG, was demonstrated using photocaged cofactor analogues and M.TaqI (Anhauser et al., 2018;Heimes et al., 2018). These groups can be used to photoregulate gene expression on the plasmid DNA target sites (Heimes et al., 2018). Recently, AdoMet analogue equipped with an acyl hydrazone linker and a terminal azide was used to tag, untag, and permanently tag DNA. It allows to sequentially introduce two fluorescent dyes or affinity tags into natural DNA (Wilkinson et al., 2020).
One of the main limiting factors for use of DNA MTases for DNA labelling and imaging in living cells is cell membrane impermeability of cofactor analogues (Sohtome et al., 2021). This issue can be overcome by employing chemo-enzymatic synthesis in situ. The most popular example exploits methionine adenosyltransferase (MAT) which uses adenosine triphosphate (ATP) and methionine as substrates (Fig. 5C) (Park et al., 1996;Singh et al., 2014;Vranken et al., 2016;Wang et al., 2013). Michailidou et al. demonstrated that MATs can synthesize Ado-Met analogues with photocaging groups which can be used to detect the labelled DNA sites (Michailidou et al., 2021). Furthermore, propargyl AdoMet analogue can be generated from methionine analogue and ATP inside the cells (Huber et al., 2020;Sohtome et al., 2021). Alternative possibility to deliver cofactor analogues into the cell via electroporation has been demonstrated for epigenetic profiling and imaging of protein methylation sites (Doll et al., 2019;Gade et al., 2021;Stankevičius et al., 2022).
Poor stability of cofactor analogues is another obstacle hampering their use for DNA labelling and microscopy applications. There are two most common degradation pathways for these analogues: intramolecular cyclization and depurination (Huber et al., 2016). To overcome the latter, 7-deaza AdoMet analogues were created by removing nitrogen from the 7 position of the adenine to prevent depurination, and aziridine-based versions of these cofactors were used in sequence-specific labelling of DNA with M.HhaI for electron microscopy analysis (Huber et al., 2016).
In summary, MTase-based labelling of DNA is a promising technique for generating a predictable labelling pattern along large natural DNA molecules. Small label size opens the way to fully exploit the potential of super-resolution fluorescence microscopy, where localization precision reaches down to nanometres and structure distortion by a label becomes a potential issue. However, there is still a long way to go before it becomes a benchmark technique for microscopy.

Conclusions and future perspectives
Site-specific DNA labelling which introduces minimal perturbations to the DNA double helix is indispensable tool for chromatin research (Boettiger and Murphy, 2020). The advantages of these techniques become apparent in super-resolution fluorescence microscopy images. Long standing denaturation-based DNA labelling methods are slowly being substituted by minimally disruptive methods based on triplex forming polyamides and TFOs. Recent advances in genome editing provided researchers with programmable proteins dCas9 and TALEs, which offer a lot of flexibility for both in vivo and in vitro imaging. Despite multiple developments, sequence-specific DNA labelling is far from ideal. Limitations shared among most labelling methods include off-targeting, probe aggregation in vivo, as well as the risk of disrupting the local chromatin structure and affecting the process under study. The methods discussed in this review can readily detect repetitive DNA loci, such as telomeres and centromeres, but they often struggle with visualization of single-copy sequences. It is worth noting, that dCas9-based methods are the most advanced towards this direction. Thanks to the elaborate signal amplification approaches, they demonstrate potential to replace classical FISH with mild highly specific CasFISH.
All the aforementioned developments are resonating with current advances of the of super-resolution fluorescence microscopy methods which are achieving astonishing sub-nanometre (nm) localization precision of fluorophores in complex biological samples (Balzarotti et al., 2017;Schmidt et al., 2021;Weber et al., 2021Weber et al., , 2022. This could allow identification of DNA conformations and structural forms (A-and B-DNA, left-handed Z-DNA, cruciforms, intramolecular triplexes, quadruplex DNA, slipped-strand DNA, parallel-stranded DNA, and unpaired DNA) provided that appropriate labelling scheme is available (Fig. 6A). However, the Nyquist criterion has to be satisfied to take full advantage of the resolving power, meaning that the labelling density should be at least 2-or 3-fold higher than the resolution of microscope. This imposes physical limitations on the label size which must allow sufficient spacing between separate reporter molecules (Liu et al., 2022). For example, probes with sizes of few nanometres are required for MINFLUX (minimal photon fluxes) or MINSTED (minimal photon fluxesstimulated emission depletion) microscopes that attain resolution in the range of 1-2 nm (Balzarotti et al., 2017;Weber et al., 2021Weber et al., , 2022. One could think that labelling each base pair on the DNA would be beneficial because of high labelling density (every 0.34 nm), but this would certainly destroy the structure of interest by distorting DNA geometry and interfering with protein-DNA interactions. On other hand, these imaging techniques allow extremely fast tracking of sparsely labelled structures. This can be achieved using all sequence-specific DNA labelling methods discussed in this review (Fig. 6B). Furthermore, adapting optical mapping principles to high-resolution 3D images of the entire nucleus is capable of endowing the microscopy images with the sequence context. However, this is a truly challenging task that will require optimization of labelling and super-resolution imaging regimes to be able to reliably resolve and register individual fluorophores or fluorophore groups in a very densely packed nucleus, as well as sophisticated machine-learning approaches for data processing. For the future, we anticipate that the research community seeking to unravel molecular mechanisms underlying chromatin functions will adopt the highlighted methods more and more. Particularly powerful approach would be combination of super-resolution microscopy and sequencing methods which allows precise positioning of specific DNA sequences in the spatial context of the nucleus. We envision that synergistic approaches would allow visualization of the fine details of chromatin structure not only in single cells seeded on glass coverslips, but also in tissues or even whole organs.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Data availability
No data was used for the research described in the article.