Epigenetic regulatory layers in the 3D nucleus

SUMMARY


INTRODUCTION
In 1956, 3 years after the publication of the double helix structure of DNA, Crick introduced a concept that became known as the central dogma of molecular biology.In ideas on protein synthesis, 1 Crick sets the major axiom of the directionality of sequence information transfer during protein biogenesis-that is, how information is passed from the DNA to messenger RNA (mRNA) and to proteins (Figure 1A).Crick's central dogma states that ''once information has got into a protein it can't get out again,'' where the term ''information'' refers to a given sequence and its transfer from one substrate, being either deoxyribonucleotides (in DNA), ribonucleotides (in RNA), or amino acids (in proteins), into another.Although DNA can be copied into RNA by the process of transcription, proteins are generated by the translation of an RNA intermediate (generally mRNA), which converts the nucleotide sequence into its amino acid counterpart, following a correspondence referred to as the genetic code.Other types of information transfer exist, namely, DNA replication, reverse transcription of RNA into DNA (such as for retroviruses prior to their integration in the genome), and RNA self-replication.
Whereas the central dogma still explains the directional transfer of sequence information between DNA, RNA, and protein, we are far from understanding how regulatory mechanisms arise from the activity of nuclear factors within the complex nuclear landscape (Figure 1B).Genome regulation involves crosstalks and feedback loops between DNA, RNAs, and proteins that occur in the three-dimensional (3D) cell nucleus (Figure 1C).Chromatin adopts a hierarchy of folding states, from whole chromosomes occupying preferred territories to specific chromatin interactions, and to local nucleosome positioning. 2,3Nuclear macroscale properties also contribute to gene regulation and include diverse membraneless microenvironments composed of proteins and RNAs.Finding general rules of genome regulation remains especially difficult because 3D genome structure and gene regulation are highly dependent on the genomic context of each gene, are specific of cell type, developmental and disease states, and can vary over time.Deciphering how the homeostatic control of gene activation and repression is achieved in a timely and spatially defined manner is fundamental because the first events that dictate cellular function and identity occur through gene regulation and have the greatest impact on development, disease onset, and progression.
In this article, we explore how proteins, RNAs, and regulatory DNA elements organize the genome and regulate the expression of genes.The distinction between co-observed events with functional, cause-and-effect implications, and coincidental co-occurrences remains challenging.The advent of multimodal approaches that simultaneously measure different parameters in the same cell promises to deliver molecular information that can connect, at the single-cell level, genome structure and function to address current challenges.

DISCOVERING REGULATORY DNA ELEMENTS AND FUNCTIONAL ncRNAs
Core to our current knowledge of genome function are the seminal discoveries on DNA sequence elements, or cis-regulatory elements (CREs), which hold regulatory activity and are found throughout the genome (Figure 2).The release of the first assemblies of mammalian genomes (e.g., the human genome 4 ) made evident that protein-coding sequences account for only a minor fraction of the genome.The number of protein-coding genes in humans is currently estimated to about 20,000.Within genes, only a small fraction of DNA sequences code for amino acids (within exons), and the remainder (introns) can contain CREs that regulate the same or other genes.Less than 1.2% of the total 3 billion base pairs in the haploid human genome sequence encodes proteins. 5In contrast, model organisms, such as fly, worm, and yeast, have smaller genomes more densely packed with protein-coding sequences, and their gene expression is more locally regulated.
The development of microarray and next generation sequencing technologies has been instrumental in deciphering how gene regulatory information is stored in the genome.Major efforts such as in the Encyclopedia of DNA Elements (ENCODE) Consortium have revealed that most (>90%) of the genome is transcribed in at least one of the analyzed cell types, although sometimes at low levels. 6Notably, the fraction of the genome transcribed in individual cell types is around 15%. 7 Genomewide studies also found that the transcribed genome comprises many newly discovered non-coding RNAs (ncRNAs) that hold regulatory functions, such as long ncRNAs, enhancer RNAs, mi-croRNAs, and circular RNAs.
Apart from the major types of ncRNAs, around 95% of all transcribed genomic locations serve as a template for transcripts whose possible functions remain obscure. 6,8Tens of millions of distinct transcripts emanating from several hundreds of thousands of genomic regions have been identified across all the assayed tissue types.Pervasive intergenic transcripts (i.e., not falling in exons nor introns) display similar tissue-specific expression patterns between human and chimpanzee, suggesting that such transcription events are under positive selection and have biological functions. 9Moreover, short pervasive transcripts associated with gene promoters and termini correlate with the expression of the protein-coding genes that they delineate Ideas on protein synthesis (Crick, 1956)   Ideas on gene regulation (1885-present) and have conserved synteny (maintenance of linear genomic arrangement) between human and mouse. 10The potential biological significance of ''pervasive transcription'' remains highly debated owing to the difficulty in assessing it unequivocally.It is possible that the process of transcription, rather than the transcripts themselves, holds regulatory roles, for instance, through chromatin remodeling as a result of RNA polymerase II (RNA Pol II) transcription. 7New approaches are needed to assess the function of pervasive transcription from a mechanistic viewpoint and in a genome-wide manner, to distinguish whether specific ncRNAs or their transcription regulate gene expression, or are simply by-products of other events.

TOWARD COMPLETE ATLASES OF CIS-REGULATORY DNA ELEMENTS
The ENCODE project has mapped candidate CREs in the human genome.Most CREs were categorized as promoters or enhancers, if they are directly adjacent to their target genes or regulate distal genes, respectively. 113][14] In the current model of CRE function, specific TFs known as pioneer factors bind to high-affinity DNA motifs, thought as ''anchors,'' to trigger the opening of chromatin, forming a ''core.''Subsequent binding and assembly of other TFs and co-factors around the core, through weaker protein-DNA interactions, multivalent associations between proteins, or with ncRNAs (Figure 3), are thought to further modulate the transcriptional dynamics of specific genes. 13he latest ENCODE report assigned approximately 8% of the human genome as putative cis-regulatory sequences. 15The total fraction of the genome that contains cis-regulatory information remains under intense debate, partly because we lack a complete understanding of how different regulatory activities act either directly on DNA (e.g., TF binding), through transcription itself or non-coding transcripts. 5,7,16n sum, evidence accumulated in the last 20 years suggests that most of the genome, rather than being ''junk,'' has regulatory functions in at least one cell type or tissue, either by encoding functional RNAs, providing binding sites for TFs, or through other less understood genome regulatory effects, such as pervasive transcription.Regulatory regions are selectively deployed in individual cellular or tissue contexts to modulate specific proteincoding gene transcription and provides a fertile ground for new regulations to evolve.Understanding the regulatory genome is also crucial beyond basic research, for healthcare applications because disease-associated sequence variants often overlap with cell-type-specific putative CREs in non-coding DNA. 17

SPATIAL ORGANIZATION OF THE GENOME: WHEN 3D STRUCTURE MEETS FUNCTION
Transcriptional regulation occurs within the 3D space of the nucleus, where hierarchically organized layers of genome conformation have evolved to allow for physical contacts between non-coding regulatory DNA elements and their cognate promoters, for spatial segregation of active and inactive chromatin regions, and for preferred positions of genomic sequences relative to different biochemical nuclear microenvironments.These layers ensure appropriate expression of housekeeping and cell-identity genes in time and space and allow for adequate cellular responses to differentiation cues and other stimuli.In the following sections, we introduce different layers of 3D genome structure and review their roles in gene regulation.

TADs
0][21][22] Each TAD is flanked by boundary regions, which are enriched for binding sites of the TF CCCTC-binding factor (CTCF) and for cohesin binding, short repetitive sequences, and housekeeping genes. 18,19,23Most studies on TAD formation have focused on the roles of the cohesin complex and CTCF. 24,25Cohesin is a multi-protein complex that extrudes DNA 26 until it finds DNA-bound CTCF especially at binding sites with convergent orientations, which are often present at TAD boundaries, 27 thereby forming loops.
1][32] Single-cell imaging shows that TAD-like structures exist in single cells; yet, they are positioned at a wide range of different genomic locations in different cells with average positions that match ligation-based results from bulk cell populations. 33This suggests that TADs reflect conformations of highest probability, while allowing contacts between genomic regions located in different TADs to some degree.Supporting this view, cohesin depletion leads to loss of preferred TAD positions, whereas variable TAD-like domains are still seen in single cells.

E-P communication within and between TADs
The contact enrichment of TADs and barrier activity of their boundaries is thought to relate with the capacity of insulator DNA sequences to restrict enhancer-promoter (E-P) communication. 34In this view, enhancers preferentially communicate with genes in the same TAD while being isolated from genes in adjacent or distant TADs. 35][38] However, the TAD-centric view for general cis-regulation is challenged by observations that many E-P (and promoter-promoter) interactions occur across domain boundaries and can efficiently activate gene expression. 32,39,40Furthermore, TAD disruption after acute cohesin or CTCF depletion in the range of hours results in only modest changes in nascent transcription in mouse embryonic stem cells (approximately 50 differentially expressed genes, mostly associated with differentiation and pluripotency), 41 whereas cohesin or CTCF removal for several days affects the expression of hundreds to, at most, few thousands of genes. 24,25In line with these observations, loop extrusion may promote contacts between only a subset of gene promoters, especially those directly bound by CTCF, and their enhancers, whereas a number of enhancers and promoters interact independently of CTCF/cohesin. 42,43is-regulatory communication is increasingly recognized as a complex phenomenon often specific of genomic context. 44E-P interactions can occur in the absence of gene expression, independently of tissue or developmental stage, 45,46 implying that productive E-P communication relies on additional events, such as enhancer activation from primed or poised enhancer states.Conversely, enhancer activity is often broader in time and space than the activity of the target genes, and gene

Information code
Cell type 2

Cell type 3
Cell type 1 Gene A Gene B Gene C Active CRE Active gene Inactive gene

Figure 2. Gene regulatory information code
CREs cover a substantial fraction of the genome and regulate transcription of target genes.CREs can be grouped into enhancers and promoters, which regulate transcription of target genes across large linear genomic distances.CREs are either cell-type specific or common to multiple cell types and can be located in extragenic or intragenic genomic regions.

Perspective
activation at the right time and place can depend on physical E-P interactions. 47,48Gene activation may also occur without changes in E-P proximity, 49 or coincide with increased E-P distance. 50Spatiotemporal E-P regulatory communication is far from being fully elucidated, may be more transient or dynamic than initially thought, and likely consists of a combination of different context-dependent molecular mechanisms governed by diverse chromatin factors, including both proteins and RNAs.

Heterochromatin and euchromatin states
A growing body of work suggests that membraneless nuclear sub-compartments play important roles in organizing chromatin and are associated with gene activation or repression (Figure 4). 51n a global level, chromatin is often classified as euchromatin and heterochromatin.3][54] The main epigenetic features of euchromatin are acetylation of histone H3 at residue K27 and trimethylation of H3K4.][60][61][62] Heterochromatin is characterized by higher levels of compaction, tends to contain silent genes, and acts as a repressive epigenetic environment. 63In most cell types, heterochromatin predominantly associates with the nuclear lamina, a mesh of proteins located at the inner nuclear membrane, and with the surface of nucleoli. 64Heterochromatin is further divided into constitutive and facultative sub-types, which have distinctive molecular properties and sub-nuclear localization.

Constitutive heterochromatin
Constitutive heterochromatin is present around centromeric and telomeric regions of chromosomes, is enriched in repetitive sequences, and is largely invariant between cell types or throughout differentiation. 63At the molecular level, constitu-tive heterochromatin is dependent on H3K9 methylation and the chromodomain-containing heterochromatin protein 1 (HP1). 65,66HP1 proteins interact with Suv39h1 and Suv39h2, the enzymes that deposit H3K9me3, thereby stabilizing them onto chromatin for efficient heterochromatin formation and/or maintenance. 670][71] HP1a interacts with H3K9me3 with an affinity and specificity that are modulated by phosphorylation of an intrinsically disordered region (IDR) in the HP1a protein, 72,73 which also interacts with nuclear RNAs. 74,75][77][78] HP1a can form phase separated liquid droplets and compact DNA in vitro, which are also dependent on phosphorylation of its IDR. 79The phase separation potential of HP1a has been proposed to confer a single mechanism by which constitutive heterochromatin forms and excludes other factors, namely, components of the transcriptional machinery, 80 depending on their chemical properties. 81However, recent in vivo evidence using cultured mouse fibroblasts shows that HP1a can efficiently mediate repression of a reporter gene independently of droplet formation. 82The compacted state of heterochromatin may instead result from bridging-type interactions that generate a collapsed chromatin conformation through polymer phase separation. 83,84cultative heterochromatin and Polycomb repression mechanisms In contrast with constitutive heterochromatin, facultative heterochromatin is more scattered throughout the genome, is highly specific of cell type, and dynamically changes during development and in disease. 54Facultative heterochromatin occurs at genomic regions occupied by Polycomb-group proteins and their associated histone modifications, H3K27me3 and H2AK119ub1, and is a typical feature of developmentally regulated genes (e.g., Hox loci) that are dynamically activated or repressed in response to developmental cues. 85CBX2, a subunit of Polycomb repressive complex 1 (PRC1), which contains a chromodomain, is thought to have an orchestrating role in facultative heterochromatin, similar to the role of HP1a in constitutive heterochromatin. 86,87Canonical PRC1 complexes contain one of the five different CBX proteins: CBX2, CBX4, CBX6, CBX7, or CBX8. 88CBX7/8 are recruited to H3K27me3, which is deposited by PRC2, and mediate PRC1 targeting to this mark. 89PRC1 catalyzes the monoubiquitination of histone H2A at residue K119 (H2AK119ub1), which stimulates PRC2 recruitment and further deposition of H3K27me3, 90 creating a feedback loop that promotes the formation of facultative heterochromatin domains. 91[100] Phase separation has also been shown to participate in facultative heterochromatin through CBX2.The IDR of CBX2 is required for the formation of liquid-like PRC1 condensates in vivo, 87,101 chromatin compaction in vitro, 102 and is essential for normal axial patterning in mice, 103 although loss of PRC1 condensates does not interfere with pre-existing compacted chromatin. 86Collectively, these observations suggest a stepwise process in which PRC1-CBX2 condensates indirectly mediate the compaction of facultative heterochromatin.First, PRC1 is recruited to H3K27me3-marked regions via CBX7/8 (or via lncRNAs), and subsequently mobilized into nearby

Chromatin type
Inactive epigenetic signatures Chromatin density Proximity to repressive nuclear landmarks Transcriptional activity Pol II and TF density Heterochromatin:  Heterochromatin and euchromatin exhibit distinct compaction states, epigenetic signatures, host different regulatory complexes and associate with different nuclear environments, which fine-tune embedded cis-regulatory elements.Proximity to repressive nuclear landmarks such as the lamina and the nucleolus preserves gene inactivity.Proximity to active nuclear landmarks such as nuclear speckles that contain RNA binding proteins and splicing factors is correlated with transcription potential.Chromatin occurs in loosely packed, decondensed (or ''melted'') state or in a more compact condensed state relating with transcriptional activity.

Decondensation
PRC1-CBX2 condensates. 87This second event is thought to be mediated by another component of canonical PRC1, the Polyhomeotic subunits, which can bridge PRC1 complexes with one another. 104,105Lastly, deposition of H2AK119ub1 by PRC1 and/or further H3K27me3 through feedback with PRC2 within the condensate would trigger chromatin compaction. 86A complete characterization of the compaction mechanism remains to be established, especially whether chromatin segments harboring repressive histone modifications spontaneously compact in vivo, or require molecular adaptors to bridge modified nucleosomes.

Mapping the nuclear position of heterochromatin domains
The sub-nuclear spatial distribution of heterochromatic domains has been extensively investigated thanks to sequencing techniques such as DNA adenine methyltransferase identification (DamID), which uses a fusion between a DNA adenine methyltransferase and a protein of interest to detect its close chromatin association in a genome-wide manner. 106Genomic stretches that preferentially localize close to the nuclear lamina, referred to as ''lamina-associated domains'' or LADs, were mapped using DamID of nuclear lamins or their interacting proteins. 107ADs are contiguous genomic stretches of 0.01-10 Mb in length in mammals, which can cover 30%-50% of the genome depending on cell type. 1080][111] LADs are also partially related to domains that associate with the surface of nucleoli, called ''nucleolusassociated domains'' or NADs, [111][112][113][114] which were initially identified by DNA sequencing methods in biochemical isolated nucleoli. 63,115For brevity, we focus below on the case of LADs, although similar principles are thought to apply to NADs.
7][118] fLADs represent more than half of all LADs, contain on average more genes than cLADs, and their celltype-specific dynamics likely play important roles in gene regulation.9][120][121] However, artificially tethering loci to the lamina has only modest effects on gene expression, often leading to incomplete repression, 122 variable outcomes from gene to gene, 123 or no effect. 124Furthermore, gene promoters located at LADs that are less tightly embedded within the lamina can evade repression with varying degrees. 125Lastly, forcing expression of genes within LADs causes detachment of those transcription units from the lamina, whereas, conversely, gene inactivation can result in their lamina engagement, suggesting that the act of gene transcription may also counteract local lamina association. 126An appealing possibility is that the global organization of LADs (and possibly of NADs) could result from the biophysical properties of heterochromatin and the sub-nuclear environment, such as mutual affinity between H3K9me2/3-rich chromatin and the lamina, 51 which would nonetheless remain susceptible to additional forces at the local level, especially transcriptional activity.The nuclear (or nucleolar) periphery may also convey a non-homogeneous medium for gene repressiveness, at different depths or locations along the periphery, which could explain the variable consequences of lamina association or dissociation on gene activity.
Radial gene positioning and proximity to splicing speckles Expression at specific genes also correlates more generally with nuclear radial positioning of loci. 62,127,128Typical examples are the progressive relocation of the b-globin gene to a more internal nuclear position during erythroid maturation, when its expression increases, 62 the immunoglobulin genes and other immune loci during lymphocyte differentiation, 129,130 and the brainderived neurotrophic factor (Bdnf) gene during neuronal activation. 131Other studies did not find a clear correlation between radial chromatin positioning and gene expression status, 132 suggesting that it is not a general determinant for all genes or in all systems and requires further study.
The relative distance of genomic regions to the nuclear lamina and to nuclear splicing speckles has been recently mapped genome-wide by tyramide signal amplification sequencing (TSA-seq), a sequencing-based approach that uses antibodyenzyme conjugates directed against specific nuclear landmarks and quantitatively labels DNA according to its physical distance to the landmark. 133,134TSA-seq showed that distances of genes to nuclear splicing speckles are inversely correlated with expression levels in various human cell types, with even small distance shifts being closely correlated with gene expression changes between different cell types.From all genes upregulated in one cell type compared with another, about 10% map to regions that increase their proximity to speckles in the cell type in which they are most expressed.

Chromosome territories
Chromosome territories represent the largest hierarchical layer of chromatin organization in the eukaryotic cell nucleus.][139] The radial positions of chromosome territories in both pluripotent and differentiated cells correlate with gene density in nearly spherical nuclei (e.g., embryonic stem cells and lymphocytes) [140][141][142] or chromosome size in non-spherical nuclei. 137][145] Chromosome territories are not strictly separated in mammalian cells, where they intermingle with one another as seen by FISH performed on ultrathin nuclear cryosections combined with confocal imaging or electron microscopy, 2 and by chromosome conformation capture experiments followed by polymer modeling. 146Intermingling varies between cell types, being lower in human embryonic stem cells and cancer cell lines compared with peripheral blood lymphocytes. 2,146The extent and specificity of chromosome intermingling changes during activation of resting peripheral blood lymphocytes and upon transcriptional inhibition. 2,147It remains unclear whether transcription occurs preferentially at the surface or throughout the volume of chromosome territories 2,148,149 and to which degree is trans-chromosomal intermingling cell-type specific 147 or variable within cell populations.
Preferred proximities between different chromosomes have been associated with the propensity for chromosomal translocations in specific cell types, 2,150 a process relevant for unstable cancer genomes and autism spectrum disorders. 151The mechanisms that drive preferential chromosome territory arrangements may include nuclear shape, preferred associations to the nuclear lamina, nucleoli, or speckles, and transcriptional activity.Further investigation is required to elucidate the specificity and general mechanisms of trans-interactions between chromosomes and their functional implications.

Gene positioning relative to chromosome territories
The position of individual genes relative to their chromosome territories changes with gene activation and depends on cell type. 148,152,1534][155][156] At the territory surface, loci may be exposed to neighboring chromosomes, 2,153 to the nuclear lamina, or to membraneless nuclear bodies, such as the nucleolus or speckles.
Although gene looping out from chromosome territories can coincide with transcriptional activation and extensive decondensation of the protruding chromatin segment, the three processes are not strictly coupled. 148,153,157For example, the insertion of the b-globin locus control region (LCR) to a gene-rich genomic region of the Rad23a locus was sufficient to reposition the locus away from the periphery of its chromosome territory, resulting in increased expression of some, but not all, genes adjacent to the LCR insertion. 158Conversely, the activation of the plasminogen activator, urokinase (PLAU) gene in hepatoblastoma-derived HepG2 cells treated with phorbol esters led to the looping of PLAU out from its chromosome territory but without leading to increased probability for transcriptional activity. 148In contrast, the transcriptional silencing of PLAU was highly correlated with its internal location within its territory.These findings suggest that there is more complexity in the relationship of territory positioning of genes and their transcriptional regulation.For some loci, repositioning within their territory may be actively regulated and core to the activation process, possibly by modulating the exposure of specific genomic loci to regulatory cues present in different nuclear microenvironments. 159The process of gene activation or transcription may itself be sufficient to promote the decondensation and relocation of loci between different sub-regions of each chromosome territories. 160rge-scale chromatin decondensation (melting) events Large-scale chromatin decondensation was first observed in the form of chromosome puffs that accompany high transcription levels in polytene chromosomes of fly salivary gland cells.161,162 A connection between chromatin decondensation and transcription induction was also reported in tandem arrays of the mammary tumor virus promoter and after retinoic-acid-induced Hoxb expression.154,163 More recently, chromatin decondensation events have been detected genome-wide at hundreds of long genes by genome architecture mapping (GAM).32 Decondensation or ''melting'' of long genes is highly cell type specific and, in neurons, tends to occur when genes are most expressed and with highest chromatin accessibility levels.Extensive decondensation of long and highly expressed genes has also been explored by FISH in terminally differentiated human cells.160 Both studies suggest that chromatin decondensation may result from high RNA Pol II occupancy and facilitate access of the transcriptional machinery.Long melting genes also tend to be sensitive to topoisomerase I inhibition, 32,164 which suggests that their high expression levels are particularly dependent on resolution of DNA supercoiling.Chromosome recondensation can accompany gene repression, which likely contributes to a reduced probability of subsequent activation by modulating access and binding of nuclear factors and enzymatic activities. 165,166igher-order transcriptional assemblies involving TFs and super-enhancers TFs hold key roles in the dynamic interplay between 3D genome organization and gene expression control.167 Many TFs undergo weak multivalent interactions through their IDRs, producing condensates that can have characteristics of liquid-liquid phase separation (LLPS), 168 and have been proposed to concentrate general components of the transcriptional machinery, such as Mediator and RNA Pol II.169 Multivalent interactions of TFs and co-factors may help stabilize them onto DNA for efficient gene activation.170 TF-mediated microenvironments often occur at, and are possibly driven by, groups of CREs known as super-enhancers 171,172 and can be modulated by RNAs.173 Higher-order chromatin associations between super-enhancer elements and highly transcribed genomic segments spanning distances of tens of megabases have also been observed, 39,174 but the roles and the preferred partners of such long-range associations remain unexplored.

THE PAST, PRESENT, AND FUTURE AVENUES IN GENOME RESEARCH
In this final section, we elaborate on ongoing efforts to address some of the remaining ''missing links'' needed to gain a comprehensive understanding of the general processes governing genome structure and function at the molecular level.
Imaging techniques pioneered the field of genome biology for decades, providing invaluable insights, such as detailed preferential genome positioning in the 3D space of the nucleus, and uncovering cell-to-cell variability of genome conformations within cell populations.New-age methods begin to increase the imaging throughput to provide parallel measurements of a larger number of genomic regions, RNAs, and proteins that can be measured at once.In parallel, the rapid development of sequencing-based approaches has revolutionized the 4D nucleome field. 175The entire genome can now be interrogated across its unidimensional linear sequence with respect to gene expression, histone modifications, chromatin binding, accessibility, and association with specific nuclear landmarks and in three dimensions in terms of multiway chromatin interactions with the rest of the genome and positioning within the nuclear landscape. 176Most of the emerging sequencing approaches are now adaptable to application in single cells, giving new insights into dynamic and cell-type-specific events.A major remaining challenge in 3D genome research lies in creating, whenever possible, robust frameworks to integrate data from different experimental approaches and resolve contradictions (Figure 5).
Genome-wide interactions of a gene measured by ligationbased techniques are routinely integrated with histone modification data in sequence space, to infer the candidate target genes of putative CREs and vice versa.However, cell-to-cell variability inevitably causes interpretation issues.For example, a specific E-P contact may be detected in a given cell population in which the gene is globally active; yet, it is not trivial whether the E-P contact occurs in cells where the enhancer and/or gene are active, or in other cells where one or both are silent.
Other limitations come with the integration of different types of data (e.g., imaging and sequencing-based), where shared ''reference points'' may be sparse or even absent.Computational modeling using physics theory and/or machine learning aspire to fill the gap between 3D genome organization and 1D features along the linear sequence by estimating how linear genome features are folded in 3D, 58,177,178 by inferring the coherent binding patterns of chromatin factors along the linear sequence from chromatin organization data, 179,180 or by deriving models that jointly consider linear features and 3D conformation. 181Further work is needed to directly test predictions made through modeling approaches beyond correlations with in vivo data.
To capture the complex relationship between genome structure and gene regulation, multimodal technologies are needed for simultaneous localization and quantification of DNA interactions, RNAs and proteins, including post-translational modifications (PTMs), in single cells ideally with spatial resolution in real tissues or biopsies.Exciting developments are ongoing in multimodal ligation-based methods (e.g., methyl-Hi-C and, more recently, combining Hi-C with RNA sequencing [RNA-seq]) to enable measurements of pairwise chromatin contacts in parallel with DNA methylation (DNAme) or RNA abundance. 182,183equential FISH has recently mapped genome organization, multiple RNAs, and proteins using successive rounds of probe hybridization and imaging in mouse embryonic stem cells and in brain cells. 57,184Split-pool recognition of interactions by tag extension (SPRITE) and GAM are also well suited to incorporate measurements of both RNA and protein modalities.SPRITE is a ligation-free technique that relies on splitting, barcoding, and pooling cross-linked chromatin clusters a number of times, followed by their sequencing to measure various aspects of genome organization. 174SPRITE is applicable in single cells, 185 and can incorporate simultaneous RNA measurements within each chromatin cluster. 186GAM is a single-cell assay based on sequencing genomic DNA extracted from ultrathin nuclear cryosections from structurally preserved cells, while retaining tissue Methods compared with respect to specific features (from top clockwise): required biological input material, preservation of tissue context information, depth of sub-nuclear spatial resolution, throughput relative to entire genomic sequence, potential for parallel measurements of DNA, RNAs, and proteins (or other biological phenomarkers), breadth of captured 3D genome structures (incl.pairwise and multiway chromatin contacts, radial positioning, and association with nuclear microenvironments; for more details, see Kempfer and Pombo 176 ).spatial information, 32,39 and is compatible by design with parallel measurements of RNAs 187 and proteins.Multimodal genome structure data will be especially valuable to train models of nuclear architecture and/or gene regulation to increase their predictive power, with applications in early prognostics of complex non-coding genetic disorders.
The development of novel integrative and multi-omics approaches also promises invaluable insights on the contributions of TFs to 3D genome structure and gene expression programs, their dynamics during development and alteration in disease.The actions of TFs, and how they cooperate to modulate chromatin architecture and achieve productive communication of enhancers with their target genes, remain largely unknown.Studies on TF function in the context of the spatially organized nucleus represent an exciting direction for the future of research in genome biology, especially how changes in the abundance of developmental TFs act on 3D genome structure throughout lineage commitment. 188lmost 70 years ago, Francis Crick laid one of the biggest milestones in molecular biology by proposing a simple set of rules that define the transfer of sequence information between DNA, RNA, and protein, at a time when fundamental concepts such as the genetic code had not yet been described.The advent of single-cell imaging and unbiased epigenomics has taught us that gene expression is regulated by proteins, RNAs, DNA sequences, and chromatin structure acting together in the 3D space of the nucleus with dynamic properties at the single-cell level.Understanding the cooperativity and interdependency between all different mechanisms and factors remains an important task for the next decades to deepen our knowledge on cellular decision-making and homeostasis and to gain insight into human disease inception and response to therapies to forge tomorrow's medicine.

Figure 4 .
Figure 4. Transcriptional regulation mechanisms class II: regulation via chromatin state and biophysical properties

Figure 5 .
Figure 5.Comparison of modern techniques to study genome organization