3D genome perspective on cell fate determination, organ regeneration, and diseases

Abstract The nucleosome is the fundamental subunit of chromatin. Nucleosome structures are formed by the combination of histone octamers and genomic DNA. Through a systematic and precise process of folding and compression, these structures form a 30‐nm chromatin fibre that is further organized within the nucleus in a hierarchical manner, known as the 3D genome. Understanding the intricacies of chromatin structure and the regulatory mode governing chromatin interactions is essential for unravelling the complexities of cellular architecture and function, particularly in relation to cell fate determination, regeneration, and the development of diseases. Here, we provide a general overview of the hierarchical structure of chromatin as well as of the evolution of chromatin conformation capture techniques. We also discuss the dynamic regulatory changes in higher‐order chromatin structure that occur during stem cell lineage differentiation and somatic cell reprogramming, potential regulatory insights at the chromatin level in organ regeneration, and aberrant chromatin regulation in diseases.


Abstract
The nucleosome is the fundamental subunit of chromatin. Nucleosome structures are formed by the combination of histone octamers and genomic DNA. Through a systematic and precise process of folding and compression, these structures form a 30-nm chromatin fibre that is further organized within the nucleus in a hierarchical manner, known as the 3D genome. Understanding the intricacies of chromatin structure and the regulatory mode governing chromatin interactions is essential for unravelling the complexities of cellular architecture and function, particularly in relation to cell fate determination, regeneration, and the development of diseases. Here, we provide a general overview of the hierarchical structure of chromatin as well as of the evolution of chromatin conformation capture techniques. We also discuss the dynamic regulatory changes in higher-order chromatin structure that occur during stem cell lineage differentiation and somatic cell reprogramming, potential regulatory insights at the chromatin level in organ regeneration, and aberrant chromatin regulation in diseases.

| INTRODUCTION
If stretched end-to-end, the diploid human genome is 2 m long, yet it fits inside a nucleus that is only a few microns across. The chromatin is folded and assembled into a complex hierarchical structure. [1][2][3] Transcriptional regulation relies on precise 3D genomic architecture. 4,5 To capture chromatin structure, numerous chromosome conformation capture (3C) techniques and their derivatives, followed by highthroughput sequencing have been developed in the past two decades. 6,7 A fresh perspective on the 3D hierarchical organization of the genome has emerged from the analysis of genome-wide chromatin contact maps, and this has revolutionized our knowledge of chromatin organization as well as our understanding of the transcriptional control of gene expression. Chromatin architectural proteins, such as Hongxin Zhong and Jie Zhang contributed equally to this work. CCCTC binding factor (CTCF) and the cohesin complex, play the essential roles in the formation of 3D genome. 4,8 Cell fate changes dynamically during normal development, cell lineage differentiation, organ injury and regeneration, and disease progression. Moreover, cell fate determination is frequently regulated by transcription factors (TFs), chromatin accessibility, and modifications, as well as higherorder chromatin structure. [9][10][11] An investigation of multidimensional chromatin architecture provides theoretical guidance for understanding the basic organization of life activities, the regeneration of organs, and the occurrence and development of diseases.
In this review, we present a summary of the hierarchical organization of the 3D genome, discuss the development of 3C techniques and their derivatives in studying the 3D genome, introduce recent discoveries about higher-order chromatin structure during stem cell lineage differentiation and somatic reprogramming, review the role of chromatin regulation in organ regeneration and diseases, and provide new insights and perspectives for future investigation of these topics.

| HIERARCHICAL STRUCTURE OF THE 3D GENOME
Functional elements in eukaryotic genomes, including coding genes, non-coding genes, cis-regulatory elements, repetitive elements, and others, interact specifically in dynamic higher-order structures rather than being arranged in linear order in the spatial organization. The '3D genome', normally refers to the higher-order chromatin structure, affects numerous biological processes, including gene expression, DNA replication, DNA damage repair, cell differentiation, and F I G U R E 1 The hierarchy of the 3D genome. During interphase, each chromosome is located at a specific nuclear region, known as chromosome territory. Chromatin can be further divided into active A compartments and relatively inactive B compartments. At the submegabase scale, topologically associated domains (TADs) of local chromatin are maintained by architectural proteins such as CTCF, cohesin complex. According to the loop extrusion paradigm, chromatin loop structures, whose formation is mainly mediated by CTCF and cohesin complex, are enriched in TADs. Chromatin loops can also be maintained by the specific transcription factors. embryonic development. 5,10,12,13 The multidimensional genome organization of eukaryotes can be described in the following order, from large to small: chromosome territories (CTs), chromatin compartments, topologically associating domains (TADs), chromatin loops ( Figure 1). 14 These structures are associated with gene expression in multiple dimensions and thereby influence the progression of cell fate transitions.

| Chromosome territories
During interphase, chromatin could occupy distinct regions within the nucleus known as CTs (Figure 1). 15 It has been reported that the nuclear topological arrangement of CTs has remained stable over 30 million years of primate evolution. 16 Several factors contribute to the creation of CTs, including chromosome size and the number of genes present on specific chromosomes. Gene-dense CTs are enriched inside the nucleus, whereas gene-poor CTs are located at the periphery of the nuclear membrane. 17 The spatial arrangement of CTs determines the frequency of genomic translocations. The presence of correct CTs can shield the genome from potentially harmful translocations in the event of DNA damage. 18

| Chromatin compartments
The genome is further separated into active A compartments and inactive B compartments (Figure 1). 19 The A compartments, which are mainly located in the interior of the nucleus, have the relatively open chromatin structure in contrast to the B compartments, are enriched with housekeeping genes, short interspersed repetitive elements (SINEs), and histone markers such as H3K27ac and H3K36me3 that indicate active transcription. However, the B compartments are primarily located at the edge of the nuclear membrane or around the nucleolus and are associated with tissue-specific genes, long terminal repeats, long interspersed nuclear elements, and repressive histone modifications such as H3K9me3 and H3K27me3 that are involved in transcriptional silencing. 20,21 Nuclear re-localization and A/B compartment switching are regulated by chromatin-associated proteins such as TFs and chromatin-modifying enzymes, as indicated by functional analysis of specific genomic loci. 22,23 The A/B compartments are spatially polarized structures that may contribute to the local concentration of transcriptional machinery and epigenetic regulators and thereby increase the efficient utilization of these biological resources. 24

| Topologically associating domains
Genomic compartments can be further subdivided into topologically associated domains (TADs) at the sub-megabase scale ( Figure 1). 25 TADs are separated by distinct boundaries and are characterized by the presence of frequent internal chromatin contacts in the same TAD but infrequent interactions across TADs.
This means that TADs frequently function as autonomous domains of gene control. [26][27][28] CTCF, cohesin complexes, SINE elements, tRNAs, housekeeping genes, and active histone modifications such as H3K4me3 and H3K36me3 are abundant at TAD boundaries.
These elements are typically closely related to transcriptional activity and epigenetic chromatin modification signatures. 26 As an insulator at the boundary, CTCF prevents the interaction of regulatory elements located on two nearby TADs. 29 Furthermore, TADs can be further divided into smaller sub-topological domains (sub-TADs) that are highly conserved across species and stable across diverse cell types. 27 TADs are functional units of chromatin, and co-regulated genes located in a given TAD exhibit comparable patterns of expression during differentiation. 25

| Chromatin loops
At the kilobase scale, chromatin fibres condense into loop-like structures known as chromatin loops (Figure 1). 32 The 'loop extrusion model' is currently one of the most well-known models that explain how chromatin loops are formed. The cohesin complex movement is halted when it encounters CTCF-occupied sites while sliding and compressing along chromatin. 33 The chromatin loops mediated by CTCF and the cohesin complex within a specific chromatin environment are intimately associated with gene expression. 34 Direct evidence for the loop extrusion paradigm has been provided by in vitro single-molecule imaging technology showing that the cohesin-NIPBL complex may compress DNA through ATPdriven loop extrusion. 35 At a finer scale, the cohesin-associated loop extrusion machinery can be further helped by RNA polymerase II when the transcriptional machinery and cohesin complex clash, which may cause the development of enhancer-promoter loop or promoter-promoter loop. 36 Chromatin loops is closely related with regulation of gene expression. 37 Our recent study revealed that CTCF has an alternative splicing isoform that skips Exons 3 and 4, causing translation to start at Exon 5 and resulting in a shortened CTCF protein (CTCF-s). 38 Competition between CTCF-s and classical CTCF for DNA binding results in disruption of the CTCF-mediated chromatin loops and promotes apoptosis through activation of interferoninducible protein 6 (IFI6). CTCF, along with other potential factors, governs long-range chromatin interactions. We screened the factors that may be involved in the regulation of chromatin loop formation and indicated that basic helix-loop-helix family member e40 (BHLHE40) regulates CTCF binding and thereby impacts the stability of chromatin loops. 39 Additionally, Yin Yang 1 (YY1) functions as a chromatin structural protein by mediating the interaction between enhancers and promoters and thus regulates gene expression. 40 However, current investigations have shown that acute depletion of architectural proteins (CTCF, cohesion, WAPL, and YY1) using a degradation system had relatively moderate effects on the expression of most genes though these proteins mediate chromatin loops, 41,42 indicating that the dynamic changes of higher-order chromatin structure are not positive correlation with the gene expression. Therefore, the relationships between higher-order chromatin structure and gene expression need us to further investigate.

| The development of 3C method and its derivatives
The spatial conformation of chromatin is one of the essential components of transcriptional regulation. Advances in 3C and F I G U R E 2 A timeline and table depicting the development of crucial chromosome conformation capture and the derived technologies.
3C-based derivative technologies over the past two decades have made it easier for us to study 3D genome organization ( Figure 2). 19,43 3C was initially reported in 2002, which could be used to deduce the spatial architecture of chromatin based on quantitative measurement of the frequency of interactions between two loci. 43 Based on the 3C method, circular chromosome conformation capture or chromosome conformation capture-on-chip (4C) technology for capturing all loci that interact with a single specific locus, 44,45 and chromatin conformation capture carbon copy (5C) technology used for detecting all interactions within a specific locus, 46 were rapidly developed. Hi-C technology, which makes genome-wide chromatin interaction analysis possible, was developed in 2009. 19 Despite the shortcomings of the initial version of the Hi-C technique, including high expenses, complicated and time-consuming experimental methods, lots of randomly ligated DNA noise, and a lack of simple noise evaluation, Hi-C marks the beginning of the era of mapping genome-wide chromatin interactions.

| Derivative techniques based on Hi-C
The most notable improvement of the Hi-C technique is the invention of in situ Hi-C technology. 32 In situ Hi-C employs a specific four-base cutter enzyme to fragment DNA, enhancing the number and diversity of DNA fragments and boosting sequencing coverage and resolution relative to Hi-C using a six-base cutter enzyme. Furthermore, in situ Hi-C executes adjacent ligation and previous steps in the nucleus, thereby capturing chromatin interactions closer to its natural state. In situ Hi-C has overcome the shortcomings of traditional Hi-C, and has been widely adopted and improved upon in subsequent versions. For example, sisHi-C, 47 an assay that captures information from a small number of cells by optimizing the procedure based on in situ Hi-C, was used for mapping chromatin interactions of early embryonic samples. Bridge-Linker Hi-C, an improved method in which biotinylated linker sequences are used, makes the technique more efficient and sensitive in capturing active chromatin interactions. 48 Micro-C significantly improves the resolution of the Hi-C method by using micrococcal nuclease (MNase) to cleave the genome. 49 In addition to the methods described above for capturing whole genome chromatin interactions, the techniques for capturing local chromatin interactions are being developed. Micro-Capture-C (MCC) 50 and Tiled-MCC 51 technologies, both of which are based on Micro-C, were developed for high-resolution chromatin interaction of specific loci. As needed, other derivative technologies centred on specific factors (e.g., TFs or histones) or specific elements (e.g., promoter or enhancer regions) were established, such as ChIA-PET, 52 HiChIP, 53 Capture Hi-C, 54 ChIATAC, 55 and others ( Figure 2). Furthermore, the development of single-cell Hi-C technology allowed us to detect thousands of chromatin contacts in single cells. 56 In addition, the optimized versions of single-cell Dip-C 57 and Methyl-HiC 58 successively emerged after that.

| Other ligation-free methods
Simultaneously, nonadjacent-junction approaches have been developed to avoid the inherent linkage bias of standard adjacent-junction methods; these include genome architecture mapping (GAM) 59  Here, we mainly focus on the role of CTCF-mediated chromatin regulations in cell fate determination.  84 The reorganization of chromatin during somatic reprogramming is also associated with TF-driven phase separation. 86 For example, OCT4 modulates TAD reprogramming by altering CTCF binding at the TAD boundary through phase separation. CTCF plays a vital role in 3D genome organization during the reprogramming process. Moreover, we recently discovered that CTCF represses the expression of somatic genes by acting as a chromatin insulator and functions as a chromatin remodeller to maintain the accessibility of pluripotency genes. 87

| CHROMATIN STRUCTURE AND ORGAN REGENERATION
Organ regeneration is a major goal of regenerative medicine, which includes the regeneration and repair of damaged tissues as well as the rebuilding of organs using stem cell technology. The essence of regenerative biology at the cellular level is associated with cell fate transition. Therefore, it is crucial to examine how programmes involved in regeneration are affected by changes in the chromatin environment. The regulatory mechanisms that operate at the chromatin level during regeneration events have been extensively studied using model animals such as Hydra, Drosophila, zebrafish, and mice. 11 To offer fresh perspectives on chromatin organization during regeneration processes, we mainly focus on the multidimensional regulation of chromatin during regeneration events; this multidimensional regulation involves chromatin remodelling, epigenetic alterations, and regulatory elements that participate in regeneration.

| Chromatin remodelling
By changing the physical spacing of the nucleosomes, ATP-dependent chromatin remodelling complexes (SWI/SNF, ISWI, CHD, and INO80) make DNA more accessible to particular protein factors and thereby change gene expression programmes. 88 In studies of liver regeneration in mice, SMARCA4, a component of the SWI/SNF remodelling complex, was found to activate the Wnt/β-catenin pathway by interacting with β-catenin protein and thereby promote hepatocyte proliferation. Deletion of SMARCA4 inhibits liver regeneration and affects survival after hepatectomy. 89 In contrast, deletion of the SWI/SNF complex component ARID1A significantly improves organ regeneration after liver injury in mice. 90 Therefore, the balance of chromatin remodelling complex components may be one of the factors limiting tissue regeneration. The role of chromatin remodelling factors in maintaining differentiationrelated programmes and activating regeneration-related programmes at the level of chromatin conformation remains to be further studied.

SMARCA4 has been shown to bind the enhancers of Myc genes in leukaemia cells and is required for the regulation of distant chromatin
interactions. 91 In human mammary epithelial cells, deletion of SMARCA4 leads to significant changes in higher-order chromatin structure. 92 Our recent study showed that SMARCA5, the ATPase of the ISWI complex, cooperates with CTCF to maintain chromatin accessibility and promote the reprogramming of mouse fibroblasts into iPSCs. 87 The regulatory regions of proliferation-related genes in regenerating hepatocytes display increased chromatin accessibility and are enriched with ELK1 and CTCF, according to a study of dynamic chromatin architecture in regenerating liver. 93

| Epigenetic modifications
The studies related to 3D genome in regulating regeneration are really rare; however, epigenetic modifications are highly connected to 3D genomic regulation. 94,95 Epigenetic modifications such as histone modification and DNA methylation are also involved in organ regeneration after injury. Polycomb groups (PcGs) are classical epigenetic inhibitory complexes that are divided into two main types of complexes, PRC1 and PRC2. 96 These complexes mediate the ubiquitination of histone H2A on lysine 119 (H2AK119ub) catalysed by RING1A/B and the trimethylation of histone on lysine 27 (H3K27me3) catalysed by EZH1/2, respectively. 96 In regenerationrelated studies, EZH2-deficient zebrafish failed to regenerate after caudal fin transection injury, suggesting that EZH2 and the histone modifications may play a role in the regeneration process. 97 In contrast, during cardiac regeneration in mice, EZH1, but not EZH2, is required to activate cardiac regeneration-related genes, 98 suggesting that PcG may play a role in both transcriptional repression and transcriptional activation during regeneration events. Several recent studies have also shown that, unlike the classical repressive process, PcG can activate gene expression programmes during cell fate transition. [99][100][101] Comparison of multi-omics data from quiescent and regenerating hepatocytes in mouse liver revealed that pro-regeneration-related genes tended to be found in the active region of the genome and were enriched in the H3K27me3 modification in the resting state; this modification was erased during regeneration, and proliferation-associated gene expression programmes were quickly activated. 102 The multiple regulatory actions of PcG can be explained by the existence of higher dimensional chromatin structural network.
Specifically, detailed suggestions regarding the role of PcG in 3D genome folding have been made. 103 Thus, it may be possible to determine how the PcG complex orchestrates the regeneration programme at the level of chromatin structure in the future.

| Regenerative regulatory elements
Chromatin regulatory elements (e.g., promoters and enhancers) play an essential role in gene expression. With the development of multi-omics technologies, chromatin regulatory elements associated with regeneration events have been identified in Hydra, Drosophila, zebrafish, and other organisms by integrating omics information on chromatin state and histone modifications. [104][105][106] The results show that some elements are relatively conserved among different species. For example, a class of tissue regeneration enhancer elements (TREEs) identified in regenerating tissues of zebrafish can trigger the repair programme at the injury site. TREE can also induce reporter gene expression at injury sites in mice. 106 Recent studies have used TREE and recombinant adenoviral AAV vectors to achieve precise repair of damaged areas of the mouse heart, demonstrating the great potential of TREE in regenerative medicine applications. 107 The spatial interaction of chromatin regulatory elements is an important part of gene expression, but the mechanism by which TREEs drive regeneration-related gene expression remains to be studied. Is the appropriate 3D chromatin structure between TREE and regeneration-related genes the basis for their function? We propose that a stable chromatin conformation is required for the initiation and maintenance of a regeneration programme and that this can be achieved by the coordinated action of TFs, remodelling complexes, mediators, epigenetic modification, TREE, and architectural proteins ( Figure 3). Therefore, further analysis of the multidimensional chromatin environment near TREEs will help us to understand the regulatory process that occurs during regeneration events and further guide clinical application.

| CHROMATIN STRUCTURE AND DISEASES
Recently, a number of studies have taken advantage of 3C-derived techologies combined with high-throughput sequencing to study the relationships between disease-related genetic variation and chromatin conformation. Abnormal regulation of chromatin structure in diseases has become a focus of current research.  However, under the pathological conditions in which the disease-related structural variants (SVs) involving TAD boundary elements (e.g., the sequence deletion of boundary elements may lead to TAD fusion), the regulatory elements between adjacent TADs begin to interfere with one another, and this can disrupt gene regulation and cause diseases. To a certain extent, single nucleotide polymorphisms (SNPs) in regulatory factor binding motifs and mutations in architectural proteins can also produce aberrant chromatin conformational patterns and pathogenic phenotypes.
issues. However, further study is required using relevant mutant cell models or disease animal models to elucidate in detail.
Understanding the occurrence and progression of diseases from the standpoint of chromatin structural variation will thus be a future direction for improving disease diagnosis. On the one hand, more sophisticated detection methods and computational tools are required to make it possible to discover relevant chromatin structural alterations. On the other hand, gene editing approaches could be used to create equivalent cell or animal models to explore the possible regulatory processes. Disease simulation and mechanism studies could benefit from using human iPSCderived organoid technology, which has shown great promise in disease modelling and regenerative medicine in recent years.