Single-cell protein-DNA interactomics and multiomics tools for deciphering genome regulation

: The emergence of single-cell genomic and transcriptomic sequencing accelerates the development of single-cell epigenomic technologies, providing an unprecedented opportunity for decoding cell fate decisions largely encoded in the epigenome. Recent advances in single-cell multimodality epigenomic technologies facilitate directly interrogating the regulatory relationship between multi-layer molecular information in the same cell. In this review, we discuss recent progress in development of single-cell multimodality epigenomic technologies and applications in elucidating cellular diversifications in development and diseases, with a focus on protein-DNA interactomics and regulatory links between epigenome and transcriptome. Further, we provide perspective on the future direction of single-cell multiomics tool development as well as challenges facing ahead


Introduction
Deciphering the epigenetic regulation of dynamic gene expression underlying cell-fate decisions is a longstanding theme in biomedical studies. Insights into the kinship of neighboring cells in space and time have been gained through imaging-based tools such as time-lapse microscopy and genetic fate mapping [1][2][3][4][5]. Emerging single-cell sequencing tools hold great promises to boost our understanding of the cellular heterogeneity and molecular regulation of dynamic cell states during development and diseases. Single-cell mRNA sequencing (scRNA-seq) has revolutionized broad biomedical researches in interpreting cellular heterogeneities and inferring cell differentiation trajectories in pseudotime [6][7][8][9][10][11][12]. However, single-cell transcriptomic sequencing datasets yield limited insights into the regulatory principles in cell fate specifications, i.e., how the epigenome is dynamically sculpted in cells diversifying from one another upon sti-formation has been forged into single-cell multi-omics measurements. We describe state-of-the-art progress and present future perspective in this flourishing field bridging fundamental biology and biomedicine.

Challenges and solutions in interrogating the epigenome in single cells
Gene activity and cell identity are largely encoded in the multi-layer epigenomic landscape, including transcription factor binding, chromatin accessibility, histone modifications, DNA modifications, and 3dimensional chromosome conformation in diverse biological processes [13,14]. Genome-wide mapping of epigenetic features in bulk samples provided valuable information in identifying cis-regulatory elements, such as promoters and enhancers, and their roles in gene regulation. Nevertheless, averaged signals in bulk assays provide no cell-type specific features in heterogenous tissues.
Since there are only two copies of genomic DNA in a diploid cell, single-cell epigenomic sequencing technologies suffer from more challenges than scRNA-seq dealing with multiple copies of transcripts. To improve the detection efficiency of epigenome in single cells, it is necessary to optimize the experimental protocols, including designing new biochemistry. As an example, Buenrostro et al. [15] successfully developed single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) based on Tn5 transposase-mediated tagmentation and simultaneous addition of sequencing adaptors for DNA fragments in the library preparation. Single cell reduced representation bisulfite sequencing (scRRBS-seq) enables the detection of approximately 40% of the CpG sites in bulk RRBS by limiting the experimental steps in a single tube before PCR amplification to avoid sample loss [16][17][18].
The low throughput strategy for achieving single-cell resolution relies on manual pipetting or fluorescenceactivated cell sorting (FACS) to sort individual cells or nuclei into multi-well plates with well-specific barcodes [6]. Furthermore, droplet-based microfluidic platforms improve the throughput to thousands of cells in one assay [9,10]. Recently, combinatorial indexing has emerged as a promising strategy for highthroughput single-cell sequencing technologies [19][20][21]. In addition, droplet-based microfluidic platforms in combination with combinatorial indexing not only greatly facilitate automation but also further increase the throughput and easy adoption in aim for from hundreds of thousands to millions of cells per experiment [22].

Single-cell protein-DNA interactomics
The gene activity is largely gated through chromatin states, including chromatin accessibility, histone modifications, and chromatin protein bindings. Genome-wide protein-DNA interaction measurementinteractomics concerns a root layer of epigenetic regulation. Chromatin immunoprecipitation followed by throughput sequencing (ChIP-seq) is a widely adopted tool for examining genome-wide protein-DNA interactions as well as histone modifications [23,24]. Traditional ChIP-seq bears major limitations such as requirement for a large amount of starting materials, careful validation of ChIP grade antibodies, low signalto-noise, and low resolution [25]. Overcoming these challenges in the standard ChIP-seq has promoted the rapid development of the field, aiming at improving the signal-to-noise ratio, increasing the resolution, and reducing the input cell number. In the few years, considerable progress has been made to detect protein-DNA interaction with markedly improved sensitivity using as few as 100 cells including STAR ChIP [26], CUT&RUN [27], nano-ChIP [28], UNINChIP [29], MOWChIP [30], and ChIL-seq [31]. Furthermore, the Bernstein laboratory first reported a prototype of single-cell ChIP-seq procedure using a droplet-based microfluidic platform (scDrop-ChIP) in 2015, and three subpopulations in mouse embryonic stem cells were identified based on single-cell H3K4me2 ChIP-seq data. However, since the experimental protocol is largely built on the traditional ChIP-seq, scDrop-ChIP only generates~800 unique reads per cell, hampering its potential applications in complex biological samples [32]. Recently, we and Henikoff laboratory have at the same time reported a widely applicable, low-cost and robust strategy, dubbed CoBATCH [33] and CUT&Tag [34], respectively, to profile histone modifications and transcription factor bindings in both low-input cells and single cells based on a fusion protein Protein A-Tn5 (PAT). CoBATCH has reported the unique reads as high as 12,000 per cell. Owing to the targeted tagmentation and enrichment of DNA fragments and obviation of immunoprecipitations, CoBATCH and CUT&Tag profoundly improve the sensitivity and precision of detection with high signal-to-noise. Similarly, by fusing Protein A with micrococcal nuclease (MNase), scChIC-seq [35], uliCUT&RUN [36] and iscChIC-seq [37] have been reported to exhibit high signal-tonoise profiles. iscChIC-seq adopts TdT-tailing-based combinatorial indexing (~11,000 reads per cell for 7800 individual cells). Moreover, improvement for genome-wide measurement of DNA-protein interaction in single cells with higher throughput may be obtained through droplet-based and combinatorial indexing strategies ( Figure 2). Two studies couple the scCUT&Tag protocol with the widely available 10x Genomics Chromium platform to explore the epigenetic heterogeneity of mouse brain and dynamic changes in chromatin silencing in patients with brain tumors [38,39]. Although droplet-based scCUT&Tag may be used to identify major subpopulations in heterogenous tissues, it suffers from obnoxious data sparsity (i.e., low coverage) with unique reads per cell ranging from 98 to 453 in various histone modification profiling in mouse brain [38]. Therefore, researchers are confronted by a trade-off between cell throughput and genomic coverage when performing single-cell protein-DNA interactomics profiling. Of note, the split-and-pool based single-cell methods are not compatible with the context with limited cell number, such as early embryo development. As a complementary way, the He laboratory has recently reported a versatile technique, dubbed single-cell itChIP-seq [40]. After cell fixation with formaldehyde, itChIP-seq applied the detergent SDS treatment for relaxing/opening chromatin prior to Tn5 transposition genome wide, enabling the profiling of histone modifications or transcription factor binding with starting materials as few as 50 cells and throughput up to 2000 cells.
With a completely different design, DNA adenine methyltransferase identification (DamID) is also able to capture contacts between proteins of interest (POIs) and chromatin on a genome-wide scale at single-cell resolution [41]. Recently, EpiDamID has been reported to simultaneously measure histone modification and transcriptome in single cells by fusing Dam to POIs, engineered chromatin reader domains, or single-chain variable fragments of antibodies [42]. However, DamID relies on transgenic expression of POIs, largely precluding its potential applications in histone modifications and clinical samples.
It is exciting to witness the emergence of new techniques such as scMulti-CUT&Tag [43], MulTI-Tag [44], nano-CT [45] and NTT-seq [46] for profiling multiple histone modifications at one time in single cells. In scMulti-CUT&Tag, different epitopes on chromatins were recognized by PAT-antibody conjugates with antibody-specific barcodes. They suggested that purification of PAT-barcodes-antibody complex may be the key step to avoid swapping of antibody-barcodes during targeted tagmentation. While in MulTI-Tag, barcoded adaptors are firstly covalently conjugated to antibody before preloaded with PAT, and antibodytethered tagmentation is carried out sequentially to avoid cross contamination between different histone modifications. However, these methods suffer from poor data quality (for example,~200 unique reads per cell for scMulti-CUT&Tag and~1000 unique reads per cell for MulTI-Tag) and a limited number of histone modifications profiled at a time. Thus, development of methods for co-occupancy of multiple chromatin proteins with high-sensitivity and high-efficiency is highly demanded.
Together, these pioneer single-cell epigenomic technologies, while still in infancy, offer us the capacity to unbiasedly unlock the regulatory relationship between a myriad of dynamic epigenetic proteins reading out regulatory elements and transcriptional outcome. Our intention is to provide a quick overview for newcomers to the field of single-cell protein-DNA interactomics as well as deeper description with details of the leading technologies to date in light of different biological contexts.

Single-cell multimodality epigenomic technologies
Although single-cell unimodal measures enable identification of heterogeneities in cell types and states, these approaches provide relatively limited values in connecting epigenomic regulation with phenotypic outcome when one layer of molecular information is assayed in single cells at a time. Emerging single-cell multi-Omics present an unprecedented opportunity to explore the cell type-specific gene regulation and intricate relationship between different modalities in tissues and various conditions from a more comprehensive view.
Distinguishing different molecular modalities in single cells in resulting sequencing data is key to realize single-cell multiomics profiling. Hence, a highly effective procedure, compatible to simultaneous measurements of different molecular modalities, is required to capture and discern each of them. In general, varying strategies include: (1) physical separation of molecules. For example, scMT-seq [45] and scTrio-seq [46] adopted physical separation by micropipette manipulation to separate cytosolic mRNA and nuclear DNA methylome. scTrio-seq2 [47] further improved this approach by using magnetic beads to separate nucleus (DNA) and cytoplasm (mRNA). These methods were applied to unravel the connection between the epigenome and transcriptome in neural lineages and colorectal cancer. (2) Biochemical separation. Namely, although different layers of molecules are derived from the same cell lysates in the multi-omics protocol, they could be distinguished and independently analyzed by virtue of unique biochemical modifications. G&T-seq [48] enabled parallel sequencing of single-cell genome and transcriptome. In brief, as with biotin-labeled oligo-dT primers for mRNA capture and reverse transcription, cDNA and genomic DNA (gDNA) were separated for the respective library preparation after incubation with streptavidin-coupled magnetic beads.
(3) Information conversion. Two studies applied similar principles to profile DNA methylation and nucleosome positioning in the same cell. In these cases, through bisulfite conversion, DNA methylation and chromatin accessibility can be further analyzed after the treatment of CpG methylation and GpC methylation, respectively [49,50]. (4) Pre-library split. In the protocol of DR-seq [51], gDNA and cDNA were separated for the following library preparation after reverse transcription and quasilinear whole-genome amplification rather than in the initial step of component separation for individual cells. Similar strategy was also used in the droplet workflow of ISSAAC-seq [52], where pre-library was separated for ATAC and RNA library construction, respectively. Collectively, these strategies are widely applied in most current single-cell multiomics approaches. However, problems such as the incompatibility of experimental conditions in different single-cell unimodal methods largely affect the data quality. More efforts are still warranted to refine the experimental protocols of multimodal single-cell technologies.
Despite these challenges, great strides have been made in development of multimodal single-cell omics technologies with high throughput (Figure 3, Table 1). For example, single-cell multiomics technologies allowing for joint detection of chromatin accessibility and gene expression in the same cell have been developed to elucidate epigenetic mechanisms of genome regulation. scCAT-seq [53] enables simultaneous profiling of chromatin accessibility and transcriptome through physical separation of nucleus DNA and cytosolic mRNA by mild lysis and centrifuge, notwithstanding in low cell throughputs. In comparison, SNARE-seq [54], sci-CAR [55], Paired-seq [56], and SHARE-seq [57] achieved high-throughput on different platforms. Moving forward, both the Ren and He laboratories have recently described high-throughput platforms for measuring chromatin occupancy along with transcriptome in thousands to hundreds of thousands of single cells in one experiment. CoTECH [58] adopted both modified smart-seq2 and CoBATCH to capture the transcriptome and chromatin protein bindings. Tagmented DNA or mRNA was indexed by different combinations of T5-T7 barcodes or well-specific oligo dT primers. After pre-amplification, amplicons were separated into DNA and RNA partitions. For DNA partition, well-indexed primers were used to introduce the second-round barcodes. For RNA partition, full-length cDNA was tagmented by Tn5 followed by the second-round indexing. The DNA and RNA profiles were linked by corresponding pairs of indexing primers. Paired-Tag [59] adopted four rounds of combinatorial indexing to dramatically scale up the throughput. Similar to histone modifications or transcription factors whose genomic distribution could be captured by Tn5 tagmentation, the abundance of cell surface and intracellular proteins could also be translated into digital sequencing counts. CITE-seq [60] labeled the antibodies with oligonucleotides to translate the information of the surface protein abundance into sequencing-based readout. The antibody oligos and mRNA were captured by oligo-dT primers on the beads in one droplet. The amplified antibody-derived tags and cDNA were separated by size and sequenced. Built on CITE-seq and mtscATAC-seq [61], ASAP-seq [62] was developed to profile chromatin accessibility, mtDNA and cell surface/intracellular proteins simultaneously on 10x Genomics Chromium single-cell ATAC Kit platform. Bridge oligos with unique bridge identifiers were used during GEM incubation to ensure that each antibody-oligo complex could be bridged only once. Overall, advances in single-cell multi-Omics, aiming at simultaneously detecting the different modalities including gene expression, chromatin states, chromosome conformation, copy number variation, lineage, cell surface protein, and spatial information in single cells, represent a new frontier for characterization of cell states and cellular dynamics in human health and diseases.

Single-cell multi-omics technologies unleash the power for unbiased characterization of cell types and states
Cellular heterogeneities have been extensively addressed by scRNA-seq analysis. Though highly informative, single-cell transcriptome itself in some cases is insufficient to phenotypically reflect cell types/ states. For example, the abundance of cell surface proteins rather than RNA expression is most commonly  Table 1.  used in the identification of CD8 + T cell in the immune system [63]. It is necessary to unbiasedly define cell types/states to gain comprehensive understanding of cell fate regulation. The Satija laboratory has reported the weighted nearest neighbor (WNN) algorithm to integrate information of multiple modalities in single cells based on relative importance of each modality [63]. The scNMT-seq is able to jointly detect transcriptome, DNA methylation, and chromatin accessibility in the same cell, and multi-omics factor analysis (MOFA) can be applied to integrate a multi-modal mouse gastrulation atlas [64]. Clearly, the mounting single-cell multi-omics data and the regulatory relation to cell function or phenotype have greatly facilitated the precise annotation of cell types and states.

Single-cell multi-omics for decoding cell lineage potentials
Exploring gene regulatory mechanisms in different biological processes remains challenging. The emergence of genome-scale assays such as single-cell multi-omics are now available for probing gene transcription and its regulation of cell differentiation. For example, SHARE-seq reveals that chromatin accessibility foreshadows gene expression during lineage differentiation in adult mouse skin. The He laboratory developed CoTECH to interrogate the relationship between the bivalent histone modifications H3K4me3/H3K27me3 and transcriptome in mouse embryonic stem cells and uncovered a context-specific regulatory interplay from naive to primed mESCs. We expect that similar to SHARE-seq, the future study harnessing the knowledge in the kinetics in varying histone modifications during cell differentiation could also open a new avenue to predict prospective cell potential in a given cell populations in development and diseases.

Single-cell epigenomic and multi-omic data analysis
The development and maturation of single-cell epigenomic and multimodality Omic technologies necessitates the emergence of new computational methods to analyze single-cell sequencing data [65]. Single-cell sequencing data analysis is faced with main challenges such as dropout events, high dimensionality, batch effect and detection noise. While single-cell RNA-seq data analytic tools are relatively mature with standardized bioinformatic pipelines, single-cell epigenomic analysis is rather challenging due to the data sparsity thus far. Therefore, it is important to accurately distinguish the real biological signal from noise. To circumvent these limitations, computational approaches have been developed to handle single-cell epigenetic data in the form of sparse matrices. Scasat [66] and SnapATAC [67] binarize raw count matrix and perform cell clustering, which ignores the co-activity and variable peaks among different regulatory elements. In the scRNA-seq analysis, highly variable genes that might be biologically important are identified before dimensional reduction. However, this is not the case for single-cell epigenetic analyses since variable peaks cannot be obtained due to the binary features. Cusanovich et al. [21] first took the most accessible features as input to latent semantic indexing (LSI) algorithm, a common workflow in scATAC-seq data processing borrowed from topic modeling. LSI results in high degrees of noise and low reproducibility when running multiple samples. Another topic-based method cisTopic [68] enables the classification of genomic regions into regulatory modules to handle the intrinsic sparsity. Pliner et al. [69] introduced Cicero to identify co-accessible elements from chromatin hubs, linking dynamic cis-regulatory elements to their target genes. More recently, ArchR provides an intuitive and user-focused interface for complex downstream analyses [70]. In addition, chromVAR globally predicts enrichment of the TF activity based on scATAC-seq data, and is thus instructive for lineage specification during cell fate decision [71]. Other analytical tools including SnapATAC [67], Signac [72], scATAC-pro [73], and SCALE [74] were also commonly used in scATAC-seq analyses with comparable set of features including doublet removal, cell clustering, identification of differential peaks, and visualization of marker genes. Of note, a sparsity similar to scATAC-seq data is also seen in single-cell profiling of histone modifications and TF binding. Hence, several analytic ideas for scATACseq such as dimension reduction and cell clustering may be also applied to the analysis pipeline of single-cell histone modifications and TF bindings profiling. However, single-cell protein-DNA interactomics and scATAC-seq data analyses differ in many aspects (e.g., active and repressive histone modifications exhibit unique features of genomic distribution), implying that existing consensus scATAC-seq analysis pipelines require further optimization when applied to single-cell protein-DNA interactomics dataset. More specialized algorithms including cell and feature selection, dimension reduction, and cell clustering for single-cell histone modification and TF binding analysis are still lacking.

Integrative analysis of single-cell multi-omics data
To make full use of the wealthy information of single-cell multiomics data, Haghverdi et al. [75] and Butler et al. [76] described two computational approaches, mutual nearest neighbors (MNN) and canonical correlation analysis (CCA), respectively. In brief, these methods integrate different single-cell modalities by identifying a shared structure across datasets, which are aligned for correction of batch effect and further single-cell downstream analyses such as unbiased cell clustering and cell type annotation. Nevertheless, different types of single-cell modality data do not share common features in some cases, leading to an analytic problem when integrating multiple single-cell datasets. For example, the correspondence features between chromatin accessibility and DNA methylome data is unclear [77]. To this end, Welch et al. [78] developed an algorithm, named MATCHER, to infer single cell multi-omic profiles from transcriptomic and epigenetic measurements without priori knowledge of cell correspondences through projection of cells from different modalities onto a common pseudotime space. Using this method, the authors analyzed the correlation and dynamics between gene expression and DNA methylation during human iPSC reprogramming.
While the single-cell research field enters the era of single-cell multiomics, integrative computational algorithms are still in their infancy [77]. The following integrative analyses we discussed mainly focus on paired single-cell multiomics datasets, i.e., different modalities are profiled from the same cells.
(1) Reference-mapping. Most of single-cell multi-omics approaches include transcriptomic profiling. Accordingly, utilizing transcriptome as an anchor to link different modality data across different experiments is a choice. Xiong et al. [58] established an analytical framework using CoTECH data to investigate the relationship of multiple histone modifications and explore the bivalency dynamics from naive to primed mESCs. A similar work took advantage of the transcriptomic information to reconstruct the developmental trajectory of OGC maturation and analyzed epigenetic regulatory programs of H3K4me1, H3K4me3, and H3K27ac. Instead of applying transcriptome as a reference, scCUT&Tag-pro enabled integrative analyses ranging from measurements of chromatin accessibility, protein-DNA interactions (six targets), gene expression and surface proteins by harmonizing into a common space based on the protein abundances [79].
(2) Combined effects. Given the contribution of each layer to cellular heterogeneity, single-cell data from all modalities were combined and represented by one graph. For example, Argelaguet et al. [80] developed a computational method MOFA to identify a set of factors and discover the source of variation in single-cell multi-Omics data. Using this strategy, the authors performed co-assays of the single-cell transcriptome, chromatin accessibility and DNA methylation during gastrulation in mouse embryos, and found that pluripotent epiblast cells are epigenetically primed for an ectoderm fate, which are earlier than the emergence of cell fate bias of endoderm and mesoderm [81]. Similarly, the computational integration of single-cell multimodal profiling data such as MNN, scAI, and totalVI may also lead to more accurate characterization of cell states [63,[81][82][83].
Additionally, computational methods are required to impute missing values in some cases [84]. A bunch of bioinformatic tools have been developed for accurate prediction of single-cell DNA methylation [85,86], and epigenomic cell-state dynamics [87]. However, computational approaches may inevitably lead to misassignments or inaccuracy. Stuart et al. [88] performed CCA using the ATAC and RNA profiles from the same cells separately and compared its inferred pairing with the correct (measured) coupling, and found an accuracy ranging from 36.7%-74.9%. It should be worth noting that such computational errors may also result from the asynchrony between different molecular layers.

Future outlook
In summary, the larger international single-cell Omics community is working together to push the boundary of the field by inventing novel single-cell epigenomics and multi-Omics tools, aiming at unbiasedly defining cell identities and states and exploring the regulatory mechanisms. Despite much progress has been made to reconstruct a comprehensive single-cell atlas to link spatiotemporal dynamics in individual cells in a multicellular organism ( Figure 4A and 4B), current single-cell epigenomic and multi-Omics approaches are far from mature due to the data sparsity, limited recovering rate from input cells and relatively low detection efficiency of molecules, not to mention full coverage of single-molecule levels. Therefore, it is critical to distinguish true biological signals in individual cells from potential technical noise. Optimization in molecular biology is required to improve the recovery of transcriptomic and genomic features. For example, the Adey laboratory developed a symmetrical strand sci ("s3") strategy to overcome the 50% yield limitation of standard Tn5 tagmentation by using a uracil-based adapter switching approach, achieving significant improvements in usable reads per cell [89]. The data quality for other transposase-based single-cell techniques may be also improved via this strategy. Experimentally, single-cell sequencing technologies are moving forward in the following directions: (1) measuring multiple layers instead of single layer in one cell; (2) measuring multiple factors at a time; (3) achieving ultra-high throughput at low cost; (4) compatible with commercial automation platform.
Spatial localization manifests at scales ranging from local (intercellular crosstalk) to global (tissue and organ patterning), strongly impacting on cell function and cell fate decisions [90]. However, most single-cell omics protocols require single-cell dissociation and suspension, resulting in elimination of spatial in-formation. Spatially-resolved transcriptomic methods have been developed and improved rapidly in recent years based on in situ sequencing and single-molecule FISH strategy. For example, technologies including Stereo-seq [91], smFISH [92], MERFISH [93], Slide-seq [94], seqFISH [95], and Tomo-Seq [96] have begun to achieve measures of spatial gene expression at high resolution. Moving beyond spatially resolved transcriptome, the Fan laboratory successfully developed Spatial-CUT&Tag [97] and spatial-ATAC-seq [98] for genome-wide profiling of histone modifications and chromatin accessibility, deepening our understanding of epigenetic regulation in spatial chromatin states. In addition, sciMAP-ATAC successfully characterized the spatially progressive nature in a cerebral ischemia model system [99]. These elegant tools help identify key regulatory elements with specific spatial information. Further efforts would be focused on the extended applications such as tumorigenesis and cancer microenvironment to realize the full potential of spatial epigenome in biomedical researches.
Computational methods have emerged to reconstruct the lineage relationship and infer the underlying dynamics of differentiation in cell fate decisions [100]. Most single-cell pseudotime algorithms are designed for single-cell transcriptomic data analysis, but less is reported on depicting the differentiation potential and developmental trajectory from the perspective in epigenome. Inspired by the ability of chromatin potential and chromatin velocity to recapitulates differentiation processes [101], we believe that chromatin states captured by single-cell epigenomic and multi-omics technologies are equally important and have the potential to infer cell lineages and predict future cell states. Importantly, several elegant technologies such as live imaging have surfaced and greatly expanded our toolkits to study cell lineage during mammalian organogenesis [102][103][104]. From our point of view, experimental and computational combination of singlecell multi-omics, live imaging, and genetic lineage tracing will provide an unprecedented opportunity to reveal precise and complete lineage definition and cell identity ( Figure 4C).
The field of single-cell sequencing methods is undergoing transformative development from measurements of cellular molecules to large-scale genetic perturbations using the CRISPR-Cas9 system. Early investigations on large-scale cellular function screens were limited to scRNA-seq readout. For example, Perturb-seq, CROP-seq, and CRISP-seq directly linked pooled CRISPR screens with transcriptomic output in thousands to hundreds of thousands of individual cells [105][106][107][108]. To facilitate high-throughput functional exploration of epigenetic mechanisms, Perturb-ATAC-seq and droplet-based Spear-ATAC-seq have been developed to characterize the effect of CRISPR/Cas9-based perturbations on chromatin accessibility and examine how epigenomic heterogeneity contributes to transcriptional regulation [109,110]. The integration of single-cell CRISPR-Cas9 functional characterization and epigenomic profiling allows for identification of cell typespecific cis-regulatory elements. So far, functional readout through single-cell epigenomic profiling is particularly challenging, and few tools are available for extrapolating cellular functions from large-scale single-cell omics data. In the future, more efforts would be expected to focus on dissecting gene regulatory network by combining modified genome editing toolkits and single-cell multiomics technologies to demystify the roles of cis-regulatory elements and trans factors in the genome regulation. Breakthrough in technologies would beget the opportunity to associate epigenetic configurations with cell potential and realize precise manipulation of cell fate, with great implications in regenerative medicine, diagnosis and treatment in human diseases [111].

Author contributions
A.H. conceived the structure of the manuscript; A.H., H.X., and R.W. wrote the paper; H.X. and R.W. prepared the figures and tables.