Simultaneous Transcriptional and Epigenomic Profiling from Specific Cell Types within Heterogeneous Tissues InVivo

Epigenomic mechanisms direct distinct gene expression programs for different cell types. Various in vivo tissues have been subjected to epigenomic analysis; however, these studies have been limited by cellular heterogeneity, resulting in composite gene expression and epigenomic profiles. Here, we introduce "NuTRAP," a transgenic mouse that allows simultaneous isolation of cell-type-specific translating mRNA and chromatin from complex tissues. Using NuTRAP, we successfully characterize gene expression and epigenomic states of various adipocyte populations in vivo, revealing significant differences compared to either whole adipose tissue or in vitro adipocyte cell lines. We find that chromatin immunoprecipitation sequencing (ChIP-seq) using NuTRAP is highly efficient, scalable, and robust with even limited cell input. We further demonstrate the general utility of NuTRAP by analyzing hepatocyte-specific epigenomic states. The NuTRAP mouse is a resource that provides a powerful system for cell-type-specific gene expression and epigenomic profiling.


In Brief
Roh et al. introduce a transgenic mouse model, named ''NuTRAP,'' for the isolation of cell-type-specific nuclei and mRNA and characterize gene expression and epigenomic states of pure populations of adipocytes in vivo. This approach is applicable to different cell types and is highly robust even with limited cell inputs.

INTRODUCTION
Although they share the same genome, the many unique cell types that make up complex tissues perform specialized functions, as directed by cell-type-specific gene expression programs. These distinct transcriptional profiles are established and maintained by epigenomic mechanisms that emerge during cellular differentiation; additional epigenomic processes regulate gene expression in response to environmental cues without changing cellular identity (Zhu et al., 2013). Thus, characterizing the chromatin state of cells and their associated gene expression profiles has become a major goal of biologists studying developmental, physiological, and pathophysiological processes. This has resulted in a large and growing catalog of data from humans and model organisms, exemplified by the Encyclopedia of DNA Elements (ENCODE), modENCODE, and Roadmap Epigenom-ics consortia (Bernstein et al., 2010;Celniker et al., 2009;ENCODE Project Consortium, 2012).
Despite the advances made in this area, significant gaps in our knowledge remain. One particularly vexing issue is cellular heterogeneity of whole tissues, such that tissue-level transcriptional and epigenomic profiles reflect an integration of data from all cell types within a sample. This issue can be compounded when looking at processes that involve changes in the proportion of constituent cells in the tissue under study; examples include the profound alterations in the immune cell composition of adipose tissue that accompanies obesity or the selective loss of specific neurons in various neurodegenerative disorders (Cildir et al., 2013;Saxena and Caroni, 2011). As a result, cell-type-specific epigenomics has focused on cultured primary cells or cell lines in vitro, where a pure population can be obtained. The fidelity with which these models represent the in vivo state, however, is often limited or unclear.
To overcome the issue of cellular heterogeneity within tissue samples, several methods have been developed. For example, laser-capture microdissection has been utilized to isolate pure populations of rare cell types (Cheng et al., 2013). However, this method requires high-level expertise and expensive equipment and is limited by extremely low throughput. Cell sorting can also be used to isolate certain types of cells if a unique cell surface marker is known, or a fluorescent marker can be activated using a Cre-lox dependent method followed by tissue dissociation and fluorescence-activated cell sorting (FACS); this approach can be limited, as the dissociation procedure itself can often alter cellular state (Richardson et al., 2015).
TRAP (translating ribosome affinity purification) exploits a GFP-tagged ribosomal protein expressed in a specific cell type and thus allows biochemical isolation of ribosome-bound mRNA from the target cell type within complex tissues without tissue dissociation (Heiman et al., 2008). TRAP has been used to characterize expression profiles of diverse cell types in many organisms (Thomas et al., 2012;Tryon et al., 2013;Watson et al., 2012;Zhou et al., 2013). Similarly, cell-type-specific epigenomic analysis has been enabled by nuclear labeling strategies followed by flow cytometry or bead-based affinity purification (Bonn et al., 2012;Jiang et al., 2008). An example of this approach is INTACT (isolation of nuclei tagged in specific cell types) in which a labeled nuclear membrane protein (RanGAP1) is expressed in a target cell type followed by affinity purification of the tagged nuclei (Deal and Henikoff, 2010). INTACT has been utilized to characterize chromatin state in plants, worms, fruit flies, and mice (Deal and Henikoff, 2010;Mo et al., 2015;Steiner et al., 2012).
We have developed a transgenic mouse line combining the ribosome-tagging strategy from the TRAP method and the nuclear tagging strategy from INTACT into a single polycistronic element targeted to the Rosa26 locus. This mouse line, which we call NuTRAP (nuclear tagging and translating ribosome affinity purification), enables simultaneous isolation of mRNA and nuclei from any cell type for which a Cre line exists. We utilize these mice to establish coincident transcriptional and epigenomic maps from two complex tissues of metabolic relevance: adipose tissue and liver. We demonstrate that chromatin immunoprecipitation sequencing (ChIP-seq) profiling using this method is robust even with low input samples, minimizing the number of animals required for such studies. The NuTRAP mouse will thus be a powerful tool for studies of cell-type-specific genomic and epigenomic profiles in vivo.

Generation of the NuTRAP Mouse
We first generated a cassette containing three components: (1) the E. coli biotin ligase BirA, (2) the mouse nuclear membrane RanGAP1 protein tagged with a biotin ligase recognition peptide (BLRP) and fused to mCherry, and (3) the 60S ribosomal subunit L10a fused to EGFP, each separated by a self-cleaving viral 2A peptide ( Figure 1A). Co-tagging of RanGAP1 with BLRP and mCherry allows nuclear isolation by both affinity-and fluorescence-based purification. The multifunctional cassette was targeted into the Rosa26 locus, preceded by a loxP-stop-loxP sequence. Upon crossing with a cell-type-specific Cre line, the cassette is expressed, enabling cell-type-specific nuclear and ribosomal labeling and subsequent purification. (A) Scheme of NuTRAP mouse. Upon expression of Cre recombinase, the NuTRAP mouse coexpresses BirA, BLRP-and mCherry-fused RanGAP1 and EGFP-fused ribosomal protein L10a by 2A-mediated self-cleavage. BLRP-and mCherry-fused RanGAP1 protein is biotinylated by BirA, allowing nuclear membrane labeling with mCherry and biotin. EGFP-L10a enables tagging of the translating mRNA polysome complex. (B) Fluorescence whole-mount images of eWAT of Ad-NuTRAP mice. mCherry shows nuclear membrane labeling, GFP shows ribosomal labeling, and DAPI visualizes nuclei. Scale bar, 20 mm. (C) Fluorescence images of nuclei isolated from eWAT of Ad-NuTRAP mice. Isolated nuclei were incubated with streptavidin-conjugated dynabeads and washed, and nuclei from the bound and unbound fractions were visualized; unbound nuclei lack GFP and mCherry labeling, whereas bound nuclei have both. Dynabeads are autofluorescent and visualized in all channels. Scale bars, 10 mm. (D) Western blot analysis of the bound and unbound nucleus fractions after streptavidinconjugated dynabead pull-down. Streptavidinconjugated horseradish peroxidase (HRP) and anti-GFP antibody detects biotinylated RanGAP1 and EGFP-L10a proteins, respectively, only in the bound fraction. NS indicates a non-specific band. (E) Flow cytometry analysis of isolated nuclei from eWAT of Ad-NuTRAP mouse. Adipocyte nuclei are positive for mCherry, while non-adipocyte nuclei are negative. (F) Quantitative analysis of nuclei from adipocytes and non-adipocytes in different fat depots of Ad-NuTRAP mouse. Bars indicate mean value ± SEM (n = 3). See also Figures S1 and S2. . Gene expression is normalized by setting the value of the input/total sample equal to 1. Bars indicate mean values ± SEM (n = 3 for input, n = 4 for TRAP, n = 5 for flotation) (*p < 0.05; **p < 0.01; ***p < 0.005). (B) MA plot of RNA-seq results. The x axis indicates transcript abundance, shown as log2 values of average counts per million (CPM), and y axis indicates log2 of fold change (TRAP/input). Each dot indicates an individual transcript, and significantly changed (false discovery rate [FDR] < 0.25) transcripts are highlighted in red.
(legend continued on next page) Adipose tissue is a highly complex tissue composed of adipocytes, fibroblasts, preadipocytes, endothelial cells, and a wide variety of immune cells, whose relative proportions are known to change in different nutritional settings (Cildir et al., 2013). We crossed NuTRAP mice with our adipocytespecific adiponectin-Cre (Ad-Cre) (Eguchi et al., 2011). In Ad-NuTRAP mice, EGFP-fused L10a and mCherry-fused RanGAP1 were expressed in brown adipose tissue (BAT) and white adipose tissue (WAT), but not in kidney or liver (Figure S1A). A majority (>80%) of the protein was completely cleaved by the 2A peptide ( Figure S1A). Fluorescence microscopy revealed mCherry-fused RanGAP1 on the nuclear surface of lipid-filled adipocytes and EGFP-fused L10a found in the cytoplasm and nucleolus ( Figure 1B). Immunohistological analysis using anti-GFP showed that adipocytes were specifically labeled and that the efficiency of Cre excision was >97% in WAT and BAT ( Figure S1B). Ad-NuTRAP mice displayed normal body weight, and their blood glucose and insulin levels were indistinguishable from those of control mice (Figures S1C-S1E).
To test the affinity purification component, we isolated nuclei from adipose tissues and performed purification using streptavidin-conjugated magnetic beads. We observed that mCherry-and EGFP-labeled adipocyte nuclei were bound to autofluorescent magnetic beads, while nuclei from nonadipocyte cells remained unbound ( Figure 1C). Purified nuclei from adipocytes, but not non-adipocytes, expressed biotinylated mCherry-fused RanGAP1 protein and EGFP-fused L10a ( Figure 1D), indicating that the affinity-based labeling system can be used for nuclear purification. Additionally, mCherry-labeled adipocyte nuclei were clearly distinguishable and sortable from the non-adipocyte population ( Figure 1E). Sorted adipocyte nuclei were of high purity (>96%) and quality; >91% of the mCherry-positive nuclei displayed good nuclear morphology ( Figure S1F). We quantified the fraction of adipocytes in epididymal WAT (eWAT), inguinal WAT (iWAT), and BAT as $50%, $30%, and $65%, respectively ( Figure 1F).
Of note, we also generated an additional transgenic mouse (H2B-TRAP)  in which mCherryfused H2B was substituted for BirA-RanGAP1 ( Figure S2A). When crossed to Ad-Cre to generate Ad-H2B-TRAP mice, mCherry-H2B and EGFP-L10a labeled the nucleus and cytoplasm of adipocytes, respectively, in BAT ( Figure S2B). However, we observed that Ad-H2B-TRAP mice were profoundly lipodystrophic in all WAT depots and failed to gain weight on a high-fat diet ( Figure S2C). Based on these results, we stopped using H2B-TRAP mice for work in adipocytes. It is worth noting, however, that expression of H2B-TRAP driven by other Cre lines, such as albumin-Cre (Alb-Cre) or AgRP-Cre, does not cause any recognizable abnormalities (data not shown), and this line has been be useful for non-adipocyte applications (Ye et al., 2016).

NuTRAP-Mediated Cell-Type-Specific Gene Expression Profiles
To characterize adipocyte-specific gene expression, we conducted TRAP experiments in the eWAT of Ad-NuTRAP mice using an anti-GFP antibody. Compared to input from whole eWAT, TRAP samples showed enrichment of adipocyte-selective gene expression; Fabp4, Adipoq, Plin1, and Pparg were enriched by $5-fold, 5-fold, 2-fold, and 2-fold, respectively (Figure 2A, top). In contrast, the endothelial cell markers Pecam1 and Vcam1, and the macrophage marker Emr1, were substantially depleted by $3-fold, 12-fold, and 9-fold, respectively (Figure 2A, top), indicating that TRAP efficiently depletes the mRNA of nonadipocyte cell types. Comparison with the traditional flotation method (i.e., modified Rodbell protocol) showed that the enrichment/depletion efficiency of adipocyte and non-adipocyte markers was not as robust as that seen with TRAP ( Figure 2A, bottom).
To generate adipocyte-specific gene expression profiles, we conducted RNA sequencing (RNA-seq) with TRAP-isolated RNA with total input RNA as a control. TRAP significantly enriched 2,771 genes and depleted 2,283 genes. Of note, depleted genes displayed greater fold changes than enriched genes (Figure 2B), as expected when isolating a cell type that makes up a significant proportion of the whole tissue and demonstrates the high specificity of the TRAP method. Gene set enrichment analysis revealed that adipocyte-enriched genes were involved in oxidative phosphorylation; lipid, glucose, and amino acid metabolism; mitochondrial function; and insulin, adipokine, and PPAR signaling ( Figures 2C and 2D), consistent with the known functions of adipocytes. On the other hand, depleted genes were involved in immune signaling, cytokine and chemokine signaling, cell adhesion, and cytoskeleton ( Figures 2E and 2F), as expected for cells known to populate the stromal-vascular fraction of the fat pad. The complete list of adipocyte-enriched and adipocyte-depleted genes is presented in Table S1. Taken together, our results indicate that the NuTRAP mouse enables the generation of cell-type-specific gene expression profiles.

NuTRAP-Mediated Cell-Type-Specific Epigenomic Profiles
We next assessed epigenome-wide active cis-regulatory element landscapes in mCherry-positive and mCherry-negative nuclei sorted from the WAT and BAT of Ad-NuTRAP mice using H3K27ac ChIP-seq. To compare epigenomic profiles of isolated nuclei with whole fat tissue, we also generated H3K27ac ChIP-seq data from whole WAT and used previously published ENCODE H3K27ac ChIP-seq data from murine whole BAT (Stamatoyannopoulos et al., 2012). Notably, our whole WAT H3K27ac ChIP-seq data displayed enrichment not only at promoters of adipocyte marker genes such as Fabp4, Adipoq, and Plin1, as expected, but also from promoters of macrophage-enriched genes such as Emr1 and Ptprc, as seen in ENCODE data of bone-marrow-derived macrophages (BMDM) (C-F) Gene set enrichment analysis using Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and DAVID Gene Ontology (GO) biological process (BP) terms. All the genes that are significantly enriched (C and D) or depleted (E and F) in adipocytes in TRAP/RNA-seq analysis are used for the analysis. Bars indicate statistical significance shown as Àlog10 of p value. See also Table S1.  . Each column represents the promoter region of a gene locus. Each track shows a different sample; WAT/BAT adipocytes and WAT/BAT non-adipocytes indicate mCherry-positive and -negative nuclei isolated by flow cytometry from WAT/BAT of the Ad-NuTRAP mouse, respectively. Whole WAT from this study and whole BAT from the ENCODE project are presented. Bone-marrow-derived macrophages (BMDMs) data from ENCODE are presented. Our H3K27ac data with sorted adipocyte or non-adipocyte nuclei are normalized and presented in the same scale. Because the whole WAT sample was separately processed and sequenced in lower depth, it is visualized in an independent scale after 3-pixel window smoothing. The scale of the ENCODE whole BAT and BMDMs are automatically determined by the genome browser for comparable visualization. (B and D) Heatmap of adipocyte-enriched (B) and adipocyte-depleted (D) H3K27ac peaks in adipocyte and non-adipocyte nuclei. (C and E) KEGG pathway analysis with the genes associated with adipocyte-enriched (C) and adipocyte-depleted (E) H3K27ac peaks. Bars indicate statistical significance shown as Àlog10 of p value.
(legend continued on next page) (Stamatoyannopoulos et al., 2012) ( Figure 3A) and endothelial cell-enriched genes such as Cd34 and Tek. Similarly, the ENCODE H3K27ac whole BAT ChIP-seq data display peaks from both adipocyte marker genes and non-adipocyte marker genes ( Figure 3A). In contrast, H3K27ac ChIP-seq data from labeled white and brown adipocyte nuclei (mCherry positive) were generally devoid of signal from non-adipocytes while showing enrichment at adipocyte promoters compared to whole fat. Strong peaks were noted at thermogenic gene promoters (e.g., Ucp1 and Dio2) in brown adipocytes, but not white, while the reverse pattern was seen for white adipocyte-selective genes (e.g., Cfd and Lep). We also noted that mCherry-negative nuclei displayed weak signal at adipocyte marker promoters ( Figure 3A), possibly due to incomplete labeling of adipocytes by the Ad-NuTRAP system. Alternatively, these cells may have recombined properly but show weak mCherry labeling of nuclei. Regardless, these results indicate that whole-tissue ChIP-seq data, including data in major repositories, represent profiles of mixed cell types and must be interpreted cautiously.
Differential enrichment of H3K27ac signal between mCherrypositive and mCherry-negative nuclei was assessed ( Figures  3B and 3D), followed by pathway analysis on genes in proximity to differentially enriched peaks. Adipocyte-enriched peaks were associated with genes involved in insulin signaling, PPAR signaling, and lipid metabolism ( Figures 3C and S3A). By contrast, adipocyte-depleted peaks were associated with genes involved in actin cytoskeleton organization, cell adhesion, and immune responses ( Figures 3E and S3B). We also integrated the H3K27ac ChIP-seq and RNA-seq data to determine whether the activity of cis-regulatory elements is correlated with gene expression. We defined global differences in gene expression and H3K27ac peak activity between adipocytes and non-adipocytes by comparing the fold changes of gene expression (TRAP/ input) to those of H3K27ac peak activity. In order to avoid bias, we selected the strongest peak for each gene within ±100 kb of the transcription start site (TSS) without filtering for peak directionality. Even with this liberal approach, a majority (>76%) of differentially regulated genes are directionally correlated with H3K27ac peaks ( Figure S3C). These results demonstrate that H3K27ac ChIP-seq in NuTRAP mice allows for the identification of cell-type-specific cis-regulatory regions and the biological processes associated with them.
Cell-type-enriched regulatory elements identified from H3K27ac ChIP-seq can be used to identify transcription factors that play important roles in those cell types (Kang et al., 2015;Mikkelsen et al., 2010). To identify transcription factors that may function in adipocytes, we performed motif enrichment analysis in regulatory elements with increased activity in mCherry-positive nuclei. Assuming that functionally important transcription factors must be expressed at reasonable levels, we prioritized motifs by transcription factor abundance from our RNA-seq data. The most enriched motifs were the wellknown pro-adipogenic transcription factors C/EBPa and PPARg ( Figure 3F), again confirming that ChIP-seq from NuTRAP generates high-quality, biologically meaningful data. We also identified NFE2L1::MAFG and NR1H2::RXRA as significantly enriched transcription factor motifs ( Figure 3F), suggesting that these transcription factors may play an important role in adipocyte functions. NR1H2 is also known as LXRb, which along with its related factor LXRa has been shown to affect adipose lipogenesis and browning of white fat (Beaven et al., 2013;Miao et al., 2015). NFE2L1 and MAFG, on the other hand, have not been studied in the context of adipose biology; our data suggest that they may have been overlooked until this point. We also conducted motif enrichment analysis on ''non-adipocyte'' peaks and identified significant motifs such as KLF4 and E2F1/3/4 ( Figure S3D), which are involved in immune cell function and preadipocyte proliferation (Alder et al., 2008;Humbert et al., 2000;Taura et al., 2012).
In order to test whether the NuTRAP system allows sufficient yield and purity for the determination of transcription factor cistromes, we conducted ChIP-seq for PPARg in isolated brown adipocyte nuclei. PPARg emerged in our motif enrichment analysis and is a known driver of adipogenesis and adipocyte gene expression (Rosen et al., 2000). Overall, we identified 36,148 PPARg peaks, similar to the number of peaks reported in whole BAT (Rajakumari et al., 2013) and cultured adipocytes in vitro (Siersbaek et al., 2012). PPARg peaks are found throughout the genome but are most common immediately proximal to the transcription start sites of genes ( Figure S3E). Unbiased de novo motif discovery analysis recovered a PPARG::RXR motif virtually identical to the known consensus ( Figure S3F). Strong PPARg peaks were noted in the promoter regions of adipocyte marker genes and also the brown adipocyte marker Ucp1 ( Figure 3A, bottom track). We also compared the overlap between brown adipocyte PPARg peaks and H3K27ac peaks from brown adipocytes and non-adipocytes. While 65% of the H3K27ac peaks in brown adipocytes overlapped with PPARg peaks, the degree of overlap in non-adipocytes was only 4% ( Figure 3G).

Comparison of Chromatin State between In Vivo and In Vitro Adipocytes
Our data from isolated adipocytes in vivo provide an unusual opportunity to compare how chromatin state differs between cells cultured in monolayer versus cells of the same type isolated from their native context. The murine 3T3-L1 line was chosen for this comparison because it is the most widely used adipocyte cell culture model and its chromatin state has been previously characterized (Mikkelsen et al., 2010). To systematically compare in vitro and in vivo epigenomic states while minimizing potential (F) Motif enrichment within adipocyte-enriched H3K27ac peaks. Each dot represents a transcription factor in the JASPAR database. The x axis indicates the statistical significance shown as Àlog10 of p value, and the y axis indicates the abundance (fragments per kilobase of transcript per million [FPKM]) of the transcription factors from the total eWAT in the RNA-seq data. Several transcription factors with the highest statistical significance and gene abundance are highlighted in red, and their binding sequence position weight matrices (PWM) are presented on the right. (G) Overlap between PPARg binding sites in BAT adipocytes and BAT adipocyte-enriched or non-adipocyte-enriched H3K27ac peaks. The bars indicate the percentage of H3K27ac peaks that also contain a PPARg peak. See also Figure S3.
confounding experimental effects of each system, we took a differentially regulated peak comparison approach. First, we identified differentially regulated H3K27ac peaks between mCherry-positive and mCherry-negative nuclei (in vivo) and between 3T3-L1 adipocytes and pre-adipocytes (in vitro) and then determined how well these correlate with one another. Overall, we noted a positive correlation between the models (R 2 = 0.5785) ( Figure 4A). Gene set enrichment analysis showed that peaks upregulated in mature adipocytes both in vivo and in vitro (top right, first quadrant) were related to metabolic processes, PPAR, insulin and adipokine signaling pathways; specific examples include the adipocyte genes Fabp4, Plin1, and Adipoq. Downregulated peaks (bottom left, third quadrant) were associated with cytoskeletal organization and cell adhesion ( Figures 4A and 4B). To better delineate how in vivo and in vitro models differ, we also determined H3K27ac peaks that were discordant. Peaks that were upregulated in vitro but downregulated in vivo were associated with genes involved in metabolic processes ( Figure 4A, top left, second quadrant), suggesting that the two systems may be metabolically distinct. Intriguingly, we noted that peaks that were upregulated in vivo but downregulated in cultured adipocytes (bottom right, fourth quadrant) were associated with genes involved in inflammatory response and chemokine signaling ( Figure 4A). Motif enrichment analysis  identified an NFKB1 binding site as enriched in these peaks (Figure S4A). These results suggest that chemokine signaling and inflammatory response mediated by the NF-kB pathway may be more important in vivo than in vitro.
Furthermore, some H3K27ac peaks were unique to either the in vitro or in vivo system; examples include the in-vivo-selective peaks at the Lep, Nnat, Mc2r, and Igf1 loci and the in-vitro-selective peaks at the Mdm2, Hes1, and Slc35e3 loci ( Figure 4B). In particular, the Lep locus is notable for its lack of the active H3K27ac mark in 3T3-L1 cells, consistent with previous reports that Lep is not expressed in 3T3-L1 adipocytes despite high expression in vivo (Wrann et al., 2012). Gene set enrichment analysis showed that the chemokine signaling pathway was associated with in-vivo-specific peaks ( Figure S4B). Pathways associated with in vitro adipocyte-selective peaks include cellcycle control ( Figure S4C), consistent with the fact that 3T3-L1 cells must undergo mitotic clonal expansion prior to differentiation, while cells in vivo do not (Cristancho and Lazar, 2011).

Epigenomic Profiles of Pure Hepatocyte Population In Vivo
In order to establish the generalizability of the NuTRAP system, we also tested its efficacy in the liver by crossing the NuTRAP mouse with the hepatocyte-specific Alb-Cre line. Most hepatocytes of Alb-NuTRAP mice were labeled with GFP (>96%; Figure S5A), and mCherry-positive hepatocyte and mCherrynegative non-hepatocyte nuclei were isolated by flow cytometry (Figures 5A-5C). Hepatocytes contribute approximately half of the total cell population in the liver ( Figure 5B) and have nuclei with a range of size and ploidy; $70% of hepatocyte nuclei were larger in size and polyploid ( Figures 5C, S5B, and S5C), confirming previous findings (Guidotti et al., 2003).
Our H3K27ac ChIP-seq data from isolated hepatocyte nuclei showed strong H3K27ac activity in hepatocyte marker gene loci but no activity in immune cell or endothelial cell marker loci, whereas non-hepatocyte nuclei showed the opposite pattern ( Figure 5D). Gene set enrichment analysis confirmed that hepatocyte-enriched H3K27ac peaks were associated with genes involved in known hepatocyte functions (xenobiotic and drug metabolism, steroid synthesis, and lipid and amino acid metabolism) ( Figures 5E and S5D), while hepatocytedepleted peaks were associated with genes involved in immune response (chemokine signaling, cell adhesion, and cytoskeleton organization) ( Figures 5F and S5E), functions of non-hepatocyte liver cells. Furthermore, motif enrichment analysis of hepatocyte-specific peaks identified transcription factors that function in hepatocytes, including HNF1A/B and HNF4A/G ( Figure 5G).
ChIP-seq profiles of H3K4me3 (for promoter marks) and H3K4me1 (for active or poised enhancer marks) were also obtained from white and brown adipocytes and hepatocytes ( Figure S6A), demonstrating broad applicability of the NuTRAP system to define epigenomic landscapes of specific cell types.

Sensitivity of ChIP-Seq Using NuTRAP
ChIP-seq with rare cell types has been challenging, because the standard ChIP-seq protocol requires a large amounts of input material, often exceeding one million cells. To determine whether the NuTRAP mouse can be used with a low amount of cell input, we conducted a titration curve of H3K27ac ChIP-seq using between 10,000 and 400,000 isolated brown adipocyte nuclei. Visual inspection of ChIP-seq data showed that all tested input levels generated high-quality profiles of H3K27ac peaks at the Pparg locus ( Figure 6A). To systematically examine ChIP-seq peak signals in these titrations, we first analyzed correlation between the highest input with 400,000 nuclei and each of the lower inputs. All comparisons displayed a strong positive correlation, although correlations were somewhat weaker with lower input ( Figure 6B). Visualization of peaks revealed highly consistent H3K27ac ChIP-seq peak profiles across all input depths (Figure 6C). Strong overlaps between peaks was observed across input levels ( Figure 6D), with $50% of the peaks found in 400,000 nuclei identifiable when only 10,000 nuclei were used. Finally, ChIP-seq efficiency was high, ranging from 46% of reads contributing to peak signals with 400,000 nuclei to 14% with 10,000 nuclei ( Figure S6B). Taken together, our data indicate that the NuTRAP system is highly sensitive and specific will be highly useful for cell types in limiting quantity.

DISCUSSION
Integrated analysis of gene expression and chromatin state from specific cell types can lead to insights into the core mechanisms underlying cellular differentiation, physiology, and pathophysiology. Studies of tissues in vivo have been hampered, however, by cellular heterogeneity, which results in complex patterns of gene expression and epigenomic signals that largely reflect the relative abundance of individual cell types in the tissue being studied. This poses a serious challenge for analyzing any tissue or organ, but even more so when the cellular population is dynamic. Adipose tissue provides an excellent example, as it contains multiple cell types at baseline, the relative proportions of which become profoundly altered during the development of obesity.
Here, we introduce the NuTRAP mouse, which enables simultaneous cell-type-specific isolation of translating mRNA and nuclei from heterogeneous in vivo tissues. Using the NuTRAP mouse, we successfully isolate mRNA and nuclei from adipocytes in vivo, characterize their profiles of gene expression and epigenomic states, and compare these to data from whole fat tissue and adipocyte cell lines cultured in vitro. We further cells and housekeeping genes are shown. Isolated hepatocyte and non-adipocyte H3K27ac data are presented in the identical scale, and the scale of the ENCODE liver is automatically determined by the genome browser for comparable visualization. (E and F) KEGG pathway analysis with the genes associated with hepatocyte-enriched (E) and hepatocyte-depleted (F) H3K27ac peaks. Bars indicate statistical significance shown as Àlog10 of p value. (G) Motif enrichment analysis with hepatocyte-enriched H3K27ac peaks. The most significant motifs are presented with their sequence PWM and statistical significance shown as Àlog10 of p value. See also Figure S5. demonstrate the general usefulness of the NuTRAP mouse to other non-adipose tissues by analyzing hepatocytes. Finally, we find that ChIP-seq using NuTRAP is highly robust to limited cell input, suggesting that it can be used for studies of low-abundance cell types.
The NuTRAP mouse has several advantages over other models developed to deal with cellular complexity. First, and most importantly, by combining the ribosomal and nuclear tagging approaches of TRAP and INTACT, respectively, it allows for the generation of transcriptional and epigenomic profiles from a single cross. This feature can be particularly useful to characterize transcriptional and epigenomic variations among individual animals. While different strategies have been employed to analyze multiple parameters from the same samples (Bock et al., 2016), our approach uses separate fractions for RNAseq and ChIP-seq. We expect that our method can be further modified to analyze RNA and DNA from the same cells. Another advantage of the NuTRAP mouse is that by exploiting a dual-labeling system using mCherry fluorescence and biotin, labeled nuclei can be isolated using either FACS or affinity purification. The NuTRAP model allows for the isolation of nuclei and ribosomes from any cell type for which a Cre line exists. Importantly, the Cre line used does not need to be expressed in only a single cell type in the body; it is only necessary for it to be unique to a single cell type in the organ under investigation.
Adipose tissue is highly heterogeneous and composed of many different cell types, including adipocytes, fibroblasts, endothelial cells, and immune cells. Our quantitative analysis using FACS found that while the fraction of adipocytes varies in different depots, non-adipocytes generally account for more than half of all cells in WAT. This is consistent with prior work in human adipose tissue suggesting that only 30%-50% of cells in white adipose tissue are adipocytes (Lee et al., 2013). Adipose tissue is somewhat unique in that mature adipocytes can be separated from the rest of the so-called stromal-vascular fraction by virtue of the fact that they float when dispersed. Nonetheless, it has been shown that even this relatively simple method can alter the gene expression profile of the cell (Ruan et al., 2003). By contrast, the NuTRAP mouse allows one to immediately isolate adipocyte-specific mRNA from fat tissues in vivo without enzymatic digestion. Furthermore, since TRAP isolates actively translating mRNA, it likely provides a profile more representative of protein-coding genes than total RNA obtained by flotation. RNA-seq from adipocytes demonstrated that a large number of genes involved in immune and stromal cell functions were dramatically depleted by TRAP, indicating that gene expression data from whole fat tissues are substantially affected by non-adipocyte cell types. Here, we provide a list of adipocyte-enriched and adipocyte-depleted genes in mice, a useful resource for workers in this field.
We also characterized the profiles of adipocyte-specific cisregulatory element states using H3K27ac ChIP-seq. While previous studies using whole fat tissue showed mixed profiles of different cell types, our results reflect a pure population of adipocytes. Gene set enrichment analysis showed that adipocyte-enriched and adipocyte-depleted cis-regulatory elements were highly associated with functional biological pathways in adipocytes and non-adipocytes, respectively, suggesting that cell-type-specific epigenomic profiling can identify relevant biological processes. Furthermore, motif enrichment analysis combined with RNA-seq expression analysis identified transcription factors known to play an important role in adipogenesis and/or adipocyte function, including C/EBPa, PPAR::RXRa, and NR1H2/LXRb (Beaven et al., 2013;Rosen et al., 2000). In this light, the identification of the NFE2L1:MAFG composite motif as significantly over-represented in adipocyte-specific peaks is especially interesting. While NFE2L1 has been reported to have metabolic functions in liver and pancreatic b-cells (Hirotsu et al., 2012;Zheng et al., 2015), our results suggest that these or related factors may play an important role in adipocyte biology. In this study, we focused on adipocyte-specific profiles in healthy conditions. Adipose tissue is known to undergo substantial changes in cellular composition in states of overnutrition, involving depletion of some cell types (e.g., eosinophils, T regulatory cells) and influx of others (e.g., CD4 + T lymphocytes and M1 macrophages). Ad-NuTRAP can be used to characterize adipocyte-specific gene expression and epigenomic changes in those settings without confounding by altered cellular distribution.
Cultured adipocyte cell lines have been employed as another means to study adipocytes in isolation, and although it has been hard to draw direct comparisons, studies have suggested that there is reasonable concordance between in vitro models such as 3T3-L1 adipocytes and adipocytes in vivo (Soukas et al., 2001). Our results provide an unusual opportunity to identify similarities and differences between the two systems. Both in vivo and in vitro adipocyte systems displayed strong H3K27ac peaks in cytoskeleton-related genes and metabolism-related genes in the precursor and mature states, respectively, suggesting that 3T3-L1 adipocytes recapitulate in vivo adipogenesis and adipocyte cellular function in a broad sense. Interestingly, we found that chemokine and inflammatory signaling is a key difference between in vitro and in vivo models, consistent with the fact that adipocytes in vivo are constantly interacting with other nearby cell types, including immune cells. Our motif enrichment analysis identified NFKB1 as one of the motifs enriched in the peaks that were upregulated in vivo but downregulated in cultured adipocytes. Together with previous reports that the NF-kB pathway is resistant to lipopolysaccharide (LPS) stimulation in 3T3-L1 adipocytes (Berg et al., 2004;Kumari et al., 2016), these results suggest that the NF-kB-mediated inflammatory pathway may act differently in vivo than in vitro. Additionally, we found that cell-cycle-related pathways were enriched in 3T3-L1 adipocytes compared to in vivo adipocytes, consistent with the proliferative potential of immortalized cells generally and the known reliance of 3T3-L1 cells on mitotic clonal expansion prior to differentiation (Cristancho and Lazar, 2011). Interestingly, 3T3-L1 cells recapitulate many if not most of the functions of adipocytes in vivo, with a few notable exceptions. One of these is the expression of the adipokine leptin (MacDougald et al., 1995;Wrann et al., 2012). Our results indicate that while many adipocyte-specific loci show similar chromatin state both in vivo and in vitro, the Lep locus lacks active H3K27ac peaks in 3T3-L1 cells.
In order to demonstrate the generalizability of our model, we applied our approach using the NuTRAP mouse for the in vivo analysis of hepatocytes and successfully characterized hepatocyte-specific chromatin states, associated biological processes, and potentially important transcription factors. We also characterized promoters and active or poised enhancers by profiling H3K4me3 and H3K4me1, respectively. Accordingly, we suggest that the NuTRAP mouse will be broadly applicable to a wide range of cell-type-specific genomic and epigenomic studies. One potential concern with this type of approach in vivo is the need for large numbers of cells to generate chromatin state maps. Standard ChIP-seq protocols often require at least one million cells for many transcription factors and histone marks, which is not possible with rare cell types without pooling large numbers of animals. To overcome this problem, several lowinput ChIP-seq methods have been previously developed, often requiring additional PCR amplification (Adli and Bernstein, 2011) or multiple additional enzymatic steps (Shankaranarayanan et al., 2011), which may introduce artifacts. More recently, new ChIP-seq methods using Tn5 transposase (Schmidl et al., 2015) and microfluidics (Rotem et al., 2015) have been developed for low inputs and single cells, respectively, but have yet to be thoroughly tested for rare cell types in vivo.
Here, we achieved strikingly high ChIP-seq efficiency using as few as 10,000 nuclei and achieving efficiencies similar to those seen with 20 million cells . These results indicate the robustness of our assay and suggest that NuTRAP can be used to characterize cells present in limiting quantities without the need for pooling large number of animals. It may be possible to further increase the sensitivity of NuTRAP by incorporating other low-input ChIP methods, including Tn5 transposasemediated ChIPmentation (Schmidl et al., 2015) or single-cell ChIP-seq methods (Rotem et al., 2015). In conclusion, we introduce a new mouse model that provides a powerful system for cell-type-specific gene expression and epigenomic profiling that can be applied to the analysis of even rare cell types.

Animals
All animal studies were performed according to procedures approved by the Beth Israel Deaconess Medical Center (BIDMC) Institutional Animal Care and Use Committee (IACUC). The generation of the NuTRAP mouse is described in detail in Supplemental Experimental Procedures. Adipocyteand hepatocyte-specific studies involved crossing NuTRAP mice with Ad-Cre (Jackson Laboratory, 010803) (Eguchi et al., 2011) or Alb-Cre (Jackson Laboratory, 003574) mice, respectively. Mice were maintained on a standard chow diet (6.4% w/w fat; Harlan Teklad, 8664) under a regular 12-hr light/ 12-hr dark cycle at constant temperature (23 C). NuTRAP (catalog #029899) and H2B-TRAP (catalog #029789) mice have been deposited at Jackson labs.
TRAP RNA Isolation TRAP was performed as previously described  with modifications. Small pieces (50-100 mg) of frozen white adipose tissues were Dounce homogenized in 4 mL homogenization buffer (50 mM Tris [pH 7.5], 12 mM MgCl 2 , 100 mM KCl, 1% NP-40, 100 mg/mL cycloheximide, 1 mg/mL sodium heparin, 2 mM DTT, 0.2 U/mL RNasin, and 1x Complete EDTA-free protease inhibitor; Roche). After centrifugation at 13,000 rpm for 10 min, the lipid layer was removed and the supernatant was collected and incubated with anti-GFP antibody (5 mg/mL; Abcam, ab290) for 1 hr at 4 C. Protein G dynabeads were washed twice in low-salt wash buffer (50 mM Tris [pH 7.5], 12 mM MgCl 2 , 100 mM KCl, 1% NP-40, 100 mg/mL cycloheximide, and 2 mM DTT), added to the homogenates with antibody, and subsequently incubated for 30 min. Dynabeads with immunoprecipitates were washed three times in high-salt wash buffer (50 mM Tris [pH 7.5], 12 mM MgCl 2 , 300 mM KCl, 1% NP-40, 100 mg/mL cycloheximide, and 2 mM DTT). Following the last wash, RLT buffer with b-mercaptoethanol was added to dynabeads, and RNA was extracted using a QIAGEN Micro RNeasy kit according to the manufacturer's instructions. For input RNA, 5% of homogenates were mixed with TRIzol and processed according to the manufacturer's instructions to extract total RNA. Isolated RNA was quantified by Qubit, and RNA integrity was analyzed using an Agilent Bioanalyzer.

RNA-Seq
Extracted RNA (100 ng) was processed for rRNA removal using the Epicenter rRNA depletion kit according to the manufacturer's instructions. rRNAdepleted RNA was subsequently used to generate paired-end sequencing libraries using the Illumina RNA TruSeq Library Kit according to the manufacturer's instructions. The quantity and quality of RNA-seq libraries were analyzed using Qubit and an Agilent Bioanalyzer, respectively, and the libraries were pooled at a final concentration of 12 pM and sequenced by HiSeq2500.

Nucleus Isolation and Sorting
Tissues were Dounce homogenized in nucleus preparation buffer (NPB; 10 mM HEPES [pH 7.5], 1.5 mM MgCl 2 , 10 mM KCl, 250 mM sucrose, 0.1% NP-40, and 0.2 mM DTT). Homogenates were filtered through 100-mm cell strainers and cross-linked with 1% paraformaldehyde (PFA) at room temperature for 4 min while shaking and then quenched by 125 mM glycine for 10 min. Cross-linking upon homogenization improved nucleus isolation yield and subsequent ChIP-seq efficiency. Cross-linked homogenates were centrifuged at 1,000 3 g for 10 min, and the nuclear pellets were resuspended in NPB and centrifuged at 1,000 3 g for 10 min. Nuclear pellets were thoroughly resuspended in nucleus sorting buffer (NSB) containing 10 mM Tris (pH 7.5), 40 mM NaCl, 90 mM KCl, 2 mM EDTA, 0.5 mM EGTA, 0.1% NP-40, and 0.2 mM DTT) and filtered through 40-mm cell strainers. Isolated nuclei were sorted using a BD FACS Aria II; single nuclei were gated by forward scattered (FSC) and side-scattered (SSC), and mCherry-positive and mCherry-negative populations were collected. GFP fluorescence in the nucleolus can be also used for sorting. WAT samples were prepared by pooling eWAT and iWAT to generate common white adipocyte profiles.
Extracted DNA (1-10 ng, or all if less) was used to generate sequencing libraries by following the ''on-bead'' sequencing library preparation method as previously described (Kang et al., 2015). Briefly, DNA was processed through end repair/phosphorylation using the End-It DNA End-Repair Kit (Epicenter), A-tailing using the Klenow Fragment (New England Biolabs, M0212) and index adaptor ligation using the Quick Ligase (NEB, M2200). AMPure XP beads were left in all the reactions to clean up the DNA using PEG (Polyethylene Glycol 8000)/NaCl solution. After ligation, DNA was eluted from AMPure XP beads and then PCR-amplified using the PfuUltra II Hotstart PCR Master Mix (Agilent Technologies, 600850) for 14-16 cycles for histone ChIP and 18 cycles for low-input titration and PPARg ChIP. Gel electrophoresis and extraction was performed using E-Gel EX Agarose Gels (Invitrogen) and MinElute Gel Extraction (QIAGEN) to select library fragments between 250 and 600 bp. The quantity and quality of the libraries were analyzed by Qubit and Agilent Bioanalyzer, respectively, and the libraries were pooled at a final concentration of 12 pM and sequenced by HiSeq2500 or NextSeq 500 systems.

Statistics
Pairwise comparisons were analyzed by two-tailed unpaired Student's t test, and p < 0.05 was considered statistically significant, unless otherwise specified.

ACCESSION NUMBERS
The accession number for the raw and processed data reported in this paper is GEO: GSE92590.