Transcriptional profiling at whole population and single cell levels reveals somatosensory neuron molecular diversity

The somatosensory nervous system is critical for the organism's ability to respond to mechanical, thermal, and nociceptive stimuli. Somatosensory neurons are functionally and anatomically diverse but their molecular profiles are not well-defined. Here, we used transcriptional profiling to analyze the detailed molecular signatures of dorsal root ganglion (DRG) sensory neurons. We used two mouse reporter lines and surface IB4 labeling to purify three major non-overlapping classes of neurons: 1) IB4+SNS-Cre/TdTomato+, 2) IB4−SNS-Cre/TdTomato+, and 3) Parv-Cre/TdTomato+ cells, encompassing the majority of nociceptive, pruriceptive, and proprioceptive neurons. These neurons displayed distinct expression patterns of ion channels, transcription factors, and GPCRs. Highly parallel qRT-PCR analysis of 334 single neurons selected by membership of the three populations demonstrated further diversity, with unbiased clustering analysis identifying six distinct subgroups. These data significantly increase our knowledge of the molecular identities of known DRG populations and uncover potentially novel subsets, revealing the complexity and diversity of those neurons underlying somatosensation. DOI: http://dx.doi.org/10.7554/eLife.04660.001


Introduction
The somatosensory nervous system comprises diverse neuronal subsets with distinct conduction properties and peripheral and central innervation patterns, including small-diameter, unmyelinated C-fibers, thinly myelinated Aδ-fibers, and large-diameter, thickly myelinated Aα/β-fibers Abraira and Ginty, 2013). Distinct sets of somatosensory neurons are thought to mediate different functional modalities, such as tactile sensation, proprioception, pruriception and nociception. During development, precise expression of neurotrophic receptors and transcription factors at different times controls the differentiation and connectivity of these diverse sensory afferent populations (Marmigere and Ernfors, 2007;Abraira and Ginty, 2013). Detection of thermal, mechanical, and chemical stimuli in the external or internal environment by the somatosensory neurons is mediated by expression of specific molecular transducers at their peripheral nerve terminals. For example, transient receptor potential (TRP) ion channels are activated in response to heat, cold, reactive chemicals, leading to cation influx and action potential generation Dib-Hajj et al., 2010;Dubin and Patapoutian, 2010;Julius, 2013). Given the high degree of cellular diversity of the somatosensory system defined at developmental, anatomical, and functional levels, a classification scheme of different somatosensory neuron subtypes based on the comprehensive set of genes they express is so far lacking. Determining the detailed molecular organization of specific somatosensory neuron subtypes is however necessary for our understanding of their specification, normal function and contribution to disease.
Cell-type specific transcriptome analysis is increasingly recognized as important for the molecular classification of neuronal populations in the brain and spinal cord (Okaty et al., 2011). Fluorescence activated cell sorting (FACS) and other neuron purification strategies coupled with transcriptional profiling by microarray analysis or RNA sequencing has allowed detailed molecular characterization of discrete populations of mouse forebrain neurons (Sugino et al., 2006), striatal projection neurons (Lobo et al., 2006), serotonergic neurons (Wylie et al., 2010), corticospinal motor neurons (Arlotta et al., 2005), callosal projection neurons (Molyneaux et al., 2009), proprioceptor lineage neurons , and electrophysiologically distinct neocortical populations (Okaty et al., 2009). These data have uncovered novel molecular insights into neuronal function. Transcriptional profiling technology at the single cell level is transforming our understanding of the organization of tumor cell populations and cellular responses in the immune system (Patel et al., 2014;Shalek et al., 2014), and has begun to be applied to neuronal populations (Citri et al., 2012;Mizeracka et al., 2013). This technology has been proposed as a useful approach to begin mapping cell diversity in the mammalian CNS (Wichterle et al., 2013).
To begin to define the molecular organization of the somatosensory system, we have performed cell-type specific transcriptional profiling of dorsal root ganglion (DRG) neurons at both whole population and single cell levels. Using two reporter mice, SNS-Cre/TdTomato and Parv-Cre/TdTomato, together with surface Isolectin B4-FITC staining, we identify three major, non-overlapping populations of DRG neurons encompassing almost all C-fibers and many A-fibers. SNS-Cre is a BAC transgenic mouse line expressing Cre under the Scn10a (Nav1.8) promoter (Agarwal et al., 2004) which has been eLife digest In the nervous system, a network of specialized neurons-known as the somatosensory system-carries information about sensations including touch, muscle position, temperature and pain. Distinct sets of somatosensory neurons are thought to carry information about the different types of sensations. In young animals, the precise switching on, or 'expression', of genes controls the formation of the network of neurons. However, it is not known exactly which genes are expressed in what types of neurons, where, or when.
Here, Chiu et al. used a technique called flow cytometry using different fluorescent markers to isolate a group of cells called Dorsal Root Ganglion (DRG) neurons in mice. These neurons have long thread-like fibers that extend from the spinal cord to the skin, muscles and joints all over the body. These fibers carry sensory information to the spinal cord, where it can be relayed to the brain and processed. The experiments compared three distinct types of DRG neuron and found that they differed in their ability to send information to other cells.
Chiu et al. analyzed the expression of all the genes in the three types of DRG neurons. Each type of neuron had distinct groups of genes that were being expressed. Also, several genes that are known to be important for sensation were expressed at different levels in the different types of cells. Next, large numbers of single cells were analyzed to find out the finer details about the three types of neuron. These findings made it possible to further divide the DRG neurons into six distinct subsets that matched previously known groups of somatosensory neurons, and also identified new ones.
Chiu et al.'s findings reveal the complexity and diversity of the neurons involved in carrying information about sensations towards the brain. This is an important step in classifying the nervous system, and uncovers many genes previously not linked to sensation. The next challenges lie in understanding how the expression of these genes in each type of neuron relates to their unique roles. DOI: 10.7554/eLife.04660.002 shown to encompass DRG and trigeminal ganglia nociceptor lineage neurons, and in conditional gene ablation studies affects thermosensation, itch, and pain Lopes et al., 2012;Lou et al., 2013). A widely used Nav1.8-Cre knock-in mouse line also exists (Stirling et al., 2005;Abrahamsen et al., 2008), but differs to some extent from the transgenic SNS-Cre mouse line. We find, for example, that SNS-Cre/TdTomato reporter mice label 82% of total DRG neurons, which is slightly greater than Nav1.8-Cre/TdTomato reporter mice (75%) (Shields et al., 2012), implying capture of a larger neuronal population. Both the SNS-Cre lineage and Nav1.8-Cre lineage neurons include a large proportion of C-fibers and a smaller population of NF200 + A-fibers (Shields et al., 2012). As expected, the majority of TdTomato + cells (90%) in the SNS-Cre/TdTomato line expressed Scn10a transcript encoding Nav1.8 when tested by RNA in situ hybridization . Our second reporter line used Parv-Cre, a knock-in strain expressing Ires-Cre under the control of the Parvalbumin promoter, which has been used in the study of proprioceptive-lineage (large NF200 + A-fiber) neuron function (Hippenmeyer et al., 2005;Niu et al., 2013;de Nooij et al., 2013). Finally we used IB4, which labels the surface of non-peptidergic nociceptive neurons (Vulchanova et al., 1998;Stucky et al., 2002;Basbaum et al., 2009).
Using these mice and the labeling strategies, we were able to FACS purify three major, nonoverlapping populations of somatosensory neurons: (1) IB4 + SNS-Cre/TdTomato + , (2) IB4 − SNS-Cre/ TdTomato + , (3) Parv-Cre/TdTomato + neurons, and analyze their whole transcriptome molecular signatures. Differential expression analysis defined transcriptional hallmarks in each for ion channels, transcription factors and G-protein coupled receptors. Further analysis of hundreds of single DRG neurons identifies distinct somatosensory subsets within the originally purified populations, which were confirmed by RNA in situ hybridization. Our analysis illustrates the enormous heterogeneity and complexity of neurons that mediate peripheral somatosensation, as well as revealing the molecular basis for their functional specialization.

Characterization of distinct DRG neuronal subsets for molecular profiling
To perform transcriptional profiling of the mouse somatosensory nervous system, we labeled distinct populations of DRG neurons. We bred SNS-Cre or Parv-Cre mice with the Cre-dependent Rosa26-TdTomato reporter line (Madisen et al., 2010). In SNS-Cre/TdTomato and Parv-Cre/ TdTomato progeny, robust fluorescence was observed in particular subsets of neurons in lumbar DRG (Figure 1-figure supplement 1).

FACS purification of DRG neuron populations
We performed FACS purification of distinct neuronal populations isolated from both adult (7-20 week old) male and female mice. To avoid multiple rounds of amplification of small quantities of RNA, which would arise from less-abundant neuronal populations such as Parv-cre/TdT + , we chose to pool DRGs from cervical to lumbar regions (C1-L6). DRG cells were enzymatically dissociated and subjected to flow cytometry following DAPI staining to exclude dead cells, and gating on TdTomato hi populations ( Figure 3). This allowed for purification of TdTomato + neuronal somata with minimal contamination from fluorescent axonal debris and non-neuronal cells ( Figure 3A). Analysis of our flow cytometry data showed SNS-Cre/TdT + vs Parv-Cre/TdT + DRG cells matched the proportions ascertained by NeuN co-staining in DRG sections ( Figure 3B). It also illustrates that a large percentage of DAPI − live cells are non-neuronal. IB4-FITC surface staining allowed us to simultaneously purify the distinct IB4 + and IB4 − subsets within the SNS-Cre/TdT + population ( Figure 3C). Forward and side scatter light scattering properties reflect cell size and internal complexity, respectively. SNS-Cre/TdT + neurons displayed significantly less forward scatter and side scatter than Parv-Cre/TdT + neurons ( Figure 3-figure supplement 1). For RNA extraction, DRG populations were sorted directly into Qiazol to preserve transcriptional profiles at the time of isolation.

Transcriptional profile comparisons of purified neurons vs whole DRG
In total, 14 somatosensory neuron samples were FACS purified consisting of 3-4 biological replicates/ neuron population (Table 1). We also analyzed RNA from whole DRG tissue for comparison with the purified neuron samples. Because of the small numbers of cells from individual sensory ganglia and to eliminate the need for significant non-linear RNA amplification, total DRGs from three mice were pooled for each sample; following purification, RNA was hybridized to Affymetrix (Santa Clara, CA) microarray genechips for transcriptome analysis.
Transcriptome comparisons showed few molecular profile differences between biological replicates, but very large inter-population differences (Figure 3-figure supplement 2). Importantly, whole DRG molecular profiles differed substantially from the FACS purified neurons. Myelin associated transcripts (Mpz, Mag, Mpz, Pmp2) that are expressed by Schwann cells, for example, showed significantly higher expression in whole DRG tissue than in purified subsets when expressed as
We also found that many transcription factors were differentially expressed across these three neuron populations (Top 60 most variably expressed TFs, Figure 7C). Many of these have not yet been explored in the somatosensory system, and could play roles in neuronal differentiation and maintenance of cell-type specification during adulthood. For example, Klf7 and Isl2 were expressed at high levels and enriched in SNS-Cre/TdT + neurons (>1.5-fold, p < 0.01, >5000 expression). Based on this analysis, these two transcription factors have now been used to facilitate the trans-differentiation of fibroblasts into nociceptor-like neurons in conjunction with other transcription factors (Wainger et al., 2014).

Research article
Single cell analysis reveals log-scale gene expression heterogeneity We next performed single cell level transcriptional analysis of the three globally characterized DRG populations using Fluidigm parallel qRT-PCR gene expression technology. Distinct transcriptional hallmarks for each FACS purified population were first defined by their differential expression in the microarray datasets (threefold enrichment, Figure 10). Taqman assays were chosen corresponding to these enriched markers, and including two housekeeping genes (Gapdh and Actb), a complete group of 80 assays was used for single cell expression profiling ( Table 2). We first used these assays to analyze 100-cell and 10-cell FACS sorted groups of each neuronal population (Figure 10-figure  supplement 1), confirming the enrichment of various marker transcripts.
We next FACS sorted individual IB4 + SNS-Cre/TdT + , IB4 − SNS-Cre/TdT + , and Parv-Cre/TdT + neurons into 96-well plates for Fluidigm analysis. A total of 334 individual neurons were purified and analyzed (IB4 + SNS-Cre/TdT + cells, n = 132; IB4 − SNS-Cre/TdT + cells, n = 110; and Parv-Cre/TdT + cells, n = 92, Table 1). We found that the expression levels for specific transcripts across single cell datasets often displayed a log-scale continuum (Figure 11). Some transcripts were highly enriched in one subset of single cells (e.g., Mrgprd, Trpv1, P2rx3), but were often nonetheless expressed at detectable levels in other neuronal groups. This continuum of gene expression made it difficult to set 'thresholds' for assigning the presence or absence of a particular transcript. Thus, we focused our definition of distinct Hierarchical clustering of single cell data reveals distinct subgroups Spearman-rank hierarchical clustering was performed on the Fluidigm expression data normalized to gapdh expression (columns represent single cells, Figure 12). This analysis revealed a high degree of heterogeneity of transcriptional expression across the three DRG populations. The vast majority of single cells showed distinct patterns of expression of at least one neuronal transcript, including voltage-gated ion channels (Scn10a, Scn11a, Kcnc2, Kcnv1), ligand-gated channels (P2rx3, Trpv1, Trpa1), and Parvalbumin (Pvalb) indicating minimal amplification noise (Figure 12-figure supplement 1). Unbiased spearman rank analysis revealed seven distinct neuronal subgroups (Figure 12). Six out of seven groups had 24 or more individual cells (group I, 115 cells; group II, 50 cells; group III, 4 cells; group IV, 24 cells; group V, 24 cells; group VI, 24 cells; group VII, 93 cells). We chose one level of sample segregation to analyze, but other cellular subclasses are likely present at lower levels of clustering ( Figure 12). Importantly, when hierarchical clustering was performed on data normalized to To perform Fluidigm single cell analysis, Taqman assays were chosen to cover four categories of population-enriched transcripts first identified by microarray whole transcriptome analysis: (1) SNS-Cre/TdT + (total population) enriched markers (vs Parv-Cre/TdT + neurons), (2) IB4 + SNS-Cre/TdT + enriched markers (vs other 2 groups), (3) IB4 − SNS-Cre/TdT + markers (vs other 2 groups), and (4) Parv-Cre/TdT + markers (vs other 2 groups). Taqman assays for housekeeping genes Gapdh and Actb were also included. DOI: 10.7554/eLife.04660.018 Research article Actb, neuronal subgroups based on gapdh normalization segregated in a similar manner (data not shown). Principal components analysis showed distinct separation of the single cell subgroups along different principal components ( Figure 13A), with Groups I and VII on disparate arms of PC2 (∼5% variation), while Group V neurons segregated along PC3 (∼1.88% variation). Parv-Cre/TdT + neurons mainly fell within group VII (96.7% of the cells, Figure 13B). IB4 + SNS-Cre/TdT + and IB4 − SNS-Cre/TdT + neurons were distributed among subgroups II-VI ( Figure 13B). Therefore, this analysis has uncovered potentially novel subgroups distributed across the SNS-Cre/TdT + population that are not captured by the presence or absence of IB4 staining.

Major characteristics of distinct single cell subgroups
We next analyzed the major characteristics of each DRG single cell subgroup ( Figure 12). Group I neurons were mostly IB4+ nociceptors enriched for Pr2x3, Scn11a, and Mrgprd, markers for nonpeptidergic nociceptors. Our analysis found a large number transcriptional hallmarks for Group I neurons that were as well enriched as the known marker genes, including Grik1, Agtr1a, Pde11a, Ggta1, Prkcq, A3galt2, Ptgdr, Lpar5, Mmp25, Lpar3, Casz1, Slc16a12, Lpyd1, Trpc3, Moxd1, Wnt2b ( Figure 12, and Figure 12-figure supplement 2). Nearest neighbor analysis across all single cells found 13 transcripts with Pearson correlation >0.5 for Mrgprd, further showing a large cohort of genes that segregate in expression within group I neurons ( Figure 14). Group II neurons expressed high levels of Ntrk1 (Trka), Scn10a (Nav1.8), and Trpv1. We also found that they expressed significant levels of Aqp1 (Aquaporin 1), and a major proportion of Group II neurons also expressed Kcnv1 (Kv8.1). Group III consisted of only four cells and we thus did not consider it a true neuronal subclass. Group IV neurons were characterized by the absence of Scn10a (Nav1.8) but the presence of Trpv1 expression (Figure 14-figure supplement 1). Although Group IV neurons were all labeled by SNS-Cre/TdTomato, they did not all show Scn10a gene expression, likely reflecting transient transcription of this transcript that is shutdown in some neurons during development .
Group VI neurons were a distinct population characterized by co-expression of Nppb and IL31ra ( Figure 14). Nppb is a neuropeptide mediator of itch signaling from DRG neurons to spinal cord pruritic circuitry (Mishra and Hoon, 2013). IL31 is a T cell cytokine associated with pruritus, and DRG neurons express the IL31 receptor (Bando et al., 2006;Sonkoly et al., 2006) Co-expression of IL31ra with Nppb suggests that these neurons may be specialized in mediating itch. Group VI neurons also showed high level expression of Scn10a, Scn11a, and Trpv1 relative to the other subsets (Figure 14figure supplement 1, 2).
Fluorescence ISH analysis of subgroup-specific characteristics RNA in situ hybridization (ISH) was used to confirm specific localization of novel Group I, VI, and VII enriched transcripts ( Table 3). The Group I marker Lysophosphatidic acid receptor 3 (Lpar3) labeled a subset of SNS-Cre/TdT + neurons that did not overlap with Parv-Cre/TdT + expression ( Figure 15). We found similar results for Prkcq (PKCθ), another Group I marker (Figure 15-figure supplement 2). The Group VI marker Il31ra also labeled a distinct subset of SNS-Cre/TdT + neurons and did not colocalize with Parv-Cre/TdT + neurons ( Figure 15). By contrast, the group VII marker Gpcr5b did not stain SNS-Cre/TdT + neurons but co-localized well with Parv-Cre/TdT + proprioceptors ( Figure 15). Double ISH found that itch-related Group VI marker IL31ra did not colocalize with group I markers Prkcq or Lpar3, nor with group VII marker Gpcr5b (Figure 15). In confirmation of the Fluidigm data, double ISH found that IL31ra colocalized well with Nppb (Figure 15), thus confirming co-expression of two itch-related markers in the same neuronal subset. Thus, expression profiling at single cell resolution reveals an unsuspected degree of complexity of sensory neurons with elucidation of many new markers and of different neuronal subtypes.

Discussion
Mapping neuronal circuitry and defining the molecular characteristics of specific neurons is critical to understanding the functional organization of the nervous system. The somatosensory system, all of whose primary sensory neurons are of neural crest origin, is highly complex, innervating diverse peripheral tissues and encoding thermal, mechanical, and chemical modalities across a broad range of sensitivities, from innocuous to noxious with different dynamic ranges (Marmigere and Ernfors, 2007;Basbaum et al., 2009;Dubin and Patapoutian, 2010;Li et al., 2011). Sensory neurons are currently classified based on myelination and conduction properties (i.e., C-, Aα/β-or Aδ-fibers) or their selective expression of ion channels (e.g., Trpv1, P2rx3, Nav1.8), neurotrophin receptors (e.g., TrkA, TrkB, TrkC, Ret), cytoskeletal proteins (e.g., NF200, Peripherin), and GPCRs (e.g., Mrgprd, Mrgpra3). However, combining these different classification criteria can result in complex degrees of overlaps, making a cohesive categorization of distinct somatosensory populations challenging. Transcriptome-based analysis has become recently a powerful tool to understand the organization of complex populations, including subpopulations of CNS and PNS neurons (Lobo et al., 2006;Sugino et al., 2006;Molyneaux et al., 2009;Okaty et al., 2009Okaty et al., , 2011Lee et al., 2012;Mizeracka et al., 2013;Zhang et al., 2014). In this study, we performed cell-type specific transcriptional analysis to better understand the molecular organization of the mouse somatosensory system.
Our population level analysis revealed the molecular signatures of three major classes of somatosensory neurons. There were vast transcriptional differences between SNS-Cre/TdTomato + and Parv-Cre/TdTomato + neurons, potentially reflecting their developmental specification into neurons with quite different functional attributes and targets. As SNS-Cre is expressed mainly within TrkAlineage neurons Liu et al., 2010), while Parv-Cre is expressed mainly in proprioceptor-lineage neurons (Hippenmeyer et al., 2005), these two populations reflect archetypical C-and Aα/β-fibers, respectively. Bourane et al previously performed SAGE analysis of TrkA deficient compared to wild-type DRGs, which revealed 240 differentially expressed genes and enriching for nociceptor hallmarks (Bourane et al., 2007). Our FACS sorting and comparative population analysis identified 1681 differentially expressed transcripts (twofold), many of which may reflect the early developmental divergence and vast functional differences between these lineages. While C-fibers mediate thermosensation, pruriception and nociception from skin and deeper tissues, Parv-Cre lineage neurons mediate proprioception, innervating muscle spindles and joints (Marmigere and Ernfors, 2007;Dubin and Patapoutian, 2010). Almost exclusive TRP channel expression in SNS-Cre/TdT + neurons vs Parv-Cre/TdT + neurons may relate to their specific thermosensory and chemosensory roles. We also found significant molecular differences between the IB4 + and IB4 − subsets of SNS-Cre/TdT + neuronal populations. Our analysis identified many molecular hallmarks for the IB4 + subset (e.g., Agtr1a, Casz1, Slc16a12, Moxd1) that are as enriched as the currently used markers (P2rx3, Mrgprd), but whose expression and functional roles in these neurons have not yet been characterized. This analysis of somatosensory subsets covered the majority of DRG neurons (∼95%), with the exception of TrkB + Aδ cutaneous low-threshold fibers (Li et al., 2011), which are NF200 + but we find are negative for SNS-Cre/TdTomato and Parv-Cre/TdTomato (Data not shown).
Single cell analysis by parallel quantitative PCR of hundreds of neurons demonstrated large heterogeneity of gene expression within the SNS-Cre/TdT + neuron population, much greater than the current binary differentiation of peptidergic or non-peptidergic IB4 + subclasses. Interestingly, we found a log-scale continuum for many transcripts, including nociceptive genes (e.g., Trpv1, Trpa1) showing high expression in IB4 + and IB4 − subsets and with lower but not absent levels in Parv-Cre/TdT + cells. This may reflect transcriptional shut-down of genes during differentiation. Unbiased hierarchical clustering analysis of single cell data revealed at least six distinct neuronal subgroups. These findings reveal new molecular characteristics for known neuron populations and also uncover novel neuron subsets: Group I neurons consist of Mrgprd + Nav1.8 + P2rx3 + Nav1.9 + cells, which are polymodal non-peptidergic C-fibers, for which we identify a panoply of new molecular markers. Group II consists of Trka hi Nav1.8 + Trpv1 + Aquaporin + neurons, matching known characteristics of thermosensitive C-fibers; many of these expressed Kcnv1. Group V consists of Th + Nav1.8 + Trka − Trpv1 − cells, matching characteristics of C-fiber low-threshold mechanoreceptors (C-LTMRs) (Li et al., 2011). Group VII consists of Pvalb + Runx3 + Etv1 + neurons, which are mostly proprioceptor-lineage neurons for which we identified 12 molecular markers. Lee et al recently performed transcriptome analysis of purified TrkC-lineage proprioceptive neurons in the presence or absence of NT-3 signaling  and we note that Group VII neurons were similar to TrkC lineage cells in gene expression (Pth1r, Runx3, Pvalb). Group IV consists of Trpv1 + Nav1.8 − neurons, which may represent a unique functional subgroup; Wood et al found that mice depleted for Nav1.8-lineage neurons retained a TRPV1 responsive subset (Abrahamsen et al., 2008). We uncover a new subset of neurons, Group VI, which appears to represent pruriceptive neurons based on their co-expression of IL31ra and Nppb.  -ATGTTCCTGGT  5′-TCACCAATGGTG  1233   Lpar3  5′-TTGTGATCGTCCTGTGCGTG  5′-GCCTCTCGGTATTGCTGTCC  870   TdTomato 5′-ATCAAAGAGTTCATGCGCTTC  5′-GTTCCACGATGGTGTAGTCCTC  615   Prkcq  5′-TCTTGCTGGGTCAGAAGTACAA  5′-TCTGTGGTTGAGTGGAATTGAC  919   Nppb  5′-TGAAGGTGCTGTCCCAGATGATTC 5′-GTTGTGGCAAGTTTGTGCTCCAAG 545   Il31ra  5′-CTCCCCTGTGTTGTCCTGAT  5′-TTCATGCCATAGCAGCACTC  559 Probesets used for RNA in situ hybridization analysis. Listed are gene symbols, sequences for forward and reverse primers, and resulting probe lengths. DOI: 10.7554/eLife.04660.027 While preparing this manuscript, several papers performing expression profiling of postnatal adult somatosensory neurons were published (Goswami et al., 2014;Thakur et al., 2014;Usoskin et al., 2014). We note that each study utilized distinct methodologies from our work: Goswami et al profiled Trpv1-Cre/TdTomato + neurons compared to Trpv1-diptheria toxin depleted whole DRG tissue (Goswami et al., 2014). Thakur et al performed magnetic bead selection to remove DRG nonneuronal cells, performing RNA-seq on residual cells enriched for neurons (Thakur et al., 2014). Usoskin et al performed an elegant single cell RNA-seq on hundreds of DRG neurons that were picked in an unbiased fashion robotically (Usoskin et al., 2014). We believe that our study possesses has unique features and certain advantages, as well as limitations, in relation to these studies. In our study, we performed whole population analysis of three major DRG subsets, which we followed by single cell granular profiling of hundreds of cells from the same populations. We believe advantages of beginning with a differential analysis of well-defined populations is that this facilitates correlation of the data back to function and enables a highly specific comparative analysis to be performed between major neuronal populations. Further definition of each population by shifting to a single cell strategy then allows identification of functionally defined groups of cells. The same advantages of a population based strategy is also a caveat, in that it could introduce pre-determined bias, which Usoskin et al purposely avoided by randomly picking single DRG neurons as a starting point. We note that our analysis is the only one so far to utilize parallel qRT-PCR of single cells, which we demonstrate is able to detect logscale differences in expression (Figure 11), and may have better detection sensitivities than single cell RNA-seq. In a comparison of the overall datasets, we produce some similar findings with Usoskin et al, including the finding of a distinct pruriceptive population (IL31ra + Group VI). However, our analysis showed higher definition of markers present in Group I and Group VII neurons, as well as Group IV neurons (which was not previously described), while Usoskin et al detected TrkB + neurons whereas we did not, as these cells are not included in our sorted populations. We believe that our study and these recently published papers will be useful foundation and resource for future analysis of the molecular determinants of sensory neuron phenotype.
Somatosensory lineage neurons subserve multiple functions: nociceptive, thermoceptive, pruriceptive, proprioceptive, and tactile. It is likely that additional granular analysis at the single cell level will further refine these subsets and uncover new molecular subclasses of neurons. As genomic technologies and single cell sorting methodologies evolve current limitations (e.g., RNA quantity) will be overcome and future analysis of thousands of single cells from distinct anatomical locations, developmental time-points, or following injury/inflammation will begin to reveal even more critical information about the somatosensory system. This transcriptional analysis illustrates an unsuspected degree of molecular complexity of primary sensory neurons within the somatosensory nervous system. Functional studies are now needed to analyze the roles of the many newly identified sensory genes in neuronal specification and action. As we begin to explore the function, connectivity and plasticity of the nervous system we need to recognize this needs a much more granular analysis of molecular identity, since even the presumed functionally relatively simple primary sensory neuron, is extraordinarily complex and diverse.

Mice
Parvalbumin-Cre (Hippenmeyer et al., 2005), ai14 Rosa26TdTomato mice (Madisen et al., 2010) were purchased from Jackson Labs (Bar Harbor, ME) and bred in the animal facility at Boston Children's Hospital. SNS-Cre transgenic mice (Agarwal et al., 2004) were from Rohini Kuner (University of Heidelberg, Germany). All animal experiments were conducted according to institutional animal care and safety guidelines at Boston Children's Hospital and at Harvard Medical School.

Ethics statement
All studies were conducted under strict review and guidelines according to the Institutional Animal Care and Use Committee (IACUC) at Boston Children's Hospital, which meets the veterinary standards
For whole mount imaging, lumbar dorsal root ganglia, trigeminal ganglia, sciatic nerve, plantar skin, abdominal walls were dissected and mounted in PBS under glass coverslips. Confocal microscopy was conducted using a LSM700 laser-scanning confocal microscope (Carl Zeiss, Germany), using a 10× Zeiss EC plan-NEOFLUAR dry and a 40× Zeiss plan-APOCHROMAT oil objectives, with Z-stacks imaged at 1 μm steps, collected of up to 200 μm total. Three dimensional reconstructions were rendered as maximum projection images using Volocity software (Perkin Elmer, Waltham, MA).

Neuronal cultures and electrophysiology
For electrophysiological analysis of Parv-Cre/TdTomato and SNS-Cre/TdTomato neurons, DRGs were dissected, placed in HBSS, incubated for 90 min with 5 mg/ml collagenase, 1 mg/ml dispase II at 37°C. Cells were triturated in the presence of DNase I inhibitor, centrifuged through 10% BSA, resuspended in 1 ml of neurobasal medium, 10 μM Ara-C (Sigma-Adrich), 50 ng/ml NGF, 2 ng/ml GDNF (Life Technologies), and plated onto 35-mm tissue culture dishes coated with 5 mg/ml laminin. Cultures were incubated at 37°C under 5% CO 2 . Recordings were made at room temperature within 24 hr of plating. Whole-cell recordings were made with an Axopatch 200A amplifier (Molecular Devices, Sunnyvale, CA) and patch pipettes with resistances of 2-3 MΩ. The pipette capacitance was decreased by wrapping the shank with parafilm and compensated using the amplifier circuitry. Pipette solution was 5 mM NaCl, 140 mM KCl, 0.5 mM CaCl 2 , 2 mM MgCl 2 , 5 mM EGTA, 10 mM HEPES, and 3 mM Na 2 ATP, pH 7.2, adjusted with NaOH. The external solution was 140 mM NaCl, 5 mM KCl, 2 mM CaCl 2 , 2 mM MgCl 2 , 10 mM HEPES, and 10 mM D-glucose, pH 7.4, adjusted with NaOH (Sigma-Aldrich). Current clamp recordings were made with the fast current-clamp mode. Command protocols were generated and data digitized with a Digidata 1440A A/D interface with pCLAMP10 software. Action potentials (AP) were evoked by 5 ms depolarizing current pulses. AP half width was measured at halfmaximal amplitude. 500 nM Tetrodotoxin (TTX) were applied to block TTX-sensitive Na + currents.

Flow cytometry of neurons
DRGs from cervical (C1-C8), thoracic (T1-T13), and lumbar (L1-L6) segments were pooled from different fluorescent mouse strains, consisting of 7-20 week age-matched male and female adult mice (see Table 1). DRGs were dissected, digested in 1 mg/ml Collagenase A/2.4 U/ml Dispase II (enzymes from Roche), dissolved in HEPES buffered saline (Sigma-Aldrich) for 70 min at 37°C. Following digestion, cells were washed into HBSS containing 0.5% Bovine serum albumin (BSA, Sigma-Aldrich), filtered through a 70 μm strainer, resuspended in HBSS/0.5% BSA, and subjected to flow cytometry. Cells were run through a 100 μm nozzle at low pressure (20 p.s.i.) on a BD FACS Aria II machine (Becton Dickinson, Franklin Lakes, NJ, USA). A neural density filter (2.0 setting) was used to allow visualization of large cells. Note: Initial trials using traditional gating strategies (e.g., cell size, doublet discrimination, and scatter properties) did not eliminate non-neuronal cells. An important aspect of isolating pure neurons was based on the significantly higher fluorescence of the Rosa26-TdTomato reporter in somata compared to axonal debris, allowing accurate gating for cell bodies and purer neuronal signatures. For microarrays, fluorescent neuronal subsets were FACS purified directly into Qiazol (Qiagen, Venlo, Netherlands). To minimize technical variability, SNS-Cre/TdTomato (total, IB4 + , IB4 − ) and Parv-Cre/TdTomato neurons were sorted on the same days. FACS data was analyzed using FlowJo software (TreeStar, Ashland, OR, USA). For Fluidigm analysis, single cells or multiple cell groups from different neuronal populations were FACS sorted into individual wells of a 96-well PCR plate containing pre RNA-amplification mixtures. For microscopy, fluorescent neurons or axons were FACS purified into Neurobasal + B27 supplement + 50 ng/ml NGF, plated in poly-d-lysine/laminin-coated 8-well chamber slides (Life Technologies) and imaged immediately or 24 hr later by Eclipse 50i microscope (Nikon). Flow cytometry was performed in the IDDRC Stem Cell Core Facility at Boston Children's Hospital.

Single neuron analysis
Flow cytometry was used to purify 100 cell groups, 10 cell groups, or single cells into 96-well plates containing 9 μl of a pre-amplification containing reaction mix from the CellsDirect One-Step qRT-PCR Kit (Life Technologies) mixture with pooled Taqman assays (purchased as optimized designs from Life Technologies). Superscript III RT Taq mix (Life Technologies) was used for 14 cycles to pre-amplify specific transcripts. We found that not every FACS sorted-well contained a cell; thus, a pre-screening method was utilized, where 2 μl from each well was subjected to two-step quantitative PCR (qPCR) for Actb (β-Actin) using fast SYBR green master mix (Life Technologies) on an Applied Biosystems 7500 machine (Applied Biosystems, Waltham, MA) using the following primers: 5′-acactgtgcccatctacgag-3′ and 5′-gctgtggtggtgaagctgta-3′. Wells showing Actb Ct values <20 were picked for subsequent analysis. Using the Biomark Fluidigm microfluidic multiplex qRT-PCR platform, pre-amplified well products were run on 96.96 dynamic arrays (Fluidigm, San Francisco, CA) and assayed against 81 Taqman assays (Life Technologies). Specific assays were chosen based on differential expression by microarray analysis, functional category, and housekeeping genes ( Table 2). Ct values were measured by Biomark software, relative transcript levels determined by 2 −ΔCt normalization to Gapdh or Actb transcript levels. For each transcript, outliers of 5 standard deviations from the mean were excluded (set to 0) from our analysis. A total of 334 single cells were analyzed, consisting of IB4 + SNS-Cre/TdT + (n = 132), IB4 − SNS-Cre/TdT + (n = 110), Parv-Cre/TdT + (n = 92) neurons. Spearman rank average-linkage clustering was performed with the Hierarchical Clustering module from the GenePattern genomic analysis platform and visualized using the Hierarchical ClusteringViewer module of GenePattern (MIT Broad Institute). A specific level of hierarchical clustering was used to ascertain clustered neuron subgroups. The Population PCA tool was used for principal components analysis-http://cbdm.hms.harvard.edu/LabMembersPges/SD.html. Pearson correlation analysis of specific transcripts to all 80 probes across the single cell expression dataset was generated using nearest neighbor analysis by the GenePattern platform. Histogram plots of single cell data were generated in Excel (Microsoft, Redmond, WA, USA). Dot plots showing single cell transcript data across subgroups was generated in Prism software (Graphpad).

Statistical analysis
Sample sizes for experiments were chosen according to standard practice in the field. 'n' represents the number of mice, samples, or cells used in each group. Bar and line graphs are plotted as mean ± standard error of the mean (s.e.m.). Data meet the assumptions of specific statistical tests chosen, including normality for parametric or non-parametric tests. Statistical analysis of electrophysiology, neuronal cell counts, and flow cytometry were by One-way ANOVA with Tukey's posttest or by unpaired, Student's t test. Data was plotted using Prism software (Graphpad).
RNA processing, microarray hybridization and bioinformatics analysis RNA was extracted by sequential Qiazol extraction and purification through the RNeasy micro kit with on column genomic DNA digestion according to manufacturer's instructions (Qiagen). RNA quality was determined by Agilent 2100 Bioanalyzer using the Pico Chip (Agilent, Santa Clara, CA, USA). Samples with RIN >7 were used for analysis. RNA was amplified into cDNA using the Ambion WT expression kit for Whole Transcript Expression Arrays (Life Technologies), with Poly-A controls from the Affymetrix Genechip Eukaryotic Poly-A RNA control kit (Affymetrix, Santa Clara, CA, USA). The Affymetrix Genechip WT Terminal labeling kit was used for fragmentation and biotin labeling. Affymetrix GeneChip Hybridization control kit and the Affymetrix GeneChip Hybridization, wash, stain kit was used to hybridize samples to Affymetrix Mouse Gene ST 1.0 GeneChips, fluidics performed on the Affymetrix Genechip Fluidics Station 450, and scanned using Affymetrix Genechip Scanner 7G (Affymetrix). Microarray work was conducted at the Boston Children's Hospital IDDRC Molecular Genetics Core. For Bioinformatics analysis, Affymetrix CEL files were normalized using the Robust Multi-array Average (RMA) algorithm with quantile normalization, background correction, and median scaling. Hierarchical clustering and principal-component analysis (PCA) was conducted on datasets filtered for mean expression values greater than 100 in any population (Mingueneau et al., 2013), with elimination of noisy transcripts with an intra-population coefficient of variation (CoV) <0.65. Spearmanrank average linkage analysis was conducted on the top 15% most variable probes across subsets (2735 transcripts) using the Hierarchical Clustering module, and heat-maps generated using the Hierarchical ClusteringViewer module of the GenePattern analysis platform (Broad Institute, MIT). The Population PCA tool was used (http://cbdm.hms.harvard.edu/LabMembersPges/SD.html). For pathway enrichment analysis, pairwise comparisons of specific neuronal datasets (e.g., Parv-Cre/TdTomato vs SNS-Cre/TdTomato) were conducted. Differentially expressed transcripts (twofold, p < 0.05) were analyzed using Database for Annotation, Visualization and Integrated Discovery (DAVID) (http://david.abcc.ncifcrf.gov). Pathway enrichment p-values for GO Terms (Biological Processes) or Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were plotted as heat-maps using the HeatmapViewer module of GenePattern. Differentially expressed transcripts were illustrated using volcano plots, generated by plotting fold-change differences against comparison p-values or −log (p-values). Transcripts showing low intragroup variability (CoV < 0.65) were included in this differential expression analysis. Specific gene families, including ion channels (calcium, sodium, potassium, chloride, ligand-gated, TRP and HCN channels), GPCRs and transcription factors were highlighted on volcano plots.

Major datasets
The following datasets were generated: