Targeted molecular profiling of rare cell populations identifies olfactory sensory neuron fate and wiring determinants

Determining the molecular properties of neurons is essential to understand their development, function and evolution. We used Targeted DamID (TaDa) to characterize RNA polymerase II occupancy and chromatin accessibility in selected Ionotropic Receptor (IR)-expressing sensory neurons in the Drosophila antenna. Although individual populations represent a minute fraction of cells, TaDa is sufficiently sensitive and specific to identify the expected receptor genes. Unique Ir expression is not linked to substantial differences in chromatin accessibility, but rather to distinct transcription factor profiles. Heterogeneously-expressed genes across populations are enriched for neurodevelopmental factors, and we identify functions for the POU-domain protein Pdm3 as a genetic switch of Ir neuron fate, and the atypical cadherin Flamingo in segregation of neurons into discrete glomeruli. Together this study reveals the effectiveness of TaDa in profiling rare neural populations, identifies new roles for a transcription factor and a neuronal guidance molecule, and provides valuable datasets for future exploration.


Introduction 68 69
Nervous systems are composed of vast numbers of neuron classes with specific 70 structural and functional characteristics. Determining the molecular basis of these 71 properties is essential to understand their emergence during development and 72 contribution to neural circuit activity, as well as appreciating how neurons exhibit 73 plasticity over both short and evolutionary timescales. 74 The olfactory system of adult Drosophila melanogaster is a powerful model 75 to identify and link molecular features of neurons to their anatomy and physiology. 76 The Drosophila nose (antenna) contains ∼50 classes of olfactory sensory neurons 77 (OSNs), which develop from sensory organ precursors cells specified in the larval 78 antennal imaginal disc 1-3 . Each OSN is characterized by the expression of a 79 unique olfactory receptor (or, occasionally, receptors), of the Odorant Receptor 80 (OR) or Ionotropic Receptor (IR) families. These proteins function -together with 81 broadly-expressed, family-specific co-receptors -to define neuronal odor-82 response properties 4-7 . Moreover, the axons of all neurons expressing the same 83 receptor(s) converge to a discrete and stereotyped glomerulus within the primary 84 olfactory center (antennal lobe) in the brain, where they synapse with both local 85 interneurons (LNs) and projection neurons (PNs) 3,8,9 . While the unique properties 86 of individual OSN classes are assumed to reflect distinct patterns of gene 87 expression, it is unclear how many molecular differences exist between sensory 88 neuron populations, which types of genes (beyond receptors) distinguish OSN 89 classes, and whether gene expression differences reflect population-specific 90 chromatin architecture. 91 Exploration of the molecular properties of specific neuron types has been 92 revolutionized by single-cell RNA sequencing (scRNA-seq) technologies 10 . 93 However, the peripheral olfactory system of Drosophila poses several challenges 94 to single-cell isolation: OSNs are very small (~2-3 µm soma diameter) and up to 95 four neurons are tightly embedded under sensory sensilla (a porous cuticular hair 96 that houses the neurons' dendrites) along with several support cells that seal 97 each neuronal cluster from neighboring sensilla 11,12 . Furthermore, each OSN 98 class represents only a very small fraction of cells of the whole antenna, as few 99 3 as 5-10 neurons of a total of ~3000 neurons and non-neuronal cells 11 . Recent 100 work 13 -discussed below -successfully profiled RNA levels in OSNs at a mid-101 developmental time-point (42-48 h after puparium formation (APF)), before these 102 cells become encased in the mature cuticle. 103 To characterize the molecular profile of specific populations of OSNs we 104 sought to use the Targeted-DNA adenine methyltransferase identification (TaDa) 105 method 14 . TaDa entails expression of an RNA polymerase II subunit (Rp215, 106 hereafter PolII) fused to the E. coli DNA adenine methyltransferase (Dam). This 107 complex (or, as control, Dam alone) is expressed within a specific cell population 108 ("targeted") under the regulation of a cell type-specific promoter using the 109 Gal4/UAS system. When bound to DNA, Dam:PolII methylates adenine within 110 GATC motifs; enrichment of methylation at genes compared to free Dam control 111 samples provides a measure of RNA PolII occupancy, which is correlated with 112 genes' transcriptional activity 14 ( Figure 1A). In addition to indicating RNA PolII 113 occupancy within a given cell type, TaDa datasets can also provide information 114 on chromatin accessibility (CATaDa) 15 ( Figure 1A). Analysis of the methylation 115 patterns across the genome generated by untethered Dam is assumed to reflect 116 the openness of chromatin, analogous to -and in good agreement with -ATAC-117 seq datasets 15,16 . 118 The selectivity of expression of Dam:PolII or Dam avoids the requirement 119 for cell isolation prior to genomic DNA extraction. Previous applications of TaDa 120 in Drosophila have profiled relatively abundant cell populations in central brain 121 tissue, the intestine or the male germline 14,[17][18][19] . Although OSN classes can be 122 effectively labeled using transgenes containing promoters for the corresponding 123 receptor genes, it was unclear whether TaDa would be sufficiently sensitive to 124 measure transcriptional activity from an individual population of sensory neurons 125 within the antenna. We therefore set out to profile several classes of  expressing OSNs to test the approach and identify novel factors that distinguish 127 the properties of these evolutionarily closely-related neuronal subtypes.

129
Results 130 131 Targeted DamID of OSN populations 132 133 To test the feasibility of TaDa for profiling Drosophila OSNs, we first focused on 134 the Ir64a population, which comprises ~16 neurons that are housed in sensilla in 135 the sacculus (a chamber within the antenna), where they detect acidic odors 20 . 136 We first verified that Ir64a promoter-Gal4 (hereafter, Ir64a-Gal4) induction of Dam 137 or Dam:PolII did not impact the expression or localization of the IR64a receptor 138 ( Figure 1B) or the projection of these neurons to the antennal lobe ( Figure 1C). 139 We proceeded to collect triplicate samples of ~2000 adult antennae and 140 processed these for TaDa (see Methods). This experimental design is expected 141 to integrate PolII occupancy patterns from mid-pupae (when Ir-Gal4 expression 142 initiates) to mature OSNs. As a positive control, we first examined the mapping of 143 reads of methylated DNA fragments to the Ir64a gene. Dam:PolII consistently 144 accessed this gene region at higher levels than Dam alone, particularly near the 145 5' end of the locus ( Figure 1D), consistent with previously-described genome-146 wide PolII occupancy patterns 14 . 147 We extended the TaDa approach using drivers to target six additional 148 individual OSN classes, including all acid-sensing Ir populations (Ir31a, Ir75a, 149 4 Ir75b, Ir75c, Ir84a) 21,22 , as well as the hygrosensory Ir40a neurons 23,24 ; these 150 populations range from ∼8 to ∼25 cells per antenna 25 . As expected, the 151 enrichment of Dam:PolII access to the Ir64a gene was specific to the Ir64a 152 neurons ( Figure 1E). We quantified "PollI occupancy scores" (OSs) by calculating 153 the log 2 ratio of the (normalized) number of reads from the Dam:PolII samples 154 compared to the (normalized) number of reads in control Dam samples across 155 the entire gene body. In Ir64a neurons, the Ir64a gene OS was significantly 156 positive (OS = 0.20; FDR = 4.26e-10), while it was not significantly different from 157 zero in the other six neuron populations ( Figure 1E, bottom).

158
Similarly, OSs for other Ir genes were significantly positive in the 159 corresponding neuronal population, but not in those in which the receptors are not 160 expressed ( Figure 1F). We noted two exceptions: first, in Ir31a neurons, Ir40a 161 had a positive OS (0.12; FDR = 1.93e-13), although it was much lower than that 162 of Ir31a (OS = 0.41; FDR = 2.67e-6) ( Figure 1F). Second, the Ir75a, Ir75b and 163 Ir75c genes had positive OSs in all three of the corresponding neuron populations 164 ( Figure 1F). This latter result is most likely due to these tandemly-clustered genes 165 having overlapping transcription units 25 .

166
To further examine the specificity of TaDa in reporting on neuron 167 population-specific gene expression, we plotted OSs for all members of the Ir 168 repertoire, as well as Or and Gustatory receptor (Gr) families. The vast majority of 169 these genes are expressed only in other antennal sensory neuron populations, 170 distinct chemosensory organs and/or life stages [26][27][28] . Concordantly, only very few 171 genes show significant OSs in any of the seven populations of Ir neurons 172 analyzed by TaDa ( Figure 1F and Figure S1). Two prominent examples are Ir8a 173 and Ir25a, which encode co-receptors for subsets of tuning IRs 29 . Ir8a has a 174 significant OS in all Ir populations except for Ir40a neurons, which is consistent 175 with the selective expression and function of IR8a in acid-sensing OSNs but not 176 hygrosensory neurons 29 ( Figure 1F). IR25a is broadly-expressed in most antennal 177 neurons 29,30 -although it may have a role in only a subset of these -which is 178 reflected in a significant OS for Ir25a in all seven populations ( Figure 1F). IR40a 179 is co-expressed with another receptor, IR93a 23,24 , but we did not detect a 180 significant OS for Ir93a in these neurons ( Figure S1 and Table S1). This negative 181 result may reflect the observation that while IR93a protein is readily detected 23 , 182 Ir93a transcripts levels (and so potentially PolII occupancy) are extremely low 30,31 . 183 Of the other chemosensory receptors displaying unexpected significant 184 OSs in these Ir neurons, many of these genes are located in the introns of other 185 genes (e.g., Ir47a, within the slowpoke 2 locus, which encodes a nervous-system 186 expressed potassium channel 32 ) or have overlapping transcripts with other genes 187 (e.g., Gr2a, which overlaps with futsch, a gene encoding a neuronal microtubule-188 binding protein 33 ) ( Figure S1). In these cases, calculation of a specific OS for the 189 chemosensory genes is impossible, and we suspect that in many cases the OS 190 reflects transcription of the non-receptor gene. OSs in any neuron population ( Figure 1G). Similarly, markers for other support 204 cell types, such as nompA (thecogen cell) 37,39 , and the glial marker repo are not 205 significantly occupied ( Figure 1G).

206
Electrophysiological experiments indicate that antennal OSNs are 207 cholinergic 40 . In line with this functional property, genes encoding the vesicular 208 acetylcholine transporter (VAChT) and choline acetyltransferase (ChAT) have 209 significant OSs in all populations, while glutamatergic and GABAergic neuron 210 markers (vesicular glutamate transporter (VGlut) and glutamic acid decarboxylase 211 1 (Gad1)) do not ( Figure 1G). We note, however, that a GABA receptor subunit 212 (encoded by GABA-B-R2) -which is heterogeneously-expressed and mediates 213 population-specific presynaptic gain control in different Or OSNs 41 -displays 214 significant, but variable, OSs in different Ir populations ( Figure 1G). datasets, representing 20-27% of all genes ( Figure 2A; Dataset S1). To 232 investigate patterns of similarity in global OSs among these populations, we 233 carried out hierarchical clustering of the mean OSs for all genes in the genome 234 ( Figure 2B). This analysis revealed an interesting correspondence in the 235 clustering of OSs with the phylogeny of the corresponding receptor genes 25 : 236 neurons expressing the most recently-duplicated receptor genes, Ir75b and Ir75c, 237 were clustered together, with the next most similar neuron population, Ir75a, 238 expressing the likely ancestral member of this tandem gene array 25 . Ir40a and 239 Ir64a neurons clustered together away from the other acid-sensing populations, 240 possibly reflecting the development of these neurons within the sacculus, rather 241 than in sensilla located on the antennal surface. 242 Different OSN populations are characterized both by the sensory receptors 243 they express and by their distinct morphological properties, notably their axonal 244 projections to discrete glomeruli within the antennal lobe. Genetic studies have 245 identified several secreted and transmembrane proteins that play roles in different 246 steps of the neuronal guidance process 3,8,9 , although our understanding remains 247 incomplete and many key molecules likely remain to be discovered. To gain 248 insight into the potential complexity and neuron-type specificity of the cell surface 249 6 proteome, we examined OSs of ∼2500 cell surface/secreted proteins (CSSPs) 250 annotated in the genome (from the FlyXCDB database 44 ) ( Figure 2C and Table  251 S2). Each neuronal population is predicted to express 244-389 CSSP genes 252 (Table S3): these are likely to encode proteins involved in diverse structural and 253 functional properties of OSNs, including sensory transduction, dendrite 254 morphogenesis, axon guidance, and synaptic maturation and transmission. We 255 did not attempt to further categorize the expressed genes, however, as most of 256 them have unknown in vivo roles. While each neuron population had a unique 257 combination of PolII-occupied CSSP genes, only a small number of these were 258 unique to a single dataset ( Figure 2C; Table S4). One exception is the Ir40a 259 population (70 uniquely occupied genes); this observation could reflect the 260 distinct sensory modality mediated by these neurons (hygrosensation) and/or 261 their unusually distributed glomerular innervations in the antennal lobe 22 . 262 The expression of both chemosensory receptors and guidance molecules 263 depends upon regulatory networks of transcription factors 2,45 . We therefore 264 assessed global patterns of PolII occupancy of the set of 754 predicted 265 Drosophila transcription factors across the seven Ir populations (using the FlyTF 266 database 46 ; Table S5). Half of these genes (379/754) displayed significant 267 occupancy in at least one population of Ir neuron, with individual OSN classes 268 displaying 190-286 occupied transcription factor genes ( Figure 2D and Table S6-269 S7). There was considerable variation in Dam:PolII occupancy of transcription 270 factor genes across the neuron populations ( Figure 2D), consistent with each 271 neuronal subtype possessing a unique and complex gene regulatory network to 272 support its specific differentiation properties.

273
Many of the transcription factors with known function in OSN development 274 are broadly-expressed across OSNs, such as Acj6, Fer1 and Onecut 45,47 ; 275 consistently, these genes display positive OS in all seven neuron populations 276 (Table S6). One exception is Engrailed (En), which is expressed and required in 277 just a subset of OSNs 48,49 . In our TaDa datasets, we observed that en displays 278 significant OSs in Ir31a, Ir40a and Ir64a populations ( Figure S2A), consistent with 279 the robust detection of En protein in only these three Ir neuron classes ( Figure  280 S2B-C).

282
Comparative chromatin accessibility in OSNs 283 284 We next performed CATaDa analysis through investigation of the Dam-alone 285 datasets, by using these to generate chromatin accessibility maps similar to 286 ATAC-seq and FAIRE-seq 15 . Genome-wide visualization of the peaks of 287 methylated sequences across the genome (see Methods) revealed 288 heterogeneous patterns of chromatin accessibility, with the expected overall 289 decrease in open chromatin towards the heterochromatic centromeric regions of 290 the chromosome arms ( Figure 3A).

291
To examine chromatin accessibility patterns at a gene level, we focused on 292 CATaDa signals at the Ir loci that distinguish different populations -i.e., the 293 seven tuning Ir genes and Ir8a -whose cell type-specific transcriptional activity is 294 well-established 25,29,30 . For each gene, peaks were present at the presumed 295 promoter region, as well as in upstream regions, which may represent enhancers.

296
Comparisons across neuron populations revealed that peak distribution was very 297 similar across all eight genes ( Figure 3B). This observation indicates that neuron-298 specific differences in Ir gene transcription are not due to differences in promoter 299 7 accessibility and also not reflecting dramatic differences in enhancer accessibility. 300 The only exceptions were the presence of peaks upstream of Ir75a and Ir75c that 301 were unique to their respective neuron populations (q-value < 0.05). Notably, 302 these peaks lie within sequences that recapitulate receptor expression patterns in 303 reporter transgenes 20,22,25 ( Figure 3B), raising the possibility that they reflect 304 enhancers that are accessible/functional only in these Ir neuron populations.

306
Identifying genes displaying differential expression in Ir populations 307 308 The analyses of genes encoding chemosensory receptors, transcription factors 309 and CSSPs demonstrate the use of TaDa OSs to identify genes that are 310 expressed (or not expressed) within specific OSN populations. While differences 311 in OS for a given gene across populations may suggest heterogeneous 312 expression levels, we sought to test for PolII occupancy differences between 313 populations. We implemented a different analysis method that used variation in 314 read-depth in the triplicate paired Dam-alone/PolII:Dam TaDa datasets along the 315 gene body ( Figure 4A and Methods). We identified candidates that had at least 316 two GATC motifs for which the excess of PolII:Dam read-depth compared to the 317 Dam-alone read-depth significantly varied in the same direction between two or 318 more population datasets (see Methods). With this criterion, 1694 genes exhibited 319 significant variation across the seven populations, indicating that only a relatively 320 small proportion of genes (~8%; Dataset S2) differ among these neuron types. 321 This number is likely to be an overestimate as it includes overlapping and/or 322 intronic genes that, as described above, cannot be parsed further using TaDa  323 data. Importantly this subset encompasses all of the expected Ir genes (see 324 Figure 4D and Dataset S2). To examine how each individual Ir neuron dataset 325 varied in comparison to the six others, we also quantified significant differences 326 between each pair of neural populations ( Figure 4B; Dataset S3). Despite the 327 methodological differences with the TaDa analysis that produces OSs, the 328 proportion of genes contributing to the total number of pairwise difference 329 approximates the clustering of global OSs ( Figure 2B). For example, comparisons 330 among the closely clustered Ir75a/Ir75b/Ir75c datasets result in very few of the 331 total differences (0.4-1.7%). Conversely, slightly more than half (51%) of the total 332 differences in the Ir84a dataset arise from comparisons with the distantly 333 clustered Ir40a dataset. 334 We investigated the general characteristics of the 1694 differentially-335 occupied genes by performing Gene Ontology analyses, finding highly significant 336 enrichment in neural, neural development, and signaling categories ( Figure 4C). 337 This enrichment is consistent with these genes underlying the known unique 338 physiological and anatomical properties of these neuron types, and indicates that 339 our comparative TaDa approach would be valuable in identifying candidates that 340 contribute functionally to these unique properties.

341
To establish a priority list for follow-up experiments, we ordered these 342 genes by the total number of differentially-occupied GATC motifs, reasoning that 343 this would highlight those displaying differential occupancy along a broad region 344 of the gene body ( Figure 4D and Table S8). While this "ranking" is biased towards 345 longer genes (which on average will have more GATC motifs), we emphasize that 346 this does not exclude interest in genes with fewer GATC motifs; for example, Ir 347 genes are distributed widely in this list ( Figure 4D). We subsequently focused on 348 experimental validation of the top-ranked transcription factor gene, pou domain 349 8 motif 3 (pdm3), and the top-ranked CSSP gene, flamingo (fmi; also known as 350 starry night (stan), flybase.org/reports/FBgn0024836), as described below.

352
Pdm3 acts as a genetic switch to distinguish Ir75a and Ir75b neuron fate 353 354 The robust signal displayed by pdm3 for between-population differences is 355 consistent with it having a significant OS only in Ir75a neurons ( Figure 4E). 356 Notably, from the pan-transcription factor survey, pdm3 is the sole gene that is 357 uniquely occupied in this neuron population ( Figure 2D and Table S4). Pdm3 is a 358 nervous-system enriched transcription factor 50-52 , although its precise in vivo 359 binding specificity is unknown. In another olfactory organ, the maxillary palp, 360 Pdm3 is expressed in multiple classes of Or neurons and functions in controlling 361 receptor expression and axon guidance 52 , but its role in Ir neurons is unknown. 362 Using Pdm3 antibodies, we first analyzed protein expression in antennae in which 363 we co-labeled each of the Ir populations with the driver lines used in the TaDa  364 experiments. Only Ir75a neurons displayed robust nuclear Pdm3 365 immunofluorescence, although a subset of Ir40a neurons had above-background 366 signals ( Figure 5A -B).

367
To examine the role of pdm3 in Ir neurons we performed transgenic RNA 368 interference (RNAi) using peb-Gal4. This driver begins to be expressed very 369 broadly in OSNs after their terminal divisions (∼16 h APF) and throughout the 370 initiation of olfactory receptor expression (∼40-48 h APF) 13,35 . pdm3 RNAi led to loss 371 of most IR75a expression, with the remaining detectable neurons expressing very 372 low levels of this receptor ( Figure 5C-D). By contrast, robust expression of all 373 other Irs was maintained ( Figure 5C-D). We noted, however, that the number of 374 cells expressing IR75b increased in pdm3 RNAi antennae ( Figure 5D). These 375 observations suggest that Pdm3 functions (directly or indirectly) to promote IR75a 376 expression and repress IR75b expression. 377 The novel IR75b-expressing cells in pdm3 RNAi antennae are mainly located 378 in the proximal region of the antenna near the sacculus, where Ir75a neurons are 379 normally found ( Figure 6A). This distribution suggested that loss of Pdm3 results 380 in a switch of receptor expression from IR75a to IR75b. We investigated this 381 possibility by examining the activity of transcriptional reporters for these 382 receptors, encompassing minimal regulatory DNA sequences fused to 383 CD4:tdGFP (hereafter, GFP) 25 . Both Ir75a-GFP and Ir75b-GFP transgenes are 384 faithfully expressed in IR75a and IR75b protein-expressing neurons ( Figure 6B-385 C). In pdm3 RNAi antennae Ir75a-GFP expression is lost, while Ir75b-GFP is 386 expressed in many ectopic cells ( Figure 6B-D).

387
We further used these Ir-GFP transgenes to examine the glomerular 388 projection of Ir75a and Ir75b neurons in pdm3 RNAi animals. In controls, Ir75a-GFP 389 and Ir75b-GFP label exclusively the DP1l and DL2d glomeruli, respectively 390 ( Figure 6E), concordant with previous analyses 22, 25 . In pdm3 RNAi animals the 391 Ir75a-GFP DP1l signal is almost completely lost ( Figure 6E), consistent with the 392 lack of reporter expression in the antennae. By contrast, the Ir75b-GFP signal in 393 the DL2d glomerulus is greatly intensified, and the glomerulus is enlarged ( Figure  394 6E), presumably reflecting the increased number of Ir75b neurons ( Figure 6C-D). 395 This simplest interpretation of these observations is that in the absence of Pdm3, 396 Ir75a neurons adopt Ir75b neuron fate encompassing both its receptor identity 397 and glomerular innervation pattern. 398 9 While OSN fate is established in early pupae 1-3 , the expression of Pdm3 in 399 Ir75a neurons in adult antennae ( Figure 5A -B) suggests that it has a persistent 400 role in these cells. To test this idea, we drove pdm3 RNAi using Ir8a-Gal4 29 , which 401 is active only from the second half of pupal development (∼48 h APF) by which 402 time OSN axonal convergence on glomeruli has occurred. In adult Ir8a>pdm3 RNAi 403 antennae, we found that the number of IR75a-expressing cells was unchanged 404 compared to controls, although protein levels were lower in several cells, arguing 405 that Ir75a neuron fate had been largely correctly established ( Figure 6F-G). 406 However, as observed with peb-Gal4-induced pdm3 RNAi , IR75b was still detected 407 in many ectopic cells ( Figure 6F). Notably, we detected several cells expressing 408 both IR75b and IR75a, which are never observed in control antennae ( Figure 6G targeting in other regions of the nervous system 53-58 . To characterise the 419 endogenous expression of Fmi in the olfactory system, we used an antibody to 420 profile protein levels in OSNs. We first examined antennae from mid-pupal stages 421 (48 h APF) but observed only trace levels of immunoreactivity across OSN soma 422 ( Figure S3A). At 24 h APF this signal was slightly stronger, but much more 423 prominent in the OSN axons as they left the antenna ( Figure S3A), suggesting 424 that this protein is predominantly transported to the axonal compartment of these 425 neurons. Indeed, Fmi protein was strongly detected in the OSN axons as they 426 enter the antennal lobe at 24 h APF ( Figure 7A). Fmi was still robustly expressed 427 at 48 h APF as these OSNs coalesced onto individual glomeruli with broad, but 428 slightly heterogeneous, expression in glomeruli across the antennal lobe ( Figure  429 7A). Subsequently, Fmi levels began to decrease to undetectable levels in some 430 glomeruli in the adult stage, while reproducibly persisting in others ( Figure 7B). 431 Most of these glomeruli are innervated by Or neurons, but we could detect Fmi in 432 DL2d (Ir75b) and DL2v (Ir75c), but not the neighboring DP1l (Ir75a) ( Figure 7B), 433 in partial concordance with the TaDa data ( Figure 4E).

434
Fmi was also detected in a group of ∼16 cells located on the antero-dorsal 435 region of the antennal lobe from later developmental stages (∼48 h APF) ( Figure  436 7A). These cells are likely to be LNs, based upon their expression of Elav ( Figure  437 S3B) and presence of stained processes that innervate the antennal lobe but do 438 not project to higher olfactory centers. Numerous LN subtypes exist, and these 439 often have broad innervation patterns across many glomeruli 59,60 . To examine the 440 specific contribution of OSNs to the Fmi immunoreactivity observed in the 441 antennal lobe, we depleted Fmi in LNs by RNAi using a combination of Gal4 442 drivers (1081-Gal4 and 449-Gal4 60 ) that cover the vast majority of these 443 interneurons ( Figure S3C). At mid-pupal development (48 h APF), differential 444 glomerular expression of Fmi was much more marked compared to control 445 animals ( Figure 7C). Although we could not confidently define all glomerular 446 identities at this earlier developmental stage, prior to full maturation of the 447 antennal lobe, Fmi protein levels did not appear to correspond precisely with the 448 To determine the function of Fmi in the developing olfactory system, we induced 461 fmi RNAi in developing antennal tissue first using a constitutive driver (ey-462 Flp,act>stop>Gal4 62 ), which should deplete Fmi expression from the earliest 463 stages of development), and examined adult brains stained with the general 464 neuropil marker nc82 (Bruchpilot). The antennal lobes of these animals showed 465 fully penetrant, severe morphological defects: the stereotyped glomerular 466 boundaries observed in the lobes of control animals were largely lacking, with 467 only a few, large glomerulus-like subregions detected ( Figure 8A). This 468 phenotype was reproduced with an independent RNAi line that targets a distinct 469 region of the coding sequence ( Figure 8A) 470 We further examined this phenotype by visualizing the projection patterns 471 of specific classes of OSNs labeled by receptor promoter-driven fluorescent 472 reporters ( Figure 8B-C). Neurons labeled by Ir75a-GFP,  were no longer constrained to discrete glomeruli ( Figure 8B), although the 474 mistargeting was relatively mild within the antennal lobe as a whole. The limited 475 defects of individual neuron populations suggests that the lack of Fmi does not 476 disrupt long-range targeting properties of OSNs, but rather impacts local 477 segregation of OSNs to discrete glomerular targets. Consistently, double labeling 478 of neurons that normally project to adjacent glomeruli (Ir75a-GFP and Ir75b-479 Tomato (Tom); Ir84a-GFP and Ir31a-Tom) revealed frequent overlap between 480 these normally-segregated populations ( Figure 8C). 481 The constitutive driver is expressed broadly across the antennal disc 62 . 482 The observed phenotypes could therefore potentially be due to a requirement for 483 Fmi in disc development rather than in OSNs. To limit fmi RNAi to OSNs, we 484 used peb-Gal4, which produced an equally strong defect in antennal lobe 485 glomerular segregation ( Figure 8D). In these animals the LN-expressed Fmi was 486 still detectable but, in contrast to the expression in OSNs, Fmi was apparently 487 homogeneously distributed across antennal lobe glomeruli ( Figure S3D). This LN 488 source of Fmi does not appear to contribute to glomerulus formation as 489 LN>fmi RNAi animals did not exhibit the same antennal lobe morphological defects 490 ( Figure 8D). Similarly, when we targeted fmi to the synaptic partners of OSNs, the 491 PNs -which pre-pattern the glomerular organization of the antennal lobe 3 -using 492 the GH146 driver, antennal lobe morphology appeared normal ( Figure 8D). The 493 absence of defects is consistent with the lack of detectable expression of Fmi in 494 these second-order neurons ( Figure 7A). Together, these results indicate that Fmi 495 functions principally, if not exclusively, within OSNs to correctly pattern glomerular 496 formation. 497 To understand the genesis of this defect during OSNs development we 498 used a temperature-sensitive repressor of Gal4 (Gal80 ts ; inactivated at 30°C) to 499 temporally-control the onset of peb>fmi RNAi . When animals were shifted to 30°C 500 10 h APF (prior to initial expression of peb-Gal4), all animals displayed antennal 501 lobe morphological defects of varying degrees, and Ir75b-GFP-labeled neurons 502 did not coalesce to a discrete glomerular unit ( Figure 8E). When shifted at 20 h 503 APF, the severity of the phenotype was diminished ( Figure 8E). When shifted at 504 40 h APF, the antennal lobe and Ir75b neuron targeting were indistinguishable 505 from controls ( Figure 8E). These experiments indicate an early developmental 506 function for Fmi. 507 We examined this early requirement in more detail by imaging the OSN 508 axons as they invade the antennal lobe. During 20-25 h APF, pioneer OSNs 509 reach the antennal lobe in a unique bundle that separates into ventromedial and 510 dorsolateral bundles, which project around the lobe's border 9,63 ( Figure 8F). In 511 fmi RNAi animals, OSNs reach the antennal lobe, but bundle organization is highly 512 disrupted, with subsets of axons apparently defasciculating prematurely in the 513 central area of the lobe ( Figure 8F). 514 Fmi (and its mammalian homologs) have roles in multiple biological 515 processes in the nervous system as well as in the establishment of epithelial 516 planar cell polarity 58,64,65 . Genetic and biochemical studies have identified a 517 number of interaction partners of Fmi in different biological contexts. We tested 518 mutations and/or multiple independent RNAi lines for several of these genes to 519 determine whether they function with this cadherin in OSNs ( Figure S4). 520 Mutations in two core components of the planar cell polarity pathway, frizzled and 521 dishevelled 66 , had no apparent impact on antennal lobe formation. Similarly, loss 522 of frazzled (encoding a Netrin homolog which function with Fmi in midline axon 523 guidance 67 ), golden goal (with which Fmi collaborates in photoreceptor axon 524 guidance 68 ), or espinas (which encodes a LIM domain protein that binds Fmi 525 intracellularly and functions in dendrite self-avoidance 57 ) did not reproduce the fmi 526 phenotype. Finally, although the intracellular signaling mechanism of Fmi is not 527 well understood, the Fmi GPCR-domain has been suggested to act through the 528 Gαq subunit to regulate dendrite growth 69 ; however, we found no evidence that 529 this signaling mechanism operates in OSNs ( Figure S4). These observations 530 suggest that Fmi functions through other cellular mechanisms to control OSN 531 axon segregation. 532 533 Discussion 534 535 Determination of the molecular composition of neurons is essential to understand 536 the development and function of the nervous system. While scRNA sequencing is 537 an undoubtedly powerful approach to address this challenge 70-74 , it requires often 538 technically-difficult cell isolation, and can be biased towards abundant cell types 539 and highly-expressed genes 75 . Moreover, it is unclear to what extent neurons 540 change their expression properties during tissue dissection (as reported for other 541 types of cell 76 ) or through inevitable loss of dendritic and axonal processes 542 (where certain transcripts may be enriched 77 ). Our study was initially motivated by 543 the desire to test the TaDa method for directed, genome-wide molecular profiling 544 of very small populations of neurons that are tightly embedded within a highly 545 heterogeneous tissue. Moreover, by applying TaDa to a set of functionally-related 546 OSNs, we aimed to identify cell type-specific differences that would point towards 547 12 mechanisms underlying the development and evolution of novel neuronal 548 populations. 549 The unique olfactory receptor expression pattern of OSNs (together with 550 other chemosensory genes) offered several positive-and negative-control loci 551 that allowed us to validate the specificity and sensitivity of TaDa. One caveat 552 emerging from this analysis is that TaDa cannot easily discriminate PolII 553 occupancy of interleaved genes, an issue that may be most pronounced in the 554 relatively gene-dense genome of Drosophila. Nevertheless, our results indicate 555 that this method could be effective in defining candidate receptors in neuronal 556 classes for which a genetic driver is available. This possibility is of particular 557 interest in the Drosophila gustatory system where many populations of neurons 558 can be labeled and physiologically profiled 28 , but incomplete knowledge of the 559 combinatorial receptor expression properties that are characteristic of this transcription factor genes). However, many other experimental and/or analytical 575 differences in the two methods are likely to impact these global numbers. 576 Genome-wide analysis of TaDa datasets from different Ir populations 577 revealed an intriguing positive relationship between the similarity of global 578 patterns of PolII occupancy between neuronal populations and the phylogenetic 579 distance of the olfactory receptors they express, notably the tandem array of 580 Ir75a, Ir75b and Ir75c genes. Similar analysis of Or OSN populations is limited by 581 the broad phylogenetic distribution of the ORs whose corresponding neuronal 582 transcriptomes have been identified 13 . However, through examination of these 583 data 13 we identified two pairs of closely-related (though not tandemly-arranged) 584 Or genes whose neuronal transcriptional profiles also cluster (Or42b/Or59b and 585 Or9a/Or47a). The simplest interpretation of these observations is that neuronal 586 populations expressing distinct, relatively-recent receptor gene duplicates derive 587 from a common ancestral neuronal population (which may have originally co-588 expressed the duplicated receptors); these extant populations therefore retain 589 correspondingly close proximity in their transcriptomic profile. The principle of 590 OSN "sub-functionalization" -which has not, to our knowledge, previously been 591 recognized empirically -has important implications for understanding the 592 evolution of novel olfactory pathways 80 . 593 One important advantage of the TaDa profiling method over scRNA-seq is 594 that it can also provide information on chromatin state, which remains difficult to 595 characterize through standard chromatin immunoprecipitation-based methods 81 . 596 We recognize that CATaDa provides only a relatively coarse-grained view of 597 13 chromatin state (similar to ATAC-seq 15 ), and variant approaches of TaDa (e.g., 598 Dam fusions to chromatin-binding proteins 82 ) are very likely to reveal finer-scale 599 differences between populations. Nevertheless, our findings of similar chromatin 600 accessibility at Ir genes across Ir populations are in-line with studies in the 601 developing Drosophila embryo where chromatin accessibility is comparable 602 between anterior and posterior region, despite substantial differences in patterns 603 of gene transcription across this body axis 83 . We suggest that the specific 604 developmental properties of OSNs are more likely to be driven by the unique 605 combination of TFs they express. Indeed many of the differentially occupied 606 genes between Ir populations encode known or predicted TFs. In agreement, we 607 validate the selective expression and function of Pdm3 in Ir75a neurons, 608 revealing it to be a key factor -and potentially the sole transcription factor -that 609 distinguishes the closely-related Ir75a and Ir75b neuron populations. Future 610 identification of Pdm3 targets in these neurons will be necessary to determine 611 whether it regulates these Irs directly, and to identify other downstream genes 612 that control the innervation of the corresponding neurons to distinct glomeruli.

613
Our TaDa datasets also revealed a rich diversity in the predicted cell 614 surface proteome of Ir neurons. Because of the temporal breadth of TaDa  615 profiling, these proteins are likely to contribute to differences of these neurons in 616 cilia/dendritic morphogenesis, axon targeting and synapse formation/plasticity. 617 Our initial characterization of one of these, the atypical cadherin Fmi, reveal it to 618 be a dynamically and heterogeneously-expressed, axon-localized protein.

619
Importantly, Fmi is not restricted to Ir neurons, but found throughout the 620 peripheral olfactory system, concordant with the dramatic loss of glomerular 621 architecture in the antennal lobe in the absence of this protein. All authors contributed to experimental design, data analysis and interpretation of 677 data. Figure  Competing interests 684 685 The authors declare no competing interests 686 687

689
Drosophila culture 690 691 Flies were maintained on a standard wheat flour/yeast/fruit juice diet at 25ºC in 12 692 h light:12 h dark conditions. Published mutant and transgenic D. melanogaster 693 strain are described in Table S9. For the temporal control of fmi RNAi with 694 Gal80 ts , flies of the desired genotype were cultivated at 19ºC; animals were 695 staged by selecting white pupae (designated as 0 h after puparium formation 696 (APF)) and shifting these animals to 29ºC after 10, 20 or 40 h, to permit induction 697 of  (Table S5). 772 For testing the hypothesis that OSs are equal among datasets (neuron 773 populations), we used DESeq2 92 . Unlike the DamID pipeline, this approach 774 focused on individual GATC fragments within each gene. To quantify read 775 coverage of each fragment, we used the same bam files that were outputted by 776 the DamID pipeline (above). We converted these to bed files using BEDTools v2 777 "bamToBed" utility and calculated the coverage of GATC fragments using the 778 Bedtools' "coverage" utility (Dataset S4). Differential occupancy was tested over 779 the full dataset using a Likelihood Ratio Test, where the full model specified cell.pop + cond. We also carried out pairwise tests using a Wald Test within 784 DESeq2. Genes that were identified as candidates for differential expression 785 were required to have ≥2 GATC fragments contributing to the difference (that 786 were also significantly PolII occupied). 787 Gene Ontology (GO) enrichment analyses were performed on the subset 788 of genes that showed differential occupancy using clusterProfiler (v3.10.1) 93 . We 789 compared these genes against the three independent, controlled vocabularies 790 provided by GO Consortium that model Biological Processes, Molecular Function 791 and Cellular Component 94 . 792 793 CATaDa analysis 794 795 CATaDa was performed on the seven populations of neurons using the same 796 sets of sequencing reads (i.e., Dam-alone data) acquired for TaDa analysis. The 797 bioinformatics pipeline followed the general concepts of a previously described 798 pipeline 15 , but with modifications. For each neuron population, the genomic 799 sequencing reads from all three biological replicates were aligned to release 6.21 800 of D. melanogaster genome with "very-sensitive" settings (Bowtie2 v2.3.0). 801 Following pooling of aligned reads from the replicates (samtools v1.7), the 802 chromatin accessibilities were represented as the coverage of reads per bin of 10 803 bases, scaled to 1× average genomic coverage (~142.57 Mb) and averaged over Immunofluorescence and RNA FISH on whole-mount antennae or antennal 814 cryosections were performed as described 95 . Immunofluorescence on whole-815 mount brains was performed essentially as described 22,96 . Antibodies used are 816 listed in Table S10. RNA probes for Ir31a and Ir84a were previously described 30 . 817 Quantification of En and Pdm3 immunofluorescence signals in OSN soma 818 nuclei ( Figure S2B-C and Figure 5A-B) was performed in Fiji 97 . In brief, regions-819 of-interest (ROIs) were manually defined using the GFP signal (which 820 circumscribes the nucleus) in a single optical slice where the cross-section of a 821 given nucleus has the largest area, followed by automated measurement of the 822 average En or Pdm3 immunofluorescence signal within this ROI. Background 823 fluorescence signals in the tissue were measured in the same way within 824 equivalently-sized, arbitrarily-chosen ROIs that did not overlap with GFP signals 825 or any other OSN nuclei (visualized with DAPI).

827
Statistics statement 828 829 Unless stated otherwise, statistical significance thresholds were p < 0.05 and the 830 tests carried were two-tailed. Measurements were taken from distinct samples, 831 unless otherwise stated, along with a corresponding correction for multiple tests 832 833 Data availability 834 835 TaDa sequencing data and experimental information have been deposited in the 836 ArrayExpress database at EMBL-EBI (www.ebi.ac.uk/arrayexpress) and will be 837 available under accession number E-MTAB-8935. Raw image data for Figures 1,  838 5-8, S2-S4 are available upon request.

845
Biological material availability 846 847 All unique biological material generated in this work is available from the 848 corresponding author upon request. 849 850 . 851  Table S1.

896
(G) OS heatmap of diverse positive-and negative-control genes in the seven Ir 897 neuron populations (i.e., with known or expected expression/lack of expression).

898
See text for definitions of abbreviations. 899 900 901 y-axis represents the fold enrichment for the peak summit against random 922 Poisson distribution of the small local region (1000 bp showing the top ten over-represented GO terms. The x-axis represents the 954 number of genes annotated to a particular GO term in the input subset. The p-955 value indicated by the colored bar is the probability of observing at least the same 956 number of genes associated to that GO term compared to what would have 957 occurred by chance.

958
(D) Ranking of the 1694 differentially-occupied genes ordered by the number of 959 GATC motifs contributing to their between-neuron differences (uncorrected for 960 gene length).

78
Only significant OSs are listed. Values correspond to those plotted in Figure  79 S1. 80 81  Table S2. Set of CSSP genes.

82
The gene list is from the D. melanogaster extracellular domain database 83 (FlyXCDB) database 2 , with additional gene symbols appended. 84 85 Table S3. CSSP genes with significant OSs in Ir neuron populations.

86
OSs correspond to those plotted in Figure 2C. 87 88 Table S4. CSSP genes with significant OSs unique to each population.

90
Table S5 Filtered set of transcription factors genes. 91 The gene list was extracted from the FlyTF.org database 3 . 92 93 Table S6. Transcription factor genes with significant OSs in Ir neuron 94 populations 95 OSs correspond to those plotted in Figure 2D.