Tuning of delta-protocadherin adhesion through combinatorial diversity

The delta-protocadherins (δ-Pcdhs) play key roles in neural development, and expression studies suggest they are expressed in combination within neurons. The extent of this combinatorial diversity, and how these combinations influence cell adhesion, is poorly understood. We show that individual mouse olfactory sensory neurons express 0–7 δ-Pcdhs. Despite this apparent combinatorial complexity, K562 cell aggregation assays revealed simple principles that mediate tuning of δ-Pcdh adhesion. Cells can vary the number of δ-Pcdhs expressed, the level of surface expression, and which δ-Pcdhs are expressed, as different members possess distinct apparent adhesive affinities. These principles contrast with those identified previously for the clustered protocadherins (cPcdhs), where the particular combination of cPcdhs expressed does not appear to be a critical factor. Despite these differences, we show δ-Pcdhs can modify cPcdh adhesion. Our studies show how intra- and interfamily interactions can greatly amplify the impact of this small subfamily on neuronal function.

How does this relatively small gene family mediate these varied effects? While significant effort has been devoted towards characterizing the role of individual d-Pcdhs in neural development, almost nothing is known regarding how multiple family members function together. The d-Pcdh subfamily has been further divided into the dÀ1 (Pcdh1, Pcdh7, Pcdh9, and Pcdh11x) and dÀ2 (Pcdh8, Pcdh10, Pcdh17, Pcdh18, and Pcdh19) subfamilies based on differences in the number of extracellular domains and also within the intracellular domain Vanhalst et al., 2005). Double label RNA in situ hybridization studies indicate individual neurons express more than one d-Pcdh (Etzrodt et al., 2009;Krishna-K et al., 2011). This suggests a model where different combinations of d-Pcdhs may be expressed within different populations of neurons. Whether such combinations exist or how many d-Pcdhs may be expressed per neuron is still not known. It seems reasonable, however, to postulate that combinatorial expression would greatly enhance the impact of d-Pcdhs on cellular function. If such combinations exist, how they would influence or modify d-Pcdh-mediated adhesion is also unknown.
The importance of examining intrafamily d-Pcdh interactions was recently underscored by a study examining the role of d-Pcdh adhesion in PCDH19-GCE (girls clustering epilepsy), a form of epilepsy limited to females. Pederick et al. demonstrated that mutations in PCDH19, a dÀ2 family member, affected cell sorting in both in vitro aggregation assays and in brains of mice. Furthermore, they also showed that humans with PCDH19-GCE exhibit abnormal cortical folding patterns (Pederick et al., 2018). Importantly, they noted that PCDH19 is likely to be co-expressed with other d-Pcdh family members, and tested how expressing PCDH10 and/or PCDH17 with PCDH19 affected sorting behavior in aggregation assays. In each case, the observed cell sorting behavior varied depending upon which d-Pcdhs were co-expressed.
This study demonstrated the importance of defining intrafamily interactions in order to understand how loss of Pcdh19 influences function. However, it did not define the extent of such combinations in vivo. It also did not establish any guiding principles for d-Pcdh adhesion, or how different combinations influence adhesion.
Here, we uncover principles used by the d-Pcdhs to regulate combinatorial adhesion. We first used single color and double label RNA in situ hybridization to show that olfactory sensory neurons (OSNs) are likely to express different combinations of d-Pcdhs. We next employed single cell RNA analysis to establish the scope of these combinations, and find individual OSNs express between zero and seven d-Pcdhs. We then systematically address the impact of this combinatorial diversity on intrafamily interactions by utilizing cell aggregation assays. In striking contrast to what has been seen for the clustered protocadherins (cPcdhs; Thu et al., 2014), we observed a range of potential adhesive behaviors. We were able to define fundamental principles that regulate these outcomes. In combination, these principles provide cells with a powerful means of fine tuning their adhesive interactions with other cells. Finally, we show that d-Pcdhs can also modify the adhesive function of cPcdhs, which have been shown to be important for neuronal survival, dendritogenesis, synapse formation, and self-avoidance (Lefebvre et al., 2012;Molumby et al., 2016;Wang et al., 2002; eLife digest Multicellular life depends on cells being able to stick together. The human body, for example, consists of trillions of cells grouped into tissues and organs. The brain alone contains some 87 billion neurons organized into complex networks. To stay together, cells use proteins on their surface called cell adhesion molecules (CAMs). There are four major families of CAMs, each with multiple members, and the CAMs on one cell recognize and interact with the CAMs on another.
But how does this process work? One possibility is that different combinations of CAMs allow different cells to stick together. Bisogni et al. tested this idea by studying a family of CAMs called the delta-protocadherins. This family has nine members, each with its own gene. Before cells can use a gene to produce a protein, they must first use the gene's DNA as a template to build an RNA molecule. By counting the number of different types of RNA molecules inside individual cells, Bisogni et al. showed that sensory neurons in the mouse each produce up to seven different deltaprotocadherins.
Further experiments revealed that cells fine-tune their interactions by varying the number, type and combination of delta-protocadherins on their surface. In addition, the delta-protocadherins also alter interactions between members of a related gene family, the clustered protocadherins. This further increases their ability to regulate how cells interact.
In contrast to previous studies that focused on single molecules, Bisogni et al. have shown how combinations of molecules work together to influence cell adhesion. Deciphering this combinatorial code is key to understanding how interactions between cells go awry in disease.
Mutations in the genes for CAMs often impair brain development. The reported findings may provide insights into how such mutations disrupt the CAM combinatorial code and alter cell to cell interactions. Weiner et al., 2005). These results provide an initial glimpse into interfamily interactions among protocadherin subfamilies. Our studies therefore provide a framework for determining how combinations of d-Pcdhs mediate adhesion, and also lay the foundation for understanding how different cadherin subfamilies integrate to regulate cell-cell adhesion.

Defining combinatorial expression of d-Pcdhs in single neurons
We first performed single color RNA in situ hybridization to examine d-Pcdh expression in the olfactory epithelium (Figure 1-figure supplement 1A-G). All detectable d-Pcdhs were expressed in a punctate pattern, indicating differential expression among OSNs. Interestingly, the expression pattern for any given d-Pcdh was not uniform throughout the epithelium. For example, Pcdh1 is more highly expressed in the lateral epithelium, and more weakly medially (Figure 1-figure supplement 1B,C). In both regions, the expression was clearly punctate, but greater numbers of OSNs in the lateral epithelium expressed Pcdh1. In contrast, other d-Pcdhs, such as Pcdh9 and Pcdh17, show the opposite pattern, and are more strongly expressed medially with relatively low expression laterally ( Figure 1-figure supplement 1D -G). Differences between dÀ1 and dÀ2 family members could not be distinguished based upon these patterns. These patterns are essentially maintained as development proceeds, although subtle changes in expression did occur. One exception was Pcdh10, whose expression we previously demonstrated to be dependent upon odorant-mediated activity (Williams et al., 2011).
The d-Pcdhs are therefore expressed in regional patterns that overlap one another, suggesting combinatorial expression. We used double label RNA in situ hybridization to begin testing this hypothesis ( Figure 1A). We systematically assayed all expressed pairs to show that 5-35% of olfactory sensory neurons (OSNs) co-express any two d-Pcdhs (Figure 1-figure supplement 1H). Interestingly, the degree of co-expression varied within the family. For example, Pcdh1 and Pcdh7 were only co-expressed 8% of the time, while Pcdh8 and Pcdh9 were co-expressed 35% of the time.
OSNs expressing the same odorant receptor project to common targets within the olfactory bulb (Ressler et al., 1994;Vassar et al., 1994). Mutant analysis of members of the d-Pcdh and cPcdh subfamilies has previously shown these genes are important for OSN targeting (Hasegawa et al., 2008;Mountoufaris et al., 2017;Williams et al., 2011). Interestingly, however, not all OSN populations were equally affected. Why some populations expressing a particular odorant receptor were more strongly affected in the mutant than those expressing a different receptor is unknown. We theorized that different OSN populations may express different combinations of d-Pcdhs. Changes in these combinations would therefore affect cell adhesion mediated by the d-Pcdhs. We therefore performed a second double label RNA in situ hybridization series to survey which d-Pcdhs are coexpressed among OSNs expressing a given odorant receptor. For any one d-Pcdh, we examined on average~70 cells expressing a given odorant receptor to determine the degree of overlap ( Figure 1B,C).
Confocal analysis showed all five OSN populations surveyed express varying proportions of different d-Pcdhs ( Figure 1B,C). There were striking differences in expression of d-Pcdhs among the different OSN populations, arguing for the presence of specific combinations of d-Pcdhs within each population. Interestingly, we did not find a simple one-to-one correspondence between odorant receptor expression and d-Pcdh expression. Instead, different OSN populations varied in the proportion of d-Pcdhs they expressed. For example, Pcdh9 was expressed by more than half of all OSNs expressing Olfr558. In contrast,~12% of Olfr557 OSNs expressed Pcdh9. The variation in d-Pcdh expression within OSN populations indicates additional levels of regulation must exist. Nevertheless, different OSN populations clearly possess differences in the proportion of d-Pcdhs expressed by those OSNs. Such differences could be important for defining how d-Pcdhs mediate targeting.
We next used the NanoString nCounter platform (Geiss et al., 2008) to more precisely define the extent of co-expression. We isolated 50 randomly selected OSNs, and performed single neuron RNA analysis for d-Pcdhs and a subset of other genes. A heat map of the raw NanoString data showed strong heterogeneity among OSNs ( Figure 1D, Figure 1-source data 2). To classify d-Pcdhs as being 'on' or 'off' in a neuron, we used a constrained gamma-normal mixture model Figure 1-figure supplement 1I). This revealed that individual OSNs  Figure 1E), far exceeding prior estimates based on RNA in situ studies. We were unable to determine if there was any preference for co-expression among or between the dÀ1 or dÀ2 subfamilies.
We performed several validation experiments (see Validation of NanoString data, Figure 1F, and Figure 1-figure supplement 1J), including qRT-PCR on individual OSNs. The observed 'on' or 'off' expression pattern of this particular validation experiment was highly similar to our NanoString results ( Figure 1F). We chose NanoString because we hypothesized a targeted approach would be more sensitive than single cell RNA-seq, which is often limited by low capture efficiency of mRNA (Islam et al., 2011;Marinov et al., 2014). Subsequent comparison with single OSN RNA-seq data sets confirmed this hypothesis (

d-Pcdhs are homophilic cell adhesion molecules
To determine how d-Pcdh combinations affect adhesion, we used K562 cell aggregation assays. K562 cells are commonly used to study adhesion mediated by cadherins because it is believed they do not express endogenous cadherins and are non-adherent (Ozawa and Kemler, 1998;Schreiner and Weiner, 2010;Thu et al., 2014) Our initial experiments showed extracellular and transmembrane domain (ECTM) constructs were easier to express than full-length constructs. Importantly, the ECTM domain was sufficient to drive homophilic adhesion (Figure 2-figure supplement 1A). As our goal was to isolate the effects of adhesion on cell-cell interactions, we chose to use the ECTM domain for all subsequent experiments. As expected, the exogenously expressed protocadherins localized to sites of intracellular contact (Figure 2-figure supplement 1B). We also confirmed that d-Pcdh adhesion is highly sensitive to EDTA, consistent with being members of the calcium dependent cadherin superfamily (Figure 2figure supplement 1C). Although all expressed d-Pcdhs induced cell aggregation (Figure 2A), Pcdh10 formed very small aggregates relative to the others. We titrated the amount of DNA to try and normalize aggregate size ( Figure 2B). However, varying the amount of Pcdh10 DNA had little impact on aggregate size. We therefore excluded Pcdh10 from further experiments.
We performed pair-wise assays by mixing cells expressing one d-Pcdh (fused to P2A-GFP) with those expressing another (fused to P2A-RFP). We found that cells expressing the same d-Pcdh intermix ( Figure 2C, center diagonal) while cells expressing different d-Pcdhs segregate from one another. We interpret these results to indicate that d-Pcdh adhesion is strictly homophilic. Identical results were found for the cPcdh subfamily using the same assay (Thu et al., 2014).

d-Pcdhs
To determine how combinatorial expression of d-Pcdhs affect adhesion specificity, we next performed mismatch coaggregation assays. In these experiments, cells expressing a single d-Pcdh are mixed with a second population of cells expressing the same d-Pcdh plus an additional, 'mismatched' d-Pcdh. Prior studies on cPcdhs using this approach showed that a single mismatch causes one population to segregate from the other, even when several cPcdhs are expressed in common  (Thu et al., 2014). In contrast, this same assay suggested adhesive outcomes may be dependent on which d-Pcdhs were co-expressed (Pederick et al., 2018).
To systematically define how mismatched d-Pcdhs influence adhesive outcomes, we screened 42 possible mismatch pairs. We discovered a range of outcomes that could be grouped into three broad categories ( Figure 3A-D). In the first, the two populations intermixed ( Figure 3A,B). In the second, the populations interfaced ( Figure 3C), and in the last, the populations segregated from one another ( Figure 3D). We also noticed that interfacing and intermixing behaviors were not binary, but instead appeared to exist on a continuum.
To better capture these differences, we developed a novel metric called the CoAggregation Index (CoAg) to quantify the degree of coaggregation (see Materials and methods). Briefly, the index measures the proportion of red and green cells that share a common boundary within a given confocal image. In general, CoAg values below 0.1 indicate segregation, whereas values between 0.1-0.2 are typical of populations that interface. Above 0.2, aggregates display increasingly higher degrees of intermixing. Thus, the CoAg index captures subtle differences in aggregation behavior not easily identified by eye. Ordering the CoAg values from our screen from high to low revealed a surprisingly linear range of behavior ( Figure 3E; mean CoAg values for a given experiment are indicated in the corner of each representative image). For comparison, the first column shows the CoAg value for Pcdh1 cells mixed with Pcdh7 cells (e.g. complete segregation), as expected from cPcdh mismatch assays (Thu et al., 2014). The red bar indicates complete mixing by matched populations.
Reordering the CoAg values into a heat map strongly argued that different d-Pcdh combinations produced different coaggregation behaviors ( Figure 3F). For example, we combined Pcdh1 cells with cells expressing Pcdh1+Pcdh7 or Pcdh1+Pcdh8. In the first case, cells interfaced (CoAg = 0.11; row 1, column 2), but in the second, they intermixed (CoAg = 0.27; row 1, column 3). Although Pcdh1 was expressed by all populations, the presence of Pcdh7 vs. Pcdh8 led to differing behaviors. This suggested that, unlike the cPcdhs, the identity of the d-Pcdh being tested is important for the outcome. This is further reinforced by the fact that strong asymmetry is observed across the diagonal in the heat map. For example, Pcdh19 cells segregate from Pcdh19+Pcdh7 cells (CoAg = 0.02; Figure 3G). However, 'across the diagonal,' Pcdh7 cells intermix with these same Pcdh19+Pcdh7 cells (CoAg = 0.40). Similarly, Pcdh19 cells intermix with Pcdh19+Pcdh9 cells (CoAg = 0.23) but across the diagonal, Pcdh9 cells segregate (CoAg = 0.07). These results strongly suggest that coaggregation is dependent upon the identity of the mismatched d-Pcdh. We obtained similar results using full-length constructs that could be expressed to generate an aggregation behavior (data not shown). To compare how different d-Pcdhs influence mismatch coaggregation, we generated a net mismatch score that revealed a potential hierarchy among d-Pcdhs ( Figure 3H, see Materials and methods).

Differential mismatch coaggregation outcomes persist after normalizing surface expression
We next considered if these variable behaviors were caused by differential surface expression of coexpressed d-Pcdhs. Some prior studies control for overall expression (e.g. from whole cell lysates), but not surface expression. To address this, we generated ECTM constructs fused to FLAG, GFP, or RFP, and used a cell-impermeant biotinylation reagent to label surface protein in live cells. Labeled proteins were then affinity purified and analyzed by western blotting for the various tags ( Figure 4figure supplement 1A). Antibody signal intensities were calibrated to allow for cross-antibody comparisons.
We re-tested all possible combinations of Pcdh1, Pcdh7, and Pcdh17, as these three had the strongest net mismatch scores in our initial screen ( Figure 3H). For Pcdh1+Pcdh7 mismatch assays, we controlled for surface expression by carefully titrating DNA input ( Figure 4A), and examined aggregation behavior at 18, 22, 26, and 44 hr post electroporation. As seen in our initial screen, Pcdh7 cells intermixed with Pcdh1+Pcdh7 cells across all time points, whereas Pcdh1 cells interfaced ( Figure 4B,C). We used 26 hr for all further tests, given no obvious differences in behavior beyond this point.
We repeated the assay for Pcdh1+Pcdh17, and found that Pcdh1 cells segregated (CoAg = 0.07, Figure 4D-F), while Pcdh17 cells intermixed (CoAg = 0.42). Interestingly, these results differ from our preliminary screen, where both Pcdh1 and Pcdh17 cells interfaced with Pcdh1+Pcdh17 cells. These results argue that controlling for surface level is important for interpreting coaggregation behavior, an aspect we explore below. Finally, we repeated our mismatch assay with Pcdh7 and Pcdh17. We again found differences in behavior (Figure 4-figure supplement 1B-D). However, we found that this pair was particularly sensitive to DNA input, as small changes could alter the result despite minor effects on surface expression (Figure 4-figure supplement 1D). For one DNA input condition, Pcdh17 cells interfaced (CoAg = 0.29), while in the other they segregated (CoAg = 0.08). In contrast, Pcdh7 cells shifted towards intermixing. Nevertheless, these results confirm that differences in aggregation are dependent on d-Pcdh identity.

Coaggregation behaviors can be modulated by altering relative surface expression levels
Our results argue that controlling for surface expression is important for understanding and interpreting differences in d-Pcdh coaggregation behavior. In addition, our expression data ( Figure 1A,B and Figure 1-figure supplement 1A-G) suggest that d-Pcdh expression levels vary both within and between neurons. To further explore the role of expression, we established conditions where gradients of low, medium and high surface levels for Pcdh1, Pcdh7, and Pcdh17 could be reproducibly generated ( Figure 5A and Figure 5-figure supplement 1A). Medium levels were similar to those used in Figure 4.
Our mismatch assays involve mixing cells that express a single d-Pcdh with those expressing two or more. We first asked what would happen if we altered surface expression in cells expressing a single d-Pcdh. We found that Pcdh1 (low, medium, and high) cells all still interfaced with Pcdh1+Pcdh7 cells ( Figure  We next asked if altering the relative proportion of d-Pcdh expression within cells expressing two d-Pcdhs would affect coaggregation. We created populations with high and low DNA input values for each d-Pcdh (e.g. Pcdh1 High +Pcdh7 Low and Pcdh1 Low +Pcdh7 High cells). We note that our goal was to simply alter the relative proportion of surface expression in these cells, and not to establish conditions where one d-Pcdh was necessarily higher in expression than another. We found that varying the ratio of expression clearly altered coaggregation outcomes ( Figure 5D,E). Differences in coaggregation behavior are most easily seen by comparing results column by column. For example, in Figure 5D (column 1), Pcdh1 cells intermix with Pcdh1 High +Pcdh7 Low cells, but segregate from Pcdh1 Low +Pcdh7 High cells. The coaggregation behavior of Pcdh1 cells is therefore clearly affected by the ratio of Pcdh1:Pcdh7 in the co-expressing cells. In the complementary experiment (column 2), Pcdh7 cells intermixed with both Pcdh1 High +Pcdh7 Low and Pcdh1 Low +-Pcdh7 High cells. However, intermixing was clearly reduced in Pcdh1 High +Pcdh7 Low cells.
In column 3, Pcdh1 High +Pcdh7 Low cells intermixed with Pcdh1 High +Pcdh7 Low cells, but less well with Pcdh1 Low +Pcdh7 High cells. The converse (column 4) was observed for Pcdh1 Low +Pcdh7 High cells. Thus, relative surface levels of co-expressed d-Pcdhs can influence aggregation behavior, even when there are no mismatches between populations.
We tested eight additional pairs using this high/low DNA input approach, and found similar results ( Figure 5-figure supplement 1H). We confirmed a relative difference between high and low surface expression for a subset of pairs ( Figure 5-figure supplement 1I). We conclude that changing the relative ratio of expression in cells expressing two d-Pcdhs has a much greater effect on coaggregation than varying expression in cells expressing one d-Pcdh. Because differences in d-Pcdh coaggregation behavior persisted despite controlling for surface expression, we next asked whether they possess differences in apparent adhesive affinity. Such differences have been argued to mediate segregation among classical cadherins, such as N-and E-cadherin (Harrison et al., 2010;Katsamba et al., 2009). We hypothesized that we could detect these potential differences by subjecting aggregates to higher shear forces. Cells expressing d-Pcdhs with weaker apparent adhesive affinities should dissociate prior to those expressing d-Pcdhs with stronger affinities.
We generated cells expressing Pcdh1, Pcdh7 or Pcdh17 at high surface levels ( Figure 5A, Figure 5-figure supplement 1A), and subjected them to gradual increases in rotational speed (15-220 RPM). Images were analyzed for aggregate size using a custom written code (Aggregate Size Measurement). These populations began dissociating as speed increased. However, Pcdh7 cells maintained larger aggregates than Pcdh1 or Pcdh17 cells at all speeds ( Figure 6A,B). Furthermore, while Pcdh1 and Pcdh17 cells appeared to fully dissociate by~200 RPM, Pcdh7 aggregates were still present even at 220 RPM. Because Pcdh1, Pcdh7, and Pcdh17 were at one end of our hierarchy ( Figure 3H), we compared Pcdh1 and Pcdh19 using the same approach. Similarly, we found that Pcdh1 cells maintained larger aggregates than Pcdh19 cells at all speeds ( Figure 6-figure supplement 1A-C).
Varying expression levels also accentuated these differences. We generated cells expressing Pcdh7 or Pcdh17 at low, medium and high levels ( Figure 5A and Figure 5-figure supplement 1A). As expected, we found that higher surface levels generated larger aggregates that could better withstand increasing rotational speeds ( Figure 6-figure supplement 1D-G). We also found that Pcdh7 cells produced larger aggregates at all speeds compared to Pcdh17 cells. Even at 220 RPM, Pcdh7 Low cells still maintained some aggregates.
If Pcdh1 has weaker apparent adhesive affinity than Pcdh7, this difference could explain why Pcdh1 cells interface with Pcdh1+Pcdh7 cells while Pcdh7 cells intermix in mismatch assays. Such differences should be accentuated by increasing shear force on aggregates. To test this, we repeated the Pcdh1+Pcdh7 mismatch assay. After allowing aggregates to form at 15 RPM, we increased the speed to 120 RPM. Despite the increased speed, Pcdh7 cells still intermixed with Pcdh1+Pcdh7 cells. However, Pcdh1 cells now segregated ( Figure 6C,D), consistent with weaker apparent adhesive affinity.
To examine structural differences that could account for this varying behavior among d-Pcdhs, we performed multiple sequence comparison by log expectation (MUSCLE) alignments. We found low sequence identity among d-Pcdhs in extracellular domains (EC) 1-4 (~35%; Figure 6-figure supplement 1H). Prior work had shown that the adhesive interface of Pcdh19 was localized to EC1-4 (Cooper et al., 2016). To test the importance of EC1-4 in adhesion mediated by other subfamily members, we deleted these domains (D1-4) from Pcdh1, Pcdh7 and Pcdh17. Although the truncated proteins were still transported to the surface, they were unable to mediate adhesion ( Our results argue that differences in apparent adhesive affinity and relative surface expression regulate coaggregation behavior. We therefore performed Monte Carlo simulations using a custom program (cellAggregator, Ghazanfar, 2018) to see if we could model these factors in silico. We successfully captured the behavior of a subset of our experiments. The model functioned most optimally in predicting cells that will intermix. For example, the model correctly predicted that cells expressing identical d-Pcdhs will intermix. Furthermore, the model also predicted the behavior of cells known to intermix in mismatch coaggregation assays. However, the model could not precisely recapitulate conditions where mismatched cells interfaced or segregated ( Figure 6E, far right column; CoAg values of (C). Error bars indicate ±SEM, * indicates p 0.05, ** indicates p 0.01. Results for each assay were determined from three independent electroporations. (E) Monte Carlo simulations incorporating affinity and relative expression level capture most, but not all, mismatch assay results. We modeled the behavior of a given mismatch assay (e.g. row 1, Pcdh1+Pcdh7). The Y-axis represents the CoAg Index (simulated (solid black and red lines) Figure 6 continued on next page for example mixing Pcdh1 cells with Pcdh1+Pcdh7 cells). Varying affinity differences, relative expression levels, or both still did not completely capture these behaviors. We anticipate other, as yet uncharacterized effects (e.g. intracellular d-Pcdh-d-Pcdh interactions [Pederick et al., 2018]) must be incorporated into the model to better capture cell adhesive behavior.

Increasing combinatorial d-Pcdh expression and interactions with a cPcdh family member
Our single cell RNA analysis showed individual OSNs express up to seven d-Pcdhs. To test the impact of increasing the number of co-expressed d-Pcdhs on mismatch aggregation, we generated populations of cells that co-expressed Pcdh7 with one to four additional d-Pcdhs. To confirm changes in the relative expression of Pcdh7 vs the other co-expressed d-Pcdhs, we measured surface expression levels ( Figure 7A) and performed coaggregation assays with cells expressing only Pcdh7. We found that each additional d-Pcdh co-expressed with Pcdh7 led to a corresponding decrease in the CoAg index ( Figure 7B). Pcdh7 only cells shifted from intermixing towards interfacing as the relative proportion of Pcdh7 decreased. Quantification of surface expression showed that the percent of Pcdh7 with respect to total surface expression decreased from~50% to 25%, almost perfectly mirroring the decline in CoAg index (R 2 = 0.94; Figure 7C). We repeated the experiment with Pcdh1, and found a similar effect (Figure 7-figure supplement 1A,B). In this case, increasing the number of co-expressed d-Pcdhs shifted the behavior of Pcdh1 cells from interfacing to segregation.
Finally, although we have focused on how d-Pcdh subfamily members function in combination, individual neurons are likely to co-express multiple cadherin subfamily members. How d-Pcdhs and these other subfamily members interact is not well understood. We first confirmed that cPcdh Pcdhb11 cells completely segregate from cells expressing d-Pcdhs, demonstrating strict homophilic adhesion (Figure 7-figure supplement 1C). We then generated populations coexpressing Pcdh7 and Pcdhb11 at three different relative expression levels for use in mismatch coaggregation assays ( Figure 7D). At the first two relative levels (DNA input ratio of 3:4 and 1:2), surface levels of Pcdh7 were~45% of total ( Figure 7E). Under these conditions, Pcdh7 cells strongly intermixed while Pcdhb11 cells segregated ( Figure 7F,G). However, at a DNA ratio of 1:4 (Pcdh7~20% of total), Pcdh7 cells still intermixed but Pcdhb11 cells could now interface. Thus, d-Pcdhs influence the aggregation behavior of cells expressing this particular cPcdh. This raises the intriguing possibility that the two subfamilies may work in concert to specify adhesion.

Discussion
Our results provide a foundation for understanding how a small gene family can exert unexpectedly complex influences on cell adhesion. Despite the wide range of combinatorial expression observed within single neurons, we identified fundamental principles that help dictate intrafamily interactions. First, we found that cells can vary the number of d-Pcdhs expressed per cell. Second, we showed that individual d-Pcdhs possess differences in apparent adhesive affinity. Third, we further demonstrated that these differences can be modulated by varying relative surface expression levels. Together, these principles dramatically augment the range of adhesive interactions mediated by this small subfamily. Despite the fact that there are only a limited number of d-Pcdhs, these and observed (thick dashed line with standard error represented by thin dashed lines). Solid lines represent simulations where the relative expression level of the two d-Pcdhs has been varied (from 1:1 to 20:1). The X-axis represents increasing differences in apparent adhesive affinity (e.g. the left most point on the X-axis represents conditions where both d-Pcdhs are of equal apparent adhesive affinity). In all three simulated coaggregation assays, the model predicted intermixing conditions (e.g. CoAg index above 0.2), but was not able to precisely model segregation or interfacing behaviors (compare right most graph in each row against the other two). DOI: https://doi.org/10.7554/eLife.41050.014 The following figure supplements are available for figure 6: principles provide cells with the ability to carefully fine tune their adhesive profiles. Even if cells express the same combination of d-Pcdhs, varying the levels of each expressed family member provides additional flexibility in modulating adhesion. These principles contrast with those defined for the cPcdhs. However, our results also provide an initial glimpse into how these two families can interact with one another to affect adhesion.

Differences in apparent adhesive affinity among d-Pcdhs
The range of apparent adhesive affinities suggest that neurons can fine tune their overall adhesive profile by varying the repertoire of d-Pcdhs expressed. One caveat is that we did not directly measure affinity using purified proteins. As our efforts are aimed at understanding how d-Pcdhs mediate cell-cell interactions, we utilize the term apparent adhesive affinity to describe the functional impact of d-Pcdhs on adhesion. Biophysical studies will be required to fully define such affinity differences. However, structural studies show cPcdhs possess varying adhesive affinities (Goodman et al., 2016;  Rubinstein et al., 2015). Despite this, such differences do not appear to have a major impact in K562 assays (Thu et al., 2014).
While cell aggregation assays have been used for decades, the technical details have never been standardized. For example, cell type, speed of rotation, time of mixing, surface expression, and mode of quantitation all differ among past studies. We note that very few studies control for or report these factors, which in our hands are important for reproducible adhesive behavior. While such controls may not be necessary when cells essentially completely segregate from one another (e.g. as for cPcdhs), such reproducibility was essential to our ability to identify and quantitate differences in adhesive outcomes among d-Pcdh family members.
Our aggregation assay results clearly contrast with a prior study of cPcdhs (Thu et al., 2014). In this paper, two populations would only fully intermix if they expressed the same combinations of cPcdhs. If even one cPcdh differed between the two, the populations would completely segregate, regardless of the identity of the mismatched cPcdh. The observed results were always binary in nature, and produced either complete intermixing or complete segregation. In contrast, we were able to observe a range of coaggregation behaviors. This spectrum of adhesive outcomes illustrates how a comparatively small gene family can still have complex effects on cellular behavior. Biophysical analysis of complex formation may better illuminate the mechanism behind such differences.
We note we did not identify any obvious differences between members of the dÀ1 and dÀ2 subfamilies in our assays. Members of both groups were expressed in overlapping patterns within the epithelium (Figure 1-figure supplement 1). In situ hybridization, NanoString, and qRT-PCR analyses also showed no obvious differences between subfamilies (Figure 1). In our mismatch aggregation assays, dÀ1 and dÀ2 members were distributed along the spectrum of our net mismatch score ( Figure 3). For example, Pcdh1, a dÀ1 family member, had a roughly equivalent net mismatch score with Pcdh17, a dÀ2 family member. However, we note that dÀ1 and dÀ2 members are often coexpressed within neurons, leading to potential intracellular interactions that may not be captured in these assays. Further, how the varying number of extracellular domains between the two subfamilies influence adhesion is not known. Further structural studies will be needed to better define how these differences affect cell-cell interactions.

d-Pcdh adhesion can be tuned by varying relative expression level
We showed a simple solution to moderating high apparent adhesive affinity d-Pcdhs is to vary relative expression level. These results are reminiscent of principles defined for classical cadherins. Steinberg's differential adhesion hypothesis provides a commonly used framework for understanding how classical cadherins mediate cell sorting. In this model, cells sort from one another to reach an optimal thermodynamic equilibrium. This sorting can be driven by differences in adhesive affinity between cells, and/or by differences in expression level (Foty and Steinberg, 2005;Friedlander et al., 1989;Steinberg and Takeichi, 1994). Thus, d-Pcdhs appear to use some of the same principles as classical cadherins. However, Steinberg and colleagues typically focused on N-and/or E-cadherin, and did not, to our knowledge, examine the behavior of multiple classical cadherins in combination. The principles we define here therefore confirm similarities between the classical and d-Pcdhs, and extend these canonical studies of cadherin function.
We chose to use the ECTM domain for these experiments because expressing the full-length construct in K562 cells proved practically difficult. However, we demonstrated that the ECTM domain mediated homophilic adhesion to a degree similar to that of the full-length construct (Figure 2-figure supplement 1). As our goal was to study adhesive interactions among co-expressed family members, this allowed us to separate adhesion from intracellular signaling. In addition, the ECTM domain is typically used to study d-Pcdh adhesion (Chen et al., 2007;Cooper et al., 2016;Emond et al., 2011). Still, it is clear there are many aspects of d-Pcdh function that are not addressed by this reductionist approach. Intracellular signaling events, heterologous extracellular interactions, and regulation of d-Pcdh gene expression can all further tune the impact of d-Pcdhs on cell-cell interactions. Indeed, our Monte Carlo simulation indicates we can capture many, but not all, behaviors associated with combinatorial expression. Most notably, not all interface or segregation behaviors could be adequately modeled ( Figure 6E). We expect that other, uncharacterized intracellular or extracellular interactions may explain these differences. In particular, Pederick et al. showed d-Pcdhs can interact in cis (Pederick et al., 2018). Such cis interactions have previously been proposed to be critical for cPcdh function (Rubinstein et al., 2017;Thu et al., 2014). If these cis interactions are also important for d-Pcdh function, we anticipate that they may contribute towards adhesion of d-Pcdhs in trans.
Nevertheless, our studies lay the foundation for new models that can integrate these principles with those defined for other cadherin subfamilies, ultimately leading to a more complete determination of cadherin function within the nervous system. Our results represent a functional genomic approach towards understanding how combinations of cadherin expression identified via transcriptomic approaches impact cellular function.

Implications for d-Pcdh function in vivo
Our reductionist approach to understanding d-Pcdh function has the fundamental advantage of allowing us to systematically test different combinations for their impact on adhesion. Such studies would be extremely difficult to execute in vivo, given the varied chromosomal locations of d-Pcdhs and the technical complexity of manipulating multiple genes at once. Further, although K562 cells have been used extensively to study protocadherin function, they are not a neuronally derived line. An appropriate question would be to ask how our results apply towards understanding d-Pcdh function in vivo.
We believe there are two major applications of this study for understanding d-Pcdh function. First, while d-Pcdhs have been suspected to be expressed in combination in vivo based on double-label RNA in situ data, there has been no prior evidence demonstrating the extent of this expression. Our single cell NanoString and qRT-PCR data ( Figure 1D-F) clearly demonstrate that multiple d-Pcdhs are expressed per neuron, and show the variety and extent of such expression. Our round-robin RNA in situ hybridization studies (Figure 1-figure supplement 1H) are also consistent with this combinatorial expression. Further, our study of d-Pcdh and odorant receptor overlap showed OSNs known to project to different targets clearly express different proportions of d-Pcdhs ( Figure 1B). While the expression of d-Pcdh vs. a given odorant receptor is not a simple, one-to-one correlation, there nevertheless were clear differences among OSNs expressing different odorant receptors. Thus, the combinatorial expression of d-Pcdhs is not an entirely random event, as has been suggested for the cPcdhs (Goodman et al., 2016;Hirano et al., 2012). This is further supported by our single label RNA in situ studies, which clearly shows spatially restricted expression of d-Pcdhs within the olfactory epithelium (Figure 1-figure supplement 1B-G). Our results therefore demonstrate that d-Pcdhs are combinatorially expressed in vivo, that 0-7 family members can be co-expressed within OSNs, and that this expression pattern is not stochastic.
Second, our studies addressed the question of how these combinations could influence d-Pcdh function. Our results argue that the particular combination expressed within a cell has a major impact on its adhesive profile. We therefore predict mutations in any one d-Pcdh will not have uniform effects on all cells that express that particular d-Pcdh, simply because different cells are likely to express different combinations. For example, we previously showed that mis-and over-expression of Pcdh10 in the olfactory system caused defects in glomerular target formation by OSNs expressing the Olfr9 odorant receptor, but not by those expressing Olfr17 (Williams et al., 2011). A recently generated Pcdh19 mutant mouse in our lab also shows targeting defects of a subset of OSN populations (data not shown). If Pcdh10 and Pcdh19 are expressed by multiple OSN populations ( Figure 1B), why are only a subset of OSNs affected in these mutants?
We speculate that this variation is due in part to the interactions between the mutated d-Pcdh and the other, co-expressed d-Pcdhs within a neuron. Furthermore, the two populations may express different levels of Pcdh19, leading to different effects when Pcdh19 is mutated. A true understanding of how mutations in d-Pcdhs mediate their effects would therefore be dependent on defining at a minimum what other d-Pcdhs are co-expressed within affected cells. Loss of any one d-Pcdh would alter the combination of d-Pcdhs expressed and change the relative expression of co-expressed protocadherins. The changes that would occur as a result of these intrafamily interactions would therefore vary based on what d-Pcdhs were co-expressed within the cell.
This same K562 assay was used to examine a mouse mutant of Pcdh19 to understand why apparent cell sorting defects occurred in the cortex (Pederick et al., 2018). Critically, this study postulated that co-expressed d-Pcdhs might influence the observed sorting behavior. They found that K562 cell adhesion was indeed affected by different d-Pcdh combinations. Although they did not correct for surface expression or draw any particular conclusions about principles that mediate their observed phenotypes, their results are consistent with ours in demonstrating the integral role of combinations in cell sorting.
Our results therefore emphasize the importance of understanding what combinations exist within neurons in order to understand observed phenotypes. However, defining the particular combination of d-Pcdhs expressed per neuron has been problematic. Single cell RNA-seq studies have been unable to adequately address what combinations are expressed within individual neurons. Our own analysis of three single OSN RNA-seq datasets (Hanchate et al., 2015;Saraiva et al., 2015;Tan et al., 2015) shows an average detection of~1 d-Pcdh per neuron, while our NanoString approach detects~3.5 (Figure 1-figure supplement 1K,L). Furthermore, our NanoString results were consistent with orthogonal validation assays using qRT-PCR and in situ hybridization. Thus, higher sensitivity approaches, similar to those used here, may be necessary to fully address what combinations are present within neurons.
We would also like to highlight the importance of potential, interfamily interactions. We demonstrated co-expression of Pcdh7 with Pcdhb11 inhibits Pcdhb11 from intermixing with Pcdh7 +Pcdhb11 cells ( Figure 7F,G). If, however, expression of Pcdh7 is reduced relative to Pcdhb11, then these cells begin to display interfacing behavior. Thus, d-Pcdhs can modify the behavior of other, coexpressed subfamily members. It seems reasonable that d-Pcdhs, classical cadherins, cPcdhs, and other subfamily members are all likely to be co-expressed within individual neurons. How would interfamily interactions influence neuronal behavior in vivo?
Studies on cPcdhs have emphasized the sheer number of possible stochastic combinations that can be generated with this family. Our studies demonstrate that even greater adhesive complexity can be generated by superimposing the effects of d-Pcdhs on cells expressing cPcdhs. Although we and others have begun establishing rules governing intrafamily interactions, it is likely that further complexity can be added via interactions between subfamilies. For example, d-Pcdhs can bind and regulate classical cadherins (Chen and Gumbiner, 2006;Chen et al., 2009;Emond et al., 2011). Such interfamily interactions may well help to explain certain mutant phenotypes associated with the cPcdhs. In the retina, deletion of cPcdhs leads to neuronal death and to defects in dendritic selfavoidance. Interestingly, interactions between cPcdh subfamilies accentuates these effects (Ing-Esteves et al., 2018), again underscoring the impact of combinatorial subfamily interactions. However, in the cortex, deletion of cPcdhs disrupts dendritic branching due to a failure to promote arborization (Molumby et al., 2016). Thus, the same family has distinct effects in different regions of the nervous system. These differences were proposed to be due to context dependent effects. However, it is conceivable that interfamily interactions, such as those between the d-Pcdhs and the cPcdhs, may also play a role in explaining these varying phenotypes. The fundamental principles defined here therefore enable new hypotheses to be generated regarding how mutations in protocadherins influence neuronal function. Continued on next page enhancer) or Cells-to-Ct buffer (containing DNAse I). As OSN isolation was performed at room temperature, neurons were collected from a given coverslip within 30 min. Cells processed in CellsDirect buffer were stored at À80˚C until processing. Cells processed in Cells-to-Ct buffer were vortexed and then incubated at room temperature for five minutes. An additional 0.5 mL of stop solution was added and incubated for 2 min at room temperature before being stored at À80˚C until further processing.

Amplification and quality control of single OSNs
Amplification reactions were done using the CellsDirect kit (Thermo-Fisher) essentially according to manufacturer's instructions, with the following modifications. The 31 gene multiplex primer set was added to individual lysates (100 nM final) in a final volume of 10 mL. Tubes were heated at 80˚C for 10 min and chilled on ice for 3 min. 10 mL of 2x reaction buffer and 1 mL of SuperScript III/Platinum Taq (Thermo-Fisher) were added and tubes were reacted in a PCR machine at 50˚C for one hour, followed by 85˚C for 15 min to inactivate the reverse transcriptase. PCR amplification was then performed with an initial activation at 94˚C for 2 min, followed by 18 cycles of 94˚C for 30 s, 60˚C for 30 s, and 72˚C for 30 s. After amplification, 20 mL of 10 mM Tris 7.5 was added to each sample to bring up the total volume to 40 mL. Four mL of each sample was then screened by quantitative PCR to determine expression levels of Gapdh (indicating successful capture and amplification) and Ncam1 (indicating an OSN). Taqman primers were designed to amplify regions internal to the 31 gene multiplex primer sequence, and samples were run on an ABI 7500 (Thermo-Fisher). Only cells with Ct values 25 for both genes were used for the NanoString analysis (Seattle, WA). See Figure 1source data 1.

NanoString nCounter processing and validation
A custom codeset of 31 genes was designed that would detect a select subset of known axon guidance genes (see Figure 1-source data 1). Single cell cDNA was hybridized to the codeset in collaboration with NanoString. Genes were determined to be positively expressed using a constrained gamma-normal mixture model approach . Briefly, 'negative' control genes (e.g. Notch2, Gfap and Cdh13) were used to estimate the distribution of the no or lowly expressed genes across all cells. Following this, for each cell a constrained gamma-normal mixture model was fit using the Expectation Maximization (EM) algorithm, constrained in the sense that the mean and variance of the no or lowly expressed component for that particular cell was the same as across all cells, allowing the highly expressed component to vary as required. This constrained gamma-normal mixture model allowed for 'sharing' of information across multiple cells, reducing the possibility of ill-fitting distributions to the cells' expression patterns. Following model fitting, cells and genes were classed as 'expressed' if the corresponding posterior probability was 0.5 or above, and 'not expressed' otherwise. After this analysis, some cells were found to be Notch2 positive, and discarded from further study. Data from four codeset genes generated no useful information and were not utilized.

Single cell qPCR validation
OSNs were isolated and amplified in a manner identical to those used for NanoString analysis. Two uL of amplified cDNA from each single cell were used as template for each Taqman assay (Gapdh, Ncam1, Notch2, and the d-Pcdhs; Figure 1-source data 1). All primer sets displayed efficiencies between 93-100%, except for Pcdh1 which had 83% efficiency (improvement was not observed with multiple primer designs). Probes were designed to bind to regions distinct from those detected with the NanoString codeset. Genes were considered 'on' if we observed a C t value less than or equal to 30.

Plasmid construction
EGFP-N1 (Clontech) vectors were modified to incorporate the TagRFP fluorophore and/or a P2A sequence. FLAG constructs were created in a pHAN vector modified to include a FLAG sequence at the 3' terminus of the polylinker. ECTM domains of d-Pcdhs and Pcdhb11 were then cloned into the appropriate vector.

K562 aggregation assay
K562 cells were purchased from ATCC (ATCC CCL-243) and tested mycoplasma negative. Low passage number cells (4-10 passages) were maintained in RPMI +L glutamine with 10% calf bovine serum (Gemini Bio, Sacramento, CA). Cells were grown to a density between 250-500,000 cells/mL prior to electroporation. For the electroporation, one million cells were removed, concentrated by centrifugation, and resuspended in 100 mL of Ingenio Electroporation Solution (Mirus Bio, Madison, WI). Five to eight mg of cesium chloride or midi prepped (Omega) DNA for each d-Pcdh to be expressed was added, and the cells electroporated using an Amaxa Nucleofector II (Lonza; program T-016, Cologne, Germany). Cells were allowed to recover for one hour at 37˚C by immediate addition of 2 mLs of CO 2 equilibrated media. After recovering, valproic acid (VPA, 4 mM final; Sigma, St. Louis, MO) was added to promote expression. Preliminary control experiments showed VPA did not affect cell adhesion, as cells electroporated with vector only remained non-adherent up to four days. For coaggregation experiments, equal volumes of cells from a given electroporation were mixed immediately following the recovery period and placed in individual wells of a 6-well (2 mLs/well) or 24-well (0.5 mLs/well) plate. Cells were gently and continuously agitated at 15 RPM overnight in a tissue culture incubator at 37˚C with 6% CO 2 .

Cell aggregation imaging
For initial aggregate size titration, 15-20 images were taken of each replicate using an inverted fluorescent microscope (Nikon, Tokyo, Japan) with a 10x objective. For speed and aggregate size experiments,~6 field of views were captured at each speed for each replicate using a confocal microscope (Zeiss LSM 510) with a 5x objective. For all other aggregation experiments,~10-15 confocal images were captured of each replicate using a 10x objective.

CoAggregation index (CoAg)
To generate the Coaggregation Index, confocal images were analyzed using custom code ('CoAg') written in Mathematica (Wolfram Research, Champaign, IL). Briefly, each confocal image of an aggregate is parsed into squares just slightly larger than the area of a single cell. After removing all black squares from the image (those containing no cells), the remaining squares are analyzed to calculate the percent of squares that contain more than one color. As a result, cells that completely segregate from one another will have a very low CoAg index because few squares will contain more than one color. In contrast, cells that interface will have higher CoAg indices as green and red cells abut one another, while those that intermix will have the highest index.
Aggregate size titration assay K562 cells were electroporated and following a one hour recovery period, allowed to form aggregates at 15 RPM overnight. At 24-26 hr, images were captured of each replicate. To determine size of aggregates, images were analyzed using the particle size plugin in ImageJ. Aggregates smaller than three cells were removed from the analysis to prevent dividing cells and single cells not participating in aggregation from skewing the results. Aggregate pixel size was compared to the pixel area of one cell to approximate the number of cells per aggregate.

Speed aggregation assay
K562 cells were electroporated and following the 1 hr recovery period, allowed to form aggregates at 15 RPM overnight. At 24-26 hr, images were captured to establish a 15 RPM baseline. Plates were then returned to the incubator and the speed increased for 1 hr to 120 RPM. Images were then acquired, and this process repeated at 160, 200 and 220 RPM. Each image was then analyzed using a custom written code ('Aggregate Size Measurement') in Mathematica (Wolfram Research, Champaign, IL) to measure the pixel size of each aggregate, and aggregate pixel size was then converted to microns.

Statistical analyses
For mismatch coaggregation assays, paired t-tests were performed between each paired population to determine statistical significance in Prism (Graph Pad, La Jolla, CA). For aggregate speed and size analyses, analysis of variance (ANOVA) were performed in R.

Biotinylation assay
Surface biotinylation of live K562 cells was performed using the Pierce Cell Surface Isolation Kit (Thermo-Fisher) essentially as recommended. Volume of cell resuspension was reduced to 1 mL, and an additional 150 uL of lysis buffer was added to ensure complete mixing during incubation.

Western blot analysis
Western blots were performed by loading 8 uL (roughly 15% of the total elution from each biotinylation experiment) onto 10% SDS polyacrylamide gels. All primary antibodies used were monoclonal in origin, and carefully titrated to establish working dilutions of equivalent detection so that samples across antibodies could be compared. To achieve this, we calibrated working monoclonal concentrations with purified RFP and GFP proteins. We also electroporated cells with the same d-Pcdh fused to different tags to optimize antibody dilution to account for variation in signal intensity. The antibodies used were mouse anti-GFP (1:4,000, Thermo-Fisher MA5-15256), mouse anti-RFP (1:2,000, Thermo-Fisher MA5-15257) and mouse anti-FLAG (1:6,000, Thermo-Fisher MA1-91878). We used the transferrin receptor (TfR) as a loading control for surface protein (1:1,000, Thermo-Fisher 13-6800). All antibodies were diluted in 20% glycerol upon receipt to promote cryostability. Estimation of band intensity was carried out using ImageJ.

Monte carlo simulation (cellAggregator)
To investigate the aggregation behavior of cell populations expressing d-Pcdhs of varying apparent adhesive affinities and expression, we performed Monte Carlo based simulations to describe cell binding interactions as a dynamic cell-cell network across discrete time steps using custom code (cel-lAggregator). Two cell populations, green (n = 25) and red (n = 25), were assigned properties of two hypothetical genes named A and B, corresponding to the coaggregation assay experiments conducted. For example, green cells could be designated as expressing high levels of A and low levels of B, and red cells as expressing low levels of B and high levels of A. The genes A and B were each also assigned binding affinities, for example, A possesses two times greater apparent adhesive affinity than B. The initial cell-cell network consists of the green and red cells as nodes in the network, and edges represent cell-cell binding interactions occurring.
For each simulation, 100 time steps were performed. At each discrete time point, the cells are mixed and allowed to bind to other cells according to a 'speed dating' set up, where the majority of cell pairs (arbitrarily set at 75%) result in a cell-cell interaction. Allowing the majority (as opposed to all cell pairs) to bind avoids oscillatory network behavior. The probability that two cells would 'speed date' increased as the Euclidean distance between the force-directed network projection onto two dimensions decreased, that is nodes more closely connected were more likely to 'speed date'. Once 'speed-dating' begins, the cell pair would bind via the genes expressed by each cell, with unbound genes selected at random with a probability corresponding to the expression level. The duration of interaction (number of time steps) depended on the identity of genes. A-B interactions persisted for only a single time step, while B-B interactions persisted for three time steps, and A-A interactions persisted for three multiplied by the affinity ratio time steps. This differential length of time for cellcell interactions is based on the idea that non-homophilic protocadherin interactions are unstable and do not persist (A-B), and that some protocadherins may have different levels of apparent adhesive affinity, leading to more persistent or stable binding time (e.g. A-A lasts more time steps than B-B if A is assigned greater affinity than B). The green or red color of the cells did not affect the binding of cell pairs.
Instantaneous network coaggregation was measured by calculating the average proportion of different-color to same-color binding partners across all cells in the network for any one time step. Cells with no network partners were not included in this calculation. The in silico coaggregation behavior for the entire simulation was then determined as an average of all instantaneous network coaggregations in the simulation. This value did not include initial time steps (arbitrarily set at 25% of the 100 total time steps) to allow for the network to stabilize following the initial state of all cells being unconnected. This resulted in a single overall in silico coaggregation index value determined for the simulation scenario. A total of 100 time steps were simulated for each scenario, and each scenario was repeated five times. To model varying affinity between genes, the affinity values were allowed to range between 1 (same affinity) and 10.

Validation of NanoString data
Pcdh18 data was discarded due to an error in the codeset. However, Pcdh18 was not detected by RNA in situ hybridization experiments in the epithelium nor in subsequent single OSN qPCR experiments. Negative controls (e.g. water or media only) showed no signal following amplification, indicating a lack of contamination. To validate the NanoString data, we first performed a 'pool-split' experiment to determine technical reproducibility. RNA from 12 single cells were pooled and then split into multiple aliquots. Each aliquot was separately amplified and processed to assess technical reproducibility. Samples showed good correlation (R 2 = 0.62; data not shown). Second, we asked if averaging the expression patterns from single neurons approximated the pattern seen using bulk epithelial RNA. We found strong correlation (R 2 = 0.65) despite the fact we only analyzed 50 cells, and bulk RNA contains neurons, glia, and other cell-types (data not shown). Finally, multiple discriminant analysis (MDA) showed that pool-split samples clustered with single cells while the water and bulk samples formed separate, discrete clusters (data not shown).
To address the concern that dissociation of whole epithelia would affect d-Pcdh expression, we generated a proxy for in vivo expression by performing single color RNA in situ hybridization studies ( Figure 1-figure supplement 1A-G; no signal was detected for Pcdh11x or Pcdh18). Interestingly, the pattern of expression was clearly variable among neurons, and unevenly distributed within the epithelium (Figure 1-figure supplement 1B-G). We used this RNA in situ data to estimate the proportion of OSNs that express each d-Pcdh (Figure 1-figure supplement 1J; see Materials and methods). We found that our single neuron data and these in vivo estimates followed similar trends (R 2 = 0.58), suggesting dissociation did not have an appreciable impact on our Nano-String data.