How clustered protocadherin binding specificity is tuned for neuronal self-/nonself-recognition

The stochastic expression of fewer than 60 clustered protocadherin (cPcdh) isoforms provides diverse identities to individual vertebrate neurons and a molecular basis for self-/nonself-discrimination. cPcdhs form chains mediated by alternating cis and trans interactions between apposed membranes, which has been suggested to signal self-recognition. Such a mechanism requires that cPcdh cis dimers form promiscuously to generate diverse recognition units, and that trans interactions have precise specificity so that isoform mismatches terminate chain growth. However, the extent to which cPcdh interactions fulfill these requirements has not been definitively demonstrated. Here, we report biophysical experiments showing that cPcdh cis interactions are promiscuous, but with preferences favoring formation of heterologous cis dimers. Trans homophilic interactions are remarkably precise, with no evidence for heterophilic interactions between different isoforms. A new C-type cPcdh crystal structure and mutagenesis data help to explain these observations. Overall, the interaction characteristics we report for cPcdhs help explain their function in neuronal self-/nonself-discrimination.

In order to explain how about 60 cPcdh isoforms can provide a comparable or even greater level 27 of neuronal diversity as 19,000 Dscam isoforms, Rubinstein et al. (2015) proposed that cPcdhs 28 interacting located on apposed membrane surfaces would form an extended zipper-like lattice 29 linear through alternating cis and trans interactions ( Figure 1D). In self-interactions -between 30 two membranes with identical cPcdh repertoires -these chains would grow to form large 31 structures, limited mainly by the number of molecules (Brasch et al., 2019;Rubinstein et al., 2015). 1 However, in non-self-interactions -between two membranes with differing cPcdh repertoires -2 such large linear assemblies would not form since even a single mismatch between expressed 3 isoforms would terminate chain assembly (Brasch et al., 2019;Rubinstein et al., 2017; Rubinstein 4 et al., 2015). This "isoform-mismatch chain-termination model" for the "barcoding" of vertebrate 5 neurons envisions the assembly of long cPcdh chains between sites of neurite-neurite contact to 6 represent the signature of "self", and to be translated by downstream signaling that leads to self- 2) The assumption that cis interactions are promiscuous is based in large part on the fact that α-25 cPcdhs and γC4 cannot reach the cell surface without binding to another "carrier" isoform (Bonn 26  quantitatively in a small number of cases so that the term "promiscuous" is qualitative at best. In 29 fact, as compared to γΒ and β cPcdh isoforms, most γA-Pcdhs do not form measurable cis 30 homodimers in solution or, in some cases they can form weak cis homodimers with a KD > 100μM, 31 as in the case of γA33-6. Nevertheless, all γA-Pcdhs are still able to reach the cell surface when 1 expressed alone. This observation can be understood if the cis dimerization affinity of γA-Pcdhs 2 is large enough to enable them to dimerize in the 2D membrane environment (Goodman et al., 3 2016b;Wu et al., 2013). Nevertheless, their weak dimerization affinities suggest, more generally, 4 that cPcdhs may exhibit a range of cis dimerization affinities. We establish below that a wide range 5 of affinities does in fact exist and, strikingly, most homophilic cis interactions are weaker than 6 their heterophilic counterparts. We consider the functional implications of this novel observation 7 in the discussion. 8 9 3) Structures have not yet been determined for complete C-type cPcdh ectodomains. Yet these 10 isoforms play unique functional roles, some of which have no apparent connection to isoform 11 diversity. For example, a single C-type isoform is sufficient for tiling which can be simply 12 understood in terms of the formation of zippers containing identical homodimers so that all 13 interacting neurons will avoid one another (Chen et al., 2017). Moreover, a recent paper by Garrett 14 and coworkers discovered that neuronal survival and postnatal viability is controlled solely by γC4 15 suggesting a function that is unique to this isoform (although it presumably requires β and/or other 16 γ carriers to reach the cell surface) (Garrett et al., 2019). Below we report extensive biophysical 17 interaction studies of C-type isoform ectodomains and report the first crystal structure of a trans 18 dimer formed by γC4. Our findings reveal that the specialized functions of C-type cPcdhs probably 19 do not involve unique structural or biophysical properties of their ectodomains. 20

21
Overall, in accordance with the requirements of the isoform-mismatch chain-termination model, 22 we find that trans-homophilic interactions are remarkably precise, with no evidence for 23 heterophilic interactions between different cPcdh isoforms. In contrast cPcdh cis interactions are 24 largely promiscuous but with relatively weak intra-subfamily and, especially, homophilic 25 interactions. Possible implications of this somewhat surprising finding are considered in the 26 discussion. Our study reveals how the extraordinary demands posed by the need to assign each 27 neuron with a unique identity are met by an unprecedented level of protein-protein interaction 28 specificity. 29 isoforms -α7, β6, β8, γA8, γA9 and γB2 -which include the most closely related isoforms by 5 sequence identity from the β and γA subfamilies (β6/8 and γA8/9) (Rubinstein et al., 2015). These 6 molecules were coupled over independent Neutravidin-immobilized flow cells and trans-7 interacting ectodomain fragments of multiple members of each cPcdh subfamily, including the C-8 types (α4, α7, α12, β6, β8, γA4, γA8, γA9, γB2, γB4, γB5, αC2, γC3, γC4, and γC5), were then 9 flowed over the six cPcdh surfaces to assess their binding. The SPR binding profiles reveal strictly 10 homophilic binding ( Figure 2A). Remarkably, no heterophilic binding was observed for any of 11 the analytes over any of the six surfaces ( Figure 2A). Even β6/8 and γA8/9 that have 92% and 82% 12 sequence identities respectively in their trans-binding EC1-4 regions exhibit no heterophilic 13 binding. Mutations designed to disrupt α7, β6, and γA8 trans interaction inhibited homophilic 14 binding, demonstrating that the observed binding occurs via the trans interface ( behavior is unlike that of other adhesion receptor families where, whether they display homophilic 17 or heterophilic preferences, the signal is never as binary as the one shown in Figure 2 (Honig and  18 Shapiro, 2020). We estimate that, even for these pairs, the lower limit for KD would be ~200 µM. 19 20 Much of the original evidence as to homophilic specificity was based on cell aggregation assays 21 (Schreiner and Weiner, 2010;Thu et al., 2014) and it is of interest to compare the results obtained 22 from these assays to those obtained from SPR. We do this in the context of examining the 23 heterophilic binding specificity between β61-4 and β81-4 trans fragment that share 92% sequence 24 identity and differ at only five residues ( binding interfaces. Each of these residues was mutated individually and in combination. Figure  26 2-figure supplement 2B and C display SPR profiles and cell aggregation images, respectively, 27 for wild type β6 and β8 and for the various mutations. We first note that changing all five residues 28 in β6 to those of β8 generates a mutant protein with essentially wild type β8 properties; it binds 29 strongly to β8 but not to β6 as seen in SPR and also forms mixed aggregates with β8 but not β6. 30  In contrast, most of the single residue mutants retain β6-like properties in both assays whereas 1 double and triple mutants exhibit intermediate behavior between β6 and β8. These results 2 demonstrate that despite the 92% sequence identity between β6 and β8, their highly specific 3 homophilic properties can be attributed to five interfacial residues. Moreover, the cell aggregation 4 assays are consistent with the heterophilic binding traces measured by SPR; cells expressing 5 mutants that generate strong SPR signals with either wild type β6 or β8 also form mixed aggregates 6 with cells expressing the same wild type protein. 7 8 Of note, trans-interacting fragments of all four C-type cPcdhs tested showed no binding over the 9 alternate cPcdh SPR surfaces ( Figure 2B). To test whether C-type cPcdhs also show strict 10 homophilic specificity with respect to each other we coupled biotinylated trans-interacting 11 fragments of αC2, γC3, γC4, and γC5 to SPR chips and passed the same four fragments alongside 12 alternate cPcdh trans fragments over these four surfaces. Only homophilic binding was observed, 13 with each of the four C-type fragments binding to its cognate partner and no other isoform ( Figure  14 2B). Disrupting the γC5 trans interaction with the S116R mutation (Rubinstein et al., 2015), 15 inhibited binding to the γC5 surface, demonstrating that the observed binding occurs via the trans Since all the cPcdh trans fragment molecules used in these SPR experiments homodimerize our 27 SPR data cannot be used to determine accurate binding affinities (Rich and Myszka, 2007). We 28 therefore used AUC to measure the trans-homodimer KDs (Figure 2-source data 1) revealing a 29 wide range of binding affinities from 2.9 µM (α71-5) to >500 µM (γC41-4). Strict homophilic 1 specificity is therefore not contingent on the strength of the homophilic interaction.  although it is comparatively divergent from the δ1 ncPcdh1 trans dimer (Modak and Sotomayor, 27 2019). This is not particularly surprising since by sequence identity γC4 is as similar over its EC1-28 4 domains to the δ2 non-clustered Pcdhs as it is to the alternate and other C-type cPcdhs (~40-29    µM. This may be due to the presence of two buried charges in the interface, E78 and D290 ( Figure  7 3-figure supplement 2). 8 9 Clustered protocadherin cis interactions are promiscuous with a range of interaction strengths 10 To systematically investigate cPcdh cis interactions, we coupled cis-interacting fragments of β9, 11 γA4, γA9, γB2, αC2, γC3, and γC5 to SPR chip surfaces. Cis-interacting fragments of three 12 members from each of the β, γA, and γB subfamilies (β1, β6, β9, γA3, γA4, γA9, γB2, γB5, γB7) 13 alongside αC2, γC3, and γC5 fragments were flowed over the seven surfaces to detect their 14 heterophilic binding ( Figure 4A). α-Pcdhs, αC1, and γC4 were not included in this study since 15 EC6-containing fragments of these molecules cannot be expressed, although an α7 EC1-5/ γC3 16 EC6 chimera was included among the analytes to assess the role of α7 EC5 ( The data clearly demonstrate a wide range of cis dimerization affinities with strong heterophilic 27 binding signals (500-2000 RU), with much weaker homophilic binding responses typically 28 between 100-140 RU. The strongest heterophilic cis interactions are in the sub-micromolar range; 29 for example, γC3/β9 cis-dimerize with a KD of 0.22 µM, while β93-6, αC22-6 and γC52-6 which 30   C-type Pcdhs αC2 2-6 γC3 3-6 γC5 2-6 γB2 3-6 γB7 3-6 γB5 3-6 γB-Pcdhs γA-Pcdhs γA3 3-6 γA9 3-6 γA4 3-6 β9 3-6 β1 3-6 β6 1-6 homodimerize with AUC-determined KDs of 9-35 µM. In addition to uniformly weak homophilic 1 interactions, within-subfamily cis interactions were consistently among the weakest observed 2 although a number of inter-subfamily interactions were also relatively weak ( Figure 4A). For 3 example, for the β9 surface comparatively weak binding was observed for all tested β and γA 4 isoforms except γA3, with the monomeric β1, γA4 and γA9 producing low responses that could 5 not be fit to a binding isotherm to calculate accurate KDs ( Figure 4B, Figure 4-figure supplement 6 1B). In contrast, robust binding to the β9 surface was observed for all γB and C-type isoforms. 7

β-Pcdhs
These data are consistent with the binding responses when β9 was used as an analyte over the other 8 six surfaces, with weak to no binding observed over the γA4 and γA9 surfaces and robust responses 9 over the γB2, αC2, γC3, and γC5 surfaces ( Figure 4A). The γA4 and γA9 surfaces showed a similar 10 pattern of binding behaviors, with weak to no binding observed for the γA and αC2 analytes, and 11 robust binding for the γC-cPcdhs with KDs for γC33-6 of 2.73 and 9.60 µM respectively over each asymmetric nature of the cis interaction implies that for each dimer interaction there are two 6 possible arrangements: one with protomer "1" forming the EC5-6 side and protomer "2" forming 7 the EC6-only side and the second where protomer "1" forms the EC6-only side and "2" the EC5- supplement 1B,C). Two of these residues, V555 and S595, result in a potential loss of EC6-only 29 interface buried surface area and are shared with α-cPcdhs, which cannot occupy the EC6-only 30  Residues which were mutated in the panel B are circled in red. γB7 crystal structure numbering is used for both γA4 and γC3 residues. See methods for γA4 and γC3 alignment. (B) Top, SEC-MALS data for an equimolar mixture of wild-type γA4 EC3-6 and γC3 EC3- 6 showing dimer formation. Plot shows size exclusion absorbance at 280 nm trace (left axis), molecular weight of the eluant peaks (right axis), and the monomer molecular weights of γA4 EC3-6 and γC3 EC3-6 measured by mass spectrometry -54.5 kDa and 56.5 kDa respectively -as dashed grey lines. Average molecular weight of the molecules in the dimer and monomer eluant peaks are labeled. Middle, SEC-MALS data for V560R mutants, which target the EC6-only side of the interface. Bottom, SEC-MALS data for residue 558 mutants. The γC3-like K558R mutation in γA4 inhibits heterodimer formation with wild-type γC3. Similarly, the γA4-like R558K in γC3 inhibits dimerization with wild-type γA4.
(C) SPR binding profiles for γB7 EC3-6 wild type and cis interface mutants flowed over three individual wild-type cis fragment surfaces. The two mutations specifically target one side of the cis interface. γA4 V560R did not dimerize with wild-type γC3 whereas γC3 V560R could still dimerize with 17 wild type γA4 ( Figure 5B). Therefore impairing γA4s EC6-only interface blocks γA4/γC3 dimer 18 formation while impairing γC3s EC6-only interface does not (although the dimerization appears 19 to be weaker compared to the wild type γA4/γC3 cis interacting pairs). We also generated a γC3-20 like mutant of γA4, K558R, which also targets the EC6-only interface. Like γA4 V560R, γA4 21 K558R also did not dimerize with wild type γC3 in MALS and when replicated in SPR experiments 22 ( Figure 5B, Figure 5-figure supplement 2B). The reverse mutation in γC3, R558K, inhibited 23 dimerization with wild type γA4 ( Figure 5B). Therefore, like the α-specific R560 residue, γC3-24 specific R558 has distinct effects on dimerization when in γA4 or γC3, inhibiting 25 heterodimerization when mutated into γA4 but promoting heterodimerization in γC3. Together 26 these data suggest that the γA4/γC3 dimer has a preferred orientation, with γA4 predominantly 27 occupying the EC6-only position and γC3 the EC5-6 side. Our data also account for the fact that 28 neither isoform homodimerizes in solution since the EC5-6 side would be impaired in the γA4 1 homodimer while the EC6 side would be impaired in the γC3 homodimer. 2 3 Next, we sought to test whether γA4 and γC3 preferentially adopt these specific positions in cis 4 interactions with a γB isoform. To accomplish this we generated mutants of γB7 individually 5 targeting the EC6-only interaction surface, γB7 Y532G, and the EC5-6 side, γB7 A570R, 6 respectively (Goodman et al., 2017) ( Figure 4-source data 1). In SPR, γB7 Y532G had only a 7 small impact on γA4 binding, while γB7 A570R abolished γA4 binding ( Figure 5C). In contrast, 8 γB7 Y532G prevented γC3 binding while γB7 A570R showed robust γC3 binding ( Figure 5C). 9 These results suggest that γA4/γB7 and γC3/γB7 cis heterodimers also have preferred orientations 10 with γA4 and γC3 maintaining their preferences for the EC6-only and EC5-6 positions 11 respectively. Additionally, SPR data for the γB7 mutants over the αC2 surface suggests αC2 12 preferentially occupies the EC6-only side in αC2/γB7 dimers ( Figure 5C). This is notable since 13 αC2 forms robust cis-homodimers and therefore, like γB7, can presumably readily occupy both 14 positions in its homophilic interactions, implying that the αC2/γB7 orientation preference could 15 be specific to the particular heterodimer pairing. The cis binding preferences indicated by our data can be largely understood in terms of the 29 asymmetric interface discussed above. Specifically, different isoforms preferentially form one side 30 of the cis dimer: for example, the EC6-only side for Pcdh-γA4 and the EC5-6 side for Pcdh-γC3. 31 Homodimerization requires participation of single isoform on both sides of an interface posing 1 challenges in the optimization of binding affinities since, in some cases, the same residue must 2 participate in different intermolecular interactions. Given significant sequence conservation in all 3 members of an alternate cPcdh subfamily (Figure 4-figure supplement 3) even intra-subfamily 4 heterophilic interactions are more difficult to optimize relative to inter-subfamily 5 heterodimerization where there are no constraints on the two interacting surfaces. Additionally, 6 the robust cell surface delivery of many cPcdhs in cells expressing only a single isoform also 7 suggests that all carrier isoforms -β-, γA-, and γB-cPcdhs, plus C-types αC2, γC3, and γC5 -can 8 fill both the EC6 and EC5-6 roles, as cis-dimer formation is thought to be required for cell surface One possible advantage of weak homophilic cis interactions would be to ensure that once reaching 23 the cell surface a diverse set of cis dimers forms. This explanation implicitly assumes that most 24 isoforms (except for α-Pcdhs and γC4) reach the surface as homodimers that must then quickly 25 dissociate and form more stable heterodimers. Another possible explanation posits that zippers 26 consisting of homodimers are easier to form in a kinetic sense than heterodimeric zippers, and that 27 this would reduce the diversity required in the chain termination model since, in this model, it is 28 essential that all isoforms be in incorporated into a growing zipper. The formation of long 29 homophilic zippers might lead to a repulsive phenotype even when mismatches are present. 30 31 However, these explanations would not easily account for interfamily heterophilic preferences. 1 Our results for C-type isoforms suggest that other factors may be involved. C-type cPcdhs have 2 different functions than alternate cPcdhs and these are reflected in different expression patterns. 3 For example, αC2 can be alone responsible for tiling (Chen et Figure 4A, discussed below) were tested at six 26 concentrations ranging between 24, 8, 2.667, 0.889, 0.296, and 0.099 μM, similarly prepared using 27 a three-fold dilution series. γC33-6 binding over β93-6 ( Figure 4A) was tested at five concentrations 28 from 8-0.099 μM. 29 For all experiments, analyte samples were injected over the captured surfaces at 50 μL/min for 40 1 s, followed by 180 s of dissociation phase, a running buffer wash step and a buffer injection at 100 2 μL/min for 60 s. Protein samples were tested in order of increasing concentration, and the entire 3 concentration series was repeated again to confirm reproducibility. Every three binding cycles, 4 buffer was used as an analyte instead of a protein sample to double reference the binding responses 5 by removing systematic noise and instrument drift. The resulting binding curves were normalized 6 for molecular weight differences according to data provided by mass spec for each molecule. The 7 data was processed using Scrubber 2.0 (BioLogic Software). To provide an estimate of the number 8 of possible heterophilic binding pairs, we have used a cut-off of 40RU, which is the lowest signal 9 that can be observed for a homodimeric cis fragment pair, γB23-6. 10 11 In Figure 4A, β61-6 and β93-6 were tested over γC33-6 at six concentrations ranging from 900 to 3.7 12 nM, which is 27-fold lower than the other interactions, prepared using a three-fold dilution series 13 in a running buffer containing increased concentrations of imidazole (100 mM) and BSA (0.5 14 mg/mL) to minimize nonspecific interactions. For these two interactions, although analyte samples 15 were injected over the captured surfaces at 50 μL/min for 40s, the dissociation phase was 16 monitored for 300s to provide additional time for complex dissociation. Nevertheless, higher 17 analyte concentrations produced binding profiles that were not reproducible, most likely due to the 18 fact that the bound complexes could not dissociate completely at these concentrations. 19 20 For the calculation of heterophilic KDs for the monomeric cis fragments β13-6, γA43-6, γA93-6 and 21 γC33-6 over each of the six surfaces, except β93-6, the duplicate binding responses were fit globally, 22 using an 1:1 interaction model and a single KD was calculated as the analyte concentration that 23 would yield 0.5 Rmax and a fitting error, indicated in brackets. KDs < 24 μM were calculated using 24 an independent Rmax. For KDs > 24 μM, the Rmax was fixed to a global value determined by the 25 Rmax of a different cPcdh analyte tested over the same surface during the same experiment that 26 showed binding above 50% and therefore produced a more accurate Rmax. For KDs >50 μM, a 27 lower limit is listed since at the analyte concentrations used we could not measure accurate KDs 28 even when the Rmax is fixed. The binding curves of γC33-6 over the β93-6 did not come to 29 equilibrium during the time-course of the experiment so a kinetic analysis was performed to 30 calculate a KD (Figure 4-figure supplement 1A). Binding of γC33-6 was tested using a 31 concentration range of 900-0.411 nM prepared using a three-fold dilution series in a running buffer 1 containing increased concentrations or imidazole (100 mM) and BSA (0.5 mg/mL) to minimize 2 any nonspecific interactions. Protein samples were injected over the captured surfaces at 50μL/min 3 for 90s, followed by 420s of dissociation phase, a running buffer wash step and a buffer injection 4 at 100 μL/min for 60s. Protein samples were tested in order of increasing concentration in triplicate 5 to confirm reproducibility. Every three binding cycles, buffer was used as an analyte instead of a 6 protein sample to double reference the binding responses by removing systematic noise and 7 instrument drift. The binding data was analyzed using an 1:1 interaction model to calculate the 8 kinetic parameters and the KD. conducted. The higher resolution (2.4 Å) crystal form 2 crystal structure (see below) was used as 7 a reference model in later rounds of iterative model-building and refinement to guide the local 8 geometry choices in this lower resolution structure. Final refinement statistics are given in Figure  9 3-source data 1.

Cis interface mutants 2
Our studies of Pcdh cis interactions we have found that mutagenesis of the cis interface commonly 3 has a deleterious impact on protein expression levels in our system (Goodman et al., 2017). We 4 assume this is because cis interaction is required for robust cell-surface delivery/secretion (Thu et  5 al., 2014), although this hasn't been specifically addressed in our HEK293 protein expression 6 system. 7 8 To test our structure-guided hypotheses regarding γA4 and γC3s' cis interactions and side 9 preferences as we tried to make a number of different cis interface mutants and were able to obtain 10 four different mutants (see

EC4
(C) (i) Schematic of the γC3 EC6 /γA4 EC5-6 cis dimer. (ii) Model of the γC3 EC6 /γA4 EC5-6 cis dimer. The model suggests that this orientation for the γA4/γC3 cis dimer interaction will be disfavored. Unfavorable residue differences between γB7 and γA4/γC3 in this orientation are noted in red.
responses normalized for mw γA4 to make it more like γC3 prevents γA4/γC3 cis-heterodimerization (A) SEC-MALS data for wild-type γA4 3-6 , wild-type γC3 3-6 , and γC3 3-6 V560R showing all three molecules are monomeric in SEC-MALS, consistent with their behavior in sedimentation equilibrium AUC. Plots show size exclusion absorbance at 280 nm trace in blue (left axis), molecular weight of the eluant peak in black (right axis), and the monomer molecular weight of γA4 3-6 or γC3 3-6 measured by mass spectrometry -54.5 kDa and 56.5 kDa respectively -as dashed grey lines. Average molecular weight of the molecules in the eluant peaks are labeled. (B) SPR binding profiles for γA4 3-6 wild type and γA4 3-6 with γC3-like cis interface mutation K558R flowed over immobilized wild-type γC3 3-6 . Loss of γC3 3-6 interaction in the presence of the K558R mutation is consistent with the SEC-MALS results shown in Figure 5.