Divergent Evolution of a Protein–Protein Interaction Revealed through Ancestral Sequence Reconstruction and Resurrection

Abstract The postsynaptic density extends across the postsynaptic dendritic spine with discs large (DLG) as the most abundant scaffolding protein. DLG dynamically alters the structure of the postsynaptic density, thus controlling the function and distribution of specific receptors at the synapse. DLG contains three PDZ domains and one important interaction governing postsynaptic architecture is that between the PDZ3 domain from DLG and a protein called cysteine-rich interactor of PDZ3 (CRIPT). However, little is known regarding functional evolution of the PDZ3:CRIPT interaction. Here, we subjected PDZ3 and CRIPT to ancestral sequence reconstruction, resurrection, and biophysical experiments. We show that the PDZ3:CRIPT interaction is an ancient interaction, which was likely present in the last common ancestor of Eukaryotes, and that high affinity is maintained in most extant animal phyla. However, affinity is low in nematodes and insects, raising questions about the physiological function of the interaction in species from these animal groups. Our findings demonstrate how an apparently established protein–protein interaction involved in cellular scaffolding in bilaterians can suddenly be subject to dynamic evolution including possible loss of function.


Introduction
The postsynaptic density extends across the postsynaptic dendritic spine and is composed of receptors, signaling enzymes, cytoskeletal structural elements, and cytoplasmic scaffolding proteins. A major role of the postsynaptic density is to stabilize and anchor glutamate receptors such as AMPA and NMDA. Proteins from the discs large (DLG) family such as postsynaptic density protein-95 (PSD-95), also called DLG4, are involved in different protein-protein interactions in the postsynaptic density. Here, they control molecular organization and regulate synaptic strength by altering the function and distribution of AMPA receptors at the synapse . DLG4 contains five folded domains: Postsynaptic density-95/discs large/Zonula occludens (PDZ)1, PDZ2, PDZ3, Src homology 3 (SH3), and guanylate kinase like (GK). DLG4 is one of four paralogs in vertebrates together with SAP97 (DLG1), PSD-93 (DLG2), and SAP102 (DLG3), respectively. These four proteins arose as a result of two consecutive whole-genome duplications in the vertebrate lineage $440 Ma (McLysaght et al. 2002;Putnam et al. 2008). The amino acid sequences of the three PDZ domains found in each of these proteins are well conserved. In fact, the identity and similarity are high also for PDZ domains in the corresponding orthologous proteins in evolutionarily distantly related animals such as Drosophila melanogaster and the fresh-water polyp Hydra vulgaris ( fig. 1A).
The PDZ domain fold is ancient and based on sequence similarity it can be traced in animals as well as in fungi and plants (Emes and Grant 2011), which contain a permuted variant (Ivarsson et al. 2008). Within the animal kingdom, there are examples of genomes containing hundreds of distinct PDZ domains, showing that its function as a protein-protein interaction module has been under positive selection following gene or genome duplications. This abundance of PDZ domains in the animal proteome is likely facilitated by its structural architecture allowing easy integration into other proteins (Harris and Lim 2001). PDZ domains consist of 80-100 residues and have a compact globular fold usually containing five to six antiparallel b strands and two a helices. However, DLG PDZ3 contains a third a helix at the C-terminus (a 3 ) ( fig. 1C).
Protein ligands of PDZ domains usually bind via their respective intrinsically disordered C-terminus to a binding groove in the PDZ domain. This groove is shaped by a conserved motif in the b 1 b 2 loop (GLGF or variants of it), the b 2 strand and the a 2 helix. In most cases, the C-terminus of the protein ligand arranges as an antiparallel b strand in the groove together with the b 2 strand ( fig. 1C). The part of the protein ligand that binds to the PDZ domain is called the PDZ-binding motif (PBM). Affinity is primarily dictated by the protein ligand residues denoted P À2 and P 0 , where P 0 is the C-terminal residue and P À2 the third residue from the C-terminus ( fig. 1D). Early studies (Songyang et al. 1997;Stricker et al. 1997) led to classification of PBMs as type I (T/S-X-/-COOH), type II (/-X-/-COOH), or type III (D/E-X-/-COOH), based on the nature of the residues at P À2 and P 0 . However, the classification of PDZ domains only depending on two residues of the binding partner is an over simplification. Not only is the specificity of PDZ-ligand interactions usually dependent on more than P À2 and P 0 residues of the ligand (Ernst et al. 2014) but also several properties of the PDZ domains themselves modulate binding, for example, residues outside of the binding groove ) and extensions to the canonical PDZ domain such as a 3 in DLG PDZ3 (Zeng et al. 2016).
One protein ligand for DLG PDZ3 is cysteine-rich interactor of PDZ3 (CRIPT) (Niethammer et al. 1998). This is an interaction that is well studied both from a structural and biophysical point of view, since it was the first structure solved of a PDZ domain with peptide ligand (Doyle et al. 1996). Full affinity of CRIPT (K d in the low mM range for the Homo sapiens CRIPT:DLG4 PDZ3 interaction) is obtained with the six last amino acid residues (Saro et al. 2007; Gianni et al. 2011;Toto et al. 2016). CRIPT has a neurological function and is required for DLG to promote dendrite growth by AMPA activation (Beique et al. 2006;Zhang et al. 2008Zhang et al. , 2017. However, certain animals including the model organisms Dr. melanogaster (where DLG was first identified) and Caenorhabditis elegans have CRIPT proteins that lack the classical PBM ( fig. 1B and supplementary fig. 1, Supplementary Material online). In Dr. melanogaster, DLG is involved in a range of processes, including synaptic clustering of Shaker potassium channels (Tejedor et al. 1997), junction structure, cell polarity, and localization of membrane proteins. Recent work shows that CRIPT clusters next to DLG in the synapse, thereby promoting dendrite growth (Zhang et al. 2017). Furthermore, it was reported from knock down experiments that CRIPT is essential for DLG dependent dendrite growth (Zhang et al. 2017). Thus, there is some ambiguity whether CRIPT from, for example, Dr. melanogaster, binds PDZ3 domains when it lacks the classical PBM and whether the biological action of DLG and CRIPT is partially independent of the PDZ3:CRIPT interaction (Zhang et al. 2017  MBE Evolutionary biochemistry, that is, the combination of phylogenetic reconstruction of ancestral sequences with expression and experimental characterization of the ancient proteins with different methods is a powerful approach for understanding protein function (Hochberg and Thornton 2017), including protein-protein interactions (Hultqvist et al. 2017;Jemth et al. 2018;Wheeler et al. 2018). To better understand the evolution and hence function of the PDZ3:CRIPT interaction, we reconstructed sequences of ancestral phylogenetic tree-matched variants of PDZ3 and CRIPT and compared their binding affinity and stability to PDZ3 and CRIPT from seven extant species ( fig. 2). We show that the PDZ3:CRIPT interaction has evolved such that the affinity has been maintained or slightly increased in most extant animal phyla (chordates, hemichordates, mollusks, and cnidarians), whereas affinity seems to be lost among most nematodes and insects. Our findings raise two questions: Is CRIPT the natural ligand for DLG PDZ3, and is control of AMPA and NMDA receptors independent of the PDZ3:CRIPT interaction, at least in nematodes and insects?

Reconstruction of Ancestral Sequences
Protein sequences for DLG family members (including DLG4 and its vertebrate paralogs DLG1, DLG2, and DLG3) and CRIPT proteins were recovered from the NCBI database (NCBI Resource Coordinators) (NCBI Resource Coordinators). The final set of sequences for ancestral reconstruction consisted of 309 unique sequences of CRIPT from 498 different species and 249 unique sequences of DLG family PDZ3 domains from 324 different species (supplementary figs. 1 and 2, Supplementary Material online). The number of different species exceeded the number of unique sequences due to 100% sequence identity between several closely related species. Although all of the PDZ3 sequences were from Metazoan (animal) species, CRIPT sequences also included other kingdoms since homologous proteins with high sequence similarity were found in both fungi and plants. Within the animal kingdom, the similarity between homologous sequences of both DLG PDZ3 and CRIPT was very high (supplementary figs. 1 and 2, Supplementary Material online). Availability and quality of sequences were variable so the widest most representative data set was constructed using search methods, sequence alignment, filtering, and sorting using Python programing and manual curation. A good multiple sequence alignment is key for a reliable ancestral sequence reconstruction (Vialle et al. 2018). The high sequence identity and similarity for both PDZ3 (86.2% and 93.6%, respectively) and CRIPT (82% and 73%, respectively, for the C-terminal six residues) and, importantly, absence of insertions and deletions provided a high-quality alignment. Moreover, the high sequence similarity yielded high posterior probabilities (>0.8) for the maximum likelihood estimate (ML) of reconstructed ancestral sequences for most reconstructed positions: 94-97% in PDZ3 (except Hexapoda ancestor, 83%) and five or six out of six residues in the PBM of CRIPT (except Deuterostomia CRIPT, four out of six) (supplementary data excel files 1 and 2, Supplementary Material online). We were able to reconstruct PDZ3 from five historical time points, namely from the common ancestors of all extant bilaterians, deuterostomes, protostomes, and hexapods (insects), respectively, as well as from the common ancestor of all extant vertebrates containing four DLG paralogs (figs. 1A and 2). The PDZ3 from the latter ancestor is denoted 1R PDZ3, since the four DLG paralogs arose as a result of two consecutive whole-genome duplications called 1R and 2R, respectively (McLysaght et al. 2002;Putnam et al. 2008). The C-terminus, that is, the PBM of CRIPT was reconstructed at six historical time points, namely from the common ancestors of all extant eukaryotes, bilaterians, deuterostomes, protostomes, vertebrates (1R), and hexapods, respectively (figs. 1B and 2 and supplementary fig. 3, Supplementary Material online). In addition to the ancient ones, we expressed and characterized DLG PDZ3 and CRIPT variants from the following extant species representing different distantly related animal groups: The nonbilaterian animal Hy. vulgaris (a cnidarian), Octopus bimaculoides (California two-spot octopus, a mollusk, and protostome), Loa loa (eye worm-a nematode or roundworm causing the disease loiasis, protostome), Dendroctonus ponderosae (mountain pine beetle, an insect, protostome), Dr. melanogaster (fruit fly, insect, protostome), Saccoglossus kowalevskii (acorn worm, a hemichordate, deuterostome), and Ho. sapiens (representing chordates, deuterostome) (figs. 1A and B and 2 and supplementary fig. 3, Supplementary Material online). We use "native interaction" to denote interactions between treematched or species-matched PDZ3:CRIPT interactions. Based on sequence alignment and structure prediction, it appears that all ancestral and extant DLG PDZ3 domains have a similar structure with the usual PDZ fold consisting of six b strands and two a helices, but also including a third Cterminal a helix (a 3 ), present in the crystal structures of human DLG4 PDZ3 (Doyle et al. 1996) (supplementary fig. 4, Supplementary Material online). During revision, we received comments on our preprint that RAxML is not optimized for ancestral sequence reconstruction. We therefore repeated the reconstruction of both PDZ3 and CRIPT with two other software, RAxML-NG and PAML, respectively, and with two different trees (CommonTree and OpenTree). Overall, the results were consistent (supplementary data excel file 2, Supplementary Material online), but in particular one reconstructed PDZ3 variant (the deuterostome ancestor) differed from that predicted by RAxML. The new ML variant predicted by PAML was therefore expressed, purified, and subjected to the same experiments as the other PDZ3 variants described below. The PAML variant of deuterostome PDZ3 was found to exhibit close to identical properties as that predicted by RAxML MBE online). Among viridiplantae, plants and mosses show a striking and complete conservation of a type 1 motif (YKQSNV), whereas green algae have a polar residue at position P 0 . The majority of animal phyla including vertebrates, annelids, and mollusks (except Euthyneura, i.e., snails and slugs) have CRIPT with a classical type 1 PBM. However, fungi and arthropods (including insects, spiders, and crustaceans) appear to lack a PBM, although there are a few exceptions. Interestingly, CRIPT from the cnidarian Hy. vulgaris ends with a Cys (fig. 1B).
In accordance with the alignment and phylogeny, CRIPT from the last common ancestor of all extant Eukaryotes probably contained a classical type 1 PBM (YKQSSV) ( fig. 2 and  supplementary fig. 3, Supplementary Material online). A type 1 PBM was likely also present in the ancestor of all animals but has since mutated in distinct animal phyla and orders. Clearly, the ancestor of bilaterian animals contained a type 1 PBM and so did the ancestors of the two major bilaterian groups, protostomes and deuterostomes. However, the last FIG. 2. Evolution of DLG PDZ3 and the PDZ-binding motif in CRIPT. Simplified tree of organismal groups used for ancestral sequence reconstructions of CRIPT and PDZ3. The nodes in the phylogenetic tree are highlighted in the color code used throughout the article for extant and reconstructed ancestral CRIPT and PDZ3 domain variants. Evolution of CRIPT C-terminal-binding motif is highlighted by red (type I: T/S-X-U) to gray (undefined motif). Ancestral sequence reconstructions of CRIPT and PDZ3, respectively, were performed based on multiple sequence alignments and a species tree (CommonTree and OpenTree). In the reconstruction, 309 unique sequences of CRIPT from 498 different species and 249 unique sequences of PDZ3 from the DLG protein family from 324 different species were included. Divergent Evolution of a Protein-Protein Interaction . doi:10.1093/molbev/msaa198 MBE common ancestor of today's arthropods likely carried a mutated PBM in CRIPT, ending with Ala or Thr, instead of the canonical Val residue.
One intriguing notion is that the C-terminal residues P À1 to P À5 are highly conserved even among CRIPTs with Ala or Thr at position P 0 . This raises the question whether CRIPTs without a canonical type 1 PBM still retain binding to their native DLG PDZ3? To address this question, PDZ3 and CRIPT from seven extant and five ancestral species were subjected to binding experiments using ITC and stopped-flow spectroscopy (figs. 3 and 4).

MBE
The binding of Ho. sapiens CRIPT to Ho. sapiens DLG4 PDZ3 is well characterized (Doyle et al. 1996;Niethammer et al. 1998;Saro et al. 2007;Gianni et al. 2011;Toto et al. 2016), and we report a similar binding affinity (0.40 mM) as previous studies (figs. 3 and 5; tables 1 and 2). Furthermore, PDZ3 from Hy. vulgaris (with a C-terminal Cys residue), O. bimaculoides, S. kowalevskii, and the common ancestors of bilaterian animals, protostomes, deuterostomes, and vertebrates (1R), respectively, were all found to bind to their tree-matched native CRIPT with low mM affinity (figs. 3-5; tables 1 and 2; and supplementary fig. 6, Supplementary Material online). On the other hand, PDZ3 domains from Dr. melanogaster, De. ponderosae, and the ancestor of hexapods bind poorly to their respective native CRIPT, which lacks a type 1 PBM. Although the interactions were too weak for stopped-flow spectroscopy, ITC experiments provided a rough estimate of the affinities (K d !10-30 mM) (table 2; figs. 4 and 5). More surprisingly, PDZ3 from the nematode L. loa was also found to bind poorly to its native CRIPT even though it contains a type 1 PBM with a C-terminal Val ( fig. 3).
To directly compare each PDZ3 domain with regard to binding of CRIPT with a type 1 PBM, the affinity of CRIPT from the ancestor of all eukaryotes (YKQSSV) was measured with all PDZ3 domains included in the study. The ancestral Eukaryota CRIPT was found to bind to all resurrected and present-day PDZ3 domains, including insect PDZ3 domains, in the low mM range (figs. 3-5; table 1; and supplementary fig. 6, Supplementary Material online). However, the binding of Eukaryota CRIPT to the nematode L. loa PDZ3 was somewhat weaker than for other PDZ3 domains (17 mM), suggesting that L. loa PDZ3 has a binding groove with a slightly different character than PDZ3 domains from the other investigated extant and ancestral species. Thus, although nematodes, the      (red) is a single exponential from which the observed rate constant k obs was obtained. k off was determined in a separate experiment as explained in Materials and Methods section, and K d calculated as k off /k on . Experiments were performed at 25 C for ITC and at 10 C for stoppedflow. Laursen et al. . doi:10.1093/molbev/msaa198 MBE sister phylum of arthropods, has retained the type 1 PBM in CRIPT the high affinity of the PDZ3:CRIPT interaction is lost, suggesting that it may not be functional in either group of animals. Another option is that the PDZ3:CRIPT interaction is functional but interactions outside the canonical groove are essential for full affinity among Ecdysozoa as observed for some other PDZ:ligand complexes (Pan et al. 2011;Li et al. 2014;Zeng et al. 2016;Ye et al. 2018) or that the PDZ3:CRIPT interaction is posttranslationally regulated (Sundell et al. 2018). Obviously, lower affinity could have evolved as a response to a certain course of events leading to for example high local cellular concentrations of DLG and CRIPT. Finally, a trivial explanation is that the PDZ3 domains used in the experiments were not properly folded. Although it is challenging to test intracellular expression levels, we attempted to partially address the other issues.
Mutational Analysis to Resolve the Low CRIPT Affinity of PDZ3 from L. loa The native phylogenetic tree-matched affinity for each interaction is shown in color code text at each node and the affinity between each PDZ3 and Eukaryota CRIPT is shown in black text. For the native interaction in the deuterostome node, two CRIPT peptides were tested due to low posterior probability for Ser at position P À2 . K d values were determined by stopped-flow experiments at 10 C. N.D., not determined because of too low affinity.  8F-J, Supplementary Material online). Thermodynamic stability generally correlates with the thermal midpoint of unfolding, which could not be well determined due to precipitation. All PDZ3 domains in the study unfolded by an apparent two-state mechanism in equilibrium urea denaturation experiments. However, because of uncertainties in curve fitting, as described below, we were careful in the interpretation regarding how the thermodynamic stability has changed during evolution. All resurrected PDZ3 domains (without an extended a 3 helix) were found to be more stable in comparison to extant PDZ3 domains, except Ho. sapiens PDZ3. Generally, we observed a slight increase in the urea midpoint for the deuterostome lineage from bilaterian to Ho. sapiens DLG4 PDZ3. However, there is some ambiguity in the actual stability of the PDZ3 variants due to uncertainty in determination of the m D-N value, which reflects the change in solventaccessible hydrophobic surface area upon denaturation and which is directly related to the stability, that is, the difference in Gibbs free energy between the denatured and the native state: DG D-N =[Urea] 50% Â m D-N . It is fair to assume that all PDZ3 variants of similar sequence length also have a similar m D-N value. Therefore, we fitted the data with a shared m D-N value, which gives a more robust estimate of the differences in DG D-N between PDZ3 variants. In addition, some PDZ3 variants did not show complete unfolding at the highest urea concentration, resulting in a lower accuracy of the fitted parameters. Nevertheless, it is clear that the extant and ancient PDZ3 domains included in the study display unfolding free energies spanning from 2.6 to 7.6 kcal mol À1 (Hexapoda ancestor likely higher). This wide range suggests that for this particular PDZ domain there is no overall evolutionary trend regarding stability, in the time window from the most recent  MBE common ancestor of bilaterian animals ($550-600 My) until today. However, in specific lineages, stability has been maintained (human, 7.6 kcal mol À1 ), whereas in other lineages such as the one leading to Dr. melanogaster, stability has decreased (2.6 kcal mol À1 ).

The a 3 Extension Increases the Stability of PDZ3
Most studies of DLG4 PDZ3 have been performed on a construct based on the original crystal structure (Doyle et al. 1996). However, in DLG4, PDZ3 is part of the supramodule PDZ3-SH3-GK in which an extended a helix (a 3 ) connects PDZ3 and SH3 ( fig. 7A), but only 7 of the 19 amino acid residues of the linker are present in the constructs investigated in the present work and in previous publications. Recent studies have reported that the entire a 3 and a peptide ligand longer than six residues are required for specific PDZ:ligand interactions (Zeng et al. 2016(Zeng et al. , 2018Ye et al. 2018). The a 3 extension does not significantly affect the affinity between Ho. sapiens PDZ3-SH3-GK and a 6-mer or 15mer CRIPT peptide (Laursen et al. 2020). However, as compared with the ancestral bilaterian PDZ3, the a 3 region is highly conserved in Ho. sapiens PDZ3 with only two substitutions, whereas De. ponderosae, Dr. melanogaster, and L. loa PDZ3 have four, eight, and nine substitutions, respectively, in the 19 residue long primary structure of a 3 ( fig. 7A). We calculated the a helix propensity of isolated a 3 (residues 395-413) for all PDZ3 variants in the present study using the AGADIR software (Munoz and Serrano 1994). The helical propensity varied from 2% to 17% ( fig. 7A), suggesting that the stability of a 3 may have changed during evolution. (The helical propensity refers to an isolated peptide but it could correlate with stability in the context of the folded protein.) Interestingly, L. loa, Dr. melanogaster, and De. ponderosae PDZ3 have the highest helix propensity in a 3 . Therefore, we decided to express, purify, and analyze PDZ3 from these three species with a full-length 19 residue a 3 to assess the effect on binding and stability. We observed a slightly higher thermodynamic stability for the longer variants of PDZ3 in comparison to PDZ3 domains without an extended a 3 such that they approach the stability of human PDZ3 ( fig. 7B-D and table 3).
Like the PDZ3 variants without a 3 extension, the affinities between the extended a 3 PDZ3 variants from L. loa, Dr. melanogaster, and De. ponderosae and their respective native CRIPT ligands (15-mer) were too low to be measured by stopped-flow spectroscopy. But, with ITC, it was again possible to obtain a rough estimate of the K d values. The affinity between Dr. melanogaster PDZ3 and CRIPT was not affected by either the a 3 extension or a longer CRIPT peptide (table 2). Similarly, although the affinity seems to increase 2fold for the respective native De. ponderosae and L. loa PDZ3:CRIPT interactions, this effect is within error of the experiment (table 2). However, the a 3 extension increased the affinity of L. loa PDZ3 for the Eukaryota CRIPT significantly, by 10-fold (table 1), suggesting that the a 3 extension of PDZ3, perhaps even in the context of a supramodule, needs to be considered if a weak interaction is analyzed.

Negative Regulation by Phosphorylation
Reversible phosphorylation results in positive or negative regulation of protein-protein interactions and therefore plays an important role for cellular signaling. Protein phosphorylation mainly occurs at Ser, Thr, and Tyr residues. Thus, with regard to the PDZ3:CRIPT interaction, phosphorylation at P À2 (Thr) in CRIPT disables binding to PDZ3 (Sundell et al. 2018).    (Yokoyama and Radlwimmer 2001) or using Bayesian sampling to reconstruct a set of sequences by sampling amino acids over the posterior probability distribution (Yang and Rannala 1997). However, an easier approach is to reconstruct a "worst-case scenario" variant. Here, one sequence is reconstructed in which every amino acid residue with a posterior probability below a certain threshold, for example, 0.8, is replaced by the residue with the second-highest probability at that position. Such a variant, denoted AltAll (alternative residue at all positions), represents a very unlikely worst-case scenario that can be used to assess how robust conclusions are to errors in the reconstructed ML sequences (Anderson et al. 2016b;Eick et al. 2017) simply by comparing the ML and AltAll proteins. If they display the same affinity for a ligand, that affinity is very likely to be the true ancestral affinity ( fig. 1A and Laursen et al. . doi:10.1093/molbev/msaa198 MBE same experiments as the ML variants to validate the conclusions based on the resurrected ML proteins. The C-termini of ancestral CRIPT could be reconstructed with high confidence due to the strong conservation observed in extant species. Therefore, AltAll CRIPT variants were only tested in two cases, for the ancestor of hexapods and ancestor of deuterostomes, respectively, both of which had one ambiguous position ( fig. 1B and supplementary figs. 3 and 6, Supplementary Material online). AltAll variants of PDZ3 contained three to five substitutions for 1R, Deuterostomia, Protostomia, and Bilateria PDZ3, at positions outside the canonical-binding groove ( fig. 1A). The primary structure of AltAll for the ancestral Hexapoda PDZ3 contained 16 substitutions in comparison to the ML sequence. This is because PDZ3 at the Hexapoda node is resurrected from relatively few sequences and from species with a large number of amino acid substitutions. However, all substitutions were situated outside of the canonical-binding groove. Overall, the AltAll and ML PDZ3 domains showed a similar binding affinity for Eukaryota and native CRIPT (supplementary figs. 6 and tables 1 and 2, Supplementary Material online). Furthermore, secondary structure monitored by circular dichroism was similar for ML and AltAll variants, showing that the folding is robust (supplementary fig. 8A In conclusion, the AltAll and ML proteins displayed similar properties with regard to stability and binding, and therefore, other proteins in the ensemble of likely variants will most probably do that too, including the true ancestral PDZ3 variants (Eick et al. 2017).

Neuroligin with Type 1 PBM Is Common among Deuterostome but Not Protostome Animals
The low affinity of the PDZ3:CRIPT interaction in nematodes and insects inspired us to look at another proposed biological ligand of DLG4 PDZ3, namely neuroligin. Like CRIPT, neuroligin interacts with PDZ3 via its C-terminus at synaptic junctions (Irie et al. 1997;Jeong et al. 2019). To compare the PBMs of neuroligin to those of CRIPT, the NCBI protein sequence database was searched for homologs of Ho. sapiens (UniProt: Q8N2Q7) and C. elegans (UniProt: Q9XTG1) neuroligin-1, as described for CRIPT in Materials and Methods section. Ho. sapiens and C. elegans neuroligin sequences were used for the search, because they were both of high quality and represented neuroligin sequences from deuterostomes and protostomes, respectively. The search resulted in 1,725 sequences in 437 species-419 metazoan species and 18 fungal species. Among the neuroligin sequences, type 1 PBM, like the one of CRIPT, was found to be predominant in deuterostomes, whereas type 2 PBM (/-X-/-COOH) was predominant in protostomes, especially hexapods (supplementary fig.  9, Supplementary Material online). Thus, we observe an even more dynamic evolution of the PBM in neuroligin as compared with CRIPT. It appears likely that bilaterian neuroligins recognize different PDZ domains in different species, whereas most of the neuroligin sequences from fungi and cnidariansthe only species outside of bilaterians that have neuroligin homologs-lack a PBM. However, the lower number of highquality neuroligin sequences makes these results less conclusive than for CRIPT.

Discussion
We have used ancestral sequence reconstruction and resurrection to better understand the connection between structure, function, and evolution of the PDZ3:CRIPT interaction and its biological function. Already at the time of the last common ancestor of all extant metazoa, CRIPT contained a type I PBM with high affinity (2-3 mM) to most present-day DLG PDZ3 domains, and likely to many other PDZ domains with type I PBM specificity. This affinity has remained constant or slightly increased (0.4-3 mM) among diverging lineages until today. But, at the beginning of the evolutionary lineage leading to present-day arthropods ($550-600 My) a point mutation in the C-terminal residue of CRIPT resulted in substantially decreased affinity for DLG PDZ3, and again, likely all PDZ domains with type I specificity. What is the role of CRIPT and how would this mutation influence function? Large regions of CRIPT are highly conserved among extant species (supplementary fig. 10, Supplementary Material online) and mutation in CRIPT is very uncommon among humans, and associated with rare Mendelian disorders (Leduc et al. 2016). A recent study including in vivo experiments indicated an essential role for CRIPT in dendritic growth (Zhang et al. 2017). Experiments were performed in the nematode C. elegans, which, like Dr. melanogaster, has a Thr at the C-terminus of CRIPT. A severely mutated CRIPT (deletion of around 130 residues) resulted in abnormalities in dendritic growth, which could be rescued by expression of Ho. sapiens CRIPT. It was not possible to knock down CRIPT in mouse, suggesting a nonredundant and critical cell biological function in vertebrate species (Zhang et al. 2017). On the other hand, it has been possible to knock down one or several of the four DLG family members because these proteins can compensate for each other (Howard et al. 2010;Bonnet et al. 2013). In light of their experiments, and the lack of a type I PBM in the C-terminus of CRIPT from C. elegans and Dr. melanogaster, Zhang et al. (2017) suggested that the Divergent Evolution of a Protein-Protein Interaction . doi:10.1093/molbev/msaa198 MBE critical biological action of CRIPT could be partially independent of the PDZ3:CRIPT interaction in these animals. In the present article, we corroborate this idea by showing that PDZ3 domains from extant species of nematode and insect phyla bind their native CRIPT with significantly lower affinities than ancestral PDZ3:CRIPT pairs and those from extant vertebrates, even when CRIPT contains a large hydrophobic residue at the C-terminus, such as Leu in L. loa CRIPT. (There are also no obvious noncanonical internal PBMs in CRIPT.) The fact that PDZ3 domains from nematodes and insects have retained the affinity for type I PBM (K d ¼0.7-1.6 mM) reinforces the notion that the PDZ3:CRIPT interaction is nonfunctional in these species.
By necessity, many protein interaction studies are performed with domains cut out from highly modular, often large proteins, which are hard to work with in pure form. However, effects from regions outside the binding surface or domain are increasingly considered in protein-protein interaction studies. Indeed, extensions of PDZ domains are associated with several potential functions: 1) protein dynamicsbased modulation of target-binding affinity, 2) provision of binding sites for macromolecular assembly, 3) structural integration of multidomain modules, and 4) expansion of the ligand-binding pocket (Wang et al. 2010). We analyzed the a 3 extension in PDZ3 that possibly extends the ligand-binding surface beyond the canonical-binding groove. Interestingly, although this a 3 extension had only minor effects on the low-affinity native PDZ3:CRIPT interactions (L. loa, Dr. melanogaster, and De. ponderosae), it indeed had a significant effect on binding to the high-affinity ancestral eukaryote CRIPT peptide to L. loa PDZ3, demonstrating high sequence dependence in the interaction, upon addition of a 3 . The sequence dependence was also reflected in the thermodynamic stabilities of PDZ3 with and without a 3 , where only L. loa PDZ3 displayed a clear stabilization in presence of the helix. With this in mind, we note that PDZ3 stabilities vary across present-day animals and that ancestral versions of PDZ3 were likely a little more stable. It may be assumed that random mutations, neutral for function, generally lead to a decrease in stability until a functional threshold is reached. It is therefore tempting to speculate that the relatively high stability (>6 kcal mol À1 ) of several ancestral and modern DLG4 PDZ3 domains is the result of positive selection.
Relatively few studies have used evolutionary biochemistry to assess protein-protein interactions (Wouters et al. 2003;Anderson et al. 2016a;Wheeler et al. 2018) and in particular attempting to reconstruct both interacting proteins (Hultqvist et al. 2017;Jemth et al. 2018). Wheeler et al. (2018) found that the extant S100A5 and S100A6 proteins display distinct specificity profiles toward a panel of peptide ligands and that this specificity was established early on during their respective evolutionary trajectory. However, the ancestral S100A5/A6 was promiscuous with regard to the tested peptide ligands. We observed a similar scenario for the interaction between two protein domains, NCBD and CID, from the transcriptional coregulators CBP/p300 and NCOA, respectively (Hultqvist et al. 2017). An ancestral NCBD domain displayed low affinity toward ancient and modern CIDs (but it displayed present-day affinities for other protein ligands). However, at the time point where we could reconstruct contemporary NCBD and CID (440 My), the interaction had the same affinity as modern complexes. These findings corroborate the hypothesis that protein interactions evolve toward higher specificity (Wheeler et al. 2016), but it is clear that "functional affinity" likely evolves relatively fast, in particular when it involves short disordered interaction motifs of three to ten residues, like for PDZ3:CRIPT. The affinity is then maintained by positive selection as long as it provides a benefit, or is lost by purifying selection if it is disadvantageous.
Evolutionary change of affinity and specificity is expected when new interactions arise and mutational equilibrium is likely established on a relatively short evolutionary time scale. For PDZ3, we do not observe the initial optimization of the affinity since the interaction is likely older than our analysis. The most ancient contemporary PDZ3:CRIPT pair we could reconstruct (Bilateria, 550-600 My) displays a 1.3-1.6 mM affinity. The K d values of modern PDZ3:CRIPT pairs with a derived Thr at position À2 are slightly lower (0.4-0.5 mM) and might reflect adaptation toward increased specificity for Thr over Ser at position À2 ( fig. 5). Although any functional advantage of the increase in specificity from the bilaterian ancestor is speculative, it is clear that affinity for the CRIPT PBM was lost in several protostome lineages by substitution of a Val for Ala or Thr at position 0 in CRIPT in an ancestral arthropod. Or even more curiously, by adaptations in PDZ3 as observed in the nematode lineage for L. loa ( fig. 5 and  supplementary fig. 1, Supplementary Material online). Finally, an interesting notion from a structure-function point of view is that not only Hy. vulgaris (native interaction) but also human PDZ3 binds CRIPT with a C-terminal Cys residue equally well as CRIPT with the native Val. Clearly, affinity is not everything; A Cys residue can be easily modified and therefore might impose a disadvantage at this position and be subjected to purifying selection.

Phylogenetic Reconstruction
To collect protein sequences, the NCBI database (NCBI Resource Coordinators) was searched for CRIPT and DLG sequences using search terms and the BLAST tool (Altschul et al. 1990) with the Ho. sapiens DLG4 PDZ3 sequence. Sequences were then aligned using Clustal Omega software (Sievers et al. 2011) and filtered. Nonredundant sequences with information on species of origin, homologous to Ho. sapiens DLG4 PDZ3 or Ho. sapiens CRIPT, respectively, were kept. Sequences were filtered for redundancy, and, in the case of identical sequences among related species, they were combined into one sequence identifier. The final sets of sequences were aligned using Clustal Omega and manually curated. The resulting alignments were used to infer the ancestral sequences of DLG family PDZ3 domains and of CRIPT. For this sequence reconstruction, species trees were downloaded using the CommonTree tool of the Taxonomy Browser in the NCBI database. The taxonomy trees were modified before the ancestral reconstruction due to software Laursen et al. . doi:10.1093/molbev/msaa198 MBE requirements-the software could only read a tree that has exactly two branches per node, and the downloaded taxonomic trees sometimes had several branches per node, usually among species of the same genus. CommonTree does not provide branch lengths, but these are calculated by RAxML for the supplied tree during ancestral reconstruction. The PDZ3 tree was adjusted according to DLG family homology; due to evidence of genome duplication in early chordates, vertebrate sequences were split into DLG1, DLG2, DLG3, and DLG4 clades (McLysaght et al. 2002;Putnam et al. 2008). In the resulting CRIPT and PDZ3 trees, species names were replaced with sequence names to match the identifiers in the alignment. The RAxML software (Stamatakis 2014) was used to infer ancestral sequences for PDZ3 and CRIPT. Because of low posterior probabilities for some positions in the reconstructed ML sequences (supplementary data excel file 1, Supplementary Material online), we reconstructed AltAll sequences for each ML sequence. In the AltAll sequences, the amino acid residues with the second-highest probabilities were included for each position where the probability of the ML residue was <0.8.
Subsequently, we learned that RAxML version 8.0.0 implemented ancestral sequence reconstruction methods differently to other phylogeny software (Stamatakis A, Kozlov A, personal communication). Therefore, the ancestral sequence reconstruction was repeated using the newer RAxML-NG (Kozlov et al. 2019) and also with PAML (Yang 2007) software. Due to possible phylogenetic inaccuracies in the taxonomic tree, these programs were also tested with a second tree from OpenTree tree of life (Redelings and Holder 2017).
Overall, CRIPT ancestors reconstructed by RAxML-NG and PAML and using both trees were consistent with the RAxMLinferred ancestors, with some minor deviations (supplementary data excel file 2, Supplementary Material online). There was a variation between K/R in P À4 and S/T in P À2 , and these conservative substitutions are unlikely to modulate binding, so the RAxML-inferred sequences were deemed valid for the analysis. The same analysis was carried out with PDZ sequences. For most PDZ sequences, amino acids in all structural and functional positions matched the originally used sequences. However, the ancestral deuterosomal PDZ3 sequence resurrected using PAML and RAxML-NG differed from the RAxML reconstruction in potentially crucial residues. This sequence was therefore expressed, purified, and subjected to experiments, where we demonstrated that binding and stability of the PAML-and RAxML-predicted variants were similar (supplementary fig. 5, Supplementary Material online).
For sequence similarity calculations, we used a matrix developed by Stephenson and Freeland (2013).

Protein Expression and Purification
cDNA encoding DLG PDZ3 from the following extant species were cloned into a modified pRSET vector (Invitrogen) and transformed into Escherichia coli BL21(DE3) pLys cells (Invitrogen) for expression: Ho. sapiens (DLG4), S. kowalevskii, Dr. melanogaster, De. ponderosae, L. loa, O. bimaculoides, and Hy. vulgaris. Similarly, reconstructed cDNA encoding the following ancestral DLG PDZ3 (both ML and AltAll variants) were expressed: 1R, Deuterostomia, Bilateria, Hexapoda, and Protostomia. The expression construct contained an N-terminal 5xHis-tag. Cells were grown in LB medium at 37 C and overexpression of protein was induced with 1 mM isopropyl-b-D-thiogalactopyranoside overnight at 18 C. Cells were harvested by centrifugation at 4 C and the pellet resuspended in 50 mM Tris, pH 7.8, 10% glycerol, and stored at 20 C. Pellets were thawed and sonicated 2Â4 min followed by centrifugation. The supernatant was filtered and added to a pre-equilibrated (50 mM Tris, pH 7.8, 10% glycerol) Nickel Sepharose Fast Flow column (GE Healthcare). Proteins were eluted with 250 mM imidazole and dialyzed into 50 mM Tris, pH 7.8, 10% glycerol buffer overnight. Proteins were then loaded onto a Q sepharose column for further purification and eluted with a 500 mM NaCl gradient in 50 mM Tris, pH 7.8, 10% glycerol. Protein purity and identity were analyzed by SDS-PAGE and MALDI-TOF mass spectrometry.

Circular Dichroism and Urea Denaturation Experiments
Experiments were performed in 50 mM sodium phosphate, pH 7.45, 21 mM (I ¼ 150) at 10 C on a JASCO J-1500 spectrapolarimeter. Far-UV spectra for all PDZ3 domains were recorded from 260 to 200 nm with 10 mM protein (the average of 16 scans). Thermodynamic stability of PDZ3 domains was determined by urea denaturation (0-8.1 M with 0.3 M steps) and monitored by molar ellipticity at 222 nm with 22 mM PDZ3. Data were fitted to a two-state unfolding process using both free fitting of each parameter or a shared m D-N to obtain the midpoint (denoted [Urea] 50% , where 50% of the protein is folded).
where F is the observed spectroscopic signal; a N is the signal of the native state in 0 M urea; b N is the slope of the native baseline; and a D and b D are the corresponding parameters for the denatured state (Clarke et al. 1993).

Kinetic Experiments
All kinetic experiments were performed in 50 mM sodium phosphate, pH 7.45, 21 mM KCl (I ¼ 150) at 10 C. Binding and dissociation experiments were performed at SX-17 MV stopped-flow spectrometer (Applied Photophysics) and carried out as described previously (Chi et al. 2010). Briefly, the association rate constant (k on ) was obtained from binding experiments. PDZ3 (1 mM) was mixed rapidly with dansyl labeled (D) CRIPT peptide (concentration range: 2-20 mM). Excitation was done at 345 nm and emission was recorded >420 nm using a long-pass filter. The change in fluorescence signal obtained from D-CRIPT was monitored over time and fitted to a single exponential function to deduce the observed rate constant (k obs ). Observed rate constants were plotted against the corresponding D-CRIPT concentration. Data were fitted to an equation for a reversible bimolecular association Divergent Evolution of a Protein-Protein Interaction . doi:10.1093/molbev/msaa198 MBE reaction (Malatesta 2005) to extract the association rate constant without considering pseudo first-order conditions. k obs ¼fk 2 on ð½PDZ 0 À CRIPT ½ 0 Þ 2 þ k 2 off þ 2k on k off ð½PDZ 0 þ CRIPT ½ 0 þ ½PDZ 0 Þg 0:5 The dissociation rate constant (k off ) was obtained from a separate displacement experiment. An equilibrium complex was preformed between PDZ3 and D-CRIPT (2:10 mM) and mixed rapidly with high excess of unlabeled CRIPT (100, 150, and 200 mM, respectively). The change in fluorescence signal obtained from dansyl-labeled CRIPT was monitored over time and fitted to a single exponential function to deduce the observed rate constant (k obs ). The average of the three k obs values is reported as the dissociation rate constant k off .

Isothermal Titration Calorimetry Experiments
All isothermal titration calorimetry (ITC)-binding experiments were performed in 50 mM sodium phosphate pH 7.45, 21 mM KCl (I ¼ 150) at 25 C on an iTC200 microcalorimeter (Malvern Instruments). PDZ3 and CRIPT peptide were dialyzed against the same buffer to minimize artifacts from buffer mismatch in the titrations. CRIPT peptide was titrated 16 times into the cell containing PDZ3. For every titration, the heat release decreased from the binding of CRIPT to PDZ3, due to complex formation and PDZ3 saturation. Change in heat over time was integrated to kcal mol À1 and fitted to a one-site binding model to obtain binding stoichiometry (n), equilibrium binding constant (K d ), and the enthalpy change upon binding (DH).

Data Availability
The authors declare that data supporting the findings of this study are available within the article and its supplementary information file, Supplementary Material online, including probability tables for reconstruction of CRIPT and PDZ3 sequences. Additional raw data are available from corresponding authors upon request.

Supplementary Material
Supplementary data are available at Molecular Biology and Evolution online.