Identification of novel human WW domain-containing proteins by cloning of ligand targets.

A recently described protein module consisting of 35-40 semiconserved residues, termed the WW domain, has been identified in a number of diverse proteins including dystrophin and Yes-associated protein (YAP). Two putative ligands of YAP, termed WBP-1 and WBP-2, have been found previously to contain several short peptide regions consisting of PPPPY residues (PY motif) that mediate binding to the WW domain of YAP. Although the function(s) of the WW domain remain to be elucidated, these observations strongly support a role for the WW domain in protein-protein interactions. Here we report the isolation of three novel human cDNAs encoding a total of nine WW domains, using a newly developed approach termed COLT (cloning of ligand targets), in which the rapid cloning of modular protein domains is accomplished by screening cDNA expression libraries with specific peptide ligands. Two of the new genes identified appear to be members of a family of proteins, including Rsp5 and Nedd-4, which have ubiquitin-protein ligase activity. In addition, we demonstrate that peptides corresponding to PY and PY-like motifs present in several known signaling or regulatory proteins, including RasGAP, AP-2, p53BP-2 (p53-binding protein-2), interleukin-6 receptor-alpha, chloride channel CLCN5, and epithelial sodium channel ENaC, can selectively bind to certain of these novel WW domains.

The recognition and elucidation in recent years of various modular protein domains, along with their specific peptide ligands, have spawned a remarkable progress in our understanding of their role in signal transduction and other fundamental cellular processes (1,2). Analysis of the SH (Src homology) domains, SH2 and SH3, present in a wide variety of proteins involved in cellular signaling and transformation has been particularly fruitful. The ligand specificity of many different SH2 and SH3 domains has been defined using combinatorial peptide libraries. SH2 domains bind with high affinity to phosphotyrosine residues within a specific sequence context (3). In contrast, SH3 domains bind to proline-rich peptides that share a conserved PXXP motif (4,5).
A newly described protein module, termed the WW domain, has been reported (6 -8). The WW domain consists of 35-40 amino acids and is characterized by four well conserved aromatic residues, two of which are tryptophan. The secondary structure of the WW domain has recently been determined and consists of a slightly bent three-stranded antiparallel ␤-sheet (9). This domain has been reported in a wide variety of proteins of yeast, nematode, and vertebrate origin, including Rsp5, Yesassociated protein (YAP), 1 human and murine Nedd-4, FE65, Pin1, and a human RasGAP-related protein (10 -14). Although the precise physiological role of the WW domain remains undetermined, its presence in diverse proteins involved in signaling, regulatory, and cytoskeletal functions, as well as its rapidly emerging role in signaling mechanisms that underlie several human diseases, clearly underscores its importance (15,16). Two ligand proteins for the YAP WW domain, WBP-1 and WBP-2 (WW domain-binding protein), have recently been cloned and found to contain a run of prolines followed by a tyrosine residue (PY motif) that mediate specific binding to the YAP WW domain (17). It is likely that WW domains have distinct ligand specificities, as the PY motifs of WBP-1 and WBP-2 did not bind to the dystrophin WW domain. Furthermore, the PY motif appears to be distinct from the PXXP ligand consensus sequence of SH3 domains. These observations suggest a direct role for the WW domain in mediating specific and distinct protein-protein interactions.
Recently, we described a method termed COLT (cloning of ligand targets), which enables the rapid cloning of ligandbinding modular protein domains using peptide ligand sequences as probes for screening cDNA expression libraries (18). While COLT has been used to clone several novel SH3 domaincontaining genes using proline-rich SH3 domain ligand peptides as probes, in this report, we describe its use in identifying three novel human WW domain-containing genes. In examining the ligand specificity of the individual WW domains, we demonstrate that they bind distinct PY motif peptide ligands with differential specificity and relative affinity. In addition, we demonstrate that peptides containing PY and PY-like motifs present in a variety of signaling or regulatory proteins can selectively bind to these novel WW domains.

EXPERIMENTAL PROCEDURES
Isolation and Characterization of WW Domain-encoding cDNA Clones by COLT-All peptides used were synthesized with an N-terminal biotin-SGSG linker and purified by high pressure liquid chromatography, and their structures were confirmed by mass spectroscopy and amino acid analysis. Multivalent peptide-streptavidin/alkaline phosphatase complexes were assembled as described (18) with the exception of the phosphotyrosine-containing peptide (pWBP-1), for * This work was supported by a grant from Cytogen Corp. (Princeton, NJ). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
¶ which a streptavidin-horseradish peroxidase conjugate was used (Sigma). Human bone marrow and brain cDNA libraries (CLONTECH, Palo Alto, CA) were plated at a density of ϳ1 ϫ 10 5 plaque-forming units/10-cm plate, and positive plaques were detected and purified as described previously (15). Bound streptavidin-horseradish peroxidase conjugate was detected with the IBI Enzygraphic™ Web (Kodak Scientific Imaging Systems) as described by the manufacturer. DNA was sequenced on both strands using ABI PRISM™ dye terminator cycle chemistry (Perkin-Elmer) on an ABI 373A automated DNA sequencer.

RESULTS AND DISCUSSION
Isolation of Novel WW Domain-containing Proteins-In an effort to identify novel WW domain-containing proteins, four putative WW domain peptide ligand sequences including WBP-1 (PGTPPPPYTVGPGY), WBP-2A (YVQPPPPPYPGPM), WBP-2B (PGTPYPPPPEFY), and a PY peptide segment of Ras-GAP (GGGFPPLPPPPYLPPLG) were used as a mixed probe to screen human brain and bone marrow cDNA expression libraries by COLT. Thirteen positive clones were identified, including sibling and overlapping clones originating from the same mRNA. These clones were sequenced and found to encode three distinct proteins, WWP1, WWP2, and WWP3 (WW domaincontaining protein), containing a total of nine novel WW domains ( Fig. 1). Data base homology searches revealed that both WWP1 and WWP2 contain four tandem WW domains and share a similar modular domain architecture with Nedd-4 and Rsp5 (10). The smaller partial clone, WWP3, was isolated from the brain cDNA library and contains a single WW domain.
In addition to the high overall amino acid homology (55%), alignment of the nine novel WW domain sequences with several previously identified domains reveals two significant blocks of homology flanking the core of the domain (Fig. 2). These blocks include N-terminal tryptophan and C-terminal proline residues that are absolutely conserved in all WW domains identified to date. The WW domains of WWP1, WWP2, WWP3, Nedd-4, Rsp5, and YAP are more similar to each other than to WW domains found in other proteins (21). Furthermore, contrary to what one would expect from a recent evolutionary duplication event of the WW domain within a gene, individual WW domains from WWP1 and WWP2 show a greater similarity to the corresponding WW domains in either Nedd-4 or Rsp5 than to each other. The above observations suggest that these WW domains are perhaps functionally similar and that multiple WW domains within the same protein are not redundant but may have evolved to perform divergent specialized roles. Interestingly, we did not isolate a human YAP cDNA from these screens. However, murine YAP was detected in a screen of a mouse embryonic cDNA library with the WBP-2A peptide as a COLT probe. 2 In addition to the WW domains, primary structure analysis of the clones revealed several other interesting features. Complete and partial C-terminal HECT (homologous to the E6associated protein carboxyl terminus) domains of ϳ300 amino acids (Fig. 3) are contained within clones WWP2 and WWP1, respectively. This domain has been shown in vitro to have E3 ubiquitin-protein ligase activity in several proteins including rat p100, yeast Rsp5, and human papilloma virus E6-AP (22,23). Encoded within the last 40 amino acids of the HECT domain is a conserved cysteine residue that is the likely site for ubiquitin thioester formation. The presence of a HECT domain is noteworthy since structurally and functionally related E3 ubiquitin-protein ligases are thought to serve a major role in defining the substrate specificity of the ubiquitin degradation system (24). In fact, Rsp5 was recently shown to be involved in the induced degradation of several nitrogen permeases in yeast (25). WWP2 also encodes an N-terminal C2-like domain characteristic of a large family of proteins including protein kinase C (26) and synaptotagamins (27). The C2 domain has been shown to bind membrane phospholipids in a calcium-dependent manner and is thought to function in the intracellular compartmentalization of proteins (28). Although the different modular domains present within WWP1 and WWP2 are highly homologous to those found in Nedd-4 and Rsp5, there is no significant homology among these proteins in regions flanking these domains. Also of interest is the presence in clone WWP3 2 R. Gao and B. K. Kay, unpublished data. of a partial N-terminal guanylate kinase-like domain that is involved in GMP binding in several membrane-associated proteins including the human erythrocyte membrane protein p55 (29) and rat postsynaptic density protein PSD-95 (30).
WW Domain-Peptide Ligand Selectivity-To examine the peptide ligand binding preferences of all nine individual novel WW domains, an enzyme-linked immunosorbent assay-based cross-affinity map experiment was performed with each domain expressed as a GST fusion protein and WBP-derived peptides (Fig. 4). Peptides WBP-1, WBP-2A, and WBP-2C bound to several individual WW domains to varying degrees. The WBP-2B peptide with an N-terminal tyrosine residue relative to the run of prolines had no binding activity, indicating the necessity for a C-terminal tyrosine in the PY motif. The relative importance of individual proline residues within the PY motif for binding to various WW domains was assessed by alanine substitution for both the WBP-1 and WBP-2A peptides. All of the variant WBP-1 peptides, with the exception of the third proline substitution (WBP-1-Pro3), retained binding activity to WW domains present in clones WWP1 and WWP2, suggesting a critical role for the third proline residue. Interestingly, substitution of the second proline residue (WBP-1-Pro2) did not abolish binding to WW domains WWP1.1 and WWP2.3. This was unanticipated in light of the results observed for binding of the WBP-1 protein PY motif to the YAP WW domain, in which both the second and third proline residues are crucial for binding (9,17). This difference suggests that WW domains WWP1.1 and WWP2.3 possess a more promiscuous binding specificity than does the YAP WW domain. Similarly, proline substitution of the WBP-2A peptide indicates that the third proline residue (WBP-2A-Pro3) is absolutely essential for WW domain binding, whereas substitution of the second proline (WBP-2A-Pro2) is not.
The specificity of individual WW domains for PY motif sequences was demonstrated by the ability to discriminate between peptides containing SH3 domain PXXP ligand consensus sequences (Src and Crk) as well as generally proline-rich control peptides derived from several proteins including acetylcholine receptor M4 and c-Abl. In addition, none of the PY motif peptides bound to either full-length Fyn or Lyn (which contain both SH3 and SH2 domains) in filter binding assays. Taken together, these results suggest that the PY motif represents a distinct binding sequence for WW domains.
The presence of a critical tyrosine residue in the PY motif raised the question of whether tyrosine phosphorylation could modulate WW domain binding. Although it is not known whether PY motifs are phosphorylated in vivo, the presence of a phosphotyrosine residue in the pWBP-1 peptide abolishes WW domain binding. Moreover, binding of the pWBP-1 peptide could be restored by removal of the phosphate moiety with prior treatment of either the free peptide or peptide bound to a streptavidin-horseradish peroxidase conjugate with alkaline phosphatase (data not shown). These results suggest a potential regulatory role for tyrosine phosphorylation in modulating WW domain-ligand interactions.
Potential WW Domain-PY Motif Interactions-Data base searches revealed that PY and PY-like motif sequences are found in a wide variety of regulatory proteins. Included among these proteins are the GTPase-activating protein RasGAP, the AP-2 transcription factor, p53BP-2, renal chloride channel CLCN5, the dystrophin-interacting molecule ␤-dystroglycan, the interleukin-2 receptor and interleukin-6 receptor-␣, and the retroviral Gag proteins from human T-cell lymphotropic virus type 1 and Rous sarcoma virus type 1 (30 -36). We tested the ability of peptides containing these motifs to bind to the novel WW domains (Fig. 5). Interestingly, although all of these peptides displayed an ability to bind WW domains in general, differences in specificity and relative binding were evident. For example, of all the peptides tested, only the CLCN5 peptide showed appreciable binding to the WWP1.4 and WWP2.4 domains. The observation that PY motif-containing peptides from several other proteins did not bind to any WW domain tested indicates that these interactions are specific and potentially biologically relevant (data not shown).
Of particular note is the demonstration that the human T-cell lymphotropic virus type 1 and Rous sarcoma virus type 1 peptides derived from the Gag protein proline-rich "L domain" bound to several WW domains. L domain regions are highly conserved in retroviruses and have been shown to function in a positionally independent manner essential for retroviral budding (37). Our results, coupled with a recent report demonstrating the interaction of the YAP WW domain with the L domain of Rous sarcoma virus (38), suggest a direct role for a WW domain(s)-Gag protein interaction in this process. The interaction of a ␤-dystroglycan peptide with several WW domains is also of interest. ␤-Dystroglycan, which contains a C-terminal PY motif, was previously shown to interact with the single WW domain present in dystrophin (15). Our results suggest that perhaps several different WW domain-containing proteins can interact with the ␤-dystroglycan C-terminal PY motif. Recently, a 12-amino acid proline-rich region of Formin, a protein encoded by the mouse limb deformity locus (39), was shown to bind to both SH3 domain-and several novel WW domaincontaining proteins (40). Significantly, a peptide encompassing the same proline-rich region of Formin did not bind to any of our novel WW domains. Since this peptide does not contain a PY motif, this suggests that our WW domains, unlike those present in the Formin-binding proteins, require a PY or PY-like motif for binding.
Taken together, the above observations suggest that interactions between these proteins and WW domain-containing proteins may play a role in the former's regulation in vivo. For example, given the likelihood that WWP1 and WWP2 function as E3 ubiquitin-protein ligases, one could invoke a simple model whereby initial substrate-specific recognition occurs via WW domain-substrate protein interaction followed by ubiquitin transfer and subsequent proteolysis. On a general level, the identification of WW domain PY motif ligand sequences in various candidate proteins can lead to testable predictions of specific WW domain-mediated interactions in vivo.
WW Domain-Epithelial Na ϩ Channel Interactions-The demonstration that peptides containing PY-like motifs derived from the cytoplasmic domains of both the wild-type ␤and ␥-subunits of the epithelial Na ϩ channel (ENaC␤-WT and ENaC␥-WT) bind to several WW domains is of particular interest (Fig. 6). Recently, a number of mutations in both ENaC␤ and ENaC␥ have been demonstrated in patients with an autosomal dominant form of hypertension characterized by elevated renal Na ϩ reabsorption, termed Liddle syndrome (41). Specifically, several nonsense mutations leading to the trunca-tion of the cytoplasmic domain of both subunits, in addition to two missense mutations (P616L and Y618H) contained within a conserved proline-rich segment of the cytoplasmic domain of the ␤-subunit, have been identified in patients (42)(43)(44). Moreover, expressed proteins containing these mutations resulted in a 3-8-fold increase in ENaC activity that was directly related to an increase in the total number of active channels. These results suggest the hypothesis that specific cytoplasmic regions of the ␤and ␥-subunits are involved in the normal negative regulation of channel activity via interactions with a modulatory protein(s). In fact, Nedd-4 was recently identified as a binding partner with the C terminus of rat ENaC␤ using the yeast two-hybrid system (45,46). In addition, we have recently isolated WWP1 in COLT screens using ENaC␤ and ENaC␥ peptides (data not shown).
Our observation that mutant peptides (ENaC␤-P616L and ENaC␤-Y618H) containing missense substitutions found in Liddle syndrome patients do not bind to the WW domains in clones WWP1 and WWP2 is consistent with the above hypothesis. This result also confirms the observation that the third proline residue and the tyrosine within the PY motif are critical for binding to the WW domain. Other substitutions of the ␤-subunit PY motif and flanking sequences were also shown to diminish binding to specific WW domains. Thus, substitution of the second proline residue of the core PY motif completely abrogated WW domain binding. In addition, mutation of specific residues flanking the C terminus of the PY motif also led to diminished WW domain binding. These results directly correlate with the activity of various ENaC␤ mutants measured by a functional assay in Xenopus oocytes (47). A PY motifcontaining peptide from the cytoplasmic domain of the wildtype ␣-subunit of ENaC (ENaC␣-WT) was also shown to bind to several WW domains, suggesting that this subunit may also be regulated by a WW domain-mediated interaction(s). Taken together, the above observations suggest a mechanism whereby a WW domain-mediated interaction(s) of a Nedd-4 family member(s) leads to the eventual ubiquitin-mediated degradation and negative regulation of the Na ϩ channel.
We have used the COLT approach to identify two new members (WWP1 and WWP2) of a family of human Nedd-4-like proteins associated with ubiquitin-protein ligase activity. Moreover, we demonstrate that individual WW domains of these proteins can clearly bind distinct PY motif peptide ligands with differential specificity and relative affinity. In addition to demonstrating the expanded utility of COLT methodology, it appears likely that additional proteins containing the modular WW domain remain to be found. In fact, we have recently isolated a third novel Nedd-4-like family member, containing three WW domains, from a human prostate cDNA library. 3 Our results, coupled with knowledge of the NMR structure of the WW domain-ligand complex, provide a framework with which to examine the peptide ligand specificity and structure/function activity of individual WW domains. In this regard, optimal ligand preferences of the WW domains are currently being deduced by screening combinatorial phage display peptide libraries. 4 Given the small size and high degree of sequence conservation of the WW domain, it is extraordinary that exquisite ligand selectivity is observed. The NMR structure of the human YAP WW domain and its peptide ligand reveals that the hydrophobic residues Leu-190, His-192, and Trp-199 (see Fig. 2) form a binding site in contact with the ligand (9). In light of these data, it is interesting to note that domains WWP1.4 and WWP2.4, which contain a C-terminal phenylalanine residue instead of a tryptophan, display a more restrictive ligand binding preference. In addition, the presence of a valine or isoleucine residue instead of Leu-190 may also play a role in determining the distinct ligand specificity of the novel WW domains. The presence of multiple WW domains with distinct ligand specificities in WWP1 and WWP2 suggests that these proteins may bind to a broad range of cellular targets. Alternatively, multiple WW domains may confer additive binding affinity to target molecules that contain multiple PY motif ligands.
Finally, the interaction of peptides containing PY and PYlike motifs from several proteins with the WW domains in WWP1 and WWP2 directly suggests a role for the ubiquitinmediated degradation of these proteins. In this respect, it is noteworthy that several cell membrane proteins, including the platelet-derived growth factor receptor (48) and yeast ␣-factor receptor Ste2p (49), are subject to ubiquitination and eventual degradation upon ligand binding. This is particularly relevant in light of the observed interaction of PY motif-containing peptides from ENaC subunits with specific WW domains and may lead to an understanding of the molecular pathology of Liddle syndrome.