A Human Ubiquitin Conjugating Enzyme (E2)-HECT E3 Ligase Structure-function Screen*

Here we describe a systematic structure-function analysis of the human ubiquitin (Ub) E2 conjugating proteins, consisting of the determination of 15 new high-resolution three-dimensional structures of E2 catalytic domains, and autoubiquitylation assays for 26 Ub-loading E2s screened against a panel of nine different HECT (homologous to E6-AP carboxyl terminus) E3 ligase domains. Integration of our structural and biochemical data revealed several E2 surface properties associated with Ub chain building activity; (1) net positive or neutral E2 charge, (2) an “acidic trough” located near the catalytic Cys, surrounded by an extensive basic region, and (3) similarity to the previously described HECT binding signature in UBE2L3 (UbcH7). Mass spectrometry was used to characterize the autoubiquitylation products of a number of functional E2-HECT pairs, and demonstrated that HECT domains from different subfamilies catalyze the formation of very different types of Ub chains, largely independent of the E2 in the reaction. Our data set represents the first comprehensive analysis of E2-HECT E3 interactions, and thus provides a framework for better understanding the molecular mechanisms of ubiquitylation.

Here we describe a systematic structure-function analysis of the human ubiquitin (Ub) E2 conjugating proteins, consisting of the determination of 15 new high-resolution three-dimensional structures of E2 catalytic domains, and autoubiquitylation assays for 26 Ub-loading E2s screened against a panel of nine different HECT (homologous to E6-AP carboxyl terminus) E3 ligase domains. Integration of our structural and biochemical data revealed several E2 surface properties associated with Ub chain building activity; (1) net positive or neutral E2 charge, (2) an "acidic trough" located near the catalytic Cys, surrounded by an extensive basic region, and (3) similarity to the previously described HECT binding signature in UBE2L3 (UbcH7). Mass spectrometry was used to characterize the autoubiquitylation products of a number of functional E2-HECT pairs, and demonstrated that HECT domains from different subfamilies catalyze the formation of very different types of Ub chains, largely independent of the E2 in the reaction. Our data set represents the first comprehensive analysis of E2-HECT E3 interactions, and thus provides a framework for better understanding the molecular mechanisms of ubiquitylation. Molecular & Cellular Proteomics 11: 10.1074/mcp.O111.013706, 329 -341, 2012.
Ubiquitin (Ub) 1 is a conserved polypeptide that is covalently conjugated to other proteins in a reversible manner, to alter their function in a variety of ways. Ub conjugation (ubiquitylation) is a highly regulated process, consisting of a sequential series of E1-E2-E3 activation, conjugation, and ligation reactions. An E1 enzyme must first activate a mature Ub polypeptide via the formation of a high-energy thiol-ester bond with the Ub carboxyl-terminal Gly residue. The activated Ub polypeptide is then transferred to a Cys residue of an E2 conjugating protein. Finally, via an E3 ligase, the Ub polypeptide is covalently conjugated to a target protein (reviewed in [1][2][3][4][5]. Monoubiquitylation (i.e. the conjugation of a single Ub molecule to a target protein) has been implicated in a number of biological processes including transcriptional control, endocytosis, plasma membrane receptor recycling, and DNA damage signaling (6,7). However, Ub itself contains seven lysine residues, all of which can be ubiquitylated to form polyubiquitin oligomers, or Ub "chains" (2)(3)(4)(5)8). Ub chains of varying lengths and linkage types can confer very different biological outcomes to a targeted protein substrate. For example, the best-characterized function of Ub chains (in this case, consisting of at least four Ub polypeptides linked via K48) is the targeting of a protein substrate for 26S proteasome-dependent degradation (9). By contrast, K63-linked Ub chains play roles in the DNA damage response, epsin-mediated endocytosis and aggresome formation (10,11).
Although only two human Ub E1 activating proteins have been identified (12), 40 E2s (including both active E2 proteins and inactive E2 variants) are encoded in the human genome (Table I). All E2 proteins share a conserved "core" ubiquitin conjugating (UBC) domain of ϳ150 amino acid residues, and many E2s possess additional N-and/or C-terminal protein sequences that can govern intracellular localization, confer regulatory properties, or provide specificity for interactions with particular E3 ligases (13,14).
Ub E3 ligases facilitate the transfer of Ub from an activated E2 to a substrate protein or another Ub molecule. Members of the RING (really interesting new gene) type E3 family recruit activated E2ϳUb complexes to substrates, resulting in direct Ub transfer from the E2 to the target (2), whereas members of the HECT (homologous to E6-AP carboxyl terminus) domain E3 family form a thiol-ester linkage with Ub prior to its transfer to a target protein (15,16). The human genome encodes hundreds of RING type E3s, but only 28 HECT domain-containing E3 ligases (16). E3s clearly demonstrate specificity for subsets of E2 proteins, and different E2-E3 combinations can generate different types of Ub chains (e.g. [17][18][19][20]. However, the molecular determinants involved in E2-HECT E3 interaction specificity are not well defined, the physical properties associated with processivity (i.e. the synthesis of long versus short Ub chains) are not well understood, and the molecular mechanisms involved in the specification of different types of Ub linkages remain cryptic.
Because of the modest affinity of most E2-E3 interactions, techniques such as co-immunoprecipitation have been largely unsuccessful in the characterization of functional E2-E3 pairs. Although yeast two hybrid screening has identified a number of putative E2-E3 functional interactions (e.g. 21,22), this methodology does not provide information concerning processivity or Ub chain linkage types generated by each pair. In vitro E2-E3 ubiquitylation reactions performed with purified proteins must be used to obtain this type of information.
Functional interactions between a number of RING domain E3s and a smaller number of E2 proteins have been investigated previously (e.g. [17][18][19][20], but no large-scale study has focused on the HECT E3 ligases. Here, we present the first comprehensive human E2-HECT E3 structure function analysis.

Plasmids-Human
E2 open reading frames were amplified by PCR from various templates (see NM numbers in Table I), and inserted using the Infusion system (BD Biosciences) into a pET28 vector with 6xHis-tag and a thrombin or TEV-cleavage site located upstream of the cDNA insert. Human HECT domain proteins (Table I) and Ub were similarly cloned. The library of E. coli E2 expression clones will be made available through Addgene and/or other nonprofit sources.
Protein Purification, Size-Exclusion Chromatography and Crystallization-Proteins were expressed in E. coli BL21 (DE3) Gold (Stratagene, La Jolla, CA) grown in TB medium in the presence of 50 g/ml kanamycin at 37°C to an OD 600 of 4 -5, induced with 2 mM isopropyl-1-thio-D-galactopyranoside and further grown for 16 -18 h at 15°C. Recombinant E2 and HECT domain proteins were purified using standard metalaffinity chromatography with TALON resin (Stratagene), according to manufacturer's instructions. All proteins for the Ub assays were dialyzed against 20 mM Tris-Cl pH 8.0, 150 mM NaCl, 10% glycerol, 2 mM dithiothreitol and stored at Ϫ80°C. For crystallization of the E2 proteins, the N-terminal 6xHis-tag was removed by incubation with thrombin (1 unit/mg protein, 2 h at 21-23°C). Proteins were further purified by gel filtration on a HighLoad 16/60 Superdex 200 column (GE Healthcare); fractions containing the target protein were pooled and concentrated by ultrafiltration using an Amicon Ultra centrifugal filter with 10 kDa cutoff (Millipore, Billerica, MA) to a final protein concentration of 20 -40 mg/ml. Structure Determination-E2 protein crystals were grown at 18°C using the hanging drop method. Suitable crystals were immersed in well solution supplemented with cryoprotectant prior to dunking and storage in liquid nitrogen. Diffraction data were collected either in-house using an FRE system or at synchrotron beamlines at the Argonne Photon Source or the Cornell High Energy Synchrotron Source. Data were processed with HKL-2000 (23), the XDS program package (24), or D * TREK (25). All structures, other than that of UBE2H, UBE2Q2 and BIRC6, were solved by molecular replacement methods using AMoRE (26), Phaser (27) or MOLREP (28). The structure of UBE2H, UBE2Q2 and BIRC6 were solved by single wavelength anomalous diffraction using either the program BnP (http://www.hwi.buffalo.edu/BnP/) or SOLVE (29). In some cases automatic model building was carried out using either ARP/wARP (30) or RESOLVE (31). Iterative manual model building was conducted using the graphics program Coot (32) combined with refinement with REFMAC5 (33), CNS1.1 (34), or PHENIX (35). Extensive use was also made of the CCP4 program suite (65). All structures have been deposited and statistics can be found in supplemental Table S1.
Autoubiquitylation and E2 Loading Assays-E2 loading assays were carried out in a volume of 10 l containing 1 g E1, 1 g of E2, and 5 g of 6xHis tagged Ub in a buffer consisting of 10 mM HEPES pH7.5, 100 mM NaCl, 40 M ATP and 2 mM MgCl 2 . Reaction mixtures were incubated for 10 min at 30°C, stopped by the addition of non-reducing SDS-PAGE sample buffer, separated by 4 -20% gradient SDS-PAGE gels (Invitrogen), and visualized by Western blotting, using a mouse monoclonal antibody directed against the 6xHis epitope tag (Qiagen, Valencia, CA), an HRP-conjugated goat-mouse secondary (GE), and ECL (BioRad, Hercules, CA).
Autoubiquitylation reactions (in the presence of E3s) were performed in a volume of 20 l in a buffer of 50 mM Tris pH 7.6, 5 mM MgCl 2 , 2 mM ATP, and 2 mM dithiothreitol, containing E1 (50 ng), E2 (100 ng), ubiquitin (5 g), and E3 (6xHis tagged HECT domain proteins, 0.5 g). After incubation at 30°C for 90 min, reactions were stopped by the addition of SDS-PAGE sample buffer and resolved on 7% SDS-PAGE gels. In the absence of E3, reaction conditions were 50 mM Tris pH 7.6, 50 mM NaCl, 50 mM KCl, 10 mM MgCl 2 , 5 mM ATP and 0.1 mM dithiothreitol, with 100 ng E1, E2 (200 ng of UBE2R1 or 1 g of UBE2K) and 5 g Ub. Reactions were incubated at 30°C for 3 h. Ubiquitylated proteins were evaluated by Western blotting using monoclonal antibodies directed against 6xHis (Qiagen), as above.
Structural models were prepared for electrostatic potential calculations by determining partial charges at a pH of 7.6 and van der Waals radii using PDB2PQR (37) with the PARSE (38) forcefield. Electrostatic potentials were calculated using the linearized Poisson Boltzmann equation, where r represents discrete grid point positions within and around the protein, (r) is the dielectric coefficient, 0 is the vacuum permittivity, (r) is the ion accessibility function, (r) is the electrostatic potential, e is the electron charge, B is the Boltzmann constant, T is the temperature, and z is the unit or partial charge located at position ␦(r Ϫ r r ) (39). The Adaptive Poisson-Boltzmann Solver (APBS) software package calculates electrostatic potential by embedding each E2 enzyme in a grid, and solves the Poisson-Boltzmann equation to determine electrostatic potential at each grid point based on assigned charge, dielectric coefficient, and ion accessibility at that position (40). The dielectric surface was defined using a sphere probe with a radius of 1.4 Å, and ion accessibility surface was defined using a sphere probe with a radius of 2.0 Å. To ensure a proper comparison, all of the E2 enzymes were superimposed and grid dimensions (129 ϫ 97 ϫ 97 points) were selected to fully enclose each enzyme when the calculated isopotential contour surfaces were plotted at Ϯ1kbT/e. Electrostatic potentials were visualized using the molecular graphics software Chimera (41). Comparison of the spatial distributions of electrostatic potentials of the E2 enzymes were performed by generating a similarity distance matrix according to the metric: where A (i,j,k) and B (i,j,k) are electrostatic potentials of proteins A and B, respectively, at a common grid point (i,j,k), and N the number of grid points. This method implies that proteins having a distance of 0 have identical spatial distributions of electrostatic potentials, whereas those having a distance of 2 have completely different electrostatic potential spatial distributions.
Agglomerative hierarchical clustering of the similarity distance matrix was performed with a euclidean metric and average linkage using the pvclust library (42,43), which performs average-linkage hierarchical clustering while assessing its uncertainty through bootstrap resampling and providing "approximately unbiased" (AU) probability values (p values) and "bootstrap probability" (BP) values. Our analysis was performed using 1,000,000 bootstrap replications. The hierarchical clustering dendrogram generated by pvclust displays AU p values (in red) calculated using multiscale bootstrap resampling, edge numbers (in gray), and a red box outlining clusters strongly supported by data (i.e. edges with a multiscale bootstrap resampling AU p value greater than 90%). The AU p value represents the frequency that a particular cluster appears in the bootstrap replicates. A discrete cluster having high probability (AU Ͼ 0.9; edge 21) excludes UBE2R1 and UBE2R2 from the other 21 E2 enzymes, which are further divided into two high probability clusters (AU Ͼ 0.9; edges 16 and 19). These two high probability clusters correspond to a net negative charged cluster (UBE2B, UBE2A, UBE2G1, UBE2C, UBE2H, UBE2K, UBE2J1, UBE2N, UBE2G2, UBE2S, and UBE2U) and a net positive charged cluster (UBE2E3, UBE2D4, UBE2D1, UBE2D3, UBE2E2, UBE2D2, UBE2E1, UBE2T, UBE2J2, and UBE2L3). In addition to the calculation and clustering of electrostatic potential distribution for the full 23 domains, additional calculations were performed on subsets of residues within the E3-binding interface, and two ubiquitin-binding interfaces.
Mass Spectrometry-Autoubiquitylation reactions were scaled-up threefold for mass spectrometric analysis, and subjected to 4 -12% gradient SDS-PAGE. Gels were stained with Coomassie brilliant blue for visualization, and the region containing proteins migrating at Ͼ125kDa was processed as in (44) for mass spectrometry. The digested peptide mixture was subjected to nLC-ESI-MS/MS, performed using an Orbitrap Velos instrument (Thermo Fisher Scientific) coupled to a Proxeon nanoHPLC system (Odense, Denmark). Reaction products from each E2-E3 pair were analyzed twice. Analytical columns were prepared in-house from 10 cm fused silica capillaries (75 m inner diameter; InnovaQuartz, Phoenix, AZ) and packed with C 18 coated silica particles (300Å pore size, 5 m particle size; Michrom Bioresources, Auburn, CA). Peptides were first injected onto a 2 cm (100 m inner diameter) C 18 precolumn, and chromatographic separation was achieved using a 120 min gradient, from 100% buffer A (5% acetonitrile with 0.1% formic acid) to 40% buffer B (95% acetonitrile with 0.1% formic acid) running at a constant flow rate of 250 nl/min. The mass spectrometer was operated in data-dependent acquisition mode: one survey (400 -1800 m/z) MS scan (at 60,000 resolution) was performed, and the forty most intense ions were chosen for fragmentation using collision-induced dissociation in the ion trap. Target ions for which two previous collision-induced dissociation scans had been collected (within 30 s) were dynamically excluded for 60 s. Thermo .raw files were converted to the .mzXML format with ReadW software v.3.5.1 (45), and data were searched using both; (a) automated database search software X!Tandem (46,47) against the Homo sapiens ENSEMBL Genome Reference Consortium assembly GRCh37 database (75,126 entries), and (b) spectral matching against our previously published Ub/Ubl spectral library (48), supplemented with additional consensus spectra derived from commercial (Boston Biochem) di-ubiquitin K6, K27, K29, and K33 linkages. Search parameters for X!Tandem specified a parent MS tolerance of 10 ppm and an MS/MS fragment ion tolerance of 0.4 Da, with up to two missed cleavages allowed for trypsin. A ϩ114.0429 Da modification of lysine was specified as a variable search parameter to identify the ubiquitin-derived diglycine motif. Oxidation of methionine (ϩ15.995) and deamidation of Gln (ϩ0.985) were also allowed as variable modifications. A GPM expect score of -2 was used as a cutoff, corresponding to a calculated false discovery rate of 0.80%. SpectraST (49,50) was used for spectral matching, with a dot product of Ն0.7 used as a cutoff, corresponding to a calculated false discovery rate of 0.66%. We have uploaded all files relevant to MS analysis of in vitro autoubiquitylation assays to Tranche, with the associated hash tag To correct for inherent differences in Ub linkage signal intensity, Proteome Discoverer (Thermo, Ver. 1.3) was used to search and analyze three replicate MS analyses of trypsin digested equimolar mixes of all seven di-ubiquitin polypeptides (Boston Biochem). Files were searched using the Sequest search algorithm and the ipi.HUMAN.v3.83 database, with a 10 ppm precursor mass tolerance and 0.5 Da fragment mass tolerance. Oxidation of M, deamidation of N or Q, and ubiquitylation of lysines (ϩ114.043) were also allowed. Sequest results were analyzed using the Percolator algorithm, with a target FDR of 0.01. Precursor ion area detection was enabled, using a precision of 4ppm, and the AUC for each Ub linkage was calculated (for all observed charge states). A "detection bias" ratio was then calculated as the ratio of the AUC for each linkage type with respect to the linkage type with the lowest AUC, K29. An average detection bias ratio was calculated from the three replicate MS runs, and this conversion ratio applied to all data (supplemental Table S2).

RESULTS
E2 evolutionary relationships-Based on the presence of a conserved catalytic domain of ϳ150 residues (domain ID #CD00195), we constructed a phylogenetic model depicting putative evolutionary relationships among the human E2 proteins (Fig. 1A). Interestingly, although this approach successfully grouped closely related E2s (e.g. the UBE2D and UBE2E subfamilies), the E2s that mediate the conjugation of ubiquitin-like proteins (Ubls) are not readily distinguished from those that mediate Ub conjugation (see also the very similar phylogenetic tree in 12). For example, UBE2I (Ubc9), the E2 for the Ubl SUMO (small ubiquitin-related modifier; 51, 52), is grouped with two known Ub E2s, UBE2A and UBE2B. Similarly, UBE2M and UBE2F, E2s for the Ubl NEDD8 (neural precursor cell expressed, developmentally down-regulated 8; 53), are placed on the same evolutionary branch as UBE2L3, a Ub E2 (54). Overall sequence similarity and putative evolutionary relationships thus appear to have somewhat limited value in predicting E2 functional properties.
Construction of an E2 Library-To better understand the human E2s, we constructed a recombinant E2 protein library. Full-length and "core" (UBC domain) versions of the 40 H. sapiens E2 proteins were cloned into a 6xHis expression vector (Table I). Using standard methods, both core and full-length polypeptides for 29 different E2s were successfully expressed in E. coli, and purified to homogeneity. The remaining full-length proteins were not expressed or were insoluble, but we were able to express the core domains of eight of the remaining E2s (Table I). These 66 purified E2 proteins (covering 37 of the 40 UBC domains in the human genome) form the basis of our library, representing the most complete collection of recombinant human E2 enzymes available to date.
A conserved E2 structural core-Although the structures of a number of E2s have been solved (Table I), those from several of the main branches of the E2 evolutionary tree (Fig.  1A) have not been previously characterized. To better understand E2 structure-function relationships, we determined high-resolution three-dimensional structures of 15 additional human E2 core domains (Table I, supplemental Fig. S1, and  supplemental Table S1). These structures double the number of solved E2 UBC domains, and, combined with those that have been previously characterized, provide nearly complete structural coverage of the human E2 core.
As expected, most of the new structures display a canonical E2 fold, composed of a four stranded, anti-parallel curled ␤-sheet surrounded on three sides by ␣-helical segments. The core domains of these E2s share a remarkable degree of similarity in three-dimensional structure, with an average root mean square deviation (RMSD) of 2Å over aligned C␣ atoms of 145 residues (relative to UBE2D1), despite sequence identity as low as 15% (supplemental Fig. S2). Interestingly, the RMSD values of the ubiquitin-like protein E2s (UBE2I, UBE2F  FIG. 1. The UBC domain. A, Dendrogram of human UBC domains, depicting putative evolutionary relationships. A phylogenetic tree was generated from a cladogram derived from a ClustalW2 alignment of the minimal UBC fold. Nodal distances and relationships have been modified for clarity. NCBI gene nomenclature is shown above in larger font, and aliases below. Protein structures solved in this study are colored blue, and previously published structures are depicted in orange. B, Ribbon diagram of UBE2D1 (PDBID: 2C4P). Helices are labeled as ␣1-␣6, and strands as ␤1-␤4. Helix ␣2 is not observed in UBE2D1, but contained in the structures of UBE2Q1 (2QGX) and UBE2Q2 (1ZUO), and is located between ␤2 and ␤3. The E3 ligase binding region (blue) and the catalytic cleft (pink) encompassing Cys85 at the active site are also indicated. C, A surface and ribbon representation of UBE2D1, with the E3-binding region colored in blue. Acidic residues on the negatively charged surface are also indicated. D, An electrostatic surface representation of UBE2D1 in the same orientation. Locations of the acidic trough and catalytic Cys residue are indicated. and UBE2M) are also ϳ2.0Å, indicating that the core polypeptide backbone structure does not diverge significantly for these conjugating enzymes. Alternative features such as surface properties or additional structural elements must therefore be important for specificity.
An E2-HECT E3 Functional Screen-Our entire E2 panel was next subjected to an in vitro "Ub loading" assay, to assess the ability of each conjugating enzyme to form a thio-esterϳUb intermediate. In the presence of Ub, ATP and the E1 enzyme Uba1, we found that 26 of the recombinant human E2s were capable of loading Ub (supplemental Fig.  S3). This result is in accordance with previous studies (e.g. 12), with one minor difference; UBE2U was able to load Ub (albeit weakly) in our assay. Our Ub loading screen also revealed that three E2s, UBE2K, UBE2R1 (Cdc34), and UBE2R2, do not require the presence of an E3 protein to carry out Ub conjugation (i.e. the formation of an isopeptide bond with the epsilon amine group of a lysine residue in one of the proteins in the reaction) under our standard in vitro reaction conditions. As expected, we did not observe Ub loading for the known ubiquitin-like protein E2s UBE2I, UBE2F, UBE2M, and UBE2L6, nor the inactive Ub E2 variants (UEVs, which lack a catalytic Cys) TSG101, TMEM189, UBE2V1, UBE2V2, or AK-TIP. UBE2Z also had no activity in our assay, consistent with a previous report demonstrating that this E2 is loaded specifically by an alternative Ub E1, UBA6 (12) (not tested here).
Functional interactions between the entire set of Ub loading E2s and a number of HECT E3 ligases were next character-   Fig. 2A and supplemental Fig. S4). Based on Ub adduct migration in SDS-PAGE, we classified each of the 234 E2-E3 autoubiquitylation reactions into one of three groups: (1) those containing long Ub chains (MW Ͼ125 kDa), (2) those containing shorter Ub oligomers or (multi-)mono- ubiquitylation (MW Ͻ125 kDa), and (3) those with no apparent products (Fig. 2B). UBE2L3, UBE2D1-4, UBE2E1-3, and UBE2J1-2 displayed functional interactions with all of the NEDD4 subfamily HECT domains, as well as with HERC4 and UBE3A (E6-AP). A majority of these interactions were highly productive, resulting in the synthesis of long Ub chains. UBE2S was able to catalyze the synthesis of long chains with WWP2 and ITCH, and short chains or multimonoubiquitylation with several other E2s. UBE2C, UBE2G1, UBE2T, and UBE2W also functionally interacted with several of the HECT domains to produce short chains or multimonoubiquitylation. UBE2A, UBE2B, UBE2H, UBE2Q1, and UBE2U displayed only low levels of activity in our screen, and no interactions were observed with UBE2G2. These data, for the first time, define a human E2-HECT functional interaction landscape.

including both active and inactive E2 variants) and HECT domain protein information. Indicated are gene name, aliases, protein length, PDBID code (if structure has been solved) and NCBI protein and gene accession numbers. "Protein Note" indicates whether we were able to produce soluble full length or core recombinant protein. "Ub loading" indicates the ability of each recombinant E2 protein to form a thio-ester bond with Ub, in our assay
In a few cases, we observed significant differences in activity between the core and full-length versions of an E2. For example, the core domains of UBE2J1 and UBE2W displayed increased activity compared with their full-length counterparts in some autoubiquitylation reactions. Conversely, the full-length UBE2S protein was more active in some reactions than the core domain. These data suggest that additional sequences outside the core can act as positive or negative regulatory domains and/or provide additional E3 specificity.
E2 Surface Charge is Associated With Ub Chain-Building Processivity-To identify physicochemical characteristics that confer Ub chain building activity and functional interactions with HECT domains, surface properties of each E2 structure were next analyzed. Charge, dielectric coefficient and ion accessibility were calculated for each E2 protein, and the Poisson-Boltzmann equation was used to determine electrostatic potential at Ͼ1,200,000 grid points/structure (Fig. 3, see Methods for further detail). Interestingly, the majority of the highly active E2s identified in our HECT autoubiquitylation screen displayed an overall net positive or neutral charge, whereas the remaining E2s displayed a net negative charge. Closer inspection revealed that most of the highly active E2s possess a clearly defined "acidic trough" adjacent to the catalytic cysteine residue (corresponding to residues Asp42, Asp87, Asp112, Asp116, and Asp117 in UBE2D1, located in the ␣5 and ␤2-␤3 loop; Figs. 1B-1D), surrounded by extensive basic regions. Indeed, agglomerative hierarchical clustering of distance matrix values grouped the majority of the active E2s, based on the spatial distribution of electrostatic potential within 6Å of the ubiquitin donor interaction interface (corresponding to residues E51, N97, R101, D102, C118, I121, S127, E139, and Y141 in UBE2S, the entire region encompassing a total of 53 aa). The most active E2s in HECTcontaining reactions thus share a high degree of similarity specifically on the Ub donor binding surface, along with a higher net charge, and these properties clearly differentiate them from other E2 proteins. By contrast, the surface electrostatic potentials of the E3 binding site and Ub acceptor region (supplemental Fig. S5) were not good predictors of activity in our assays.
Several previous reports have indicated that E2 acidic residues may contribute to catalysis. Asp117 and the nearby HPN (where N ϭ Asn77) motif in UBE2D2 were shown to assist catalysis by stabilizing the E2-Ub oxyanion intermediate (55). Other negatively charged residues adjacent to the catalytic cleft may also contribute to the binding of the positively charged residues in the region surrounding the carboxy- terminus of Ub (56). To test whether the acidic trough plays a role in HECT-mediated ubiquitylation, we generated a series of UBE2D1 proteins in which individual amino acids in this region were converted to basic residues. As a measure of E2 activity, the amount of free (unconjugated) Ub remaining after a standard autoubiquitylation reaction with the ITCH HECT domain was monitored (Fig. 4A). Wild type (WT) UBE2D1 incorporated ϳ90% of the Ub in the reaction mix into conjugates within 90 min (compare lane 1, no E2, to lane 2). Consistent with previous findings, mutation of residues important for E2-E3 interactions (K4 and F62) greatly attenuated Ub conjugation (lanes 3 and 5). Conversion of individual UBE2D1 acidic trough residues to basic amino acids (D42R, D87R, D112R, D116R, and D117R; see Fig. 4B for residue locations) also resulted in markedly lower activity in this assay (lanes 4, 6, 9, 10, and 11). Similar results were observed with other HECT domains (data not shown). These data are consistent with an important role of the acidic residues surrounding the catalytic Cys (as was recently also observed for UBE2S; 17).
A HECT Binding Signature Correlates with Ub Chain-Building Activity-Although most E2s can catalyze ubiquitylation with RING E3 ligases, UBE2L3 is preferentially utilized by HECT E3s (54,57). The HECT domain protein E6-AP (UBE3A) selectively interacts with UBE2L3 and UBE2L6 with a K d of ϳ5 M, and binds to other E2s with reduced affinity (54). The UBE2L3 residues responsible for binding to the E6-AP HECT domain were previously identified on helix ␣1, ␤3-␤4 loop 4, and ␣3-␣4 loop 7. We aligned this region through all of the active Ub E2s in our screen, and scored them for potential HECT binding affinity, based on the binding energies associ- sequences. Residues previously found to play a role in HECT binding are highlighted. Column color corresponds to the change in binding energy observed when this residue was mutated (39). A "conservation score" for each E2 was assigned as follows; identical residues were assigned a score according to binding energy, where: red residues ϭ ϩ4, orange ϭ ϩ3, mustard ϭ ϩ2, light yellow ϭ ϩ1; conserved mutations at the same location (e.g. R to K, or S to T) were assigned half scores. Similar amino acids (e.g. aliphatic, charged or small amino acids) were assigned a score of 0. Dramatically altered amino acids at the same position were penalized with a negative score corresponding to the binding energy of each UBE2L3 amino acid. Raw scores were summed for each E2, and conservation score determined by dividing by the E2L3 score of 32. ated with a number of critical UBE2L3 residues (54) (Figs. 4C  and 4D). Notably, the active E2s highlighted in our HECT screen also scored highest in this analysis, suggesting that these proteins possess structural features more favorable for HECT domain binding than other human E2 proteins. A higher degree of structural similarity to the E6-AP binding domain of UBE2L3 was thus an excellent predictor of Ub chain building activity in in vitro reactions with HECT domains.
To test whether this signature is important for E2 activity in HECT-containing autoubiquitylation reactions, we generated a series of UBE2L3 proteins in which individual residues in the E6-AP HECT interaction domain were mutated, and assessed their activity in vitro with the ITCH E3 ligase HECT domain (Fig. 4A). UBE2L3 proteins mutated at residues in the E6-AP binding motif (K9E, F63A, E93R, K96E, and K100E; see Fig.  4C for residue locations) displayed a consistent decrease in activity with ITCH. These data suggest that the affinity of the E2 protein for HECT domains also plays an important role in HECT-mediated ubiquitylation.
Characterization of Ub Chain Linkages in E2 -HECT Reactions-Finally, we utilized mass spectrometry to characterize the Ub chain linkages generated by a number of functional E2-E3 pairs. Autoubiquitylation reaction products were separated via SDS-PAGE, and proteins migrating at Ͼ125 kDa were subjected to trypsin proteolysis. The resulting peptides were analyzed using nanoflow liquid chromatography-electrospray ionization-tandem mass spectrometry (nLC-ESI-MS/ MS). Ub-Ub linkages were identified using two methods: (1) standard database searching, with the inclusion of a ϩ114.0429 Da mass shift (corresponding to the Ub tryptic GG remnant on lysine) as a variable modification (58), and (2) spectral matching, using a Ub/Ubl spectral library that we recently developed, and which now contains consensus spectra derived from all seven Ub chain linkage types (59,48). When UBE2D2, UBE2E3 or UBE2J2 were combined with the RING E3s Mdm2 or Ro52, a mix of Ub linkages was generated, in which K63 Ϸ K48 Ϸ K11 Ͼ Ͼ all other linkage types ( Fig. 5 and supplemental Table S2). UBE2L3 was not functional with either of these RING proteins. In the presence of the same E2s, the HECT domains derived from ITCH, WWP2, and NEDD4L also catalyzed the formation of a mix of K63-, K48-and K11-linked Ub products, where rank order of abundance was K63 Ͼ K48 Ͼ K11 Ͼ Ͼ all other linkage types ( Fig. 5 and supplemental Table S2). In contrast, the HERC4 HECT domain gave rise to a very different spectrum of Ub chains with the same set of E2s, synthesizing Ub oligomers that were markedly enriched in K48 linkages (K48 Ͼ Ͼ K63 Ͼ all other linkages). Thus, very different linkage types were observed in reactions containing the same E2, but different HECT E3s.
To further delineate the role of HECT domains in Ub linkage specification, UBE2R1 (Cdc34) and UBE2K, both of which have been previously demonstrated to generate Ub chains enriched in K48 linkages (60,61), were analyzed in reactions containing; (1) no E3, (2) the RING E3s Mdm2 or Ro52, and (3) HECT domains derived from WWP2, ITCH, NEDD4L, and HERC4. Both of these E2s catalyzed the synthesis of Ub oligomers highly enriched in K48 linkages in the absence of E3, or when combined with RING E3s (Fig. 5 and supplemental Table S2). A similar linkage distribution was observed in reactions containing HERC4. However, when UBE2R1 or UBE2K were combined with NEDD4 subfamily HECT E3s (n.b. NEDD4L was inactive with UBE2K), the composition of Ub linkages in the reactions were similar to those observed with UBE2D2, UBE2E3, UBE2J2, and UBE2L3; i.e. mixed chain products, with significant levels of K63 and K48 linkages, along with a low level of K11 linked Ub oligomers ( Fig. 5 and supplemental Table S2).
Consistent with earlier studies (62,63), our data thus confirm that; (1) HECT domains can govern linkage composition in autoubiquitylation reactions, largely independent of E2 linkage specificity, and (2) different HECT domains display different linkage preferences. Specifically, we found that HERC4 catalyzes the synthesis of Ub oligomers enriched in K48 linkages, whereas the NEDD4 subfamily HECT domains NEDD4L, ITCH and WWP2 synthesize Ub chains consisting primarily of K63, K48, and K11 linkages. DISCUSSION Structural genomics of the human E2 family confirmed that the UBC fold is stable through a large range of sequence variations, and highlighted an "acidic trough" near the catalytic cysteine, surrounded by extensive basic regions. This surface area is likely involved in a number of important roles in Ub/Ubl transfer. The area of the trough near the catalytic Cys specifically bonds with and desolvates the nucleophilic epsilon amino group, contributing to the activation of the substrate lysine (64). A previous study demonstrated that mutation of two Asp residues in this region (D100A and D127A) of UBE2I (Ubc9) significantly affects the transfer of SUMO from UBE2I to a target lysine (64). The area of the trough farther from the nucleophilic cysteine may be involved in recruiting and/or positioning the Ub/Ubl polypeptide during catalysis; in both the SUMO-UBE2I-RanGAP1 and the yeast UBC8-Ub (PDBID:1FXT) structures, this patch of the E2 mediates direct contact with the Ub/Ubl C terminus (56,64). Consistent with this model, we found that mutation of acidic residues in the UBE2D1 trough decreased its activity in HECT domain-containing reactions. These observations do not extend to all E2s, however. For example, UBE2R1 was highly active in our autoubiquitylation assays, yet displays very different surface charge characteristics. Other structural elements apparently confer a similar function in this case. Previous reports have indicated that a long acidic loop in UBE2R1 plays an essential role in the assembly of polyubiquitin chains (60).
We also found that similarity to the previously defined HECT binding region of UBE2L3 was a good predictor of activity for all E2s in HECT-containing in vitro reactions (Fig. 4). Eletr and Kuhlman (54) first defined the regions of UBE2L3 involved in binding to the HECT E6-AP (UBE3A), and the NEDD4L-UBE2D2 (55) cocrystal structure highlighted the same region. We found that the E2s sharing the highest similarity with UBE2L3 in this region also displayed higher levels of Ub chain building activity in autoubiquitylation assays, with a number of different HECT domains. Mutational analysis of the HECT binding signature in UBE2L3 suggested that this motif likely plays an important role in Ub chain-building activity with all HECT proteins.
Both NEDD4L and E6-AP contact the E2 protein via helices H7 and H8 (55). However, E6-AP appears to make more extensive interactions with UBE2L3 via the loop between helix H7 and S5. Interestingly, E6-AP and HERC4 possess similar E2 binding sites, whereas NEDD4L, WWP2, SMURF1 and ITCH diverge in this region. This could explain the variation in chain linkage types synthesized by the NEDD4 subfamily versus HERC4 with the same E2 proteins, if the more robust HERC4 binding to E2s stabilizes the E2-E3 pair in a conformation that favors the formation of K48-linked Ub oligomers. Consistent with this model, HERC4-containing autoubiquitylation reactions were also less productive overall than NEDD4 subfamily-containing interactions with the same set of E2s (supplemental Fig. S4, and see Ub linkage counts in supplemental Table S2), suggesting that the E2 off rate for HERC4 could be slower. Further study will be required to test this model.
Previous publications have suggested that HECT E3s determine linkage specificity in autoubiquitylation reactions (62,63), because they possess a catalytic Cys residue. However, many different E2-E3 interaction models can be envisioned. For example, (1) once a single Ub molecule is transferred to the HECT Cys residue, Ub chains could be built upon it by E2s. In this case, the E2 would specify the linkage type. Or, (2) Ub chains built by one or more E2s could be transferred en bloc to a HECT before conjugation to a target. Linkage specificity would be entirely determined by the E2 in this model. Or, (3) if an E2 interacts strongly with a HECT E3, it is possible that it could influence chain types produced by the E2-E3 pair. Unlike any other study that we are aware of, here we challenged several different HECT domains with two E2s that synthesize Ub chains specifically enriched in K48 linkages (UBE2R1/Cdc34 and UBE2K (19,63)). We found that the chain types generated in HECT-containing autoubiquitylation reactions were quite different than those observed in RINGcontaining reactions (or in reactions lacking an E3) with these E2s, indicating that the HECT domain is the primary determinant of Ub-Ub linkages in this context.
Recent work from the Rape laboratory indicated that residues in UBE2S coordinate with residues of Ub itself to produce K11-specific Ub chain linkages (17). This interaction involves the Ub ␤1, ␤2 and ␤4 strands (K11 is in ␤2) interacting with the active site and helix ␣3 of UBE2S. Three side chains in UBE2S ␣3 (E131, R135 and E139) participate in hydrogen or salt bonds with Ub. To assess whether other E2s may function in a similar manner, we aligned all known E2 structures, and analyzed conservation in this region (supplemental Fig. S6). Notably, no other E2s possess similar residues at the ␣3 sites that make contact with Ub in UBE2S. Structural analyses therefore supports our current understanding that K11 specificity (at least using a mechanism like UBE2S) is a trait likely pertinent only to UBE2S.
Finally, it is important to note that functional E2-HECT pairs that can only extend Ub chains on mono-ubiquitylated targets, or those that require one or more cellular co-factors, would not be detected in our assay. Nevertheless, our new set of E2 core structures, combined with the construction of a nearly complete E2 protein library, a comprehensive E2-HECT functional interaction screen, and a mass spectrometry-generated Ub linkage data set, represents an important resource for better understanding the structural properties that mediate E2-HECT interactions, processivity, and the synthesis of different types of Ub chain linkages.