The Crystal Structure of an Algal Prolyl 4-Hydroxylase Complexed with a Proline-rich Peptide Reveals a Novel Buried Tripeptide Binding Motif*

Plant and algal prolyl 4-hydroxylases (P4Hs) are key enzymes in the synthesis of cell wall components. These monomeric enzymes belong to the 2-oxoglutarate dependent superfamily of enzymes characterized by a conserved jelly-roll framework. This algal P4H has high sequence similarity to the catalytic domain of the vertebrate, tetrameric collagen P4Hs, whereas there are distinct sequence differences with the oxygen-sensing hypoxia-inducible factor P4H subfamily of enzymes. We present here a 1.98-Å crystal structure of the algal Chlamydomonas reinhardtii P4H-1 complexed with Zn2+ and a proline-rich (Ser-Pro)5 substrate. This ternary complex captures the competent mode of binding of the peptide substrate, being bound in a left-handed (poly)l-proline type II conformation in a tunnel shaped by two loops. These two loops are mostly disordered in the absence of the substrate. The importance of these loops for the function is confirmed by extensive mutagenesis, followed up by enzyme kinetic characterizations. These loops cover the central Ser-Pro-Ser tripeptide of the substrate such that the hydroxylation occurs in a highly buried space. This novel mode of binding does not depend on stacking interactions of the proline side chains with aromatic residues. Major conformational changes of the two peptide binding loops are predicted to be a key feature of the catalytic cycle. These conformational changes are probably triggered by the conformational switch of Tyr140, as induced by the hydroxylation of the proline residue. The importance of these findings for understanding the specific binding and hydroxylation of (X-Pro-Gly)n sequences by collagen P4Hs is also discussed.

4R-Hydroxyproline (4Hyp) 2 is an uncommon amino acid produced by the addition of a hydroxyl group to the C4 carbon atom of the proline pyrrolidine ring. 4Hyp residues have an essential role in the extensive collagen family, where they are necessary for the formation of stable triple helical molecules (1)(2)(3). 4Hyp residues are also found in many plant and algal hydroxyproline-rich glycoproteins (HRGPs), such as extensins and arabinogalactan proteins, which are the major structural components of their cell walls (4,5). In addition to these structural roles, 4Hyp has a key role in the regulation of gene expression in an oxygen-dependent manner via the hypoxia-inducible transcription factor (HIF) (6 -8).
The formation of 4Hyp in the above proteins is catalyzed by the prolyl 4-hydroxylases (P4Hs), which are 2-oxoglutarate (2OG) dioxygenases and also require Fe 2ϩ and O 2 ( Fig.  1A) (1)(2)(3)9). Two P4H families, the collagen P4Hs (C-P4Hs) and the HIF-P4Hs, each having three isoenzymes, are responsible for the hydroxylation of collagens and HIF, respectively (1-3, 6 -9). The vertebrate C-P4Hs are ␣ 2 ␤ 2 tetramers in which the ␣-subunits are responsible for the hydroxylation and the ␤-subunits are identical to proteindisulfide isomerase. The tetrameric assembly is required for stability and full activity (1)(2)(3). In contrast, the plant P4Hs and probably also the HIF-P4Hs are monomers (2,10). Plant P4Hs have a ϳ30% sequence identity to the catalytic domain of the C-P4H ␣-subunits (11,12), and they also resemble C-P4Hs in that they hydroxylate proline-rich polypeptides and are located in the lumen of the endoplasmic reticulum (2). HIF-P4Hs, on the other hand, are cytoplasmic and nuclear enzymes (8,9) that act on proline residues in -Leu-X-X-Leu-Ala-Pro-motifs in HIF-␣ (13,14). C-P4Hs hydroxylate the central proline of the -X-Pro-Gly-repeats of collagen polypeptides, typically generating about 100 4Hyp residues in polypeptides with a length of about 1000 amino acids. Plant P4Hs can also hydroxylate peptides with -X-Pro-Gly-repeats in vitro, but generally much less effectively than peptides representing their physiological substrates, the HRGPs, which are rich in serine and proline and can fold into a left-handed fibrous poly(L-proline) type II (PPII) helix conformation (15,16). The only known exception is isoenzyme 1 of the large Arabidopsis thaliana P4H (At-P4H) family, which efficiently hydroxylates a collagen-like (Pro-Pro-Gly) 10 peptide, for which its K m is only 3-fold higher than that of the human C-P4H-I (17). Plant P4Hs accept also poly(L-proline) as a substrate, which is not hydroxylated by any of the C-P4Hs, but instead acts as an efficient competitive inhibitor of C-P4H-I (2). Several 2OG dioxygenases have now been crystallized and despite their low overall amino acid sequence identity the structures have revealed that the catalytic sites are always located at the same site of a common framework: a doublestranded ␤-helix (jelly-roll) fold that consists of 8 antiparallel ␤-strands (18,19) (Fig. 1B). The catalytic sites of all P4Hs have the conserved -His-X-Asp-…-His-catalytic motif for Fe 2ϩ coordination and a basic residue that binds the C5 carboxylate moiety of 2OG. The crystal structures of two P4Hs have been solved, namely those of human HIF-P4H-2 (10) and a plant P4H from Chlamydomonas reinhardtii (Cr-P4H-1) (12). The structures of these two enzymes share the jelly-roll core fold preceded by an N-terminal part that contains two long ␣-helices in both structures, and also three ␤-strands in Cr-P4H-1 and 2 ␤-strands in HIF-P4H-2 (10,12). In each of these two structures the extra ␤-strands extend the major ␤-sheet of the jelly-roll fold (Fig. 1B), and the helices of the N-terminal part shield this major sheet from bulk solvent. The proximal histidine and the aspartate of the Fe 2ϩ coordination motif are located in the ␤IIstrand of the jelly-roll, the distal histidine of this motif is in the adjacent ␤VII-strand, whereas the basic 2OG-binding residue (lysine in Cr-P4H-1 and arginine in HIF-P4H-2) is in the ␤VIIIstrand. Both enzymes have been crystallized in the presence of an active site metal ion and a 2OG analogue, but not with a peptide substrate (10,12).
We present here the first crystal structure of a P4H complexed with a proline-rich peptide substrate. Cr-P4H-1 was cocrystallized with a 10-residue long peptide substrate (Ser-Pro) 5 that adopts the PPII helix conformation. The structure reported here reveals an entirely novel binding mode for proline-rich peptide substrates that is also expected to be utilized in C-P4Hs but not in HIF-P4Hs.

EXPERIMENTAL PROCEDURES
Preparation of the Protein Samples and Activity Measurements-A truncated Cr-P4H-1 (Val 29 -His 253 ) with an N-terminal His 6 SUMO fusion partner was expressed in Escherichia coli and purified to homogeneity as described previously (12). The tag was cleaved by digestion overnight with SUMO protease (12). Mutations were introduced using a QuikChange TM site-FIGURE 1. The reaction and the crystal structure of Cr-P4H-1. A, reaction catalyzed by P4Hs. The 2OG molecule coordinates the Fe 2ϩ via C 1 -carboxylate and C 2 -keto oxygen atoms. The product is a trans-4-hydroxyprolyl residue. The two oxygen atoms of O 2 (red) are used for the hydroxylation of the proline and the oxidation of 2OG, which is decarboxylated by releasing the C 1 -carboxylate group as CO 2 . B, schematic representation of the Cr-P4H-1 crystal structure. The jelly-roll fold is shown in darker brown the major sheet being in the front and the minor sheet at the back. The two substrate binding loops are shown with dashed lines. C, stereoview of the ribbon diagram showing the mode of (Ser-Pro) 5 peptide substrate (in stick representation) binding to the jelly-roll motif of Cr-P4H-1. The zinc ion, indicating the active site region, is shown as a gray ball. The ␤-strands of the jelly-roll core are labeled using roman numerals as in B. The two major flexible loops, ␤3-␤4 and ␤II-␤III, wrapping the peptide substrate are also indicated. directed mutagenesis kit (Stratagene). The wild-type and mutant variants of Cr-P4H-1 were purified for the kinetic analyses as above, with the exception that the His 6 SUMO fusion partner was not cleaved. The catalytic properties of the wildtype and mutant Cr-P4H-1 were measured using poly(L-proline), M r 5000, (Ser-Pro) 5 and (Pro-Pro-Gly) 10 as substrates as described previously (11).
Crystal Structure Determination and Validation-A dataset to 1.98-Å resolution was collected from a Cr-P4H-1 crystal using synchrotron radiation at the X12 beamline, EMBL, DESY, Hamburg at 100 K cryoconditions using a 0.932-Å wavelength. The data were collected using a strategy from BEST (21) based on two reference images collected 90 degrees apart and processed by MOSFLM (22). The final dataset was processed using XDS (23). The data collection statistics are summarized in Table 1. The structure was solved by molecular replacement using Phaser (24) and molecule A of the SeMet(apo) form of Cr-P4H-1 (PDB ID 2V4A) (12), excluding the flexible loop regions, as a model. The structures of Cr-P4H-1 in molecules A-D and of the (Ser-Pro) 5 peptides at the active sites of molecules A and C were built with iterative cycles of manual building using COOT (25). A restrained refinement with translation, libration, and screw-rotation displacement (TLS) (26) was accomplished using Refmac5 (27) from the CCP4 package (28). The (Ser-Pro) 5 at the active site of molecule A consists of residues Pro 4 -Ser 9 , whereas in molecule C the residues Pro 2 -Ser 9 could be built in the electron density. A total of 14 zinc atoms were included in the model, one at each of the four active sites and the rest mostly at the molecule-molecule interfaces. There were strong densities close to the bound Zn 2ϩ in each molecule in the 2OG-binding pocket, but these were not enough to build the PDC molecule. The acetate ions fitted perfectly in these densities and refined with normal B-values. Finally, 468 water molecules were added to the structure using the ARP/wARP (29) option in Refmac. The final refinement statistics are shown in Table 1. Noncrystallographic symmetry restraints were not used during the refinement. The TLSANL program (30) was used to obtain the isotropic and the anisotropic B-values for each individual atom. Ramachandran plot analysis showed 99.5% of the residues to be in favored regions and 0.5% in allowed regions according to MolProbity (31). The figures were prepared using Pymol (Delano Scientific LLC).

RESULTS AND DISCUSSION
Structure Determination-A truncated Cr-P4H-1 lacking the 29 N-terminal amino acids, was prepared as described (11,12) and crystallized in the presence of the inhibitors Zn 2ϩ and PDC, and the peptide substrate (Ser-Pro) 5 . PDC is a homologue of 2OG (Fig. 1A) having two identically placed carboxylate groups, and the (Ser-Pro) 5 peptide is a shortened version of a (Ser-Pro) 19 motif present in the GP1 protein of the C. reinhardtii cell wall, which is a potent substrate for Cr-P4H-1 (11,16). Four independent molecules A-D are present in the resulting crystal form of Cr-P4H-1, and the final model was refined to 1.98-Å resolution (Table 1). Zn 2ϩ is present at each of the four active sites, while the (Ser-Pro) 5 peptide is found only in molecules A and C ( Table 2). The Cr-P4H-1 molecules containing both Zn 2ϩ and the peptide are referred to hereafter as the Znpeptide complex. PDC is not found in any of the Cr-P4H-1 molecules, but instead an acetate molecule is placed at each active site close to the Zn 2ϩ in the 2OG-binding pocket. The acidic crystallization conditions (pH 5.5) together with the presence of acetate ions in the buffer solution have apparently favored binding of the acetate ion in the 2OG-binding pocket instead of PDC. This is the second crystal form of Cr-P4H-1 obtained in the presence of Zn 2ϩ . A Cr-P4H-1 ternary complex with Zn 2ϩ and PDC (referred to as the Zn-PDC complex) was crystallized earlier at pH 8.5 in Tris-HCl buffer (PDB entry 2JIG) (12). The new structural data on the Zn-peptide complex will first be described and compared with the previous Zn-PDC complex. This structure will subsequently be compared with the structures of corresponding ternary complexes of the other superfamily members and discussed also in the context of point mutation studies probing the functional importance of residues in the flexible loops.
The Overall Structure-The (Ser-Pro) 5 peptide adopts a typical PPII helix conformation in the Zn-peptide complex (Fig.  1C). Ser 1 is disordered in molecule C, Ser 1 -Ser 3 in molecule A and Pro 10 in both molecules. Molecule C is therefore used as the reference molecule. The middle -Ser 5 -Pro 6 -Ser 7 -tripeptide region of the bound peptide has the lowest B-factors, and its geometry is well defined by the electron density map (Fig. 2). The Pro 6 is deeply buried and points toward the catalytic site (Figs. 2 and 3A). The peptide is bound to the edge strands of the jelly-roll fold, namely the ␤II-strand of the minor sheet of the jelly-roll fold and the ␤5-strand that extends the major sheet (Fig. 1, B and C). The N terminus of the peptide is bound close to the minor sheet, which also provides the three conserved metal ion-binding residues, namely His 143 , Asp 145 and His 227 (Fig. 2). The peptide subsequently crosses over to the major sheet, where Ser 9 interacts with the Trp 99 of the ␤5 edge strand (Fig. 2). The bound peptide also interacts with the side chains of ␤VII (Leu 226 , near Pro 2 ) of the minor sheet and of ␤I (Glu 127 and Gly 128 , near Ser 9 ) and ␤VIII (Trp 243 , near Pro 6 ) of the major sheet (Fig. 3B). The -Ser 5 -Pro 6 -Ser 7region of the bound peptide is completely covered by two loops, the ␤3-␤4 loop (Val 75 -Ser 95 ) and the ␤II-␤III loop (Tyr 146 -Gly 158 ), which protrude out of the jelly-roll fold (Figs. 1C, 3A, and 4). The ␤3-␤4 loop in particular interacts tightly with the bound peptide via the residues Ser 78 -Val 79 -Val 80 at the entrance to the loop and Ser 87 and Arg 93 at the exit (Fig. 3, A  and B). These two loops, which are disordered in molecules B and D of this crystal form, are known to have large conformational flexibility (12) (Fig. 5 and Table 2). The ␤3-␤4 loop, which was only seen in an ordered open conformation in the Zn-PDC complex (12), now adopts an ordered closed conformation in which its tip has moved 14 Å toward the tip of the ␤II-␤III loop (Fig. 5). The ␤II-␤III loop conformation of the Zn-peptide complex is referred to as an ordered-extended conformation (Table 2), previously also observed in the Zn-binary complex (12). In the other structures, this loop is partially disordered or observed in an ordered, compact conformation ( Table 2). In the latter conformation the ␤II-␤III loop binds in the peptide-binding groove (Fig. 5).
In the Zn-peptide complex the tips of the ␤3-␤4 and ␤II-␤III loops interact with each other, but not with the peptide substrate, as highlighted in Fig. 3A. Both tips adopt an ␣-helicallike conformation with a strong hydrogen bond between the main chain oxygen of Asp 81 and nitrogen of Gly 85 and a weak hydrogen bond between the main chain oxygen of Asp 149 and nitrogen of Ala 153 . Two sequence motifs present in these Cr-P4H-1 loop tips are conserved in plant P4Hs and C-P4Hs, namely -Asp/Asn-X-X-Ser/Thr-Gly-(Asp 81 -Gly 85 ) in the ␤3-␤4 loop of Cr-P4H-1 and -Asp/Glu-X-X-Asn/Asp-(Asp 149 -  In" refers to the active conformation of Tyr 140 . In the "Out" conformation, the main chain atoms of Tyr 140 also change. The ␤II strand that contains the metal binding -His-X-Asp-motif is always ordered in the Zn 2ϩ complexes and adopts identical conformation in these four structures. SEPTEMBER 11, 2009 • VOLUME 284 • NUMBER 37

JOURNAL OF BIOLOGICAL CHEMISTRY 25293
Asn 152 ) in the ␤II-␤III loop (Fig. 6). In these motifs, the side chains of Asp 81 and Ser 84 as well as those of Asp 149 and Asn 152 are hydrogen-bonded to each other (Fig. 3A). The side chains of Asp 81 and Ser 84 point into the bulk solvent, whereas the Asp 149 and Asn 152 side chains point to the partner loop (Fig. 3A). Nev-ertheless, only one direct hydrogen bond exists between the two loop tips, namely Asn 152 -Gly 85 (Fig. 3A). The residue Gly 85 is fully conserved in the plant P4H and C-P4H sequences (Fig. 6), the Phi/Psi values (82°/2°) not favoring a side chain in this position. In addition, there are water (W)-mediated hydrogen bonding interactions between the two loops via three well-defined water molecules W40, W37, and W81 (Fig. 3, A and B). W37, W40, and W81 are also hydrogen-bonded to the (Ser-Pro) 5 peptide, in particular to the side chain oxygen of Ser 5 (Fig.  3B). There is also a hydrophobic interaction between the two loops, mediated by Val 80 and Phe 147 (Fig. 3, A and B). The ␤3-␤4 and ␤II-␤III loops have high conformational flexibility (Fig. 5), and the observed conformations will thus be affected by crystal contacts. For this reason a systematic mutagenesis study of the conserved motifs of the two loops was performed to confirm the importance of these two partner loops for the biocatalytic function of Cr-P4H-1 (Table 3). In each of these loop variants, i.e. the D81A, S84A, G85A, D149A, D149N, and N152A mutants, the K m values for the substrate poly(L-proline) were markedly increased and a 4 -15-fold decrease was observed in the k cat values, although none of these residues directly interacts with the substrate. In the N152A variant the only hydrogen bond between the two loops is lost, causing the most drastic effect on the kinetic constants. The mutagenesis data show that modest residue changes, even at the tips of the loops (Table 3), which effect the loop-loop interactions, much reduce the catalytic efficiency.
The bound (Ser-Pro) 5 peptide has adopted the PPII helix conformation, as generally observed in complexes of prolinerich peptides with proteins (32). The interactions between (Ser-Pro) 5 and Cr-P4H-1 are unique in two other respects, however, i.e. (i) the absence of stacking interactions between the proline side chains of the bound peptide and aromatic residues of the protein, and (ii) the -Ser 5 -Pro 6 -Ser 7 -central tripeptide is completely buried in the complex, being shielded from the bulk solvent by the ␤3-␤4 and ␤II-␤III loops (Fig. 4). This substratebinding mode of Cr-P4H-1 is also unique within the 2OG dioxygenase superfamily. Most of the family members with known structures are microbial enzymes involved in antibiotic biosynthesis that use small molecule substrates (18). Only factor inhibiting HIF (FIH), which hydroxylates a specific asparagine residue in HIF-␣, has been co-crystallized with its peptide substrate, but the substrate is not proline-rich and thus has no PPII helix conformation and is not bound in a tunnel (33). A loop region corresponding to the ␤3-␤4 loop in Cr-P4H-1 also participates in the peptide binding in FIH (33), however, and it has been proposed that a topologically similar loop may be involved in the substrate binding of HIF-P4H-2 (34), the oxidative DNA/RNA repair enzyme AlkB from E. coli (35) and phytanoyl-CoA hydroxylase (36). However, in the corresponding loops of these homologues the characteristic -Asp/Asn-X-X-Ser/ Thr-Gly-sequence motif is not conserved. Moreover, the ␤II-␤III loop is absent in FIH and in HIF-P4H-2. The fact that the latter loop is not present in any of the three HIF-P4H isoforms (6) indicates that this subfamily of P4H must have a different strategy for peptide substrate binding than plant P4Hs and C-P4Hs, which all have the elongated ␤II-␤III loop (Fig. 6). Peptide Binding and Substrate Specificity-The interactions between (Ser-Pro) 5 and Cr-P4H-1 are highlighted in a schematic way in Fig. 3B. The free main chain NHand carbonyl oxygen groups of the peptide are hydrogen-bonded to the side chains and main chain of Cr-P4H-1, and well defined water molecules. The central proline, Pro 6 , of the peptide points to the   (Figs. 2 and 3, A and B). The side chains of the catalytic -His-X-Asp-motif, the Zn 2ϩ ion and two arginines, Arg 93 and Arg 161 , are also located nearby (Figs. 2  and 3). Arg 93 , which is stacked with the side chain of Tyr 140 , interacts with the (Ser-Pro) 5 via a water molecule W174, whereas Arg 161 forms a direct hydrogen bond with the backbone oxygen of Pro 6 (Fig. 3). In addition, Tyr 140 forms a hydro-gen bond with the backbone oxygen of Pro 4 of the peptide (Fig. 3, A and  B). These three amino acids, Arg 93 , Tyr 140 , and Arg 161 , are fully conserved in all P4Hs, and mutation of any of these residues to alanine leads to complete inactivation of Cr-P4H-1 (12). Two main chainmain chain hydrogen bonds are formed between the -Ser 78 -Val 79 -Val 80 -region of the ␤3-␤4 loop of Cr-P4H-1 and the -Ser 5 -Pro 6 -Ser 7region of the peptide (Fig. 3). In addition, the side chain oxygens of Ser 78 and Ser 87 of the ␤3-␤4 loop are hydrogen-bonded to the side chain oxygen of the peptide Ser 7 (Fig. 3). Although these four residues of the ␤3-␤4 loop, Ser 78 , Val 79 , Val 80 , and Ser 87 , closely interact with (Ser-Pro) 5 , only Val 79 is highly conserved in the animal C-P4Hs and plant P4Hs (Fig. 6).
The structural data obtained here provide a first rationale for understanding the substrate specificity of the animal C-P4Hs for (X-Pro-Gly) n sequences. Given the notable sequence identity between Cr-P4H-1 and the catalytic domain of C-P4Hs (12), it is predicted that the mode of binding of the peptide in Cr-P4H-1 and C-P4Hs is the same, implying that the bound collagen peptide in the C-P4H complex will also adopt the PPII conformation. As shown in Fig. 7, A and B, both a collagen peptide taken from the structure of a synthetic collagen triple helix and the poly(L-proline) peptide superimpose well with the bound (Ser-Pro) 5 peptide substrate in Cr-P4H-1. In particular the residues of the central tripeptide region, referred to as being at the X, Y, and Z positions (Pro-Hyp-Gly in the collagen peptide and Pro-Pro-Pro in poly-L-proline), superimpose well with the corresponding -Ser 5 -Pro 6 -Ser 7 -tripeptide in both cases. This indicates that the hydrogen bonding interactions that exist between the main chain atoms of the central tripeptide of the (Ser-Pro) 5 substrate and Cr-P4H-1 will also exist between the C-P4Hs and their -X-Pro-Gly-substrates.
The structure presented here shows that each of the three positions of the central tripeptide of the bound (Ser-Pro) 5 has unique features. Firstly, the serine in the Z-position, Ser 7 , points upwards and is hydrogen-bonded to the side chain oxy- gen atoms of Ser 78 and Ser 87 of the ␤3-␤4 loop (Fig. 3B). The residues corresponding to these two serines in C-P4Hs are highly conserved threonine and leucine residues, respectively (Fig. 6). These more bulky and more hydrophobic residues, apparently favor a glycine in the Z-position of the tripeptide bound to C-P4Hs. This is fully consistent with the fact that these enzymes solely hydroxylate the proline in the Y position of the (Pro-Pro-Gly) 10 peptide. This preference of C-P4Hs for a glycine also suggests that the peptide-protein main chain-main chain interaction between N(Ser 7 ) and O(Ser 78 ) is preserved in the C-P4H peptide substrate complexes, and this could be an important substrate specificity determinant in C-P4Hs.
The function of the hydrogen-bonding interactions between the two serines and the serine residue in the Z-position of the peptide substrate (Ser 7 ) was probed by Cr-P4H-1 mutagenesis followed up by enzyme kinetic studies using three substrates, poly(L-proline), (Pro-Pro-Gly) 10 and (Ser-Pro) 5 . Poly(L-proline) is used as a substrate by Cr-P4H-1 and At-P4H isoenzymes 1 and 2, their K m values for poly(L-proline), M r 5000 -10000, being 140, 2, and 30 M, respectively (11,17,37,38). Poly(Lproline) is not hydroxylated by C-P4Hs, but instead acts as an efficient competitive inhibitor of C-P4H-I with respect to the collagen substrate, the K i being 0.5 M (39). The (Pro-Pro-Gly) 10 collagen peptide substrate has low affinity for Cr-P4H-1 and At-P4H isoenzyme type 2, the K m values being Ͼ1500 M and 2800 M, respectively, but binds tightly to At-P4H isoenzyme type 1 and C-P4Hs (11,17,37,38). Ser 78 and Ser 87 were mutated to threonine and leucine, respectively, to mimic the C-P4H sequences (Fig. 6). Wild-type Cr-P4H-1 hydroxylates the (Ser-Pro) 5 and (poly)L-Pro peptides with similar efficiency, whereas the (Pro-Pro-Gly) 10 peptide is hydroxylated with lower catalytic efficiency ( Table 3). The Cr-P4H-1 S78T, S87L and S78T/S87L mutations had no effect on the hydroxylation efficiency of (Pro-Pro-Gly) 10 (Table 3). The S78T and S87L mutations reduced the k cat values by 1.5-fold and about 4-fold when poly(L-proline) was used as a substrate, while the effects were even milder, if any, with (Ser-Pro) 5 (Table 3). The double mutation S78T/S87L reduced the k cat values obtained with poly(Lproline) and (Ser-Pro) 5 10-fold and 3-fold, respectively (Table   3). These data indicate that the interactions between the Ser 78 and Ser 87 side chains and the peptide substrate provide only a moderate contribution to the substrate specificity of Cr-P4H-1.
The peptide-protein interactions in the X-position will also be important for the substrate specificity. The serine side chain in the X-position, Ser 5 , points downwards to the ␤II-␤III loop and makes two hydrogen bonds with main chain oxygen atoms (Tyr 144 and Tyr 146 ) of this loop (Fig. 3). Furthermore, Ser 5 is hydrogen-bonded to water molecules involved in water-mediated loop-loop (␤3-␤4/␤II-␤III) interactions (Fig. 3). As shown in Fig. 6, there are more amino acid differences between the Cr-P4H-1 and C-P4H ␤II-␤III loops than between their ␤3-␤4 loops, the former also being 2 residues longer in the C-P4Hs. These differences correlate with different substrate specificities in the X-position of the central tripeptide, being preferably a proline in the C-P4Hs (1). As shown in Fig. 3, the side chains of the ␤II-␤III loop point upwards toward the ␤3-␤4 loop, and they are therefore also critically important for the water structure between these two loops and therefore for the substrate specificity. For example, Asp 149 of the ␤II-␤III loop is hydrogen bonded to water W40 and subsequently to water W37 and the main chain nitrogen of Ser 5 (Fig. 3).
Active Site and Reaction Mechanism-A characteristic feature of the structure of this competent complex is the "in" orientation of the side chain of Tyr 140 (Table 2), pointing inwards into the 2OG binding cavity (Fig. 8A), and being sandwiched between the side chains of Arg 93 and His 143 (Fig. 3A). This geometry is observed in the Zn-peptide complex and in the Zn-PDC complex. The Zn 2ϩ ion is complexed with a solvent molecule, acetate or PDC, respectively, in both structures. The acetate ion in the Zn-peptide complex mimics the binding of PDC in the Zn-PDC complex in that one acetate oxygen atom replaces the 1-carboxylate oxygen and the second acetate oxygen replaces the nitrogen atom of the inhibitor (Fig. 8A). In the Zn-peptide complex two water molecules are bound at the bottom of the 2OG binding cavity. These water molecules replace the two 5-carboxylate oxygens of PDC (Fig. 8A).
The geometry at the Zn 2ϩ ion binding site of the Zn-PDC complex and the Zn-peptide complex shows the position of the Kinetic studies of wild-type and mutant variants of Cr-P4H-1 using poly(L-proline), M r 5000, as a substrate The Ser 78 and Ser 87 mutants were also analysed with the (Ser-Pro) 5 and (Pro-Pro-Gly) 10 SEPTEMBER 11, 2009 • VOLUME 284 • NUMBER 37 three protein ligand atoms of the -His-X-Asp-…-His-motif and the two PDC/acetate atoms (Fig. 8A). The remaining sixth coordination position for molecular oxygen is predicted to be opposite the proximal histidine (12). In the Zn-PDC complex, a 31 Å 3 cavity is calculated to occur in this region with ICM (MOLSOFT, LLC). This cavity is lined by the carboxylate moiety of Asp 145 , and by the hydrophobic side chain moieties of Thr 164 , Leu 166 , Phe 212 , Thr 241 , and Trp 243 (Fig. 8A). Each of these residues protrudes out of the major sheet. In the Zn-peptide complex the side chains of Thr 241 , Trp 243 , and Leu 166 have adopted slightly different orientations (Fig. 8A). Thr 241 is hydrogen bonded to the inhibitor molecule in the Zn-PDC complex, but has rotated in the Zn-peptide complex, like the Leu 166 and Trp 241 side chains, causing the cavity to disappear in this complex. This finding is consistent with the fact that in the catalytic cycle molecular oxygen binds last to the active site (2), implying that both the substrate and the cofactor (2OG) are present when oxygen binds. It has been suggested that the O 2 binding site may also be located opposite the proximal histidine in clavaminate synthase (40) and in the AlkB (35), but interestingly, the currently available structures of 2OG dioxygenases also suggests an alternative site for O 2 binding, opposite to the distal histidine of the -His-X-Asp-…-Hismotif (18). The hydroxylation reaction is known to proceed via a FeIV(ϭO) ferryl intermediate which is formed after the activated molecular oxygen has reacted with 2OG, by which succinate is also formed (Fig. 1A) (2,18,19). The geometry of all available complexes suggest that the oxygen of the FeIV(ϭO) ferryl intermediate is bound to the metal ion in the position opposite the distal histidine, corresponding to the O1 position in the Cr-P4H-1 inhibitor complex (Fig. 8A). In this catalytic position the ferryl oxygen can abstract the C4 (Pro 6 ) hydrogen atom, as predicted by spectroscopic studies on a P4H from Paramecium bursaria Chlorella virus 1, in which an FeII-I(OH) radical and C4 (Pro 6 ) radical are formed (41). Spectroscopic studies performed on taurine dioxygenase (TauD) from E. coli have provided very similar information on this mechanism (42). These mechanistic studies therefore suggest that the oxy-ferryl oxygen ligand migrates to its catalytic position after the decarboxylation of 2OG in the P4H reaction. In the next step of the P4H reaction cycle the OH-radical is donated back to the C4 (Pro 6 ) radical, resulting in the hydroxylated product. The geometries of the Cr-P4H-1 Zn-peptide complex and the TauD Fe-2OG-taurine complex (PDB entry 1OS7) are compared in Fig. 8B. The TauD complex includes an Fe 2ϩ ion, 2OG and the substrate taurine, but the reaction does not proceed, as the crystals have been grown under anaerobic conditions (43). In both TauD and the P4Hs O 2 binding occurs after binding of the 2OG and the substrate (41,42). The substrates are bound in topologically identical positions in the TauD and Cr-P4H-1 complexes (Fig. 8B). In Cr-P4H-1 the target C4 of the substrate Pro 6 points to the acetate oxygen with a distance of 3.8 Å. This acetate oxygen superimposes on the O1-atoms of the PDC and 2OG molecules of the Cr-P4H-1 Zn-PDC complex and the TauD 2OG-taurine complex, respectively (Fig. 8, A and B). This geometry is suitable for the formation of the 4-R-OH product of the proline. The Pro 6 is in the down-puckering (C 4 -endo) conformation (Fig. 9), which was recently shown to be the favored prolyl substrate conformation for C-P4Hs (44) and HIF-P4Hs (45). However, the preferred conformation of the 4-R-OH-proline is the up-puckering (C 4 -exo) conformation (see PDB entries: 1V4F, 1V6Q, 1V7H, 01YM, 1LM8) (44 -48), and therefore it is expected that the 4Hyp will adopt the up-puckering conformation on completion of the chemical conversion. In this conformation it will clash with the side chain of Tyr 140 , as is visualized in the superimposition analysis (Fig. 9). Consequently, it can be predicted that the hydroxylation of Pro 6 will cause Tyr 140 to flip to its out conformation, as seen in the Zn-binary complexes of this study ( Table 2) and in our previous study (12). As Tyr 140 in the Zn-peptide complex is stacked with the Arg 93 of the ␤3-␤4 loop, the conformational change in Tyr 140 will also directly affect the conformation of that loop. Concomitantly with the flipping out of Tyr 140 , causing the ␤I-␤II loop to become disordered, the Glu 141 side chain flips inwards and overlaps with the position of the ␤3-␤4 loop in the Zn-peptide complex. Interestingly, the Y140A mutation completely inactivates Cr-P4H-1 (12), whereas the mutation to Phe has only minor effects on the kinetic constants (Table 3). Our model predicts that the Y140A mutation will disable the conformational switch of Tyr 140 , apparently causing the enzyme to become inactive, whereas the clash with the hydroxylated product in the Y140F variant will be the same as in the wild-type enzyme (Fig. 9). Evolutionary pressure has apparently generated the ␤3-␤4 and ␤II-␤III loop-loop interactions to be sufficiently strong to stabilize the protein peptide interactions required for catalysis, while at the same time these loop-loop interactions are not too tight, allowing loop opening on completion of the hydroxylation step and thereby facilitating product release.

Prolyl 4-Hydroxylase Complexed with a Peptide Substrate
Concluding Remarks-The structural and enzymological data on Cr-P4H-1 obtained here show that the ␤3-␤4 and ␤II-␤III loops define the substrate binding tunnel. The bound peptide adopts the PPII conformation and the central -Ser-Pro-Ser-tripeptide is bound in this tunnel in such a manner that the first serine (in the X-position) points down toward the ␤II-␤III loop, the proline in the Y-position points toward the catalytic site and the serine in the Z-position points up toward the ␤3-␤4 loop. The -Asp/Asn-X-X-Ser/Thr-Gly-and -Asp/Glu-X-X-Asn/Asp-motif at the tips of these loops are important for stabilizing the closed, liganded conformation, whereas the flanking regions of the loops participate in the determination of FIGURE 8. The active site geometry of Cr-P4H-1. A, superimposed stereoview of the active sites of the Cr-P4H-1 Zn-peptide complex (light brown) and Zn-PDC complex (violet). The zinc ions at the active sites are shown as gray balls, and the dashed lines highlight the atoms directly interacting with the zinc ion in the Zn-peptide complex. The (Ser-Pro) 5 is shown using gray bond colors. The amino acid side chain conformational differences at the proposed oxygen binding site opposite to the proximal histidine (below the zinc ions) are shown. The 1-carboxylate and 5-carboxylate ends of PDC are labeled. The acetate molecule (Ac) and two water molecules (light red) in the 2OG-binding pocket of the Zn-peptide complex are also shown. B, superimposed stereoview of the active sites of the Cr-P4H-1 Zn-peptide complex (light brown) and TauD (PDB ID 1OS7). The iron ion (brown) bound to TauD, and the taurine substrate is also shown. The hydroxylated atom of the taurine molecule is marked with an asterisk. substrate specificity. It is proposed that the hydroxylation of the proline in the Y-position at the end of the catalytic cycle induces a conformational switch in which Tyr 140 flips out, forcing Arg 93 and the ␤3-␤4 loop to adopt an unliganded, open (disordered) conformation. In this conformation the succinate produced by decarboxylation of 2OG can be released and replaced again by 2OG, and the hydroxylated peptide can translocate in the binding groove so that a new, unhydroxylated tripeptide is ready to be hydroxylated in the next catalytic cycle. The structure and sequence comparisons indicate that the PPII mode of binding of the central tripeptide is preserved in the C-P4H enzyme family and it is proposed that this complex enzymatic mechanism is also a common feature of these enzymes.