Structural and mutational analysis of the PhoQ histidine kinase catalytic domain. Insight into the reaction mechanism.

PhoQ is a transmembrane histidine kinase belonging to the family of two-component signal transducing systems common in prokaryotes and lower eukaryotes. In response to changes in environmental Mg(2+) concentration, PhoQ regulates the level of phosphorylated PhoP, its cognate transcriptional response-regulator. The PhoQ cytoplasmic region comprises two independently folding domains: the histidine-containing phosphotransfer domain and the ATP-binding kinase domain. We have determined the structure of the kinase domain of Escherichia coli PhoQ complexed with the non-hydrolyzable ATP analog adenosine 5'-(beta,gamma-imino)triphosphate and Mg(2+). Nucleotide binding appears to be accompanied by conformational changes in the loop that surrounds the ATP analog (ATP-lid) and has implications for interactions with the substrate phosphotransfer domain. The high resolution (1.6 A) structure reveals a detailed view of the nucleotide-binding site, allowing us to identify potential catalytic residues. Mutagenic analyses of these residues provide new insights into the catalytic mechanism of histidine phosphorylation in the histidine kinase family. Comparison with the active site of the related GHL ATPase family reveals differences that are proposed to account for the distinct functions of these proteins.

PhoQ is a transmembrane histidine kinase belonging to the family of two-component signal transducing systems common in prokaryotes and lower eukaryotes. In response to changes in environmental Mg 2؉ concentration, PhoQ regulates the level of phosphorylated PhoP, its cognate transcriptional response-regulator. The PhoQ cytoplasmic region comprises two independently folding domains: the histidine-containing phosphotransfer domain and the ATP-binding kinase domain. We have determined the structure of the kinase domain of Escherichia coli PhoQ complexed with the non-hydrolyzable ATP analog adenosine 5-(␤,␥-imino)triphosphate and Mg 2؉ . Nucleotide binding appears to be accompanied by conformational changes in the loop that surrounds the ATP analog (ATP-lid) and has implications for interactions with the substrate phosphotransfer domain. The high resolution (1.6 Å) structure reveals a detailed view of the nucleotide-binding site, allowing us to identify potential catalytic residues. Mutagenic analyses of these residues provide new insights into the catalytic mechanism of histidine phosphorylation in the histidine kinase family. Comparison with the active site of the related GHL ATPase family reveals differences that are proposed to account for the distinct functions of these proteins.
Two-component signaling systems are used ubiquitously by prokaryotes and also by a number of lower eukaryotes to sense and respond to various environmental conditions. These systems consist of a histidine kinase that acts as the sensor of environmental stimuli and a response regulator that mediates the cellular response, generally at the level of transcriptional control (1). As with many signaling pathways, protein phosphorylation is used as a means to transmit information; however, unlike the majority of phosphoproteins found in higher eukaryotes, in which tyrosine, serine, or threonine serve as the substrate for phosphorylation, histidine kinases autophosphorylate a histidine residue from which the phosphoryl group is subsequently transferred to a conserved aspartate residue in the response regulator. The catalytic mechanism is reasonably well understood for aspartyl phosphorylation, while far less is known about the autokinase reaction. This lack of information is due in part to the relative scarcity of detailed structural information for the histidine kinases. Recently, structural information has become available for the CheA (2,3) and EnvZ (4) histidine kinases. These structures reveal that the catalytic ATP-binding domain is an autonomously folding ␣/␤-sandwich that shares structural homology with a family of ATPases that include Hsp90, DNA gyrase B, and MutL (5). Although these structures provide some insight into function, they have not allowed the assignment of catalytic residues. Here we describe the 1.6-Å resolution crystal structure of the catalytic domain of the PhoQ histidine kinase complexed with an AMPPNP 1 nucleotide. PhoQ is a transmembrane histidine kinase that is involved in Mg 2ϩ homeostasis and/or pathogenesis of a number of Gram-negative bacteria (for review see Ref. 6). PhoQ responds to limiting concentrations of extracellular Mg 2ϩ by increasing the net phosphorylation of the PhoP transcriptional response-regulator. The cytoplasmic portion of PhoQ consists of independently folded phosphotransfer and ATP-binding catalytic domains. The high degree of resolution reveals a detailed view of the nucleotide-binding site that provides new insights into catalytic mechanism.

EXPERIMENTAL PROCEDURES
Plasmids-The plasmids used for expression of the PhoQ cytoplasmic domain (residues 219 -486, which contains the phosphotransfer and catalytic domains) and the PhoQ catalytic domain (residues 331-486) were constructed as follows. For the cytoplasmic domain, DNA corresponding to codons 219 -486 was PCR-amplified from plasmid pLPQ2 (7), which contains a full-length copy of the phoQ gene, cut with NdeI and EcoRI (synthetic sites were introduced in the PCR primers), and cloned into the NdeI-EcoRI backbone of pAED4 (8). The resulting plasmid, pAED4QTR, has an initiating ATG codon (at the synthetic NdeI site) fused to codon 219 of phoQ and all of the remainder of the gene including the native translational termination codon. In this plasmid, expression of the PhoQ cytoplasmic domain is controlled by a plasmid-borne T7 10 promoter and ribosome binding site. Variants of pAED4QTR were constructed in which the Lys-392, Arg-434, or Arg-439 codons were independently replaced with alanine codons. These mutant genes were constructed by encoding the mutant codon in a PCR primer that also contained a nearby unique restriction site, PCR-am-plifying a portion of the gene, and cloning the mutated PCR product into the pAED4QTR plasmid. The phoQ catalytic domain was cloned into pAED4 as described above for the cytoplasmic domain, except a primer that fuses an initiating ATG codon to and amplifies DNA from codon 331 of phoQ was used. In the resulting plasmid, pAED4QKD, expression of the C-terminal PhoQ catalytic domain (residues 331-486) was controlled by the T7 10 promoter and ribosome binding site. Details of plasmid construction are available upon request.
Protein Purification-The PhoQ cytoplasmic domain (residues 219 -486) was purified from Escherichia coli strain BL21 containing pAED4QTR (or variants with alanine substitutions at residues 392, 434, or 439) that had been treated with 1 mM isopropyl-1-thio-␤-Dgalactopyranoside to induce expression of the protein. Cells were harvested, lysed by sonication, and the debris was removed by centrifugation. Ammonium sulfate was added (15% w/v), and following centrifugation, the pellet was resuspended and dialyzed against 20 mM Tris⅐Cl (pH 8.0), 50 mM KCl, 10 mM ␤-mercaptoethanol. This solution was loaded onto a MonoQ anion-exchange column (Amersham Pharmacia Biotech) and eluted with a gradient from 50 mM to 0.5 M NaCl in 20 mM Tris⅐Cl (pH 8.0), 10 mM ␤-mercaptoethanol. Fractions containing the PhoQ cytoplasmic domain (as assayed by gel electrophoresis and Coomassie Blue staining) were pooled, concentrated, and dialyzed against 20 mM Tris⅐Cl (pH 8.0), 0.1 M NaCl, 10 mM ␤-mercaptoethanol. This solution was loaded onto a Superdex 75 gel filtration column (Amersham Pharmacia Biotech) and eluted in 20 mM Tris⅐Cl (pH 8.0), 0.1 M NaCl, 10 mM ␤-mercaptoethanol. Fractions containing the cytoplasmic domain were pooled and judged to be greater than 95% pure, as assessed by gel electrophoresis and Coomassie Blue staining. The PhoQ catalytic domain (residues 331-486) was purified from E. coli strain BL21 containing pAED4QKD as described above for the cytoplasmic domain with the following exception. Following removal of cellular debris, 20% ammonium sulfate (w/v) was added and the precipitate removed by centrifugation. Ammonium sulfate was then added to the supernatant to 35% (w/v) and the protein pelleted and purified as for the PhoQ cytoplasmic domain. Selenomethionine-containing protein was produced by a nonauxotrophic protocol (9) and purified as following the same protocol. Protein concentration was calculated using an extinction coefficient of 9576 M Ϫ1 cm Ϫ1 at 280 nm for the cytoplasmic domain based on the presence of 8 tyrosines in the sequence and 4788 M Ϫ1 cm Ϫ1 at 280 nm for the catalytic domain based on the presence of 4 tyrosines in the sequence (10).
Autokinase Assays-All reactions were performed at 22°C in 20 mM Tris⅐Cl (pH 8.0), 50 mM KCl, 1 mM MgCl 2. [␥-32 P]ATP (3.5-7.0 Ci at Ͼ5000 Ci/mmol) and varying amounts of non-labeled ATP (25,50,100,200, and 400 M) were incubated with 1-2 M PhoQ cytoplasmic domain in a 35-l volume, and the autophosphorylation reaction was allowed to proceed. Aliquots were removed and the reaction stopped by addition of SDS-PAGE sample buffer. The samples were then subjected to gel electrophoresis to separate the protein from free nucleotide, and phosphorylated protein was quantitated by phosphorimaging. The rate of protein phosphorylation was calculated for each ATP concentration and plotted versus rate/free [ATP] (Eadie-Hofstee plot). K m (Ϫ slope) and V max (y axis intercept) were derived from the plots, and k cat was calculated as V max /[enzyme]. Although the autokinase reaction does not strictly follow Michaelis-Menten steady state kinetics because phosphorylation of the protein effectively reduces the concentration of the enzyme-substrate complex, we believe that the derived parameters are still useful in comparing mutant and wild type variants. The reaction can be assumed to be near steady state since the amount of phosphorylated protein was always less than 5% of the total.
Crystallization and Data Collection-Crystallization was carried out at 4°C using the hanging-drop vapor diffusion method. The protein concentration was 10 mg/ml in 10 mM Tris-HCl (pH 8.5) 100 mM NaCl. Native and selenomethionine-containing kinase domain crystals were obtained in 1 week after mixing the protein and reservoir buffer (100 mM cacodylate (pH 6.5), 20 -24% polyethylene glycol 1500, 100 -200 mM magnesium acetate) in a 1:1 ratio. Crystals were in space group P2 1 2 1 2 1 with unit cell dimensions of a ϭ 43.5 Å, b ϭ 45.0 Å, c ϭ 71.8 Å. Heavy atom derivatives were generated by soaking crystals in reservoir solution plus 10 mM HgCl 2 (36 h), 2 mM lead acetate (96 h), or 5 mM K 2 PtCl 4 (144 h). For cryo experiments, crystals were transferred to 35% polyethylene glycol 1500, 7.5% ethylene glycol, 100 mM cacodylate (pH 6.5), and 150 mM magnesium acetate for 1-2 min and flash-frozen at 100 K in a nitrogen Oxford Cryosystem. All x-ray data sets were collected at 100 K in-house on a Mar345 image plate detector, using a Rigaku RTP300 rotating-anode x-ray generator and at the NSLS Beamline X4A on a CCD Quantum 4 detector. The diffraction data were processed and reduced with Denzo and Scalepack (11) Statistics for all used data sets are given in Table I.
Structural Determination and Refinement-The structure was solved by combination of multiwavelength anomalous diffraction (MAD) and multiple isomorphous replacement (MIR) data. Coordinates of 2 selenium atoms (out of three possible positions) were located from the MAD data using MADSYS (12) and SOLVE (13) programs. The calculated electron density map showed a clear solvent boundary but not secondary structural elements. The phases derived from MAD data were used to locate the heavy atom positions in difference Fourier maps. MAD and MIR data were combined, and atom parameters were refined using isomorphous and anomalous differences, with the maximum likelihood method incorporated in MLPHARE (14). The resultant phases to 2.75 Å were further improved by solvent flattening, and the resolution was extended to 1.6 Å using DM (14). The consequent map was interpretable showing several secondary structure features.
The initial model, containing 120 of the 156 residues, was built using the ARP/wARP program (15). Combining the partial structure phases with experimental phases allowed manual tracing of the remaining residues with the program O (16). Three N-terminal and the five Cterminal residues are missing from the final structure. The model was subjected to interactive cycles of manual rebuilding and conjugated gradient minimization, simulated annealing, and individual B-factor refinement using the programs O and CNS (17). Both anisotropic Bfactor and bulk solvent corrections were applied. A molecule of AMP-PNP, one Mg cation, and 189 water molecules were located during the refinement. Eleven residues present double conformation (residues 345, 348, 349, 351, 370, 371, 407, 414, 425, 465, and 471) in the final model. Stereochemistry checks indicate that the refined model is in quite good agreement with expectations for models within this resolution range (18). Statistics for the final model are given in Table I. Atomic coordinates for PhoQ-KD have been deposited with the Protein Data Bank, with accession code 1ID0. (19).
Structure and Sequence Comparisons-Structures were superimposed based on ␣-carbon atoms alone, and only atom pairs identified as equivalent were used for r.m.s. deviation calculations. The leastsquares superimpositions were calculated using LSQMAN (20). The overall best fits between the structures were determined using the "Brute" option in the program with a cut-off distance criterion of 3.5 Å and a minimum fragment length of three consecutive residues. The coordinates were taken from the Protein Data Bank with entry codes: CheA-KD, 1b3q (2); EnvZ-KD, 1bxd (4); GyrB, 1ei1 (21); MutL, 1b62 (22); and Hsp90, 1byq (23). Sequence alignments among these proteins were based on structural superimposition. The multiple alignments of each protein with their respective families were taken from Pfam data base (Ref. 24; www.pfam.wustl.edu) with accession numbers: histidine kinase family, PF00512 PF; GyrB family, PF00204; MutL family, PF01119; and Hsp90 family, PF00183.

RESULTS AND DISCUSSION
Overall Fold-The crystal structure of the C-terminal PhoQ catalytic domain (residues 331-486) complexed with the nonhydrolyzable ATP analog, AMPPNP, was solved by combining experimental phases from MIR with phases from MAD, and this structure was refined at 1.6-Å resolution ( Table I). The structure as shown in Fig. 1A is a two-layer ␣/␤ sandwich fold composed of a flat, mixed, five-stranded ␤ sheet and three ␣ helices (from left to right: ␤B, ␤D, ␤G, ␤F, and ␤E, and ␣1, ␣2 and ␣3, respectively), with dimensions 35 ϫ 50 ϫ 35 Å (Fig. 1). There is a deep cavity in one end where the AMPPNP molecule is localized, and the opposite end is closed by a small antiparallel ␤-sheet formed by the ␤A and ␤C strands. In the final model, segments at the N terminus (residues 331-335) and the C terminus (residues 481-486) are disordered and extend into the solvent. The region comprising residues 433-439 has high B-factors (51.4 Å 2 on average) compared with the average for the protein (23.4 Å 2 ), indicating an intrinsic flexibility. Nevertheless, the electron density is sufficiently well defined to trace the region (Fig. 1C). This flexible loop is part of a long polypeptide segment that extends away from the rest of the molecule and has been termed the ATP-lid in the homologous EnvZ and CheA proteins (5). The proximity of this region to the bound AMPPNP molecule and the flexibility indicated by high Bfactors suggest important roles in ATP binding and interac-tions with the phosphorylation site in the histidine phosphotransfer domain (see below).
Comparison with the Histidine Kinase Domains of CheA and EnvZ-The detailed three-dimensional structure of the homologous kinase domains of CheA and EnvZ has been reported previously (2,4). The structure of PhoQ-KD is shown with the crystal structure of the Thermotoga maritima CheA kinase domain (2.6-Å resolution) and with the NMR structure of the E. coli EnvZ kinase domain (complexed with AMPPNP) in Fig. 2B and an alignment of the amino acid sequences based on the superimposition of the structural elements is shown in Fig. 2A. CheA-KD, which is longer than the two other kinase domain (189 residues compared with 156 and 160 residues, respectively, in PhoQ-KD and EnvZ-KD), possesses two extra ␣ helices, one between ␣2 and ␤D, and another before the ATP-lid loop (Fig. 2). Roughly 70% of the superimposed residues are identical for all three proteins and these primarily cluster in the N, G1, F, and G2 boxes that have been classically defined in alignments of the histidine kinase superfamily (25) and in the recently defined G3 box (5). The remaining identical residues are mostly hydrophobic in character and are distributed between all of the structural elements where they generally participate in forming the core of the molecule ( Fig. 2A). Surprisingly, the r.m.s. deviation between the backbone ␣-carbon atoms of the PhoQ-KD and CheA-KD models is lower (1.62 Å) than the r.m.s. deviation between the backbone traces of the more closely related (by sequence and structural organization) PhoQ-KD and EnvZ-KD proteins (1.84 Å). Although we cannot rule out the possibility that the greater deviation from the EnvZ-KD is genuine, it seems more likely that this may result of ambiguities in the overlaid NMR structures. Consequently, detailed structural comparisons were carried out with the CheA-KD structure. Besides the two additional helices in CheA-KD, the principal structural difference between them is the conformations of the high flexibly loop of the ATP-lid (Figs. 2B and 3). ATP-lid mobility is facilitated by the presence of three conserved glycine residues (441, 443, and 445 in PhoQ) of the G2 box and one following the F box (Gly-432 in PhoQ; Figs. 2A and 3). The loop is anchored at its N terminus by the conserved phenylalanine residue (Phe-429 in PhoQ) for which the F box is named and at its C terminus by a conserved hydrophobic residue (Leu-446 in PhoQ and Met-507 in CheA) that begins the ␣3 helix (Fig. 3). These two residues interact with each other as part of a larger hydrophobic patch in which Ile-428, Leu-446, and Ile-460 form a small pocket where Phe-429 is inserted (Fig. 3). The superimposition of CheA-KD shows a similar hydrophobic patch composed of Phe-487, Leu-486, Met-507, and Met-521 (Fig. 3). The chemical similarity of these residues and the similar organization in the EnvZ-KD model (Phe-387, Leu-386, Ile-408, and Leu-420; Fig. 2A) suggest that this hydrophobic cluster is a general structural feature of histidine kinases. In the nucleotide-free CheA-KD model, the loop is extended toward the solvent in an "open" conformation ( Fig.  3). In contrast, the PhoQ loop is in close contact with the main ␤-sheet, forming a "closed" conformation. The largest distance between equivalent residues in the ATP lid of PhoQ and CheA is 10 Å (Gly-441 in PhoQ and Gly-502 in CheA; Fig. 3). A 30°r otation of the loop about the hydrophobic patch anchor would produce an open conformation in PhoQ that would be structurally similar to that seen in CheA-KD. Three loop residues (Arg-434, Arg-439, and Gln-442) make extensive contacts with the phosphate groups and the chelated Mg 2ϩ ion of the bound AMPPNP molecule in the PhoQ-KD structure, suggesting that nucleotide binding may induce the closed conformation. Apparently, the interactions with the AMPPNP phosphates play the principal role in the loop reorganization since the hydrophobic patch superimposes with minimal differences between the two structures (Fig. 3). Mutagenesis studies of the proposed hinge of the ATP-lid in EnvZ have shown that this region is essential for kinase activity (26).
The alternate disposition of the ␤-hairpin between the ␤F and ␤G strands represents a second, but minor structural difference between the PhoQ-KD and CheA-KD structures (Fig.  3). This ␤-hairpin corresponds to the recently defined G3 motif (5), which includes a conserved glycine residue (Gly-469 in PhoQ) that immediately precedes the ␤G strand ( Fig. 2A). Structurally, the ␤-hairpin lines the back of the nucleotidebinding site and provides solvent-mediated contacts with the AMPPNP ring. The ␤-hairpin is displaced in the PhoQ-KD structure relative to the CheA-KD structure (Fig. 3), and this displacement exposes two hydrophobic residues (Met-466 and Leu-467 in PhoQ) to the solvent. Together with Phe-397 in the ␤D strand, these residues form a solvent-exposed hydrophobic region. Perhaps ATP binding induces a shift in the position of the ␤-hairpin that, in turn, allows the hydrophobic patch to interact with the substrate histidine phospho-transfer domain, which is absent in this structure. These hydrophobic residues are not, however, conserved among histidine kinases, suggesting that the solvent exposure of the hydrophobic patch in PhoQ may simply be due to the space need of the ATP-ring pocket rather than a mechanistic function. Additional experiments will be necessary to address these possibilities.
The Nucleotide Binding Site-The ATP-binding pocket of PhoQ-KD involves not only absolutely conserved residues from the N, G1, F, G2, and G3 boxes, but also partially conserved residues from these motifs and from the ATP-lid loop ( Figs. 2A  and 4). In particular, residues from the G1, F, and G3 boxes provide the principal contacts with the adenosine moiety, whereas the N and G2 boxes contact both the adenosine moiety and Mg 2ϩ phosphates. The ATP-lid interacts exclusively with the phosphates and the divalent cation. The hydrogen bond between the N6 amino group of AMPPNP and the carboxyl side chain of the conserved Asp-415 in the G1 box is the single direct protein interaction with the adenine ring (Fig. 4). Water mol- The bound AMPPNP molecule is shown in ball-and-stick representation, and the Mg 2ϩ ion is represented as a cyan sphere. B, stereo C␣ trace. Every tenth amino acid residue is indicated as a sphere and labeled with its residue number. The high B-factor segment is shown with dashed lines. The AMPPNP-Mg 2ϩ molecule is drawn in a gray ball-and-stick representation. The orientation is the same as in panel A. C, simulated annealing 2F o Ϫ F c omit map (residues 433-442 removed) for the flexible loop contoured at 1 (cyan) and 2 (orange). The AMPPNP molecule is shown without density. Carbon, nitrogen, oxygen, and phosphate are drawn in yellow, blue, red, and green, respectively. Water molecules are omitted for clarity. ecules mediate additional hydrogen bonds between the protein and the adenine nitrogen atoms (Fig. 4). Specifically, the N6 amino group interacts with the main-chain carbonyl group of Val-386 through water W3. The endocyclic N1 atom of the adenine ring makes a bidentate water-mediated hydrogen bond to the main chain nitrogen atom of conserved Gly-419 and the side chain of Asp-415. Additionally, the adenine N7 atom forms a water-mediated hydrogen bond with the side chain oxygen of conserved Asn-389, which makes an additional water-mediated interaction with the adenine N1 atom. Additional elements responsible for adenine base binding are Tyr-393, which makes an aromatic stacking interaction on one side, and Ile-420, which makes van der Waals contacts with the other face of the adenine ring (Fig. 4). Tyr-393 is held in place by hydrophobic interactions between the aromatic ring and the aliphatic portion of the Lys-392 side chain, which lies parallel to the tyrosine residue (Fig. 4). The ribose moiety of the AMPPNP molecule presents weak interactions with the protein. Its 2Ј-and 3Ј-hydroxyl groups are more solvent-exposed than the rest of the sugar moiety with the O3Ј atom forming a hydrogen bond with the hydroxyl group of Tyr-393 (Fig. 4).
Although residues in the two conserved boxes (N and G2) interact with the triphosphate moiety of AMPPNP, many of these contacts are provided by amino acids that are only partially conserved among histidine kinases. The ␥-phosphate is hydrogen-bonded to the side chains of Gln-442 and Arg-439 of the ATP-lid and with Lys-392 and Tyr-393 of the N box ( Figs.  2A and 4). The side chain of Arg-434 forms a hydrogen bond with the ␤-phosphate and the ␣-phosphate interacts with the side chains of Asn-385 and Asn-389, for which the N box is named, and with the peptide nitrogen atom of Leu-446 in the G2 box ( Figs. 2A and 4). Of these residues, only Asn-389 is absolutely conserved among histidine kinases, whereas the others are only partially conserved. Apparently, the molecular details of the nucleotide-binding site vary for different histidine kinases. The divalent cation is coordinated by the three phosphate groups of the nucleotide via three non-bridging oxygen atoms (Fig. 4). The remaining three octahedral coordination sites of the cation are occupied by the carboxyamide oxygens of Asn-385 and Gln-442, and by a water molecule. Surprisingly, the coordination distances between the metal cation and its ligands (about 2.45 Å) are more appropriate for a bound Mn 2ϩ than a Mg 2ϩ ion (27). The high concentration of Mg 2ϩ present in crystallization solution (ϳ150 mM as acetate salt) and the fact that Mg 2ϩ is the most common divalent cation required for enzymatic activity by histidine kinases make it most likely the case that Mg 2ϩ is the ion present in the structure. However, Mn 2ϩ has been reported as a preferred cation in some bacterial (28) and plant (29) histidine kinases. Future enzymatic assays will be necessary to elucidate the cation preferences of PhoQ.
The nucleotide has a compact conformation, with a C3Ј-endo sugar pucker and the ␥-torsion angle is in the ϩsc conformation (as defined in Ref. 30). Although this "closed" conformation is less common than the "extended" conformation (30), it is present in the structurally related MutL, GyrB, and Hsp90 proteinnucleotide complexes (21,22,31). The AMPPNP triphosphate moiety is "curled" due to its ␣,␤,␥-tridentate coordination to Mg 2ϩ , which brings the ␥-phosphate near the ribose ring. The ␣,␤,␥-tridentate Mg 2ϩ coordination is also unusual in a metal-nucleotide-protein complex, but is also present in the MutL and GyrB structures. Additional protein structures showing this tridentate coordination are phosphoglycerate kinase (32), pyruvate kinase (33), cyclin-dependent kinase 2 (34), chaperonin GroEL (35), and a protein of unknown function from Methanococcus jannaschii (36). In PhoQ-KD, this closed and curled conformation places the ␥-phosphate facing inward toward the ␣2 helix, a position where phosphate transfer would be obstructed (Fig. 4). Consequently, a ␥-phosphate movement (possibly a rotation) must occur to allow histidine phosphorylation (see below).
Kinetic standing of the catalytic mechanism for the autokinase reaction of histidine kinases has been hindered by the lack of highly conserved residues that might have an enzymatic role and by a lack of structural definition. Disorder in the binding site of the EnvZ catalytic domain structure precluded such analysis. Although parts of the ATP-lid also are flexible in this PhoQ complex, the side chains from three residues that interact directly with Mg 2ϩ nucleotide are the best defined parts of this flexible segment (Fig. 1C). The side-chain B-factor values of these residues (31.2, 34.3, and 33.1 Å 2 for Arg-434, Arg-439, and Gln-442, respectively) are comparable with the average for the side chains overall (25.8 Å 2 ). Therefore, the detail observed in the nucleotide binding site of PhoQ allows us to identify potential catalytic residues. As shown in Fig. 4, three basic residues are placed at the phosphotransfer site; Lys-382 and Arg-439 interact at the ␥-phosphate side, and Arg-434 interacts with the ␤-phosphate group. To examine how these groups may contribute to cleavage and/or to transition state stabilization, each residue was individually substituted by alanine in the context of the entire cytoplasmic fragment (including the histidine substrate-containing phosphotransfer domain) and the kinetic parameters were measured. Results of the mutational analysis are shown in Table II.
An alanine substitution for Lys-392 produces a strong effect in K m (roughly 40-fold increase) and a weaker effect in k cat (about 10-fold reduction), which suggests roles in both nucleotide binding and catalysis. Binding may come both from hydrophobic interactions between the aliphatic portion of the lysine side chain and Tyr-393, which appears to buttress the aromatic residue for optimal stacking with the adenine base, and possibly also from amino group interactions with the ␥-phosphate group. Electrostatic neutralization of the transition state provided by the ⑀-a-  (20). Structural elements are shown for PhoQ and MutL only. Identical and similar residues are colored in red and green, respectively, for residues conserved across all family members and in blue when only conserved among GHL members. Active site and substrate-binding residues are highlighted in yellow. The catalytic Glu in GHL members is boxed in red. The numbering is indicated in magenta in the end or middle of each row (note that some of the GHL structure elements are juxtaposed with respect to the primary structure of these elements in PhoQ). B, superimposition of the MutL (yellow) and PhoQ (blue) active sites. Carbon, nitrogen, oxygen, and phosphate are drawn in gray, blue, red, and green, respectively. Mg 2ϩ cations are shown as spheres, and hydrogen bonds are depicted as dotted lines. mino group may account for the effect on catalysis seen in k cat reduction. In contrast, substitution of the arginine at residue 434 reduces k cat by 2 orders of magnitude but has little effect on K m , which suggests that Arg-434 is critical to catalysis. Its guanidium group may function in transition state stabilization and/or as a general acid to protonate the ␤-phosphate leaving group. Substitution of Arg-439 has only modest effects on either kinetic parameter. It may seem surprising that Arg-434, which interacts with the ␤-phosphate in the structure, and not Arg-439, which interacts with the ␥-phosphate, is critical to catalysis. However, the triphosphate moiety is under the ATP-lid and oriented with the ␥-phosphate group directed inward toward the ␣2 helix, where it would be inaccessible to the phospho-accepting histidine substrate (Fig. 4). We presume, therefore, that the ␥-phosphate group of ATP in PhoQ must reorient from the conformation seen here before phosphorylation can take place, perhaps to one more like that seen in GHL ATPases such as MutL (see below). After such rearrangement, Arg-439 would no longer make phosphate contacts but Arg-434 could remain critically placed to assist in phosphotransfer.
Despite marked structural differences between histidine kinases and other protein kinases, the phosphorylation mechanism of PhoQ could be analogous to that observed in the serine/ threonine or tyrosine kinases. The catalytic centers of these hydroxyl-directed protein kinase present a general base (an aspartate residue) that deprotonates the phosphate-acceptor group and two basic residues, one that activates departure of the leaving group and another that neutralizes the charge (37). Our structure-directed mutational analysis of PhoQ implicates two basic residues in analogous functions. Arg-434 could serve as a general acid to activate the leaving group departure, and Lys-392 could neutralize the negative charge. The catalytic domain has no obvious candidate residue for the role of catalytic base, but this function may reside in the histidine-substrate domain. The phospho-accepting histidine residue is on a helix in the phosphotransfer domain, and the adjacently following residue is almost always either glutamic or aspartic acid, which would make this the best candidate for the general base.
Structural Similarities with GHL ATPases and Mechanistic Implications-The kinase domain of the histidine kinases belongs to a larger family of proteins, termed the GHKL family, whose members include GyrB, Hsp90, histidine kinases, and MutL (5). Unlike the histidine kinases, which catalyze the phosphotransfer from ATP to a histidine substrate, the other members of the family are ATPases that use the energy of hydrolysis to mediate the movement of protein subunits that perform various functions. A structurally based sequence alignment of PhoQ-KD, MutL, DNA gyrase B, and Hsp90 is shown in Fig. 5A. PhoQ-KD presents the four conserved motifs characteristic of the superfamily (5), three of which correspond to the classical N, G1, and G2 histidine kinase conserved motifs ( Fig. 2A). Topologically, the five ␤ strands and the three ␣ helices that define the structural core of PhoQ-KD superimpose with homologous elements in the other members, although the N-terminal ␣1 helix and ␤B strand elements are in the Cterminal sequence of MutL, GyrB, and Hsp90 rather than the N-terminal sequence as in PhoQ-KD (Fig. 5A). Pairwise sequence comparison between PhoQ-KD and the other three proteins give identities between 14 and 23% (GyrB) for these alignments, which at the high end approaches that observed in comparisons among histidine kinase family members. Fig. 5B shows an overlay of the PhoQ-KD and MutL structures with a superimposition of the nucleotide molecules. The nucleotides have similar conformations except at the ␥-phosphate. In the PhoQ-KD⅐AMPPNP complex, the ␥-phosphate group faces in an orthogonal direction to that found in the MutL structure (Fig. 5B). The orientation of the ␥-phosphate in the MutL structure may be representative of the orientation of the ␥-phosphate in PhoQ after it undergoes a proposed rotation (see above) to allow exposure to the histidine substrate. An aspartate residue and structural water molecules in all cases recognize the adenine ring. However, a GHL-conserved threonine residue is bound to one of these water molecules in the ATPases. The differences between the PhoQ-KD active site and that of the three ATPases are most striking when it comes to the nature of the residues interacting with metal-triphosphate (Fig. 5). The absolutely conserved catalytic glutamate residue that serves as general base for water activation in ATP hydrolysis in the ATPases (Glu-29 in MutL) is replaced in histidine kinases by a conserved asparagine (Asn-385 in the PhoQ-KD structure, Fig. 5). The equivalent of Asn-389 in PhoQ is absolutely conserved in all members of the GHKL superfamily but interacts with the divalent cation in the ATPase structures. Surprisingly, Asn-389 does not coordinate the Mg 2ϩ cation in PhoQ-KD; instead, Asn-385 and Gln-442 are the two protein residues that interact with Mg 2ϩ in the Pho-KD structure. The space occupied by Gln-442 in PhoQ is taken up by a conserved lysine residue in GHL family (Lys-307 in MutL, Fig. 5) that comes from a separate domain contiguous to the nucleotidebinding domain and forms hydrogen bonds with the ␥-phosphate (Fig. 5B). This lysine residue has been implicated in playing a key role in transition state stabilization in MutL and GyrB (22,38). Neither the cytoplasmic fragment nor the isolated catalytic domain of PhoQ possess detectable ATPase activity (data not shown). It would be interesting to see if a basic substitution at residue 442 in conjunction with a glutamate residue in position 385 would confer ATPase activity on PhoQ.
Variation of Catalytic Mechanism among Histidine Kinases-Although histidine kinases have characteristic amino acid sequences, they nevertheless show considerable variability even within their more conserved N, F, and G boxes; 11 distinct subgroups have been defined (39). Having identified residues that are involved in catalysis in PhoQ, we are now able to investigate histidine kinase variability in the context of catalytic mechanism. As described above, Tyr-393 in PhoQ stacks with the adenine base of the bound nucleotide and Lys-392 is proposed to help position the tyrosine for more optimal nucleotide binding and to stabilize the transition state. Arg-434 plays a critical role in catalysis and is less important in binding. EnvZ is identical to PhoQ at these positions. Analysis of 467 histidine kinases reveals two major classes with respect to these three catalytic residues.
The predominant class of histidine kinases is like PhoQ in having a basic/aromatic pair at positions 392/393. Fifty-eight percent of the histidine kinases have either lysine or arginine in the position corresponding to Lys-392 in PhoQ, and members of this class nearly always have an aromatic or a histidine residue (86 and 13%, respectively) in place of Tyr-393. Presum- ably, the members of this class utilize a binding arrangement similar to that observed in the PhoQ-KD structure. Kinases in this class typically either have a basic residue (41%) as in PhoQ or a glutamine (47%) residue at the position corresponding to Arg-434. Clearly, those kinases that lack a basic residue at this position must utilize a different basic residue or mechanism to substitute for the role Arg-434 plays in catalysis in PhoQ. The high frequency with which glutamine is found at this position suggests that it plays an important role in the function of those proteins.
Histidine kinases of a second major class (34% of all) have either aspartate or glutamate (22%) or asparagine or glutamine (12%) in the position corresponding to Lys-392 in PhoQ, and in this class the residue corresponding to Tyr-393 is usually an alanine (45%) or a histidine (30%). This second major class usually has three residues displaying the motif of Thr/Ser (68/25%)-Thr/Gly/Ser (72/13/8%)-Lys/Arg (66/28%) aligned with residues 434 -436 of PhoQ. CheA belongs to this second class ( Fig. 2A). The sequence preferences of the second class are strikingly similar to those of the GHL ATPases, which have Asp and Ala/Glu, respectively, at the positions corresponding to Lys-392 and Tyr-393, and a Thr/Gly-Ser/Gly/Thr-Lys triad in place of Arg-434 (Fig. 5A). The lysine residue is in all cases hydrogen-bonded to the nucleotide ␤-phosphate (Lys-79 in MutL), and in MutL Thr-77 and Ser-78 interact with the ribose moiety with the ␣-phosphate and Ser-78 (Fig. 5B). Presumably, similar interactions occur in the non-PhoQ-like class of histidine kinases and the binding energy contributed by these interactions may compensate for the lack of the tyrosine-adenine stacking interaction observed in the PhoQ-KD structure.
Orthodox histidine kinase proteins with a typical domain organization, as exemplified by PhoQ and EnvZ, have a transmembrane N-terminal sensor domain linked through the histidine phosphotransfer domain to the kinase domain. Atypical histidine kinase proteins, such as CheA, have the histidine domain remote from the kinase domain and separate from the sensor domain. There also exist hybrid kinases in which an additional pair of phosphoacceptor and phosphodonor sites intervenes, either contiguously on the kinase protein or as separate proteins, between the kinase and the response regulator to form a phosphorelay system (1). PhoQ happens both to be a typical orthodox kinase and to have both the predominant catalytic configuration, whereas CheA is an atypical kinase having the secondary class of catalytic configuration. These associations are not general, however; members of the different classes of catalytic configuration are distributed variously among the different types of domain organization.
The crystal structure of T. maritima CheA-KD complexed with a variety of ATP analogs was published just before submission of this article (3). Together, the present work and the results of Bilwes et al. provide both corroboratory and complementary views of the histidine kinase active site. The repositioning of the ATP-lid to a closed conformation in response to nucleotide binding seen in the case of CheA-KD is similar to that deduced here from the PhoQ-KD structure. There do appear to be significant differences in the geometry of nucleotide binding, however, as expected from the analysis above which places PhoQ and CheA in different catalytic classes. A detailed comparison between these structures when the molecular coordinates of CheA-KD become available in the PDB will likely provide important insights into the catalytic mechanisms utilized by these two proteins. Mutational studies to evaluate the roles of candidate catalytic residues in CheA will also be of interest.