High-resolution crystal structure of Streptococcus pyogenes β-NAD+ glycohydrolase in complex with its endogenous inhibitor IFS reveals a highly water-rich interface

The crystal structure of the complex between the C-terminal domain of Streptococcus pyogenes β-NAD+ glycohydrolase and an endogenous inhibitor for SPN was determined at 1.70 Å. It reveals that the interface between the two proteins is highly rich in water molecules.


Introduction
The gram-positive bacterium Streptococcus pyogenes causes a variety of human diseases such as superficial infections (pharyngitis and impetigo) and life-threatening conditions (toxic shock syndrome and necrotizing fasciitis) (Cunningham, 2000;Sachse et al., 2002). The virulence of S. pyogenes is enhanced by its toxin -NAD + glycohydrolase (SPN; also known as Nga) (Sumby et al., 2005). SPN possesses only -NAD + hydrolytic activity but no ADP-ribosyltransferase or ADP-ribosyl cyclase activity (Ghosh et al., 2010;Stevens et al., 2000). SPN is injected across the host cell membrane into the cytoplasm through streptolysin O (SLO), a member of a large family of pore-forming toxins and cholesterol-dependent cytolysins. Once translocated into the cytoplasm of the host cell, SPN contributes to virulence by depleting the intracellular NAD pool and producing the potent vasoactive compound nicotinamide, but not by ADP-ribosylation of protein substrates (Ghosh et al., 2010;Stevens et al., 2000;Bricker et al., 2005).
The -NAD + glycohydrolase hydrolyzes -NAD + , an important cofactor in numerous redox and energy-producing biological reactions, to produce nicotinamide and adenosine diphosphoribose (ADP-ribose) (Tatsuno et al., 2007;Michos et al., 2006). Strict -NAD + glycohydrolases are incapable of further catalysis of the products from the initial reaction (Ghosh et al., 2010). Other enzymes that can hydrolyze -NAD + are multifunctional and can be classified into two on the basis of additional reactions (Ghosh et al., 2010). ADPribosyltransferases catalyse the hydrolysis of -NAD + and the transfer of an ADP-ribose moiety onto target proteins (Holbourn et al., 2006). ADP-ribosyl cyclases convert -NAD + into cyclic ADP-ribose, a potent second messenger for calcium mobilization (Karasawa et al., 1995).
SPN is comprised of two domains. The amino-terminal 190 residues of SPN are required for translocation of SPN into the PDB Reference: 4kt6 host cell via cytolysin-mediated translocation (CMT) pathway (Ghosh & Caparon, 2006). The C-terminal domain (residues 191À451) alone is active as the -NAD + glycohydrolase but it is also indispensible for translocation (Ghosh & Caparon, 2006;Ghosh et al., 2010). SPN is also toxic to bacterial cells; therefore S. pyogenes encodes the ifs gene, which encodes the immunity factor for SPN (IFS) as an endogenous antitoxin (Meehl et al., 2005;Kimoto et al., 2006). IFS functions as a competitive inhibitor of the substrate -NAD + by blocking the active site in the C-terminal domain of SPN. It was reported that increasing concentrations of IFS decreased the rate of -NAD + hydrolysis, with K I,app of 2.0 AE 0.3 nM (Meehl et al., 2005). IFS is essential for the viability of SPN-producing strains of S. pyogenes by protecting the bacterium from the toxic -NAD + glycohydrolase activity of SPN that fail to be secreted (Meehl et al., 2005).
In order to provide structural details of the interactions of SPN with IFS and thus the inhibition mode, we have employed limited proteolysis to isolate a crystallizable complex of SPN and IFS, which consists of the SPN C-terminal domain (residues 193-451; SPN ct ) and the full-length IFS (residues 1-161). We have solved the crystal structure of this SPN ct -IFS complex by single anomalous diffraction and refined the model at 1.70 Å resolution. The overall complex structure is highly similar to the previously reported structure that was refined at 2.80 Å (PDB entry 3pnt; Smith et al., 2011). Much higher resolution of the data used in this study allowed us to identify a lot more water molecules bound to the protein complex, in particular at the interface between SPN ct and IFS. Unexpectedly, our higher resolution structure reveals that the interface between SPN ct and IFS is highly rich in water molecules, placing it as one of the wettest protein-protein interfaces. Many interactions between SPN ct and IFS are water-mediated. As an example, the protruding Arg40 of IFS blocks the -NAD + binding site of SPN ct through extensive water-mediated interactions. If the IFS-inhibited SPN has to dissociate from IFS before the free SPN is translocated across the cell envelope, the wet SPN ct -IFS interface may facilitate such dissociation (Ghosh & Caparon, 2006).

Cloning, expression and purification of the SPN ct -IFS complex
The contiguous spn (SpyM3_0128) gene covering the residues 38-451 and the full-length ifs (SpyM3_0129) gene of S. pyogenes M3 were PCR-amplified, and cloned into the pET-28b(+) vector (Novagen), using the NdeI/XhoI restriction enzymes. This construct added a hexahistidine-containing 21residue tag (MGSSHHHHHHSSGLVPRGSHM) at the Nterminus of SPN. The two proteins were co-expressed in Escherichia coli Rosetta2 (DE3) cells using Terrific Broth culture medium. Protein expression was induced by 0.5 mM isopropyl -d-thiogalactopyranoside and the cells were incubated for an additional 18 h at 303 K following growth to midlog phase at 310 K. The cells were lysed by sonication in a lysis buffer [20 mM Tris-HCl at pH 8.5, 500 mM NaCl, and 5% (v/v) glycerol] containing 5 mM imidazole followed by centrifugation to remove cellular debris. The supernatant was applied to an affinity chromatography column of HiTrap Chelating HP (GE Healthcare). The protein was eluted with the lysis buffer containing 300 mM imidazole and the eluted sample was further purified by size-exclusion chromatography using a HiLoad 16/60 Superdex 200 prep-grade column (GE Healthcare). The elution buffer was 20 mM Tris-HCl at pH 8.5, 200 mM NaCl and 0.1 mM tris(2-carboxyethyl)phosphine. We could confirm the complex formation of the two proteins by SDS-PAGE. However, we noticed that the 49 kDa band corresponding to SPN was degraded slowly. Thus, a limited proteolysis experiment was carried out to obtain a proteolysisresistant core of the complex. After extensive testing of various combinations of proteases (trypsin and chymotrypsin) at different concentrations (at a mole ratio of 1 : 100, 1 : 1000 and 1 : 10000) and incubation time (30 min, 1 h, 3 h, 6 h and 20 h) and temperature (295 K and 310 K), the best condition was established to be -chymotrypsin (Sigma catalog No. C4129) at a mole ratio of 1 : 1000 for 20 h at 310 K. After thechymotrypsin treatment, the complex was purified by sizeexclusion chromatography using a HiLoad 16/60 Superdex 200 prep-grade column.
The selenomethionine (SeMet)-labeled complex protein was expressed and purified as above, except that we used the M9 cell culture medium that contained extra amino acids including SeMet.

Crystallization and X-ray data collection
The protein complex was concentrated to 50 mg ml À1 for crystallization using an Amicon Ultra-15 centrifugal filter unit (Millipore). Crystals were grown by sitting-drop vapor-diffusion method at 295 K. Each sitting drop prepared by mixing 1 ml each of the protein solution and the reservoir solution was placed over 100 ml of the reservoir solution. Best crystals of both SeMet-labeled and native SPN ct -IFS complex were obtained with the reservoir solution of 20% (w/v) tacsimate at pH 4.0 and 20% (w/v) polyethylene glycol 3350. Crystals were transferred to a cryoprotectant solution, which contained 20% (v/v) glycerol in the reservoir solution. Single-wavelength anomalous diffraction (SAD) data were collected from a crystal of the SeMet-substituted SPN ct -IFS complex at 100 K on an ADSC Quantum 315 CCD detector system (Area Detector Systems Corporation, Poway, CA, USA) at the BL-4A experimental station of Pohang Light Source, Korea. Raw data were processed using the program suit HKL2000 (Otwinowski & Minor, 1997

Phasing and refinement
The structure of SPN ct -IFS complex was solved by Se SAD phasing. Phase calculation, density modification and initial model building were carried out using PHENIX AutoSol and AutoBuild (Adams et al., 2010). Phenix AutoSol located all 30 expected selenium atoms of two complex molecules in an asymmetric unit. Subsequent manual model building was conducted using the program COOT (Emsley & Cowtan, 2004) and the model was refined with the programs REFMAC (Murshudov et al., 1997) and PHENIX (Adams et al., 2010), including the bulk solvent correction. 5% of the data were randomly set aside as the test data for the calculation of R free (Brü nger, 1992). Water molecules were added using the program COOT and were manually inspected. The quality of the refined model was assessed by MolProbity . Crystallographic and refinement statistics are summarized in Table 1. The coordinates and structure factors have been deposited in the Protein Data Bank (PDB) under the accession code 4kt6.

Preparation of the SPN ct -IFS complex and its structure determination
We co-expressed the mature SPN (residues 38-451) and its endogenous inhibitor IFS (residues 1-161) from S. pyogenes but we could not crystallize the whole complex, because the SPN component was degraded slowly. Therefore, we optimized the condition of limited proteolysis to isolate a readily crystallizable SPN ct -IFS complex. The SPN ct -IFS complex consisted of C-terminal residues  of SPN and all residues (1-161) of IFS. Under our optimized proteolysis condition, the chymotrysin cleavages occurred only before Gly193. The loss of the SPN N-terminal region was supported by mass analysis of trypsindigested peptide fragments of the denatured complex. Previously, the SPN ct -IFS complex was co-expressed with the full-length IFS (residues 1-161) by identifying a C-terminal enzymatically active domain of SPN (residues 191-451) (Smith et al., 2011). The structure of the SPN ct -IFS complex was determined at 2.80 Å (Smith et al., 2011), with the model accounting for residues 196-445 for both chains A and C of SPN, and residues 1-161 or 2-161 for chains B or D of IFS.
Our crystals of the purified SPN ct -IFS complex diffracted to high resolution and allowed us to solve the structure by the Se SAD method. The model of the SPN ct -IFS complex was refined to yield R work and R free values of 19.7% and 23.5%, respectively, for 20.0-1.70 Å data. The model includes 830 residues in two copies of the complex (residues 193-446 of SPN and residues 1-161 of IFS) and 596 water molecules. The C-terminal residues 447À451 of SPN (in both chains A and C) are likely disordered in the crystal. Two independent heterodimeric complexes in the P1 unit cell are highly similar to each other, with root-mean-square (r.m.s.) deviations of 0.27 Å for 415 C atoms (254 residues of SPN ct and 161 residues of IFS) in the model. The two chains of SPN ct in the asymmetric unit are highly similar to each other with an r.m.s. deviation of 0.22 Å for 254 C atoms; the two chains of IFS in the asymmetric unit are also highly similar to each other with an r.m.s. deviation of 0.22 Å for 161 C atoms.
Our 1.70 Å structure of the SPN ct -IFS complex is highly similar to the previously reported structure determined at 2.80 Å (Smith et al., 2011) Table 1 Statistics for data collection, phasing and model refinement.

The interface between SPN ct and IFS is highly rich in water molecules
In our high-resolution structure of the SPN ct -IFS complex, IFS interacts with SPN ct through numerous hydrogen bonds and electrostatic interactions, many of which are water-mediated. The predominant contacts between SPN ct and IFS involve an 1-2 loop, 6, 6-2 loop, 8-3 loop and 9-4 loop of SPN ct ; 1, 2-3 loop, 5, 7a-7b loop and 7b of IFS. The complex buries a large surface area at the interface between SPN ct and IFS (3210 Å 2 and 3280 Å 2 for A : B and C : D interfaces, respectively). Our higher-resolution (1.70 Å ) structure reveals that the interface is very rich in water molecules (Fig. 1); 67 and 71 water molecules are identified at the A : B and C : D interfaces, respectively. Many of these water molecules are conserved and common to both interfaces. The interface waters have B-factors ranging from 20.2 to 50.0 Å 2 for the A : B interface and from 21.3 to 47.8 Å 2 for C : D. The mean B-factor of interface waters (32.2 Å 2 for the A :B interface and 33.0 Å 2 for C : D) is slightly higher than that of non-hydrogen protein atoms (26.9 Å 2 for A/B chains and 30.0 Å 2 for C/D chains) but is lower than the overall B-factor of other waters (36.0 Å 2 ). The previous complex structure was determined at 2.80 Å resolution (Smith et al., 2011) and it shows essentially identical buried surface areas at the interface (3260 Å 2 and 3270 Å 2 for A : B and C : D interfaces, respectively). However, only a small number of water molecules could be located due to insufficient resolution. A total of 153 water molecules were identified per two complex molecules in the asymmetric unit, with only 14 and 10 at the A :B and C : D interfaces, respectively (Fig. 1). Nearly all of these interface water molecules are present in our higher-resolution structure.
Water is often indispensable for specific recognition of two proteins as an integral part of protein-protein interfaces (Ladbury, 1996;Levy & Onuchic, 2004). Analyses of water molecules at the protein-protein interfaces showed that, on average, the interfaces of complexes and homodimers contain about ten water molecules per 1000 Å 2 of interface area, and crystal packing interfaces, about 15 (Rodier et al., 2005;Reichmann et al., 2008). Moreover, interfaces of weak and highly transient complexes contain more waters than found in high-affinity complexes (Rodier et al., 2005;Reichmann et al., 2008). In our SPN ct -IFS complex structure, $ 21 water molecules are found per 1000 Å 2 , making it one of the wettest protein-protein interfaces. The water-rich interface may be advantageous for facilitating the dissociation of IFS from the complex immediately before translocation across the cell envelope.
Our high-resolution crystal structure of the SPN ct -IFS complex reveals that many interactions between SPN ct and IFS are water-mediated. A prominent example is the 2-3 loop of IFS, which points toward the NAD binding cavity of SPN. Compared with the unbound IFS structure (PDB entry 3qb2; Smith et al., 2011), the 2-3 loop of IFS bound to SPN ct is considerably moved toward the active site cavity of SPN in our SPN ct -IFS complex, with an r.m.s. deviation of 0.52-0.94 Å for 20 C atoms. The side-chain of Arg40 on the IFS 2-3 loop protrudes into the NAD binding cavity of SPN ct (Fig. 2), blocking the binding of the substrate -NAD + . Arg40 of IFS interacts with SPN ct through extensive water-mediated interactions (Fig. 2). It makes an extensive water-mediated hydrogen-bond network with the residues located on 2, 8, 8-3 loop and 9-4 loop of SPN ct (Gln216 on 2; Ile328 and Lys329 on 8; Gly330 and Asp332 on 8-3 loop; Gly368, Asn370, Asn373, Ile374, Gln378, Thr379, Trp380, Glu389 and Glu391 on 9-4 loop).  ribosyltransferase superfamily is missing in SPN. It has anhelical linker subdomain, which is absent in other ADPribosyltransferase superfamily enzymes (Smith et al., 2011). The ARTT motif is important for the substrate specificity and recognition of the ADP-ribosyltransferase superfamily (Han & Tainer, 2002). The Q/E-X-E sequence of the ARTT motif provides the key catalytic glutamic acid to stabilize an oxocarbenium ion intermediate (Han & Tainer, 2002;Holbourn et al., 2006). The second Gln or Glu (Q/E), located two positions upstream from the catalytic Glu in the ARTT loop, is essential for the ribosyltransferase activity of ADP-ribosylating toxins. It may be important for recognizing the target residue of substrate proteins (Nagahama et al., 2000;Wilde et al., 2002;Han et al., 2001). It was suggested that the different conformation of the ARTT loop in SPN as well as SPN's unique -helical linker subdomain does not allow accommodation of protein substrates in the canonical mode of other ADP-ribosyltransfereases (Smith et al., 2011).
It appears that the distinct side-chain orientation of SPN Glu389 (Fig. 3), the second Q/E in the ARTT motif, is responsible for the lack of ADP-ribosyltransferase activity in SPN. In our present complex structure, as well as in the previously reported structure (Smith et al., 2011), the conformation of the ARTT loop of SPN ct is considerably different from ADP-ribosyltransferases. Furthermore, the side-chain of Glu389 is stretched into the interior and is surrounded by -helices 7 and 8 (Fig. 3b). The side-chain orientation of Glu389 is in an almost opposite direction from corresponding residues of ADP-ribosyltransferases (Fig. 3a). When we modeled an NAD molecule of B. cereus VIP2 (PDB entries 1qs2) into the active site of SPN ct by superimposing the two structures, a water lies in the SPN ct structure between Glu389 and the susceptible glycosidic bond of NAD + (Fig. 3b). This water may be activated by Glu389 to act as a nucleophile for the hydrolytic reaction catalyzed by SPN ct (Ghosh et al., 2010;Robertus et al., 1998). Unlike Glu389, the side-chain of the catalytic Glu391 of SPN ct overlaps well with the corresponding residues of ADP-ribosyltransferases (Fig. 3a). Structural comparison of the second Glu of the ARTT motif between SPN ct and ADP-ribosyltransferases. (a) Superimposition of the active site residues (Glu389 and Glu391) of SPN with those of ADP-ribosyltransferases such as B. cereus VIP2 complexed with NAD, B. cereus VIP2, C. botulinum C2 at pH 3.0, C. botulinum C2 at pH 6.1, C. perfringens iota toxin, V. cholerae CT and E. coli heat-labile enterotoxin (LT). Other secondary structure elements except for the region containing the ARTT motif were removed for clarity. The regions containing the second Glu of the ARTT motif and -NAD are marked with a dotted square. (b) A slightly different view of the dotted square in (a) for the superposition of SPN ct and VIP2 complexed with NAD is shown in the box. A water molecule present in our SPN ct structure is donoted by W (a purple sphere). Secondary structure elements of SPN ct are labeled.

Figure 2
Stereoview of water-mediated interactions between Arg40 of IFS (gray surface) and the active site of SPN (green ribbon). Arg40 of IFS is represented as a ball-and-stick model inside the surface in teal color. Residues of SPN ct interacting with IFS Arg40 are shown as ball and stick. Purple spheres are water molecules and dotted lines denote hydrogen bonds.