Stabilization of Escherichia coli ribonuclease HI by strategic replacement of amino acid residues with those from the thermophilic counterpart.

Thermus thermophilus ribonuclease H is exceptionally stable against thermal and guanidine hydrochloride denaturations as compared to Escherichia coli ribonuclease HI (Kanaya, S., and Itaya, M. (1992) J. Biol. Chem. 267, 10184-10192). The identity in the amino acid sequences of these enzymes is 52%. As an initial step to elucidate the stabilization mechanism of the thermophilic RNase H, we examined whether certain regions in its amino acid sequence confer the thermostability. A variety of mutant proteins of E. coli RNase HI were constructed and analyzed for protein stability. In these mutant proteins, amino acid sequences in loops or terminal regions were systematically replaced with the corresponding sequences from T. thermophilus RNase H. Of the nine regions examined, replacement of the amino acid sequence in each of four regions (R4-R7) resulted in an increase in protein stability. Simultaneous replacements of these amino acid sequences revealed that the effect of each replacement on protein stability is independent of each other and cumulative. Replacement of all four regions (R4-R7) gave the most stable mutant protein. The temperature of the midpoint of the transition in the thermal unfolding curve and the free energy change of unfolding in the absence of denaturant of this mutant protein were increased by 16.7 degrees C and 3.66 kcal/mol, respectively, as compared to those of E. coli RNase HI. These results suggest that individual local interactions contribute to the stability of thermophilic proteins in an independent manner, rather than in a cooperative manner.


Stabilization of Escherichia coli
Thermus thermophilus ribonuclease H is exceptionally stable against thermal and guanidine hydrochloride denaturations as compared to Escherichia coli ribonuclease HI (Kanaya, S., and Itaya, M. (1992) J. Biol. Chem . 267, 10184-10192). The identity in the amino acid sequences of these enzymes is 52%. As an initial step to elucidate the stabilization mechanism of the thermophilic RNase H, we examined whether certain regions in its amino acid sequence confer the thermostability. A variety of mutant proteins of E. coli RNase HI were constructed and analyzed for protein stability. In these mutant proteins, amino acid sequences in loops or terminal regions were systematically replaced with the corresponding sequences from T. thermophilus RNase H. Of the nine regions examined, replacement of the amino acid sequence in each of four regions (R4-R7) resulted in an increase in protein stability. Simultaneous replacements of these amino acid sequences revealed that the effect of each replacement on protein stability is independent of each other and cumulative. Replacement of all four regions (R4-R7) gave the most stable mutant protein. The temperature of the midpoint of the transition in the thermal unfolding curve and the free energy change of unfolding in the absence of denaturant of this mutant protein were increased by 16.7 "C and 3.66 kcal/mol, respectively, as compared to those of E. coli RNase HI.
These results suggest that individual local interactions contribute to the stability of thermophilic proteins in a n independent manner, rather than in a cooperative manner.
One of the main purposes of protein engineering is to develop methods for designing protein variants with higher thermostability. Various strategies to enhance protein stability have been proposed (1-7). These strategies, as well as the findings that the effects of amino acid substitutions on protein stability are additive (8-lo), encourage us to use protein engineering technology for industrial applications. However, general methods to increase protein stability have not yet * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequencefs) reported in this paper has been submitted to the GenBankTM/EMBL Data Bank with accession number(s) KO052 and 60507.
$ Present address: Basic Research Laboratories, Toray Industries, Inc., 1-111 Tebiro, Kamakura, Kanagawa 248, Japan. 5 To whom correspondence and reprint requests should be ad- dressed. been established. More information on the structure-stability relationships of proteins is required.
Elucidation of the mechanisms by which thermophilic enzymes acquire their unusual thermostability will not only yield much information on the structure-stability relationships of proteins, but also allow evaluation of strategies that have been proposed to improve protein stability. It has been shown that the thermostability of mesophilic enzymes can be dramatically enhanced by replacing a few amino acid residues or a portion of the amino acid sequence with those from the thermophilic counterpart (11-16). In some cases, thermostabilization of proteins has been rationally designed (11, 12). However, the differences in stability between mesophilic and thermophilic proteins may reflect the sum of the various forces or the interactions that stabilize the proteins. In addition, different proteins may adopt unique mechanisms for stabilization. Therefore, it is important to thoroughly investigate the differences in protein stability for a given pair of mesophilic and thermophilic proteins.
RNase H (EC 3.1. 26.4) is an enzyme that specifically degrades the RNA moiety of DNA/RNA hybrids (for a review, see Ref. 17). The enzyme is distributed widely in various organisms, including Escherichia coli (18,19) and Thermus thermophilus (20 (rH-, mH-), recA13, ara-13, proA2, lacY1, galK2, rspL2O (Sm'), xyl-5, mtl-1, supE44, X-) were from Takara Shuzo Co., Ltd. Cells were grown in Luria broth medium (34) (24). Other chemicals were of reagent grade. Plasmid Construction-Plasmid pJAL6OON for the overproduction of E. coli RNase HI was constructed as follows. The rnhA gene in plasmid pAK6OO was amplified using PCR. 5'-and 3"primers were 5'-GGCATATGCTTAAACAGGTAG-3' and 5"GGGTCGACCA-ATTCGCAGGCGGTTGG-3', where dots represent the initiation codon of the rnhA gene, and the underlined bases show the positions of the NdeI and SalI sites, respectively. After digestion of the PCR product with NdeI and SalI, the resultant 500-base pair DNA fragment was ligated with the large NdeI-Sal1 fragment of pJLA503 to construct plasmid pJAL6OON. In this plasmid, transcription of the rnhA gene is initiated by the bacteriophage X promoters PR and PL in tandem and terminated by the bacteriophage fd terminator (Fig.  1). The cIthXS7 gene of bacteriophage h is also present on the plasmid. PCR was performed in 25 cycles with a Perkin-Elmer Cetus DNA Thermal Cycler (Model PJ2000) using a Gene Amp kit (Takara Shuzo Co., Ltd.) according to the procedure recommended by the supplier.
All oligodeoxyribonucleotides were synthesized with an Applied Biosystems Model 380A automatic DNA synthesizer by the phosphoramidite method (35).
Muta~enesis-Alteration of the rnhA gene was carried out by sitedirected mutagenesis using PCR or by cassette mutagenesis. All the codons for amino acid residues that were changed in the rnhA gene in this experiment are listed in Table I. Relevant restriction enzyme sites of pJAL6OON are shown in Fig. 1. Alteration of the codons for the amino acid residues located at the NH, or COOH terminus was carried out by cassette mutagenesis. The 30-base pair NdeI-BglII fragment or the 100-base pair SstII-Sal1 fragment from pJAL6OON was replaced with the chemically synthesized mutant. Alteration of the amino acid codons located between the BglII and SstII sites was carried out with PCR as described by Higuchi (36). Briefly, two primary PCR products, which overlap in sequence, were first obtained from an appropriate DNA template. One was generated with the 5'primer containing the BglII site and the 3"mutagenic primer, and the other was obtained with the 5"mutagenic primer and the 3'primer containing the SalI site. The two PCR products were denatured, annealed, and reamplified with only the 5'and 3"primers. The resultant secondary PCR product was digested with BglII and SalI and ligated to the large BglII-Sal1 fragment of pJAL6OON. For the construction of the pJAL6OON derivatives with multiple region replacements, the relevant restriction enzyme sites were used to generate a chimeric gene composed of the two different mutant rnhA genes. Alternatively, the pJAL6OON variant, instead of pJALGOON, was used as a DNA template for mutagenesis with PCR. Since either the 5'-or the 3'-mutagenic primer or both were usually designed to create a new restriction enzyme site within the gene, the mutants were initially screened by restriction enzyme mapping of plasmid DNA prepared from Amp' transformants of E. coli HB101. The production and purification of the mutant proteins from HBlOl transformants were carried out as described for the wild-type protein from E. coli N4830-1 harboring plasmid pPL801 (24). The synthesis levels and the purities of the mutant proteins were analyzed by sodium dodecyl sulfate-15% polyacrylamide gel electrophoresis (37). Each ously described (38). mutation was confirmed by amino acid sequence analysis as previ-Protein Concentration-The protein concentration was determined from UV absorption with an absorption coefficient of A%' = 2.02, which assumed that all mutant proteins, except for those in which both Trp"' and Trp"" were replaced by Phe, had the same absorption coefficient as that of wild-type E. coli RNase HI (38). The absorption coefficient of the mutant proteins, in which Trp"' and Trp'" were replaced by Phe, was modified t o A I 2 = 1.42 by determining the protein concentration by t,he method of Bradford (39) using wild-type E. coli RNase HI as the standard protein. This value was in good agreement with the A ! : ? value of 1.48 calculated by using c = 1576 M" cm" for Tyr ( X 5) and 5225 M-' cm" for Trp ( X 4) a t 280 nm (40).
Enzymatic Actiuity-The RNase H activity was determined by measuring the radioactivit,y of the acid-soluble digestion product from the substrate, a "H-labeled M13 DNA/RNA hybrid, as previously described (25). One unit of enzymatic activity is defined as the amount of enzyme producing 1 pmol of acid-soluble material/min a t 37 "C. The specific activity is defined as units of enzymatic activity/milligram of protein.
Circular Dichroism Spectra-The CD spectra were measured on a 5-600 spectropolarimeter (Japan Spectroscopic Co., Ltd.). Spectra were obtained at 25 "C in 10 mM sodium acetate buffer (pH 5.5) containing 0.1 M NaC1. The protein concentration was 0.15 mg/ml, and the optical path length was 2 mm.
Thermal Denaturation-Thermal denaturation curves were determined by monitoring the CD value at 220 nm as the temperature was increased. Proteins were dissolved in 10 mM glycine HC1 buffer (pH 3.0) containing 1 mM dithiothreitol or in 20 mM sodium acetate buffer (pH 5.5) containing 1 M GdnHCl and 1 mM dithiothreitol. The protein concentration was 0.15 mg/ml, and the optical path length was 2 mm. The temperature of the cuvette containing the sample solution was raised a t a rate of -0.7 "C/min. The temperature of the sample solution was directly measured by a Takara D641 thermistor. All mutant proteins examined reversibly unfolded in a single cooperative fashion under these experimental conditions. On the assumption that  GdnHCl Denaturation-GdnHC1 denaturation curves were observed at 25 "C by monitoring the CD values at 220 nm with variation of the GdnHCl concentration. Proteins were dissolved in 20 mM sodium acetate buffer (pH 5.5) containing the appropriate concentration of GdnHCl and incubated at 25 "C for at least 30 min prior to the measurement. The protein concentration was 0.13 mg/ml, and the optical path length was 2 mm. All mutant proteins reversibly unfolded in a single cooperative fashion. On the assumption that the unfolding equilibrium of those proteins follows a two-state mechanism, the pre-and post-transition base lines were linearly extrapolated, and the difference in free energy change between the folded and unfolded states (AG) was calculated as described by Pace (43). The free energy change in H 2 0 (AGHfl) and the measurement of the dependence of AG on the GdnHCl concentration (m) were determined by a least-squares fit of the data from the transition region to the following equation: ). The midpoint of the GdnHCl denaturation curve ([Dllr2) was the concentration of GdnHCl at which the AG value became 0.

RESULTS
Mutagenesis-Comparison of the three-dimensional structure of E. coli RNase HI with the hypothetical structure of T. thermophilus RNase H indicates that the amino acid substitutions between these two enzymes are mostly localized in the COOH-terminal region; the region consisting of the NH2 terminus and the loop between PB and PC; and the region consisting of aII, aIII, aIV, and PE (23). In this study, the amino acid sequences of E. coli RNase HI in these regions were replaced with the corresponding sequences from T. thermophilus RNase H. In addition, the amino acid sequences forming the loop structure were also chosen as a target sequence to be replaced. The amino acid residues involved in the loop regions are generally exposed to the solvent and are expected to contribute to protein stability in an independent manner, rather than in a cooperative manner. Finally, as shown in Fig. 2  localizations of these regions in the crystal structure of E. coli RNase HI are shown in Fig. 3.
The mutant proteins of E. coli   Enzymatic Actiuity-The specific activities of the mutant proteins are summarized in Table 11. It is notable that the mutant enzymes Gsob-RNase H and Rg-RNase H exhibited extremely low specific activities as compared to that of the wild-type protein. In addition, the specific activities of the mutant proteins in which the amino acid sequences in region R6 or R7 or both were replaced with those of T. thermophilus RNase H were considerably decreased as compared to that of the wild-type protein. All other mutant proteins exhibited specific activities similar to that of the wild-type protein. The specific activities of the mutant proteins with multiple region replacements were similar to or slightly less than the hypothetical values, which were calculated assuming that the effect of each region replacement on the enzymatic activity is CUmulative.   respectively, are shown in comparison with that of the wild-type protein (---). All spectra were measured as described under "Experimental Procedures." CD Spectra-The far-ultraviolet (200-250 nm) CD spectra of the mutant proteins were classified into four groups (WT, and types A-C) based on the shape of the troughs (Fig. 4). The type of CD spectrum for each mutant protein is summarized in Table 11. A group designated as W T represents the spectrum identical to that of the wild-type protein. Type A represents the spectrum that has a deep minimum at 217 nm. This minimum position was shifted by 3-4 nm to a longer wavelength as compared to the position of the minimum in the spectrum of the wild-type protein. Types B and C represent the spectra in which the slope of a trough in the 220-240 nm region was sharper than that observed in the spectrum of the wild-type protein. The type B spectrum is distinguished from the type C spectrum by the position of the minimum.  Whereas the former has a minimum at 214 nm, the latter has a minimum at 217 nm. Type A and B spectra were only observed for the mutant proteins that involve the replacement of the amino acid sequences in regions R5 and R7, respectively. Type C spectra were only observed for the mutant proteins that involve the replacement of the amino acid sequences in both of these regions. These results suggest that the effect of each region replacement on the CD spectrum is cumulative. However, the differences in the CD spectra were faint, and the spectra of all the mutant proteins basically resembled one another. This suggests that none of the mutant proteins is markedly changed in its tertiary structure. Thermal Stability-To examine whether replacement of the amino acid sequence in each region with the corresponding thermophilic RNase H sequence enhances the stability of E. replacements with that of the wild-type protein indicates that only the replacement of the amino acid sequence (or an amino acid residue) in each of the three regions R4-Rs resulted in an increase in the thermal stability at either pH. Replacement of the amino acid sequence in region R7 increased the thermal stability at pH 5.5, but decreased it at pH 3.0. All the T,,, values of the other mutant proteins with single region replacements were equal to or below that of the wild-type protein at either pH. Notably, %-RNase H became extremely unstable. In addition to the region replacements, a few single amino acid substitutions were also introduced to examine their effects on protein stability. Neither the insertion of G I P b nor the Gln1I3 * Pro substitution significantly affected protein stability. However, the His11g + Glu substitution substantially ' S. Kanaya, personal communication.

Temperature ("C)
affected protein stability. The His"' + Glu mutation in R7-RNase H gave R7E1Ig-RNase H. The T,,, value of R7E"'-RNase H remained unchanged at pH 5.5, but increased by 4.6 "C at pH 3.0 as compared to that of R7-RNase H.
For mutant proteins with multiple amino acid sequence replacements, hypothetical AT,,, values were obtained by simply adding the AT,,, values determined for mutant proteins with a single replacement of the amino acid sequence in each region and are listed in Table 111. Only for the mutant proteins in which the amino acid sequences in regions R, and R7 were both replaced were the AT,,, values slightly larger (by 0.8-1. 8 "C) than the hypothetical values. All other observed AT,,, values were equal to or below the hypothetical ones.
Stability against GdnHCl Denaturation-The stabilities against GdnHCl denaturation of the mutant proteins with higher AT, values than that of the wild-type protein were analyzed at pH 5.5. GdnHC1-induced denaturation curves and plots of AG versus GdnHCl concentration around the midpoint of the denaturation are shown in Fig. 6A and 6B, respectively. The parameters characterizing the GdnHCl denaturation are summarized in Table IV (T,) is the temperature of the midpoint of the thermal denaturation transition shown in Fig. 5. The difference in the melting temperature between the wild-type and mutant proteins (AT,) is calculated as T,n (mutant) -T,(wild type). The hypothetical AT, value (ATmhyp) was calculated for each mutant protein with multiple region replacements assuming that the effect of each amino acid sequence replacement on the protein stability is independent and cumulative. For example, ATmhyp of R1/Rn-RNase H was calculated by adding the AT, values of R1-RNase H and Rn-RNase H. AH, and AS,,, are the enthalpy and entropy changes of unfolding at T,,,, respectively, which were calculated by van't Hoff analysis. The difference between the free energy change of unfolding of the mutant proteins and that of the wild-type protein at the T,,, of the wild-type protein (AAG,) was estimated by the relationship given by Becktel and Schellman (42) as described under "Experimental Procedures." Errors are within f0.3"C for T,, k12 kcal/mol for AH,,,, and k0.03 kcal/mol/K for AS, for the wild-type protein and were determined from four independent experiments.   (43). In all cases, AG was found to vary linearly with the GdnHCl concentration. Only the GdnHCl unfolding curves of the mutant proteins, whose thermal denaturation curves are shown in Fig. 5, are presented. They are designated by the same symbols as those described in the legend to  Heat Inactivation-To examine whether the mutant proteins with higher T, and [Dllf2 values than those of the wildtype protein are more stable against irreversible heat inactivation than the wild-type protein, we have measured the residual activities of R4/R5/&/R7-RNase H after heating a t various temperatures for 10 min as described previously (23).

Wild type
The temperatures at which the wild-type and mutant proteins lost 50% of their activities ( TI/') were -50 and 65 "C, respectively. The difference in these T1/' values is comparable with the difference in the T,,, values, indicating that this mutant protein is more stable against irreversible heat inactivation than the wild-type protein as well.

Strategy to Identify Structural Elements
Responsible for Thermal Stabilization-The construction of chimeric genes between the structural genes of the thermophilic and mesophilic proteins followed by the determination of the stabilities of these gene products have been successfully employed as a general strategy to determine the structural elements responsible for the unusual stability of thermophilic proteins (14,  16, 45). Such a strategy should be effective if no information on the three-dimensional structure is available for either the thermophilic protein or the mesophilic counterpart. However, the possibility exists that a decrease in protein stability, which is caused by an unfavorable interaction between the amino acid residues from the mesophilic and thermophilic origins, could offset an increase in protein stability, which is caused by the introduction of a structural element responsible for thermal stabilization. Larger replaced sequences within the chimeric protein increase the possibility of an unfavorable interaction due to the larger proportion of amino acid residues of different origin. Such unfavorable interactions could be reduced if the size of the amino acid sequence to be replaced were limited. We have therefore selected nine different amino acid sequences of E. coli RNase HI with limited sizes, including those which form the loops, and replaced them individually or in combination with the corresponding T. therrnophilus RNase H sequences (Fig. 2). This approach, while rational, may not necessarily be the only one; and because of the enormous possible amino acid substitutions at any given position, there could be many proteins of equivalent stabilities generated using a different approach.
Analyses for the stabilities of the enzymes generated in this study allow us to suggest that four of the nine sequences (those in regions R4-R7) almost independently contribute to the unusual stability of the thermophilic RNase H, but that the amino acid sequences in regions R1-R3 and Rfl do not (Table 111). To identify the amino acid substitutions that determine the unusual stability of the thermophilic protein, the strategy presented here, which involves the systematic replacement of the amino acid sequences with limited sizes, might therefore be more useful than one that involves the construction of a chimeric protein with larger replaced segments. A slight cooperativity was seen between regions R6 and R7 for stabilization against thermal (Table  111) and GdnHCl (Table IV)

Role of Proline Residues-T. thermophilus
RNase H has more Pro residues than E. coli (Table 111). The single Gln113 --* Pro substitution slightly decreased the stability (Table 111), probably because the conformation of the loop between aIV and BE in E. coli RNase HI might be considerably different from that in T. thermophilus RNase H, and the Gln113 + Pro substitution alone may not be sufficient to alter the nature of this loop.

Effect of GIPob Insertion and His"' + Glu Reverse Muta-
tion"Gaoh-RNase H was constructed to examine the effect of the insertion of Glysoh on the stability of E. coli RNase HI. Since the increase in the T , value of Gflob-RNase H was much less than that of R5-RNase H as compared to the T,,, value of the wild-type protein (Table 111), the insertion of Glyaoh alone cannot account for the increase in the stability of R5-RNase H. The great reduction in the enzymatic activity caused by the insertion of Glyfloh may be due to some alteration in the geometry around a11 and aIII, which are proposed to contribute to the formation of the substrate-binding site (28). The reverse substitution from His"' to Glu in R7-RNase H resulted in an increase in the T, value by 4.6 "C at pH 3.0, but not at pH 5.5, as compared to the T, value of R7-RNase H.
Since the 0 -€ 1 atom of Glu"' forms a hydrogen bond with the N-€2 atom of His'27 in E. coli RNase HI (28), this result suggests that the decrease in the stability of R7-RNase H at pH 3.0 is due to an electrostatic repulsion between His"' and The relatively low enzymatic activity of R7E11g-RNase H and R,-RNase H (Table 11) is probably due to a slight deformation in the active-site geometry resulting from the modification of the hydrophobic interactions between a11 and BE in these mutant proteins. Further mutagenesis experiments will be required to limit the amino acid substitutions to those that are responsible for the increase in the stability of R5-, F&-, or R,-RNase H.
Effect of COOH-terminal Sequence Replacement-The COOH termini of the E. coli and T . thermophilus proteins, which show no sequence similarities (Fig. Z), are likely to fold uniquely and interact with different parts of the protein molecule. The dramatic decrease in both the enzymatic activity (Table 11) and the protein stability (Table 111) of &-RNase H therefore strongly suggests that the replacement of the COOH-terminal sequence introduces significant strain into the COOH-terminal peptide or within the region with which the COOH-terminal peptide interacts. In fact, the mutant protein in which the larger COOH-terminal peptide of E. coli RNase HI (Met'42-Va1'55) was replaced with Glr~'~'-Ala'~~ of T. thermophilus RNase H was recovered in an insoluble form in cells. 3 An increase in the unfavorable interactions caused by the replacement of the large COOH-terminal peptide may disturb the proper folding of the protein molecule.
Correlation between Stabilities against Heat and GdnHCl-The increase in the [Dl1/* values (Table IV) determined from the GdnHCl denaturation curves at pH 5.5 was comparable to the increase in the T , values at pH 5.5 (Table 111) for all mutant proteins, except for R7-RNase H, which gave a [Dlll2 value similar to that of the wild-type protein. Additivity in the [Dl1/* values was also observed for the mutant proteins with multiple region replacements, as was seen in the T , values. The differences in the AGHzo values were not proportional to the differences in the [Dlll2 values for R6-RNase H and R5/F&-RNase H (Table IV). These two mutant proteins had relatively lower AGHZ0 values than expected because the values of the dependence of AG on the GdnHCl concentration ( m ) of these mutant proteins were smaller than those of the wild-type and other mutant proteins. This result suggests that the cooperativity of the GdnHC1-induced unfolding for these mutant proteins is weaker than that for the wild-type protein. Shortle et al. (49) have shown that the absolute value of the change in m is correlated with the loss of protein stability. In contrast, R5-RNase H and Rs/F&-RNase H were more stable even though the m values of these mutant proteins were decreased. A mechanism different from that proposed by Shortle et al. may be involved in the stability of these mutant proteins. Careful calorimetric measurements will be required to precisely determine the thermodynamic values for these mutant proteins.
Stability-Activity Relationship-Controversy still remains as to whether an increase in the thermal stability is accompanied by a reduction in the enzymatic activity due to a decrease in conformational flexibility, In fact, T. thermophilus RNase H exhibited lower enzymatic activity than E. coli RNase HI under physiologically mild conditions (23). In this study, we have shown that some mutants of E. coli RNase HI could be stabilized without losing enzymatic activity. A typical example can be seen with R4/&-RNase H. Whereas this mutant protein has a T , value that is 8.8 "C higher than that of E. coli RNase HI at pH 5.5, it retained 75% of the enzymatic activity of the E. coli protein. This result supports the previous finding that enzymes from thermophilic origins are catalytically indistinguishable from their mesophilic counterparts (50). It also suggests that the thermostability of mesophilic enzymes can be enhanced by protein engineering technology without loss of enzymatic activity.