Journal of Molecular Biology
The Limit of Accuracy of Protein Modeling: Influence of Crystal Packing on Protein Structure
Introduction
The densely packed environment of globular proteins in crystal structures is likely to affect protein structure. Crystal contacts bury a significant portion of the solvent accessible surface of a protein1, 2, 3, 4 and this might induce structural changes. What are the characteristics and the extent of such changes? A direct way to approach this question would be to compare X-ray structures and those modeled by NMR (see, for example Smith et al.5). Unfortunately, there are relatively few examples of proteins resolved by both methods, nor is it yet clear how best to compare a multitude of NMR models with a single X-ray model. An indirect way is to analyze differences between structures of the same protein in different crystal environments (i.e. when the arrangement of molecules in the crystal lattice is different). This has been done for several small datasets6, 7, 8 or for several specific proteins.4, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 In some of these, different crystal packing resulted in rigid body motion of large structural units9, 17 or loop conformational changes.18 In most others, crystal packing had a role in local structural differences. Sometimes, the structural difference was much larger than that originating from point mutation.11
The few database studies carried out regarding the influence of crystal packing on protein structure deal mainly with side-chain conformation. Bower et al.20 examined the side-chain angle differences between lysozyme crystallized in different space groups in order to derive an upper limit for prediction accuracy of side-chains in modeling programs. They found that about 25% of the lysozyme residues differ in χ1 by >40° in different crystal environments, and about 40% differ in χ1 or χ2. Jacobson et al.21 found that side-chains with close intermolecular contacts tend to have different conformations more often if their crystal environment is different. This might result in an underestimation of predictive accuracy, especially for surface residues. Moreover, it is not clear if regions of protein structures that participate in intermolecular crystal contacts are less or more mobile than other surface regions, since it was variously claimed that the average B-factor is larger in the region of contacts11 as well as the opposite.2
The growth of the Protein Data Bank (PDB) database makes it now feasible to arrive at statistical conclusions regarding structural effects of crystal packing. Here, we analyze proteins whose crystals have more than one molecule in the asymmetric unit or whose structures were determined at least twice by X-ray crystallography. The comparison of different structures of the same protein, in identical or different structural environments, is the main tool available for examining the amount of structural variability associated with crystal packing. However, a major problem is that in almost all cases there is some degree of dependence between structures resolved more than once. The potential limits of accuracy of structure prediction as evaluated by crystal structure comparisons is discussed.
Section snippets
Results
A visual example of the crystal packing phenomenon we are addressing is illustrated in Figure 1 using basic fibroblast growth factor as a case in point. At least two different crystal forms of this protein exist.22, 23 In these two crystal forms, different regions of the protein surface are shown to be involved in crystal contact (Figure 1(a)). In fact, in the case of pancreatic ribonuclease and its six crystal forms, it was found that almost any surface residue can be involved in crystal
Crystal packing effects
The results of our database study show that, on average, crystal environment influences protein structure. A variety of parameters not previously examined at a statistical level were used. These include: local backbone and hinge-like motions, side-chain flexibility, B′ values, positional conservation of associated water and ligands. The variability between proteins resolved in different crystal forms was clearly higher than between those having the same form. However, despite the different
Datasets
Datasets for this study were created by first extracting all pairs of PDB36, 37 chains identical in sequence (February 2004 version) and whose structures were determined by X-ray crystallography to a resolution equal or better than 2.5 Å. Pairs were then separated into those whose partners share, or do not share, the same structural environment (determined from the space group, unit cell dimensions and atomic contacts between the molecules27).
We derived two lists of pairs having different
Acknowledgements
We thank Dr Zippora Shakked for valuable discussions.
References (47)
- et al.
Conservation helps to identify biologically relevant crystal contacts
J. Mol. Biol.
(2001) - et al.
Refined crystal structures of subtilisin novo in complex with wild-type and two mutant eglins
J. Mol. Biol.
(1991) - et al.
Crystal packing in six crystal forms of Pancreatic Ribonuclease
J. Mol. Biol.
(1992) - et al.
Crystal structure of human class mu glutathione transferase GSTM2-2. Effects of lattice packing on conformational heterogeneity
J. Mol. Biol.
(1994) - et al.
Protein flexibility and adaptability seen in 25 crystal forms of T4 lysozymes
J. Mol. Biol.
(1995) - et al.
Open structures of MurD: Domain movements and structural similarities with folylpolyglutamate synthetase
J. Mol. Biol.
(2000) - et al.
Two structures of cyclophilin 40: folding and fidelity in the TPR domains
Structure
(2001) - et al.
Prediction of protein side-chain rotamers from a backbone-dependent rotamer library: a new homology modeling tool
J. Mol. Biol.
(1997) - et al.
On the role of the crystal environment in determining protein side-chain conformations
J. Mol. Biol.
(2002) - et al.
Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography
Structure
(2004)