The Three-dimensional Structure of Ricin at 2.8 A*

The x-ray crystallographic structure of the hetero-dimeric plant toxin ricin has been determined at 2.8-A resolution. The A chain enzyme is a globular protein with extensive secondary structure and a reasonably prominent cleft assumed to be the active site. The B chain lectin folds into two topologically similar domains, each binding lactose in a shallow cleft. In each site a glutamine residue forms a hydrogen bond to the OH-4 of galactose, accounting for the epimerimic specificity of binding. The interface between the A and B chains shows some hydrophobic contacts in which pro-line and phenylalanine side chains play a prominent role. from a minimap dis- played on Plexiglas sheets. An a carbon backbone was then fitted to the electron density using the computer program FRODO (26) and an Evans and Sutherland MPS graphic system. The Cots were positioned near the base of side chain density when present and were constrained to a bond length of 3.8 A. These atoms then were used as a guide in building the complete molecular model on the graphics system.

A variety of plants contains cytotoxic proteins which are among the most poisonous compounds known. Perhaps the best known of these toxins is ricin, isolated from the seeds of the castor plant Ricinus communis. The protein was first isolated by Stillmark in the 1880s (1) and has been extensively studied since then. A comprehensive review of the voluminous work on ricin was published in 1982 by Olsnes and Pihl (2).
Ricin is isolated as a heterodimer, consisting of a 32,000dalton A chain glycoprotein linked by a disulfide bond to a 32,000-dalton B chain glycoprotein (3). The B chain is a lectin with a binding preference for galactosides (4), although it binds much more strongly to complex galactosides from cell surface carbohydrates than to simple sugars (5). Binding of the B chain to cell surface receptors appears to trigger endocytotic uptake of the protein (6). Recent studies have suggested that A and B chains have a strong affinity for one another, mediated by hydrophobic forces, and that association of the chains is necessary for toxicity. The disulfide link between the chains does not appear to be critical for toxicity except in maintaining protein-protein interactions at very low toxin concentration (7). There is recent evidence to suggest that the B chain also facilitates escape of the A chain from endosomes into the cytoplasm (8). These same workers also showed that, in addition to the galactoside binding uptake mechanism, ricin can be taken up by and intoxicate those cells which possess mannose receptors. In this case, the man-* This work was supported by Grant GM30048 from the National Institutes of Health and by a gift from the Cetus Corporation of Emeryville, CA (both to J. D. R.) Area detector data were collected at the national facility at the University of California at San Diego, supported by Grant RR01644 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "adoertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
** To whom correspondence should be addressed. nose residues of ricin are themselves recognized by a cellular receptor (8).
Once the A chain of ricin reaches the cytoplasm of a typical eucaryotic cell, it enzymatically attacks the 60 S ribosomal subunit and disrupts protein synthesis (2). Ricin has been shown (9) to have a K, = 0.1 PM for ribosomes and shows a kcat = 1500 min". Bacterial ribosomes are not susceptible to ricin intoxication (10). The exact nature of the catalytic event is unknown. No cofactors are required, and the reaction is not similar to that of diphtheria toxin which attacks soluble elongation factors. It has been shown that ricin has a very weak ribonuclease activity against rRNA isolated from the 60 S subunit of reticulocyte ribosomes (11). Despite many efforts, there is no evidence as yet for any gross alteration, including rRNA cleavage, of ricin-intoxicated ribosomes (2). It seems reasonable, however, that the ribonuclease activity may be an aberration of normal ricin action and that ricin action may involve the binding of rRNA in some fashion.
In addition to its biochemically interesting properties, ricin has attracted considerable interest as a potential antitumor agent. The protein has been shown to be roughly 10-100 times more toxic to certain transformed cell lines than to normal cells (6). However, the most exciting use of ricin has been in the form of immunotoxins. These are conjugates of ricin with cell-specific antibodies which act to target the toxin (12)(13)(14). Although the use of immunotoxins as anticancer agents has been reasonably encouraging, there have been problems due to lack of target cell specificity. It has been observed that if the B chain cell surface binding function is reversibly blocked, as by galactose, then immunotoxin specificity increases (13). This has led to the hope that genetic engineering of ricin could produce more specific therapeutic reagents.
The amino acid sequences and carbohydrate compositions of both ricin chains have been determined by chemical methods (15,16). More recently, Lord and Butterworth (17) have shown that both chains of ricin are coded by a single mRNA. The nucleotide sequence for cDNA derived from this message has been determined (18). That sequence revealed that ricin is synthesized as a proenzyme, including a leader sequence upstream of the A chain and 12-amino acid peptide linking the A to the B chain. The leader peptide and the linking peptide are both processed away to form the heterodimer. The nucleotide sequence of the genomic DNA has also been determined (19). It confirms the proenzyme structure of ricin and also reveals that the gene has no introns.
In this paper we report the three-dimensional strucpne of ricin, as determined by x-ray crystallography at 2.8-A resolution.

EXPERIMENTAL PROCEDURES
Ricin was prepared and crystals grown as described previously (7,20). The crystals are orthorhombic space group P212121, and with a Phase refinement statistics as a function of resolution. PtTP refers to the 2-hydroxyethane thiolate (2,2',2"terpyridine) platinum dinitrate derivative. MeHg refers to methyl mercury. f H is the root mean squared heavy atom structure factor, E is the root mean squared lack of closure, and fom is the average figure of merit.  6, and c = 114.7 A, containing one ricin molecule/ asymmetric unit. Work on the ricin structure has been underway in our laboratory for over 9 years and until recently has been very frustrating. High resolution data for native ricin and at least five heavy metal derivatives were previously collected by oscillation photography and also by automated diffractometry. The positions of heavy atoms were determined by difference Patterson analysis, heavy metal site parameters were refined, and phases calculated. The resulting maps were encouraging in that many features of the protein, particularly ot helices in the A chain, were discernible. Electron density for many side chains was also well resolved, but it was impossible to unambiguously follow the backbone for either peptide chain.
Attempts were made over several years to improve the maps by a variety of partial model refinements and phase combination methods (21-23). Although initially encouraging, these methods did not produce an interpretable chain tracing. The problems undoubtedly centered around poor starting phases. The most likely reason for this is the fact that ricin crystals change cell parameters very readily, even responding to changes in humidity. The changes are most pronounced along the B axis and may be as large as 1%. Since traditional data collection methods required six to eight crystals for a high resolution data set, the data were never completely isomorphous and tended to show an R,, of -12% for a given data set where I?., = 2 I Ih -I h 1 / This is quite large compared to the difference anticipated between native and derivative data sets, upon which phase quality depends.
Recently we collected data for native ricin and two isomorphous derivatives using the Mark I1 multiwire area detector system at the University of California-San Diego (24). The derivatives were methyl mercury acetate and 2-hydroxyethanethiolate(2,2',2"-terpyridine) platinum dinitrate. Each was prepared by soaking crystals in a 1-2 mM solution for 10 days. Typically, we collected 215,000 observations of 16,800 unique reflections to 2.8 A, with a merging Ra, = 5.5%.
Anomalous disparsion differences were measured and proved useful for phasing reflections to about 4 A. Heavy atom sites were located by difference Patterson analysis and the heavy atom parameters refined by standard methods. Table I shows the heavy atom refinement statistics appropriate for the calculation of phases for the multiple isomorphous replacement electron density map.
The resulting map appeared to be an improvement over our earlier maps, but still could not be traced unambiguously. The multiple isomorphous replacement phases were then refined by the method of Wang (25). These refined phases allowed a new map to be made which was interpretable. The electron density corresponding to an extended / 3 sheet structure in the A chain was substantially improved and was recognized as such for the first time. Most improved, however, was the density for the B chain, which is nearly devoid of secondary structures.
The protein backbone was initially traced from a minimap displayed on Plexiglas sheets. An a carbon backbone was then fitted to the electron density using the computer program FRODO (26) and an Evans and Sutherland MPS graphic system. The Cots were positioned near the base of side chain density when present and were constrained to a bond length of 3.8 A. These atoms then were used as a guide in building the complete molecular model on the graphics system. Fig. 1 shows a stereogram of the 01 carbon tracing of ricin.

RESULTS AND DISCUSSION
The B chain occupies the lower left portion of the figure. It is roughly 70 8, long, about 30 A wide, and is composed of two separate folding domains. Each domain is a roughly 30-A sphere and binds one lactose disaccharide, as predicted earlier (27).
The A chain is a more globular protein, situated in the upper right of Fig. 1. The protein has the shape of an oblong disc, approximately 55 A long, 45 A wide, and only about 35 A thick. Fig. 2 is a schematic ribbon drawing of the protein; the A and B chains have been separated slightly for clarity. The carboxyl-terminal 57 or so residues of the A chain fold into a protruding domain that interacts with the B chain by sliding into the cleft between the two B chain domains. A disulfide bond is formed between residues 259 of the A chain and 4 of the B chain. The amino-terminal 8-10 residues of the B chain form a long arm which interacts with the carboxyl-terminal domain of the A chain. As a result, the interchain disulfide is in an exposed region quite far removed from the main body of the B chain.
We now consider each chain in slightly greater detail. The A chain of ricin exhibits a substantial amount of secondary structure. It contains seven 01 helices totaling about 80 amino acid residues, meaning that about 30% of the protein is helical.
There also exists a reasonably pronounced, five-stranded p sheet composed of residues 57-116, although not all the residues in that stretch are directly involved in the sheet. In all, the A chain is about 15% p structure.
The A chain folds into three somewhat arbitrary domains, as shown schematically in Fig. 3. The amino-terminal 117 residues comprise the first domain (shown in Fig. 3a), which is dominated by the five-stranded p sheet. This flat domain forms the bottom of the A chain as we usually view it down the crystallographic a axis. The second domain (Fig. 3b) is composed of residues 118-210 and is dominated by five 01 helices. This domain generally sits over the first domain and slightly to the left. The only free sulfhydryl group in the protein, Cys-171, lies on the second helix in this domain. As expected, i t bonds to the single methyl mercury of that isomorphous derivative. The third domain, residues 211-267 (shown in Fig. 3c) The B chain, at the lower left, is an elongated, two-domain structure. The two lactose molecules bound to the protein are indicated at either end of the B chain. The view is directly down the crystallographic a axis, which minimizes overlap of the two chains. of A chain action. In fact, even the substrate is unknown, although the large subunit of eucaryotic ribosomes is the target (2). The protein does have a reasonably prominent cleft, however, as seen in the Ca drawing of Fig. 4. It is very likely that this is the active site cleft, or latent active site, since a conformational rearrangement may take place when the A chain is released from the B chain. As yet, no active site residues have been identified by chemical studies. Further work, perhaps by site-directed mutagenesis, will be required to define the role of residues in this vicinity. The putative active site cleft is created at the interface between all three domains. This cleft runs roughly horizontally and downward across the center of the A chain as viewed in Fig. 1. The cleft has good access to solvent in the heterodimer, suggesting that a conformational change may occur when A chain is released from the B chain since the heterodimer itself is inactive.
The B chain is a lectin with an affinity for galactosides.
The protein has been known for some time to bind two galactosides in a noncooperative fashion (28,29). In one study the two sites had different galactoside affinities more nearly equal (29). Chemical studies have implicated Tyr-248 and an unknown Trp in the two binding sites (30, 31). We have previously shown that the B chain is a gene duplication product, showing about 32% amino acid identity between the two halves (27). Each domain is built around a pair of disulfide loops, and we expected it to form two domains, each binding one lactose. This is indeed true, and the two domains have identical folding topologies as can be seen in Fig. 2. The amino-terminal domain consisting of residues 1-135 can be related to the carboxyl-terminal domain (residues 136-267) in this view by a rotation of roughly 90" counterclockwise around the axis normal to the page. The lactose binding site of each domain is shown as a pair of circles in Fig. 2.
The native crystals contain the disaccharide lactose, and the two lactose binding sites are quite clearly defined in the native electron density map. (In addition, we have produced a 4.5-A difference density map between parent and lactose free ricin). The native electron density map for the carboxylterminal domain, with the bound lactose and several key residues superimposed, is shown in Fig. 5a. It is clear that even in this unrefined model, key residues can be positioned quite accurately.
The electron density shows that in each domain the galactose moiety of the bound lactose lies in a shallow pocket. One also appears to be formed between Gln-35 and OH-6 of the galactose. The sugar binding arrangement for domain 1 is shown in Fig. 5b. In domain 2, Asn-255 is crucial to epimeric specificity, bonding to the galactose OH-4. The position of Asn-255 is stabilized by a hydrogen bond with Asp-234. A water molecule, bound to Asn-255, forms a hydrogen bond to OH-3 and is in a position analogous to Lys-40 of domain 1. There is no analog of Gln-35 in the domain 2 galactose binding site. The key elements of this binding site are shown in Fig.  5c.
It is interesting to note that the two domains of the B chain are the product of a gene duplication, have similar folding topologies, and bind galactose in similar ways; yet, the residues involved in that binding are not directly related by the gene duplication. The residues binding galactose in the aminoterminal domain are found in the first third of the domain peptide, whereas those of the carboxyl-terminal are in the last third of that peptide. The implication of the two apparently nonhomologous binding sites in two clearly homologous domains for the evolution and function of B chain will be addressed and thoroughly developed in a separate paper.
Having described the topology of the ricin structure, we can now briefly address certain other questions. It is generally considered that the A and B chains are held together by nonpolar forces (7,32). The measurements of Lewis and Youle (7) suggest that the free energy of this association is roughly -10 kcal/mol. The free energy of moving various chemical groups from an aqueous to a nonpolar environment has been cataloged by Kauzmann (33). The movement of a methane, similar to an alanine side chain, is favorable by -2.6 kcal/ mol, whereas movement of a benzene ring is favorable by -4.1 kcal/mol. Together, this information suggests that the ricin A-B interface need involve hydrophobic contacts between a relatively few residues. Our initial analysis shows only a few residues interacting, and they are largely the side chains of prolines and phenylalanines. For example, prolines 7 and 9 of B chain contact Pro-229 of A chain. Also, Pro-260 of B chain, together with phenylalanines 218 and 219, contact Pro-250 of A chain.
It is also known that ricin is a glycoprotein (15,16). There are two glycosylation sites on the B chain and two potential sites on the A chain, although A chain is often incompletely glycosylated (34). The two carbohydrate moieties at positions 95 and 135 of B chain are very clear in our present map, and density for several sugars can be seen. There is electron density adjacent to each A chain site (10 and 236), but it is very weak, and we hope that improved phases from crystallographic refinement will allow us to visualize those sugars better.
Refinement will also facilitate detailed comparison of the ricin A chain with trichosanthin. Trichosanthin is a singlechain plant toxin, which has an amino acid sequence -34% identical with A chain (35), for which the peptide backbone has been traced at 4 A resolution (36). Our preliminary inspection shows that the overall folding pattern is similar to that of A chain, although some features of secondary structure appear to differ.
Finally, it is well known that the A chain of ricin is not enzymatically active, unless it is released from the B chain (2); it is also known that the release exposes a nonpolar area on the A chain which may aid in membrane transport (37). It is quite likely that release from the B chain causes a confor-Structure mational change in A chain which activates the enzyme function and also alters the nonpolar surface to facilitate membrane binding. We have recently crystallized A chain expressed from a gene cloned into Escherichia coli (38). The crystals diffract well, and a comparison between that structure and the A chain seen here in ricin may shed light on this putative conformational change.