Crystal Structure of Trimeric Carbohydrate Recognition and Neck Domains of Surfactant Protein A*

Surfactant protein A (SP-A), one of four proteins associated with pulmonary surfactant, binds with high affinity to alveolar phospholipid membranes, positioning the protein at the first line of defense against inhaled pathogens. SP-A exhibits both calcium-dependent carbohydrate binding, a characteristic of the collectin family, and specific interactions with lipid membrane components. The crystal structure of the trimeric carbohydrate recognition domain and neck domain of SP-A was solved to 2.1-Å resolution with multiwavelength anomalous dispersion phasing from samarium. Two metalbinding sites were identified, one in the highly conserved lectin site and the other 8.5 Å away. The interdomain carbohydrate recognition domain-neck angle is significantly less in SP-A than in the homologous collectins, surfactant protein D, and mannose-binding protein. This conformational difference may endow the SP-A trimer with a more extensive hydrophobic surface capable of binding lipophilic membrane components. The appearance of this surface suggests a putative binding region for membrane-derived SP-A ligands such as phosphatidylcholine and lipid A, the endotoxic lipid component of bacterial lipopolysaccharide that mediates the potentially lethal effects of Gram-negative bacterial infection.

Pulmonary surfactant is a mixture of phospholipids and proteins that lines the distal airways and stabilizes the gas-exchanging alveolar units of the lung (1). Surfactant membranes are decorated with a calcium-dependent phospholipid-binding protein called surfactant protein A (SP-A) 1 (2,3), one of the four known surfactant proteins. Reports that SP-A is required for the formation of tubular myelin (4,5) and other large surfactant aggregates (6) and for the preservation of low surface tensions in the presence of serum protein inhibitors (7) suggest that the protein serves multiple roles in surfactant function. However, recent evidence that SP-A is also a pulmonary host defense protein includes reports that SP-A binds to, aggregates, opsonizes, and permeabilizes microorganisms (8) and that the SP-A-deficient gene-targeted mouse is susceptible to lung infection with multiple organisms (9). One view that reconciles these apparently divergent functions of SP-A is that its high affinity for surfactant membranes is a mechanism for concentrating the protein at the front lines of defense against inhaled pathogens (8).
SP-A belongs to the structurally homologous family of innate immune defense proteins known as collectins (10,11), so named for their collagen-like and lectin domains. Collectins compose a subset of the C-type (calcium-dependent) lectins, a functionally diverse and phylogenetically ancient family of carbohydrate binding proteins (12). The binding of collectins to glycoconjugates on the surface of microorganisms results in enhanced clearance through a variety of mechanisms, including reduction of epithelial adherence, agglutination, and opsonization (13). All known collectin family members bind more avidly to mannose and glucose, which most microbial ligands contain, than to galactose, fructose, and sialic acid, which are common in glycoproteins from higher organisms. Three collectins are known in humans: pulmonary proteins SP-A, surfactant protein D (SP-D), and serum mannose-binding protein (MBP).
Structural details of more than a dozen C-type lectins (14 -16), including SP-D (17) and MBP (18,19), have been characterized through x-ray crystallography. In the collectins, the conserved C-type lectin motif (20), known as the carbohydrate recognition domain (CRD), at the C terminus is linked to an extended collagen-like domain of tripeptide Gly-X-Y repeats. The collectin linker, an extended ␣-helical "neck," forms a coiled coil with the necks of two other monomers and serves as the nucleus for trimerization in the Nterminal direction through the collagen domain (21,22). Larger aggregates form by means of disulfide bond formation within a short N-terminal segment containing two cysteine residues (23). Electron micrographs show that SP-A and MBP assemble as six trimers arrayed in parallel and in register, resembling a bouquet of tulips (24), whereas SP-D is cruciform, formed by radial arrangements of four trimers (25).
SP-A and SP-D exhibit high affinity binding interactions with lipid as well as carbohydrate ligands. SP-A binds specifically to dipalmitoylphosphatidylcholine, the most abundant phospholipid species in surfactant (26), whereas SP-D binds to the minor surfactant component phosphatidylinositol (27,28). Both bind to the lipopolysaccharide moieties that decorate bacterial surfaces; SP-A recognizes the lipid A component of lipopolysaccharide (29,30), whereas SP-D binds to the core oligosaccharide and O-oligosaccharides (31). Electron micrographs and other data indicate that the globular CRD and collagen-like domains in both surfactant proteins are oriented proximal and distal, respectively, to the membrane surface (32,33). However, little is known of the molecular details of the membrane association.
The present study was undertaken to determine the threedimensional structure of the SP-A CRD and neck domain and to address questions related to its functional properties, especially with respect to specific interactions with surfactant phos-  pholipids and membrane components, such as the endotoxin lipid A, derived from pathogenic bacteria.

EXPERIMENTAL PROCEDURES
A recombinant rat SP-A containing only the amphipathic linking domain and the CRD and lacking the consensus for asparagine-linked glycosylation (⌬N1-P80,N187S) was synthesized in insect cells using baculovirus vectors, as described previously (34). The ⌬N1-P80,N187S was purified by mannose-Sepharose affinity chromatography, dialyzed against 5 mM Tris and 50 mM NaCl and concentrated to at least 3 mg/ml. Protein concentrations were determined with the bicinchoninic protein assay kit (BCA, Pierce) using bovine serum albumin as the standard. P6 3 crystals (a ϭ 97.680 Å, c ϭ 44.538 Å) containing one monomer per asymmetric unit were grown by vapor equilibration in sitting droplets at 17°C over mother liquor containing 1.6 M LiSO 4 , 10 mM CaCl 2 , 7% (w/v) glycerol, 100 mM MES buffer, pH 6.5, with a 1:1 molar ratio of SP-A:trimannose. Crystallographic data were collected at the X8C beam-line at Brookhaven National Labs. Native and Sm MAD data sets were collected from the same crystal. For the native data set, an additional 5% (v/v) glycerol was added to ensure cryoprotection, and data were collected to 2.07 Å using a wavelength of 1.072 Å. The native crystal was then recovered and thawed into a 50-L sitting drop of mother liquor/5% glycerol. To this droplet was added 0.5-L aliquots of 200 mM SmCl 3 , layered on the surface to float and gradually diffuse through the drop down to the crystal. Five aliquots were added over a 1-h period, giving a final concentration of 10 mM SmCl 3 . The droplet was then allowed to sit undisturbed for 7 h. The Sm-soaked crystal was refrozen in the beam and MAD data were collected at three wavelengths (Table I). The crystal still diffracted to 2.0 Å with good mosaicity (0.56 versus 0.65°for the native data set). However, the detector distance required to accommodate the long wavelength of the Sm edge gave only a 2.8-Å resolution at the edge of the imaging plate, limiting MAD phasing to that resolution. Data were processed using DENZO and SCALEPACK (48). Phases were determined for the Sm MAD data with SOLVE (49), which located two Sm sites per asymmetric unit.
The solvent-flattened MAD-phased electron density map calculated at 2.8-Å resolution presented very well defined electron density in which secondary structure, particularly the three ␣-helices and longer ␤-strands, were clear and unambiguous. Secondary structure was fit manually in O (50). After the chain tracing was completed in O, the model was positionally refined using crystallography NMR software (51). The model was then further refined to 2.1-Å resolution against the native data using simulated annealing protocols. Iterative cycles of rebuilding and refinement, using difference Fourier and simulated annealing omit maps, were used to produce the final model. Broken density at the N terminus precluded unambiguous positioning of residues 81-83, which were eventually omitted. For the structure of the SP-A/samarium complex, 2.5-Å resolution data at the remote wavelength ( ϭ 1.771 Å) were reprocessed and refined against the fully refined coordinates of the native SP-A structure. The few differences between the two models as observed in composite omit and difference Fourier maps were localized to the metal ion binding loops, which were carefully rebuilt. In both final models (native SP-A and its complex with samarium), 92.7 and 7.3% of residues occur in the core and allowed regions, respectively, of the Ramachandran plot, as calculated in PRO-CHECK (52). Refinement statistics are presented in Table I. Figures  (except for Fig. 1A) were prepared using Swiss PDB Viewer (53) and POVRAY. Fig. 1A was prepared using ALSCRIPT (54).

RESULTS AND DISCUSSION
The crystal structure is derived from the ⌬1-80/N187S fragment, which encompasses the extended, helical neck region (residues 81-108) and the globular CRD (residues 111-228) of the SP-A monomer. The N187S substitution eliminates the only site of N-linked glycosylation in the CRD. The fragment forms a stable non-covalent trimer in solution (34). In the crystal, the monomers of the trimer are related by crystallographic symmetry.
The secondary structure of the CRD includes three ␣-helices and 11 short (3-7 residue) strands of ␤-sheet (Fig. 1). The overall fold of the CRD in SP-A is similar to that of SP-D and MBP. The main structural differences are restricted to the region that includes surface loops and calcium binding sites. Excluding the variable regions, which contain insertions or deletions, the root-mean-square deviation for C-␣ atoms in SP-A is 0.67 Å when compared with SP-D (over 104 atoms representing 92% of the structure) and 1.29 Å compared with MBP (96 atoms, or 85%). The neck region consists of a single amphipathic ␣-helix (residues 84 -108) culminating in a reverse turn (residues 108 -111) that leads into the first ␤-sheet of the CRD domain. The three N-terminal residues (81-83) of the fragment are disordered because of high mobility and are omitted from the structure. The neck regions of three monomers are intertwined around a 3-fold rotational symmetry axis to form the coiled-coil trimer (Fig. 1). The trimer is held together primarily by means of the hydrophobic face of the amphipathic helix from the neck of each monomer. This hydrophobic face is composed of the side chains of Leu-87, Leu-91, Ile-94, Ile-98, Thr-101, Met-102, and Leu-105. The reverse turn is situated around the three-fold symmetry axis such that the polar side chains of Gln-108 and Ser-110 form H-bonds with their counterparts in the other two monomers. Three intermolecular salt bridges linking Glu-90 and Lys-95 of adjacent monomers provide additional stabilization. Intramolecular contacts between the neck and CRD are minimal in the SP-A monomer.
In the SP-A trimer, the interface between domains does not have the appearance of a freely mobile hinge region as it does in the monomer. Major contacts between the two domains are primarily intermolecular and mostly hydrophilic. In one such contact, the carboxylate group of the C terminus, Phe-228, of one monomer forms a salt bridge and an H-bond with the side chains of neck residues His-96 and Gln-100, respectively, of another monomer. All three residues are invariant in mammalian SP-A sequences. The neck and CRD domains are oriented nearly perpendicularly to each other in SP-A, in contrast to both SP-D and MBP, where the interdomain angles are greater (i.e. "T" versus "Y" shapes, respectively) (Fig. 2). This difference in tertiary structure likely plays a role in the recognition of different classes of macromolecular substrates by these otherwise homologous proteins: in particular, the extensive planar surface of SP-A is better suited for binding membrane lipids than that of the other collectins.
Calcium Binding Sites-Crystallographic and solution studies have identified two classes of calcium-binding sites in collectin CRDs. SP-D and MBP crystal structures show a single, high affinity C-lectin type site (designated herein as the primary site), composed of highly conserved acidic or amide ligands from the protein. Lower affinity (designated auxiliary) sites are situated 8 and 12 Å from the primary site. The SP-A crystal structure was determined in the presence of calcium. One bound calcium ion was observed, and that is in the primary site. This site is very similar overall to those found in the SP-D and MBP (Fig. 3). The SP-A protein donates five oxygen ligands: Glu-195 (side chain), Arg-197 (main chain), Asn-214 (side chain), and Asp-215 (main chain and side chain). All are highly conserved except for Arg-197, which is replaced by asparagine in MBP and SP-D. Yet the Arg-197 carbonyl oxygen assumes the coordination position that is, in the other two collectins, occupied by the asparagine side-chain carbonyl oxygen, thus preserving the common architecture of the site despite the substitution. The coordination of calcium via the main-chain oxygen in  suggests that substitutions at this position (e.g. by alanine in human SP-A or by site-directed mutagenesis) will retain calcium binding capabilities even if other properties are affected (35). The present structure also shows that the highly conserved residue Pro-196 adopts the cis conformation associated with calcium binding, as demon- strated for MBP (36). Isomerization of the equivalent proline in MBP from trans to cis permits the flanking residues (equivalent to  in SP-A) to adopt the conformation required to coordinate the calcium ion (37).
In the lectin sites of collectins, the calcium coordinated shell is completed by exogenous non-protein ligands (38). In complexes of MBP-mannose, two coordination positions are occupied by vicinal carbohydrate oxygens. In structures of MBP or SP-D in the absence of carbohydrate, the corresponding calcium ligands are solvent water molecules. In the SP-A crystal structure, mannose binding is not observed, despite its presence in the crystallization medium, presumably because of competition with the cryoprotectant glycerol (39). The electron density at these positions is poorly defined, however, and has been modeled in the present structure as two water molecules.
Although calcium binding only occurs at the C-lectin (primary) site in the SP-A/Ca 2ϩ complex, an additional (auxiliary) site is revealed in the SP-A/Sm 3ϩ complex. Samarium ions, which provided MAD phasing for the SP-A crystal structure, occupy both the primary site and an auxiliary site 8.5 Å away (Fig. 4A). The primary and auxiliary sites in SP-A are situated on opposite sides of the highly flexible 198 -203 loop. The location and architecture of the SP-A auxiliary site differs from those observed in SP-D and MBP, which are homologous with each other but are not conserved in SP-A (Fig. 4B).
The crystallographic occupancy of the "auxiliary" Sm 3ϩ ion is approximately half that of the "primary" Sm 3ϩ ion, and its coordination shell contains fewer ligands. The lanthanide is bound by the main-chain carbonyl oxygen of Lys-201 and in a bidentate manner by the Glu-171 carboxylate moiety. The oxygen-Sm 3ϩ bond lengths are all between 2.66 -2.75 Å. It is conceivable that under ionic conditions more favorable for metal ion binding, additional oxygen ligands from neighboring residues could be recruited for metal ion coordination. The flexible 198 -203 loop surrounds this auxiliary site, and a conformational change could bring the backbone oxygens of Gly-198, Gly-200, or Glu-202 into closer proximity to the metal ion.
The SP-A complexes with either calcium or samarium indicate that the primary site is filled before the auxiliary site, whereas the reverse is true of MBP (39). Metal ion binding in the SP-A auxiliary site affords some stabilization of the conformation of the flexible 198 -203 loop. This stabilization in turn affects the primary metal binding site. The major difference between the calcium and samarium complexes at the primary site involves Glu-202. If metal ion is not bound at the auxiliary site, as in the SP-A complex with calcium, the Glu-202 sidechain position is not close enough to coordinate the primary metal ion. In contrast, in the SP-A complex with samarium, where both metal binding sites are occupied, the Glu-202 side chain can be clearly identified as one of the samarium-coordinating ligands. The finding that alanine substitution of Glu-202 blocks binding to calcium, phospholipid, and carbohydrate ligands is consistent with an important ligand-binding role for this residue (40,41) and suggests that the fully functional state contains calcium bound at both sites.
Solution binding studies have suggested that SP-A contains an additional calcium-binding site outside the CRD (42). The crystallographic data suggest that a potential location for this site is a distinctive anionic patch in the neck region. The patch is composed of Asp-84, Glu-85, and Glu-86, invariant residues in mammalian SP-A sequences that are not conserved in SP-D or MBP. These acidic residues are not involved in intra-or intermolecular contacts, suggesting they may play a more specialized role in SP-A function. Neither calcium nor lanthanide is observed crystallographically to bind at this location; however, it is possible that further oligomerization or macromolecular ligand binding may be required to complete a functional calcium-binding site at this location.
Lipid Binding by SP-A-Mutagenesis studies have revealed that the phospholipid binding domain of SP-A overlaps with the carbohydrate binding domain (40,(43)(44)(45). Findings that SP-A discriminates between saturated and unsaturated acyl chains and exhibits specificity for the choline head group (26) imply that discrete binding sites exist for these moieties on the SP-A molecule. Surface electrostatics calculations (Fig. 5) of SP-A and SP-D show that, as might be expected, calcium binding neutralizes some of the negative charge observed in the protein surfaces when calcium is omitted from the model. However, when calcium is included in the model, the SP-A CRD surface is considerably less polar than that of SP-D. This distinction may be consistent with the preference of SP-A for hydrophobic ligands.
In considering possible epitopes on SP-A for phospholipid binding, it is useful to note that many of the short stretches of highly variable sequence (Fig. 1) that distinguish the collectins from each other occur on one face of the CRD surface, proximal to the C-lectin binding site. One such sequence is Y-N-N-Y (residues 161-164). The Tyr-161 ring, together with those of Tyr-164, Tyr-208, and Tyr-192, form a strip of aromatic rings across the CRD surface (Fig. 6) that could serve as a binding site for lipid acyl chains. None of these four residues is conserved in SP-D or MPD (Fig. 1A), which lack this extended hydrophobic surface. It is interesting that in the SP-A crystal structure, a MES buffer molecule binds so that its morpholino ring stacks over that of Tyr-164. The zwitterionic MES molecule bears a striking resemblance to phosphocholine in its overall size, shape, and charge (Fig. 6, inset), and MES binding at this location might be indicative of choline binding near Tyr-164. However, this suggestion remains speculative until structural information can be obtained from SP-A lipid complexes.
Further evidence supporting a role for this hydrophobic strip in SP-A/lipid interactions is provided by the crystal structure of the bacterial outer membrane protein, FhuA, bound to lipopolysaccharide (46) (PDB accession code 1PFC). In this complex, the acyl chains of the lipid A portion of lipopolysaccharide bind noncovalently to the FhuA surface via a very similar aromatic region composed of four phenylalanine side chains (analogous to Tyr-161, Tyr-164, Tyr-208, and Tyr-192 in SP-A). As in SP-A, the aromatic residues in FhuA belong to discontinuous ␤-sheet structure forming the floor of the lipid A-binding surface. Numerous electrostatic interactions are formed between the phosphorylated glucosamine and other polar moieties of lipid A and basic or amine side chains on FhuA (47). These include a cluster composed of two arginines and a glutamine (residues 382, 384, and 353, respectively). In SP-A, a similarly situated cluster is created by the side chains of Arg-216, Arg-222, and Gln-220, and additional basic residues are present in the vicinity. Although these similarities do not presuppose any evolutionary relationship between the two proteins, they point to possible regions on SP-A that could mediate high affinity binding of the common substrate, lipid A.