Solution structure of the fibronectin type III domain from Bacillus circulans WL-12 chitinase A1.

Growing evidence suggests that horizontal gene transfer plays an integral role in the evolution of bacterial genomes. One of the debated examples of horizontal gene transfer from animal to prokaryote is the fibronectin type III domain (FnIIID). Certain extracellular proteins of soil bacteria contain an unusual cluster of FnIIIDs, which show sequence similarity to those of animals and are likely to have been acquired horizontally from animals. Here we report the solution structure of the FnIIID of chitinase A1 from Bacillus circulans WL-12. To the best of our knowledge, this is the first tertiary structure to be reported for an FnIIID from a bacterial protein. The structure of the domain shows significant similarity to FnIIIDs from animal proteins. Sequence comparisons with FnIIIDs from other soil bacteria proteins show that the core-forming residues are highly conserved and, thus, are under strong evolutionary pressure. Striking similarities in the tertiary structures of bacterial FnIIIDs and their mammalian counterparts may support the hypothesis that the evolution of the FnIIID in bacterial carbohydrases occurred horizontally. The total lack of surface-exposed aromatic residues also suggests that the role of this FnIIID is different from those of other bacterial beta-sandwich domains, which function as carbohydrate-binding modules.

The fibronectin type III domain (FnIIID) 1 is one of the most common folds in modular proteins. It was initially characterized in fibronectin, and since then has been found in ϳ2% of all animal proteins. Although most of these are extracellular proteins, FnIIIDs are also found in membrane receptor proteins as well as in intracellular proteins (1). X-ray crystallography (2)(3)(4)(5)(6)(7)(8) and NMR spectroscopy (9 -12) have been used to elucidate the structures of several animal FnIIIDs, all of which adopt Greek key ␤-sandwich folds with three and four strands, consisting of 80 -100 amino acid residues in total. The functions of many FnIIIDs are still unclear. Each FnIIID comprises domain-intrinsic and domain-specific regions (13). The former, made up of relatively conserved residues, are responsible for forming the FnIIID scaffold, which comprises a hydrogen-bond network and a hydrophobic core. The scaffold is common to all FnIIID structures and endows the domain with its the mechanical extensibility against tension and its high refolding speed (14,15). By contrast, the domain-specific regions are formed by exposed residues that are not well conserved across the FnIIID family. These residues often form the recognition site for the FnIIID of an interacting partner protein (4,9).
FnIIIDs have also been found in a restricted set of carbohydrases from soil bacteria (16,17). It is well known that, unlike eukaryotes, bacteria acquire a significant proportion of their genetic diversity through foreign sequences from distantly related organisms (18,19). In particular, soil bacteria are noted for their mosaic genomes that reflect extensive recombination (20). The domains occur in different locations and in different in the carbohydrases of bacteria; in addition, the bacteria that possess FnIIIDs appear to be broadly distributed between Gram-positive and Gram-negative bacteria. As FnIIIDs appear sporadically in bacterial phylogenetic trees and have a high sequence similarity to those of animals, the presence of this domain in bacteria is regarded as the most convincing example of horizontal gene transfer from animal to prokaryote (17,21).
The first bacterial FnIIID to be reported was found in chitinase A1 from Bacillus circulans WL-12 (16), the structure of which is described in this paper. B. circulans WL-12 is a Grampositive soil bacterium and was identified through its lysis of the cell walls of yeast and fungi. To hydrolyze chitin's ␤-1,4glycosidic-linkages, this bacterium uses three enzymes, chitinase A1 (ChiA1), C1 (ChiC1), and D1 (ChiD1) (22). All three chitinases adopt multidomain structures. ChiA1 consists of an N-terminal catalytic domain, two FnIIIDs, and a C-terminal chitin-binding domain (Fig. 1); ChiD1 is made up of an Nterminal chitin binding domain, an FnIIID, and a C-terminal catalytic domain; ChiC1 comprises a catalytic domain and a C-terminal portion with no apparent sequence similarity to other known proteins. Among these, ChiA1 is known as the key enzyme of chitin degradation and exhibits the highest enzymatic activity for both insoluble and soluble chitin (23).
Except for the FnIIIDs, the tertiary structures of the domains of ChiA1 have been solved by x-ray crystallography and NMR. The crystal structure of the catalytic domain reveals an (␣/␤) 8 -TIM-barrel fold that is common to class 18 glycosyl hydrolases (24). Two exposed tryptophan residues (Trp-122 and Trp-134) of CatD ChiA1 are thought to play an important role in the hydrolysis of crystalline chitin. We recently reported the solution structure of the chitin-binding domain of ChiA1 * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
** To whom correspondence should be addressed. (ChBD ChiA1 ), which can bind to chitin substrates in an insoluble or crystalline state (25). Although the overall topology of ChBD ChiA1 is similar to that of the cellulose-binding domain from a bacterial cellulase, the location of exposed hydrophobic residues that are proposed to be important for substrate binding differs between the chitin-and cellulose-binding domains, indicating a difference in the mode of substrate binding.
The catalytic-and chitin-binding domains are tethered by two FnIIIDs (each comprising 86 residues). The N-terminal FnIIID ( 1 FnIIID ChiA1 ; residues Ala-464 to Thr-549) and the other FnIIID ( 2 FnIIID ChiA1 ; residues Ala-559 to Thr-644) are linked by a short sequence (9 residues) and share 74.4% sequence identity with each other. The highest sequence identity between 2 FnIIID ChiA1 and an animal FnIIID is 34%. It has been reported that the deletion of 2 FnIIID ChiA1 has no impact on the chitin binding activity of ChiA1, but causes a significant decrease in the colloidal chitin hydrolyzing activity (26). The natural function of these FnIIIDs remains, however, unclear.
Although there has been a rapid increase in the amount of structural and functional information on animal FnIIIDs during the past several years, the complete lack of three-dimensional structural information for bacterial FnIIIDs has obscured the evolutionary relationship of FnIIIDs. The precise structural organization of the FnIIID is crucial to our understanding of how the domain functions, as well as to the identification of evolutionary relationships. Here we report the solution structure of 2 FnIIID ChiA1 solved by multidimensional NMR spectroscopy. To the best of our knowledge, this is the first report of an FnIIID structure from a bacterial protein. We also model the structure of 1 FnIIID ChiA1 by using a homology modeling algorithm. These structures provide additional evidence in support of the hypothesis that the fibronectin type III domains of bacteria and animals may be related by a horizontal gene transfer process.

EXPERIMENTAL PROCEDURES
Sample Preparation-Recombinant 2 FnIIID ChiA1 was obtained by expressing a His-tagged protein in Escherichia coli BL21(DE3) cells, followed by affinity purification on a Ni 2ϩ chelating column, cleavage with Factor Xa, and final purification by gel filtration. After cleavage with Factor Xa, vector-derived His and Met residues remain in the N terminus as His-557/Met-558-2 FnIIID ChiA1 . Uniform 15 N-labeling was achieved by growing the bacteria in M9 minimal medium containing 15 NH 4 Cl as the sole nitrogen source; for uniform 15 N-and 13 C-labeled samples, 13 C 6 -glucose was used as the sole carbon source. Typical yields were 2 mg of pure protein/liter of bacterial culture. Most NMR experiments were performed with 1.0 -1.5 mM 2 FnIIID ChiA1 samples at 310 K and pH 6.5 (20 mM potassium phosphate, 50 mM KCl, 2 mM Pefabloc ® ) using H 2 O/D 2 O 9:1 (v/v) as the solvent. Homonuclear two-dimensional NOESY and TOCSY, and three-dimensional HCCH-TOCSY and 1 H-13 C NOESY-HSQC spectra were obtained with samples dissolved in D 2 O.
NMR Spectroscopy-NMR spectra were acquired with a Bruker DMX500, DRX500, or DRX800 spectrometer equipped with a pulsefield-gradient triple resonance probe. Assignment of the 1 H, 15 N, and 13 C resonances of 2 FnIIID ChiA1 was obtained using two-dimensional TOCSY, NOESY, 1 H-15 N HSQC, 1 H-13 C HSQC, 3D 1 H-15 N TOCSY-HSQC, and a series of triple-resonance experiments incorporating pulsed field gradients, water flip-back pulses, and sensitivity enhancement when amide protons were detected in the F3 dimension: CBCACOHN, CBCANH, HNCO, CCONH, HCCONH, and HCCH-TOCSY spectra (27). The two-dimensional NOESY (150-ms mixing time), three-dimensional 1 H-15 N NOESY-HSQC (100-and 150-ms mixing times), and three-dimensional 1 H-13 C NOESY-HSQC (150-ms mixing time) spectra were also used to derive distance restraints for structure determination. To identify the hydrogen-bond pattern, a 3h J HNCO HNCO experiment was performed (28). The data on DMX 500 and DRX 500 machines were acquired with 512* ( Structure Calculations-Initially, structure calculation and NOE peak assignment were performed in an iterative and manual manner using the program DYANA (version 1.5) (31). NOE cross-peak intensities were classified as strong, medium or weak, and assigned to restraints of 1.8 -3.0, 1.8 -4.0, or 1.8 -5.0 Å, respectively. On the base of 3h J HNCO HNCO spectrum, hydrogen-bond restraints were applied as 2.5-3.3 Å for N-O pairs and 1.8 -2.5 Å for HN-O pairs and used from the initial procedure. Backbone torsion angle restraints were derived from 3 J HNH␣ of HMQC-J (27) and TALOS program (32). The backbone angle restraints used were Ϫ65 Ϯ 25°for 3 J HNH␣ Ͻ 4.7 Hz, Ϫ120 Ϯ 40°f or 3 J HNH␣ Ͼ 8.5 Hz and Ͻ 9.9 Hz, and Ϫ120 Ϯ 20°for 3 J HNH␣ Ͼ 10.0 Hz. If the predicted TALOS angle was within the range of 3 J HNH␣ , the TALOS-derived angle was used for restraints. The torsion angles 1 of Tyr, Phe and Trp were estimated from 3 J CЈC␥ and 3 J NC␥ coupling constants (33). Final refinement employing ambiguous NOE assignments and floating chirality assignments was performed using the program ARIA (version 1.0) and a CNS package as described (34). A total of 150 structures were refined in the last (ninth) iteration, and the 30 lowest energy structures were analyzed using MOLMOL (35), AQUA, and PROCHECK-NMR software (36).
Homology Modeling and Sequence Alignments-The structure of 1 FnIIID ChiA1 was modeled using the 2 FnIIID ChiA1 structure as a template. 1 FnIIID ChiA1 shows 74.4% amino acid sequence identity with 2 FnIIID ChiA1 , and there are no gaps in the sequence. The MODELLER program (37) was used for modeling, and the quality of the model was assessed using PROCHECK. Homologous bacterial sequences of 2 Fn-IIID ChiA1 were obtained from the SMART (smart.embl-heidelberg.de) server (38). To date (July 2001), 135 bacterial FnIIIDs from 102 proteins are registered in the SMART server. Among these, we chose only FnIIIDs from the proteins whose functions are clearly identified as bacterial glycosyl hydrolases. Initially there were 83 FnIIIDs from 61 proteins. These sequences were filtered manually to get only one FnIIID sequence in the case of a protein with multiple FnIIIDs. Species redundancies were also considered. For comparison, three FnIIIDs from B. circulans were added. Finally, 21 FnIIID sequences from 20 carbohydrases from 18 bacterial genera were used in multiple sequence alignments using ClustalW with default parameters (39). CHROMA software was used for analyzing and annotating the results from sequence alignments (40).

Resonance Assignment, Restraints for Structure Calculation,
and Structure Determination-The sequence-specific NMR backbone assignment was obtained from CBCACONH and CBCANH spectra. Except for the signals from the vector-derived N-terminal His and Met residues, which were not observed in 1 H-15 N HSQC, all the backbone signals were assigned (Fig. 2). Side-chain assignment was achieved using mainly CCONH, HCCONH, and HCCH-TOCSY spectra. Tyrosine, phenylalanine, and tryptophan side-chain chemical shifts were assigned from two-dimensional NOESY and TOCSY spectra. To help assignment, we used programs written in-house using the chemical shift statistics of the BioMagResBank data base (www.bmrb.wisc.edu) and the semi-automatic assignment module of NMRView. A nearly complete assignment of 2 Fn-IIID ChiA1 was achieved. This result has been deposited in the BioMagResBank under accession no. 5178. To extract distance restraints, we used NOESY spectra with 100-and 150-ms mixing times. No severe spin diffusion effects were found in the spectrum with a 150-ms mixing time. We also measured 3h J HNCO HNCO spectrum to get hydrogen-bond information directly. By comparing HNCO and 3h J HNCO HNCO spectra, we identified 23 inter-hydrogen bonds between the ␤-strands (Fig.  3). With 117 torsion angle restraints, these hydrogen-bond restraints (23*2) were applied from the initial step of structure calculation. Structure calculation with DYANA was repeated by adding distance information derived from NOE cross-peaks identified manually, until an ensemble of the calculated structures gave the global fold. When the calculated structure converged enough to identify the global fold, the ARIA procedure was employed for further refinement using ambiguous NOE cross-peaks. This procedure is essentially the same as that described previously (34).
The structure of the 2 FnIIID ChiA1 was determined from the distance and torsional angle restraints listed in Table I Table I.
Sequence Alignments of Bacterial FnIIIDs-Of the 61 FnIIID-containing bacterial carbohydrases that were initially retrieved from SMART data base, 21 are from Streptomyces, and 16 are from Bacillus genera. For a comparison of FnIIIDs across genera, we selected only one FnIIID from each genus. As the sequences of FnIIIDs from the same enzyme tend to be more similar to each other than to FnIIIDs from other proteins, we also selected only one FnIIID sequence per enzyme. If all the FnIIID sequences were used in the alignment, then the apparent similarity between them and the proportion of those from Gram-positive bacteria increased. All data concerning acronyms, enzymatic function, organism name, Gram sensitivity, and order/total number are listed in Table II (Fig. 6). FnIIID from Yersinia enterocolitica (O68975) exhibited the lowest sequence identity (32.6%) to 2 FnIIID ChiA1 . The lengths of aligned sequences are nearly the same except for those of Erwinia chrysanthemi (PEHX_ERWCH) and O68975, which contain ϳ20 more amino acids in their CЈE loops in comparison with the others. A comparison of bacterial FnIIID sequences with animal FnIIID sequences indicates that the C-terminal 20 residues are more conserved in the bacterial sequences.
Structure Description-The structure of 2 FnIIID ChiA1 is a canonical ␤-sandwich structure containing two antiparallel ␤-sheets that are packed face to face (Fig. 4B). One sheet is composed of three ␤-strands (A, B, and E), whereas the other is composed of four ␤-strands (C, CЈ, F, and G). Strand G is divided into two sections, G1 and G2, as in animal FnIIIDs. Three loops in the direction of N terminus (loops BC, CЈE, and FG) and three in the direction of the C terminus (loops AB, CCЈ, and EF) connect the seven ␤-strands, respectively (Fig. 4B).
The ␤-sandwich scaffold of 2 FnIIID ChiA1 is stabilized by an extensive hydrogen-bond network between the ␤-strands, and by a hydrophobic core formed by inward-facing residues from the ␤-sheets. It is well known that animal FnIIIDs contain several ␤-bulge structures on the edge strands A, CЈ, and G. In 2 FnIIID ChiA1 , three ␤-bulge structures at residues Asn-565/ Leu-566 at the beginning of strand A, Thr-569/Ala-570 at middle of strand A, and Ala-601/Thr-602 within strand CЈ could be unambiguously identified by the direct detection of hydrogen bonds through scalar couplings. This method naturally revealed the hydrogen-bond network that defines the strand topology, as shown in Fig. 3, demonstrating its utility. The hydrogen-bond network shows that strand G is divided into two segments, G1 and G2, as is commonly seen in animal FnIIIDs.
The residues that are conserved in bacterial FnIIIDs and form the hydrophobic core are well defined in the final structures (Fig. 5). The following residues are totally buried in the core, with an solvent-accessible surface area of less than 7%: Pro-563, Ile-576, Leu-578, Trp-580, Ser-583, Tyr-592, Val-594, Ala-609, Ile-611, Phe-622, Val-624, Ala-626, Ser-637, and Val-642. The hydrophobic core is made up of two clusters in which three aromatic residues play a central role. One of the clusters contains Trp-580 and Tyr-592 surrounded by Pro-563, Ser-583, Val-594, Val-624, and Ala-626. These aromatic residues are nearly completely conserved, even in animal FnIIIDs, suggesting that this cluster is integral for maintaining the FnIIID fold. The other cluster is made up of Phe-662 surrounded by Ile-576, Leu-578, Ile-611, and Val-642. The hydrophobic properties of these residues are also highly conserved across bacterial and animal FnIIIDs (Figs. 5 and 6).
For animal proteins, some FnIIIDs display a variety of binding modes with other proteins using combinations of the loop regions. For example, human fibronectin binds to integrin through the RGD site on the FG loop and the PXSRN site on the CЈE loop, which form interfacial surfaces (4,9). The importance of the loop sequence for molecular recognition is also demonstrated by an artificially engineered FnIIID with altered sequences on the BC and FG loops that binds ubiquitin with high affinity (41). In bacterial FnIIIDs, the AB, EF, and FG loops are relatively well conserved in length and sequence. In contrast, the BC, CCЈ, and CЈE loops are variable (Fig. 6). The In addition to the loop regions, some exposed residues in the ␤-sheets play a role in molecular recognition in some proteins. In human growth factor and human tissue factor receptor, the association of successive domains in the FnIIID-FnIIID segment creates charged surfaces around the domain boundaries, which serve as binding sites for their ligands (2,5). In contrast, the surface of 2 FnIIID ChiA1 is mostly surrounded by noncharged residues as shown in Fig. 7A. 2 FnIIID ChiA1 has only four negatively charged (Asp-585, Asp-593, Asp-617, and Asp-628) and three positively charged (Lys-625, Lys-627, and Lys-643) residues on the surface, which are sparsely located on the protein surface and do not form a noticeable charged patch.
To analyze the charge status of the 1 FnIIID ChiA1 -2 FnIIID ChiA1 module, we modeled the structure of 1 FnIIID ChiA1 . As there are no gaps in the sequence alignment and a high sequence identity (74.4%) between 2 FnIIID ChiA1 and 1 FnIIID ChiA1 , homology modeling of the 1 FnIIID ChiA1 structure was quite straightforward. The r.m.s.d. between the modeled 1 FnIIID ChiA1 and 2 FnIIID ChiA1 is 1.173 Å over the C␣ coordinates of Ala-559 to Thr-644 in 2 FnIIID ChiA1 . Contacts between the hydrophobic core-forming residues are nearly identical between the two FnIIIDs. All the backbone angles of the modeled structure, except for those of glycines, are in the allowed regions of Ramachandran plot.
The electrostatic potential map on the surface is drawn in Fig. 7B. The modeled structure of 1 FnIIID ChiA1 reveals that charged groups are sparsely distributed on the protein surface with no noticeable charged patches, as seen for 2 FnIIID ChiA1 . Modeling of the 1 FnIIID ChiA1 -2 FnIIID ChiA1 module using the coordinates of 1 FnIIID ChiA1 and 2 FnIIID ChiA1 at a variety of inter-domain orientations suggests that, unlike human tissue factor and human growth factor receptor, the two domains are unlikely to produce a cluster of charged side chains at the domain boundary (data not shown).
Low Sequence Complexity-The noteworthy feature of 2 Fn-IIID ChiA1 is its unusual amino acid composition. The sequence of this domain is rich in amino acids with short side chains. It has 20 threonine, 16 alanine, and 10 serine residues, and these three types of amino acid make up over 50% of the residues of this domain. Owing to the biased amino acid composition, if using a BLAST sequence homology search without a filtering option, then proteins with low complexity such as antifreeze  proteins (threonine-and alanine-rich) or mucin (threonine-and serine-rich) are retrieved with high probability. Although proteins with low sequence complexity are more likely to generate relatively extended structures (42), 2 FnIIID ChiA1 adopts a globular and compact fold. The molecular mass of 2 FnIIID ChiA (Ala-559 to Thr-644) is 8423.1 daltons, giving an average mass per residue of 97.9 daltons, which is lower than 99% of the protein sequences contained in SWISS-PROT (43). This high content of small side chains probably caused the relatively small difference between backbone and heavy atom r.m.s.d. values in the final 30 NMR structures. A high content of small side chains is generally found in bacterial FnIIIDs, but not in animal FnIIIDs.
Most of the threonine residues are located on the surface of 2 FnIIID ChiA (Fig. 7C) and, thus, have little effect on its global fold. Most threonines are not highly conserved in bacterial FnIIIDs; instead, small residues (Ala, Cys, Ser, Thr, Asp, Asn, Val, Gly, and Pro) occur with a probability greater than 80% at the positions corresponding to threonines in 2 FnIIID ChiA . These residues presumably contribute to the solvent-accessible surface of each molecule and, thus, may be important functionally.

DISCUSSION
The Residues under Evolutional Constraints Are Similar between the Bacterial and Animal FnIIIDs-The NMR solution structure of 2 FnIIID ChiA1 and multiple sequences alignment of bacterial FnIIIDs show that these domains are surprisingly similar each other despite the broad and sporadic distribution of the bacteria containing them. Residues that play important roles for the scaffold formation are totally conserved. Interestingly, the properties of amino acids that are presumably under weak evolutional pressure, such as residues on loops or exposed on ␤-sheets, are also preserved, as shown in Figs. 5 and 6. Although we cannot exclude the possibility that the threonines on the surface of 2 FnIIID ChiA1 have specialized functions, such as those of antifreeze proteins (45,46), it seems more reasonable to think that these domains evolved recently from a domain that was rich in amino acids with small side chains and that these amino acids have not been substituted much, considering the relatively high content of light amino acids in 2 FnIIID ChiA compared with that in other bacterial FnIIIDs.
So where did the bacterium acquire FnIIID initially? Because of the high sequence similarity between FnIIIDs of bacteria and animals, it was suggested that the bacterial FnIIID was transferred across phyla horizontally, and this hypothesis has been widely accepted through phylogenetic analysis (17,21). Our structural information for 2 FnIIID ChiA1 may increase the evidence in support of this concept. There is good correlation in hydrophobic core-forming residues between bacterial and animal FnIIIDs, and structural details such as ␤-bulges at the edge strands are also common in both bacterial and animal domains. To the best of our knowledge, it is very rare to find bacterial and vertebrate proteins that share such common features.
The principle that protein evolution is determined mainly by constraints on activity, specificity, folding, and stability is generally accepted (47,48). The key residues that play roles in preserving the nature of the protein (packing, hydrogen bonding, or unusual dihedral angles) are inclined to be strongly conserved in property, if not in identity, and can be used to measure evolutionary distance. From this point of view, if bacterial FnIIIDs correlate with animal FnIIIDs because of horizontal DNA acquisition, the residues under strong evolutionary pressure will be more highly conserved between closer relatives.
As a BLAST search indicates that the sequence of the FnIIID from titin proteins, which are intracellular members of the FnIIID-containing family, is most similar to that of 2 Fn-IIID ChiA1 with an amino acid identity of 34%, we use titin FnIIID as an example. Fortunately, the tertiary structure of a titin module from human cardiac muscle (FnIIID 1BPV ; Protein Data Bank code, 1BPV) has been reported (10). FnIIID 1BPV , which shares a sequence identity of 32% with 2 FnIIID ChiA1 and shows the C␣ r.m.s.d. of 1.9 Å over corresponding 81 residues, reveals a very similar hydrophobic-core packing. Of the 14 residues of 2 FnIIID ChiA1 with a solvent-accessible surface area of less than 7%, 11 residues from FnIIID 1BPV (including 8 identical residues: Pro-563, Leu-578, Trp-580, Tyr-592, Val-594, Phe-622, Ala-626, and Ser-637 after 2 FnIIID ChiA1 numbering) have the same hydrophobic feature (Fig. 8).
Our argument for a close relationship between FnIIIDs from bacteria and animals is reinforced by the discovery of FnIIIDs in other kingdoms. Powerful searching tools, such as the hidden Markov model approach, have recently revealed new Fn-IIIDs from yeast and plant (49,50), despite low sequence similarities with bacterial FnIIIDs (ϳ15% identities with the sequence of 2 FnIIID ChiA1 ). Of these, the crystal structure of the FIG. 6. Multiple sequence alignment of bacterial FnIIIDs. Amino acid residues with a 80% consensus are colored. Classification of the consensus amino acid type is according to that used in program CHROMA as follows: ϩ is applied for positively charged residues (His, Lys, and Arg), Ϫ for negatively charged residues (Asp and Glu), a for aromatic residues (Phe, His, Trp, and Tyr), h for hydrophobic residues (Ala, Cys, Phe, His, Ile, Leu, Met, Val, Trp, and Tyr), l for aliphatic residues (Ile, Leu, and Val), * for alcohol residues (Ser and Thr), p for polar residues (Asp, Glu, His, Lys, Asn, Gln, Arg, Ser, and Thr), s for small residues (Ala, Cys, Ser, Thr, Asp, Asn, Val, Gly, and Pro), t for tiny residues (Ala, Gly, and Ser), and uppercase letters for consensus amino acids. The lengths of the considered sequences are given in parentheses on the left, and the orders and total numbers of FnIIIDs in enzymes are in parentheses on the right. The first and last sequence numbers are indicated. 2 FnIIID ChiA1 and 1 FnIIID ChiA1 are abbreviated as 2Fn3 and 1Fn3, respectively. Secondary structure elements of 2 FnIIID ChiA1 are indicated above the alignments. Sequence alignment was carried out by ClustalW (39) and annotated using CHROMA (40).
In addition to the apparent similarity in their overall structure, NMR parameters indicate that there is striking similarity in the arrangement of the core-forming residues of 2 FnIIID ChiA1 and FnIIID 1BPV . The chemical shift of FnIIID 1BPV is a good indicator for this structural resemblance. Whereas carbon chemical shifts reflect secondary structure, proton chemical shifts are sensitive to tertiary environments. Trp-580 and Tyr-592, which are the central residues of the FnIIID scaffold and highly conserved, make unique side-chain interactions with each other. The indole nitrogen of Trp-580 forms a stacking interaction with the aromatic ring of Tyr-592. This interaction causes an unusual upfield shift of the chemical shift of Trp-580 H⑀1; the chemical shift is 5.80 ppm, which is the most upfield-shifted value for a tryptophan H⑀1 in the Bio-MagResBank data base (Fig. 2). Intriguingly, the same proton of FnIIID 1BPV exhibits a very similar value of 5.81 ppm (Bio-MagResBank no. 4295) (10,52), although the chemical shift is highly sensitive to the relative orientation of the tryptophan and tyrosine residues. This result indicates that the electromagnetic environments, and thus the relative orientation, of the residues in the central hydrophobic core of 2 FnIIID ChiA1 and FnIIID 1BPV are highly similar.
These observations indicate that, in terms of sequence and tertiary structure, 2 FnIIID ChiA1 are less similar to plant FnIIID 4KBP than animal FnIIID 1BPV . This finding may imply that bacterial 2 FnIIID ChiA1 shares a relationship with animal FnIIIDs that is closer than would be expected from the evolutionary distance between animals and bacteria.
The Structure of 2 FnIIID ChiA1 Shows Different Features from Other ␤-Sandwich Domains of Bacterial Carbohydrases-It has been reported that the FnIIIDs of chitinase A1 from B. circulans WL-12 are not directly involved in chitin binding. The truncated form of chitinase A1 from B. circulans WL-12, lacking either 2 FnIIID ChiA1 , or both 1 FnIIID ChiA1 and 2 FnIIID ChiA1 , showed nearly the same binding affinity for chitin as the fulllength enzyme. By contrast, a significant decrease in chitin hydrolyzing activity was observed for the truncated forms, in proportion to the number of FnIIIDs missing (26).
A structure resembling that of FnIIID has been found in another bacterial carbohydrase, chitinase A from Serratia marcescens (53). This enzyme consists of three domains: an N-terminal domain (ChiN), a catalytic domain, and a small (␣ϩ␤) domain. The sequence of the catalytic domain is conserved between this enzyme and chitinase A1 from B. circulans, and their tertiary structures are similar (r.m.s.d. between corresponding C␣ atoms, 1.25 Å), suggesting that they share an identical catalytic mechanism. ChiN adopts a global fold similar to 2 FnIIID ChiA1 but with an additional ␤-␣-␤ element between the A and B strands that interacts with the other domain ( Fig. 9, B and C). Although a structure-based sequence alignment shows that ChiN has a sequence identity of 19.8%, the arrangement of hydrophobic-core-forming residues in ChiN is different from those in 2 FnIIID ChiA1 and other FnIIIDs. Moreover, important differences are found in the surface residues of these two domains. ChiN has adjacently arranged tryptophans (Trp-33 and Trp-69) exposed on a continuous surface with the conserved aromatic residues of catalytic domain (Trp-245 and Phe-232). These residues play important roles in guiding a chitin chain into the catalytic site (54). In contrast, 2 Fn-IIID ChiA1 has no exposed tryptophans. The only aromatic residue with a solvent-accessible surface area of more than 10% is Tyr-620 (13.3%), a residue that is highly conserved throughout animal proteins.
Unlike the bacterial FnIIIDs, which are broadly distributed across both Gram-positive and Gram-negative bacteria, the ChiN domain has been found only in the chitinase of Gramnegative bacteria. Interestingly, all the ChiN domains are located at the N termini of these enzymes. In contrast, the FnIIIDs of bacteria, like animal FnIIIDs, are located at a variety of positions within the proteins. This observation suggests that the FnIIID-containing proteins have arisen through domain shuffling. Recently, another chitinase (ChiC) was identified from S. marcescens that has three domains including an FnIIID (55). Although its FnIIID exhibits a high sequence similarity to 1 FnIIID ChiA1 and 2 FnIIID ChiA1 , the sequence of the catalytic domain is different from that of chitinase A1 from B. circulans. As above, this observation implies that these FIG. 7. Surface diagram. Electrostatic potentials on the solventaccessible surface of 2 FnIIID ChiA1 (A) and 1 FnIIID ChiA1 (B) are shown. The potentials are colored from red (negative charge) to blue (positive charge). The right-hand image was generated by 180°rotation along y-axis of the left-hand molecule. C, surface diagram of 20 threonine residues of 2 FnIIID ChiA1 . For clarity, a 60°rotation along the vertical axis was applied to the image in A. The figures were prepared using the program GRASP (60).

FnIIIDs have been inserted into the polypeptides by domain shuffling.
Several other ␤-sandwich architectures besides ChiN and FnIIID have been found in bacterial carbohydrases, such as domain N of ␣-amylase II (56), C-terminal starch-binding domain of ␤-amylase (57), cellulose-binding domain (58), and xylan-binding domain (59). The secondary structure topologies of these modules are different from that of FnIIID, although their tertiary structures show limited similarity to that of 2 FnIIID ChiA1 with DALI Z scores ranging from 2.6 to 5.4. These domains are capable of binding to carbohydrates and commonly possess surface-exposed aromatic residues that are thought to make contacts with substrate sugar chains. Only the structure of 2 FnIIID ChiA1 does not exhibit such features.
The high degree of conservation of the global structures among the bacterial FnIIID-containing carbohydrases and the lack of known function of the domain and of surface features reminiscent of specific functions in 1 FnIIID ChiA1 and 2 FnIIID ChiA1 tempt us to postulate that FnIIID has the role of a spacer in bacterial carbohydrases. To degrade insoluble substrate efficiently, catalytic and binding domains must adopt various relative positions. The fact that most FnIIIDs are located between the catalytic and binding domains may support this hypothesis. Moreover, mechanical elasticity, which is an intrinsic property of FnIIID, may make the FnIIID the best candidate for this type of role.