Crystal structure of Grimontia hollisae collagenase provides insights into its novel substrate specificity toward collagen

Collagenase from the gram-negative bacterium Grimontia hollisae strain 1706B (Ghcol) degrades collagen more efficiently even than clostridial collagenase, the most widely used industrial collagenase. However, the structural determinants facilitating this efficiency are unclear. Here, we report the crystal structures of ligand-free and Gly-Pro-hydroxyproline (Hyp)-complexed Ghcol at 2.2 and 2.4 Å resolution, respectively. These structures revealed that the activator and peptidase domains in Ghcol form a saddle-shaped structure with one zinc ion and four calcium ions. In addition, the activator domain comprises two homologous subdomains, whereas zinc-bound water was observed in the ligand-free Ghcol. In the ligand-complexed Ghcol, we found two Gly-Pro-Hyp molecules, each bind at the active site and at two surfaces on the duplicate subdomains of the activator domain facing the active site, and the nucleophilic water is replaced by the carboxyl oxygen of Hyp at the P1 position. Furthermore, all Gly-Pro-Hyp molecules bound to Ghcol have almost the same conformation as Pro-Pro-Gly motif in model collagen (Pro-Pro-Gly)10, suggesting these three sites contribute to the unwinding of the collagen triple helix. A comparison of activities revealed that Ghcol exhibits broader substrate specificity than clostridial collagenase at the P2 and P2′ positions, which may be attributed to the larger space available for substrate binding at the S2 and S2′ sites in Ghcol. Analysis of variants of three active-site Tyr residues revealed that mutation of Tyr564 affected catalysis, whereas mutation of Tyr476 or Tyr555 affected substrate recognition. These results provide insights into the substrate specificity and mechanism of G. hollisae collagenase.

Collagenase from the gram-negative bacterium Grimontia hollisae strain 1706B (Ghcol) degrades collagen more efficiently even than clostridial collagenase, the most widely used industrial collagenase. However, the structural determinants facilitating this efficiency are unclear. Here, we report the crystal structures of ligand-free and Gly-Pro-hydroxyproline (Hyp)-complexed Ghcol at 2.2 and 2.4 Å resolution, respectively. These structures revealed that the activator and peptidase domains in Ghcol form a saddle-shaped structure with one zinc ion and four calcium ions. In addition, the activator domain comprises two homologous subdomains, whereas zincbound water was observed in the ligand-free Ghcol. In the ligand-complexed Ghcol, we found two Gly-Pro-Hyp molecules, each bind at the active site and at two surfaces on the duplicate subdomains of the activator domain facing the active site, and the nucleophilic water is replaced by the carboxyl oxygen of Hyp at the P1 position. Furthermore, all Gly-Pro-Hyp molecules bound to Ghcol have almost the same conformation as Pro-Pro-Gly motif in model collagen (Pro-Pro-Gly) 10 , suggesting these three sites contribute to the unwinding of the collagen triple helix. A comparison of activities revealed that Ghcol exhibits broader substrate specificity than clostridial collagenase at the P2 and P2 0 positions, which may be attributed to the larger space available for substrate binding at the S2 and S2 0 sites in Ghcol. Analysis of variants of three active-site Tyr residues revealed that mutation of Tyr564 affected catalysis, whereas mutation of Tyr476 or Tyr555 affected substrate recognition. These results provide insights into the substrate specificity and mechanism of G. hollisae collagenase.
Collagen is the most abundant protein in mammals and has a triple-helical structure, with Gly-Pro-hydroxyproline (Hyp) tripeptide as the basic unit. Collagenase (Enzyme Commission number: 3.4.24.3) cleaves the triple-helical region of collagen under physiological conditions, catalyzing the hydrolysis of peptide bonds with glycine residues at the P1 position (1).
To expand the industrial utility of collagenase, a better understanding of the mechanism by which collagenase cleaves the triple-helical region of collagen is required. However, only limited information is available regarding the crystal structure of clostridial collagenase (8,9) and Vibrio collagenase VhaC (24). Here, we report the crystal structures of ligand-free Ghcol at 2.2 Å resolution and Ghcol complexed with Gly-Pro-Hyp at 2.4 Å resolution. Furthermore, we compared the activities and substrate specificities of Ghcol and clostridial collagenase and examined the role of three Tyr residues in the active site of Ghcol on its catalytic activity and substrate specificity.

Determination of Ghcol structure
In our previous study, a Ghcol protein (Ala88-Gln767) with the 30-amino acid Sec signal peptide (Met1-Ala30) added at the N terminus for extracellular secretion was expressed using the B. chosinensis expression system (Fig. S2A) and purified from the culture supernatant (23). The purified protein preparation exhibited 62-and 74-kDa bands on SDS-PAGE under reducing conditions, suggesting that the C-terminal region (12 kDa) of the 74-kDa Ghcol was degraded during the purification steps (23). To obtain a homogeneous preparation, we expressed the collagenase module (Ala88-Thr646) of Ghcol with the Sec signal peptide using the B. chosinensis expression system (Fig. S2B) and purified it from the supernatant. This purified Ghcol preparation was homogenous, exhibiting a single 62-kDa band on reducing SDS-PAGE (Fig. S3). Crystals of ligand-free Ghcol were obtained (Fig. S4). Crystals of Ghcol complexed with Gly-Pro-Hyp were prepared by soaking the crystals of ligand-free Ghcol in a reservoir solution containing Gly-Pro-Hyp. Table 1 summarizes the data collection and structure statistics. The space groups of ligand-free and ligandcomplexed Ghcol structures are C2 and P2 1 , respectively. Both crystals contain two molecules (chains A and B) in the asymmetric unit of the cell. The structures of ligand-free and ligand-complexed Ghcol (Ala88-Gly622) were refined at 2.2 and 2.4 Å resolution with R/R free of 0.186/0.227 and 0.206/ 0.255, respectively.
Overall structure of Ghcol Figure 1 shows the overall structure of the collagenase module of Ghcol. The structure is divided into the activator domain (Ala88-Tyr355), linker (Ala356-Gly365), and peptidase domain (Phe366-Gly622). The peptidase domain is further divided into two subdomains, namely the upper half (C1: Phe366-Val515) and lower half (C2: Val516-Gly622) (Fig. 1A). Figure 1B shows topology of the activator and peptidase domains. The activator domain contains 21 α-helices. The α-helices 5-9 (Act1) and 10-19 (Act2) exhibit similar topologies. The peptidase domain comprises the upper-half and lower-half subdomains. In the peptidase domain, the upper-half subdomain contains four α-helices and six βstrands, including the zinc-binding motif H 492 EYVH 496 , whereas the lower-half subdomain contains eight α-helices and no β-strands. Figure 1C shows the crystal structure of the collagenase module of Gly-Pro-Hyp-bound Ghcol. The activator and peptidase domains exhibit a saddle-shaped structure with one zinc and four calcium ions. Unexpectedly, Ghcol binds to two peptides at each of three binding sites, which are present in the active site (a) and in the activator subdomains, Act1 (b) and Act2 (c). The sites are separated by a distance of 15 to 27 Å. Each two peptides in the three binding sites extend in the same direction.
A comparison of tertiary structures of Act1 and Act2 ( Fig. 2A) revealed that they are similar with an RMSD of 3.3 Å for 152 pair C-alpha atoms. However, Act1 and Act2 do not share a significant sequence homology (Fig. 2B). We speculate that in evolution, one of the two subdomains were generated from the other subdomain by gene duplication, and that the amino acid sequences have changed extensively without changing the tertiary structure.
The superposition of the two molecules in the asymmetric unit of ligand-free Ghcol and Gly-Pro-Hyp-bound Ghcol shows RMSD of 0.38 and 0.31 Å for 535 and 533 pair C-alpha atoms, respectively. It becomes smaller to 0.16 to 0.21 Å when each domains are compared ( Table 2). The RMSD of each domain between ligand-free and Gly-Pro-Hyp-bound enzymes is also low (0.16-0.23 Å, Table 2), suggesting that almost no conformation change occurred by the binding of Gly-Pro-Hyp. The structure figures in this article were prepared for chain A both in the ligand-free and Gly-Pro-Hypbound Ghcol.
Structures of Gly-Pro-Hyp binding sites of Ghcol Figure 3, A-C shows the binding sites a, b, and c, respectively, in more detail with the omit 2f o -f c map and the f o -f c map with the peptide model. No peaks of the f o -f c map appeared, indicating that these models are appropriate. In Figure 3A, the two peptides in the active site are clearly Activator domain (Ala88-Tyr355, silver, gold, and light blue), linker (Ala356-Gly365, green), upper-half peptidase subdomain (C1: Phe366-Val515, magenta), and lower-half peptidase subdomain (C2: Val516-Gly622, blue) are shown. The number indicates that of amino acid residues. A, domain organization. B, topology diagram. The α-helix is shown as a column. The β-strand is shown as an arrow. The catalytic zinc ion is shown as a yellow star. The amino acid residue to which calcium ion (Ca1-4) binds is shown as a red cross. C, crystallographic structure of Gly-Pro-Hypcomplexed Ghcol. The catalytic zinc ion is shown as a yellow sphere. The calcium ions are shown as a pink sphere or a red sphere. The bound peptides are shown in stick with green for a in the active site and orange for b in the activator domain 1 (Act1) and cyan for c in the activator domain 2 (Act2). Ghcol, collagenase from Grimontia hollisae strain 1706B. separated, suggesting that this structure might reflect the enzyme-product complex after cleavage of collagen. In Figure 3, B and C, the two peptides at the activator domain are partially overlapping; the O atom of Hyp in the first peptide and the N atom of Gly in the second peptide are located at almost the same position. This might be because these two atoms have multiple positions, and the average coordinates were obtained. There may be possibility that six-residue peptide contaminated or synthesized by reverse reaction during soaking preferentially bound to the binding site. We also modeled the six-bound residues of two Gly-Pro-Hyp in a reverse direction (Fig. 3, D and E). Several peaks appeared in the f o -f c map, indicating that these models are inappropriate. Figure 3F shows the Ramachandran plot. The conformations of the Gly-Pro-Hyp peptides bound to Ghcol are almost the same as those of Pro-Pro-Gly repeats in a model collagen (Pro-Pro-Gly) 10 . We therefore propose that these three sites contribute to unwinding of the triple helix of collagen.
Crystallographic structures of the collagenase modules of clostridial collagenase isoforms, ColG, ColH, and ColT, have   Figure 2. Comparison of the activator domains 1 and 2 (Act1 and Act2). A, crystallographic structure. Gly-Pro-Hyp-bound Act1 (Thr125-Asn223) and Act2 (Gly224-Asp315) are shown in light orange and light blue, respectively, after superposition. Side chains of the residues (Arg167, Tyr175, Glu215, Asp221, Glu308, and Arg311) with hydrogen bond and C-C contact are shown in blue stick. The number indicates the distance (Å) between the atoms. B, structure sequence alignment after superposition by COOT. The RMSD for 152 C-alpha pair is calculated to be 3.3 Å. Red box indicates the residues with hydrogen bond and C-C contact. Yellow box indicates the residues with only C-C contact. The helix region is drawn in gray color.   Fig. 1C) are modeled in a reverse direction and refined. There are several f o -f c map peaks. The omit map is shown in gray color. F, Ramachandran plot. The backbone torsional angles for the collagen triple helix model of (Pro-Pro-Gly) 10 (PDB ID: 1K6F) are also plotted (blue). The angles for Gly, Pro, and Hyp are shown with triangles, squares, and circles, respectively. The favored (thin red) and the allowed (thin green) regions are for proline residues generated by COOT. Ghcol, collagenase from Grimontia hollisae strain 1706B; PDB, Protein Data Bank.

Structures of calcium-binding sites of Ghcol
Ghcol has four calcium ions, one near the active site (Ca3 in Fig. 1C) and three distant from the active site (Ca1, Ca2, and Ca4 in Fig. 1C). In clostridial collagenase, one calcium ion was observed near the active site in ColH and ColT, whereas no calcium ions were observed in ColG (8,9). Fig. S6, A−D shows the calcium-binding sites of Ghcol for Ca1 in the activator domain, for Ca2 in the upper-half subdomain in the peptidase domain, for Ca3 near the active site, and for Ca4 in the lowerhalf subdomain of the peptidase domain, respectively. Though no homology was observed among these four calcium-binding sites, they occur at the C-terminal end of α-helices except for Ca2. Fig. S6E shows the comparison of the Ca3-binding sites of Ghcol with the corresponding sites of ColH, ColT, and Vibrio collagenase VhaC. All these sites are similar, containing conserved one Glu, two Gly, and one Arg residues and two water molecules.
Clostridial collagenases consist of a collagenase module and a C-terminal segment. In ColH, the C-terminal segment comprises two polycystic kidney disease-like domains and a collagen-binding domain. Ohbayashi et al. (7) demonstrated that calcium ion plays an important role in the stability of fulllength ColH expressed in Esherichia coli. They also conducted a small-angle X-ray scattering analysis, which revealed that the full-length ColH adopted a tapered shape with a swollen head and an elongated overall structure under calcium-chelated conditions (25). Ghcol consists of a collagenase module and a bacterial prepeptidase C-terminal segment. The role of calcium ions on Ghcol activity, stability, and structure will be investigated in future studies. Table 3 and Figure 3, A-C show the hydrogen bonds and C-C contacts between Ghcol and Gly-Pro-Hyp peptides. Notably, the Hyp side chains form more hydrogen bonds than those of Pro in all three binding sites. Figure 4 shows the active-site structure of Ghcol. His492, His496, and Glu520 coordinate to the zinc ion in the Ghcol active site. Zinc-bound water is present in the ligand-free Ghcol structure, whereas in the ligand-bound Ghcol, this water is replaced by the carboxyl oxygen of the Hyp at the P1 position. Figure 4 also shows the Ghcol active-site residues thought to be important for catalysis. Three Tyr residues, Tyr476, Tyr555, and Tyr564 are located in the active site. The OH groups of Tyr555 and Tyr564 protrude to the zinc ion. The OH groups of Tyr476 and Tyr480 form water-mediated hydrogen bonds with OXT of Hyp at P3'. Tyr555 is involved in hydrogen bonding with OE1 atom of Glu520 both in the ligand-free and ligand-bound Ghcols (2.8 Å), and Tyr564 forms a hydrogen bond with O atom of Hyp in the ligand-bound Ghcol. Previous Table 3 Hydrogen bonds and C-C contacts between Ghcol and Gly-Pro-Hyp sequence comparison of Ghcol with ColH, ColG, and ColT suggested that the conserved Tyr568 residue may play an important role in catalysis in Ghcol (21). However, this is unlikely because Tyr568 is distant from the active site (not shown).

Active-site structures of Ghcol
Based on crystallographic analysis of clostridial collagenase, it has been shown that the C1 and C2 subdomains, which constitute the active site, undergo contraction at different levels (ColG > ColT > ColH) upon the binding of a peptidic inhibitor (9). However, as shown in Figure 4, the binding of tripeptides does not significantly change the structure of the Ghcol active site. This may be explained by that crystal of Ghcol complexed with Gly-Pro-Hyp was prepared by soaking of Gly-Pro-Hyp into the crystal of ligand-free Ghcol, where conformation changes were restricted by crystal packing.
The crystal structure of Vibrio collagenase VhaC has been recently reported (24). Similar to Ghcol and clostridial collagenase, Vibrio harveyi collagenase (VhaC) contains an activator domain, linker, peptidase domain, and one zinc ion, exhibiting a saddle-shaped structure. The overall structures of Ghcol and VhaC are strikingly similar (Fig. S7A) with RMSD values of 0.35-1.38 Å (Table 2), which are smaller than those between Ghcol and ColG, ColH, or ColT (1.56-2.38 Å). The superposition of activator domain of Ghcol, ColG, and VhaC revealed that around 40 rotation is required to fit the peptidase domain of ColG to that of Ghcol or VhaC (Fig. S7A). These results strongly suggest that Ghcol and VhaC may exhibit similar catalytic mechanisms. Fig. S7, B and C shows the rigid body rotation of the C1 and C2 subdomains in the peptidase domain. It shows that rotation angle between Ghcol and VhaC and that between Ghcol and ColG are similar (10.3 and 13.3 , respectively), suggesting that the petidase domains are well conserved in Ghcol, ColG, and VhaC.

Insight into the mechanism of collagenase to cleave collagen
Collagenase is thought to locally unwind the triple-helical structure of collagen before hydrolyzing the peptide bonds (26,27). Based on the saddle-shaped structure of clostridial collagenase, Eckhard et al. (8) proposed a chew-and-digest mechanism for its catalytic action. In this mechanism, the triple helical collagen was suggested to be unwound first on interacting with the peptidase domain in the open conformational state of collagenase, followed by interaction with both the activator and peptidase domains in the closed state (8). On the other hand, Wang et al. (24) demonstrated that the binding of the collagen triple helix occurs only to the activator domain by isothermal titration calorimetry analysis. They concluded that the collagen triple helix first binds to the activator domain before unwinding and cleavage.
In the present study, Ghcol was observed to bind two Gly-Pro-Hyp molecules at each of three sites: the active site, Act1, and Act2. All the bound tripeptides exhibited the same conformation as the Pro-Pro-Gly units in collagen (Fig. 3F). The binding site of Act1 is 11 Å apart from Act2 and 23 Å apart from the active site. The distance between Act2 and the active site is 22 Å (Fig. 1C). Fitting the collagen triple helix (Protein Data Bank [PDB] ID: 1K6F) to the active-site peptide resulted in collision with the amino acid residues, suggesting that direct binding of the collagen triple helix to the active site of Ghcol is difficult. In contrast, the superposition of the and lower-half [C2] subdomain, respectively) and Gly-Pro-Hyp-complexed Ghcol (dark pink and dark cyan for C1 and C2 subdomain, respectively). The catalytic zinc ion and the oxygen of water near the zinc ion are shown as a yellow (orange for ligand-free) and cyan sphere, respectively. The peptide is colored in yellow, and PGE (triethylene glycol) found in the ligand-free Ghcol is colored in green. Ghcol, collagenase from Grimontia hollisae strain 1706B. collagen triple helix model (PDB ID: 1K6F) to the Gly-Pro-Hyp binding sites in Act1 and Act2 (Fig. 5) revealed the possible binding of the two collagen triple helices to the activator domains. The distance between the two collagen triple helices is 13 Å, which is close to the distance between the two collagen triple helices observed in the crystal structure of collagen (PDB ID: 1K6F). This suggests that the activator domain unwinds the collagen fibril to the triple helix state. The unwinding of the triple helix may occur via a conformational change mediated by the open-close motion of the domain as estimated by Eckhard et al. (8) and Wang et al. (24).
Unlike bacterial collagenases, the mechanism of mammalian collagenase to cleave collagen has been extensively studied. Commercially available ColH and ColG were used to compare the activity of clostridial collagenase with that of Ghcol. Both proteins migrated as a 120-kDa band on SDS-PAGE under reducing conditions (Fig. S3), suggesting that they contained the C-terminal polycystic kidney disease-like domain and collagen-binding domain, unlike Ghcol.
First, we analyzed the ability of these three collagenases to hydrolyze FITC-collagen. As shown in Figure 6A, ColG exhibited a specific activity comparable to that of Ghcol, whereas the specific activity of ColH was only 10% that of Ghcol. Unlike with FITC-collagen hydrolysis, both ColH and ColG exhibited gelatin-hydrolyzing activity, and ColG appears more active than Ghcol (Fig. 6B). Next, we analyzed the hydrolytic activities toward the fluorogenic peptide substrate (7-Methoxycoumarin-4-yl)acetyl-Lys-Pro-Leu-Gly-Leu-[N 3 -(2,4dinitrophenyl)-2,3-diaminopropionyl]-Ala-Arg (MOCAc-KPLGL(Dpa)-AR) (Fig. S9A). When the reaction started, the fluorescence intensity at 400 nm (FI 400 ) of the reaction solution increased with increasing time (Fig. S10A). As shown in Figure 6C, the specific activities of ColH and ColG were less than 1% of that of Ghcol. As shown in Fig. S9A, MOCAc-KPLGL(Dpa)-AR has bulky residues at P3 and P3'. The result suggests that Ghcol has the large space for substrate binding in the active site. Finally, we analyzed the ability of the collagenases to hydrolyze N-[3-(2-furyl)acryloyl]-Leu-Gly-Pro-Ala-OH (FALGPA) (Fig. S9B). In this assay, the rate of decrease in absorbance at 322 nm of the reaction solution corresponded to v o and was proportional to the enzyme concentration (Fig. S10B). As shown in Figure 6D, the specific activities of ColH and ColG were 70% and 20% of that of Ghcol, respectively. In FITC-collagen digestion, ColG exhibited higher activity than ColH, whereas ColG exhibited lower activity on FALGPA than ColH. This may be because the selective loop of ColH covers the active site more substantially than that of ColG (8,9).
We compared collagenolytic activities of Ghcol and clostridial collagenase, by hydrolyzing collagen using Ghcol or Liberase-C, which is a mixture of ColH and ColG and analyzing the reaction products using gel-filtration HPLC (Fig. 7, A and B). The reaction products at 1 h exhibited two peaks with retention times of 21 and 24 min. The former peak corresponded to peptides containing more than six amino acid residues, whereas the latter peak corresponded to tripeptides. With increasing reaction time (1-48 h), the former peak area decreased, and the latter peak area increased (72-93% for Ghcol and 37-69% for Liberase-C), indicating that collagen digestion continued. The former peak hardly appeared in the Ghcol reaction at 20 to 48 h, whereas, it clearly appeared in the Liberase-C reaction, suggesting that Ghcol degrades collagen more efficiently than Liberase-C. N-terminal amino acid sequence analysis of the reaction products at 20 h revealed that the peak corresponding to Glu appeared in the second cycle for Ghcol (Fig. 7C) but did not appear for Liberase-C (Fig. 7D). In X position of Gly-X-Y repeat in collagen, Pro is the most, Ala is the second most, and Glu is the third most abundant residues (30). Van Wart and Steinbrink (30) reported that clostridial collagenase has a preference for Pro and Ala, but not Glu, for P2 and P2 0 sites. Eckhard et al. (31) reported that ColH, ColG, and ColT do not favor substrates that contain Asp, Glu, Lys, or Arg at P2 or P2 0 sites. In contrast, our results suggest that Ghcol cleaves collagen even when Glu residues occupy the P2 and P2 0 sites. When the degradation products by Liberase-C in the former peak were fractionated by gel filtration chromatography and further purified by reversephase chromatography, a nonapeptide including Glu, Gly-Gln-Arg-Gly-Glu-Arg-Gly-Phe-Hyp (bovine Col α(I) chain precursor 964-972) was identified. Therefore, we next hydrolyzed Gly-Pro-Hyp-Gly-Pro-Hyp (Fig. 7E) and Gly-Glu- Arg-Gly-Phe-Hyp (Fig. 7F). With increasing reaction time, the Gly-Pro-Hyp concentration increased for both Ghcol and Liberase-C (Fig. 7E), whereas the Gly-Glu-Arg concentration increased only for Ghcol (Fig. 7F). These results suggest that Ghcol, unlike clostridial collagenase, favors substrates that contain Glu at the X position of collagen, explaining that Ghcol degrades collagen more efficiently than clostridial collagenase.
To explore the mechanism of the differences observed in Figures 6 and 7 between Ghcol and ColG, we compared the surfaces of Gly-Glu-Hyp-bound Ghcol and ColG (Fig. 8). The former structure was made by replacing Gly-Pro-Hyp with Gly-Glu-Hyp. The latter structure was made by placing Gly-Glu-Hyp on the active site of ColG after fitting the peptidase domain of ColG to Ghcol. Ghcol has enough space to accommodate the side chain of Glu at the P2 and P2 0 positions, whereas there is a collision with the side chains of Trp539 and Phe515 in the case of ColG. In addition, compared with ColG, the active site of Ghcol is more hydrophobic. These findings suggest that Ghcol might exhibit broad specificity for P2 and P2 0 residues in the substrate. This might explain that Ghcol degrades collagen more efficiently than clostrical collagenase.

Activities of Ghcol variants toward collagen, gelatin, and synthetic peptides
Glu493 in the zinc-binding motif H 492 EYVH 496 of Ghcol has been suggested to be the catalytic residue responsible for acidic pK a (pK e1 ) (21,23). In this study, Tyr476, Tyr555, and Tyr564, which are located in the active site, are suggested to be important for catalysis (Fig. 4). To examine the roles of these residues in more detail, we expressed Ala88−Gln767 of WT Ghcol and four variants, Y476A, Y555A, Y564A, and E493A, with an N-terminal Sec signal peptide using the B. chosinensis expression system (Fig. S2) and purified them from the culture supernatant. The purified preparations exhibited 62-and 74-kDa bands on SDS-PAGE under reducing conditions (Fig. S8). Figure 9, A and B shows the hydrolytic activities in the hydrolysis of FITC-collagen and gelatin, respectively. Y476A and Y555A had similar activity as WT, Y564A exhibited reduced activity, and E493A lacked this activity. Figure 9, C and D shows the activity on MOCAc-KPLGL(Dpa)-AR and FALGPA, respectively. Y555A and Y476A showed reduced activity compared with WT, whereas Y564A and E493A lacked this activity. In other words, mutation of Tyr564 affected catalysis rather than substrate recognition, whereas mutation of Tyr476 or Tyr555 affected substrate recognition rather than catalysis. This might be explained by that Tyr564 is closer to the zinc ion than Tyr476 and Tyr555, whereas Tyr476 and Tyr555 are closer to Gly-Pro-Hyp than Tyr564 (Fig. 4). These results suggest that Glu493 is indispensable for catalysis and that Tyr476, Tyr555, and Tyr564 contribute to catalysis and substrate recognition to varying degrees.

Conclusion
The crystal structures of Ghcol revealed the following features. First, the activator and peptidase domains exhibit a saddle-shaped structure with one zinc ion and four calcium ions, which is similar to clostridial collagenase and strikingly similar to Vibrio collagenase VhaC. Second, Ghcol binds two peptides at each of the three sites: the active site and the two similar sites (Act1 and Act2) in the activator domain. Third, the activator domain contains two repeated subdomains to which collagen triple helix can bind. Finally, Ghcol has the large space for substrate binding at the S2 and S2 0 sites. This explains its broad specificity for P2 and P2 0 residues in the substrate. These findings are important for elucidating the mechanism of triple-helical collagen digestion by bacterial collagenases. The results from this study also explain the high catalytic activity and substrate specificity of Ghcol, encouraging its application in industry.

Protein expression and purification
Nucleic acid and protein sequences of Ghcol (Fig. S1) were obtained from the DNA Data Bank of Japan database (AB600550). The WT Ghcol (Ala88-Thr646) was expressed in B. chosinensis cells transformed with the expression plasmid (Fig. S2A) and purified from the supernatants as described previously (19). The expression plasmid for the WT Ghcol (Ala88−Gln767) (Fig. S2B) was previously described (23), from which the expression plasmids for Ghcol variants were constructed by QuikChange method using the primers listed in Table S1. The WT Ghcol and variants were expressed in B. chosinensis cells and purified from the supernatants as described previously (23).   A and B, the surface is colored in pink and cyan for the peptidase upper-half (C1) and lower-half (C2) subdomain, respectively, except for acidic (red) and basic (blue) amino acids. C and D, the electrostatic surface potential density is colored gradient from red (−10 kT/e) to blue (10 kT/e). Ghcol, collagenase from Grimontia hollisae strain 1706B.

Enzyme assay
ColH and ColG were purchased from Meiji Seika Pharma. The concentrations of ColH and ColG were determined using Protein Assay CBB Solution (Nacalai Tesque, Inc) with bovine serum albumin (Nacalai Tesque, Inc) as a standard. Clostridium histolyticum collagenase, Liberase-C, was purchased from Roche Diagnostics. The concentration of Liberase-C was determined by the denoted weight. The k cat value of Ghcol for type I collagen is 1.3 times higher than that of Liberase-C (22). Based on the k cat value and molecular weights of both enzymes, enzyme/substrate ratios of 1% and 2.5% were used for Ghcol and Liberase-C, respectively.
Collagen and FITC-labeled type I collagen (FITCcollagen) was prepared as described previously (19). Collagen hydrolysis assay was carried out in accordance with a modified version of previously described method (20). Briefly, collagenases (6.0 μg/ml for Ghcol or 15.0 μg/ml for Liberase-C) were mixed with 50 mM Tris-HCl (pH 7.5) containing 0.6 mg/ml bovine type I collagen, 200 mM NaCl, and 5 mM CaCl 2 , and incubated at 30 C for the time intervals shown in Figure 7, E and F. After heat shock to stop the enzymatic reaction, the collagenase digests were separated using gel filtration. Gel filtration analysis of the reaction products was carried out as follows: column, Superdex Peptide 10/30 HR (GE Healthcare); solvent, 0.1 M ammonium bicarbonate, 20% v/v acetonitrile; detector, and absorbance at 220 nm. FITC-collagen hydrolysis assay was carried out as described previously (19,21,23).

Amino acid sequencing
N-terminal amino acid sequence analysis was performed as described previously (39). Briefly, collagenase digests after 20 h incubation were separated by gel filtration under aforementioned conditions, and tripeptide-containing fractions were collected. N-terminal sequence of the collected fractions was analyzed by a Procise 494 protein sequencer (Applied Biosystems) in pulsed-liquid mode.

Data availability
The atomic coordinates and structure factors reported in this study were deposited in the PDB under accession code 7WSS for ligand-free Ghcol and 7XEB for Gly-Pro-Hyp-bound Ghcol.
Supporting information-This article contains supporting information.