Conformational requirements of collagenous peptides for recognition by the chaperone protein HSP47.

The collagen binding chaperone HSP47 interacts with procollagen in the endoplasmic reticulum and plays a crucial role in the biosynthesis of collagen. We recently demonstrated that typical collagen model peptides, (Pro-Pro-Gly)(n), possess sufficient structural information for interaction with HSP47 (Koide, T., Asada, S., and Nagata, K. (1999) J. Biol. Chem. 274, 34523-34526). Here we show that binding of (Gly-Pro-Pro)(n) peptides to HSP47 can be detected using the two-hybrid system in yeast if a trimerizing domain is fused to the C termini of the peptides. Some peptides interacted with HSP47 at a lowered assay temperature at 24 degrees C but not at 30 degrees C, indicating the importance of conformational change of the substrate peptides. To analyze the spectrum of HSP47 substrate sequences, we performed two-hybrid screening of collagen-like peptides in designed random peptide libraries using HSP47 as a bait. In selected peptides, the enrichment ratio calculated for each amino acid residue correlated strongly with the contribution of the residue to triple-helix stability independently determined using synthetic collagen model peptides. Taken together, our results suggest that HSP47 preferentially recognizes collagenous Gly-X-Y repeats in triple-helical conformation. We also demonstrated that screening of combinatorial peptide libraries is a powerful strategy to determine conformational requirements as well as the elucidation of binding motifs in primary structure.

HSP47 is an endoplasmic reticulum (ER) 1 resident stress protein, which is thought to function as a collagen-specific molecular chaperone. This protein associates with procollagen during its folding and/or post-translational modification in the ER (1)(2)(3). Recent studies have revealed that HSP47 plays a critical role in collagen biosynthesis; hsp47 null mice show abnormal collagen synthesis and die before E11.5. 2 Because HSP47 binds in vitro to various types of collagen (at least types I-V; Ref. 4), as well as to collagen-like proteins such as C1q, 3 it appears that HSP47 may function through specific binding to the helix-forming portions of procollagen. However, the function of HSP47 at the molecular level remains unclear. In a previous report, we showed that typical collagen model peptides ((Pro-Pro-Gly) n , n Ն 7) were recognized by HSP47 in an in vitro binding assay (5). The strength of this interaction increased with increasing length of the model peptides, and the interaction was negatively regulated by prolyl 4-hydroxylation at the second Pro residues of the triplets.
In immature procollagen ␣-chains (preproto ␣-chains), Pro is the most common amino acid residue in the X and Y positions of Gly-X-Y triplets, but a variety of residues other than Pro are also found in both positions (6). The longest uninterrupted series of (Gly-Pro-Pro) n repeats in a preprotocollagen is the five-repeat series (n ϭ 5) found in the human, bovine, chick, rat, and mouse ␣1(I) chain. And the longest (Pro-Pro-Gly) n repeats are also the five repeats in the human ␣2(I) chain. The initial aims of the present study were to elucidate amino acid preferences in HSP47 substrate peptides and to try to identify a putative consensus sequence for HSP47 binding using the yeast two-hybrid system to screen collagenous sequences.
We developed a method of detecting specific interaction between HSP47 and collagen model peptides using the yeast two-hybrid system. Using this method, we analyzed the effects of model (Gly-Pro-Pro) n substrate length and assay temperature on interaction with HSP47. We also investigated the effect of amino acid replacement in the (Gly-Pro-Pro) n peptides as a function of assay temperature. Finally, we constructed two peptide libraries containing diverse collagen-like sequences and screened these libraries for HSP47 binding sequences using the yeast two-hybrid system. The data obtained in this study highlighted the conformational basis of HSP47 substrate preference rather than identifying a consensus sequence for HSP47 binding.

EXPERIMENTAL PROCEDURES
General-Restriction enzymes, Taq polymerase (Ex Taq), DNA ligase, and other DNA-modifying enzymes were purchased from Takara Shuzo Co. Ltd. (Kusatsu, Japan). DNA fragments were synthesized by Hokkaido System Science (Sapporo, Japan). Chemicals were purchased from either Nacalai Tesque (Kyoto, Japan) or Wako Pure Chemical Co. (Osaka, Japan). DNA sequencing was carried out on an ABI PRISM 377A sequencer (Perkin-Elmer).
Plasmid Constructs-The identities and orientations of the inserts of all plasmid constructs, except for those in random peptide libraries, * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AB044560.
¶ Present address: Dept. of Biological Science and Technology, Faculty of Engineering, The University of Tokushima, Tokushima 770-8506, Japan.
were confirmed by sequencing.
To construct pYL97 encoding the GAL4-activation domain (AD) fused with the K100-C1q-like domain, the DNA fragment encoding the C1q-like domain of the K100 gene (GenBank TM accession number AB044560) was amplified by PCR with the primers YOR68 -6 (5Ј-TTA GGA TCC ACG CGG CCG GGG CTA TC-3Ј) and YOR68 -7R (5Ј-GGG GAA TTC TCA GTC AGC ATA AAT AAT AAA-3Ј). The resulting 0.5-kb fragment was digested with BamHI and EcoRI and subcloned into the BamHI and EcoRI sites of pACT2 (CLONTECH) to yield pYL97.
Plasmids encoding GAL4AD-(Gly-Pro-Pro) n -K100 C1q-like domain were constructed as follows. A 0.5-kb DNA fragment encoding the K100 C1q-like domain was amplified by PCR with the primers Eco-pYL68 (5Ј-CAG GAA TTC CGC GGC CGG GGC TA-3Ј) and pYL68-Xho (5Ј-TTC CTC GAG TAT CAG TCA GCA TAA ATA ATA-3Ј), digested with EcoRI and XhoI, and subcloned into the EcoRI and XhoI sites of pGADGH to create the plasmid pGAD-C1q. Synthetic DNA fragments encoding (Gly-Pro-Pro) n (n ϭ 6 -9; Table I, entries 6 -9) were amplified as above  and subcloned into the BamHI and EcoRI sites of pGAD-C1q. Plasmids encoding GAP1, GAP2, GPA1, and GPA2 peptides (see Fig. 3) were constructed by the same procedure using the corresponding synthetic inserts (Table I, entries 10 -13). To generate pYL103 for bacterial expression of a GST fusion protein of the K100 C1q-like domain, pYL97 was digested with BamHI and EcoRI, and the resulting 0.5 kb-fragment was subcloned into the BamHI and EcoRI sites of pGEX-3X (Amersham Pharmacia Biotech) to yield pYL103.
The inserts for the X and Y libraries (Table I, entries 14 and 15) were chemically synthesized as single-stranded DNA fragments. To minimize bias in the amino acid distribution, the optimized mixtures of nucleotide derivatives developed by Cwirla et al. (8) were used at the degenerate positions. The libraries were constructed by a procedure similar to the construction of the plasmids encoding the GAL4AD-(Gly-Pro-Pro) n -K100 C1q-like domain.
Yeast Two-hybrid Binding Assays-Two-hybrid binding assays to detect interaction between HSP47 in the binding hybrid and peptides in the activation hybrids were performed according to the instructions for the MATCHMAKER two-hybrid kit (CLONTECH). A human lamin C fragment in pAS2-1 (pLAM5Ј-1, CLONTECH) was used as a negative control.
Library Screening-Two-hybrid screening of random collagenous peptide libraries was performed at a culture temperature of 30°C according to the instructions for the MATCHMAKER two-hybrid kit (CLONTECH), using HSP47 (pYL94) as a bait. Initial screening was carried out using histidine auxotrophic assays on agar plates, and the resulting positive clones were further screened using ␤-galactosidase assays on nitrocellulose membranes.
Western Blotting of Yeast Cell Lysates-Yeast cells harvested from 5 ml of liquid culture were frozen in liquid nitrogen and thawed in 100 l of 10 mM Tris⅐HCl (pH 7.5) containing 1 mM phenylmethanesulfonyl fluoride and 0.2% Nonidet P-40. The cells were lysed by vortexing with glass beads (Sigma) at 4°C for 30 min. After centrifugation, the supernatant was collected, and the protein concentration was determined using a protein assay kit (Bio-Rad). Lysate proteins (10 g) were separated by SDS-PAGE, and transferred to nitrocellulose membranes. To detect the activation hybrid proteins, anti-GAL4AD monoclonal antibody (CLONTECH) was used as a primary antibody, and immunoreactive bands were visualized using an ECL kit (Amersham Pharmacia Biotech).
Expression, Purification, and Chemical Cross-linking of Recombinant K100 C1q-like Domain-Escherichia coli JM109 cells carrying pYL103 were grown in 5 ml of LB medium containing 50 g/ml ampicillin at 37°C. The expression of the recombinant protein was induced by adding isopropyl-␤-D-thiogalactopyranoside (0.5 mM), and the culture was continued for 1 h. Cell lysates were prepared as described elsewhere (9). GST-K100 Clq-like domain fusion protein in the lysate was adsorbed onto 0.3 ml of glutathione-Sepharose 4B (Amersham Pharmacia Biotech), and the beads were washed four times with 10 mM Hepes⅐Na (pH 7.5), 3.7 mM EDTA, 0.4 M NaCl. The fusion protein on the beads was cleaved by overnight treatment with 2 units of Factor Xa (Amersham Pharmacia Biotech) in 50 mM Hepes⅐Na (pH 8.2), 150 mM NaCl containing 1 mM CaCl 2 at room temperature. The supernatant containing a recombinant K100 C1q-like domain was collected.
For chemical cross-linking, ice-cold solutions of the recombinant K100 C1q-like domain (ϳ100 g/ml), GST, ovalbumin, or bovine serum albumin (100 g/ml each) in 50 mM Hepes⅐Na (pH 7.5), 150 mM NaCl were mixed with glutaraldehyde (final concentration 0.09%) and incubated at 37°C for 3 min. The reaction was terminated by adding 2ϫ Laemmli's SDS sample buffer, and products were analyzed by SDS-PAGE.

RESULTS AND DISCUSSION
Detection of the Binding of (Gly-Pro-Pro) n Peptides to HSP47 in the Yeast Two-hybrid System-In an earlier report (5), we detected the specific binding of recombinant GST⅐HSP47 fusion protein to immobilized synthetic (Pro-Pro-Gly) n peptides when n was not less than 7 and showed that prolyl 4-hydroxylation (which converts Pro residues to 4-hydroxyprolyl (Hyp) residues) at the second Pro residue of each triplet has a negative effect on the interaction. As yeast cytosol is unlikely to contain prolyl 4-hydroxylase activity, a similar binding assay was performed using the yeast two-hybrid system. A plasmid encoding mouse HSP47 fused to the GAL4 DNA binding domain (GAL4BD) was transfected into yeast cells harboring HIS3 and LacZ genes in GAL4-driven reporter constructs. Synthetic DNA fragments containing various numbers of Gly-Pro-Pro repeats were fused to the gene encoding the GAL4 activation domain (GAL4AD), and these plasmids were also introduced into the yeast cells. The binding of the hybrid proteins expressed in the yeast cells was assessed by growth on histidinedepleted agar plates and by ␤-galactosidase assay on nitrocellulose membranes. In this system, no interaction was detected between HSP47 and peptides containing 5-10 repeats of (Gly-Pro-Pro) (Fig. 1A), in contrast to the results of the in vitro solid-phase binding assay using synthetic (Pro-Pro-Gly) n peptides (5).
In parallel with the experiments above, one of the authors (T. Yorihuzi) cloned a gene encoding a novel C1q-like protein, tentatively named K100 protein, from an E17 mouse embryo cDNA library by two-hybrid screening using HSP47 as a bait. The open reading frame of the gene encodes an N-terminal signal sequence and 17 repeats of Gly-X-Y followed by a Cterminal globular domain. The globular domain is equivalent to the so-called C1q module (10, 11) and shows 28 and 34% amino acid identity with the corresponding domains of mouse C1q-B FIG. 1. Interaction of HSP47 with (Gly-Pro-Pro) n peptides with or without a C-terminal K100 C1q-like domain. A, interaction between full-length HSP47 and (Gly-Pro-Pro-) n constructs with or without a C-terminal K100 C1q-like domain. Yeast cells harboring the expression plasmid encoding the HSP47⅐GAL4BD fusion protein were further transfected with the plasmid encoding the activation hybrid indicated. Two-hybrid interaction was measured at 30°C by both growth on histidine-lacing agar plates and the following ␤-galactosidase assay on nitrocellulose membranes. B, yeast cells harboring the expression plasmid encoding HSP47⅐GAL4BD and that encoding the activation hybrid indicated were grown in liquid medium and harvested. After cell lysis using glass beads and Nonidet P-40 containing buffer, total yeast proteins (10 g) containing the activation hybrids indicated were separated on 10% SDS-PAGE with (odd lanes) or without (even lanes) prior heat treatment (100°C, 5 min). Activation hybrid proteins were visualized by Western blotting using anti-GAL4AD antibody. chain and type X collagen, respectively. 4 Interaction between HSP47 and K100 protein was detected in the two-hybrid system only when the C1q-like domain was present at the C terminus of the Gly-X-Y repeat sequence of K100, although the C1q-like domain itself did not interact with HSP47 (data not shown). We therefore fused the C1q-like domain to the C termini of the (Gly-Pro-Pro) n peptides in the activation hybrids and tested for binding to HSP47. Specific interaction was detected when the number of triplet repeats fused to the C1q-like domain was greater than 7 (Fig. 1A). This result demonstrates that the interaction between HSP47 and (Gly-Pro-Pro) n peptides can be detected in the yeast two-hybrid system. In the presence of the K100 C1q-like domain, the strength of the interaction depended on the length of the peptides in a manner consistent with our previous observations based on the in vitro binding assay (5).
Next, we wished to clarify the role of the K100 C1q-like domain in HSP47-peptide interactions in the two-hybrid system. To determine whether the K100 C1q-like domain simply stabilized the activation hybrids by preventing enzymatic degradation in the cells, we examined the cellular protein levels of the activation hybrids by Western blotting using anti-GAL4AD antibody. The protein levels of the activation hybrids remained relatively constant, regardless of the presence or absence of the K100 C1q-like domain (Fig. 1B). The C1q-like domain of type X collagen is reported to form a very stable trimer that remains intact even in SDS-containing buffers (12)(13)(14). We looked for the formation of such a stable trimer by the K100 C1q-like domain using unboiled SDS-PAGE samples of the yeast lysate but did not detect any bands corresponding to the expected molecular mass of a trimer (Fig. 1B).
We further investigated the possible oligomer-forming properties of the K100 C1q-like domain using a more sensitive method. The recombinant K100 C1q-like domain was expressed in E. coli cells, purified, and cross-linked in solution using glutaraldehyde, and the products of the cross-linking reaction were analyzed by SDS-PAGE. The major cross-linked products migrated at positions corresponding to dimer and trimer (Fig. 2). Under the same cross-linking conditions, GST (control for dimeric proteins; Ref. 15) migrated as a dimer, whereas no oligomer formation was evident in reactions containing ovalbumin or bovine serum albumin (controls for monomeric proteins) ( Fig. 2A). However, the majority of recombinant K100 C1q-like domain eluted in the monomer fraction on high pressure liquid chromatography gel filtration analysis at room temperature (data not shown). These results indicate that the K100 C1q-like domain possesses an intrinsic homotrimerforming propensity, but the trimer that it forms is not particularly stable. The C1q-like domain fused to the C termini of the (Gly-Pro-Pro) n peptides may, therefore, have moderately stabilized trimers of the activation hybrids in the yeast cells and shifted the melting temperatures of the peptides up to the window of temperatures at which two-hybrid assay can be performed.
Effect of Assay Temperature and Amino Acid Replacement on the Interaction of (Gly-Pro-Pro) n Peptides with HSP47 in the Yeast Two-hybrid System-The thermal stability of collagenous triple-helices is known to depend on the lengths of the peptides; longer peptides show higher melting temperatures than shorter ones consisting of different numbers of the same triplet units (16). The result shown in Fig. 1 led us to speculate that a trimeric (or triple-helical) conformation of the (Gly-Pro-Pro) n peptides may be responsible for interaction with HSP47. If this were true, then peptides with fewer than eight repeats of (Gly-Pro-Pro) would be expected to bind to HSP47 at lower temperatures. We therefore carried out two-hybrid binding assays at 24°C rather than 30°C. When the two-hybrid assay was performed at 24°C, we detected binding of (Gly-Pro-Pro) 7 to HSP47 (Fig. 3A) even though this peptide did not appear to bind to HSP47 at 30°C (Fig. 1A).
In preprotocollagen, the native substrate of HSP47, the X and Y positions of the Gly-X-Y triplets are often occupied by amino acid residues other than proline. In addition, Pro to Ala substitutions in collagen model peptides are known to decrease the thermal stability of the triple-helical conformation (17). To investigate structure-activity relationships in HSP47 substrate peptides, Pro residues at either the X or the Y position of (Gly-Pro-Pro) 8 were replaced with Ala residues. A single substitution of Ala for Pro at the middle X position of the parent peptide abolished HSP47 binding activity at 30°C (Fig. 3B,  GAP1). In contrast, a single substitution at the Y position did not affect the binding (GPA1). This difference indicates that Pro residues in the X and Y positions do not contribute equally to the interaction with HSP47. Double substitution of Pro residues at either X or Y positions abolished the interaction with HSP47 at both 24 and 30°C (GAP2 and GPA2). The binding of the Ala-substituted peptides was also dependent on temperature; when the assay was performed at 24°C, GAP1 displayed HSP47 binding activity. These results imply that not only the trimeric structure but also the triple-helical conformation of the substrate peptides is responsible for their interaction with HSP47.
Design and Characterization of the Collagenous Peptide Libraries-The screening of combinatorial peptide libraries is a powerful strategy for identification of sequences, which interact with a specific protein of interest. This strategy has been successful in identifying binding motifs using synthetic peptide libraries (18 -20), the phage-display system (8), and the two-4 T. Yorihuzi and K. Nagata, manuscript in preparation.

FIG. 2. Trimer-forming property of recombinant K100 C1q-like domain.
A, recombinant K100 C1q-like domain was cross-linked using 0.09% glutaraldehyde and analyzed by SDS-PAGE on 15% gel followed by silver staining. GST was used as a control for dimeric protein.
Ovalbumin and bovine serum albumin were used as controls for monomeric proteins. Molecular sizes are shown in kDa. The monomer, dimer, and trimer of the recombinant K100 C1q-like domain are indicated by asterisks. B, untreated and cross-linked recombinant K100 C1q-like domain was analyzed by SDS-PAGE on 12% gels followed by Western blotting using rabbit antiserum raised against recombinant K100 C1q-like domain.
hybrid system in yeast cells (21). We have applied such a combinatorial library technique to yeast two-hybrid screening.
To determine the binding preferences of HSP47 for amino acid residues at the X and Y positions of the substrate collagen-like peptides, we designed random peptide libraries for two-hybrid selection. Because (Gly-Pro-Pro) 8 combined with a C-terminal K100 C1q-like domain was the minimal peptide required for interaction with HSP47 under the standard assay conditions at 30°C (Fig. 1A), the random peptides in the activation hybrid library were fixed to 24-mers (encoding 8 triplets). The Gly residues at every third position were also fixed, because the Gly residues are conserved in all types of collagen. If all of the X and Y residues of (Gly-X-Y) 8 are randomized, the theoretical diversity is 20 16 ϭ 6.6 ϫ 10 20 . This value is far beyond the practical limit of diversity that can be screened by colony selection. We therefore constructed two independent peptide libraries, an X library and a Y library, in which six of the eight Pro residues at either X or Y positions were randomized (Fig. 4). The DNA fragments encoding the randomized peptides were chemically synthesized, amplified by PCR, and inserted in frame between the sequences encoding GAL4AD and the K100 C1q-like domain (Fig. 4). The theoretical diversity of each library is 20 6 ϭ 6.4 ϫ 10 7 .
We next characterized the diversity of each of the peptide libraries expressed in yeast cells by sequence analysis of the transfected plasmid DNA. Of 200 clones chosen at random from the X library, 121 (60.5%) contained uncorrupted inserts encoding 24-mer peptides, and 80 (40%) of 200 randomly chosen clones from the Y library contained normal inserts. The random nature of the amino acid residues present in the degenerate positions was also confirmed by sequence analysis. The frequency of occurrence of each amino acid residue in the degenerate positions of the 200 randomly chosen peptides was taken as the basis for the calculation of enrichment ratios after two-hybrid selection.
Selection of HSP47 Binding Peptides from the X and Y Libraries-Using HSP47 as a bait, 1 ϫ 10 7 clones of the X library were screened. This two-hybrid screening yielded positive clones from neither the histidine auxotrophic selection nor the ␤-galactosidase assay. The probability that six degenerate X positions are all occupied by Pro residues is 20 Ϫ6 (ϭ 6.4 ϫ 10 Ϫ7 ) and that any five X positions are occupied by Pro residues is 7.3 ϫ 10 Ϫ5 (ϭ 20 Ϫ6 ϫ 19 ϫ 6). Because the net clone number screened was estimated to be 6.05 ϫ 10 6 (ϭ 60.5% of 1 ϫ 10 7 ), this result indicates that none of the Pro residues at the X positions can be replaced with any other amino acid residues in the (Gly-Pro-Pro) 8 peptide under these assay conditions. Similar selection from 7 ϫ 10 5 clones of the Y library yielded 88 clones that showed specific interaction with HSP47. The different contributions of the X and Y positions to interaction with HSP47 was also demonstrated using (Gly-Pro-Pro) n peptides with Ala substitutions (Fig. 3B, compare GAP1 to GPA1). Residues in the X positions are more accessible to the solvent in the triple helix (22), and (Gly-Pro-Y) n sequences show higher melting temperatures of the triple helix than the corresponding (Gly-X-Pro) n peptides (23). Thus, residues at the X and Y positions in the Gly-X-Y repeats appear to contribute differently to either the triple helix stability or the formation of the HSP47 binding surface. Because GAP1 interacted with HSP47 at 24°C (Fig. 3B), most of the peptides in the X library appear to have failed to form triple-helical structures under the screening conditions at 30°C.
To exclude any possible experimental bias that might have affected the selection, we compared the amounts of the activation hybrids expressed in the HSP47-selected yeast clones with those in randomly chosen negative clones. Protein levels estimated by immunoblotting using anti-GAL4AD antibody were not significantly different between the clones examined, regardless of the HSP47 binding activity of the peptides (Fig. 5). We therefore conclude that the result of two-hybrid selection may be attributed directly to the HSP47 binding property of the substrate peptides.
All of the peptide sequences deduced from the DNA sequences of the selected clones are shown in Fig. 6. We could not identify any characteristic order in the primary amino acid sequences of the individual peptides. Most of the amino acid residues were almost uniformly distributed throughout the randomized Y positions of HSP47 binding peptides. Arginine residues, which have been reported to possess a triple-helix stabilizing effect comparable to that of Hyp residues (24), were most strongly enriched (up to 6-fold) in the HSP47 binding peptides relative to basal values taken from 200 randomly chosen peptides. Proline residues were also highly enriched (about 5-fold). On the other hand, Asp, Phe, Gly, Asn, and Trp residues were apparently excluded (less than 20% of the basal frequency, Fig. 7).
Brodsky and co-workers (6, 17, 24 -26) have studied the contribution of various amino acid residues to the stability of triple-helical structures using sets of designed collagen models such as acetyl-(Gly-Pro-Hyp) 3 -Gly-X-Y-(Gly-Pro-Hyp) 4 -Gly-Gly-amide. Surprisingly, the enrichment ratios for amino acid residues at the Y position of our model peptides correlated strongly with the thermal stabilities of corresponding collagen model peptides reported in the literature (Fig. 8). This correlation clearly demonstrates the importance of triple-helical structure in substrate recognition by HSP47. Selection by HSP47 binding from the Y library appears to be attributable to the FIG. 3. Effect of assay temperature on the interaction of collagenous peptides with HSP47. A, interaction between HSP47 and (Gly-Pro-Pro) n peptides of different lengths. Interaction in the yeast two-hybrid system was detected by both histidine auxotrophy and ␤-galactosidase activity. B, interaction between HSP47 and (Gly-Pro-Pro) 8 constructs in which prolyl residues at the X or Y positions were replaced with Ala residues, as indicated. Two-hybrid assays were performed as in A. triple-helix-forming propensities of the substrate peptides.
Concluding Remarks-In this study, we have focused on the molecular determinants of HSP47 substrate recognition. In this study, we established a yeast two-hybrid assay system to study the interaction between HSP47 and substrate peptides. Using this assay, we have obtained the following evidence that HSP47 preferentially recognizes triple-helical peptides as binding substrates: 1) a trimerizing domain, such as the K100 C1q-like domain, fused to the peptides was required for detectable interaction (Figs. 1 and 2); 2) some peptides were bound by HSP47 at 24°C but not at 30°C, suggesting that the melting temperature of the triple helix may be an important factor in binding (Fig. 3); and 3) amino acid residues, which stabilize the triple helix, such as Arg and Pro, were enriched in the group of peptides selected from the Y library and vice versa (Figs. 6 -8).
Although the real function of HSP47 is still ambiguous, the studies on the substrate recognition lead us to some working hypotheses. In the previous paper, we showed that HSP47 prefers Pro residues at the Y positions rather than Hyp in the context of the (Pro-Y-Gly) n sequence (5). Combined with the result shown in this paper, it seems that HSP47 retains triple helical procollagen molecules having less prolyl 4-hydroxylated portions in the ER. Prolyl 4-hydroxylase is also reported to bind to a less prolyl 4-hydroxylated form of single chain procollagen (27). These mechanisms may be expected to ensure the quality of procollagen molecules to be secreted, although it is not known whether HSP47 can exert a quality control mechanism alone or in cooperation with prolyl 4-hydroxylase. It is not clear whether HSP47 exclusively binds to triple helical portions containing Gly-Pro-Pro triplets or other triplets such as Gly-Pro-Arg can be accomodated to the HSP47-binding sites, because all peptides designed for two hybrid screening contain known HSP47 binding sequences such as Gly-Pro-Pro-Gly and Gly-Pro-Pro at both ends of the randomized sites (Fig. 4). If the sequences other than (Gly-Pro-Pro) n can form HSP47-binding sites, our finding that HSP47 prefers triple helical substrates suggests another role of HSP47; HSP47 might facilitates procollagen folding in the ER by stabilizing partially folded triple helical intermediates that would be otherwise unstable at body temperature. Further biochemical and physicochemical studies using individual model peptides would clarify the molecular function of HSP47 as collagen-specific chaperone.
Although we initially undertook two-hybrid screening of the peptide libraries with a view to identifying HSP47 binding motifs in the primary structure, no such motifs were apparent from our results. Instead, we have obtained information re- FIG. 5. Amounts of the activation hybrids expressed in the yeast cells. Yeast lysates (10 g) were separated by SDS-PAGE on 10% gels, and the activation hybrids were detected by Western blotting using anti-GAL4AD antibody. Untransfected CG1945 lysate was used as a negative control.
FIG. 6. Sequences of HSP47 binding peptides selected from the Y library. Amino acid residues at Y 1 -Y 6 positions are shown in bold. Arg and Pro residues, which predominated in these positions, are highlighted in blue and red, respectively. Amino acid residues that could not be identified by DNA sequencing are denoted as "?." Clone 88 encoded a truncated peptide.  Fig. 7 are plotted against the T m values of the corresponding model peptides, Ac-(Gly-Pro-Hyp) 4 -(Gly-Pro-Y)-(Gly-Pro-Hyp) 3 -Gly-Gly-amide. T m values were taken from the literature (6, 17, 24 -26). Amino acid residues are shown using the singlelettered code.
garding the preferred secondary structure of HSP47 substrates (Figs. 6 -8). It is of note that this is the first case, to our knowledge, in which biological selection from combinatorial libraries has provided information about intermolecular conformation. We believe that the similar two-hybrid method will be a powerful tool to elucidate the substrate conformation of other collagen-binding proteins.