Human cystatin D. cDNA cloning, characterization of the Escherichia coli expressed inhibitor, and identification of the native protein in saliva.

A cDNA coding for cystatin D, a human member of the cystatin protein family, has been cloned after specific amplification of reverse-transcribed parotid gland RNA. After replacing the segment encoding the putative 20-residue signal peptide with one encoding the Escherichia coli OmpA leader sequence, the cDNA was expressed in E. coli. The isolated recombinant protein exhibited Ki values of 1.2 nM and > 1 microM for papain and cathepsin B, respectively. An antiserum raised against recombinant cystatin D recognized a protein in human saliva with electrophoretical mobility identical to that of the recombinant protein. Immunoenzymatic analysis revealed that this cysteine proteinase inhibitor is present in human saliva and tears at concentrations of 3.8 and 0.5 mg/liter, respectively, while it was not detected in seminal plasma, blood plasma, milk, or cerebrospinal fluid. Cystatin D purified from human saliva by immunosorption displayed a heterogeneous N-terminal end, with sequences starting at residues 5, 7, 9, and 11 of the predicted N-terminal portion of the mature protein. On the basis of structural and functional properties, cystatin D represents a novel cysteine proteinase inhibitor possibly playing a protective role against proteinases present in the oral cavity.

The cystatin superfamily has been subdivided into families I, 11, and I11 (also called the stefin, cystatin, and kininogen families, respectively), each with members differing from those of the other families in structural organization and biological distribution (Barrett et al., 1986;Rawlings and Barrett, 1990b). The family I cystatins A and B are small proteins consisting of single polypeptide chains of about 100 amino acids residues without disulfide bridges. The family I1 cystatins consist of polypeptide chains of approximately 120 amino acid residues with two intrachain disulfide bonds. Finally, the family I11 cystatins, the kininogens, display a higher degree of structural complexity characterized by the presence of three family I1 cystatin-like domains, each with two disulfide bridges at positions homologous to those in family I1 cystatins (Muller-Ester1 et al., 1986). Family I and I1 cystatins are mainly present intracellularly and in secretory fluids (Abrahamson et al., 1986) whereas kininogens are highly concentrated in blood plasma (Adam et al., 1985;Abrahamson et al., 1986).
Human family I1 cystatins are encoded by genes in a multigene family, probably comprising seven different members ; Al-Hashimi et al., . Considering that recent nucleotide sequence analyses has revealed that two of them are pseudogenes (Saitoh et al., 1991), five family I1 cystatins may be present in human tissues and body fluids. To date, four of these have been isolated and characterized cystatins C, S, SN, and SA. Cystatin C is the one most widely distributed in the body with particularly high concentrations in seminal plasma and cerebrospinal fluid (Lofberg and Grubb 1979;Abrahamson et al,, 1986). A point mutation in the gene coding for cystatin C causes a hereditary form of amyloid angiopathy (Palsdottir et al., 1988;Levy et al., 1989;Abrahamson et al., 1992). In afflicted patients, the cystatin C variant is deposited in cerebral arteries resulting in brain hemorrhage, dementia, and early death (Lofberg et al., 1987). The three other human family I1 cystatins characterized to date, cystatins S, SN, and SA, have been isolated from saliva (Isemura et al., 1984(Isemura et al., ,1986(Isemura et al., , 1987(Isemura et al., , 1991Al-Hashimi et al., 1988;Ramasubbu et al., 1992) but they are also present in other secretory body fluids like tears and seminal plasma (Abrahamson et al., 1986;Sabatini et al., 1989).
We have recently cloned a new gene from a human genomic library, the structure of which is very similar to those of previously known family I1 cystatin genes (Freije et al., 1991). This gene contains coding information for a pre-protein of 142-amino acid residues, where the first 20 probably constitute a signal peptide. The deduced polypeptide chain sequence of the mature protein, called cystatin D, displays regions homologous to the three regions forming the inhibitory center of other cystatins. Cystatin D would thus correspond to an as yet unidentified fifth human family I1 cystatin which, according to Northern blot studies (Freije et al., 1991), should be present in human saliva. In this work we report the cloning and expression in Escherichia coli of a cDNA coding for the putative cystatin D. W e also show that the recombinant protein displays inhibitory activity against cysteine proteinases. Finally, we report the purification and characterization of cystatin D from human saliva.

EXPERIMENTAL PROCEDURES
Materials-Specimens of human parotid gland were obtained at autopsies performed within 15 h after death and frozen in liquid nitrogen and stored at -70 "C until used. The RNA PCR' kit used for the reverse transcription of total RNA and cDNA amplification was from Perkin-Elmer Cetus. Restriction endonucleases and other reagents used for molecular cloning were purchased from Boehringer Mannheim (Mannheim, Germany). Oligonucleotides were synthesized by the phosphoramidite method in an Applied Biosystems DNA synthesizer, model 381A, and purified by polyacrylamide gel electrophoresis according to standard procedures (Maniatis et al., 1982). Double-stranded DNA probes were radiolabeled with [32P]dCTP (3000 Ci/mmol) using a commercial random-priming kit from Boehringer Mannheim. Affinity purified preparations of papain (EC 3.4.22.2) and human cathepsin B (EC 3.4.22.1) were a gift from Dr. I. Bjork, Uppsala, Sweden, and purchased from Calbiochem (San Diego, CA), respectively. The enzyme substrates Bz-Arg-pNA and Z-Phe-Arg-NHMec were obtained from Bachem (Bubendorf, Switzerland). Recombinant cystatin C was prepared as described earlier (Abrahamson et al., 1988). Reagents for amino acid sequencing were from Applied Biosystems (Foster City, CA).
Cloning of Cystatin D cDNA-cDNA synthesis was carried out using 1-pg aliquots of total RNA isolated from human parotid gland and the RNA PCR kit from Perkin-Elmer Cetus. RNA was isolated by guanidinium thiocyanate-phenol-chloroform extraction, following the procedure described by Chomczynski and Sacchi (1987). The reverse transcription reaction was performed for 1 h at 42 'C with random hexamers as primers and the resulting material was then used in a PCR with the oligonucleotides cysd-a (5"GAACAT-GATGTGGCCCATGCAC) and cysd-b (5"GGTGGTCAGTGTGA-CAGGCCTT) (25 pmol/reaction) deduced from the 5'-and 3'-cystatin D open reading frame ends, respectively. PCR was performed with the use of a Techne Thermal Cycler PHC-3 (New Brunswick Scientific, New Brunswick, NJ) in a volume of 100 p1 with a supernatant of 100 pl of mineral oil to prevent evaporation. The reaction mixture was heated at 95 "C for 2 min and then 40 cycles were run under the following conditions: 95 "C, 1 min; 59 "C, 1 min; 72 "C, 2 min; in the last cycle the polymerization was carried out at 72 "C for 10 min. The final product was treated with T4 polynucleotide kinase, purified by electrophoresis in a 2% agarose gel and subcloned into the SmaI site of pEMBL19. DNA from six independent clones (named pCysDl to pCysD6) was obtained by the alkaline lysis procedure and sequenced by the dideoxy chain termination method (Sanger et al., 1977) using the Sequenase Version 2.0 kit (U. S. Biochemical Corp.). All nucleotides in the coding sequence were identified in both strands. 'The abbreviations used are: PCR, polymerase chain reaction; PAGE, polyacrylamide gel electrophoresis; bp, base pair(s); CAPS, 3-(cyclohexy1amino)-1-propanesulfonic acid; Bz, benzoyl; Z, benzyloxycarbonyl; pNA, p-nitroanilide; NHMec, 7-(4-methyl)coumarylamide.
Construction of an E. coli Expression Vector-Plasmid pHD389 is a derivative of the cystatin C expression plasmid pHD313 (Abrahamson et al., 1988;Dalb~ge et al., 198913). pHD313 was digested with NarI, the ends were filled in using the Klenow fragment of E. coli DNA polymerase, and then digested with EcoRI. The resulting linearized plasmid, devoid of a NarI/EcoRI fragment downstream from the 3'-end of the cystatin C gene insert, was isolated. An EcoRI/NaeI fragment containing the polylinker region of pUC18 followed by the transcription terminator from phage fd was isolated from pHD162 SP9b (Dalb0ge et al., 1989a) and ligated to the above plasmid fragment, resulting in plasmid pHD388 in which the cystatin C gene is followed by the pUC18 polylinker and the fd transcription terminator. pHD388 was digested with EcoRI, the ends were filled in using the Klenow fragment of E. coli DNA polymerase, and then digested with ClnI to remove the cystatin C gene including the OmpA signal peptide encoding fragment. A 100-bp ClnI/PuuII fragment containing the OmpA signal peptide encoding sequence was isolated from pHD282-2 ( Dalb0ge et al., 198913) and ligated to the plasmid fragment from pHD388, resulting in pHD389. The resulting construct thus contained the phage X cI temperature-sensitive repressor gene, the phage X PR promoter, an optimized ribosome binding site, the OmpA signal peptide encoding sequence, a pUCl8 polylinker, and the phage fd transcription terminator.
Expression of the Cystatin D Gene in E. coli-An additional oligonucleotide was synthesized which contained a recognition site for NurI (Cys-Nar: 5'-GCGGCGCCCGGGAGTGCCTCG). This primer was used to amplify the coding sequence of cystatin D, excluding the putative signal sequence, using the oligonucleotide cysd-b as downstream primer and 25 ng of the plasmid pCysDl as template. Cycling conditions were as above, except that only 30 cycles were carried out and the annealing temperature was 55 "C in the first 5 cycles and 60 "C for the remaining 25 cycles. The final product was digested with NarI and StuI and the resulting DNA fragment was ligated between the NurI and the SmaI sites of the vector pHD389. The resulting plasmid, called pCysD2-1, was transformed into E. coli strain MC1061 (Casadaban and Cohen, 1980). Bacteria containing the expression plasmids pHD389 or pCysD2-1 were induced to produce recombinant protein as previously described for cystatin C production (Abrahamson et al., 1988;Dalbdge et al., 1989b). To extract the recombinant protein from the periplasm, the induced bacteria were collected by centrifugation and resuspended in one-fifth of the original volume in 0.2 M Tris buffer, pH 9.0, with 20% sucrose and 0.1 M EDTA. After centrifugation, bacteria were resuspended again in one-twentieth of the original volume in 10 mM Tris buffer, pH 9.0, by vigorous agitation for 10 min. Cell debris were removed by centrifugation at 20,000 X g for 15 min and the supernatant was stored at 4 "C until used.
Zsolntion of Recombinant Cystatin D-Periplasmic extracts were concentrated 5-fold and applied to a Mono Q HR 5 / 5 column (Pharmacia LKB, Uppsala, Sweden) equilibrated in 20 mM ethanolamine, pH 9.0. The elution was carried out with a gradient of the same buffer containing 1.0 M NaCl at a flow rate of 0.5 ml/min. Fractions containing recombinant cystatin D were pooled and gel chromatographed on a Superdex 75 HR 10/30 column (Pharmacia LKB) in 0.1 M ammonium acetate at a flow rate of 0.2 ml/min. Fractions of 0.2 ml were collected and those containing pure recombinant cystatin D, as judged by SDS-polyacrylamide gel electrophoresis, were pooled and lyophilized. All purification steps were performed at 4 "C.
(0.5 nmol) or protein bands blotted onto polyvinylidine difluoride Amino Acid Sequence Analysis-Isolated recombinant cystatin D membranes (Trans-Blot, Bio-Rad) were subjected to N-terminal sequence analysis by automatic Edman degradation using an Applied Biosystems 477A Sequencer in the presence of Polybrene (Hewick et al., 1981;Matsudaira, 1987). The resulting phenylthiohydantoinderivatives were identified and quantified using an on-line phenylthiohydantoin analyzer (model 120A) and the standard elution program of Applied Biosystems.
Determination of Cysteine Proteinase Inhibitory Actiuity-Concentrations of inhibitorily active recombinant cystatin D in samples were determined by titration of papain, which previously had been active site-titrated with E-64 (Barrett et al., 1982). For determination of equilibrium constants for dissociation (K,) of cystatin D complexes with papain and human cathepsin B, continuous rate assays with 10 Z-Phe-Arg-NHMec as substrate in 100 p~ sodium phosphate buffer were employed (Nicklin and Barrett, 1984). The buffer contained 1 mM ditiothreitol and 2 mM EDTA and was adjusted to pH 6.5 for papain assays and to pH 6.0 for cathepsin B assays. Cathepsin B was preincubated for 20 min in assay buffer at room temperature before use. The enzyme concentrations in the assays were 0.05-0.25 nM. The highest cystatin D concentration tried in cathepsin B assays was 100 nM. The inhibitor concentrations giving informative inhibition, i.e. resulting in a new steady state rate within 1 h after addition of inhibitor, were 20-50 nM in the papain assays. Substrate hydrolysis at 37 'C was monitored in a Perkin-Elmer Cetus LS50 fluorometer at excitation and emission wavelengths of 360 and 460 nm, respectively. Steady-state velocities before (uo) and after (ui) addition of inhibitor were found with the aid of linear regression using FLUSYS (Rawlings and Barrett, 1990a (Henderson, 1972). K, values for hydrolysis of Z-Phe-Arg-NHMec under the assay conditions, of 60 p~ for papain (Hall et al., 1992) and 250 p M for cathepsin B (Kirschke and Barrett, 1987 Antiserum Production-Polyclonal antisera against recombinant cystatin D were produced by injecting 0.2 mg of purified recombinant cystatin D emulsified in complete Freund's adjuvant subcutaneously into each of two rabbits. Three weeks later identical injections were given. The rabbits were bled 6 weeks after the first injection and then every third week. The IgG fraction of the antiserum was isolated by chromatography on a protein A-Sepharose column (Pharmacia LKB).
Immunochemical Analyses-Proteins separated by SDS-PAGE (Laemmli, 1970) were electrophoretically transferred to nitrocellulose membranes (Millipore, Bedford, MA) at 20 mA for 2 h in a Bio-Rad Trans-Blot apparatus with a buffer containing 10 mM CAPS, 4 mM NaOH, and 20% (v/v) methanol. After transfer, filters were blocked in phosphate-buffered saline containing 10% bovine serum albumin for 16 h and then incubated with anti-cystatin D antiserum (diluted 1:lOO in phosphate-bufferred saline with 10% bovine serum albumin) for 1 h. After washing the membranes five times with phosphatebuffered saline containing 0.1% Tween 20, they were incubated with lZ6I-protein A (ICN Radiochemicals) for 30 min and then washed again with phosphate-buffered saline with 0.1% Tween 20. Immunoreactive bands were detected by autoradiography using a Kodak X-Omat film and an exposure time of 8 h. Classical immunoelectrophoresis employing 1% (w/v) agarose gels in 75 mM barbital buffer, pH 8.6, was performed according to Scheidegger (1955).
The cystatin D concentration in different body fluids was measured by enzyme-amplified single radial immunodiffusion (Lofberg and Grubb, 1979). Gels of 1% (w/v) agarose containing 10% (w/v) dextran T10 and 0.6% (v/v) of the rabbit antiserum against recombinant cystatin D were used. The precipitates were visualized by carbazole staining after incubation with horseradish peroxidase-labeled swine antibodies against rabbit immunoglobulins (Dako Inc., Copenhagen, Denmark). A solution of isolated recombinant cystatin D was used to produce the standard curves of the procedure. The protein concentration of the recombinant cystatin D solution was determined by quantitative amino acid analysis. Dilution analysis established the sensitivity of the immunochemical quantitation to 0.1 mg of cystatin D/liter.
Purification and Characterization of Cystatin D from Human Saliua-Two hundred ml of saliva was collected directly into a flask containing benzamidinium chloride, EDTA, Tris, and sodium azide to final concentrations of at least 0.006, 0.03, 0.05, and 0.015 M, respectively. The saliva was then frozen, thawed, and centrifuged at 5,000 X g for 15 min. The supernatant was mixed with an equal volume of 0.1 M Tris buffer, pH 7.4, containing 0.5 M NaC1, 5 mM benzamidinium chloride, 10 mM EDTA, and 0.015 M sodium azide. The resulting solution was applied to an immunoaffinity column, prepared by coupling of the IgG fraction of a rabbit antiserum raised against recombinant cystatin D to CNBr-Sepharose (Pharmacia LKB) as described by the manufacturer. After extensive washing of the column with a 0.1 M Tris buffer, pH 7.4, containing 0.5 M NaCI, 5 mM benzamidinium chloride, 10 mM EDTA, and 0.015 M sodium azide, a 0.2 M glycine buffer, pH 2.2, with 0.5 M NaCl, 5 mM benzamidinium chloride, 10 mM EDTA, and 0.015 M sodium azide was used to elute immunosorbed proteins. Agarose gel electrophoresis (Jeppsson et al., 1979) and immunoelectrophoresis using the antiserum against recombinant cystatin D were used to monitor the column effluent. The protein-containing acid effluent was immediately neutralized by addition of 2 M Tris buffer, pH 8.6, and then concentrated to about 100 pl by pressure ultrafiltration using Diaflo YM2 and Centricon 3 membranes (Amicon Corp.). Agarose gel electrophoresis at pH 8.6 of this concentrated solution revealed two protein bands after fixation and staining of the gel. The agarose gel electrophoresis was therefore repeated but the fixation and staining procedure was replaced by blotting of the separated proteins onto a polyvinylidine difluoride membrane followed by N-terminal sequencing of the transferred proteins (Matsudaira, 1987; Olafsson et al., 1990).

Isolation and Sequence
of a Human Cystatin D cDNA Clone-Based on the DNA sequence of the cystatin D gene two specific primers (cysd-a and cysd-b), corresponding to the sense strand upstream to the assumed ATG start site and to the antisense strand downstream to the stop codon, were synthesized and used to amplify single-stranded cDNA derived from human parotid gland RNA. We chose this tissue due t o positive Northern blot hybridization with a cystatin D-specific probe (Freije et al., 1991). A PCR product compatible with the size predicted from the three putative exons in the gene sequence (465 bp) was obtained. Digestion of the 465-bp product with PstI, which has a unique recognition site in the putative exon 1 of the gene, produced two fragments of the correct predicted lengths (data not shown).
T h e PCR product was subcloned into pEMBL19 and six independent clones were subjected to nucleotide sequence analysis. The nucleotide sequence of the cystatin D cDNA and the corresponding amino acid sequence are shown in  , 1991). This difference at the DNA level results in an amino acid change (arginine specified by the cDNA sequence and cysteine by the genomic one). Hybridization studies with allele-specific oligonucleotides on a large number of genomic DNA samples have demonstrated that this change corresponds to a polymorphism in the population (Balbin et al., 1993). Expression of Cystatin D in E. coli-In order to elucidate whether the cloned cDNA sequence codes for a biologically active cysteine proteinase inhibitor, an attempt was made to express the cDNA in E. coli. To produce the putative mature protein in the periplasm of E. coli, the previously proposed signal peptide of cystatin D was replaced by the E. coli OmpA signal peptide. For this purpose, the sequence coding for the putative mature protein was amplified using a pair of oligonucleotides corresponding to the 3'-and 5'-ends of the sequence (Fig. 2). Due to the design of the 5'-primer (Cys-Nar), the amplified fragment contained the coding information for 2 additional amino acids (Ala-Pro) at the N-terminal end (Fig. 2).
The DNA fragment thus obtained was inserted in the polylinker region of the expression vector pHD389. The resulting plasmid, called pCysDZ-1, contained the coding sequence for the proposed mature cystatin D in-frame with the one coding for the signal peptide of OmpA.
The original vector (pHD389) as well as the recombinant plasmid (pCysD2-1) were transformed into E. coli MC1061 and the transformed bacteria were induced to produce the recombinant protein. The periplasmic contents were extracted by an osmotic shock procedure (Dalb~ge et al., 1989b) and the protein composition was analyzed by SDS-PAGE. As can be seen in Fig. 3, the periplasmic extract of the bacteria transformed with the recombinant plasmid contained a protein of about 17 kDa, which was not present in the control extracts. The yield of recombinant protein was approximately 100 mg/liter culture.

FIG. 1. Human cystatin D cDNA.
The sequence of the PCR-amplified cDNA is shown, with the amino acid sequence of the translation product below the nucleotide sequence. The residues which differ in the previously described genomic sequence are presented between brackets. The sequences of oligonucleotides cysd-a and cysd-b (complementary strand), used for the amplification, are underlined. ( T ) 180 CAATGACAAGAGTGTGCAGCGTGCCCTGGACTTTGCCATCAGCGAGTACAAGGTCAT

Pat1 240
TAATAAGGATGAGTACTACAGCCGCCCTCTGCAGGTGATGGCTGCCTACCAGCAGATCGT To confirm that the 17-kDa polypeptide produced in E. coli was cystatin D, it was isolated from bacterial extracts using the strategy described under "Experimental Procedures." Thus, extracts containing about 20 mg of total protein were chromatographed on a Mono Q column and the chromatogram obtained is shown in Fig. 4. As judged by SDS-PAGE, the putative cystatin D did not bind to the anion-exchange column, in contrast to most other proteins of the extract (Fig. 4,  inset). The non-binding material was then fractionated by size-exclusion fast protein liquid chromatography and the eIuate analyzed by 280-nm absorption and SDS-PAGE (Fig.  5). The main peak of the chromatogram corresponded to an isolated, single polypeptide chain protein, with a molecular mass o f about 17 kDa (Fig. 5, inset). Automatic Edman degradation of the polypeptide chain identified a single sequence confirming the homogeneity of the protein preparation. The sequence obtained (Ala-Pro-Gly-Ser-Ala-Ser-Ala-Gln-Ser-Arg-Thr-Leu-Ala-Gly-Gly-Ile-His-Ala-Thr-Asp-Leu-Asn-Asp-Lys-Ser-Val-Gln-) corresponded exactly to the one deduced for cystatin D from the cDNA sequence with the exception of the 2 first residues (Ala-Pro) introduced in the recombinant cystatin D sequence as a result of the cloning procedure. In addition, the sequence obtained proved that the OmpA signal peptide had been properly processed and cleaved at the expected Ala-Ala bond during the secretion process. The slight differences between the molecular mass of cystatin D deduced from SDS-PAGE mobility (17 kDa), and that calculated from the amino acid sequence (about 14 kDa) are probably due to abnormal electrophoretical behavior of this low molecular mass protein, similar to what has been observed for cystatin C using the same buffer system (Abrahamson et al., 1988).

StUI
Inhibitory Activity of Recombinant Cystatin D-The cysteine proteinase inhibitory activity of recombinant cystatin D was investigated by attempts to determine the equilibrium constants for dissociation (Ki) for its interactions with the model cysteine proteinases, papain, and cathepsin B. The cystatin D interaction with papain was slow, and steady-state enzyme activity rates after addition of inhibitor could only be readily observed a t inhibitor concentrations 220 nM. The inhibition obtained was compatible with a Ki value for papain inhibition of 1.2 nM (S.E. = 0.09, n = 4). Significant inhibition of human cathepsin B activity could not be demonstrated even with a cystatin D concentration of 100 nM. Since a Ki value of 1 PM should have resulted in 30% reduction of enzyme activity under the conditions of the experiment, the Ki value for cathepsin B inhibition was concluded to be >1 PM.
Demonstration of Cystatin D in Human Saliva-A rabbit antiserum raised against isolated recombinant cystatin D was used in an attempt to demonstrate the presence of cystatin D in human saliva. The specificity of the antiserum was first tested by classical immunoelectrophoresis with human serum, purified preparations of the human cystatins A, B, C, S, SA, and SN, as well as recombinant cystatin D as antigens. Only recombinant cystatin D produced a precipitate. When fresh saliva was used as antigen in the procedure, a single immunoprecipitation arc was produced and in the same electrophoretic position as that produced by recombinant cystatin D. The proteins of fresh human saliva were also separated by SDS-PAGE, transferred to a nitrocellulose filter and tested for reactivity with the antiserum raised against recombinant cystatin D.  The antiserum against cystatin D was used to construct an enzyme-amplified single radial immunodiffusion procedure to allow quantitation of cystatin D in various body fluids. Ten samples each of saliva, tears, seminal plasma, and blood plasma, as well as four samples each of milk and cerebrospinal fluid, all from healthy individuals, were investigated. The cystatin D concentration in seminal plasma, blood plasma, milk, and cerebrospinal fluid was below the sensitivity limit of the procedure (0.1 mg/liter). The saliva samples had the highest cystatin D concentration (mean: 3.8 mg/liter, i.e. approximately 275 nM; range 1.6-5.1 mg/liter). Tears displayed low, but detectable, cystatin D levels (mean: 0.5 mg/ liter; range: (0.1-1.5 mg/liter).
Purification and Partial Characterization of Cystatin D from Saliva-In an attempt to isolate human cystatin D from saliva, the IgG fraction of the antiserum raised against recombinant cystatin D was coupled to a CNBr-activated Sepharose column. Fresh saliva containing a mixture of proteinase inhibitors was applied to the column. After washing of the column with proteinase inhibitor-containing neutral buffer, the bound material was eluted with glycine buffer, pH 2.2, containing the same proteinase inhibitors.
Immunoelectrophoresis was used to monitor the column effluent and revealed that only the acid effluent contained cystatin D immunoreactivity. Agarose gel electrophoresis of the neutralized and concentrated acid effluent displayed two protein bands ( A and B in Fig. 7). Band B had the same electrophoretic mobility as purified recombinant cystatin D and band A a slightly more anodal mobility. The proteins of a second aliquot of the concentrated effluent were separated by agarose gel electrophoresis, blotted onto polyvinylidine difluoride membranes, and subjected to automated Edman degradation. Two major sequences were obtained for the constituents of both band A and band B. All four agreed with segments of the N-terminal portion of the protein specified by the cystatin D cDNA (Fig.   8), starting at residues 5, 7, 9, and 11 of the predicted Nterminal portion of the mature protein (Freije et al., 1991).

DISCUSSION
The present work was undertaken after the cloning and characterization of a newly detected human gene (CST5)' The cystatin D gene was originally called CST4 when it was cloned and sequenced by Freije et al. (1991), but it has been renamed CST5 since the cystatin S gene was also called CST4 when it was simultaneously cloned (Saitoh et al., 1991). The designation CST5 for the human cystatin D gene will be officially used by the Human Gene Mapping Nomenclature Committee (P. McAlpine, personal communication). seemingly coding for an additional member of the cystatin family of proteinase inhibitors, tentatively called cystatin D (Freije et al., 1991). Since no protein with the structural characteristics corresponding to this gene had been described, we tried to demonstrate that the protein encoded by the CST5 gene is a biologically active cysteine proteinase inhibitor. Our strategy was to isolate a cystatin D cDNA and then express it in a bacterial host. Since previous studies indicated that the CST5 gene is expressed in the parotid gland (Freije et al., 1991), the required cDNA was obtained by reverse transcription and PCR amplification of RNA from a sample of this tissue. Nucleotide sequence analysis of several cDNA clones confirmed the overall structural data derived from the sequence of the CST5 gene, including the location of all intronexon junctions which had been proposed for the gene. Interestingly, comparison of the nucleotide sequence of the cystatin D cDNA with the previously determined genomic sequence revealed a single nucleotide difference resulting in a Cys/Arg variation in the amino acid sequence of the putative mature protein. The finding of both variants of the CST5 gene in the general population (Balbin et al., 1993) clearly indicates that this variation represents a common polymorphism and not cloning artifacts in the cDNA or genomic clones.
After characterization, the cystatin D cDNA was placed under the control of a heat inducible promoter and expressed in E. coli using a system very similar to the one previously used for the production of cystatin C with full biological activity (Abrahamson et al., 1988;Dalbege et al., 1989b). The recombinant cystatin D was secreted to the periplasmic space indicating that the product had been properly directed to this bacterial compartment by the OmpA signal peptide fused to the putative N-terminal sequence of the mature protein. The recombinant protein was purified by a simple two-step chromatographic procedure which resulted in the isolation of large amounts of protein suitable for structural and functional analysis. The N-terminal sequence of the purified protein confirmed that the signal peptide had been efficiently processed at the expected position and that the protein product had the amino acid sequence predicted from the nucleotide sequence, including the 2 additional residues introduced in Predicted mature cystatin D sequence GSASAQSRTLAGGIHATDLNDKSVQ ...

FIG. 8. Identified sequences for
human salivary cystatin D. The sequences determined by automated Edman degradation of the two protein bands shown in Fig. 7 are given in the standard one-letter code.

Band B proteins:
the N-terminal end during the cloning process. Functional analysis performed with the recombinant protein demonstrated that cystatin D is a functional cysteine proteinase inhibitor, although the equilibrium constant for dissociation of its complex with papain (Ki = 1.2 nM) was much higher than that of the cystatin C-papain complex (Kd = 11 f~; Lindahl et al., 1992). It is unlikely that this difference is due to the presence of 2 extra N-terminal residues (Ala-Pro) in recombinant cystatin D, since N-terminally extended Ala-Met-Glu-Ala-Glu-cystatin C displays virtually the same inhibitory properties as native cystatin C of human origin (Abrahamson et at., 1989). It is also noteworthy that recombinant cystatin D did not show any significant inhibitory activity against cathepsin B, suggesting that it may have a much more restricted inhibitory spectrum than other cystatins characterized to date .
The availability of isolated recombinant cystatin D allowed the production of an antiserum suitable for testing the presence in human fluids of proteins related or identical to the recombinant protein. As anticipated from the expression of the CST5 gene in parotid gland (Freije et al., 1991), immunoblotting of saliva proteins separated by SDS-PAGE revealed the presence in human saliva of a protein with the expected size and immunoreactivity. It was possible to isolate the protein from saliva by the use of immunosorption but the isolated protein displayed a ragged N terminus, although maximal precautions were taken to prevent proteolytic degradation during the saliva collection and cystatin D isolation.
A similar situation has been described for the isolation of cystatins S, SN, and SA from saliva (Hawke et al., 1987;Saitoh et al., 1988;Al-Hashimi et al., 1988;Isemura et al., 1991). The fact that the longest variant of cystatin D we could demonstrate in saliva is 4 residues shorter than the sequence predicted for the mature protein by using von Hejne's algorithm (von Hejne, 1985;Freije et al., 1991) is probably due to proteolytic degradation, although it cannot be excluded that this cystatin D variant displays the authentic N-terminal sequence of the native protein. However, all sequence data obtained agree with the sequence for cystatin D deduced from the CST5 gene structure and therefore provide conclusive evidence that the cystatin D gene is normally expressed in at least some human tissues.
The quantitative analysis of the cystatin D concentration in several biological fluids showed a unique distribution for this cystatin, since it could only be demonstrated in saliva and tears but not in milk, seminal plasma, blood plasma, or cerebrospinal fluid. The mean saliva cystatin D concentration, approximately 275 nM, is slightly higher than the mean saliva level of cystatin C (Abrahamson et al., 1986). This concentration is sufficiently high to cause complete inhibition of papain in saliva, since our data show that the equilibrium constant for dissociation of cystatin D-papain complexes is more than 100 times lower (1.2 nM), which means that the enzymeinhibitor equilibrium will be shifted almost totally towards the complexed state. This theoretical consideration indicates that cystatin D might have a physiological role in saliva as an inhibitor of cysteine proteinases with papain-like properties.

AQ-RTLA SRTLAGG
In this fluid, the inhibitor could play a protective role against the potentially harmful effects of proteinases of bacterial, fungal, viral, or cellular origin present in the oral cavity. Therefore, cystatin D may be considered as an additional component of the nonimmune protective system in this cavity, with an interesting parallelism to histatins, a family of histidine-rich proteins found in human parotid secretion and with fungistatic and antibacterial properties (Oppenheim et Sabatini et al., 1989). By contrast, the other known family I1 cystatins, including the so called salivary cystatins S, SN, and SA, have a wider distribution in the body and may play different physiological roles in those biological fluids and secretions in which they are present. The ability to obtain large amounts of functional recombinant cystatin D from engineered E. coli, as described here, will be helpful to further clarify the physiological role of this protein and its relationship to other cystatins, with special reference to the occurrence of distinctive specificities against the different cysteine proteinases. Finally, the availability of cDNA and genomic clones for cystatin D will facilitate comparative studies to elucidate the mechanisms involved in the differential expression of the genes encoding all members of this family of proteinase inhibitors.