Molecular cloning of chick lysyl hydroxylase. Little homology in primary structure to the two types of subunit of prolyl 4-hydroxylase.

Lysyl hydroxylase (EC 1.14.11.4), an alpha 2 dimer, catalyzes the formation of hydroxylysine in collagens by the hydroxylation of lysine residues in X-Lys-Gly sequences. We report here on the isolation of cDNA clones coding for the enzyme from a chick embryo lambda gt11 library. Several overlapping clones covering all the coding sequences of the 4-kilobase mRNA and virtually all the noncoding sequences were characterized. These clones encode a polypeptide of 710 amino acid residues and a signal peptide of 20 amino acids. The polypeptide has four potential attachment sites for asparagine-linked oligosaccharides and 9 cysteine residues, at least one of which is likely to be involved in the binding of the Fe2+ atom to a catalytic site. A surprising finding was that no significant homology was found between the primary structures of lysyl hydroxylase and prolyl 4-hydroxylase in spite of the marked similarities in kinetic properties between these two enzymes. A computer-assisted comparison indicated only an 18% identity between lysyl hydroxylase and the alpha-subunit of prolyl 4-hydroxylase and a 19% identity between lysyl hydroxylase and the beta-subunit of prolyl 4-hydroxylase. Visual inspection of the most homologous areas nevertheless indicated the presence of several regions of 20-40 amino acids in which the identity between lysyl hydroxylase and one of the prolyl 4-hydroxylase subunits exceeded 30% or similarity exceeded 40%. Southern blot analyses of chick genomic DNA indicated the presence of only one gene coding for lysyl hydroxylase.

Lysyl hydroxylase (EC 1.14.11.4), an a2 dimer, catalyzes the formation of hydroxylysine in collagens by the hydroxylation of lysine residues in X-Lys-Gly sequences. W e report here on the isolation of cDNA clones coding for the enzyme from a chick embryo Xgt 1 1 library. Several overlapping clones covering all the coding sequences of the 4-kilobase mRNA and virtually all the noncoding sequences were characterized. These clones encode a polypeptide of 710 amino acid residues and a signal peptide of 20 amino acids. The polypeptide has four potential attachment sites for asparagine-linked oligosaccharides and 9 cysteine residues, at least one of which is likely to be involved in the binding of the Fe2+ atom to a catalytic site. A surprising finding was that no significant homology was found between the primary structures of lysyl hydroxylase and prolyl 4-hydroxylase in spite of the marked similarities in kinetic properties between these two enzymes. A computer-assisted comparison indicated only an 18% identity between lysyl hydroxylase and the a-subunit of prolyl 4-hydroxylase and a 19% identity between lysyl hydroxylase and the 8-subunit of prolyl4-hydroxylase. Visual inspection of the most homologous areas nevertheless indicated the presence of several regions of 20-40 amino acids in which the identity between lysyl hydroxylase and one of the pro-lyl4-hydroxylase subunits exceeded 30% o r similarity exceeded 40%. Southern blot analyses of chick genomic DNA indicated the presence of only one gene coding for lysyl hydroxylase.
Lysyl hydroxylase (procollagen-lysine,2-oxoglutarate 5dioxygenase, EC 1.14.11.4) catalyzes the formation of hydroxylysine in collagens and other proteins with collagen-like amino acid sequences by the hydroxylation of lysine residues in X-Lys-Gly sequences (for reviews, see Refs. 1 and 2). The hydroxylysine residues formed in the reaction have two important functions. Their hydroxy groups serve as attachment sites for carbohydrate units in the form of either the monosaccharide galactose or the disaccharide glucosylgalactose, and they are essential for the stability of intermolecular * This work was supported by grants from the Research Councils for Medicine and the Natural Sciences with the Academy of Finland. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

to the GenBank'rM/EMBL Data Bank with accession number(s)
The nucleotide sequence(s) reported in this paper has been submitted 505746.
collagen cross-links. The importance of hydroxylysine is clearly demonstrated by the profound changes in the mechanical properties of certain tissues that are seen in patients with the type VI variant of the Ehlers-Danlos syndrome, a heritable connective tissue disorder with a deficiency in lysyl hydroxylase activity (see Refs. 3 and 4). Lysyl hydroxylase has been purified to homogeneity from chick embryos (5) and human placental tissues (6) and shown to be a dimer (a2) consisting of only one type of subunit with a molecular weight of about 85,000 (5-7). This enzyme is very similar to prolyl 4-hydroxylase (EC 1.14.11.2) in its catalytic properties. Both enzymes act on non-hydroxylated collagens and collagen-like polypeptides, and both enzymes require Fez+, 8-oxoglutarate, Oz, and ascorbate (1,2,8). The kinetic constants of the enzymes for their cosubstrates and competitive inhibitors are likewise very similar, and the enzymes appear to have identical reaction mechanisms (1,2,8). Nevertheless, prolyl 4-hydroxylase differs from lysyl hydroxylase, being a tetramer (a2p2) that consists of two different types of monomer with molecular weights of about 64,000 (a-subunit) and 60,000 (@-subunit) (8). Complete cDNA-derived amino acid sequences have recently been determined for both the psubunit (9-11) and the a-subunit (12,13) of human and chick prolyl 4-hydroxylase, whereas no amino acid sequence data have been available for lysyl hydroxylase from any source.
In order to obtain information about the structure, biosynthesis, and regulation of lysyl hydroxylase and create probes for studying lysyl hydroxylase deficiency in the type VI variant of the Ehlers-Danlos syndrome, we isolated cDNA clones for chick lysyl hydroxylase and determined a complete cDNAderived amino acid sequence for it. Common amino acid sequences could be expected within the family of collagen hydroxylases due to the marked similarities in their kinetic properties, but no definitive sequence homology was found between the primary structures of lysyl hydroxylase and the two types of subunit of prolyl4-hydroxylase.

Purification of Lysyl Hydroxylase and Preparation of
Peptides of the Enzyme-Lysyl hydroxylase was isolated from 15-day-old whole chick embryos. The purification steps consisted of affinity chromatography on concanavalin A-Sepharose, affinity chromatography on collagen linked to agarose, and chromatography on hydroxyapatite columns (5-7).
The purified lysyl hydroxylase was fractionated by SDS-PAGE,' stained with Coomassie Brilliant Blue, and cut from the gel, and was then eluted electrophoretically from the gel into a dialysis tube in 0.1 M NH,HCO, buffer, pH 8.5. After lyophilization, the lysyl hydroxylase ' The abbreviations used are: SDS-PAGE, sodium dodecyl sulfatepolyacrylamide gel electrophoresis; CAPS, 3-(cyclohexylamino)propanesulfonic acid nt, nucleotide; kh, kilobase pair.

Molecular Cloning
of Chick Lysyl Hydroxylase (15-50 pg) was dissolved in 1 ml of 0.1 M NH,HC03 and digested with 2.4 units of trypsin (Worthington, 240 units/mg) or with 0.5 units of a-chymotrypsin (Sigma, 46 units/mg) a t 37 "C for 18 h. The peptides were purified by a Waters 650 (Millipore) advanced protein purification system using a reverse phase column (PePRPCTM HR5/ 5, Pharmacia LKB Biotechnology Inc.). Elution (0.7 ml/min) was achieved using a linear gradient of acetonitrile (0-60% in 30 min) in 0.1% trifluoroacetic acid, and peptides were detected at 215 nm. Peptides obtained from lysyl hydroxylase by CNBr digestion (14) were isolated using a linear gradient of acetonitrile as above and a ProRPCTM HR5/2 (Pharmacia) reverse phase column for purification. Amino Acid Sequence Determination-Amino-terminal sequences of lysyl hydroxylase and peptides derived from the intact protein were determined by automated Edman degradation with an Applied Biosystems model 477A on-line 120A liquid-pulsed sequenator using narrow bore reverse phase analysis. For determination of the NHzterminal sequences of various peptides obtained by the digestion of lysyl hydroxylase with trypsin, chymotrypsin, or CNBr, the purified peptides were applied directly to a Polybrene/sodium chloride-treated glass fiber filter and degraded in the sequenator (15). The NHzterminal sequence of lysyl hydroxylase was determined by fractionation of 30 pg of the enzyme protein on a 10% SDS-PAGE and electrophoretic transfer onto polyvinylidene difluoride membranes using 10 mM CAPS, 10% methanol (pH 9.0) as transfer buffer. The transfer was carried out for 1 h a t 100 V using a Bio-Rad Transblot apparatus, and afterward proteins were visualized by staining with heparin/toluidine (16). The band corresponding to lysyl hydroxylase was cut out, destained with 8% acetic acid in 50% methanol, and washed with water before sequencing.
Immunological Studies-Polyclonal antibodies were prepared against a synthetic peptide of 15 amino acids derived from the nucleotide sequence of a lysyl hydroxylase cDNA clone (Fig. 3). The peptide was chemically synthesized by the solid phase procedure of Merrifield (17) by means of an automatic Applied Biosystems model 430A peptide synthesizer at the Department of Biochemistry, University of Oulu. The synthesized peptide was cleaved from the resin by treatment with trifluoromethanesulfonic acid and purified further by reverse phase chromatography using a linear acetonitrile gradient (0-60%) as the eluant. Approximately 2.5 mg of the synthetic peptide was conjugated to 12 mg of hemocyanin with glutaraldehyde (18), and the peptide solution was divided into seven aliquots. For primary immunization, one of the aliquots was emulsified in Freund's complete adjuvant and injected intradermally into a rabbit. The others were emulsified in Freund's incomplete adjuvant and injected into the same rabbit after 1 week and at weekly intervals thereafter.
The purified or partially purified lysyl hydroxylase was electrophoresed on a 10% SDS-PAGE and blotted onto polyvinylidene difluoride membrane as described elsewhere (19). The blot was cut into strips, which were incubated overnight with antibodies, and antibody binding was detected as described for the library screening (below).
Interaction of the antipeptide antibodies with lysyl hydroxylase was also tested by immunoprecipitation. A suspension of 100 pl of protein A-Sepharose (1:l (v/v) suspension of protein A-Sepharose gel in 0.2 M NaC1, 0.1 M glycine, 20 mM Tris-HC1, pH 7.5 (4 "c)) was incubated with immune or nonimmune serum at 4 "C for 4 h with gentle agitation. The gel was separated by centrifugation a t 5000 X g for 10 min, washed with the above buffer, and then incubated with partially purified lysyl hydroxylase at 4 "C overnight. The gel was separated by centrifugation and the supernatant used for the assay of unbound lysyl hydroxylase activity.
Lysyl hydroxylase activity was assayed by measuring the formation of radioactive hydroxylysine in (3H]lysine-labeled non-hydroxylated procollagen substrate (20). To test the effect of the antipeptide antiserum on lysyl hydroxylase activity, varying amounts of immune and nonimmune serum were added to the enzyme reaction.
Isolation ofcDNA Clones-A chick embryo Xgtll cDNA expression library (Clontech) was screened with polyclonal antibodies against purified chick embryo lysyl hydroxylase (21). For this purpose IgG was purified from rabbit antiserum by affinity chromatography on a protein A-Sepharose column (Pharmacia) as suggested by the manufacturer. Antibodies binding to Escherichia coli proteins were removed by immunoadsorption as described elsewhere (22). The library was plated on a lawn of E. coli Y1090 at a density of 1 X 10' phages/ 150-mm plate, and expression of the Xgtll fusion proteins was induced by nitrocellulose filters (Schleicher & Schuell) soaked in 10 mM isopropyl P-D-thiogalactoside (23). The filters were washed once in 10 mM Tris-HCI, pH 7.4,0.9% (w/v) NaCl, 3% (w/v) bovine serum albumin and then incubated in the same solution with 4.1 pg/ml purified IgG overnight at 4 "C. The recombinants reacting with the antibody were detected using horseradish peroxidase-conjugated goat anti-rabbit IgG (Bio-Rad) as a second antibody and 4-chloro-lnaphthol as the peroxidase substrate. Purified chicken lysyl hydroxylase at concentrations of 1, 5, and 10 ng was spotted onto nitrocellulose filters as a positive control.
In order to exclude false immunopositive phages, the clones isolated by immunological screening were hybridized with oligonucleotide probes. For this purpose four oligonucleotides were designed from the amino acid sequences of fragments of chick lysyl hydroxylase according to the criteria suggested by Lathe (24). The oligonucleotides were synthesized with an Applied Biosystems 380 B oligonucleotide synthesizer at the Department of Biochemistry, University of Oulu, and labeled with [-y-3ZP]ATP using T, polynucleotide kinase. The X-DNAs were isolated from the immunopositive phage recombinants, electrophoresed on a 0.8% agarose gel, and transferred to a nitrocellulose filter. The filter-bound X-DNAs were hybridized with 10 ng/ml of each 32P-labeled oligonucleotide in 35% (v/v) formamide, 6 X SSC (1 X SSC = 0.15 M NaCl, 0.015 M sodium citrate, pH 6.8), 1% (w/v) bovine serum albumin, 1% (w/v) Ficoll, 1% (w/v) polyvinylpyrrolidone, and 0.25 mg of denatured salmon sperm DNA/ml), and 0.1% (w/v) SDS at 37 "C for 20 h. The filters were washed twice in 6 X SSC and 0.5% (w/v) SDS at 55 "C for 15 min. Autoradiographs were prepared using Kodak XAR-5 film. Insert DNA was isolated from the immunopositive clone hybridizing with oligonucleotides and from all the other subsequently identified positive clones and ligated into the EcoRI site of the plasmids pBR322 or pUC19. To obtain additional cDNA clones, the chick embryo Xgtll library was screened with the 3ZP-labeled nick-translated insert LHC-24 under stringent conditions. DNA Sequencing and Sequence Analysis-Nucleotide sequences were obtained from the plasmid clones by the dideoxynucleotide chain termination method (25) and by using T 7 DNA polymerase (26). The sequencing primers were either vector-specific primers or specific 17mer oligonucleotides. The sequence was determined for both strands, and most nucleotides were sequenced several times in different, overlapping clones. Sequences fully covering the internal EcoRI site of cDNA were obtained by sequencing this area from the DNA of phage recombinants. GC-rich regions of plasmid recombinants were also sequenced with dITP and 7-deaza-dGTP in place of dGTP in order to eliminate occasional compression artifacts encountered in the sequence.
Sequence data were analyzed by the IBI DNA and protein sequence analysis system. Homology comparisons with the National Biomedical Research Foundation Protein Data Bank and GenBank sequences were performed using Microgenie sequence software (Beckman) and comparisons with the PIR/FASTA and SWISSPROT (release March 1990) sequences by means of the GCG sequence analysis software package (27). Detailed dot matrix comparisons between lysyl hydroxylase and prolyl 4-hydroxylase protein sequences were carried out using the IBI program, and homologous regions were further compared visually. In the visual comparison of protein sequences, similar amino acids were grouped as follows: Pro, Gly; Ser, Thr; Lys, Arg; Glu, Gln, Asn, Asp; Phe, Trp, His, Tyr; Ala, Ile, Val, Leu, Met, Cys (28).
Northern and Southern Blot Analysis-Poly(A)' RNA was isolated from locally established cultured chick embryo fibroblasts using an oligo(dT)-cellulose affinity column (29). Poly(A)' RNA was fractionated electrophoretically on a 0.7% agarose gel containing 2 M formaldehyde, transferred to a nitrocellulose filter, and hybridized (26).
High molecular weight genomic DNA was isolated (26) from cultured chick embryo fibroblasts and digested completely with either BamHI, EcoRI, or HindIII. The digested DNAs were fractionated electrophoretically on a 0.8% agarose gel, and the DNA was transferred to a nitrocellulose filter and hybridized with 32P-labeled nicktranslated cDNA probes.

Partial Amino Acid Sequences of Chick Lysyl Hydroxyluse-
Amino acid sequences were determined for the amino-terminal end of chick lysyl hydroxylase and six peptide fragments of the enzyme obtained by digestion with either CNBr, trypsin, or chymotrypsin ( Table I). These sequences ranged from 5 to 19 amino acids, and they all could subsequently be identified in the cDNA-derived amino acid sequence of lysyl hydroxylase (Table I and below). Four of these peptides were used to design synthetic oligonucleotide probes on the basis of a knowledge of codon usage in the deduction of probe sequences from amino acid sequence data (24,30). A 57-nt probe, AAGCAGTCCAAGCAGCTGCTGGTGCTGCTGG-TGCTGACAGTGGCCACCAAGCAGTTC, was based on the sequence of the CNBr peptide (Table I); a 33-nt probe,

CTGGTGGAGATGCCCACCCCCGATGTGTACTGG,on
that of the tryptic peptide Try 1; a 21-nt probe, TGGTTCCCCATCTTCACAGAC, on that of the chymotryptic peptide Chy 1; and a 21-nt probe, GACTTCCAGCAT-GAGAAGCTG, on that of the chymotryptic peptide Chy 2.
Isolation and Sequencing of cDNA Clones-A chick embryo X g t l l cDNA expression library was screened with purified antibodies against chick embryo lysyl hydroxylase, and 18 positive plaques were identified among 2 X IO5 recombinants. To establish the identity of the positive recombinants, all the positive clones were screened with the four oligonucleotides prepared according to the above peptide sequences. One recombinant, LHC-24, gave a positive signal when hybridized with the 33-and 21-mer oligonucleotides derived from Try 1 and Chy 1 (Table I). This clone was isolated and sequenced ( Fig. 1). It covers 2152 nt and encodes the amino acid sequences of peptides Try 1 and Chy 1 ( Table I)    1 2900 nt and the shortest an insert of 900 nt (Fig. 1). The cDNA-derived amino acid sequence obtained from these clones accommodates the amino-terminal sequence and all six peptide sequences obtained by amino acid sequencing (Table I).
Immunological Studies-To establish further the identity of the protein coded by the mRNA complementary to the clones isolated here, an antiserum was prepared in a rabbit against a synthetic peptide derived from the cDNA sequences. This synthetic peptide was 15 amino acid residues in length and corresponded to residues 579-593 in the hydrophilic area of the COOH-terminal domain (Fig. 3). The antibodies were found in immunoblotting to stain a protein band corresponding to chick lysyl hydroxylase and an additional band corresponding to a polypeptide with a molecular weight of about 61,000, which probably represents a degradation product of the enzyme (Fig. W). In agreement with previous data (7) the band corresponding to lysyl hydroxylase was always broad, which is due to the heterogeneity of glycosylation of the enzyme. A faint staining was also obtained in immunoblotting experiments with human lysyl hydroxylase (not shown).
The antibodies to the synthetic peptide inhibited the activity of chick lysyl hydroxylase when added to the enzyme activity incubation mixture, but complete inhibition was not achieved (Fig. 2B). The antibodies also precipitated the enzyme activity when coupled to protein A-Sepharose (Fig. 2C). The immunological results thus indicate that the amino acid sequence obtained from the cDNA clones does indeed represent the sequence of lysyl hydroxylase and not the sequence of a contaminating protein that may have been copurified with the enzyme.
Nucleotide and Derived Amino Acid Sequences of the cDNAs-The cDNA clones encode a 730-amino acid polypeptide (Fig. 3). A 20-amino acid residue hydrophobic sequence (presumably the signal peptide) beginning with methionine precedes the NH2-terminal end of lysyl hydroxylase as obtained by protein sequencing (Fig. 3 and Table I). The molecular weight of the polypeptide, excluding the signal peptide, is 77,880. The cDNA-derived amino acid sequence contains four sequences, -Am-Ile-Ser-, -Am-Lys-Ser-, -Am-Tyr-Thr-, and -Am-Cys-Ser- (Fig. 3), which may serve as attachment sites for asparagine-linked oligosaccharides. The hydrophilicity/hydrophobicity plot indicates that the polypeptide is mainly hydrophilic but also contains several hydrophobic regions (Fig. 4) the luminal or transmembrane proteins of the endoplasmic reticulum (see "Discussion"). The cDNA clones also cover 117 nt of the 5'-untranslated sequences and 1627 nt of the 3'untranslated sequences (Fig.  3). The 3'untranslated sequence contains a slightly atypical polyadenylation signal, ATTAAA, 1561 nt downstream of the translation stop codon TAG. This polyadenylation signal, which is used in 12% of the genes (31), is accompanied 17 nt downstream by a poly(A) tail of 45 nt. Polymorphic sites were found in positions 1350 and 2554, where G found in some clones is replaced by A and T, respectively, in others.
No homologies were found when the nucleotide and predicted amino acid seuqences were compared with the Gen-Bank nucleotide and National Biomedical Research Foundation protein data bank sequences, except for short, probably nonsignificant homologies with some vertebrate and viral proteins (not shown). Surprisingly, no significant overall homologies were found with the amino acid sequences of the two types of subunit of chick prolyl 4-hydroxylase. A computer- Homologous amino acids are grouped as suggested by Argos (28). The numbers indicate the first amino acid residue of each region compared between chick lysyl hydroxylase (Fig. 3) and the a-subunit of chick and human prolyl 4-hydroxylase (12,13). The human prolyl 4-hydroxylase asubunit sequences are shown only where they differ from the chick tu-subunit sequences. Dashes are introduced into the sequences to obtain maximal alignment between the regions compared.
important a-subunit of prolyl4-hydroxylase (8,32) are shown in Fig. 5. It should be stressed, however, that no data are currently available to indicate whether any or all of these homologies represent mere chance rather than true evolutionarily or functionally significant homologies.
Northern and Southern Blot Analysis-The mRNA hybridizing with the clones coding for chick lysyl hydroxylase is about 4000 nt long (Fig. 6A), and the cDNA clones thus cover almost the whole mRNA (Fig. 3). The hybridization signal of lysyl hydroxylase mRNA could easily be detected in cultured embryonic chick tendon fibroblasts actively synthesizing type I collagen. This suggests that the mRNA is present in significant quantities in such a situation.
In order to obtain information on the gene(s) coding for chick lysyl hydroxylase, genomic DNA from cultured embryonic chick tendon fibroblasts was digested with either BamHI, EcoRI, or HindIII, and the digestion products analyzed by Southern blot hybridization. A 512-nt EcoRI-BamHI fragment, a 385-nt BamHI-EcoRI fragment of clone LHC-8 (Fig.  l), and a 613-nt AuaII-AuaII fragment of clone LHC-24, covering nucleotides 178-690, 691-1076, and 2937-3550, respectively, were used as probes. Single fragments of 4.6, 4.8, and 7.2 kb were detected in the BamHI digest of genomic DNA with the 512-, 385-, and 613-nt probes, respectively (Fig.  6, B-D). Single fragments of 7 and 3.5 kb were seen in the HindIII digest of genomic DNA with the 385-and 613-nt probes, respectively, whereas four bands of 6.5, 2.5, 0.6, and 0.3 kb were obtained with the 512-nt probe (not shown). These patterns suggest that a single gene codes for chick lysyl hydroxylase. In additional experiments, genomic chick DNA was digested as above and analyzed by Southern blot hybridization with cDNA clones LHC-35 and LHC-24, covering the whole 4-kb cDNA, as a probe. Four to five bands were detected with each enzyme, their sums being 31,30, and 26 kb in the BamHI, EcoRI, and Hind111 digests, respectively. Thus the total size of the chick lysyl hydroxylase gene is about 30 kb, providing that no long intron regions remained undetected in the hybridizations.

DISCUSSION
The data reported here indicate that the chick lysyl hydroxylase subunit consists of 710 amino acid residues and a signal peptide of 20 amino acids. The polypeptide contains four potential attachment sites for asparagine-linked oligosaccharides, a finding that agrees with the previously established glycoprotein nature of the enzyme and the heterogeneity in the extent of glycosylation (7). The calculated molecular weight of 77,880 is in excellent agreement with the value of 78,000 determined for the enzyme subunit after removal of the oligosaccharide units by treatment with endoglycosidase H (7). The polypeptide contains 9 cysteine residues, at least one of which is likely to be involved in the binding of the Fe2+ atom to a catalytic site (1,33).
Previous research has established that prolyl4-hydroxylase and lysyl hydroxylase are located within the cisternae of the rough endoplasmic reticulum; prolyl4-hydroxylase is soluble, whereas lysyl hydroxylase appears to be membrane-bound (1). The hydrophilicity/hydrophobicity plot demonstrates that lysyl hydroxylase contains several hydrophobic regions (Fig. 4) and is more hydrophobic than the a-and p-subunits of prolyl 4-hydroxylase (9, 12). Nevertheless, the polypeptide does not seem to have any typical amino-terminal or carboxyl-terminal transmembrane domain, and the membrane anchorage may thus be due to other types of interaction. Several other membrane proteins of the endoplasmic reticulum also have no transmembrane domain, but the mechanisms by which these proteins are associated with the membrane are currently unknown (34,35).
The soluble luminal proteins of the endoplasmic reticulum of Chick Lysyl Hydroxylase contain the carboxyl-terminal tetrapeptide sequence -Lys-Asp-Glu-Leu or its closely related variant, which appears to be both necessary and sufficient for the retention of a polypeptide within the lumen of the endoplasmic reticulum (35). This carboxyl-terminal retention signal is also found in the @-subunit (9-11) but not in the a-subunit (12, 13) of prolyl4hydroxylase, which suggests that one function of the @-subunit in the prolyl 4-hydroxylase tetramer is to retain the enzyme within the lumen of the endoplasmic reticulum (8,12,32). Several families of transmembrane proteins of the endoplasmic reticulum appear to have another retention motif in their cytoplasmically exposed tails, this motif consisting of two lysines positioned 3 and 4 or 5 residues from the carboxyl terminus (36). The present data indicate that lysyl hydroxylase contains neither the retention signal of the soluble endoplasmic reticulum luminal proteins nor the retention signal of endoplasmic reticulum transmembrane proteins. This agrees with data suggesting that the enzyme is membranebound (see Ref. 1) but does not contain any COOH-terminal transmembrane domain.
A surprising finding was that no significant homology was found between the primary structures of lysyl hydroxylase and the two types of subunit of prolyl4-hydroxylase, in spite of the marked similarities between the kinetic properties of these two enzymes (see Introduction). The @-subunit of prolyl 4-hydroxylase has recently been shown to be a highly unusual multifunctional polypeptide, identical to the enzyme protein disulfide isomerase (9, 37) and a major cellular thyroid hormone binding protein (38,39), and highly similar to a glycosylation site binding protein of oligosaccharyl transferase (40). The a-subunits of prolyl 4-hydroxylase probably contribute most parts to the catalytic sites of the vertebrate enzyme tetramer (see Refs. 8 and 32), and early evolutionary forms of prolyl 4-hydroxylase present in unicellular and multicellular green algae appear to consist of only one type of monomer, which is antigenically related to the a-subunit of the vertebrate enzyme (41). Homologies would therefore seem especially likely between lysyl hydroxylase and the a-subunit of prolyl 4-hydroxylase. Seven sequences with some degree of such homology are shown in Fig. 5, but as yet no data are available to indicate whether any of these homologies are of functional or evolutionary significance.
The availability of cDNA clones for chick lysyl hydroxylase will provide tools for studying the mechanisms involved in regulating the synthesis of the enzyme (see Ref. 2), and the clones have already made it possible to isolate cDNA clones for human lysyl hydroxylase.' The human cDNA clones in turn will make it possible to initiate detailed investigations into the mutations leading to the deficiency in lysyl hydroxylase activity in patients with the type VI variant of the Ehlers-Danlos syndrome (see Refs. 3 and 4).