Primary Structure of a Putative Receptor for a Ligand of the Insulin Family*

Nucleotide sequence analysis of human and guinea pig genomic DNA encoding a new member of the insulin receptor (IR) family revealed that the predicted primary structure of this IR-related protein is as similar to the IR and insulin-like growth factor (IGF) I receptor as the IR and IGF-IR are to each other. The conservation of this IR-related sequence among mammals and with the IR and IGF-IR suggests that this IR- related protein is a novel receptor for insulin, IGF-I, IGF-11, or an as yet unidentified peptide hormone or growth factor belonging to the insulin family.

Nucleotide sequence analysis of human and guinea pig genomic DNA encoding a new member of the insulin receptor (IR) family revealed that the predicted primary structure of this IR-related protein is as similar to the IR and insulin-like growth factor (IGF) I receptor as the IR and IGF-IR are to each other. The conservation of this IR-related sequence among mammals and with the IR and IGF-IR suggests that this IRrelated protein is a novel receptor for insulin, IGF-I, IGF-11, or an as yet unidentified peptide hormone or growth factor belonging to the insulin family.
Although insulin and one structurally related insulin-like growth factor (IGF),' IGF-I, are known to be primary regulators of metabolism and growth, the role of IGF-11 remains unclear (1-3). Precise determination of the physiological functions of each of these peptides is complicated, however, by the ability of three known receptors to bind these peptides with overlapping specificities. Two of these receptors, the insulin receptor (IR) and the IGF-I or type I IGF receptor, belong to a family that also includes several more distantly related tyrosine kinases: the protein products of ltk and of the proto-oncogenes c-ros, met, and trk (4, 5 ) . The IR and IGF-IR exhibit a high degree of overall similarity (6)(7)(8); extracellular a-subunits containing the ligand-binding region are disulfide-bonded to @-subunits which span the membrane and contain a cytoplasmic tyrosine kinase activated by ligand binding. Surprisingly, the IGF-I1 or type I1 IGF receptor lacks any homology to the IR and IGF-IR as well as any known mechanism of signal transduction (9, 10). This IGF-IIR is the same molecule as the cation-independent mannose 6-phosphate (Man-6-P) receptor involved in lyososomal enzyme targeting (11) and shares antigenic properties with an IGFbinding protein present in serum (9). This IGF-IIR is unlikely to mediate IGF-I1 action, a t least in the chicken; the avian Man-6-P receptor lacks the high affinity binding site for IGF-* This work was supported in part by Medical Research Council of Canada Grant MA8786. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted and 505047.
to the GenBankTM/EMBL Data Bank with accession number(s) J05046 $Funded by grants from the Quebec Diabetes Association and Canadian Diabetes Association.
The abbreviations used are: IGF, insulin-like growth factor; IGF-IR, insulin-like growth factor I receptor; IGF-IIR, insulin-like growth factor I1 receptor; IR, insulin receptor; IRR, insulin receptor-related receptor; kb, kilobase(s); Man-6-P, cation-independent mannose 6phosphate. I1 (12). IGF-I1 also binds with high affinity to the receptor encoded by the human IGF-IR cDNA, although in most reports IGF-I1 has been shown to bind the type I IGFR with lower affinity than IGF-I (13). These results raise the possibility that there is another receptor in the IR family whose binding specificity differs from those of IR and IGF-IR (3, 13). Previous reports of anomalous binding of insulin and the IGFs have also been explained by hypothesizing the existence of additional receptors belonging to the IR family (14)(15)(16)(17)(18)(19). In this study, we have defined the primary sequence of a new receptor structurally so similar to the IR and IGF-IR that it suggests this insulin receptor-related receptor (IRR) mediates the action of a ligand(s) that is identical or very similar to insulin, IGF-I, or IGF-11.
Library Screening, Cloning, and Sequencing-Part of the human IR gene was isolated by screening a library of genomic DNA (24) using synthetic oligonucleotides based on the human IR cDNA sequence (241-260 and 261-280 base pairs (6)). DNA encoding part of exon 2 from the human IR (an AluIISstI fragment encoding L23-L125) was used as a probe to isolate IRR genomic DNA from a library of guinea pig genomic DNA. A guinea pig IRR DNA probe encoding part of exon 2 (PstI fragment encoding L29-Q133, Fig. 3) was used to isolate a phage containing all of the human IRR gene except exon 1 from a XCharon 4A library (24). Insert DNAs from both human and guinea pig genes were subcloned into vectors pEMBL18 or pEMBL19 (25); deletions were created by restriction enzyme digestion; and single stranded DNA was sequenced by the dideoxy method using Klenow (26) and resequenced using modified T7 DNA polymerase and dITP to resolve ambiguities (27).

RESULTS AND DISCUSSION
T o determine if IR DNA could be used to identify previously unknown receptors of the IR family, Southern blots of mammalian genomic DNA were probed with DNA encoding part of the extracellular ligand-binding region of the human IR. DNA from exon 2 which encodes an N-terminal region of the a-subunit (28) hybridized at reduced stringency to multiple fragments in guinea pig, human, and rat genomic DNA (Fig.  1A). The single fragment in each species which encodes the IR was identified by washing at 60 "C to eliminate hybridization to less related sequences (data not shown). The size estimated for this EcoRI fragment in human DNA (=18 kb, Fig. 1A) is similar to that of an EcoRI fragment containing exon 2 of the human IR isolated by Seino et al. (28). One other hybridizing fragment in each genome ( Fig. 1A) probably encodes the IGF-IR since there is 75% identity between DNA encoding the human IR and IGF-IR in the region used as probe (6)(7)(8). At least one hybridizing fragment in each genome, however, could not be attributed to these known genes.
Since analysis of genomic DNA by Southern blotting suggested that IR DNA could identify novel, homologous DNA sequences, we used the human IR DNA as probe to isolate IR-related sequences from a guinea pig genomic library. The region of this guinea pig DNA that was homologous to the human IR probe selectively detected EcoRI fragments in guinea pig, human, and rat genomic DNAs (Fig. 1B) that had hybridized only weakly to the human IR probe (Fig. 1A). We used this IRR DNA from guinea pig to isolate the analogous human IRR gene. The novel identity of this IRR DNA was confirmed by localizing the human gene to chromosome 1.' In contrast, genes encoding other members of the IR family are known to be located on human chromosomes 6,7,15, and 19 (c-ros, met, IGF-IR, and IR, respectively (8,(29)(30)(31)).
Nucleotide sequence analysis of the guinea pig and human genomic DNA encoding the IRR revealed an overall gene organization similar to that of the human IR. The guinea pig IRR gene consists of 22 exons (Fig. 2). All of these exons, with the exception of exon 1 which encodes the signal peptide and the first three amino acids of the a-subunit, were also present in our human IRR isolate (Fig. 2). This intron/exon organization of both IRR genes is identical to that of the human IR gene (28). Indeed, of the 42 identifiable intron/ exon boundaries in the IRR and IR genes, 36 are at analogous positions and the remaining 6 accommodate deletions or insertions of no more than 5 amino acids. Although the relative sizes of introns within each gene are similar, the IRR P. Shier, H. F. Willard, and V. M. Watt, unpublished data. genes in both species are approximately 10-fold smaller (=15 kb in the guinea pig; -13 kb without the signal exon in the human) than the human IR gene (>120 kb (28)). The identical exon organization of both IRR genes (Fig. 2) and the human IR gene (28) revealed that the IRR gene does not exhibit the common pseudogene structure arising from reverse transcription of a processed RNA (32). Although we have not yet detected IRR transcripts using Northern analysis of poly(A) RNA from multiple tissues and cell lines including placenta, liver, muscle, and brain, the absence of in-frame stop codons in all human and guinea pig IRR exons even though the amino acid sequence of the IRR has diverged significantly from those of the IR and IGF-IR (Fig. 3) also indicates that the IRR gene is not a pseudogene. Alignment of the predicted amino acid sequences of the human and guinea pig IRR with those of the human IR and IGF-IR revealed colinear organization of these preproreceptors over their entire length (Fig. 3). The processing of the IRR to its mature form, therefore, is likely similar to that of the IR and IGF-IR. A 26-residue sequence beginning at a consensus initiator methionine (GCCATGG compared to consensus ACCATGG (33)) of guinea pig prepro-IRR (Fig. 3) exhibits structural characteristics typical of cleavable signal peptides (34). A putative cleavage site (Arg-His-Arg-Arg7") in the IRR exhibits the basic characteristic of the cleavage recognition sites (Arg-Lys-Arg-Arg) present at analogous positions in the human IR and IGF-IR (Fig. 3). This cleavage of the IRR precursor which contains 1271 amino acids with a calculated M, -141,000 would result in an a-subunit of 716 amino acids with M, -79,000 and a @-subunit of 551 amino acids with M, = 61,000. The predicted subunit sizes of the IR and IGF-IR are similar: 84 and 80 kDa for the a-subunits; 70 and 71 kDa for the @-subunits, respectively (6,8). Within the extracellular regions, 8 of 10 potential N-linked glycosylation sites conserved between the guinea pig and human IRR are also present in both the human IR and IGF-IR (Fig. 3). The IRR may therefore exhibit a high level of glycosylation similar to that of the IR and IGF-IR (6)(7)(8).
As in the IR and IGF-IR, the a-and p-subunits of the IRR are also likely to be disulfide-linked forming a biologically active a2-P2 heterotetrameric receptor complex. The Cys residues suggested to be involved in this linkage of a-and psubunits of the IR and IGF-IR (8) are also present in the IRR (between Cys'j31 and (Fig. 3)). These alp-subunit heterotetramers could anchor in the membrane via a single 22amino acid hydrophobic sequence (Valsg6 to Phe917) of the @-

S-----H Q R E E A G G R D G G S S L G F K R S Y E E H I P Y T H M N G G K K N G R I L T L P R S N P S
3 2 7 1 1 3 5 , 1 3 3 7

hIGF-IR KLPEPEELDLEP MEOVPLDPSASS SLPLPD H S G H K A E N G P G P G V L V L R A S € D E R Q P Y A H M N G G R K N E R A L P L P Q S S T m
FIG. 3. Predicted amino acid sequences of human and guinea pig IRR aligned with those of closely related members of the IR family. Open boxes indicate residues of human IRR (hZRR) identical to those of human IR (hZR (6)), human IGF-IR (MGF-IR (8)), or human c-ros protein (hc-ros (40)). Only differences between guinea pig IRR (gpZRR) and human IRR are shown. Numbers indicate amino acid residues of each proreceptor and for human IRR were assigned assuming that the unidentified initial amino acids of the a-subunit are identical in number to those in the guinea pig IRR. Cys residues are indicated with black boxes; Tyr residues in the cytoplasmic region are bold. The signal peptide and transmembrane regions are underlined with solid lines.
Consensus N-glycosylation sites (NXS/T) of human IRR are ouerlined with solid lines; consensus sequences for tyrosine kinase activity with dashed lines. Intron positions are indicated by triangles; the potential ATP binding site (GXGXXG and K) by asterisk.
subunit. Ligand binding to the extracellular region of the IRR may involve the Cys-rich region of the a-subunit (24 Cys between Cys150 and Cys302) since the corresponding Cys-rich region of the IR has been implicated in insulin binding (35). Characteristics of ligand binding and/or signal transduction may also be determined by a small region encoded by exon 11 immediately N-terminal to the putative alp cleavage site of the IRR (Fig. 3). This exon is differentially spliced from human IR RNA in specific tissues (36) and is absent from the placental human IGF-IR sequence (8). Signal transduction following ligand binding would likely lead to activation of the tyrosine kinase present in the cytoplasmic region of the p-subunit. Structural features characteristic of tyrosine kinases are located between amino acid residues Leug5' and T h P 4 and include a potential ATP binding site (Gly-X-Gly-X-X-Gly965 and Lysga7) as well as all other residues conserved among kinases (4). Tyrosine specificity of this putative kinase is supported by the presence in the IRR of the two protein tyrosine kinase consensus sequences (Asp1oSg to to G~u "~~ (4)). Within the tyrosine kinase domains, the four Cys residues and six of seven Tyr residues conserved between the IR and IGF-IR are also present in the IRR. Among these are the autophosphorylation sites implicated in the regulation of IR kinase activity (Tyr1162,1163 in human IR  The scheme of the predicted preproreceptor is described in Fig. 2.
Vertical numbers indicate the positions of the initial amino acids in each region or the terminal residue of the IRR precursor. Horizontal numbers indicate percent identity between corresponding domains. h, human; gp, guinea pig, -, not available; ND, not determined.
(37)). The structure of the physiologically active conformation of the cytoplasmic kinase domain may be influenced by the C-terminal "tail" (38,39). This C-terminal tail of the IRR (57 or 60 residues) is shorter than it is in either the IR (98 residues) or IGF-IR (107 residues), although its size in the IRR is more similar to those of other members of the IR family such as met and human c-ros (59 and 51 residues, respectively (4)). Characteristics initially described for the human IR C terminus, hydrophilicity, and enrichment of Pro and Gly residues (6) are evident in the I R R of 57 residues in this region of the human IRR, 31 are hydrophilic and 13 are either Pro or Gly. The amino acid sequence of the IRR exhibits very high levels of identity with those of both the IR and IGF-IR (Fig.  4). Indeed, similarity is as high as that between the IR and IGF-IR (Fig. 4). Among these three receptors, the cytoplasmic region of the @-subunit which defines the enzymatic domain for tyrosine-specific kinase activity exhibits the highest degree of identity, 279% (Fig. 4). In contrast, the next most related member of the IR family, the oncoprotein c-ros, exhibits only 50% identity with either the human IRR or IR in this region. Among the IRR, IR, and IGF-IR, the regions between the kinase and transmembrane domains and the regions flanking the Cys-rich domain are somewhat less similar (52-65% identity). Although overall similarity of the Cys-rich domains is more limited (only 46-51% identity), there is remarkable conservation of Cys residues (>88%). Surrounding the a/@ cleavage site, identity also is lower (41-46%) and includes conserved Cys residues. The most divergent region in this family of receptor molecules is at the cytoplasmic C terminus (19-45% identity) where even between guinea pig and human IRR there is only 68% identity (Fig. 4). Even greater species variation in the C terminus region has previously been reported for the chicken and human c-ros gene products (40). As expected, amino acid similarity between signal domains or transmembrane domains is limited to their overall hydrophobic character.
This conservation of the IRR among mammals and with the IR and IGF-IR suggests that the IRR gene encodes a new receptor of the IR family. Indeed, the high degree of identity between the IRR and the insulin and IGF-I receptors in the extracellular ligand-binding region suggests that the IRR is a novel receptor for insulin, IGF-I, IGF-11, or an unknown structurally similar ligand. Although the primary structures of receptors for insulin and IGF-I have been defined (6)(7)(8), the IRR may be an additional receptor that exhibits unique binding characteristics for insulin or IGF-I. Alternatively, the IRR may mediate the biological action of IGF-I1 while the known IGF-IIR/Man-6-P receptor acts only as a binding protein without the ability to function in signal transduction. This DNA encoding the IRR may thus facilitate determination of the physiological role of IGF-11.