The Basolateral Targeting Signal in the Cytoplasmic Domain of Glycoprotein G from Vesicular Stomatitis Virus Resembles a Variety of Intracellular Targeting Motifs Related by Primary Sequence but Having Diverse Targeting Activities*

Using systematic site-directed mutagenesis, the baso- lateral targeting signal in the cytoplasmic domain of glycoprotein G from vesicular stomatitis virus (VSV G) has been localized to an 11-amino acid sequence, which contains two essential residues and a third that makes a minor contribution. A tyrosine at position 19 of the 29-residue carboxyl-terminal cytoplasmic tail is the most important residue and cannot be replaced by other aromatic amino acids, while an isoleucine at position 22, 3 residues carboxyl-terminal to this tyrosine, is also critical but can be replaced by other aliphatic residues. Ad- ditionally, an arginine at position 16 makes a minor contribution. Therefore the crucial elements of this targeting signal can be represented by the sequence Y-X- X-aliphatic. While earlier investigation has suggested similarity between basolateral targeting and internal- ization signals, alignment of this sequence with other cytoplasmic targeting signals suggests the existence of a broad class of homologous targeting motifs that direct protein delivery to a variety of cellular locations. This in turn suggests the existence of a family of homologous receptors, distributed throughout the cell, which differ in their affinity for subsets of these targeting sequences. The plasma membrane of a polarized epithelial cell is typi-cally divided into two domains with distinct compositions (1). HHG Ml3-based mutagenesis two primers Zoller (29). one of PCR-based mutagenesis (30-32) using either an HHG or HHG R26t gene (where t = stop) as a PCR template and polymerase (New England Biolabs) the PCR constructs, codons fol- amino stop codons. Constructs containing internal deletions were made using mutagenic deoxyoligonucleotides with homology to the template on either side of the sequence to be deleted and lacking the sequence of the intended deletion. Point mutants were made with deoxyoligonucleotides designed to change the coding sequence of each codon individu- ally, and in the case of the degenerate mutagenesis at positions Tyr-19 and Ile-22, a mixture of oligonucleotides was used to produce a variety of amino acids at each position. For each of these mutants, a 500-base pair restriction fragment generated from digestion of the mutagenesis product was subcloned into the expression vector pCB6 containing the remainder of the gene. Mutations were identified by dideoxy sequencing of the region of the genes encoding the entire HHG cytoplasmic domain. lower. No correlation between polarity and expression level was observed. Assay of Polarity of Surface Delivery and Quantitatwn--To prepare for experiments, cells were plated and grown for 5 days on Costar Transwell filters (Costar Corp., Boston, MA) as previously described (7). at either the apical or basolateral surface. The amount of trypsin- specific cleavage at the basolateral surface was then expressed as a percentage of total (apical + basal) trypsin specific cleavage. Therefore the numbers reported represent the percent of newly exocytosed protein that was initially delivered to the basolateral domain of the cell surface. Unless otherwise specified, reagents were purchased from Sigma. receptor; HA+& a mutant influenza hemagglutinin with an 8-amino acid extension; ttR, transferrin receptor; glyc. 106, a mutant glycophorin; EGFR, eDidermd mwth factor receDtor.

The plasma membrane of a polarized epithelial cell is typically divided into two domains with distinct compositions (1). Understanding the establishment of the non-uniform distribution of proteins in epithelial cell membranes is an active area of research (1)(2)(3). In the Madin-Darby canine kidney (MDCK)' cell line, apical and basolateral proteins apparently co-migrate through the exocytic pathway until they reach the TGN, where they are "sorted" into separate exocytic vesicles destined for either the apical or the basolateral domain of the cell surface (4,5). Recent experiments have suggested that short amino acid sequences in the cytoplasmic domains of transmembrane proteins function as basolateral targeting signals (6-12). Most of these signals resemble cytoplasmic internalization signals in that they contain important aromatic residues.
Occasionally basolateral targeting signals are found in the same region of ing Grant T32GM08203 (to D. C. T.) from the National Institutes of * This research was supported in part by Grant GM37547 and Train-Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "aduertisernent" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The abbreviations used are: MDCK, Madin-Darby canine kidney; TGN, trans-Golgi network VSV G, glycoprotein G from vesicular stomatitis virus; HA, hemagglutinin; DMEM, Dulbecco's modified Eagle's medium; PCR, polymerase chain reaction. the primary sequence as internalization signals; however, in each case where signals have been more carefully investigated, basal targeting and internalization signals have proven to be distinct (8, 9, 13-15). A number of internalization signals have been analyzed in detail, leading to several proposals for consensus internalization motifs (16)(17)(18)(19). However, in the case of basolateral targeting, only two such signals, one in the polymeric immunoglobulin receptor (pIg receptor) and one in lysosomal acid phosphatase, have been extensively analyzed and they appear to share no sequence similarity (9,20). Before any consensus can be proposed for a basal targeting motif, a larger number of these signals need to be precisely defined.
Several earlier experiments suggested that the glycoprotein G from vesicular stomatitis virus (VSV G) might contain cytoplasmic basolateral targeting information (21,22). The existence of a dominant basolateral targeting signal in the cytoplasmic domain of this protein was conclusively established when the tail sequence was shown to be sufficient to confer basolat-era1 targeting upon a normally apical protein (10). The targeting information was shown to be completely dependent upon a unique tyrosine at position 19 of the 29-residue carboxyl-terminal tail (10). Additionally, high internalization rates measured for VSV G in other early experiments were interpreted to mean that this cytoplasmic sequence also contained a coated pit localization signal (23, 241, which was postulated to involve the sequence YTDI (19). However, a complication of these experiments was that the internalization rate of G was measured after fusion of virus with the cell membrane, perhaps allowing aggregated patches of viral G protein to yield an artificially high internalization rate. Subsequent experiments, finding a much lower internalization rate for VSV G (10, 25), as well as a negligible effect of mutation of the tyrosine on the internalization rate (lo), suggested that this protein did not contain a coated pit localization signal a t all. Therefore, the identification of basal targeting information in the cytoplasmic tail of VSV G provided an opportunity for a detailed analysis of a basal targeting sequence, as well as the potential to identify characteristics that distinguish these signals from apparently similar internalization signals.
Mutational analysis performed in a chimera containing the cytoplasmic domain of VSV G and the lumenal and transmembrane domains of the normally apical influenza hemagglutinin ( H A ) has allowed identification and characterization of a sequence which directs basal targeting of this chimera. The signal identified shows homology with many cytoplasmic targeting signals that are able to direct targeting to a variety of cellular locations in addition to the basal membrane and coated pits.

MATERIALS AND METHODS
Construction ofMutants-The gene for the G protein investigated in this work was originally derived from the San Juan strain of the Indi-ana serotype of VSV (26). Construction of cDNAs encoding the HHG chimera and the HA'""construct has been previously reported (10,27).
The entire coding sequence of these genes was subcloned into the expression vector pCB6 (7) under the control of the cytomegalovirus immediate-early promoter. The HHG Y19S mutant was constructed using the Ml3-based mutagenesis protocol of Kunkel(28) modified slightly to use two primers according to Zoller and Smith (29). All remaining mutants were constructed by one of several PCR-based mutagenesis techniques (30-32) using either an HHG or HHG R26t gene (where t = stop) as a PCR template and the VentTM polymerase (New England Biolabs) as the PCR enzyme. For the truncated constructs, codons following the last amino acid of each new construct were mutated to encode stop codons. Constructs containing internal deletions were made using mutagenic deoxyoligonucleotides with homology to the template on either side of the sequence to be deleted and lacking the sequence of the intended deletion. Point mutants were made with deoxyoligonucleotides designed to change the coding sequence of each codon individually, and in the case of the degenerate mutagenesis at positions Tyr-19 and Ile-22, a mixture of oligonucleotides was used to produce a variety of amino acids at each position. For each of these mutants, a 500-base pair restriction fragment generated from digestion of the mutagenesis product was subcloned into the expression vector pCB6 containing the remainder of the gene. Mutations were identified by dideoxy sequencing of the region of the genes encoding the entire HHG cytoplasmic domain. Additionally, for seven of the mutants generated by PCR techniques, the complete 500-base pair subcloned fragment was sequenced to check for second site mutations; however, as no second site mutations were found, this was not considered to be a problem with the techniques used. All sequencing was performed according to the protocol supplied with the Sequenase kit version 2.0 (U. S. Biochemical Corp.). During the time course of targeting experiments, a similar proportion of each mutant (and wild-type) protein was processed to its fully glycosylated form and then delivered to the cell surface indicating that the mutations did not inhibit folding, glycosylation, or exocytic delivery. Additionally, all proteins constructed could be cleaved by trypsin into two stable proteolytic fragments, suggesting that they formed stable HA trimers with no overall structural perturbation of the external domain. (Processing and exocytosis of R16t was somewhat inhibited; however, no essential conclusions were drawn from experiments on this protein.) Preparation and Characterization of MDCK Cell Lines-Transfection, selection, and maintenance of MDCK cells expressing the mutant proteins were as previously described (7). The cell populations transfected were derived from a clonal MDCK cell line selected for its high degree of polarity as judged by the methionine uptake assay (7).All experiments were performed on uncloned populations of transfectants to minimize the effect of clonal variation. Control experiments on HHG and HA'""clonal cell lines verified that the polarity of the constructs in the uncloned populations agreed closely with the polarity of the constructs in individual clones derived from these populations. The ability of the transfected populations to form polarized monolayers was established by measuring the ability of the cells to sort the methionine transporter to the basolateral surface. The methionine uptake assay was performed on at least one transfected population for every mutant, and in each case at least 92% of the methionine uptake occurred through the basolateral surface. In no case did transfection of a particular mutant cause loss of polarity of the cell population as measured by this assay. Probably due to different transfection efficiencies, the expression level of different transfected populations varied over a 5-fold range, with most populations having an expression level in the middle of this range and varying only about 2-fold. The expression level of all HHG mutants in which basal targeting was impaired either fell into the middle range similar to HHG or was lower. No correlation between protein polarity and expression level was observed.
Assay of Polarity of Surface Delivery and Quantitatwn--To prepare for experiments, cells were plated and grown for 5 days on Costar Transwell filters (Costar Corp., Boston, MA) as previously described (7). Confluent monolayers were treated with 10 m~ sodium butyrate (Sigma) for 14-15 h to increase the expression level of the transfected genes. Controls showed that this treatment did not affect the polarity of methionine uptake or the polarity of the wild-type HHG construct in these cells. For each experiment, three different monolayers of cells were washed several times with phosphate-buffered saline (138 m~ NaCl, 2.7 m~ KCI, 8.1 m~ N&HP04, 1.2 m~ KH2P04, pH 7.4) containing 1 m~ M e and 0.1 m~ Ca*+, incubated for 30-45 min in Dulbecco's modified Eagle's medium (DMEM, Life Technologies, Inc.) lacking methionine and cysteine (Met-/Cys-), and then incubated with 140 pl of Met-/Cys-containing 1.5-3 mCi/ml Trans6S-label (ICN, Imine, CA) on the basolateral surface for 30 min to metabolically label newly synthe-sized protein. The cells were incubated in normal DMEM for 1 h to allow newly synthesized protein to proceed to the surface. During this incubation, 10 & m l trypsin (Sigma) was included in the apical or the basal medium of two different monolayers and 20 pglml soybean trypsin inhibitor (Sigma) was included on the opposing side. The third monolayer was incubated in DMEM alone. After 1 h at 37 "C, all monolayers were incubated for an additional 10 to 15 min with DMEM containing 100 & m l soybean trypsin inhibitor to inhibit trypsin on both cell surfaces and any in endosomes. This medium was removed and replaced with ice-cold DMEM also containing 100 pg/ml soybean trypsin inhibitor, and the cells were rocked gently on ice for 30-45 min. The cells were lysed in icecold lysis buffer (50 m~ Tris-HC1, pH 8.0, 1% Nonidet P-40, 0.1% SDS, 5 & m l aprotinin; Sigma) containing 100 & m l soybean trypsin inhibitor. Cells were scraped into the lysate, which was centrifuged for 30 min at 12,000 x g. One-halfto two-thirds of the supernatant was transferred to another set of tubes containing an equal volume of NETgel (50 m~ Tris-HC1, pH 8.0, 1 m~ EDTA, 150 m~ NaC1, 0.25% gelatin, 0.05% Nonidet P-40,0.01% NaNJ + 3% bovine serum albumin, 100 pl of a 10% solution of protein A-Sepharose (Pharmacia Biotech), and 0.5 pl of anti-HA rabbit serum, which w i l l bind to the external domain of HA present in all constructs. After allowing antibody and protein A binding to occur during gentle rocking either overnight at 4 "C or 4 h at mom temperature, the Sepharose was washed and proteins were eluted and analyzed by SDS-polyacrylamide gel electrophoresis as previously described (10). Gels were exposed to phosphor image screens to quantify the bands of the uncleaved proteins and their two cleavage products, HA1 and HA2 (model 400E PhosphorImager, Molecular Dynamics, Sunnyvale, CAI. An autoradiograph of a sample gel can be found in Ref. 10. The relative amount of each chimera delivered to the basal as opposed to the apical surface during the 1-h incubation was calculated as follows. The percentage of cleaved protein recovered as HA1 and HA2 when trypsin was present in either the apical or the basal medium of two separate monolayers was first corrected for the level of endogenous cleavage measured in the third sample incubated without trypsin. This correction assumed that endogenous cleavage did not occur preferentially at either the apical or basolateral surface. The amount of trypsinspecific cleavage at the basolateral surface was then expressed as a percentage of total (apical + basal) trypsin specific cleavage. Therefore the numbers reported represent the percent of newly exocytosed protein that was initially delivered to the basolateral domain of the cell surface.
Unless otherwise specified, reagents were purchased from Sigma.

The Basolateral Targeting Znformtion in the Cytoplasmic Domain of VSV G Is Completely Contained within an 11-Amino Acid Sequence Spanning Tail Residues Lys-14 through Met-
24-Mutational analysis of wild-type VSV G to define the basolateral targeting signal in the cytoplasmic domain of this protein was complicated by the possibility, supported by earlier experiments, that the lumenal domain of this protein contained redundant basolateral targeting information (10, 33, 34).
Therefore, mutagenesis was performed on a chimera, HHG, containing the cytoplasmic domain of VSV G and the lumenal and transmembrane domains of the normally apical influenza HA. An added advantage of using this chimera was the availability of an assay that uses trypsin cleavage to monitor surface arrival of the protein. This assay, used i n the past to monitor the surface arrival of a variety of HA constructs (7, lo), has proven to be more sensitive and more reproducible than techniques that require saturable, tight binding of a reagent such as an antibody or biotin to monitor protein distribution (35).
To define the region of the G tail containing the basolateral targeting information, the HHG gene was mutated to encode a series of truncations and internal deletions in the G cytoplasmic domain and the mutated genes were subcloned into the expression vector pCB6 under the control of the cytomegalovirus promoter. This DNA was transfected into MDCK cells and neomycin-resistant transfectants were selected using G418. To monitor the exocytic targeting of the resulting proteins in the transfected MDCK cells, the cells were cultured on filters where they form polarized monolayers i n which the apical and basolateral surface domains are separated by tight junctions. deleted. The number reported as "percent basal" is the fraction of newly exocytosed protein delivered directly to the basal surface during a 1 h or equal to 3. The last column contains either n or actual experimental values when n = 2. The sequence located beneath the series of constructs illustrates the region determined by this analysis to encode basal targeting information.
The exocytic proteins were metabolically labeled with 35S by a brief incubation of the cells in Trar~~~S-label. The cells were then incubated a t 37 "C with trypsin in the media to allow the labeled protein t o proceed through the exocytic pathway until it reached the cell surface and encountered the protease. In the presence of trypsin, H A t a i " and chimeras of HA containing foreign cytoplasmic domains were cleaved into two tryptic fragments that remained membrane-bound. After this incubation, the trypsin was inhibited, the cells were lysed, and the HHG construct present in the lysates was immunoprecipitated using antibodies to the external domain of HA. The precipitated proteins were viewed by SDS-polyacrylamide gel electrophoresis and autoradiography, and the polarity of surface delivery of each protein was determined by comparing the amount of tryptic fragments produced when trypsin was present in the basal medium of one monolayer to the amount produced when trypsin was present in the apical medium of a separate monolayer. All numbers reported in the following figures and text represent the percentage of exocytosed protein delivered directly to the basolateral domain of the cell surface during a 1-h incubation, calculated as described under "Materials and Methods." Whenever a decrease in basal targeting was observed for a particular construct, a corresponding increase in apical targeting was also observed, indicating that loss of basal delivery was due to missorting and not loss of protein. Fig. 1 contains an illustration of the various truncations and deletions made in the G cytoplasmic domain of the HHG chimera, as well as the polarity measured for each resulting construct. For convenience, only the 29 amino acid carboxyl-terminal tail of G is shown next to the last 3 residues of the HA transmembrane domain. The first two proteins shown are controls, which confirm previous observations (10) that the HA lacking any cytoplasmic sequence (HAta"-) is delivered largely to the apical surface (only 28% basal) and that the addition of the G tail to this protein creates a chimera (HHG) delivered almost exclusively (98%) to the basolateral surface. The polarity of the next two proteins, R26t and M24t, indicates that deletion of the last 5 residues of this tail has little or no effect on the targeting information encoded there. Deletion of the last 7 residues in the I22t protein has a significant but small effect, suggesting that the most critical elements of the signal have been retained, although residues 23 and 24 are apparently needed for maximal signal function. The polarity of R16t indi-cates that deletion of the last 13 residues, including tyrosine 19, which was demonstrated to be essential to this targeting signal in wild-type VSV G, yields a protein that is more apical than basal, suggesting that critical elements of the basal targeting sequence have been deleted. The membrane distal border of the targeting signal can therefore be placed at residue Met-24 with the last 2 residues, positions 23 and 24, making a minor but significant contribution. The remaining three constructs illustrated in this figure contain internal sequence deletions. The tight basal polarity of Al-8 and A6-13,26t indicates that residues 1-13 do not contain critical elements of the targeting sequence; however, the loss of basal targeting seen when the entire sequence is deleted in the A1-13 construct suggests that some minimal separation between the cytoplasmic signal and the transmembrane domain is required. The membrane-proximal border of this signal can therefore be placed conservatively at residue Lys-14, as no further deletions on the amino-terminal side of the tyrosine were tested. The 11-amino acid region of the G tail that has been defined to contain basal targeting information has been illustrated beneath the series of constructs in Fig. l.

An Alanine Scan of Residues 14-24 Indicates That Qr-19 and Zle-22 Are the Most Critical Residues in the Basal Targeting
Signal, with Arg-16 Playing a Minor Role-After identification of the region of the tail containing basal targeting information, a set of point mutants was constructed to determine which amino acids within this 11-residue sequence contributed to the targeting signal. This series of mutations, as well as those to be described subsequently, were made in the background of R26t. To determine which residues played a positive role in targeting, each of the 11 residues between Lys-14 and Met-24 was individually changed to alanine and the polarity of surface arrival of each mutant protein was measured as described above. AS illustrated in Fig. 2 A , only 3 of the residues in this sequence seem to have a significant effect on the basal targeting activity, as measured by the "alanine scan." Tyr-19 and Ile-22 both appear to be essential residues in this signal, as mutation of either one alone reduced the basal targeting of the protein to the background level of HAtai". Arg-16 also makes a small but significant contribution to the basal targeting activity. Additionally, 3 of the residues neighboring the critical tyrosine and isoleucine, Thr-20, Asp-21, and Glu-23, were all changed to alanine in the same protein to determine what contribution the The y axis indicates the percent of newly exocytosed protein delivered directly to the basal surface during a 1-h incubation. Gray horizontal bars indicate the values determined for the positive control R26t ( n = 5) and the negative control HA" ( n = 6), and the width of mutant illustrated across the x axis, the ends of the error bars represent these bars is +1 standard deviation from the mean value. For each values for n = 2 and +1 standard deviation for n 2 3. A, alanine scan of G tail residues 1P24. Each individual amino acid throughout the Lys-1PMet-24 region was individually changed to an alanine in R26t. This sequence is listed along the x axis where the amino acid designation indicates which residue was changed to an alanine in each construct. Number of trials for each mutant are as follows: K14A, n = 2; K15A, n = 2; R16A, n = 4; Q17A, n = 2; I18A, n = 2; Y19A, n = 2; T20A, n = 2; D21A, n = 3; I22A, n = 2; E23A, n = 2; M24A, n = 2. B, triple alanine point mutants in R26t background. In the KKR protein Lys-14, Lys-15, and Arg-16 were all changed to alanine ( n = 2). In the TDE protein, T20, D21, and E 23 were all changed to alanine (n = 3).

FIG. 3. Degenerate mutagenesis at residues Tyr-19 and ne-22.
The results of this mutagenesis are presented as described for Fig. 2. Again all mutants shown were constructed in the background of R26t. A, mutagenesis at Tyr-19. Amino acids illustrated across the x axis represent residues to which Tyr-19 was changed in each construct. For each protein n = 2. B , Mutagenesis at Ile-22. Amino acids illustrated across the x axis represent residues to which Ile-22 was changed in each construct. Number of trials for each mutant are: I221 (wild-type), n = 5; I22L, n = 4; I22V, n = 2; I22F, n = 6; I22Y, n = 2; I22C, n = 2; I22G, n = 2; I22P, n = 3; I22S, n = 4; I22A, n = 2; I22H, n = 2; I22R, n = 2; I22N, overall physical properties of this region might make to the targeting activity of the tyrosine-isoleucine pair. These effects might not be detected by the mutation of individual residues to alanine. However, as illustrated in Fig. 2B, mutation of all 3 residues at once had no effect on the targeting activity, suggesting that these residues do not make any positive contribution to basal targeting.
Finally, the Y19A and I22A mutations were also made in the full-length HHG protein, which is 3 residues longer than the R26t construct, to compare the effect of these mutations in the two different backgrounds. While the Y19A mutation had the same effect on basal targeting activity in HHG and R26t, the I22A mutation had a less severe effect in the full-length protein; R26t I22A was only 29% basal ( Fig. 2 A ) , while HHG I22A was 86 2 1% basal (data not shown).

There Is a Strict Requirement for Qrosine at Position 19, while the Isoleucine at Position 22 Can Be Replaced by Other
Aliphatic Residues-To investigate the requirements of the signal for specific amino acid side chains at the two critical positions identified by the alanine scan, a variety of amino acids were tested for their ability to function at the positions of  and Ile-22, again in the R26t construct. The polarity of surface arrival of the resulting mutant proteins is illustrated in Fig. 3. As shown in Fig. 3 A , we did not find another amino acid that could substitute for tyrosine at position 19. The low basal targeting activity of the serine 19 and valine 19 mutants rules out a simple requirement for a hydroxyl group or a large hydrophobic side chain. Additionally, the low targeting activity of the phenylalanine 19 mutant indicates that the tyrosine in this signal is essential, as it cannot be replaced even by a closely related aromatic residue. As reported for I22A above, each of these mutations at Tyr-19 was also tested in the full-length HHG protein. Y19A, Y19S, and Y19V had identical effects in R26t and HHG. Y19F had a slightly more severe effect in the R26t protein than it did in full-length HHG; however, the difference was small and may not be significant (data not shown).
Because of the apparent lack of sequence conservation around aromatic residues in cytoplasmic targeting signals, a wide variety of amino acids were tested for their ability to function at the position of isoleucine 22. However, as illustrated in Fig. 3B, in the basal targeting signal of VSV G there is a strong preference for an extended aliphatic side chain at this position, as only valine and leucine retain wild-type basal targeting activity. Phenylalanine, tyrosine, and cysteine function at a suboptimal level, while small, polar, or charged residues function poorly or not at all.

The Positively Charged Character of the Sequence Lys-14-Arg-16 Does Not Make a Significant Contribution to the Basal
Targeting Activity Encoded in the G Tail-The presence of 3 contiguous positively charged residues, Lys-14, Lys-15, and Arg-16, located within the 11-residue sequence containing targeting information suggested the possibility that the highly charged character of this region might contribute to the basal targeting signal. Therefore all three of these residues were changed to alanine a t once in the R26t protein. However, as illustrated in Fig. 2B, the effect of this triple mutation on basal targeting was very similar to the effect of mutating Arg-16 alone (see Fig. 2 A ) , suggesting that the polarity of the triple mutant was largely due to the loss of Arg-16 and not to diminished positive charge in this region. This KKR triple mutant was also constructed in the full-length HHG protein and, as reported for mutations made at the position of tyrosine 19, had essentially the same effect in the R26t construct and the fulllength HHG. One additional mutant in the background of R26t, R16L, gave an interesting result. The substitution of a leucine for the arginine creates a protein that is significantly more basal (86 2 2% basal, data not shown) than one containing an alanine at this position (73% basal, Fig. 2 A ) , and in fact the leucine seems to be almost as functional as the wild-type arginine (93% basal, Fig. 2).

The Basolateral Targeting Signal in the VSV G Cytoplasmic Tail Fits the General Motif of Y-X-X-aliphatic-Using deletion
analysis, the basal targeting information encoded in the cytoplasmic domain of VSV G has now been shown t o span residues Lys-14-Met-24 provided that a certain minimal separation between the beginning of the signal and the transmembrane domain is maintained. The last two residues of this region, 23 and 24, can be changed t o alanine without affecting basal targeting. Deletion of these two residues however, by truncation down to residue 22, does have a minor effect on targeting, suggesting that a certain minimal sequence length is required for optimal signal function. Point mutants throughout the entire Lys-14 -Met-24 sequence in the R26t protein suggest that critical residues of the signal include Tyr-19 and Ile-22 with Arg-16 making a minor, but significant contribution. Additionally, the tyrosine is completely essential and cannot be replaced, while the isoleucine can be replaced equally well by either valine or leucine. Therefore the critical part of this signal seems to be represented by the sequence Y-X-X-aliphatic. The involvement of the tyrosine is fully consistent with earlier data, which indicated that this tyrosine is absolutely required for the tail of VSV G to retain its basal targeting activity in the wild-type VSV G protein (10).
Although the tyrosine and the isoleucine appear to be the most critical elements of the signal, Arg-16 does make a definite contribution as judged by the slight decrease in basal polarity of the R16A mutant. Interestingly, while the presence of an alanine significantly impairs basal targeting, a leucine is almost as functional as the wild-type arginine, suggesting that the function of the arginine is not solely to provide a positive charge, but that the size, shape, or hydrophobicity of the side chain is important. The possibility that the contribution of the positive charge is minor is supported by the observation that removal of all three positive charges upstream of the tyrosine in the KKR mutant has a relatively small effect on basal targeting. The critical elements of the basal targeting signal in the cytoplasmic domain of G defined by this work are summarized in Fig. 4.
Although most of this mutational analysis was done in the HHG R26t construct, a selected set of point mutations, including KKR, I22A, and each Tyr-19 mutant, were also tested in the full-length HHG protein, which is 3 residues longer. Only the I22A mutant gave a significantly different result in the two backgrounds. It is possible that the longer tail is structurally more stable and allows the hydrophobic character of alanine to substitute for isoleucine. This substitution may also be tolerated in the full-length protein due to the presence of additional minor interactions between the three extra residues and the signal "receptor." Table I, the basolateral targeting signal in the cytoplasmic tail of VSV G has similarity to a large number of cytoplasmic targeting determinants with the more general motif of aromatic-X-X-large hydrophobic. In addition to basal targeting and coated pit localization, some of these sequences have been shown to be necessary for targeting to the lysosome, the endoplasmic reticulum, and the TGN. However, in a t least four cases where sequences encode more than one type of targeting information, the overlapping signals can actually be distinguished by mutagenesis (see Table 11). For example, mutation of certain residues in the aligned region of lysosomal acid phosphatase caused a differential effect on basal targeting and internalization indicating that there are two distinct signals in this sequence (9). The basal targeting and internalization signals in lgp120 can be distinguished from the lysosomal targeting signal by mutation of the neighboring glycine (61, and the internalization signal in TGN 38 can be distinguished from the TGN targeting information by mutation of an arginine (36). We have now made a similar observation for the overlapping basolateral targeting and internalization signals in the cytoplasmic domain of the influenza hemagglutinin mutant HA Y543,F546.'

An Alignment of This Targeting Signal with Other Cytoplasmic Targeting Signals Suggests the Existence of a Broad Class of Targeting Motifs-As illustrated in
One explanation for overlapping, but distinct, targeting motifs is that different signals share a common structure and rely on individual amino acid side chains to encode specificity. However, results for VSV G indicated that this is not necessarily the case for all basal targeting and internalization motifs. Mutation of Ile-22 to phenylalanine creates the sequence of an optimal internalization signal in the G tail based on the alignment in Table I, as well as on previous proposals for internalization signal consensus sequences (17)(18)(19)37). However neither this nor any of nine other point mutations tested caused more than a 2-fold increase in the normally low internalization rate of HHG (data not shown). This suggests that the structure of the G tail required for basolateral targeting is distinct from structures that mediate internalization.
There are several basal targeting sequences and many internalization signals that do not display the motif shown in Table  I. Two basal targeting signals in the cytoplasmic domain of the low density lipoprotein receptor (13) and one in the polyimmunoglobulin receptor (20) each contain an important aromatic amino acid but cannot be aligned with the sequences illustrated. Additionally, there are a number of internalization signals that have an essential aromatic residue but have a polar or charged residue in place of the second large hydrophobic amino acid, as well as a subset of internalization signals that have similarity to the N-P-X-Y internalization signal of the low density lipoprotein receptor (38,39). However, in vitro competition experiments have suggested that, in the case of the internalization signals, this last class of signal may compete with those presented in Table I for binding to the same cytoplasmic receptor molecules (40)(41)(42). Additionally, recent analysis of the internalization signal in the influenza hemagglutinin mutant HA Y543 (18) suggests that degenerate internalization signals can actually function quite well. Finally, there is evidence that multiple internalization signals in the same protein function at different efficiencies and act cooperatively (14,38,39). Therefore it seems likely that the receptors for cytoplasmic internalization signals are actually capable of binding a much more degenerate set of sequences than Table I would indicate and that other targeting receptors may share this ability to recognize a diverse set of primary sequences.
There are two distinct, although non-exclusive, mechanisms by which degenerate signals might be recognized and then bound by cytoplasmic receptors. In one mechanism, a signal would be recognized by its inherent structure and, in another, some feature of a flexible sequence of amino acids would initiate binding by a receptor and be productively bound only if it could acquire a particular "induced" structure upon binding. In the "inherent structure" mechanism, signal degeneracy would result from different primary sequences which form similar secondary structures, while in the "induced structure" mechanism, signal degeneracy would result from the tolerance of a variety of non-interfering amino acid side chains in the signal binding domain of a receptor molecule. Current data are insufficient to distinguish between these two mechanisms.

An alignment of the basolateral targeting signal in VSV G with other similar cytoplasmic targeting determinants
of targeting in each category (with the exception of RM, explained below). Shaded residues indicate those that have been determined to be critical The sequences illustrated, each oriented amino-to carboxyl-terminal, represent the minimum sequence determined to be necessary for the type for targeting activity by mutagenesis. In group 1, shaded residues are those determined to be involved in basolateral targeting; in group 2, for lysosomal targeting; in group 3, for the other targeting activity indicated; in group 4, for endocytosis. The outlined columns are to emphasize the residues apparently conserved throughout the various motifs and are not based on mutational analysis of the sequences. The references for the work on each signal are listed in the final column. Abbreviations are as follows: Gwt, vesicular stomatitis virus glycoprotein G, FcR, Fc receptor; hNGFR, a mutant human nerve growth factor receptor; ASGPR H1, the H1 subunit of the human asialoglycoprotein receptor; HAY543,F546, a mutant influenza hemagglutinin; LAP, lysosomal acid phosphatase; lgp120, lysosomal glycoprotein 120 from rat (a member of the LA" 1 family of lysosomal membrane proteins); LA" 2, the human member of the LA" 2 family (the canine LAMP2 is known to be basal); CD3 y and 6, two subunits of the T cell receptor complex; CD3 E , a subunit of the T cell receptor; GLUT 4, glucose transporter 4; TGN 38/41, two similar proteins localized to the trans-&@ network; FcRIII y, the y subunit of the type I11 Fc receptor; CI MPR, cation-independent mannose-&phosphate receptor; CD MF' R, cationdependent mannose-6-phosphate receptor; RM, rat macrophage asialoglycoprotein receptor (reported data indicate only that the tyrosine is necessary for endocytosis, not what other portion of the cytoplasmic domain is required); pIgR, polymeric immunoglobulin receptor; HA+& a mutant influenza hemagglutinin with an 8-amino acid extension; ttR, transferrin receptor; glyc. 106, a mutant glycophorin; EGFR, eDidermd mwth factor receDtor.  Yes lYS no effect inhibited 6

Basolateral targeting signals
Yes TGN no effect inhibited 3 6 Yes no effect M. Roth, in prep.

Does a Class of Related Cytoplasmic Adapter Molecules Mediate a Wide Variety of Protein nufieking
Events?-After the initial identification of the first cytoplasmic internalization motifs, it was proposed that cytoplasmic proteins, later called adapters, were binding to these sequences to promote the clustering of proteins in coated pits (4345). It now seems likely that there are a number of adapter molecules, functioning at various stages of the exocytic and endocytic pathway, that recognize a variety of targeting motifs and whose function is to sort the cargo proteins into the proper trafficking vesicles at the proper time. To visualize how small sequence differences between motifs could generate specificity for different adapters, an analogy may be made to the binding of different classes of peptides by different major histocompatibility complex alleles where specificity "pockets" in the otherwise similar binding grooves discriminate between different peptides by their ability to accommodate particular amino acid side chains (46).
In the case of the internalization signals, there is substantial evidence that implicates the AP2 "adaptins" of the plasma membrane clathrin coat as playing the role of the cytoplasmic receptodadapter (40)(41)(42)47,48). The possibility that A P 2 is the sorting adapter at the plasma membrane responsible for recruiting proteins into coated pits raises the possibility that the related A P 1 adaptins of the Golgi clathrin coat might play a role in the recognition of the basolateral targeting motifs in the TGN. However, the A P 1 adaptins do not appear to be able to bind cytoplasmic sequences that encode basal targeting information (40,47). Given the relatedness of the sequences of a number of sorting signals, there must be families of related sorting adapters that are currently unidentified. To understand the molecular basis for intracellular protein traffic, it will be necessary to identify, purify, and clone a variety of these cytoplasmic targeting molecules.