Human E apoprotein heterogeneity. Cysteine-arginine interchanges in the amino acid sequence of the apo-E isoforms.

It has been postulated that 3 alleles at a single gene locus are responsible for the 3 major isoforms of the human E apoprotein (E-4, E-3, and E-2) and that the minor isoforms arise by post-translational glycosylation of the major isoforms (Zannis, V. I., and Breslow, J. L. (1981) Biochemistry 20,1033-1041). We have found that the heterogeneity of the 3 major isoforms of the human E apoprotein is due to differences in primary structure and involves cysteine-arginine interchanges. Based on amino acid analyses, the E isofoms differ in the number of cysteine residues per mol of protein, ie. E-4 has no cysteine, E-3 has 1 cysteine residue, and E2 has 2 cysteine residues. The sites of substitution have been identified in 2 cyanogen bromide fragments and occur at positions 4 and 33 in the 17and 89-residue fragments, respectively. Based on sequence and amino acid analysis, the E-4, E-3, and E-2 contain arginine/ arginine, cysteine/arginine, and cysteine/cysteine at these sites, respectively. These cysteine-arginine interchanges are sufficient to account for the known charge differences between the E-4, E-3, and E-2 isoforms. Furthermore, the cysteine differences in the E isoforms have been confirmed in a large number of subjects using a rapid screening procedure involving charge modification of the cysteine residues of the E isoforms with cysteamine. This method provides a sensitive measurement of the cysteine content of the isoforms and, as predicted from the cysteine contents, the E-4, E-3, and E-2 isoforms are shifted 0, 1, and 2 positive charge units, respectively, on isoelectric focusing gels after cysteamine modification. These cysteine-arginine interchanges in the human E isoforms may directly affect the metabolic activity of the various isoforms and have profound metabolic consequences.

= 35,000-39,000 apoprotein is implicated in cholesterol transport in the plasma ( l l ) , is one of the proteins responsible for determining lipoprotein interaction with the apo-B, E receptors of fibroblasts (12,13), and appears to be one of the key proteins mediating lipoprotein recognition and uptake by the liver (14)(15)(16). The heterogeneity of human apo-E has been recognized since its initial description, and, depending on the individual, 3 or 4 immunochemically related bands can be demonstrated by isoelectric focusing (17,18). These bands are commonly designated as the E-1, E-2, E-3, and E-4 isoforms' (E-4 being the most basic), and they focus with isoelectric points between pH 5.4 and pH 6.1. The major isoforms (E-2 uersus E-3 uersus E-4) differ from one another by a single unit of charge (19,20). Only 1 pathological disorder, type I11 hyperlipoproteinemia (primary dysbetalipoproteinemia), is associated with a specific apo-E isoform pattern. Characteristically, the isoform patterns of affected subjects lack the E-4 band and either lack or exhibit marked deficiencies of the E-3 isoform (17,18).
From two-dimensional electrophoretic studies of apo-E, Zannis and Breslow (19,20) have suggested that human apo-E heterogeneity results from a combination of 2 independent alterations in the E apoprotein: ( a ) heritable alterations in the protein, resulting from 3 independent alleles acting at a single gene locus, and (b) a post-translational glycosylation of the major isoforms with the addition of 1 or more negatively charged sialic acid residues. They have proposed that each allele is responsible for one of the major isoforms, i.e. E-4, E- 3, and E-2, and that in a given individual, the minor isoforms, including the E-1 isoform, arise as a result of post-translational glycosylation of one of the major isoforms. Thus, this model predicts 3 heterozygous and 3 homozygous conditions. These have now been identified in apo-E screening studies (19,20). The isoform patterns associated with the heterozygous and homozygous states have been designated as a and / 3 patterns, respectively, and have been subdivided according to the major isoforms present. The p-11, p-111, and p-IV designations refer to subjects that are homozygous for the E-4, E-3, and E-2 isoforms, respectively. Similarly, the designations a-11, a-111, and a-IV refer to heterozygous patterns resulting from the isoform pairs E-4/E-3, E-3/E-2, and E-4/E-2, respectively. In addition, Zannis and Breslow (19) have also presented evidence that the minor isoforms present in an individual's apo-E pattern appear to be sialated products arising from a posttranslational modification of the single major isoform in the case of homozygotes or of the 2 major isoforms in the case of heterozygotes. It is the purpose of this paper to establish that a major difference among the 3 major isoforms is amino acid substitutions and that these substitutions can account for the isoform pattern seen by isoelectric focusing.

RESULTS
Based on the Zannis and Breslow data (20), which indicate that the biosynthesis of apo-E is under the control of 3 different alleles a t a single gene locus, one might anticipate finding amino acid sequence differences among the 3 major isoforms of apo-E. T o test this possibility, we compared the apo-E from an individual homozygous for E-3 with the apo-E from an individual homozygous for E-2 (a subject with type I11 hyperlipoproteinemia).
The one-dimensional isoelectric focusing gels of the apo-E from these subjects are shown in Fig. 1 and are positioned horizontally to allow direct comparison with the two-dimensional gels. As shown by the one-dimensional gel, the normal pattern contained the E-3 isoform as the major isoform (greater than 50% as determined by densitometric scanning). The two-dimensional pattern of this apo-E was identical with the p-I11 pattern described by Zannis and Breslow (19,20). Thus, this subject was characterized as having the homozygous E-3 apo-E genotype. The one-dimensional isoelectric focusing pattern of the apo-E from the type I11 hyperlipoproteinemic subject revealed that the E-2 isoform was the major isoform present (Fig. 1). The two-dimensional gel indicated a p-IV pattern and confmed that the subject was homozygous for the E-2 genotype.
The E-2 and E-3 isoforms were isolated by preparative isoelectric focusing from the E apoprotein of the subjects homozygous for E-2 and E-3, respectively. As shown in Table  I, the amino acid compositions of the E-3 and E-2 isoforms were similar except for differences in 2 amino acid residues. When the results from multiple analyses of the E-2 and E-3 were examined closely, it was observed that the cysteine and arginine content of these 2 isoforms differed significantly (greater than 1 standard deviation). Furthermore, when the compositional data of each isoform were expressed in terms of number of residues per mol, assuming 294 residues/mol, it was observed that the E-2 isoform contained 1 more residue of cysteine (a total of 2 cysteine residues) and 1 less residue of arginine than did the E-3 isoform (Table I). These results suggested that cysteine was substituted for arginine in the E-2 isoform. The existence of 2 cysteine residues in the E-2 isoform, in contrast with a single cysteine residue in the E-3 isoform, was c o n f m e d by amino acid analysis of the E-2 isoform from 2 additional subjects homozygous for E-2. These 2 E-2 isoforms had cysteine contents of 2.1 residues (0.72 mol %) and 2.0 residues (0.68 mol %), respectively. T o extend these observations, we prepared the cyanogen bromide fragments and characterized the cysteine-containing peptides. These studies were performed on the whole apo-E preparations from the E-2 and E-3 homozygotes, not the isolated isoforms. T o justify this approach, it was important to demonstrate that the minor forms of the E isoforms did not have a different protein structure. Previously, we have presented data which indicate that there was a single polypeptide chain present in the E apoprotein from normal and type I11 subjects, as determined by NH2-terminal sequence analysis (21). The minor bands observed in the subjects homo-

E-3 isoform isolated from an individual homozygous for E-3 (14 determinations). This subject had an elevated triglyceride and was classified as a type V hyperlipoproteinemic. E-2 isoform isolated from an individual homozygous for E-2 (17 determinations). The diagnosis of type 111 hyperlipoproteinemia was clearly established in this subject.
' Assumes 294 amino acid residues/mol, not including tryptophan, so that 0.34 mol o/o = 1 residue.
" +Standard deviation does not overlap oetween E-2 and E-3 for these amino acids.
zygous for E-2 or E-3 were very likely glycosylated forms of the major isoform, as predicted by the observations of Zannis and Breslow (19,20). Furthermore, we determined that the cysteine content of the E-1and E-2-migrating bands obtained from the E-3 homozygote was 1.2 and 1.1 cysteine residues/ mol, respectively, similar to the cysteine content of the E-3 isoform (1 cysteine residue/mol). Similarly, the E-1-migrating band from the E-2 homozygote had a cysteine content of 1.8 cysteine residues/mol, the same as the cysteine content of the E-2 isoform (2 cysteine residues/mol). For these reasons, it was reasonable to proceed with studies of the unfractionated apo-E from subjects homozygous for the E-2 and E-3 isoforms. Cyanogen bromide digestion and isolation of the fragments from the apo-E of the E-2 homozygous subject having type I11 hyperlipidemia revealed that 2 of the fragments each contained 1 residue of cysteine. The smaller of the cysteinecontaining fragments consisted of 17 residues, while the larger fragment contained 89 residues. The amino acid analyses of these 2 peptides are shown in Table 11. On the other hand, when the same 2 cyanogen bromide fragments were obtained from the apo-E from the subject homozygous for E-3, cysteine was absent (0.2 residue) in the large fragment but a single residue was detected in the smaller peptide (Table 11).
These observations were confirmed by complete sequence analysis of the 17-residue fragments and partial sequence analysis of the 89-residue fragments from the E-2 and E-3 apo-E. As shown in Table 111, the 2 cysteine residues from the homozygous E-2 apo-E occurred at cycle 4 in the small fragment and at cycle 33 in the large cyanogen bromide fragment. Sequence determination of the large fragment from the E-3 apo-E indicated that the partial sequence was identical with that of the corresponding E-2 apo-E fragment except for cycle 33. Substituted for cysteine at this position in the E-3 apo-E was an arginine residue (Table 111). The amino acid sequences of the small fragment from both preparations were identical, and both E-2 and E-3 apo-E contained a cysteine residue at cycle 4 ( Table 111).
The difficulties associated with carrying out sequence studies on a large number of apo-E preparations necessitated developing a relatively rapid screening method to extend our observation that E-2 and E-3 differed in cysteine content. It was essential that this screening method be capable of dktinguishing between 1 and 2 residues of cysteine/mol of apo-E. To this end, we investigated the use of chemical modifications with specific cysteine reagents that would alter the charge of the protein. Cysteamine (P-mercaptoethylamine) proved most effective. By introducing an amino group on the cysteine

Amino acid analysis of cysteine-containing cyanogen bromide fragments from the E apoprotein of an E-3 homozygote and an E-2 homozygote Large fragment"
Small fragmenth ' Expressed as residues per mole, assuming Ala = 1.0 (2 determinations).
Converted to homoserine and homoserine lactone, which were present in all of these hydrolysates.
Determined as cysteic acid.

E-3 G~~-A~P -V A L -C Y~-G L Y -A R G -L E U -V A L -G L N -~Y U -A R G -~L Y -G L U -~A L -~L N -~L
residue through disulfide bond formation, the net charge of the protein was increased by 1 positive charge/cysteine residue. Furthermore, these charge modifications were easily monitored by isoelectric focusing. The results of cysteamine treatment of the d < 1.02 lipoproteins from the previously discussed E-2 and E-3 homozygous subjects are presented in Fig. 2. As expected, the apo-E isoforms from the E-2 homozygote were shifted by 2 positive charge units. In a similar manner, the E-3 homozygous apo-E was shifted 1 positive charge unit. As shown in Fig. 2, the E isoforms from both subjects then focused in identical positions. Since the cysteine content derived by sequence analysis in these E-2 and E-3 homozygotes was equivalent to that derived by cysteamine treatment, the screening method was validated. The focusing positions of the noncysteine-containing C apoproteins were unaffected by cysteamine treatment, and they served as a reference point for comparing different samples. The d < 1.02 lipoproteins from 8 additional E-2 homozygous type I11 subjects (including those whose E-2 isoforms showed 2 cysteine residues by amino acid analysis) and from 26 E-3 homozygotes have been examined by the cysteamine treatment method. In all cases, the expected charge shifts occurred, i e . the homozygous E-2 pattern shifted 2 positive charge units and the homozygous E-3 pattern shifted 1 positive charge unit (Fig.  3).
During our screening studies, we found 1 subject that was homozygous for the E-4 isoform. Cysteamine treatment of the d < 1.02 lipoproteins from this subject did not affect the isoelectric focusing position of the apo-E from this subject (Fig. 3). We have also investigated the effect of cysteamine treatment on 11 subjects heterozygous for the E-3 and E-4 isoforms. The heterozygous state was characterized by approximately equal intensities of the E-3 and E-4 isofoms on one-dimensional gels (Fig. 3). As with the E-4 homozygote, the mobility of the E-4 band in heterozygous subjects was not affected by cysteamine treatment. However, the E-3 band was shifted to the E-4 position, and the treated pattern then resembled that of the homozygous E-4 subject (Fig. 3). These results suggested that the E-4 isoform lacked cysteine. To verify this, we isolated the E-4 isoform from the apo-E of a heterozygous E-4/E-3 subject by preparative isoelectric focusing and determined the cysteine content by amino acid analysis. As shown in Table IV, the E-4 isoform did not contain significant amounts of cysteic acid after performic acid oxidation. With the exception of arginine and cysteine, the values for all amino acid residues of the isolated E-4 isoform were similar (&l S. D.) to those of the E-3 isoform. It appeared that the E-4 isoform contained an additional arginine residue when compared with E-3 shown in Table I.
Glutathione (y-glutamylcysteinylglycine) also caused charge shifts of the E isoforms. However, the extent of the  rotein Isoforms reaction was not as complete as with cysteamine, particularly with the homozygous E-2 apo-E. In contrast with cysteamine, glutathione introduced 1 negative charge/cysteine residue and shifted the E patterns in the negative direction. In parallel with the cysteamine modifications, E-3 was shifted 1 charge unit, E-2 2 charge units, and E-4 was not affected (data not shown). Both the cysteamine and glutathione charge modifications were reversible by treatment with /I-mercaptoethanol, and this restored the apo-E isoforms to their original focusing positions (data not shown).

DISCUSSION
In this study, we have examined the nature of the difference between the 3 major isoforms (E-4, E-3, and E-2) of the human E apoprotein. It has been shown that the major isoforms differ in primary structure and differ specifically in the content of cysteine and arginine. It appears that the E-4 isoform lacks cysteine, whereas the E-3 and E-2 isoforms contain 1 and 2 residues of cysteine, respectively. Partial sequence analysis of the E-2 and E-3 apo-E establishes the sites of amino acid substitution. A 17-residue cyanogen bromide fragment from both E-2 and E-3 contains a cysteine residue at position 4. In addition, the sequence of this fragment is identical for both E-2 and E-3. Partial sequence analysis of an 89-residue cyanogen bromide fragment reveals a cysteine residue at position 33 in the E-2 apo-E. Position 33 in this fragment from the E-3 apo-E does not reveal cysteine but instead the amino acid arginine.
It is noteworthy that the substitution of arginine for cysteine results in the addition of 1 positive charge to the net charge of E-3 relative to E-2 apo-E. Moreover, since the E-3 and E-2 are known to differ by a single charge (19,20), this single substitution is sufficient to account for the charge difference between E-3 and E-2. Furthermore, amino acid analyses reveal that the E-4 isoform does not contain cysteine. Although the sequence data are not presently available for the E-4 isoform, it appears from comparing the amino acid composition of E-4 with the E-3 isoform that an arginyl residue may be substituted for cysteine. Such a substitution is sufficient to account for the charge difference between E-4 and E-3. Although the total sequence comparison between E-3 and E-2 apo-E is only 90% complete, results to date indicate that the cysteine-arginine substitution is the only difference between the 2 forms.3 Therefore, it is probable (but not yet proven) that these cysteine-arginine interchanges are the only differences in the polypeptide chains of E-4, E-3, and E-2.
These observations are consistent with the genetic data obtained by Zannis and Breslow (19,20) which indicate that 3 independent alleles are involved. The differences in the amino acid composition of the 3 apo-E types establish that there is genetic influence at the level of the gene encoding for apo-E. The calculated gene frequency of the 3 apo-E alleles indicates that the E-3 allele is by far the most common (20). This suggests that the E-3 isoform is the parent type of apo-E. Of the 6 codons specifying arginine, 2 differ from the cysteine codons by a single base. Thus, a single base change (point mutation) at 1 of 2 sites in the E-3 gene could account for the E-4 and E-2 apo-E. In the case of E-4, there could be a point mutation resulting in an arginine substitution for the single cysteine residue of E-3. In the case of E-2, the mutation would involve a different location and could result in the substitution of a cysteine residue for an arginine. This then yields 3 isoforms, E-2, E-3, and E-4, having Cys/Cys, Cys/Arg, and Arg/Arg, respectively, at the 2 sites of mutation.
The amino acid compositions of the E-4, E-3, and E-2 isoforms of human apo-E have been reported to be similar to each other (22) and to contain approximately the same amount of cysteine (22, 23). Those results are obviously at variance with the data presented here. One possible explanation involves the different methods used for isolating the isoforms and the possibility of contamination by cysteinecontaining polypeptides. Havel et al. (22) and Utermann et al. (23) determined cysteine content of the apo-E isoforms prepared directly from very low density lipoproteins by isoelectric focusing. In our case, the whole apo-E was fwst isolated from the d < 1.02 lipoproteins by gel permeation chromatography followed by isoform isolation by preparative isoelectric focusing. Use of the two-step isolation method gave very reproducible amino acid analyses from preparation to preparation. We did direct isolations of the isoforms from VLDL and observed that the amino acid composition of a given isoform was not as consistent as with the two-step isolation. The most notable variations occurred with aspartic acid, glutamic acid, isoleucine, phenylalanine, histidine, arginine, and cysteine. The variation appeared to be more extreme in subjects with low concentrations of apo-E in their d < 1.02 lipoproteins, possibly indicating a larger relative contribution by some contaminating material.
From the standpoint of metabolic function of the E apoprotein, these substitutions may have a profound effect. It has been suggested that the hyperlipidemia in type I11 (E-2 homozygous) subjects arises as a result of a defective remnant removal mechanism (24). The E-2 apo-E of type I11 hyperlipoproteinemic subjects has been postulated to be responsible for the defect since the E-2 isoform from type I11 subjects is not taken up by perfused rat livers (24) and since the type I11 apo-E exhibits a reduced fractional catabolic rate when injected into both normal and type 111 subjects (25). It is reasonable to speculate that the substitution of cysteine for arginine in the E-2 apo-E may be a contributing factor. Arginyl residues are involved in the interaction of E-containing lipoproteins with both peripheral and liver cell receptors (26), and the substitution of cysteine for arginine may directly or indirectly perturb the recognition and binding of E-containing lipoproteins by liver receptors. This hypothesis is currently under investigation in our laboratory.  Sequence o f the $mall CNBrFrlgnent frm the E Apoproteln