Positions in human serum albumin which involve the indole binding site. Sequence of 107-residue fragment.

The first 107 residues of Fragment C of human serum albumin have been sequenced and two positions at which affinity labels block the indole site determined. Histidine 23 is the position of blockage by bromoacetyl-L-tryptophan and lysine 67 is the position of blockage by 5-dimethylaminonaphthalene-1-sulfonyl chloride and probably pyridoxal-5'-phosphate. The presence of an indole ligand at the binding site markedly reduces incorporation of the label into the above lysyl residue, and in the case of 5-dimethylaminonaphthalene-1-sulfonyl chloride, increases incorporation into three other positions, lysine residues 13, 39, and 84. It is concluded that binding of the indole ligand on the site brings about conformational changes in the albumin structure exposing new reactive positions for 5-dimethylaminonaphthalene-1-sulfonyl chloride. There is a large accumulation of basic and hydrophobic residues and no glycine, serine, threonine, valine, aspartate, or cysteine residues in the sequence 10 to 43. Lysine 71 has been identified by amino acid analyses and sequence studies as the position acetylated by acetylsalicylic acid (Hawkins, D. R., Pinckard, N., Crawford, C. P., and Farr, R. S. J. Clin. Invest. (1969) 48, 536), establishing the structural relationships of two major ligand binding sites on albumin. The lone tryptophan is at position 86. Evidence indicates that within residues 1 to 86 of Fragment C and within residues of the A-Phe fragment (Mr equals approximately 10,000), the latter known to be adjacent to Fragment C in the whole albumin structure, exists the major binding sites of all ligands for human serum albumin.


SUMMARY
The first 107 residues of Fragment C of human serum albumin have been sequenced and two positions at which affinity labels block the indole site determined.
Histidine 23 is the position of blockage by bromoacetyl-r.-tryptophan and lysine 67 is the position of blockage by .5-dimethylaminonaphthalene-1 -sulfonyl chloride and probably pyridoxal-S'phosphate. The presence of an indole ligand at the binding site markedly reduces incorporation of the label into the above lysyl residue, and in the case of 5-dimethylaminonaphthalene-1-sulfonyl chloride, increases incorporation into three other positions, lysine residues 13, 39, and 84. It is concluded that binding of the indole ligand on the site brings about conformational changes in the albumin structure exposing new reactive positions for $dimethylaminonaphthalene-1-sulfonyl chloride.
There is a large accumulation of basic and hydrophobic residues and no glycine, serine, threonine, valine, aspartate, or cysteine residues in the sequence 10 to 43. Lysine 71 has been identified by amino acid analyses and sequence studies as the position acetylated by acetylsalicylic acid (HAWKINS, D. R., PINCKARD, N., CRAWFORD, C. P., AND FARR, R. S. J. Clin. Invest. (1969) 48, 536), establishing the structural relationships of two major ligand binding sites on albumin. The lone tryptophan is at position 86. Evidence indicates that within residues 1 to 86 of Fragment C and within residues of the A-Phe fragment (M, = ~10,000)~ the latter known to be adjacent to Fragment C in the whole albumin structure, exists the major binding sites of all ligands for human serum albumin.
* This investigation was supported by National Science Foundation Research Grant GB-7224 and by a grant from the James H. Cummings Foundation.
An initial report was presented at the 57th Annual Meeting of the Federation of American Societies for Experimental Biology, April, 1973. Part of the data are taken from a thesis submitted by K. K. Gambhir for his Ph.D. degree.
$ Present address, Department of Obstetrics, Gynecology and Medicine, Howard University, Washington, D. C. 20001.
Q From whom reprints should be requested.
It was found earlier that reacting human serum albumin with affinity labels bromoacetyl-L-tryptophan, dansyli chloride, or pyridoxal S/-phosphate, led to a blockage of the binding of acetyln-tryptophan (1). It was further found that the greater part of each label was incorporated into Fragment C (-11, = -18,000). Thus, at conditions of 1:l molar ratio of labeling agent to albumin 65 to 95% of the label was incorporated into Fragment C, accounting for essentially an equivalent stoichiometric blocking of the binding site when bromoacetyl-L-tryptophan was used as a reagent, and 35 to 50% of the equivalent stoichiometric blocking when dansyl chloride or pyridoxal 5'-phosphate were used as the labeling reagents. If labeling was conducted in the presence of indolepropionate, a ligand strongly bound by albumin, the indole site was largely protected. Hromoacetyl-L-tryptophan reacted with imidazole groups, dansyl chloride reacted with e-amino lysyl(s) and tyrosyl-OH group(s), and pyridoxal 5'.phosphate reacted with e-amino lysyl group(s).
The indole binding site on albumin is unusual in several aspects. First, the above described labeling agents which block the indole site all have diverse structural features. The site, thus, is apparently readily adaptable to different types of ligands. A further example of this adaptability is shown with thyroxine, a compound much different from the indole compounds in shape and structural groups, yet, as shown by Tritsch (2), binding competitively with L-tryptophan.
Another unusual feature of the site is that fatty acids at low concentrations inhibit the indole site in the alkaline pH region, but have little effect on indole binding in the neutral and acid pH regions (3). Fatty acid inhibition at low concentration is not directly competitive with indole binding. In addition to providing a broad accommodation to different types of ligand, there is also evidence that occupation of the site by an indole compound provides considerable protection to the albumin. Thus, the reaction of trypsin with albumin is much retarded when albumin is associated with L-tryptophan (4). Also, acetyl-n-tryptophan has, for many years, been known to stabilize albumin against thermal denaturation.
Finally, it has recently been postu- phenylalanine, lysine, glutamate, serine; and valine, alanine, lysine, threonine, glutamine. The standards, applied at 2.5 nmol, were alternated with unknowns in applications across the plate. Repetitive aliquots (-35 ~1) of unknowns were applied at the same spot until the fluorescent intensity reached approximately that of the standards. The amount applied was recorded. Eight unknowns were usually run on each gel plate. Ascending development was carried out in Solvent V (heptane 58 ml, propionic acid 17 ml, and ethylene chloride 25 ml) of Jeppsson and Sjoquist (11). The dried chromatograms were first viewed under ultraviolet light to identify as many zones as possible, then sprayed with a 1-butanol solution containing 0.5 g/100 ml of ninhydrin followed by heating of the plates in an oven at 110" for 10 min. This latter treatment assisted further identification of some of the zones by color differences (12). Overlaps were easily followed by this method. Sequencing experiments were terminated when the background became too high to recognize the unknown zones. When leucine, isoleucine, and sometimes valine and proline were indicated, their identification was confirmed by gas chromatography using the method of Pisano et al. (13) as indicated in the Sequencer Operator Manual. PTH derivatives of arginine, histidine, and cysteic acid, and sometimes lysine (when 4.sulfophenylisothiocyanate was used), were usually identified by amino acid analysis after hydrolysis with 58$& HI in sealed, evacuated tubes at 140" for 24 hours. In some instances, arginine was identi-

RESULTS
Peptide Separations-The elution profile of trypsin-digested, maleylated carboxymethylated Fragment C is shown in Fig. 1. Zones a, b,, and c were found to contain single peptides with NHz-terminal residues, Phe, Tyr, and Cys. Amino acid analyses are given in Table I. Zone b, appeared to be a mixture of two fragments; however, fractions taken from the leading and trailing shoulders had identical NHz-terminal groups, and upon hydrolysis, had identical amino acid contents. The reason for the evident splitting of this zone was not clear. The relative amounts in the split zones also varied; in some instances, the leading peak was much reduced. Zone d was freeze-dried, demaleylated, and chromatographed on the peptide resin (Fig. 2). Amino acid analyses of Zones dl, dB, and da are reported in Table I. The other zones in this elution contained multiple peptides at low concentrations and were not further considered.
The elution profile of trypsin-digested, maleylated S-fl(4pyridyl)ethylated Fragment C is given in Fig. 3. When compared with Fig. I, a new zone appeared between Zones c and d, Zone e. This zone was further purified by concentration and passing the second time over Sephadex, G-100, or by demaleylating and chromatographing on the peptide resin (Fig. 4). Comparison of the amino acid composition of the peptides in Zone b, and e from the S-fl(4-pyridyl)ethylated Fragment C digest with the peptide from b, of the S-carboxymethylated Fragment C digest indicated that the latter was equal to the sum of the former two (Table II). Splitting the peptide b, at its internal arginine apparently is inhibited by S-carboxymethylation at cysteinyl residues. Splitting readily occurred with S-@(4-pyridyl)ethylated and performic acidoxidized Fragment C preparations.
The fragments isolated accounted for all of the eight arginines in Fragment C. Peptide a, which contained a homoserine residue, and was, thus, the COOH-terminal peptide, contained an internal  arginine which did not cleave with trypsin under any conditions attempted.
The zone of the void volume from the elution of chymotrypsindigested, maleylated, carboxymethylated Fragment C (column conditions the same as described for Fig. 1) was demaleylated and chromatographed on the peptide resin (column and gradient conditions the same as described for Fig. 4). The amino acid composition and eluting fractions for some of the peptides isolated are given in Table III. The amino acid composition and eluting fractions for two peptides isolated from an cr-protease digestion of peptide b, using the same peptide resin conditions are reported in Table IV. sequences-The results of sequencing experiments are given in Fig. 5. Intact Fragment C (500 nmol) was subjected to 42 steps of automated Edman degradation. Several of the mixed residues were later obtained from chymotryptic or Lu-protease-cleaved peptides (Tables III and IV). Thus, residue 19 was determined to be isoleucine from the composition of peptide 30; residue 30 was determined to be glutamate from the composition of peptide 23; and residue 36 was determined to be lysine by isolation of peptide 31. Automatic sequencing of peptide b, (300 nmol) provided residues 38 to 63, and an overlap with the sequence obtained with the intact Fragment C. No PTH product was obtained for residue   54. From consideration of the amino acid composition in peptide b,, serine was assigned to this position. There was a large drop in the yield at aspartate 63. cy-Protease digestion of the b, fragment produced a peptide with the composition aspartate, glycine, serine, alanine, glutamate, lysine, and arginine; and another with the amino acid composition glutamate, lysine, and arginine (Ta-ble IV). Neither of these peptides would sequence by the Edman procedure, presumably because blocking of the NHz-terminal residues had occurred in their isolation. Carboxypeptidase 13 reacted with b, producing arginine. Carboxypeptidase A subsequently released glutamine. A mixture of carboxypeptidases A and B released arginine, glutamine, lysine, and alanine with a slow release of serine. On this basis, the order of assigned residues 64 to 69 seems reasonably correct. No peptide was isolated to establish evidence of overlap between residues 69 and 70. However, all other overlaps of arginine-cleaved peptides were accounted for,  leading one to conclude that this assignment is correct. Peptides e, di, and dz were sequenced manually. Negative results were obtained for residue 74, and serine was assigned by difference. A chymotryptic peptide containing glycine, glutamate, arginine, alanine, and phenylalanine was isolated (Table III), which, by manual sequence, was shown to be that of residues 79 to 83, confirming the overlap of this region. The sequence of n-bromosuccinimide-reacted Fragment C (500 nmol) provided an overlap for peptides d1 and dz as well as for peptide a. The amino acid analysis of peptide dS is consistent with that of residues 22 to 37. Peptide a (500 nmol) sequenced poorly. The yield after the first residue dropped markedly, presumably due to the presence of proline at the second position. Table V provides a summary of residue identification.
Location of Afinity Labels in Fragment C-Combined tryptic and chymotryptic digestion of acetyl-L-tryptophan-albumin (labeled at I : 1 molar ratio of bromoacetyl-L-tryptophan to albumin) produced peptide 1 (Fig. 6) with the composition on hydrolysis of arginine, 3-carboxymethylhistidine, proline, and tyrosine. When compared with the total 3H label in albumin, 60% or more could be accounted for in this peptide zone. No carboxymethylhistidine (or N'-carboxymethyllysine for that matter) was found on hydrolysis of any of the other zones. Upon sequencing, the 123 4 following residues were obtained: ArL-Pro-Tyr.
The acetyltryptophan-labeled histidine was assigned to the second residue.    On comparison with the sequence in Fig. 4, this residue was found to be histidine 23. This position was further confirmed when the label, after maleylation and tryptic digestion of Fragment C, was found to be located in Zone d. The composition of the peptides in the latter permit the label to be present only at residue 23. Chromatography of the tryptic digest of performic acid-oxidized, maleylated, dansyl-Fragment C (albumin dansylated at a molar ratio of 1: 1) and a similar treatment of pyridosal-P-Fragment C (albumin reacted with pyridoxal 5'.phosphate at a molar ratio of 1 :I) produced the results in Fig. 7. These labels were clearly shown to be concentrated in peptide b,. The dansyl content was estimated as 0.35 moljmol of peptide b and the pyridoxal 5'-phosphate content as 0.25 mol/mol of peptide b. Amino acid analyses demonstrated that the labels were present only in N'-lysyl derivatives.
Reduced S-P(4-pyridyl)ethylated Fragment C from albumin after labeling with dansyl chloride in the presence of indolepropionate was also maleylated and reacted with trypsin. The profile obtained was similar to that in Fig. 3. In this instance, the b, zone was clearly much less labeled than when labeling was conducted in the absence of indolepropionate (for comparative purposes these data are also indicated in Fig. 7). Furthermore, the label increased considerably in Zones c and d when indolepropionate was present, but was unchanged in Zone a.
When albumin was labeled with pyridoxal S/-phosphate in the presence of indolepropionate, the amount of label incorporated into Fragment b, was approximately one-half that incorporated in the absence of indolepropionate. The behavior was similar to dansyl chloride in this respect. From analogy with the dansyllabeling studies, the fact that pyridoxal 5'-phosphate was earlier found to react with a lysyl residue in albumin which inhibited acetyl-L-tryptophan binding, it is most likely that pyridoxal 5'phosphate also reacts at lysine 67 in the same manner as dansyl chloride.

DISCUSSION
The specific residues to which the dansyl labels were attached The indole site is inhibited by labeling at two positions in were identified when the labeling was conducted both in the Fragment C, histidine 23 and lysine 67. The peptide chain unpresence and absence of indolepropionate. Trypsin digestion of doubtedly folds back on itself in this region in development of peptide b, (labeled with dansyl chloride 1: 1 in the absence of the binding site. It had earlier been found that a tyrosyl OH indolepropionate) followed by descending chromatography in group in another major albumin fragment (Fragment A, M, = 1-butanol/glacial acetic acid/water, 4/l/5, v/v for 44 hours pro--34,000) was also involved in the indole binding site. More reduced one strong fluorescent zone (-15 cm from origin) which on cent evidence indicates that this label is located in A-Phe, a subamino acid analysis was found to contain residues 59 to 69. This fragment of Fragment A with 92 residues. Thus, these three posiidentified the major labeling position as lysine 67. Other dans-tions on albumin, although much removed from each other in ylated zones on the chromatogram were of much lower intensity. the peptide sequence, have been identified as within or in close In a similar experiment with peptide b, isolated from albumin proximity of the indole binding site. One perhaps cannot rule labeled with dansyl chloride in the presence of indolepropionate, out an effect induced by labeling at some distance from the site. 6717 two dansyl zones of moderate intensity were present, one at lysine 67, identified as described above, and another which could not be identified by elution from the paper chromatogram. ol-Protease digestion of peptide b, followed by resin column chromatography provided three zones containing dansyl residues. On amino acid analysis, they corresponded to residues 39 to 40, 63 to 68, and 63 to 67. Both lysine 39 and 67 were, therefore, dansylated when labeling was conducted in the presence of indolepropionate. Furthermore, lysine 39 was found to be dans-Slated only in the presence of indolepropionate. In view of the difficulties in elution of dansyl zones from the chromatogram, no quantitation was attempted. However, one should note that the total dansyl label in peptide b, was approximately twice as high when labeled in the absence of indolepropionate, compared to when indolepropionate was present. Labeling at residue 67, therefore, must be much reduced by the presence of indolepropionate occupying the indole binding site.
Trypsin digestion of peptide c followed by chromatography and electrophoresis (pH 6.0) provided separation of a large peptidecontaining residues 1 to 14 which strongly fluoresced. Since trypsin would not be expected to cleave at NC-dansyl-lysyl residues, the labeling position in this peptide was assigned to lysine 13. When labeling was conducted in the presence of inclolepropionate, labeling again was found in residue 13 which, on the basis of increased dansylation indicated in peptide c uuder these dansylation conditions (Fig. 7), led to the conclusion that labeling at lysine 13 was increased approximately Z-fold by the presence of indolepropionate at the binding site.
Zone d obtained from Fragment C isolated from albumin dansylated in the presence of indolepropionate was also analyzed for peptides with dansyl labels. Two dansyl zones, one very strong, and one of moderate fiuorescent strength, appeared on a chromatogram developed 24 hours in the 1-butanol/glacial acetic acid/water solvent. The strongly fluorescent zone contained residues 82 to 90, the other weaker zone had an amino acid composition which did not correspond to any known peptide or amino acid sequence found in Fragment C (in nmol, lysine 6.6, proline 13.4, alanine 22, and phenylanine 12). The source of this peptide was not further investigated at this time.
Since, however, blockage at the site had earlier been shown to the absence of the acetyl group, this fragment was cleaved into be noncompetitive (only 12, the amount of free site, was affected-two subfragments, with amino acid compositions leucine, lysine; not the binding constant itself), a distance effect such as may occur in allosteric regulation seems unlikely.
The position of residue 23 in Fragment C is adjacent to 2 arginine residues, in support of other evidence that arginine is involved in major anionic binding sites on serum albumin (17)(18)(19)(20). There is a clustering of types of amino acids in regions of Fragment C. Thus, 5 out of 6 tgrosines, 5 out of 8 phenylalanines, 4 leucines, and 1 isoleucine fall within residues 10 to 43. Serine, glycine, threonine, aspartate, valine, and cysteine are missing in this region. There are 2 proline residues near residue 23, suggesting the absence of helical structure in this part of the site. In the and serine, glutamate, alanine, leucine, and lysine. The leucinelysine peptide is apparently the only one in the trvptic digest of albumin. Except for cysteine, which, it appears, was not assayed for by the authors, these residues correspond with our own residues 70 to 76.
The presence of an indole ligand on the indole site makes available other c-amino groups in the general site region for reaction with dansyl chloride. This is seen at lysines 13,39, and 84 (Fig. 7), where a large increase in labeling occurred when indolepropionate was present. This latter is also consistent with the previous evidence that the binding of indolepropionate-enhanced association sequence 10 to 43, there are 8 basic residues and only 2 acidic of L-tryptophan at secondary sites (25). Presumable indole liresidues. In this segment, amino acids predominate with hydrogands binding at the primary site induce a rearrangement of the phobic and basic side chains, which, if proper conformation albumin structure exposing hydrophobic and positively charged exists, should provide good binding sites for anions with hydro-groups for the accommodation of further ligands. A very small phobic components. The latter is a property albumin strongly entropy change was found on the association of skatole, a prepossesses.
dominantly hydrophobic molecule, with albumin (26) ; to esplain Skatole was found to bind to albumin only in the presence of such a small value, one must allow for an alteration in the strucanions which also bind (21). Presumably, the high positively ture of albumin, which would counter the entropy change caused charged density of the site must be neutralized by an anion such by the removal of skatole from the aqueous solution. This theras chloride, or thiocyanate before it can accept the hydrophobic modynamically anticipated structural change would also be compound, skatole. The 2 positively charged arginines ad-consistent with the exposure of additional hydrophobic areas, jacent to the site, which would strongly favor anionic binding, presumably to become secondary binding sites for further ligands. are consistent with this concept. Interestingly, acetyl-n-trypto-Finally, evidence suggests that the region consisting of residues phan which has its OWII anionic charge, binds strongest to albumin 1 to 86 of the C fragment and certain residues of the A-Phe fragin the absence of chloride or thiocyanate. One is led to speculate ment is the region for all major binding sites of ligands 011 althat there may be carboxyl groups on the A-Phe fragment (the bumin. The indole binding siteis in this region, and it is seen to latter we know is adjacent to the indole site in Fragment C) accommodate a variety of ligands. In unreported experiments, which bind to the positively charged groups in the 10 to 43 residue Mr. Douglas Karrel found that the primary binding site of region of Fragment C, furnishing the forces for forming the site bilirubin was most likely in the A-Phe fragment, probably adbetween the A-Phe and C fragments. In the affinity labeling ex-jacent to the C fragment. Salicylic acid binding, as described periments, no reactive lysine groups were evident in the A-Phe above, is taken to involve lysine 71 of the C fragment. Farr's fragment, rather, they were overwhelmingly present in the C group (27) showed that the binding of 3-acetamido-2,4,6-trifragment. The C fragment, on the other hand, contained a num-iodobenzoate (acetrizoate) was enhanced when the e-amino group ber of tyrosines which, however, were generally unreactive with present at residue 71 was acetylated. This latter observation imdansyl chloride. The indole site is destroyed in the N-F transition plied that this region of the peptide involves the association site (22), concurrent with the titration of a large number of carboxyl of acetrizoate. Chignell and Starkweather (28) found that the groups. Removal of the charges on Fragment A-Phe would, thus binding of phenylbutazone was similarly enhanced on the acetremove the electrostatic attractions holding it to Fragment C. ylation of albumin with acetylsalicylic acid. Furthermore, In the vicinity of residue 67, no such dominance of hydrophobic Pinckard et al. (24) found that a number of drugs, dyes, and other or basic residues exists. There are 6 acidic and 6 basic residues, compounds, including bilirubin, inhibited acetrizoate binding. 1 phenylalanine, 6 leucines, and no tyrosine, isoleucine, or Tryptophan did not inhibit acetrizoate binding, nor did bilirubin valine between residues 43 and 81. This area has a number of when added to albumin at a 1: 1 molar ratio inhibit tryptophan amino acids with hydrophilic side chains. 111 addition to the binding. In some unreported experiments, the primary site for acidic and basic amino acids, there are 3 serines, 2 glycines, 3 fatty acid binding is indicated to be in the A-Phe fragment. The half-cystines, and 6 alanines. Unless the folding of the chain clustering of major binding sites in this limited area on albumin fortuitously positions certain residues close to each other, this where interactions between associations are evident may be imregion would not appear capable of providing hydrophobic forces portant in physiological regulation, such as, for example, plasma for binding ligands. This region probably makes up a third side fatty acid effects on tryptophan binding leading to changes in of the binding site, providing mostly positions for accommodating serotonin levels in the brain (5). the anionic or hydrophilic segments of the ligands. Tryptophan is located at position 86. Others have noted that it is near major binding sites on albumin. There is no evidence of involvement of residues 87 to 159 of Fragment C in the indole site. Under all conditions studied, this part of Fragment C was labeled only weakly with dansyl chloride and pyridoxal phosphate and not at all with bromoacetyl-L-tryptophan.
Lysine 71 is most likely the residue at which the acetyl group derived from acetylsalicylic acid is attached. Hawkins et al. (23,24) isolated a fragment with the acetyl group attached, containing 2 leucines, 2 lysines, serine, glutamate, and alanine. In