A 113-Amino Acid Fragment of CD4 Produced in Escherichia coli Blocks Human Immunodeficiency Virus-induced Cell Fusion*

A gene encoding a 113-amino acid, NH2-terminal fragment of CD4, rsT4.113, was constructed and expressed in Escherichia coli under the control of the tryptophan operon promoter. Following induction, rsT4.113 is produced at 5-10% of total E. coli protein, and it is found in inclusion bodies. The protein is purified in two steps under denaturing and reducing conditions. Solubilized rsT4.113 is first purified on a column of Q-Sepharose to remove low molecular weight contaminants and then purified to greater than 95% homogeneity by gel filtration. Renaturation of rsT4.113 is achieved at approximately 20% yield by dilution and dialysis. High performance liquid chromatography analysis of renatured rsT4.113 reveals a less than 15% contaminant of reduced protein. Purified and renatured rsT4.113 contains epitopes for both OKT4a and Leu3a, anti-CD4 monoclonal antibodies which block CD4-gp 120 association, but lacks measurable affinity toward a nonblocking anti-CD4 monoclonal antibody, OKT4. By comparison to a longer form (375 amino acids) of recombinant soluble T4 produced in mammalian cells that contains the entire extracellular domain, rsT4.113 has a comparable affinity for binding to OKT4a and Leu3a in a radioimmunoassay. Analysis of antiviral activity of rsT4.113 demonstrates that the E. coli-derived protein inhibits human immunodeficiency virus-induced syncytium formation with an IC50 of 5-10 micrograms/ml. These data demonstrate that the human immunodeficiency virus-binding domain of CD4 is localized within the NH2-terminal 113 amino acids of CD4 and is contained within a structure homologous to the kappa variable-like domain of immunoglobulins.

* This work was supported in part by National Institute of Allergy and Infectious Diseases Grant AI25662-01 (to R.A.F. and R.T.S.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
0 To whom correspondence and reprint requests should be addressed.
The abbreviations used are: HIV, human immunodeficiency virus; rsT4, recombinant soluble T4; rsT4.113, a 113-amino acid fragment of rsT4 produced in E. coli; rsT4.2 and rsT4.3, 372-and 375-amino acid fragments, respectively, of rsT4 expressed in Chinese hamster ovary cells; HPLC, high performance liquid chromatography; SDS, sodium dodecyl sulfate; PAGE, polyacrylamide gel electrophoresis; PBS, phosphate-buffered saline; PTH, phenylthiohydantoin. quired immunodeficiency syndrome. The structure of CD4 deduced from cloning the human cDNA (6) contains 435 amino acids. Amino-terminal sequence analysis of CD4 from a human macrophage cell line (U937) revealed that the mature NH2 terminus is at position 3' of the inferred sequence, and that Asn-3 is, in fact, a lysine (7). 3 The structural characteristics of the polypeptide have been proposed to include a NHzterminal extracellular domain of 370 amino acids, a short transmembrane segment marked by the predominance of lipophilic amino acids, and a short cytoplasmic domain of 38 amino acids. The extracellular domain of CD4 contains 6 halfcysteinyl residues in a 1-2, 3-4, and 5-6 disulfide pattern (8) inferred by homology to the sheep CD4 primary structure (9) and two potential sites of N-linked glycosidic attachment. The pattern of disulfide cross-linking indicates the potential for domain structure in the extracellular polypeptide segment, and this is further substantiated by sequence homology between the NHz-terminal 100 amino acids of CD4 to the K light chain regions of immunoglobulins (6). A more detailed consideration of this homology (8) suggests the presence of four V-like domains in the extracellular segment of CD4 and, thus, the evolutionary derivation of this protein from polyimmunoglobulin receptor structures.
Recently, we and others have demonstrated that recombinant soluble forms of CD4 block HIV replication and HIVdependent cell fusion in uitro (7,lO-12). This finding pertains to forms of CD4 truncated at the NHz-terminal side of the transmembrane segment and produced from mammalian cell culture. These recombinant soluble T4 (rsT4) molecules have also been shown to bind the major HIV envelope glycoprotein, gp120 (10)(11)(12)(13). Indeed, the antiviral activity of rsT4 can be attributed to competitive binding with cell-surface CD4, the HIV receptor.
Several forms of soluble CD4 have been constructed which serve to define the minimal structure required in CD4 for interaction with gp120. In fact, soluble CD4 derivatives containing only the two NHz-terminal immunoglobulin-like domains (14,15) have been shown to maintain antiviral and gpl20-binding properties. We report here a 113-amino acid fragment of CD4 (rsT4.113) and its expression, purification, and renaturation from Escherichia coli. This truncated CD4 corresponds to the NHz-terminal immunoglobulin domain of CD4 and is likely to be homologous to the structure (16,17) of the variable portion of a K-type light chain. Construction ofpBG211.11-A plasmid coding for the NHZ-terminal 113 amino acids of recombinant CD4 was constructed in four steps beginning with the previously described mammalian cell expression plasmid, called pBG380 (7), which contains the gene coding for recombinant soluble T4 that is truncated after the codon for amino acid 377 (isoleucine). In the first step, the cDNA in pBG380 coding for the CD4 signal sequence, i.e. amino acids -22 to -1, was deleted by oligonucleotide-directed mutagenesis to create pBG38Odelta.s~. This step introduced a Clal restriction site followed in sequence by: 1) a 10-base pair spacer; 2) a methionine codon; 3) CD4 cDNA coding for amino acids 1-374; 4) a translational stop signal; and 5) a Xho2 restriction site. The Clal-Xho2 fragment isolated from pBG38Odelta.s~ was inserted into an E. coli plasmid in front of the tryptophan operon promoter (18) to yield pBG196.10. In the third step, a second oligonucleotide-directed mutagenesis was used to insert three tandem translational stop codons following the sequence coding for amino acids -23 through +113 in pBG380, to yield pBG394. The final step to construct the E. coli expression plasmid containing the rsT4.113 gene was an assembly of three fragments. The first fragment, including the vector sequences, was produced by restricting pBG196.10 with Hind3 and Clal to remove the coding sequence from amino acid 61 through 374 and including vector sequence following the 3' end of the rsT4 gene. The second fragment, a Hind3-Bgl2 segment including the condons for CD4 amino acids 61 through 113 immediately followed by a triplet of stop codons in tandem, was isolated from pBG394. The third fragment for the assembly step was a BarnH1-Clul fragment containing a bacteriophage T4 transcriptional termination signal.' The ligation of these three fragments produced pBG211.11 (Fig. lA).
Production of rsT4.113 in E. coli-A lon htpr mutant of E. coli (19), transformed with pBG211.11, was diluted into 40 ml of minimal medium plus tryptophan (100 pg/ml) at 30 "C and shaken for 4.5 h. One-ml aliquots were removed after this dilution into induction medium at 0, 2, and 4 h and after growth overnight. Aliquots were centrifuged, and cell pellets were subjected to lysis by boiling in Laemmli-gel loading buffer as previously described (20). After centrifugation to remove cell debris, samples were subjected to SDS-PAGE, followed by either protein blot analysis with a rabbit polyclonal anti-peptide sera as probe (7) or by Coomassie Blue protein staining (Fig. 1B).
Purification and Refolding-Wet cells (14 g) from a 4-liter shakeflask fermentation were suspended in 100 ml of a 20 mM Tris, pH 7.5, buffer containing 20 pg/ml DNase, 20 pg/ml RNase, and 1 mM phenylmethylsulfonyl fluoride. The suspension was twice passed through a French press at 1,000 p.s.i., then centrifuged at 18,000 X g for 15 min at 4 "C. The resulting pellet was solubilized in 20 ml of a 20 mM Tris, pH 7.5, buffer containing 7 M urea and 10 mM 2mercaptoethanol. The suspension was subjected to ultracentrifugation at 85,000 X g for 90 min at 4 "C. The supernatant was diluted by the addition of 80 ml of 20 mM Tris, pH 7.5, 7 M urea, 10 mM 2mercaptoethanol, and 40 ml of the sample was applied to a column (3 X 4 cm) of Q-Sepharose fast-flow (Sigma) equilibrated in the same buffer. The column was developed with a gradient in 400 ml total volume of increasing NaCl from 0 to 0.3 M in the same Tris/urea/2mercaptoethanol buffer. Column fractions were monitored for absorbance at 280 nm and for protein by SDS-PAGE (15% acrylamide). A pool was prepared containing 20 mg of protein in 50 ml and concentrated to 10 ml in an Amicon stirred-cell ultrafiltration unit using a PM-30 membrane. 5.0 ml of the concentrate was applied to a column (1.5 X 95 cm) of S-300 (Sigma) equilibrated and developed in the same Tris/urea/2-mercaptoethanol buffer. Column fractions were monitored for absorbance at 280 nm and for protein by SDS-PAGE. A pool containing rsT4.113 (approximately 4 mg) in 15 ml was thus prepared.
Refolding of rsT4.113 (at a concentration of 0.5 Am/ml) was achieved by stepwise dialysis against: 500 volumes of 3 M urea, 20 20% solvent B (0.085% trifluoroacetic acid/70% acetonitrile) and developed with a linear gradient of increasing acetonitrile concentration from 20 to 80% solvent B over 45 min a t a flow rate of 0.5 ml/ min. Radioimmunoassay and Epitope Analysis-Radioimmunoassay and epitope analysis of rsT4.113 was performed in 96-well microtiter plates coated overnight a t 4 "C with 50 pl/well of goat-anti-mouse IgG (HyClone, Logan, UT). Wells were rinsed, dried, and blocked by addition of a PBS solution containing 5% fetal calf serum (100 pl/ well) for 1 h at room temperature. The microtiter wells were rinsed again with water, dried, and treated with 50 pl of an antibody solution (OKT4, OKT4a, or Leu3a) at 2 pg/ml in PBS, for 2 h at room temperature. Plates were washed with a PBS/O.O5% Tween-20 solution, water, and dried. Protein samples (50 pl) containing 25,000-30,000 cpm of 1251-labeled rsT4.3 were added to microtiter wells thus prepared. Following a 1-h incubation, plates were washed with the PBS/Tween-20 solution, water, and then counted for radioactivity. Protein samples ranged in concentration from 0.4 to 25 pglml.
HIV Fusion Assay-Antiviral activity of rsT4.113 was measured in a cell fusion assay performed essentially as described by Walker et al. (21). H9 cells (5 X lo'), chronically infected with human Tlymphotrophic virus-IIIB, were preincubated with rsT4.113 in a range of concentrations from 5 to 50 pg/ml in 150 pl of medium in 96-well plates. Following a 1-h incubation at 37 "C in 5% C02, 1.5 X lo' C8166 cells (22) were added to the wells in 50-pl aliquots. Plates were further incubated for 2 h at 37 "C in 5% CO2, after which the number of syncytia in 50-pl aliquots were counted microscopically. Syncytia were defined as cells containing a ballooning cytoplasm greater than three cell diameters. All samples were counted twice, and the results were averaged. To prevent bias during counting, samples were coded. Preincubation of H9 cells with OKT4a or rsT4.3 (7), both at concentrations of 25 pg/ml, served as positive controls. Negative controls included H9 and C8166 cells alone.
Analytical Techniques-Automated Edman degradation of rsT4.113 was performed using an AB1 470A gas-phase protein Sequencer equipped with a 900 A data system. Phenylthiohydantoin amino acids were analyzed on-line using an AB1 120A PTH-analyzer equipped with a PTH-C18 column (2.1 X 220 mm). Protein (10 pg) for sequence analysis was applied to SDS-PAGE (15% acrylamide) and electroblotted on Immobilon (Millipore) membrane as described by Matsudaira (23). Amino acid analysis of protein samples was performed by hydrolysis of protein in 6 N HCI, in uacuo, for 24 h at 110 "C. Hydrolysates were applied to a Beckman 6300 Analyzer equipped with postcolumn detection by ninhydrin. Western blot analysis of SDS-PAGE gels was performed by standard techniques using mouse polyclonal antisera raised against a synthetic peptide segment of CD4 (residues 44 to 63).

RESULTS AND DISCUSSION
Several lines of evidence demonstrate that CD4 is the receptor for HIV. First, monoclonal antibodies specific for CD4, e.g. OKT4a and Leu3a, block infectivity and replication of HIV in cell culture (2,3). Second, viral infectivity is associated with a loss of T4+-T cells (1). Moreover, infection of T4+ cells with HIV leads to fusion of infected cells with uninfected T4+-T cells (2). Finally, a direct physical association (Kd = 4 X lo-' M) has been measured between CD4 and gp120/160, the envelope glycoprotein of HIV (12,24). Nevertheless, structural determinants which participate in the CD4-gp120 interaction are only partly understood. The interaction is believed to involve contributions from carbohydrate moieties of the envelope glycoprotein (25) and to be independent of transmembrane and cytoplasmic domains of CD4 (7, 10-12, 14, 15). Accordingly, binding of gp120/160 to CD4 does not require the presence of a cell surface nor of additional cell surface proteins. Indeed, soluble forms of CD4 have comparable affinities for binding to the HIV envelope glycoprotein as those determined for recombinant cell surface CD4 (12).
In order to evaluate further the structural determinants in CD4 responsible for interaction with HIV, we have prepared several recombinant soluble forms of the receptor truncated from the COOH terminus to 180,154, and 130 amino acids in length.5 These derivatives were produced in mammalian cell culture and were found to contain the epitope for OKT4a. Based upon these data, we pursued construction of a 113amino acid form of CD4 and, given the presence of only a single disulfide bridge, expression of the construct in a bacterial host.
Purification of rsT4.113 from E. coli  conditions. Given in Fig. 2a is a chromatogram showing purification of rsT4.113 by ion exchange chromatography. rsT4.113 is found to elute early in the NaCl gradient and to be well resolved from low molecular weight contaminants. Accordingly, gel filtration chromatography was used to separate rsT4.113 from high molecular weight contaminants. The rsT4.113-containing pool from Q-Sepharose chromatography was applied to a column of S-300 (Fig. 2b), and this allowed for final purification of the protein to near homogeneity. Shown in Fig. 2c is a SDS-PAGE analysis depicting the purification of rsT4.113 throughout centrifugation and chromatography steps.
Refolding of purified rsT4.113 was achieved by dilution and dialysis steps to transfer protein to nondenaturing and oxidized conditions. HPLC was used to follow the course of the refolding. As shown in Fig. 3a, protein in 7 M urea, 10 mM 2mercaptoethanol eluted from the HPLC column a t 49% acetonitrile in the gradient. As subsequent steps transferred the protein to 1 M urea, 0.1 M ammonium acetate (Fig. 3b) and phosphate-buffered saline (Fig. 3c), an increasing percentage of rsT4.113 was found to elute earlier in the HPLC gradient, specifically a t 47% acetonitrile. The identity of the earlier eluting peak as oxidized product was verified by reduction of rsT4.113 in nonchaotropic solutions and application of sample thus treated to HPLC under the same conditions (results not shown). The elution of oxidized rsT4.113 prior to reduced protein on HPLC suggests that formation of the single disulfide bridge decreases relative hydrophobicity of the protein (26). Spectral analysis of rsT4.113 was performed throughout the course of refolding in order to monitor the relative yield of soluble protein in the procedure. The refolding method described above allows approximately 20% recovery of rsT4.113; HPLC analysis indicates an approximately 15% contaminant of reduced protein in the preparation (Fig. 3c).
Automated Edman degradation of rsT4.113 reveals that the purified protein contains stoichiometric quantities of NHZmethionine, the translation initiation methionine of rsT4.113 in pBG211.11. The percent recovery of phenylthiohydantoinyl-methionine at the first cycle of the degradative chemistry was consistent with routine initial yields obtained in the automated Edman. Thus, we can exclude the possibilities that a significant percentage of the rsT4.113 lacks the initiation methionine or that sequence analysis was impaired by the presence of glutamine at the first cycle of the degradative chemistry. Sequence analysis was performed for 40 cycles, and no evidence of lysine carbamylation was observed. Amino acid analysis displays a close correlation between actual and theoretical values for amino acids, thus indicating the absence of significant proteolytic degradation in the product.
Examination of epitopes in rsT4.113 was performed using radioimmunoassays. The ability of renatured rsT4.113 to compete with '*'I-labeled rsT4.3 (a 375-amino acid form of rsT4) for binding to anti-CD4 monoclonal antibodies was measured. The results (Fig. 4) show that rsT4.113 contains the epitope for OKT4a but not OKT4. Because only OKT4a blocks infectivity of HIV in vitro (2,3) and the binding of both CD4 (27) and rsT4.3 to gp120/160, this analysis suggests that a significant portion of the structural elements required for interaction with HIV are contained within the NH2terminal 113 residues of CD4. Quantitatively, however, comparison to unlabeled rsT4.3 shows that the E. coli-derived protein exhibits a molar affinity decreased by a factor of 3. In contrast to binding by either OKT4a and Leu3a (data not shown), rsT4.113 exhibits no measurable competition with radiolabeled rsT4.3 for binding to an OKT4 solid phase. The absence of OKT4 binding to other derivatives of CD4 shorter than 180 residues mentioned above (15)' suggests that the OKT4 epitope is localized within the COOH-terminal half of the CD4 polypeptide.  Consistent with monoclonal antibody epitope mapping, rsT4.113 produced in E. coli effects a dose-dependent inhibition of HIV-dependent syncytium formation in an in vitro assay performed essentially as described previously (7, 21). Syncytium formation is blocked to 90% of control at concentrations of rsT4.113 at 12.5 gg/ml with an IC, of 5-10 pg/ml (Fig. 5). Similar analysis with longer forms of rsT4 reveal complete inhibition at concentrations of 12.5 pg/ml. Thus, whereas rsT4.113 is an effective agent toward neutralization of HIV-dependent cell fusion in vitro, its molar specific inhibitory activity is decreased by a factor of 3. It is as yet undetermined whether or not this decreased potency is due to incomplete renaturation of the E. coli-derived protein, the presence of three additional amino acids at the NH2 terminus of rsT4.113 (Met-Gln-Gly), but lacking in rsT4.2 or rsT4.3 as produced in mammalian cells, or the absence of additional structure in rsT4.113 required for high affinity binding to HIV.
As the NHZ-terminal 113 amino acids of CD4 can be renatured from E. coli inclusion bodies to yield a polypeptide with anti-HIV activity, it is likely that this segment of structure comprises a functional domain. Therefore, the homology of this segment of CD4 to K V regions of immunoglobulin light chain can be re-evaluated in light of structural conservation

100
of a single globular fold. Based upon the three-dimensional structure of the variable portion of the Bence-Jones protein REI (16,17), alignment of amino acid sequences follows from topological conservation of Cys-18 and Cys-86 of CD4 with Cys-23 and Cys-88 of V-K with minimal requirement for insertions and deletions (Fig. 6). This alignment yields conservation of lipophilic amino acid side chains at all but 6 out of 17 positions that are considered of importance in formation of the REI hydrophobic core (16,17). In particular, Trp-35 of REI is aligned with Trp-30 of CD4 and Phe-73 is aligned with a Trp. As represented in Fig. 6, gross structural characteristics of CD4 as applied to the globular fold of V-K include a shortened strand A, a shortened turn linking strands C and D, and extensions, perhaps as loops, in turn linking strands G and H and strands H and I. The structural alignment of CD4 to V-K would suggest that a core globular fold is maintained and that divergence occurs in extension or abrogation of turns linking nine antiparallel @-strands. Although speculative in nature, this approximation of structure in the HIVbinding domain of CD4 provides a framework for identification of molecular detail in the CD4-gp120/160 interaction. The antiviral activity of rsT4.113 and preservation of OKT4ajLeu3a epitopes therein demonstrate conservation of HIV-binding function within the first of four putative immunoglobulin-like domains in CD4. Other recent studies (28,29) defining monoclonal antibody and gp120 binding epitopes in CD4 using saturation mutagenesis and mouse-human T4 chimeras support the conclusions established using rsT4.113. Nevertheless, studies by Landau et al. (29) and Clayton et al. (31) suggest the importance of immunoglobulin-domain I1 determinants for maximal gpl20-binding affinity by CD4 mutants. While considerable work is still required to understand fully the interaction of CD4 with HIV, the results presented herein may facilitate rational design of antiviral drugs based on the affinity of a specific domain of CD4 for the major envelope glycoprotein of HIV.