Cloning and Characterization of the 5”Transcriptional Regulatory Region of the Human Intercellular Adhesion Molecule 1 Gene*

Cell-cell adhesion is critical in the generation of immunologic responses and is dependent upon expression of a variety of cell surface receptors. While intercellular adhesion molecule l (ICAM-l), a specific receptor for lymphocyte function-associated antigen 1, is con- stitutively expressed by some cell types, its de nouo or increased expression by various cells has been associated with the initiation of inflammatory responses and appears to be transcriptionally regulated. The 5’ region of the human ICAM-1 gene has been cloned and both structurally and functionally analyzed. A 17.3- kilobase genomic clone containing three exonal regions encoding the N-terminal third of the ICAM-1 protein was isolated. A 2.05-kilobase subclone, containing the 5’ most exon, was utilized to determine an interferon- y-induced transcription initiation site via primer extension and S1 nuclease protection assays. Analysis of the 5’-flanking region revealed consensus sequences for appropriately located basal promoter elements, as well as numerous potential cis-acting enhancer elements. When subcloned into a reporter gene construct, the putative promoter subregion functioned as a potent promoter. However, in accord with biologically ob- served expression of ICAM-1 in specific cell types, when additional 5“flanking sequences were included in reporter gene constructs, tissue appropriate repression of transcription was observed. recessed termini AccI SstI ends of fragment of Escherichia polymerase the a blunt-end ligation designated pHs-CAT-B. pBS-CAT-P and pBS-CAT-C were designed to contain upstream of the CAT coding region the 1165 nucleotide BglII/SstI (BS) fragment of pGX6-2.05, which fully encompasses the HS fragment at its 3' end and extends 5' over an additional 883 nucleotides of the ICAM-1 5'-flanking region. Another group of constructs, designated pBH-CAT- P and pBH-CAT-C, contain only the 883 base pair BgZIIIHindIII (BH) fragment of the ICAM-1 5"flanking region, and thus do not encompass any of the HS fragment. The BH fragment does contain by sequence analysis an alternative transcription initiation site with appropriately located TATA and CAAT boxes. Because of the avail-ability of appropriate, unique restriction sites, these BS and BH fragments were cloned into either pCAT-Promoter or pCAT-Control (Promega) after removing the SV40 promoter region from each parent plasmid, rendering pCAT-Promoter equivalent to pCAT-Basic (i.e. lacking either SV40 promoter or enhancer), and rendering pCAT- Control promoterless but retaining the SV40 transcription enhancer downstream of the CAT coding region. Successful construction of each plasmid was confirmed by restriction enzyme analysis. pg of pCHllO (Pharmacia LKB Biotech- nology Inc.), an SV40 driven plasmid expressing E. coli P-galactosid-ase, expression of which was utilized to normalize lysate volumes for differences in transfection efficiency among various plates and constructs, as previously described (28). pCAT-Control contains the SV40 promoter and enhancer sequences and was thus utilized as a positive control for CAT expression. Transfection with pCAT-Basic, containing no endogenous promoter or enhancer, served as a negative control for CAT expression. After exposure to precipitated plasmids for 16 h, cells were washed, and medium replenished. After 48 h at 37 "C 5% CO,, cells were harvested and lysates prepared as described (35). Cell lysates were assayed for CAT enzyme expression as described, using I4C-labeled chloramphenicol as substrate. Acetylated derivatives of chloramphenicol were resolved by thin layer chroma-tography, detected by autoradiography, and percentage conversion of substrate to acetylated forms determined after scintillation counting of appropriate areas of the chromatograph. Transfections and CAT assays for each plasmid were performed in triplicate.

associated antigen 1 (8), has been shown to be critical for a number of adhesion events among leukocytes and between leukocytes and other cell types (7,9,10). ICAM-1 is constitutively expressed on hematopoietic cells, vascular endothelium, fibroblasts, and certain epithelial cells (1). Its de novo expression by tissues involved in inflammatory responses has been demonstrated in numerous pathologic and experimental situations (11)(12)(13)(14), and in vitro it can also be induced on many cell types by various proinflammatory cytokines, including interferon-y (IFN-y), tumor necrosis factor-a, and interleukin 1 (1,15,16), as well as by phorbol esters (2, 17,18). Cytokines may increase base-line expression of ICAM-1, e.g. in vascular endothelium, or lead to ICAM-1 expression on cells that normally do not express this molecule, e.g. in keratinocytes (10). De nouo and upregulated expression of ICAM-1 a t sites of inflammation may enhance the recruitment and localization of leukocytes as well as permit their proper interaction with cells expressing targeted antigens.
In addition to its physiologic role of contributing to cellcell adhesion in immune responses, the ICAM-1 protein has also been subverted by several pathologic processes. ICAM-1 functions as the receptor for the major group of rhinoviruses (19,20), and a soluble form of ICAM-1 has specifically inhibited rhinovirus infection (21). Additionally, ICAM-1 serves as a sequestration antigen for plasmodium falciparum-infected red blood cells (22), and its expression is also thought to correlate positively with the metastatic potential of malignant melanoma (23).
Thus, the regulated expression of ICAM-1 by various tissues and cell types appears to play a critical role in numerous physiologic and pathologic processes. Since ICAM-1 expression appears to be transcriptionally regulated (61, elucidation of the molecular mechanisms involved in its induction by various physiologic stimuli and pathologic conditions requires identification and analysis of the transcriptional regulatory regions of the ICAM-1 gene. Furthermore, a cell type, such as human keratinocytes (HK), which does not constitutively express ICAM-1 but which can be rapidly induced to do so by exposure to specific signals, provides an optimal in uitro model for analyzing the structural and functional organization of the transcriptional regulatory apparatus of the ICAM-1 gene. In this article, we describe the structure and organization of the 5' region of the human ICAM-1 gene, including demonstration of a transcription initiation site utilized in epithelial cells upon stimulation with IFN-y. We also demonstrate that the 5"flanking region of this gene contains a potent functional promoter as well as consensus sequences for enhancer elements which may be biologically important in regulating its expression. Additionally, genomic regions immediately flanking the functional promoter subregion confer cell type appropriate repression of reporter gene expression when transfected into epithelial cells. The concordance of expres-sion levels of the endogenous ICAM-1 gene in a specific cell type and expression of ICAM-1-based reporter gene constructs when transfected into these cells thus demonstrate the biologic relevance of the identified transcriptional regulatory regions.

MATERIALS AND METHODS
Cells-HK were purchased from Clonetics (San Diego, CA) and grown without feeder cells in a serum-free, low calcium medium (KGM, Clonetics). Cells were used for experiments after the second passage, a t 80-90% confluence. A431, a human epidermoid squamous cell carcinoma line (24) was obtained from American Type Culture Collection (Rockville, MD). B6 is a murine tyrosine kinase negative fibroblast cell line previously utilized in transfection studies (25). A431 and B6 cells were cultured in Dulbecco's modified Eagle's medium (4.5 g/liter glucose) (GIBCO) supplemented with 3 mM Lglutamine, 10% heat-inactivated fetal bovine serum, 100 units/ml penicillin, 0.25 mg/ml amphotericin B, and 100 pg/ml streptomycin (all from GIBCO). All cells were grown a t 37 "C in humidified 5% CO,. In some experiments, HK and A431 cells were exposed to recombinant human IFN-7 (Amgen Biologicals, Thousand Oaks, CA) at 250 units/ml for 24 h prior to preparation of RNA.
Genomic Library Screening-A commercially prepared human lymphocyte genomic library (Stratagene, La Jolla, CA) was screened with a human ICAM-1 cDNA (26). This cDNA contains the bases 46-1891 of the cDNA published by Staunton et al. (6), which will be used as a reference of cDNA nucleotide positions in this report. The cDNA insert was subcloned from the original vector aH3M into pGEM3Z (Promega, Madison, WI) as previously described (27). The genomic library was prepared by partial SauIIIa digestion of circulating lymphocyte DNA with subsequent cloning of size-fractionated genomic fragments into the BamHI sites of the substitution vector bacteriophage XDASH (Stratagene). A 315-nucleotide XbaIIHincII 5' fragment of the ICAM-1 cDNA was "'P-labeled by random primer extension (specific activity of 10R-109 cpm/pg) (28) and used to screen duplicate filters of 9 X 10" recombinant phage plaques (28), giving a 99% probability for the presence of one 15-kb insert with relevant genomic material (29). Hybridizing clones were plaque purified in subsequent screening rounds. Single clone phage DNA was prepared from lysates (30) using polyethyleneglycol precipitation, sodium dodecyl sulfate/EDTA phage lysis, and DNA extraction/precipitation (28). The purified DNA was then used for subsequent analytical and cloning procedures.
Restriction Enzyme Mapping and Southern Blot Analysis-Purified phage DNA was digested with various combinations of restriction endonucleases, the digests analyzed by horizontal agarose gel electrophoresis, and a restriction map was created. The digests were blotted to and immobilized on Nytran nylon membranes (Schleicher & Schuell), and Southern analysis (31) was carried out with two "'Plabeled fragments of the human ICAM-1 cDNA. A 315-nucleotide 5' and a 1.53-kb 3' fragment were generated from the cDNA by simultaneous digestion with XbaI, which released the cDNA from its cloning site in pGEMSZ, and HincII, which cuts once within the cDNA. Hybridization was carried out for 20 h at 42 "C in 50% formamide, 2.5 X Denhardt's solution (1 X = 0.02% bovine serum albumin, 0.02% polyvinylpyrrolidone, 0.02% Ficoll), 5 X SSC (1 X = 0.15 M NaC1, 0.015 M sodium citrate, pH 7.0), 0.1% sodium dodecyl sulfate, and 100 pg/ml tRNA. Autoradiography, using XAR-2 film and intensifying screen at -70 "C, was performed after increasingly stringent washes.
DNA Sequencing-For DNA sequence analysis, EcoRI fragments of the recombinant phage clone were subcloned into pGEM3Z in a shotgun cloning procedure (30); selected inserts were sequenced directly from denatured double-stranded constructs. Sequencing was carried out by the dideoxynucleotide chain termination method of Sanger et al. (32) using the Sequenase 2.0 reaction kit (U. S. Biochemical). Primers utilized were either complementary to the T7 or SP6 sites flanking the multiple cloning site of the vector (Promega) or were commercially synthesized oligodeoxyribonucleotides complementary to experimentally established sequences. The sequence of pGX6-2.05, the EcoRI subclone containing the 5' most coding region of the ICAM-1 gene, was screened for promoter/enhancer consensus sequences using the sequence analysis program MacVector (IBI, New Haven, CT).
RNA Preparation-HK and A431 cells were harvested at 80-90% confluence by exposing them to 0.05% trypsin, 0.53 mM EDTA (GIBCO) for 5-10 min a t 37 "C followed by two washes with phosphate-buffered saline. Total cellular RNA was prepared by guanidinium isothiocyanate lysis and CsCl gradient ultracentrifugation (28). Because the ICAM-1 transcript is absent or extremely low in untreated HK and A431 cells, respectively, and because it is relatively abundant in both HK and A431 cells that have been treated with IFN-y (27), total cellular RNA was utilized in both primer extension reactions and S1 nuclease protection assays at 10 and 25 Sglreaction, respectively.
Primer Extension Reaction-Primer extension reactions for the determination of the extent of the 5"untranslated region (5'-UTR) of the ICAM-1 mRNA were performed as described (33), using synthetic oligodeoxyribonucleotides complementary to portions of the ICAM-1 mRNA. The 40-mer 5' GCT GGG AGC CAT AGC GAG GCT GAG GTT GCA ACT CTG AGT A 3' was designated PE1; complementarity of its 5' end begins 1 2 nucleotides downstream of the translation start site (AUG) of the ICAM-1 mRNA, and extends the mRNA. The 40-mer 5' CTG GGA ACA GAG CCC CGA GCA 28 nucleotides upstream into the 5"untranslated region (UTR) of GGA CCA GGA GTG CGG GCA G 3', which is complementary to nucleotides fully within the coding region of the ICAM-1 mRNA (5' end located 67 nucleotides downstream of the AUG), was designated PE2. The primers were "P-end-labeled using T4 polynucleotide kinase (28) and hybridized with total cellular RNA from IFN-?-treated or untreated A431 cells and HK. Incubation was for 20 h at 50 "C in a solution containing 1 M NaC1, 0.5 M HEPES, pH 7.5, 1 mM EDTA, pH 8.0 RNA and primer were then coprecipitated and resuspended in 50 mM Tris, 5 mM MgC12, 50 mM KC1, 0.005% gelatin, 5 mM dithiothreitol, 4 mM dNTP, pH 8.3. 40 units of avian myeloblastosis virus-reverse transcriptase (Boehringer Mannheim, West Germany) were added to a final reaction volume of 25 p1 and the primer extension reaction was carried out for 90 min at 42 "C. The reaction was terminated with 1 pl of 0.5 M EDTA, pH 8.0, and reaction products were precipitated, resuspended in denaturing buffer, heat denatured, and fractionated by denaturing acrylamide gel electrophoresis. The size of the primer extension products was determined by counting the number of nucleotides in a concurrently run sequencing ladder. This sequencing reaction was performed on pGA6-2.05 using primers whose 5' ends were identical to those of PE1 and PE2, allowing identification of specific bases which correspond to the cap site. After fractionation, the gel was dried and autoradiography performed as described above.
SI Nuclease Protection Assay-A single-stranded 5' end-labeled DNA probe complementary to the ICAM-1 mRNA was generated from the genomic subclone pGA6-2.05 as described (33). This 187nucleotide single-stranded probe extends from 12 nucleotides downstream (5' end) to 175 nucleotides upstream (3' end) of the translation start site of the ICAM-1 mRNA and was calculated to extend a t least 100 nucleotides 5' of the candidate transcription start sites. The probe was hybridized with total cellular RNA of either IFN-A-treated or untreated A431 cells and HK for 20 h in a solution of 80% formamide, 0.4 M NaC1, 0.04 M PIPES, pH 6.4, 1 mM EDTA, pH 8.0. The hybridization mix was subsequently exposed to 300-1000 units/ ml s1 nuclease for various times a t 37 "c in 0.28 M NaC1, 0.05 M sodium acetate, p H 4.5, 4.5 mM ZnSO,, 20 pg/ml denatured salmon sperm DNA. The reaction was terminated by adding 1/4 volume 4 M ammonium acetate, 20 mM EDTA, pH 8.0, 40 pg/ml tRNA, and the reaction products precipitated, resuspended, and analyzed by denaturing acrylamide gel electrophoresis and autoradiography as described.
Production of Reporter Gene Constructs Using Various Portions of the ICAM-1 5"flanking Region and Chloramphenicol Acetyltransferase (CAT) Vectors-CAT reporter gene constructs were designed and engineered to contain various portions of the ICAM-1 5"flanking region contained within pGX6-2.05 and were produced in vectors which either provided or lacked an SV40 transcription enhancer region (Fig. 4A ). pHs-CAT-B was produced in the following manner. A 282-nucleotide HindIII/SstI fragment extending 277 base pairs 5' of the identified transcription start site (see below, "Results") was isolated. This fragment, designated HS, also contains by sequence analysis the putative functional TATA box, two potential Spl-binding sites, and three potential CAAT boxes (see "Results" for nucleotide number locations).
The promoterless CAT expression vector $AT-Basic (Promega) was digested with HindIIIIAccI, each cutting single sites (HindIII 5' of AccI) in the multiple cloning site located just 5' of the CAT coding region. The complementary HindIII ends of vector and HS were ligated, ensuring the same 5 ' 4 ' orientation for both the ICAM-1 HS fragment and the CAT coding region. The recessed termini of the AccI and SstI ends of vector and insert, respectively, were then blunt-ended with the Klenow fragment of Escherichia coli DNA polymerase I, and the plasmid was then recircularized by a blunt-end ligation and designated pHs-CAT-B. pBS-CAT-P and pBS-CAT-C were designed to contain upstream of the CAT coding region the 1165 nucleotide BglII/SstI (BS) fragment of pGX6-2.05, which fully encompasses the HS fragment at its 3' end and extends 5' over an additional 883 nucleotides of the ICAM-1 5'flanking region. Another group of constructs, designated pBH-CAT-P and pBH-CAT-C, contain only the 883 base pair BgZIIIHindIII (BH) fragment of the ICAM-1 5"flanking region, and thus do not encompass any of the HS fragment. The BH fragment does contain by sequence analysis an alternative transcription initiation site with appropriately located TATA and CAAT boxes. Because of the availability of appropriate, unique restriction sites, these BS and BH fragments were cloned into either pCAT-Promoter or pCAT-Control (Promega) after removing the SV40 promoter region from each parent plasmid, rendering pCAT-Promoter equivalent to pCAT-Basic (i.e. lacking either SV40 promoter or enhancer), and rendering pCAT-Control promoterless but retaining the SV40 transcription enhancer downstream of the CAT coding region. Successful construction of each plasmid was confirmed by restriction enzyme analysis.
Transient Expression of CAT Vectors-Subconfluent murine B6 fibroblast and A431 cell cultures (lo6 cells/lOO-mm tissue culture dish) were transfected by the calcium phosphate precipitation technique (34) with 15 pg of plasmid DNA of either pHs-CAT-B, pBH-CAT-P, pBH-CAT-C, pBS-CAT-P, pBS-CAT-C, pCAT-Basic, or pCAT-Control. In comparative expression experiments, cells were also cotransfected with 1 pg of pCHllO (Pharmacia LKB Biotechnology Inc.), an SV40 driven plasmid expressing E. coli P-galactosidase, expression of which was utilized to normalize lysate volumes for differences in transfection efficiency among various plates and constructs, as previously described (28). pCAT-Control contains the SV40 promoter and enhancer sequences and was thus utilized as a positive control for CAT expression. Transfection with pCAT-Basic, containing no endogenous promoter or enhancer, served as a negative control for CAT expression. After exposure to precipitated plasmids for 16 h, cells were washed, and medium replenished. After 48 h at 37 "C 5% CO,, cells were harvested and lysates prepared as described (35). Cell lysates were assayed for CAT enzyme expression as described, using I4C-labeled chloramphenicol as substrate. Acetylated derivatives of chloramphenicol were resolved by thin layer chromatography, detected by autoradiography, and percentage conversion of substrate to acetylated forms determined after scintillation counting of appropriate areas of the chromatograph. Transfections and CAT assays for each plasmid were performed in triplicate.

Isolation and Structural
Analysis of Genomic DNA-The screening of the human lymphocyte genomic library with the 315-nucleotide 5' fragment of a human ICAM-1 cDNA resulted in the isolation of six hybridizing clones, which were plaque purified and found to be identical upon restriction enzyme mapping and Southern analysis. One clone, designated X6, is shown (Fig. 1 Within the exonal sequences of pGX6-2.05 are contained the first ATG codon with reference to the described cDNAs (6,26) and 67 nucleotides of ICAM-1 protein coding region (Fig.  2). Furthermore, all the previously determined nucleotide sequence upstream from the translation start site, as documented in the published cDNAs, is found immediately 5' of the ATG codon. At the 3' border of the exon, a consensus sequence for a splice donor site has been identified.
The 263 nucleotides of coding region contained in pGX6-5.4 are identical in sequence to cDNA nucleotides 125-388 and are flanked at their 5' and 3' ends by consensus sequences for splice acceptor and donor sites, respectively (data not shown), indicating that this region corresponds to a complete exon. The exonal sequences in the insert of subclone pGX6-0.85 are identical to cDNA nucleotides 389-592 and are bordered, at their 5' end, by a consensus splice acceptor site. The 3' end of this fragment has no associated splice donor site but contains a terminal SauIIIa site. However, since the 3' end of this 0.85-kb subclone corresponds precisely to the 3' end of the entire X6 genomic fragment, and since the genomic containing the first exon and candidate promoter region. Numbering was determined in primer extension and S1 nuclease protection assays (see below), with the deduced transcription start site (bent arrow) designated as +l. The ICAM-1 first exon protein coding region extends from nucleotides +40 to +lo7 and is accompanied by the deduced amino acid sequence. Bold underline, putative TATA box; normal underline, putative Spl-binding sites. library from which X6 was cloned was prepared by a partial SauIIIa digest of human genomic DNA, it is most likely that this SauIIIa site is the product of a preparative enzymatic restriction digest of the genomic DNA within this exon rather than the formal 3' end of this exon. Thus, X6 contains the first 533 nucleotides of ICAM-1 protein coding region organized over two complete exons and one partial exon, and all the 5' UTR nucleotides of the ICAM-1 cDNA published to date.
Within 16, the correspondence of exonal regions to various protein domains of the final gene product is strikingly similar to that previously described for other members of the immunoglobulin superfamily (5).
The 67 nucleotides of protein coding region contained within the first exon correspond to the first 22 and 113 codons for the 27-residue signal peptide of the ICAM-1 protein (6). The remainder of the signal peptide is encoded by the second exon, which also contains 82 and 213 of the 88 codons which determine the N-terminalmost immunoglobulin-like domain (6). Since pGX6-2.05 contains the 5' most coding region that is represented in the cDNA, it was considered likely to contain candidate 5"flanking transcriptional regulatory regions and was sequenced in its entirety (Fig. 2). Within the 2045 nucleotides of genomic material are contained the 67 nucleotides of protein coding region (see above), 1392 nucleotides of 5'-flanking region, and 586 nucleotides of the first intron. Consensus sequences for promoter elements, e.g. TATA and CAAT boxes and Splbinding sites, are found within the 5"flanking region appropriately located with respect to the identified cap site (see below). Additionally, consensus sequences for a number of enhancer elements can be found in both the 5'-flanking and first intronal regions of pGX6-2.05. These include two potential interferon-stimulated response elements a t nucleotides -1307 to -1294 and +465 to +478 (36); two potential glucocorticoid receptor elements at -1328 to -1314 and +658 to +642 (37); a potential NFKB site at -528 to -519 (38); two potential AP2 sites a t -917 to -910 and -506 to -499 (39); and several potential AP1 sites (40).
Characterization of the 5'-UTR and Cap Site-Primer extension was performed using reverse transcriptase after endlabeled, complementary oligodeoxyribonucleotides had been hybridized to total cellular RNA of either untreated or IFNy-treated HK or A431 cells (Fig. 3A). No specific extension products were observed if total cellular RNA of untreated HK or A431 cells was used as template. The extension of PE1 produced specific bands of 51 and 52 nucleotides when reacted with RNA from IFN-y-treated cells. Since the 5' end of PE1 is complementary to the base located 12 nucleotides downstream of the translation start site AUG, these bands correspond to a 5'-UTR of 39 or 40 nucleotides in length. PE2, whose 5' end is complementary to the base 67 nucleotides downstream of the AUG, produced bands of 106 and 107 nucleotides (Fig. 3B). Thus, extension studies utilizing nonoverlapping primers confirm a 5'-UTR of 39-40 nucleotides for ICAM-1 transcripts generated under these experimental conditions, that is by HK and A431 cells after IFN-y treatment.
To demonstrate that the full 5'-UTR was contained within the identified putative first exon, S1 nuclease protection assays were performed. A single-stranded probe of 187 nucleotides, complementary to the 5' region of the ICAM-1 mRNA, was generated from the appropriate strand of pGX6-2.05. The 5' end of this probe, like the primer PE1, is complementary to the base located 12 nucleotides downstream of the ICAM-1 AUG. 5' end-radiolabeled probe was allowed to hybridize to total cellular RNA from IFN-y-treated A431 cells, and S1 Methods." Specific extension products are marked by the arrow. R, autoradiography of a primer extension reaction with primer PE2. A sequencing reaction was run concurrently for the determination of the size of the extension products. The sequencing reaction was performed on pCX6-2.05 using a primer whose 5' end coincides with that of PE2. Thus, the precise candidate nucleotides corresponding to the cap site for ICAM-1, as well as the length of the primer extension products, can be determined. Specific extension products are marked by the arrow. C, autoradiography of an S1 nuclease protection assay. A single-stranded, 5' end-labeled DNA probe complementary to the ICAM-1 mRNA extending 12 nucleotides downstream (5' end of probe) to 175 nucleotides upstream (3' end of probe) of the translation start site of the ICAM-1 mRNA was hybridized to total cellular RNA of IFN-y-treated A431 cells and to tRNA and then either exposed (+) or not exposed (-) to S1 nuclease as described under "Materials and Methods." Bands corresponding to specific protected probe fragments are marked by the arrow. Length of protected probe fragments was determined by concurrently electrophoresed sequencing reactions and radiolabeled nucleotide length markers.
nuclease was subsequently added. S1 nuclease completely degraded single-stranded probe hybridized to irrelevant tRNA, but when hybridized with the A431 RNA, produced protected fragments or probe slightly greater than 50 nucleotides in length (Fig. 3C), consistent with a 5'-UTR of 39-40 nucleotides and in agreement with the primer extension data.
Omission of S1 nuclease from the reaction resulted in the preservation of the full-length, single-stranded probe. In additional experiments using the same probe, protected fragments identical in size to those seen with A431 RNA were observed using RNA from IFN-y-treated HK. RNA from untreated HK and untreated A431 cells did not protect the probe from S1 digestion (data not shown).
The primer extension and S1 nuclease protection experiments utilizing IFN-y-induced transcripts identify a cap site within the genomic fragment of pGX6-2.05 that is 39-40 nucleotides 5' of the ICAM-1 AUG (Fig. 2). Sequence analysis of the region immediately upstream from this cap site reveals a TATA box (41) within a GC-rich region a t nucleotides -30 t o -25 (+1 designated as the guanidine 40 nucleotides 5' of the ICAM-1 AUG). Additionally, both consensus Spl-binding sites (-196 to -191 and -59 to -54 (37)) and potentially utilized CAAT boxes (-262 to -254 and -154 to -146 (42) and -53 to -48 (43)) can be identified within the first 300 nucleotides of the 5"flanking region identified by these studies.
The 5"Flanking Region of the ICAM-1 Gene Contains a Potent Functional Promoter and a n Upstream Region Which Confers Tissue Appropriate Repression of Transcription-Whether the candidate promoter region surrounding the identified transcription initiation site could drive expression of a heterologous reporter gene was first determined. A 282-nucleotide HindIII/SstI (HS) fragment of pGX6-2.05 (nucleotides -277 to +5 of Fig. a), containing the identified cap site and candidate basal promoter elements described above, was inserted immediately upstream of the CAT reporter gene contained in the promoterless expression vector, pCAT-Basic (Fig. 4A). This inserted H S fragment functioned as a potent promoter of CAT expression when pHs-CAT-B was transiently transfected into both murine B6 fibroblasts and the squamous cell carcinoma line, A431 (Fig. 4B). In multiple transfection analyses, the relative expression of CAT enzyme using pHs-CAT-B was consistently as good or better than that observed with the positive control SV40 promoter/enhancer construct, pCAT-Control. The fact that expression of this hybrid construct was high in A431 cells, which normally express no or very low levels of ICAM-1 (27), raised the possibility that in engineering pHs-CAT-B, a potent ICAM-1 promoter had been divorced from flanking, constitutively utilized repressor elements. Therefore, CAT reporter constructs were produced using additional portions of the ICAM-1 5"flanking region, and these constructs (Fig. 4A) were utilized in transient transfections to determine: 1) if sequences immediately 5' of the HS fragment might contain regions that physiologically repress ICAM-1 gene transcription in the proper cellular context (construct pBS-CAT-P); and 2) if repression of CAT expression in A431 cells could be demonstrated with the inclusion of additional 5"flanking regions, could this repression be overridden with the addition to the repressed construct of an SV40 enhancer (construct pBS-CAT-C). Furthermore, analysis of the ICAM-1 genomic region immediately 5' of the HS fragment of pGX6-2.05 revealed not only consensus sequences for multiple potential cis-acting regulatory elements (see above), but also additional candidate cap sites with appropriately located basal promoter elements (potential cap site, nucleotides -284 to -278; potential TATA box, -313 to -307, Fig. 2). For this reason, and because previously defined ICAM-1 cDNA sequences indicated the existence and utilization of alternative transcription initiation sites in the expression of the ICAM-1 gene (see below, "Discussion"), reporter constructs were also devised to assess the ability of the region immediately flanking the HS fragment to function as a promoter, both without and with the assistance of an SV40 enhancer (Fig. 4A, constructs pBH-CAT-P and pBH-CAT-C, respectively). Results of CAT expression utilizing these additional constructs (Fig. 4C)  nucleotide upstream BH region displays no promoter function, either without (lane 4 ) or with (lane 8) the addition of the SV40 enhancer. These BH constructs show no promoter activity when transfected into murine B6 fibroblasts, as well (data not shown). These data demonstrate that while the ICAM-15"flanking region contains a potent promoter region, it also contains physiologically relevant repressor elements that are both functional and capable of being overridden. Thus, in A431 cells, in which ICAM-1 gene expression is constitutively very low but which can be markedly upregulated upon exposure to specific signals, concordance of the biology of endogenous ICAM-1 gene expression levels and transient transfection expression assays is achieved when larger, inherently more physiologic portions of the ICAM-1 gene 5'flanking region are utilized in the hybrid expression constructs.

DISCUSSION
We have characterized the genomic structure of the human ICAM-1 5'-transcriptional regulatory region and have identified a potent, functional promoter within this region that is flanked by a functional repressor. Structural analysis has also revealed the exonal organization of that portion of the gene which encodes the N terminus of the ICAM-1 protein, specifically the signal peptide and the distal immunoglobulin-like domain, which contains the primary binding site for both lymphocyte function-associated antigen 1 and the major group of rhinovirus (21). The protein coding region of the first exon contains 67 of 81 nucleotides (22 and 113 of 27 codons) that encode the cDNA-deduced signal peptide (61, and the second exon contains the remaining 14 nucleotides encoding the signal peptide and 247 of 264 nucleotides (82 and 213 of 88 codons) that encode the N-terminal immunoglobulin-like domain (6). Virtually identical exonal organization of signal peptide and distal immunoglobulin-like domains have been reported for other members of the immunoglobulin superfamily, e.g. for the variable (V) regions of immunoglobulin H, K , and X chain genes (44), as well as for the V-regions of TCR a , p, and y chain genes (45). The three EcoRI fragments of X6 that contain hybridizing, exonal material have also been detected in genomic Southern analysis (23).' The exonal information of the ICAM-1 gene appears to be dispersed over a larger region of genomic DNA than predicted (6). Apart from X6, we have also isolated another genomic recombinant phage clone that contains an additional 300 nucleotides of ICAM-1 protein coding sequence.' Together, these two recombinant phage clones represent approximately 30 kb of ICAM-1 genomic sequence, but contain only 52.3% of the ICAM-1 protein coding region, indicating that the ICAM-1 gene is considerably larger than previously thought.
Utilizing RNA isolated from two epithelial cell types (HK and A431 cells) after IFN-y treatment, we have identified a cap site that is appropriately located with respect to a consensus TATA box and predicts a 5'-UTR of 39-40 nucleotides. The 5' extent of the ICAM-1 cDNA isolated by Staunton et al. (6) extends an additional 17 nucleotides upstream from our experimentally determined cap site (5' end of the Staunton et al. (6) cDNA at nucleotide -17, Fig. 2), and predicts a 5'-UTR of at least 57 nucleotides. Since this cDNA sequence was obtained from a phorbol myristate acetate-stimulated HL60 cDNA library, the inclusion of these additional 17 nucleotides in the 5'-UTR may have resulted from an mRNA generated using an alternate transcription start site located slightly upstream in the genomic sequence. Since this 17 nucleotide region is identical in sequence and location in both the cDNA and the X6 subclone, it is highly unlikely that this discrepancy can be explained by alternative splicing or the existence of additional intron-separated ICAM-1 exonal regions further upstream. Indeed, probing of Northern blots which contain ICAM-1 transcripts with fragments of X6 representing an additional 4.5 kb of upstream genomic sequences have failed to detect any hybridizing signals. From the genomic sequence reported here, and if the Staunton et al. nate transcription initiation site would be located only 12 nucleotides downstream from the identified TATA box, with no alternative, appropriately distanced (25-30 nucleotides) (46) TATA consensus sequence within the genomic clone. Repetitive S1 nuclease protection assays utilizing RNA isolated from IFN-y-treated epithelial cells have failed to detect protected fragments which would correspond to a 5'-UTR of 57 nucleotides in length. However, in some primer extension studies, we have detected minor specific bands corresponding to a 5'-UTR of precisely this length (data not shown). Thus, while a predominant ICAM-1 transcript induced by IFN-y in epithelial cells appears to utilize a transcription initiation site located 39-40 nucleotides 5' of the translation start site, there are indications that an alternate cap site which produces a transcript with a slightly longer 5'-UTR may also be present. However, additional upstream consensus cap sites with appropriately located promoter elements (contained within the BH fragment, Fig. 4) failed to demonstrate any functional promoter activity in transient expression assays of reporter gene plasmids, even with the addition of an SV40 enhancer. Nevertheless, these putative alternative transcription initiation sites (located 17 nucleotides apart) cannot explain why multiple ICAM-1 mRNA species ranging from 1.9 to 3.3 kb are usually detected in Northern blot analysis (6,23,26). These considerable mRNA length differences may reflect post-transcriptional modifications of the ICAM-1 mRNA. There is little indication, however, that alternative splicing affecting either the 5'-UTR or coding region could contribute to these multiple RNA species, since the isolated cDNAs have identical protein coding sequences and, at least with regard to the 5' extent of the mRNA, the cDNA sequences are completely contained within the characterized genomic DNA. Splicing events exclusively affecting the 3'-UTR of ICAM-1 mRNA have not been ruled out.
When inserted into a heterologous reporter gene construct, the immediate 5'-flanking region of the ICAM-1 gene demonstrates strong promoter activity when transfected into murine B6 fibroblasts and A431 cells. The chloramphenicol conversion rates for the ICAM-1 pHs-CAT-B construct are uniformly equivalent or better than those seen with the positive control SV40 promoter/enhancer CAT plasmid, pCAT-Control. Interestingly, this ICAM-1 promoter construct was also expressed a t high levels in A431 cells, a cell line that shows no or only low constitutive expression of ICAM-1 under standard culture conditions (27). While these differences in expression might have been attributable to the absence of physiologic regulatory mechanisms in transient transfection expression systems, transfection experiments utilizing the additional ICAM-1-based reporter constructs pBS-CAT-P and pBS-CAT-C demonstrate that this discrepancy is explained by the dissociation of the basal ICAM-1 promoter from neighboring regulatory DNA sequences. These flanking sequences clearly contain cis-acting regions that, in the appropriate cellular context, physiologically repress ICAM-1 gene transcription. The absence of these presumptive repressor elements, as in the pHs-CAT-B construct, thus allows strong promoter activity even without the addition of normally required inducers of expression, such as IFN-y. Indeed, this observed constitutive repressor activity can be completely overcome with the addition of an SV40 enhancer to the appropriate reporter construct. Furthermore, preliminary experiments indicate that this constitutive repressor activity observed with the pBS-CAT-P construct can also be overridden by treatment of transiently transfected A431 cells with IFN-y, a signal which rapidly induces endogenous ICAM-1 gene expression in this and numerous other cell types. It is therefore of interest that within the 5"flanking region of the human ICAM-1 gene are contained potential interferon responsive elements, glucocorticoid receptor-binding sites, an NFKB consensus element, and AP1 and AP2 sites, regions which may be involved in the regulated expression of this gene. The exact biologic roles played by these potential elements as well as other regions involved in the constitutive and tissue-specific regulation of ICAM-1 gene expression are currently under investigation.