Histone H1 Binds to the Putative Nuclear Factor I Recognition Sequence in the Mouse &I) Collagen Promoter*

It has previously been demonstrated that nuclear factor I (NF I) or a related protein binds to a region between -315 and -295 from the start of transcription in the mouse alpha 2(I) collagen gene promoter. In the present work we have purified this factor to homogeneity from rat liver. DNA sequence-specific proteins were isolated from nuclear extracts using heparin-agarose affinity chromatography and two successive chromatographies on a recognition site affinity matrix. Approximately 160 micrograms of the DNA binding proteins was obtained from 100 g of rat liver. More than 1700-fold purification over the nuclear extract and 58% recovery of the DNA binding activity was achieved. The purified preparation contained five to six protein components ranging in molecular weight from 30,000 to 35,000, as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. It was demonstrated using DNase I footprint analysis that the factor binds to the putative NF I binding site in the mouse alpha 2(I) collagen promoter. It has a dissociation constant of 7 nM for a short DNA fragment containing this binding site, while a constant of 0.45 nM was obtained for a similar-sized fragment containing the nuclear NF I consensus binding sequence. The purified factor is identical to histone H1 in several respects. They share similar amino acid compositions and they give similar V8-protease and N-bromosuccinimide peptides. In addition, antibodies raised to bovine histone H1 recognize the purified factor and interfere with its binding to DNA. Methylation interference and preparative gel shift assay show that histone H1 binds to the specific sequence from the preparation of the alpha 2(I) collagen promoter binding factor. It is thus evident from the present results, that histone H1 binds to the NF I recognition sequence in the mouse alpha 2(I) collagen promoter.

It has previously been demonstrated that nuclear factor I (NF I) or a related protein binds to a region between -315 and -295 from the start of transcription in the mouse a2(I) collagen gene promoter. In the present work we have purified this factor to homogeneity from rat liver. DNA sequence-specific proteins were isolated from nuclear extracts using heparin-agarose affinity chromatography and two successive chromatographies on a recognition site affinity matrix. Approximately 160 fig of the DNA binding proteins was obtained from 100 g of rat liver. More than 1700-fold purification over the nuclear extract and 58% recovery of the DNA binding activity was achieved.
The purified preparation contained five to six protein components ranging in molecular weight from 30,000 to 35,000, as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis.
It was demonstrated using DNase I footprint analysis that the factor binds to the putative NF I binding site in the mouse m(I) collagen promoter. It has a dissociation constant of 7 nM for a short DNA fragment containing this binding site, while a constant of 0.45 nM was obtained for a similar-sized fragment containing the nuclear NF I consensus binding sequence.
The purified factor is identical to histone H1 in several respects. They share similar amino acid compositions and they give similar V8-protease and N-bromosuccinimide peptides. In addition, antibodies raised to bovine histone H1 recognize the purified factor and interfere with its binding to DNA. Methylation interference and preparative gel shift assay show that histone H1 binds to the specific sequence from the preparation of the a2(I) collagen promoter binding factor. It is thus evident from the present results, that histone H1 binds to the NF I recognition sequence in the mouse az(I) collagen promoter.
Type I collagen, which is one of the main secretory proteins synthesized by fibroblasts (I), confers tensile strength to various kinds of connective tissue. Its molecule is a trimer of two al(I) chains and one az(I) chain, coiled into a triplehelical structure. Interestingly, the rate of type I collagen synthesis is regulated, e.g. during development of the mouse embryo, and the protein can be detected in utero after day 8 (2-3). This induction is essential for development. Inactivation of the al(I) collagen gene, brought about by retroviral 2 To whom reprint requests should be sent. insertion of foreign DNA into the first intron of the gene in the mouse oocyte, is lethal (4)(5). Major changes in the rate of synthesis of type I collagen also take place in various pathological states. A drastic decrease is observed in fibroblasts after viral transformation, for example, due to a reduced rate of transcription of the corresponding genes (6)(7)(8).
Genes for type I collagen polypeptide chains have been characterized from several species. The q ( I ) and (~~ ( 1 ) collagen genes generally consist of 51 or 52 exons separated by an equal number of introns (9)(10). A single copy per haploid genome of the al(I) gene is located in the long arm of human chromosome 17 (ll), and that of the az(l) gene in chromosome 7 (12). The length of the al(I) collagen gene is only one-half that of the a~(1) gene, since they span 18 and 38 kilobases, respectively (13). In most situations, they are transcribed in the ratio of 21, which confers the same ratio on the steadystate levels of the corresponding mRNAs (14). The al(I) and m(I) collagen mRNAs are equally translated and have identical half-lives (14).
Chimeric plasmids containing part of the mouse az(I) collagen gene promoter and a bacterial chloramphenicol acetyltransferase gene have been generated (15), transient transfection experiments with these plasmids (15) and studies on transgenic mice (16) have revealed that the part of the promoter that ranges from -2000 to +54 from the start of transcription is sufficient for expression and, together with an enhancer element in the first intron (17), confers the tissue specificity necessary for the expression of the a~ ( 1 ) collagen gene.
Several nuclear protein factors that recognize specific DNA sequences in the mouse az(I) coIlagen promoter have been discovered, and two of these have been characterized in more detail. One factor binds to a region between -70 and -96 that contains a CCAAT sequence on the sense strand between -80 and -84 (18), and the other binds to a region between -295 and -315 from the start of transcription (19). It has been demonstrated by deletion analysis that the latter area is also important for efficient expression of the a~ ( 1 ) collagen gene (15). Less than 10% of the activity observed with the authentic promoter is obtained when a mutant having a deletion from -350 to -290 is used in transient transfection experiments. Furthermore, in the light of similarities in the recognition sequences and of competition studies, it is probable that this factor is either NF I' or a related protein (20).
In order to elucidate the basic mechanisms which regulate the expression of type I collagen genes, we have investigated the nuclear protein factors which bind to specific DNA sequences in the a2(1) collagen promoter. We report here on the isolation of proteins from rat liver which bind to the putative NF I binding site.

Characterization of the a2(I) Collagen Promoter Binding
Proteins-Proteins which recognize the putative NF I binding site in the mouse (~~ ( 1 ) collagen promoter were isolated from rat liver using heparin-agarose and two consecutive DNA recognition site affinity chromatographies. The affinity purified material was subjected to electrophoresis on a SDSpolyacrylamide gel. Silver staining revealed the presence of a t least five or six protein bands, ranging in molecular weight from 30,000 to 35,000 (majority of the staining at 33,000) ( The preparation was more than 90% pure, as judged by polyacrylamide gel electrophoresis. The DNA recognition sequence of the purified factor has a partial dyad of symmetry (21,22). This implies that proteins which bind to that sequence may be dimeric in structure, like many prokaryotic DNA binding proteins. Consistent with this, the purified DNA binding factor behaves in solution mainly like a protein of molecular weight -60,000. This is demonstrated by gel filtration of the factor on a Sephacryl S-300 matrix followed by analysis of the protein by UV spectrophotometry and DNA binding activity by the nitrocellulose filter binding assay. Most of the protein and DNA binding activity co-eluted with the bovine serum albumin standard as a distinct peak (Fig. 2). The protein nevertheless has a tendency to aggregate, and occasionally forms with an even higher molecular weight were observed.
The broad protein band ranging in molecular weight from 30,000 to 35,000 was cut out from the acrylamide gel, electroeluted from the gel and subjected to sequencing of the amino terminus and to amino acid analysis. Despite 500 pmol of protein being subjected to automated gas-phase sequencing, no sequence was obtained. It is thus probable that the amino terminus of the DNA binding factor is blocked. Its amino acid composition nevertheless turned out to be characteristic (Table I), the most conspicuous feature being an abundance of lysine, alanine, and proline residues, while only trace amounts of aromatic amino acids and no methionine were observed. DNA Binding Characteristics of the Purified Protein-The purified factor recognizes the putative NF I binding site in the mouse a2(I) collagen promoter, as shown by DNase I footprint analysis, where the autoradiogram pointed to an area of protection from DNase I digestion and some sites of enhanced sensitivity (Fig. 3, lanes 2 and 3). The area protected by the isolated factor actually spanned exactly the same sequence that had previously been observed with crude nuclear extracts from NIH-3T3 cells (19), containing the sequence 5'-TCGCN4GCCAA-3' on the anti-sense strand.
It is evident that the factor isolated here binds both to the sequence represented in the mouse (YZ(I) collagen promoter and to the consensus sequence for NF I binding sites (Fig.  4a), although it seems to have a much higher affinity for the latter. Dissociation constants for various binding sequences were determined using the nitrocellulose filter binding assay Portions of this paper (including "Materials and Methods," part of "Results," additional references, Table I (Suppl.), and Figs. [1][2][3][4][5]) are presented in miniprint at the end of this paper. Miniprint is easily read with the aid of a standard magnifying glass. Full size photocopies are included in the microfilm edition of the Journal that is available from Waverly Press.   (1) collagen promoter binding factor. Panel A demonstrates the purity of the factor. After purification using heparinagarose chromatography and two successive chromatographies on the recognition site affinity matrix, the factor was subjected to electrophoresis on a SDS-polyacrylamide gel (12%). Lane I contains 1 pg of protein and lane 3 5 pg. Lane 2 contains only the sample buffer, thus demonstrating that the bands at molecular weight about 55,000-66,000 originate from the sample buffer. Panel B demonstrates the similarity of the purified factor to bovine histone H1. The two protein preparations were subjected to electrophoresis on a glycerol-SDSpolyacrylamide gel (15%). Lane 2 contains 2 pg of histone H1 (from Boehringer Mannheim) and lane3 the 4 1 ) collagen promoter binding factor. The bands were visualized using silver staining. The molecular weight standards (lanes I and 4 in panel B ) from top to bottom are phosphorylase b, bovine serum albumin, ovalbumin, carbonic anhydrase, soybean trypsin inhibitor, and lysozyme. The major components of the purified factor (molecular weights 30,000-35,000 and 20,500) are marked by the arrows. BPB, bromphenol blue. and Scatchard plot analysis (Fig. 4b). A dissociation constant of 7 nM was obtained both for the PuuII-PuuII fragment from plasmid pBSla and for the BglII-Tag1 fragment from plasmid pAZ1009. Both of these represent the mouse az(1) collagen binding site. A dissociation constant of 0.45 nM was obtained for the PuuII-PuuII fragment from plasmid pNFlb, which is the same size as the corresponding fragment from plasmid pBSla and contains the consensus sequence for NF I binding sites. No binding was observed to the PuuII-PuuII fragments from plasmids pUC13 and pUC19, which are the parent plasmids of pNFlb and pBSla, respectively.
The Purified Protein Is Identical to Histone HI-The amino acid composition of the purified protein, especially its richness in lysine and alanine, suggested a possible similarity to histone H1 variants. The amino acid compositions of the two proteins were remarkably similar (Table I) FIG. 2. Sephacryl S-300 chromatography of the purified factor. The purified factor (20 pg) was chromatographed on a Sephacryl S-300 matrix. The fractions were subjected to determination of DNA binding activity using the 32P-labeled PuuII-PuuII fragment from plasmid pNFlb (-O-) or to spectrophotometric determination of protein (-0-). the molecular weight standards were catalase (240,000), bovine serum albumin (BSA) (66,000), and pepsin (35,000).

TABLE I
Amino acid composition of the purified cuz(I) collagen promoter binding factor and histone HI The molecular weight 30,000-35,000 protein was eluted from a SDS-polyacrylamide gel and subjected to acid hydrolysis and determination of amino acid composition. The values are means of two determinations. ND, not determined. The number of amino acid residues per chain was calculated assuming that the protein consists of 220 amino acid residues. The composition of rat NF I was then compared with those of the chicken histone HlOl (27)

Total
-220 218 224 -220 species than the other members of the histone gene family (see Ref. 23). Thus the differences that were observed between a selection of previously published amino acid compositions of histone H1 and that of the protein in the present preparation may be due to variations between different species and variants.
Each of the five or six histone H1 polypeptide chain variants encountered in all mammalians has a molecular weight of about 20,500, but the protein features many posttranslational modifications such as the amino-terminal N-acetylser- collagen promoter. The BglII-TaqI fragment from plasmid pAZ1009 was labeled a t the BgnI site using T4 polynucleotide kinase. The anti-sense strand of the 4 1 ) collagen promoter was thus labeled. The 5"end-labeled fragment was then incubated in the absence (lanes 1 and 4 ) or presence (lanes 2 and 3 ) of the purified factor (250 ng), treated with DNase I, and the digestion products were analyzed on a sequencing gel. An autoradiogram of the gel is shown. The Maxam-Gilbert sequencing reactions from the same fragment were run simultaneously. The sequence TCGCNdGCCAA can be found on the opposite strand inside the protected area. Thick arrows denote protection and narrow arrows enhancement.
ine (24), several poly(ADP-ribose) moieties linked to glutamates (25) and phosphates in serines, threonines, and lysines (26). These modifications are likely to increase the real molecular weight of the protein quite considerably. The NF I preparation shows amazingly similar mobility in a SDS-polyacrylamide gel to that of a bovine histone H1 preparation (Fig. lB), to the extent that these two preparations are indistinguishable. The molecular weight 20,500 component may be a degradation product of HI. The higher molecular weight form of the (~~ ( 1 ) collagen promoter binding factor can be labeled by incubating nuclei in the presence of 32P-labeled NAD, thus suggesting that it contains poly(ADP-ribose) (data not shown).
In order to acquire additional support for the identity of the two proteins, the liver nuclear factor I and bovine histone H1 preparations were digested with Staphylococcus aurew V8 protease and the peptides analyzed on a glycerol-SDS-polyacrylamide gel. Both preparations gave rise to two major groups of peptides of molecular weight about 12,000 and 14,000 (Fig. 5A) 4. Scatchard plot analysis of binding of the purified factor to various DNA fragments. Panel A shows an autoradiogram of the gel retardation assay, to demonstrate the differences in binding affinities. Lanes I and 2 contain the PuuII-PuuII fragment from plasmid pNFlb, labeled at the 5'-ends using T, polynucleotide kinase, and lanes 3 and 4 contain the PuuII-PuuII fragment from plasmid pBSla labeled in the same way. These fragments are of approximately the same size (330 and 320 base pairs, respectively). Lanes 5 and 6 contain the BgnI-TaqI fragment from plasmid pAZ1009 labeled at the BglII site. This fragment is shorter than the others (approximately 120 base pairs). The fragments are incubated in either the absence (lanes I , 3, and 5) or presence (lanes 2, 4, and 6) of the purified DNA binding factor (120 ng). Equimolar concentrations (0.5 nM) of the labeled fragments were used. The specific radioactivities were 760, 1420, and 640 cpm/fmol for the fragments from plasmid pNFlb, pBSla, and pAZ1009, respectively. Panel B shows a typical Scatchard plot analysis. A nitrocellulose filter binding assay was used to quantify the binding. Increasing amounts of the labeled fragment (PuuII-PuuII from plasmid pNFlb) were incubated in the presence of a fixed amount of the purified factor (1.5 ng). The amount of the labeled fragment retained by the filter is shown. The inset shows a bound versuS bound/free transformation of the data. A dissociation constant of 0.43 nM can be calculated for the NF I consensus binding site. The maximum binding in the assay is 9 fmol. If it is assumed that the factor binds in the manner of a molecular 64,000 weight dimer, the amount of purified factor added to the reaction must be 0.65 ng. Thus, the preparation is at least 40% pure.
carboxyl terminus a t about 220 (from Ref. 27). Other peptides obtained were more variable between the two preparations, apparently due to inter-species variations in glutamate residues in the aminoterminal part of the molecules (23).
Use was made of the ability of N-bromosuccinimide to destroy tyrosine residues in the polypeptide chain (28), in order to demonstrate a further similarity between the two proteins. The main bovine histone H1 variants have only a single tyrosine residue (the location of which corresponds to amino acid residue 71 in chicken H101, from Ref. 27), so that chemical degradation should give rise to peptides two-thirds and one-third of the size of the original polypeptide. Bovine histone H1 and the crZ(1) collagen promoter binding factor from the rat gave rise to very similar peptide patterns, as demonstrated by analysis on a glycerol-SDS-polyacrylamide gel (Fig. 5B). The mobilities of the major peptides were in good agreement with cleavage of the molecules a t a single site, as expected.
Immunological studies were performed to demonstrate that the molecular weight 30,000-35,000 protein in the preparation of the cr2(I) collagen promoter binding factor is histone H1.
Antibodies raised in rabbits to bovine histone H1 recognize both bovine and rat histone Hls that were isolated using the conventional methods and the molecular weight 30,000-35,000 bands in the preparation of the crz(1) collagen promoter binding factor (Fig. 6A). The antibodies apparently have equal affinities for histone H1 and the purified factor, as tested by using equal amounts of the proteins in dot blot immunoassay. From crude cell extracts the antibodies did not recognize any bands, and denaturation by boiling or an SDS-treatment does not affect the antigenicity (data not shown). In addition to the molecular weight 30,000 to 35,000 bands, the antibodies to histone H1 recognize in the preparation of the cr2(I) collagen promoter binding factor bands of slower mobility (Fig. 6A, lane 4). Staining was observed at molecular weight 52,000-66,000, a t 92,000, and at 160,000 by the globular standards. Staining of the higher molecular weight forms varies between different preparations of the factor and they probably represent covalently cross-linked polymers of histone HI. The cross-linking apparently occurs during the Histone H1 Binds to the NF I Recognition Sequence-In the light of the above data, it is evident that the molecular weight 33,000 protein in the preparation of a?(I) collagen promoter binding factor is identical to histone H1. Although the protein was more than 90% pure, it is not unequivocally the case that the molecular weight 33,000 protein must be J. Ristiniemi   Proteins which migrate at molecular weight 30,000 to 35,000 in a SDS-polyacrylamide gel bind to a specific sequence in the mouse a Z ( 1 ) collagen promoter. Panel A shows that methylation of guanosines in the putative NF I recognition sequence interferes with DNA binding of the protein which causes the retardation in the gel shift assay. The BglII-TaqI fragment from plasmid pAZ1009 was 5"end-labeled at its BglII site, and the guanosine residues methylated with dimethyl sulfate under rate-limiting conditions. The methylated fragment was subsequently subjected to a gel retardation assay using the purified az(I) collagen promoter binding factor. The retarded and unretarded fragments were eluted form the gel and subjected to chemical cleavage with piperidine, and to electrophoresis on a 6 M urea-polyacrylamide sequencing gel. The Maxam-Gilbert sequencing reactions were run simultaneously. Lane 1 shows the G reaction pattern obtained from the unretarded band and lane 2 that from the retarded one. The putative NF I recognition sequence is delineated by open triangles, and arrows denote guanosines, the methylation of which affects the binding. Panel B shows an electrophoretic analysis of the protein that causes retardation in the gel shift assay. The gel retardation assay was performed using the 5"end-labeled BglII-TaqI fragment from plasmid pAZ1009 and the purified @,(I) collagen promoter binding factor. The retarded band was eluted from the gel and subjected to electrophoresis on a SDSpolyacrylamide (15%) gel. The staining procedure and molecular that which binds to the specific sequence. Methylation interference experiments performed clearly demonstrate that the proteins which cause retardation in the gel shift assay really do bind to the putative NF I recognition sequence in the d ) collagen promoter (Fig. 7A). In the retarded band the signals for the two adjacent G residues complementary to C c in the GCCAA sequence were clearly missing. On the other hand, the preparative gel retardation assay combined with SDSpolyacrylamide gel electrophoresis shows that the molecular weight 33,000 protein in fact causes such retardation (Fig. 7B,  lane 2 ) .
In order to demonstrate that the protein which binds to the specific DNA sequence is recognized by the histone H1 antibodies, antiserum was added during the binding incubation in the gel retardation assay. An extra band of slow mobility was observed when the factor and DNA fragments were incubated in the presence of antibodies to histone H1 and this band was not detectable when pre-immune antiserum was used (Fig.  6B, lanes 4 and 6 ) . DISCUSSION A group of proteins which bind to a putative NF I recognition sequence around -300 with respect to the start of transcription in the mouse ap(I) collagen promoter are isolated here. Nuclear extract from rat liver was subjected to heparinagarose and two successive DNA recognition site affinity chromatographies (20). Purification of DNA binding activity was followed by two assays: a DNA gel retardation assay (19,(29)(30)(31) and a nitrocellulose filter binding assay (20). 160 Fg of the affinity purified proteins was obtained from 100 g of rat liver. A 1700-fold purification over the rat liver nuclear extract was achieved, with a 58% recovery of the DNA binding activity.
The protein preparation obtained consisted of from five to six components, ranging in molecular weight from 30,000 to 35,000. Surprisingly, the amino acid composition of the factor was very similar to that of histone H1 (see Ref. 32). The purified rat factor and bovine histone H1 had identical mobilities in a SDS-polyacrylamide gel and gave rise to both V8 protease and N-bromosuccinimide peptide maps which very closely resembled each other. Furthermore, the antibodies raised to bovine histone H1 recognized the cuZ(1) collagen promoter binding factor and interfered with its binding to DNA. It was demonstrated using methylation interference (19) and preparative gel shift assay (33) that the same protein from the preparation also binds to the putative NF I recognition sequence. Thus, evidently histone H1 binds to the NF I recognition sequence in the mouse 0cz(I) collagen promoter.
The band pattern recognized by the histone H1 antibodies in the preparation of the promoter binding factor was slightly different from that of histone H1. However, similar pattern will be obtained after chemical acetylation of histone H1 by acetic a n h~d r i d e .~ I t is therefore evident that the molecular weight 33,000 protein in the preparation of the (~~ ( 1 ) collagen promoter binding factor is histone H1.
The eukaryotic genome is packed into nucleosome structures in the chromatin consisting of various kinds of histone protein. Histone H1 binds to nucleosome linker regions, probably sealing the DNA that is wrapped two turns around the core histones (34)(35)(36)(37)(38). From four to six variants of histone H1 weight standards are the same as in Fig. 1. Lane 2 shows staining pattern of proteins eluted from the retarded hand and lane 2 that of proteins of identical mobility from a control sample that was run simultaneously on the retardation gel in the absence of DNA. The proteins of the molecular weight 30,000-35,000 are marked by an arrow. BPB, bromphenol blue. exist in each mammalian species (39-45), exhibiting tissue and developmental stage-specific expressions (36,39,40,(46)(47)(48)(49)(50)(51). The function of histone H1 has remained unelucidated. It has been hypothesized that it packs the chromatin into inactive condensed form (52-so), thus acting as a general repressor (61). This hypothesis is supported by the finding that active chromatin contains less histone H1, or else histone H1 is only loosely bound to DNA (62,63). In addition, some results suggest that the nucleosome structure is clearly different in actively transcribed areas (64).
The factor purified here behaved in solution like a molecular weight 60,000 protein and could thus be a dimer of two identical subunits. This is consistent with previous studies which show that NF I recognizes a sequence that displays an apparent dyad of symmetry (21,22). Interestingly, purified histone H1 also has a tendency to form aggregates (25). It is not known what is the functional complex in tissues, but it is possible to cross-link two histone H1 molecules in vivo (281, suggesting that dimers may really exist in the nucleus.
On the other hand, no one has ever been able to show that histone H1 binds to a specific sequence in DNA. There are some hints to the effect that it could be a sequence-specific DNA binding protein, the most convincing one being the constant pattern of nucleosomes encountered in. vivo (see Ref. 65), but it seems to be very difficult to reconstitute this pattern in vitro, probably because the factors determining nucleosome positioning are not known. Interestingly, we have recently demonstrated that the globular domain of the histone H1 molecule displays homology with the nucleotide binding domain of adenine nucleotide binding proteins (66), thus suggesting that the histone H1 molecule could possess domains that would ascribe it sequence specificity for binding. It nevertheless still remains to be explained why no one has ever noticed this characteristic of histone H1 before. One explanation might be that the conventional purification procedures used for histone H1 (67) include denaturing perchloric acid extraction and precipitation with acidified acetone.
It has previously been demonstrated that NF I will bind to several viral DNAs such as adenovirus (68), mouse mammary tumor virus-long terminal repeat (69), BK virus (69), herpes simplex virus (70) and human cytomegalovirus DNAs (71), as well as to cellular genes such as the lysozyme (72), c-myc (73) and az(I) collagen genes (19). Several reports have come out recently on the purification of this factor. Originally, Nagata et al. (74) reported purification of a molecular weight 47,000 protein which was though to be related to NF I activity. Later on, Kelly and Rosenfeld (20) reported purification of a factor which ranged in molecular weight from 52,000 to 66,000, and Diffley and Stillman (75) that of a factor which migrated at about 160,000 by globular standards. Very recently, purification of a protein which had NF I-like activity and a molecular weight of from 30,000 to 35,000 was reported by Rupp and Sippel (33). The factor we have purified is probably identical to the last of these. It is difficult to explain the discrepancy between the molecular weights obtained, although the factor we have purified has a tendency to aggregate. Like to those of histone H1 (25), NF I aggregates are apparently not solubilized by the SDS treatment that is usually performed prior to polyacrylamide gel electrophoresis.
It is possible that the variation in the molecular weights of the NF I preparations may at least partly be due to differences between species, since Rupp and Sippel used chicken liver, we used rat liver and the other groups cultured human HeLa cells. There is an argument against this possibility, however, namely that the molecular weight 33,000 protein can also be obtained from cultured human skin fibroblasts.3 It is more

aa(I) Collagen Promoter Binding
Factor conceivable that this protein has been conserved throughout evolution in all species as is the case with histone H1 (see Ref. 27). Even so, there is still a possibility that a single species may possess multiple proteins that have N F I-related activity, and that we have purified one member of that gene family.
The dissociation constants of the present factor for binding to the NF I site in the ~~ ( 1 ) promoter and to the NF I consensus sequence seem to be high, thus suggesting low affinity binding. Lower values have previously been reported (20). It should be noted, however, that the dissociation constants depend on the non-specific DNA used in the assay. Therefore, it is conceivable that NF I, i e . histone HI, may have some affinity for the [poly(dI-dC)][poly(dI-dC)] used here and that this interaction obviously interferes with the sequence-specific binding.
NF I has been assigned two different functions. Originally it was found to promote replication of adenovirus (68,76-77), but recently it has also been shown to enhance transcription of some eukaryotic genes directed both by viral and host cell promoters (70, 78). The herpes simplex virus-thymidine kinase gene promoter contains a canonical CCAAT element around -80 from the start of transcription which is required for optimal expression of the gene (70). It has been proposed that this CCAAT binding factor may be NF I, and likewise NF I has been shown to be able to enhance transcription of other genes containing the CCAAT element, such as the pglobin gene. On the other hand, it has also been demonstrated that the steroid-dependent transcriptional stimulation obtained under the control of mouse mammary tumor virus-long terminal repeat is abolished when a N F I binding site around -80 in the long terminal repeat is deleted (78). This may mean that an interaction between the two DNA sequencespecific transcription factors, i.e. steroid receptors and NF I, is needed for optimal stimulation of transcription (79).
Deletion analysis of the aZ(I) collagen promoter turned out to be complicated (15). It may be that the region around -300 to which the present purified factor binds is important for expression of the gene ( E ) , although this was difficult to demonstrate. Indirect evidence was obtained when the ciselement was cloned in front of a minimal SV40 promoter (801, whereupon the element stimulated expression of a chloramphenicol acetyltransferase marker gene. On the other hand, the stimulatory effect of transforming growth factor-p has been demonstrated to be mediated via the same cis-element (80), and therefore it is obvious that the binding of a protein factor to this putative NF I binding site must have an important regulatory role.
The finding that histone H1 binds to a specific sequence in DNA, that actually has already been suggested previously (81), has interesting implications and raises a number of questions. Firstly, what is the role of histone H1 in determining nucleosome positioning, and is the NF I recognition sequence a determinant there? Secondly, how do the various effectors described above act through these linker histones? These seem to be some of the basic questions to be answered when elucidating mechanisms involved in the regulation of eukaryotic gene expression.
Recently, three reports have come out concerning the cloning of a protein that binds to the NF I recognition sequence (82-84). This protein was expressed in bacteria and it evidently stimulates transcription and replication. However, this protein is highly homologous with histone Hl,3 and therefore we conclude that H1 and that other protein, which should be called NF I, bind to the same operator sites on the DNA.