Structure, Chromosomal Locus, and Promoter Analysis of the Gene Encoding the Mouse Helix-Loop-Helix Factor HES-1 NEGATIVE AUTOREGULATION THROUGH THE MULTIPLE N BOX ELEMENTS*

is a mammalian helix-loop-helix factor structurally related to the Drosophila hairy and Enhancer of split proteins. It binds more preferentially to the N box (CACNAG) than to the E box (CANNTG) and acts as a negative regulator. In this study, we have isolated and characterized the mouse HES-1 gene. This gene consists of four exons, and the positions of introns are well conserved when compared with those of the Drosophila hairy gene, except for the third intron. Southern blot and interspecies backcross analyses suggest that the mouse HES-1 gene is a single-copy gene and is located around position 26 on chromosome 16. The transcription initiation site, determined by the S1 nuclease and primer extension experiments, is located 31 nucleotides downstream of a TATA box. In the 5’-regulatory region, there are four N box sequences, and the DNase I footprinting and gel mobility shift analyses show that HES-1 binds to these sequences. Transient transfection assays using C3HlOT1/2 cells suggest that there are several positive regulatory regions in the HES-1 gene. However, cotransfection of the HES-1 expression vector leads to -40-fold repression in promoter activity. Furthermore,

The HES-1 protein, which shows the highest sequence homology to the Drosophila hairy gene product, is expressed in a wide variety of tissues of both embryos and adults Feder et al., 1993). For example, a high level of HES-1 expression is observed in the epithelial cells of embryonal and adult respiratory and digestive organs. HES-1 is also expressed at a high level in embryonal muscles as well as in the ventricular zone of the embryonal nervous system, where neural precursor cells proliferate, but the expression decreases to a very low level in adult muscles and the differentiated nervous system . Thus, HES-1 expression is developmentally controlled in a tissue-specific manner.
Whereas most HLH proteins bind to the E box (CANNTG) (Blackwell and Weintraub, 1990), HES-1 preferentially recognizes the different consensus sequence CACNAG (called the N box) more than the E box . Transcriptional analyses show that HES-1 acts as a negative regulator by two different mechanisms: repression by directly binding to the N box and prevention of other HLH activators from binding to the E box . Furthermore, HES-1 efficiently antagonizes the transcriptional activity of MyoD, a muscle determination factor with an HLH domain, and inhibits MyoD-induced myogenesis . These results indicate that HES-1 may be involved in cellular differentiation.
In this study, to extend the molecular analysis of HES-1 further, we characterized the structure and chromosomal locus of the mouse HES-1 gene. We found that the structural features, such as the exon-intron boundaries, of the mouse HES-1 gene are very similar to those of the Drosophila h gene, suggesting that the two genes originated from the same or closely related ancestral gene. Transient transfection studies showed that there are several positive regulatory elements in the 5'region of the HES-1 gene. Further analysis of the promoter region revealed that the HES-1 protein binds to its own promoter and negatively regulates its own expression.

EXPERIMENTAL PROCEDURES
Isolation ofMouse HES-1 Gene-The mouse genomic library (Stratagene) was screened by hybridization in situ as described previously (Takahashi et al., 1992). The 1.4-kilobase EcoRI fragment of the rat HES-1 cDNA  was used as a probe. Three positive clones were obtained from 1 x lo6 plaques. All of these clones contained a 3-kilobase EcoRI fragment hybridized positively, and this fragment was subcloned into pBluescript and subjected to sequence analysis.
Southern Blot Analysis-Mouse liver DNA digested by restriction enzymes was electrophoresed on 0.7% agarose gel and transferred to a nylon membrane filter. The 32P-labeled 1.1-kilobase PstI-EcoRI fragment of the mouse genomic fragment containing exon 4 was hybridized to the DNA at 65 "C in a solution containing 0.75 M NaCl, 0.075 M sodium citrate, and 0.5% SDS.
Interspecies Backcross Analysis-NOD mice and Japanese wild mice (MOL-MIT) were used for interspecies backcross analyses as described previously (Watanabe et al., 1989). More than 30 markers were used to search for the locus closely linked to the HES-1 gene, and the Smst locus (located at position 19 on chromosome 16) was found to be most closely linked to the HES-1 gene. SI Nuclease and Primer Extension Analyses-Reactions were carried out as described previously (Kageyama et al., 1987). For S1 nuclease analysis, the 451-nucleotide SnaI-PuuII fragment labeled at the PuuII site was hybridized to the mouse embryonal poly(A) RNA (20 pg) at 42 "C in a solution containing 80% (v/v) formamide and 0.4 M NaCl and treated with 200 units of S1 nuclease. For primer extension analysis, the 173-nucleotide Sau3Al-PuuII fragment labeled at the PuuII site was hybridized under the same conditions as described for S1 nuclease analysis and subjected to the reverse transcription reaction. The S1 nuclease and primer extension products were electrophoresed on 7 M urea, 6% polyacrylamide gel.
Transient Transfection Analysis-Reporter plasmids contained the chloramphenicol acetyltransferase (CAT) gene under the control of various lengths of the HES-1 promoter. A mutation was introduced into the N box sequences in vitro as described previously (Vandeyar et al., 1988) by using three oligonucleotides: 5'-CGTcG&CCTAGCGGCCAATG-3', 5'-CACGCGGCACCGC=GGACTGCGCCCCCCC-3', and 5'-GCAC-GCGAACGGC~TGAAACITCCCCAAAC-3'. 7 pg each of the CAT reporter plasmids was transfected into C3HlOT1/2 cells by the calcium phosphate precipitation method with or without 7 pg of the HES-1 expression vector, which directed HES-1 expression from the cytomegalovirus enhancer and promoter. The total DNA amounts were adjusted to 15 pg with the control vector pSV-CMV and 1 pg of the P-galactosidase expression vector. Cells were harvested aRer 48 h, and CAT activities were determined as described previously .
For DNase I footprinting analysis, the 379-nucleotide SphI-AccI fragment labeled at the AccI site, which contained the 5"region of the mouse HES-1 gene, was used as a probe. DNase I footprinting reactions were camed out as described previously . For gel mobility shift analysis, the probes containing the region from nucleotides -165 to -1 with or without a mutation in the N box elements were used. Reactions were carried out as described previously .

Structural Organization of Mouse HES-1
Gene-Three genomic clones containing the mouse HES-1 gene were isolated by screening 1 x lo6 plaques of a mouse genomic library with the rat HES-1 cDNA probe. Because all of these clones contained a 3-kilobase EcoRI fragment hybridized positively to the HES-1 cDNA probe, the nucleotide sequence of this EcoRI fragment was determined (Fig. 1). Comparison with the full-length rat HES-1 cDNA sequence revealed that the mouse HES-1 gene consisted of four exons and that the EcoRI fragment contained the whole region of the mouse HES-1 gene (Fig. 1). All introns were located within the protein-coding region, and the sequences of the exon-intron boundaries all possessed the consensus splicing signal conforming to the GT-AG rule. The deduced amino acid sequence of mouse HES-1 showed a complete match in the B-HLH domain and 99% identity in the whole region to that of rat HES-1.
Comparison between the mouse HES-1 and Drosophila hairy genes showed that they had similar genomic organization, except that the mouse HES-1 gene had a third intron ( Fig. 2A).
The other introns were located within the B-HLH region (see Fig. l), and their positions were well conserved between mouse and Drosophila (Fig. 2 B ) .
Our recent studies showed that there are at least four genes structurally related to the HES-1 gene (Akazawa et al., 1992;Sasai et al., 1992). These genes are homologous to each other in the B-HLH domain, but exhibit no significant sequence homology outside the domain, except for the carboxyl-terminal Trp-Arg-Pro-Trp sequence. To determine whether other genes highly related to HES-1 outside the B-HLH domain exist, we performed Southern blot analysis. The HES-1 probe that did not contain the B-HLH domain was hybridized to mouse genomic DNA under high stringency conditions. As shown in Fig.  3, only a single band that matched the size of the mouse HES-1 gene was detected, suggesting that there is no gene highly related to HES-1 outside the B-HLH domain.
Interspecies Backcross Analysis of Assignment of Chromosomal Locus of HES-1 Gene-To determine the chromosomal locus of the mouse HES-1 gene, we next carried out interspecies backcross analysis using two strains of mice, NOD and MOL-MIT. Southern blot analysis showed that the two strains exhibited a restriction endonuclease fragment length variation of the HES-1 gene when digested by XbaI; NOD mice gave a 5.5-kilobase band, and MOL-MIT mice a 4-kilobase band (data not shown). Therefore, high molecular weight DNAs isolated from 68 backcross progenies, (NOD x MOL-MIT)F1 x NOD, were subjected to linkage analysis by Southern blot experiments. As shown in Fig. 4, the HES-1 gene was closely linked to the somatostatin (Smst) locus on chromosome 16. Thus, the locus of the HES-1 gene on chromosome 16 was further examined by the three-point cross-test using the Smst and Ets-2 loci, which are located 19 centimorgans (cM) and 59 cM apart from the centromere on chromosome 16, respectively (Green, 1989). Out of 68 progenies, only five mice showed recombination between the Smst and Hes-l loci (7.4%), 27 mice showed recombination between the Ets-2 and Hes-1 loci (39.7%), and 30 mice showed recombination between the Smst and Ets-2 loci (44.1%). These results suggest that the mouse HES-1 gene is located -7 cM distal from the Smst locus on chromosome 16 (26 cM apart from the centromere) (Fig. 4).

S1 Nuclease and Primer Extension Analyses of Danscription
Initiation Site-To determine the transcription initiation site, we performed S1 nuclease and primer extension analyses (Fig. 5). The SmaI-PuuII fragment labeled at the PuuII site (7 nucleotides downstream of the translation initiation site) was used for S1 nuclease analysis. This analysis exhibited a single protected band of 255 nucleotides (lane 31, indicating that the 5"terminus of the HES-1 mRNAis 248 nucleotides upstream of the translation initiation site. Primer extension analysis using the primer labeled at the same PuuII site demonstrated two bands of -280 and 250 nucleotides (Fig. 5, lane 1 ) . However, because the long-exposed autoradiograph also detected the former band in the negative control lane using tRNA (data not shown), this 280-nucleotide band was probably nonspecific. The latter band, with a size similar to that of the S1 nuclease-resistant band, on the other hand, was not detected in the negative control lane and therefore seemed specific. These results suggest that HES-1 transcription starts from the site -250 nucleotides upstream of the translation initiation codon. Although primer extension analysis showed a rather broad band, S1 nuclease analysis gave a single band, corresponding to the site 248 nucleotides upstream of the translation initiation codon. Thus, this position was most likely a major transcription initiation site and was therefore designated as nucleotide 1 (see Fig. 1).
Sequence examination of the promoter region revealed that there is a TATA motif (tatatat) at nucleotide -31, which may direct precise transcription initiation (Fig. 1, boxed). A CAAT box is also present at nucleotide -151. Another feature is that there are four N box sequences: CACAAG (opposite strand) at nucleotide -165, CACGAG at nucleotides -132 and -58, and CACCAG (opposite strand) at nucleotide +16 (see Fig. 1).

ctgcgtctagaaattaagtgggttgtgcacagcgggactccttttacttttctcttaaaacctcggggtgaaatggcagatcccgtgggaactcaggacctcttttctccccctttgcag T C A T C A M G C C T A T C A T G G C~G G G~T A M T~G T C T M~~C T~C A~G A T T T T G G A T G C A C T T M G A A A G A T g t a a g t g a g c a a t g c t t t t t t t t a
DNA Binding Analysis of HES-1 Protein with HES-1 Promoter-Because the HES-1 promoter contained the N box sequences, target elements of HES-1 itself, we next performed a DNase I footprinting experiment to determine whether HES-1 binds to its own promoter. The HES-1 protein, expressed in Escherichia coli, was purified and renatured. As shown in Fig.  6 A , E. coli-expressed HES-1 clearly bound t o the upstream three N box sequences present at nucleotides -165, -132, and -58 (Lanes 3 and 4 ) and weakly bound to the most downstream N box present a t nucleotide +16 (Lane 6). Weak protection was also observed in the region around nucleotide -10, which did not conform to the canonical N or E box consensus sequence (Lane 6). We also carried out gel mobility shift analysis to confirm that HES-1 bound to the N box elements of the HES-1 promoter. As shown in Fig. 6B, the probe containing the wild-type promoter region from nucleotides -165 to -1 that had three N box elements exhibited a strong retarded band (lane 2 ) . This binding was specific because the retarded band was competed well when poly(d1-dC) was replaced with the wild-type fragment (Lane 3 ) . Introduction of a mutation into one or two N box elements resulted in weaker binding (lanes 4-6). Furthermore, when all three N box elements were disrupted, no significant binding was detected (Lane 7). These results suggest that HES-1 may interact with its own promoter through the multiple N box elements and may be involved in the regulation of its own expression.
DanscriptionaL Analysis of HES-1 Promoter-We next analyzed the promoter activity of the HES-1 gene by a transient transfection method. Reporter plasmids containing the CAT gene under the control of various lengths of the HES-1 gene promoter were transfected into C3H10T1/2 cells, which express a low level of HES-1 mRNA.
As shown in Fig. 7, the CAT plasmid containing the promoter region between nucleotides -2000 and +46 exhibited a high level of transcription (lane 1 ). Deletion of the sequence between nucleotides -2000 and -952 showed no significant change, whereas further deletion to nucleotide -883 resulted in a -30% reduction (Lane 7). Another deletion from nucleotides -751 to -583 also led to an additional -30% reduction in activity (Lane 11). Whereas further deletion to nucleotides -195 did not reduce the expression (nucleotides -195 to +46; Lane 15), the promoter containing nucleotides -195 to -1 showed an addi-   lanes 1 and 2 ). S1 nuclease analysis was conducted by using the 451-nucleotide SmaI-PuuII fragment labeled at the PuuII site as a probe (lanes 3 and 4 ) . 20 pg each of poly(A) RNA isolated from mouse embryo (lanes 1 and 3 ) and tRNA(lams 2 and 4 ) was used. The specific product is shown by an arrowhead. The marker sizes (in nucleotides) are indicated on the left.
Because HES-1 binds to the N box sequences present in the HES-1 promoter (at nucleotides -165, -132, -58, and +16), we next cotransfected the HES-1 expression vector with the CAT  E. coli-expressed HES-1 protein: lanes 2 and 5, no protein; lane 3,250 ng; and lanes 4 and 6,500  The total DNA amounts were adjusted to 15 pg with the control vector pSV-CMV and 1 pg of the p-galactosidase expression vector. The CAT activity of the reporter plasmid containing the region between nucleotides -2000 and +46 of the HES-1 gene was taken to be 100 (lane 1 ), and relative activities were measured. Each value of relative CAT activities is the average of at least four independent experiments.
plasmids to examine whether HES-1 regulates its own promoter activity. HES-1 expression was directed by the cytomegalovirus promoter and enhancer. As shown in Fig. 7, introduction of the HES-1 vector led to -20-40-fold repression of the CAT activity of each reporter plasmid (lanes 2 , 4 , 6 , 8 , 1 0 , 1 2 , 1 4 , 1 6 ,  18, and 201, except for the one that showed a background level of expression (lane 2 2 ) . These results show that HES-1 can negatively regulate its own promoter activity. The pronioter containing nucleotides -195 to -1 or nucleotides -165 to -1 still exhibited 30-40-fold repression by HES-1 (Fig. 7, lanes 18 and 201, suggesting that the most downstream N box (at nucleotide +16) is not required for this negative regulation. To determine if the other N boxes are necessary for the negative regulation of HES-1 expression, we made several further deletion constructs. However, the promoter activities of these constructs were too weak to assess the negative regulation through the N box (data not shown). Therefore, we next introduced a site-directed mutation into the N box of the HES-1 promoter construct containing the region from nucleotides -165 to -1, resulting in weak, but still significant activity. When the mutation was introduced, the basal activity slightly increased (-1.5-2-fold; data not shown), suggesting that endogenous HES-1 or a similar factor negatively regulates the expression through the N box. As shown in Fig. 8, whereas the wild-type promoter containing the region between nucleotides -165 and -1 exhibited 30-fold repression by HES-1, introduction of a mutation into one of the N box sequences resulted in 14-fold repression. When two N box elements were mutated, only 7-fold repression was observed. Furthermore, disruption of all three N box elements resulted in only 2-fold repression by HES-1. Thus, the number of N box elements correlated well to the repressor activity exhibited by HES-1 (Fig. 8). Further- mutation (underlined, shown by X) was introduced into the three N box sequences (at nucleotides -165, -132, and -58) of the HES-1 promoter as indicated on the left. The CAT plasmid containing either the wildtype or mutated HES-1 promoter (from nucleotides -165 to -1) was transfected into C3HlOTV2 cells with or without the HES-1 expression vector. -Fold repression of the CAT activities in the presence of the HES-1 expression vector is shown. Relative DNA binding affinity determined by gel mobility shift analysis (Fig. 6 B ) is also indicated.
more, the number of N box elements and -fold repression exhibited by HES-1 were also parallel to the DNA binding affinity of HES-1 for the promoter (Fig. 8). Therefore, these results strongly suggest that the N box sequences play an important role in the negative autoregulation of the HES-1 gene.

DISCUSSION
Zkanscriptional Control of HES-1 Gene-In this study, we analyzed the structure, chromosomal locus, and regulation of the mouse HES-1 gene. Interestingly, this gene may be negatively regulated by its own product. The three N box sequences present in the HES-1 promoter (at nucleotides -165, -132, and -58) play an important role in this negative regulation. The number of N box elements correlated well to the repressor activity of HES-1: no N box showed only 2-fold repression, one N box showed 7-fold repression, two N boxes showed 14-fold repression, and three N boxes showed 30-40-fold repression. These data suggest that in the HES-1 promoter, the multiple N box elements act synergistically rather than additively in negative regulation. Previously, we showed that HES-1 negatively regulates transcription from the chimeric promoter consisting of the p-actin promoter and six repeats of the N box sequences . However, with this artificial promoter, HES-1 exhibited only 5-fold repression. Thus, these results suggest that the relative position of the N box to the promoter and/or the distance between the multiple N box elements may be important for optimal repression by HES-1.
A mutation introduced into the N box sequences results in a slight increase in the basal activity of the HES-1 promoter, suggesting that an endogenous factor also negatively regulates transcription through the N box. Because mutation of the N box sequences did not decrease the transcriptional activity, these negative regulatory sequences do not seem to overlap with the positive regulatory elements. Thus, it is unlikely that the negative regulation of the HES-1 gene through the N box results from competition with specific activators for the target sites.
nucleotide positions of the mouse HES-1 gene are shown on the left. B, gel mobility shift analysis. The probes contained the fragment including the promoter region between nucleotides -165 and -1 with (underlined, shown by X) or without a mutation in the N box elements as indicated on the left and above each lane. Each probe was mixed with 100 ng of the E. coli-expressed HES-1 protein (lanes 2-7). Reactions were carried out in the presence of either 100 ng of poly(d1-dC) (lanes 2 and 4-7) or 100 ng of the wild-type (wt) competitor containing the region between nucleotides -165 and -1 (lane 3). Lane 1 shows the probe only. Relative DNA binding affinity determined by the density of the retarded band (arrowhead) is indicated under each lane.
Rather, it may be involved in the inhibition of other factors, such as the general transcription factors, that do not recognize the N box. Several factors that negatively regulate transcription not by competing with specific activators for overlapping target sites have been characterized (Licht et al., 1990;Johnson and Krasnow, 1992). For example, the Drosophila even-skipped protein represses transcription by acting early in preinitiation complex assembly (Johnson and Krasnow, 1992). This repression requires the proline-rich repression domain of the euenskipped protein (Han and Manley, 1993). Because HES-1 also has a proline-rich region downstream of the B-HLH domain, it would be interesting to see whether the proline-rich region of HES-1 is also necessary for the negative regulation through the N box.
In gel mobility shift analysis, only a single retarded band was detected even though the probes containing multiple HES-1binding sites were used (Fig. 6B). When the amount of the HES-1 protein was increased, we observed another band with a much higher molecular size (data not shown). However, this complex probably resulted from nonspecific protein aggregation rather than from binding to the multiple N box sequences. Because excess amounts of the probes were used for the binding reaction, it is possible that, at most, only one site of the probes was recognized by the HES-1 protein. Another possibility is that the complex with multiple HES-1-binding sites could be unstable and could result in the dissociation of the protein during electrophoresis. HES-1 still weakly represses transcription from the mutated HES-1 promoter that lost all of the N box sequences. Because weak binding was detected in another region of the HES-1 gene that did not conform to the canonical N or E box, HES-1 may repress transcription through this site. It is also possible that HES-1 could inactivate a transcriptional activator by forming a nonfunctional heterodimer as in the case of inhibition of E47 by HES-1. Further studies such as the characterization of positive regulatory factors will be necessary to test these possibilities.
Negative autoregulation is an interesting feature in the understanding of the regulatory mechanisms of HES-1 gene expression. HES-1 is expressed a t a high level in developing nervous and muscle tissues, but its level decreases as differentiation proceeds . Forced expression of HES-1 results in the inhibition of MyoD-induced myogenesis, indicating that down-regulation of HES-1 expression is required for normal muscle development . Our data described above raise the interesting possibility that HES-1 itself could take part in this down-regulation. If this is the case, the negative autoregulation in the course of differentiation may be a unique feature in contrast to the positive autoregulation of the muscle determination factor MyoD (Thayer et al., 1989). Further studies will be necessary to determine whether the HES-1 protein is involved in the downregulation of HES-1 expression during development.
The promoter analysis results shown above also suggest that there are several positive regulatory regions in the 5'-part of the HES-1 gene. Although these regions do not contain typical regulatory elements such as the E box, some regions have GCrich sequences, which could be recognized by the positive regulator S p l (Kadonaga et al., 1986).
Recent analysis shows that HES-1 expression is up-regulated at the transcriptional level by treatment with growth factors in the absence of new protein synthesis, suggesting that the HES-1 gene is an immediate early gene (Feder et al., 1993). However, the sequence identical or similar to that of the serum response element (GATGTCC(AT)&GACATC) ("reisman, 1986) is not present in the promoter region. Instead, there is an AP1-like site a t nucleotide -498 (TGACTCC), which could be responsible for the response to growth factors (Angel et al., 1987;Lee et al., 1987).

Structure and Chromosomal Locus of Mouse HES-1 Gene-
The mouse HES-1 gene consists of four exons, and the positions of the introns in the B-HLH domain are well conserved between the mouse HES-1 and Drosophila hairy genes. These results strongly suggest that the two genes originated from the same or closely related ancestral gene.
Southern blot and interspecies backcross analyses indicate that the HES-1 gene is a single-copy gene and is located -7 cM distal from the Smst locus on chromosome 16. This assigned locus of the HES-1 gene is very close to the belly spot and tail (Bst) locus (10 cM distal from the Smst locus) (Epstein et al., 1986). This mutant has pleiotropic anomalies such as a short kinked tail, ventral spotting, and white feet. Because HES-1 is expressed in these affected regions, we investigated whether the HES-1 gene is mutated in Bst mice. Although the positions of the two loci are very close to each other, no difference between Bst mice and their normal littermates was observed on Southern blot analysis.' We also sequenced the B-HLH region amplified by the polymerase chain reaction, but did not find any mutation. These results suggest that the HES-1 gene may be different from the Bst gene, although the possibility that there could be a small mutation such as a point mutation in the regions other than the polymerase chain reaction-amplified portion was not excluded yet. Thus, further studies will be necessary to determine the phenotypes of the HES-1 gene mutation. These studies should also help to further the understanding of the mechanism of cellular differentiation.