cys-3, the Positive-acting Sulfur Regulatory Gene of Neurosporu crassa, Encodes a Sequence-specific DNA-binding Protein*

the positive-acting master sulfur regulatory gene of Neurospora crassa, turns on the expression of an entire set of unlinked structural genes which encode sulfur-catabolic enzymes. cys-3 encodes a protein of 236 amino acid residues and contains a potential bi- partite DNA-binding domain which consists of a leu-tine zipper and an adjacent highly basic region. Gel band mobility shift and DNA footprint experiments were used to demonstrate that the CYS3 protein, ex- pressed in Escherichia coli, binds to three distinct sites in the 5’ upstream DNA of cys-14, the structural gene for sulfate permease II. The CYS3 protein also binds to one distinct sequence element upstream of the cya-3 gene itself, which suggests an autoregulatory role for this protein. Two mutant altered in the region of the DNA-binding failed to to either or The control iVeu-rospora of unlinked

cys-3, the positive-acting master sulfur regulatory gene of Neurospora crassa, turns on the expression of an entire set of unlinked structural genes which encode sulfur-catabolic enzymes. cys-3 encodes a protein of 236 amino acid residues and contains a potential bipartite DNA-binding domain which consists of a leutine zipper and an adjacent highly basic region. Gel band mobility shift and DNA footprint experiments were used to demonstrate that the CYS3 protein, expressed in Escherichia coli, binds to three distinct sites in the 5' upstream DNA of cys-14, the structural gene for sulfate permease II. The CYS3 protein also binds to one distinct sequence element upstream of the cya-3 gene itself, which suggests an autoregulatory role for this protein.
Two mutant CYS3 proteins, altered in the basic region of the DNA-binding domain, failed to bind to either the cys-14 or the cys-3 upstream recognition elements.
The sulfur control circuit of the filamentous fungus, iVeurospora crassa, consists of a set of unlinked structural genes which specify various enzymes involved in sulfur metabolism. Synthesis of this entire family of sulfur catabolic enzymes, which includes two sulfate permease species, a methioninespecific permease, aryl sulfatase, choline sulfatase, and an extracellular protease, occurs only when cellular levels of sulfur become limited (l-4). The expression of these sulfur catabolic enzymes is controlled by two distinct regulatory genes. One of these, designated scan (for "sulfur controller") acts in a negative fashion; mutants of scan express the sulfur related enzymes in a constitutive fashion (5). The other sulfur regulatory gene, cys-3, appears to act in a positive fashion to turn on the expression of the entire set of sulfur-related structural genes (5,7). Mutants of cys-3 lack all of sulfur related enzymes, whereas temperature sensitive cys-3 mutants are devoid of these enzymes at 37 "C, but have a wild-type phenotype at 25 "C. The structural genes for sulfate permease II and aryl sulfatase, cys-14 and ars, respectively, encode mRNAs whose cellular content is highly regulated by the availability of sulfur and by both the cys-3 and scan control genes (8, 9). Thus, it appears that cys-14 and ars, and presumably all of the sulfur-related structural genes, are subject to transcriptional control. cys-3, the positive-acting sulfur regulatory gene, has been postulated to specify a regulatory protein which binds at DNA recognition sequences adjacent to each of the sulfur structural genes, thereby activating their expression (8). Expression of the cys-3 regulatory gene itself was found to be regulated by the scan gene and by sulfur derepression (7). Moreover, some evidence suggested that cys-3 is also subject to autogenous control (7). The cys-3+ regulatory gene has been cloned and its entire nucleotide sequence determined (10); cys-3 appears to encode a protein comprised of 236 amino acids which has homology to histone Hl, the yeast GCN4 protein, and to the FOS oncogene product. It appears that a positively charged region of the cys-3 protein, in combination with an adjacent leucine zipper element, comprise a bipartite DNA-binding domain (10). Two cys-3 mutants were shown to cause substitutions for basic amino acid residues in the charged segment.
It was of considerable importance to investigate directly the possibility that the cys-3 regulatory gene encodes a sequencespecific DNA-binding protein.
We present results of gel band mobility shift experiments and DNA footprint studies which demonstrate that the cys-3 protein binds to three sites in the 5'-flanking DNA of the cys-14 structural gene. We found that the cys-3 protein also binds to a single site in the 5' upstream DNA of the cys-3 gene itself, adding support to the possibility that cys-3 is autogenously regulated. transformed into E. coli hosts BL21(DE3) and BLBl(DE3)pLysS for expression of the cys-3 proteins. Single colonies were used to grow overnight cultures which were used to inoculate @O-fold dilution) 250 ml of medium. These cultures were incubated at 37 "C with shaking to an optical density at 600 nm of 0.5-1.0, when the inducer isopropyl-1-thio$-D-galactopyranoside was added to a final concentration of 1 mM. The cultures were then shaken at 37 "C for 2 h, which was optimal for expression of the cys-3 protein. Expression and recovery of nrotein was best in host BL~~(DE~)DLvsS.

MATERIALS
The bacterial cells ._I were disrupted by sonication (two 30-s pulses), and cell debris was removed by centrifugation at 12,000 rpm for 10 min. Nucleic acids were precipitated from the supernatant fluid with Polymin P and then the proteins were precipitated with ammonium sulfate (60% saturation), disolved in buffer A (10 mM Tris-HCl, pH 7.0, 1 mM EDTA, 25 mM NaCl, and 10% glycerol) and dialyzed extensively against the same buffer. Gel Bared Mobility Shift Experiments-A set of DNA fragments from the 5'-flanking region of the cys-14 gene and of the cys-3 gene were prepared and radioactively labeled with ["*P]dATP by filling in with Klenow fragment of DNA polymerase (16). Two 27-mer oligonucleotides were synthesized and hybridized to form a doublestranded oligonucleotide of 25 nbp with the sequence ATGTTCGCT-GATGCCATTCATTGAT (and its complement), with each oligonucleotide having two unpaired bases, CG, at the 5' end.
Protein (0.3-0.6 rg, prepared as described above) was incubated with the "P-labeled DNA fragments (approximately 1 ng) for 30 min at 25 "C in a total volume of 25 ~1 in binding buffer (12 mM HEPES, 4 mM Tris-HCl pH 7.9,50 mM KCl, 1 mM EDTA, 1 mM dithiothreitol, 0.3 mg/ml bovine serum albumin, 3 pg of poly(dI-dC), and 10% glycerol). The samples were loaded to 4% polyacrylamide vertical gels in a low ionic strength buffer (3 mM sodium acetate buffer, pH 7.9, 6.7 mM Tris-HCl, 1 mM EDTA) and run for approximately 2 h with buffer recirculation at 30 mA until the free DNA fragments were near the bottom of the gel. The gel was transferred to Whatman No. 3MM paper, and autoradiographs were prepared with Kodak XAR-5 film at -70 "C.
DNA Footprinting-DNA footprints (DNase I protection experiments) were carried out with a modification of the procedure described by Desplan et al. (17). A cys-14 or cys-3 gene 5'-flanking DNA fragment (l-5 ng), 32P-labeled at one end, was incubated with E. coli extracts containing the expressed cys-3+ protein in binding buffer in 25 ~1 total volume at 25 "C for 30 min. After addition of 175 ~1 of dilution buffer (10 mM Tris-HCl, pH 7.5, 12 mM MgC12, 2.5 mM CaC&, 1 mM dithiothreitol, and 10% glycerol), the samples were placed on ice. Deoxyribonuclease I (25 ng) was added to each reaction and the mixtures were incubated on ice for 5 min, when 200 ~1 of storming buffer (40 mM Tris-HCl. DH 8.0. 2 mM EDTA. 0.6 M NaCl) wai addid. After extracting each sample with phenol and with chlo: roform, the DNA was ethanol-precipitated, dried, and resuspended in DNA sequencing loading buffer. The samples were run on 6% polyacrylamide sequencing gels, and autoradiographed at -70 "C with Kodak XAR-5 film.

Expression of the cys-3 Protein-The
wild-type cys-3 gene and two mutant genes were cloned into the expression vector PET-3b (18) as described under "Materials and Methods," and transformed into E. coli host strains. Fig. 1 reveals that the cys-3 wild-type and cys-3 mutant genes were expressed at high levels to give proteins of the expected size, representing easily the most abundant protein in the bacterial extracts. E. coli cells which contained the expression vector lacking the cys-3 gene did not produce this protein. The identity of the second codon had an important affect upon the level of protein expression, and changing this codon to one more optimal for expression in E. coli (19) led to approximately a IO-fold increase in the level of cys-3 protein (data not shown).
Gel Band Mobility Shift Experiments-We anticipated that one or more DNA recognition sites for the cys-3 protein might be situated upstream of the cys-14 gene. Gel mobility shift experiments (17,20) were undertaken to investigate posssible DNA binding by CYS3. Representative results are presented in Fig. 2. These and additional results summarized in Fig. 3 revealed that the mobility of specific cys-14 5'-flanking DNA fragments was markedly retarded when incubated with protein extracts containing the cys-3+ protein. However, when the same DNA fragments were incubated with protein extracts from E. coli cells containing the expression vector, but lacking the cys-3 gene, no retardation occurred in their mobility. Specificity for the DNA binding was evident in that the mobility of other 5' fragments was not affected by CYS3 (Fig. 3). Extracts containing the cys-3+ protein did not alter the mobility of 5'-flanking DNA segments of the nit-3 gene whose expression is not regulated by the cys-3 gene (not shown).
Two mutant CYS3 proteins, encoded by cys-3 mutant genes which are incapable of turning on the expression of the various sulfur-related structural genes, were tested for the ability to bind to the cys-14 DNA fragments, using the mobility shift assay. Both of these cys-3 mutant proteins were completely deficient in DNA-binding (Fig. 2).
Some experimental evidence has indicated that the cys-3 gene might be subject to autogenous regulation, suggesting that the wild-type CYS3 protein might also bind to 5'-flanking DNA of the cys-3 gene itself. Results presented in Figs. 2 and 3 reveal that the cys-3+ protein does indeed bind to specific cys-3 upstream DNA fragments, resulting in a marked change in their mobility, whereas protein from control E. coli cells did not alter the mobility of the cys-3 DNA; specificity was obvious because three fragments which share a common    Fig. 4 demonstrate the presence in the cys-14 upstream DNA of three distinct binding sites for the CYS3 protein within which both protected re-gions and enhanced cleavages by DNase I are obvious. These three sites occur at -0.19, -0.95, and -1.4 kb (measured from the start codon for translation).
The footprints representing the first two sites are each approximately 20 nb in length, whereas that for the most distant site is twice as long (48 nb), suggesting that it might comprise two adjacent binding sites (See Fig. 5).
DNA Footprinting with cys-3 DNA-The mobility shift experiments described above indicated that the CYS3 protein binds at one or more sites located upstream of the cys-3 gene. Fig. 4 presents the results of a DNA footprint experiment which demonstrated that the CYS3 protein binds at a single region centered at -230 upstream of the cys-3 gene. Both enhanced DNase I cleavages and protected regions were evident in the footprint.
The cys-3 gene upstream DNA sequence identified by the CYS3 protein footprint lacks dyad symmetry and, surprisingly, is quite long (52 nb), similar to the third (most upstream) site identified in the cys-14-flanking DNA (Fig. 5).
Oligonucleotide Binding by the CYS3 Protein-In order to further examine sequence-specific DNA binding by the CYS3 protein, we synthesized a double-stranded 25-mer whose central 19 nucleotide bases corresponded to the recognition sequence of site 1 identified by the DNA footprint experiments with the cys-14-flanking DNA (Fig. 5). Mobility shift experiments demonstrated that this oligonucleotide was bound by the CYS3 protein (Fig. 6). Moreover, the oligonucleotide also competed strongly for CYS3 binding with the cys-14 5' DNA fragment which contains site 1. The combined results presented above have led us to conclude that DNA-binding by CYS3 at site 1 upstream of cys-14, and presumably at the other sites, is sequence-specific.
Comparison of Binding Sites-The first two sites in the cys-14 upstream DNA recognized by the CYS3 protein are approximately 20 nb in length, whereas the third site and the site upstream of the cys-3 gene are twice as long. This suggested the possibility that the longer sites might actually comprise two adjacent binding sites, which together might show a higher affinity for the CYS3 protein. The leucine zipper and immediately upstream charged region of cys-3 and other proteins appear to comprise a bipartite DNA-binding domain. In cys-3, a methionine is found in place of one leucine in the zipper region. Basic amino acids in the charged region are circled. Asteri.sh identify the two basic amino acids which are replaced by glutamine in cys-3 mutant proteins, which are incapable of binding to cys-14 or cys-3 5' DNA recognition elements. mobility, whereas all of the cys-14 fragments appear to be completely free; at higher concentrations of the CYS3 protein, nearly all of the cys-3 fragments have been retarded in the gel, whereas only a limited amount of the cys-14 fragments displayed a band shift (Fig. 6). The results suggest that the cys-3 DNA fragment binds the CYS3 protein with a higher affinity than does the cys-14 fragment.

DISCUSSION
The cys-3 positive-acting control gene encodes a regulatory protein which appears to turn on the expression of various sulfur-related structural genes. The cys-3 protein contains a well defined leucine zipper and an immediately adjacent upstream basic region, which together comprise a putative bipartite DNA-binding domain (10). The cys-3 protein was highly expressed in E. coli such that it easily represented the most abundant soluble protein. As noted previously (19), the second codon was critically important in obtaining a high level of expression; approximately a IO-fold increase in expression was achieved by substituting a favorable second codon for the usual one present at that position in the cys-3 coding region.
Expression of cys-14, the structural gene which encodes sulfate permease II, is completely dependent upon a functional cys-3 gene and upon relief from sulfur catabolite repression (8). The CYS3 protein binds to three distinct recognition elements located at approximately -0.19, -0.95, and -1.4 kb in the 5' DNA upstream of the cys-14 gene. The nucleotide sequence of site 1 has a limited dyad symmetry, primarily restricted to the heptameric sequence ATGCCAT which could comprise a core binding site; it is noteworthy that enhanced DNase I cleavages are found at each end of this motif. The right half of this sequence is immediately repeated (TCAT), and also contains an enhanced cleavage site.
It is intriguing that the recognition sequences for the CYS3 protein in the three sites upstream of the cys-14 gene and the single site upstream of the cys-3 gene show only limited sequence homology. Nevertheless, these sites do possess some important common features. The cys-14 sites 1 and 2 share a common hexanucleotide sequence, TTCGCT, whereas the third cys-14 site and the single cys-3 site have a common pentanucleotide sequence, GAGAA. It may be significant that in each case one strand in these homology blocks is purinerich, the other strand pyrimidine-rich (Fig. 5). In most cases two such related homology blocks appear to flank a binding site, and, in particular, the distal half of the cys-3 binding region contains a perfect hexameric inverted repeat (Fig. 5). The much longer protected regions found for cys-3 and site 3 of cys-14 might each contain two adjacent binding sites, and it is noteworthy that each of these have additional nucleotide blocks with similar characteristics (Fig. 5). Another obvious feature found in all of the binding sites is the presence of at least one, but usually multiple copies of the sequence CAT (or CAAT); these repeated CAT sequences provide a limited dyad symmetry which may represent the central core of a CYS3 binding site. McKnight and his colleagues (21) have suggested that leucine zipper DNA-binding (bZIP) proteins interact with directly abutted dyad-symmetric DNA sequences.
The four CYS3 binding sites, as defined by the footprints, possess only limited sequence identity. It is well established that some regulatory proteins can recognize two or more distinct DNA sequences; e.g. the yeast HAP1 activator protein binds to two upstream activation sites which are of different sequence, both of which lack any dyad symmetry (22). Moreover, the HAP1 activator competes with a second regulatory protein, RC2, for binding to the UASl site of the cyc-1 gene (23). Other proteins that recognize quite distinct nucleotide sequences include the glucocorticoid receptor, C/EBP, and octamer binding protein, OBPlOO (24). It has also been established that some nucleotide sequences represent recognition elements for multiple trans-acting regulatory proteins (24). It is obvious that complex interactions of multiple regulatory proteins with DNA recognition elements play an important role in controlling eukaryotic gene expression. It is of potential interest that the DNA footprint in the cys-3 upstream DNA and the third cys-14 site are each approximately 50 nb in length, i.e. twice the length of the first two cys-14 sites, which suggests that they may actually comprise two adjacent CYS3 binding sites. The footprints for these longer sites appear to reveal two protected regions, separated by a central region with enhanced cleavages. It appears possible that two or more CYS3 protein molecules might bind to these longer upstream binding regions, perhaps even in a cooperative manner. This possibility was also suggested by the appearance of multiple shifted bands in the gel retardation assays, which may result from different numbers of CYS3 protein molecules bound to the cys-3 upstream DNA fragment. Consistent with the possibility that cys-3 possesses two adjacent binding sites, we found that the cys-3 DNA fragment has a higher affinity for the CYS3 protein than did the DNA fragment carrying the shorter cys-14 site 1. However, this result must be interpreted with caution because of the marked differences in the nucleotide sequences of the recognition elements, and much additional work will be required to determine the fine structure of these binding sites.
The positive-acting cys-3 protein appears to be a member of a family of proteins which form dimers by virtue of a leucine zipper (or coiled coil) structure (25-29). The leucine zipper element plus an immediately upstream basic charged region together comprise a bipartite DNA-binding domain (Fig. 7). In regulatory proteins of this class, it has been shown that the leucine zipper is necessary for dimer formation and that the basic region is required for sequence-specific DNA binding (26, 29,30). Results presented here demonstrated that mutant CYS3 proteins with amino acid substitutions in their basic region were incapable of binding to either the cys-3 or the cys-14 binding sites; these same mutants are nonfunctional in gene activation. These findings imply that these specific basic amino acid residues are crucial for the ability of the cys-3 protein to bind to the distinct sequence elements which lie upstream of each of these two genes. We expect that other amino acids within this basically charged region of the CYS3 protein are also involved in DNA-binding.