The CGL2612 Protein from Corynebacterium glutamicum Is a Drug Resistance-related Transcriptional Repressor

The emergence of antibiotic-resistant bacteria often causes serious clinical problems. The TetR family is one of the major transcription factor families that regulate expression of genes involved in bacterial antimicrobial resistance systems. CGL2612 protein is a transcription factor newly identified by genomic DNA analysis on Corynebacterium glutamicum, which belongs to the mycolic acid-containing Actinomycetales, including the well known pathogens Corynebacterium diphtheriae and Mycobacterium tuberculosis. Crystal structure analysis showed that the CGL2612 protein exhibits significant structural similarity to the multidrug resistance (MDR)-related transcription factor QacR from Staphylococcus aureus, despite poor amino acid sequence similarity between these proteins. Binding DNA sequence analysis of CGL2612 protein using the systematic evolution of ligands by the exponential enrichment (systematic evolution of ligands by exponential enrichment, or SELEX) method revealed that this protein is a new member of the TetR family, which regulates expression of the immediately upstream gene, cgl2611, probably encoding a major facilitator superfamily permease. Subsequent functional analyses confirmed a function of the CGL2612 as a transcriptional repressor responsible for the antimicrobial resistance system in C. glutamicum. The strategy used in the present study is one of the most convenient and powerful methods to analyze functionally unknown transcription factors, and the results obtained here will contribute to our understanding of the drug resistance mechanism not only in C. glutamicum but also in the related bacteria, C. diphtheriae and M. tuberculosis.

ous structurally dissimilar compounds. The latter type of transporter can actively export a wide range of compounds, and these are called multidrug transporters (1). Bacteria that acquire such transport systems can cancel the effects of many kinds of drugs (multidrug resistance (MDR) 3 ). MDR has been shown to be widespread among both Grampositive and Gram-negative bacteria (3), and the emergence of such bacteria causes nosocomial infection and often has adverse effects on health. Staphylococcus aureus is one of the bacteria that often develop MDR and can cause serious clinical problems. The TetR family of transcriptional factors are known to regulate expression of genes involved in bacterial efflux systems against antimicrobial compounds or drugs (4). Proteins that belong to this family are known to act as transcriptional repressors, and they bind constitutively to their regulatory sequence and repress expression of target genes that are related to drug detoxification or export. When the cellular level of toxic compounds increases, the repressor alters its DNA-binding ability and dissociates from the regulatory sequence, resulting in activation of antibiotic resistance of the bacterium.
Corynebacterium glutamicum is widely used for industrial production of amino acids. The genome of the C. glutamicum wild-type strain ATCC13032 has been sequenced by Kyowa Hakko and is now available in the public domain. The well known pathogenic bacteria, Corynebacterium diphtheriae and Mycobacterium tuberculosis, are phylogenetically related to C. glutamicum. Analysis of C. glutamicum (3.3 Mbp), C. diphtheriae (2.5 Mbp), and M. tuberculosis (4.4 Mbp) genomes using the GTOP protein structure prediction data base (available on the World Wide Web at spock.genes.nig.ac.jp/ϳgenome/gtop.html) (5) indicated that these genomes encode 16,12, and 43 proteins that possess the TetR-type DNA-binding domain, respectively, but most of their functions are still unknown. Structural and functional analyses of such novel TetR family proteins found by genomic analysis promote the elucidation of drug resistance mechanisms in these bacteria and contribute to reduction of the threat of MDR. Here, we report the crystal structure and biological function of CGL2612 protein (177 residues, 20.3 kDa) from C. glutamicum as a transcriptional repressor of a putative major facilitator superfamily (MFS) drug efflux pump responsible for increasing resistance to antibiotics in bacteria.

Protein Preparation
The full-length (amino acids 1-177) cgl2612 gene was amplified by PCR using the C. glutamicum wild-type strain ATCC13032 genome as a template DNA and was inserted into the vector pET-26b (Novegen Inc., Madison, WI) into the NdeI and XhoI restriction enzyme sites. The 3Ј-end primer for cloning was designed for direct attachment of a His 6 tag to the C-terminal end of CGL2612 protein via a two-amino acid linker, consisting of Leu and Glu (pET/CGL2612-His). The resulting plasmid was transformed into Escherichia coli BL21-Star (DE3) host cells (Stratagene, La Jolla, CA).
Protein expression and purification of both native and Se-Met-substituted proteins for crystallization experiments were performed using methods similar to those described previously (6).

Crystallization
Crystallization experiments were performed using the same strategy as described previously (6). The best native crystal was obtained under the following conditions: 0.08 M Tris-HCl (pH 8.3), 0.16 M MgCl 2 , 21% polyethylene glycol 400, and 20% glycerol. The dimensions of the crystal were 0.3 ϫ 0.3 ϫ 0.4 mm 3 , and this crystal belonged to the space group P2 1 2 1 2 1 with cell dimensions of a ϭ 56.9 Å, b ϭ 67.0 Å, and c ϭ 101.2 Å, and the asymmetric unit contained two molecules of CGL2612 protein.
The crystal volume per protein mass (V M ) (7) was 2.4 Å 3 Da Ϫ1 with a solvent content of 48.3%.
Se-Met-substituted crystals were also obtained under conditions similar to those used for the native crystals, and the best crystal was obtained under the following conditions: 0.08 M Tris-HCl (pH 8.3), 0.16 M MgCl 2 , 23% polyethylene glycol 400, and 20% glycerol. The dimensions of the crystal were 0.3 ϫ 0.3 ϫ 0.4 mm 3 , and this crystal also belonged to the space group P2 1 2 1 2 1 with cell dimensions of a ϭ 58.2 Å, b ϭ 67.9 Å, and c ϭ 105.5 Å, and the asymmetric unit contained two molecules of CGL2612 protein. The crystal volume per protein mass (V M ) (7) was 2.6 Å 3 Da Ϫ1 with a solvent content of 52.1%.

Data Collection and Structure Determination
Data collection statistics are listed in TABLE ONE. Native data were collected at a wavelength of 0.9000 Å at 100 K by flash cooling on beamline BL38B1, SPring-8 (Hyogo, Japan). Single anomalous diffraction data were collected at a wavelength of 0.9800 Å from a Se-Metsubstituted crystal at 100 K on beamline BL41XU in SPring-8. These data were processed using the HKL2000 program (8).
The structure of CGL2612 protein was determined using the single anomalous diffraction method. Its asymmetric unit contained two molecules of the protein, and six of the eight selenium sites were found using the program SOLVE (9). Using the initial phase calculated by SOLVE, phase improvement and automated model building were performed using the RESOLVE program (10,11). The program successfully built a model except for some residues in terminal parts or loop regions. The missing residues were built manually using the graphic program O (12). The phasing statistics are summarized in TABLE ONE.
Structure refinement was performed on the native data. The orientation and the position of molecules in the native crystal were determined by a molecular replacement method using the AMoRe program (13). The monomer model built on the single anomalous diffraction data was used as a search model. Model fitting and refinement were carried out automatically using the Lafire program 4 with the CNS program (15). Finally, manual model checking and fitting were carried out using O. Water molecules were found using CNS, and the final model had an R-factor of 20.8% and a free R-factor of 24.1% for data between 10 and 1.9 Å resolution. The final refinement statistics are given in TABLE  TWO.

Identification of the CGL2612 Binding Site
Random SELEX Selection-To generate an initial random sequence DNA library, single-stranded 62-mer oligonucleotides (5Ј-CGGAATT-CCGGTCGACCAGAAGN 16 TATGTGCGTCTACATGGATCCTC-A-3Ј), containing a random core of 16 nucleotides, were constructed. Both ends (the first 22 nucleotides and the last 24 nucleotides) of the oligonucleotide were primer binding sites. A double-stranded oligonucleotide DNA pool was amplified by PCR using primers (primer PRS5Ј and PRS3Ј) complementary to the 5Ј-and the 3Ј-primer binding sites, respectively. PCR amplification was performed with ExTaq DNA polymerase (Takara Bio Inc., Otsu, Shiga, Japan). The PCR products, a random sequence DNA library, were mixed with Ni 2ϩ -nitrilotriacetic acid resin (Qiagen Inc., Valencia, CA) preadsorbed with 50 g of purified CGL2612 protein in binding buffer (20 mM Tris-HCl, pH 7.5, 250 mM NaCl, 10 mM imidazole). The mixture was incubated at room temperature for 1 h. After washing of nonspecific binding sites with binding buffer, protein-DNA complexes were eluted with elution buffer (20 mM Tris-HCl, pH 7.5, 500 mM NaCl, 500 mM imidazole). DNA fragments were extracted from the eluted fraction by phenol/chloroform extraction and ethanol precipitation. Eluted fragments were amplified by PCR 4 M. Yao, Y. Zhou, and I. Tanaka, manuscript in preparation. using the primers PRS5Ј and PRS3Ј, and the PCR products were used as a DNA pool for the next cycle of selection. This process was repeated for a total of eight cycles. Selected DNA fragments after these cycles were cloned into the pGEM-T Easy Vector (Promega, Madison, WI), and the DNA sequences of their random core regions were analyzed. Genomic SELEX Selection-Chromosomal DNA of the C. glutamicum wild-type strain ATCC13032 was fragmented into short fragments (ϳ100 -300 bp) by sonication (Astrason XL2020; Misonix Inc., Farmingdale, NY) on ice, and both ends of the fragments were blunted using DNA polymerase (DNA Blunting kit, Takara Bio Inc., Otsu, Shiga, Japan). The blunt-ended fragments were cloned into EcoRV-digested pBR322 (New England Biolabs Inc., Beverly, MA) using T4 DNA ligase (Takara Bio), and transformed into E. coli XL-1 blue (Stratagene, La Jolla, CA). Forty-eight of the resulting clones were analyzed for the size and sequence of their insert fragments to verify randomness of the inserts required for use as a SELEX library template. Over 1.0 ϫ 10 5 clones were collected for library preparation, and our library template had sufficient variety and length to cover the total genome of C. glutamicum. The DNA library for genomic SELEX experiments was generated by PCR from the template using primers PGS5Ј (5Ј-CTTGGTTATGC-CGGTACTGC-3Ј) and PGS3Ј (5Ј-GCGATGCTGTCGGAATGGAC-3Ј), corresponding to sequences upstream and downstream of the EcoRV site in the pBR322 vector. PCR was performed using ExTaq DNA polymerase (Takara Bio). The experimental procedure for selection and sequencing steps were essentially the same as those of random SELEX experiments described above.

DNase I Footprint Analysis
The probes for DNase I footprinting experiments were prepared by PCR from C. glutamicum wild-type strain ATCC13032 genomic DNA using the specific primer pair PDF5Ј-PDF3Ј. These primers (PDF5Ј, 5Ј-AAGTGACCTCACAGAATCGC-3Ј; PDF3Ј, 5Ј-TGCGGTGTA-GAGAATCGAGT-3Ј) were designed for 110 bp upstream and downstream from positions Ϫ36 to Ϫ6 of the cgl2611 gene, respectively. Prior to PCR amplification, the 5Ј termini of the primers of either the coding strand or the noncoding strand were labeled with [␥-32 P]ATP (PerkinElmer Life Sciences) using T4 polynucleotide kinase (Takara Bio). Aliquots of 120 ng of labeled probes, 10 g of salmon testis DNA (Sigma), and 150 g of purified CGL2612 protein were mixed in 20 l of reaction buffer (40 mM Tris-HCl, pH 7.5, 200 mM NaCl, 8 mM MgCl 2 , 5 mM dithiothreitol) and incubated for 20 min at 37°C. After incubation, 0.25 units of DNase I (Takara Bio) in 0.5 l of reaction buffer were added and mixed, and digestion was allowed to continue for 1 min at 37°C. Digestion was terminated by the addition of phenol/chloroform, and reaction mixtures were extracted. DNA was precipitated with ethanol, and the resulting pellets were resuspended in 8 l of loading dye. After heating at 90°C for 1 min, aliquots were analyzed by electrophoresis on an 8 M urea-containing 6% polyacrylamide gel. Sequence ladders were generated using a Thermo sequence cycle sequence kit (Amersham Biosciences) with the same primer as was end-labeled for probe preparation.

Construction of a cgl2612 Disruptant
For disruption of the cgl2612 gene, the C. glutamicum wild-type strain ATCC31831 was used. A 2.1-kb DNA fragment encompassing the cgl2612 gene was amplified by PCR using the primers, 5Ј-AG-GAGCCACTTCTAGATCTGTCG-3Ј and 5Ј-GGACATCACCTG-CAGATGTCC-3Ј. The amplified DNA fragment was digested with XbaI and PstI and then ligated with the vector plasmid pK18mobsacB (16) digested with XbaI-PstI. From the resulting plasmid, a 440-bp XhoI fragment encoding an internal part of cgl2612 was removed by XhoI digestion followed by self-ligation. The constructed plasmid, named pK18⌬cgl2612, was introduced directly into ATCC31831 cells by electroporation, and kanamycin-resistant transformants were selected. As pK18⌬cgl2612 did not have a replication origin functional in C. glutamicum cells, only cells in which homologous recombination occurred between the chromosomal DNA and plasmid DNA could grow on kanamycin-containing plates. The kanamycin-resistant transformants where F o and F c are observed and calculated structure factor amplitudes. b R-free factor value was calculated for R-factor, using only an unrefined subset of reflection data (10%). c The Ramachandran plot was calculated by PROCHECK (19). NOVEMBER 18, 2005 • VOLUME 280 • NUMBER 46 thus carried the intact cgl2612 gene and the deleted cgl2612 gene in addition to the vector plasmid sequence on their chromosome. One of the kanamycin-resistant transformants was grown in L liquid medium without kanamycin for 16 h, and the cells were then spread on sucrosecontaining L agar plates (20% sucrose). As cells carrying the sacB gene encoded by the vector plasmid were killed in the presence of sucrose, only those cells in which the sacB gene was excised from the chromosome by the second homologous recombination between the intact and deleted cgl2612 could grow on sucrose-containing plates. The resultant sucrose-tolerant recombinants should have the wild-type or the deleted cgl2612 gene, depending on the recombination point. Among sucrosetolerant recombinants, the desired deletion mutant strain was selected by PCR amplification with the primers used for cloning of the cgl2612 gene. Chromosomal deletion of cgl2612 was further confirmed by Southern hybridization (data not shown). The constructed deletion mutant strain was named D2612.

Primer Extension Analysis
The transcriptional initiation site of the cgl2611 gene was determined by primer extension analysis. Total cellular RNA was isolated from exponentially growing cells of ATCC31831 or D2612 using an RNA   (19)), QacR (Protein Data Bank numbers 1jts, 1jty, 1jt6, 1jtx, 1jup, and 1jum (24,26)), and TetR (Protein Data Bank number 2tct (21)), respectively. The meanings of the marks above or below the sequence are as follows. D, residues that are involved in protein-DNA interaction in QacR-IR1 complex structure (Protein Data Bank number 1jt0 (20)); d, putative DNA interaction residues of CGL2612 protein and other homologues suggested by sequence and tertiary structure comparisons with QacR-and TetR-DNA complex structures (20, 23); #, residues forming a ligand-binding cavity in CGL2612 protein.
isolation kit (RNeasy, Qiagen, Chatsworth, CA). Aliquots of 40 g of total RNA were used for primer extension with Superscript II reverse transcriptase (Invitrogen) and the biotinylated oligonucleotide (5Ј-AGAAAGAGACCACCGCTGATAACGC-3Ј), which was complementary to a region within cgl2611. The primer extension products were separated on an 8 M urea-containing gel together with the sequencing ladders obtained using the same primer. Separated products were detected by chemiluminescence using a Phototope-Star detection kit (New England Biolabs Inc., Beverly, MA).

Antimicrobial Susceptibility Analysis
Minimal inhibitory concentrations (MICs) were determined using a standard 2-fold serial dilution format on L agar plates. Aliquots of 10 l of stationary culture diluted in 0.85% NaCl to 10 6 cells/ml were spotted onto each plate. Following incubation at 30°C for 24 h, the MIC was defined as the lowest concentration that resulted in Ͼ99.9% inhibition of colony formation.
The N-terminal three helices of CGL2612 protein form a compact DNA-binding domain that contains the typical helix-turn-helix motif (␣ 2 -␣ 3 ), in which the latter helix was assumed to be a DNA recognition helix. The C-terminal portion of this protein consists of six helices and was assumed to act as a regulatory domain (RD). CGL2612 subunits form stable dimers related by 2-fold symmetry with this domain. C-terminal helices ␣ 8 and ␣ 9 from each subunit form a helix bundle structure, and hydrophobic interactions among residues in this bundle mainly take part in stabilization of the dimer. The RD and N-terminal DNAbinding domain of CGL2612 are connected by the ␣ 4 helix, and these two domains are associated with each other through a common hydrophobic internal core. Residues located at the edge of the hydrophobic core formed in the DNA-binding domain (Ile 9 , Leu 10 , and Ala 13 in the ␣ 1 helix and Leu 22 and Leu 25 in the ␣ 2 helix) also participate in hydrophobic interaction with hydrophobic residues from RD (␣ 4 and ␣ 6 helices). In addition to these hydrophobic interactions, a salt bridge was also observed between these domains (Arg 11 in ␣ 1 and Asp 62 in ␣ 4 helix).
Comparison with Other TetR Family Proteins-DALI structural similarity search (17) revealed that the overall structure of CGL2612 protein was similar to those of the TetR family proteins. Structural homologs of CGL2612 were EthR (Protein Data Bank number 1t56: z-score ϭ 11.7, root mean square differences ϭ 3.6 Å for 162 C␣ atoms; Protein Data Bank number 1u9n: z-score ϭ 11.0, r.m.s. deviation ϭ 4.0 Å for 160 C␣ atoms) (18,19), QacR (Protein Data Bank number 1jt0: z-score ϭ 11.1, r.m.s. deviation ϭ 4.0 Å for 159 C␣ atoms) (20), YcdC (Protein Data Bank number 1pb6: z-score ϭ 9.8, r.m.s. deviation ϭ 4.0 Å for 164 C␣ atoms), TetR (Protein Data Bank number 2tct: z-score ϭ 9.6, r.m.s. deviation ϭ 4.5 Å for 151 C␣ atoms) (21), and CprB (1ui5: z-score ϭ 8.9, r.m.s. deviation ϭ 3.6 Å for 145 C␣ atoms) (22). This confirmed that the CGL2612 protein also belongs to the same family of proteins. The amino acid sequences of CGL2612 and its homologous proteins from the related bacteria, C. diphtheriae and M. tuberculosis, are aligned in Fig. 2 (16 -30% identity). As shown in the figure, DNA-binding domains of these proteins are well conserved and exhibit 24 -61% identities in their primary structures. In particular, the residues around ␣ 3 , a putative DNA recognition helix, are highly conserved. These conserved residues have already been shown to participate in the interaction with DNA molecules in DNA-protein complex structures of the other TetR family proteins, QacR (Protein Data Bank number 1jt0) (20) and TetR (Protein Data Bank number 1qpi) (23) (Fig. 2, residues marked D), and residues that may be involved in DNA-protein interactions in CGL2612 and its homologues can also be predicted (Fig. 2, residues marked d).
TetR family proteins are transcriptional repressors, and there are ligand molecules that induce dissociation of proteins from their conjugate DNA. Tertiary structures of related proteins showed that TetR family proteins commonly possess cavities in their RDs (18,21,22,24,25), and the structures of EthR-hydrophobic ligand, TetR-tetracycline Mg 2ϩ , and complexes of QacR with different compounds demonstrated that these cavities are the inducer-binding sites for the TetR family of proteins (19, 21, 24 -26). The present structure of CGL2612 protein indicated that there is also a deep cavity in its RD, and the cavity is assumed to act as an inducer-binding site in this protein. This tentative binding site has two openings to bulk solvent in subunit A; one opening is framed by the helices ␣ 4 , ␣ 5 , and ␣ 6 , whereas the other is framed by helices ␣ 6 and ␣ 7 and the ␣ 8 -␣ 9 loop from another subunit (Fig. 3a). Around the latter opening, the ␣ 8 -␣ 9 loop covers the entrance of the cavity and narrows its opening. Main chain trajectories of subunits A and B are slightly different from each other (r.m.s. deviation for residues 3-174 is 1.000 Å). Especially, in subunit B, the main-chain trajectory of . The main entrance of the cavity is highly negatively charged (acidic). b, same representation viewed from the opposite side of CGL2612. The main-chain trajectory in the ␣ 8 -␣ 9 loop and helix ␣ 9 of subunit A is different from subunit B, and the small entrance of the subunit B cavity is closed. This suggested that the structure of the corresponding region has flexibility, and this may be required for uptake or evacuation of the inducer molecule. These figures were generated using PyMOL (Delano Scientific LLC, South San Francisco, CA). NOVEMBER 18, 2005 • VOLUME 280 • NUMBER 46 the ␣ 8 -␣ 9 loop and ␣ 9 helix is different from another subunit, and the latter opening is closed by the loop (Fig. 3b). This suggests that the corresponding region is rather flexible, and the former larger opening is likely to work as an inducer entrance of this cavity. The interior of the inducer-binding site of CGL2612 is surrounded by residues Leu 59 , Trp 63 , and Glu 66 in ␣ 4 , Thr 87 , Leu 88 , Glu 90 , and Val 92 in ␣ 5 and ␣ 5 -␣ 6 loop, Glu 96 and Leu 100 in ␣ 6 , Trp 113 , Asn 117 , and Ile 121 in ␣ 7 , Gln 140 , Asp 144 , and Phe 147 in ␣ 8 , and residues from the other subunit (Ile 152 , His 153 , and Asp 154 ) (Fig. 2, residues marked with a number symbol). As shown in Fig. 2, the inducer-binding site of this protein consists mainly of hydrophobic and acidic residues, and the site exhibits a significantly acidic nature (Fig. 3). Inducer molecules will dock into the binding site using affinity with this biased electrostatic surface potential.

Structural and Functional Analysis of CGL2612 Protein
Despite the similarities in their tertiary structures, TetR family proteins display poor amino acid sequence identity in their RDs, and also the tertiary structures of the RDs are slightly different from each other to adopt their divergent inducers. Tertiary structures of TetR family proteins showed that each protein has an inducer binding cavity of various shapes and sizes. Ligand complex structures of EthR, QacR, and TetR proteins have already revealed residues that interact with the bound ligands (19, 21, 24 -26). These residues were spread on all helices framing the binding cavities, and those were particularly clustered around the region of ␣ 5 and ␣ 6 and the loop between these two helices (Fig. 2). In QacR and TetR proteins, it was shown that the binding of ligand molecule induces conformational changes around the area, and the structural changes cause relocation of DNA-binding domain to alter its DNA binding ability (20,21,24). Although the RDs of TetR family proteins show poor amino acid sequence similarity, primary structure alignment based on their secondary and tertiary structures revealed that the binding site-forming residues of CGL2612 and ligand-interacting residues for other homologues (marked with a number symbol and hatched in black, respectively) are located in similar positions in their primary structures, and especially, CGL2612 and QacR exhibit rather similar propensities (Fig. 2). TetR proteins act as dimer, and structural comparison in their dimer, the biological functional form, revealed that the QacR was the best homolog to CGL2612 (QacR, Protein Data Bank number 1jt0, r.m.s. deviation ϭ 3.8 Å for 317 C␣ atoms; EthR, Protein Data Bank number 1t56, r.m.s. deviation ϭ 5.2 Å for 309 C␣ atoms). At present, certain inducer-binding residues of CGL2612 protein are not known, and detailed mechanisms for DNA binding ability alternation after the binding of inducer molecule are also still unclear; nevertheless, such strong commonality observed in their structures implies that these proteins may share similar functional mechanisms. To discuss further details, ligand-complexed structure analysis of CGL2612 will be necessary.
Identification of CGL2612 Binding Site-The tertiary structure of CGL2612 showed that this protein is a member of the TetR family that is structurally quite similar to EthR and QacR. This raises the question of what types of genes are regulated by this protein. To identify the CGL2612 binding site on DNA, two different selections were examined. First, a random SELEX method was attempted, and this selection showed that the 16-bp random nucleotide core from isolated DNA frag- ments contained pseudopalindromic sequences (Fig. 4). The next selection, a genomic SELEX method, showed that all isolated fragments contained a common 30-bp sequence, and no other consensus sequence was isolated, suggesting that CGL2612 binds to the sequence as a unique high affinity binding site. Sequence comparison between this consensus sequence and the genomic sequence of C. glutamicum wild-type strain ATCC13032 revealed that the obtained consensus sequence corresponded to an upstream region of the cgl2611 gene (genomic map position from 2,778,931 to 2,778,960 bp; see Fig. 4). As aligned in Fig. 4, the pseudopalindromic sequences identified by random SELEX displayed significant homology with the consensus sequence obtained by genomic SELEX. These results suggested that this 30-bp region may be the CGL2612 binding site.
Subsequent DNase I footprint analysis demonstrated that CGL2612 protein protects DNA from position Ϫ34 to Ϫ3 (coding strand) and from Ϫ37 to Ϫ6 (noncoding strand) of cgl2611 (Fig. 5a). These regions overlap the presumptive CGL2612 binding site suggested by SELEX experiments, and thus these 32-bp sequences were confirmed to be the CGL2612 binding site. In view of the size of the CGL2612 protein, this binding site is too long for a single CGL2612 dimer, implying that two or more CGL2612 dimers bind to this site. This raises the new question of how CGL2612 dimers bind to this site. Our results of random SELEX implied that the center part of the CGL2612 binding site might have the highest affinity to the protein. This part is thought to be the core region of the CGL2612-DNA interaction; thus, a binding model similar to the complex structure of related protein QacR and its binding DNA (Protein Data Bank number 1jt0) was presumed. In the case of the QacR, two protein dimers bound to their binding sequence IR1, and the center region of IR1 was bound by two dimers from both sides, playing a major role in the protein-DNA interaction (20) (Fig. 5b, bottom). However, our footprint analysis provided another molecular basis for considering how the CGL2612 dimer binds to its binding site. As shown in Fig. 5a, a single digestion band was observed in both the coding and noncoding protection areas (position Ϫ21, letter C on the coding strand, and position Ϫ20, letter T on the noncoding strand, respectively). These DNase I-sensitive bases in the center part of the binding site indicate the presence of a DNase I-accessible gap here. It is expected that the CGL2612 molecule does not bind to this center part or interacts only weakly. The CGL2612 binding site is composed of an inverted repeat pair and a 4-bp spacer sequence between the repeated sequence sites (Fig. 5a). A halfsite of the inverted repeat consists of 13 bp (44 Å in length), and this has structurally sufficient space for a CGL2612 dimer to bind (QacR dimer and TetR dimer require 37 and 31 Å for binding, respectively (20,23)). The normal B-DNA structure contains 10 nucleotides per turn, and due to the 4-bp spacer sequence, these half-sites are on almost the same side of the DNA chain. This suggests that two CGL2612 dimers bind to the binding site in tandem from the same side of the DNA chain (Fig. 5b,  top). Such binding of two protein dimers will allow a bending motion of DNA chain at the center spacer part, and this thermal motion of DNA is likely to permit an occasional digestion at the part by DNase I. This model provides a reasonable explanation for the evidence of footprinting analysis. Although the structures of the binding sites of CGL2612 and QacR are somewhat similar (long binding site with inverted repeat sequences), their lengths are different from each other, and our experimental results suggested a distinctive binding mode for the CGL2612-DNA complex. In our proposed model, one CGL2612 dimer binds to a  Fig. 4. b, schematic diagram of a CGL2612-DNA interaction model and a QacR-IR1 complex structure (20). The half-sites of inverted repeat and DNase I-sensitive bases on each strand are colored in red and blue, respectively. In the case of the QacR-IR1 complex, the symmetry center of IR1 was bound by two protein dimers from both sides and surely protected. On the other hand, taking into consideration the result obtained from footprint analysis, CGL2612 dimers bind to its binding site and are separated each other, and a structural gap will be found in the symmetric center.
half-site of the long inverted repeat sequence. Each equivalent subunit in a dimer recognizes different DNA sequences comprised in the halfsite. To determine the details of how four CGL2612 subunits recognize its binding DNA sequence, structural analysis of CGL2612-DNA complex will be necessary.
Functional Role of CGL2612 Protein as a Transcriptional Repressor of a Drug Efflux MFS Permease-The CGL2612 binding site determined in this study contains a putative promoter sequence, "Ϫ35" TTGTAA and "Ϫ10" TACAGT, and is located just upstream of the cgl2611 gene, which probably encodes an MFS permease (Fig. 4). Thus, it was expected that CGL2612 regulates the expression of the cgl2611 gene by binding to its promoter region. However, the putative initiation codon ATG (genomic map position 2,778,966 -2,778,968) of cgl2611 is positioned very close to the putative promoter sequence, with an 8-bp spacing sequence, and moreover, no SD-like sequences were found in this region. To determine whether CGL2612 regulates expression of cgl2611, primer extension analysis of cgl2611 transcripts was carried out using the wild-type C. glutamicum strain, ATCC31831, and its cgl2612 deletion mutant derivative, D2612. D2612 was constructed by the double-crossover method as described under "Experimental Procedures." As shown in Fig. 6, a primer extension product initiated from the third G of the putative initiation codon ATG was detected in the wild-type strain ATCC31831 as well as the ⌬2612 mutant strain D2612. This indicated that the real initiation codon is located further downstream. CTG (2,779,050 -2,779,052) or ATG (2,779,161-2,779,163) could be the real initiation codon of cgl2611, but this remains to be determined experimentally. Primer extension analysis also showed that the amount of cgl2611 transcript in the ⌬2612 mutant cells was 2.2-fold higher than that in the wild-type cells. This clearly indicated that CGL2612 functions as a transcriptional repressor of cgl2611. The two genes, cgl2611 and cgl2612, are located in tandem, and they overlap by 10 bp, suggesting that these genes form an operon (Fig. 4, top). Regulation of the cgl2611 promoter by CGL2612 protein may suggest automodulation of cgl2612.
CGL2611, the product of the cgl2611 gene, is composed of 494 amino acid residues and is predicted to be a membrane protein based on its amino acid sequence. CGL2611 protein has 14 ␣-helical segments that traverse the cytoplasmic membrane, and a BLAST search (27) showed that this protein has notable sequence similarity with toxin efflux pumps that belong to the MFS family; MFS is one of the major families to which a number of bacterial drug efflux pumps belong (1). CGL2611 exhibits 28.7% amino acid sequence identity with the S. aureus MFS QacA multidrug efflux pump, which renders cells resistant to antimicrobial compounds, such as quaternary ammonium compounds (Qacs), via a proton motive force-dependent mechanism (28). Amino acid inspection revealed that although residue Asp 323 of QacA, which is the key residue for the "high affinity" of QacA to divalent cations, is displaced by serine in CGL2611 (Ser 320 ), other intermembrane residues, Asp 34 and Arg 114 in QacA, the crucial residues for drug export (28), are perfectly conserved in CGL2611 protein (Asp 33 and Arg 133 , respectively), suggesting that CGL2611 also possesses strong potential to act as a drug exporter. Since the ⌬2612 mutant strain D2612 overproduces CGL2611, it is expected that D2612 shows a drug-resistant phenotype. To estimate the biological function of CGL2611, the antibacterial susceptibilities of the wild-type strain ATCC31831 and the ⌬2612 mutant strain D2612 were investigated. The sensitivities of ATCC31831 and D2612 to several antibiotics, which are structurally and functionally diverse and in common use, and substrates expelled by other homologous anti-drug systems EthA-EthR (ethidium bromide (Et)) and QacA-QacR (Et, benzalkonium chloride, and malachite green) were examined. As shown in TABLE THREE, D2612 showed increased resistance to Norfloxacin, a quinolone-type DNA gyrase inhibitor. The increased rate of Norfloxacin resistance (2-fold) was correlated well with the derepression rate of cgl2611 expression (2-fold), suggesting that CGL2611 was involved in this resistance. Sensitivity to Sparfloxacin, which is structurally related to Norfloxacin, was not changed, suggesting that very specific structure of Norfloxacin, but not a quinolone structure, is recognized by CGL2611. Sparfloxacin has additional methyl groups and fluorine to Norfloxacin, and such differences may be discriminated by  the protein. Our examination demonstrated that CGL2611 protein is also responsible for resistance to cationic compounds. Although D2612 showed no effect on malachite green resistance, the strain displayed 2-fold and 4-fold increased tolerance for benzalkonium chloride and Et, respectively (TABLE THREE). On the other hand, QacA does not exhibit resistance to anionic substrate such as Norfloxacin (29). These suggest that CGL2611 is a multidrug export protein whose spectrum of substrates is not exactly the same with QacA.
In this study, we identified CGL2612 protein, a hypothetical protein found by DNA analysis, as a transcriptional repressor of an upstream gene, cgl2611, encoding a probable MFS permease that has a nonnegligible effect on resistance for toxic compounds in C. glutamicum. These results will extend our understanding of the drug resistance mechanism not only in C. glutamicum and but also in other related bacteria. To identify the biological functions of this CGL2612 protein, we combined tertiary structure determination and DNA-binding sequence analysis using the SELEX method. It is worth noting that this strategy is one of the most convenient and powerful methods for analysis of functionally unknown transcription factors taking advantage of extensive genome information.