Rules of Expansion: an Updated Consensus Operator Site for the CopR-CopY Family of Bacterial Copper Exporter System Repressors

Many Gram-positive bacteria respond to copper stress by upregulating a copper export system controlled by a copper-sensitive repressor, CopR-CopY. The previous operator sequence for this family of proteins had been identified as TACANNTGTA. Here, using several recombinant proteins and mutations in various DNA fragments, we define those 10 bases as necessary but not sufficient for binding and in doing so, refine the cop operon operator to the 16-base sequence RNYKACANNTGTMRNY. Due to the sheer number of repressors that have been said to bind to the original 10 bases, including many antibiotic resistance repressors such as BlaI and MecI, we feel that this study highlights the need to reexamine many of these sites of the past and use added stringency for verifying operators in the future.

M etals are essential nutrients to all living organisms. Roughly 40% of proteins use metal as cofactors and structural components in a vast number of cellular processes (1). Iron, zinc, and manganese are examples of first-row divalent transition metals used by living organisms. The ability to form stable complexes plays a vital role as to how each metal is used in the organism. The stability of protein-metal complexes is generalized by the Irving-Williams series (Mn Ͻ Fe Ͻ Co Ͻ Ni Ͻ Cu Ͼ Zn) (2). In general, more stable complexes correlate to a metal's toxicity, as native metals for an active site can be displaced by another metal ion further along in the observed series. This mismetallation is present across multiple metal-binding motifs and can result in abnormal protein function (3). For most prokaryotes, however, metals such as copper, nickel, and cobalt are broadly toxic. Higher order organisms have evolved ways to regulate and transport these metals, reducing promiscuity in mature active sites. Mammalian hosts have evolved strategies to both sequester essential metals from bacteria (e.g., Fe, Mn, and Ca) and bombard them with toxic metals such as copper; a strategy called nutritional immunity (4,5). Mutations or deletions of a mammalian copper transporter ATP7A can attenuate bacterial clearance by macrophages (6,7). Thus, both within mammalian systems and on surfaces, copper is utilized as an antimicrobial (8)(9)(10)(11)(12).
Copper can catalyze Fenton-like reactive oxygen stress in bacteria, but mismetallation of iron-sulfur clusters and enzymes necessary for nucleotide and amino acid synthesis also leads to significant toxicity (13)(14)(15)(16)(17). As such, bacteria have evolved specialized import and export systems to acquire necessary metals and to adapt to this metal toxicity. The presence of these import systems within the bacteria typically indicates a nutritional requirement for the metal. Iron, for instance, is an essential metal for Streptococcus pneumoniae (the pneumococcus), a Gram-positive pathogen that causes pneumonia, meningitis, otitis media, and septicemia. In S. pneumoniae, iron has five known import systems/contributing genes, Pia, Piu, Pit, the hemin-binding system encoded by SPD_1590 (D39 strain), and SPD_1609, but no known export systems (18)(19)(20). Calcium, zinc, and manganese all have export and import systems, which suggests that there are many optima of concentrations depending on the cell requirements. However, the pneumococcus has no known import system for copper but contains a dedicated copper export system encoded by the cop operon (19,(21)(22)(23)(24). In general, copper export systems consist of an operon DNA regulator, a copper chaperone, and one or two copper exporters (21,(25)(26)(27)(28)(29)(30)(31).
Multiple studies regarding the cop operon have been performed in S. pneumoniae (14,21,30,(32)(33)(34)(35) and in other bacteria. Globally, cop operon regulators function as either repressors, as in S. pneumoniae, or activators. Although there are cop operon activators and repressors in structurally distinct groups, they all serve to protect the bacteria against copper stress by sensing copper and facilitating its export. Activators, such as CueR in Escherichia coli, sense copper and bind upstream of the cop operon to promote transcription (36). Conversely, repressor proteins release DNA upon binding copper and are found in species such as Lactococcus lactis and S. pneumoniae (CopR-CopY), Listeria monocytogenes, and Mycobacterium tuberculosis (CsoR) (19,27,30,34,35,37). The pneumococcal cop operon contains, copY as the DNA repressor, cupA as a membrane-associated copper chaperone, and copA as the copper-specific exporter (19,30,34). The CopY repressor protein has an N-terminal helix-wing-helix motif, which directs the homodimer to bind its cognate DNA (34,35). The Cu-chaperone protein CupA chelates intracellular Cu, reduces it from Cu 2ϩ to Cu 1ϩ , and finally delivers it to CopA for export (32)(33)(34)38). CupA copper chelation allows for the recycling of CopY to the operator when copper is exhausted. Mutations in the copper export protein in cop operons result in decreased bacterial virulence, highlighting the importance of copper in nutritional immunity (21,30,39,40).
Pneumococcal CopY is homologous to several known antibiotic resistance repressors, including BlaI, a Staphylococcus aureus MecI homolog that represses the gene for a ␤-lactamase (34,35,41). Like CopY, BlaI and MecI interact with a known operator sequence, TACA/TGTA, form homodimers, and are mostly helical in secondary structure (34,41). However, BlaI is regulated by proteases instead of metal interactions (42).
The CopR-CopY family of cop operon repressors have C-terminal metal-binding protein motifs, Cys-X-Cys. Each CXC motif binds copper in a 1:1 ratio. These motifs can also bind zinc with a stoichiometry of two CXC motifs to one zinc (35). Since these proteins function as dimers, the stoichiometry is one zinc per dimer, or two coppers per dimer after zinc displacement. Copper binding to CopR-CopY proteins causes a conformational change, leading to release from the operator site, while zinc binding leads to higher cop operator affinity (34,35). However, how binding metal directly leads to the conformational changes associated with DNA binding is currently unknown. Previous work in Enterococcus hirae demonstrated that the CopY protein binds to two palindromic regions (TACANNTGTA) upstream of the cop operon by using a DNasefootprinting assay (25). The cop operator sequence was then identified using surface plasmon resonance to evaluate binding of CopR-CopY to promoter regions in E. hirae, Lactococcus lactis, and Streptococcus mutans that contained TACANNTGTA (43). The oligonucleotides used in this study contained a promoter sequence beyond just the 10 bases listed above. We suspect that the 10 bases were identified as the operator sequence based on the palindromic nature of the sequence, without testing whether those bases alone were sufficient for CopY binding.
Since the CopR protein from L. lactis was found to regulate multiple genes, we sought to test potential CopY-regulated genes outside the cop operon in Streptococcus pneumoniae (44). However, we quickly found that the TACANNTGTA operator is not a reliable predictor for repressor protein binding (44). We used S. pneumoniae CopY in binding experiments with various DNA constructs to define a sufficient cop operator in vitro. Here, we present our findings and propose an update to the cop operator motif. We found that unlike the LacI repressor system, pneumococcal CopY does not demonstrate any cooperative binding, despite the operator sites being in close proximity. Furthermore, we highlight the strong conservation between bacterial cop operators and CopR-CopY constructs by demonstrating the presence and strength of crossspecies interactions. We identified bacteria with CopY homologs and aligned the upstream sequences to determine if they also contained putative cop operators and whether these operators followed the consensus motif we have proposed.

RESULTS
CopY operator homology. Early DNA-binding studies were carried out using a CopY homolog from Enterococcus hirae on the interactions with the cop operon operator (25,43,45). We observed that there are two large repeats upstream of the pneumococcal cop operon that include a 10-base sequence important for CopY binding. These motifs in S. pneumoniae differed slightly from those observed in E. hirae (Fig. 1A). Although the amino acid sequence of the E. hirae copper repressor and the upstream binding repeats are highly similar to those of S. pneumoniae (34,35) and contain the 10-base sequence, E. hirae operators upon initial observation lacked the Updating an Operator Site for Copper Exporter Systems extended regions flanking this sequence in pneumococcus (Fig. 1B) (25). A BLAST search revealed that the 61-base stretch of DNA upstream of the pneumococcal cop operon that includes the two extended 21-base repeats is highly conserved in all pneumococcal strains (46).
CopY binds to both pneumococcal cop operators. Previous studies showed that CopR-CopY specifically bound to the cop operon operator in sequence-and metaldependent manners, as disrupting the operator bases or adding copper disrupted binding, while adding manganese or iron had no detectable effect (34). These studies were conducted with only one full T/GACANNTGTA motif (here, KACANNTGTA) intact (34,35). It is not clear why these operators are duplicated or why they fail to adhere to strict palindrome sequences. We preliminarily sought to first determine if CopY bound to a DNA fragment containing both operators and if we could observe binding to one or both operators. Using a 61-base double-stranded DNA (dsDNA) fragment and recombinant CopY, we used an electrophoretic mobility shift assay (EMSA) to measure binding. As expected, CopY binds in a dose-dependent manner to the 61-base dsDNA fragment (Fig. 2). Furthermore, there were two shifts displayed on the EMSA, indicating that the second-site interaction was concentration dependent (Fig. 2). These data are in agreement with initial studies of CopR in E. hirae, which also contains two operator sites.
To quantitatively determine the affinities to the operators, we used a higher throughput system to determine affinity: a biolayer interferometry (BLI) assay using the Octet Red384 (Octet). BLI is an optical technique similar to surface plasmon resonance that uses interference of white light reflected from two surfaces to measure interactions between molecules but is performed in a 96-or 384-well plate (47). The process involves (i) a baseline step in buffer; (ii) a step for the loading of analyte A to the surface of a biosensor; (iii) a second baseline step to make sure analyte A stays on the sensor (which should appear flat); (iv) an association step of analyte B (which should appear as a shift in reflected light if binding occurs); and (v) a disassociation step (which should appear as an opposite shift in the reflected light as analyte A disassociates from analyte B). In our experiments, analyte A was the biotinylated dsDNA oligonucleotides which bind to the streptavidin biosensor, and analyte B was the recombinant protein (CopY). For viewing simplicity, we begin our plots begin at step 3. These experiments demonstrated that the two-operator-site (here, wild type) DNA had a similar affinity (dissociation constant [K d ] ϭ 28.1 nM) as that of the proximal DNA (DNA that has the distal site scrambled) (K d ϭ 25.5 nM) to CopY (Table 1; Fig. 3A and C). The distal site DNA (DNA that has the proximal site scrambled) had a slightly lower affinity (K d ϭ 55.2 nM) ( Table 1; Fig. 3B). Since the sequences are identical, we suspect this difference is because the first 5= base in that motif is linked to biotin, in turn bound to the streptavidin-coated biosensor. Thus, we would encourage including more than one base between the DNA binding sequence and protein. As expected, CopY bound DNA constructs containing intact 21-base repeat-containing cop operon operators with significantly higher affinities than for the scrambled DNA sequence (scram) (Fig. 3D) or single-stranded DNA (ssDNA) containing the wild-type operators, which both exhibited extremely weak binding (data not shown). Taken together, CopY binds both 21-base repeats containing the operator sites independently of each other with nanomolar affinity.
Defining the pneumococcal CopY operator. With a reported binding sequence of KACANNTGTA, we hypothesized that CopY may also bind similar sequences, as previously seen in Lactococcus lactis (44). Allowing for one base variation from the reported binding sequence, we found matches upstream of genes upregulated under copper stress and hypothesized that they may also be regulated by CopY (14,48). Seven potential binding sites were assessed using BLI. To our surprise, CopY did not bind to these proposed operator sites ( Fig. 4 and Table 2; see also Fig. S1 in the supplemental material). Based on these results, we suspected that the reported sequence in the literature may be necessary, but not sufficient, for CopY binding. Indeed, we found that CopY failed to bind the reported binding sequence (Fig. 5A). Taken together, we have  concluded that the reported binding operator is necessary but not sufficient for cop repressor binding (25). We next wanted to establish which bases outside the previously reported 10-base KACANNTGTA sequence were necessary for CopY binding. Beginning with the minimal 10-base sequence, constructs for BLI were designed to include or exclude identical bases from the 21-base S. pneumoniae CopY operator. The first set of sequences extended the operator five bases upstream or six bases downstream of the included 10-base sequence. The remaining positions were left scrambled. Neither of these fragments bound S. pneumoniae CopY more than the negative controls, suggesting there are essential bases on each side of the 10-base sequence (Fig. 5B). We then widened the operator construct by two or three bases on either side of the core  10-bases for a total of 14 and 16 bases of the total 21 bases. These operators were used to determine the minimum operator size for S. pneumoniae CopY. The 14-base sequence did not display binding greater than that of negative controls (K d Ͼ 3 M), but the 16-base sequence displayed approximately ϳ10-fold weaker affinity than the proposed "full-length" 21-base operator. (Fig. 5C and Table 1; see Fig. S2). Extending the sequence to 19 of the 21 bases led to comparable levels of binding to the full sequence (Fig. 5D). Based on these data, we suggest that 16 bases make up a minimal operator site (ATTGACAAATGTAGAT) recognized by CopY and that additional flanking bases increase stability. Bioinformatic characterization of the CopY-CopR operator. We performed BLASTp searches for S. pneumoniae TIGR4 CopY homologs, first excluding and then including the Streptococcus genus (46). Using a maximum target sequence number of 1,000 for each search, and then combining both lists, we found 335 candidate homologs (see Data Set S1, Tab 1). From this list, we extracted protein sequences from the unique NCBI accession numbers-many NCBI accession numbers represented several and/or identical homologs. Many of the 141 unique protein sequences belonged to species in the Streptococcus and Lactobacillus genera (Data Set S1, Tab 2). This table also included species such as the yogurt probiotic Lactobacillus acidophilus and Mycobacteroides abscessus, an emerging multidrug-resistant pathogen that causes lung, skin, and soft tissue infections (49). We used programs within the MEME suite to identify DNA motifs within 100-bases upstream of the copY homolog start codon, which would correspond to the promoter and suspected operator-containing elements (Data Set S1, Tab 3) (50,51). Similar to that of known CopR-CopY operators, the program-derived 21-base sequence contained KACANNTGTA (Fig. 6A) (52). Of these 88 sequences, 67 had two CopR-CopY cop operon operators (similar to those in Streptococcus pyogenes, E. hirae, and Enterococcus faecium), 14 had one operator (such as Enterococcus faecalis), and 7 lacked detectable sequence similarity (Fig. 6B to D; Data Set S1, Tab 4 to 6). Increasing the maximum bases upstream to 500 did not yield additional sequences (data not shown). For the genomes with two operators, we found that most of the operators were between 24 and 39 bases from each other, with the mode being 26 bases (Fig. S2). Of the species that had a single cop operon operator, the sequences were more variable KACANN YGTA (Data Set S1, Tab 5). Of these 14 sequences, 7 were on the positive strand (but six were palindromic), and 7 nonpalindromic sequences were on the negative strand (Data Set S1, Tab 5). Of the seven sequences that had no apparent operator homology, six were in the Lactobacillus genus and one was in the Macrococcus genus (Data Set S1, Tab 6). Comparing the upstream regions across these seven strains, MEME suite did not identify a consensus motif. This suggests that these seven strains lost the operator since diverging from a common ancestor.
Determining the consensus cop operon operator. The initial characterization of the CopY operator sites was performed using a few mutations in the E. hirae cop promoter (25,53). As a proof of principle, we used the native E. hirae cop distal or proximal operators and examined them for pneumococcal CopY binding ( Fig. 7A and  B). S. pneumoniae CopY bound to the proximal E. hirae operator with low nanomolar affinity (Fig. 7A). However, pneumococcal CopY did not bind to the distal E. hirae operator (Fig. 7B). This implies that for CopY from S. pneumoniae, the central "AA" motif (previously notated as "NN") cannot accommodate variations at this position (Fig. 7B).
While inspecting the alignment of bacterial species operator sites via MEME Suite ( Fig. 1A and 6A; Data Set S1, Tab 4), we identified a clear purine-N-pyrimidine pattern on each side of the previous 10-base operator. Using our sequence data of homologous cop operators, we arrived at a proposed consensus sequence of RNYKACANNTGTARNY (where "R" is purine, "Y" is pyrimidine, and "K" is either G or T) (44) for CopY family repressor operators. These data were supported by both the E. hirae cop operators having the alternate purine or pyrimidine bases at positions 1, 14, and 16 but are still functional (Fig. 1B). Thus, we focused on position 3, which was the same in the pneumococcal operator and E. hirae operators in order to test our hypothesis that conserved R-N-Y motifs flank the 10-base core at positions 1, 3, 14, and 16. Position 3 was mutated to the alternate pyrimidine (T to C) or switched from pyrimidine to purine (T to A). As expected from our model, the T to C mutation at position 3 did not significantly alter the binding affinity compared to that of the 19-base fragment, while the T to A mutation completely abolished binding (Fig. 7C and D).
Due to the palindromic nature of this operator, we decided to test the last nonpalindromic base in the now RNYKACAAATGTARNY consensus which was the "A" in position 13. With position 4 being K (G or T), we mutated the "A" at position 13 to a "C" so that would fit with the complementarity compared to position 4, termed "full palindrome." We found that pneumococcal CopY was able to bind to this full palindrome sequence at near WT levels ( Table 1). Taken together, we believe that the pneumococcal DNA binding sequence for CopY is RNYKACAAATGTMRNY, with "M" representing "C" or "A," as opposed to the previously reported KACANNTGTA (Fig. 8A).
Interspecies compatibility in vitro across CopY family proteins and operators. Thus far, we have defined the pneumococcal CopY operator to be larger and more selective than the canonical cop box motif. However, our bioinformatics data indicate that there is strong conservation across the protein family operators (Fig. 8B). We hypothesized that some degree of cross-species interactions or substitution could be possible in vitro. Therefore, we produced recombinant CopY from both E. hirae and

Updating an Operator Site for Copper Exporter Systems
Streptococcus thermophilus to further test our proposed operator consensus and to glean details on how the proteins bind to their operators. We encountered problems with initial purification of E. hirae CopY. Incubation with tobacco etch virus (TEV) protease led to proteolysis. The ideal sequence for TEV protease cleavage is ENLYFQ | GS, with the pipe character representing the hydrolyzed bond. Since this site is so specific, it is commonly used for precision proteolysis, e.g., removing a purification tag. Reviewing the amino acid sequence of E. hirae CopY, we identified a functional TEV site (ENLFSH | IC). After reviewing the literature, we introduced an E86D mutation (DNLFS HIC) (54). This point mutation was sufficient to abrogate the undesired proteolysis while retaining the negative charge at position 86.
We used BLI to measure binding between the S. pneumoniae, S. thermophilus, and E. hirae cop repressor proteins and CopY operator sites from S. pneumoniae, S. thermophilus (proximal and distal), and E. hirae (proximal and distal). These experiments comprised a small matrix of interactions between related CopY proteins and operators. The E. hirae cop repressor bound to all five operators with no detectable difference even between uncleaved E. hirae CopY and cleaved E86D protein, implying that the His tag and TEV cleavage site intact had no effect on binding this specific system. We found that S. pneumoniae CopY did not bind to the distal E. hirae operator but did bind to the E. hirae proximal operator and both distal and proximal cop operators from S. thermophilus (Tables 1 and 3). The S. thermophilus cop operon repressor bound to its own operators and the E. hirae operators (albeit weakly to the distal site) but not to the pneumococcal operator (Table 3). Taken together, the consensus sequence for these operators would be RNYKACANNTGTMRNY.

DISCUSSION
Previous binding studies characterizing pneumococcal CopY were carried out with DNA containing only one operator (21,34,35). Here, we demonstrate that CopR-CopY homologs in Streptococcus and many other genera have two operators upstream of the cop operon (Fig. 1A). Despite having two identical 21-base repeat operators within close proximity, the S. pneumoniae CopY protein does not have a significantly higher affinity for DNA with two operators present. A limitation of this study and subject to future direction is how these two operators affect gene regulation. Based on predicted  Ϫ35 and Ϫ10 sites, we suggest that the distal operator prevents sigma factors from binding at the copY Ϫ35 element and that the proximal operator occludes RNA polymerase, establishing two layers of repression for the cop operon (55). This reasoning is also consistent with the proposed hypothesis of why there are two operators in the antibiotic resistance repressor BlaI (56). It is unclear which sigma factor drives copY transcription or whether the binding between CopY, a sigma factor, and/or RNA polymerase is competitive. Another subject of future direction is why such repression through multiple operators is necessary for the cop operon especially considering that (i) cop operon upregulation is linked to increased pneumococcal survival in the host and (ii) a ΔcopY mutant has increased virulence in mice (30,57). We anticipate that further study of these systems via fitness testing and exploring the effect of having these systems on under metal-limiting conditions will yield clues as to the competitive advantages or selective pressures of the cop operon at the host-microbe interface. Given the previous 10-base consensus sequence suggested for CopR-CopY protein operators, it was plausible there were additional genes that the cop repressor could regulate, similarly to L. lactis (44). Some of these putative binding sites in S. pneumoniae corresponded to putative promoter regions of genes and operons upregulated under copper stress, suggesting that CopY was a regulator of several operons (14). However, we have empirically demonstrated that these putative sites-despite showing copperdependent regulation-did not bind CopY and thus, importantly, that the canonical 10-base cop box is not sufficient for binding (14,21). These facts ultimately led us to explore and propose an expanded CopY operator sequence (RNYKACAAATGTMRNY), adding to the previously reported KACANNTGTA (Fig. 8). Unlike in the L. lactis genome, a search using this operator motif in multiple streptococcal species for potential operators yielded no additional sites (48).
While CopY does not bind to other locations within S. pneumoniae, its aforementioned role in controlling the cop operon and the importance of that operon at the host-pathogen interface makes it of continued interest. It is still not known how CopY has higher binding to DNA with zinc bound than in the apo form and how copper changes the structure to disrupt DNA binding. This expanded operator can better inform structure/function studies regarding residue-to-nucleotide interaction in the CopR-CopY family of proteins, for which there is still no structure with metal or DNA bound.
In generating the list of homologous CopR-CopY genes and proteins, we also were able to look upstream of the gene and compile what we believe to be a consensus operator site which differs slightly from the proposed pneumococcal operator. The strong conservation between repressor and operator across species is made clear by the affinity studies with cop repressors from both E. hirae and S. thermophilus. However, this cross-species interaction was not universal-pneumococcal CopY only bound to one E. hirae operator but bound both S. thermophilus sites-the new consensus operator is more predictive of these proteins binding in this limited homology matrix. Based on the strong interactions we observed in vitro, the selectivity of repressors for operators, and a suite of predicted CopR-CopY operators, we propose a consensus operator of RNYKACANNYGTMRNY (Fig. 8). This consensus operator is consistent with the results presented by Magnani et al. using the L. lactis CopR protein ( Table 2) (44).
We propose that, moving forward, operator sequences for transcriptional regulators should be revisited and examined with a higher level of scrutiny, specifically those that are just apparent palindromic sequences. In doing so, we believe that this would reveal potential binding interactions within the organism's respective genome and with genomes of the cohabitant bacteria of their environments from which they might acquire new genes.

MATERIALS AND METHODS
Aligning and comparing CopY homologs and promoter sequences. The BLAST sequence alignment algorithm was used to align both E. hirae and TIGR4 S. pneumoniae cop operon promoter regions, the 21-base repeats upstream of the TIGR4 pneumococcal cop operon, and the promoter regions of pneumococcal species (46). A set of custom Python scripts (available from https://github.com/Van -Doorslaer/O-Brien_et_al_2019) were used to assign identified copY homologs to bacterial genomes and Updating an Operator Site for Copper Exporter Systems extract the suspected regulatory region from individual species (100 and 500 bases upstream of the start codon). Importantly, in many cases, the initial BLAST search identified CopY homologs which matched multiple species isolates/strains. In this case, the identified proteins were again compared to the NCBI database, and the homolog with the lowest E value was retained. If this approach was unsuccessful, the homolog was excluded from further analysis.
Cloning and site-directed mutagenesis. copR-copY from S. pneumoniae, S. thermophilus, and E. hirae were cloned from genomic DNA of the respective organisms and inserted into the pMCSG7 vector (58). The PrimerX tool (http://bioinformatics.org/primerx) was used to generate primers for site-directed mutagenesis of E86D in the E. hirae copR gene. Recombinant Pyrococcus furiosus (Pfu) polymerase was used to perform all molecular reactions. Correct sequences were verified by Sanger sequencing before use.
Protein purification. Recombinant CopR-CopY proteins were purified as described by Neubert et al. (34), with modifications. The pMCSG7 vector includes an N-terminal 6ϫHis tag linked to CopR-CopY via a tobacco etch virus (TEV) protease cleavage site (58). Unless otherwise specified, all steps were performed on ice or at 4°C. After initial purification using immobilized metal-affinity chromatography (IMAC) (HisTrap FF; GE Healthcare), the crude CopR-CopY sample was incubated at 23°C with a 100:1 mass ratio of recombinant TEV. The cleaved product was confirmed via SDS-PAGE. Cleaved CopR-CopY was purified with subtractive IMAC; the TEV protease contains a C-terminal His tag. The flowthrough was concentrated using concentrator with a molecular weight cutoff (MWCO) of 10 kDa (MilliporeSigma) and further purified by size exclusion chromatography (SEC) (Superdex 200; GE Healthcare) using a buffer of 20 mM Tris (pH 8), 200 mM NaCl, and 1 mM Tris(2-carboxyethyl)phosphine (TCEP). Based on the elution volume and comparison to an SEC standard (Bio-Rad), we observed that CopR-CopY proteins eluted exclusively as a dimer. Peaks containing pure CopR-CopY (as determined by SDS-PAGE) were pooled, and the concentration was determined by absorbance at 280 nm. Samples were used immediately or protected with 35% to 50% glycerol and aliquoted into thin-walled PCR tubes and flash-frozen using liquid N 2 .
Biolayer interferometry DNA/protein binding. Double-stranded DNA fragments were hybridized as described above by incubating 5= biotinylated ssDNA with an unlabeled complementary strand at 95°C for 5 to 10 min and then left to cool to room temperature (see Table S1 in the supplemental material). The biotinylated dsDNA oligonucleotides were diluted to 250 or 50 nM in the assay buffer (50 mM Tris [pH 7.4], 150 mM NaCl, 4% glycerol, 1 mM TCEP, 0.1% bovine serum albumin [BSA]). During initial optimization, we observed significant nonspecific binding of CopY to the biosensors in the absence of BSA. The inclusion of 0.1% BSA eliminated signals of nonspecific CopY binding at and above the highest concentrations used in the assay (3 M). The dissociation constant for CopY with various DNA fragments was determined using an Octet Red384 (Pall ForteBio). Streptavidin biosensors (Pall ForteBio) were hydrated at 26°C using the Sidekick shaker accessory for 10 min at 1,000 rpm. Hydrated sensors were incubated in the assay buffer to acquire a primary baseline. The sensors were then loaded with biotinylated dsDNA, followed by a secondary baseline measurement using wells with buffer solution. DNA-loaded biosensors were then moved to wells containing various CopY concentrations to measure the association and then placed back into assay buffer for dissociation recordings. All experiments were maintained at 26°C with shaking at 1,000 rpm. The optimized protocol was as follows: primary baseline, 60 s; DNA loading, 150 s; secondary baseline, 180 s; association, 180 s; and dissociation, 270 s. The methods were optimized to smooth the signal at association and minimize recording at equilibria. Additionally, proteins were tested both with and without cleaving the N-terminal His tag.
Analysis was performed using Octet software. We applied a 1:1 binding model using a global fit to biosensor replicates at each concentration of CopY. During preprocessing, an average of the secondary baseline across the various biosensors was applied, as well as Savitzky-Golay filtering to reduce noise. The data were interstep corrected using an alignment to the dissociation step. Data were modeled using combined fits of absorption rate constant (k a ) and dissociation rate (k d ) values across independent replicates. Final estimates for K d and related statistics were taken from the kinetic analysis.
Rate constants for each sample were determined using the Octet analysis software as follows. For all 1:1 stoichiometric modeling, complex formation was evaluated as pseudofirst-order kinetics. The observed rate constant (k obs ) was calculated according to the equation Y ϭ Y 0 ϩ A͑1 Ϫ e -k obs ϫt ͒. Where Y 0 is the initial binding, Y is the level of binding, t is time, and A is asymptote value at max response. Dissociation rate (k d ) was calculated according to the equation Y ϭ Y 0 ϩ Ae -k d ϫt . The calculated k obs and k d values were then used to determine k a using the equation k a ϭ k obs Ϫk d ͓CopY͔ . Finally, the dissociation constant (K d ) was determined by the identity K d ϭ k d k a .