The optimal pH of AID is skewed from that of its catalytic pocket by DNA-binding residues and surface charge

Activation-induced cytidine deaminase (AID) is a member of the apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like (APOBEC) family of cytidine deaminases. AID mutates immunoglobulin loci to initiate secondary antibody diversification. The APOBEC3 (A3) sub-branch mutates viral pathogens in the cytosol and acidic endosomal compartments. Accordingly, AID functions optimally near neutral pH, while most A3s are acid-adapted (optimal pH 5.5-6.5). To gain a structural understanding for this pH disparity, we constructed high-resolution maps of AID catalytic activity vs pH. We found AID’s optimal pH was 7.3 but it retained most (>70%) of the activity at pH 8. Probing of ssDNA-binding residues near the catalytic pocket, key for bending ssDNA into the pocket (e.g R25) yielded mutants with altered pH preference, corroborating previous findings that the equivalent residue in APOBEC3G (H216) underlies its acidic pH preference. AID from bony fish exhibited more basic optimal pH (pH 7.5-8.1) and several R25-equivalent mutants altered pH preference. Comparison of pH optima across the AID/APOBEC3 family revealed an inverse correlation between positive surface charge and overall catalysis. The paralogue with the most robust catalytic activity (APOBEC3A) has the lowest surface charge, most acidic pH preference, while the paralogue with the most lethargic catalytic rate (AID) has the most positive surface charge and highest optimal pH. We suggest one possible mechanism is through surface charge dictating an overall optimal pH that is different from the optimal pH of the catalytic pocket microenvironment. These findings illuminate an additional structural mechanism that regulates AID/APOBEC3 mutagenesis. measured values represented by each graphed data point. Enzyme kinetic data points were plotted according to GraphPad Prism’s Michaelis-Menten equation. For EMSA binding kinetic graphs, 2 independent enzyme preparations were tested across two parallel experiments at pH 7.5, totaling 4 values represented by each graphed data point. Using GraphPad Prism Software, data were plotted as deamination activity of AID (the percentage of the substrate which was converted to product) versus effective pH. To facilitate comparison between various AID mutants, chimeras and orthologs with different overall activity levels, maximum percentage of deamination activity was calculated by dividing each data point by the maximum absolute value for each data set. EMSA data points were plotted according to GraphPad Prism’s one site-specific binding equation. In each graph, the error bars represent standard deviations.

Thus, outside their normal physiological role, AID/A3 expression and activity is associated with tumor initiation and propagation.
Recently, several studies have shown that the catalytic pocket of AID/APOBEC3s can undergo dynamic pocket closure/opening, a phenomenon termed "Schrödinger's CATalytic pocket" that acts as a structurally inherent regulator of mutagenic activity (51,52,(76)(77)(78). Thus, within the AID/APOBEC3 family, many of the structural features underlying their role in immunity vs cancer (substrate preference, shape, DNA binding, and dC catalysis) are largely attributed to the surface structure of the variable secondary catalytic loops.
Outside their role in immunity and cancer, the surface composition of the AID/APOBEC3s has also been attributed to other biochemical features within the family.
Several studies have examined the pH preference within the AID/APOBEC3 family. A3s were found to have a considerably lower optimal pH (pH 5.5-6.5), characterized by dramatically lower activity at physiological pH of 7.4 (61,(79)(80)(81)(82), except for A3F catalytic domain (A3F-CTD) which demonstrated a high catalytic rate across pH 5.5-9.5 (81). In contrast, AID has been shown most active near physiological pH (80). The optimal pH of A3G-CTD (pH 5.5-6.5) is influenced by the surface residue H216 on L1 (79). H216R (equivalent to R25 in AID) largely erased this dependency on acidic pH (79), while in A3F, N214H (equivalent to H216 in A3G) Downloaded from http://portlandpress.com/biochemj/article-pdf/doi/10.1042/BCJ20210529/925578/bcj-2021-0529.pdf by guest on 07 December 2021 increased activity at pH 5.5 by ~27-fold (81). Cumulatively, this suggests that protonation of this residue (Histidine pK a = 6.0) is important for binding substrate and catalysis under acidic conditions. In AID and other A3s, this equivalent-residue is located at the "mouth" of the catalytic pocket and plays a pivotal role in anchoring the negatively-charged backbone of ssDNA into a deamination-feasible complex (51,55,55,57,(79)(80)(81). Similarly, A3A was also shown to have an optimal pH of 5.5, demonstrating a dramatically higher binding affinity and catalysis in acidic vs physiological pH (61,80,82). It was hypothesized that the acidic optimal pH of A3A represents adaptation for enzymatic activity in acidic compartments of the cell, such as lysosomes and phagolysosomes where A3A targets HIV-1 (61).
The aforementioned studies have established that A3s are more acid-adapted in comparison to AID and suggested substrate binding as a major mechanism underlying optimal pH in the AID/APOBEC3 family. However, these studies used pH increments of 0.5-1 across activity buffers and we sought to establish a higher-resolution map of AID's pH preference. In so doing, we pinpointed the optimal pH of AID and demonstrated it retains most activity at more basic pH. DNA-binding groove mutants localized at the mouth of the catalytic pocket were found to shift the optimal pH and dampen AID's activity at basic pH. Mutation of the equivalent DNA-binding groove residues in AID orthologs demonstrated more dramatic shifts in optimal pH. And finally, we found AID chimeras involving ssDNA-binding groove 2 and the assistant patch that shift the optimal pH and dramatically alter activity at more basic pH. This work highlights the network of structural regions that determine the optimal pH of AID, with emphasis on the DNA binding grooves. More broadly speaking, we suggest AID/APOBEC3s modify their optimal pH through differences in isoelectric point. These results are important for understanding the evolution of pH adaptation within the AID/APOBEC3 family, amongst Downloaded from http://portlandpress.com/biochemj/article-pdf/doi/10.1042/BCJ20210529/925578/bcj-2021-0529.pdf by guest on 07 December 2021 evolutionary-distant AID orthologs, and for engineering acid-or base-stable AID variants suitable for use in cytidine base-editing technologies.

AID and APOBEC expression and purification:
Wildtype human and orthologous GST-AIDs as well as domain-swapped and point mutant AIDs were expressed and purified as described before (51,64,(83)(84)(85). Briefly, EcoRI-flanked coding sequences were synthesized by Genescript and cloned into pGEX-5X-3 vector (GE Healthcare, USA) to generate GST-AID expression constructs. GST-AID was induced in DE3 E.coli cells by the addition of 1mM IPTG at 16°C for 16 hours. Bacterial cells were lysed using a French pressure cell press (Thermospectronic) and the expressed GST fusion protein was purified using glutathione Sepharose high performance beads (Amersham). The yield and purity of protein preparations were determined through SDS-PAGE. GST-AID of ~80-90% purity were dialyzed in 100 mM NaCl, 20 mM Tris-HCl pH 7.5, and 1 mM dithiothreitol and stored at -80°C. AID mutants were generated through site-directed mutagenesis. APOBEC3 enzymes were also expressed as GST-fusion proteins by expression of the pcDNA3.1-based vectors in HEK293T cells, as previously described (86,87). Briefly, 25x10 cm plates each containing 5x10 5 cells were transfected with 5 µg of expression plasmid per plate using Polyjet transfection reagent (Froggabio), incubated for 48 hours, collected, and resuspended in 500mM NaCl, 50mM Phosphate Buffer pH 8.2, 0.2mM PMSF, and 50 µg/ml RNAse A. Following lysis in a French pressure cell press (Thermospectronic), centrifugation was carried out to clear the lysate and the supernatant was purified using Glutathione Sepharose high-performance beads (Amersham).
Two to four independent preparations of each enzyme were purified to obtain average results from multiple independently purified preparations of each enzyme tested.

Alkaline cleavage deamination assay:
The standard alkaline cleavage activity assay has been described previously for AID and APOBEC3s (19,51,64,(83)(84)(85)87 corresponding optimal pH, using the same reaction composition as described above. Depending on the optimal activity parameters of each AID/APOBEC, reactions were incubated at its optimal temperature and time, ranging from several minutes to 4 hours, followed by heat inactivation of AID/APOBECs (85°C for 20 minutes) to ensure all measured activity occurred at the precise pH point rather than subsequent uracil glycosylation step which would slightly alter the reaction pH.
To remove AID-generated uracil from substrate, 10 µL of UDG reaction including 0.2 µL of UDG enzyme (NEB, USA), 2 µL of UDG buffer and 7.8 µL of milliQ water was added to each reaction followed by incubation at 37°C for 30 minutes. The abasic site was cleaved by incubating the reactions at 96°C for 10 min by addition of 2 µL of 1 M NaOH to a final [NaOH] of 100 mM. To separate the substrate from product, reactions were electrophoresed on a 14% denaturing acrylamide gel. To visualize the result, gels were exposed to a Kodak Storage Phosphor Screen GP (Bio-Rad) and imaged using a PhosphorImager (Bio-Rad, Hercules, CA, USA).

Preparation of enzyme activity assay buffers of finely varying pH increments:
The AID activity buffers are based on the previously described standard reaction buffer for optimal AID activity in the alkaline cleavage assay, which is 100 mM Phosphate buffer, pH 7.2 (19,51,83,84,87). 100 mM Phosphate buffer with pH ranging from 5.

Data collection and quantification:
Image Lab or Quantity One Software (Bio-Rad, Hercules, CA, USA) were used to perform band densitometry of the product and substrate bands in alkaline cleavage gels and for the bound and unbound bands of EMSA gels. For pH range plots, the alkaline cleavage deamination assay was carried out for two to four independently purified preparations of each enzyme in duplicate resulting in 4 to 8 values represented by each graphed data point. Data in pH range plots were "lightly" smoothed according to GraphPad Prism's XY smoothing function, using 4 neighboring data points. For the enzyme velocity kinetic plots at given pH, two independent preparations of each AID were tested in two parallel experiments for each preparation, totaling 4 independently- recurring binding modes (based on RMS) constituting a low-energy cluster, totaling 32 lowenergy clusters for each AID-ssDNA complex. AID-ssDNA docked complexes were analyzed using UCSF chimera v1.7 (https://www.cgl.ucsf.edu/chimera/); (94). All AID and AID-ssDNA complexes were visualized using PyMOL v1.7.6 (https://pymol.org/2/). The E Catalytic effective pKa, isoelectric point and surface charges of non-complexed enzymes were calculated using PDB2PQR web server (https://server.poissonboltzmann.org/pdb2pqr; (95,96).

R25H acid-shifts the optimal pH of AID and dampens activity at basic pH
It was previously reported that the majority of active A3 paralogues (A3A, A3B, A3C, A3D, A3G and A3H) deaminate cytidine efficiently at acidic pH (pH 5.5-6.5) but quite poorly at neutral or slightly basic pH (pH 7.0-8.0). It has been suggested that this difference in enzymatic activity was due to a net positive charge at acidic pH and net negative (or near zero) charge at neutral pH ranges, thus favoring binding to the negatively charged substrate ssDNA at lower pH.
We and others have previously described the surface charge of AID at neutral pH and showed that it is highly positively charged particularly in the ssDNA binding grooves (19,51,56,57,84,84). Indeed, a previous report showed that AID was active at pH ranges 6.5-8.5 with the highest activity levels at pH 7.5 (80); However, this study examined activity at only 4 pH points (pH 5.5, 6.5, 7.5, and 8.5). Therefore, to establish a structural understanding of AID's pH preference, we began by ensuring that our AID and A3 enzyme preparations behave according to the pH hierarchy established by several previous reports. To this end, we compared AID, A3A and A3G ( Figure 1A) and examined their enzymatic activity at several pH in the standard alkaline cleavage assay ( Figure 1B). We confirmed the previous observations that the A3s were adapted to more acidic pH than AID, with respective optimal pH at ~ 6 and ~7.2, respectively. To ensure that AID's optimal pH was not dependent on a particular substrate used, we also compared activity on three WRC motifs (TGC, TAC and AAC) and a non-WRC target (GGC) and observed the same optimal pH, suggesting this preference is a property of AID ( Figure S1A). Interestingly, we noted a narrowing of AID substrate specificity at acidic pH, in agreement with A3A's narrowed motif specificity at low pH (61).
Given the moderate degree of homology between Hs-AID and A3A/A3G ( Figure 1A) we wondered about the structural basis for this difference in pH preference within the AID/APOBEC3 family. Due to the dramatic charge differences predicted at pH 7 ( Figure S1B), we rationalized that there must be a network of ionizable residues that modulate substrate binding and influence catalysis. We examined surface exposed residues in the ssDNA binding groove of AID, near the catalytic pocket. We previously presented AID-ssDNA binding simulations identifying residues predicted to bind ssDNA on the surface in deamination-feasible complexes (51). This study proposed 50 DNA binding residues localized along two predicted ssDNA binding grooves (grooves 1 and 2) intersecting at the catalytic pocket. Subsequent crystal structures of AID and AID-DNA complexes (PDB: 5W0R) confirmed that groove 1 was the dominant DNA-binding groove in AID, thereby strengthening our original predictions (57). Of these DNA-binding residues, R25 was identified as the most frequent ssDNA-binding residue in 94% of AID-ssDNA complexes that positioned dC for catalysis. Intersecting both ssDNA binding grooves and located at the apex of the catalytic pocket, R25 was found to bind nucleobases and electrostatically bind 2-5 phosphodiester groups of ssDNA ( Figure 1C). Indeed, the R25 equivalent residue in A3G (H216) ( Figure 1A) was previously identified as being a key residue responsible for the acidic optimal pH of A3G, hypothesized to occur due to the protonation of its imidazole ring contacting ssDNA (79). Comparing several A3s and AID Besides differences in overall activity, mutation of R25 also altered the pH preference as hypothesized. Whereas Hs-AID has an optimal pH range of 7.2-7.6 with stable activity at higher pH (70% of peak activity at pH 8.0), R25H and R25N exhibit slightly lower optimal and unaltered optimal pH (7.1 and 7.33, respectively) ( Figure 1F). Thus, R25H lowers the optimal pH of AID and dramatically decreases activity at even slightly more basic pH (10-20% of max activity of R25H at pH 8.0-8.2, compared to 50-70% for Hs-AID), likely due to a loss of local ssDNA binding near the catalytic pocket via deprotonation of the imidazole ring. In contrast, R25N is closer to Hs-AID both in its optimal pH (7.33 vs. 7.45) and with respect to retaining much of its activity at basic pH (50-60% of max activity at pH 8.20; Figure 1F). The activity profiles obtained thus far were based on using single AID:substrate ratios across many fine pH increments as shown in Figure 1D. To validate the optimal pH results of these activity assays across pH 5.94-8.20, we carried out enzyme kinetics to compare initial deamination velocities of Hs-AID, R25H and R25N. For each of the three enzymes, we measured velocity at Hs-AID's optimal pH (pH 7.21), an acidic pH below optimal (pH 6.7), and a basic pH above optimal (pH 7.99) ( Figure 1G). The initial velocity kinetics supported the results obtained by pH range experiments, first in that all three enzymes exhibited the highest velocity at their optimal pH, and second, in that Hs-AID maintains a higher level of activity at the basic range past its optimal pH, as compared to R25 mutants. From these analyses, we make three conclusions: first, Hs-AID retains most of its activity at basic pH; second, homologous mutations of R25 can dramatically change overall activity levels of AID and its pH preference patterns; third, in line with previous finding that the R25 equivalent in A3G (H216) underlies the acidic pH preference of A3G, Hs-AID R25H also exhibits a shift to lower optimal pH with significantly lowered activity at higher basic pH ranges. Having demonstrated the role of R25 in modulating AID's pH sensitivity, we continued along the same rationale to explore the role of other ionizable residues neighboring R25 in the ssDNA binding grooves. Adjacent to R25, E26 has also been predicted to contact ssDNA in a large proportion of AID:ssDNA complexes (75% of complexes; (51). Unlike R25, which binds ssDNA via the phosphodiester backbone and nucleobases, E26 binds nucleobases, but appears to electrostatically repel the DNA phosphodiester groups. Instead, E26 hydrogen bonds with Q14 and Y28, and electrostatically binds ssDNA binding residues R25, R50 and K52 (R25, R50 and K52 bind ssDNA in 93%, 31% and 69% of complexes, respectively; (51); Figure 2A-B). In the case of R25H, at pH 8-8.5 there is a local negative charge created proximal to the opening of the catalytic pocket (neutral H25 + negative E26), thus requiring a lower pH to protonate H25 and promote ssDNA binding. By that same logic, E26R creates a positively-charged side-chain capable of hydrogen bonding with Q14 and Y28, but disrupts R25, R50 and K52 stabilization, thereby hindering ssDNA binding near the catalytic pocket. We found that E26R retained catalysis near that of Hs-AID, but unexpectedly increased the optimal pH to 7.77 ( Figure 2D).

E26 stabilizes the ssDNA binding grooves, while E26R base-shifts the optimal pH
We suspect the disrupted DNA binding grooves of E26R are optimized at higher pH, resulting in near-native levels of catalysis.
However, due to a large portion of its side-chain buried between L1/α1, R24 plays a more critical role in stabilizing the DNA binding grooves (Figure 2A; (51,57). We therefore speculated that mutation of R24 would result in a severe impairment of activity. Indeed, R24D resulted in a complete loss of enzymatic activity, while R24A produced a catalytically hindered mutant with no change in pH preference ( Figure 2C). Like the impairment of catalysis caused by mutation of R25, R24D and R24A hampered catalysis whilst maintaining a high affinity for binding ssDNA  Figure S1D), suggesting these mutations, like R25H, maintain the integrity of the ssDNA binding grooves, but compromise deamination-specific ssDNA binding. We also probed the mutational effect of several hyper-IgM mutations and other mutations on the pH preference of AID. Most mutants were catalytically inactive and those that were enzymatically active showed no change in pH preference in comparison to Hs-AID ( Figure S1E). We conclude that R24 and E26 stabilize deamination-feasible conformations of the ssDNA binding grooves and that E26 influences pH through stabilization of electrostatic topology near the catalytic pocket.

Orthologous AIDs have variable optimal pH, largely due to differences in a single surface residue
We examined the pH preference of AID from evolutionary-distant species. We have previously demonstrated that AID from zebrafish (Danio rerio; Dr-AID) and channel catfish (Ictalurus punctatus; Ip-AID) display biochemical differences compared to Hs-AID, such as catalytic rates (Dr-AID >> Hs-AID>>Ip-AID) and optimal temperatures (fish AID are adapted to colder temperatures than mammalian AID) (83,85). We wondered if some of these structural differences also mediated differences in optimal pH. Like AID/A3s from humans, we observed the same variability of the Hs-AID R25-equivalent residues amongst orthologous AID (R, H or N) in that Dr-AID and Ol-AID both had H29 whilst Ip-AID had N28 ( Figure 3A). We next However, there are many sequence differences between Hs-, Dr-, Ip-and Ol-AID, notably the secondary catalytic loops, ssDNA-binding grooves, and overall surface charge ( Figure 3A and S1B), all of which could influence pH preference. Additionally, the A3s were shown to be acidadapted (optimal pH 5.5-6.5; (61,(79)(80)(81) despite varying between R, H or N at this position ( Figure S1C). Therefore, it is unlikely that the optimal pH of orthologous AID is dictated solely by the identity of this residue. To test this hypothesis, we first examined the enzymatic activity of Dr-, Ol-and Ip-AID in comparison to Hs-AID at different pH ( Figure 3C). In terms of optimal pH, Dr-AID and Ol-AID were quite close to Hs-AID (pH 7.6 vs 7.3) whilst Ip-AID was more base-adapted (pH of 7.9). Like Hs-AID, the orthologous fish AID also exhibited a stable pH profile at higher pH. The near-optimal pH, (>90% of optimal activity window) is 7.1-7.6 for Hs-AID, 7.2-7.8 for Dr-AID, 7.2-8.1 (broadest) for Ol-AID and 7.5-8.2 (most basic) for Ip-AID.
We next probed the effects of reciprocal mutations of this position in Dr-and Ip-AID ( Figure 3D and E). Dr-AID H29R and H29N both showed a shift towards a more basic optimal pH of 8.1 in comparison to Dr-AID. On the other hand, Ip-AID N28R and N28H showed a shift towards a more acidic optimal pH of 7.1 and 7.33, respectively, compared to Ip-AID. Thus, AID from different species demonstrate differences in optimal pH, and reciprocal mutation of the Hs-AID R25-equivalent residue between Dr-AID (H29) and Ip-AID (N28), imparts the donor enzyme's optimal pH to the recipient (7.3 vs. 8.1). These results are consistent with the analyses of Hs-AID mutants in which R25N was slightly more basic shifted in its optimal pH than R25H.
Even though the reciprocal mutation of Hs-AID's R25 equivalents in Dr-AID and Ip-AID were sufficient to impart the optimal pH of the donor, two lines of evidence suggested that other residues or regions in AID/APOBEC3s are likely involved in determination of optimal pH. First, R25N in Hs-AID did not mediate a drastic basic pH shift, as it did in the case of Dr-AID's H29N, and second, A3F with an N in the equivalent position (N214) represents an oddity within the AID/APOBEC family as indicated by its retention of catalysis across a broad pH range of 5.5-9.5 (81). Thus, we next sought to examine three different factors that could influence optimal pH: first, the optimal pH for the deamination reaction in the catalytic pocket, second, the overall surface charge and isoelectric point of each protein in its native folded state, and third, specific physical regions that when transplanted into chimeric enzymes could alter optimal pH.

The catalytic pocket microenvironment is not a determinant of optimal pH in AID/APOBEC3s
In the catalytic pocket of AID/APOBEC3s, the catalytic glutamic acid (E58 in human AID, henceforth referred to as E Catalytic ) donates a proton during the cytidine deamination reaction, followed by re-protonation, thus regenerating the catalytic pocket for the next deamination reaction (53). The transition between the de-protonated and re-protonated states would be quickest at pH ranges nearest to the effective pKa of E Catalytic . Unlike intrinsic pKa, which is suitable for surface residues, the effective pKa is determined by the microenvironment of the ionizable group. For example, many enzyme active sites utilize a catalytic glutamic acid with an elevated effective pKa (e.g. intrinsic pKa = 4.5 vs. effective pKa = 6.7; (97). Thus, in the case of AID, the E Catalytic effective pKa is influenced by its solvent accessibility and the composition of the secondary catalytic pocket residues. Given the catalytic pocket homology within the AID/APOBEC3 family ( Figure 1A), we wondered if optimal pH of AID/APOBEC3s would reflect the E Catalytic effective pKa. Due to A3F's unique and wide-ranging optimal pH of 5.5-9.5, we suspect unique structural features outside the scope of this study explain its pH preference and have therefore excluded A3F from further analysis. We predicted the E Catalytic effective pKa of AID/APOBEC3 structures and found it was well conserved within a narrow range (E Catalytic effective pKa = 2.7-3.5) ( Figure S1F). The average E Catalytic effective pKa amongst the A3s was only slightly lower than that of the AID orthologs (3.0 vs. 3.5). Therefore, it appears unlikely that this factor is responsible for the differences in optimal pH. However, an interesting trend can be noted in the difference between the E Catalytic effective pKa and optimal pH. If the most efficient de-protonation/re-protonation cycle of E Catalytic occurs at or near its effective pKa, it would stand to reason that the further the enzyme's optimal pH is from its E Catalytic effective pKa, the less optimal the de-protonation/re-protonation regeneration of E Catalytic between deamination reactions, and hence the slower the catalytic rate. In support of this notion, we observed that the difference between optimal pH and E Catalytic effective pKa was greater in AID (all orthologs) vs.
APOBEC3s ( Figure 4A), a trend that correlates with their respective catalytic rates. These findings suggest that E Catalytic effective pKa does not directly alter the optimal pH of AID/APOBEC3s; however, it might be suggestive of a possible mechanism used by these enzymes to limit mutagenic activity: to evolve structural features outside the catalytic pocket that positions the enzyme's overall optimal pH at a greater distance from its E Catalytic effective pKa.
Thus, we sought to investigate the structural features beyond the catalytic pocket that determine overall pH optima.

Computed surface charge correlates with optimal pH in the AID/APOBEC3 family
To examine other possible attributes that may determine the optimal pH of AID/APOBEC3 paralogues, we probed the relationship between the optimal pH and isoelectric points. When the correlation between isoelectric points and optimal pH was examined, a linear relationship was observed where the APOBECs with lower isoelectric points had lower optimal pH ( Figure 4B).
Thus, optimal pH within the AID/APOBEC3 family is highly dependent upon isoelectric point.
When we examined surface charge vs. optimal pH or near-optimal pH (>90% catalysis) (Figure 4C, Figure S1B) we noted all AID/APOBEC3s exhibited a positive surface charge. All AID orthologs had a higher positive charge (average +11) vs. the A3s (average +6) at their optimal pH. Consistent with its quick on/off rate of ssDNA binding, A3A exhibited the lowest positive charge (+3.5) at its optimal pH 5.5, while its surface charge became negative near neutral pH ( Figure S1B) explaining its requirement for acidic pH. This indicates that an overall positive surface charge is a key determinant of optimal pH within the AID/APOBEC3 family, likely for optimal ssDNA binding/catalysis. This idea is consistent with a previous study that demonstrated the optimal pH for A3A catalysis correlated with the optimal pH of substrate binding, namely that A3A's binding affinity for ssDNA was several-fold diminished when tested at near neutral pH vs. its optimal pH 5.5 (61). In addition, we note that the most catalytically robust AID/APOBEC3 paralogue (A3A) has the lowest optimal pH and lowest surface charge at its optimal pH, while the most catalytically lethargic enzyme (AID) has the highest optimal pH and surface charge at optimal pH. Thus, we speculate that higher surface charge in the AID/APOBEC3 family contributes to catalytic lethargy through two mechanisms: first by facilitating slower on/off binding events, and second, through resulting in an optimal pH that is further displaced from the E Catalytic effective pKa, as discussed in the preceding section.

The optimal pH and base-stability of AID is dependent on the ssDNA binding grooves and assistant patch
To test whether there are distinct structural motifs in the AID structure that alter pH sensitivity, we constructed multiple chimeras incorporating regions from Dr-AID into a Hs-AID scaffold ( Figure 5A and S2) and compared optimal pH to Hs-AID ( Figure 5). We began by targeting loop binding catalytically-productive ssDNA complexes and thus, unlikely to perturb optimal pH. As expected, Hs/Dr-AID-5 and Hs/Dr-AID-6 did not perturb the optimal pH ( Figure 5B). We also targeted α3 (Hs/Dr-AID-3, Hs/Dr-AID-4, and Hs/Dr-AID-1), a region that composes ssDNAbinding groove 2 (19,51,56) and would likely affect optimal pH ( Figure 5C). Analogous to R25H in Hs-AID, we found Hs/Dr-AID-3 and Hs/Dr-AID-4 both acid-shifted the optimal pH slightly (optimal pH= ~7.1), while substantially reducing activity at more basic pH (30-40% and 20-30% of max activity for Hs/Dr-AID-3 and Hs/Dr-AID-4 at pH 8.0-8.2, respectively, compared to 50-70% for Hs-AID). Inspection of the Hs/Dr-AID-3 sequence ( Figure S2) revealed major differences from Hs-AID, including the rearrangement of several positive/negative residues, as well as the loss and gain of histidine residues (H93Q and D96H, respectively) that may impact ssDNA-binding across pH. As a follow-up, we also examined the optimal pH of L2+L4+α3 (Hs/Dr-AID-1), a chimera we previously reported with enhanced catalytic pocket accessibility and catalytic rate (51). Surprisingly, Hs/Dr-AID-1 had retained the same optimal pH and basestability as Hs-AID, suggesting the reduced activity at basic pH offered by α3 was masked by L2 and L4. Although not directly involved in binding ssDNA complexes near the active site, L2 and L4 are heavily involved in the predicted dimerization interface of AID (56,98), whereby L2 and L4 from the neighboring AID forms an extended ssDNA binding groove 1. Thus, chimeras involving L2 and L4 may indirectly affect pH stability through changes of the ssDNA binding groove-1 of an AID dimer, or the formation of AID dimers altogether.
Next, we examined chimeras altering the N-terminal region (N, Hs/Dr-AID-8), loop 7 (L7, Hs/Dr-AID-9) and the C-terminal region (C, Hs/Dr-AID-11; Figure 5D). The N-terminal region alters the N-termini, α1 and part of L1 (retains R25), which contains some residues involved in ssDNA binding, but acts primarily to stabilize the conformation of L1 residues involved in deamination-specific ssDNA binding (e.g., R25 and E26). L7 contains many ssDNA contact residues, most notable for its vital role in substrate specificity (75) and dC binding in the catalytic pocket amongst the AID/APOBEC family (51,52,57,63,66,73,78). The C-terminal region alters α6 and α7. α6 houses the assistant patch, which mediates structured DNA-binding in AID (57), while α7 is unique to AID in that it shares no homology with other APOBECs ( Figure 1A). Additionally, α7 was predicted to adopt several low-energy conformations relative to the core of AID, some of which contact L7 and may impact the ssDNA binding grooves (51).
Thus, we hypothesized that alteration to these regions alone or in combination would modulate pH sensitivity. Examining Hs/Dr-AID-8 and Hs/Dr-AID-9, we find both base-shift the optimal pH to 7.56, while retaining a similar level of activity at more basic pH. In contrast, Hs/Dr-AID-11 is most active across a broadened pH range (optimal pH= 7.56-8.2; Figure 5D, right panel).
Given the interplay between these regions, we next examined chimeras involving N+L7, N+C and N+L7+C (Hs/Dr-AID-10, Hs/Dr-AID-12, Hs/Dr-AID-13, respectively; Figure 5E). Hs/Dr-AID-10 and Hs/Dr-AID-12 both demonstrated base-shifted optimal pH (optimal pH 7.56 and 7.77, respectively) and stability at basic pH like that of Hs/Dr-AID-8, suggesting the N-terminal region plays a dominant role in these chimeras. Interestingly, Hs/Dr-AID-13 demonstrated an acid-shifted optimal pH of 6.99 with retention of >70% activity across a broadened pH range of 6.8-8.2 ( Figure 5E, right panel). The observations that the N, L7 and C chimeras alone constitute a base-shifted enzyme, but a chimera with all three combined demonstrate an acid-shifted enzyme with activity across a broadened pH reinforce the notion that the pH sensitivity of AID is dependent upon a network of DNA binding residues. Using a high-resolution map of catalysis vs. pH, we have pinpointed the optimal pH of AID and considered it in the context of its APOBEC3 paralogues. We found a correlation between optimal pH and surface charge in AID/APOBEC3s and noted that catalytic rate has an inverse relationship with both surface charge and optimal pH. The family member with the lowest catalytic rate (AID) has the most positive surface, highest optimal pH (neutral-basic), and largest difference between its optimal pH and the optimal pH of its catalytic pocket microenvironment (E catalytic effective pKa). In contrast, the family member with the most robust catalytic rate (A3A) has the least positively charged surface, lowest optimal pH (acidic), and smallest difference between its optimal pH and the optimal pH of its catalytic pocket microenvironment (E catalytic effective pKa). Thus, we suggest that in AID/APOBEC3s, surface charge regulates enzyme activities through two mechanisms: first, through allowing for faster on/off binding rates of negatively charged ssDNA by the enzymes with the least positively charged surface; and second, through modulating the magnitude of the difference between the enzyme's overall optimal pH and the optimal pH of the catalytic microenvironment (E Catalytic effective pKa), as the further the enzyme's optimal pH is from the E catalytic pKa , the less efficient the process of Ecatalytic regeneration (de-protonation/re-protonation cycle) between successive cytidine deamination reactions.

Conclusion
We previously showed that structural restriction of the catalytic pocket through spontaneous occlusion/opening (Schrödinger's CATalytic pocket) is a structurally built-in mechanism that regulates the mutagenic activities of AID/APOBEC3s. Here, our in-depth study of pH optima across AID/APOBEC3s led us to recognize yet another structurally built-in regulatory mechanism that limits the mutagenic activities of these enzymes. Optimal pH-E Catalytic effective pKa across APOBEC3s and orthologous AIDs. On average, APOBEC3 enzymes have a lower difference compared to orthologous AID. B) Optimal pH vs Isoelectric point (R 2 =0.85, p < 0.05). In general, orthologous AID have a higher isoelectric point and optimal pH as compared to APOBEC3s. C) Enzyme surface charge vs optimal pH of orthologous AID and APOBEC3s. At their respective optimal pH, AID has a higher positive surface charge as compared to APOBEC3s. The horizontal range bars indicate the pH range at which the enzyme's activity is within the >90% of the maximal activity measured at optimal pH.
The vertical range bars indicate the charge range at this pH range (derived from Figure S1B).