Phosphatase activity tunes two-component system sensor detection threshold

Two-component systems (TCSs) are the largest family of multi-step signal transduction pathways in biology, and a major source of sensors for biotechnology. However, the input concentrations to which biosensors respond are often mismatched with application requirements. Here, we utilize a mathematical model to show that TCS detection thresholds increase with the phosphatase activity of the sensor histidine kinase. We experimentally validate this result in engineered Bacillus subtilis nitrate and E. coli aspartate TCS sensors by tuning their detection threshold up to two orders of magnitude. We go on to apply our TCS tuning method to recently described tetrathionate and thiosulfate sensors by mutating a widely conserved residue previously shown to impact phosphatase activity. Finally, we apply TCS tuning to engineer B. subtilis to sense and report a wide range of fertilizer concentrations in soil. This work will enable the engineering of tailor-made biosensors for diverse synthetic biology applications.

Supplementary Note 1. Model of the effect of phosphatase activity on detection threshold.

Derivation of [RR~P]
The Batchelor-Goulian TCS model 1 depicts reactions for SK autophosphorylation/SK~P autodephosphorylation, SK~P/RR binding/unbinding, transfer of the phosphoryl group from SK~P to RR, SK/RR~P binding/unbinding, and dephosphorylation of RR~P by SK. The reactions, rate constants, and corresponding ordinary differential equations are depicted below: [

Model of gene regulation
In a canonical TCS, phosphorylation induces RR dimerization and subsequent promoter binding and regulation of transcription. We modelled these processes via a classical Hill function for activatable promoters. In this function, represent leaky transcription from the promoter in the absence of activator, and ( + ) is the maximal transcription from the promoter. is the equilibrium constant and the promoter is half activated when RR~P equals . The hill coefficient, , represents the level of cooperativity of RR~P binding.

= + * +~
We used the following parameters to simulate a leaky output promoter.

TCS simulations
To simulate the TCS transfer function, we assumed that ligand concentration was linearly related to the autokinase rate of the SK (kk), as previously 3 , scaling kk between 10 -2 and 10 1 (Fig. 1b). To simulate the effect of phosphatase-diminishing mutations, we varied the CEp term between 10 -2 and 10 2 fold that of WT. Python code for the simulations is included as Supplementary Material.

Relationship between K1/2 and dynamic range
We also examined the effect changes in phosphatase activity had on the dynamic range of the TCS. We limited the effect of ligand induction on autokinase activity to the previously mentioned scaling range of 10 -2 to 10 1 of basal activity. The dynamic range was then calculated as the ratio of the maximal promoter activity to the basal promoter activity. We found there was a region of phosphatase activity where the detection threshold could be changed without effecting the dynamic range (Supplementary Fig. 1). However, large decreases in phosphatase activity resulted in basal levels of RR~P that were high enough to activate the promoter, resulting in basal transcription and decreased dynamic range. Conversely, large increases in phosphatase activity caused RR~P levels to remain low even at high induction, causing decreased maximal expression from the TCS and decreased dynamic range.

Analysis of the effect of kinase activity on detection threshold
We also examined the effect that changes in kinase activity could have on the detection threshold of a TCS. Here, rather than increasing SK autophosphorylation (kk) as before (Fig 1b;  Supplementary Fig. 1), we simulated the effect of inducer by decreasing SK autodephosphorylation (k-k) between 10 3 and 10 0 . With this change, we found that kinase activity had similar control of detection threshold to that previously shown with phosphatase activity; however, here decreases in kinase activity resulted in increases in detection threshold and increases in kinase activity resulted in decreases in detection threshold (Supplementary Fig. 1). This finding suggests that either phosphatase activity or kinase activity could serve as a tuning knob for detection threshold. In this work, we focus on phosphatase activity for decreasing detection threshold since this is easier to achieve by weakening phosphatase activity than strengthening kinase activity.

Supplementary Note 2. Predictive model of soil nitrate concentrations.
To predict the concentration of nitrate in soil we collected nitrate transfer functions in soil on three separate days (Supplementary Fig. 12). We observed significant day to day variability in the high and low values of the cultures, and therefore normalized each day's data to the high and low for that day. Data from all days was fit with a hill function to obtain best fit parameters (Methods).
To predict the nitrate concentration in fertilized soil based on GFP measurements we inverted the hill function to obtain the following equation: Fertilizer transfer functions in soil were taken and GFP levels were measured ( Supplementary  Fig. 12e). The parameterized inverted Hill functions of the NarX and NarX(C415R) TCSs were used to predict nitrate concentrations as shown in Fig. 6b. Data points with predicted nitrate concentrations outside of the range of the y-axis were not included in this plot. Accurate detection ranges of the two TCSs were defined as the range in which the predicted nitrate values were within 2-fold of the manufacturer supplied nitrate value. (a) The relationship between the detection threshold and dynamic range of the transfer functions simulated in (Fig. 1b) Table 4) are integrated into the chromosome. The genetic systems are first constructed as linear double stranded DNA fragments that we call Integration Modules (IMs) prior to integration (Methods). We utilize the naming convention "iABxxx" where "i" indicates that the construct is an IM, "AB" indicates the first and last initial of the individual who designed the IM, and "xxx" is a unique numerical identifier. (a) We fused the REC domain of NarL to the DBD of YdfI to connect the E. coli NarXL TCS to transcription from the B. subtilis PydfJ115 promoter. We selected the YdfI DBD due to the high degree of homology of YdfI to NarL and because YdfI regulates a single promoter, PydfJ, in B. subtilis 4 , thus reducing the likelihood of unwanted fan-out cross-regulation. PydfJ is known to be regulated only by YdfI, and thus does not naturally respond to nitrate. The host YdfHI TCS, whose natural ligand is unknown, was knocked out using the IM in Panel D to prevent its regulation of the PydfJ115 promoter. (b) An alignment of the linker region of the NarL and YdfI proteins. The residue at which the proteins were joined is shown with a triangle. Identical and similar residues are highlighted with black and gray respectively. (c) The NarL-YdfI regulated PydfJ promoter. YdfI operator sites are shown with an arrow and the -35 and -10 sequences are highlighted with gray.

Supplementary Figure 3. Optimization of SK and RR expression levels for B. subtilis nitrate sensing TCSs.
(a) We constructed iND138 wherein NarL-YdfI (Supplementary Fig. 2) is expressed from an IPTG-inducible promoter and sfGFP is expressed from the NarL-YdfI~P activated PydfJ115 output promoter (Supplementary Fig. 2). iND138 is integrated into the amyE locus. (b) NarX, NarX(C415R), and NarX(D558V) are expressed from a xylose-inducible promoter in iND27, iND71, and iND72, respectively. Each of these IMs is integrated into the ganA locus of a strain also containing iND138, resulting in a complete TCS. (c) We induced the expression of NarL-YdfI and each NarX variant to different extents in two dimensions in the presence and absence of nitrate, measured the resulting sfGFP output, and calculated the dynamic range, or ratio of sfGFP fluorescence in each condition. To compare detection thresholds of the wild-type and mutant sensors (Fig. 2), we selected a single set of induction conditions that resulted in large dynamic range of all three sensors (10µM IPTG, 1% xylose; brown boxes).

Supplementary Figure 4. Engineering the iso-SK expression strain.
For the iso-SK experiment, we aimed to utilize the IPTG and xylose induction modules (Supplementary Fig. 3) to control NarX and NarX(C415R) expression, respectively. However, because we lacked other inducible promoter systems, we first replaced the xylose-inducible NarL-YdfI promoter (Supplementary Fig. 3) with a constitutive version. (a) We utilized xylose inducible NarX(C415R) and differentially applied nitrate in the media to screen five constitutive B. subtilis promoters (PliaG, PyqxD, PlepA, Pveg, and PrpsD) for the ability to drive appropriate levels of NarL-YdfI expression. Note that the translation rate of RBS3-sfGFP (Supplementary Table  2) is weaker than the translation rate of RBS4-sfGFP2 since this construct was made prior to optimization of sfgfp translation, resulting in lower GFP levels when compared to other data in this paper. (b) NarX(C415R)/NarL-YdfI performance at different SK and RR expression levels. We selected PliaG, which gave the highest fold change, for the iso-SK experiment.

Supplementary Figure 5. Performance of the iso-SK expression strain.
(a) The iso-SK strain device schematics. (b) The relationship between the detection threshold and dynamic range of the iso-SK strain at different expression levels. Points represent the best fit value of the parameter and error bars the 95% confidence interval of the fit parameter.

Supplementary Figure 6. Quantification of NarX and NarX(C415R) expression levels for the iso-SK experiment.
(a, c) To quantify the relationship between IPTG and NarX expression and xylose and NarX(C415R) expression, we fused sfGFP to the C-terminus of each SK in separate test strains. (a) Device schematic of NarX double mutant TCS system. (b) TCS output and fold change at different IPTG and xylose concentrations. There is little observable effect of SK induction on RR function with or without nitrate, suggesting that signaling has been abolished due to the combined effects of two phosphatase-reducing mutations. Note that iND48 contains the weakly translating RBS3-sfGFP (Supplementary Table 2) as in Supplementary Fig. 4, resulting in lower GFP levels when compared to other data in this paper. Figure 9. The phosphatase hot spot residue is present in 64% of SKs.

Supplementary
(a) Development of the Hidden Markov Model (HMM) (Fig. 5a) of the conserved CA domain G2 box region containing the GXGXG motif (Methods). An alignment of 12 SKs from diverse sub-families that contain the G2 box (redrawn from Wolanin et al. 6 ). The GXGXG motif is underlined and the phosphatase hot spot residue within the motif is indicated (*). The bold white "G" on black background is conserved in all 12 SKs. Boxes indicate positions with greater than 70% conservation of similar residues across all 12 SKs. Bold residues are similar to 70% of other residues in the same column. (b) We identified 56,855 non-redundant SKs from genomes in the NCBI RefSeq database (Methods). Using the G2 box HMM, we determined that 38,966 of these SKs contain the G2 box region. Then, we eliminated those SKs wherein the G2 box region was outside the kinase core, which is composed of the DhP and CA domains, or that lacked glycines at either of the final two conserved GXGXG positions. This restriction yielded 36,508 SKs (64% of non-redundant SKs) that contain the phosphatase hot spot residue. (c) The distribution of amino acids found at the phosphatase hot spot residue. Figure 10. Detailed characterization of TtrS phosphatase hot spot mutants.

Supplementary
(a) Device schematic of E. coli TtrSR system used in this work. The mCherry transcriptional unit has no function in this work. (b) Tetrathionate response of TtrSR with all 20 amino acids at the TtrS phosphatase hot spot. We define functional mutants (*) as those with fold activation > 2. Circles and bars are as described in Supplementary Fig. 7. (c) Relationship between hydropathy 7 of the TtrS phosphatase hot spot residue and the tetrathionate response of the TCS. (d) Tetrathionate transfer functions of the ten L627 mutants with the largest fold activation. Data points, error bars, and model fits are as described in Supplementary Fig. 6.