Mechanical slowing-down of cytoplasmic diffusion allows in vivo counting of proteins in individual cells

Many key regulatory proteins in bacteria are present in too low numbers to be detected with conventional methods, which poses a particular challenge for single-cell analyses because such proteins can contribute greatly to phenotypic heterogeneity. Here we develop a microfluidics-based platform that enables single-molecule counting of low-abundance proteins by mechanically slowing-down their diffusion within the cytoplasm of live Escherichia coli (E. coli) cells. Our technique also allows for automated microscopy at high throughput with minimal perturbation to native physiology, as well as viable enrichment/retrieval. We illustrate the method by analysing the control of the master regulator of the E. coli stress response, RpoS, by its adapter protein, SprE (RssB). Quantification of SprE numbers shows that though SprE is necessary for RpoS degradation, it is expressed at levels as low as 3–4 molecules per average cell cycle, and fluctuations in SprE are approximately Poisson distributed during exponential phase with no sign of bursting.


Supplementary
| Distributions at OD600 ~ 0.15 using growth conditions as of Fig. 3d, for SprE tagged with a single mNeonGreen (SprE-mNG) vs. a tandem dimer (SprE-2xmNG). The measured distributions are shown overlaid with Poisson distributions of the same averages. SprE-2xmNG yields a ~30% higher count as expected due to the imperfect maturation yield of fluorescent proteins (the maturation efficiency of mNeonGreen appears to be approximately 80%). Both distributions shown here without size conditioning still closely follow Poisson. Figure 10 | Linear dependence between total fluorescence intensity per cell and SprE concentration can allow for estimation of protein abundances at the population level for higher copies where spot overlap becomes a significant issue. The value for zero SprE copies was obtained from the WT background strain with the segmentation marker. Black circles are averages. Blue and red bars are standard deviation and standard error of the mean, respectively. The red line is a linear fit. Figure 11 | Western blot analysis shows that C-terminally tagged SprE fusion proteins behave identical to untagged, wild-type SprE and deliver the stress-response sigma factor RpoS to the protease ClpXP for degradation. RpoS accumulates significantly in the absence of SprE. Western blot samples were prepared from E. coli cells in late exponential phase (OD600 ~0.7). The monoclonal anti-RpoS antibody cross-reacts with several non-specific bands (indicated by gray arrows on the right), including a band at around 45 kDa (indicated by white asterisk) that runs only slightly above RpoS. The molecular weight of RpoS is 38 kDa but it runs at a higher apparent molecular weight slightly below 45 kDa. The RpoS protein is clearly detectable in the wild-type strain MC4100 (lane 1) but absent in the ΔrpoS strain (lane 2). The RpoS level is strongly increased in the ΔsprE deletion strain (lane 3) confirming SprE's essential role in controlling the RpoS protein level. The RpoS protein level is similar to the wild-type level in the SprE-3xFLAG tag (lane 4) and the SprE-Venus YFP strains (lane 5 & 6) showing that the behavior of untagged and tagged SprE is indistinguishable. SprE-Venus is derived from SprE-Venus-FRT Kan by eliminating the Kan resistance marker with pCP20. Normalized distribution considering only cells between 4 and 6 microns in length. This leads to higher average SprE numbers since longer cells tend to have more SprE molecules due to the fact that SprE numbers scale with cell size. In this case, the overlaid Poisson distribution closely resembles the measured distribution, which consistently exhibits a Fano factor of approximately 1. (d) Cell length distribution for all cells. Inset shows the range chosen (marked by red crosses) for the length conditioning in (c). Figure 13 | Simulated distributions that are representative of various data points along the growth curve. Left, center and right panels correspond to OD600 = 0.04, t = 240 min; OD600 = 1.2, t = 495 min and OD600 = 1.88, t = 765 min, respectively. (a) Distributions are generated for 10,000 cells using numerical simulations (Online Methods and Supplementary Fig. 6). The black curves show the sampled Poisson distribution with the input mean number according to the experimentally detected average SprE count. The histograms for the detected number of spots according to the 250-nm resolution limit and the corresponding experimental cell geometries are shown as red curves. The blue curves represent a Poisson distribution with a mean of the detected numbers for comparison. The table summarizes the results of the simulations. (b) 1,000 images are generated for each condition using identical parameters to those of the numerical simulations (Online Methods and Supplementary Fig. 7 and 8). The simulated images were then analyzed with the spot-finding algorithm. Figure 14 | Histograms of the average fluorescence intensity of E. coli cells with the RpoS750-Venus at an OD600 of 1.2 (red) and an OD600 of 1.9 (blue), representing the SprE minimum and early stationary phase respectively. Essentially, no cell at OD600 of 1.2 exhibits a comparable intensity to that of the stationary-phase cells. Figure 15 | SprE dynamics upon exist from stationary phase. Cells were grown to deep-stationary phase for 5 days in M9 + 10% LB. The cells were then diluted into fresh media and monitored over time for (a) mNeonGreen signal and (b) cell length (open shapes and shaded grey area represent averages and standard deviation, respectively). Steadiness in total intensity for SprE-mNeonGreen until ~100 min suggests that the SprE is stable and is primarily diluted by cell division. By ~180 min the SprE levels return to that of the balanced-growth regime (marked by the green dashed line), also suggested by the SprE counts (data not shown). At this point, the average cell length is also representative of the balanced-growth. Figure 16 | Comparison of chemical fixation (left) and MACS (right) for the same cell culture. Half the volume of a growing culture was used for fixation and the other half for MACS. Top and bottom panels display the CFP segmentation marker, and mNG tandem dimer tagged SprE, respectively. Chemical fixation results in significantly reduced number of observed spots even for the SprE tagged with tandem mNeonGreen. Cells shown here were fixed using paraformaldehyde, following the protocol described in Kuhlman et al. 2 Similar results were obtained using SprE-mNG as well as formaldehyde with various fixation times. Fixed cells were imaged on an agar pad, and imaging conditions were otherwise identical. Scale bar (white) is 2 μm. . We directly send the sample from the pressure tube (PT) into MACS by setting the 3W-Valve to (2). To be able to rapidly empty the PT into the waste when rinsing or priming for the next sampling, we set the 3W-Valve to (1). The entire system is enclosed within a temperature-controlled incubator kept at 37 ˚C. Right panel: After each sampling, we run a cleaning routine by rinsing the PT and the MACS chip sequentially using 10% bleach, 10% ethanol, and water. Right panel shows the efficiency of the cleaning routine. When PT is initially filled with a CFP-expressing strain, cells are only detected in the CFP channel as expected. When we run the cleaning routine, and fill the PT with a YFP-expressing strain, cells are only detected in the YFP channel. A single representative snapshot is shown here but essentially no CFP-expressing cells were detected in >100 frames, which indicates that there is no carryover from the previous sampling of the YFPexpressing cells. Scale bar (white) is 5 m. coli cell (inset) can be approximated with a 2D-Gaussian (red line) with a root-meansquared size () of 1.45 pixels (~93 nm for the magnification used on our system), which yields a resolution limit 3 of rxy = 2.4 ~ 225 nm.     tattgaattaatggcttatcgacaagtgg Supplementary Note 1: The simulations were performed with variables representing the confinement volume (i.e. cell geometry), diffusion coefficient of molecules, photophysical properties of the fluorescent tags, diffraction-limited imaging and EMCCD specifications. For simulating single-molecule images acquired on agar pads, the confinement volume was represented as a cylinder that has a circular cross section with a diameter equal to the average diameter of an E. coli cell. This diameter was measured using the fluorescence images of the cells expressing cytoplasmic CFP as a cell marker. In the case of MACS, the pressing causes cell flattening, which we assume results in cells with an elliptical cross-section. Therefore, single-molecule images acquired with MACS were simulated with a confinement volume modeled as a cylinder with an elliptical cross section. The major axis of the elliptical cross section was calculated from images of cells expressing cytoplasmic CFP. The short axis was estimated using the assumption that the circumference of the cell did not change from the circular to elliptical transformation due to pressing, which typically corresponds to an estimation of ~50% decrease in cell height. Diffusive traces of single molecules were created using a random walk algorithm 4 with the molecular diffusion constant estimated from FRAP measurements (D ~0.1 µm 2 per s). The diffuser emits photons along the entire trajectory. The number of photons per emitter was sampled from a Poisson distribution with a mean of 1000 photons per molecule. The emitted photons fall on top of the EMCCD pixels with a spread given by the point-spread function of the diffraction-limited imaging optics. The point-spread function is approximated as a twodimensional symmetric Gaussian with σ = 0.22 λ/NA (~80 nm for light with a wavelength of λ ~520 nm and an objective lens with a numerical aperture of NA = 1.45) on our setup 3 . The detection noise is modeled using a Poisson distribution. The photons were converted to electrons according to the quantum efficiency and the EM gain of the EMCCD camera. The electrons are converted to counts according to the specified inverse system gain of the EMCCD. Finally, background noise originating from dark current and cellular autofluorescence was added to each pixel. Dark current, background noise and gain noise parameters were estimated from the EMCCD specifications. Cellular autofluorescence signal counts were estimated from control cells with wild-type (untagged) SprE. Simulated images were then analyzed according to the abovementioned spot-finding algorithm.

Supplementary Note 2:
A previously published theoretical model predicts that the cells are critically crowded, i.e. the protein densities that are optimal for the reaction rates happen to be close to the actual protein densities in the cells 5 . Simply put, the cytoplasm may have just enough water to support diffusion 6 perhaps in order to ensure high effective concentrations and thereby higher rates of bimolecular reactions, and eliminating some water, ever-slightly, causes cells to transition into a regime where macromolecules barely diffuse at all. This prediction is consistent with our experimental observation of the single-molecule visualization when we use MACS to press on the cells and slow down the diffusion of the cytoplasmic proteins.

Supplementary Note 3:
In conventional fluorescence microscopy, an individual point source like a single mNeonGreen molecule emits light at ~517 nm 7 , which is spread by diffraction, resulting in an intensity distribution on the detector of the camera chip. This intensity distribution is known as the point-spread function (PSF) and has a width that is over two orders of magnitude larger than the nanometer-sized fluorescent protein. Molecules that are in close spatial proximity cannot be individual resolved if their PSFs overlap substantially, which would result in protein undercounting. This resolution limit, based on the Rayleigh criterion, is defined as where λem is the emission wavelength of the fluorophore and NA is the numerical aperture of the microscope objective. For mNeonGreen and a 100× objective (NA 1.45), rxy is about 217 nm, which is in good agreement with the PFS measurement that we performed on our microscope setup (Supplementary Fig. 19). As a conservative estimate, we used nm for the computer simulations.
In total, P molecules are drawn from a Poisson distribution with mean λ = 〈P〉 and placed into a three-dimensional cell with fixed cell length, cell width, and cell height (Supplementary Table 1 and Supplementary Fig. 6 a and b). The exact x, y, and z position of each individual molecule is determined by drawing three random numbers (x, y, z) from continuous uniform distributions. For the molecule position along the cell length coordinate, x is drawn from a continuous uniform distribution with minimum and maximum values of xa and xb (e.g. 0 and 5,500 nm). For the position along the cell width and cell height coordinates, y and z are drawn form continuous uniform distributions with minimum and maximum values of ya and yb (e.g. 0 and 2,000 nm) and of za and zb (e.g. 0 and 880 nm), respectively. The algorithm then tests whether the molecule falls within an ellipse with a semi-major axis of (e.g. 1,000 nm) and a semi-minor axis of (e.g. 440 nm) (Supplementary Fig. 6b). The condition C for this test is and new random numbers are drawn for x, y, and z if (i.e. the molecule falls outside the cell volume if ). If the cross section of the cell is a circle instead of an ellipse, cell width equals cell height and C can be simplified to .
The height of an average E. coli cell is larger than the depth of field (DOF) of a typical high-resolution fluorescence microscope, including our setup. Hence not all molecules are captured within a single focal plane, which could result in protein undercounting. When we press on the cells using MACS, the cell height is substantially reduced and all fluorescent molecules appear within a single focal plane. We hence limit the spatial overlap analysis to two dimensions (i.e. the cell length x vs. cell width y plane). The algorithm calculates for all molecules in a given cell the pair-wise Euclidean distances between the molecules using the following formula: √ ( ) (Supplementary Fig. 6c). This distance matrix is converted into a logical matrix, which we call the cluster matrix (Supplementary Fig. 6d). Pairs of molecules that are less than rxy apart from each other are assigned the value 1 (i.e. spatial overlap) whereas molecules that are more than rxy apart are assigned a 0 (i.e. spatially resolved). Every row of the distance matrix is then assigned a flag that represent the 'status' of the molecule with values of either 1 (i.e. active) or 0 (i.e. inactive). After placement of the molecules, all rows have a status of 1 (active). The algorithm then starts with row i = 1, which corresponds to molecule i = 1, and searches for first-degree nearby neighbors (i.e. molecules that are less than rxy away from molecule i = 1). If molecule j is a first-degree nearby neighbor of molecule i = 1, then the corresponding i,j-entry in the cluster matrix would be 1 (see next paragraph). If no first-degree nearby neighbors of molecule i were found, the 'status' flag of molecule i is set to 0 (inactive) and the code then proceeds with the next row (e.g. i = 2). Rows that have a status of 0 (inactive) are skipped by the algorithm (e.g. row 6; see below). This is necessary to prevent assigning and counting of the same molecule multiple times.
When the algorithm encounters a row i, e.g. i = 5, that has a first-degree nearby neighbor j, e.g. j = 6, with an i,j-entry of 1 in the cluster matrix (Supplementary Fig.  6d), the algorithm will also search for potential higher-degree nearby neighbor(s) and neighbors of neighbors to connect elongated molecule clusters. The 'status' flag of molecules that are assigned to a cluster is set to 0 (inactive). Using nested for loops, the algorithm checks for up to the (P-1)th-degree nearby neighbors (e.g. if P = 10, the code searches for the first-and potentially up to the ninth-degree nearby neighbors) given that a lower-degree nearby neighbor exists. This strategy assures that all connected molecules are found and are correctly assigned to the respective clusters. For example, molecule i = 7 has a first-degree nearby neighbor (i.e. molecule 10) and a seconddegree nearby neighbor (i.e. molecule 8, which is indirectly linked to 7 via 10). Finally the number of clusters is determined and added to the number of molecules that are not part of a cluster to determine the total number of detected molecules. For the example from above (Supplementary Fig. 6a,b), the cell with P = 12 molecules has 7 spatially resolved molecules and two clusters with 2 and 3 molecules, respectively (Supplementary Fig. 6e). The total number of detected molecules is 9. Three molecules were missed because of the width of the PFS and the resulting spatial clustering.

Supplementary Note 4:
Construction of E. coli strains, plasmids and primers used in this study.

E. coli strain construction
All E. coli strains that were used in this study are listed in Supplementary Strain BO56 was built by tagging the SprE gene with 2x mNeonGreen (also referred to as tandem mNeonGreen (tdNG)) fused to the AP tag. The SprE gene was modified at the endogenous chromosomal locus using the λRed method. Plasmid pDML199 was used as the PCR template for amplifying the integration cassette with 300-bp upstream and downstream homologies (to increase transformation and targeting efficiency). The 300-bp upstream and downstream homology regions were PCR amplified from genomic DNA (MC400) using primers DML_P674_F1 and DML_P675_R1 and primers DML_P676_F2 and DML_P677_R2, respectively, and PCR stitched to the integration cassette. Correct chromosomal integration was confirmed with colony PCR using primers DHL_P80_F and DML_P682_R. The FRT-flanked Kan marker was excised using pCP20, following removal of the plasmid by growing the cells at the non-permissive temperature in rich media. The galK::Plac-CFP-Amp allele was then P1 transduced from Wc.
Strain DHL222 was built by integrating the RpoS750-Venus degradation reporter into the phoA locus using the λRed method following a standard protocol 8 . The PrpoS-rpoS750-Venus-T1 terminator-FRT Kan FRT integration cassette was PCR amplified from pDHL39 with primers DHL_P93_F and DHL_P94_R. The upstream flank of the integration site was PCR verified with primers DHL_P101_F and DHL_P46_R; and the downstream flank of the integration site was verified with primers DHL_P47_F and DHL_P104_R.
Strain DHL307 was built by PCR amplifying the 3xFLAG-FRT Kan FRT casette from pDHL229 with primers DHL_P107_F and DHL_P108_R. The purified integration cassette was integrated into strain DHL193 as previously described 8 . Tagging of sprE with the 3xFLAG tag was PCR verified with primers DHL_P120_F and DHL_P121_R.
Strain DHL331 is identical to strain DHL307 except that the FRT-flanked Kan marker was removed with pCP20.
Strain DHL394 was constructed by PCR amplifying the Venus-T1 terminator-FRT Kan FRT cassette from pDHL146 with primers DHL_P168_F and DHL_P169_R and integrating the cassette into strain DHL193 as previously described 8 . The upstream flank of the integration scar was PCR verified with primers DHL_P120_F and DHL_P79_R, whereas the downstream flank of the integration scar was verified with primers DHL_P80_F and DHL_P81_R.
Strain DHL812 is identical to strain DHL399 except that the FRT-flanked Kan marker was excised using pCP20.
Strain DHL934 was built by transforming plasmid pDHL917 into MC4100.
Strain GL15 was built by P1 transducing the galK::Plac-CFP-Amp allele from Wc into strain DHL222 after the FRT-flanked Kan marker was removed from DHL222 with pCP20.
Strain GL19 was built by P1 transducing the galK::Plac-CFP-Amp allele from the Wc strain into MC4100.
Strain GL45 was built tagging the SprE gene with mNeonGreen at the native locus using the λRed method (see above). The integration cassette was PCR amplified using plasmid pDML22 as a template and primers DHL_P168_F and DHL_P169_R. The FRTflanked Kan marker was removed with pCP20, and then the galK::Plac-CFP-Amp allele was P1 transduced from Wc into this strain.
Strain GL64 was built by first P1 transducing the rpoS::Kan allele from JW5431 (KEIO collection) into strain GL45. Second, the FRT-flanked Kan marker was removed with pCP20 following P1 transduction of the galK::Plac-CFP-Amp allele. The P1 lysate was obtained from the Wc strain.

Plasmid construction
All plasmids that were used in this study are listed in Supplementary Table 3.
The plasmid construction was performed using traditional 'sticky-end' cloning or isothermal assembly 9 . Analytical restriction digests or PCR were used to verify the plasmid construction. DNA sequencing was used to verify all cloning steps that involved PCR amplification. Phusion (Finnzymes), Vent (NEB), and Q5 (NEB) polymerases were used for the PCR reactions. The restriction enzymes were purchased from NEB and used according to the instructions provided by the manufacture. pDHL16 was built by amplifying the full-length rpoS promoter (P rpoS ) and the first 750 base pairs of the rpoS open-reading frame (rpoS750) from genomic DNA (MC4100) with primers DHL_P13_F and DHL_P14_R. The full-length rpoS promoter includes the preceding nlpD gene including the nlpD promoter. The PCR product was gel purified, digested with EcoRI and SacI, and then ligated in pUC19, which was cut with the same restriction enzymes and also gel purified. pDHL17 was built by amplifying Venus and the T1 terminator from pPM1 (courtesy of Per Malkus) with primers DHL_P15_F and DHL_P16_R. The PCR product was digested with SacI and XmaI, gel purified, and ligated into pUC19, which was also digested with SacI and XmaI followed by gel purification. pDHL23 was built by digesting pDHL17 with SacI and XmaI. The liberated Venus-T1 terminator fragment was then gel purified and subcloned into pDHL16, which was also digested with SacI and XmaI followed by gel purification. pDHL39 was built by digesting pDHL23 with EcoRI and XmaI. The liberated P rpoS -rpoS750-Venus-T1 terminator fragment was then gel purified and ligated into EcoRI/XmaI-cut pDHL19.
pDHL229 was constructed by PCR amplifying the 3xFLAG tag from plasmid pSUB11 10 with primers DHL_P105_F and DHL_P106_R. The PCR fragment was then cut with SacI and XmaI, gel purified and ligated into pDHL19, which was digested with SacI and XmaI and also gel purified.
pDHL655 was built in two steps. First, mGFPmut3 was amplified from pDHL580 using primers DHL_P21_F and DHL_P315_R. The PCR product was digested with SacI and XmaI, and then gel extracted. The purified PCR product was ligated into pDHL19, which was cut with the same enzymes and also gel extracted. Secondly, the resulting plasmid was digested with XmaI, gel extracted, and an oligo site, which encodes the 15amino-acid long acceptor peptide (AP) tag 11 , was inserted into the vector backbone. The oligo site was made by annealing primers DHL_P316_F and DHL_P317_R following a standard protocol (see e.g. https://www.addgene.org/plasmid-protocols/annealedoligo-cloning/). pDHL694 was built by PCR amplifying Dronpa from pDHL583 with primers DHL_P255_F and DHL_P348_R. The PCR product was digested with SacI and XmaI, gel purified, and ligated into pDHL655, which was also cut with the restriction enzymes SacI and XmaI and then gel purified. pDML830 was constructed in two steps. Frist, Zif268 was PCR amplified from a plasmid template that encodes Zif268 coding sequence (courtesy of the Silver lab, Harvard Medical School). The PCR product was then digested with BamHI and HindIII, gel extracted, and ligated into pPM16, which was cut with the same enzymes and also gel purified. Secondly, the resulting plasmid was opened with the restriction enzymes HindIII and XbaI, and gel extracted. Primers DHL397_F and DHL_398_R were used to make an oligo site that encodes the AP tag. The oligo site was then inserted into the opened vector. pDHL917 was built by first digesting pDHL830 with BamHI and HindIII, which cuts out the DNA sequence that encodes the Zif268. The vector was then gel purified. The mEos2 coding sequence was PCR-amplified with primers DHL_P485_F and DHL_P486_R using pDHL844 as a template. mEos2 was then inserted into the BamHI/HindIII-digested pDHL830 vector with isothermal assembly. pDML22 was built by PCR amplifying mNeonGreen with primers DML_P118_F and DML_P119_R from pUC57-Kan-mNeonGreen. The plasmid was a gift from the Lindquist lab (Whitehead Institute at MIT). The PCR product was inserted into vector pDHL580, which was cut with BlpI and XmaI, using isothermal assembly. pDML199 was built by PCR amplifying the first mNeonGreen from plasmid pDML48 using primers DML_P637_F and DML_P638_R and the second mNeonGreen from plasmid pDML152 using primers DML_P635_F and DML_P636_R. The second mNeonGreen is yeast codon-optimized and has hence no local sequence identity with the first mNeonGreen (though the global sequence identity is ~78%). Avoiding sequence homology is important to prevent recombination between the two mNeonGreen parts, which would reduce the tandem mNeonGreen to a single mNeonGreen. The PCR-amplified mNeonGreens were then inserted with isothermal assembly into pDHL694, which was digested with SacI and XmaI and also gel purified. Plasmids pDML48 and pDML152 were generous gifts from the Lindquist lab (Whitehead Institute at MIT, Cambridge, MA, USA).
All plasmids and the corresponding vector maps are available upon request.

Primers
The primers used in this study are listed in Supplementary Table 4. The Primers were purchased from Integrated DNA Technologies, Inc. (Coralville, IA, USA).