Structured illumination multimodal 3D-resolved quantitative phase and fluorescence sub-diffraction microscopy

Sub-diffraction resolution imaging has played a pivotal role in biological research by visualizing key, but previously unresolvable, sub-cellular structures. Unfortunately, applications of far-field sub-diffraction resolution are currently divided between fluorescent and coherent-diffraction regimes, and a multimodal sub-diffraction technique that bridges this gap has not yet been demonstrated. Here we report that structured illumination (SI) allows multimodal sub-diffraction imaging of both coherent quantitative-phase (QP) and fluorescence. Due to the conventional fluorescent applications of SI, we first demonstrated the principle of SI-enabled three-dimensional (3D) QP sub-diffraction imaging with calibration microspheres. Image analysis confirmed enhanced lateral and axial resolutions over diffraction-limited QP imaging, and established striking parallels between coherent SI and conventional optical diffraction tomography. We next introduce an optical system utilizing SI to achieve 3D sub-diffraction, multimodal QP/fluorescent visualization of A549 biological cells fluorescently tagged for F-actin. Our results suggest that SI has unique utility in studying biological phenomena with significant molecular, biophysical, and biochemical components.


I. Introduction
Optical microscopy has played a crucial role in advancing frontiers of the biological sciences by allowing high-resolution, non-invasive visualization of important biological samples. Although developments and advances in optical design and manufacturing have made available highresolution objectives with unprecedented numerical aperture (NA), microscopy faces a fundamental physical diffraction limit that can preclude visualization of important sub-cellular features in biological samples 1,2 . In response, several imaging techniques have been developed which allow far-field sub-diffraction resolution imaging using a variety of unique and innovative mechanisms 3,4 .
Sub-diffraction imaging techniques introduced thus far operate in two main regimes: 1) imaging via spatially-coherent diffraction, or 2) imaging via spatially-incoherent fluorescence. Synthetic aperture (SA) is a popular choice for imaging in the first regime, and operates by using oblique illuminations to spatiotemporally encode a wider frequency support into the final image than directly allowed by the microscope's physical aperture [5][6][7] . Applications of SA have resulted in both high-resolution imaging, where (Sparrow) resolutions of <100 nm have been achieved 8 , and high-throughput imaging, where gigapixel-scale images with resolutions > 5x over the diffraction limit have been obtained 9,10 .
Sub-diffraction techniques for imaging in the fluorescent regime, often referred to as "super-resolution" techniques, typically require the sample's fluorescent labels to have either photoswitching or depletion capabilities. Photoactivated localization microscopy (PALM) is a prominent example that uses photoswitchable fluorophores to localize individual emitters at subdiffraction resolutions per acquisition, before combining them into a final super-resolved image 11 . Stimulated emission depletion (STED) is another successful example that utilizes point-scanning to directly minimize the size of the scanned focal spot by saturated stimulated emission with two synchronized ultrafast laser sources 12 . Such techniques have found tremendous success in biological imaging and have achieved resolutions well below 100 nm.
These two regimes operate on fundamentally different mechanisms, and thus enable distinct and complementary biological observations. Fluorescence imaging is the standard for molecular-specific, background-free, cellular imaging, and has enabled great insights into gene expression, protein interaction, cytoskeletal organization, endocytic dynamics, organelle structures, intracellular transport, cytokinesis, and general intracellular dynamics [13][14][15][16][17][18] . Coherent diffraction imaging is the choice technique to image unstained and minimally prepared cellular samples with endogenous contrast to extract quantitative and biologically relevant parameters. Examples of coherent-diffraction imaging include quantitative-phase (QP) and light-scattering imaging, which can noninvasively probe structural, mechanical, biophysical, and biochemical properties of cells, and have been used for analysis of whole-cell morphology, mass, shear stiffness, refractive-index/optical-path-length (OPL), dispersion spectroscopy, and absorption/scattering [19][20][21][22][23] . Important biomedical applications even include non-invasive and quantitative measurements of hemoglobin oxygenation saturation 24 and detection of different cancers types 25 . In both fluorescent and coherent-diffraction regimes, biological insight is enhanced with sub-diffraction resolutionunfortunately, because the two regimes are fundamentally different, a sub-diffraction resolution method applicable to both has been difficult to find. This poses an obstacle in microscopy, as users conventionally must choose between either unimodal sub-diffraction or multimodal diffraction-limited imaging. Such a choice can prevent a synergistic, multimodal analysis of individual sub-cellular components beyond the diffraction limit and can hinder a cohesive understanding of biological morphology and function.
In this work, we demonstrate that structured illumination (SI) microscopy is a subdiffraction technique compatible with both coherent-diffraction and fluorescent imaging. Indeed, SI is already an established technique which is conventionally associated with fluorescent superresolution and has become popular due to its speed, cost, and simplicity [13][14][15] . However, SI also has applications in coherent imaging and has shown promise for sub-diffraction quantitative-phase (QP) imaging of diffractive samples [26][27][28] . Because SI enables fluorescent and coherent-diffraction sub-diffraction imaging, it offers a unique capability for biologists to conduct multimodal studies that probe relationships between the biophysical/biochemical properties of a cell with its molecular properties/processes. In the following sections, we experimentally demonstrate, for the first time in our knowledge, a single optical system that uses SI to generate 3D QP and fluorescence visualizations at sub-diffraction resolutions, with potential applications for future multimodal analysis of sub-diffraction cellular components.

II. Theoretical framework A. Imaging transfer functions.
Diffraction theory describes the maximum lateral spatial frequencies ∥, and ∥, and maximum axial spatial frequencies ⊥, and ⊥, , respectively, that can be observed through a microscope for both conventional fluorescence and QP microscopies. For the case where QP and fluorescent modalities share the same physical microscope and the light used for QP imaging serves the dual purpose of also being the excitation for fluorescent imaging, these maximum spatial frequencies are given by ∥, = 2NA/ , ∥, = NA/ , ⊥, = (1 − cos )/ , and ⊥, = (1 − cos )/ 29 . Here, and are the excitation and emission wavelengths used for QP and fluorescent imaging, respectively, is the index of refraction of the medium, is the maximum half-angle of light that the detection objective supports, and NA = sin is the detection objective's numerical aperture. In three-dimensional Fourier space, these spatial frequencies set the bounds of observable spatial frequency content that can be passed through the microscope (note that the spatial frequency bounds for QP and fluorescence are only applicable in the electric-field and optical intensity regimes, respectively, so direct comparisons are inappropriate). In accordance with diffraction theory, these regions of observable spatial frequencies define the systems' transfer functions (TF) and take the shape of a spherical cap, mathematically a subsection of Ewald's sphere, in QP imaging and a torus-like structure in fluorescence imaging ( Fig. 1) 2,30 . Note that ⊥, , does not equate to QP imaging's axial resolution -because QP imaging's TF has infinitesimal axial frequency support, the sample's diffracted wave vector with the ⊥, axial component propagates through the whole image volume and offers little optical sectioning 31,32 .
The spatial frequencies outside the QP and fluorescent TFs are typically unobservable and are the target for visualization by sub-diffraction resolution imaging. SI achieves this by aliasing information into the TF by illuminating the sample with a spatially modulated illumination pattern. Because this aliasing effect happens naturally when two spatial patterns overlap regardless of fluorescent or diffractive imaging, SI can serve as a generalized platform for multimodal subdiffraction imaging, up to a resolution gain factor of 2. Nonlinear SI, which uses fluorescent nonlinearities to achieve resolution gains of >2, is not considered here 33 . In this work, we demonstrate that SI allows 3D sub-diffraction multimodal QP and fluorescent imaging. We refer the reader to the work by Gustafsson et al. 30 , which beautifully illustrates the concept behind SI for 3D fluorescent imagingin the next section, we introduce an analogous approach for QP imaging. (a,b) Crosssectional plot and 3D rendering demonstrate that the transfer function in QP imaging forms a spherical cap (commonly referred to as Ewald's Sphere). Dimensional parameters of this cap are determined by wavelength, microscope objective NA, and immersion oil refractive index. Autocorrelation of this cap corresponds to the transfer function for fluorescence imaging. (c,d) Cross-sectional plot and 3D render show this region to have a filled torus-like shape. Note that direct comparisons between (a) and (c) are inappropriate because spatial frequencies in QP and fluorescence imaging are measured in the electric-field and intensity regimes.

B. Principle of SI-enabled 3D QP visualization.
The SI framework for sub-diffraction resolution imaging is typically modelled mathematically as a modulation of the spatial frequencies in the sample by those in an illumination pattern, which is then imaged through a system that operates under constraints of linearity and translationinvariance. The illumination's modulation of the sample's spatial frequencies results in resolvable "beat" frequencies (i.e., Moiré patterns) that allow reconstruction of sample spatial frequencies that fall outside the system's diffraction limit 34 . This mathematical treatment does not make any fundamental assumptions on coherence, and is thus equally applicable to both fluorescent and coherent-diffraction imaging (as long as complex electric-field is imaged in the coherent imaging case so that the requirement for linear and translation-invariant imaging is satisfied). Indeed, SI's ability to enable sub-diffraction resolution coherent QP-microscopy (SI-QPM) in two dimensions has been clearly demonstrated [26][27][28] .
To intuitively extend the conceptual framework towards 3D coherent imaging, we note that for coherent-diffraction imaging, SI is equivalent to multiplexed oblique illumination, where each tilted plane wave in the illumination's plane-wave-decomposition contributes linearly to the electric-field at the image plane 35 . It follows from Fourier theory that because on-axis plane wave illumination results in a spherical-cap TF centered at the origin of frequency space, tilted planewave illumination, mathematically described with a phase ramp, displaces the TF in frequency space by an amount equal to the tilt angle. This concept is identical to and ubiquitously used in optical diffraction tomography (ODT) 22,29,36 .
SI-QPM's main difference from conventional ODT is in implementationbecause SI linearly multiplexes tilted illuminations onto the sample to achieve its structured pattern, the imaged field is a linear superposition of the sample's spatial frequencies from correspondingly shifted TFs. In the case of a sinusoidal structured pattern, phase shifting the pattern allows for analytical solution for these spatial frequencies, which can then be digitally shifted to their correct regions in 3D frequency space 33,34 . Because the coherent TF is a spherical cap with infinitesimal axial frequency support, simply illuminating with structured patterns with spatial frequencies at the maximum-allowed magnitude, as is typically done to maximize lateral resolution gain in fluorescent SI, leaves much of the axial frequency space uncovered. To thoroughly cover 3D frequency-space, the tilt angle of the illuminations must be incremented. Conventionally, ODT Note that illuminating with simply one spatial frequency is not sufficient to fill out axial frequency space in coherent-diffraction imaging, as it is in fluorescent imaging. (d,e,f) Corresponding 3D renderings show 3D frequency space being filled by Ewald caps. accomplishes this by scanning the tilt angle of the illumination beam directly with a pair of scanning mirrors. SI can achieve the same effect by illuminating with spatial sinusoids of varying spatial frequencies (Fig. 2).
Coherent SI and ODT are conventionally considered separate imaging techniques, and so their connection as intuitively explained above may not be immediately obvious. For the interested reader, Supplementary Note 1 rigorously formulates the theory for SI-enabled 3D QP subdiffraction imaging upon the fundamental principles of 3D fluorescent SI super-resolution introduced by Gustafsson et al 30 . Though this formulation was derived independently of any consideration of ODT's conventional framework, the final conclusions in Supplementary Note 1, summarized in Supplementary Figs. 2(g-j), is mathematically identical to those of ODT.

C. Reporting diffraction-limited resolution.
Reporting on an imaging system's resolution usually involves reporting some metric of "minimum-resolvable distance". We choose to report theoretical diffraction-limited resolution in terms of the Abbe limit, which gives the cycle-period of the highest spatial frequency of the sample that is transferred by the microscope onto the image plane. In our case of DC-centered, symmetric, and filled lateral TFs, diffraction-limited lateral Abbe resolutions are the reciprocal of the spatial frequency bounds described above -QP and fluorescent imaging have Abbe lateral resolutions of ∥, = 1/ ∥, = /NA and ∥, = 1/ ∥, = /2NA in the electric-fields and optical intensity domains, respectively. As mentioned previously, due to its infinitesimal axial TF extent, conventional QP imaging has little axial resolution 31,32 . SI-enabled 3D QP, however, has an expected Abbe axial resolution in electric-field of ⊥, = 1/ ⊥, = / (1 − cos ). Similarly, fluorescent imaging has an Abbe axial resolution of ⊥, = 1/ ⊥, = / (1 − cos ) in optical intensity.
We choose the Abbe standard for "minimum-resolvable distance" because of its straightforward applicability to both coherent and incoherent modalities. Other popular alternatives that involve basic parameters of the image's point-spread-function (FWHM, Rayleigh, Sparrow) have complex interpretations in coherent imaging 1,[37][38][39][40] . In the case of intensity-based coherent imaging, where the imaged intensity is nonlinearly related to sample structure, such interpretations may be outright misleading. Even in the case of electric-field imaging via holographic or computational methods, accurate interpretations must incorporate considerations of the system's coherence properties or the sample's phase dependences. In the case of a fully coherent system and a sample of uniform phase, we show in Supplementary Fig. 3 that the Abbe limit is a conservative metric for resolution and that other reporting schemes for resolution may be significantly more attractive -we emphasize however that such resolution "improvements" are not indicative of an actual difference in the system's PSF or imaging performance.
We also note that in the case of incoherent diffraction-based imaging, such as brightfield, darkfield, phase-contrast, or differential-interference contrast (DIC) microscopies, the lateral Abbe diffraction limit is generally considered to be ∥, = /(NA + NA), where is the generalized imaging wavelength and NA is the numerical aperture of the illumination objective (i.e., condenser lens) that sets the angular range available for illumination 41,42 . In the case in which the illumination and detection objectives have equal numerical apertures, such that NA = NA, the resolution limit ∥, = /2NA matches the resolution limit ∥, of fluorescent microscopy. Because typical QP microscopies use on-axis plane wave illumination, it is tempting to conclude that the resolution limit ∥, = /NA is simply a special instance of ∥, under the constraint NA = 0. One could then naturally argue that it is unfair to consider /NA as the diffraction limit when the sample could conceivably be illuminated with the full angular range allowed by the illumination objective to directly achieve /2NA resolution imaging without the addition of SI. To respond to this, we assert that the general expression for ∥, makes the key assumption of incoherent illumination, and thus is not an appropriate standard for coherent imaging. Indeed, simply illuminating the sample with coherent waves spanning the full range of illumination angles allowed by the illumination objective equates to illuminating the sample with a speckle pattern, which is unsuitable for widefield imaging. Hence, the general standard for coherent widefield imaging in the QP-imaging community is to image the sample with a flat electric-field illumination background, achieved with a single on-axis coherent beam (i.e., NA = 0), and consider the resulting resolution limit of ∥, = /NA as the coherent system's diffraction limit 5,9,32,42,43 .

III. Methods.
A. Optical System.
The general design of our system is illustrated in Supplementary Fig. 4. As shown, our system uses single-mode broadband light (NKT Photonics, EXW-6) spectrally filtered to 488 ± 12.5 nm (Semrock, FF01-482/25), which serves the dual purpose of being the illumination and excitation for QP and fluorescent imaging, respectively. This light was collimated and passed through a 50:50 polarization beam splitter (PBS, Thorlabs, PBS251) before being incident onto an amplitude spatial light modulator (SLM, Holoeye, HED 6001). Due to the SLM being nematic, and thus capable of accepting non-binary inputs, sinusoidal patterns were programmed into the SLM. In the domain of the SLM's coordinate space, the minimum spatial-period of written patterns was limited to ~27 um, corresponding to approximately 3.4 SLM pixels (SLM pixel pitch = 8.0 um) and thus satisfying the Nyquist limit for SLM pixel sampling. The pattern written onto the spatial light modulator was passed through the first 4f system (L1  L2) before being imaged through the second 4f system (L3  OBJ) onto the sample. To ensure faithful generation of a sinusoidal pattern at the sample, extraneous diffraction orders resulting from the SLM's pixilation were spatially filtered out by an adjustable iris diaphragm (F, Thorlabs, ID25) placed in the Fourier plane of the first 4f system. The focal length of L3 was chosen so that the desired ±1 diffraction orders arising from a ~27 um spatial-period sinusoidal pattern would be refocused to points near the opposite edges of the back focal plane of OBJ. Diffraction and fluorescence from the sample are collected in transmission from the sample through a detection objective (matched in NA to the imaging objective) and is magnified by the second 4f system (OBJ  L4). The sample's fluorescence signal is split from the diffraction signal by a dichroic mirror (DM, Thorlabs, DMLP505) and further spectrally filtered (SF, Thorlabs, FEL500) before imaging onto the first camera (CMOS-F, Pixelink). The sample's diffraction signal, after being split from the fluorescence via the DM, was passed through a diffraction-phase setup, where a grating (DG, Edmund Optics Ronchi 60 lpmm) was placed at the conjugate image plane to the sample to split the signal into various diffraction orders. A mask (M) was positioned at the Fourier plane of DG to physically block the -1 st diffraction order while completely passing the +1 st order. The mask also contained a 20 um pinhole (PH, Edmund Optics, 52-869) to spatially filter the 0 th order to generate a uniform wavefront reference beam to interfere with the +1 st diffraction order at the second camera (CMOS-QP, Pixelink). Due to broadband common-path off-axis interference, high temporal phase stability and low coherent noise (noise variance of < 0.001 rad for widefield imaging) were achieved 26 . Supplementary Fig. 4(c) shows in more detail how the mask was positioned relative to the diffraction orders arising from DG and the SLM.
For experiments involving calibration microspheres (Fig. 3, Supplementary Figs. 6 and 7), a 60X, 1.0 NA Nikon Physiology objective lens was used. For the experiments involving A549 cells (Fig. 4, Supplementary Figs. 8 and 9), this objective was replaced with a 40X, 1.3 NA Zeiss objective lens. The focal length of the condenser lens L4 was chosen to set the system magnification such that a coherent diffraction limited spot was sampled by ~6x6 camera pixels (6.7um pixel pitch). This oversampling, in conjunction with the off-axis interference frequency set by DG (60 lpmm), was sufficient to extract the sample's complex electric-field via standard digital off-axis holography techniques [42][43][44] . Examples of raw interferogram acquisitions alongside associated Fourier distributions are shown in Supplementary Fig. 5, to demonstrate clear isolation of sample's complex electric-field information in Fourier space.

B. Image Acquisition.
Custom acquisition software was written in MATLAB. Three beam interference, with an unblocked 0 order, was used to generate periodic illumination patterns for 3D fluorescent and QP SI. Our sample was axially scanned during image acquisition at increments of 200nm, satisfying the Nyquist requirement set by our QP and fluorescent axial diffraction limits. A total of 60 axial slices were taken for a single volume. There was no physical change in the optical system between 3D fluorescent and QP SI imaginghowever, the patterned illumination procedures for the two modalities were different and so the fluorescent/QP acquisitions were not simultaneous.
For 3D fluorescent SI imaging, the maximum allowed spatial frequency was used for the illumination pattern, such that the component ±1 orders were at the edge of the illuminating objective's back focal plane. For each imaged z-plane, acquisitions were taken for five phaseshifts, spaced 2 /5, per rotation of the illumination pattern, with two rotations, spaced /2 radians apart. In total, this corresponded to 600 raw acquisitions to reconstruct a single volume. Camera integration time was set to 120 ms per acquisition for sufficient fluorescent SNR.
For 3D QP SI imaging, the spatial frequency magnitude of the periodic illumination pattern was incremented 10 times through the domain [0, NA/ ], per imaged z-plane. For each spatial frequency, acquisitions were taken for five phase-shifts, spaced 2 /5, per rotation of the illumination pattern, with six total rotations, spaced /3 radians apart. In total, this corresponded to 18000 raw acquisitions to sub-diffraction resolve a single volume. Camera integration time was set to 15 ms per acquisition to average out the high-frequency temporally-fluctuating "flickers" inherent with our SLM device. Supplementary Fig. 5 illustrates examples of raw interferograms and associated Fourier transforms from which complex-valued sample electric-fields are reconstructed.
For both fluorescent and QP imaging, phase-shifting of the spatial illumination pattern allowed solving for the sub-diffraction resolution spatial frequency components via linear system inversion. These components were then translated back to their appropriate regions in frequency space 26,34 before being combined (Supplementary Note 1). Standard Wiener deconvolution methods were applied to account for the non-uniform weighting of the final computed TF. Standard whole-image adjustments (brightness, contrast) were consistently applied across image datasets, in accordance with accepted practices for image presentation 45,46 .

C. Sample Preparation.
Microsphere Phantom Preparation. We prepared calibration samples of 400nm, 520nm, and 770nm diameter polystyrene microspheres (BangLaboratories). In order to attach the beads to a surface, 10 uL of the microsphere dilutions (2 uL stock-solution / 500 uL ethanol) were placed onto #1.5 coverslips and allowed to dry. 1X phosphate buffered solution (PBS) was placed over the regions of interest and served as the appropriate immersion fluid for our 1.0 NA Nikon Physiology objective lens.
Cell Preparation. A549 lung cancer cells were cultured using Dulbecco's Modified Eagle Medium (DMEM) supplemented with 10% fetal bovine serum and 1 µL/mL pen-strep. Cells were plated at low-density onto #1.5 coverslips and allowed to attach to the substrate overnight. Cells were fixed using a 4% paraformaldehyde in PBS. Alexa Fluor 488 phalloidin (Life Technologies) was used to stain filamentous actin following the manufacturer's suggested protocol. Following staining, an adhesive spacer and coverslip was placed on top of the sample to ensure a uniform PBS layer above the imaging field of view. Oil of refractive index n=1.51 was placed over the region of interest and served as the appropriate immersion fluid for our 40X 1.3 NA Zeiss objective lens.

A. 3D visualization of microspheres with SI-QPM.
The design of our SI-QPM system largely follows our original SI-DPM system, which achieved QP imaging via common-path off-axis holography 26 . One notable difference in our current SI-QPM system, however, is that a spatial-light-modulator (SLM) is used instead of a physical grating to generate structured patterns at the sample. This is necessary to have pixel-addressable control over the patterns, which in turn allows tuning the patterns' translations, rotations, and spatialfrequency magnitudes without physically moving or replacing physical gratings. Our system uses broadband, single-mode, illumination at = 488 ± 15 nm with an imaging objective of NA = 1 designed for imaging through immersion media with refractive index = 1.33. This yields a lateral diffraction limit of ∥, = /NA ≈ 488nm and an axial diffraction limit for tomography of ⊥, = /[ (1 − cos )] ≈ 1.0um. Our specific acquisition procedures used to reconstruct a QP tomographic volume via SI are detailed in Methods.
SI's capability to enable 2D lateral QP subdiffraction resolution [26][27][28] was verified by our updated SI-QPM system with calibration and biological samples (shown in Supplementary Figs.  6(a-d), 8(a-d), and 9(a-d)). Here, however, we highlight the importance of SI-enabled 3D QP by demonstrating how a lack of axial resolution may affect and degrade lateral visualization of structures even within the system's diffraction limit. We imaged a monolayer sample of 520nm polystyrene microspheres (n = 1.60 at λ = 488nm) through an immersion medium of 1x Phosphate Buffered Solution (PBS). Figs. 3(a,b) compare the SI-enhanced and conventional widefield (WF) QP images, respectively, of a central x-y slice through an imaging volume of this sample, demonstrating superior visualization with SI enhancement. Figs. 3(c,d) show depth-slices through the location marked by the dashed yellow line in Figs. 3(a,b), respectively, and demonstrate that SI-enhancement can provide depth sectioning of the microspheres to a resolution of 1.2 um, which matches well with our expected axial resolution. No such depth localization is apparent with the conventional WF depth-slice, where QP signal from the microspheres propagates through all depth slices. We note in Fig. 3c that, although the beads are well localized, a haze of QP signal (indicated by yellow arrows) is present-this is an artifact of the "missing-cone" problem in ODT, arising  in (a,b)) show the microspheres depth-localized to a resolution of 1.2um with SIenhancement, while no depth-localization is apparent with conventional WF. (e,f) The axial cross-sections of radially-averaged Fourier transforms show the 3D frequency content of the SI-enhanced and WF imaging volumes. The Fourier transform of the WF imaging volume clearly shows the Ewald cap associated with conventional coherent imaging. In contrast, the Fourier transform of the SI-enhanced imaging volume depicts the distinct butterfly shape associated with ODT. (g,h) Zooms of the region outlined in yellow from (a,b), respectively, are shown. The 520nm diameter beads fall just within the bound set by the diffraction limit, and so are theoretically resolvablehowever, coherent noise and out-of-focus diffraction artifacts deem sections of the zoom (indicated by yellow arrows in (h)) practically irresolvable without the enhancements allowed by SI. from incomplete frequency coverage, and is typically dealt with in post-processing 47 . To demonstrate how SI changes the frequency content of the imaged volume, we show in Figs. 3(e,f) the axial profiles of radially-averaged 3D Fourier transforms of the SI-enhanced and WF imaging volumes, respectively. The Fourier transform of the WF imaging volume clearly shows that the imaged spatial frequencies lie on the spherical shell (Ewald cap) associated with coherent imaging. In contrast, the Fourier transform of the SI-enhanced volume depicts the distinct butterfly shape associated with ODT 22,36 , which allows optical sectioning and enables 3D QP. As expected, Fig.  3e also shows twice the lateral frequency support as Fig. 3f. Figs. 3(g,h) show zooms of regions in Figs. 3(a,b) outlined in yellow to emphasize the improvement in visualization capability of SIenhanced QP over WF. Though the microspheres are within the system's diffraction limit and are visible in the conventional WF QP image zoom in regions of sparse microsphere density (indicated with green arrows in Fig. 3h), microsphere edges show significant defocus as well as susceptibility to diffraction artifacts. In regions of high microsphere density (indicated with yellow arrows in Fig. 3h), these defocus and diffraction artifacts can effectively hinder clear visualization of individual microspheres. In the SI QP image zoom, such defocus and diffraction artifacts are effectively sectioned out and result in clear and sharp visualization of all individual microspheres. Supplementary Fig. 7 rigorously demonstrates that, even with sample features well within the diffraction limit, WF QP imaging is susceptible to diffraction artifacts from defocusconversely, SI QP imaging clearly demonstrates its tomographic capability to strongly section out defocused artifacts.

B. Multimodal 3D sub-diffraction resolution biological visualization.
SI 3D subdiffraction cellular resolution has been demonstrated for fluorescent imaging 30,48here, however, we experimentally demonstrate for the first time to our knowledge, SI being used for 3D subdiffraction imaging of both QP and fluorescence in a single, multimodal, optical system. The technical design of this system is detailed in Methods and Supplementary Fig. 4, and mainly consists of a SI-DPM add-on module to a conventional SIM system. For this system, we used an imaging objective with NA of 1.3 to image through immersion media with refractive index = 1.51. Our excitation light remained the original broadband, single-mode, illumination at = 488 ± 15 nm. This yields QP lateral diffraction limits of ∥, = /NA ≈ 375nm and a QP tomographic axial diffraction limit of ⊥, = /[ (1 − cos )] ≈ 635nm. Our fluorescent filter is designed to pass fluorescent emission at = 545 ± 20 nm. Thus, our expected fluorescent lateral and axial diffraction limits are ∥, = /2NA ≈ 210nm and ∥, = /[ (1 − cos )] ≈ 735nm, respectively. To demonstrate imaging performance in a biological sample, we fluorescently labelled A549 cells with AlexFluor-488 phalloidin for F-actin visualization, and imaged QP and fluorescence ( Fig. 4 and Supplementary Figs. 8 and 9). While QP imaging offers endogenous mass-based contrast, it lacks sensitivity to specific organelles and cellular components. Therefore, coupling QP imaging with fluorescence allows for the simultaneous evaluation of mass and other descriptors of specific cytological components, and could be further used to delineate organelle boundaries for the determination of refractive index. For both modalities, SI imaging offered dramatic visualization enhancements when compared to conventional WF counterparts. We begin  in (a,b), which shows several high phase-delay structures clearly resolvable with (c) SI but not (d) WF QP imaging. 3D imaging capabilities between SI and WF are compared when considering defocused sample planes through the (c,e,g,i) SI and (d,f,h,j) WF image volumes. In the SI volume, the sharp QP signal from the high phasedelay structures attenuated with increasing defocus, indicating optical depth sectioning. In contrast, the WF volume showed the QP signal from the high phase-delay structures diffracting out into the defocused planes, leading to diffraction artifacts indistinguishable from in-focus QP signal. Fluorescent resolution was also enhanced when comparing (k) SI to (l) WF imaging, respectively. (m,o,q,s) Defocused planes show that SI fluorescence imaging demonstrates clear optical sectioning and shows the actin morphology undergoing clear organizational changes through different depths of the cell. In contrast, defocused planes through the (n,p,r,t) WF fluorescence volume show a strong defocused signal throughout the volume stack, which hinders visualization of important high-resolution features. by comparing imaging performance between SI and WF QPM for an individual A549 cell ( Fig.  4(a,b)). We zoom into a region above the nucleus to visualize a cluster of mass localizations. Molecular labelling would be required to truly ascertain the identity of these high phase-delay structures. However, given their small size and high phase-delay, we hypothesize that these are small lipid based vesicles which are known to have a high refractive index lipid bilayer, leading to the relatively high phase delay for the small object 49 . A previous work utilized total internal reflectance microscopy (TIRF) to monitor the secretion of ATP-containing vesicles from the same cell line, and reported these vesicles to surround the perinuclear region 50 . As seen in Figs 4(a,b) the visible high phase-delay vesicles indeed surround the apical portion of the cell where the nucleus likely resides. From the SI and WF zooms in Fig 4(c,d), respectively, we see that these QP structures are beyond the diffraction limitthe factor of 2 resolution enhancement enabled by SI, however, allows clear visualization of the individual localizations ( Supplementary Fig. 9(d) quantitatively demonstrates adjacent localizations to have QP peak-to-peak distances of ~230 nm). Furthermore, Figs. 4(c-j) demonstrate that axial translation of the sample results in severe diffraction artifacts from the out-of-focus localizations in the conventional WF visualization (indicated by yellow arrows in Figs. 4(h,j)). Depending on the specific features in the sample, these artifacts may even be indistinguishable from in-focus QP signal. With SI enhancement, however, the localizations effectively disappear from out-of-focus QP images. This phenomenon attributes to the 3D sectioning ability that SI enables in QP imaging by filling out axial frequency space The F-actin visualization was also drastically improved when comparing fluorescent SI super-resolution to diffraction-limited WF imaging (Figs. 4(k,l)). As can be seen, SI was required to visualize individual actin units. Previous work has demonstrated that 3-beam fluorescent SI results in 3D resolution gain as well as out-of-focus rejection via filling of the missing cone 30 . We experimentally confirm this by showing a zoom region (outlined in yellow in Figs. 4(k,l)) undergoing defocus (Figs. 4(m-t)). Increased axial visualization is apparent when comparing the defocused zooms of SI to conventional WF. Defocused signal is abundant in the WF zooms and occludes clear visualization of the important high-frequency content associated with the F-actinconversely, the SI-enhanced zooms demonstrate clear optical sectioning, and the imaged actin morphology shows distinct changes as the cells were axially scanned through the image focus. Supplementary Fig. 9(h) quantitatively indicates improvement in lateral resolution by an intensity profile drawn across an in-focus actin filament (Supplementary Fig. 9(g))the width of the filament was 180 nm between the signal troughs in the SI fluorescence image, while no resolvable width could be measured in the WF image.

V. Discussion.
From a theoretical perspective, this work (1) introduced the framework for SI 3D sub-diffraction QP imaging, (2) drew parallels between SI-QPM and oblique illumination microscopy, and (3) equated SI-QPM to a multiplexed form of the more established ODT. However, more fundamentally, this work conveys that SI constitutes a sub-diffraction resolution technique compatible with both fluorescence and QP.
This multimodal compatibility has direct implications that can benefit microscopy for biological research. QP imaging provides a technology with which to noninvasively analyze endogenous cellular biophysical and biochemical parameters. It has shown promise in studying whole-cell spectroscopy, morphology, mass, stiffness, and refractive-index/optical-path-length distributions [19][20][21][22][23] . Furthermore, QP is effectively free and fastno major sample preparation procedures are necessary to get strong imaging signal and camera integration times rarely pose a problem for longitudinal biological studies. However, QP imaging inherently has no molecular specificity, and the biological parameters extracted from QP cannot be localized to specific cellular components. This prevents analysis of localized and structure-specific QP, which can hinder studies exploring morphology, mechanics, mass, spectra, or density endogenous to individual subcellular components. Towards this end, due to its ability for molecular-specific contrast, fluorescence microscopy directly complements QP imaging. Fluorescence microscopy is the dominant imaging choice when studying interactions and dynamics of specific sub-cellular components, as is necessary in studies of gene expression, protein localization, intracellular transport, organelle dynamics, diffusion kinetics, etc [13][14][15][16][17][18] . A single optical system that incorporates QP and fluorescent imaging can allow registered, multimodal visualization of specific, localized regions of the sample and perhaps enable molecular-specific quantitative analysis. Indeed, a few past works may have considered this potential and introduced optical systems that efficiently combined both modalities 51,52 . Unfortunately, though such systems may represent important tools for biological analysis, they remain diffraction-limited, and a generalized scheme to achieve robust sub-diffraction resolution imaging with coherent and fluorescent contrast has remained elusive.
As shown in this work, SI represents a solution to this problem. Both coherent and fluorescent imaging have individually showed that a factor 2 resolution gain dramatically enhances biological visualization. Fig. 4 and Supplementary Figs. 8 and 9 demonstrate this with QP imaging by significantly improving visualization of high phase-delay structures surrounding the nuclear region of the cell. ODT also exploits this factor 2 resolution gain when visualizing 3D refractiveindex maps at sub-diffraction resolutions 22 . In fluorescence imaging, previous applications utilizing SI-enabled resolution-doubling have included super-resolved visualization of microtubule dynamics associated with the polymerization of α-tubulin in Drosophila melanogaster S2 cells 48 . Mitochondria dynamics in HeLa cells were also imaged, and SI-enabled superresolution was required to visualize the fine mitochondrial features associated with the cristae, usually seen only with electron microscopes.
We envision that biological insights extracted from such resolution enhancements from typically separate coherent and fluorescent imaging systems, can be potentially combined within the SI framework. Such a synergy may be important to comprehensively study biological components with distinct molecular and biophysical/biochemical functions. For example, cell cytoskeleton, known to be a network of F-actin, microtubules, and other intermediate filaments connecting most cellular structures, directly affects the mechanical properties of the cell. Previous studies have explored the distributions of mechanical stresses and displacements within the cell in response to applied loads and have modelled the cytoskeleton as a network of discrete, stressed elements 53,54 . The cytoskeleton, however, also affects various important signal transduction pathways (STP) that control how information (from either mechanical or molecular stimuli) is passed from the cell membrane to structures within the cytoplasm or nucleus. Important cytoskeletal components that enable this include molecules from the glycolytic enzyme, protein kinase, lipid kinase, hydrolase, and GTPase families 55 . Multimodal QP/fluorescent sub-diffraction imaging can offer a unique ability to probe molecular and mechanical interactions in the cytoskeleton during STP events by enabling visualization of molecular-specific mechanical forces as well as whole-cell mechanical responses to signal transduction events. Applications of this can be important to explore how mechanical or mechanotransduction events affect gene expression and protein synthesis, as well as whole-cell functions such as cell growth, differentiation, locomotion, cytokinesis, and apoptosis.

Supplementary Note 1
In this section, we rigorously adapt the existing mathematical framework for 3D super-resolution fluorescent imaging via SI to 3D sub-diffraction resolution QP imaging. Though we highlight distinct differences between the two frameworks, we strive to maintain similar notation and derivation strategies as those presented in the original work by Gustafsson et al 30 . We hope that such similarities may encourage readers to draw comparisons between the mathematical frameworks for SI-enabled sub-diffraction QP and super-resolution fluorescent 3D reconstructions. We start by briefly reviewing how spatial frequencies propagate through a coherent imaging system. Coherent propagation of spatial frequencies through system aperture For widefield coherent imaging, the illumination beam is typically considered to be a monochromatic planewave of wave-vector illum = (0,0, ) incident with a flat wavefront on a diffracting sample (shown below in Supplementary Figure 1(a)). Here, = 2 / is the wave-vector magnitude, and is the illumination wavelength. The interaction of the sample with the illumination beam results in a total diffraction that can be decomposed into plane-wave components, each with a differently oriented wave-vector. Because is maintained across all plane-wave components, these differently oriented wave-vectors share the common wave-vector magnitude of . Thus, in 3D Fourier space, the set of all diffracted wave-vectors trace a sphere (i.e., Ewald sphere) of radius centered at the origin.
We denote s as a wave-vector from the set of plane-wave components accepted through the numerical aperture of the imaging objective. The set of all possible wave-vectors s traces a spherical cap on the Ewald sphere (illustrated below by a circular arc in Supplementary Figure 1(b)), and describes the 3D spatial frequencies of the component plane-waves that are allowed to propagate to the imaging detector. In the lightly-scattering approximation, valid for largely transparent samples, s ≈ obj + illum , where obj is a 3D spatial frequency inherent in the sample. Thus, we see that obj is simply shifted from s by the constant illumination wave-vector, obj ≈ s − illum , for the set of all detected s . In Fourier space, the set of all obj thus traces a spherical cap coincident with the origin (shown below in Supplementary Figure  1(c) in the case of widefield imaging), and denotes the region of the sample's 3D spatial-frequency spectrum that can be imaged. Thus, this region is considered the coherent system's transfer function (TF). The aim of sub-diffraction resolution imaging is to capture more sample spatial frequencies than those directly encompassed by this TF.
Supplementary Figure 1. Illustration depicting transfer of spatial frequencies in a coherent imaging system. (a) Illustration depicting imaging setup for orthogonal, widefield, coherent illumination. (b) Fourier diagram illustrating the illumination wavevector, the Ewald sphere (in dashed outline) for all possible diffracted spatial frequencies, as well as the set of all diffracted spatial frequencies s that can propagate to the imaging detector. (c) Fourier diagram illustrating the coherent system's transfer function as the region of the sample's 3D spatial-frequency spectrum that give rise to the set of all detected spatial frequencies s .

3D image formation with axial scanning of sample
The imaging formation process in a coherent imaging system with complex electric-field imaging can be described with standard properties of linearity and translation-invariance. Namely, the electric-field of the system's image is a blurred version of the electric-field diffracting through the sample. Mathematically, this can be described by: where = ( , , ) is the 3D spatial coordinate vector, ( ) is the sample's complex transmittance function, ( ) is the illumination electric-field through the sample, ( ) is the acquired image's complex electricfield (obtained in this manuscript via off-axis holography), ℎ( ) is the imaging system's coherent pointspread-function (PSF), and ⊗ is the convolution operator. We note here that ℎ( ) describes the system's PSF for the spatial frequencies present in the diffraction propagating from the sample that are able to arrive at the imaging detector, not simply the inherent spatial frequencies in the sample. Thus, for the purposes of this derivation, we label the Fourier transform of ℎ( ) as the system's propagating transfer function (pTF), such that ℎ( ) FFT ⇔ ( ), where = ( , , ) is the 3D spatial frequency vector and ( ) is the system pTF that encompasses the region of 3D Fourier space that describe the spatial frequencies present in the diffracted waves from the sample. Mathematically, the pTF is exactly the set of all possible detected wave-vectors s (shown above in Supplementary Figure  1(b)).
In the specific case where the illumination 3D electric-field can generally be written as a sum of discrete components that can be separated into axial and lateral harmonic functions: where , = ( , ) is the 2D lateral spatial coordinate vector, ( ) is the th axial complex harmonic function, and ( , ) is the th lateral complex harmonic function, and is an arbitrary coefficient. If this illumination function is fixed with respect to the sample during image acquisition, the expression for the 3D volumetric image can be simply written by substituting Eq. (2) into Eq. (1) and expanding the convolution integral: where ′ = ( ′ , ′ , ′ ) refers to the 3D spatial coordinate vector in the reference frame of the sample. Importantly, in the case of a fixed illumination function with respect to the sample during image acquisition, Eq. (3) shows the axial harmonic function depending on only the sample's reference coordinates.
Conversely, if the sample was axially scanned during image acquisition, the axial harmonic function would instead depend on the difference between the sample's and image's coordinate frames, as first importantly noted by Gustafsson et al 30 where ( ) designates the th term of the ( ). Fourier transforming ( ), we get: Because the above mathematical treatment fundamentally relies only on modelling the imaging process as a linear and translation-invariant process (a condition satisfied by coherently imaging complex electricfields), the final imaging expression Eq. (5) is identical to that presented in Eq. (6) in Gustafsson et al 30 , where linearity and translation-invariance requirements were fulfilled by imaging intensity through a fluorescent (incoherent) system. We now apply Eq. (5) to specifically describe coherent 3D imaging under widefield and SI conditions.

3D image formation with orthogonal widefield illumination and axial scanning of sample
In the case of coherent widefield illumination, which consists of illuminating the sample with a flat wavefront, the illumination beam can be described simply as an axially propagating plane-wave, . We note now that ( − ) encompasses exactly the same region of Fourier space as the system TF, illustrated above in Supplementary Figure 1(c). Thus, 0 ( ) is shown to be exactly a low-pass filtered version of ( ) with the system TF. Though this result is expected, it demonstrates the validity of Eq. (5) in describing the coherent image formation process. We now use Eq. (5) to mathematically describe the image formation process under structured illumination.

3D image formation with 3-beam sinusoidal illumination and axial scanning of sample
In this manuscript, we achieve structured illumination of the sample with three mutually coherent tilted plane-waves. Thus, the 3D illumination electric-field can be mathematically expressed as: where 0 , 1 , 2 and 0 , 1 , 2 are the wave-vectors and reference phases for the three component plane waves, respectively.
In Supplementary Figure 2(a-c) and (d-f) below, we graphically illustrate the relationship between ( ), ( , ), and ( ) and the corresponding ( ) terms, respectively, for components = 0,1,2. As is clear in the case of = 0, the axial/lateral Fourier pair { 0 ( ); 0 ( , )} is exactly equivalent to the Fourier representation of widefield illumination -thus, the sample's spatial frequency information contained in 0 ( ) is mathematically equivalent to what is imaged via widefield illumination. However, in the case of = 1,2, we see that the combination of axially translated ( ) (from convolution of ( ) with ( )) and laterally translated ( ) (from convolution of ( ) with ( , )) results in ( ) imaging a region of the sample's spatial frequency spectrum that is axially and laterally offset from the region imaged with widefield imaging. To reconstruct a high-resolution image where these image regions of the sample's spatial frequency spectrum are appropriately synthesized, the component images ( ) for = 0,1,2 need to be separated and then appropriately Fourier shifted. Fourier transforming Eq. (4) shows the Fourier spectrum of a single raw image to be a linear superposition of the Fourier spectra of the individual component images: As in conventional fluorescent SI, disentangling the component terms in Eq. (10) simply requires acquiring multiple, say > 3, raw images ( ) with different known values of to form a linear system of equations. Standard linear inversion solvers can then be used to solve for the unknown 0 ( ), 1 ( ), and 2 ( ) components. Varying is accomplished by simple translation of the sinusoidal patterned structured element. Conventional optical diffraction tomography also acquires exactly these components via sequential oblique illuminations (i.e., illuminating individually with the plane-wave components composing the SI, as expressed in Eq. (6): exp( 0 • ) , exp( 1 • ) , and exp( 2 • ), respectively).
Once the component terms are known, 1 ( ) and 2 ( ) can be digitally Fourier shifted by − ,3 and Eq. (11) demonstrates that the total reconstructed region of the sample's spectrum is encompassed by 0 ( ) + 1 ( + ,3 ) + 2 ( − ,3 ). As is clear from Supplementary Figure 2(j) below, this region contains more of the sample's spectrum than allowed by typical widefield imaging -however, significant axial portions of the sample's spectrum remain uncovered. To image these regions, the above process is repeated for increments of the 2D spatial frequency magnitude contained in the sinusoidal patterned structured element, | |. Recall the deterministic relationship between the magnitudes of the 2D lateral spatial frequency and the 1D axial spatial frequency , , i.e., | | 2 + | , | 2 = 2 . Thus, incrementing will inherently axially increment the sample's region of imaged Fourier spectrum and enable solid subdiffraction resolution imaging enhancements. The total coverage of 3D Fourier space from incremented is illustrated in the main text Figure 2(c,f).
This procedure, in turn, is repeated for incremented rotations of (i.e., rotating the structured element) to isotropically extend the resolution enhancements. Wiener deconvolution techniques can compensate for uneven weighting due to overlap of individual components in the final summations.