Diffraction-Unlimited Imaging Based on Conventional Optical Devices

: We propose a computational paradigm where oﬀ-the-shelf optical devices can be used to image objects in a scene well beyond their native optical resolution. By design, our approach is generic, does not require active illumination, and is applicable to several types of optical devices. It only requires the placement of a spatial light modulator some distance from the optical system. In this paper, we ﬁrst introduce the acquisition strategy together with the reconstruction framework. We then conduct practical experiments with a webcam that conﬁrm that this approach can image objects with substantially enhanced spatial resolution compared to the performance of the native optical device. We ﬁnally discuss potential applications, current limitations, and future research directions.


Introduction
The resolution of an imaging system, i.e., its ability to separate points that are located at small angular positions, is limited by the density of the sensors and by diffraction.Following advances in signal processing and computer science, computational approaches have been proposed to address cases where the number of sensors is the limiting factor.In particular, the compressed-sensing framework allows images to be reconstructed from substantially fewer sensors than defined by the Shannon-Nyquist sampling limit [1,2].A practical implementation of compressed sensing is seen in the single-pixel camera, where a scene is imaged based on a single detector [3].Applications of single-pixel imaging include microscopy [4], terahertz imaging [5], fluorescence lifetime imaging [6], time-resolved hyperspectral imaging [7], Raman imaging [8], and phase imaging [9]-see [10] for a recent review.In parallel to this trend, however, the spatial density of sensors has increased substantially, with native camera resolutions in standard mobile phones now exceeding ten megapixels.In most consumer and professional optical devices, computational methods that compensate for a lack of sensors are not critical.
The resolution of an imaging device is not only limited by the density of the sensors, but also by the diffraction that occurs inside the optical system.This physical limit has been extensively addressed in fluorescence microscopy.With the advent of super-resolution techniques, different technologies in microscopy are now able to achieve diffraction-unlimited imaging, i.e., to resolve details beyond the diffraction limit of the physical system [11].These techniques rely on complex illumination schemes (e.g., stimulated emission depletion [12]) or photocontrol of the sample (e.g., photoactivated localization microscopy [13], stochastic optical reconstruction microscopy [14]), and they are tailored to fluorescence imaging.More generic resolution-enhancement techniques have also been developed for non-fluorescent objects (e.g., digital holography [15], synthetic-aperture microscopy [16], Fourier ptychography [17]).While diffraction per se is unvoidable, these methods manage to extract visual information that is robust to diffraction.For instance, stimulated emission depletion exploits nonlinear fluorophore responses to minimize the area of illumination at the focal point, which allows to resolve areas that are substantially smaller than the diffraction limit.
However, the above methods rely on active illumination schemes and require direct access to the sample or object to be imaged.While suitable for microscopy, none of them can be applied when the object is too far away to be accessed (e.g., in remote sensing or astronomy).In this case, only post-processing approaches have been implemented, to deconvolve the instrument response [18].While deconvolution can improve the image quality when strong prior information is available (e.g., point-like objects), it leads to relatively poor results for extended objects of unknown structure [19].
In this paper, we propose a computational imaging paradigm for objects that are too far away to be illuminated or accessed, which allows them to be imaged beyond the limit of diffraction.Our paradigm involves a specific acquisition procedure that allows to extract visual information that is robust to diffraction.Our approach is highly flexible, in the sense that it can be applied to any conventional off-the-shelf optical device by adding a spatial light modulator (SLM) some distance from it.After acquisition of a sequence of images for different SLM patterns, the object can be reconstructed through a simple procedure.To the best of our knowledge, this is the first time that a practical set-up based on these techniques has been shown to break the diffraction limit of an off-the-shelf optical device.
In Section 2, we specify the nature of the imaging problem under consideration.In Section 3, we introduce the joint acquisition and reconstruction paradigm, discuss its intrinsic invariance to diffraction and sampling, and describe an effective implementation.In Section 4, we demonstrate the relevance of the proposed approach through practical optical implementation that is based on a webcam, with a comparison of the results with respect to the conventional case where no SLM is added.In Section 5, we discuss the implications of this work, propose future lines of research, and discuss potential applications.Finally, we conclude our work in Section 6.

Conventional diffraction-limited acquisition
We consider the problem of imaging an object that emits spatially incoherent light, using a conventional diffraction-limited digital optical device.Let Σ o denote the object plane located at distance z o from the front principal plane of the device (see Fig. 1).We assume that the object and image planes are both perpendicular to the optical axis.The light emitted by the object is recorded by a sensor array of P square pixels, which results in a digital image g ∈ R P .This digital image is obtained by sampling the light intensity within the image plane g(x), x ∈ Σ i .Mathematically, we can model the sampling operator S by where φ(x), x ∈ Σ i , is an instrument response function that represents the integration effect of the sensor pixels, ∆ t is the acquisition time, {x p } 1≤p≤P are the centers of the pixels in the image plane Σ i , and * is the continuous-domain convolution.The intensity g, and hence the digital image g, suffers from diffraction within the imaging device.On the basis of a centered diffraction-limited model, the diffraction can be modeled through a low-pass operator D [20, p. 130], such that where f i (x), x ∈ Σ i , is a diffraction-free image, and h(x), x ∈ Σ i is a point-spread function corresponding to the Fraunhofer diffraction pattern of the exit pupil of the imaging system.The image f i relates to the object-plane intensity profile f o through geometrical optics and the Fig. 1.Conventional approach (black) and proposed approach (red additions).The object f o that lies on the object plane Σ o is imaged using a conventional optical device with a lens.The numerical image g is formed within the image plane Σ i .Compared to f i , the ideal geometrical image of f o , the measured image g suffers from not only sampling, but also diffraction.To circumvent both problems, we place a spatial light modulator (SLM) in the plane Σ.From a sequence of modulated images {g k } 1≤k ≤K acquired using some SLM patterns {q k } 1≤k ≤K , we can recover f , which is the geometrical projection of f o onto Σ.The dimensions are not to scale, for clarity.
inverse-square law, which depend on z o and the type of optical system.More specifically, where M o is a magnification factor, and where α is a multiplicative constant under the approximation of small angles.Inserting Eqs. ( 2) and (3) into Eq.( 1) yields the forward model where • is a dummy variable.Finally, we assume a Poissonian-Gaussian noise model [21], which yields the noisy acquisition g δ ∼ P(g where P and N are the Poisson and Gaussian distributions, respectively, d is the dark current and contributions from ambient light, µ is an offset imposed by the sensor, and σ 2 read models both the readout and the quantization noise. Our goal is to recover a degradation-free image from the degradation-sensitive measurements, which is particularly relevant when the sampling or the diffraction effects-or both-dramatically limit the spatial resolution of a raw image g δ .Contrary to post-processing approaches that invert Eqs. ( 4)-( 5) numerically, which is prone to artifacts due the ill-posedness of the operators S and D, we propose to alter the acquisition chain upfront, such that the degradation effects are neutralized.

Concept
We propose to make the acquisition chain described in Section 2 robust to the loss of resolution due to S and D, by performing multiple acquisitions and to recover the degradation-free image using a straightforward numerical inversion after preprocessing a set of modulated images.
Our acquisition approach consists of measuring a set of dot products {v k } 1≤k ≤K between the image and some SLM patterns {q k (x)} 1≤k ≤K , x ∈ Σ.This is realized through the addition of a SLM to the pre-existing acquisition set-up.The key specificity of our approach is to ensure that the SLM patterns modulate the image before degradation occurs due to S and D, which ultimately makes the dot products and the corresponding numerical reconstruction robust to these effects.This is achieved by placing the SLM sufficiently far from the optical system, at a distance z perpendicular to the optical axis (see Fig. 1).Our overall acquisition and reconstruction approach involves three main steps, as illustrated in Fig. 2 and described below.Fig. 2. Acquisition and reconstruction pipeline of the proposed approach.The set of digital modulated images {g k } 1≤k ≤K is first acquired from the object f o using a set of SLM patterns {q k } 1≤k ≤K .Then, these images are preprocessed, which leads to the diffractioninsensitive scalar measurements {v k } 1≤k ≤K .Finally, the digital diffraction-free image f is reconstructed from these diffraction-insensitive measurements through a straightforward numerical inversion, based on the knowledge of the SLM patterns used for acquisition.The preprocessing step corresponds to Eq. ( 7), the reconstruction step to Eq. ( 10), the superscript δ being used to denote noisy measurements and estimates.

Modulated image acquisition
The use of image modulation modifies the initial conventional forward model of Eq. (4).Specifically, every SLM pattern q k leads to measurement of the (digital) modulated image g k given by where f = αf o (z o • /z) is the geometrical projection of f o onto the SLM plane, and M = M o z/z o is the corresponding magnification factor.As f o , the intensity profile f is free of degradation due to sampling and diffraction.As the q k are complex-valued, whereas light intensities are positive quantities, every g k is obtained in practice as a linear combination of several sub-acquisitions, as detailed in Sections 3.2 and 3.3 below.Finally, according to Eq. ( 5)-which applies to every acquisition-we only have access to noisy versions g δ k of the modulated images g k .Preprocessing Once the modulated images are acquired, we numerically integrate them over their field-of-view, to produce the scalar quantities where u = [1 . . .1] ∈ R P .While each modulated image g δ k is altered by sampling and diffraction in Eq. ( 6), the scalar quantity v δ k is not affected by sampling or diffraction.Indeed, as shown in the Appendices A.1 and A.2 , it is proportional on average to the dot product of the SLM pattern and the diffraction-free image, i.e., where E is the expectation.This property, which is key to our acquisition approach, exploits the energy-preserving nature of diffraction.
Reconstruction As detailed in Appendix A.3, each measurement v δ k satisfies the discretescalar-product relation E v δ k = q k f , where q k ∈ R N and f ∈ R N are discrete versions of the SLM pattern q k and of the degradation-free image f , respectively.Defining where Q = [q 1 . . .q K ] ∈ R K×N is the matrix containing the discrete SLM patterns.Therefore, we simply recover the diffraction-free image as where Q + is the Moore-Penrose pseudo-inverse of Q.Here, the matrix Q is related to the choice of the SLM patterns, which is discussed in Section 3.2.Therefore, it is easy to choose K = N patterns for which Q is invertible; hence This differs significantly from methods that invert Eq. ( 4), where the discrete forward model depends on the physics and is typically ill-conditioned.When only a few SLM patterns are considered, i.e.K<N, Eq. ( 10) provides the minimum 2 norm solution.

SLM patterns
Our choice for the SLM is dictated by two requirements.First, it is crucial to maximize light throughput, so as to limit the acquisition time needed to acquire low-intensity objects with acceptable signal-to-noise ratios (SNRs).Secondly, the SLM patterns need to capture the information of f into relatively few measurements, so as to decrease the number of measurements, and hence the time needed for an acquisition.For instance, choosing Q as the identity matrix corresponds to an extreme case where light is only transmitted through a single SLM pixel at a time.This complies poorly with these requirements.
In this study, we choose Q as the discrete Fourier basis, which transmits ∼ 64% of the incident light flux and is known to sparsify natural images [22,23].For a SLM array of N = N 1 × N 2 square pixels of size ∆ × ∆, this defines the discrete SLM patterns q k = [q 1 k , . . ., q N k ] as where j is the imaginary unit, n is the two-dimensional pixel coordinate associated with the n-th pixel of the SLM, and ξ k is the two-dimensional spatial frequency of the k-th pattern, with This choice implies that the measurement vector v is the discrete Fourier transform (DFT) of f .Therefore, the reconstruction step of Eq. ( 10) simplifies to the performing of an inverse DFT, which also has the advantage of having rapid implementation with complexity O(N log N).The implementation of the spatial patterns q k into the SLM is discussed in Section 3.3, below.

Differential acquisition strategy
As our physical set-up can only implement positive-valued SLM patterns, every complex-valued SLM pattern must be split into positive real-valued patterns to be programmed into the SLM.The acquisitions that use positive patterns are then recombined to obtain a complex-valued image g δ k .Differential strategies, which find their roots in structured light microscopy [24], are common in ghost [25] and single-pixel imaging [26].One advantage of differential acquisition is its intrinsic robustness to the additive noise caused by environmental illumination.Here, we use four positive patterns, following a splitting approach that is similar to [26].Specifically, we define q k,i , 1 ≤ i ≤ 4 by where (•) + and (•) − are the projections onto the positive and negative orthants.These positive patterns are compatible with Eq. ( 11) as q k = (q k,1 − q k,2 ) + j(q k,3 − q k,4 ).Each of the positive patterns q k,i leads to the measuring of a distinct positive-modulated image g δ k,i .According to the forward and noise models of Eqs. ( 6) and ( 5), we have and In the latter, we assume that d k does not vary during the four sub-acquisitions.Finally, we repeat the measurements L times and average them to decrease the noise.The modulated image g δ k is computed as where g δ k,i, is the -th modulated image acquired using the SLM pattern q k,i .As shown in Appendix A.1, the modulated image g δ k is an unbiased estimate of g k , which includes the cancellation of additive environmental-illumination effects.This is crucial in our setting where the amount of light captured from an object at long distance can be very low.

Time budget and subsampling
As f is real-valued, only half of its DFT coefficients are nonredundant.Recalling that the SLM is an array of N = N 1 × N 2 pixels, and assuming that N 1 and N 2 are even, the number of DFT coefficients for full acquisition is K = N/2 + 2. Implementing the differential approach of Eq. ( 15), the time budget for a full acquisition is thus The time budget of full acquisition can be substantial when the acquisition time per measurement is large (e.g., for low-intensity objects that require ∆ t or L to be large), or when the image resolution N is large.One approach to decrease the time budget is to use a smaller number of L of acquisitions per SLM pattern, which yields Another approach is to acquire a subset of K<N/2 + 2 significant DFT coefficients.In this case, we consider the low-frequency diamond scheme, which yielded the best image-reconstruction results in [27].The corresponding time budget is We also propose an adaptive subsampling approach that can preserve higher-frequency coefficients, exploiting repeated measurements.This approach is described in detail in Appendix A.4.For each of the aforementioned subsampling methods, the sampling ratio γ is defined as

Angular resolution
As this approach is robust to diffraction and sampling that occur after modulation, the maximum angular resolution R (in radians; the smaller the better) only depends on the SLM.Under the approximation of small angles, and neglecting noise, we have where ∆ is the SLM pixel size and z is the distance between the conventional optical device and the SLM.The combination of the original device and the SLM can be seen as a single new device where the maximum angular resolution is only parameterized by ∆ and z.This is in contrast to conventional optical devices, where the maximum angular resolution is fixed and is limited by diffraction.
One way to improve the angular resolution is to decrease the SLM pixel size.The lower limit for ∆ depends on the available technology (e.g, digital micro mirrors, translucent or reflective liquid crystal displays).Ultimately, ∆ must be larger than the wavelength used for acquisition, to avoid diffraction effects during modulation [28].For a fixed pixel size ∆, a target angular resolution R can be achieved by setting the SLM distance z accordingly, i.e., choosing z ≥ ∆/R.For instance, for a pixel size of 50µm, setting the SLM at a distance of 10 m yields an angular resolution of 5 • 10 −6 radians, which is equivalent to approximately one arcsecond.This angular resolution is comparable to that of a regular 4-inch telescope [29].In Table 1, we report more example values of R as a function of z and ∆.

Experimental setup
We evaluate this approach considering a webcam (USB HD C270; Logitech) as the conventional optical device (Fig. 3(a)) and the front screen of a commercial showcase (ClearVue Lite CV101LV1) as the SLM (Fig. 3(b)).The object is placed and illuminated inside the same showcase some distance behind the screen (Fig. 3(c)).Overall, the scene setting follows the configuration of Fig. 1, where all of the distances and the effective SLM area ensure that the object is fully modulated and captured by our webcam according to the requirements of Section 3.
The webcam camera has a resolution of 1280 × 960 pixels, an angle of view of 60 degrees, and a focal length of 4 mm.It produces color images in compressed JPEG format that we converted to grayscale.According to the trigonometric relations between these quantities [31], the pixel size of the camera is 2.9µm, with a corresponding angular resolution of 135 arcseconds.Importantly, this pixel size is comparable to the theoretical optical diffraction limit associated with the webcam parameters, and can thus be used as a meaningful reference to assess the resolving power of  12).The object is located inside the showcase and can be seen in this picture behind the modulated pattern.our approach.For instance, the Airy-disk diameter D A = 2.44λN obtained at average optical wavelength λ = 550 nm [32] and low focal ratio The SLM has a liquid crystal display of 1024 × 600 pixels with ∆ = 210µm and a contrast ratio of 500:1.For our experiments, we only use an effective area of N = 64 × 64 pixels located at the center of the liquid crystal display, with the rest of the pixels set to block out all of the light.For conventional acquisitions, we set the effective SLM region of 64 × 64 pixels to transmit all of the light, while leaving the rest of the SLM pixels set to block out all of the light.This ensures that the same object field-of-view is acquired with and without modulation.
The object is the letter "T" as the black capital (width 9.5 mm, height 8 mm) with a white background, as shown in Fig. 4(a).The object is placed inside the showcase, which leads to z = 3.80 m and z o = 3.87 m, respectively.As viewed from the webcam, the effective size of f o captured by the SLM is thus 730 × 730 arcseconds, or as the equivalent, 5.4 × 5.4 sensor pixels.
In our experiments, we acquire each pattern L = 50 times during ∆ t = 42 ms.The object illumination, SLM transmittivity gain, and webcam gain are set to maximize the brightness, while avoiding saturation of the webcam.We perform the acquisition in a dark room, to minimize the influence of variations in ambient light during acquision.To ensure the correct synchronization between the acquisitions performed by the webcam and the generation of the SLM patterns, a latency of 0.5 s is added between the successive measurements.As this latency is only relevant to our particular software implementation, it is not included in our time budget.
Both the optical and the SLM devices are connected to a laptop computer (MacBook Pro; 2.4 GHz Intel Core i7; 6 GB memory).All of the acquisition and reconstruction methods are implemented in Matlab.

Proposed paradigm versus conventional acquisition
In this first experiment, we assess the proposed paradigm, and compare it with conventional acquisition where the object is acquired from the same location but without SLM modulation.Based on the practical set-up and its parameters described in Section 4.1, we acquire object coefficients and reconstruct f via Eq.(10).The results of this experiment are shown in Fig. 4.
In the conventional acquisition setting (Fig. 4(b)), the object appears in a very small central region of the acquired image, where the horizontal line at the top is due to light leaking from the showcase.When magnifying this central region (Fig. 4(c)), whose size is 5.4 × 5.4 webcam-sensor pixels as derived in Section 4.1, no clear object features can be identified.The image is blurred due to diffraction and instrument response.This confirms that the native resolution of the webcam is insufficient to image the object.Considering the responses at 10% and 90% (dashed black vertical lines) relative to the dark and bright levels (solid black horizontal lines), we obtain an average edge response (rising edge) of 2.6 pixels.
In the proposed-paradigm setting (Fig. 4(d)), the reconstructed object can be resolved and appears to be consistent with the original profile (Fig. 4(a)).The reconstruction also contains details that are significantly smaller than the pixel resolution and diffraction limit in the classical setting (Fig. 4(c)).Finally, the resolution of our approach is quantified using the edge response [33].Accordingly, the analysis of the reconstruction (Fig. 4(e)) yields a resolution of 2.6 SLM pixels, which corresponds to an angular resolution of 30 arcseconds.As the native angular resolution of the webcam is 135 arcseconds (see Section 4.1), this is a 4.5-fold improvement.This result demonstrates that our joint acquisition and reconstruction paradigm can image objects at a resolution that significantly exceeds the native limits of the conventional device it is built from.
As mentioned in the previous sections, one major caveat of our approach is its acquisition time.In that regard, Eq. ( 16) implies that the time budget to acquire f in our set-up is t full = 17 220 s (4.8 h).In the next experiment, we thus investigate how this time budget can be mitigated while maintaining acceptable reconstruction quality.

Subsampling
In this second experiment, we investigate whether the subsampling can maintain reconstruction quality under acquisition-time budgets than are lower than the one of Section 4.2.To do so, we compare the performance of the subsampling strategies proposed in Section 3.4.For convenience, our images are reconstructed retrospectively, based on the full set of DFT coefficients, as in [34].Each subsampling method is evaluated in terms of the SNR using the fully sampled result of Section 4.2 as reference.
First, we consider the case where γ = 1/4, i.e., a four-fold reduction in the acquisition time.The images reconstructed here are shown in Fig. 5. Repetition subsampling (Fig. 5(a)) yields the worst result, due to noise.The noise issue is further exacerbated in the extreme case where the image is reconstructed using no repetition (Fig. 5(e)), which illustrates the need for repeat measurements.Compared to both non-adaptive approaches (Fig. 5(a), (b), the proposed adaptive subsampling scheme (Fig. 5(c)) yields the best result.This last is closest to the ideal oracle subsampling (Fig. 5(d)), where the highest-energy coefficients are determined and selected a posteriori from the full acquisition (L = 50; K = N/2 + 2).Reconstruction from adaptive sampling is also less blurry than from nonadaptive sampling, which confirms the potential of adaptive approaches to better preserve high-frequency information.The sampling pattern of the adaptive scheme (Fig. 5(g)) is also close to that of the oracle (Fig. 5(h)), as opposed to the nonadaptive case (Fig. 5(f)).Figure 6 illustrates the SNR of the image reconstructed with different subsampling strategies with increasing sampling ratios.Adaptive sampling consistently outperforms its nonadaptive counterpart over a large range of sampling ratios.Its drop in performance at low sampling ratios is due to the constant time-budget overhead that is used to determine the coefficients relevant for repetition, as detailed in Appendix A.4.Overall, these results highlight the critical impact of the subsampling strategy and the potential of adaptive methods to preserve reconstruction quality with smaller time budgets.Fig. 6.Performance of the different subsampling methods for the various time budgets.The signal-to-noise ratio (in decibels) is plotted for each method as a function of the time budget in hours, varying γ accordingly.

Discussion
Our experiments confirm that the proposed paradigm can be used to resolve objects beyond the native capabilities of an optical device.It is worth emphasizing that in these experiments, the classical resolution benchmark that is outperformed by this method is the pixel size of the webcam, which is comparable to an ideal diffraction limit, as mentioned in Section 4. The actual resolving power of the webcam is even lower than this benchmark due to device nonidealities, such as optical aberrations, which also account for the blur observed in Fig. 4(c).
Our paradigm potentially extends to a relatively broad class of optical devices and applications.In this regard, it is important to note that the only property of the point-spread function h that is exploited to derive Eq. ( 8) is its preservation of the total light energy, as shown in Appendix A.2.For this reason, Eq. ( 8) also holds in the presence of optical aberrations, which can be seen as phase components of a generalized pupil function [20, p. 145].Our acquisition approach would also remain valid if h were not isoplanatic, or if the light emitted from the object were spatially coherent, in which case diffraction effects would become nonlinear in intensity but remain energy preserving.
While the strength of the current study is its broad applicability to a large set of optical devices, dedicated optics can also be envisaged, drawing from previous studies on one-pixel cameras, for instance.For current single-pixel cameras, one major difference is that the SLM is primarily meant to compensate for the lack of multiple sensors, and it is integrated into the device itself instead of being placed externally.The achievable resolution is thus currently determined by the diffraction limit of the device [27], as opposed to our acquisition approach where modulation from the SLM occurs before diffraction.However, our approach described in Section 3 is computationally similar to what is used in single-pixel cameras.In particular, both involve the acquisition of scalar products with the object using a SLM.In further work, the single-pixel architecture can thus be adapted to implement our approach, and its single sensor could physically replace the numerical-integration process in Eq. (7).
In terms of applications, our paradigm can be used to either better resolve objects -as in our experiments here -or to better track them.The proposed approach can also be exploited to image objects where the distance is out of reach, for either practical or technical reasons, e.g., at infinity.In that regime, the ability of our method to increase angular resolution, or, equivalently, to increase spatial resolution independent of the object distance becomes key, as there is no way to access, illuminate, or increase the size of the object in the image field of view.This is typically the case in astronomical imaging, in which case an additional SLM placed at sufficient distance inside or outside the Earth's atmosphere might further enhance the resolution limits of an existing ground or space-based telescope.Such a configuration could borrow from the external occulter concept [35], except that the SLM would not only allow to mask the light from unwanted sources, but also to modulate the light from the object of interest and reconstruct a more detailed image, based on our paradigm.For astronomical applications, the robustness of our paradigm to diffraction and the use of differential measurements might also prove useful, to mitigate the effects of atmospheric seeing and turbulence, depending on the relative location of the SLM.This remains the topic of further investigations.
An important limiting factor of our approach is the acquisition time required to produce a suitable reconstruction, which is caused by the very low amounts of light that can be collected by the sensor array compared to the noise level.In that regard, our no-repetition result (Fig. 5(e)) illustrates how noise can negatively affect reconstruction quality when the acquisition time is insufficient.The effect of noise is also important in the computational imaging paradigms used in microscopy [36].In the noiseless regime, Eq. ( 8) implies that the imaging quality of our approach is only limited by the angular resolution determined by the SLM parameters ∆ and z, as discussed in Section 3.5.How to extend our paradigm to higher resolutions, for instance for point-like objects [37], constitutes an open question.
In this work, we have shown how particular acquisition strategies can mitigate the issue of the acquisition time to some extent.In further work, the proposed acquisition and reconstruction methods can be adapted to decrease the time budget, to maximize the reconstruction quality, and to operate in more complex settings.For instance, satisfactory reconstructions might be obtained from fewer coefficients based on the compressed-sensing framework, assuming wavelet, total-variation, or nonlocal image priors [1,2,38].Recent advances in deep learning for inverse problems might also be of benefit to our paradigm.Indeed, image reconstruction based on neural networks is intrinsically faster than compressed-sensing-based iterative algorithms, and provides improved reconstruction quality by learning image features during an offline training phase [39].
Adaptive-acquisition methods that are more advanced than the one proposed in Section 3.4 can also be developed to limit the acquisition time.Such methods might avoid repetitions of the same SLM pattern by producing specifically optimized patterns for each new measurement, based on information on signal and noise properties that would be gathered from all of the previous measurements.Furthermore, acquisition settings where the object is moving or changing might be handled by adapting our acquisition model and our algorithms.

Conclusion
We here propose and demonstrate experimentally a novel imaging paradigm where an optical device can be used in conjunction with a SLM to acquire and resolve remote objects with resolution that exceeds the diffraction limit of the optical device.Our acquisition strategy can be seen as the transformation of the optical device that produces degraded measurements-due to diffraction and instrument response-into a new device that produces compressed measurements.Then, the image of the scene is reconstructed by solving a simple inverse problem.Our experiments represent the first proof-of-concept that the loss of spatial and angular resolution that is intrinsic to diffraction can be circumvented through the use of specific acquisition and reconstruction strategies.

A.1. Unbiasedness of the measurements
By linearity of the expectation, we obtain, from the differential expression of g δ k given by Eq. ( 15), Under the noise model of Eq. ( 14), the additive biases, which include the effects of ambient light, cancel out mathematically as Using the forward model of Eq. ( 13), we then obtain which simplifies, by linearity of both S and D, to

A.2. Robustness to diffraction and sampling
We now demonstrate that v δ k are robust to diffraction and sampling.In more detail, they provide inner products between the degradation-free image f and the modulating patterns q k , up to a multiplicative constant.By linearity of the expectation, we have Therefore, as measurements are unbiased as shown in Eq. ( 24), we have Substituting the sampling operator S with its definitions leads to Assuming that the pixel instrument function φ(x p − x) indicates the location of the p-th pixel of the sensor, the instrument function forms a partition of unity, i.e., p φ(x p − x) = 1, and the convolution simplifies to In a similar fashion, the convolution kernel h satisfies ∫ h(x)dx = 1 because diffraction preserves the total light energy.Therefore, assuming that the diffracted modulated images D{q k (M•)f (M•)} lie within the field of view of the device, we obtain which is independent of S and D.

A.3. Discretization
Each modulation pattern q k (x) is implemented using a vector q k = [q 1 k , . . ., q N k ] that indicates the value of each of the SLM pixels.Mathematically, we have where b(x) is a square box function of size ∆ that represents the shape of the SLM pixels.Therefore Eq. ( 28) expands as Defining f = [f 1 , . . ., f N ] as the discrete version of f (x) that matches the resolution and pixel size of the SLM, i.e., we obtain the discrete-scalar-product relation q n k f n = q k f .(32)

A.4. Adaptive subsampling
Subsampling methods such as the low-frequency diamond scheme proposed in [27] select coefficients a priori, regardless of image properties.Therefore, such methods can miss relevant high frequencies.Inspired by [34], we propose an adaptive subsampling scheme that mitigates this problem.Specifically, we propose to first make a quick but exhaustive acquisition of all of the coefficients, and then to repeat the acquisition of the highest coefficients only, to increase their SNR.
Retaining K significant coefficients and acquiring the significant coefficients L times at most, we obtain the adaptive measurement vector as follows: 1. Acquire all patterns, i.e., {q k } for 1 ≤ k ≤ N. The resulting measurement vector is denoted by v δ,1 2. Low-pass filter the image of the first-pass measurements v δ,1 with a Gaussian filter of unit variance; this yields the filtered coefficients vδ,1 3. Find the locations of the K highest (absolute) values of the filtered measurement vector vδ,1 .The set of indices indicating relevant coefficients is denoted by Ω.
where the overhead (2N + 8 − 4K)∆ t is due to the adaptive coefficient selection in Steps 1-3.

Fig. 3 .
Fig. 3. Elements of the proposed experimental setup.(a) Webcam.(b) Showcase with front screen.(c) Closeup of this front screen highlighting its central area used as SLM (64 × 64 pixels) while it generates one of the light-transmittivity patterns of Eq. (12).The object is located inside the showcase and can be seen in this picture behind the modulated pattern.

Fig. 4 .
Fig. 4. The proposed paradigm compared to conventional object acquisition.(a) Reference object profile.(b) Image from the conventional optical device.(c) Magnified central area of 5.4 × 5.4 pixels of (b) corresponding to the size of the object as directly viewed from the webcam.(d) Image reconstructed using our diffraction-unlimited approach, with the colorbar representing the normalized brightness; N = 64 × 64, K = 2050.(e) Vertical edge response (right) computed within the blue rectangle (left).We compute the edge response by averaging the vertical profiles across the horizontal pixels 43 to 53 (solid blue line).Considering the responses at 10% and 90% (dashed black vertical lines) relative to the dark and bright levels (solid black horizontal lines), we obtain an average edge response (rising edge) of 2.6 pixels.

4 . 1 LL
Acquire the relevant patterns, i.e., {q k } k∈Ω , L − 1 more times.The resulting measurement vectors are denoted by v δ, , 2 ≤ ≤ L, where the coefficients that are not acquired are set to zero.The first-pass acquisition is cleaned by setting v δ,1 k to zero for all k Ω. 5. Average the measurement vectors to get the adaptive measurement vector v δ .Mathematically, v δ = =1 v δ, .The low-pass filtering in Step 2 and the elimination of nonrelevant coefficients in Step 4 both aim to reduce the effect of noise on the initial single pass.The time budget associated with this adaptive scheme is t sub = (4KL + 2N + 8 − 4K)∆ t , K<N/2 + 2, (