Compressed sensing with near-field THz radiation

RAYKO I. STANTCHEV,* DAVID B. PHILLIPS, PETER HOBSON, SAMUEL M. HORNETT, MILES J. PADGETT, AND EUAN HENDRY School of Physics, University of Exeter, Stocker Road, Exeter EX4 4QL, UK SUPA, School of Physics and Astronomy, University of Glasgow, Glasgow G12 8QQ, UK QinetiQ Limited, Cody Technology Park, Ively Road, Farnborough GU14 0LX, UK e-mail: e.hendry@exeter.ac.uk *Corresponding author: ris202@exeter.ac.uk


INTRODUCTION
Imaging using terahertz (THz) radiation (0.1 -2 THz) is appealing as many visibly opaque materials are transparent in this frequency range [1].THz imaging enables, for example, the detection of faults in space shuttle panels [2], investigations into the substructure of paintings [3] and non-destructive qualitycontrol of pharmaceutical tablets [4].The THz photon energies are non-ionizing, unlike X-rays, hence inspections will not damage sensitive electronics [5].However, THz imaging has two main outstanding challenges.Firstly, THz detector arrays are usually expensive and very narrow-band [6,7].Secondly, the relatively long wavelengths (0.15 -3 mm) impose that diffraction limited imaging will fail to see micron scale details.
To date, most sub-wavelength THz imaging techniques use near-field raster scanning techniques, where a sub-diffraction limited probe tip is raster-scanned over a surface, sampling the evanescent field at each location [8,9].These methods have achieved 10 nm resolution [9].Whilst impressive, the weak signals emanating from the probe are sensitive to detector noise, necessitating long measurement times, and complex detection schemes are required to improve acquisition rate [10].
Recently, alternative imaging techniques using single-pixel detectors have emerged [11,12].Single-element detectors are generally cheaper and more robust than detector arrays.These methods rely on spatially patterning the incident beam of radiation with a set of known patterns (described later).A set of orthogonal patterns minimizes the mean square error in this approach [13].However, to unambiguously reconstruct an Npixel image in this manner, a minimum of N measurements are normally required.To circumvent the trade-off between imaging time and resolution, a variety of compressive sampling techniques have been developed that make use of assumptions about the nature of the object to reconstruct an image from an under-sampled set of measurements [14][15][16][17].
In this contribution, we demonstrate a near-field terahertz imaging approach using an ultra-thin, silicon photo-modulator.The ultra-thin wafer allows us to access the THz evanescent fields to achieve a spatial resolution of 9 µm (l/45 at 0.75 THz), demonstrated explicitly in experiment by imaging a resolution target.We conclude by investigating two different methods of speeding up the image aquisition time by reconstructing images from under-sampled sets of measurements: adaptive sampling and compressed sensing.

METHODS
Single-pixel imaging schemes have been successfully implemented in the THz regime [18][19][20].A recently demonstrated approach to this employs dynamic spatial patterning of a THz beam using a photo-conductive modulator [21].This technique exploits the fact that when an optical pump beam is incident onto a photo-modulator, such as a silicon wafer, electron-hole pair photo-excitation increases its THz conductivity [22], and reduces THz transmission.By shaping the optical pump beam into a binary intensity pattern, we are able to spatially control which areas of the modulator transmit THz radiation.In this way, the pattern in the optical pump beam is imprinted onto the THz beam that is transmitted through the photo-conductive modulator.Previous studies have shown that the thickness of the photo-modulator limits the achievable resolution, as this determines the amplitude of evanescent fields interacting with the object [21].
Figure 1 illustrates our imaging setup (a detailed schematic is also shown in supplementary info of ref. [21]).For generation and detection of THz radiation we use a pair of ZnTe crystals in a standard THz time domain spectrometer [23].The system is pumped by an 800 nm (90 f s) amplified Ti-Sapphire laser running at a repetition rate of 1050 Hz.This laser system also provides a photo-excitation beam which is spatially structured by a digital micromirror device.This patterned beam (fluence of ⇠ 100 µJ/cm 2 per pulse) is projected onto a silicon wafer (6 µm thick, 8000 W • cm resist) using a +5 cm lens.
In those regions of the photo-conductive modulator that are illuminated by the optical photo excitation beam, the silicon conductivity increases, rendering the material response at these locations opaque to THz radiation [22].Then a single-cylce THz pulse (see Fig. 4 of supplimentary info) arrives ⇠ 5 ps after photo-excitation.Since carrier diffusion can be neglected on these fast timescales, the pattern from the 800 nm beam is transferred to THz pulse without spatial blurring.The patterned THz field then interacts with an object placed on the exit interface of Si wafer before being collected onto a single-element detector.We record the peak amplitude of our THz pulse for our transmission signal, shown in supplementary Fig. 4. Dynamic spatial encoding of the THz beam enables an image to be reconstructed from a series of measurements with a single-element detector.We measure the total THz field transmitted through an object as it is illuminated with a sequence of spatially patterned beams.Measurement y i represents the transmission corresponding to the i th pattern.Our task is then to reconstruct the image of the unknown object from the series of correlation measurements with our known patterns.To do this the spatial structure of the set of illuminating patterns can be stored in a sampling matrix A.Here the i th row of A is an Nelement vector which is a 1D representation of the i th projected pattern.Therefore, for an N-pixel image we have measurement y i = Â N j=1 A ij x j where x j is the j th image pixel.This can be represented as a matrix equation where y is a column vector containing the measurements, and x is a column vector representing the N-pixel image of the object.The image is obtained by solving the above equation for x, which is then reshaped into a 2D array of image pixel values.As described above, A represents the basis the image is expanded in.When the image is fully sampled (i.e. with the same number of linearly independent measurements as there are pixels) A satisfies AA T = I, where I is identity matrix.In this case the solution is simply given by x = A T y.Further, it is well known that an orthogonal basis minimizes the detector noise [13].For this reason, when the number of measurements equals the number of image pixels, we use a Hadamard matrix (A = H) formed from the Paley type-I construction (see supplementary sec 9 for details).H consists of elements taking the value of 1 or -1 in equal number.Each Hadamard pattern transmits 50% of the incident beam, giving the maximal signal level at the detector.As our intensity masks cannot represent negative numbers, we use a lock-in amplifier to record the difference in transmission between a mask consisting of 1s (i.e.transmitting radiation) and 0s (i.e.blocking radiation) and its inverse.This measurement also minimizes low-frequency intensity fluctuations of our THz source (see supplementary info of ref. [21]).
If an object is undersampled, A has fewer rows than columns, and the problem is underconstrained, with infinitely many solutions that satisfy Eqn. 1. Adaptive sampling and compressive sensing are both strategies that make use of assumptions about the nature of the object (such as object sparsity when represented in a particular basis) to choose one solution that most likely represents the object.Depending upon the strength of these assumptions, and the level of noise in the measurements, these approaches have been proven to enable functioning image reconstruction from highly undersampled measurement sets [14][15][16][17].More details of the compressive approaches we use in this work are given in §4.

RESOLUTION TESTS
In near field imaging approaches, subwavelength resolution can be achieved due to the interaction of near-fields with the object.However, near-fields decay exponentially with distance.In our approach, the thickness of the modulator can therefore be expected to play an important role in determining the ultimate resolution of our THz images.To investigate this, this we image a subwavelength sized, metallic resolution target (cartwheel) through a silicon photo-modulator of varying thickness h, shown in Fig. 2.
For a relatively thick modulator (h=400 µm, of the order of the THz wavelength), we see that very few of the subwavelength features of the cartwheel are evident in the resulting image (see fig. 2a).One can understand the resulting image by considering the diffracted field expected for the object when propagated through a thickness h of silicon (refractive index = 3.44), plotted in figure 2d using scalar diffraction theory [24] (see supplementary §7 for details).While agreement is imperfect (see below), The overall trend here is clear: as the thickness of the modulator is reduced, the images sharpen.Hence, due to the increasing spatial frequency content of the cartwheel towrds its centre, we can estimate our obtained resolution by evaluating the minimal distance for which the cartwheel arms are distinguishable.This leads to resolution estimates of 154 µm, 100 µm and 9 (±4) µm for h = 400 µm, 110 µm and 6 µm, respectively.
We note that there are two main reasons for the discrepancies between the left and right hand panels of Fig. 2. Firstly, the polarisation of the THz field is important, while our scalar diffraction calculations neglect this.This leads to the breaking of rotational symmetry in the experimental images.Indeed, the effect of polarization can be observed explicitly when we vary the orientation of certain objects (see supplementary §6).Secondly, we must note that the optical pump light has a finite penetration depth in silicon (11 µm for 800 nm [25]), which will influence the diffracted field.This has two effects firstly the modulation efficiency will decrease due to transmitted pump intensity being wasted.Conversely this finite penetration depth actually decreases the effective thickness of the modulator increasing resolution.We discuss these effects and the optimal selection of the modulator thickness in the supplimentary information.Further investigation with direct gap modulators is therefore required in order to push the THz image resolution lower than 9 µm using this method.

MEASUREMENT REDUCTION
As with all near field imaging techniques, small signals go hand in hand with long measurement times.However, as our imaging approach does not rely on raster scanning, we can reduce acquisition time by reducing the total number of independent measurements.It should be noted that this is the first time undersampling with near field radiation has been performed.In this section, we investigate two strategies to reconstruct images using undersampled sets of measurements: adaptive sampling and compressed sensing.
Using adaptive sampling, we first measure a low resolution image and then sample regions of interest with progressively higher resolution.In short, identification of coarse edges from this initial low resolution image determines where to sample with higher resolution, thus reducing the total number of measurements that are made [15].Edge identification is achieved via a single-tier 2D Haar wavelet decomposition of the low resolution image.The Haar wavelet transform is a hierarchical structure that highlights the presence of edges at progressivley finer scales: edge features yield large wavelet coefficients, while more uniform areas yield low wavelet coefficients [26].In our experiment, after each higher resolution resampling phase, edge detection is performed on the new image, and the process is repeated until the required resolution is reached.The algorithm is described in detail in supplementary § 9.
Compressed sensing is an alternative non-adaptive approach which draws on assumptions that we can make about the object to reduce the total number of measurements.This can include knowledge of the basis in which the representation of the image is sparse [14], or the basis in which the total variance or curvature of the object is expected to be low.Such assumptions typically hold for a wide variety of natural images, and compressed sensing theory shows how it is then possible to recover an image using fewer measurements than the number of pixels in the image (i.e.A in eqn. 1 has fewer rows than columns) [14].In this work we make the assumption that the total curvature of the image will be low.We sample the object using a set of random binary patterns, which are a chosen due to their high degree of incoherence with respect to a wide range of basis representations that are anticipated to be sparse.Non adaptive compressed sensing has the advantage that masks can be designed (and loaded onto a modulator) ahead of time rather than in response to measurement, as is the case for adaptive sampling.
We also note that in both our adaptive and compressive sampling strategies, we have applied regularization to combat noise in our measurements.This essentially allows the reconstruction algorithm to permit solutions that deviate from the measurements by an amount based on the estimated noise level, while seeking to minimize the level of curvature in the reconstruction.More details are given in Supplementary Information, alongside unregularized images.
Figure 3 compares reconstructed images of a transmissive object depicting two of Maxwell's equations as the number of measurements for both adaptive imaging and compressive sensing are reduced.The main features of the object are clearly recovered even when the number of samples is reduced to 35% of the Nyquist limit.These images also display directional contrast modulation due to polarisation effects (see supplementary §6).Comparing the two methods, we observe that adaptive sampling marginally out performs compressed sensing, yielding higher contrast features.Note that our aim is to demonstrate the feasibility of these techniques rather than to optimise the sampling and reconstruction algorithms to suit the object under test and measurement conditions.We anticipate improvements in image quality should one inject more prior knowledge.

CONCLUSIONS
In summary, we have demonstrated a sub-wavelength THz imaging technique that is compatible with adaptive and compressive sensing algorithms.We photoexcite a 6µm-thick silicon wafer with a sequence of optical patterns to dynamically spatially modulate an incident THz beam.By placing an object at the exit interface of our wafer, we demonstrate imaging at

A FEW EXPERIMENTAL DETAILS
For generation and detection of THz radiation we use a typical THz time domain spectrometer [1]s.The spectrometer is powered by 90 f s, 800 nm pulses emanating from a Ti:Sapphire regen amplified laser system running at a 1 kHz repetition rate.This train of pulses is split into three beams: generation, detection and modulation.The first beam generates our THz radiation via optical rectification in a ZnTe crystal [2]s.We use 90 off-axis parabolic mirrors to collect and focus the THz onto a sample and then again onto a detector.The second detection beam measures our THz waveform via electro-optic sampling in another ZnTe crystal [3]s.In short, by temporally overlapping the picosecond THz and femtosecond optical pulses one can discretely sample the temporal THz profile.In Fig. 4a we show our THz transient and detection pulse envelope.For all results in the main paper we measured the peak of our THz pulse as our detector readout, shown my black arrow in Fig. 4a.This gives a spectrally averaged measurement.The inset shows the frequency spectrum of our pulse.Our central wavelength is 400 µm.The third beam is used to spatially modulate a THz pulse.This is achieved by first patterning the optical pump pulse, via a digital micromirror device, followed by a lens, in order to project a binary intensity pattern onto a silicon wafer.The photo-excitation of charge carriers increases the plasma frequency of the material to a region well above THz frequencies, thus rendering the material a conductor [4]s.Further, by temporal synchronization of all the pulses, we pass the THz a few picoseconds after photoexcitation.This minimizes carrier diffusion and hence transfers the optical pump pattern to the THz pulse without spatial blurring.Further details can be found in supplementary information of ref. [5]s.

A. Optimal Thickness of the Modulator
The optimal thickness of the modulator depends on two major factors the modulation efficiency of the photo-modulator and the effective minimum size of the pattern at the sample plane.The refractive index of the photo-excited silicon, which determines the modulation efficiency, depends on the absorbtion of pump photons into the modulator.Figure 5a.shows the absorbtion as a function of modulator thickness calculated using the transfer matrix method [6]s assuming a refractive index at 800nm of 3.6941+0.0065435i.Figure 5b.shows the corresponding refractive index (atw = 1THz) calculated using a Drude model where w p = q n electron e 2 e 0 m ? ,m ?= 0.26 ⇤ m 0 is the electron mass and g = 3THz is the scattering rate.The number of electrons is calculated from the absorbtion model via where f is the incident fluence, A is the absorption, t is the thickness of the layer and E pump is the energy of an 800nm photon.
We divide the photomodulator into 40 layers with refractive index as shown in 5b and calculate the THz transmission at 1THz for modulators with increasing numbers of layers included.The results are shown in 6 where it can be seen that the THz transmission is independent of thickness above ⇡ 5µm.
The other consideration is the effect of diffraction away from our photoexcited aperture, assuming a size for the aperture of a = 6µm (the size of the image of the DMD mirrors assuming perfect imaging) the field in the sample plane x 0 can be written as where t is the thickness of the modulator, x is distance in the aperture plane and k = 2pl THz n.The FWHM of the field in the sample plane is characteristic of the resolution.Note we use field here as this is the measured quantity in experiment.
The FWHM is shown in figure 6 it should be noted that the size of the diffracted aperture for our choice of modulator is ⇡ 20µm a factor of two larger the achieved imaging resolution of 9µm.This can be explained by consideration of the thickness of the photo-modulated region as seen in figure 5 where for the thinnest samples the photo-modulation extends all the way through the sample giving an effective thickness much smaller than calculated.

POLARIZATION
In Fig. 2c we observe an apparently lower resolution (nominally 51 µm as opposed to 9 µm) depending on which cartwheel arms one evaluates.This effect arises from polarization boundary conditions at the interface of a conductor.Namely, the electricfield component parallel to the to interface of a conductor must vanish.Thus, if our incident THz field is polarized parallel to two metallic edges which are separated by less than half wavelength in the material, one can expect a reduced transmission.This leads to a superficial increase in the observed resolution for regions where this boundary condition is satisfied.This polarization effect can also be seen when imaging the del operator in the Maxwell equations of Figure 3.With two perpendicular polarizations in Figs.2e and 2f: the sides of the triangle which are perpendicular to the polarization show the largest transmission.

SCALAR DIFFRACTION THEORY
The Rayleigh resolution criterion is defined as when the first diffraction minimum of a source coincides with the maximum of another source.We use this to estimate the resolution of our imaging system.Using scalar diffraction theory [7]s, we calculate the diffraction pattern, at the exit interfaces of our silicon photomodulators, for two parallel slits with variable separation.In ref. [7]s, by assuming that all scatterers, sources and diffracting apertures are located in negative z-space, Kowarz obtains a solution to the 2D Helmholtz equation for positive z-space.His solution for the electric field U(x, z) is the sum of two parts, a homogeneous propagating field U h (x, z) and an evanescent field U i (x, z): where u x is the free space wavenumber, k is the free space wavenumber in x and A(u x ) is a spectral amplitude function that is the Fourier transform of the scatterer's field distribution in the plane z = 0, ie.
Now we can obtain intensity distribution, defined as for any field distribution |U(x, 0)|.Furthermore, in experiment we have multiple frequencies.To to account for this, we sum the diffracted fields for all our frequencies, where each frequency component has an input amplitude given by our pulse spectrum (fig.4) and a silicon equivalent wavelength.Finally, we extend this theory to two dimensions by considering the 2D Fourier transform of eq.7 and adding an extra integral over u y 1 in eqs. 5 & 6.

ORIGIN OF VERTICAL LINES IN FIG 2A
We begin with an empirical observation of how to eradicate the vertical lines: moving the silicon wafer out our lens focus increases the amplitude of the vertical lines.This effect is shown for a 6µm thick Si photomodulator in Fig. 6.Here, strong vertical lines appear as one of moves out of focus.The lower panels in fig.6 show the first 350 measurements of the corresponding image above.It can be seen that the out of focus image contains periodically occurring masks with abnormally high values.We now proceed to explain these observations.Consider imaging a perfectly homogeneous field with Hadamard multiplexing 2 .The first mask has values of +1 everywhere, ie.every pixel is turned on.Therefore this mask will measure to the total field transmitted through our object.However, every other mask will contain equal numbers of +1s as -1s 3 .For a homogeneous field, these masks are expected to have a detector readout of zero.However, imperfect projection using our +5cm imaging lens will lead to some +ve and ve mask pairs which are not perfect inverses of each other.As such, these mask pairs will give rise to a reading on the detector.The strength of the signal for any particular mask pair will depend on their degree of cancellation: the further one lies from the image plane of the lens, the less perfect will be the cancellation.Hence, this explains why the image artefacts that appear in the out of focus image have the largest values.Finally, we believe these signal artefacts occur in a periodic manner due to the Paley type-I Hadamard sampling matrix, namely that it is constructed from a cyclical permutation. 1 Note the square root terms become q 1 u 2 x u 2 y and there is an extra e ikuy y term. 2 The same method that was used to obtain the images in Figs.2b-f 3 As mentioned in the main text, to achieve this desired encoding scheme we use a Lock-In amplifier to record the difference in THz transmissions of a mask and its complimentary inverse.

ADAPTIVE SAMPLING METHODS
We follow an adaptive approach similar to that described in [15].As described in the main text, we first measure a low resolution image (I i ) consisting of N i ⇥ N i pixels, where i is the tier number.i = 1 in the first case.This image is measured using a full set of Hadamard projections (Sylvester construction), and the reconstruction is the weighted sum of the Hadamard patterns, each weighted by its measured level of correlation with the object.Next we perform a single tier Haar Wavelet transform on I 1 .In general the single tier Haar wavelet transform T i is calculated as follows, where T 0 i is the partial Haar transform calculated as an intermediate step.
Here x and y are Cartesian coordinates defining pixel locations.
Eqns. 8 to 11 essentially calculate the sum and differences between adjacent rows and columns of pixels.T i consist of 4 quadrants (here referred to as Q1-Q4).Q1 is a downscaled version of the original image, with the linear resolution reduced by a factor of two.Q2-Q4 represent images of the object at the same resolution as Q1, with vertical, horizontal and diagonal edges highlighted respectively, at the scale of the pixels sizes in the original low resolution image.We now use the identification of edges highlighted in T i to guide where to make further measurements at higher resolution.We form image I edge , which combines the highlighted edge information present in images Q2-Q4.I edge is formed by calculating the pixel-by-pixel quadrature sum of Q2, Q3 and Q4 to create the image: We now create an image mask by binarising I edge based on either a threshold value or by setting a required percantage of pixels to image at higher resolution.This mask denotes the regions with the highest wavelet coefficients and so defines where the object will be sampled at the next phase.The next tier of imaging is performed by making a series of Hadamard projection measurements using a fully sampled set of patterns confined to the regions defined by the mask.Here the commonly used Sylvester Hadamard construction is no longer optimal, as the number of patterns (and therefore pixels) in the Sylvester Hadamard sets are confined to 2 k , where k is a nonnegative integer, while the number of pixels within the next phase is unlikely to equal 2 k .Therefore we now use the Paley type-I Hadamard construction, which is of more flexible scale, since it can possess a number of patterns equal to p + 1, where p is a Prime number that is congruent to 3 (mod 4).Therefore we create the smallest Paley type-I Hadamard matrix which can be used to critically or slightly oversample the target area of the object defined by the mask, at twice the resolution of the initial low resolution initial image I 1 .The fully sampled higher resolution image of the masked regions is once again reconstructed from a weighted sum of the Paley Type-I Hadamard patterns, each weighted by its measured correlation with the object.
Finally, image I 1 is upscaled by a factor of 2, and updated to I 2 by either replacing those parts of I 1 with the higher resolution information, or to make the best use of all measurements, we can combine both low and high resolution measurements by representing all our measurements as a matrix equation as shown in Eqn. 1, and solving for the image.In the second case, the newly imaged areas are now oversampled, and so a least squares fit provides a level of noise suppression.In the regions that have not been resampled at higher resolution we keep our initial low resolution measurements with uniform intensity across each larger scale pixel.
Once I 2 is obtained, it is used as the input for the single tier Haar wavelet transform and the process is repeated at increasing resolutions.Evidently, as the algorithm progresses, the selection of new regions to image at higher resolutions is likely to be made inside regions that were imaged in the previous phase.This represents the truncation of the Wavelet tree, which is following the assumption that high values of wavelet coefficients at coarse scales are highly correlated with high values at finer scales.This is a reasonable assumption which can be understood by considering that sharp edges are represented by Fourier components across a wide range of frequencies.Figure 10 (left hand side) shows the resulting image reconstructions prior to regularisation.More details of the regularization procedure are given below.

COMPRESSIVE SENSING METHODS
As mentioned in the main paper, if we have performed M measurements for an N-pixel image, then A is an M by N matrix, y is an M sized vector of our measurements and x is our N-pixel image that is to be reconstructed.Unfortunately, A is no longer invertible and there exist an infinite number of solutions that satisfy these constraints.Obtaining the physically relevant solution is therefore the challenge in this type of reconstruction.

A. L 1 minimization
In addition to the compressive reconstruction presented in Fig. 3 (right hand side) in the main paper, in this section we also show a reconstruction from the same measurements but this time based on L 1 minimization.This form of reconstruction relies on the assumption that we know in advance that our solution will be sparse when transformed into a known basis representation (for example the Fourier transform or Wavelet transform of many natural images is sparse).Therefore we search for a solution that satisfys such a sparsity criterion.A vector is considered ksparse if it has k non-zero components.The number of non-zero components are measured with a L 0 norm, defined as: However, minimizing the L 0 norm is an NP-hard problem [8]s.
Fortunately it has been shown that a successful approach [14] is to find a solution that minimizes the L1-norm, which is given by the sum over the all of the components in x.This is an attractive option as minimization of the L1-norm is not an NP-hard problem and so is considerably less computationally intensive than finding the L 0 norm.Therefore when representing x in the basis in which we assume it will be sparse, the problem becomes; min ||x|| 1 subject to y = Ax.( Figure 10 (right-hand side) gives an example of this approach.It was obtained by solving Eq. ( 14) via the L1-magic package [9]s, under the assumption that the solution would be sparse in the Fourier basis.

B. Minimum curvature regularization
Regularization is a general term to describe the process of solving an under-constrained problem by injecting additional information or assumptions.The adaptive and compressed sensing images shown in Fig. 3 were all reconstructed using a regularizer that sought to minimize the curvature in the image (i.e. the second spatial derivative of the image pixel intensities), while simultaneously permitting the solution to deviate from that arising from the measurements alone by an amount based on the estimated levels of noise.Analagously to above, the problem can be expressed as where the C is a cost function given by C is the sum of two components: the first term (c 2 /N) specifies the ratio of the size of the discrepancy between the measured signals (y m ) and the predicted signals in the current reconstruction estimate (y p ), and the estimated level of noise in the measurements (s).Ensuring this is less than 1 stops the image estimate from diverging too far from the measurements.Here c 2 is given by where N is the total number of measurements and s is the estimate of the standard deviation of these noise in the measurements.
The second term in Eqn.16 (lR(p)) specifies the level of total curvature squared in the reconstruction estimate.Taking the square of the image curvature better ensures a differentiable cost function suited to gradient descent based optimisation.Here R(p) is given by where n is the total number of pixels in the reconstruction, and p j is the intensity of pixel of index j.l is an autoselected scalar that weights the importance of the first and second terms in Eqn.16. l is itself optimised to search for the solution for which c 2 /N = 1.Our regularizer searches the parameter space using a gradient descent algorithm interjected with random moves to attempt to climb out of local minima.The choice between total variation minimization or total curvature minimization is dependent upon which assumption best suits our expectation of the qualities of the object (i.e.do we expect the image of the object to have a low total variation or a low total curvature).In our present case, observing our binary objects, a low total variation in the image, or a low total curvature are both good assumptions about what the object is like, and so it is not critical which assumption we use.

SNR OF RECONSTRUCTED IMAGES
In correlation based imaging techniques, such as both of the methods we demonstrate in this work, there are two main sources of noise which govern what types of pattern should be used: illumination fluctuations and detector noise.We have previously performed a comparison of the imaging performance of our system using a raster scan set of patterns vs a fully sampled Hadamard set, showing that we are in the detector noise dominated regime (see ref [21]).If the system is dominated by detector noise, as in our case, then it is advantageous to send as much light as possible to the detector for each measurement.Therefore we use patterns which are each 50% transmissive (such as the Hadamard basis).However, there is no free lunch and so this choice of patterns also comes with a different drawback -the reconstruction is now more sensitive to illumination fluctuations than a simple raster scan.We also note that in our reconstruction we combat this noise using a regularization algorithm to improve the quality of our images, as described in the supplementary information.
There are a variety of methods to calculate SNR for images, depending upon the type of scene, the level of the background, and whether one has knowledge of the object or scene that is being imaged for comparison.Here the objects are known to be binary and the background of the reconstructions is near zero.
Therefore we follow the method used in [10]s and choose two regions of the image, one representing an area we know to be a feature, and one representing an area we know to be background.We now calculate SNR according to: where µ is the mean pixel intensity, and s is the standard deviation of the pixel intensity within the selected regions.

Fig. 1 .
Fig.1.Illustration of imaging setup: using a digital micromirror device and a lens, a pump pulse is spatially structured and projected onto a silicon wafer.This spatially modulates a coincident THz pulse.This THz pulse then passes through an object and is measured on a single-element THz detector.Inset is an optical image of a resolution test target (cartwheel) manufactured from gold on a 6µm thick silicon wafer.

Fig. 2 .
Fig. 2. a, b, c: THz images of the cartwheel shown in inset of Fig. 1 taken through 400, 110, 6µm thick silicon wafers respectively.A close-up annotated version of part c can be found in the supplementary information.Note, the cartwheels in a, b have diameters larger than the field of view.The origin of the vertical lines in a is discussed in supplementary info §8.d, e, f: Calculated diffracted fields of a cartwheel as propagated through 400, 110, 6µm thick silicon, respectively.THz polarization is horizontal in experiment.

Fig. 3 .
Fig. 3. Compressive 72 ⇥ 72 THz images of r • E = 0 and r • B = 0 (see supplementary info for optical image) with decreasing number of measurements, where the top, middle and bottom rows respectively use 75%, 50% and 35% measurements as the number of pixels.Note, a full set of measurements would take a total of 5 hours.Left column: adaptive sampling.Right column: compressed sensing.

Fig. 4 .
Fig. 4. Blue: THz transient detected by our system.Measurement arrow indicates the temporal point we sample for our individual mask measurements.Red: envelope of the optical probe pulse used to detect the THz transient.Inset shows the Fourier spectrum of our pulse, central frequency is 0.75THz and full-width-half-maximum is 1.4THz.

Fig. 5 .
Fig. 5. a. Absorbtion as a function of distance into the modulator.In order to minimize Fabry Perot interference fringes the average absorbtion over the range (780-830nm) is calculated.b.The refractive index as a function of depth into the photomodulator calculated using a Drude model the oscillations come from residual Fabry Perot effects.

Fig. 6 .
Fig. 6.Comparison of the two most important quantities in determining the optimal size of the photo-modulator.(blue) THz transmission through photo-excited silicon showing that below 10 µm ⇡ the photo-modulator begins to block less THz.

Fig. 7 .Fig. 8 .
Fig. 7. Annotated graph of part of fig.2c.The electric field is polarized horizontally.There are two resolvable distances depending on which cartwheel arms one evaluates.These have both been indicated on the graph.

Fig. 9 .
Fig. 9. Left to right; THz images of the cartwheel as the silicon wafer is moved out of the focus of our projection lens.Top row are the images with the corresponding Hadamard signals below.Semi-focus (Out of focus) is defined as when the object is moved by ⇠30(60)µm out of the focal plane.

Fig. 10 .
Fig. 10.Compressive 72 ⇥ 72 THz images of r • E = 0 and r • B = 0 with decreasing number of measurements, where the top, middle and bottom rows respectively use 75%, 50% and 35% measurements as the number of pixels.Left column: adaptive sampling.Right column: compressed sensing.
Figure ??shows the location of these regions on each image.

Fig. 11 .
Fig. 11.Optical image of the object (Maxwell's equations) imaged in fig. 3. Note, this was taken in a transmission microscope hence red is the ultra-thin silicon and black is the gold.