Generalized sampling using a compound-eye imaging system for multi-dimensional object acquisition

: In this paper, we propose generalized sampling approaches for measuring a multi-dimensional object using a compact compound-eye imaging system called thin observation module by bound optics (TOMBO). This paper shows the proposed system model, physical examples, and simulations to verify TOMBO imaging using generalized sampling. In the system, an object is sheared and multiplied by a weight distribution with physical coding, and the coded optical signal is integrated on to a detector array. A numerical estimation algorithm employing a sparsity constraint is used for object reconstruction.


Introduction
A compound-eye imaging system is a promising computational imaging modality.Compoundeye optics have enabled light-field acquisition [1] and device compactness [2,3].Thin observation module by bound optics (TOMBO) is a representative example of a compound-eye imaging system [4].
An advantage of compound-eye imaging systems is that they permit diverse data acquisition schemes.Different lenslets may create different encodings.For example, time detection based on the encoding concept has been proposed [5], and range detection in [6] can be considered as a system based on the concept.These compact systems reconstruct a three-dimensional object from a two-dimensional measurement where the size is the same as that of an axial plane of the object.
This paper proposes generalized sampling approaches for multi-dimensional object acquisition using TOMBO.In the proposed system, an object is acquired with coding and multiplexing in a two-dimensional snapshot.In particular, the coding schemes in [5,6] are extended for multi-dimensional data acquisition of various objects.
There could be multiple choices for coding schemes for multi-dimensional object acquisition such as coded aperture imaging and multi-shot imaging .These schemes differ by design constraints.This paper considers the compactness of hardware and single-shot object acquisition capability as critical design constraints.As indicated by a large literature, TOMBO imaging modality is one of such techniques that can implement a compact system that can meet our design constraints.This motivates us to investigate its potential as a compressive imaging technique.
In this paper, the mathematical model of the proposed system and examples of the coding schemes for spectral and polarization imaging techniques are presented.Simulation results of the proposed system are shown.The implementations are inspired by [7,8,9].The previously presented systems have a tradeoff between the spatial and axial resolutions.For example, in [7,8], the number of the spectral or polarization channels is roughly proportional to that of the lenses.Increasing the number of lenses reduces the spatial resolution.The approaches proposed in this paper may compensate for the tradeoff by leveraging compressive sampling [10].
A constrained optimization technique to incorporate sparsity in some basis of an object esti-  mate is used for reconstruction.The reconstruction method is inspired by compressive sampling [10].In compressive sampling, the systems should satisfy some assumptions stated in section2 for accurate reconstruction.The proposed system is compared to a theoretical baseline sensing system which is a Gaussian random sensing matrix.Several systems based on sparse reconstruction have been demonstrated and have shown promising results [11,12,13].
Section 2 provides a brief background on TOMBO and compressive sampling.Section 3 describes a general model for multi-dimensional TOMBO imaging.Section 4 presents examples of coding schemes.Simulation results are given in section 5.

TOMBO
In a simplified conceptual model, TOMBO consists of lenslets and a detector array as shown in Fig. 1.An imaging structure associated with a lenslet is called a unit [4].Each unit produces a low-resolution (LR) image on the detector array.
When the number of units is N u × N u in a square arrangement, the focal-length and the diameter of the lenslet need to be N u times smaller than those of the corresponding conventional full-aperture system to obtain the same field of views.This results in a LR image whose size is N u times smaller than that of an image produced by the full-aperture system.The thickness and depth-of-field of TOMBO are N u times shorter and N u 2 times longer, respectively.This allows for compact hardware with a large depth-of-field.Objects are often assumed to be located within the depth-of-field, and the lenslets are assumed to be aberration-free [4,14].These assumptions are made throughout this paper, unless otherwise stated.

Compressive sampling
The proposed system model in this paper forms an underdetermined linear system of equations as described in section 3. Compressive sampling (CS) is a theoretical framework for solving an underdetermined system [10,15].The reconstruction method in this paper is inspired by CS.
A linear system model can be written as where and β β β ∈ R N f ×1 are a measurement vector, a sensing matrix, an object vector, a basis matrix, and a transform coefficient vector, respectively.R N i ×N j denotes a N i × N j matrix of real numbers.We consider the case where N g N f .
The number of measurement components required for accurate reconstruction is given as where c is a constant [15].According to CS theory [15], if Θ Θ Θ satisfies RIP (Eq.( 2)), measurements are with high probability sufficient to accurately estimate β β β .An accurate estimate of the s nonzero coefficients in β β β can be obtained by solving where || • || 1 denotes 1 norm.

A mathmatical model for proposed acquisition schemes
) denote a continuous density function representing a multi-dimensional object.
x and y represent spatial dimensions, and z 0 , ••• , z N n −1 represent the other dimensions dependent on the application.x = 0 and y = 0 are defined as the center of a detector array.For simplicity, the y dimension is omitted.Extending to higher dimensions may be readily achieved with small modifications of the model.

Continuous model
In the proposed system, a multi-dimensional object is integrated on to detectors with one of two coding schemes, as demonstrated in Fig. 2. In one of the coded integrations inspired by [6], an object is sheared by an optical element, and the sheared optical signal is integrated on to a detector array.In the shear-transformation, each axial plane in an object is shifted along the x axis as shown in Fig. 2(a).In [6], the shift corresponds to a parallax.In another coded integration inspired by [5], an object is multiplied with a weight distribution, and the weighted optical signal is integrated on to a detector array as shown in Fig. 2(b).The weight distribution is a continuous function of z.In [5], the weight distribution corresponds to an exposure time.
The two schemes are referred to as sheared integration (SI) and weighted integration (WI), respectively.
We denote integrated data associated with the u-th unit as G u (v), where v denotes the spatial dimension in a unit as shown in Fig. 1. v in the u-th unit is defined as where S n,u (z n ) and W n,u (z n ) show a shift in SI and a weight distribution in WI of the z n dimension in the u-th unit, respectively.L u is the center position of the u-th lenslet on the v axis as shown in Fig. 1.For simplicity, N n = 1 is assumed, and a subscript n is omitted.Eq. ( 6) can be rewritten as

Discretization model
using notation similar to that in [17], where a tilde indicates a discrete data.G is an intermediate data before sampling by the detectors.l, m, and i are integer variables of the x, z, and v axes in a discretization model, respectively.x and z are the pixel pitches along the x and z axes in a discrete object.G u (i) is sampled by detectors.The where N v , j, v , and D u are the number of detectors in a unit, an index for the detectors in a unit, the pixel pitch of the detectors, and the center of the center detector on the v axis in the u-th unit.Then, the measurement data G can be written as where • is the floor function.

System matrix
We assume that x = v /N u and N x = N v N u , which are both typical assumptions in TOMBO imaging [4,5,6,14].Thus, the numbers of elements in the measurement data and the object are As indicated in Eq. ( 8), the m-th axial plane in F is shifted by Su (m) and multiplied with Wu (m).C C C m,u ∈ R N x ×N x which denotes the coding operation for the m-th axial plane in F in the u-th unit is expressed as, where C C C m,u (p, q) is the (p, q)-th element in the matrix C C C m,u .C C C u ∈ R N f ×N f represents the coding operation implemented by the TOMBO system on the object f f f ( F in vector form) in the u-th unit and is written as where which sums all of the axial layers is defined by where x denotes an identity matrix.
The downsampling matrix T T T ∈ R N v ×N x can be defined by where 1 and 0 denote a N u × 1 vector whose elements are all 1 and a N u × 1 vector whose elements are all 0, respectively.A superscript T indicates a transpose of a matrix.Therefore, the measurement data G on the u-th unit is T T T Q Q QC C C u f f f , which means, firstly, an object is coded in each unit by C C C u , secondly, the coded data is integrated on to a detector array by Q Q Q, and lastly, the integrated data is downsampled with detectors by T T T .The sensing matrix Φ Φ Φ ∈ R N g ×N f is expressed by

Implementation of proposed acquisition schemes
The proposed coding schemes can be implemented for a wide array of practical applications.
Each application would rely on some physical optical elements to implement the coding scheme expressed by Eqs. ( 6) or (7).In this section, we present examples of the coding scheme for spectral imaging and polarization imaging.Using similar schemes, physical coding strategies for range, time, spectrum, polarization, large dynamic range, and wide field-of-view may be available.Physical codings for spectral imaging using SI and WI are illustrated in Fig. 3. SI for spectral imaging can be implemented by using dispersive elements (e.g., prisms).The elements in each unit have different dispersion directions as shown in Fig. 3(a).The dispersion results in different shifts for each spectral slice.In Eq. ( 7), z represents the wavelength.The shift corresponds to S u (z).
WI for spectral imaging may be implemented with multi-band pass filters placed above or below the lenslet as shown in Fig. 3(b).Each of the filters has different pass-bands.Pass-bands and stop-bands are represented with W u (z) = 1 and W u (z) = 0 in Eq. ( 7), respectively.A stack  of bandstop filters or a patch of bandpass filters may be used to substitute for the multi-band pass filter.Figure 4 shows a conceptual diagram for polarization imaging with the proposed codings.SI for polarization imaging may be performed with birefringent linear polarizers [18].The elements split an incident ray into two polarized rays.Hence, an image at each polarization angle is shifted.Each unit has different shift for each polarization angle as shown in Fig. 4(a).Here, z represents a linear polarization angle.The shift corresponds to S u (z) in Eq. (7).
WI for polarization imaging may be performed with polarization plates.Polarization plates with different linear polarization angle are placed above or below the lenslet as shown in Fig. 4(b).The weight distribution is expressed as W u (z) = cos 2 (P u − z) [18], where P u is the polarization angle in the u-th unit.A patch of polarization plates, where each plate has a different polarization angle, allows flexibility in the design of a weight distribution.

Simulation of the proposed concept
The concept of multi-dimensional TOMBO imaging was verified through application independent simulations.These general simulations could readily modified for a specific application like those mentioned in the previous section.
A method called two-step iterative shrinkage/thresholding algorithm (TwIST) [19] was used for reconstruction.TwIST is an interative convex optimization algorithm that uses two previous estimates to improve convergence properties for the problem described by Eq. ( 5).For simplicity, a shift in SI was assumed as S u (z) = (A u z + B u ) x .A u and B u are a gradient and a bias, respectively, of the shear-transformation in the u-th unit defined as A u = (−2u/(N u − 1) + 1)A 0 and B u = −A u N z z /2.For example, A 0 = 1.0 of N u = 3 indicates that A 0 = 1.0,A 1 = 0.0, and A 2 = −1.0.A shift at the center axial plane, where z = N z z /2, is S u (z) = 0.0.A u and B u of the y axis is the same as those of the x axis.A weight distribution in WI was assumed to be a binary pattern.In the m-th axial plane, h units were set as W u (m z ) = 1 in Eq. ( 7).The h units were randomly chosen, while the other N u 2 − h units were set as W u (m z ) = 0.In this case, the maximum number of separable axial planes is fixed N u 2C h .A lenslet's position L u in Eq. ( 8) was randomly set in each unit.The range was [− v /2, v /2], where v is the pixel pitch of the detectors.The position of the center detector in a unit is D u = 0.
Figure 5 shows a simulation of four-dimensional data acquisition using the two TOMBO coding schemes.An object whose size is 128 × 128 × 4 × 2 and measurement data whose size is 128 × 128 are shown in Figs.5(a) and 5(b).The compression ratio is 8, which is calculated as N f /N g , where N f and N g are the numbers of elements in an object and measurement data, respectively.In Fig. 5, the object and the simulation results are reshaped to 128 × 128 × 8 for display.SI with A 0 = 1.0 and WI with h = 3 were used for the z 0 and z 1 axes, respectively.The measurement signal-to-noise ratio (SNR) in the presence of additive white Gaussian noise and the number of units were 30 dB and 2 × 2, respectively.The object estimate sparsity in gradients was enforced using the total variation (TV) [20].Two-dimensional TV was applied independently for each axial plane as where l x and l y are indices for the x and y axes in a discrete object.∇[•] l x ,l y is a two-dimensional gradient vector for the x and y directions, and | • | denotes the magnitude of the gradient vector.The object consists of multiple Shepp-Logan phantoms, which is sparse in two-dimensional TV domain.The total number of non-zero gradient values was s = 3242.The reconstruction results with TwIST and the Richardson Lucy method (RL) [21,22] are compared in Figs. 5

(c) and 5(d).
Their peak signal-to-noise ratios (PSNR) were 32.1 dB and 19.4 dB, respectively.The PSNR is found by computing 20 log 10 (MAX/ √ MSE), where MAX and MSE represent the maximum of the signal values and the mean squared error between two signals, respectively [23].
CS object reconstruction accuracy may be estimated using a correlation between columns of Θ Θ Θ, which is the multiplication of a sensing matrix Φ Φ Φ and a basis matrix Ψ Ψ Ψ, in Eq. (1) [24].When a correlation between two columns in Θ Θ Θ is high, it is difficult to resolve the two components in a transform coefficient vector β β β corresponding to the columns in Θ Θ Θ.So that, the reconstruction accuracy depends on not only Φ Φ Φ but also Ψ Ψ Ψ.
When a two-dimensional basis is used for each axial plane as in the previous simulation, the reconstruction accuracy along the axial direction in an object estimate may be roughly predicted based on the correlation between columns of Φ Φ Φ corresponding to two axial planes.
From Eqs. ( 1), ( 13), (14), and the assumption to use a two-dimensional basis, a sensing matrix, a basis matrix, and Θ Θ Θ can be rewritten as respectively, where ψ ψ ψ ∈ R N x ×N x and O O O ∈ R N x ×N x are a two-dimensional basis matrix for each axial plane and a N x × N x zero matrix.If the correlation between a column in φ φ φ m and that on another axial plane is high, then the corresponding correlation between a column in φ φ φ m ψ ψ ψ and that on another axial plane may be high.In this case, it is difficult to resolve the axial planes.
When |A 0 | is small or h is large, the correlation between a column in φ φ φ m and that on another axial plane is high.For example, Fig. 5(e) shows a reconstruction result where SI with A 0 = 0.2 was used for the z 0 axis.The reconstruction accuracy along the axial direction with A 0 = 0.2 was lower than that with A 0 = 1.0.Figure 6 shows another simulation of five-dimensional data acquisition with the discrete wavelet transform (DWT).The sizes of the object in Fig. 6(a) and the measurement data in Fig. 6(b) were 128 × 128 × 2 × 2 × 2 and 128 × 128.The compression ratio is 8. Twodimensional DWT was applied for each axial plane.The object consists of multiple natural images where the small coefficients in two-dimensional DWT were truncated.In twodimensional DWT domain, the total number of non-zero DWT coefficients across all the planes was s = 2000.SI with A 0 = 3.0, WI with h = 12, and WI with h = 12 were used for the z 0 , z 1 , and z 2 axes, respectively.The measurement SNR and the number of the units were 30 dB and 4 × 4, respectively.The reconstruction results with TwIST and RL are compared in Figs.6(c) and 6(d).Their PSNRs were 24.5 dB and 15.4 dB, respectively.Figure 6(e) shows a reconstruction result where WI with h = 15 was used for the z 1 and z 2 axes.The reconstruction accuracy along the axial direction with h = 15 was lower than that with h = 12.
Figure 7 illustrates the sensitivity of the reconstructions to noise as represented by a curve relating the measurement SNR to the reconstruction PSNRs.Also, the performance is compared to that of an ideal Gaussian random compressive sensing matrix, which is known to require a (optimally) small number of measurements to satisfy the RIP compared to what the proposed systems would require.Since the proposed systems usually have a worse RIP meaning that more measurements are required to obtain higher reconstruction accuracy, they present worse reconstruction accuracy compared to that of the Gaussian random matrix.However, such random sensing matrices would be very difficult to physically implement in general with the current technology.In addition, it is not clear how such random sensing systems may provide the compactness of physical systems and snapshot acquisition functionality, which are benefits of the proposed approach.

Conclusions
We proposed a generalized sampling approach for multi-dimensional object acquisition using TOMBO.The sampling uses multi-dimensional sheared and weighted integration in each unit.The mathematical model and some examples of the proposed measurement approach were presented.The simulation demonstrated reconstruction of an object with the number of elements totaling eight times that of the measurement data.A method inspired by compressive sampling was used in the reconstruction.These schemes enable us to acquire a multi-dimensional object with a single two-dimensional measurement by a compact imaging system.Also, these schemes extend abilities of compound-eye imaging systems to various applications.A useful avenue for future work is to analyze theoretical properties of the proposed systems.It would be interesting to see how many more measurements would be required in general for the proposed systems to produce a certain accuracy, which is related to the validity of the sparsity assumption in the proposed systems.Also, it would be very useful to find a more efficient sparsity transformation that provides a better RIP and a better sparse representation of the objects of interest.Furthermore, we plan to investigate other coding schemes that may provide a better RIP overall to better exploit the sparsity assumption.

Fig. 1 .
Fig. 1.Cross section view of TOMBO.v, O u , and L u are the spatial dimension, the center position, and the position of a lenslet in the u-th unit, respectively.

Fig. 4 .
Fig. 4. Top views of TOMBO for polarization imaging with (a) SI and (b) WI.Arrows, dots, circles, and shaded areas indicate directions of polarization, centers of shifted images, lenslets, and polarization plates, respectively.