CMOS Approach to Compressed-domain Image Acquisition

A hardware implementation of a real-time compressed-domain image acquisition system is demonstrated. The system performs front-end computational imaging, whereby the inner product between an image and an arbitrarily-specified mask is implemented in silicon. The acquisition system is based on an intelligent readout integrated circuit (iROIC) that is capable of providing independent bias voltages to individual detectors, which enables implementation of spatial multiplication with any prescribed mask through a bias-controlled responsemodulation mechanism. The modulated pixels are summed up in the image grabber to generate the compressed samples, namely aperture-coded coefficients, of an image. A rigorous bias-selection algorithm is presented to the readout circuit, which exploits the bias-dependent nature of the imager’s responsivity. Proven functionality of the hardware in transform coding compressed image acquisition, silicon-level compressive sampling, in pixel nonuniformity correction and hardware-level implementation of region-based enhancement is demonstrated.


Introduction
Dramatic advances in the field of computational and medical imaging over the past decades have enabled many critical applications such as night vision, medical diagnosis, quality control, and remote sensing applications [1][2][3][4][5].The increasing demand in image quality and its fidelity needs an increase in pixel count and a sophisticated post-processing mechanism to efficiently store, transmit, and analyze this huge data [6][7][8][9].There is an inherent trade-off between the generation of big data by such imaging systems, and efficiency in extraction of useful information within real-time constraints, limiting the efficacy of such sensors in real-time decision-making systems [10,11].The traditional imaging system gets burdened by the acquisition, transmission, and storage of excess data, bearing redundant information for the given application of interest [12][13][14][15][16]. Transmission of the extra information requires a high bandwidth and consumption of extra power to store or transmit.Similarly, post-processing imposes extra latency and requires additive power consumption, which is troublesome for many low-power, real-time applications, and portable devices [17].
There is a need to address this problem by intelligently acquiring a limited but most important set of data and process the abstract information.This, in turn, needs an additional ability where computations are performed at the pixel level, within the readout integrated circuit, at the front-end of the imager [18].
In the pursuit of seeking an efficient computational imaging hardware, which tends to address the memory efficiency, low power consumption and minimal latency requirements, we demonstrate a CMOS-based imaging hardware [19], which supports compression at the acquisition time [20], inside the pixel.Figure 1(a) shows an iconic block diagram of the long established typical imaging system, and our alternative approach, compresseddomain imaging, is demonstrated in Fig. 1(b).The proposed approach suggests integration of the postprocessing to the acquisition, which results in lower latency and reduction in power consumption.In Section 2, we discuss some background and prior works in the area of sensor-level compression.Section 3 covers our proposed compressed-domain imaging hardware and the photodetector embedded in the design.The experimental setup is explained in Section 4. Different applications, including nonuniformity correction and compressive sensing are discussed in Sections 6 and 7, and the experimental results are also presented.Finally, we will outline conclusions and future works in Section 9.

Background and previous work
For a typical image sensor, imaging involves reading out the values sampled at different pixels [21]; whereas in the case of compressed-domain hardware, a set of gain matrices is loaded to the pixel array, and the image sensor's output would be a linear combination of the projection of the object's reflectance function to the gain matrices [16,22].In the following paragraphs, we make some comparisons among a few other works that have been devoted to the problem of online compression and hardware domain sensing based on matrix projection.
One of the earliest reported hardware implementations to the compressive sensing is based on a single-pixel camera [23].The single-pixel imaging utilizes a digital micromirror (DMM) [24] to project the incident light coming from the object to the digital masks.The photodetector samples the integrated light coming from the sample, which is modulated by using the DMM.This method is usually used for far infrared imaging where having an array of low-cost, small size photodetectors is not feasible.The DMM degrades the sensitivity of the imager, and the alignment of different components is a limit to the scaling of this method.
An optical-domain coded apertures based compressive sensing is demonstrated in [25].A random phase mask injects the measurement matrices, and the modulated intensities at different pixels are sampled using a lowresolution imager.This technique suffers from the noise added by the optical masks, and the complexity of the alignment setup is a big challenge.
A CMOS imager is demonstrated in [26] that utilizes a flip-flop-based shift-register distributed over the pixel array to hold the random digital patterns.The shift register selectively disconnects the pixels from the readout and implements the measurement matrices.The proposed hardware offers multiplication only by a binary value.This limits the compressive-sensing algorithm to the binary projection matrices, which are composed of only one or zero.Furthermore, there is no control over the bias voltage of the detectors, the result of which many features that are offered by modulation at the detector level are not supported.Finally, because the unitcell does not support integration, the proposed hardware cannot work with the detectors with lower quantum efficiency.
Figure 2 presents our proposed monolithic CMOS image sensor that can run as a stand-alone image sensor and is able to perform spatiotemporal region of interest enhancement.The hardware is also capable of generating already compressed images as well as canceling the nonuniformity inherent from process variation or other sources such as a voltage drop across the image sensor chip.The main contribution of this hardware is the introduction of control over a per-pixel modulation factor through controling the photodetector's responsivity that is demonstrated as a controllable gain symbol in the pixels.The capacitor represents the analog memory that is embedded to store and hold the bias information for individual pixels.The AND gate selectively enables different pixels to load the bias voltage to the active pixel, and this selection occurs at the same time that the pixel is being readout; therefore, no delay penalty is associated with the new design.While sampling the integrated voltage to the sample-and-hold (S&H) capacitor, voltage Vref is used as a global reference voltage for all of the preamplifiers.This removes the bias voltage from showing up in the readout and makes the readout value meaningful.During the readout, the bias information, which is loaded to different pixels, can be different from each other and also from the bias that is loaded to the same pixel in the previous frame.This is what we refer to as the spatiotemporal independence of pixels biasing scheme.
The proposed hardware has the unique feature of performing application-specific transform coding based on a specialized set of bias masks.These sets of bias masks are dictated by a rigorous bias-selection algorithm that is then stored in the memory of the device.The incoming image data is projected into the designated masks to generate the code words used for image reconstruction.Most importantly, our proposed bias-selection algorithm, which has not been reported in the literature, considers the responsivity of the device, resulting in remarkably less reconstruction error.We will discuss the detail implementation of the iROIC in the next section.

Design of the pixel
Implementation of a compressed-domain imaging system requires a means to implement projection of the object's reflectance function to the gain matrices.The way we have approached this problem is by embedding a fine control over operating voltage of each individual pixel's detector.The current hardware is designed with an array of n+/nwell/psub detectors that is laid out along with the rest of the readout integrated circuit in silicon.The fill-factor of the detector is 8.4%.A cross section of the photodetector is shown in Fig. 3(a).and c) the same measured results that are scaled to one.In this experiment, a green LED is used as the illumination source and the dimension of the photodetector is 100 µm×100 µm.
The graph in Fig. 3(b) shows the measured photocurrent of the n+/nwell/psub photodetector at six different illumination levels and at dark.A green LED is used as the light source in this experiment, and the intensity is modulated by controlling the injection current.The LED is placed at almost 40 cm away from the detector, meaning that the intensity of illumination at the scale of the detector and the optical power meter can be approximated as uniform.The illumination intensity is measured simultaneously, and the reported optical power is scaled to the area of the detector.As seen in Fig. 3(b), because the photoresponse is a function of both the bias voltage and the intensity of the light, one could load the projection matrix to the pixel array and acquire the image while the pixels are operating at different response-modulation factors.The graph in Fig. 3(c) shows the same measured data, which are normalized to one.Because the normalized data are approximately overlapping, we can state that modulating the bias voltage scales the measured photocurrent.This could lead to many applications, which will be discussed in the following sections.Table 1 briefly compares different possible configurations for the preamplifier stage of the unitcell.Because the capacitive trans-impedance amplifier (CTIA) provides the best performance in terms of precise control over the detector's bias voltage, as well as provides high injection efficiency, large voltage swing, and support for good charge storage, we have selected this configuration as the base for the preamplifier.In the proposed unitcell, the conventional CTIA configuration is featured with the ability to control each individual pixel's bias voltage.Here, we briefly explain the process that is followed to operate the compressed-domain imaging.The readout mechanism is also demonstrated in Fig. 4(d): 1.The bias control circuit is composed of an analog switch, SWBias, that is enabled when the row-select and column-select signals address the pixel, then the analog memory is loaded with the bias voltage.2. During the integration, the bias is held at the analog memory.Both SWBias and SWRefswitches are off for the entire integration time to protect the CBias capacitor from charging.3.At the end of the integration, the SWRef switch is enabled to set the same reference voltage for all of the pixels and to make the sample's value meaningful.
To provide a high voltage swing range, the chip has been fabricated by Taiwan Semiconductor Manufacturing Co. (TSMC) at CL035HV process technology node, which is a standard CMOS process technology, supporting four metal layers, two poly layers, and the two different voltage domains.The metal layers serve as the interconnection between various devices, and the polies are used at the gate of the transistors, and they have been employed to form the inter-poly capacitors.The high voltage domain is used for the unitcells to support high swing voltage; the low voltage domain is employed for the row-select and column-select circuitry, resulting in higher integration and lower power consumption.The minimum feature size of the devices at the CL035HV process node is 1.5 µm for the transistors at high voltage domain and 0.35 µm for those at low voltage domain.A major challenge in the design of this circuit was the trade-off between the number of functionalities and the area for the pixel.To comply with the pitch of standard focal plane arrays (FPAs), we decided to restrict the unitcell to 30 µm × 30 µm.The constraint imposed by area forced us to have all the switches at the minimum size supported by the technology node.This minimum feature size of 1.5 µm is still large enough to neglect leakage currents that are dominant mainly at submicron or deep submicron devices.All of the switches are based on a single NMOS transistor.The rest of the area was equally divided between the capacitors to achieve the highest possible resolution for the output image data.In total, the unitcell is composed of seven transistors for the dual-stage differential amplifier and eight transistors for the rest of the unitcell circuitry.The unitcell also includes four capacitors that serve as the compensation, the integration, the sample-and-hold, and the biasvoltage holder capacitor.
To have a model for the response-modulation function of the imager, the response of the system to a uniform level of illumination at different bias voltages is measured.The normalized imagers photoresponse is shown in Fig. 5.In the error-bar graph, the mean and standard deviations are based on statistical analysis over all the pixels in the entire 96 × 96 frame, and each measurement was repeated 10 times to reduce random noises.The mean value and the standard variation shown in this figure are employed as the base for bias selection in a realtime system.The curve infers that the system responds to the bias voltage in a semi-linear fashion as long as the detector's bias voltage is limited to ∼ [+0.4,+3.5].Although we have considered n+/nwell/psub photodetectors as a means to exploit compressed-domain image acquisition, the circuit would work fine with any detector, for which the nominal operating voltage and current of the detector fit in the specification of the designed readout integrated circuit.Additionally, we have embedded extra knobs, such as the bias current of the preamplifier, the integration time, and the readout clock speed, that are set from outside the chip.These knobs can be employed to optimize the operating point of the system.

Experimental setup
In the implemented hardware, the timing signal and the analog bias for the photodetectors are generated using a Raspberry Pi board (RPB).The main reason for choosing the RPB as the main controller is its extended support for on-board memory in the form of a micro-SD card.The typical FPGAs do not support for high volume storage; this challenges the storage of massive bias information.A DAC converts these digital values to analog and then feed them to the iROIC.The output video signal is sampled using an ADC chip, which is derived by the RPB.The sampled data are both sent to a remote computer for the purpose of online monitoring and also are stored in the local memory of the controller to be processed later.The RPB board acts as a stand-alone controller for the iROIC and performs all image acquisition details.A custom PCB board is designed to host the test chip, to interface the RPB board, and to deliver high signal integrity.The RPB board is controlled using a desktop over LAN, and test vectors are loaded using Linux's standard commands such as rsync, ssh, scp, etc.A block diagram of the experimental setup is shown in Fig. 6(b).The control over bias information of every pixel's detector and the flexibility offered by the experimental setup has enabled many different applications that are explained in the following sections.

Nonuniformity correction
The pixels are designed to maximize the sensitivity to the photoresponse.However, the overall performance of the sensor is limited by noise, which comes from many different sources and contributes to the output signal.Random noise is a temporal variation in the signal that is not constant and changes over time, from frame to frame.This type of noise, which is hard to predict, has a statistical distribution and can be canceled statistically by the mean of averaging [27,28].
On the other hand, the pattern noise is the spatial variation in the photoresponse of different pixels while they are exposed to a uniform illumination.This type of noise is fixed over time and cannot be reduced by averaging.
The pattern noise stems from the variations in the growth or fabrication of the photodetectors.The difference in the driving and sampling circuitry or the variation in power distribution also results in deviation in responsivity in the form of pattern noise [29].
The pattern noise is composed of fixed pattern noise (FPN) [30,31] and photo response nonuniformity (PRNU) components [32,33].The FPN is measured in the absence of illumination and is a result of variations in growth, detector dimension, doping concentrations, fabrication defects, characteristics of transistors (VT, gm, W, L, etc.) [34,35], or nonuniformity in the distribution of power [36].Additionally, at high-speed readout, the differences between the resistance and capacitance that are seen at the output of different unitcells can also cause nonuniformity.The second component of pattern noise, PRNU, is a function of illumination and varies based on the dimension of the photodetector, the doping concentration, and the color of the light incident to the detector [37].
Nonuniformity correction is an important topic under investigation and deals with processing inconsistencies that lead to unfavorable pattern noise.Independent of the source of the nonuniformity, it can be corrected using single-point calibration, two-point calibration [38,39], or scene-based nonuniformity correction [40].
Because pattern noise does not change with time, it could be canceled by using proper biasing of the circuit.We have used a two points based nonuniformity correction to calibrate the responsivity of the image sensor.Two different uniform illuminations are used as the calibration points, and as a result, an offset and a gain are calculated for each pixel, which is employed to correct the photoresponses that are read from each pixel.This method has the extra benefit that if the nonuniformity grows with temperature, it will offer a better correction.The mathematical formulation for the correction algorithm we used is given below [41].The linear model of the imaging device is estimated by: where    is the actual object's reflection function, which is incident to the image sensor, and the observed pixel value is given by    .Variable k is the frame index, and the gain and offset of the (i, j) th detector are denoted by    and    respectively.Here, nonuniformity correction is carried out by the means of a linear transformation of the observed pixel values    .The goal is to provide an estimate of the true intensity     so that all of the detectors appear to be performing uniformly.The correction is given by: (2)    =       +    , where    and    are the gain and offset of the linear correction model of the (i, j) th detector.
After we estimate the parameters    and    or    and    , the NUC can be achieved as per Eq. ( 2) for which we computed the corrected bias to be applied from the responsivity graph.In Fig. 7(a), we demonstrate an image of a white paper, which is taken at uniform biasing for all of the pixels.Although the bias information is uniform, the pixels' response across the image varies because of the nonuniform illumination, weakly sensitive pixels, and other sources of fixed pattern noise.shows that it is flat as for the given non-uniform illumination, the camera results in an image with a wide range of pixel intensity level, while at the same time our NUC method resulted in a narrow histogram as shown in Fig. 7(d).Here, the point is that the hardware is able to cancel the integrated nonuniformity that stems in the pixels, the ROIC, and also in the illumination.
The nonuniformity correction also aided in the fine-tuning of the responsivity curves.Because the responsivity is based on the calibration of pixels under different bias conditions and different lighting conditions, enabling nonuniformity correction before this calibration process allowed a uniform behavior of responsivity through all of the pixels and less invariant toward any form of noise.This also guaranteed that the SNR of responsivity is above a certain threshold, which enabled the bias-selection technique to have superior performance as discussed over results.

Compressed-domain image acquisition
The most important application of the chip is targeted in a compressed-domain imaging framework.The compression is achieved by the hardware by performing a projection of the image to a set of basis masks implemented in the detectors' biases.We have considered two different in-hardware compression modalities, which are in-pixel discrete-cosine-transform (DCT) based compressed-domain image acquisition and compressive sensing framework [42,43].
To implement the compression modalities in hardware, we need to adapt the compressive masks as per device responsivity so that we ensure the mask coefficients are exactly achievable as modulation factors at the pixels.

Discrete cosine transform
In this part, we present the mathematical formulations for compression and reconstruction of the image using the DCT.In order to realize any sort of transform coding on the computational imaging hardware, one needs to be able to project the acquired image into the designated mask where the transform coefficients need to be realized at each of the pixels as multiplication factors.Considering R is the responsivity of the image sensor, which is a function of the object's reflectance function I and the detector's bias voltage V, then: (3)  = (, ), where g is some nonlinear function of I and V. Here, if I is the object reflectance function in spatial domain, then its frequency domain transform is given by: The inverse of the DCT transform function is defined as: In order to implement the computationally intensive DCT transform in hardware, we have reordered Eq. ( 4) and decoupled the bias (mask) matrices from the image sensor responses, which is shown in the equation below: where: ,  = 0,1, … ,  − 1.
In the above equation, Mask uv (i, j) is the mask set that is to be loaded to the image sensor as the bias information.If we assume N equals M, for exact reconstruction the total number of masks would be N × N. The mask matrices can be represented as, .
In the calculation of the mask matrices, because C(u) and C(v) are not a function of m and n, they are treated as constants and are not included in Eq. ( 8).Because all of the coefficients are limited to the same range of [−1, +1], we could efficiently use the limited dynamic range of the analog memory to store the bias voltage; otherwise, the DCT coefficient would need a greater number of bits to deliver the same SNR.
The discussion above works fine as long as the system is noise free; however, the system's response-modulation function shown in Fig. 5 triggers the need for a more intelligent bias-selection algorithm.Due to the device's limited dynamic range and noise behavior of the system, it is a must to have a bias-selection algorithm.This algorithm efficiently prescribes the optimal bias to each pixel, which leads in minimization of the effect of noise.Also, some linear transformation is used to map all coefficients over the given implementable dynamic range.The next section is devoted to the mathematical model of the device-response and bias-selection algorithms.

Bias selection algorithm
In this section, we will describe a novel bias-selection algorithm based on the MMSE approach, which tends to address the issue of image reconstruction when noise comes into play in the responsivity of the device.When the bias corresponding to a basis coefficient is computed without considering the effect of noise in the responsivity of the device, then we call it a naïve technique.This term will be used frequently in the rest of paper to consider such cases.
The projection and reconstruction are exact as long as the device behaves deterministically for the applied mask.However, the complexity rises as its behavior tends to be random and there exists a finite uncertainty to its response.In this case, the common reconstruction method does not lead to exact recovery as it is difficult to find a unique bias that is able to achieve the designated gain factor.Next, we discuss a technique that enables us to optimally choose the bias for the given mask coefficient.
To begin describing the bias-selection method, as shown in Fig. 8, we consider a set of basis masks, {  } =1  , each of which is to be implemented by a 2D array of biases to be determined later.Each of these masks consists of a 2D array of coefficients, given by {{   }} ,=1

𝑀𝑀
. The objective is to map each of these    coefficients into achievable responsivity values by means of the application of appropriate bias drawn from the responsivity function given by  � ().Here,  � () is the noisy responsivity of the device as a function of applied bias.This bias assignment is carried out according to the optimization criterion stated in Eq. ( 14).For an imaging system of resolution N = 96 × 96 pixels, the image captured by the system I, the matrix of DCT coefficients Y, and the k-th ideal DCT mask B (k) , is represented by �.
The k-th practical mask based on noisy responsivity is: �, and (9) () = () + (,   2 ), where R(v) is the implementable k th mask based on ideal responsivity when the system is noise free.Now the expression for computing the individual DCT coefficients corresponding to the noisy responsivity mask and corresponding error are given by (10)   � () = � �  , ̃ , () (), where the k-th DCT coefficient corresponding to the ideal mask is denoted by (12)   () = � �  ,  , () .

𝑖𝑖 𝑖𝑖
For a specific pixel at (i, j) position, if b is the mask coefficient to be achieved and () is the realizable coefficient from responsivity, then the objective function for bias selection for that specific pixel is given by ( 13) () = ( − ()) 2 , and the optimization problem is given by ( 14) minimize v () subject to (()) = 0, where • E(f (v)) stands for the expected value of the entity f(v), which is a function of v.
Equivalently, the problem can be reformulated as then to find the optimum vopt, we differentiate the objective with respect to v such that   ((  )) = 0, and thus we obtain where, r(vopt) corresponds to the optimal realizable gain coefficient for a given ideal value of mask coefficient b, and vopt stands for the optimal bias to be applied to realize gain r(vopt).The above expressions explained the optimal bias-selection rule driving the corresponding gain coefficients to be implemented on the pixel to realize the optimal mask coefficient b.
The bias-selection algorithm works fine as long as the variance of noise in the responsivity lies within some limit and the lighting condition does not change drastically.This is because for different operating light conditions, the responsivity might change and the designed bias in the memory will not be able to suffice the objective.

Conditioning the masks for mapping the bias into device dynamic range
For an image {{  }} ,=1  and basis masks given by   = {{   }} ,=1  , the DCT coefficient for the ideal case is achieved as However, due to the device's limited operating dynamic range and memory, there is a need to appropriately condition the mask coefficients such that they are realizable as per the device responsivity.Once the projection is obtained, an equivalent transform needs to be applied to retrieve the actual DCT coefficients.Now for any linear transformation given by r = mb + c, where m is the gain, c is the offset and ris the entity equivalent to b in the transform domain.Hence, this transformation is identically applied to all of the basis coefficients, to accommodate all of them into the working dynamic range of the device responsivity.

𝑖𝑖
Then, for each projection coefficient    ′ , we can condition as follow so to retrieve the actual projection coefficient: This conditioning is responsible for mapping of the target mask coefficients into the realizable region, the distribution of which is shown in Figs.9(a) and 9(c) corresponding to naïve and MMSE methods, respectively.Also, as observed from the distribution of bias from Figs. 9(b) and 9(d), the MMSE spreads out the bias to ensure the quantization effects on implementation are minimized.As MMSE considers the effects of noise while bias is prescribed for the given mask, variance is added on the realizable mask coefficients, which leads to their spread when compared to that designed without considering the effect of noise.

DCT-based image compression
Once the optimal masks and gain are designed with the aid of the bias-selection algorithm, the biases are then applied to the hardware, which in turn, results in achieving the desired coefficients as modulation factors at each pixel.Finally, the DCT coefficient corresponding to each mask is achieved by (19)    = � �  , ̃ ,  (  ) 96 =1 96 =1 .

DCT-based image reconstruction
Image reconstruction is achieved by simply applying the linear combination of the masks to which the image was projected.The reconstruction is achieved by the following equation: (20)  �        .
Following the discussion above, we performed DCT-based image compression optimally on the hardware.However, some error still exists in the projection coefficients that propagate during the reconstruction, which is mainly due to the limited dynamic range of the pixels and different random uncharacterized noise present in the hardware.

Compressive sensing implementation
The second type of in-pixel compressed-domain acquisition we have explored is compressive sensing (CS).While in the DCT transform coding, the gain vectors vary continuously, which leads to the maximal exploitation of device dynamic range; the CS implementation simplifies the complexity by making use of only zeros and ones, which makes the system more resilient to noise.Here, we present some background regarding CS and implementation methodology on the proposed hardware.
CS is based on the principle of achieving a larger and more efficient compression, provided that the desired data is sparse in some basis.Sparsity is the primary condition here, which will lead to efficient reconstruction of data if it is sampled in a proper domain.We consider the input image as a discrete-time column vector x ∈ R P with elements x [n] where n = 1, 2,…, P and P = 96 × 96.Then, x can be represented as a linear combination of elements from an orthonormal basis {  } =1  and coefficients si.Here, We assume that s is sparse with K nonzero coefficients.Now, by selecting an efficient binary random sensing matrix ψ, we can represent the reduced data set as y = ψx where ψ is a binary matrix of size M × P and M ≪ P.
In this way, the dimension of data set is reduced from P to M. However, the size M also needs to be properly determined for stable reconstruction.The standard expressions for computing M are given as where c is a constant.Here, the matrix ψ is composed of M basis functions in P dimension to which data x is projected, i.e. ψ = [ψ1|ψ2| … âŤĆψM] T , where ψ1 is of size P × 1.The matrix was designed with the restricted isometry property (RIP) [44] given below: where σK ∈ [0, 1).Moreover, each ψi is converted to an equivalent 2D data set and then subjected to be implemented on hardware as a measurement mask.Because this mask is composed of binary elements, it is easier to achieve projections as the detector tends to switch on or off depending upon the bias applied for the acquisition.After we have obtained coefficients from the projection of the image to the reduced basis, the challenging problem is to reconstruct the image out of its dimensionally reduced format.Specifically, in this problem, we look forward to reconstruct image vector x by only using the M measurements in the vector y, the random measurement matrix ψ, and the orthonormal basis ϕ.Equivalently, we could reconstruct the sparse coefficient vector s.The estimate is given by the ℓ1 minimization criteria, which uses a convex relaxation of the ℓ0 norm given as (23)  � = minâĄą‖ ′ ‖ 1 , such that  ′ = , and The reconstruction was performed with the aid of ℓ1-magic algorithm, where the same random basis was considered for reconstruction, which was used for the projection during the hardware implementation [45].

Performance comparison between naïve DCT, LMS DCT, and CS reconstruction
For a prescribed response-modulation factor, mandated by the DCT masks, for example, we analytically calculated the required voltage using the bias-selection algorithm as discussed in Section 6.2.Note that without such a statistical calculation of the voltage, the implementation of the modulation level would be inexact andd would result in errors in the image reconstruction.Figure 10 shows reconstructed images for different compression methods with a different number of projection coefficients taken into account.The criticality of the statistical calculation of the voltages is evidenced by the presence of noise in the reconstructed images using the naïve approach, which uses bias voltages that are calculated without considering uncertainty in ROIC's implementation of the masks, as shown in Fig. 10(a).In contrast, the reconstruction based on a bias-selection algorithm tends to achieve a better reconstruction, as seen in Fig. 10(b).In addition, the CS reconstruction, as shown in Fig. 10(c), outperforms the DCT-based approach.For the given results, we can see that naïve based reconstruction fails to retrieve the details of the image as well as contrast levels due to the presence of noise.However, MMSE-based results suggest that they achieve a better contrast result as well as reproduce most of the details of the original image.Note that the CS gives almost exact reconstruction when a sufficient number of coefficients is used.This is due to the fact that CS exploits randomness as a tool to extract information with fewer coefficients, and the uncertainty in responsivity has less of an implication on it compared to the DCT approach, which relies on the exact implantation of the masks.Also, for the DCT transform, a linear combination of projection coefficients, with the corresponding basis masks results into reconstruction where an error in projection is propagated during the reconstruction.CS reconstruction uses ℓ1 minimization-based optimization, which tends to keep the reconstruction noise as low as possible.Hence, the CS-based reconstruction is more tolerant of uncertainty in electronic mask implementation due to its robust ℓ1 optimization, whereas the DCT approach uses an ℓ2 optimization, which is known for its inferior performance compared to ℓ1 optimization.For MMSE and naïve methods, although the visual results are better with the higher percent of coefficients considered to lower percent for binary CS in the reconstruction process; the individual pixel values were off from the original pixels, whereas the difference was less for CS.This is because the correlation between the pixels retains the image structure and looks better for the user.Thus, the correlation of pixels for larger projection coefficients for reconstruction in naïve and MMSE are higher when compared to lower number of projection coefficients for binary CS.However, compared to CS, because the coefficients are more sensitive to noise for MMSE and naïve, the reconstruction error is higher.In this context, considering more coefficients in reconstruction leads to more propagation of projection error.This error is less for MMSE when compared to naïve.
The analog image sensor has a limited memory, which forces the device to operate over a limited dynamic range; this constrains the device to rely on small, block-sized transform coding instead of a large kernel mask.This is due to, for a large block size, the mask coefficients being significantly large in number and denser.This gives rise to quantization issue as most of the neighboring coefficient values are rounded to their nearby realizable coefficients.As a result, the realized mask loses its orthonormal property, and the implemented mask is no longer equivalent to the targeted mask, leading to reconstruction errors.
7. Functioning as a stand-alone camera Depending on the modulation scheme applied to the chip, different applications could be delivered.In the simplest scenario, if all of the pixels are biased with the same voltage, the iROIC camera can be used as a standalone camera.In this mode of operation, Vbias should remain constant, and as a result, the modulation factor that is used for different pixels is the same.
The extra benefit of this hardware over the conventional CTIA is that in stand-alone mode, because the reference voltage for the readout is different from the detector's bias voltage, a Vref− Vbias offset is applied to the measured values, which means a level shifter is embedded in every pixel.This method is beneficial if there is a constant offset at the output of the imager.Figure 11 shows four images that are taken by the iROIC camera in stand-alone mode.

Region of interest (ROI) enhancement
The support for continuous spatiotemporal control over the bias voltage applied to each photodetector enables ROI enhancement achieved by means of selectively modulating responsivity of detectors located in the region of interest.Different applications are advantaged from this, and some are briefly discussed below: • It aids in enhancing the contrast of image over a given region, which is originally poor due to limited dynamic range of the sensor.This is also a solution to the challenge of finding an optimum bias for a high contrast image where part of it saturated and some other part is at the noise level.A smart selection of bias voltages enforces all pixels to operate in the linear region.• This method facilitates in achieving different resolutions for different regions of a given image by using sub-masks corresponding to low pass and high pass response.This is useful in the surveillance and medical applications, where the user may be interested in a specific region and wants to ignore the information in the rest of the image.• Spectral selectivity in different areas of the image is another application of the hardware; however, the requirement is to have support for multispectral tunability at the photodetectors.

Figure 12(a)
shows the original image of the white matter, which we have used at the input of iROIC in the image segmentation experiment.Figure 12(b) depicts the white matter image we have taken with iROIC when a uniform bias is applied to all of the pixels, and Figs.12(c)-12(f) present the same scene with the exception of applying different bias to some selected area, which is referred to as region of interest.
Fig. 12 a) Original white matter image used for imaging.b) Image is taken using iROIC with a uniform biasing for all of the pixels where some of the pixels are saturated due to the high intensity.In c), d), e), and, f) the same scene is imaged using proper biasing for the different areas that normally are at the noise floor of the imager.

Conclusions
A monolithic implementation of compressive-domain image acquisition is presented, where all of the computations are performed at the acquisition time within the analog ROIC circuit.Detector-bias information is the knob we employed to control the modulation factor of each individual pixel.The reported hardware outputs a reduced set of compression coefficients of an image, thereby avoiding the generation of big data.A flexible image retrieval setup enables fine control over the matrix that is to be projected to the image.
The enhanced acquisition technique, which utilizes a statistical detector biasing scheme, offers many different applications, such as in-place nonuniformity correction, sensor level region of interest enhancement, transform coding embedded in ROIC, and compressive sampling, where all of them are proven using the selection of proper biasing matrices.Additionally, for the case of transform coding, an intelligent bias-selection algorithm is proposed, and the result is compared against the naïve method.
The motive of the current extension is to efficiently acquire data and reconstruct with fewer projection coefficients, which is highly desired for multispectral imaging.This reduces acquisition time and instrument complexity.Here we deploy a CS-based compression technique where an entire signal can be reconstructed from sparse data set where a proper basis pursuit algorithm is used for reconstruction of the multispectral image from a reduced data set.

Fig. 1
Fig. 1 a) A system-level block diagram of a conventional imaging system, which includes image acquisition, storage, and post-processing stages.b) Block diagram of the intelligent readout integrated circuit we propose for on-chip image acquisition and compression.

Fig. 2
Fig. 2 Block diagram of the individual pixel bias tunable readout integrated circuit and the CTIA-based unitcell at the extended view.The extra circuitry added to the CTIA-based unitcell enables setting independent bias voltages for each individual pixel while the previously integrated voltage is being read out.

Fig. 3
Fig. 3 a) A cross section of the n+/nwell/psub photodetector used in this chip, b) the measured photoresponse of n+/nwell/psub photodetector as a function of the applied bias voltages at different illumination levels,and c) the same measured results that are scaled to one.In this experiment, a green LED is used as the illumination source and the dimension of the photodetector is 100 µm×100 µm.

Figure 4 (
Figure 4(a) depicts the detailed block diagram of the unitcell of iROIC.Figure 4(b) shows the video switches, the active load for the source follower at the output of the unitcells.The ROIC peripherals are shown in Fig.4(c).In the proposed unitcell, the conventional CTIA configuration is featured with the ability to control each individual pixel's bias voltage.Here, we briefly explain the process that is followed to operate the compressed-domain imaging.The readout mechanism is also demonstrated in Fig.4(d):

Fig. 4 a
Fig. 4 a) Switch level implementation of iROIC unitcell.The unitcell includes 15 transistors and three capacitors, b) the video switches, c) the row/column select peripherals, and d) a sample timing diagram of a single unitcell.

Fig. 5
Fig.5 Demonstration of the normalized modulation function of the system to a uniform illumination level.The graph reflects the system's response to the modulation of the detector's bias.

Fig. 6 a
Fig. 6 a) A microphotograph of the fabricated ROIC, the row and column select, and the test devices.The unitcell is shown in the extended view.b) A block diagram of the experimental setup, which includes a Raspberry Pi board as the main controller of the system, an ADC and a DAC to set the bias voltage of the detectors and grabs the readout of the imager.All communication between the controller and a remote machine is over SSH.

Figure 7 (
b), on the other hand, shows another image under the same illumination condition with a bias matrix, which is optimized for the NUC technique discussed above.Gain and offset are calculated as per Eqs.(1) and (2) per pixel and are embedded in the bias applied to each pixel using the RPB board.The 3D intensity level shown in Figs.7(a) and 7(b) reflects a Gaussian distribution with a flat variance in part (a) due to the presence of nonuniformity whereas the variance is minimal due to its correction in part (b).

Fig. 7 a
Fig. 7 a) The result of imaging a white paper with uniform biasing, while the illumination is not uniform.Defects and other sources of nonuniformity also contribute to the variation across the image.The stack of three graphs demonstrates (I) camera output image, (II) illumination contour, and (III) 3D view of the intensities.b) Another white paper is imaged with the same illumination condition using the implemented nonuniformity correction.The graph has the same scale as part (a), and the legend in the middle is for part (II).c) and d) show the histogram for the measured results of part (a) and (b), respectively.

( 4 ),
where i and j are integers in the range of [0, N − 1], which are used to address different pixels, and C(u) and C(v) are defined in the following equation:(5) (), ()

Fig. 8
Fig.8 Acquisition and compression processes, which include mapping k mask matrices to their corresponding bias voltages.The mapping is based on the system's response-modulation function shown in Fig.5.Then the bias matrices that are stored in the Raspberry Pi memory are loaded to the imager and projected to the object's reflectance function.The resultant dot product is optionally summed up in the hardware, and the kresulting coefficients are sent to the remote computer for reconstruction.

Fig. 9 a
Fig. 9 a) Distribution of 8 × 8 block-based DCT mask coefficients for naïve method, b) distribution of bias for naïve method, c) distribution of 8 × 8 block-based DCT mask coefficients for MMSE method, and d) distribution of bias for MMSE method.

Fig. 10
Fig. 10 The resulting images reconstructed using a) naïve DCT, b) minimum-mean-square error based DCT, c) compressive sensing, and d) ideal DCT.e) The performance of different method is compared in terms of the mean square error between the reconstructed image and the original image.

Fig. 11
Fig. 11 Four images that are taken using iROIC camera in normal mode.a) phantom, b) a cell, c) some rice grains, and d) UNM logo.

Table 1 .
Comparison between different configuration for preamplifier used in an imager.Due to the need for good bias control, high injection efficiency, and sufficient charge storage, we have selected CTIA configuration for iROIC.Structure Injection efficiency Detector bias Power dissipation Pixel area Charge storage