Thresholding schemes for visible light communications with CMOS camera using entropy-based algorithms

: Recent visible light communication (VLC) studies mainly used positive-intrinsic-negative (PIN) and avalanche photodiode (APD). VLC using embedded complementary-metal-oxide-semiconductor (CMOS) camera is attractive. Using the rolling shutter effect of CMOS camera can increase the VLC data rate; and different techniques have been proposed for improving the demodulation of the rolling shutter pattern. Important steps to demodulate the rolling shutter pattern are the smoothing and the application of efficient thresholding to distinguish data logic. Here, we propose and demonstrate for the first time two entropy thresholding algorithms, including maximum entropy thresholding and minimum cross entropy thresholding. Experimental evaluation to compare their bit-error-rate (BER) performances and efficiencies are also performed.


Introduction
In the near future, 4G wireless systems such as Long Term Evolution (LTE) and LTE-Advanced may not satisfy the bandwidth demand; hence network operators are now actively planning for radio-frequency (RF) spectrum re-arrangement (reusing the spectrum in the 2G wireless systems) and exploring the possibility of using other frequency spectra, such as 60 GHz and 100 GHz [1-3].One interesting proposal is to use the visible spectrum for the future 5G wireless communication [4].Visible light communication (VLC) can offer a high speed, high density, directional and secure wireless communication [5][6][7][8][9].One advantage of VLC when compared traditional free space optics (FSO) communication is that it can combine lighting and communication simultaneously.Hence, the VLC can be easily deployed in the existing light emitting diode (LED) lamps with little extra cost.VLC can be implemented using the existing lighting infrastructure; however, recent VLC studies mainly used positiveintrinsic-negative (PIN) and avalanche photodiode (APD) [5][6][7][8][9].The deployment cost can be significantly reduced if the embedded complementary-metal-oxide-semiconductor (CMOS) cameras in vehicles and smart-phones can be used as the VLC receivers (Rxs).However, as the frame rate (frame per second, fps) of these CMOS cameras are only 30/60 fps, using these embedded cameras for VLC is very challenging.Recently, a camera-based VLC for vehicleto-vehicle was demonstrated [10]; however the data rate is only to 150 bit/s (3 x 50 bit/s) by using red-green-blue (RGB) LEDs.The rolling shutter effect of CMOS camera [11] can be used to increase the VLC data rate.> 1 kbit/s camera-based VLC can be achieved by using different techniques for improving the demodulation of the rolling shutter pattern, such as blooming mitigation and extinction ratio (ER) enhancement [12,13].
Several thresholding schemes have been proposed, such as using the third order polynomial fitting [14]; however, it is not actuate enough for the fast changing data, particularly at high data rate (i.e.low pixel per bit case).In this work, we propose and demonstrate for the first time using entropy to make the VLC rolling shutter thresholding easier.The entropy thresholding algorithms: maximum entropy thresholding and minimum cross entropy thresholding are discussed.Experimental evaluation to compare their bit-errorrate (BER) performances and efficiencies are also performed.Data rate of 5760 bit/s can be achieved.

Experiment and algorithms
The experimental setup of the camera-based VLC system is shown in Fig. 1.The data packet is first constructed using Matlab in a computer.The structure of the data packet used in the camera-based VLC system will be discussed later.The data packet is applied to a phosphor white-light LED (Cree XR-E) through an arbitrary waveform generator (AWG, Tektronix AFG3252C) having analog bandwidth of 240 MHz and sampling rate of 2 GSample/s.Then the emitted VLC signal is received by a smart-phone (Apple Iphone6) equipped with CMOS camera having resolution of about 1920 x 1080 pixels.In this proof-of-concept demonstration, since only a single LED is used, the transmission distance is about 10 cm and the illuminance to the CMOS camera is about 750 lux.The received signal performance depends on several factors, such as the signal-to-noise ratio (SNR) (i.e. the illuminance), uniformity of the light spot received by the camera, and the camera sensitivity.We believe that the transmission distance can be further extended by using higher power or more uniform light source, or with a proper lens.As the CMOS camera is operated in rolling shutter mode, this means each row of pixels is activated sequentially instead of all the rows of pixels are activated at the same time.Hence, the rolling shutter pattern with bright and dark stripes is recorded as shown in the Fig. 1 inset (i).Each measurement sample is recorded for 2 minutes.Then, this 2 minutes video of rolling shutter pattern is transferred to Matlab program for offline demodulation.According to previous studies [12], there is a processing latency between each completed recorded frame in the CMOS camera.The measured processing latency can be up to 40% for the frame period (1/frame rate); hence special VLC packet structure and arrangement are needed in order to guarantee a complete VLC packet can be recorded in a frame.Here, each VLC packet is transmitted three times sequentially; and each packet is consisted of a 12-bit header, 32-to 96-bit payload, 1 start-bit and 1 stop-bit.On-off keying (OOK) is used in the VLC packet.
The demodulation of the rolling shutter pattern includes grayscale conversion, column pixel selection, signal smoothing and thresholding.The first procedure of grayscale conversion is to convert the color rolling shutter pattern into grayscale values, so that grayscale values of 0 and 255 stand for all-dark and all-bright respectively.In the second procedure, a column pixel is selected to avoid the "blooming" (the pixel saturation by the strong LED light) according to the algorithms as described in ref [12].After this, a column matrix of 1080 elements (represent the 1080 pixels) storing the grayscale values can be plotted as the grayscale value pattern as shown as the blue curve in Fig. 1 inset (ii).The signal smoothing process can be performed to enhance the BER.In the smoothing process, a second order polynomial fitting (SOPF) is applied to the grayscale value pattern (the red curve in Fig. 1 inset (ii)); then the grayscale values  SOPF is set equal to the value of the SOPF curve.
Then, another SOPF is applied (the green curve in Fig. 1 inset (ii)); then the grayscale values < the second SOPF curve is set equal to zero.After this procedure, the ER of the grayscale value pattern can be significantly increased.Finally, the thresholding is applied to distinguish the logic 1 and 0 in the grayscale value pattern.Here, we propose and demonstrate using entropy for the VLC rolling shutter thresholding.The entropy is a concept that comes from the second law of thermodynamics and measures spontaneous dispersal of energy.It was later introduced to communications theory as a measure of the efficiency in data transmission.Here, the proposed scheme exploits the entropy of the grayscale level distribution for effective thresholding.Two schemes: maximum entropy and minimum cross entropy are evaluated and compared experimentally.
In the maximum entropy VLC rolling shutter thresholding, our proposed idea is inspired by ref [15], which divides an image into background and foreground classes.Each class has its entropy, and when the sum of the two class entropies reaches the maximum, the threshold of the image is optimal.In our VLC rolling shutter thresholding, first, the whole data packet including the header and payload will be divided into sections.Assume the threshold value of the grayscale value pattern is t, and 0 < t < 255.Hence the grayscale value pattern can be divided into two groups: one group has elements with grayscale values smaller than t (called the foreground), and the other group with grayscale value larger than t (called the background).The probability distribution of the grayscale levels in the foreground is expressed in Eq. (1).
where t is the threshold, p i is the probability of pixels with grayscale vale i and P B is the probability of gray level less than or equal to the threshold, The entropy of the foreground is written in Eq. (3), 0 log( ) and the entropy of the background is written in Eq. ( 4). 255 After running through the t from 0 to 255, the threshold T is selected such that the total entropy, H F + H B is maximized.
In the minimum cross entropy VLC rolling shutter thresholding, we apply the Kullback-Leibler divergence to determine the difference between two probability distributions [16].By minimizing the cross entropy, the optimum threshold value can be obtained.Like the maximum entropy scheme, the whole data packet is divided into sections.Here, we define the foreground (grayscale values  t) and background (grayscale value < t) for the original input grayscale pattern and the output binary pattern.Let the grayscale distributions of the foreground and background of the original pattern as q(g) and q'(g); and that of the output binary pattern as p(g) and p'(g).Then, we can calculate the entropy of the foreground and background as shown in Eq. (5).
After running through the t from 0 to 255, the threshold T is obtained such that the total entropy, H F + H B , is minimized.

Results and discussions
After applying the thresholding schemes as discussed above, the data logic can be determined; hence the BER can be obtained.In the VLC packet, 32-bit, 48-bit, 64-bit, and 96bit payloads are compared, while 12-bit is used for the header.They stand for the net data rates of 1920 bit/s (32 bit/frame x 60 frame/s), 2880 bit/s, 3840 bit/s and 5760 bit/s respectively.In each thresholding scheme, we have experimentally evaluated the optimum number of divisions, and found that 12 divisions are the optimum.Figures 2(a)-2(d) show the grayscale value patterns with 32-bit and 96-bit payload respectively.Figures 2(a) and 2(c) show the grayscale value patterns without the smoothing process as described in Section 2. From Fig. 2(a), we can observe that at low data rate, the thresholding process can easily define the grayscale pattern logic no matter the smoothing is applied or not.However, from Fig. 2(c), there is a low ER particularly at the fast changing data pattern; and some fast changing pattern cannot be defined correctly by the maximum entropy thresholding.Then, the smoothing process is applied as shown in Fig. 2(d).It is worth to note that the performance will be influenced by the incident angles of the visible light signal to the camera.We can apply the region-grow algorithm [17] to track the light source, and the detail analysis is reported in [17].show the grayscale value patterns without the smoothing process.Similar to the maximum entropy thresholding scheme, at low data rate case, the smoothing process may not necessary.However, at high data rate case and without the employ of smoothing, some fast changing pattern cannot be defined correctly as shown in Fig. 3(c); and this issue can be mitigated when the smoothing is applied as shown in Fig. 3(d).Finally, the BER performances of the maximum entropy thresholding and minimum cross entropy thresholding with and without the smoothing processing are shown in Fig. 4(a).It is observe that the smoothing is an important part to improvement the performance.We also compare the BER and processing time of our proposed entropy thresholding schemes with other thresholding schemes reported in [14], such as quick adaptive, polynomial and iterative thresholding schemes as shown in Fig. 4(b) and 4(c).As shown in Fig. 4(b), it is observe that the quick adaptive thresholding shows the best BER performance, while other thresholding schemes perform similarly.However, as shown in Fig. 4(c), the processing time of quick adaptive scheme is the longest.The off-line processing is performed by a personal computer with an Intel i5-4210U processor @ 1.7 GHz, 4 GB RAM, and Matlab 2007. Figure 4(d) shows the net data rates achieved in different payload bit; showing that of 5760 bit/s can be achieved in the 96-bit payload case (96 bit/frame x 60 frame/s).

Summary
In this work, two entropy thresholding algorithms: (i) maximum entropy thresholding and (ii) minimum cross entropy thresholding were discussed and compared.Experimental BER and processing time evaluations were performed.The BER performance of both thresholding schemes was nearly the same.However, the required processing time of the minimum cross entropy was longer than the maximum entropy due to the relatively large processing procedures needed to compute the input and output grayscale value patterns.Net data rate of 5760 bit/s can be achieved when using the 96-bit payload.

Fig. 2 .
Fig. 2. Experimental grayscale value pattern with maximum entropy thresholding for 32-bit data payload (a) before and (b) after smoothing; and for 96-bit data payload (c) before and (b) after smoothing.

Fig. 3 .
Fig. 3. Experimental grayscale value pattern with minimum cross entropy thresholding for 32bit data payload (a) before and (b) after smoothing; and for 96-bit data payload (c) before and (b) after smoothing.

Figures 3
Figures 3(a)-3(d) show the grayscale value patterns with 32-bit and 96-bit payload respectively when the minimum cross entropy thresholding is applied.Figures3(a) and 3(c) show the grayscale value patterns without the smoothing process.Similar to the maximum entropy thresholding scheme, at low data rate case, the smoothing process may not necessary.However, at high data rate case and without the employ of smoothing, some fast changing pattern cannot be defined correctly as shown in Fig.3(c); and this issue can be mitigated when the smoothing is applied as shown in Fig.3(d).

Fig. 4 .
Fig. 4. (a) (b) BER performances and the (c) processing time of different thresholding schemes; and the (d) corresponding net data rates.