Maximum likelihood estimation of vehicle position for outdoor image sensor-based visible light positioning system

Abstract. Image sensor-based visible light positioning can be applied not only to indoor environments but also to outdoor environments. To determine the performance bounds of the positioning accuracy from the view of statistical optimization for an outdoor image sensor-based visible light positioning system, we analyze and derive the maximum likelihood estimation and corresponding Cramér–Rao lower bounds of vehicle position, under the condition that the observation values of the light-emitting diode (LED) imaging points are affected by white Gaussian noise. For typical parameters of an LED traffic light and in-vehicle camera image sensor, simulation results show that accurate estimates are available, with positioning error generally less than 0.1 m at a communication distance of 30 m between the LED array transmitter and the camera receiver. With the communication distance being constant, the positioning accuracy depends on the number of LEDs used, the focal length of the lens, the pixel size, and the frame rate of the camera receiver.

Maximum likelihood estimation of vehicle position for outdoor image sensor-based visible light positioning system Xiang Zhao a,b,c, * and Jiming Lin d,e 1 Introduction In recent years, with the rapid development of solid-state lighting technology, white light-emitting diodes (LEDs) have been widely used in the field of lighting, displaying, and transmitting and/or receiving of data. 1,2 Compared with incandescent and fluorescent lights, white LEDs have the characteristic of long life expectancy, high energy efficiency, and low cost, and they can be modulated at a relatively high speed that is undetectable to the human eye. To date, a considerable amount of white LED-based research has been developed and may fall into two categories: visible light communication (VLC) and visible light positioning (VLP).
For VLC, the intrinsic features of white LEDs makes them suitable for high speed communication. First, VLC is based on the lighting function of white LEDs. To ensure sufficient light intensity, 400 to 1000 lux 3 is often required for illumination levels. Therefore, the signal-to-noise ratio is high enough for VLC. Second, the radiation spectrum of white LEDs spans from 400 to 800 THz; thus high channel capacity could be achievable in accordance with the Shannon formula. At present, most researches on high speed VLC are confined to the indoor environment, mainly to improve the modulation bandwidth of LEDs, 4 develop improved modulation technology, 5 and design multiplexing scheme. 6 The highest data rate reported so far is the wave division multiplexing (WDM) VLC system, 7 where carrierless amplitude and phase modulation technology and adaptive equalization technology are jointly used to achieve a data rate of 4.5 Gbps in the laboratory.
For VLP, [8][9][10] according to the optical reception devices used at the receiver, it can be divided into the photodiode (PD)-based VLP and the image sensor (IS)-based VLP. Since the PD is susceptible to the direction of the light beam, if it is flipped over or moved out of the range covered by the LED, the PD-based VLP system could fail; thus it has limited mobility and is only suitable for slow speed motion or quasistatic condition. In addition, PDs cannot be utilized in outdoor direct solar radiation environment. This is because PDs can only detect the optical power of incoming light; because direct solar radiation is usually strong, the PD is saturated by the intense optical power since it has a limited response. Therefore, most of the PD-based VLPs proposed belong to indoor positioning. [11][12][13][14][15][16][17] For the IS-based VLP, 9,[18][19][20] IS is used as an optical reception device. IS can detect not only the intensity but also the angle of arrival (AOA) of incoming light. IS consists of many pixels, and different light sources can be spatially separated using their imaging points on IS. Here, the light sources include various LED sources (such as indoor LED dome light, outdoor LED traffic light, LED brake light, or headlight of a vehicle) and noise sources (such as the Sun and other ambient lights). Via differentiating the imaging positions of light sources, LED sources can be recognized by a simple feature matching algorithm from multiple noise sources. Consequently, the IS-based VLP is available not only for indoor but also outdoor environments. Furthermore, combined with image processing and digital signal processing technology, the IS-based VLP can be utilized for safety driving such as collision warning and avoidance, lane change assistance, pedestrian detection, and adaptive cruise control.
To date, the published papers on the IS-based VLP have mostly focused on the field of applied research, such as indoor navigation systems, 21 outdoor intelligent traffic systems, [22][23][24][25] and various location-based services. 26 These researches have shown that accurate localization can be achievable; however, little has been published about the analytical performance bounds of the positioning accuracy from the view of statistical optimization. The determination of the positioning accuracy will allow the optimization of the parameters governing the IS-based VLP systems.
The contributions of this paper are as follows. First, we analyze and derive the maximum likelihood estimation (MLE) and corresponding Cramér-Rao lower bounds (CRLB) for a typical outdoor IS-based VLP system, assuming white Gaussian model for system noise. Second, we analyze the effect of system parameters on CRLB. When a camera IS is used as receiver, there exist several types of noise generated from IS. As shot noise takes the dominant role, the system noise variance is influenced by many factors, such as the total received optical power, the pixel size, the focal length, and the frame rate of the camera receiver. Because the derived CRLB is proportional to system noise variance, we will emphatically analyze the system noise and the parameters affecting system noise variance in this paper.
The rest of this paper is organized as follows. In Sec. 2, an outdoor IS-based VLP system model is introduced, where the transmitter is the LED array of traffic light, and the camera receiver is assumed to be mounted on the dashboard of a vehicle. In Sec. 3, the MLE of the vehicle position is derived under the condition that the observation values of the LEDs' imaging points are affected by white Gaussian noise. The performance analysis is completed in Sec. 4, where the CRLB is deduced and the parameters affecting CRLB are analyzed in detail. In Sec. 5, simulation results are given for a typical outdoor scenario. Conclusions are made in Sec. 6.
Notations: The operators f·g T , E½·, and varð·Þ denote the transpose of a matrix, the expectation, and the variance of a random variable or matrix, respectively.

System Model
For the outdoor IS-based VLP system, as shown in Fig. 1, the transmitter may be the LED array of the traffic light in a city crossing, and the receiver may be a camera IS mounted on the dashboard of a vehicle. The signal from the LED and the image of the LED on the camera receiver are jointly used to determine the location of the receiver, which is assumed to be the vehicle position.
In Fig. 1, there are three coordinate systems, which are the three-dimensional (3-D) world coordinate system, the 3-D camera coordinate system, and the two-dimensional (2-D) image plane coordinate system. Any LED P i (i ¼ 1; 2; : : : ; N) from the LED array transmitter is imaged into an imaging point p i (i ¼ 1; 2; : : : ; N) in the image plane through the center of the lens. It is assumed that LED P i (i ¼ 1; 2; : : : ; N) is located at P i ¼ ðX i ; Y i ; Z i Þ T in the 3-D world coordinate system and is known a priori. The imaging point of LED P i is p i ¼ ðx i ; y i Þ T (i ¼ 1; 2; : : : ; N) in the 2-D image plane coordinate system, which can be measured via image processing and signal processing technologies. However, the measurement value of the imaging point is often influenced by noise. When shot noise is the dominant noise source, system noise can be viewed as white Gaussian noise. 27,28 Hence, our goal is to estimate the location of the camera receiver for white Gaussian noise, to obtain the MLE, and finally derive the CRLB.
For any LED P i from the LED array transmitter, it satisfies E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 3 2 6 ; 2 6 2 where P ci ¼ ðX ci ; Y ci ; Z ci Þ T , i ¼ 1; 2; : : : ; N is the coordinate of LED P i in the camera coordinate system. The camera coordinate system is such a system where its origin is located The goal is to estimate the vehicle position, which is assumed to be the world coordinates O s of the center of the camera receiver under the condition that the system is affected by white Gaussian noise. The system parameters are listed in Table 1. at the center of the camera receiver, the direction of Z c is perpendicular to the 2-D image plane, and the Z c axis is usually called the optical axis. R is the rotation matrix of the camera receiver from the camera coordinate system to the world coordinate system, which is a 3 × 3 orthogonal matrix. O s ¼ ðX s ; Y s ; Z s Þ T is the world coordinate of the center of the camera receiver, which is assumed to be the vehicle position since the camera receiver is fixed on a vehicle, supposably on the dashboard of a vehicle. The rotating process of the camera receiver from the camera coordinate system to the world coordinate system is shown in Fig. 2. The rotation angle ϕ and ω can be directly read out from the inclination sensor attached in the camera receiver; however, the azimuth angle κ has to be calculated: For simplicity in this paper we assume that the vehicle is running on a reasonably flat terrain plane without azimuth rotation; that is to say, the orientation of the camera coordinate system is the same as that of the world coordinate system so that the rotation matrix from the camera coordinate system to the world coordinate system can be expressed as R ¼ E, where E denotes an identity matrix. In addition, since the camera receiver is fixed on the dashboard of a vehicle, the height Z s of the camera receiver is known a priori, then the distance (between the traffic light and camera receiver) along the direction of the optical axis is h ¼ Z i − Z s . In the 3-D camera coordinate system, the relationship between P ci ¼ ðX ci ; Y ci ; Z ci Þ T , i ¼ 1; 2; : : : ; N and p i ¼ ðx i ; y i Þ T , i ¼ 1; 2; : : : ; N can be described, with the focal length of the lens being f, as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 6 3 ; 2 7 6 Rearranging Eqs. (1) and (3), we get the mathematical relationship between the LEDs ðX i ; Y i Þ N i¼1 and the measurement values ðx i ;ỹ i Þ N i¼1 of their imaging points, which can be written as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 4 ; 3 2 6 ; 7 0 8 where the measurement noise fn xi g N i¼1 and fn yi g N i¼1 are independently white Gaussian noise of the direction x and y in the 2-D IS plane, with the same mean 0 and variance σ 2 .
Our goal is to estimate the parameter vector r ¼ ðX s ; Y s Þ T of the vehicle position, derive its MLE, and finally get the CRLB for white Gaussian noise.

Maximum Likelihood Estimation
Based on the measurement values fx i g N i¼1 and fỹ i g N i¼1 and the LED coordinates fX i g N i¼1 and fX i g N i¼1 , the log-likelihood function of the parameter vector r ¼ ðX s ; Y s Þ T of the vehicle position is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 3 2 6 ; 5 3 1 Differentiating the log-likelihood function with respect to X s gives E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 3 2 6 ; 4 0 1 Let ∂ lnðX s ; Y s Þ∕∂X s ¼ 0; then the MLE of the position parameter X s is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 3 2 6 ; 3 2 9X Similarly, differentiating the log-likelihood function with respect to Y s gives E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 3 2 6 ; 2 5 7 Let ∂ lnðX s ; Y s Þ∕∂Y s ¼ 0; then the MLE of the position parameter Y s is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 3 2 6 ; 1 8 5Ŷ

Xc
Yc Zc Os Fig. 2 Rotating process of the camera receiver from the camera coordinate system to the world coordinate system.
Consequently, the MLE of the vehicle position can be obtained by finding the means of measurement values fx i g N i¼1 and fỹ i g N i¼1 , and the means of LEDs coordinates fX i g N i¼1 and fX i g N i¼1 . Figure 3 shows the estimation values of X s and Y s when σ is 10 −3 . It can be seen that the estimation values vibrate around the original value (X s ¼ 2.02 m and Y s ¼ 30.8 m), and this is because the program is run independently each time.

Performance Analysis
The CRLB gives a lower bound on variance attainable by any unbiased estimation. In order to better illustrate the performance of an estimation method, it can be compared with the CRLB. The regularity condition of the CRLB 29 holds for the given estimation since Eqs. (6) and (8) are finite, and the expected value of Eqs. (6) and (8) is 0.
The CRLB of the vector parameter r ¼ ðX s ; Y s Þ T can be obtained through three steps. First, from Eqs. (6) and (8) we get the second-order derivatives of the log-likelihood function with respect to X s and Y s , respectively. Second, taking the negative expectations of the second-order derivatives yields E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 6 3 ; 2 3 2 8 > > > > > < > > > > > : The 2 × 2 Fisher information matrix IðrÞ is written as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 6 3 ; 1 3 5 IðrÞ ¼ Finally, the CRLB of the vector parameter r ¼ ðX s ; Y s Þ T can be derived by taking the ½i; i'th element of the inverse of IðrÞ, namely, i ¼ 1;2, from Ref. 29. The inverse of the 2 × 2 Fisher information matrix r ¼ ðX s ; Y s Þ T is expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 3 2 6 ; 7 0 8 Consequently, the CRLB of the vehicle position for white Gaussian noise is given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 4 ; 3 2 6 ; 6 3 5 Figure 4 shows the performance comparison of CRLB and mean square positioning error (MSPE) at σ 2 ∈ ½−20;60 dB. MSPE is defined as E½ðX s − X s Þ 2 þ ðŶ s − Y s Þ 2 . Note that the decibel scale is employed in both axes in order to facilitate the presentation. 30 It can be seen that the MSPE is proportional to σ 2 and is unlimitedly close to the CRLB. From Eq. (14), we know that the CRLB is proportional to the noise variance σ 2 , with the number of LEDs used N, the focal length of the camera receiver f, and the distance h being known. However, when a camera IS is used as receiver for an outdoor IS-based VLP system, there exist several types of noise generated from IS. When shot noise takes the dominant role, the system noise variance is influenced by many factors, such as the total received optical power, the pixel size, the focal length, and the frame rate of camera receiver.
In the following, we will emphatically analyze the system noise in the IS-based VLP system and the parameters affecting system noise variance.

System Noise
There are two basic types of noise generated by IS, which are pattern noise (PN) and random noise (RN). PN can be directly observed by human eyes and distributed in a spatial form, which does not vary with each frame of image. The effect of PN on image quality is far greater than RN, but it can be effectively inhibited or eliminated through the correlated double sampling or flat field correction technology. Hence, the effect of PN will not be considered in this paper.
The quantized values of RN vary with each frame of image, and RN obeys a statistical distribution. One typical RN is shot noise, and it is generated by random variation of photoinduced charge carriers with incoming light in the semiconductor of the camera receiver. When the number of photoinduced charge carriers is large enough, shot noise is in Gaussian distribution and is white noise. In the IS-based VLP system, shot noise is mainly made up of three parts: quantum noise generated from the observation point of the image corresponding to each LED, quantum noise coming from the interference of other LEDs, ambient light noise from fluorescent or incandescent lights or the sun, and so on. Since IS has the ability to spatially separate sources, the imaging points of discrete LEDs on the camera IS receiver can be resolved; that is, noise from the interference of other LEDs is so small that it can be classified into ambient light noise. Hence, while shot noise takes the dominant role, the system noise variance can be expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 6 3 ; where q is the electronic charge, ρ is the conversion coefficient from the optical to electrical domain and is often assumed to be that 0.4 mA∕mW, P r is the total received optical power of the camera receiver, P n is the power of ambient light noise on unit area, A total is the total detecting area of IS, A is the effective detecting area corresponding to a single LED, I 2 is the noise bandwidth factor, and often, I 2 ¼ 0.562, and R b is the data transmission rate. Because the frame rate is equal to the sampling rate of the camera, if Nyquist sampling is used, the frame rate of the camera should be at least twice the data transmission rate.

Parameters Affecting System Noise Variance
In this paper, such a channel scenario is utilized for the outdoor IS-based VLP system, as shown in Fig. 5(a), where the LED array transmitter is placed on horizontal ground and the camera receiver is fixed on the dashboard of a vehicle, with the center of the LED array transmitter in the optical axis of the camera receiver.

Total received optical power
If N LEDs are used to locate an IS receiver, the total received optical power P r of IS is P r ¼ P N i¼1 H i ð0ÞP t , when each LED transmits constant optical power P t for each line of sight (LOS) channel. A lateral view of the transmitterreceiver channel is shown in Fig. 5(b). For the i'th channel, i ¼ 1;2; : : : ; N, is the directed circuit gain, m is the order of Lambertian emission and generally m ¼ 1, ϕ i is angle of irradiance, φ i is the angle of incidence with 0 ≤ φ i ≤ φ C , and φ C is the field of view (FOV) of the IS receiver. D i is the propagation distance from each LED transmitter to the camera receiver. For the communication distance between the LED transmitter and camera receiver being h, if ϕ i ¼ φ i then cosðϕ i Þ ¼ h∕D i , and the total received optical power can be expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 3 2 6 ; 4 8 3 where CðφÞ ¼ P N i¼1 ½ðm þ 1Þ cos mþ3 ðφ i Þ∕2π, which is related to incidence angles. It is assumed that all incidence angles of LOS links are within the FOV of the receiver.

Effective area for detecting
It is necessary to calculate the image size corresponding to one LED when a camera IS is used for receiver. The imaging process of a single LED through the lens on the camera IS is shown in Fig. 6. According to Newton's formula, if the diameters of the LED and the corresponding image are L and l, respectively, for a focal length of f and a distance of h between the LED and the lens, then the relationship between these parameters satisfies l ¼ fL∕h. It is referred to the distance where the LED generates an image that falls into exactly one pixel as the critical distance d c . If h ≥ d c , the image of the LED falls into only one pixel; then the effective area for detecting is A ¼ w 2 , where w is the width of a pixel.
If h < d c , the image of LED will fall into several pixels; then A ¼ l 2 ¼ ðfL∕hÞ 2 .

Numerical Results
Simulation experiments are performed in a channel scenario, as shown in Fig. 5(a), where the LED array transmitter is placed on horizontal ground and the camera receiver is fixed on the dashboard of a vehicle, with the center of the LED array transmitter in the optical axis of the camera receiver. The communication distance between the LED array transmitter and the camera receiver is changed from 15 to 60 m, every 5 m on a static condition. The white LEDs are used for the LED array transmitter, and a Photron IDP-Express R2000 is used for the camera IS receiver. The parameters are listed in Tables 2 and 3, respectively. In the following, we will present simulation results for the CRLB for the positioning system described in the previous section for a range of parameters, such as the communication distance, the pixel size, and the focal length and frame rate of the camera receiver.
First, we study the influence of the communication distance on CRLB. Figure 7 shows the CRLB versus the communication distance between the LED transmitter and camera receiver, from 15 to 60 m, with a step size of 15 m on a static condition. The positioning accuracy decreases with increasing communication distance. When the communication distance between the LED array transmitter and the camera receiver is 60 m, the CRLB of the vehicle position is about 0.35 m. However, when the distance is shortened to 15 m, the CRLB of the vehicle position is less than 0.05 m.
Second, we study the influence of pixel width on CRLB. Figure 8 plots the CRLB versus the number of LEDs, which l L h f D Fig. 6 Imaging process of a single LED on the camera IS.   Fig. 7 Influence of communication distance on CRLB with the distance between the LED array transmitter and the camera receiver from 15 to 60 m, every 15 m on a static condition. The camera receiver has a focal length of 35 mm, a pixel width of 10 μm, and a frame rate of 1000 fps. shows that the positioning error decreases as the number of LEDs increases. We vary pixel width from 25 to 10 μm. The CRLB drops with decreasing the pixel width. When four LEDs are used in the outdoor IS-based VLP system at a communication distance of 30 m between the LED array transmitter and the camera receiver, the CRLB of the vehicle position is less than 0.1 m. Next, we study the impact of focal length on CRLB. In Fig. 9, the CRLB is plotted as a function of the used number of LEDs. The CRLB falls with increasing focal length. We vary focal length from 20 to 35 mm. This figure again shows that low values of CRLB are achievable for typical camera IS parameters. For four LEDs used in the outdoor IS-based VLP system at a communication distance of 30 m between the LED array transmitter and the camera receiver, the CRLB of the vehicle position is less than 0.1 m.

Number of LEDs
Finally, we investigate how CRLB behaves as we vary the frame rate of the camera receiver. In Fig. 10, the CRLB is plotted versus the number of LEDs for various frame rates. It shows that for a given number of LEDs, the CRLB drops with reducing frame rate. For four LEDs used in the outdoor IS-based VLP system at a communication distance of 30 m between the LED array transmitter and the camera receiver, the CRLB of the position of camera receiver for the frame rate of 1000 fps is about 0.5 m. This falls to only about 0.05 m when the frame rate is decreased to 30 fps. Therefore, the positioning accuracy increases with reducing of the frame rate, However, the lower frame rate (which is equal to the sampling rate of the camera IS) directly limits the achievable data rate. This is the reason why high speed cameras are usually utilized for VLC, while medium and low speed cameras are used for VLP.

Conclusion
For a typical outdoor scenario, theoretical limits of the location of an in-vehicle camera receiver are calculated by deriving the CRLB. Under the condition that the observation values of the LED imaging points are affected by white Gaussian noise, the MLE for the vehicle position is first calculated, then the CRLB is derived. For typical parameters of a white LED array and in-vehicle camera IS, simulation results show that accurate location estimation is achievable, with the positioning error usually in the order of centimeters for a communication distance of 30 m between the LED array transmitter and the camera receiver. Positioning accuracy has relation with the number of LEDs used, the focal length of the lens, and the pixel size and frame rate of the camera receiver in the presence of a constant communication distance. The determination of the CRLB will provide a theoretical basis of statistical analysis for the optimization problem for outdoor IS-based VLP systems.  Fig. 9 Influence of focal length on CRLB with the focal length from 20 to 35 mm for a step size of 5 mm. The communication distance between the LED array transmitter and the camera receiver is 30 m, and the camera receiver has a pixel width of 10 μm and a frame rate of 1000 fps.  Fig. 10 Influence of frame rate on CRLB with the frame rate from 30 to 1000 fps. The communication distance between the LED array transmitter and the camera receiver is 30 m, and the camera receiver has a focal length of 35 mm and a pixel width of 10 μm.