A Novel Received Signal Strength Assisted Perspective-three-Point Algorithm for Indoor Visible Light Positioning

In this paper, a received signal strength assisted Perspective-three-Point positioning algorithm (R-P3P) is proposed for visible light positioning (VLP) systems. The basic idea of R-P3P is to joint visual and strength information to estimate the receiver position using 3 LEDs regardless of the LEDs' orientations. R-P3P first utilizes visual information captured by the camera to estimate the incidence angles of visible lights. Then, R-P3P calculates the candidate distances between the LEDs and the receiver based on the law of cosines and the Wu-Ritt's zero decomposition method. Based on the incidence angles, the candidate distances and the physical characteristics of the LEDs, R-P3P can select the exact distances from all the candidate distances. Finally, the linear least square (LLS) method is employed to estimate the position of the receiver. Due to the combination of visual and strength information of visible light signals, R-P3P can achieve high accuracy using 3 LEDs regardless of the LEDs' orientations. Simulation results show that R-P3P can achieve positioning accuracy within 10 cm over 70% indoor area with low complexity regardless of LEDs orientations.


Introduction
Indoor positioning has attracted increasing attention recently due to its numerous applications including indoor navigation, robot movement control and advertisements in shopping malls. In this research field, visible light positioning (VLP) is one of the most promising technology due to its high accuracy and low cost [1,2]. Visible light possesses strong directionality and low multipath interference, and thus VLP can achieve high accuracy positioning performance [2]. Besides, VLP utilizes light-emitting diodes (LEDs) as transmitters. Benefited from the increasing market share of LEDs, VLP has relatively low cost on infrastructure [2].
VLP typically equips photodiodes (PDs) or cameras as the receiver. Positioning algorithms using PDs include proximity [3], fingerprinting [4] and time of arrival (TOA) [5], angle of arrival (AOA) [6] and received signal strength (RSS) [7,8]. Positioning algorithms using cameras are termed as image sensing [9]. Proximity is the simplest technique, while it only provides proximity location based on the received signal from a single LED with a unique identification code. Fingerprinting algorithms can achieve enhanced accuracy at a high cost for building and updating a database. TOA and AOA algorithms require complicated hardware implementation. In contrast, RSS and image sensing algorithms are the most widely-used methods due to their high accuracy and moderate cost [1]. Nowadays, both PD and the camera are essential parts of smartphones, meaning that RSS and image sensing algorithms can be easily implemented in such popular devices [1].
However, the RSS and image sensing algorithms also have their own inherent limitations. In particular, RSS algorithms determine the position of the receiver based on the power of the

System Model
The system diagram is illustrated in Fig. 1. Four coordinate systems are utilized for positioning, which are the pixel coordinate system (PCS) o p − u p v p on the image plane, the image coordinate system (ICS) o i − x i y i on the image plane, the camera coordinate system (CCS) o c − x c y c z c and the world coordinate system (WCS) o w − x w y w z w . As shown in Fig. 1, different colors represent different coordinate systems. In PCS, ICS and CCS, the axes u p , x i and x c are parallel to each other and, similarly, v p , y i and y c are also parallel to each other. Besides, o p is in the upper left corner of the image plane and o i is in the center of the image plane. In addition, o i is termed as the principal point, whose pixel coordinate is (u 0 , v 0 ). In contrast, o c is termed as the camera optical center. Furthermore, o i and o c are on the optical axis. The distance between o c and o i is the focal length f , and thus the z-coordinate of the image plane in CCS is z c = f .
In the proposed positioning system, 3 LEDs are the transmitters mounted on the ceiling. The receiver is composed of a PD and a standard pinhole camera, and they are close to each other. As shown in Fig. 1, n w LED,i denotes the unknown unit normal vector of the ith LED in the WCS.
is the coordinate of the ith LED in the WCS, which are assumed to be known at the transmitter and can be obtained by the receiver through visible light communications (VLC). In contrast, r w = (x w r , y w r , z w r ) is the world coordinate of the receiver to be positioned. In addition, φ i and ψ i are the irradiance angle and the incidence angle of the visible lights, respectively. Furthermore, w c i and d w i denote the vectors from the receiver to the ith LED in the CCS and the WCS, respectively.
LEDs with Lambertian radiation pattern are considered. The line of sight (LoS) link is the dominant component in the optical channel, and thus this work only considers the LoS channel for simplicity [16]. The channel direct current (DC) gain between the ith LED and the PD is given by [17] where m is the Lambertian order of the LED, given by m = − ln 2 ln cos Φ 1 2 , where Φ 1 2 denotes the semi-angles of the LED. In addition, d i = d w i = s w i − r w , where · denotes Euclidean norm of vectors, A is the physical area of the detector at the PD, T s (ψ i ) is the gain of the optical filter, and g (ψ i ) = is the gain of the optical concentrator, where n is the refractive index of the optical concentrator and Ψ c is the field of view (FoV) of the PD. The received optical power from the ith LED can be expressed as where P t denotes the optical power of the LEDs and C = P t , where R p is the efficiency of the optical to electrical conversion and σ 2 noise,i means the total noise variance.

Received Signal Strength Assisted Perspective-three-Point Algorithm (R-P3P)
In this section, a novel visible light positioning algorithm, termed as R-P3P is proposed. R-P3P mainly consists of three steps. In the first step, the incidence angle is estimated according to the visual information captured by the camera based on the single-view geometry. Then, the candidate distances between the LEDs and the receiver is obtained based on the law of cosines and Wu-Ritt's zero decomposition method [18]. Next, based on the candidate distances, the incidence angles and the semi-angles of the LEDs, the irradiance angles are calculated utilizing the RSS received by the PD and then the exact distances between the LEDs and the receiver can be obtained. Finally, based on the distances, the position of the receiver is estimated by the LLS algorithm.

Incidence Angle Estimation
In the pinhole camera, the pixel coordinate of the projection of the ith LED is denoted by s p i = u p i , v p i , and this coordinate can be obtained by the camera through image processing [9]. Based on the single-view geometry theory, the ith LED, the projection of the ith LED onto the image plane and o c are on the same straight line. Therefore, the camera coordinates of the ith LED can be expressed as follows is the intrinsic parameter matrix of the camera, which can be calibrated in advance [15]. Besides, f u = f d x and f v = f d y denote the focal ratio along u and v axes in pixels, respectively. In addition, d x and d y are the physical size of each pixel in the x and y directions on the image plane, respectively.
In CCS, the vector from o c to the ith LED, w c i , can be expressed as where o c = (0 c , 0 c , 0 c ) is the origin of the camera coordinate. The estimated incidence angle of the ith LED can be calculated as where n c cam = (0 c , 0 c , 1 c ) is the unit normal vector of the camera in CCS and is known at the receiver side. Besides, (·) T denotes the transposition of matrices. Since the absolute value of ψ i,est remains the same in different coordinate systems, the estimated incidence angles in WCS are also given by (5). In this way, R-P3P is able to obtain the incidence angles regardless of the receiver orientation. Figure 2 shows the geometric relations among LEDs and the camera. As shown in Fig. 2, T i (i ∈ {1, 2, 3}) is the ith LED and o c is the camera optical center. The distance between T i and T j , d ij (i, j ∈ {1, 2, 3} , i j), is known in advance. Besides, w c i (i ∈ {1, 2, 3}), which can be calculated by (4), are the vectors from the receiver to T i in CCS. Furthermore, α ij (i, j ∈ {1, 2, 3} , i j) is the angle between w w i and w w j , i.e., α ij = ∠T i o c T j , which can be calculated as

Distance Estimation
We define △T i o c T j as the triangle constructed by the vertices T i , o c and T j . According to the law of cosines, in the triangle △T i o c T j , we have To simplify (7), let and Since d 3 0, we can obtain the following equation system which is equivalent to (7) Since where v requires to be calculated. Besides, we can eliminate v from (11), and thus we have Following the same method in [18], d i (i ∈ {1, 2, . . . , K }) can be obtianed by solving (12) based on Wu-Ritt's zero decomposition method [19] as follows As the same with [18], there four groups of d i (i ∈ {1, 2, . . . , K }). The typical P3P methods require the fourth beacon to obtain the right solution of d i [14,15,18]. In contrast, we obtain the right solution based on the RSS captured by the PD in the next subsection.

Irradiance Angle Estimation
According to (2), the RSS captured by the PD from the ith LED can be expressed as Since the distance between the PD and the camera, d PC , is much smaller than the distances between the LEDs and the receiver, we omit d PC in the algorithm. However, the effect of d PC on R-P3P's performance will be evaluated in the simulations. Therefore, with the incidence angle estimated by (5), we can obtain the irradiance angle φ i (i ∈ {1, 2, 3}) as follows With the four groups of d i (i ∈ {1, 2, 3}) obtained by (13), we can obtain four groups of φ i (i ∈ {1, 2, 3}). Fortunately, the semi-angles of the LEDs, Φ 1 2 , are known in advance. This means that the right solution of φ i (i ∈ {1, 2, 3}) have to comply with the following constraints We can estimate φ i (i ∈ {1, 2, 3}) by (16). However, consider the effect of noise and d PC , there may be no group of φ i (i ∈ {1, 2, 3}) comply (16) or there may be more than one groups of φ i (i ∈ {1, 2, 3}) comply (16). For the former case, we give a tolerance for (16) with the step of 5% until we find out one group of exact φ i (i ∈ {1, 2, 3}). For the latter case, we choose the final φ i (i ∈ {1, 2, 3}) from all the groups that comply (16) randomly. These measures will undoubtedly introduce positioning errors. Fortunately, the probability of these cases is very low, and thus the accuracy of R-P3P is almost the same with the typical PnP method that requires 4 LEDs, which will be shown in the simulations. Based on the estimated φ i (i ∈ {1, 2, 3}), d i (i ∈ {1, 2, 3}) can be further determined. In this way, we can estimate the distances between the LEDs and the receiver using only three LEDs.

Position Estimation By Linear Least Square Algorithm
The distances between the LEDs and the receiver obtained in 3.3 can be expressed as follows In practice, LEDs are usually deployed at the same height (i.e., z w 1 = z w 2 = z w 3 ) and hence R-P3P can estimate the 2D position of the receiver (x w r , y w r ) based on the following standard LLS estimatorX and Since z w 1 = z w 2 = z w 3 , z-coordinate of the receiver can be calculated by substituting (18) into the first equation of (17), which can be expressed as follows where Since H i is the quadratic of d i , as shown in (1), we can obtain two z-coordinates of the receiver. However, the ambiguous solution, z w r,est = h + ∆, can be easily eliminated as it implies the height of the receiver is beyond the ceiling. Therefore, R-P3P can determine the 3D position of the receiver, r w est = x w r,est , y w r,est , z w r,est , by only 3 LEDs with the LLS method.

SIMULATION RESULTS AND ANALYSES
As R-P3P simultaneously utilizes visual and strength information, a typical PnP algorithm [18] and CA-RSSR [8] are conducted as the baseline schemes in this section. The PnP algorithm utilizes the visual information only. Besides, CA-RSSR exploits both visual and strength information. The system parameters are listed in Table 1. Assume that visible light signals are modulated by on-off keying (OOK). All statistical results are averaged over 10 5 independent runs. For each simulation run, the receiver positions are selected in the room randomly. To reduce the error caused by the channel noise, the received optical power is calculated as the average of 1000 measurements [7]. The pinhole camera is calibrated and has a principal point (u 0 , v 0 ) = (320, 240), and a focal ratio f u = f v = 800. The image noise is modeled as a white Gaussian noise having an expectation of zero and a standard deviation of 2 pixels [20]. Since the image noise affects the pixel coordinate of the LEDs' projection on the image plane, the pixel coordinate is obtained by processing 10 images for the same position.
We evaluate the performance of R-P3P in terms of its coverage, accuracy and computational cost in the 3D-positioning case. We define coverage ratio (CR) of the positioning algorithms as

CR =
A effective A total (22) where A effective is the indoor area where the algorithm is feasible and A total is the entire indoor area. Besides, the positioning error (PE) is used to quantify the accuracy performance which is defined as where r w true = x w r,true , y w r,true , z w r,true and r w est = x w r,est , y w r,est , z w r,est are the world coordinates of the actual and estimated positions of the receiver, respectively. Furthermore, we exploit the execution time to evaluate the computational cost.  Table 2 provides the required number of LEDs for 3D positioning for R-P3P, CA-RSSR and the PnP algorithm. As we can observe, R-P3P requires the least number of LEDs. Figure 3 shows the comparisons of the coverage ratio (CR) performance among the three algorithms with the FoVs, Ψ c , varying from 0 • to 80 • . Besides, the LEDs tilt with a angle θ = 0 • , θ = 10 • and θ = 30 • for Fig. 3(a), Fig. 3(b) and Fig. 3(c), respectively. The positioning samples are chosen along the length, width and height of the room, with a five centimeters separation from each other. A SNR of 13.6 dB is assumed according to the reliable communication requirement of OOK modulation [16]. As shown in Fig. 3, R-P3P achieves the highest CR for all Ψ c regardless of θ. It performs consistently well from Ψ c = 20 • to Ψ c = 80 • with the CR exceeding 90% for θ = 0 • and θ = 10 • , and the CR exceeding 70% for θ = 30 • . The CR of R-P3P is more than 2% , 3% and 5% higher than the PnP algorithm for θ = 0 • , θ = 10 • and θ = 30 • , respectively. Meanwhile, the CR of R-P3P is more than 8% , 10% and 18% higher than the CA-RSSR for θ = 0 • , θ = 10 • and θ = 30 • , respectively. As we can observe from Fig. 3, as the tilt angle of the LEDs increases, the CR for all the three algorithms decreases, and the CR performance advantage of R-P3P compared with the other two algorithms increases. Besides, the CR of R-P3P is more than 40% for all the three θ for Ψ c = 10 • . In contrast, the PnP algorithm and CA-RSSR almost cannot be implemented for Ψ c = 10 • . In addition, the CR of the three algorithms decrease slightly with large FoV since the power of shot noise increases [21].

Accuracy Performance Of R-P3P
In this subsection, we evaluate the accuracy performance of R-P3P under the influence of LEDs orientation, the image noise and the distance between the camera and the PD on the receiver.

1) Effect of the LED orientation
We first evaluate the effect of LEDs orientation on 3D-positioning accuracy of R-P3P, CA-RSSR and the PnP algorithm. CA-RSSR requires the LEDs to face vertically downwards, which may be challenging to satisfy in practice. Therefore, two cases are considered for CA-RSSR: the ideal case where the LEDs face vertically downwards, and the practical case where the LEDs tilt with a random angle perturbation θ ≤ 5 • . In contrast, R-P3P and the PnP algorithms can be implemented in the two cases, and thus only the practical case is considered for them. The accuracy performance is represented by the cumulative distribution function (CDF) of the PEs. As shown in Fig. 4, R-P3P achieves 80th percentile accuracies of about 5 cm, which is almost the same with the PnP algorithm. This implies that the probability of the situations that more than one groups of φ i (i ∈ {1, 2, . . . , K }) comply (16) or no group of φ i (i ∈ {1, 2, . . . , K }) complies (16) is very low. Therefore, although (16) is not strict in theory, the accuracy of R-P3P is close to that of the PnP algorithm using less LEDs. Besides, CA-RSSR achieves 80th percentile accuracies of about 10 cm for the ideal case. However, the practical case of CA-RSSR presents a significant accuracy decline compared with the ideal case of the CA-RSSR. Thus, a slight LEDs orientation perturbation can impair the accuracy significantly for the CA-RSSR.
Then, we evaluate the 3D-positioning accuracy of R-P3P with varying tilt angles of LEDs. The performance is represented by the CDF of PEs, given θ = 0 • , 10 • , 20 • , 30 • , 40 • and 60 • . As shown in Fig. 5, R-P3P can achieve 80th percentile accuracies of less than 5 cm for all θ. Therefore, R-P3P can be utilized widely in the scenarios where the LEDs are in any orientation. Besides, the accuracy of R-P3P increases slightly as the tilt angle of LEDs increases since the irradiance angles decrease which further improves the received signal power.
2) Effect of the image noise Since R-P3P also exploits visual information, we then evaluate the effect of the image noise on the accuracy performance of R-P3P, CA-RSSR and the PnP algorithms for 3D positioning under the case where the LEDs tilt with a random angle perturbation θ ≤ 5 • . The image noise is modeled as a white Gaussian noise having an expectation of zero and a standard deviation ranging from 0 to 4 pixels [20]. The mean of PEs that are affected by the image noise are shown in Fig. 6. As shown in Fig. 6, the accuracy performance of R-P3P closes to that of the PnP algorithm and is much better than that of CA-RSSR. For R-P3P, the means of PEs increase from 3 cm to 10 cm with the increasing of the image noise. For the PnP algorithm, the means of PEs    increase from 0 to 9 cm. In contrast, for CA-RSSR, the means of PEs keeps at about 72 cm.
3) Effect of the distance between the PD and the camera Since R-P3P exploits the PD and the camera simultaneously, we then evaluate the effect of the distance between the PD and the camera, d PC , on the accuracy performance of R-P3P. We compare CA-RSSR and R-P3P on 3D-positioning performance with varying d PC under the case where the LEDs tilt with a random angle perturbation θ ≤ 5 • . This performance is represented by the CDF of the PEs with d PC = 0 cm, 1 cm, 3 cm, 6 cm and 10 cm. In particular, d PC = 0 cm indicates the ideal case that the PD and the camera overlap. As shown in Fig. 7, compared with CA-RSSR, R-P3P can achieve better performance. In specific, R-P3P can achieve 80th percentile accuracies of about 5 cm regardless of d PC . In contrast, CA-RSSR can only achieve 40th percentile accuracies of about 30 cm for all d PC . As we can observe from Fig. 7, d PC has little effect on positioning accuracy of R-P3P. This means that R-P3P can be widely used on devices with various d PC .

Computational Cost
In this subsection, we compare execution time of R-P3P, CA-RSSR and the PnP algorithm for 3D positioning to evaluate the computational cost performance [15] [22]. To have a fair comparison, all algorithms have been implemented in Matlab on a 1.6GHz × 4 Core laptop. The experiment consists of 10 5 runs. The results are shown in Fig. 8. Since R-P3P estimates the position of the receiver by the LLS method, the computational cost of R-P3P is the lower than that of CA-RSSR, and the execution time of it is shorter than 0.001 s for almost 100% of the 10 5 runs. Considering a typical indoor walking speed 1.3 m/s, the execution delay of R-P3P only causes 0.2 cm positioning error, which is acceptable for most applications. Besides, the computational cost of the PnP algorithm is over 0.002 s for about 90% of the 10 5 runs, which means the computational cost of R-P3P is less than 50% of that of the PnP algorithm.

CONCLUSION
We proposed a novel indoor positioning algorithm named R-P3P that simultaneously utilizes visual and strength information. Based on the joint of visual and strength information, R-P3P can mitigate the limitation on LEDs orientation. Besides, R-P3P can achieve better accuracy performance than CA-RSSR with low complexity due to the use of the LLS method. Furthermore, R-P3P requires less LEDs than the PnP algorithm. Simulation results indicate that R-P3P can achieve positioning accuracy within 10 cm over 70% indoor area with low complexity regardless of LEDs orientations. Therefore, R-P3P is a promising indoor VLP approach, which can be widely used in the scenarios where the LEDs are in any orientation. In the future, we will experimentally implement R-P3P and evaluate it using a dedicated test bed, which will be meaningful for future indoor positioning applications.