Eye Gaze Based 3D Triangulation for Robotic Bionic Eyes

Three-dimensional (3D) triangulation based on active binocular vision has increasing amounts of applications in computer vision and robotics. An active binocular vision system with non-fixed cameras needs to calibrate the stereo extrinsic parameters online to perform 3D triangulation. However, the accuracy of stereo extrinsic parameters and disparity have a significant impact on 3D triangulation precision. We propose a novel eye gaze based 3D triangulation method that does not use stereo extrinsic parameters directly in order to reduce the impact. Instead, we drive both cameras to gaze at a 3D spatial point P at the optical center through visual servoing. Subsequently, we can obtain the 3D coordinates of P through the intersection of the two optical axes of both cameras. We have performed experiments to compare with previous disparity based work, named the integrated two-pose calibration (ITPC) method, using our robotic bionic eyes. The experiments show that our method achieves comparable results with ITPC.


Introduction
Active binocular vision is a binocular vision system that could actively change its own view direction. It is beneficial for multiple applications such as manipulation [1], three-dimensional (3D) reconstruction [2], navigation [3], 3D mapping [4], and so on. 3D coordinates estimation based on active binocular vision attracts extensive research interests.
Active binocular vision systems can be divided into two categories: the first category has fixed cameras and the second category has non-fixed cameras. The vision systems of ASIMO [5], HRP-3 [6], HRP-4 [7], PR2 [8], ATLAS [9], and Walkman [10] belong to the first category. However, this category can not perceive objects that are very close to the cameras. The second category, which is similar as human vision system, can obviously improve flexibility and extend the field of view. The vision systems of the ESCHeR head [11], Yorick head [12], MAC-EYE robot eyes [13], iCub head [14], ARMAR-III head [15], Flobi head [16], Zhang XL's robot eyes [17], ARMAR-4 head [18], Muecas [19], Figure 1. In our proposed method, we use eye gazing or visual servoing to actively adjust the view direction of each camera to keep the 2D image of a three-dimensional (3D) point P at the principal point. In other words, P is located at the fixation point. 3D triangulation could be performed through the intersection of two cameras' optical axes.
The main contribution of this paper is that keeping the target point lies on the optical axes of each camera through eye gazing, we can perform 3D triangulation through calculating the intersecting point of two optical axes of cameras instead of solving the projection equations of two cameras that rely on disparity and stereo extrinsic parameters.

System Configuration
The anthropomorphic robotic bionic eyes we designed (as shown in Figure 2) for human-like active stereo perception, is illustrated in detail in [24]. This is a revised version with seven degree of freedoms (DOFs). It has two eyes and one neck. Each of the eye has two DOFs: tilt and pan. The neck has three DOFs. Joints of the neck and eyes are serially connected. It means that the links are connected serially by the neck and eyes joints and form ordered open chains. In the ordered open chain, the next joint is the load of the previous joint. As shown in Figure 3, our designed robotic bionic eyes contain two serial open chains: frame 0 → 1 → 2 → 3 → 4 → 5 → 6, frame 0 → 1 → 2 → 3 → 7 → 8 → 9. Each eye is installed with a USB camera with resolution of 640 × 480 @ 120 fps. A common trigger signal is connected to both cameras in order to realize hardware synchronization. We define the coordinate systems that are based on the standard D-H convention. Figure 3 shows the frame definitions. Table 1 shows the D-H parameters of the robotic bionic eyes. The transformation matrix i−1 T i between Frame i and Frame i−1 can be calculated according to the D-H parameters [42].  Table 1. The D-H parameters of the robotic bionic eyes. (Here parameters d i , θ i , a i , α i are the link offset, joint angle, link length and link twist, respectively. The i-th joint offset is the value of θ i in the initial state. L 1 is 64.27 mm, L 2 is 11.00 mm, L 3l is 44.80 mm, L 3r is 47.20 mm, L 4 is 13.80 mm, and L 5 is 30.33 mm). i d i /mm θ i /rad a i /mm α i /rad i-th joint offset/rad

Eye Gaze Based 3D Triangulation Method
The main idea of the eye gaze [43][44][45][46] based 3D triangulation method is to control the view direction of both cameras to keep a 3D spatial point P lies on the optical axes of both cameras. We are able to calculate the exact 3D coordinates of P through calculating the intersection point of the two optical axes, as we could obtain the pose of each camera through forward kinematics. We also know that once P lies on the optical axis of a camera, the image of P will be at the principal point of the 2D image. Accordingly, we could perform image based visual servoing to change the pose of both cameras to keep the images of P at the principal points of two cameras and, consequently, P lies on the two optical axes.
In this section, we first introduce visual servoing based approach for eye gazing, and then we show the process of calculation of the intersection point.

Visual Servoing Based Eye Gazing
We use image based visual servoing to control the pose of each camera. We will first get relationship of the velocity of the image feature points and the velocity of the joints. After that, we will present a simple visual servoing control law in order to generate commands to drive the feature point error down to zero.

Relation Matrix
Here, we use a relation matrix to describe the velocity of the image feature points and the joint velocity. The relation matrix could be decomposed to be the multiplication of an Interaction matrix and a Jacobian matrix. An Interaction matrix is used to build the relationship of the velocity of an image feature point and the velocity of the camera. A Jacobian Matrix is used to describe the relationship of the velocity of the camera and joint velocity.ṗ T is the velocity of a camera, and J image is the Interaction matrix calculated, as follows: where f u and f v is the focal length in the X and Y directions of the camera, respectively, Z c is the depth of the spatial point P in the camera frame.
The relationship between V n and the joint velocityθ could be described, as follows: where J joint is the Jacobian matrix w.r.t camera frame, and it is defined as: where R is the rotation matrix of the base frame w.r.t the camera frame, and J θ = [J θ 1 J θ 2 . . . J θ n ] T is the Jacobian matrix w.r.t the base frame. The elements J θ i (i = 1, 2, . . . , n) can be calculated, as follows: where Z i and O i are the direction vector of Z axis and the origin of Frame O i −x i y i z i w.r.t base frame. Accordingly, the relationship between the velocity of an image feature pointṗ = [uv] T and the joint velocityθ can be formulated, as follows:

Image Based Visual Servoing Control Law
We use a simple image based visual servoing control law to control the motion of the camera so as to reduce the error of the expected image point p * = [u * v * ] T and the actual image point p= [u v] T of 3D spatial point P, where [u * v * ] T are set to the principal point [u 0 v 0 ] T , which can be achieved from the intrinsic parameters, and [u v] T are the actual pixel coordinates of the spatial point P in the camera.
where K = diag{k 1 , k 2 } is the gain matrix, and where k 1 and k 2 are the gain used to control how fast the pixel errors in the X and Y directions are regulated to zero independently [47], and e = p * − p= [u * − u v * − v] T is the error between p * and p. From Equation (7), we can infer that the error will decay to zero exponentially, where K is the exponential coefficient.
The above equation could be rewritten for the left eye and right eye, respectively. For the left eye, we could derive where p l = [u l v l ] T and p * l = [u * l v * l ] T are the actual pixel coordinates and the expected pixel coordinates of the spatial point in the left camera, respectively, andθ l = [θ 1θ2θ3θ4θ5 ] T is the joint velocity.
We fix the neck joints and establish the base frame O N −x N y N z N at link 3, the joint velocity vector is rewritten as:θ From Equations (10) and (11), the velocities of left eye joints can be calculated, as follows: where J l ij is the element on the i-th row and j-th column in the matrix J l . Similarly, the velocities of the right eye joints can be calculated, as follows: In each loop of the visual servoing, we send the generated eye joint velocity commands to drive the robotic bionic eyes to the view direction that reduces the error between p * and p in both cameras.

3D Triangulation by Calculation of Intersection Point
After visual servoing, the point P will lie on the optical axes of each camera, and theoretically the intersection point of the two optical axes is the representation of P. We could calculate the intersection point geometrically if the equations of the two optical axes are obtained. We could use forward kinematics in order to derive the description of the two optical axes because each axis is the Z of the camera frame for each eye. In real situation, the two optical lines may not intersect each other, and so we use the middle point of the common perpendicular line segment of the two skew optical axes to be the representation of the intersection point instead.

Calculation of the Coordinates of the Optical Axes
First of all, we will deduct the process to calculate the coordinates of the two optical axes L l and L r . The optical axis of each camera is the coordinates of Z axis w.r.t the neck frame N, and so we need to derive the transformation matrix of the frame of the two cameras w.r.t frame N through forward kinematics, or the pose of left camera N T Cl and the pose of right camera N T Cr .
where 6 T Cl is the head-eye parameters of the left camera, which can be calculated using the method in [40]. N T 6 is represented as the transformation matrix of the left eye frame w.r.t the neck frame N, and it can be obtained using the D-H parameters of the robotic bionic eyes, as follows: It is also the case for the right camera.
where 9 T Cr is head-eye parameters of the right camera. N T 9 is the transformation matrix of the right eye frame w.r.t the neck frame N, it can be obtained, as follows: Accordingly, we can perform real-time calculation of L l and L r using Equations (14) and (16).

Calculation of the Intersection Point
We are going to calculate the intersection point of the optical lines L l and L r . Here, L l and L r are the ideal optical axes. The actual optical axes L l and L r will no longer intersect each other due to the measurement errors. It means that L l and L r may not in the same plane. We use the middle point of their common perpendicular line segment P 1 P 2 to be the representation of P instead, which is shown in Figure 4. In order to obtain P, we need to calculate P 1 and P 2 . Here, P 1 is the intersection of line L r and plane derived from L l and P 1 P 2 . P 2 is the intersection of line L l and plane derived from L r and P 1 P 2 .
First of all, we would like to calculate the plane derived from L l and P 1 P 2 . In order to do that, we will introduce two points C, D on line L r with coordinates of Cr C = (0, 0, 10, 1) T and Cr D = (0, 0, 20, 1) T in the right camera frame. We set the coordinates in frame N as N C and N D.
And N C = N T Cr Cr C and N D = N T Cr Cr D. We will also introduce two points A, B on line L l . The L r in frame N (in the following statements, all of the parameters are represented in frame N) are defined, as follows: We use n L l , n L r , n P 1 P 2 , n P 1 P 2 L l to represent the direction of L l , L r , P 1 P 2 and the norm of plane P 1 P 2 L l respectively. Accordingly, and and the plane P 1 P 2 L l could be written as: From Equations (18) and (21), we could get the intersection point P 1 between plane P 1 P 2 L l and line L r .
Similarly, we could get P 2 like this: Additionally, the fixation point P could be obtained: Hence, the eye gaze based 3D triangulation Algorithm 1 can be summarized, as follows: Visual Servoing based Eye Gazing: • Gaze the 3D point P at the principal point in the left and right cameras simultaneously through visual servoing based eye gazing. In each loop of the visual servoing: • Calculate the joint velocity commands of the robotic bionic eyes using Equations (12) and (13) according to pixel deviation in the left eye and right eye, respectively. • Send the joint velocity commands to the robotic bionic eyes and get the feedback of joint angles.

3D Triangulation by Calculation of Intersection Point:
• Calibrate the head-eye parameters 6 T Cl and 9 T Cr using method in [40].
• If visual servoing based eye gazing is successful: • Calculate the transformation matrices N T 6 and N T 9 using the forward kinematics with joint position feedback. • Obtain the optical axes of the left and right cameras using Equations (14) and (16).

Propagation of Uncertainty
Uncertainty [48][49][50] is a quantification of the doubt about the validity of measurement results. The uncertainties of P using our method and the ITPC method are mainly propagated from image detection uncertainty. For our method, because some times the image uncertainty is tiny, which may result in no motion of optical axes because of the joint feedback precision (±0.04 • ) is not as high as image coordinates. Subsequently, we could not measure the uncertainty directly, because there is no variation of the coordinates of P. In order to solve this problem, we use the law for the propagation of uncertainty [51,52] instead to calculate the theoretical uncertainties of point P obtained from the two methods that are based on image detection uncertainty.

Image Detection Uncertainty
The image detection uncertainty is represented as u u and u v . They can be calculated, as follows: where u i , v i are the i-th (i = 1, 2, . . . , n) independent observation of the image coordinates u, v, respectively. u, v are the average of the n observations of u, v, respectively.

Uncertainty of P Using Our Method
If eye gazing converges at time t, the image point of P coincide with the principal point in both cameras. After time t, the eye joint angles θ i (i = 4, 5, 7, 8) will vary according to the variation of the image coordinates of P in both cameras through eye gazing. For the left eye, the joint angles θ 4 , θ 5 can be formulated, as follows: where u l , v l are the image of P in the left camera, u 0 l , v 0 l are the principal point in the left camera, and f u l , f v l are the focal length in the X and Y directions of the left camera, respectively. The uncertainties of the eye joint angles θ i (i = 4, 5) can be calculated by propagating the uncertainties of the image of P in the left camera through partial derivatives of Equations (27) and (28), as follows: where u u l , u v l can be obtained using Equations (25) and (26). We use v l , u l to be the replacement of v l and u l in ∂θ 4 ∂v l and ∂θ 5 ∂u l . The uncertainties propagation of θ 7 , θ 8 are similar to θ 4 , θ 5 . From the previous Section, we derived that the 3D point P can be triangulated by Equations (22)- (24). The uncertainties of P can be calculated by propagating the uncertainties of the eye joint angles θ i (i = 4, 5, 7, 8) through partial derivatives of Equation (24). The uncertainties of P = (x p , y p , z p ) T using our proposed method can be formulated, as follows: In the above equations, we use θ f b i or the average of the n independent repeated observations of

Uncertainty of P Using ITPC Method
In the ITPC method, the eye joint angles θ i (i = 4, 5, 7, 8) will not change after time t. The stereo extrinsic parameters Cr T Cl can be calculated using the eye joint angles θ i (i = 4, 5, 7, 8). Here, θ i = θ f b i , where θ f b i is the i-th joint angle feedback at time t. P = (x p , y p , z p ) can be calculated while using the intrinsic parameters of both cameras, the stereo extrinsic parameters Cr T Cl , and the image coordinates of P in both cameras. The uncertainties of P using the ITPC method can be calculated by propagating the uncertainties of the image coordinates in both cameras through partial derivatives: where u u l , u u r can be calculated using Equation (25). u v l , u v r can be calculated using Equation (26).

Experiments and Results
We will first compare the triangulation performance of our method with ITPC method in simulated and physical experiments in order to evaluate the eye gaze based 3D triangulation method. After that, we will do the precision experiments comparing with stereo system ZED mini with fixed cameras. Finally, we will execute experiments to obtain the time response of our method.

Comparison Experiments with the ITPC Method
We will compare our proposed method with the ITPC method in both simulated and physical experiments. Static target points as well as moving target points will be triangulated to see the performance of the two methods.

Simulated Experiments
We will do the two comparison experiment in simulation environments. The first is triangulation of static target points and the other is triangulation of a moving target point.

Experimental Setup
In simulated experiments, we use GAZEBO [53] in order to simulate the kinematics of the robotic bionic eyes with the same mechanical model as the physical robotic bionic eyes. We also use the same joint controller under the ROS framework to send joint velocity commands to GAZEBO and get the joint angle feedback as well as image feedback from both eyes.
The image size of the left and right cameras is 1040 × 860 in pixel. The intrinsic parameters obtained while using Bouguet toolbox [54] are shown in Table 2. The checkerboard corner points are detected with sub-pixel position accuracy. We choose the point at the upper right corner as the spatial point P. The expected pixel coordinates of point P in both cameras are fixed at the principal point (520.50, 430.50). Figure 5 shows the images of P in the left and right cameras.

Triangulation of Static Target Points
In this experiment, we get 21 different static target points by placing the simulated checkerboard at 21 different positions. At each position, we recorded the estimated 3D coordinates w.r.t the base frame N using our proposed method and ITPC method, respectively. The ground truth can be obtained from GAZEBO (as shown in Table 3).  Error Analysis: we calculated the absolute errors of the mean of the 3D coordinates using our proposed method and ITPC method (as shown in Figure 6) w.r.t the ground truth, respectively. The absolute error of both methods in the simulated experiments are mainly from the error of D-H parameters, the error of intrinsic parameters and the error of head-eye parameters. The absolute errors in X axis of both methods are between 0.73 mm and 3.96 mm, as shown in Figure 6. The absolute errors in Y axis of both methods are between 1.14 mm and 6.18 mm. We can see that the minimum absolute errors in Z axis of both methods are only 0.02 mm and 0.04 mm, respectively, when Z ground truth is 1100 mm. The maximum absolute errors in Z axis of both methods only reach 0.5% of the ground truth depth. In conclusion, the absolute errors of our proposed method are very close to the ITPC method in the X, Y, and Z axes.
Uncertainty Analysis: the uncertainties of our proposed method (shown in Figure 7) were calculated using Equations (31)- (33). The uncertainties of the ITPC method (shown in Figure 7) were calculated using Equations (34)- (36). At each position, the times n of independent repeated observations is 1150. For both of the left and right cameras, the image detection uncertainties in the X direction are between 0.006 pixels and 0.057 pixels. The image detection uncertainties in the Y direction are between 0.006 pixels and 0.057 pixels. The absolute uncertainties of our proposed method are very close to the ITPC method in the X, Y and Z axes, as shown in Figure 7. The absolute uncertainty in X axis of both methods are between 0.009 mm and 0.130 mm. The absolute uncertainty in Y axis of both methods are between 0.028 mm and 0.188 mm. The absolute uncertainty in Z axis of both methods are between 0.088 mm and 4.545 mm.

Triangulation of a Moving Target Point
We want to verify the effectiveness of our proposed method in the case that the target point is moving.
We placed a checkerboard at 1000 mm on the Z axis w.r.t the base frame N. The trajectory of the target point on the X and Y axes w.r.t frame N is x = r * sin(t/10000) and y = −100 + r * cos(t/10000), respectively, where r = 100 mm. We recorded the 3D coordinates of the moving target points estimated by our proposed method and ITPC method, respectively.
Error Analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ground truth, respectively. The mean absolute error in X axis of both methods are 1.76 mm and 1.61 mm, respectively, as shown in Figure 8. The mean absolute error in Y axis of both methods are 2.03 mm and 2.17 mm, respectively. The mean absolute error in Z axis of both methods are 1.69 mm and 1.71 mm which reach only 0.17% of the ground truth depth. The mean absolute error in X axis of our proposed method is larger than the ITPC method and the mean absolute error in Y, Z axes of the proposed method are smaller than the ITPC method. Uncertainty Analysis: we calculated the standard deviation of the coordinates in Z axis as the uncertainties of our proposed method and ITPC method. In the experiments, the number n is 3780. The absolute uncertainties of our proposed method and the ITPC method in Z axis are 0.30 mm and 0.98 mm, respectively. The absolute uncertainties in Z axis of our proposed method are smaller than the ITPC method.

Physical Experiments
The physical experiments are performed on the real robotic bionic eyes. Triangulation on static and moving target points is performed. Comparisons are carried out between our proposed method and the ITPC method.

Experimental Setup
In physical experiments, the image size of the left and right cameras is 640 × 480 in pixel. The intrinsic parameters of both cameras are shown in Table 4 [55] information is obtained through ViSP [56] library. We choose the center (point P) of the AprilTag as the fixation point. The expected image of point P in the left and right cameras are set to the principal points (362.94, 222.53) and (388.09, 220.82), respectively. Figure 9 shows the images of P in the left and right cameras.

Triangulation of Static Target Points
We placed the AprilTag at 21 different positions (as shown in Table 5). At each position, the estimated 3D coordinates using our proposed method and ITPC method were recorded, respectively. Error Analysis: we calculated the absolute error of the mean of estimated 3D coordinates w.r.t ground truth using our proposed method and ITPC method, respectively (as shown in Figure 10). The absolute error of both methods in the physical experiments are mainly from errors of forward kinematics, errors of intrinsic parameter, errors of head-eye parameters and joint offset errors. As shown in Figure 10, our proposed method are closer to the ground truth than the ITPC methods in the X and Z axes in most of the 21 different positions, especially when Z ground truth are larger than 2000 mm. The absolute error in Y axis of our proposed method is between 0.55 mm and 12.28 mm. The absolute error in Y axis of the ITPC method are between 1.52 mm and 12.88 mm. The minimum absolute errors in Z axis of our proposed method and the ITPC method are 1.42 mm and 2.32 mm, which reach 0.23% and 0.38% of the ground truth depth, respectively. The maximum absolute error in Z axis of our proposed method and the ITPC method are 124.49 mm and 174.64 mm, which reach 4.97% and 6.98% of the ground truth depth, respectively. Our proposed method obtains smaller mean absolute errors in the X, Y, and Z axes. Uncertainty Analysis: the uncertainties of our proposed method (shown in Figure 11) were calculated using Equations (31)- (33). The uncertainties of the ITPC method (shown in Figure 11) were calculated using Equations (34)- (36). At each position, n = 1200. For the left camera, the image detection uncertainties in the X direction are between 0.027 pixels and 0.089 pixels. The image detection uncertainties in the Y direction are between 0.039 pixels and 0.138 pixels. For the right camera, the image detection uncertainties in the X direction are between 0.016 pixels and 0.067 pixels. The image detection uncertainties in the Y direction are between 0.046 pixels and 0.138 pixels. The absolute uncertainty in X axis of both methods are between 0.119 mm and 3.905 mm, as shown in Figure 11. The absolute uncertainty in Y axis of both methods are between 0.091 mm and 0.640 mm. The absolute uncertainty in Z axis of both methods are between 0.268 mm and 7.975 mm. In conclusion, the absolute uncertainties of our proposed method are very close to the ITPC method in the X, Y, and Z axes.  Figure 12 shows the trajectory of the ground truth and the estimated P using our proposed method and ITPC method. Error Analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ground truth respectively (shown in Figure 13). The mean absolute error in X axis of our proposed method and the ITPC method are 11.81 mm and 9.53 mm, as shown in Figure 13. The mean absolute error in Y axis of our proposed method and the ITPC method are 14.89 mm and 16.59 mm. The mean absolute error in Z axis of our proposed method and the ITPC method are 23.74 mm and 22.48 mm. The mean absolute error in X, Z axes of the proposed method are larger than the ITPC method, and the mean absolute error in Y axis of our proposed method is smaller than the ITPC method.

Comparison Experiments with Zed Mini
We compare the triangulation performance of our proposed method and ITPC method with stereo system ZED Mini while using fixed cameras (shown in Figure 2).

Experimental Setup
The image size of the left and right cameras of ZED mini are 1280 × 720 pixels. The intrinsic parameters including the focal length and principal point of both cameras are shown in Table 6.
The maximum horizontal and vertical field of view of ZED mini are 90 • and 60 • , respectively. We placed the AprilTag at eight different positions to get 8 different spatial points. The estimated 3D coordinates using ZED mini are shown in Table 7.
Error analysis: we calculated the absolute error of our proposed method and ITPC method w.r.t ZED mini, respectively. As shown in Figure 14, the absolute errors of our proposed method are very close to the ITPC method in the X and Y axes. Our proposed method gets smaller absolute error in the Z axis at most of the eight different positions. The absolute error in X axis of both methods are between 25.17 mm and 68.63 mm. The absolute error in Y axis of both methods are between 35.26 mm and 55.18 mm. The minimum absolute errors in Z axis of our proposed method and ITPC method are 0.44 mm and 2.10 mm. The maximum absolute errors in Z axis of our proposed method and ITPC method are 86.40 mm and 92.90 mm. No camera movements exist in ZED mini. Our proposed method and ITPC method get a minimally larger errors when comparing with ZED mini mainly because of eye motion.

Simulated Experiments
We perform experiments to evaluate how k 1 and k 2 in the constant coefficient matrix K affect the time it takes for eye gazing. We initially placed the simulated checkerboard to set the target point at (−0.00, −100.00, 1000). A step signal with amplitude of 100 mm in Y direction was applied to the target point. We adjusted k 1 and k 2 from 0.5 to 4.0. The system overshot when k 1 or k 2 is larger than 4.0. Increasing k 1 and k 2 moderately can shorten the time it takes for eye gazing, as shown in Figure 15.

Physical Experiments
In the physical experiments, we placed the AprilTag to set the target point at (−0.00, −145.00, 1500). We set the robotic bionic eyes to the initial state. We set k 1 = 4.0, k 2 = 4.0 in K. It takes 650 ms to move the target point P from (385.98, 196.60) to (362.94, 222.53) in the left image frame, and from (361.09, 191.82) to (388.09, 220.82) in the right image frame, respectively, through eye gazing.

Conclusions
In this paper, we have proposed an eye gaze based 3D triangulation method for our designed robotic bionic eyes. The eye gaze was realized through image based visual servoing in order to keep the two optical axes pass through the target point P. We could obtain the 3D coordinates of P through the intersection of the two optical axes of both cameras. The optical axes of both cameras could be derived from forward kinematics with head-eye parameters calibrated beforehand. In real applications, the two optical axes may not intersect each other due to visual servoing errors and model errors, and so we use the middle point of the common perpendicular line segment of the two skew optical axes as the representation of the intersection point P.
From the simulated and physical experiments, we can see that the proposed method achieves comparable results with the ITPC method in the absolute errors and the propagated uncertainties, and our proposed method gets smaller mean absolute errors with the triangulation of static target points in physical experiments. Our proposed method and ITPC method get larger errors w.r.t conventional stereo systems with fixed cameras such as ZED Mini due to model errors introduced by manufacturing which include link length error, coaxiality error and error due to link stiffness. The experiments show that at the beginning of the visual servoing process our method need several hundred milliseconds to locate a target point to its optical center. Selecting k 1 and k 2 in the gain matrix K by using fuzzy PID could be potential solution to minimize the initial processing time to lacate a target point to its optical center.
Although our method has only tiny improvement in triangulation precision compare with the ITPC method, it is a new bionic approach for triangulation using eye gazing through image based visual servoing. Our system has much larger field of view than traditional stereo pair, such as ZED Mini, and it does not rely on stereo extrinsic parameters directly. Another advantage is that image detection error tolerance is higher. In the future, we are going to reduce the model error and joint offset error introduced by manufacturing to obtain compatible precision as stereo pair with fixed cameras.

Conflicts of Interest:
The authors declare no conflict of interest.