ESTIMATION OF SPATIAL COORDINATES OF 3 D OBJECTS BY STEREOSCOPIC SCANNING

In the article, mathematical model of simple camera by the perspective projection is described. Furthermore, two base techniques of stereoscopy scanning of 3D visual scene are described. The first technique is a stereoscopic system with parallel optical axes and the second technique is stereoscopic system where optical axes of cameras are crossed in defined point. From these mathematical constructions equations for spatial coordinates estimation of 3D objects in 3D visual scene are derived. Afterwards, a simulation for different parameters of individual stereoscopic camera systems is performed. Based on the results of simulations, features of both stereoscopic camera systems are analysed.


INTRODUCTION
Stereoscopic scanning of 3D visual scenes is based on properties of human visual system, because the human visual system in principle is a stereoscopic system.Human brain then evaluates this projection as a 3D visual perception.Currently, record and reconstruction of 3D visual scene require except for 2D projection also record of depth.There are several options how to obtain this information.For example there is a method based on laser scanning or method which uses a projector which projects known pattern on 3D visual scene and objects situated in it.Based on its deformation it is possible to determine depth coordinates [1].In our article we introduce a method of depth estimation based on stereoscopic scanning of 3D visual scene.Spatial coordinate estimation is necessary in robotics where the robot must be able to determine independently its position in space.We can also use this estimation in 3D modelling of objects.Stereoscopic estimation of spatial coordinates found its application also in automobile factory, where the car automatically evaluates a potentially dangerous situation on the road and warns the driver.
The analysed stereoscopic system of estimation of spatial coordinates is applicable in the situations in which active scanning of 3D visual scene by the sonar, radar or already mentioned techniques are not possible.Further advantage is that storage of depth map is not necessary.Depth map can be computed directly from stereo images.In the 3D visual scene reconstruction it is not necessary to capture textures by using another camera system since one of the stereo images can be used as texture.
Following chapters describe basic mathematical model of the camera system called perspective projection.Based on this mathematical model, two basic arrangements of stereoscopic camera system are described.At the end of the article these stereoscopic systems are compared according to the spatial coordinate estimation accuracy.

PERSPECTIVE PROJECTION
Fig. 1 shows a simplified arrangement of coordinate system of camera and 3D visual scene (VS).The real camera can be described with mathematical model of perspective projection.In this projection [2] all points of 3DVS are projected to the image plane across the focal point of system.

Fig. 1 Simplified model of the camera
Before the calculation of projection of the point of 3DVS into the image plane it is necessary to transform coordinate system of 3DVS into the coordinate system of camera [3,4].Thus where "d" is the distance between beginnings of coordinate systems.Perspective projection of the point with (x, y, z) coordinates of 3DVS into the point of image with (i, j) coordinates is defined as where f x = m x f and f y = m y f represent the camera focal length linearly adjusted by the scale of m x and m y scaling.Point with coordinates (i 0 , j 0 ) is representing principal point of image.

STEREOSCOPIC SCANNING
Stereoscopic scanning is one of the techniques of estimation of depth coordinates.There are two configurations.In first of them, optical axes of cameras of the stereoscopic camera system (SCS) are parallel.This system is called coaxial SCS or parallel stereoscopic camera system (PSCS).The other configuration is where the optical axes of cameras are crossed in defined point.This system is called abaxial SCS or toed-in stereoscopic camera system (TSCS) [5,6].

Parallel stereoscopic camera system
As shown in Fig. 2 the PSCS consists of two identical cameras which are in horizontal direction separated at distance "a".Center of the system is in the middle of this distance.In scanning of 3DVS this point is on the common straight line with beginning of its coordinate system.This straight line is perpendicular to identical planes made up with axes x and y of coordinate systems of cameras.
Fig. 2 The camera arrangement of the parallel stereoscopic camera system Left (KL) and right (KR) camera can be replaced with their perspective projection models.Then for camera KL we can write and similarly for the camera K Depth coordinate "r" is possible to obtain by solving of the system of equations ( 5) and (7).
By substituting h in equation ( 7) we get After the expression of unknown r from the equation ( 8) we get And by the substitution  = (  −   ) we get final equation for depth coordinate.
where D is horizontal disparity [7].Horizontal disparity represents difference of horizontal coordinates of projections of equal points on the left and right image.As it is clear from the equation ( 10), the value of depth coordinates grows with growth of disparity.For estimation of horizontal and vertical coordinates it is possible to deduce next equations.As it is clear from the equation ( 10), for estimation of depth coordinate it is necessary to know the distance between the SCS and 3DVS and the baseline separation "a" of cameras

Toed-in stereoscopic camera system
Unlike the previous stereoscopic camera system, in this system optical axes of cameras are not parallel but they are crossed in a defined point [9].As it is clear from Fig. 3, cameras are rotated.In point of crossing between the optical axes there is an angle 2β.Distance "zp" between the beginning of SCS and the point of intersection of optical axis is called the zero parallax and it is defined by equation where "a" is baseline separation of cameras and β is an angle of rotation of cameras.Transformation of coordinate system of 3DVS into the coordinate systems of cameras is defined as After perspective projection, points of 3DVS are projected into the individual images by using these equations Disadvantage of this method is the difficulty of computation of system of equations ( 16) and ( 17) for estimation of depth coordinate.Another disadvantage is that besides of the horizontal disparity, also the vertical disparity arises [8].The vertical disparity represents the difference of vertical coordinates of projections of equal points of 3DVS in the left and right image.With the increase of coordinates in the horizontal direction, the vertical disparity is increasing too.This causes an error in estimation of the spatial coordinates.The estimation of depth coordinates is possible to get by equation ( 18 As it is clear from the equation (18), in case if value of the angle β is equal to zero, the equation is changed onto the equation (10).This is confirmation that PSCS is only a special case of TSCS.For the estimation of horizontal and vertical coordinates it is possible to deduce following equations As it is clear from the equation (18), for estimation of depth coordinate is necessary to know the distance between the SCS and 3DVS, the baseline separation "a" of cameras and the angle β.If zp = d, then it can be shown that horizontal disparity is equal to zero for points which are from SCS in distance equal to "d".

EXPERIMETNAL RESULTS
Fig. 4 shows 3DVS with three pyramids.Their distribution is the same for experiments with both of SCS.The main vertices of pyramids have depth  1 = 500,  2 = 400,  3 = 250 points and others vertices have zero value of depth in a common coordinate system of 3DVS. is the mean square value of depth coordinates of pyramids vertices and is mean square value of error between the real and estimated values of depth coordinates of pyramids vertices.Similarly, it is possible to evaluate the accuracy of estimation of other coordinates.From the equations for the estimation of horizontal and vertical coordinates it is clear that these equations as well as the equation for the depth estimation depend only on values of coordinates of the point in raster in left and right images.Also disparity depends on these coordinates and other variables are known.Therefore for purposes of evaluation of our experiments, the evaluation of depth accuracy is sufficient.Accuracy is evaluated for all vertices of pyramids.Subsequently, it is evaluated influence of the camera baseline separation and the distance between SCS ISSN 1335-8243 (print) © 2014 FEI TUKE ISSN 1338-3957 (online), www.aei.tuke.skand 3DVS on the accuracy of depth estimation.In our experiments it is supposed that the scaling factors of cameras in horizontal and vertical direction are equal, thus m x = m y = 10.Also it is supposed that the zp = d.Generally, the coordinate system of 3DVS can be given in scene units [su].For simplicity, all variables are given in points.In our case one point represents 0.25 mm.Conversion between distances given in points and millimeters is achieved by the equation (24).
where m p is the point scale and in our case it is The image is of a resolution 512x512 points.In Tables 1 -4 the parameters of stereoscopic camera systems are shown and considering them the accuracy of estimation is evaluated.The distance between SCS and 3DVS is evaluated for values 1622, 2433, 3244 and 4055 points.
The influence of the baseline separation on the accuracy of estimation is evaluated for the distance from the interval <15, 55> mm with step 5mm.These are converted in to points by equation ( 24).Also in the Table 1 mean square of the depth and the estimation errors are shown.The table 5 shows dependence of mean value of SNR for all value of baseline on distance between SCS and 3DVS.From the comparison of graphs in Fig. 6, 7 and 8 it is clear that TSCS achieves better results than PSCS.As it is obvious from graph in Fig. 8 and the table 5 it can be also seen that with growing distance between SCS and 3DVS accuracy of depth estimation decreases in both systems.As we mentioned above in TSCS except for the horizontal disparity, the vertical disparity also arises.This causes an inaccurate estimation of depth coordinate of the points shifted in horizontal direction from the center of the 3DVS coordinate system.In the case of PSCS these kinds of errors caused by the vertical disparity are not occurred.Vertical disparity causes the curvature in depth coordinates of the reconstructed 3DVS as is shown on the Fig. 9. Fig. 9.a shows that the orthogonal grid is placed over the plane hv and Fig. 9.b shows its reconstruction by the TSCS.The points of 3DVS shifted in the horizontal direction from the center of the 3DVS coordinate system are below the ideal depth plane.That is why it is necessary to put the objects of the 3DVS in the area around the center of the 3DVS coordinate system.

CONCLUSION
In the article, we mathematically modelled the camera and based on this we mathematically described two stereoscopic systems with different configuration.The first configuration is the system with two cameras with parallel optical axes and the second configuration is system where the optical axes of cameras are crossed in defined point.We showed that the PSCS is only special case of the TSCS.By the simulation of both systems we created the images for the varying distances between the SCS and the 3DVS and for the varying baseline separation of cameras.On basis of the images we performed estimation of depth coordinates of the simulated 3DVS with three pyramids.By the comparison of results it is clear that for purposes of depth estimation and the reconstruction of the 3DVS for both systems, comparable results are achieved.The main aim of the article is to mathematically describe two basic stereoscopic systems and on the basis of the achieved experimental results perform the analysis of their properties.For the real cameras, their parameters, like the focal length and the scales of scaling for the horizontal and vertical direction or also the distance between SCS and 3DVS are not always known.These intrinsic and extrinsic parameters are possible to get by the calibration.The real cameras also cause the additional distortion which is possible to compensate by the methods of correction.

Fig. 3
Fig.3The camera arrangement of the toed-in stereoscopic camera system

Fig. 4 5
Fig.4 The 3D visual scene with three pyramidsThe anaglyph[10] acquired by the PSCS is shown in Fig.5.a and an anaglyph acquired by the TSCS is shown in Fig.5.b.

2 Fig. 6 Fig. 7 Fig. 8
Fig.6 Graphs of dependence of SNR on the given baseline separation of the cameras for PSCS

Fig. 9 a
Fig. 9 a) the 2D orthogonal grid b) reconstruction of the 2D orthogonal grid

Table 1
Dependence of SNR given the baseline separation of the cameras for distance 1622 points

Table 3
Dependence of SNR given the baseline separation of the cameras for distance 3244 points

Table 5
Dependence of mean value of SNR given by the