Three-Dimensional Environment Modeling Based on Structure from Motion with Point and Line Features by Using Omnidirectional Camera

Three-dimensional map is available for autonomous robot navigation (path planning, selflocalization and object recognition). In unknown environment, robots should measure environments and construct their maps by themselves. Three-dimensional measurement using image data makes it possible to construct an environment map (Davison, 2003). However, many environmental images are needed if we use a conventional camera having a limited field of view (Ishiguro et al., 1992). Then, an omnidirectional camera is available for wide-ranging measurement, because it has a panoramic field of view (Fig. 1). Many researchers showed that an omnidirectional camera is effective in measurement and recognition in environment (Bunschoten & Krose, 2003; Geyer & Daniilidis, 2003; Gluckman & Nayar, 1998).


Introduction
Three-dimensional map is available for autonomous robot navigation (path planning, selflocalization and object recognition).In unknown environment, robots should measure environments and construct their maps by themselves.Three-dimensional measurement using image data makes it possible to construct an environment map (Davison, 2003).However, many environmental images are needed if we use a conventional camera having a limited field of view (Ishiguro et al., 1992).Then, an omnidirectional camera is available for wide-ranging measurement, because it has a panoramic field of view (Fig. 1).Many researchers showed that an omnidirectional camera is effective in measurement and recognition in environment (Bunschoten & Krose, 2003;Geyer & Daniilidis, 2003;Gluckman & Nayar, 1998).Our proposed method is based on structure from motion.Previous methods based on structure from motion often use feature points to estimate camera movement and measure environment (Rachmielowski et al., 2006;Kawanishi et al., 2009).However, many nontextured objects may exist in surrounding environments of mobile robots.It is hard to extract enough number of feature points from non-textured objects.Therefore, in an environment having non-textured objects, it is difficult to construct its map by using feature points only.

52
Then, line features should be utilized for environment measurement, because non-texture objects often have straight-lines.As examples of previous works using lines, a method for precise camera movement estimation by using stereo camera (Chandraker et al., 2009), a method for buildings reconstruction by using orthogonal lines (Schindler, 2006) and so on (Bartoli & Sturm, 2005;Smith et al., 2006;Mariottini & Prattichizzo, 2007) have been proposed.However, there is a prerequisite on previous line detections of them.A method must obtain a vanishing point (Schindler, 2006) or a pair of end points of the straight-line (Smith et al., 2006).Some of previous line detection is only for a normal camera (Chandraker et al., 2009).Alternatively, some previous methods obtain line correspondences by hand (Bartoli & Sturm, 2005;Mariottini & Prattichizzo, 2007).We propose a method for straight-line extraction and tracking on distorted omnidirectional images.The method does not require a vanishing point and end points of straight-lines.These straight-lines are regarded as infinite lines in the measurement process (Spacek, 1986).Therefore, the proposed method can measure straight-lines even if a part of the line is covered by obstacles during its tracking.Our proposed method measures feature points together with straight-lines.If only straightlines are used for camera movement estimation, a non-linear problem must be solved.However, camera movement can be estimated easily by a linear solution with point correspondences.Moreover, although few numbers of straight-lines may be extracted from textured objects, many feature points will be extracted from them.Therefore, we can measure the environment densely by using both feature points and straight-lines.The process of our proposed method is mentioned below (Fig. 2).First, feature points and straight-lines are extracted and tracked along an acquired omnidirectional image sequence.Camera movement is estimated by point-based Structure from Motion.The estimated camera movement is used for an initial value for line-based measurement.The proposed line-based measurement is divided into two phases.At the first phase, camera rotation and line directions are optimized.Line correspondence makes it possible to estimate camera rotation independently of camera translation (Spacek, 1986).Camera rotation can be estimated by using 3-D line directions.At the second phase, camera translation and 3-D line location are optimized.The optimization is based on Bundle adjustment (Triggs et al., 1999).Some of measurement results have low accuracy.The proposed method rejects such results.Measurement results of feature points and straight-lines are integrated.Triangular meshes are generated from the integrated measurement data.By texture-mapping to these meshes, a three-dimensional environment model is constructed.

Coordinate system of omnidirectional camera
The coordinate system of the omnidirectional camera is shown in Fig. 3.A ray heading to image coordinates   , uv from the camera lens is reflected on a hyperboloid mirror.In this paper, the reflected vector is called a ray vector.The extension lines of all ray vectors intersect at the focal point of the hyperboloid mirror.The ray vector r is calculated by the following equations.

Straight-line tracking
Straight-lines are extracted from a distorted omnidirectional image.The proposed method obtains edge points by Canny edge detector (Canny, 1986).An example of edge point detection is shown in Fig. 5  I and y I are derivatives of image I .An edge point which has large value of the ratio of eigenvalues is regarded as a point locating on a line.In the proposed method, if the ratio is smaller than 10, the edge point is rejected as a corner point.This process provides us separated edge segments.A least square plane is calculated from ray vectors of edge points which belong to an edge segment.If the edge segment constitutes a straight-line, these ray vectors are located on a plane (Fig. 6).Therefore, an edge segment which has a small least square error is regarded as a straight-line.The proposed method is able to extract straight-lines, even if an edge segment looks like a curved line in a distorted omnidirectional image.Fig. 6.The relationship between a straight-line and a ray vector.
The maximum number of edge points which satisfy equation ( 4) is calculated by using RANSAC (Fischler & Bolles, 1981).
where th l is a threshold., i j r is a ray vector heading to an edge point j included in an edge segment i .i n is the normal vector of the least square plane calculated from the edge segment i .If over half the edge points of the edge segment i satisfy equation (4), the edge segment is determined as a straight-line.The threshold th l is calculated by the following equation.
where m r is the radius of projected mirror circumference in an omnidirectional image.A threshold th l allows angle error within m 1/r [rad].It means that an angle error between a ray vector and a straight-line is within 1 pixel.Straight-lines are tracked along an omnidirectional image sequence.The proposed method extracts points at constant intervals on a straight-line detected in the current frame (Fig. 7 (a) and (b)).These points are tracked to the next frame by KLT tracker (Fig. 7 (d)).Edge segments are detected in the next frame (Fig. 7 (c)).The edge point closest to the tracked point is selected as a corresponding edge point (Fig. 7 (e)).The edge segment which has themaximum number of corresponding edge points is regarded as a corresponding edge segment (Fig. 7 (f)).If an edge segment corresponds to several lines, a line which has larger number of corresponding edge points is selected.Matching point search on a line has the aperture problem (Nakayama, 1988).However, it is not difficult for the proposed method to obtain corresponding lines, because it does not require point-to-point matching.By continuing the above processes, straight-lines are tracked along the omnidirectional image sequence.An example of line tracking is shown in Fig. 8.

Point-based measurement
Camera movement is estimated by a point-based method (Kawanishi et al., 2009).The method is based on eight-point algorithm (Hartley, 1997). where . n is the number of feature points.e is calculated as the eigenvector of the smallest eigenvalues of T UU.Estimated camera movement in this process is used as an initial value for line-based measurement.However, not all feature points tracked in the image sequence correspond satisfactorily due to image noise, etc. Mistracked feature points should be rejected.The proposed method rejects these points as outliers by using RANSAC algorithm (Fischler & Bolles, 1981).

Line-based measurement
Estimated camera movement is optimized by using straight-lines.A straight-line is represented as infinite lines by using its direction vector w d and location vector w l ( ww k  ld , k is a factor).The superscript w means that the vector is in world coordinate system.As a prerequisite for line-based measurement, at least, more than 3 images and 3 pairs of corresponding lines (at least one line is not parallel to others) are needed.In the first step, camera rotation and line directions are estimated.The step is independent of camera translation and line locations estimation.In the next step, camera translation and line locations are optimized by a method based on Bundle adjustment (Triggs et al., 1999).In these phases, initial value of 3-D line direction and location are required.These initial values are calculated from line correspondences and initial camera movements.

Camera rotation and 3-D line direction optimization
Our proposed method calculates a normal vector c i n of a least square plane calculated from an edge segment i in Section 3.2.The superscript c means that the vector is in a camera coordinate system at camera position c .Camera rotation depends on 3-D line direction vector

Camera translation and 3-D line location optimization
Camera translation vector w c t and 3-D line location w i l are optimized by Bundle adjustment (Triggs et al., 1999).The method estimates camera movements by minimizing reprojection errors.The projection error of the straight-line is calculated as an angle error between two vectors on a plane which is orthogonal to the line direction.The sum of reprojection errors of straight-lines E t is calculated by the following equation.The relationship between these vectors is shown in Fig. 10.Fig. 10.Relationship between camera translation vector and 3-D line location vector.

 
The sum of reprojection errors of straight-lines E t is minimized by a convergent calculation based on Levenburg-Marquardt method.In these two optimization steps, lines which have large error are rejected as outliers by RANSAC algorithm.
In the proposed method, 3-D lines are represented as uniformly-spaced points , w in e .3-D coordinates of these points are calculated by the following equation. , where h is a uniform distance and n is an integer number, respectively. 3-D coordinates of , w in e is reprojected to the image sequence.When the 2-D coordinates of the reprojection point are close to the corresponding edge segment enough, the point is added into measurement data.By using estimated camera movement, 3-D coordinates of feature points which have the minimal reprojection error are calculated and integrated with straight-line measurement data.

Result qualification
Measurement data which have low accuracy should be rejected before 3-D model construction.
Measurement accuracy of the feature point is evaluated by following equations.

Model construction
Triangular meshes are generated from integrated measurement data by using the 3-D Delaunay triangulation.However, Delaunay triangulation generates a triangular mesh which contradicts a physical shape because the triangular mesh does not consider the shape of the measurement object.Therefore, we apply the triangular optimization method (Nakatsuji et al., 2005) to the triangular mesh (Fig. 11).The method adapts the triangular mesh to the physical shape by detecting a texture distortion.By texture mapping to these meshes, a 3D environment model is constructed.Table 1.Evaluation result of vertical line measurements.

Experiments
A measurement result is shown in Fig. 13.A measurement result which has larger Zcoordinate value is displayed in red, and smaller one is displayed in blue.Angles and depth errors were calculated for evaluation of measurement accuracy in Table 1.An angle error of calculated line directions is 1.2 degree standard deviation.Its maximum error is 1.9 degree.An angle error between two flat walls estimated from measurement data is within 1.5 degrees.A depth error between an estimated flat wall and reconstructed lines has 2.3 mm standard deviation.Its maximum error is 7.5 mm.This experiment shows that our proposed method has sufficient accuracy to accomplish static obstacle avoidance, self-localization.Next, we experimented in an environment including non-textured objects as shown in Fig.We also experimented in an outdoor environment including textured objects (trees and so on) and non-textured objects (buildings and so on) as shown in Fig. 17.We used 240 omnidirectional images.An integrated measurement result is shown in Fig. 18.As one of textured objects, the shape of ground surface is measured by point-based measurement.
As non-textured objects, the shape of the building is measured by line-based measurement.

Conclusions
We proposed an environment modeling method based on structure from motion using both feature points and straight-lines by using an omnidirectional camera.Experimental results showed that our proposed method is effective in environment including both textured and non-textured objects.
As future works, the precision improvement of edge tracking is necessary.Moreover, we should evaluate difference of camera movement estimation accuracy between point-based measurement and edge-based measurement.Further, edge position correlation should be used for increasing measurement stability.

Fig. 1 .
Fig. 1.Omnidirectional camera equipped with a hyperboloid mirror.The left figure shows an acquired image.
center coordinates of the omnidirectional image, x p and y p are pixel size, f is the focal length,  ,  and  are hyperboloid parameters.These parameters are calibrated in advance.

Fig. 3 .
Fig. 3.The coordinate system of the omnidirectional camera.Ray vector r is defined as a unit vector which starts from the focal point of a hyperboloid mirror.

Fig. 5 .
Fig. 5. Edge segment extraction.(a) Input image.(b) Detected canny edge points.(c) Edge segments are separated by rejecting corner points.To separate each straight-line, corner points are rejected as shown in Fig.5 (c).Corner points are detected by using two eigenvalues of the Hessian of the image.Hessian matrix is calculated by the following equation.2

Fig. 7 .
Fig. 7. Searching for a corresponding edge segment in the next frame.(a) Straight-line extracted in the current frame.(b) Points extracted at constant intervals on the line.(b) Edge segments in the next frame.(c) Points (b) are tracked between the current frame and the next frame.(d) Corresponding edge points.(e) Corresponding edge segment.

R
sum of errors E R between camera rotation matrix w c R and 3-D line direction w i d are calculated as shown in the following equation.is a rotation matrix from the world coordinate system to camera coordinate system c .Here, w i d and c i n are unit vectors.The relationship between a direction vector and a normal vector is shown in Fig. 9. Camera rotations and line directions are optimized by minimizing E R .Levenburg-Marquardt method is used for the minimization.

Fig. 9 .
Fig. 9. Relationship between a direction vector of straight-line and a normal vector of a least square plane.
i l is a vector which connects the camera position c and the 3-D line location with the shortest distance.ˆc i l is calculated by the following equation.factor which shows a location on the 3-D line., ic B satisfies the following equation.www.intechopen.comThree-Dimensional Environment Modeling Based on Structure from Motion with Point and Line Features by Using Omnidirectional Camera 59

Fig. 11 .
Fig. 11.Triangular mesh generation and its optimization.(a) Triangular meshes generated by Delaunay triangulation.(b) Optimized triangular meshes.Measurement accuracy of straight-line is evaluated by equation (14), too.In this evaluation, , im p is the middle point of the line connecting two vectors 1 c i g and 2 c i g at the shortest

First
, accuracy of line-based measurement is evaluated.Measurement objects are lengthwise-lines on a flat wall shown in Fig.12.The reason for including crosswise-lines is that the proposed method needs lines having different direction.The moving distance of the camera was about 2m.The number of input images is 72.An input image size is 2496  1664 pixels.

Fig. 12 .
Fig. 12. Measurement objects place on flat walls.Vertical lines are measured for accuracy evaluation.Level lines are set for camera movement estimation.

Fig. 13 .
Fig. 13.Measurement result for accuracy evaluation.Vertical lines are measured precisely.Measurement results of level lines are removed as low measurement accuracy data.
14.We used 84 omnidirectional images.Measurement results of feature points and straightlines are shown in Fig. 15.The blue marks in the figure show the camera trajectory.Although feature point measurement results are sparse, straight-lines can be measured densely.This experimental result shows that our proposed method is effective for a nontextured environment.High Low www.intechopen.com(a) Non-textured environment.(b) Input image.

Fig. 14 .
Fig. 14.Non-textured environment.We cannot get enough feature points in the environment because there are few features.Modeling result is shown in Fig.16.Images having a view-point which is different from camera observation points can be acquired.A model constructed from feature point measurement data is only a small part of this environment (Fig.16(a) and (c)).Meanwhile, edge measurement data makes it possible to construct a non-textured environment model (Fig.16}(b) and (d)).We also experimented in an outdoor environment including textured objects (trees and so on) and non-textured objects (buildings and so on) as shown in Fig.17.We used 240 omnidirectional images.An integrated measurement result is shown in Fig.18.As one of textured objects, the shape of ground surface is measured by point-based measurement.As non-textured objects, the shape of the building is measured by line-based measurement.
Three-Dimensional Environment Modeling Based on Structure from Motion with Point and Line Features by Using Omnidirectional Camera 57 An essential matrix E is calculated from ray vectors of corresponding feature points.An essential matrix E and ray vectors satisfy the following equation.
www.intechopen.comabe is the row a and column b element of Essential matrix E .The matrix E is obtained by solving simultaneous equations for more than eight pairs of corresponding ray vectors. 2 min J  Ue ,