A review on egomotion by means of differential epipolar geometry applied to the movement of a mobile robot
Introduction
Camera calibration is the first step toward computational computer vision. The use of precisely calibrated cameras makes the measurement of distances in a metric world from their projections on the image plane possible. The camera model is a mathematical description of the geometric relationship between the 3D geometric entities and their 2D projections on the image plane consisting of a set of internal camera parameters which describes the internal geometry and optics of the camera, and a set of extrinsic parameters which describes the position and orientation of the camera in the scene. Perspective cameras can be represented by several models depending on the desired level of accuracy.
Given a 3D point in metric coordinates with respect to the world coordinate system {W}, its projection in pixels with respect to the image coordinate system {I} can be computed through some linear (sometimes non-linear) equations.2 This set of equations encapsulates several transformations which are broken down into four steps (see Fig. 1).
- 1.
First the coordinates of point in the world coordinate system are transformed into the camera coordinate system by using an Euclidean transformation.
- 2.
Next, it is necessary to carry out the projection of point onto the image plane by using a projective transformation, obtaining point .
- 3.
The third step models lens distortion causing a disparity of the real projection on the image plane. Then, point is transformed into the real projection .
- 4.
Finally, the last step consists of transforming the point from the metric coordinate system of the camera into the image coordinate system of the computer in pixels.
Small variations in the definition of the geometric transformations used imply the use of different camera models, resulting in different calibration techniques. For instance, the technique proposed by Hall [1] in 1982 is based on an implicit linear camera calibration by computing the 3×4 transformation matrix which relates 3D object points with their 2D image projections. The latter work of Faugeras [2], proposed in 1986, was based on extracting the physical parameters of the camera from this sort of transformation technique. Some years later, work proposed by Faugeras was adaptated to include radial lens distortion [3]. Two other interesting contributions are the widely used method proposed by Tsai [4], is based on a two-step technique modelling only radial lens distortion, and the complete model of Weng [5], proposed in 1992, which includes three different types of lens distortion. Research efforts are still being carried out on obtaining new camera models to improve both accuracy in computing the optical ray and in extracting the camera parameters which best model reality. For additional details concerning camera calibration methods, please check the recent calibration survey [6].
When we get into the binocular case (that is two views from a stereoscopic system or two different views from a single moving camera) another interesting relationship is defined in the so-called epipolar geometry. This information is contained in the fundamental matrix which includes the intrinsic parameters of both cameras and the position and orientation of one camera with respect to the other. The fundamental matrix can be used to simplify the matching process between the viewpoints and to get the camera parameters in active systems where optical and geometrical characteristics might change dynamically depending on the imaging scene. In this case, the camera parameters can be extracted by using Kruppa equations [7]. Moreover, the epipolar geometry can be considered from both a continuous and a discrete point of view.
Probably the most well-known viewpoint is the discrete epipolar constraint formulated by Longuet–Higgins [8], Huang [9] and Faugeras [10]. In this case the relative 3D displacement between both views is recovered by the epipolar constraint from a set of correspondences in both image planes. Then, given an object point with respect to one of the two camera coordinate system and its 2D projections and on both image planes (in metric coordinates), the 3 points define a plane Π which intersects both image planes at the epipolar lines and , respectively, as shown in Fig. 2. Note that the same plane Π can be computed using both focal points and and a single 2D projection, which is the principle used to reduce the correspondence problem to a single search along the epipolar line. Moreover, the intersection of all the epipolar lines defines an epipole on both image planes, which can also be extracted by intersecting the line defined by both focal points and on both image planes. All the epipolar geometry is contained in the so-called Fundamental matrix [8] as shown in Eq. (1)where the fundamental matrix depends on the intrinsic parameters of both cameras and the rigid transformation between them
When the intrinsic camera parameters are known, it is possible to simplify , , obtaining,where . The matrix is called essential [9].
Many papers describe different methods to estimate the fundamental matrix [11], [12], [13], [14].
The differential case is the infinitesimal version of the discrete case, in which both views are always given from a single moving camera. If the velocity of the camera is low enough and the frame rate is very high, the relative displacement between two consecutive images becomes very small. The 2D displacement of image points can then be obtained from an image sequence using the optical flow. In this case, the 3D camera motion is described by a rigid motion using a rotation matrix and a translation vector, as in (Fig. 3)where differentiatingThen, replacing the parameter to in Eq. (5), the following equation is obtained:which leads to the following differential epipolar constraint: is the angular velocity of the camera and is the linear velocity of the camera. By projecting and in the image plane, in camera coordinates and its corresponding optical flow are obtained.
For a complete demonstration, the reader is directed to Haralick's book, Chapter 15 [15], where the movement of a rigid body related to a camera is explained. In our case, the demonstration is used to describe the movement of a camera related to a static object, in which only the sign of the obtained velocities differs from the previous one. Nevertheless, Eq. (7) can also be demonstrated in different ways as explained in Viéville [16] and Brooks [17]. Also, another equivalent form of Eq. (7) is shown in Eq. (8). In this case, since matrix is symmetric, the number of unknowns is reduced to sixwhere
The existence of two forms indicates that a redundancy exists in Eq. (7) (for a demonstration see Viéville [16], Brooks [17] and Ma [18]). Several books describe the optical flow such as Trucco and Verri [19], and the article published by Barron et al. [20] gives a state-of-the-art in optical flow estimation.
When comparing the discrete and differential methods, the discrete epipolar equation incorporates a single matrix, whereas the differential epipolar equation incorporates two matrices. These matrices encode information about the linear and angular velocities of the camera [15].
Approaches to motion estimation can be classified into discrete and differential methods depending on whether they use a set of point correspondences or optical flow. Another possible classification takes into account the estimation techniques used for motion recovery (linear or non-linear techniques). In Table 1, the algorithms are summarized and classified in terms of their nature (discrete and differential case), and estimation method (linear and non-linear technique).
This article analyzes several different algorithms for camera motion estimation based on differential image motion. The surveyed methods have been compared with them and experimental results are given. Moreover, this article analyzes the adaptation of general methods used in free 3D movement to planar motion which corresponds to the common case of a robot moving on a plane with the aim of studying how much accuracy improves by constraining the camera movement. Hence, this article focuses on linear techniques, as the motion has to be recovered in real-time.
This article is structured as follows. Section 2 describes up to 12 algorithms for 3D motion estimation based on optical flow. Section 3 focuses on the estimation of planar motion by constraining the free movement explained in the previous section. Then, Section 4 deals with the experimental results obtained. The article ends with conclusions.
Section snippets
Overview of 3D motion estimation
In this section, we detail some methods used for the recovery of every 6-DOF3 motion parameter from optical flow, providing insights into the complexity of the problem. The surveyed methods have been classified considering whether they are based on the Differential Epipolar Constraint or not.
Adaptation to a mobile robots
The aim of this work is to estimate the motion of a mobile robot. Due to the fact that the permitted movements of a robot are limited, it is possible to establish some modifications in the differential epipolar equation by applying new constraints. With these modifications, the number of potential solutions is reduced so the obtained results improve considerably.
Our robot (see Fig. 5) is constrained to only two independent movements: a translation along axis and a rotation around axis.
Experimental results
All the methods surveyed have been programmed and tested under the same conditions of image noise with the aim of giving an exhaustive comparison of most of 6-DOF motion estimation methods. Hence, Section 4.1 compares the twelve surveyed methods of 3D motion estimation and Section 4.2 deals with the six proposed adaptations to a 2-DOF mobile robot movement estimation. Section 4.3 shows results in real image sequences.
Conclusions
This article presents an up-to-date classification of the methods and techniques used to estimate the movement of a single camera. A survey of several motion recovering methods is done and experimental results are given with synthetic data considering both gaussian noise and outliers.
The general methods to estimate a 6-DOF movement have been adapted to the common case of a mobile robot moving on a plane obtaining better results and stability even under important noise conditions.
This article is
Acknowledgements
We greatly appreciate Dr. Tina Y. Tian, Dr. Carlo Tomasi and Dr. David J. Heeger who implemented the methods explained in Section 2.2 which have been compared with the rest of methods explained in the article, specially Dr. Heeger who gave us insightful information and the source code of such methods.
About the Author—XAVIER ARMANGUÉ received the B.S. degree in Computer Science in the University of Girona in 1999 before joining the Computer Vision and Robotics Group. At present he is involved in the study of stereovision systems for mobile robotics and he is working for his Ph.D. in the Computer Vision and Robotics Group in the University of Girona and in the Institute of Systems and Robotics in the University of Coimbra.
References (48)
- et al.
A robust-coded pattern projection for dynamic 3D scene measurement
Pattern Recognition Lett.
(1998) - et al.
A comparative review of camera calibrating methods with accuracy evaluation
Pattern Recognition
(2002) - et al.
Overall view regarding fundamental matrix estimation
Image Vision Comput.
(2003) - et al.
The first order expansion of motion equations in the uncalibrated case
Comput. Vision Graphics Image Process.: Image Understanding
(1996) - et al.
A simplified linear optic flow-motion algorithm
Comput. Vision Graphics Image Process.
(1988) Determining the instantaneous direction of motion from optical flow generated by a curvilinearly moving observer
Comput. Graphics Image Process.
(1981)- et al.
Passive navigation
Comput. Vision Graphics Image Process.
(1983) - et al.
Measuring curved surfaces for robot vision
Comput. J.
(1982) - O.D. Faugeras, G. Toscani, The calibration problem for stereo, in: Proceedings of the IEEE Conference on Computer...
A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses
IEEE Int. J. Robotics Automat.
(1987)
Camera calibration with distortion models and accuracy evaluation
IEEE Trans. Pattern Anal. Mach. Intell.
Kruppa's equations derived from the fundamental matrix
Pattern Anal. Mach. Intell.
A computer algorithm for reconstructing a scene from two projections
Nature
Some properties of the E matrix in two-view motion estimation
IEEE Trans. Pattern Anal. Mach. Intell.
Three-Dimensional Computer Vision
Determining the epipolar geometry and its uncertaintya review
Int. J. Comput. Vision
The fundamental matrixtheory, algorithms, and stability analysis
Int. J. Comput. Vision
The development and comparison of robust methods for estimating the fundamental matrix
Int. J. Comput. Vision
Determining the ego-motion of an uncalibrated camera from instantaneous optical flow
J. Opt. Soc. Am.
Linear differential algorithm for motion recoverya geometric approach
Int. J. Comput. Vision
Introductory Techniques for 3D Computer Vision
Cited by (30)
Camera motion estimation through monocular normal flow vectors
2015, Pattern Recognition LettersCitation Excerpt :The matched features can describe the motion correspondence and can be used to estimate the epipolar geometry and the fundamental matrix. Finally, the camera motion parameters can be obtained by decomposing the fundamental matrix [2,5]. However, establishing accurate feature correspondences is itself a very challenging task.
A review and evaluation of methods estimating ego-motion
2012, Computer Vision and Image UnderstandingCitation Excerpt :In our simulations we focused on properties like statistical bias, consistency, robustness, and fusion between visual image motion and depth incorporation. Armangue et al. proposed a review on the estimation of ego-motion where they focused on methods utilizing the discrete epipolar constraint and the adaptation of these methods to the setting of their mobile robotics platform [3]. Due to the different focus of this study it is a good complement to ours.
Outlier rejection by oriented tracks to aid pose estimation from video
2006, Pattern Recognition LettersSurvey of Ship Detection in Video Surveillance Based on Shallow Machine Learning
2021, Xitong Fangzhen Xuebao / Journal of System SimulationCCTV Scene Perspective Distortion Estimation From Low-Level Motion Features
2016, IEEE Transactions on Circuits and Systems for Video Technology
About the Author—XAVIER ARMANGUÉ received the B.S. degree in Computer Science in the University of Girona in 1999 before joining the Computer Vision and Robotics Group. At present he is involved in the study of stereovision systems for mobile robotics and he is working for his Ph.D. in the Computer Vision and Robotics Group in the University of Girona and in the Institute of Systems and Robotics in the University of Coimbra.
About the Author—HELDER ARAÚJO is currently Associate Professor at the Department of Electrical and Computer Engineering of the University of Coimbra. He is Deputy Director of the Institute for Systems and Robotics, Coimbra. His main research interests are computer vision and mobile robotics. He has been working in vision and robotics for the last 13 years.
About the Author—JOAQUIM SALVI graduated in Computer Science in the Polytechnical University of Catalunya in 1993. He joined the Computer Vision and Robotics Group in the University of Girona, where he received the M.S. degree in Computer Science in July 1996 and the Ph.D in Industrial Engineering in January 1998. He received the best thesis award in Industrial Engineering of the University of Girona. At present, he is an associate professor in the Electronics, Computer Engineering and Automation Department of the University of Girona. His current interest are in the field of computer vision and mobile robotics, focusing on structured light, stereovision and camera calibration.
- 1
This research is partly supported by Spanish project CICYT-TAP99-0443-C05-01.