Abstract.
Registration of a mission video sequence with a reference image without any metadata (camera location, viewing angles, and reference DEMs) is still a challenging problem. This paper presents a layer-based approach to registering a video sequence to a reference image of a 3D scene containing multiple layers. First, the robust layers from a mission video sequence are extracted and a layer mosaic is generated for each layer, where the relative transformation parameters between consecutive frames are estimated. Then, we formulate the image-registration problem as a region-partitioning problem, where the overlapping regions between two images are partitioned into supporting and nonsupporting (or outlier) regions, and the corresponding motion parameters are also determined for the supporting regions. In this approach, we first estimate a set of sparse, robust correspondences between the first frame and reference image. Starting from corresponding seed patches, the aligned areas are expanded to the complete overlapping areas for each layer using a graph-cut algorithm with level set, where the first frame is registered to the reference image. Then, using the transformation parameters estimated from the mosaic, we initially align the remaining frames in the video to the reference image. Finally, using the same partitioning framework, the registration is further refined by adjusting the aligned areas and removing outliers. Several examples are demonstrated in the experiments to show that our approach is effective and robust.
Similar content being viewed by others
References
Ayer S, Sawhney H (1995) Layered representation of motion video using robust maximum-likelihood estimation of mixture models and MDL encoding. In: International conference on computer vision
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222-1239
Brown L (1992) A survey of image registration techniques. ACM Comput Surv 24(4):325-376
Ferrari V, Tuytellars T, Van Gool L (2003) Wide-baseline multiple-view correspondences. In: IEEE conference on computer vision and pattern recognition
Horn B, Schunck B (1981) Determining optical flow. Artif Intell 17:185-203
Ke Q, Kanade T (2002) A robust subspace approach to layer extraction. In: IEEE workshop on motion and video computing
Keller Y, Averbuch A (2003) Implicit similarity: a new approach to multi-sensor image registration. In: IEEE conference on computer vision and pattern recognition
Khan S, Shah M (2001) Object based segmentation of video using color, motion and spatial information. In: IEEE conference on computer vision and pattern recognition
Kolmogorov V, Zabih R (2002) What energy functions can be minimized via graph cuts? In: European conference on computer vision
Osher S, Fedkiw R (2003) Level set methods and dynamic implicit surfaces. Springer, Berlin Heidelberg New York
Sawhney H, Hsu S, Kumar R (1998) Robust video mosaicing through topology inference and local to global alignment. In: European conference on computer vision
Sethian J (1999) Level set methods and fast marching methods. Cambridge University Press, Cambridge, UK
Sheikh Y, Shah M (2004) Aligning ‘dissimilar’ images directly. In: Asian conference on computer vision
Sheikh Y, Khan S, Shah M, Cannata R (2003) Geodetic alignment of aerial video frames. In: Video registration, video computing series. Kluwer, Dordrecht
Shen D, Davatzitos C (2002) HAMMER: Hierarchical attribute matching mechanism for elastic registration. IEEE Trans Med Imag 21:1421-1439
Shah M, Kumar R (eds) (2003) Video registration. Kluwer, Dordrecht
Szeliski R (1996) Video mosaics for virtual environments. IEEE Comput Graph Appl 16:22-30
Tomasi C, Manduchi r (1998) Bilateral filtering for gray and color images. In: International conference on computer vision
Wills J, Agarwal S, Belongie S (2003) What went where. In: IEEE conference on computer vision and pattern recognition
Wildes R, Hirvonen D, Hsu S, Kumar R, Lehman W, Matei B, Zhao W (2001) Video georegistration: algorithm and quantitative evaluation. In: International conference on computer vision
Xiao J, Shah M (2003) Two-frame wide baseline matching. In: International conference on computer vision
Xiao J, Shah M (2004) Motion layer extraction in the presence of occlusion using graph cut. In: IEEE conference on computer vision and pattern recognition
Zheng Q, Chellappa R (1993) A computational vision approach to image registration. IEEE Trans Image Process 2(3):311-326
Zitova B, Flusser J (2003) Image registration methods: a survey. Image Vis Comput 21:977-1000
Author information
Authors and Affiliations
Corresponding author
Additional information
Received: 16 September 2004, Accepted: 23 September 2004, Published online: 19 January 2005
Rights and permissions
About this article
Cite this article
Xiao, J., Shah, M. Layer-based video registration. Machine Vision and Applications 16, 75–84 (2005). https://doi.org/10.1007/s00138-004-0162-5
Issue Date:
DOI: https://doi.org/10.1007/s00138-004-0162-5