ABSTRACT
Video matting is the process of pulling a high-quality alpha matte and foreground from a video sequence. Current techniques require either a known background (e.g., a blue screen) or extensive user interaction (e.g., to specify known foreground and background elements). The matting problem is generally under-constrained, since not enough information has been collected at capture time. We propose a novel, fully autonomous method for pulling a matte using multiple synchronized video streams that share a point of view but differ in their plane of focus. The solution is obtained by directly minimizing the error in filter-based image formation equations, which are over-constrained by our rich data stream. Our system solves the fully dynamic video matting problem without user assistance: both the foreground and background may be high frequency and have dynamic content, the foreground may resemble the background, and the scene is lit by natural (as opposed to polarized or collimated) illumination.
Supplemental Material
- Apostoloff, N. E., and Fitzgibbon, A. W. 2004. Bayesian video matting using learnt image priors. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 407--414.Google Scholar
- Asada, N., Fujiwara, H., and Matsuyama, T. 1998. Seeing behind the scene: analysis of photometric properties of occluding edges by the reversed projection blurring model. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 2, 155--67. Google ScholarDigital Library
- Ben-Ezra, M., and Nayar, S. 2004. Jitter camera: High resolution video from a low resolution detector. In IEEE CVPR, 135--142. Google ScholarDigital Library
- Bhasin, S. S., and Chaudhuri, S. 2001. Depth from defocus in presence of partial self occlusion. Proceedings of the International Conference on Computer Vision 1, 2, 488--93.Google Scholar
- Blake, A., Rother, C., Brown, M., Perez. P., and Torr, P. 2004. Interactive image segmentation using an adaptive gmmrf model. Proceedings of the European Conference on Computer Vision (ECCV).Google Scholar
- Chuang, Y.-Y., Curless, B., Salesin, D. H., and Szeliski, R. 2001. A bayesian approach to digital matting. In Proceedings of IEEE CVPR 2001, IEEE Computer Society, vol. 2, 264--271. Google ScholarDigital Library
- Chuang, Y.-Y., Agarwala, A., Curless, B., Salesin, D. H., and Szeliski, R. 2002. Video matting of complex scenes. ACM Trans. on Graphics 21, 3 (July), 243--248. Google ScholarDigital Library
- Debevec, P. E., and Malik, J. 1997. Recovering high dynamic range radiance maps from photographs. In Proceedings of the 24th annual conference on Computer graphics and interactive techniques, ACM Press/Addison-Wesley Publishing Co., 369--378. Google ScholarDigital Library
- Favaro, P., and Soatto, S. 2003. Seeing beyond occlusions (and other marvels of a finite lens aperture). In IEEE CVPR, 579--586.Google Scholar
- Fleischer, M., 1917. Method of producing moving picture cartoons. US Patent no. 1,242,674.Google Scholar
- Glassner, A. S. 1995. Principles of Digital Image Synthesis. Morgan Kaufmann Publishers, Inc. Google ScholarDigital Library
- Haralick, R. M., Sternberg, S. R., and Zhuang, X. 1987. Image analysis using mathematical morphology. IEEE PAMI 9, 4, 532--550. Google ScholarDigital Library
- Hecht, E. 1998. Optics Third Edition. Addison Wesley Longman, Inc.Google Scholar
- Hillman, P., Hannah, J., and Renshaw, D. 2001. Alpha channel estimation in high resolution images and image sequences. In Proceedings of IEEE CVPR 2001, IEEE Computer Society, vol. 1, 1063--1068.Google Scholar
- Levoy, M., Chen, B., Vaish, V., Horowitz, M., McDowall, I., and Bolas, M. 2004. Synthetic aperture confocal imaging. ACM Trans. Graph. 23, 3, 825--834. Google ScholarDigital Library
- Malvar, H. S., Wei He, L., and Cutler, R. 2004. High-quality linear interpolation for demosaicing of bayer-patterned color images. Proceedings of the IEEE International Conference on Speech, Acoustics, and Signal Processing.Google Scholar
- Nayar, S. K., and Branzoi, V. 2003. Adaptive dynamic range imaging: Optical control of pixel exposures over space and time. In Proceedings of the International Conference on Computer Vision (ICCV), 1168--1175. Google ScholarDigital Library
- Nayar, S. K., Watanabe, M., and Noguchi, M. 1996. Real-time focus range sensor. IEEE PAMI 18, 12, 1186--1198. Google ScholarDigital Library
- Nocedal, J., and Wright, S. J. 1999. Numerical Optimization. Springer Verlag.Google Scholar
- Pentland, A. P. 1987. A new sense for depth of field. IEEE PAMI 9, 4, 523--531. Google ScholarDigital Library
- Porter, T., and Duff, T. 1984. Compositing digital images. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques, ACM Press, 253--259. Google ScholarDigital Library
- Potmesil, M., and Chakravarty, I. 1983. Modeling motion blur in computer-generated images. Computer Graphics 17, 3 (July), 389--399. Google ScholarDigital Library
- Rother, C., Kolmogorov, V., and Blake, A. 2004. "grabcut": interactive foreground extraction using iterated graph cuts. ACM Trans. on Graphics 23, 3, 309--314. Google ScholarDigital Library
- Ruzon, M. A., and Tomasi, C. 2000. Alpha estimation in natural images. In CVPR 2000, vol. 1, 18--25.Google Scholar
- Schechner, Y. Y., Kiryati, N., and Basri, R. 2000. Separation of transparent layers using focus. International Journal of Computer Vision, 25--39. Google ScholarDigital Library
- Smith, A. R., and Blinn, J. F. 1996. Blue screen matting. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, ACM Press, 259--268. Google ScholarDigital Library
- Sun, J., Jia, J., Tang, C.-K., and Shum, H.-Y. 2004. Poisson matting. ACM Transactions on Graphics (August), 315--321. Google ScholarDigital Library
- Yahav, G., and Iddan, G. 2002. 3dv systems' zcam. Broadcast Engineering.Google Scholar
- Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. on Graphics 23, 3, 600--608. Google ScholarDigital Library
Index Terms
- Defocus video matting
Recommendations
Defocus video matting
Video matting is the process of pulling a high-quality alpha matte and foreground from a video sequence. Current techniques require either a known background (e.g., a blue screen) or extensive user interaction (e.g., to specify known foreground and ...
Automatic spectral video matting
This paper proposes automatic spectral video matting based on adaptive component detection and component-matching-based spectral matting. In the proposed automatic spectral video matting, adaptive component detection is used to automatically generate ...
Automatic video matting through scribble propagation
ICVGIP '16: Proceedings of the Tenth Indian Conference on Computer Vision, Graphics and Image ProcessingVideo matting is an extension of image matting and is used to extract the foreground matte from an arbitrary background of every frame in a video sequence. An automatic scribbling approach based on the relative motion of the foreground object with ...
Comments