Skip to main content
Log in

Sampling Minimal Subsets with Large Spans for Robust Estimation

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

When sampling minimal subsets for robust parameter estimation, it is commonly known that obtaining an all-inlier minimal subset is not sufficient; the points therein should also have a large spatial extent. This paper investigates a theoretical basis behind this principle, based on a little known result which expresses the least squares regression as a weighted linear combination of all possible minimal subset estimates. It turns out that the weight of a minimal subset estimate is directly related to the span of the associated points. We then derive an analogous result for total least squares which, unlike ordinary least squares, corrects for errors in both dependent and independent variables. We establish the relevance of our result to computer vision by relating total least squares to geometric estimation techniques. As practical contributions, we elaborate why naive distance-based sampling fails as a strategy to maximise the span of all-inlier minimal subsets produced. In addition we propose a novel method which, unlike previous methods, can consciously target all-inlier minimal subsets with large spans.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. Actually 7 matches are sufficient since \(\mathbf F \) is rank deficient by one, but the estimation process from 7 is more complicated. In any case the rank constraint can be imposed post-estimation from 8 matches (Hartley 1997).

  2. http://cms.brookes.ac.uk/research/visiongroup/

  3. Note that unlike Algorithms 1 and 2, Algorithms 3 and 4 update the sampling distribution (Step 5) according to the data available so far in the minimal subset. This can also be done for Algorithms 1 and 2, e.g., by recentring (54) and (55) on the datum last sampled. However our experiments suggest that this produces worse performance.

References

  • Chin, T. J., Yu, J., & Suter, D. (2010). Accelerated hypothesis generation for multi-structure robust fitting. In European Conference on Computer Vision (ECCV).

  • Chin, T. J., Yu, J., & Suter, D. (2012). Accelerated hypothesis generation for multi-structure data via preference analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4), 625–638.

    Article  Google Scholar 

  • Chum, O., & Matas, J. (2005). Matching with PROSAC—Progressive sample consensus. In Computer Vision and Pattern Recognition (CVPR).

  • Chum, O., & Matas, J. (2010). Planar affine rectification from change of scale. In Asian Conference on Computer Vision (ACCV).

  • Chum, O., Matas, J., & Kittler, J. (2003). Locally optimized RANSAC. In Deutsche Arbeitsgemeinschaft für Mustererkennung (DAGM).

  • Chum, O., Matas, J., & Obdrzakek, S. (2004). Enhancing RANSAC by generalized model optimization. In Asian Conference on Computer Vision (ACCV)

  • Chum, O., Werner, T., & Matas, J. (2005). Two-view geometry estimation unaffected by a dominant plane. In Computer Vision and Pattern Recognition (CVPR).

  • de Groen, P. (1996). An introduction to total least squares. Nieuw Archief voor Wiskunde, 4(14), 237–253.

    Google Scholar 

  • Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24, 381–395.

    Article  MathSciNet  Google Scholar 

  • Frahm, J. M., & Pollefeys, M. (2006). RANSAC for (quasi-)degenerate data (QDEGSAC).

  • Golub, G. H., Hoffman, A., & Stewart, G. W. (1987). A generalization of the Eckart-Young-Mirksy matrix approximation theorem. Linear Algebra and its Applications, 88–89, 317–327.

    Article  MathSciNet  Google Scholar 

  • Golub, G. H., & van Loan, C. F. (1980). An analysis of the total least squares problem. Numerical Analysis, 17, 883–893.

    Article  MATH  Google Scholar 

  • Goshen, L., & Shimshoni, I. (2008). Balanced exploration and exploitation model search for efficient epipolar geometry estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence

  • Harker, M., & O’Leary, P. (2006). Direct estimation of homogeneous vectors: An ill-solved problem in computer vision. In Indian Conference on Computer Vision, Graphics and Image Processing.

  • Hartley, R. (1997). In defense of the eight-point algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(6), 580–593.

    Google Scholar 

  • Hartley, R., & Zisserman, A. (2004). Multiple View Geometry (2nd ed.). Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Hoerl, A. E., & Kennard, R. W. (1980). A note on least squares estimates. Communications in Statistics: Simulation and Computation, 9(3), 315–317.

    Google Scholar 

  • Jacobi, C. G. J. (1841). De formatione et proprietatibus determinantium. Journal fur die reine und angewandte Mathematik, 9, 315–317.

    Google Scholar 

  • Kahl, F., & Hartley, R. (2008). Multiple-view geometry under the \(l_\infty \)-norm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(9), 1603–1617.

    Article  Google Scholar 

  • Kahl, F., & Henrion, D. (2005). Globally optimal estimates for geometric reconstruction problems. In International Conference on Computer Vision (ICCV).

  • Kanazawa, Y., & Kawakami, H. (2004). Detection of planar regions with uncalibrated stereo using distributions of feature points. In British Machine Vision Conference (BMVC).

  • Kemp, C., & Drummond, T. (2005). Dynamic measurement clustering to aid real time tracking. In International Conference on Computer Vision (ICCV).

  • Kukush, A., Markovsky, I., & Huffel, S. V. (2002). Consistent fundamental matrix estimation in a quadratic measurement error model arising in motion analysis. Computational Statistics and Data Analysis, 3(18), 3–18.

    Article  Google Scholar 

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Meer, P. (2004). Robust techniques for computer vision. In G. Medioni & S. B. Kang (Eds.), Emerging topics in computer vision. Prentice Hall.

  • Mikolajczyk, K., & Schmid, C. (2004). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.

    Article  Google Scholar 

  • Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaffalitzky, F., et al. (2005). A comparison of affine region detectors. International Journal of Computer Vision, 65(1), 43–72.

    Article  Google Scholar 

  • Mühlich, M., & Mester, R. (1998). The role of total least squares in motion analysis. In: European Conference on Computer Vision (ECCV).

  • Myatt, D. R., Torr, P. H. S., Nasuto, S. J., Bishop, J. M., & Craddock, R. (2002). NAPSAC: high noise, high dimensional robust estimation—It’s in the bag. In British Machine Vision Conference (BMVC).

  • Olsson, C., Eriksson, A., & Hartley, R. (2010). Outlier removal using duality. In Computer Vision and Pattern Recognition (CVPR).

  • Pham, T. T., Chin, T. J., Yu, J., & Suter, D. (2012). The random cluster model for robust geometric fitting. In Computer Vision and Pattern Recognition (CVPR).

  • Rousseeuw, P. J., & Leroy, A. M. (1987). Robust regression and outlier detection. New York: Wiley.

    Book  MATH  Google Scholar 

  • Scherer-Negenborn, N., & Schaefer, R. (2010). Model fitting with sufficient random sample coverage. International Journal of Computer Vision, 89, 120–128.

    Article  Google Scholar 

  • Stigler, S. M. (2000). The history of statistics: The measurement of uncertainty before 1900 (8th edn., Chap. 1). The Belknap Press of Harvard University Press.

  • Subrahmanyam, M. (1972). A property of simple least squares estimates. Sankhya B, 34, 3.

    MathSciNet  Google Scholar 

  • Tordoff, B. J., & Murray, D. W. (2005). Guided-MLESAC: Faster image transform estimation by using matching priors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1523–1535.

    Google Scholar 

  • van Huffel, S., & Wandewalle, J. (1989). Algebraic connections between the least squares and total least squares problem. Numerical Mathematics, 55, 431–449.

    Google Scholar 

  • van Huffel, S., & Wandewalle, J. (1991). The total least squares problem: Computational aspects and analysis. Philadelphia, PA: SIAM Publications.

  • Vedaldi, A., & Fulkerson, B. (2008). VLFeat: An open and portable library of computer vision algorithms. http://www.vlfeat.org/

  • Wong, H. S., Chin, T. J., Yu, J., & Suter, D. (2011). Dynamic and hierarchical nulti-structure geometric model fitting. In: International Conference on Computer Vision (ICCV).

  • Zhang, Z. (1997). Parameter estimation techniques: A tutorial with application to conic fitting. Image and Vision Computing, 15(1), 59–76.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tat-Jun Chin.

Appendices

Appendix 1: Proof of Proposition 2 for TLS with non-minimal subsets

From Sect. 2.1, the weight of a non-minimal subset \(\nu \) is proportional to \(|\mathbf{X }(\nu )^T \mathbf{X }(\nu )|\) which, from (36), is equal to

$$\begin{aligned} |\mathbf{X }(\nu )^T \mathbf{X }(\nu )|&= |\mathbf{V }\mathbf{S }^T_m\mathbf{U }_m(\nu )^T\mathbf{U }_m(\nu ) \mathbf{S }_m \mathbf{V }^T |\end{aligned}$$
(65)
$$\begin{aligned}&= |\mathbf{U }_m(\nu )^T \mathbf{U }_m(\nu )| |\mathbf{S }_m \mathbf{V }^T|^2. \end{aligned}$$
(66)

Similarly,

$$\begin{aligned} |\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )| = |\mathbf{U }_m(\nu )^T \mathbf{U }_m(\nu )| |\tilde{\mathbf{S }}_m \mathbf{V }^T|^2. \end{aligned}$$
(67)

Therefore, \(|\mathbf{X }(\nu )^T \mathbf{X }(\nu )| = \alpha |\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )|\) where \(\alpha \) is a constant which does not depend on \(\nu \). This proves that

$$\begin{aligned}&|\mathbf{X }(\nu _1)^T \mathbf{X }(\nu _1)| > |\mathbf{X }(\nu _2)^T \mathbf{X }(\nu _2)|\end{aligned}$$
(68)
$$\begin{aligned}&\implies |\mathbf{Z }(\nu _1)^T \mathbf{Z }(\nu _1)| > |\mathbf{Z }(\nu _2)^T \mathbf{Z }(\nu _2)|. \end{aligned}$$
(69)

Appendix 2: Proof of Propositions 1 and 2 for mixed OLS-TLS

We aim to prove that Proposition 1 holds for the mixed OLS-TLS problem (Sect.3.3), i.e., the solution to the OLS problem

$$\begin{aligned} \underset{{\varvec{\beta }}}{\mathrm{arg~min }}~\Vert \mathbf{y }- \hat{\mathbf{y }} \Vert ^2~~~\mathrm{s.t.}~~~\mathbf{Z }{\varvec{\beta }}= \hat{\mathbf{y }} \end{aligned}$$
(70)

coincides with the mixed OLS-TLS estimate

$$\begin{aligned} \breve{{\varvec{\beta }}} = \left( \mathbf{X }^T\mathbf{X }- \sigma _{m_2 + 1}^2 \mathbf L \right) ^{-1} \mathbf{X }^T \mathbf{y }= (\mathbf{X }^T \mathbf{Z })^{-1} \mathbf{X }^T \mathbf{y }\end{aligned}$$
(71)

where

$$\begin{aligned} \mathbf{Z }:= \mathbf{X }- \sigma _{m_2 + 1}^2(\mathbf{X }^T)^\dagger \mathbf L . \end{aligned}$$
(72)

See Sect. 3.3 for the definition of the other symbols involved.

Let \(\tilde{{\varvec{\beta }}}\) be the solution to (70). Then

$$\begin{aligned} \tilde{{\varvec{\beta }}} = (\mathbf{Z }^T\mathbf{Z })^{-1} \mathbf{Z }^T \mathbf{y }\end{aligned}$$
(73)

which, following the proof of Proposition 1, can be rearranged to become

$$\begin{aligned} \mathbf{X }^T\mathbf{Z }\tilde{{\varvec{\beta }}}&= \mathbf{X }^T\mathbf{y }+ \sigma _{m+1}^2 \mathbf L ^T (\mathbf{X }^T)^{\dagger T}(\mathbf{Z }\tilde{{\varvec{\beta }}} - \mathbf{y }). \end{aligned}$$
(74)

As shown in the proof of Proposition 1, the column spans of \(\mathbf{Z }\) and \((\mathbf{X }^T)^\dagger \) are equal. Since vector \((\mathbf{Z }\tilde{{\varvec{\beta }}} - \mathbf{y })\) is orthogonal to \(\mathcal R (\mathbf{Z })\), it is also orthogonal to \(\mathcal R ((\mathbf{X }^T)^\dagger )\), thus the second component on the RHS of (74) equates to 0, yielding

$$\begin{aligned} \mathbf{X }^T\mathbf{Z }\tilde{{\varvec{\beta }}} = \mathbf{X }^T\mathbf{y }. \end{aligned}$$
(75)

Comparing (75)–(71) proves \(\tilde{{\varvec{\beta }}} = \breve{{\varvec{\beta }}}\), i.e., the mixed OLS-TLS estimate \(\breve{{\varvec{\beta }}}\) coincides with the solution of the OLS (70).

To prove that Proposition 2 also holds for mixed OLS-TLS, it is sufficient to show that, given a non-minimal data subset \(\nu \) of size \(m+i \le n\),

$$\begin{aligned} |\mathbf{X }(\nu )^T \mathbf{X }(\nu )| \propto |\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )|. \end{aligned}$$
(76)

To begin, let \(\mathbf{X }= \mathbf{U }\mathbf{S }\mathbf{V }^T\) be the SVD of \(\mathbf{X }\). Then \((\mathbf{X }^T)^\dagger = \mathbf{U }\mathbf{S }^{-1}\mathbf{V }^T\) is the SVD of \((\mathbf{X }^T)^\dagger \). Also, since \(n > m\)

$$\begin{aligned} \mathbf{X }= \mathbf{U }_m \mathbf{S }_m \mathbf{V }^T~~~~(\mathbf{X }^T)^\dagger = \mathbf{U }_m \mathbf{S }^{-1}_m \mathbf{V }^T. \end{aligned}$$
(77)

Then, from (72)

$$\begin{aligned} \mathbf{Z }&= \mathbf{U }_m ( \mathbf{S }_m \mathbf{V }^T - \sigma ^2_{m_2+1} \mathbf{S }_m^{-1}\mathbf{V }^T \mathbf L ) = \mathbf{U }_m \varvec{\varGamma } \end{aligned}$$
(78)

where we define the square matrix

$$\begin{aligned} \varvec{\varGamma } := ( \mathbf{S }_m \mathbf{V }^T - \sigma ^2_{m_2+1} \mathbf{S }_m^{-1}\mathbf{V }^T \mathbf L ). \end{aligned}$$
(79)

Also, observe that

$$\begin{aligned} \mathbf{Z }(\nu )&= \mathbf{U }_m(\nu ) \varvec{\varGamma }. \end{aligned}$$
(80)

In (66), the following determinant has been established

$$\begin{aligned} |\mathbf{X }(\nu )^T \mathbf{X }(\nu )|&= |\mathbf{U }_m(\nu )^T \mathbf{U }_m(\nu )| |\mathbf{S }_m \mathbf{V }^T|^2. \end{aligned}$$
(81)

The determinant \(|\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )|\) is then

$$\begin{aligned} |\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )|&= | \mathbf{U }_m(\nu )^T \mathbf{U }_m(\nu ) | |\varvec{\varGamma }|^2 \end{aligned}$$
(82)

which implies \(|\mathbf{X }(\nu )^T \mathbf{X }(\nu )| \propto |\mathbf{Z }(\nu )^T \mathbf{Z }(\nu )|\). Note that this result also holds for minimal subsets by setting \(i = 0\).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tran, Q.H., Chin, TJ., Chojnacki, W. et al. Sampling Minimal Subsets with Large Spans for Robust Estimation. Int J Comput Vis 106, 93–112 (2014). https://doi.org/10.1007/s11263-013-0643-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-013-0643-y

Keywords

Navigation