Skip to main content
Log in

Measuring and Predicting Object Importance

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

How important is a particular object in a photograph of a complex scene? We propose a definition of importance and present two methods for measuring object importance from human observers. Using this ground truth, we fit a function for predicting the importance of each object directly from a segmented image; our function combines a large number of object-related and image-related features. We validate our importance predictions on 2,841 objects and find that the most important objects may be identified automatically. We find that object position and size are particularly informative, while a popular measure of saliency is not.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Dwork, C., Kumar, R., Naor, M., & Sivakumar, D. (2001). Rank aggregation methods for the web. In WWW (pp. 613–622).

  • Einhauser, W., Spain, M., & Perona, P. (2008). Objects predict fixations better than early saliency. Journal of Vision, 8(14), 1–26. URL: http://journalofvision.org/8/14/18/.

    Article  Google Scholar 

  • Elazary, L., & Itti, L. (2008). Interesting objects are visually salient. Journal of Vision, 8(3:3), 1–15.

    Google Scholar 

  • Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., & Zisserman, A. (2008). The PASCAL visual object classes challenge 2008 (VOC2008) results. http://www.pascal-network.org/challenges/VOC/voc2008/workshop/index.html.

  • Fei-Fei, L., Iyer, A., Koch, C., & Perona, P. (2007). What do we perceive in a glance of a real-world scene? Journal of Vision, 7(1), 1–29. URL: http://journalofvision.org/7/1/10/.

    Article  Google Scholar 

  • Fog, A. (2008). Calculation methods for Wallenius’ noncentral hypergeometric distribution. Communications in Statictics, Simulation and Computation, 37(2), 258–273.

    Article  MATH  MathSciNet  Google Scholar 

  • Fowlkes, C., Martin, D. R., & Malik, J. (2003). Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In CVPR (2) (pp. 54–64).

  • Grauman, K., & Darrell, T. (2005). The pyramid match kernel: discriminative classification with sets of image features. In ICCV (pp. 1458–1465).

  • Griffin, G., Holub, A., & Perona, P. (2007). Caltech-256 object category dataset (Tech. Rep. 7694). California Institute of Technology. URL: http://authors.library.caltech.edu/7694.

  • Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (2nd ed.). New York: Springer.

    MATH  Google Scholar 

  • Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.

    Article  Google Scholar 

  • Jaccard, P. (1901). Étude comparative de la distribution florale dans une portion des alpes et des jura. Bulletin del la Société Vaudoise des Sciences Naturelles, 37, 547–579.

    Google Scholar 

  • Kendall, M. G. (1962). Rank correlation methods. Charles Griffin and Company Limited.

  • Lazebnik, S., Schmid, C., & Ponce, J. (2006). Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In CVPR (2) (pp. 2169–2178).

  • Lebanon, G., & Lafferty, J. D. (2002). Cranking: combining rankings using conditional probability models on permutations. In ICML (pp. 363–370).

  • Lowe, D. G. (1999). Object recognition from local scale-invariant features. In ICCV (pp. 1150–1157).

  • Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • Manly, B. F. J. (1974). A model for certain types of selection experiments. Biometrics, 30(2), 281–294.

    Article  MATH  Google Scholar 

  • Mayer, M., & Switkes, E. (1985). Spatial frequency taxonomy of the visual environment. Investigative Ophthalmology and Visual Science, 26(280).

  • Rabinovich, A., Belongie, S., Lange, T., & Buhmann, J. M. (2006). Model order selection and cue combination for image segmentation. In CVPR (1) (pp. 1130–1137).

  • Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., & Belongie, S. (2007). Objects in context. In ICCV (pp. 1–8). New York: IEEE.

    Google Scholar 

  • Rensink, R. A., O’Regan, J. K., & Clark, J. J. (1997). To see or not to see: the need for attention to perceive changes in scenes. Psychological Science, 8, 368–373.

    Article  Google Scholar 

  • Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2005). LabelMe: a database and web-based tool for image annotation (Tech. rep.).

  • Russell, B. C., Torralba, A. B., Liu, C., Fergus, R., & Freeman, W. T. (2007). Object recognition by scene alignment. In NIPS.

  • Shore, S. (2005). Stephen Shore: American surfaces. Phaidon Press.

  • Shore, S., Tillman, L., & Schmidt-Wulffen, S. (2005). Uncommon places: the complete works. Aperture

  • Sorokin, A., & Forsyth, D. (2008). Utility data annotation with amazon mechanical turk. In CVPR.

  • Spain, M., & Perona, P. (2008). Some objects are more equal than others: measuring and predicting importance. In Proceedings of the European conference on computer vision (ECCV).

  • Stein, A. N., Stepleton, T. S., & Hebert, M. (2008). Towards unsupervised whole-object segmentation: combining automated matting with boundary detection. In CVPR.

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society B, 58(1), 267–288.

    MATH  MathSciNet  Google Scholar 

  • Torralba, A. B., Fergus, R., & Freeman, W. T. (2008). 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Transactions Pattern Analysis Machine Intelligence, 30(11), 1958–1970.

    Article  Google Scholar 

  • Viola, P. A., & Jones, M. J. (2001). Rapid object detection using a boosted cascade of simple features. In CVPR (1) (pp. 511–518).

  • von Ahn, L., & Dabbish, L. (2004). Labeling images with a computer game. In CHI (pp. 319–326).

  • Walther, D., & Koch, C. (2006). Modeling attention to salient proto-objects. Neural Networks, 19(9), 1395–1407.

    Article  MATH  Google Scholar 

  • Zhang, H., Berg, A. C., Maire, M., & Malik, J. (2006). Svm-knn: discriminative nearest neighbor classification for visual category recognition. In CVPR (2) (pp. 2126–2136).

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Merrielle Spain.

Additional information

This material is based upon work supported under a National Science Foundation Graduate Research Fellowship, Office of Naval Research grant N00014-06-1-0734, and National Institutes of Health grant R01 DA022777.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spain, M., Perona, P. Measuring and Predicting Object Importance. Int J Comput Vis 91, 59–76 (2011). https://doi.org/10.1007/s11263-010-0376-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11263-010-0376-0

Keywords

Navigation