Skip to main content
Log in

Spatial Query Estimation without the Local Uniformity Assumption

  • Published:
GeoInformatica Aims and scope Submit manuscript

Abstract

Existing estimation approaches for spatial databases often rely on the assumption that data distribution in a small region is uniform, which seldom holds in practice. Moreover, their applicability is limited to specific estimation tasks under certain distance metric. This paper develops the Power-method, a comprehensive technique applicable to a wide range of query optimization problems under both L and L2 metrics. The Power-method eliminates the local uniformity assumption and is, therefore, accurate even for datasets where existing approaches fail. Furthermore, it performs estimation by evaluating only one simple formula with minimal computational overhead. Extensive experiments confirm that the Power-method outperforms previous techniques in terms of accuracy and applicability to various optimization scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. S. Acharya, V. Poosala, and S. Ramaswamy. “Selectivity Estimation in Spatial Databases,” In Proceedings of ACM SIGMOD Conference, 13–24, 1999.

  2. N. An, Z. Yang, and A. Sivasubramaniam. “Selectivity Estimation for Spatial Joins,” In Proceedings of ICDE Conference, 368–375, 2001.

  3. W. Aref and H. Samet. “A Cost Model for Query Optimization Using R-Trees,” In Proceedings of ACM GIS Conference, 1–8, 1994.

  4. N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger. “The R*-tree: An Efficient and Robust Access Method for Points and Rectangles,” In Proceedings of ACM SIGMOD Conference, 322–331, 1990.

  5. A. Belussi and C. Faloutsos. “Estimating the Selectivity of Spatial Queries Using the Correlation’s Fractal Dimension,” In Proceedings of VLDB Conference, 299–310, 1995.

  6. S. Berchtold, C. Bohm, D. Keim, and H. Kriegel. “A Cost Model for Nearest Neighbor Search in High-Dimensional Data Space,” In Proceedings of ACM PODS Conference, 78–86, 1997.

  7. S. Berchtold, D. Keim, and H. Kriegel. “The X-tree: An Index Structure for High-Dimensional Data,” In Proceedings of VLDB Conference, 28–39, 1996.

  8. K. Beyer, J. Goldstein, and R. Ramakrishnan. “When Is “Nearest Neighbor” Meaningful?” In Proceedings of ICDT Conference, 217–235, 1999.

  9. B. Blohsfeld, D. Korus, and B. Seeger. “A Comparison of Selectivity Estimators for Range Queries on Metric Attributes,” In Proceedings of ACM SIGMOD Conference, 239–250, 1999.

  10. C. Bohm. “A cost model for query processing in high dimensional data spaces,” ACM TODS, Vol. 25(2):129–178, 2000.

  11. T. Brinkhoff, H. Kriegel, and B. Seeger. “Efficient Processing of Spatial Joins Using R-trees,” In Proceedings of ACM SIGMOD Conference, 237–246, 1993.

  12. N. Bruno, L. Gravano, and S. Chaudhuri. “STHoles: A Workload Aware Multidimensional Histogram,” In Proceedings of ACM SIGMOD Conference, 211–222, 2001.

  13. S. Chaudhuri, G. Das, M. Datar, R. Motwani, and V. Narasayya. “Overcoming Limitations of Sampling for Aggregation Queries,” In Proceedings of IEEE ICDE Conference, 534–542, 2001.

  14. A. Corral, Y. Manolopoulos, Y. Theodoridis, and M. Vassilakopoulos. “Closest Pair Queries in Spatial Databases,” In Proceedings of ACM SIGMOD Conference, 189–200, 2000.

  15. A. Deshpande, M. Garofalakis, and R. Rastogi. “Independence Is Good: Dependency-Based Histogram Synopses for High-Dimensional Data,” In Proceedings of ACM SIGMOD Conference, 199–210, 2001.

  16. C. Faloutsos and I. Kamel. “Beyond Uniformity and Independence, Analysis of R-trees Using the Concept of Fractal Dimension,” In Proceedings of ACM PODS Conference, 4–13, 1994.

  17. C. Faloutsos, B. Seeger, A. Traina, and C. Traina. “Spatial Join Selectivity Using Power Laws,” In Proceedings of ACM SIGMOD Conference, 177–188, 2000.

  18. D. Gunopulos, G. Kollios, V. Tsotras, and C. Domeniconi. “Approximate Multi-Dimensional Aggregate Range Queries over Real Attributes,” In Proceedings of ACM SIGMOD Conference, 463–474, 2000.

  19. J. Jin, N. An, and A. Sivasubramaniam. “Analyzing Range Queries on Spatial Data,” In Proceedings of IEEE ICDE Conference, 525–534, 2000.

  20. J. Lee, D. Kim, and C. Chung. “Multidimensional Selectivity Estimation Using Compressed Histogram Information,” In Proceedings of ACM SIGMOD Conference, 205–214, 1999.

  21. H. Lin and B. Huang. “Sql/sda: a query language for supporting spatial data analysis and its web-based implementation,” IEEE TKDE, Vol. 13(4):671–682, 2001.

  22. Y. Mattias, J. Vitter, and M. Wang. “Wavelet-Based Histograms for Selectivity Estimation,” In Proceedings of ACM SIGMOD Conference, 448–459, 1998.

  23. Y. Mattias, J. Vitter, and M. Wang. “Dynamic Maintenance of Wavelet-Based Histograms,” In Proceedings of VLDB Conference, 101–110, 2000.

  24. M. Muralikrishna and D. DeWitt. “Equi-Depth Histograms for Estimating Selectivity Factors for Multi-Dimensional Queries,” In Proceedings of ACM SIGMOD Conference, 28–36, 1998.

  25. F. Olken and D. Rotem. “Random Sampling from Database Files: A Survey,” In Proceedings of IEEE SSDBM Conference, 92–111, 1990.

  26. B. Pagel, F. Korn, and C. Faloutsos. “Deflating the Dimensionality Curse using Multiple Fractal Dimensions,” In Proceedings of IEEE ICDE Conference, 589–598, 2000.

  27. B. Pagel, H. Six, H. Toben, and P. Widmayer. “Towards an Analysis of Range Query Performance in Spatial Data Structures,” In Proceedings of ACM PODS Conference, 214–221, 1993.

  28. C. Palmer and C. Faloutsos. “Density Biased Sampling: An Improved Method for Data Mining and Clustering,” In Proceedings of ACM SIGMOD Conference, 82–92, 2000.

  29. Y. Poosala and Y. Ioannidis. “Selectivity Estimation without the Attribute Value Independence Assumption,” In Proceedings of VLDB Conference, 486–495, 1997.

  30. Y. Sakurai, M. Yoshikawa, S. Uemura, and H. Kojima. “The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation,” In Proceedings of VLDB Conference, 516–526, 2000.

  31. S. Shekhar, M. Coyle, D. Liu, B. Goyal, and S. Sarkar. “Data models in geographic information systems,” Communication of the ACM, Vol. 40(4), 1997.

  32. C. Sun, D. Agrawal, and A. El Abbadi. “Exploring Spatial Datasets with Histograms,” In Proceedings of IEEE ICDE Conference, 93–102, 2002.

  33. N. Thaper, S. Guha, P. Indyk, and N. Koudas. “Dynamic Multidimensional Histograms,” In Proceedings of ACM SIGMOD Conference, 428–439, 2002.

  34. Y. Theodoridis and T. Sellis. “A Model for the Prediction of R-tree Performance,” In Proceedings of ACM PODS, 161–171, 1996.

  35. Y. Theodoridis, E. Stefanakis, and T. Sellis. “Cost Models for Join Queries in Spatial Databases,” In Proceedings of IEEE ICDE Conference, 476–483, 1998.

  36. TIGER, http://www.census.gov/geo/www/tiger/.

  37. Y. Wu, D. Agrawal, and A. El Abbadi. “Applying the Golden Rule of Sampling for Query Estimation,” In Proceedings of ACM SIGMOD Conference, 449–460, 2001.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yufei Tao.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tao, Y., Faloutsos, C. & Papadias, D. Spatial Query Estimation without the Local Uniformity Assumption. Geoinformatica 10, 261–293 (2006). https://doi.org/10.1007/s10707-006-9828-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10707-006-9828-7

Keywords

Navigation