ABSTRACT
Recently, several research efforts have addressed answering skyline queries efficiently over large datasets. However, this research lacks methods to compute these queries over uncertain data, where uncertain values are represented as a range. In this paper, we define skyline queries over continuous uncertain data, and propose a novel, efficient framework to answer these queries. Query answers are probabilistic, where each object is associated with a probability value of being a query answer. Typically, users specify a probability threshold, that each returned object must exceed, and a tolerance value that defines the allowed error margin in probability calculation to reduce the computational overhead. Our framework employs an efficient two-phase query processing algorithm.
- G. Beskales, M. A. Soliman, and I. F. Ilyas. Efficient Search for the Top-k Probable Nearest Neighbors in Uncertain Databases. In VLDB, 2008. Google ScholarDigital Library
- S. Börzsönyi, D. Kossmann, and K. Stocker. The Skyline Operator. In ICDE, 2001.Google ScholarDigital Library
- R. Cheng, J. Chen, M. F. Mokbel, and C.-Y. Chow. Probabilistic Verifiers: Evaluating Constrained Nearest-Neighbor Queries over Uncertain Data. In ICDE, 2008. Google ScholarDigital Library
- R. Cheng, L. Chen, J. Chen, and X. Xie. Evaluating probability threshold k-nearest-neighbor queries over uncertain data. In EDBT, 2009. Google ScholarDigital Library
- R. Cheng, D. V. Kalashnikov, and S. Prabhakar. Evaluating Probabilistic Queries over Imprecise Data. In SIGMOD, 2003. Google ScholarDigital Library
- J. Chomicki, P. Godfrey, J. Gryz, and D. Liang. Skyline with Presorting. In ICDE, 2003.Google ScholarCross Ref
- P. Godfrey, R. Shipley, and J. Gryz. Maximal vector computation in large data sets. In VLDB, 2005. Google ScholarDigital Library
- M. Hua, J. Pei, W. Zhang, and X. Lin. Efficiently Answering Probabilistic Threshold Top-k Queries on Uncertain Data. In ICDE, 2008. Google ScholarDigital Library
- W. Kießling. Foundations of Preferences in Database Systems. In VLDB, 2002. Google ScholarDigital Library
- C. Koch and D. Olteanu. Conditioning Probabilistic Databases. In VLDB, 2008. Google ScholarDigital Library
- D. Kossmann, F. Ramsak, and S. Rost. Shooting Stars in the Sky: An Online Algorithm for Skyline Queries. In VLDB, 2002. Google ScholarDigital Library
- G. Koutrika and Y. E. Ioannidis. Personalization of Queries in Database Systems. In ICDE, 2004. Google ScholarDigital Library
- X. Lian and L. Chen. Monochromatic and bichromatic reverse skyline search over uncertain databases. In SIGMOD, 2008. Google ScholarDigital Library
- X. Lian and L. Chen. Probabilistic ranked queries in uncertain databases. In EDBT, 2008. Google ScholarDigital Library
- D. Papadias, Y. Tao, G. Fu, and B. Seeger. Progressive skyline computation in database systems. TODS, 30(1):41--82, 2005. Google ScholarDigital Library
- J. Pei, B. Jiang, X. Lin, and Y. Yuan. Probabilistic Skylines on Uncertain Data. In VLDB, 2007. Google ScholarDigital Library
- C. Re, N. Dalvi, and D. Suciu. Efficient top-k query evaluation on probabilistic data. In ICDE, 2007.Google ScholarCross Ref
- M. A. Soliman and I. F. Ilyas. Ranking with uncertain scores. In ICDE, 2009. Google ScholarDigital Library
- M. A. Soliman, I. F. Ilyas, and K. C.-C. Chang. Top-k Query Processing in Uncertain Databases. In ICDE, 2007.Google ScholarCross Ref
Index Terms
- Skyline query processing for uncertain data
Recommendations
Group-by skyline query processing in relational engines
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementThe skyline operator was first proposed in 2001 for retrieving interesting tuples from a dataset. Since then, 100+ skyline-related papers have been published; however, we discovered that one of the most intuitive and practical type of skyline queries, ...
Efficient processing of probabilistic set-containment queries on uncertain set-valued data
Set-valued data is a natural and concise representation for modeling complex objects. As an important operation of object-oriented or object-relational database, set containment query processing over set-valued data has been extensively studied in ...
Skyline Query Processing on Interval Uncertain Data
ISORCW '12: Proceedings of the 2012 IEEE 15th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing WorkshopsMany recent applications involve processing and analyzing uncertain data. Recently, several research efforts have addressed answering skyline queries efficiently on massive uncertain datasets. However, the research lacks methods to compute these queries ...
Comments