Abstract
We extend the OLAP data model to represent data ambiguity, specifically imprecision and uncertainty, and introduce an allocation-based approach to the semantics of aggregation queries over such data. We identify three natural query properties and use them to shed light on alternative query semantics. While there is much work on representing and querying ambiguous data, to our knowledge this is the first paper to handle both imprecision and uncertainty in an OLAP setting.
Similar content being viewed by others
References
Abiteboul S., Kanellakis P.C., Grahne G. On the representation and querying of sets of possible worlds. In: SIGMOD (1987)
Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: PODS (1999)
Arenas M., Bertossi L.E., Chomicki J., He X., Raghavan V., Spinrad J. (2003) Scalar aggregation in inconsistent databases. Theor. Comput. Sci. 3(296): 405–434
Bell D.A., Guan J.W., Lee S.K. (1996) Generalized union and project operations for pooling uncertain and imprecise information. Data Knowl. Eng. 18(2): 89–117
Cavallo, R., Pittarelli, M.: The theory of probabilistic databases. In: VLDB (1987)
Chen A.L.P., Chiu J.S., Tseng F.S.C. (1996) Evaluating aggregate operations over imprecise data. IEEE TKDE 8(2): 273–284
Cheng, R., Kalashnikov, D.V., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: SIGMOD (2003)
Dempster A.P., Laird N.M., Rubin D.B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodological) 39(1): 1–38
Dey, D., Sarkar, S.: PSQL: A query language for probabilistic relational data. Data Knowl. Eng. 28(1), 107–120 (1998). DOI http://dx.doi.org/10.1016/S0169-023X(98)00015-9
Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 15(1), 32–66 (1997). http://doi.acm.org/10.1145/239041.239045
Garcia-Molina H., Porter D. (1992) The management of probabilistic data. IEEE TKDE 4, 487–501
Garg, A., Jayram, T.S., Vaithyanathan, S., Zhu, H.: Model based opinion pooling. In: The Eight International Symposium on Artificial Intelligence and Mathematics (2004)
Genest C., Zidek J.V. (1986) Combining probability distributions: a critique and an annotated bibliography (avec discussion). Stat. Sci. 1, 114–148
Kiviniemi, J., Wolski, A., Pesonen, A., Arminen, J.: Lazy aggregates for real-time OLAP. In: DaWaK 1999
Lakshmanan L.V.S., Leone N., Ross R., Subrahmanian V.S. (1997) ProbView: a flexible probabilistic database system. ACM TODS 22(3): 419–469
Lenz, H.J., Shoshani, A.: Summarizability in OLAP and statistical data bases In: SSDBM (1997)
Lenz, H.J., Thalheim, B.: OLAP databases and aggregation functions. In: SSDBM (2001)
McClean S.I., Scotney B.W., Shapcott M. (2001) Aggregation of imprecise and uncertain information in databases. IEEE TKDE 13(6): 902–912
Motro A. (1990) Accommodating imprecision in database systems: issues and solutions. SIGMOD Rec. 19(4): 69–74
.Motro, A.: Sources of uncertainty, imprecision and inconsistency in information systems. In: Uncertainty Management in Information Systems, pp. 9–34 (1996)
Pedersen, T.B., Jensen, C.S., Dyreson, C.E.: Supporting imprecision in multidimensional databases using granularities. In: SSDBM (1999)
Ross, R., Subrahmanian, V.S., Grant, J.: Aggregate operators in probabilistic databases. J. ACM 52(1), 54–101 (2005). http://doi.acm.org/10.1145/1044731.1044734
Rundensteiner, E.A., Bic, L.: Evaluating aggregates in possibilistic relational databases. Data Knowl. Eng. 7(3), 239–267 (1992). DOI http://dx.doi.org/10.1016/0169-023X(92)90040-I
Shoshani, A.: OLAP and statistical databases: similarities and differences. In: PODS (1997)
Wu X., Barbará D. (2002) Learning missing values from summary constraints. SIGKDD Explor. 4(1): 21–30
.Wu, X., Barbará, D.: Modeling and imputation of large incomplete multidimensional datasets. In: DaWaK (2002)
Zhu, H., Vaithyanathan, S., Joshi, M.V.: Topic learning from few examples. In: PKDD (2003)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Burdick, D., Deshpande, P.M., Jayram, T.S. et al. OLAP over uncertain and imprecise data. The VLDB Journal 16, 123–144 (2007). https://doi.org/10.1007/s00778-006-0033-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00778-006-0033-y