Abstract
The data vault model natively supports data and schema evolution, so it is often adopted to create operational data stores. However, it can hardly be directly used for OLAP querying. In this paper we propose an approach called Starry Vault for finding a multidimensional structure in data vaults. Starry Vault builds on the specific features of the data vault model to automate multidimensional modeling, and uses approximate functional dependencies to discover out of data the information necessary to infer the structure of multidimensional hierarchies. The manual intervention by the user is limited to some editing of the resulting multidimensional schemata, which makes the overall process simple and quick enough to be compatible with the situational analysis needs of a data scientist.
This work was partly supported by the EU-funded project TOREADOR (contract n. H2020-688797).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Carmè, A., Mazón, J.-N., Rizzi, S.: A model-driven heuristic approach for detecting multidimensional facts in relational data sources. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 13–24. Springer, Heidelberg (2010)
Combi, C., Parise, P., Sala, P., Pozzi, G.: Mining approximate temporal functional dependencies based on pure temporal grouping. In: Proceedings of ICDM Workshops, pp. 258–265, Dallas, USA (2013)
Golfarelli, M., Maio, D., Rizzi, S.: Conceptual design of data warehouses from E/R schemes. In: Proceedings of HICSS, pp. 334–343, Kohala Coast, HI (1998)
Golfarelli, M., Rizzi, S.: Data Warehouse Design: Modern Principles and Methodologies. McGraw-Hill, New York (2009)
Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data warehouse design from XML sources. In: Proceedings of DOLAP, pp. 40–47, Atlanta, Georgia (2001)
Hughes, R.: Agile Data Warehousing for the Enterprise. Elsevier Science, Amsterdam (2015)
Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)
Hultgren, H.: Data vault modeling guide (2012). http://hanshultgren.files.wordpress.com
Jensen, C.S., Snodgrass, R.T., Soo, M.D.: Extending existing dependency theory to temporal databases. IEEE Trans. Knowl. Data Eng. 8(4), 563–582 (1996)
Jensen, M.R., Holmgren, T., Pedersen, T.B.: Discovering multidimensional structure in relational data. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 138–148. Springer, Heidelberg (2004)
Jovanovic, V., Bojicic, I.: Conceptual data vault model. In: Proceedings of SAIS, vol. 23, pp. 1–6, Atlanta, Georgia (2012)
Kim, J., et al.: SAMSTARplus: an automatic tool for generating multi-dimensional schemas from an entity-relationship diagram. Revista de Informática Teórica e Aplicada 16(2), 79–82 (2009)
Krneta, D., Jovanovic, V., Marjanovic, Z.: A direct approach to physical data vault design. Comput. Sci. Inf. Syst. 11(2), 569–599 (2014)
Linstedt, D.: DV modeling specification v1.09 (2013). http://danlinstedt.com
Phipps, C., Davis, K.C.: Automating data warehouse conceptual schema design and evaluation. In: Proceedings of DMDW, pp. 23–32, Toronto, Canada (2002)
QOSQO: QUIPU 1.1 Whitepaper (2016). www.datawarehousemanagement.org
Romero, O., Abelló, A.: A framework for multidimensional design of data warehouses from ontologies. Data Knowl. Eng. 69(11), 1138–1157 (2010)
Winter, R., Strauch, B.: A method for demand-driven information requirements analysis in data warehousing projects. In: Proceedings of HICSS, p. 231, Big Island (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Golfarelli, M., Graziani, S., Rizzi, S. (2016). Starry Vault: Automating Multidimensional Modeling from Data Vaults. In: Pokorný, J., Ivanović, M., Thalheim, B., Šaloun, P. (eds) Advances in Databases and Information Systems. ADBIS 2016. Lecture Notes in Computer Science(), vol 9809. Springer, Cham. https://doi.org/10.1007/978-3-319-44039-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-44039-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44038-5
Online ISBN: 978-3-319-44039-2
eBook Packages: Computer ScienceComputer Science (R0)