Skip to main content

Starry Vault: Automating Multidimensional Modeling from Data Vaults

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9809))

Abstract

The data vault model natively supports data and schema evolution, so it is often adopted to create operational data stores. However, it can hardly be directly used for OLAP querying. In this paper we propose an approach called Starry Vault for finding a multidimensional structure in data vaults. Starry Vault builds on the specific features of the data vault model to automate multidimensional modeling, and uses approximate functional dependencies to discover out of data the information necessary to infer the structure of multidimensional hierarchies. The manual intervention by the user is limited to some editing of the resulting multidimensional schemata, which makes the overall process simple and quick enough to be compatible with the situational analysis needs of a data scientist.

This work was partly supported by the EU-funded project TOREADOR (contract n. H2020-688797).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Carmè, A., Mazón, J.-N., Rizzi, S.: A model-driven heuristic approach for detecting multidimensional facts in relational data sources. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 13–24. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  2. Combi, C., Parise, P., Sala, P., Pozzi, G.: Mining approximate temporal functional dependencies based on pure temporal grouping. In: Proceedings of ICDM Workshops, pp. 258–265, Dallas, USA (2013)

    Google Scholar 

  3. Golfarelli, M., Maio, D., Rizzi, S.: Conceptual design of data warehouses from E/R schemes. In: Proceedings of HICSS, pp. 334–343, Kohala Coast, HI (1998)

    Google Scholar 

  4. Golfarelli, M., Rizzi, S.: Data Warehouse Design: Modern Principles and Methodologies. McGraw-Hill, New York (2009)

    Google Scholar 

  5. Golfarelli, M., Rizzi, S., Vrdoljak, B.: Data warehouse design from XML sources. In: Proceedings of DOLAP, pp. 40–47, Atlanta, Georgia (2001)

    Google Scholar 

  6. Hughes, R.: Agile Data Warehousing for the Enterprise. Elsevier Science, Amsterdam (2015)

    Google Scholar 

  7. Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: an efficient algorithm for discovering functional and approximate dependencies. Comput. J. 42(2), 100–111 (1999)

    Article  MATH  Google Scholar 

  8. Hultgren, H.: Data vault modeling guide (2012). http://hanshultgren.files.wordpress.com

  9. Jensen, C.S., Snodgrass, R.T., Soo, M.D.: Extending existing dependency theory to temporal databases. IEEE Trans. Knowl. Data Eng. 8(4), 563–582 (1996)

    Article  Google Scholar 

  10. Jensen, M.R., Holmgren, T., Pedersen, T.B.: Discovering multidimensional structure in relational data. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 138–148. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  11. Jovanovic, V., Bojicic, I.: Conceptual data vault model. In: Proceedings of SAIS, vol. 23, pp. 1–6, Atlanta, Georgia (2012)

    Google Scholar 

  12. Kim, J., et al.: SAMSTARplus: an automatic tool for generating multi-dimensional schemas from an entity-relationship diagram. Revista de Informática Teórica e Aplicada 16(2), 79–82 (2009)

    Google Scholar 

  13. Krneta, D., Jovanovic, V., Marjanovic, Z.: A direct approach to physical data vault design. Comput. Sci. Inf. Syst. 11(2), 569–599 (2014)

    Article  Google Scholar 

  14. Linstedt, D.: DV modeling specification v1.09 (2013). http://danlinstedt.com

  15. Phipps, C., Davis, K.C.: Automating data warehouse conceptual schema design and evaluation. In: Proceedings of DMDW, pp. 23–32, Toronto, Canada (2002)

    Google Scholar 

  16. QOSQO: QUIPU 1.1 Whitepaper (2016). www.datawarehousemanagement.org

  17. Romero, O., Abelló, A.: A framework for multidimensional design of data warehouses from ontologies. Data Knowl. Eng. 69(11), 1138–1157 (2010)

    Article  Google Scholar 

  18. Winter, R., Strauch, B.: A method for demand-driven information requirements analysis in data warehousing projects. In: Proceedings of HICSS, p. 231, Big Island (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Golfarelli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Golfarelli, M., Graziani, S., Rizzi, S. (2016). Starry Vault: Automating Multidimensional Modeling from Data Vaults. In: Pokorný, J., Ivanović, M., Thalheim, B., Šaloun, P. (eds) Advances in Databases and Information Systems. ADBIS 2016. Lecture Notes in Computer Science(), vol 9809. Springer, Cham. https://doi.org/10.1007/978-3-319-44039-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44039-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44038-5

  • Online ISBN: 978-3-319-44039-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics