ABSTRACT
Clio, the IBM Research system for expressing declarative schema mappings, has progressed in the past few years from a research prototype into a technology that is behind some of IBM's mapping technology. Clio provides a declarative way of specifying schema mappings between either XML or relational schemas. Mappings are compiled into an abstract query graph representation that captures the transformation semantics of the mappings. The query graph can then be serialized into different query languages, depending on the kind of schemas and systems involved in the mapping. Clio currently produces XQuery, XSLT, SQL, and SQL/XML queries. In this paper, we revisit the architecture and algorithms behind Clio. We then discuss some implementation issues, optimizations needed for scalability, and general lessons learned in the road towards creating an industrial-strength tool.
- P. Bernstein. Applying Model Management to Classical Meta Data Problems. In CIDR, 2003.Google Scholar
- R. Fagin, P. Kolaitis, L. Popa, and W.-C. Tan. Composing Schema Mappings: Second-Order Dependencies to the Rescue. In PODS, 2004. Google ScholarDigital Library
- R. Fagin, P. G. Kolaitis, R. J. Miller, and L. Popa. Data Exchange: Semantics and Query Answering. In ICDT, 2003. Google ScholarDigital Library
- M. Lenzerini. Data Integration: A Theoretical Perspective. In PODS, 2002. Google ScholarDigital Library
- S. Melnik, P. A. Bernstein, A. Halevy, and E. Rahm. Supporting Executable Mappings in Model Management. In SIGMOD, 2005. Google ScholarDigital Library
- R. J. Miller, L. M. Haas, and M. A. Hernández. Schema Mapping as Query Discovery. In VLDB, 2000. Google ScholarDigital Library
- L. Popa, Y. Velegrakis, R. J. Miller, M. A. Hernández, and R. Fagin. Translating Web Data. In VLDB, 2002. Google ScholarDigital Library
- E. Rahm and P. A. Bernstein. A Survey of Approaches to Automatic Schema Matching. The VLDB Journal, 10(4):334--350, 2001. Google ScholarDigital Library
- N. C. Shu, B. C. Housel, R. W. Taylor, S. P. Ghosh, and V. Y. Lum. EXPRESS: A Data EXtraction, Processing, and REStructuring System. TODS, 2(2):134--174, 1977. Google ScholarDigital Library
- Clio grows up: from research prototype to industrial tool
Recommendations
The Clio project: managing heterogeneity
Clio is a system for managing and facilitating the complex tasks of heterogeneous data transformation and integration. In Clio, we have collected together a powerful set of data management techniques that have proven invaluable in tackling these ...
Clio: Schema Mapping Creation and Data Exchange
Conceptual Modeling: Foundations and ApplicationsThe Clio project provides tools that vastly simplify information integration. Information integration requires data conversions to bring data in different representations into a common form. Key contributions of Clio are the definition of non-procedural ...
Comments