Abstract
This chapter presents a multi- and interdisciplinary synthesis of ideas about the definition and theoretical conceptualization of provenance, drawing from disciplines such as archival science, law, computer science, library and information science, and visual analytics. Through the lens of these distinct domains, the chapter explores different purposes served by provenance; various ways that diverse fields are capturing, representing and using provenance information; provenance standards and specifications, and a range of open research challenges relating to theorizing about provenance and capturing, representing and using provenance information in increasingly distributed, heterogeneous information eco-systems combining machine and human intelligence. From this blending of perspectives on provenance from different disciplines and ‘interdisciplines’, a rich picture emerges of provenance as a dynamic construct and evolving focus of research.
Membership in the imProvenance Group is fluid, but the core group of individuals who contributed to the development of this synthesis comprise: Lucie Burgess, Adrian Cunningham, Ken Cavelier, David Dubin, Luciana Duranti, Paolo Missier, Bertram Ludäscher, Corinne Rogers, Joe Tennis, Ken Thibodeau, Margaret Varga and Ashley Wheat.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
More specifically, Moreau’s analysis revealed a growing trend of research activity related to provenance, with about half the papers concerning provenance published since 2008. He conjectured that the development of the Grid as a technology for running scientific applications and the UK e-science program have been two significant external triggering factors that have caused increasing numbers of researchers to focus on the provenance problem.
- 2.
Appendix A of this volume provides a complete list of workshop participants.
- 3.
Mixed-initiative systems can be described as systems that augment human cognitive capabilities by developing machines capable of offloading human thought processes and actively supporting individuals in pursuing their goals. In this sense, the focus is less on creating artificial intelligence (AI) than on augmenting human intelligence (IA).
- 4.
There is some debate about whether Terry Cook was referring both to creators (agents) as well as to function in his reference to relationships. In 2010, in his International Council on Archives plenary address, Terry Cook stated his view of the concept of provenance: “provenance is the concept of linking records or archives, or group or series of archives, to their creator, whether an individual or organization. The value of provenance is that it allows archivists and researchers to understand a record and its content in terms of who made it, where, when, how, and why, and what changes have taken place with the record over time, and why” and from this Duranti infers: In short, Cook appeared to equate the term “provenance” with “context”, but his “who made it, where, when, how, and why” definitely refer to persons, not functions. Lemieux (2014), on the other hand, finds evidence that Cook was also referring to functions. She cites his 1992 article “Mind over Matter”, in which he writes in reference to his provenance-based macro-appraisal theory, “Turning to the second part of the model, the citizen-state interaction reflects a convergence of three factors: the programme (function), the agency (structure), and the citizen.”
- 5.
When we trace the history of a concept through revisions of indexing languages, we are studying the concept’s ontogeny. Ontogeny is a term borrowed from biology.
- 6.
- 7.
- 8.
The content of Wikidata is available under a free license, exported using standard formats, and can be interlinked to other open data sets on the linked data web. (http://www.wikidata.org/wiki/Wikidata:Main_Page).
- 9.
397 is between 396 Women’s Treatment and Position and 398 Folklore, Proverbs (s.l.).
- 10.
NARA. Online Public Access. http://www.archives.gov/research/search/.
- 11.
- 12.
- 13.
- 14.
References
Papritz, J.: Archivwissenschaft. 4 vols. Archivschule Marburg. Institut fur Archivwissenschaft, Marburg (1976)
Moreau, L.: The foundations for provenance on the web. Found. Trends Web Sci. 2(2–3), 99–241 (2010)
Yeo, G.: Trust and context in cyberspace. Arch. Rec. 34(2), 214–234 (2012)
Duranti, L.: The odyssey of records managers. In: Burke, F.G., Nesmith, T. (eds.) Canadian Archival Studies and the Rediscovery of Provenance, pp. 29–60. Scarecrow Press, Metuchen (1993)
Jones, T.G., Burgess, L., Jefferies, N., Ranganathan, A., Rumsey, S.: Contextual and provenance metadata in the Oxford University Research Archive (ORA). In: Metadata and Semantics Research, pp. 274–285. Springer International Publishing, Berlin (2015)
Cohen, F: Digital forensics and electronic discovery. http://all.net (c. 2013)
Socha, G., Gelbmann, T.: Electronic discovery reference model. http://www.edrm.net/resources/edrm-stages-explained (2016)
Tennis, J.T.: A Kaleidoscope perspective: change in the semantics and structure of facets and isolates in Analytico-Synthetic classification. SRELS J. Inf. Manage. 50(6), 789–794 (2013)
Bearman, D.A., Lytle, R.H.: The power of the principle of provenance. Archivaria. 1(21), (1985)
Schon, D.A., DeSanctis, V.: The reflective practitioner: how professionals think in action. J. Contin. High. Educ. 34, (1986)
Varga, M., Varga, C.: Visual analytics – data, analytical and reasoning provenance. Springer Nature (2016). This volume
Duranti, L.: The concept of appraisal and archival theory. Am. Arch. 57, 328–344 (1994)
Lemieux, V.L.: Applying Mintzberg’s theories on organizational configuration to archival appraisal. Archivaria. 1(46), (1998)
Cook, T.: Archival science and postmodernism: new formulations for old concepts. Arch. Sci. 1(1), 3–24 (2001)
Pearce-Moses, R., Baty, L.A.: A Glossary of Archival and Records Terminology. Society of American Archivists, Chicago (2005)
International Council on Archives. International Standard Archival Description (General). ICA, Paris (1994)
International Council on Archives. ISAAR (CPF) International Standard Archival Authority Record for Corporate Bodies, Persons and Families, 2nd edn. ICA, Paris (2004)
International Council on Archives. Committee on Best Practices and Standards: Progress Report for Revising and Harmonising ICA Descriptive Standards. ICA, Paris (2012)
Duchein, M.: Theoretical principles and practical problems of respect des fonds in archival science. Archivaria 1(16), 64–82 (1983)
Horsman, P.: The last dance of the phoenix or the de-discovery of the archival fonds. Archivaria 1(54), 1–23 (2002)
Gilliland-Swetland, A.J.: Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment. Council on Library and Information Resources, Washington (2000)
Cook, T.: Mind over matter: towards a new theory of archival appraisal. In: Craig, B. (ed.) The Archival Imagination: Essays in Honour of Hugh Taylor, pp. 38–70. Association of Canadian Archivists, Ottawa (1992)
Abukhanfusa, K., Sydbeck, J. (eds.): The Principle of Provenance. Report from the First Stockholm Conference on Archival Theory and the Principle of Provenance. Swedish National Archives, Stockholm (1994)
Douglas, J.: Origins: evolving ideas about the principle of provenance. In: Eastwood, T., MacNeil, H. (eds.) Currents of Archival Thinking, pp. 23–43. Libraries Unlimited, Santa Barbara (2010)
Scott, P.: The record group concept: a case for abandonment. Am. Arch. 29, 493–504 (1966)
Canadian Committee on Archival Description. Rules for Archival Description. Bureau of Canadian Archivists, Ottawa (1990)
Barr, D.: The fonds concept in the working group on archival descriptive standards report. Archivaria. 1(25), 163–169 (Winter 1987–88)
Millar, L.: The death of the fonds and the resurrection of provenance: archival context in space and time. Archivaria 1(53), 1–15 (2002)
Nesmith, T.: The concept of societal provenance and records of nineteenth-century Aboriginal-European Relations in Western Canada: implications for archival theory and practice. Arch. Sci. 6(3–4), 351–360 (2006)
Nesmith, T.: Reopening archives: bringing new contextualities into archival theory and practice. Archivaria 60(60), 259–274 (2006)
Lemieux, V.L.: Toward a ‘Third Order’ archival interface: research notes on some theoretical and practical implications of visual explorations in the Canadian context of financial electronic records. Archivaria 1(78), 53–93 (2014)
EDM Council. Financial Industry Business Ontology. http://www.edmcouncil.org/financialbusiness (2012–2016)
World Wide Web Consortium. PROV-O: The PROV Ontology. https://www.w3.org/TR/prov-dictionary/ (2013)
DCMI. DCMI Specifications. http://dublincore.org/specifications/ (1995–2016)
Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: Database Theory—ICDT, pp. 316–330. Springer, Berlin, Heidelberg (2001)
Cheney, J., Chiticariu, L., Tan, W.-C.: Provenance in Databases: Why, How, and Where. Now Publishers Inc., Breda (2009)
Green, T.J., Karvounarakis, G., Ives, Z.G., Tannen, V.: Update exchange with mappings and provenance. In: Proceedings of the 33rd International Conference on Very Large Data Bases, pp. 675–686. VLDB Endowment, Almaden (2007)
Davidson, S.B., Boulakia, S.C., Eyal, A., Ludäscher, B., McPhillips, T.M., Bowers, S., Freire, J.: Provenance in scientific workflow systems. IEEE Data Eng. Bull. 30(4), 44–50 (2007)
Amsterdamer, Y., Davidson, S.B., Deutch, D., Milo, T., Stoyanovich, J., Tannen, V.: Putting lipstick on pig: enabling database-style workflow provenance. Proc. VLDB Endowment 5(4), 346–357 (2011)
Thomas, J.J., Cook, K.A.: Illuminating the Path: The Research and Development Agenda for Visual Analytics. National Visualization and Analytics Centre, Richland, WA (2005)
Jankun-Kelly, T.J.: The Case for Visual Analysis Provenance Cases, Workshop on Analytic Provenance: Process + Interaction + Insight. CHI (2011)
Keim, D.A., Kohlhammer, J., Ellis, G., Mansmann, F.: Mastering the Information Age-Solving Problems with Visual Analytics. Florian Mansmann (2010)
Xu, K.: InterPARES Trust Interdisciplinary Workshop on Provenance Participant’s Statement. Unpublished document (May, 2015)
Pirolli, P., Card, S.K.: The sensemaking process and leverage points for analyst technology as identified through cognitive task analysis. Proc. Int. Conf. Intell. Anal. 5, 2–4 (2005)
Klein, G., Moon, B., Hoffman, R.R.: Making sense of sensemaking 1: alternative perspectives. IEEE Intell. Syst. 4, 70–73 (2006)
Hutchins, E.: Cognition in the Wild. MIT Press, Cambridge (1995)
Hollan, J., Hutchins, E., Kirsh, D.: Distributed cognition: toward a new foundation for human-computer interaction research. ACM Trans. Comput. Hum. Interact. (TOCHI) 7(2), 174–196 (2000)
Roberts, J.C., Keim, D., Hanratty, T., Rowlingson, R.R., Walker, R., Hall, M., Jacobson, Z., Lavigne, V., Rooney, C., Varga, M.: From Ill-defined problems to informed decisions. In: EuroVis Workshop on Visual Analytics. Eurographics Association, Geneva (2014)
Factor, M., Henis, E., Naor, D., Rabinovici-Cohen, S., Reshef, P., Ronen, S., Michetti, G., Guercio, M.: Authenticity and provenance in long term digital preservation: modeling and implementation in preservation aware storage. In: Workshop on the Theory and Practice of Provenance. ACM SIGMOD 38(2), 57–60 (2009)
Gillean, D., Leveillé, V., Rogers, C.: Records in the Cloud–A metadata framework for cloud service providers. In: Proceedings of the International Conference on Cloud Security Management: ICCSM, p. 166. Academic Conferences Limited, Curtis Farm (2013)
Lemieux, V.L., Rogers, C., Thibodeau, K.: InterPARES Trust (international multidisciplinary research into issues of trust in digital objects in online environments) Metadata: Authenticity and Provenance in the Cloud. NATO Specialist Meeting IST-13: Distributed Data Analytics for Combating Weapons of Mass Destruction, Lorton, VA, 15–17 October 2014
Sedona Conference: Best Practices Recommendations & Principles for Addressing Electronic Document Production. The Sedona Conference, Sedona (2007)
Missier, P., Ludäscher, B., Dey, S., Wang, M., McPhillips, T., Bowers, S., et al.: Golden trail: retrieving the data history that matters from a comprehensive provenance repository. Int. J. Digit. Curation. 7(1) (2012)
Open Data Charter.net: Open Data Charter. http://opendatacharter.net/who-we-are/ (c. 2015)
McKinsey Global Institute. Open Data: Unlocking Innovation and Performance with Liquid Information. McKinsey & Co., London (2013)
European Union. Data Portal. http://www.europeandataportal.eu (2016)
Open Government Partnership. About. http://www.opengovpartnership.org/about (2016)
Ballard, M.: Poor data quality hindering government open data programme. Computer Weekly (28 August 2014)
Dasu, T., Johnson, T.: Exploratory Data Mining and Data Cleaning, vol. 479. Wiley, New York (2003)
Anderson, S.R., Allen, R.B.: Envisioning the archival commons. Am. Arch. 72(2), 383–400 (2009)
Oomen, J., Aroyo, L.: Crowdsourcing in the cultural heritage domain: opportunities and challenges. In: Proceedings of the 5th International Conference on Communities and Technologies, pp. 138–149. ACM, New York (2011)
Eveleigh, A.: Crowding out the archivist? Locating crowdsourcing within the broader landscape of participatory archives. In: Ridge, M., Mia Ridge (ed.) Crowdsourcing our Cultural Heritage, pp. 211–212. Ashgate Publishing, Farnham (2014)
Dewey, M.: Decimal Classification and Relative Index for Libraries, Clippings, Notes, etc, 8th edn. Forest Press, Tionesta (1913)
Trickett, S.B., Trafton, J.G., Saner, L., Schunn, C.D.: I don’t know what’s going on there: the use of spatial transformations to deal with and resolve uncertainty in complex visualizations. In: Lovett, M.C., Shah, P. (eds.) Thinking with Data, pp. 65–86. Lawrence Erlbaum Associates, Mahwah (2007)
Flood, M.D., Lemieux, V.L., Varga, M., Wong, B.L.W.: The application of visual analytics to financial stability monitoring. J. Financ. Stability (2016)
Watts, K.A.: Proposing a place for politics in arbitrary and capricious review. Yale Law J. 119, 2–85 (2009)
Kelly, J.E.: Welcome to the era of cognitive systems. http://asmarterplanet.com/blog/2012/05/welcome-to-theera-of-cognitive-systems.html (May 10, 2012)
Computing Research Association. Grand Research Challenges in Information Systems. CRA, Washington (2002)
Cavelier, K.: InterPARES Trust Interdisciplinary Workshop on Provenance Participant’s Statement. Unpublished document (May, 2015)
MacNeil, H.: Trusting description: authenticity, accountability, and archival description standards. J. Arch. Organ. 7(3), 89–107 (2009)
Bearman, D.: Description standards: a framework for action. Am. Arch. 52(4), 514–519 (1989)
GBIF (Global Biodiversity Information Facility. What is GBIF. http://www.gbif.org/what-is-gbif (2016)
Missier, P., Dey, S., Belhajjame, K., Cuevas-Vicenttín, V., Ludäscher, B.: D-PROV: extending the PROV provenance model with workflow structure. In: Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance (TaPP 13). USENIX Association, Berkeley (2013)
Murta, L., Braganholo, V., Chirigati, F., Koop, D., Freire, J.: Noworkflow: capturing and analyzing provenance of scripts. In: Provenance and Annotation of Data and Processes, pp. 71–83. Springer International Publishing, Berlin (2014)
Lerner, B., Boose, E.: RDataTracker: collecting provenance in an interactive scripting environment. In: Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance (TaPP 2014). USENIX, Berkeley (2014)
Hertzum, M., Hansen, K.D., Andersen, H.H.K.: Scrutinising usability evaluation: does thinking aloud affect behavior and mental workload? Behav. Inf. Technol. 28(2), 165–181 (2009)
Gotz, D., Zhou, M.X.: Characterizing users’ visual analytic activity for insight provenance. Inf. Vis. 8(1), 42–55 (2009)
Shrinivasan, Y.B., van Wijk, J.J.: Supporting exploration awareness in information visualization. IEEE Comput. Graph. Appl. 29(5), 24–33 (2009)
Pike, W.A., May, R., Baddeley, B., Riensche, R., Bruce, J., Younkin, K.: Scalable visual reasoning: supporting collaboration through distributed analysis. In: International Symposium on Collaborative Technologies and Systems, pp. 24–32. IEEE Press, New York (2007)
Walker, R., Slingsby, A., Dykes, J., Xu, K., Wood, J., Nguyen, P.H., Stephens, D., Wong, B.L., Zheng, Y.: An extensible framework for provenance in human terrain visual analytics. IEEE Trans. Vis. Comput. Graph. 19(12), 2139–2148 (2013)
Nguyen, P.H., Xu, K., Walker, R., Wong, B.L.W.: SchemaLine: timeline visualization for sensemaking. In: Proceedings of the 18th International Conference on Information Visualization (IV), pp. 225–233. IEEE Press, New York (2014)
Gotz, D., Wen, Z.: Behavior-driven visualization recommendation. In: Proceedings of the 14th International Conference on Intelligent User Interfaces, pp. 315–324. ACM, New York (2009)
Bavoil, L., Callahan, S.P., Crossno, P.J., Freire, J., Scheidegger, C.E., Silva, C.T., Vo, H.T: Vistrails: enabling interactive multiple-view visualizations. In: Proceedings of IEEE Information Visualization 05, pp. 135–142. IEEE Press, New York (2005)
Dunne, C., Henry Riche, N., Lee, B., Metoyer, R., Robertson, G.: GraphTrail: analyzing large multivariate, heterogeneous networks while supporting exploration history. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 1663–1672. ACM, New York (2012)
Lemieux, V.L., Dang, T.: Building accountability for decision-making into cognitive systems. In: Advances in Information Systems and Technologies, pp. 575–586. Springer, Berlin, Heidelberg (2013)
Missier, P., Belhajjame, K., Cheney, J.: The W3C PROV family of specifications for modelling provenance metadata. In: Proceedings of the 16th International Conference on Extending Database Technology, pp. 773–776. ACM, New York (2013)
Moreau, L., Hartig, O., Simmhan, Y., Myers, J., Lebo, T., Belhajjame, K., Miles, S., Soiland-Reyes, S.: PROV-AQ: provenance access and query. http://www.w3.org/TR/prov-aq (2012)
Gil, Y., Miles, S., Belhajjame, K., Deus, H., Garijo, D., Klyne, G., Missier, P., Soiland-Reyes, S., Zednik, S.: PROV model primer. https://www.w3.org/TR/prov-primer/ (2012)
Lebo, T., Sahoo, S., McGuinness, D., Belhajjame, K., Cheney, J., Corsar, D., Garijo, D., Soiland-Reyes, S., Zednik, S., Zhao, J.: ProvO: The prov ontology. W3C Recommendation. (2013)
Firth, H., Missier, P.: ProvGen: generating synthetic PROV graphs with predictable structure. In: Provenance and Annotation of Data and Processes, pp. 16–27. Springer International Publishing, Berlin (2014)
Groth, P., Moreau, L: PROV-Overview. An overview of the PROV Family of Documents. https://www.w3.org/TR/2013/NOTE-prov-overview-20130430/ (2013)
The Bodleian Library. CAMELOT: A contextual data model for the Bodleian digital library. http://camelot-dev.bodleian.ox.ac.uk (2016)
PREMIS Editorial Committee. Data dictionary for preservation metadata, Version 3.0. OCLC, Washington (2015)
ISO/IEC: ISO 14721: 2012– Space data and information transfer systems -- Open archival information system (OAIS) -- Reference model. ISO, Geneva (2012)
PREMIS Editorial Committee. PREMIS OWL Ontology 2.2 now available. https://www.loc.gov/standards/premis/ontology-announcement.html (2013)
Guercio, M.: PREMIS and the long-term preservation of complex digital archives: Lessons learned and critical issues from the CASPAR Research. Round Table on PREMIS – Preservation Metadata: Implementation Strategies, Rome Italy (2009)
McKemmish, S., Acland, G., Ward, N., Reed, B.: Describing records in context in the continuum: the Australian Recordkeeping Metadata Schema. Archivaria 1(48), 3–37 (1999)
ISO/IEC: ISO 23081: 2006. Information and Documentation – Records Management Processes – Metadata for Records – Part I: Principles. ISO, Geneva (2006)
ISO/IEC: ISO 15489: 2001. Information and Documentation – Records Management – Part I: General. ISO, Geneva (2001)
ISO/IEC: ISO 23081: 2009. Information and Documentation – Records Management Processes – Metadata for Records – Part 2: Conceptual and Implementation Issues. ISO, Geneva (2009)
ISO/IEC: ISO 23081. Information and Documentation – Records Management Processes – Metadata for Records – Part 3: Self-Assessment Method. ISO, Geneva (2011)
Duranti, L.: The long-term preservation of accurate and authentic digital data: the INTERPARES project. Data Sci. J. 4, 106–118 (2005)
InterPARES 2 Terminology Database. http://www.interpares.org/ip2/ip2_terminology_db.cfm (2016)
Xie, S.L.: Preserving digital records: InterPARES findings and developments. In: Lemieux, V.L. (ed.) Financial Analysis and Risk Management, pp. 187–206. Springer, Berlin, Heidelberg (2013)
InterPARES. Chain of preservation model. http://www.interpares.org/ip2/ip2_models.cfm# (2007)
International Council on Archives. ISAF: International Standard for Activities-Functions of Corporate Bodies. ICA, Paris (2006)
Mitchell, C. (ed.): Trusted Computing. Institution of Electrical Engineers, New York (2005)
Xu, K., Attfield, S., Jankun-Kelly, T.J., Wheat, A., Nguyen, P.H., Selvaraj, N.: Analytic provenance for sensemaking: a research agenda. Comput. Graph. Appl. 35(3), 56–64. IEEE, New York (2015)
Dou, W., Jeong, D.H., Stukes, F., Ribarsky, W., Lipford, H.R., Chang, R.: Recovering reasoning processes from user interactions. IEEE Comput. Graph. Appl. 3, 52–61 (2009)
Author information
Authors and Affiliations
Consortia
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Lemieux, V.L., the imProvenance Group. (2016). Provenance: Past, Present and Future in Interdisciplinary and Multidisciplinary Perspective. In: Lemieux, V. (eds) Building Trust in Information. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-319-40226-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-40226-0_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-40225-3
Online ISBN: 978-3-319-40226-0
eBook Packages: Economics and FinanceEconomics and Finance (R0)