Elsevier

Journal of Web Semantics

Volume 20, May 2013, Pages 18-34
Journal of Web Semantics

How ontologies are made: Studying the hidden social dynamics behind collaborative ontology engineering projects

https://doi.org/10.1016/j.websem.2013.04.001Get rights and content

Abstract

Traditionally, evaluation methods in the field of semantic technologies have focused on the end result of ontology engineering efforts, mainly, on evaluating ontologies and their corresponding qualities and characteristics. This focus has led to the development of a whole arsenal of ontology-evaluation techniques that investigate the quality of ontologies as a product. In this paper, we aim to shed light on the process of ontology engineering construction by introducing and applying a set of measures to analyze hidden social dynamics. We argue that especially for ontologies which are constructed collaboratively, understanding the social processes that have led to their construction is critical not only in understanding but consequently also in evaluating the ontologies. With the work presented in this paper, we aim to expose the texture of collaborative ontology engineering processes that is otherwise left invisible. Using historical change-log data, we unveil qualitative differences and commonalities between different collaborative ontology engineering projects. Explaining and understanding these differences will help us to better comprehend the role and importance of social factors in collaborative ontology engineering projects. We hope that our analysis will spur a new line of evaluation techniques that view ontologies not as the static result of deliberations among domain experts, but as a dynamic, collaborative and iterative process that needs to be understood, evaluated and managed in itself. We believe that advances in this direction would help our community to expand the existing arsenal of ontology evaluation techniques towards more holistic approaches.

Introduction

Today, large-scale ontologies in fields such as biomedicine are developed collaboratively by a large set of distributed users, using tools such as collaborative Protégé  [1], [2] that provide structured logs of changes of the ontology. Evaluating the outcome of such collaborative ontology engineering efforts is a problem of pressing practical and theoretical relevance: For managers and quality assurance personnel, understanding the quality of collaboratively constructed ontologies–and how they have been constructed–is key. For developers of tools for collaborative ontology construction, understanding these processes will help improve the tools and make them fit more naturally the process that is already taking place. For researchers, collaborative ontology engineering projects with large numbers of users involved add a new social layer and additional complexity to an already complex theoretical problem. Therefore, we need new methods and techniques to analyze and further investigate the social dynamics of collaborative ontology engineering efforts.

Traditionally, evaluation methods in the field of semantic technologies have focused on the end result of ontology engineering efforts, mainly, on evaluating ontologies and their corresponding qualities and characteristics. This focus has led to the development of a useful arsenal of ontology-evaluation techniques that study and investigate the quality of ontologies as a product   [3]. However, ontology evaluation represents a wide open problem, and we need new techniques, especially for ontologies that are constructed collaboratively. For example, evaluating an ontology that has been constructed by hundreds of users without understanding who these users are, what they have contributed, where they had disagreed with one another, or how they have participated would paint a very narrow picture of the ontology under investigation. We argue that understanding the usually hidden social dynamics that have led to the construction of ontologies has the potential to create new insights and opportunities for ontology evaluation.

Our main objective in this paper is to study the social fabric of collaborative ontology engineering projects empirically, as a prerequisite for devising future evaluation methods that investigate the social processes behind such projects. Our high-level hypothesis is that quantitative analysis of ontology change data can provide qualitative insights into characteristics of collaborative ontology construction processes.

Our work is inspired by work of researchers who investigate the social dynamics behind collaborative construction processes in a range of different domains, including open source software and collaborative authoring systems such as Wikipedia. We will leverage and adapt work from these areas whenever possible in order to study and explore social dynamics in the context of collaborative ontology construction, such as the work of Suh et al.  [4] who analyzed the influence of a set of different factors on collaboration between Wikipedia editors. Voss  [5] conducted research regarding the analysis of different attributes of Wikipedia articles and users, such as the amount of edits contributed by each user or the amount of distinct users that worked on each article. Blumenstock  [6], Wilkinson and Huberman  [7] on the other hand analyzed and identified, among other things, that the amount of changes performed on an article in Wikipedia correlates with its quality. Stamelos et al.  [8] studied the quality of code in open source software development projects by counting and comparing specific attributes of the committed source code against industry standards.

Research questions: Using historical data from five different collaboratively constructed ontologies in the field of biomedicine and a sample of Wikipedia articles as a control, we aim to study the following research issues:

  • 1.

    Dynamic aspects (Section  4.1):

    • (a)

      How does activity in the system evolve over time?

    • (b)

      How are changes to the ontology distributed across concepts?

    • (c)

      How does activity in ontology engineering projects differ from activity in other collaborative authoring systems such as Wikipedia?

  • 2.

    Social aspects (Section  4.2):

    • (a)

      Is collaboration actually happening or do users work independently?

    • (b)

      How is the work distributed among users?

    • (c)

      How does collaboration in ontology engineering differ from collaboration in other systems such as Wikipedia?

  • 3.

    Lexical aspects (Section  4.3):

    • (a)

      Is the vocabulary in the ontology stabilizing or does it continue to change/grow?

    • (b)

      Are the concepts in the ontology lexically stabilizing or do they continue to change?

  • 4.

    Behavioral aspects (Section  4.4):

    • (a)

      Are collaborative ontologies constructed in a top-down or a bottom-up manner?

    • (b)

      Are collaborative authoring systems such as Wikipedia constructed similarly (i.e. top-down or bottom-up) to collaboratively engineered ontologies?

    • (c)

      How do contributors allocate activity on different abstraction levels in different ontologies?

In order to explore these questions, we introduce a set of practical measures and apply them to the structured change-logs of five different collaborative ontology construction efforts to assess their efficacy. While our results indicate that these measures provide a useful approach to answering questions like the ones above, we expect future work to discover other–potentially more useful–measures to characterize social dynamics in collaborative ontology engineering projects.

Contributions: To the best of our knowledge, the work presented in this paper represents the most fine-grained study of social dynamics in very large collaborative ontology engineering projects to date. We develop and apply quantitative metrics that help answer qualitative questions related to dynamic, social, lexical, and behavioral aspects of collaborative ontology engineering processes. Our results show that (i) there are qualitative differences between different collaborative ontology engineering projects that demand explanations in terms of organizing and managing quality in such projects and (ii) there are also interesting commonalities that set collaborative ontology engineering projects apart from other collaborative authoring projects such as Wikipedia. Our findings suggest that collaborative ontology engineering represents a novel and interesting phenomenon with unique characteristics that warrant more research in this direction.

The paper is structured as follows: In Section  2 we review related work. In Section  3, we introduce the data sets used in this study, and provide descriptive statistics. We proceed with presenting the results from our comparative study of change logs in Section  4. In Section  5, we discuss our results and interpret our findings. We conclude our paper with a summarization of our findings and implications in Section  6.

Section snippets

Related work

For the research presented in this paper, we consider work from the following domains to be of relevance: ontology evaluation; collaborative ontology engineering; collaborative authoring systems.

Material and methods

In the following study, we use two main types of data for our analysis: First, we use a set of biomedical ontologies that are being developed collaboratively in Protégé (and its derivatives) and a set of articles from Wikipedia describing biomedical terms as a control (Section  3.1); and second, we use the structured logs of changes that reflect collaborative development of these resources (Section  3.2).

Results

In the following, we present results from our empirical investigations on dynamic, social, lexical and behavioral aspects of collaborative ontology engineering processes.

Summary and discussion

In this paper, we present an analysis of quantitative data that characterizes collaborative development of several large biomedical ontologies. The analysis of this quantitative data enabled us to gain qualitative insight into dynamic, social, lexical, and behavioral aspects of the process of ontology engineering itself. We summarize these insights in the rest of this section by revisiting the set of our initial research questions:

Conclusions

This work exposes the hidden social dynamics behind collaborative ontology engineering projects. The main results of this paper are twofold: (i) On a theoretical level, our work makes an argument for expanding the existing arsenal of ontology evaluation techniques with new techniques that analyze the social dynamics behind collaborative ontology engineering projects. (ii) On an empirical level, our work conducts a broad investigation of five real-world collaborative ontology engineering

Acknowledgments

We want to thank the World Health Organization for providing us with change tracking data for ICD-11 and ICTM as well as answering our questions to help validating our results.

References (49)

  • A. Maedche et al.

    Measuring similarity between ontologies

  • R. Porzel, R. Malaka, A task-based approach for ontology evaluation, in: Proceedings of the 16th European Conference on...
  • C. Brewster, H. Alani, S. Dasmahapatra, Y. Wilks, Data driven ontology evaluation, in: Proceedings of the International...
  • P. Mika et al.

    Ontologies are us: a unified model of social networks and semantics

  • P. Haase, G. Qi, An analysis of approaches to resolving inconsistencies in dl-based ontologies, in: Proceedings of...
  • J. Lam, Methods for resolving inconsistencies in ontologies, Ph.D. Thesis, University of Aberdeen,...
  • M. Sabou, J. Gracia, S. Angeletou, M. d’Aquin, E. Motta, Evaluating the semantic web: a task-based approach, in:...
  • L. Obrst et al.

    The evaluation of ontologies

  • M. Krötzsch et al.

    Semantic MediaWiki

  • S. Auer et al.

    OntoWiki—a tool for social, semantic collaboration

  • C. Ghidini et al.

    MoKi: the enterprise modelling wiki

  • V. Zacharias, S. Braun, SOBOLEO—social bookmarking and lightweight ontology engineering, in: Workshop on Social and...
  • T. Schandl et al.

    Poolparty: SKOS thesaurus management utilizing linked data

    Semant. Web: Res. Appl.

    (2010)
  • T. Tudorache et al.

    Will Semantic Web technologies work for the development of ICD-11?

  • Cited by (24)

    • Analyzing user interactions with biomedical ontologies: A visual perspective

      2018, Journal of Web Semantics
      Citation Excerpt :

      These studies have used the data provided by logs of user activity in collaborative ontology development tools. Strohmaier et al. [12] conducted an empirical investigation using user activity logs to measure the impact of collaboration on ontology-engineering projects. The authors developed several new metrics to quantify different aspects of the hidden social dynamics that take place in these collaborative ontology-engineering projects from the biomedical domain.

    • How to apply Markov chains for modeling sequential edit patterns in collaborative ontology-engineering projects

      2015, International Journal of Human Computer Studies
      Citation Excerpt :

      The authors applied it to the analysis of the ICD-11 project. Strohmaier et al. (2013) investigated the hidden social dynamics that take place in collaborative ontology-engineering projects from the biomedical domain and provided new metrics to quantify various aspects of the collaborative engineering processes. Falconer et al. (2011) investigated the change-logs of collaborative ontology-engineering projects, showing that contributors exhibit specific roles, which can be used to group and classify these users, when contributing to the ontology.

    • Discovering Beaten Paths in Collaborative Ontology-Engineering Projects using Markov Chains

      2014, Journal of Biomedical Informatics
      Citation Excerpt :

      In contrast to Mikroyannidi et al., our analysis focuses on the detection of sequential patterns in interaction data rather than content. Strohmaier et al. [23] investigated the hidden social dynamics that take place in collaborative ontology-engineering projects from the biomedical domain and provides new metrics to quantify various aspects of the collaborative engineering processes. Wang et al. [24] have used association-rule mining to analyze user editing patterns in collaborative ontology-engineering projects.

    View all citing articles on Scopus
    View full text