skip to main content
10.1145/3487553.3524925acmconferencesArticle/Chapter ViewAbstractPublication PageswwwConference Proceedingsconference-collections
research-article

A Map of Science in Wikipedia

Authors Info & Claims
Published:16 August 2022Publication History

ABSTRACT

In recent decades, the rapid growth of Internet adoption is offering opportunities for convenient and inexpensive access to scientific information. Wikipedia, one of the largest encyclopedias worldwide, has become a reference in this respect, and has attracted widespread attention from scholars. However, a clear understanding of the scientific sources underpinning Wikipedia’s contents remains elusive. In this work, we rely on an open dataset of citations from Wikipedia to map the relationship between Wikipedia articles and scientific journal articles. We find that most journal articles cited from Wikipedia belong to STEM fields, in particular biology and medicine (47.6% of citations; 46.1% of cited articles). Furthermore, Wikipedia’s biographies play an important role in connecting STEM fields with the humanities, especially history. These results contribute to our understanding of Wikipedia’s reliance on scientific sources, and its role as knowledge broker to the public.

References

  1. Clive E Adams, Alan A Montgomery, Tony Aburrow, Sophie Bloomfield, Paul M Briley, Ebun Carew, Suravi Chatterjee-Woolman, Ghalia Feddah, Johannes Friedel, Josh Gibbard, 2020. Adding evidence of the effects of treatments into relevant Wikipedia pages: A randomised trial. BMJ open 10, 2 (2020), e033655.Google ScholarGoogle Scholar
  2. Wenceslao Arroyo-Machado, Daniel Torres-Salinas, Enrique Herrera-Viedma, and Esteban Romero-Frías. 2020. Science through Wikipedia: A novel representation of open knowledge through co-citation networks. PloS one 15, 2 (2020), e0228713.Google ScholarGoogle ScholarCross RefCross Ref
  3. Aleksandar Brezar and James Heilman. 2019. Readability of English Wikipedia’s health information over time. WikiJournal of Medicine 6, 1 (2019), 1–6.Google ScholarGoogle ScholarCross RefCross Ref
  4. Chih-Chun Chen and Camille Roth. 2012. Citation Needed: The Dynamics of Referencing in Wikipedia. In Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration (Linz, Austria) (WikiSym ’12). Association for Computing Machinery, New York, NY, USA, Article 8, 4 pages. https://doi.org/10.1145/2462932.2462943Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Giovanni Colavizza. 2020. COVID-19 research in Wikipedia. Quantitative Science Studies 1, 4 (Dec. 2020), 1349–1380. https://doi.org/10.1162/qss_a_00080Google ScholarGoogle ScholarCross RefCross Ref
  6. Rodrigo Costas, Sarah Rijcke, and Noortje Marres. 2021. “Heterogeneous couplings”: Operationalizing network perspectives to study science‐society interactions through social media metrics. Journal of the Association for Information Science and Technology 72, 5 (May 2021), 595–610. https://doi.org/10.1002/asi.24427Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Zhichao Fang and Rodrigo Costas. 2020. Studying the accumulation velocity of altmetric data tracked by Altmetric.com. Scientometrics 123, 2 (May 2020), 1077–1101. https://doi.org/10.1007/s11192-020-03405-9Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Aaron Halfaker, Bahodir Mansurov, Miriam Redi, and Dario Taraborelli. 2018. Citations with identifiers in Wikipedia. https://doi.org/10.6084/m9.figshare.1299540Google ScholarGoogle Scholar
  9. Christian Herzog, Daniel Hook, and Stacy Konkiel. 2020. Dimensions: Bringing down barriers between scientometricians and data. Quantitative Science Studies 1, 1 (Feb. 2020), 387–395. https://doi.org/10.1162/qss_a_00020Google ScholarGoogle ScholarCross RefCross Ref
  10. Isaac Johnson, Martin Gerlach, and Diego Sáez-Trumper. 2021. Language-agnostic Topic Classification for Wikipedia. In Companion Proceedings of the Web Conference 2021. 594–601.Google ScholarGoogle Scholar
  11. Isaac Johnson and Aaron Halfaker. 2020. Wikipedia Articles and Associated WikiProject Templates. Figshare (6 2020). https://doi.org/10.6084/m9.figshare.10248344.v4Google ScholarGoogle Scholar
  12. Lucie-Aimée Kaffee and Hady Elsahar. 2021. References in Wikipedia: The Editors’ Perspective. In Companion Proceedings of the Web Conference 2021. 535–538.Google ScholarGoogle Scholar
  13. Florian Lemmerich, Diego Sáez-Trumper, Robert West, and Leila Zia. 2019. Why the World Reads Wikipedia: Beyond English Speakers. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia) (WSDM ’19). Association for Computing Machinery, New York, NY, USA, 618–626. https://doi.org/10.1145/3289600.3291021Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Włodzimierz Lewoniewski, Krzysztof Węcel, and Witold Abramowicz. 2020. Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information 11, 5 (2020), 263.Google ScholarGoogle ScholarCross RefCross Ref
  15. Lauren A Maggio, Ryan M Steinberg, Tiziano Piccardi, and John M Willinsky. 2020. Meta-Research: Reader engagement with medical content on Wikipedia. Elife 9(2020), e52426.Google ScholarGoogle ScholarCross RefCross Ref
  16. Mostafa Mesgari, Chitu Okoli, Mohamad Mehdi, Finn Årup Nielsen, and Arto Lanamäki. 2015. “The sum of all human knowledge”: A systematic review of scholarly research on the content of Wikipedia. Journal of the Association for Information Science and Technology 66, 2(2015), 219–245. https://doi.org/10.1002/asi.23172 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.23172Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Joshua M Nicholson, Ashish Uppala, Matthias Sieber, Peter Grabitz, Milo Mordaunt, and Sean C Rife. 2021. Measuring the quality of scientific references in Wikipedia: an analysis of more than 115M citations to over 800,000 scientific articles. The FEBS Journal 288, 14 (2021), 4242–4248.Google ScholarGoogle ScholarCross RefCross Ref
  18. Finn Årup Nielsen. 2007. Scientific citations in Wikipedia. arXiv preprint arXiv:0705.2106(2007).Google ScholarGoogle Scholar
  19. Antonio Perianes-Rodriguez, Ludo Waltman, and Nees Jan van Eck. 2016. Constructing bibliometric networks: A comparison between full and fractional counting. Journal of Informetrics 10, 4 (2016), 1178–1195. https://doi.org/10.1016/j.joi.2016.10.006Google ScholarGoogle ScholarCross RefCross Ref
  20. Tiziano Piccardi, Martin Gerlach, Akhil Arora, and Robert West. 2021. A Large-Scale Characterization of How Readers Browse Wikipedia. arxiv:2112.11848 [cs.CY]Google ScholarGoogle Scholar
  21. Tiziano Piccardi, Miriam Redi, Giovanni Colavizza, and Robert West. 2020. Quantifying engagement with citations on Wikipedia. In Proceedings of The Web Conference. 2365–2376.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Tiziano Piccardi, Miriam Redi, Giovanni Colavizza, and Robert West. 2021. On the Value of Wikipedia as a Gateway to the Web. In Proceedings of the Web Conference 2021(Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 249–260. https://doi.org/10.1145/3442381.3450136Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Miriam Redi, Besnik Fetahu, Jonathan Morgan, and Dario Taraborelli. 2019. Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia’s Verifiability. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 1567–1578. https://doi.org/10.1145/3308558.3313618Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Dwaipayan Roy, Sumit Bhatia, and Prateek Jain. 2021. Information asymmetry in Wikipedia across different languages: A statistical analysis. Journal of the Association for Information Science and Technology (2021).Google ScholarGoogle Scholar
  25. F.N. Silva, M.P. Viana, B.A.N. Travençolo, and L. da F. Costa. 2011. Investigating relationships within and between category networks in Wikipedia. Journal of Informetrics 5, 3 (2011), 431–438. https://doi.org/10.1016/j.joi.2011.03.003Google ScholarGoogle ScholarCross RefCross Ref
  26. Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, and Jure Leskovec. 2017. Why We Read Wikipedia. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1591–1600. https://doi.org/10.1145/3038912.3052716Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Harshdeep Singh, Robert West, and Giovanni Colavizza. 2021. Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia. Quantitative Science Studies 2, 1 (April 2021), 1–19. https://doi.org/10.1162/qss_a_00105Google ScholarGoogle ScholarCross RefCross Ref
  28. Denise A Smith. 2020. Situating Wikipedia as a health information resource in various contexts: A scoping review. PloS one 15, 2 (2020), e0228786.Google ScholarGoogle ScholarCross RefCross Ref
  29. Cassidy R Sugimoto, Sam Work, Vincent Larivière, and Stefanie Haustein. 2017. Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and technology 68, 9(2017), 2037–2062.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Misha Teplitskiy, Grace Lu, and Eamon Duede. 2017. Amplifying the impact of open access: Wikipedia and the diffusion of science. Journal of the Association for Information Science and Technology 68, 9(2017), 2116–2127.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Neil Thompson and Douglas Hanley. 2018. Science is shaped by Wikipedia: Evidence from a randomized control trial. MIT Sloan Research Paper5238-17 (2 2018). https://doi.org/10.2139/ssrn.3039505Google ScholarGoogle Scholar
  32. Daniel Torres-Salinas, Esteban Romero-Frías, and Wenceslao Arroyo-Machado. 2019. Mapping the backbone of the Humanities through the eyes of Wikipedia. Journal of Informetrics 13, 3 (2019), 793–803.Google ScholarGoogle ScholarCross RefCross Ref
  33. Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports 9, 1 (2019), 1–12.Google ScholarGoogle Scholar
  34. Martijn Visser, Nees Jan van Eck, and Ludo Waltman. 2021. Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. Quantitative Science Studies 2, 1 (04 2021), 20–41. https://doi.org/10.1162/qss_a_00112 arXiv:https://direct.mit.edu/qss/article-pdf/2/1/20/1906541/qss_a_00112.pdfGoogle ScholarGoogle Scholar
  35. Alex Yarovoy, Yiftach Nagar, Einat Minkov, and Ofer Arazy. 2020. Assessing the Contribution of Subject-matter Experts to Wikipedia. ACM Transactions on Social Computing 3, 4 (2020), 1–36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Olga Zagorova, Roberto Ulloa, Katrin Weller, and Fabian Flöck. 2021. ”I updated the ”: The evolution of references in the English Wikipedia and the implications for altmetrics. Quantitative Science Studies (12 2021), 1–27. https://doi.org/10.1162/qss_a_00171Google ScholarGoogle Scholar

Index Terms

  1. A Map of Science in Wikipedia

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WWW '22: Companion Proceedings of the Web Conference 2022
      April 2022
      1338 pages
      ISBN:9781450391306
      DOI:10.1145/3487553

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 16 August 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed limited

      Acceptance Rates

      Overall Acceptance Rate1,899of8,196submissions,23%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format .

    View HTML Format