ABSTRACT
In recent decades, the rapid growth of Internet adoption is offering opportunities for convenient and inexpensive access to scientific information. Wikipedia, one of the largest encyclopedias worldwide, has become a reference in this respect, and has attracted widespread attention from scholars. However, a clear understanding of the scientific sources underpinning Wikipedia’s contents remains elusive. In this work, we rely on an open dataset of citations from Wikipedia to map the relationship between Wikipedia articles and scientific journal articles. We find that most journal articles cited from Wikipedia belong to STEM fields, in particular biology and medicine (47.6% of citations; 46.1% of cited articles). Furthermore, Wikipedia’s biographies play an important role in connecting STEM fields with the humanities, especially history. These results contribute to our understanding of Wikipedia’s reliance on scientific sources, and its role as knowledge broker to the public.
- Clive E Adams, Alan A Montgomery, Tony Aburrow, Sophie Bloomfield, Paul M Briley, Ebun Carew, Suravi Chatterjee-Woolman, Ghalia Feddah, Johannes Friedel, Josh Gibbard, 2020. Adding evidence of the effects of treatments into relevant Wikipedia pages: A randomised trial. BMJ open 10, 2 (2020), e033655.Google Scholar
- Wenceslao Arroyo-Machado, Daniel Torres-Salinas, Enrique Herrera-Viedma, and Esteban Romero-Frías. 2020. Science through Wikipedia: A novel representation of open knowledge through co-citation networks. PloS one 15, 2 (2020), e0228713.Google ScholarCross Ref
- Aleksandar Brezar and James Heilman. 2019. Readability of English Wikipedia’s health information over time. WikiJournal of Medicine 6, 1 (2019), 1–6.Google ScholarCross Ref
- Chih-Chun Chen and Camille Roth. 2012. Citation Needed: The Dynamics of Referencing in Wikipedia. In Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration (Linz, Austria) (WikiSym ’12). Association for Computing Machinery, New York, NY, USA, Article 8, 4 pages. https://doi.org/10.1145/2462932.2462943Google ScholarDigital Library
- Giovanni Colavizza. 2020. COVID-19 research in Wikipedia. Quantitative Science Studies 1, 4 (Dec. 2020), 1349–1380. https://doi.org/10.1162/qss_a_00080Google ScholarCross Ref
- Rodrigo Costas, Sarah Rijcke, and Noortje Marres. 2021. “Heterogeneous couplings”: Operationalizing network perspectives to study science‐society interactions through social media metrics. Journal of the Association for Information Science and Technology 72, 5 (May 2021), 595–610. https://doi.org/10.1002/asi.24427Google ScholarDigital Library
- Zhichao Fang and Rodrigo Costas. 2020. Studying the accumulation velocity of altmetric data tracked by Altmetric.com. Scientometrics 123, 2 (May 2020), 1077–1101. https://doi.org/10.1007/s11192-020-03405-9Google ScholarDigital Library
- Aaron Halfaker, Bahodir Mansurov, Miriam Redi, and Dario Taraborelli. 2018. Citations with identifiers in Wikipedia. https://doi.org/10.6084/m9.figshare.1299540Google Scholar
- Christian Herzog, Daniel Hook, and Stacy Konkiel. 2020. Dimensions: Bringing down barriers between scientometricians and data. Quantitative Science Studies 1, 1 (Feb. 2020), 387–395. https://doi.org/10.1162/qss_a_00020Google ScholarCross Ref
- Isaac Johnson, Martin Gerlach, and Diego Sáez-Trumper. 2021. Language-agnostic Topic Classification for Wikipedia. In Companion Proceedings of the Web Conference 2021. 594–601.Google Scholar
- Isaac Johnson and Aaron Halfaker. 2020. Wikipedia Articles and Associated WikiProject Templates. Figshare (6 2020). https://doi.org/10.6084/m9.figshare.10248344.v4Google Scholar
- Lucie-Aimée Kaffee and Hady Elsahar. 2021. References in Wikipedia: The Editors’ Perspective. In Companion Proceedings of the Web Conference 2021. 535–538.Google Scholar
- Florian Lemmerich, Diego Sáez-Trumper, Robert West, and Leila Zia. 2019. Why the World Reads Wikipedia: Beyond English Speakers. In Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining (Melbourne VIC, Australia) (WSDM ’19). Association for Computing Machinery, New York, NY, USA, 618–626. https://doi.org/10.1145/3289600.3291021Google ScholarDigital Library
- Włodzimierz Lewoniewski, Krzysztof Węcel, and Witold Abramowicz. 2020. Modeling Popularity and Reliability of Sources in Multilingual Wikipedia. Information 11, 5 (2020), 263.Google ScholarCross Ref
- Lauren A Maggio, Ryan M Steinberg, Tiziano Piccardi, and John M Willinsky. 2020. Meta-Research: Reader engagement with medical content on Wikipedia. Elife 9(2020), e52426.Google ScholarCross Ref
- Mostafa Mesgari, Chitu Okoli, Mohamad Mehdi, Finn Årup Nielsen, and Arto Lanamäki. 2015. “The sum of all human knowledge”: A systematic review of scholarly research on the content of Wikipedia. Journal of the Association for Information Science and Technology 66, 2(2015), 219–245. https://doi.org/10.1002/asi.23172 arXiv:https://asistdl.onlinelibrary.wiley.com/doi/pdf/10.1002/asi.23172Google ScholarDigital Library
- Joshua M Nicholson, Ashish Uppala, Matthias Sieber, Peter Grabitz, Milo Mordaunt, and Sean C Rife. 2021. Measuring the quality of scientific references in Wikipedia: an analysis of more than 115M citations to over 800,000 scientific articles. The FEBS Journal 288, 14 (2021), 4242–4248.Google ScholarCross Ref
- Finn Årup Nielsen. 2007. Scientific citations in Wikipedia. arXiv preprint arXiv:0705.2106(2007).Google Scholar
- Antonio Perianes-Rodriguez, Ludo Waltman, and Nees Jan van Eck. 2016. Constructing bibliometric networks: A comparison between full and fractional counting. Journal of Informetrics 10, 4 (2016), 1178–1195. https://doi.org/10.1016/j.joi.2016.10.006Google ScholarCross Ref
- Tiziano Piccardi, Martin Gerlach, Akhil Arora, and Robert West. 2021. A Large-Scale Characterization of How Readers Browse Wikipedia. arxiv:2112.11848 [cs.CY]Google Scholar
- Tiziano Piccardi, Miriam Redi, Giovanni Colavizza, and Robert West. 2020. Quantifying engagement with citations on Wikipedia. In Proceedings of The Web Conference. 2365–2376.Google ScholarDigital Library
- Tiziano Piccardi, Miriam Redi, Giovanni Colavizza, and Robert West. 2021. On the Value of Wikipedia as a Gateway to the Web. In Proceedings of the Web Conference 2021(Ljubljana, Slovenia) (WWW ’21). Association for Computing Machinery, New York, NY, USA, 249–260. https://doi.org/10.1145/3442381.3450136Google ScholarDigital Library
- Miriam Redi, Besnik Fetahu, Jonathan Morgan, and Dario Taraborelli. 2019. Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia’s Verifiability. In The World Wide Web Conference (San Francisco, CA, USA) (WWW ’19). Association for Computing Machinery, New York, NY, USA, 1567–1578. https://doi.org/10.1145/3308558.3313618Google ScholarDigital Library
- Dwaipayan Roy, Sumit Bhatia, and Prateek Jain. 2021. Information asymmetry in Wikipedia across different languages: A statistical analysis. Journal of the Association for Information Science and Technology (2021).Google Scholar
- F.N. Silva, M.P. Viana, B.A.N. Travençolo, and L. da F. Costa. 2011. Investigating relationships within and between category networks in Wikipedia. Journal of Informetrics 5, 3 (2011), 431–438. https://doi.org/10.1016/j.joi.2011.03.003Google ScholarCross Ref
- Philipp Singer, Florian Lemmerich, Robert West, Leila Zia, Ellery Wulczyn, Markus Strohmaier, and Jure Leskovec. 2017. Why We Read Wikipedia. In Proceedings of the 26th International Conference on World Wide Web (Perth, Australia) (WWW ’17). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, CHE, 1591–1600. https://doi.org/10.1145/3038912.3052716Google ScholarDigital Library
- Harshdeep Singh, Robert West, and Giovanni Colavizza. 2021. Wikipedia citations: A comprehensive data set of citations with identifiers extracted from English Wikipedia. Quantitative Science Studies 2, 1 (April 2021), 1–19. https://doi.org/10.1162/qss_a_00105Google ScholarCross Ref
- Denise A Smith. 2020. Situating Wikipedia as a health information resource in various contexts: A scoping review. PloS one 15, 2 (2020), e0228786.Google ScholarCross Ref
- Cassidy R Sugimoto, Sam Work, Vincent Larivière, and Stefanie Haustein. 2017. Scholarly use of social media and altmetrics: A review of the literature. Journal of the Association for Information Science and technology 68, 9(2017), 2037–2062.Google ScholarDigital Library
- Misha Teplitskiy, Grace Lu, and Eamon Duede. 2017. Amplifying the impact of open access: Wikipedia and the diffusion of science. Journal of the Association for Information Science and Technology 68, 9(2017), 2116–2127.Google ScholarDigital Library
- Neil Thompson and Douglas Hanley. 2018. Science is shaped by Wikipedia: Evidence from a randomized control trial. MIT Sloan Research Paper5238-17 (2 2018). https://doi.org/10.2139/ssrn.3039505Google Scholar
- Daniel Torres-Salinas, Esteban Romero-Frías, and Wenceslao Arroyo-Machado. 2019. Mapping the backbone of the Humanities through the eyes of Wikipedia. Journal of Informetrics 13, 3 (2019), 793–803.Google ScholarCross Ref
- Vincent A Traag, Ludo Waltman, and Nees Jan Van Eck. 2019. From Louvain to Leiden: guaranteeing well-connected communities. Scientific reports 9, 1 (2019), 1–12.Google Scholar
- Martijn Visser, Nees Jan van Eck, and Ludo Waltman. 2021. Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic. Quantitative Science Studies 2, 1 (04 2021), 20–41. https://doi.org/10.1162/qss_a_00112 arXiv:https://direct.mit.edu/qss/article-pdf/2/1/20/1906541/qss_a_00112.pdfGoogle Scholar
- Alex Yarovoy, Yiftach Nagar, Einat Minkov, and Ofer Arazy. 2020. Assessing the Contribution of Subject-matter Experts to Wikipedia. ACM Transactions on Social Computing 3, 4 (2020), 1–36.Google ScholarDigital Library
- Olga Zagorova, Roberto Ulloa, Katrin Weller, and Fabian Flöck. 2021. ”I updated the ”: The evolution of references in the English Wikipedia and the implications for altmetrics. Quantitative Science Studies (12 2021), 1–27. https://doi.org/10.1162/qss_a_00171Google Scholar
Index Terms
- A Map of Science in Wikipedia
Recommendations
Methodological issues in measuring citations in Wikipedia: a case study in Library and Information Science
Wikipedia citations have been suggested as a metric that partially captures the impact of research, providing an indication of the transfer of scholarly output to a wider audience beyond the academic community. In this article, we explore the coverage ...
A novel method for depicting academic disciplines through Google Scholar Citations: The case of Bibliometrics
This article describes a procedure to generate a snapshot of the structure of a specific scientific community and their outputs based on the information available in Google Scholar Citations (GSC). We call this method multifaceted analysis of ...
A diachronic perspective on citation latency in Wikipedia articles on CRISPR/Cas-9: an exploratory case study
AbstractThis paper analyzes Wikipedia’s representation of the Nobel Prize winning CRISPR/Cas9 technology, a method for gene editing. We propose and evaluate different heuristics to match publications from several publication corpora against Wikipedia’s ...
Comments