Abstract
The aim of this systematic review and meta-analysis is to determine the frequency of plagiarism in scientific papers estimated from publications that use text-matching software to identify plagiarism. For this purpose, a literature search of 39 bibliographic databases has been conducted and a total of 10,005 articles have been identified. Ten articles met the criteria for inclusion in the meta-analysis and they checked for plagiarism in 6459 already published articles or manuscripts submitted to journals or conferences. All articles assessed plagiarism in a two-step process, first identifying textual similarity based on text-matching software and second, additionally inspecting detected similarity in the human verification process. The result revealed that 18% (95% CI: 12–25%) of articles have instances of plagiarism. Subgroup analyses were conducted to explain the large variance in the results. Following factors were tested: the number of plagiarism criteria implemented during the human verification process, sample size, the country where the study was conducted, the scientific discipline of analyzed papers, and publication status of analyzed papers. Plagiarism rates were higher across studies with a smaller sample size (N < 500) or a larger number of plagiarism criteria used to identify plagiarism (4 or 5 criteria). In conclusion, text-matching software is effective in providing evidence for plagiarism; however, this includes only textually based cases of plagiarism, and the reliability of software results depends on additional human verification.
Similar content being viewed by others
Data availability
All relevant data are within the manuscript and its Supporting Information files.
Code availability
Not applicable.
References
Ana, J., Koehlmoos, T., Smith, R., & Yan, L. L. (2013). Research misconduct in low- and middle-income countries. PLoS Medicine, 10(3), 1–6. https://doi.org/10.1371/journal.pmed.1001315
Anderson, M. S., & Steneck, N. H. (2011). The problem of plagiarism. Urologic Oncology: Seminars and Original Investigations. https://doi.org/10.1016/j.urolonc.2010.09.013
Baždarić, K., Bilić-Zulle, L., Brumini, G., & Petrovečki, M. (2012). Prevalence of plagiarism in recent submissions to the Croatian medical journal. Science and Engineering Ethics, 18(2), 223–239. https://doi.org/10.1007/s11948-011-9347-2
Biagioli, M. (2012). Recycling texts or stealing time?: Plagiarism, authorship, and credit in science. International Journal of Cultural Property, 19(3), 453–476. https://doi.org/10.1017/S0940739112000276
Bohannon, J. (2014). Study of massive preprint archive hints at the geography of plagiarism. Science Now. http://search.ebscohost.com/login.aspx?authtype=shib&custid=s4753785&groupid=knjiznica&profile=eds.
Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. John Wiley & Sons. https://doi.org/10.1002/9780470743386
Borg, E. (2009). Local plagiarisms. Assessment and Evaluation in Higher Education, 34(4), 415–426. https://doi.org/10.1080/02602930802075115
Bouter, L. M., Tijdink, J., Axelsen, N., Martinson, B. C., & Riet, G. (2016). Ranking major and minor research misbehaviors : Results from a survey among participants of four World Conferences on Research Integrity. Research Integrity and Peer Review. https://doi.org/10.1186/s41073-016-0024-5
Bouville, M. (2008). Plagiarism: Words and ideas. Science and Engineering Ethics, 14(3), 311–322. https://doi.org/10.1007/s11948-008-9057-6
Bretag, T., & Mahmud, S. (2009). Self-plagiarism or appropriate textual re-use? Journal of Academic Ethics, 7(3), 193–205. https://doi.org/10.1007/s10805-009-9092-1
Buchanan, G., & McKay, D. (2017). The Lowest form of flattery: Characterising text re-use and plagiarism patterns in a digital library corpus. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries (pp. 1–10). Toronto, ON, Canada: Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/JCDL.2017.7991570.
Citron, D. T., & Ginsparg, P. (2015). Patterns of text reuse in a scientific corpus. Proceedings of the National Academy of Sciences, 112(1), 25–30. https://doi.org/10.1073/pnas.1415135111
COPE. (2011). How should editors respond to plagiarism? COPE Discussion Document. https://publicationethics.org/resources/discussion-documents/how-should-editors-respond-plagiarism-april-2011. Accessed 1 March 2020.
COPE. (2013). Suspected plagiarism in a published manuscript. Version 2. https://doi.org/10.24318/cope.2019.2.2.
COPE. (2018). Suspected plagiarism in a submitted manuscript. Version 2. https://doi.org/10.24318/cope.2019.2.1.
Errami, M., Hicks, J. M., Fisher, W., Trusty, D., Wren, J. D., Long, T. C., & Garner, H. R. (2008). Déjà vu—A study of duplicate citations in Medline. Bioinformatics, 24(2), 243–249. https://doi.org/10.1093/bioinformatics/btm574
Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS ONE, 4(5), e5738. https://doi.org/10.1371/journal.pone.0005738
Fisher, R. J., & Katz, J. E. (2000). Social-desirability bias and the validity of self-reported values. Psychology & Marketing, 17(2), 105–120. https://doi.org/10.1002/(SICI)1520-6793(200002)17:2%3c105::AID-MAR3%3e3.0.CO;2-9
Foltýnek, T., Meuschke, N., & Gipp, B. (2019). Academic plagiarism detection: A systematic literature review. ACM Computing Surveys. Association for Computing Machinery. https://doi.org/10.1145/3345317.
Furuya-Kanamori, L., Barendregt, J. J., & Doi, S. A. R. (2018). A new improved graphical and quantitative method for detecting bias in meta-analysis. International Journal of Evidence-Based Healthcare, 16(4), 195–203. https://doi.org/10.1097/XEB.0000000000000141
Gasparyan, A. Y., Nurmashev, B., Seksenbayev, B., Trukhachev, V. I., Kostyukova, E. I., & Kitas, G. D. (2017). Plagiarism in the context of education and evolving detection strategies. Journal of Korean Medical Science, 32(8), 1220–1227. https://doi.org/10.3346/jkms.2017.32.8.1220
Gazni, A., Sugimoto, C. R., & Didegah, F. (2012). Mapping world scientific collaboration: Authors, institutions, and countries. Journal of the American Society for Information Science and Technology, 63(2), 323–335. https://doi.org/10.1002/asi.21688
Grieneisen, M. L., & Zhang, M. (2012). A comprehensive survey of retracted articles from the scholarly literature. PLoS ONE, 7(10), e44118. https://doi.org/10.1371/journal.pone.0044118
Griffin, C. (2010). The journal of bone & joint surgery’s crosscheck experience. Learned Publishing, 23(2), 132–135. https://doi.org/10.1087/20100208
Hayes, N., & Introna, L. (2005). Systems for the production of plagiarists? The implications arising from the use of plagiarism detection systems in UK universities for Asian learners. Journal of Academic Ethics, 3(1), 55–73. https://doi.org/10.1007/s10805-006-9006-4
Heitman, E., & Litewka, S. (2011). International perspectives on plagiarism and considerations for teaching international trainees. Urologic Oncology, 29(1), 104–108. https://doi.org/10.1016/j.urolonc.2010.09.014
Helgesson, G., & Eriksson, S. (2014). Plagiarism in research. Medicine Health Care and Philosophy, 18(1), 91–101. https://doi.org/10.1007/s11019-014-9583-8
Higgins, J., Thomas, J., Chandler, J., Cumpston, M., Li, T., & Page, M. (2019). Chapter 10: Analysing data and undertaking meta-analyses | Cochrane Training. In Cochrane handbook for systematic reviews of interventions version 6.0, 1(September). https://training.cochrane.org/handbook/current/chapter-10.
Higgins, J. R., Lin, F., & Evans, J. P. (2016). Plagiarism in submitted manuscripts : Incidence, characteristics and optimization of screening—case study in a major specialty medical journal. Research Integrity and Peer Review. https://doi.org/10.1186/s41073-016-0021-8
Hodges, A., Bickham, T., Schmidt, E., & Seawright, L. (2017). Challenging the profiles of a plagiarist: A study of abstracts submitted to an international interdisciplinary conference. International Journal for Educational Integrity. https://doi.org/10.1007/s40979-017-0016-3
Hofmann, B., Myhr, A. I., & Holm, S. (2013). Scientific dishonesty–a nationwide survey of doctoral students in Norway. BMC Medical Ethics, 14, 3. https://doi.org/10.1186/1472-6939-14-3
Honig, B., & Bedi, A. (2012). The fox in the hen house: A critical examination of plagiarism among members of the academy of management. Academy of Management Learning and Education, 11(1), 101–123. https://doi.org/10.5465/amle.2010.0084
Horbach, S. P. J. M. S., & Willem, H. W. (2017). The extent and causes of academic text recycling or “self-plagiarism.” Research Policy, 48(2), 492–502. https://doi.org/10.1016/j.respol.2017.09.004
Hyland, K. (1999). Academic attribution: Citation and the construction of disciplinary knowledge. Applied Linguistics, 20(3), 341–367. https://doi.org/10.1093/applin/20.3.341
Introna, L. D., & Hayes, N. (2011). On sociomaterial imbrications: What plagiarism detection systems reveal and why it matters. Information and Organization, 21(2), 107–122. https://doi.org/10.1016/j.infoandorg.2011.03.001
Ison, D. C. (2012). Plagiarism among dissertations: Prevalence at online institutions. Journal of Academic Ethics, 10(3), 227–236. https://doi.org/10.1007/s10805-012-9165-4
Jia, X., Tan, X., & Zhang, Y. (2014). Replication of the methods section in biosciences papers: Is it plagiarism? Scientrometrics, 98(1), 337–345. https://doi.org/10.1007/s11192-013-1033-5
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953
Lakhani, J., Benzies, K., & Hayden, K. A. (2012). Attributes of interdisciplinary research teams: A comprehensive review of the literature. Clinical and Investigative Medicine, 35(5), 260–265. https://doi.org/10.25011/cim.v35i5.18698
Lariviere, V., Gingras, Y., Sugimoto, C. R., & Tsou, A. (2015). Team size matters: Collaboration and scientific impact since 1900. Journal of the Association for Information Science & Technology, 66(7), 1323–1332. https://doi.org/10.1002/asi.23266
Lesk, M. (2015). How many scientific papers are not original? Proceedings of the National Academy of Sciences, 112(1), 6–7. https://doi.org/10.1073/pnas.1422282112
Li, Y. (2015). ‘Standing on the shoulders of giants’: Recontextualization in writing from sources. Science and Engineering Ethics. https://doi.org/10.1007/s11948-014-9590-4
Lipsey, M. W., & Wilson, D. B. (2000). Practical meta-analysis. SAGE publications.
Mahmud, S., Bretag, T., & Foltýnek, T. (2019). Students’ perceptions of plagiarism policy in higher education: A comparison of the United Kingdom, Czechia, Poland and Romania. Journal of Academic Ethics, 17(3), 271–289. https://doi.org/10.1007/s10805-018-9319-0
Malički, M., Aalbersberg, I. J., Bouter, L., & ter Riet, G. (2019). Journals’ instructions to authors: A cross-sectional study across scientific disciplines. PLoS ONE. https://doi.org/10.1371/journal.pone.0222157
Martinson, B., Anderson, M., Crain, A., & De Vries, R. (2006). Scientists’ perceptions of organizational justice and self-reported misbehaviors. Journal of Empirical Research on Human Research Ethics, 1(1), 51–66. https://doi.org/10.1525/jer.2006.1.1.51
Masic, I. (2014). Plagiarism in scientific research and publications and how to prevent it. Materia Socio-Medica, 26(2), 141–146. https://doi.org/10.5455/msm.2014.26.141-146
Maxwell, A., Curtis, G. J., & Vardanega, L. (2008). Does culture influence understanding and perceived seriousness of plagiarism? International Journal for Educational Integrity. https://doi.org/10.21913/ijei.v4i2.412
Mayes, R. J. (2017). A Content Originality Analysis of HRD Focused Dissertations and Published Academic Articles Using Turnitin Plagiarism Detection Software. ProQuest Dissertations and Theses. University of North Texas.
Neyeloff, J. L., Fuchs, S. C., & Moreira, L. B. (2012). Meta-analyses and Forest plots using a microsoft excel spreadsheet: Step-by-step guide focusing on descriptive data analysis. BMC Research Notes. https://doi.org/10.1186/1756-0500-5-52
Okonta, P., & Rossouw, T. (2013). Prevalence of scientific misconduct among a group of researchers in Nigeria. Developing World Bioethics, 13(3), 149–157. https://doi.org/10.1111/j.1471-8847.2012.00339.x
Pecorari, D., & Petrić, B. (2014). Plagiarism in second-language writing. Language Teaching, 47(3), 269–302. https://doi.org/10.1017/S0261444814000056
Peh, W. C., & Arokiasamy, J. (2008). Plagiarism: A joint statement from the Singapore Medical Journal and the Medical Journal of Malaysia. Singapore medical journal, 49(12), 965–6. http://www.ncbi.nlm.nih.gov/pubmed/19122943. Accessed 2 April 2020.
Pupovac, V., Bilic-Zulle, L., & Petrovecki, M. (2008). On academic plagiarism in Europe: An analytical approach based on four studies. Digithum: The Humanities in the Dital Era, (10), 13–18. http://www.uoc.edu/digithum/10/dt/eng/pupovac_bilic-zulle_petrovecki.pdf.
Pupovac, V., & Fanelli, D. (2015). Scientists admitting to plagiarism: A meta-analysis of surveys. Science and Engineering Ethics, 21(5), 1331–1352. https://doi.org/10.1007/s11948-014-9600-6
Richardson, M., Garner, P., & Donegan, S. (2019). Interpretation of subgroup analyses in systematic reviews: A tutorial. Clinical Epidemiology and Global Health. https://doi.org/10.1016/j.cegh.2018.05.005
Roig, M. (2001). Plagiarism and paraphrasing criteria of college and university professors. Ethics & Behavior, 11(3), 307–323. https://doi.org/10.1207/S15327019EB1103
Roig, M. (2010). Plagiarism and self-plagiarism: What every author should know. Biochemia Medica, 20(3), 295–300.
Roig, M. (2017). Encouraging editorial flexibility in cases of textual reuse. Journal of Korean Medical Science, 32(4), 557–560. https://doi.org/10.3346/jkms.2017.32.4.557
Shafer, S. (2016). Plagiarism Is ubiquitous. Anesthesia & Analgesia, 122(6), 1776–1780. https://doi.org/10.1213/ANE.0000000000001344
Shi, L. (2006). Cultural backgrounds and textual appropriation. Language Awareness, 15(4), 264–282. https://doi.org/10.2167/la406.0
Šipka, P. (2010). Mere protiv plagijarizma i srodnih pojava. Beograd, Serbia.
Smart, P., & Gaston, T. (2019). How prevalent are plagiarized submissions? Global survey of editors. Learned Publishing, 32(1), 47–56. https://doi.org/10.1002/leap.1218
Sorokina, D., Gehrke, J., Warner, S., & Ginsparg, P. (2006). Plagiarism detection in arXiv. In Proceedings—IEEE International Conference on Data Mining, ICDM, (July), pp. 1070–1075. https://doi.org/10.1109/ICDM.2006.126.
Stavale, R., Ferreira, G. I., Galvão, J. A. M., Zicker, F., Novaes, M. R. C. G., de Oliveira, C. M., & Guilhem, D. (2019). Research misconduct in health and life sciences research: A systematic review of retracted literature from Brazilian institutions. PLoS ONE. https://doi.org/10.1371/journal.pone.0214272
Steen, R. G. (2011). Retractions in the scientific literature: Is the incidence of research fraud increasing? Journal of Medical Ethics, 37(4), 249–253. https://doi.org/10.1136/jme.2010.040923
Stretton, S., Bramich, N. J., Keys, J. R., Monk, J. A., Ely, J. A., Haley, C., et al. (2012). Publication misconduct and plagiarism retractions: A systematic, retrospective study. Current Medical Research and Opinion, 28(10), 1575–1583. https://doi.org/10.1185/03007995.2012.728131
Sun, X., Briel, M., Walter, S. D., & Guyatt, G. H. (2010a). Is a subgroup effect believable? Updating criteria to evaluate the credibility of subgroup analyses. BMJ (Online). https://doi.org/10.1136/bmj.c117
Sun, X., Heels-Ansdell, D., Walter, S. D., Guyatt, G., Sprague, S., Bhandari, M., et al. (2011). Is a subgroup claim believable?: A user’s guide to subgroup analyses in the surgical literature. Journal of Bone and Joint Surgery—Series A. https://doi.org/10.2106/JBJS.I.01555
Sun, X., Ioannidis, J. P. A., Agoritsas, T., Alba, A. C., & Guyatt, G. (2014). How to use a subgroup analysis users’ guides to the medical literature. JAMA—Journal of the American Medical Association. https://doi.org/10.1001/jama.2013.285063
Sun, Y. C. (2013). Do journal authors plagiarize? Using plagiarism detection software to uncover matching text across disciplines. Journal of English for Academic Purposes, 12(4), 264–272. https://doi.org/10.1016/j.jeap.2013.07.002
Sun, Z., Errami, M., Long, T., Renard, C., Choradia, N., & Garner, H. (2010b). Systematic characterizations of text similarity in full text biomedical publications. PLoS ONE, 5(9), 1–6. https://doi.org/10.1371/journal.pone.0012704
Swaan, P. W. (2010). Publication ethics—a guide for submitting manuscripts to pharmaceutical research. Pharmaceutical Research, 27(9), 1757–1758. https://doi.org/10.1007/s11095-010-0188-5
Swazey, J. P., Anderson, M. S., & Seashore, L. K. (1993). Ethical problems in academic research. American Scientist, 81, 542–543.
Taylor, D. B. (2017). Plagiarism in manuscripts submitted to the AJR: Development of an optimal screening algorithm and management pathways. American Journal of Roentgenology, 208(4), 712–720. https://doi.org/10.2214/AJR.16.17208
Thomas, A., & de Bruin, G. P. (2015). Plagiarism in South African management journals. South African Journal of Science, 111(1/2), 1–3. https://doi.org/10.17159/sajs.2015/20140017
Titus, S. L., Wells, J. A., & Rhoades, L. J. (2008). Repairing research integrity. Nature, 453(June), 980–982. https://doi.org/10.1038/453980a
Turnitin. (2016). The plagiarism spectrum: Instructor insights into the 10 types of plagiarism.
Vasconcelos, S., Leta, J., Costa, L., Pinto, A., & Sorenson, M. M. (2009). Discussing plagiarism in Latin American. European Molecular Biology Organization, 10(7), 677–682. https://doi.org/10.1038/embor.2009.134
Vessel, K., & Habibzadeh, F. (2007). Rules of the game of scientific writing: Fairplay and plagiarism. Lancet, 369(24), 641–641. https://doi.org/10.1016/S0140-6736(07)60307-9
Wuchty, S., Jones, B. F., & Uzzi, B. (2007). The increasing dominance of teams in production of knowledge. Science, 316(5827), 1036–1039. https://doi.org/10.1126/science.1136099
Yi, N., Nemery, B., & Dierickx, K. (2020). Perceptions of plagiarism by biomedical researchers: An online survey in Europe and China. BMC Medical Ethics, 21(1), 1–16. https://doi.org/10.1186/s12910-020-00473-7
Yilmaz, I. (2007). Plagiarism? No, we’re just borrowing better English. Nature. https://doi.org/10.1038/449658a
Zhang, H. (2010a). CrossCheck: An effective tool for detecting plagiarism. Learned Publishing, 23(1), 9–14. https://doi.org/10.1087/20100103
Zhang, H. Y. (2010b). Chinese journal finds 31% of submissions plagiarized. Nature, 467(7312), 153. https://doi.org/10.1038/467153d
Zhang, Y., & Jia, X. (2012). A survey on the use of CrossCheck for detecting plagiarism in journal articles. Learned Publishing, 25(4), 292–307. https://doi.org/10.1087/20120408
Zimitat, C. (2012). Plagiarism across the academic disciplines. In Connections in Higher Education, HERDSA National Conference. Hobart, Australia. https://doi.org/10.13140/2.1.1984.0005.
Acknowledgements
The author would like to thank Daniele Fanelli for generously sharing the knowledge of conducting systematic reviews and meta-analyses. Completion of the project would not be possible without the help of my colleagues Maja Miloš, who has helped with the data collection process, and Matea Butković, who has proofread the manuscript. Finally, thanks to the anonymous reviewers, their insightful comments and constructive critics substantially improved the manuscript.
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no financial or proprietary interests in any material discussed in this article.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Pupovac, V. The frequency of plagiarism identified by text-matching software in scientific articles: a systematic review and meta-analysis. Scientometrics 126, 8981–9003 (2021). https://doi.org/10.1007/s11192-021-04140-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-021-04140-5