Skip to main content
Log in

The frequency of plagiarism identified by text-matching software in scientific articles: a systematic review and meta-analysis

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

The aim of this systematic review and meta-analysis is to determine the frequency of plagiarism in scientific papers estimated from publications that use text-matching software to identify plagiarism. For this purpose, a literature search of 39 bibliographic databases has been conducted and a total of 10,005 articles have been identified. Ten articles met the criteria for inclusion in the meta-analysis and they checked for plagiarism in 6459 already published articles or manuscripts submitted to journals or conferences. All articles assessed plagiarism in a two-step process, first identifying textual similarity based on text-matching software and second, additionally inspecting detected similarity in the human verification process. The result revealed that 18% (95% CI: 12–25%) of articles have instances of plagiarism. Subgroup analyses were conducted to explain the large variance in the results. Following factors were tested: the number of plagiarism criteria implemented during the human verification process, sample size, the country where the study was conducted, the scientific discipline of analyzed papers, and publication status of analyzed papers. Plagiarism rates were higher across studies with a smaller sample size (N < 500) or a larger number of plagiarism criteria used to identify plagiarism (4 or 5 criteria). In conclusion, text-matching software is effective in providing evidence for plagiarism; however, this includes only textually based cases of plagiarism, and the reliability of software results depends on additional human verification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

All relevant data are within the manuscript and its Supporting Information files.

Code availability

Not applicable.

References

Download references

Acknowledgements

The author would like to thank Daniele Fanelli for generously sharing the knowledge of conducting systematic reviews and meta-analyses. Completion of the project would not be possible without the help of my colleagues Maja Miloš, who has helped with the data collection process, and Matea Butković, who has proofread the manuscript. Finally, thanks to the anonymous reviewers, their insightful comments and constructive critics substantially improved the manuscript.

Funding

No funding was received for conducting this study.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vanja Pupovac.

Ethics declarations

Conflict of interest

The authors have no financial or proprietary interests in any material discussed in this article.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 282 KB)

Supplementary file2 (PDF 316 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pupovac, V. The frequency of plagiarism identified by text-matching software in scientific articles: a systematic review and meta-analysis. Scientometrics 126, 8981–9003 (2021). https://doi.org/10.1007/s11192-021-04140-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-021-04140-5

Keywords

Navigation