Abstract
Plagiarism has been among the top forms of academic misconduct. Detective, reactive and proactive measures are taken to mitigate plagiarism in scholarly works. Text-matching tools play a significant role in the detection of plagiarism. Many studies have tested the performance of text-matching tools in detecting plagiarism from various perspectives. However, no study addressed the performance of such tools in ideographic languages, particularly Japanese. Considering the sharp increase in the number of academic Japanese text and plagiarism incidents in the Japanese context, it is essential to explore to what extent text-matching tools catch similarities in Japanese texts and respond to the needs of Japanese users. Within this scope, this study set out to explore the coverage and usability performance of text-matching tools in the Japanese language. We tested the coverage performance of 10 text-matching tools with five types of intentionally plagiarized documents. Also, we tested the usability performance via a feature checklist. The testing results suggested that the tools generally give a relatively higher performance on the usability side rather than the coverage aspect. Most tools have minimal coverage performance in the Japanese language. In the end, we provided takeaways for vendors, policymakers and educators.
Similar content being viewed by others
Data availability
Data and materials generated, used, and analyzed in this study are publicly available in the European Network for Academic Integrity (ENAI) repository. https://www.academicintegrity.eu/wp/testing-of-support-tools-for-plagiarism-detection-working-group/
Notes
In this manuscript, the word "tool" refers any service, system, machine, website or downloadable program related to text matching and is grouped under one word for consistency only.
The vendor in this manuscript is referred not only to a commercial organization that provides a sort of platform to detect similarities on the texts but also the stakeholder contributes to educational and research activities on plagiarism and academic misconduct issues.
References
Abdelhamid, M., Azouaou, F., & Batata, S. (2022). A survey of plagiarism detection systems: Case of use with English, French and Arabic Languages. arXiv preprint arXiv:2201.03423.
Ahmed, R. K. A. (2015). Overview of different plagiarism detection tools. International Journal of Futuristic Trends in Engineering and Technology, 2(10), 1–3.
Ali, A. M. E. T., Abdulla, H. M. D., & Snasel, V. (2011, May 13). Overview and comparison of plagiarism detection tools [Poster presentation]. Annual International Workshop on DAtabases, TExts, Specifications and Objects (Dateso), 161–172, Pisek, Czech Republic. https://ceur-ws.org/Vol-706/poster22.pdf
Alotaibi, N., & Joy, M. (2021). English-Arabic cross-language plagiarism detection. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) (pp. 44–52). https://doi.org/10.26615/978-954-452-072-4_006
Birkić, T., Celjak, D., Cundeković, M., & Rako, S. (2016). Analysis of software for plagiarism detection in science and education. University of Zagreb. Retrieved January, 2022, from https://www.srce.unizg.hr/files/srce/docs/CEU/analysis_of_software_for_plagiarism_detection_in_science_and_education.pdf
Bull, J., Collins, C., Coughlin, E., Sharp, D., & Square, P. (2001). Technical review of plagiarism detection software report. University of Luton.
Chen, S., & Macfarlane, B. (2016). Academic integrity in China. In T. Bretag (Ed.), Handbook of Academic Integrity (pp. 99–105). Springer Reference.
Chowdhury, H. A., & Bhattacharyya, D. K. (2018). Plagiarism: Taxonomy, tools and detection techniques. arXiv:1801.06323. https://doi.org/10.48550/arXiv.1801.06323
Condurache, I.-A., & Bolboacă, S. D. (2022). Comparison of plagiarism detection performance between some commercial and free software. Applied Medical Informatics, 44(2), 73–86. https://ami.info.umfcluj.ro/index.php/AMI/article/view/904
Ehrich, J., Howard, S. J., Mu, C., & Bokosmaty, S. (2016). A comparison of Chinese and Australian university students’ attitudes towards plagiarism. Studies in Higher Education, 41(2), 231–246.
El Bachir Menai, M., & Bagais, M. (2011). APlag: A plagiarism checker for Arabic texts. 2011 6th International Conference on Computer Science & Education (ICCSE) (pp.1379–1383). IEEE. https://doi.org/10.1109/ICCSE.2011.6028888
Elamine, M., Mechti, S., & Belguith, L. H. (2020). Hybrid plagiarism detection method for French language. International Journal of Hybrid Intelligent Systems, 16(3), 163–175. https://doi.org/10.3233/HIS-200284
Elamine, M., Bougares, F., Mechti, S., & HadrichBelguith, L. (2021). Extrinsic plagiarism detection for French language with Word embeddings. In A. Abraham, P. Siarry, K. Ma, & A. Kaklauskas (Eds.), Intelligent systems design and applications (pp. 217–224). Springer International Publishing. https://doi.org/10.1007/978-3-030-49342-4_21
Elkhatat, A. M., Elsaid, K., & Almeer, S. (2021). Some students plagiarism tricks, and tips for effective check. International Journal for Education Integrity, 17(1), 1–12. https://doi.org/10.1007/s40979-021-00082-w
Foltýnek, T., Dlabolová, D., Anohina-Naumeca, A., Razı, S., Kravjar, J., Kamzola, L., Guerrero-Dib, J., Çelik, Ö., & Weber-Wulff, D. (2020). Testing of support tools for plagiarism detection. International Journal of Educational Technology in Higher Education, 17(1), 1–31. https://doi.org/10.1186/s41239-020-00192-4
Fukaya, R., Yamamura, T., Kudō, H., Matsumoto, T., Takeuchi, Y., & Ohnishi, N. (2003). Hindo tōkei to gainen jisho wo mochiita bunshō no ruijisei no teiryōka [Measuring similarity between documents using term frequency and concept dictionary]. Jōhō Shori Gakkai Kenkyū Hōkoku [IPSJ SIG Notes], 153, 73–79.
Howard, R. M. (1992). A plagiarism pentimento. Journal of Teaching Writing, 11(2), 233–245.
Hussein, A. S. (2015). A plagiarism detection system for Arabic documents. In D. Filev, J. Jabłkowski, J. Kacprzyk, M. Krawczak, I. Popchev, L. Rutkowski, V. Sgurev, E. Sotirova, P. Szynkarczyk, & S. Zadrozny (Eds.), Intelligent Systems’2014 (pp. 541–552). Springer International Publishing. https://doi.org/10.1007/978-3-319-11310-4_47
Jadhav Sunayana, D., & Lihitkar Shalini, R. (2021). Plagiarism detection software: A comparative evaluation. Library Philosophy and Practice (e-journal), 6206. https://digitalcommons.unl.edu/libphilprac/6206/
Japan Foundation. (2018). Survey report on Japanese-Language education abroad 2018. Retrieved May, 2022, from https://www.jpf.go.jp/j/project/japanese/survey/result/dl/survey2018/text.pdf
Japan Statistical Yearbook. (2022). Chapter 25 Education. Retrieved September, 2022, from https://www.stat.go.jp/english/data/nenkan/71nenkan/1431-25.html
Kahloula, B., & Berri, J. (2016). Plagiarism detection in arabic documents: Approaches, architecture and systems. Journal of Digital Information Management, 14(2), 124–135.
Kakkonen, T., & Mozgovoy, M. (2010). Hermetic and web plagiarism detection systems for student essays—An evaluation of the State-of-the-Art. Journal of Educational Computing Research, 42(2), 135–159. https://doi.org/10.2190/EC.42.2.a
Kamimura, T. (2014). Citation behaviors observed in Japanese EFL students’ argumentative writing. Journal of Pan-Pacific Association of Applied Linguistics, 18(1), 85–101.
Kier, C.A., & Ives, C. (2022) Recommendations for a balanced approach to supporting academic integrity: perspectives from a survey of students, faculty, and tutors. International Journal for Educational Integrity, 18(22). https://doi.org/10.1007/s40979-022-00116-x
Krizkova, S., Tomaskova, H., & Gavalec, M. (2016). Preference comparison for plagiarism detection systems. 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp.1760–1767). IEEE. https://doi.org/10.1109/FUZZ-IEEE.2016.7737903
Kulkarni, S., Govilkar, S., & Amin, D. (2021). Analysis of plagiarism detection tools and methods. Proceedings of the 4th International Conference on Advances in Science & Technology (ICAST2021). International Conference on Advances in Science & Technology, Bahir Dar, Ethiopia. https://doi.org/10.2139/ssrn.3869091
Lancaster, T., & Culwin, F. (2005). Classifications of plagiarism detection engines. Innovation in Teaching and Learning in Information and Computer Sciences, 4(2), 1–16. https://doi.org/10.11120/ital.2005.04020006
Masic, I. (2012). Plagiarism in scientific publishing. Acta Informatica Medica, 20(4), 208–213. https://doi.org/10.5455/aim.2012.20.208-213
Maurer, H., Kappe, F., & Zaka, B. (2006). Plagiarism—A survey. Journal of Universal Computer Science, 12(8), 1050–1084. https://doi.org/10.3217/JUCS-012-08-1050
McGowan, U. (2005). Plagiarism detection and prevention: Are we putting the cart before the horse. In A. Brew & C. Asmar (Eds), Proceedings of Higher Education Research and Development Society of Australasia (HERDSA) conference (pp. 287–293). HERDSA. https://www.herdsa.org.au/publications/conference-proceedings/research-and-development-higher-education-higher-education-82
Mostofa, S. K. M., Tabassum, M., & Ahmed, S. M. Z. (2021). Researchers’ awareness about plagiarism and impact of plagiarism detection tools – does awareness effect the actions towards preventing plagiarism? Digital Library Perspectives, 37(3), 257–274. https://doi.org/10.1108/DLP-10-2020-0100
Nadhri, S., Elamine, M., & Belguith, L. H. (2021, December). Automatic evaluation of existing plagiarism detection tools [Paper presentation]. In Tunisian Algerian Conference on Applied Computing (TACC). Tabarka, Tunisia: TACC. https://ceur-ws.org/Vol-3067/paper17.pdf
Nagoudi, E. M. B., Khorsi, A., Cherroun, H., & Schwab, D. (2018). A two-level plagiarism detection system for Arabic documents. Cybernetics and Information Technologies, 18(1), 124–138.
Nahas, M. N. (2017). Survey and comparison between plagiarism detection tools. American Journal of Data Mining and Knowledge Discovery, 2(2), 50–53. https://doi.org/10.11648/j.ajdmkd.20170202.12
Naik, R. R., Landge, M. B., & Mahender, C. N. (2015). A review on plagiarism detection tools. International Journal of Computer Applications, 125(11), 16–22.
Odaka, T., Murata, T., Gao, J., Suwa, I., Kuroiwa, J., & Ogura, H. (2003). n-gram wo mochiita gakusei repooto hyōkashuhōnoteisatsu [A proposal on student report scoring system using n-gram text analysis method]. IEICE, J86-D-1(9), 702–705.
Perkins, M., Gezgin, U. B., & Roe, J. (2020). Reducing plagiarism through academic misconduct education. International Journal for Educational Integrity, 16(1), 1–15.
Pertile, S. de L., Moreira, V. P., & Rosso, P. (2016). Comparing and combining Content- and Citation-based approaches for plagiarism detection. Journal of the Association for Information Science and Technology, 67(10), 2511–2526. https://doi.org/10.1002/asi.23593
Razı, S. (2015). Development of a rubric to assess academic writing incorporating plagiarism detectors. SAGE Open, 5(2), 1–13. https://doi.org/10.1177/2158244015590162
Suzuki, K., Takahashi, I., Shirai, H., Kuroiwa, J., Odaka, T., & Ogura, H. (2009). Hyōsetsu repooto hakkenn ni riyō suru 1 buntan`I de no kensaku kuerisakuseishushō [Web search query in detecting plagiarism reports]. IEICE, 92(11), 2072–2076.
Takahashi, I., Miyakawa, K., Odaka, T., Shirai, H., Kuroiwa, J., & Ogura, H. (2007). Web saito karano hyōsetsurepooto hakken shien shisutemu [A Computer Aided Detection System for Learners` Reports Plagiarism from Web-site]. IEICE, 90(11), 2989–2999.
Teeter, J. (2014). Deconstructing attitudes towards plagiarism of Japanese undergraduates in EFL academic writing classes. English Language Teaching, 8(1), 95–109.
Ueno, S., Takahashi, I., Kuroiwa, J., Shirai, H., Odaka, T., & Ogura, H. (2006). Fukusuu no web peeji kara hyōsetsu shita repooto no hakken shien shisutemu no jissō [Implementation of a support system to find out of the report plagiarized from several web pages]. JŌhō Shori Gakkai Kenkyū Hōkoku [IPSJ SIG Notes], 87, 41–46.
Ueta, K., & Tominaga, H. (2010). A development and application of similarity detection methods for plagiarism of online reports. Proceedings of ITHET, 2010, 363–371.
Vani, K., & Gupta, D. (2016). Study on extrinsic text plagiarism detection techniques and tools. Journal of Engineering Science and Technology Review, 9(4), 150–164. https://doi.org/10.25103/jestr.094.23
Vrbanec, T., & Meštrović, A. (2021). Corpus-based paraphrase detection experiments and review. Information, 11(5), 241.
Wahle, J. P., Ruas, T., Foltýnek, T., Meuschke, N., & Gipp, B. (2022). Identifying machine-paraphrased plagiarism. In M. Smits (Ed.), Information for a better World: Shaping the Global future (pp. 393–413). Springer International Publishing. https://doi.org/10.1007/978-3-030-96957-8_34
Weber-Wulff, D., Möller, C., Touras, J., & Zincke, E. (2013). Plagiarism detection software test 2013. Retrieved March, 2022, from http://plagiat.htw-berlin.de/software/2013/
Weber-Wulff, D. (2010). Plagiarism detection test 2010. Retrieved March, 2022, from https://plagiat.htw-berlin.de/software-en/2010-2/
Wheeler, G. (2009). Plagiarism in the Japanese universities: Truly a cultural matter? Journal of Second Language Writing, 18(1), 17–29.
Wheeler, G. (2014). Culture of minimal influence: A study of Japanese university students’ attitudes toward plagiarism. The International Journal for Educational Integrity, 10(2), 44–59.
Wheeler, G. (2016). Perspectives from Japan. In T. Bretag (Ed.), Handbook of Academic Integrity (pp. 107–112). Springer Singapore. https://doi.org/10.1007/978-981-287-098-8_7
Wu, Z., Liang, J., Zhang, Z., & Lei, J. (2021). Exploration of text matching methods in Chinese disease Q&A systems: A method using ensemble based on BERT and boosted tree models. Journal of Biomedical Informatics, 115, 1–10. https://doi.org/10.1016/j.jbi.2021.103683
Yamamoto, F. (2016). Ronbun no ‘itoteki dehanai hyōsetsu’ no mondai: Modaritii no kondō to kaishaku no nai in`yō [Unintentional plagiarism in Japanese writing: Confusion of modalities and citation without interpretation]. Global Communication, 6, 117–132.
Yamamoto, F., & Nitsū, N. (2015). Ronbun no in`yō – kōzō: Jinbun & shakaikagakukei ronbun shidō no tame no kisoteki kenkyū [Quotation and interpretation structure of literature-analysis papers: Basic research on instruction for writing papers in humanities and social science]. Nihongo Kyōiku [journal of Japanese Language Teaching], 160, 94–109.
Yamamoto, F., Nitsū, N., Ohshima, Y., & Satō, S. (2014). In`yō kara kaishaku ni itaru in`yōbun no tayōsei [Varieties of citations from quoting to interpreting in the “literature-analysis-type” papers in humanity and social science]. Dai 16 kai Senmon Nihongo Kyōiku Gakkai Ronshū [Conference proceeding of 16th conference of the society for technical Japanese education], 2–23.
Yoshimura, F. (2015). Japanese university students’ experience with and perceptions of citations in academic writing. Journal of Institute for Research in English Language and Literature, 40(1), 37–62.
Acknowledgements
We gratefully thank the ENAI TeSToP Working Group members for inspiring this work with the original TeSToP and allowing us to use the method; Debora Weber-Wulff for enabling us to use the methodology she developed, and for sharing her time to discuss encoding issues in Japanese from the software viewpoint; and Tomáš Folýtnek on behalf of the ENAI for letting us use their official e-mail account to communicate with the tools. We also would like to express our sincere thanks to Graham Lee for proofreading the manuscript.
Author information
Authors and Affiliations
Contributions
In light of the workload types given below, authors’ contributions are as follow.
Work/contribution types:
1. managing the project
2. creating/establishing the evaluation framework
3. idea support
4. theoretical contribution and/or guidance
5. secretariat/communicating with tools
6. data input
7. data interpretation
8. designing/developing images & graphs & tables
9. chapter/section writing (if yes, which part)
10. improving the language
11. mentorship
TÖ: 1, 2, 3, 5, 6, 7, 8, 9 (entire manuscript), 11
İS: 5, 6, 7, 8
ÖÇ: 2, 3, 4, 7, 9 (section of “previous test”), 10
SR: 2, 3, 4, 7, 10, 11
SÇA: 6,7, 9 (partially contributed to “previous test” section)
DD: 3, 7, 11
Corresponding author
Ethics declarations
Competing interests
Several authors of this article are involved in organizing the European Network for Academic Integrity annual conferences. The Academic Integrity Ph.D. Summer Schools receive funding from text-matching software vendors. This did not influence our research in any phase.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix 1
1.1 Details of sources used
Type of text | Source of text | Link of text (if available), details (if any) | Date | |
---|---|---|---|---|
A1 | Original text | Wikipedia | https://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%AC%E3%81%AE%E9%AB%98%E9%BD%A2%E5%8C%96 | 17.01.2022 |
A2 | Paraphrased text (automatically) | Web tool | 17.01.2022 | |
A3 | Paraphrased text (manually) (changed kana systems) | Manual | Applied by Senem Çente Akkan | 17.01.2022 |
A4 | Japanese-English Translation | Google translate | 17.01.2022 | |
A5 | Disguising techniques | Manual | Numbering styles, OCR, punctuation, white characters, etc. are applied | 17.01.2022 |
B1 | Original paper (online & open access database) | J-Stage | https://www.jstage.jst.go.jp/article/jtje/21/0/21_3/_article/-char/ja Özşen, T. (2019). 「日本学研究における日本語学習の意味と課題」[The meaning and issues of Japanese Learning in Japanology Studies], 『専門日本語教育研究』[Journal of Technical Japanese Education], No.21:3–9, ISSN: 1345–1995 | 18.01.2022 |
B2 | Paraphrased text (automatically) | Web tool | 18.01.2022 | |
B3 | Paraphrased text (manually) (changed kana systems) | Manual | Checked by Senem Çente Akkan | 18.01.2022 |
B4 | Japanese-English Translation | Google translate | 18.01.2022 | |
B5 | Disguising techniques | Manual | Numbering styles, OCR, punctuation, white characters, etc. are applied | 18.01.2022 |
C1 | Original paper (Non-online, unpublished on the internet) | Book chapter | Özşen, T. (2015) 「生活構造論的視点からトルコの農村を読み直す」(Translation: Rereading the Turkish Rural Community from the viewpoint of Life Structure)、 in Tokuno S., Makino A., Matsumoto T. (eds)、『暮らしの視点からの地方再生—地域と生活の社会学』(Sociology of Community and Life) Kyushu University Press; 139–162 | 18.01.2022 |
C2 | Paraphrased text (automatically) | Web tool | 28.03.2022 | |
C3 | Paraphrased text (manually) (changed kana systems) | Manual | Checked by Senem Çente Akkan | 18.01.2022 |
C4 | Japanese-English Translation | Google translate | 18.01.2022 | |
C5 | Disguising techniques | Manual | Numbering styles, OCR, punctuation, white characters, etc. are applied | 28.03.2022 |
D1 | Multi-source text (Wikipedia, government white papers, OA journal paper) | Wikipedia, Japan Foundation webpage, CiNii for journal paper | Government White paper (Japan Foundation): https://www.jpf.go.jp/j/project/japanese/survey/result/dl/survey2018/text.pdf Online OA Journal Paper: https://ci.nii.ac.jp/naid/110009687716 1st paragraph from Wikipedia, 2nd and 3rd paragraphs are from Japan Foundation The last paragraph is from the Journal paper | 18.01.2022 |
D2 | Paraphrased text (automatically) | Web tool | 19.01.2022 | |
D3 | Paraphrased text (manually) (changed kana systems) | Manual | Checked by Senem Çente Akkan | 18.01.2022 |
D4 | Japanese-English Translation | Google translate | 18.01.2022 | |
D5 | Disguising techniques | Manual | Numbering styles, OCR, punctuation, white characters, etc. are applied | 18.01.2022 |
Appendix 2
2.1 Main contact URLs for the 10 text-matching tools evaluated in this paper
chiyo-co |
CopyContentDetector |
Docol©c |
Dupli Checker |
OXSICO |
Plagiarism Checker.co |
Plagiarism Checker X |
Plagiarism Detector.net |
SmallSEOTools |
StrikePlagiarism.com |
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Özşen, T., Saka, İ., Çelik, Ö. et al. Testing of support tools to detect plagiarism in academic Japanese texts. Educ Inf Technol 28, 13287–13321 (2023). https://doi.org/10.1007/s10639-023-11718-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10639-023-11718-4