Skip to main content
Log in

Testing of support tools to detect plagiarism in academic Japanese texts

  • Published:
Education and Information Technologies Aims and scope Submit manuscript

Abstract

Plagiarism has been among the top forms of academic misconduct. Detective, reactive and proactive measures are taken to mitigate plagiarism in scholarly works. Text-matching tools play a significant role in the detection of plagiarism. Many studies have tested the performance of text-matching tools in detecting plagiarism from various perspectives. However, no study addressed the performance of such tools in ideographic languages, particularly Japanese. Considering the sharp increase in the number of academic Japanese text and plagiarism incidents in the Japanese context, it is essential to explore to what extent text-matching tools catch similarities in Japanese texts and respond to the needs of Japanese users. Within this scope, this study set out to explore the coverage and usability performance of text-matching tools in the Japanese language. We tested the coverage performance of 10 text-matching tools with five types of intentionally plagiarized documents. Also, we tested the usability performance via a feature checklist. The testing results suggested that the tools generally give a relatively higher performance on the usability side rather than the coverage aspect. Most tools have minimal coverage performance in the Japanese language. In the end, we provided takeaways for vendors, policymakers and educators.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Data and materials generated, used, and analyzed in this study are publicly available in the European Network for Academic Integrity (ENAI) repository. https://www.academicintegrity.eu/wp/testing-of-support-tools-for-plagiarism-detection-working-group/

Notes

  1. In this manuscript, the word "tool" refers any service, system, machine, website or downloadable program related to text matching and is grouped under one word for consistency only.

  2. The vendor in this manuscript is referred not only to a commercial organization that provides a sort of platform to detect similarities on the texts but also the stakeholder contributes to educational and research activities on plagiarism and academic misconduct issues.

References

  • Abdelhamid, M., Azouaou, F., & Batata, S. (2022). A survey of plagiarism detection systems: Case of use with English, French and Arabic Languages. arXiv preprint arXiv:2201.03423.

  • Ahmed, R. K. A. (2015). Overview of different plagiarism detection tools. International Journal of Futuristic Trends in Engineering and Technology, 2(10), 1–3.

    Google Scholar 

  • Ali, A. M. E. T., Abdulla, H. M. D., & Snasel, V. (2011, May 13). Overview and comparison of plagiarism detection tools [Poster presentation]. Annual International Workshop on DAtabases, TExts, Specifications and Objects (Dateso), 161–172, Pisek, Czech Republic. https://ceur-ws.org/Vol-706/poster22.pdf

  • Alotaibi, N., & Joy, M. (2021). English-Arabic cross-language plagiarism detection. Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) (pp. 44–52). https://doi.org/10.26615/978-954-452-072-4_006

  • Birkić, T., Celjak, D., Cundeković, M., & Rako, S. (2016). Analysis of software for plagiarism detection in science and education. University of Zagreb. Retrieved January, 2022, from https://www.srce.unizg.hr/files/srce/docs/CEU/analysis_of_software_for_plagiarism_detection_in_science_and_education.pdf

  • Bull, J., Collins, C., Coughlin, E., Sharp, D., & Square, P. (2001). Technical review of plagiarism detection software report. University of Luton.

    Google Scholar 

  • Chen, S., & Macfarlane, B. (2016). Academic integrity in China. In T. Bretag (Ed.), Handbook of Academic Integrity (pp. 99–105). Springer Reference.

    Chapter  Google Scholar 

  • Chowdhury, H. A., & Bhattacharyya, D. K. (2018). Plagiarism: Taxonomy, tools and detection techniques. arXiv:1801.06323. https://doi.org/10.48550/arXiv.1801.06323

  • Condurache, I.-A., & Bolboacă, S. D. (2022). Comparison of plagiarism detection performance between some commercial and free software. Applied Medical Informatics, 44(2), 73–86. https://ami.info.umfcluj.ro/index.php/AMI/article/view/904

    Google Scholar 

  • Ehrich, J., Howard, S. J., Mu, C., & Bokosmaty, S. (2016). A comparison of Chinese and Australian university students’ attitudes towards plagiarism. Studies in Higher Education, 41(2), 231–246.

    Article  Google Scholar 

  • El Bachir Menai, M., & Bagais, M. (2011). APlag: A plagiarism checker for Arabic texts. 2011 6th International Conference on Computer Science & Education (ICCSE) (pp.1379–1383). IEEE. https://doi.org/10.1109/ICCSE.2011.6028888

  • Elamine, M., Mechti, S., & Belguith, L. H. (2020). Hybrid plagiarism detection method for French language. International Journal of Hybrid Intelligent Systems, 16(3), 163–175. https://doi.org/10.3233/HIS-200284

    Article  Google Scholar 

  • Elamine, M., Bougares, F., Mechti, S., & HadrichBelguith, L. (2021). Extrinsic plagiarism detection for French language with Word embeddings. In A. Abraham, P. Siarry, K. Ma, & A. Kaklauskas (Eds.), Intelligent systems design and applications (pp. 217–224). Springer International Publishing. https://doi.org/10.1007/978-3-030-49342-4_21

    Chapter  Google Scholar 

  • Elkhatat, A. M., Elsaid, K., & Almeer, S. (2021). Some students plagiarism tricks, and tips for effective check. International Journal for Education Integrity, 17(1), 1–12. https://doi.org/10.1007/s40979-021-00082-w

    Article  Google Scholar 

  • Foltýnek, T., Dlabolová, D., Anohina-Naumeca, A., Razı, S., Kravjar, J., Kamzola, L., Guerrero-Dib, J., Çelik, Ö., & Weber-Wulff, D. (2020). Testing of support tools for plagiarism detection. International Journal of Educational Technology in Higher Education, 17(1), 1–31. https://doi.org/10.1186/s41239-020-00192-4

    Article  Google Scholar 

  • Fukaya, R., Yamamura, T., Kudō, H., Matsumoto, T., Takeuchi, Y., & Ohnishi, N. (2003). Hindo tōkei to gainen jisho wo mochiita bunshō no ruijisei no teiryōka [Measuring similarity between documents using term frequency and concept dictionary]. Jōhō Shori Gakkai Kenkyū Hōkoku [IPSJ SIG Notes], 153, 73–79.

    Google Scholar 

  • Howard, R. M. (1992). A plagiarism pentimento. Journal of Teaching Writing, 11(2), 233–245.

    Google Scholar 

  • Hussein, A. S. (2015). A plagiarism detection system for Arabic documents. In D. Filev, J. Jabłkowski, J. Kacprzyk, M. Krawczak, I. Popchev, L. Rutkowski, V. Sgurev, E. Sotirova, P. Szynkarczyk, & S. Zadrozny (Eds.), Intelligent Systems’2014 (pp. 541–552). Springer International Publishing. https://doi.org/10.1007/978-3-319-11310-4_47

    Chapter  Google Scholar 

  • Jadhav Sunayana, D., & Lihitkar Shalini, R. (2021). Plagiarism detection software: A comparative evaluation. Library Philosophy and Practice (e-journal), 6206. https://digitalcommons.unl.edu/libphilprac/6206/

  • Japan Foundation. (2018). Survey report on Japanese-Language education abroad 2018. Retrieved May, 2022, from https://www.jpf.go.jp/j/project/japanese/survey/result/dl/survey2018/text.pdf

  • Japan Statistical Yearbook. (2022). Chapter 25 Education. Retrieved September, 2022, from https://www.stat.go.jp/english/data/nenkan/71nenkan/1431-25.html

  • Kahloula, B., & Berri, J. (2016). Plagiarism detection in arabic documents: Approaches, architecture and systems. Journal of Digital Information Management, 14(2), 124–135.

    Google Scholar 

  • Kakkonen, T., & Mozgovoy, M. (2010). Hermetic and web plagiarism detection systems for student essays—An evaluation of the State-of-the-Art. Journal of Educational Computing Research, 42(2), 135–159. https://doi.org/10.2190/EC.42.2.a

    Article  Google Scholar 

  • Kamimura, T. (2014). Citation behaviors observed in Japanese EFL students’ argumentative writing. Journal of Pan-Pacific Association of Applied Linguistics, 18(1), 85–101.

    Article  Google Scholar 

  • Kier, C.A., & Ives, C. (2022) Recommendations for a balanced approach to supporting academic integrity: perspectives from a survey of students, faculty, and tutors. International Journal for Educational Integrity, 18(22). https://doi.org/10.1007/s40979-022-00116-x

  • Krizkova, S., Tomaskova, H., & Gavalec, M. (2016). Preference comparison for plagiarism detection systems. 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) (pp.1760–1767). IEEE. https://doi.org/10.1109/FUZZ-IEEE.2016.7737903

  • Kulkarni, S., Govilkar, S., & Amin, D. (2021). Analysis of plagiarism detection tools and methods. Proceedings of the 4th International Conference on Advances in Science & Technology (ICAST2021). International Conference on Advances in Science & Technology, Bahir Dar, Ethiopia. https://doi.org/10.2139/ssrn.3869091

  • Lancaster, T., & Culwin, F. (2005). Classifications of plagiarism detection engines. Innovation in Teaching and Learning in Information and Computer Sciences, 4(2), 1–16. https://doi.org/10.11120/ital.2005.04020006

    Article  Google Scholar 

  • Masic, I. (2012). Plagiarism in scientific publishing. Acta Informatica Medica, 20(4), 208–213. https://doi.org/10.5455/aim.2012.20.208-213

    Article  MathSciNet  Google Scholar 

  • Maurer, H., Kappe, F., & Zaka, B. (2006). Plagiarism—A survey. Journal of Universal Computer Science, 12(8), 1050–1084. https://doi.org/10.3217/JUCS-012-08-1050

    Article  Google Scholar 

  • McGowan, U. (2005). Plagiarism detection and prevention: Are we putting the cart before the horse. In A. Brew & C. Asmar (Eds), Proceedings of Higher Education Research and Development Society of Australasia (HERDSA) conference (pp. 287–293). HERDSA. https://www.herdsa.org.au/publications/conference-proceedings/research-and-development-higher-education-higher-education-82

  • Mostofa, S. K. M., Tabassum, M., & Ahmed, S. M. Z. (2021). Researchers’ awareness about plagiarism and impact of plagiarism detection tools – does awareness effect the actions towards preventing plagiarism? Digital Library Perspectives, 37(3), 257–274. https://doi.org/10.1108/DLP-10-2020-0100

    Article  Google Scholar 

  • Nadhri, S., Elamine, M., & Belguith, L. H. (2021, December). Automatic evaluation of existing plagiarism detection tools [Paper presentation]. In Tunisian Algerian Conference on Applied Computing (TACC). Tabarka, Tunisia: TACC. https://ceur-ws.org/Vol-3067/paper17.pdf

  • Nagoudi, E. M. B., Khorsi, A., Cherroun, H., & Schwab, D. (2018). A two-level plagiarism detection system for Arabic documents. Cybernetics and Information Technologies, 18(1), 124–138.

    Article  MathSciNet  Google Scholar 

  • Nahas, M. N. (2017). Survey and comparison between plagiarism detection tools. American Journal of Data Mining and Knowledge Discovery, 2(2), 50–53. https://doi.org/10.11648/j.ajdmkd.20170202.12

    Article  MathSciNet  Google Scholar 

  • Naik, R. R., Landge, M. B., & Mahender, C. N. (2015). A review on plagiarism detection tools. International Journal of Computer Applications, 125(11), 16–22.

    Article  Google Scholar 

  • Odaka, T., Murata, T., Gao, J., Suwa, I., Kuroiwa, J., & Ogura, H. (2003). n-gram wo mochiita gakusei repooto hyōkashuhōnoteisatsu [A proposal on student report scoring system using n-gram text analysis method]. IEICE, J86-D-1(9), 702–705.

    Google Scholar 

  • Perkins, M., Gezgin, U. B., & Roe, J. (2020). Reducing plagiarism through academic misconduct education. International Journal for Educational Integrity, 16(1), 1–15.

    Article  Google Scholar 

  • Pertile, S. de L., Moreira, V. P., & Rosso, P. (2016). Comparing and combining Content- and Citation-based approaches for plagiarism detection. Journal of the Association for Information Science and Technology, 67(10), 2511–2526. https://doi.org/10.1002/asi.23593

    Article  Google Scholar 

  • Razı, S. (2015). Development of a rubric to assess academic writing incorporating plagiarism detectors. SAGE Open, 5(2), 1–13. https://doi.org/10.1177/2158244015590162

    Article  Google Scholar 

  • Suzuki, K., Takahashi, I., Shirai, H., Kuroiwa, J., Odaka, T., & Ogura, H. (2009). Hyōsetsu repooto hakkenn ni riyō suru 1 buntan`I de no kensaku kuerisakuseishushō [Web search query in detecting plagiarism reports]. IEICE, 92(11), 2072–2076.

    Google Scholar 

  • Takahashi, I., Miyakawa, K., Odaka, T., Shirai, H., Kuroiwa, J., & Ogura, H. (2007). Web saito karano hyōsetsurepooto hakken shien shisutemu [A Computer Aided Detection System for Learners` Reports Plagiarism from Web-site]. IEICE, 90(11), 2989–2999.

    Google Scholar 

  • Teeter, J. (2014). Deconstructing attitudes towards plagiarism of Japanese undergraduates in EFL academic writing classes. English Language Teaching, 8(1), 95–109.

    Article  Google Scholar 

  • Ueno, S., Takahashi, I., Kuroiwa, J., Shirai, H., Odaka, T., & Ogura, H. (2006). Fukusuu no web peeji kara hyōsetsu shita repooto no hakken shien shisutemu no jissō [Implementation of a support system to find out of the report plagiarized from several web pages]. JŌhō Shori Gakkai Kenkyū Hōkoku [IPSJ SIG Notes], 87, 41–46.

    Google Scholar 

  • Ueta, K., & Tominaga, H. (2010). A development and application of similarity detection methods for plagiarism of online reports. Proceedings of ITHET, 2010, 363–371.

    Google Scholar 

  • Vani, K., & Gupta, D. (2016). Study on extrinsic text plagiarism detection techniques and tools. Journal of Engineering Science and Technology Review, 9(4), 150–164. https://doi.org/10.25103/jestr.094.23

    Article  Google Scholar 

  • Vrbanec, T., & Meštrović, A. (2021). Corpus-based paraphrase detection experiments and review. Information, 11(5), 241.

    Article  Google Scholar 

  • Wahle, J. P., Ruas, T., Foltýnek, T., Meuschke, N., & Gipp, B. (2022). Identifying machine-paraphrased plagiarism. In M. Smits (Ed.), Information for a better World: Shaping the Global future (pp. 393–413). Springer International Publishing. https://doi.org/10.1007/978-3-030-96957-8_34

    Chapter  Google Scholar 

  • Weber-Wulff, D., Möller, C., Touras, J., & Zincke, E. (2013). Plagiarism detection software test 2013. Retrieved March, 2022, from http://plagiat.htw-berlin.de/software/2013/

  • Weber-Wulff, D. (2010). Plagiarism detection test 2010. Retrieved March, 2022, from https://plagiat.htw-berlin.de/software-en/2010-2/

  • Wheeler, G. (2009). Plagiarism in the Japanese universities: Truly a cultural matter? Journal of Second Language Writing, 18(1), 17–29.

    Article  Google Scholar 

  • Wheeler, G. (2014). Culture of minimal influence: A study of Japanese university students’ attitudes toward plagiarism. The International Journal for Educational Integrity, 10(2), 44–59.

    Article  Google Scholar 

  • Wheeler, G. (2016). Perspectives from Japan. In T. Bretag (Ed.), Handbook of Academic Integrity (pp. 107–112). Springer Singapore. https://doi.org/10.1007/978-981-287-098-8_7

    Chapter  Google Scholar 

  • Wu, Z., Liang, J., Zhang, Z., & Lei, J. (2021). Exploration of text matching methods in Chinese disease Q&A systems: A method using ensemble based on BERT and boosted tree models. Journal of Biomedical Informatics, 115, 1–10. https://doi.org/10.1016/j.jbi.2021.103683

    Article  Google Scholar 

  • Yamamoto, F. (2016). Ronbun no ‘itoteki dehanai hyōsetsu’ no mondai: Modaritii no kondō to kaishaku no nai in`yō [Unintentional plagiarism in Japanese writing: Confusion of modalities and citation without interpretation]. Global Communication, 6, 117–132.

    Google Scholar 

  • Yamamoto, F., & Nitsū, N. (2015). Ronbun no in`yō – kōzō: Jinbun & shakaikagakukei ronbun shidō no tame no kisoteki kenkyū [Quotation and interpretation structure of literature-analysis papers: Basic research on instruction for writing papers in humanities and social science]. Nihongo Kyōiku [journal of Japanese Language Teaching], 160, 94–109.

    Google Scholar 

  • Yamamoto, F., Nitsū, N., Ohshima, Y., & Satō, S. (2014). In`yō kara kaishaku ni itaru in`yōbun no tayōsei [Varieties of citations from quoting to interpreting in the “literature-analysis-type” papers in humanity and social science]. Dai 16 kai Senmon Nihongo Kyōiku Gakkai Ronshū [Conference proceeding of 16th conference of the society for technical Japanese education], 2–23.

  • Yoshimura, F. (2015). Japanese university students’ experience with and perceptions of citations in academic writing. Journal of Institute for Research in English Language and Literature, 40(1), 37–62.

    Google Scholar 

Download references

Acknowledgements

We gratefully thank the ENAI TeSToP Working Group members for inspiring this work with the original TeSToP and allowing us to use the method; Debora Weber-Wulff for enabling us to use the methodology she developed, and for sharing her time to discuss encoding issues in Japanese from the software viewpoint; and Tomáš Folýtnek on behalf of the ENAI for letting us use their official e-mail account to communicate with the tools. We also would like to express our sincere thanks to Graham Lee for proofreading the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

In light of the workload types given below, authors’ contributions are as follow.

Work/contribution types:

1. managing the project

2. creating/establishing the evaluation framework

3. idea support

4. theoretical contribution and/or guidance

5. secretariat/communicating with tools

6. data input

7. data interpretation

8. designing/developing images & graphs & tables

9. chapter/section writing (if yes, which part)

10. improving the language

11. mentorship

: 1, 2, 3, 5, 6, 7, 8, 9 (entire manuscript), 11

İS: 5, 6, 7, 8

ÖÇ: 2, 3, 4, 7, 9 (section of “previous test”), 10

SR: 2, 3, 4, 7, 10, 11

SÇA: 6,7, 9 (partially contributed to “previous test” section)

DD: 3, 7, 11

Corresponding author

Correspondence to Tolga Özşen.

Ethics declarations

Competing interests

Several authors of this article are involved in organizing the European Network for Academic Integrity annual conferences. The Academic Integrity Ph.D. Summer Schools receive funding from text-matching software vendors. This did not influence our research in any phase.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

1.1 Details of sources used

 

Type of text

Source of text

Link of text (if available), details (if any)

Date

A1

Original text

Wikipedia

https://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%AC%E3%81%AE%E9%AB%98%E9%BD%A2%E5%8C%96

17.01.2022

A2

Paraphrased text (automatically)

Web tool

https://www.paraphraser.io/ja/paraphrasing-tool

17.01.2022

A3

Paraphrased text (manually)

(changed kana systems)

Manual

Applied by Senem Çente Akkan

17.01.2022

A4

Japanese-English Translation

Google translate

https://translate.google.com/

17.01.2022

A5

Disguising techniques

Manual

Numbering styles, OCR, punctuation, white characters, etc. are applied

17.01.2022

B1

Original paper

(online & open access database)

J-Stage

https://www.jstage.jst.go.jp/article/jtje/21/0/21_3/_article/-char/ja

Özşen, T. (2019). 「日本学研究における日本語学習の意味と課題」[The meaning and issues of Japanese Learning in Japanology Studies], 『専門日本語教育研究』[Journal of Technical Japanese Education], No.21:3–9, ISSN: 1345–1995

18.01.2022

B2

Paraphrased text (automatically)

Web tool

https://www.paraphraser.io/ja/paraphrasing-tool

18.01.2022

B3

Paraphrased text (manually)

(changed kana systems)

Manual

Checked by Senem Çente Akkan

18.01.2022

B4

Japanese-English Translation

Google translate

https://translate.google.com/

18.01.2022

B5

Disguising techniques

Manual

Numbering styles, OCR, punctuation, white characters, etc. are applied

18.01.2022

C1

Original paper

(Non-online, unpublished on the internet)

Book chapter

Özşen, T. (2015) 「生活構造論的視点からトルコの農村を読み直す」(Translation: Rereading the Turkish Rural Community from the viewpoint of Life Structure)、 in Tokuno S., Makino A., Matsumoto T. (eds)、『暮らしの視点からの地方再生—地域と生活の社会学』(Sociology of Community and Life) Kyushu University Press; 139–162

18.01.2022

C2

Paraphrased text (automatically)

Web tool

https://www.paraphraser.io/ja/paraphrasing-tool

28.03.2022

C3

Paraphrased text (manually)

(changed kana systems)

Manual

Checked by Senem Çente Akkan

18.01.2022

C4

Japanese-English Translation

Google translate

https://translate.google.com/

18.01.2022

C5

Disguising techniques

Manual

Numbering styles, OCR, punctuation, white characters, etc. are applied

28.03.2022

D1

Multi-source text (Wikipedia, government white papers, OA journal paper)

Wikipedia, Japan Foundation webpage, CiNii for journal paper

Government White paper (Japan Foundation): https://www.jpf.go.jp/j/project/japanese/survey/result/dl/survey2018/text.pdf

Wikipedia: https://ja.wikipedia.org/wiki/%E6%97%A5%E6%9C%AC%E8%AA%9E%E6%95%99%E8%82%B2#%E6%97%A5%E6%9C%AC%E8%AA%9E%E6%95%99%E8%82%B2%E3%81%AE%E6%AD%B4%E5%8F%B2

Online OA Journal Paper: https://ci.nii.ac.jp/naid/110009687716

1st paragraph from Wikipedia,

2nd and 3rd paragraphs are from Japan Foundation

The last paragraph is from the Journal paper

18.01.2022

D2

Paraphrased text (automatically)

Web tool

https://www.paraphraser.io/ja/paraphrasing-tool

19.01.2022

D3

Paraphrased text (manually)

(changed kana systems)

Manual

Checked by Senem Çente Akkan

18.01.2022

D4

Japanese-English Translation

Google translate

https://translate.google.com/

18.01.2022

D5

Disguising techniques

Manual

Numbering styles, OCR, punctuation, white characters, etc. are applied

18.01.2022

Appendix 2

2.1 Main contact URLs for the 10 text-matching tools evaluated in this paper

chiyo-co

CopyContentDetector

Docol©c

Dupli Checker

OXSICO

Plagiarism Checker.co

Plagiarism Checker X

Plagiarism Detector.net

SmallSEOTools

StrikePlagiarism.com

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Özşen, T., Saka, İ., Çelik, Ö. et al. Testing of support tools to detect plagiarism in academic Japanese texts. Educ Inf Technol 28, 13287–13321 (2023). https://doi.org/10.1007/s10639-023-11718-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10639-023-11718-4

Keywords

Navigation