Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Park, Chanjun; Kim, Kuekyeng; Yang, YeongWook; Kang, Minho; Lim, Heuiseok

doi:10.1007/s11042-020-09148-2

Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Published: 27 June 2020

Volume 80, pages 34591–34608, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chanjun Park¹,
Kuekyeng Kim¹,
YeongWook Yang²,
Minho Kang³ &
…
Heuiseok Lim¹

505 Accesses
10 Citations
Explore all metrics

Abstract

The aim of a spelling correction task is to detect spelling errors and automatically correct them. In this paper we aim to perform the Korean spelling correction task from a machine translation perspective, allowing it to overcome the limitations of cost, time and data. Based on a sequence to sequence model, the model aligns its source sentence with an ‘error filled sentence’ and its target sentence aligned to the correct counter part. Thus, ‘translating’ the error sentence to a correct sentence. For this research, we have also proposed three new data generation methods allowing the creation of multiple spelling correction parallel corpora from just a single monolingual corpus. Additionally, we discovered that applying the Copy Mechanism not only resolves the problem of overcorrection but even improves it. For this paper, we evaluated our model upon these aspects: Performance comparisons to other models and evaluation on overcorrection. The results show the proposed model to even out-perform other systems currently in commercial use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Toward perfect neural cascading architecture for grammatical error correction

Article 19 November 2020

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Chinese Grammatical Error Correction Using Statistical and Neural Models

Notes

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv:1409.0473
Byun J, Rim HC, Park SY (2007) Automatic spelling correction rule extraction and application for spoken-style korean text. In: Sixth international conference on advanced language processing and web information technology (ALPIT 2007). IEEE, pp 195–199
Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078
Cristo M, Hanada R, Carvalho A, Lores FA, Pimentel MDGC (2017) Fast word recognition for noise channel-based models in scenarios with noise specific domain knowledge. In: Proceedings of the 2017 ACM on conference on information and knowledge management, pp 607–616
Fivez P, Šuster S, Daelemans W (2017) Unsupervised context-sensitive spelling correction of clinical free-text with word and character n-gram embedding. In: 16th workshop on biomedical natural language processing of the association for computational linguistics, pp 143–148
Gehring J, Auli M, Grangier D, Yarats D, Dauphin YN (2017) Convolutional sequence to sequence learning. In: Proceedings of the 34th international conference on machine learning. JMLR.org, vol 70, pp 1243–1252
Graves A, Mohamed AR, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 6645–6649
Gu J, Lu Z, Li H, Li VO (2016) Incorporating copying mechanism in sequence-to-sequence learning. arXiv:1603.06393
Kim M, Choi SK, Kwon HC (2014) Context-sensitive spelling error correction using inter-word semantic relation analysis. In: 2014 international conference on information science & applications (ICISA). IEEE, pp 1–4
Kim J, Hong T, Kim P (2019) Word2vec based spelling correction method of twitter message. In: Proceedings of the 34th ACM/SIGAPP symposium on applied computing, pp 2016–2019
Klein G, Kim Y, Deng Y, Senellart J, Rush AM (2017) Opennmt: open-source toolkit for neural machine translation. arXiv:1701.02810
Kullback S, Leibler RA (1951) On information and sufficiency. Ann Math Stat 22(1):79–86
Article MathSciNet Google Scholar
Kwon HC, Kang MY, Choi SJ (2004) Stochastic korean word-spacing with smoothing using korean spelling checker. Int J Comput Process Lang 17 (4):239–252
Article Google Scholar
Lee JH, Kim M, Kwon HC (2017) Improved statistical language model for context-sensitive spelling error candidates. J Korea Multimed Soc 20 (2):371–381
Article Google Scholar
Lee JH, Kim M, Kwon HC (2017) The utilization of local document information to improve statistical context-sensitive spelling error correction. KIISE Trans Comput Pract 23(7):446–451
Article Google Scholar
Luong MT, Pham H, Manning CD (2015) Effective approaches to attention-based neural machine translation. arXiv:1508.04025
Manohar V, Hadian H, Povey D, Khudanpur S (2018) Semi-supervised training of acoustic models using lattice-free mmi. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 4844–4848
Napoles C, Sakaguchi K, Post M, Tetreault J (2015) Ground truth for grammatical error correction metrics. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 2: short papers) , pp 588–593
Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp 311–318
Park DS, Chan W, Zhang Y, Chiu CC, Zoph B, Cubuk ED, Le QV (2019) Specaugment: a simple data augmentation method for automatic speech recognition. arXiv:1904.08779
Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P et al (2011) The kaldi speech recognition toolkit. In: IEEE 2011 workshop on automatic speech recognition and understanding, CONF. IEEE Signal Processing Society
Qin Y, Carlini N, Goodfellow I, Cottrell G, Raffel C (2019) Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. arXiv:1903.10346
Roy S, Ali FB (2019) Unsupervised context-sensitive bangla spelling correction with character n-gram. In: 2019 22nd international conference on computer and information technology (ICCIT). IEEE, pp 1–6
Schabes Y, Roche E (1995) Exact generalization of finite-state transductions: application to grapheme-to-phoneme transcription. In: Submitted to the 23rd meeting of the association for computational linguistics (ACL’95)
Sennrich R, Haddow B, Birch A (2015) Neural machine translation of rare words with subword units. arXiv:1508.07909
Shin G, Seol A, Cho H, Nam K, Pae S (2015) Korean spelling development and linguistic patterns. J Speech Lang Hear Disord 24(2):61–72
Article Google Scholar
Soltau H, Liao H, Sak H (2016) Neural speech recognizer: acoustic-to-word lstm model for large vocabulary speech recognition. arXiv:1610.09975
Sutskever I, Vinyals O, Le Q (2014) Sequence to sequence learning with neural networks. Advances in NIPS
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Yang M (2005) Development of orthographic knowledge among Korean children in grades 1 to 6. University of Virginia
Yujian L, Bo L (2007) A normalized levenshtein distance metric. IEEE Trans Pattern Anal Mach Intell 29(6):1091–1095
Article Google Scholar

Download references

Acknowledgements

This research was supported by the MSIT(Ministry of Science and ICT), Korea, under the ITRC(Information Technology Research Center) support program(IITP-2020-2018-0-01405) supervised by the IITP(Institute for Information & Communications Technology Planning & Evaluation) and National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (No.NRF-2017M3C4A7068189). I am very grateful to my friend Yejin Jang for helping me with correcting English.

Author information

Authors and Affiliations

311 Aegineung Student Center, College of Informatics, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, 02841, Korea
Chanjun Park, Kuekyeng Kim & Heuiseok Lim
Institute of Education, University of Tartu, Tartu, 50103, Estonia
YeongWook Yang
LLsoLLu, 5, Mabang-ro 10-gil, Seocho-gu, Seoul, Republic of Korea
Minho Kang

Authors

Chanjun Park
View author publications
You can also search for this author in PubMed Google Scholar
Kuekyeng Kim
View author publications
You can also search for this author in PubMed Google Scholar
YeongWook Yang
View author publications
You can also search for this author in PubMed Google Scholar
Minho Kang
View author publications
You can also search for this author in PubMed Google Scholar
Heuiseok Lim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heuiseok Lim.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Chanjun Park and Kuekyeng Kim contributed equally to this work.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Park, C., Kim, K., Yang, Y. et al. Neural spelling correction: translating incorrect sentences to correct sentences for multimedia. Multimed Tools Appl 80, 34591–34608 (2021). https://doi.org/10.1007/s11042-020-09148-2

Download citation

Received: 25 February 2020
Revised: 22 April 2020
Accepted: 27 May 2020
Published: 27 June 2020
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11042-020-09148-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Abstract

Access this article

Similar content being viewed by others

Toward perfect neural cascading architecture for grammatical error correction

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Chinese Grammatical Error Correction Using Statistical and Neural Models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Neural spelling correction: translating incorrect sentences to correct sentences for multimedia

Abstract

Access this article

Similar content being viewed by others

Toward perfect neural cascading architecture for grammatical error correction

Youdao’s Winning Solution to the NLPCC-2018 Task 2 Challenge: A Neural Machine Translation Approach to Chinese Grammatical Error Correction

Chinese Grammatical Error Correction Using Statistical and Neural Models

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation