Abstract
Research aimed at correcting words in text has focused on three progressively more difficult problems:(1) nonword error detection; (2) isolated-word error correction; and (3) context-dependent work correction. In response to the first problem, efficient pattern-matching and n-gram analysis techniques have been developed for detecting strings that do not appear in a given word list. In response to the second problem, a variety of general and application-specific spelling correction techniques have been developed. Some of them were based on detailed studies of spelling error patterns. In response to the third problem, a few experiments using natural-language-processing tools or statistical-language models have been carried out. This article surveys documented findings on spelling error patterns, provides descriptions of various nonword detection and isolated-word error correction techniques, reviews the state of the art of context-dependent word correction techniques, and discusses research issues related to all three areas of automatic error correction in text.
- ABNEY, S. 1990. Rapid incremental parsing with repair. In Proceedings of the 6th New OED Conference: Electronic Text Research (Waterloo, Ontario, Oct. 1990).]]Google Scholar
- AHO, A.V. 1990. Algorithms for finding patterns in strings. In Handbook of Theoretical Computer Science, J. Van Leeuwen, Ed. Elsevier Science Publishers, B. V., Amsterdam.]] Google ScholarDigital Library
- AHO, A. V., AND CORASICK, M.J. 1975. Fast pattern matching: An aid to bibliographic search. Commun. ACM 18, 6 (June), 333-340.]] Google ScholarDigital Library
- AHO, A. V., AND PETERSON, T.G. 1972. A minimum distance error-correcting parser for context free languages. SIAM J. Comput. 1, 4 (Dec.), 305-312.]]Google ScholarCross Ref
- ALBERGA, C.N. 1967. String similarity and misspellings. Commun. ACM 10, 302 313.]] Google ScholarDigital Library
- ALLEN, R. B., AND KAMM, C.h. 1990. A recurrent neural network for word identification from continuous phoneme strings. In Advances in Neural Information Processing Systems, vol. 3. R. P. Lippmann, J. E. Moody, D. S. Touretzky, Ed. Morgan Kaufmann Publishers, San Mateo, Calif.]] Google ScholarDigital Library
- ALM, N., ARNOTT, J. L., AND NEWELL, A.F. 1992. Prediction and conversational momentum in an augmentative communication system. Cornman. ACM 35, 5 (May), 46 56.]] Google ScholarDigital Library
- ANGELL, R. C., FREUND, G. E., AND WILLETT, P. 1983. Automatic spelling correction using a trigram similarity measure. Inf. Process. Manage. 19,255 261.]]Google ScholarCross Ref
- ATWELL, E., AND ELLIOTT, S. 1987. Dealing with ill-formed English text (Chapter 10). In The Computational Analysis of English: A Corpus- Based Approach. R. Garside, G. Leach, G. Sampson, Ed. Longman, Inc. New York.]]Google Scholar
- BAHL, L. R., BROWN, P. F., DESOUZA, P. V., AND MERCER, R.L. 1989. A tree-based statistical language model for natural language speech recognition. IEEE Trans. Acoust. Speech Stg. Process. 37, 7, (July), 1001-1008.]]Google Scholar
- BAHL, L. R., JELINEK~ F., AND MERCER, R.L. 1983. A maximum likelihood approach to continuous speech recognition. IEEE Trans. Patt. Anal. Machine Intell. PAMI-5, 2 (Mar.), 179 190.]]Google Scholar
- BENTLEY, J. 1985. A spelling checker. Commun. ACM 28, 5 (May), 456-462.]] Google ScholarDigital Library
- BICKEL, M.A. 1987. Automatic correction to misspelled names: A fourth-generation language approach. Commun. ACM 30, 3 (Mar.), 224-228.]] Google ScholarDigital Library
- BLAIR, C. R. 1960. A program for correcting spelling errors. Inf. Contr. 3, 60 67.]]Google ScholarCross Ref
- BLEDSOE, W. W., AND BROWMNG, I. 1959. Pattern recognition and reading by machine. In Proceedings of the Eastern Joint Computer Conference, vol. 16, 225-232.]]Google ScholarDigital Library
- BOCAST, A. K. 1991. Method and apparatus for reconstructing a token from a Token Fragment. U.S. Patent Number 5,008,818, Design Services Group, Inc. McLean, Va.]]Google Scholar
- BOIWE, R. H. 1981. Directory assistance revisited. AT & T Bell Labs Tech. Mem. June 12, 1981.]]Google Scholar
- BROWN, P. F., DELLA PIETRA, V. J., DESOUZA, P. V., AND MERCER, R. L. 1990a. Class-Based n- Gram Models of Natural Language.]]Google Scholar
- BROWN. P., Cecum, J., DELLA PIETRA, S., DELLA PIETRA, V., JELINEK, F., MERCER, R., AND ROOSIN, P. 1990b. A statistical approach to machine translation. Con*put. Ling. 16, (June), 79-85.]] Google ScholarDigital Library
- BROWN, P., DELLA PIETRA, S., DELLA PIETRA, V., AND MERCER, R. 1991. Word sense disambigaation using statistical methods. In Proeeedtngs of the 29th Annual Meeting of the Association for Computational Linguistics (Berkeley, Calif., June), ACL, 264 270.]] Google ScholarDigital Library
- BURR, D. J. 1983. Designing a handwriting reader. IEEE Trans. Patt. Anal. Machine Intell. PAMI-5, 5 (Sept.), 554 559.]]Google ScholarDigital Library
- BURR, D. J. 1987. Experiments with a connactionist text reader. In IEEE International Conference on Neural Networks (San Diego, Calif., June). IEEE, New York, IV:717-724.]]Google Scholar
- CARBERRY, S. 1984. Understanding pragmatically ill-formed input. In Proceedings of the lOth International Conference on Computational Linguistics. ACL, 100-206.]] Google ScholarDigital Library
- CARBONELL, J. G., AND HAYES, P.J. 1983. Recovery strategies for parsing extragrammatical language. Amer. J. Comput. Ltng. 9, 3-4 (July-Dec.), 123 146.]] Google ScholarDigital Library
- CARTER, D.M. 1992. Lattice-based word identification in CLARE. In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics (Newark, Del., June 28-July 2). ACL, 159-166.]] Google ScholarDigital Library
- CHERKASSKY, V., AND VASSILAS, N. 1989a. Backpropagation networks for spelling correction. Neural Net. 1, 3 (July), 166-173.]]Google Scholar
- CHERKASSKY, V., AND VASSILAS, N. 1989b. Performance of back-propagation networks for associative database retrieval. Int. J. Comput. Neural Net.]]Google Scholar
- CHERKASSKY, V., RAO, M., AND WECHSLER, H. 1990. Fault-tolerant database retrieval using distributed associative memories. Inf. Sci. 46, 135-168.]] Google ScholarDigital Library
- CHERKASSKY, V.. FASSETT, K., AND VASSILAS, N. 1991. Linear algebra approach to neural associative memories and noise performance of neural classifiers. IEEE Trans. Comput. 40, 12, 1429-1435.]] Google ScholarDigital Library
- CHERKASSKY, V., VASSILAS, N., BRODT, G. L., AND WECHSLER, H. 1992. Conventional and associative memory approaches to automatic spelling checking. Eng. Appl. Artif. Intell. 5, 3.]]Google ScholarCross Ref
- CHERRY, L., AND MACDONALD, N. 1983. The Writer's Workbench software Byte, (Oct.), 241 248.]]Google Scholar
- CHOUEKA, Y. 1988. Looking fbr needles in a haystack. In Proceedtngs of RIAO, 609 623]]Google Scholar
- CHURCH, K.W. 1988. A stochastic parts program and noun phrase parser for unrestricted text. In Proceedings of the 2nd Applted Natural Language Processing Conference (Austin, Tex, Feb.). ACL, 136 143.]] Google ScholarDigital Library
- CHURCH, K. W., ANO GALE, W.A. 1991a. Probability scoring for spelling correction. Stat. Camput. 1, 93 103.]]Google ScholarCross Ref
- CHURCH, K. W., AND GAbE, W. A. 1991b. Enhanced Good-Turmg and cat-cal Two new methods for esmnating probabilities of English bigrams. Comput. Speech Lung. 1991.]] Google ScholarDigital Library
- COHEN, G. 1980. Reading and searching for spelling errors. In Cognitive Processes in Spelhng. Uta Frith, Ed. Academic Press, London.]]Google Scholar
- COtLER, C. H., CHURCH, K. W., AND LIBERMAN, M. Y. 1990. Morphology and rhyming: Two powerful alternatives to letter-to-sound rules for speech synthesis. In Proceedings of the Conference on Speech Synthesis. European Speech Communication Association.]]Google Scholar
- CONTANT, C., AND BRUNELLE, E. 1992 Exploratexte: Un analyseur a l'affut des erreurs grammaticales. In Actes du colloque lexiquesgrammatres compares, Universite du Quebec a Montreal. In French.]]Google Scholar
- CUSHMAN, W. H., OJHA, P. S., AND DANIEl, S, C. M. 1990. Usable OCR: What are the mlmmum requirements. In CH1-90 Conference Proceedrags, Special Issue o/ the ACM SIGCHI Bulletin (Seattle, Wash., Apr 1-5.) ACM, New York, 145-151.]] Google ScholarDigital Library
- DAHL, P, AND CHERKASSKY, V. 1990. Combined encoding in associative spelling checkers. Umv. of Minnesota EE Dept. Tech. Rep.]]Google Scholar
- DAMERAV, F.J. 1990. Evaluating computer-generated domain-oriented vocabularies. Inf Process. Manage. 26, 6, 791-801.]] Google ScholarDigital Library
- DAMERAU, F.J. 1964. A technique for computer detection and correction of spelling errors. Cornmun ACM 7, 3 (Mar.), 171-176.]] Google ScholarDigital Library
- DAMERAU, F. J., AND MAYS, E. 1989. An examinatmn oi undetected typing errors. Inf. Process. Manage. 25, 6, 659 664.]] Google ScholarDigital Library
- DAVIDSON, L 1962. Retrieval of misspelled names in an airline passenger record system. Commun. ACM 5, 169 171.]] Google ScholarDigital Library
- DEERWESTER, S., DUMAIS, S. T., FURNAS, G. W., LANDAUER, T K., AND HARSHMAN, R. 1990. Indexing by Latent Semantic Analysis. JASIS 41, 6, 391-407.]]Google ScholarCross Ref
- DEFFNER, R., EDER, K, AND GEiGER, I-I. 1990a. Word recognition as a first step towards natural langlmge processing with artificial neural nets. In Proceedings of KONNAI-90.]] Google ScholarDigital Library
- DEFFNER, R., GEIGER, H., KAHLER, R., KREMPL, T., AND BRAUER, W. 1990b. Recognizing words with connectionist architectures. In Proceedings of INNC-90-Parts (Paris, France, July), 196.]]Google Scholar
- DEHEER, T. 1982. The application of the concept of homeosemy to natural language information retrieval. Inf. Process. Manage. 18, 229-236.]]Google ScholarCross Ref
- DELOCttE, G., AND DEmLh F. 1980. Order information redundancy of verbal codes in French and English' Neurolinguistic implications. J. Verbal Learn. Verbal Behav. 19, 525-530.]]Google ScholarCross Ref
- DEMASCO, P. W., AND McCoY, K.F. 1992. Generating text from compressed input: An intelhgent interface for people with severe motor impairments. Commun. ACM 35, 5 (May), 68-78.]] Google ScholarDigital Library
- DEROUAULT, A.-M., AND MERIALDO, B. 1984a. Language modeling at the syntactic level. In Proceedmgs of the 7th International Conference on Pattern Recognition (Montreal, Canada, July 30-Aug. 2), 1373-1375.]]Google Scholar
- DEROUAULT, A.-M, AND MER~ALDO, B. 1984b. TASF: A stenotypy-to-French transcription system. In Proceedings of the 7th International Conference on Pattern Recogn~tton (Montreal, Canada, July 30-Aug. 2), 866-868.]]Google Scholar
- DUNLAVEY, M. R 1981. On spelling correction and beyond. Cammun. ACM 24, 9 (Sept.), 608.]] Google ScholarDigital Library
- DURHAM, I., LAMB, D A, AND SAXE, J B. 1983. Spelling correction in user interfaces. Commun. ACM 26, 10 (Oct.), 764 773.]] Google ScholarDigital Library
- EASTMAN, C. M., AND MCLEAN, D. S. 1981. On the need for parsing ill-ibrmed input. Amer. J Comput. Ling. 7.4, 257.]] Google ScholarDigital Library
- ELLIOTT, R. J. 1988. Annotating spelling list worda with a~fixation classes. AT & T Bell Labs Int. Mem. Dec. 14.]]Google Scholar
- ELLIS, A. W. 1979 Slips of the pen. Vis. Lang. 13, 265-282.]]Google Scholar
- ELLIS, A. W. 1982. Spelling and writing (and reading and speaking). In Normahty and Pathology m Cognttwe Functwns, A. W Elhs, Ed. Academic Press, London.]]Google Scholar
- FA$$, n., AND WILKS, Y. 1983. Preference semantics, fil-formedness, and metaphor Amer J. Comput. Ling. 9.3 4 (July-Dec), 178 189.]] Google ScholarDigital Library
- F~NK, P. K., AND BIERMANN, A.W. 1986. The correction of ill-formed input using history-based expectation with applications to speech understanding. Comput. Ling. 12, i (Jan.-Mar.), 13-36.]] Google ScholarDigital Library
- FORNEY, G. D., JR. 1973. The Viterbi algorithm. Prec. IEEE 61, 3 (Mar.), 268-278.]]Google ScholarCross Ref
- Fox, E. A., CHEN, Q. F., AND HEATH, L.S. 1992. A faster algorithm for constructing minimal perfect hash functions. In Proceedings of the 15th Annual International SIGIR Meeting, SI- GIR'92 (Denmark, June). ACM, New York~ 266-273.]] Google ScholarDigital Library
- FREDKIN, E. 1960. Trie memory. Commun ACM 3, 9, (Sept.), 490-500.]] Google ScholarDigital Library
- FROMKIN, V., ED. 1980. Errors in Linguistic Performance: Shps of the Tongue, Ear, Pen and Hand. Academic Press, New York, 1980.]]Google Scholar
- GALE, W. A., AND CHURCH, K.W. 1990. Estimation procedures for language context: Poor estimates are worse than none. In Proceedings of Compstat-90 (Dubrovnik, Yugoslavia). Springer-Verlag, New York, 69-74.]]Google ScholarCross Ref
- GALLANT, S. I. 1991. A practical approach for representing context and for performing word sense disambiguation using neural networks. Neural Comput. 3, 293-309.]]Google ScholarCross Ref
- GARRETT, M. 1982. Production of speech: Observations from normal and pathological language use. In Normality and Pathology ~n Cognttive Functmns, A. W. Ellis, Ed. Academic Press, London.]]Google Scholar
- GARSIDE, R., LEACH, G., AND SAMPSON, G. 1987. The Computatwnal Analysis of English: A Corpus-Based Approach. Longman, Inc., New York.]]Google Scholar
- GENTNER, D. R., GRUDIN, J., LAROCHELLE, S., NOR- MAN, D. A., AND RUMELHART, D. E. 1983. Studies of typing from the LNR typing research group. In Cognitive Aspects of Skilled Typewriting, W. E. Cooper, Ed. Springer- Verlag, New York.]]Google Scholar
- GERSHO, M., AND REITER, R. 1990. Information retrieval using self-organizing and heteroassociative supmwised neural networks. In Procee&ngs oflJCNN (San Diego, Calif. June).]]Google Scholar
- GOOD, I.J. 1953. The population frequencies of species and the estimation of population parameters Biometrika 40, 3 and 4 (Dec.), 129-264.]]Google Scholar
- GORIN, R. E. 1971. SPELL: A spelling checking and correction program. Online documentation for the DEC-10 computer.]]Google Scholar
- GOSHTASBY, A., AND EHRICH, R.W. 1988. Contextual word recognition using probabilistic relaxation labeling. Patt. Recog. 21, 5, 455-462.]] Google ScholarDigital Library
- GRANGER, R.H. 1983. The NOMAD system: Expectation-based detection and correction of errors during understanding of syntactically and semantically ill-formed text. Amer. J. Comput. Ling. 9, 3-4 (July-Dec.), 188-196.]] Google ScholarDigital Library
- GRUDIN, J. 1983. Error patterns in skilled and novice transcription typing. In Cognitive Aspects of Skilled Typewriting, W. E. Copper, Ed. Springer-Verlag, New York.]]Google Scholar
- GRUHIN. J. 1981. The organization of serial order in typing. Ph.D. dissertation Univ. of California, ~an Diego.]]Google Scholar
- HALL, P. A. V., ANn DOWLING, G. R. 1980. Approximate string matching. ACM Comput. Surv. 12, 4 (Dec.), 17 38.]] Google ScholarDigital Library
- HANSON, S. J., AND KEGL, J. 1987. PARSNIP: A connectionist network that natural language grammar from exposure to natural language sentences. In Proceedings of the Cognitive Science Conference.]]Google Scholar
- HANSON, A. R., RISEMAN, E. M., AND FISHER, E., 1976. Context in word recognition. Part. Recog. 8, 35-45.]]Google ScholarCross Ref
- HARMON, L. D. 1972.Automatic recognition of print and script. Proc. IEEE 60, (Oct.), 1165 1176.]]Google ScholarCross Ref
- HAWLEY, M.J. 1982. Interactive spelling correction in Unix: The METRIC Library. AT &T Bell Labs Tech. Mem., August 31.]]Google Scholar
- HEIDORN, G.E. 1982. Experience with an easily computed metric for ranking alternative parses. In Proceedings of the 20th Annual Meeting of the Associatzon for Computational Linguistics (Toronto, Canada). ACL, 82-84.]] Google ScholarDigital Library
- HEIDORN, G. E., JENSEN, K., MILLER, L. A., BYRD, R. J., AND CHODOROW, M.S. 1982. The EPIS- TLE text-critiquing system. IBM Syst. J. 21, 3,305-326.]]Google ScholarDigital Library
- HENSELER, J., SCHOLTES, J. C., AND VERDOEST, C. R. J. 1987. The design of a parallel knowledge-based optical character recognition system. Master of Science Theses, Dept. of Mathematics and Informatics, Delft Univ. of Technology.]]Google Scholar
- HINDLE, D. 1983. User manual for Fidditch, a deterministic parser. Tech. Mere. 7590 142, Naval Research Lab.]]Google Scholar
- Ho, T. K., HULL, J. J., AND SRIHARI, S. N. 1991. Word recognition with multi-level contextual knowledge. In Proceedings of IDCAR-91 (St. Malo, France), 905-915.]]Google Scholar
- HOTOPF, N. 1980. Slips of the pen. In Cognitive Processes in Spelling, Uta Frith, Ed. Academic Press, London.]]Google Scholar
- HULL, J.J. 1987. Hypothesis testing in a computational theory of visual word recognition. In Proceedings of AAAI-87, 6th National Conference on Artificial Intelligence. vol. 2 (Seattle, Wash., July 13 17). AAAI, 718 722.]]Google Scholar
- HULL, J. J., AND SRIHARI, S. N. 1982. Experiments in text recognition with binary n-gram and Viterbi algorithms. IEEE Trans. Patt. Anal. Machine Intell. PAMI-4, 5 (Sept.), 520 530.]]Google ScholarDigital Library
- JELINEK, F., MERIALDO, B., ROUKOS, S., AND STRAUSS, M. 1991. A dynamic language model for speech recognition. In Proceedings of the DARPA Speech and Natural Language Workshop (Feb. 19-22), 293-295.]] Google ScholarDigital Library
- JENSEN, K., HEIDORN, G. E., MILLER, L. A., AND RAVIN, Y. 1983. Parse fitting and prose fixM ing: Getting a hold on ill-formedness. Amer. J. Comput. Ling. 9, 3-4 (July-Dec.), 147 160.]] Google ScholarDigital Library
- JOHNSTON, J. C., AND MCCLELLAND, J. L. 1980. Experimental tests of a hierarchical model of word identification. J. Verbal Learn. Verbal Behav. 19, 503-524.]]Google ScholarCross Ref
- JONES, M. A., STORY, G. A., AND BALLARD, B. W. 1991. Integrating multiple knowledge sources in a Bayesian OCR post-processor. In Proceedtngs of IDCAR-91 (St Malo, France), 925-933.]]Google Scholar
- JOSHI, A.K. 1985. How much context-sensitivity is necessary for characterizing structural descriptions-Tree Adjoining Grammars In Natural Language Processing Theoretzcal, Computatzonal and Pwcholog~cal Perspectives, D. Dowty, L. Karttunen, A. Zwicky, Ed. Cambridge University Press, New York.]]Google Scholar
- KAHaN, S, PAVLIDiS, T., AND BAIRD. H. S. 1987. On the recognition of characters of any font size IEEE Trans Patt. Anal. Machine Intell. PAMI-9, 9, 274-287]] Google ScholarDigital Library
- KASHYAP, R. L, AND OOMMEN, B. J. 1981 An effective algorithm for string correction using generalized edit distances. Inf Sci 23, 123-142.]]Google ScholarCross Ref
- KASHYAP, R. L., AND OOMMEN, B.J.1984. Spelling correction using probabilistic methods. Part Recog. Lett. 2, 3 (Mar.), 147 154.]]Google ScholarDigital Library
- KEELER, J., AND RUMELHART, D.E. 1992. A selforganizing mtegreted segmentation and recognition neural net. In Advances ~n Neural ln/~rmation Proccsszng Systems, vol. 4. J. E. Moody, S. J. Hanson, R. P. Lippmann, Ed. Morgan Kaufmann, San Mateo, Calif., 496-503.]]Google Scholar
- KEMPEN, G., AND VOSSE, T. 1990. A languagesensitive text editor for Dutch. In Proceedings of the Computers and Writing 111 Conference (Edinburgh, Scotland, Apr )]]Google Scholar
- KERNIGHAN, M.D. 1991. Specialized spelling correction for a TDD system AT & T Bell Labs Tech. Mere., August. 30.]]Google Scholar
- KERNIGHAN, M. D., AND GALE, W.A. 1991. Varmtions on channel-frequency spelling correction in Spamsh. AT&T Bell Labs Tech. Mem., September.]]Google Scholar
- KERNIGHAN, M. D., CHURCH, K. W., AND GALE, W. A. 1990. A spelling correction program based on a noisy channel model. In Proceedings of COL- ING-90, The 13th International Conference on Computational Linguistics, vol. 2 (Helsinki, Finland). Hans Kar}gren, Ed. 205-210.]] Google ScholarDigital Library
- KNUTH, D. E. 1973. The Art of Programming. Vol. 3, Sorting and Searching. Addison-Wesley, Reading, Mass.]] Google ScholarDigital Library
- KOHONEN, T. 1980. Content Addre.ssable Memortes Springer-Verlag, New York.]] Google ScholarDigital Library
- KOHONEN, T. 1988. Self-Orgamzation arid Assoctative Memory. Springer-Verlag, New York.]] Google ScholarDigital Library
- KUCERA, H., AND FRANCIS, W.N. 1967. Computational Analysis of Present-Day American Engltsh Brown University Press, Providence, R.I.]]Google Scholar
- KUKICH, K. 1988a. Variatmns on a back-propagation name recognition net. In Proceedings of the Advanced Technology Conference, vol 2 (May 3-5). U.S. Postal Service, Washington D.C., 722-735.]]Google Scholar
- KUKICH, K. 1988b. Back-propagation topologies for sequence generation. In Proceedings o/ the IEEE International Conference on Neural Networks, vol. 1 (San Diego, Calif., July 24 27). IEEE, New York, 301-308.]]Google ScholarCross Ref
- KUKICH, K. 1990 A comparison of some novel and traditional lexical distance metrics for spelling correction. In Proceectzngs of INNC- 90-Paris (Paris, France, July), 309-313.]]Google Scholar
- KUK~CH, K. 1992. Spelling correction for the telecommunications network for the deaf. Commun ACM 35, 5 (May), 80 90.]] Google ScholarDigital Library
- LANDAUER, T. K, AND STREETER~ L. A. 1973. Structural differences between common and rare words. J. Verbal Learn. Verbal Behav. 12, 119-131.]]Google ScholarCross Ref
- LEE, Y.-H., EVENS, M., MICfiAEL, J. A., AND ROVlCK, A.A. 1990. Spelling Correction for an intelligent tutoring system. Tech. Rep., Dept. of Computer Science, Illinois Inst. of Technology, Chicago]]Google Scholar
- TEIN, V I. 1966. Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Dokl. 10, (Feb), 707-710.]]Google Scholar
- AN, M. Y., AND WALKER. D.E. 1989. ACL Data Collectmn mitmtlve: First release. Fznite String 15, 4 (Dec.), 46-47.]]Google Scholar
- CE, R., AND WAGNER, R. 1975. An extension of the string-to-string correction problem. J. ACM 22, 2 (Apr.), 177-183.]] Google ScholarDigital Library
- O., BURGES, C. J. C, LECuN, Y, AND DENKER, J.S. 1992. Multi-digit recogmtion using a space displacement neural network. In Advances in Neural Information Processzng Systems, vol. 4, J. E Moody, S. J. Hanson, R. P. Lippnmnn, Ed. Morgan Kaufmann, San Mateo, Calif, 488-495.]]Google Scholar
- E., DAMERAU, F. J., AND MERCER, R L 1991. Context based spelling correction. Inf. Process. Manage. 27, 5. 517-522.]] Google ScholarDigital Library
- J. L., AND RUMELHART. D.E. 1981 An interactive activation model of context effects in letter perception. Psychol. Rev. 88, 5 (Sept.), 375 407.]]Google Scholar
- K.F. 1989 Generating context-sensitive responses to object-related misconceptions. Artif. Intell. 41, 157-195]] Google ScholarDigital Library
- Y, M. D 1992. Development of a spelling li~t. IEEE Trans_ Comrnun. COM-30, i (Jan.), 91 99.]]Google Scholar
- L.G. 1988. Cn yur cmputr reed ths. In Proceedinss of the 2nd Applzed Natural Language Processing Conference (Austin, Tex, Feb.). ACL, 93-100.]] Google ScholarDigital Library
- S., HAYES, P. J., AND FAIN J. 1985. Controlling search in fiemble parsing. In Proceedings of the Internatzonal Jmnt Conference on Artificml Intelhgence. Morgan Kaufman, San Marco, Calif., 786-787.]]Google Scholar
- R. 1987. Spelhng checkers, spelling correctors, and the misspellings of poor spellers. Inf. Process. Manage. 23, 5, 495-505.]] Google ScholarDigital Library
- R. 1986. A partial-dictionary of English in computer-usable form. Lit. Ling. Comput. 1, 4, 214 215.]] Google ScholarDigital Library
- R. 1985. A collection of computer-readable corpora of English spelling errors. Cog. Neuropsychol. 2, 3,275-279.]]Google ScholarCross Ref
- AND FRAENKEL, A. S. 1982a. Retrieval in an environment of faulty texts or faulty queries. In Proceedings of the 2nd International Conference on Improving Database Usabihty and Responsiveness (Jerusalem), P. Scheuerman, Ed. Academic Press, New York, 405-425.]]Google Scholar
- AND FRAENKEL, A. S. 1982b. A hash code method for detecting and correcting spelling errors. Commun. ACM 25, 12 (Dec.), 935 938.]] Google ScholarDigital Library
- H.L. 1970. Spelling correction in systems programs. Commun. ACM 13, 2 (Feb.), 90-94.]] Google ScholarDigital Library
- R., AND CHERRY, L.L. 1975. Computer detection of typographical errors. IEEE Trans. Profess. Commun. PC-18, 1, 54-63.]]Google Scholar
- E., JR., AND THARP, A.L. 1977. Correcting human error in alphanumeric terminal input. Inf. Process. Manage. 13, 329-337.]]Google ScholarCross Ref
- ER, G. L. 1966. Introduction to Dynamic Programming. Wiley, New York.]]Google Scholar
- J., PHILLIPS, V. L., AND DUMAIS, S. T. 1992. Retrieving imperfectly recognized handwritten notes. Behav. Inf. Teeh.]]Google Scholar
- M. K., AND RUSSELL, R. C. 1918. U.S. Patent Numbers, 1,261,167 (1918) and 1,435,663 (1922). U.S. Patent Office, Washington, D.C.]]Google Scholar
- T., TANAKA, E., AND KASAI, T. 1976. A method of correction of garbled words based on the Levenshtein metric. IEEE Trans. Comput. 25, 172-177.]]Google Scholar
- T., MACHI, F., EVANS, B., AND TOM, J. 1988. Computational techniques for improved name search. In Proceedings of the 2nd Annual Applied Natural Language Conference (Austin, Tex, Feb.). ACL, 203-210.]] Google ScholarDigital Library
- E, K., CHIGNELL, M., KHOSHAFIAN, S., AND WONG, H. 1990. Intelligent databases. A/ Expert, (Mar.), 38 47.]]Google Scholar
- ON, J. L. 1980. Computer programs for detecting and correcting spelling errors. Commun. ACM 23, 12, (Dec.), 676-684.]] Google ScholarDigital Library
- PETERSON, J.L. 1986. A note on undetected typing errors. Commun. ACM 29, 7 (July), 633-637.]] Google ScholarDigital Library
- POLLOCK, J. J., AND ZAMORA, A. 1983. Collection and characterization of spelling errors in scientific and scholarly text. J. Amer. Soc. Inf. Sci. 34, 1, 51 58.]]Google ScholarCross Ref
- POLLOCK, J. J., AND ZAMO~, A. 1984. Automatic spelling correction in scientific and scholarly text. Commun. ACM 27, 4 (Apr.), 358-368.]] Google ScholarDigital Library
- RAMSaAW, L. A. 1989. Pragmatic knowledge for resolving ill-formedness. Tech. Rep. No. 89-18, BBN, Cambridge, Mass.]]Google Scholar
- RHYNE, J. R., AND WOLF, C. G. 1991. Paperlike user interfaces. RC 17271 (#76097), IBM Research Division, T. J. Watson Research Center, Yorktown Heights, N.Y.]]Google Scholar
- RHYNE, J. R., AND WOLF, C. G. 1993. Recognition-based user interfaces. In Advances m Human-Computer Interaction, vol. 4, H. R. Hartson and D. Hix, Ed. Ablex, Norwood, N.J.]]Google Scholar
- RICHARDSON, S. D., AND BRADEN-HARDER, L. C. 1988. The experience of developing a largerscale natural language text processing system: CRITIQUE. In Proceedings of the 2nd Annual Applied Natural Language Conference, (Austin, Tex. Feb.). ACL, 195-202.]] Google ScholarDigital Library
- E. M., AND HANSON, A.R. 1974. A contextual postprocessing system for error correction using binary n-grams. IEEE Trans. Cornput. C-23, (May), 480-493.]]Google Scholar
- ROBERTSON, A. M., AND WILLETT, P. 1992. Searching for historical word-forms in a database of 17th-century English text using spelling-correction methods. In Proceedings of the 15th Annual International SIGIR Meeting, SIGIR'92 (Denmark, June). ACM, New York, 256-265.]] Google ScholarDigital Library
- ROSENFELD, A., HUMMEL, R. A., AND ZUCKER, S. W. 1976. Scene labeling by relaxation operations. IEEE Trans. Syst. Man Cybernet. SMC-6, 6, 420-433.]]Google ScholarCross Ref
- RUMELHART, D. E., AND MCCLELLAND, J.L. 1982. An interactive activation model of context effects in letter perception. Psychol. Rev. 89, 1, 60-94.]]Google ScholarCross Ref
- RUMELHART, D. E., HINTON, G. E., AND WILLIAMS, R. J. 1986. Learning internal representations by error propagation. In Parallel Distnbuted Processing: Explorations in the Microstructure of Cognition, D. E. Rumelhart and J. L. McClelland, Ed. Bradford Books/MIT Press.]] Google ScholarDigital Library
- SALTON, G. 1989. Automatic text transformations. In Automatic Text Processing: The Transformahon, Analysis and Retrieval of Information by Computer. Addison-Wesley, Reading, Mass.]] Google ScholarDigital Library
- SAMPSON, G. 1989. How fully does a machineusable dictionary cover English text. Lit. Ling. Comput. 4, 1, 29-35.]]Google ScholarCross Ref
- SANKOFF, D., AND KRUSKAL, J. B. 1983. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading, Mass.]]Google Scholar
- SANTOS, P. J., BALTZER, A. J., BADRE, A. N., HENNE- MAN. R. L.. AND MILLER. M. S. 1992. On handwriting recognition system performance: Some experimental results. In Proceedings of the Human Factors Soctety 36th Annual Meeting (Atlanta, Ga., Oct. 12-16). Human Factors Society.]]Google ScholarCross Ref
- SCHANK, R. C., LEBOWITZ, M., AND BIRNBAUM, L. 1980. An integrated understander. Am. J. Comput. Ltng. 6, 1, 13 30.]] Google ScholarDigital Library
- SHELL, B. A. 1978. Median split trees. A fast look-up technique for frequently occurring keys. Commun. ACM 21, 11 (Nov.), 947-958]] Google ScholarDigital Library
- SH~NOHAL, R, AND TOUSSAINT, G. T 1979a Experiments in text recognition with the modified Viterbi algorithm. IEEE Trans Patt. Anal. Machine Intell. PAMI-1, 4 (Apr), 184 193.]]Google Scholar
- SHiNGHAL, R., AND TOUSSAINT, G.T. 1979b. A bottom-up and top-down approach to using context in text recognition. Dzt. J. Man-Machine Stud. 11,201 212.]]Google ScholarCross Ref
- SIDOROV, A.A. 1979. Analysis of word similarity on spelling correction systems. Program. Cornput. Softw 5, 274 277.]]Google Scholar
- SINHA, R. M. K., AND PRASADA, B. 1988. Visual text recognition through contextual processing. Port. Recog. 21, 5, 463 479.]] Google ScholarDigital Library
- SITAR, E.J. 1961. Machine recognition of cursive script: The use of context for error detection and correction. Bell Labs Tech. Mem.]]Google Scholar
- SLEATOR, D. a., AND TEMPERLY, a. 1992. ParsLng Enghsh with a Link Grammar. Source code via internet host: spade.pc.cs.cmu.edu:/usr/ sleator/pubhc. Carnegie-Mellon Univ., Pittsburgh, Pa.]]Google Scholar
- SMADJA, F. 1991a From n-grams to collocations: An evaluation of XTRACT. In Proceedzngs of the 29th Ahnual Meetzng of the Assoczatlon for Computational Linguistics (Berkeley, Calif., June). ACL, 279 284.]] Google ScholarDigital Library
- SMaDJA, F. 1991b. Extracting collocations from text. An apphcation: Text Generation. Ph.D. dissertation, Columbia Umv., New York.]] Google ScholarDigital Library
- SMADJA, F., AND McKEOWN, K. 1990. Automatically extracting and representing collocations for language generation. In Proceedings of the 28th Annual Meeting of the Association for Computational LlnguLetics, (Pittsburgh, Pa., June). ACL, 252-259.]] Google ScholarDigital Library
- SPENKE, M., BEILKEN, C., MATTERN, F., MEVENKAMP, M., AND H. M. 1984. A language independent error recovery method for LL(1) parsers. Softw. Pract. Exp. 14, 11.]]Google ScholarCross Ref
- SRItlARI, S., El). 1984. Computer Text Recognitzon and Error Correctwn. IEEE Computer Society Press, Plscataway, N.J]]Google Scholar
- SRIHARI, S. N., HULL, J. J., AND CHOUDHARI. R. 1983. Integrating diverse knowledge sources in text recognition. ACM Trans. Office Inf. Syst. 1, i (Jan.), 68-87.]] Google ScholarDigital Library
- SuRL L. Z. 1991. Language transfer: A foundation for correcting the written English of ASL signers. Tech. Rep. No. 91-19, Dept. of Computer and Information Sciences, Univ. of Delaware, Newark, Del.]]Google Scholar
- SuRL L. Z., AND McCoY, K. F. 1991. Language transfer in deaf writing: A correction methodology for an instructional system. Tech. Rep. No. 91-20, Dept. of Computer and Information Sciences, Univ. of Delaware, Newark, De}.]]Google Scholar
- TAYLOR, W D. 1981. GROPE--A spelling error correction tool. AT & T Bell Labs Tech. Mere.]]Google Scholar
- TENCZAR, P., AND GOLDEN, W. 1972. CERL Report X-35. Computer-Based Educatmn Research Lab., Umv of Ilhnois, Urbana, Ill.]]Google Scholar
- THOMPSON, B. H. 1980. Linguistic analysis of natural language communication with computers. In Proceedings of the 8th Internatzonal Conference on Computational Llnguistzcs (Tokyo, Japan), 190 201.]] Google ScholarDigital Library
- TOUSSAINT, G T. 1978. The use of context in pat-tern recognition. Patt Recog. 10, 189 204.]]Google ScholarCross Ref
- TR^WICK, D J. 1983. Robust sentence analysis and habitability. Ph.D dissertation, California Inst. of Technology, Pasadena. Calif.]]Google Scholar
- TROY, P. L. 1990 Combining probabilistic sources with lexical distance measures for spelhng correction. Bellcore Tech Memo., Bellcore, Morristown, N.J.]]Google Scholar
- TSAO, Y. C. 1990. A lexical study of sentences typed by hearing-impaired TDD users. In Proceed~ngs of the 13th International Symposium on Human Factors in Telecommun~catzons (Turin, Italy, Sept ), 197 201.]]Google Scholar
- TURBA, T.N. 1981. Checking for spelling and ty pographical errors in computer-based text. SIGPLAN-SIGOA Newslett. (June), 51-60.]] Google ScholarDigital Library
- ULLMANN, J.R. 1977 A binary n-gram technique for automatic correction of substitution, deletion, insertion and reversal errors in words. Cornput J. 20, 141-147.]]Google Scholar
- VAN BERKEL, B., AND DESMEI)T, K. 1988 Triphone analysis' A combined method for the correction of orthographical and typographical errors. In Proceedings of the 2nd Apphed Natural Language Processing Conference (Austin, Tex., Feb.). Association for Computational Linguistics (ACL).]] Google ScholarDigital Library
- VERONIS, J. 1988a. Computerized correction of phonographic errors. Comput. Hum. 22, 43-56.]]Google ScholarCross Ref
- VERONIS, J 1988b. Morphosyntactic correction in natural language interfaces, in Proceedings of the 12th Iaternat~onal Conference on Computattonal Ltngu~st~cs (Budapest, Hungary), 708 713]] Google ScholarDigital Library
- VOSSE, T. 1992. Detecting and correcting morpho-syntactic errors m real texts. In Proceedlngs of the 3rd Conference on Applied Natural Language Processing (Trento, Italy, Mar. 31 Apr.3). ACL, 111-118.]] Google ScholarDigital Library
- W^GNER, R.A. 1974. Order-n correction for regular languages. Commun. ACM 17, 5 (May), 265 268.]] Google ScholarDigital Library
- WAGNER, R. A., ANI~ F~aCnER~ M. J 1974. The stnng-to-string correction problem. J ACM21, I (Jan.), 168 178.]] Google ScholarDigital Library
- WALKE~, D. E. 1991. The ecology of language. In Proceedings of the International Workshop on Electronic D~ctzonarzes (Feb.). Japan Electronic Dictionary Research Institute, Tokyo, 10-22.]]Google Scholar
- WALKER, D. E., AND AMSLER, R.A. 1986. The use of machine-readable dictionaries in sublanguage analysis. In Analyzing Language ~n Restricted Domains: Sublanguage Description and Processing. Lawrence Erlbaum, Hillsdale, N.J., 69-83.]]Google Scholar
- WALTZ, D. L. 1978. An English language question answering system for a large relational database. Commun. ACM 21, 7, 526-539.]] Google ScholarDigital Library
- Webster's New World Misspeller's Dictionary. Simon and Schuster, New York.]]Google Scholar
- WEISCHEDEL, R. M., AND SONDHEIMER, N.K. 1983. Meta-rules as a basis for processing ill-formed input. Amer. J. Comput. Ling. 9, 3-4 (July-Dec.), 161-177.]] Google ScholarDigital Library
- WING, A. M., AND BADDELEY, A.D. 1980. Spelling errors in handwriting: A corpus and distributional analysis. In Cognitive Processes in Spelhng, U. Frith, Ed. Academic Press, London.]]Google Scholar
- WONG, C. K., AND CHANDRA, A.K. 1976. Bounds for the string editing problem. J. ACM 23, 1 (Nov.), 13-16.]] Google ScholarDigital Library
- WRIGHT, h. G., AND NEWELL, A. F. 1991. Computer help for poor spellers. Brit. J. Educ. Tech. 22, 2 (Feb.), 146 148.]]Google ScholarCross Ref
- YANNAKOUDAKIS, E. J., AND FAWTHROP, D. 1983a. An intelligent spelling correcter. Inf. Process. Manage. 19, 12, 101-108.]]Google ScholarCross Ref
- YANNAKOUDAKIS, E. J., AND FAWTHROP, D. 1983b. The rules of spelling errors. Inf. Process. Manage. 19, 2, 87 99.]]Google ScholarCross Ref
- YOUNG, C. W., EASTMAN, C. M., AND OAKMAN, R. L. 1991. An analysis of ill-formed input in natural language queries to document retrieval systems. Inf. Process. Manage. 27, 6, 615-622.]]Google ScholarCross Ref
- ZA~IORA, E. M., POLLOCK, J. J., AND ZAMORA, A. 1981. The use of trigram analysis for spelling error detection. Inf. Process. Manage. 17, 6, 305-316.]]Google ScholarCross Ref
- ZIPF, G. K. 1935. The Psycho-Biology of Language. Houghton Mifflin, Boston.]]Google Scholar
Index Terms
- Techniques for automatically correcting words in text
Recommendations
Techniques for automatically correcting words in text (abstract)
CSC '93: Proceedings of the 1993 ACM conference on Computer scienceResearch aimed at correcting words in text has focused on three progressively more difficult problems:
- 1) non-word error detection;
- 2) isolated-word error correction; and
- 3) context dependent word correction.
In response to the first problem, ...
Word processing in spanish using an english keyboard: a study of spelling errors
UI-HCII'07: Proceedings of the 2nd international conference on Usability and internationalizationThis article describes a study of spelling errors made by writers while typing in Spanish using an English keyboard. The most important contribution of this study is the identification of a profile of errors made by writers using a word processor and an ...
RuMedSpellchecker: Correcting Spelling Errors for Natural Russian Language in Electronic Health Records Using Machine Learning Techniques
Computational Science – ICCS 2023AbstractThe incredible advances in machine learning have created a variety of predictive and decision-making medical models that greatly improve the efficacy of treatment and improve the quality of care. In healthcare, such models are often based on ...
Comments