Word sense disambiguation methods

Turdakov, D. Yu.

doi:10.1134/S0361768810060010

Word sense disambiguation methods

Published: 25 November 2010

Volume 36, pages 309–326, (2010)
Cite this article

Programming and Computer Software Aims and scope Submit manuscript

D. Yu. Turdakov¹

349 Accesses
11 Citations
Explore all metrics

Abstract

Word sense disambiguation is one of the key tasks of text processing. It consists in the determination of senses of words or compound terms in accordance with the context where they were used. The word sense disambiguation problem originated in the 1950s as a subtask of machine translation. Since then, the great number of methods of its solution has been developed; however, none of them may be viewed as a perfect one. The paper is a survey of most well-known studies in this field.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advances Toward Word-Sense Disambiguation

A Survey of Different Approaches for Word Sense Disambiguation

An Analysis of Word Sense Disambiguation (WSD)

References

Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K., WordNet: An On-line Lexical Data-base, Int. J. Lexicography, 1990, vol. 3, pp. 235–244.
Article Google Scholar
Word Sense Disambiguation: Algorithms and Applications (Text, Speech and Language Technology), Aggire, E. and Edmonds, P.G., Eds., Springer, 2007.
Ide, N. and Ve’ronis, J., Word Sense Disambiguation: The State of the Art, Computational Linguistics, 1998.
Salton, G., Automatic Information Organization and Retrieval, McGraw Hill Text, 1968.
Litowski, K.C. Desiderata for Tagging with Word-Net Synsets or MCAA Categories, Proc. of the ACL-SIGLEX Workshop, “Tagging Text with Lexical Semantics: Why, What, and How?” Washington, DC, 1997, pp. 12–17.
Seneff, S., TINA: A Natural Language System for Spoken Language Applications, Comput. Linguist., 1992, vol. 18, no. 1, pp. 61–86.
Google Scholar
Grineva, M., Grinev, M., Turdakov, D., Velikhov, P., and Boldakov, A., Harnessing Wikipedia for Smart Tags Clustering, KASW: Int. Workshop on Knowledge Acquisition from the Social Web, 2008.
Yarowsky, D., Decision Lists for Lexical Ambiguity Resolution: Application to Accent Restoration in Spanish and French, Proc. of the 32nd Ann. Meeting of Association for Computational Linguistics, Morristown, NJ, USA: Association for Computational Linguistics, 1994, pp. 88–95.
Chapter Google Scholar
Grineva, M., Grinev, M., and Lizorkin, D., Effective Extraction of Thematically Grouped Key Terms from Text, AAAI-SSS-09. Social Semantic Web: Where Web 2.0 Meets Web 3.0., 2009.
Grineva, M., Grinev, M., and Lizorkin, D., Extracting Key Terms From Noisy and Multi-theme Documents, The 18th Int. World Wide Web Conf., 2009, pp. 661–661.
Aristotel, Categories (Collected works in four volumes), Moscow: Mysl’, 1978–1984.
Google Scholar
Rozental’, D.E., Golub, I.B., and Telenkova, M.A., Sovremennyi russkii yazyk (Contemporary Russian), Airis, 2007.
Gimenez, J. and Marquez, L., SVMTool: A General POS Tagger Generator Based on Support Vector Machines, 2004.
Malouf, R., A Comparison of Algorithms for Maximum Entropy Parameter Estimation, COLING-02: Proc. of the 6th Conf. on Natural Language Learning, Morristown, NJ, USA: Association for Computational Linguistics, 2002.
Google Scholar
Schank, R.C., Conceptual Information Processing, Amsterdam: North Holland, 1975.
MATH Google Scholar
Vinogradov, V.V., Main Types of Word Lexical Meanings, in Voprosy yazykoznaniya (Linguistics Issues), 1953.
Kaplan, A., An Experimental Study of Ambiguity and Context, Mechanical Translation, 1955, vol. 2, no. 2, pp. 39–46.
Google Scholar
Yarowsky, D., One Sense per Collocation, HLT’93: Proc. of the Workshop on Human Language Technology, Morristown, NJ, USA: Association for Computational Linguistics, 1993, pp. 266–271.
Chapter Google Scholar
Gale, W.A., Church, K.W., and Yarowsky, D., A Method for Disambiguating Word Senses in a Large Corpus, Comput. Humanities, 1993, vol. 26, pp. 415–439.
Article Google Scholar
Gale, W.A., Church, K.W., and Yarowsky, D., One Sense per Discourse, HLT’91: Proc. of the Workshop on Speech and Natural Language, Morristown, NJ, USA: Association for Computational Linguistics, 1992, pp. 233–237.
Chapter Google Scholar
Richmond, K., Smith, A., and Amitay, E., Detecting Subject Boundaries Within Text: A Language Independent Statistical Approach, Proc. of the Second Conf. on Empirical Methods in Natural Language Processing, EMNLP-2, Providence, RI: Brown University, 1997, pp. 47–54.
Google Scholar
Winograd, T., Procedures as a Representation for Data in a Computer Program for Understanding Natural Language, Tech. Rep. MAC-TR-84. MIT Project MAC, 1971.
Miller, G.A., Leacock, C., Tengi, R., and Bunker, R.T., A Semantic Concordance, HLT’93: Proc. of the Workshop on Human Language Technology, Morristown, NJ, USA: Association for Computational Linguistics, 1993, pp. 303–308.
Chapter Google Scholar
Nelson, W. F. and Kučera, H., Frequency Analysis of English Usage: Lexicon and Grammar, J. English Linguistics, 1982, vol. 18, no. 1, pp. 64–70.
Google Scholar
Leacock, C., Towell, G., and Voorhees, E., Corpusbased Statistical Sense Resolution, HLT’93: Proc. of the Workshop on Human Language Technology, Morristown, NJ, USA: Association for Computational Linguistics, 1993, pp. 260–265.
Chapter Google Scholar
Bruce, R.R. and Wiebe, J., Word-Sense Disambiguation Using Decomposable Models, Proc. of the 32nd Ann. Meeting of the Association for Computational Linguistics, 1994, pp. 139–146.
Hwee, Tou Ng and Hian, Beng Lee., Integrating Multiple Knowledge Sources to Disambiguate Word Sense: An Exemplar-Based Approach, Proc. of the Thirty-Fourth Ann. Meeting of the Association for Computational Linguistics, Joshi, A. and Palmer, M., Eds., San Francisco: Morgan Kaufmann, 1996, pp. 40–47.
Google Scholar
Kilgarriff, A., SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs, LREC, 1998, pp. 581–588.
Atkins, S., Tools for Computer-aided Corpus Lexicography: The Hector Project, Acta Linguistica Hungarica, 1993, vol. 41, pp. 5–72.
Google Scholar
Palmer, M., Fellbaum, C., Cotton, S., Delfs, L., and Hoa Trang Dang, English Tasks: All-Words and Verb Lexical Sample, Proc. of Senseval-2: The Second Int. Workshop on Evaluating Word Sense Disambiguation Systems, Toulouse, France, 2001, pp. 21–24.
Mihalcea, R. and Edmonds, P., Proc. of Senseval-3: The Third Int. Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Barcelona, Spain, 2004.
Chklovski, T. and Mihalcea, R., Building a Sense Tagged Corpus with Open Mind Word Expert, Proc. of the ACL-02 Workshop on Word Sense Disambiguation, Morristown, NJ, USA: Association for Computational Linguistics, 2002, pp. 116–122.
Chapter Google Scholar
Guha, R.V. and Lenat, D.B., CYC: A Mid-term Report, Appl. Artif. Intell., 1991, vol. 5, no. 1, pp. 45–86.
Article Google Scholar
Marcus, M.P., Marcinkiewicz, M.A., and Santorini, B., Building a Large Annotated Corpus of English: The Penn Treebank, 2004.
Kilgarriff, A. and Grefenstette, G., Introduction to the Special Issue on the Web as Corpus, Computational Linguistics, 2003, vol. 29, pp. 333–347.
Article MathSciNet Google Scholar
Chomsky, N., Syntactic Structures, The Hague: Mouton, 1957.
Google Scholar
Minsky, M., A Framework for Representing Knowledge, MIT-AI Lab. Memo 306,1974.
Richens, R.H., Interlingual Machine Translation, Computer J., 1958, vol. 3, no. 1, pp. 144–147.
Article Google Scholar
Masterman, M., Semantic Message Detection for Machine Translation, Using an Interlingua, Int. Conf. on Machine Translation of Languages and Applied Language Analysis, London: Her Majesty’s Stationery Office, 1962, pp. 437–475.
Google Scholar
Quillian, M.R., The Teachable Language Comprehender: A Simulation Program and Theory of Language, Commun. ACM, 1969, vol. 12, no. 8, pp. 459–476.
Article Google Scholar
Hayes, P.J., A Process to Implement Some Word-Sense Disambiguation, Working Paper 23, Institut pour les Etudes Sémantiques et Cognitives, Université de Genéve, 1976.
Collins, A.M. and Loftus, E.F., A Spreading Activation Theory of Semantic Processing, Psychological Review, 1975, vol. 82, no. 6, pp. 407–428.
Article Google Scholar
Lesk, M., Automatic Sense Disambiguation Using Machine Readable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone, ACM Special Interest Group for Design of Communication, Proc. of the 5th Ann. Int. Conf. on System Documentation, 1986, pp. 24–26.
Leacock, C., Miller, G.A., and Chodorow, M., Using Corpus Statistics and WordNet Relations for Sense Identification, 1998.
Hirst, G. and St-Onge, D., Lexical Chains as Representations of Context for the Detection and Correction of Malapropisms, 1997.
Resnik, P., Using Information Content to Evaluate Semantic Similarity in a Taxonomy, Proc. of the 14th Int. Joint Conf. on Artificial Intelligence, 1995, pp. 448–453.
Jiang, J.J. and Conrath, D.W., Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy, Int. Conf. Research on Computational Linguistics (ROCLING X), 1997.
Lin, D., An Information-Theoretic Definition of Similarity, ICML’98: Proc. of the Fifteenth Int. Conf. on Machine Learning, San Francisco: Morgan Kaufmann, 1998, pp. 296–304.
Google Scholar
Mihalcea, R. and Moldovan, D.I., A Method for Word Sense Disambiguation of Unrestricted Text, Proc. of the 37th Ann. Meeting of the Association for Computational Linguistics on Computational Linguistics, Morristown, NJ, USA: Association for Computational Linguistics, 1999, pp. 152–158.
Chapter Google Scholar
Agirre, E. and Rigau, G., Word Sense Disambiguation Using Conceptual Density, Proc. of the 16th Int. Conf. on Computational Linguistics, 1996, pp. 16–22.
Stetina, J., Kurohashi, S., and Nagao, M., General Word Sense Disambiguation Method Based on a Full Sentential Context. Usage of WordNet in Natural Language Processing, Proc. of COLING-ACL Workshop, 1998.
Morris, J. and Hirst, G., Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text, Comput. Linguist., 1991, vol. 17, no. 1, pp. 21–48.
Google Scholar
Mihalcea, R. and Moldovan, D.I., A Highly Accurate Bootstrapping Algorithm for Word Sense Disambiguation, Int. J. Artificial Intelligence Tools, 2001, vol. 10, no. 1–2, pp. 5–21.
Article Google Scholar
Turdakov, D. and Lizorkin, D., HMM Expanded to Multiple Interleaved Chains as a Model for Word Sense Disambiguation, Proc. of the 23rd Pacific Asia Conf. on Language, Information and Computation, Hong Kong: City University of Hong Kong, 2009, pp. 549–558.
Google Scholar
Mihalcea, R., Unsupervised Large-vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling, HLT’05: Proc. of the Conf. on Human Language Technology and Empirical Methods in Natural Language Processing, Morristown, NJ, USA: Association for Computational Linguistics, 2005, pp. 411–418.
Chapter Google Scholar
Brin, S. and Page, L., The Anatomy of a Large-Scale Hypertextual Web Search Engine, Computer Networks and ISDN Systems, 1998, pp. 107–117.
Nelken, R. and Shieber, S.M., Lexical Chaining and Word-Sense Disambiguation, Tech. Report TR-06-07, School of Engineering and Applied Sciences, Harvard University, 2007.
Brockmann, C. and Lapata, M., Evaluating and Combining Approaches to Selectional Preference Acquisition, EACL’03: Proc. of the Tenth Conf. on European Chapter of the Association for Computational Linguistics, Morristown, NJ, USA: Association for Computational Linguistics, 2003, pp. 27–34.
Chapter Google Scholar
Lukashevich, N.V. and Dobrov, B.V., Russian-Language Thesaurus for Automatic Processing Large Text Collections, Komp’yutornaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intelligence Technologies), Narin’yani, A.S., Ed., Moscow: Nauka, 2002.
Google Scholar
Dobrov, B.V. and Lukashevich, N.V., Ontologies for Automatic Text Processing: Description of Concepts and Lexical Meanings, Komp’yutornaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intelligence Technologies), Laufer, N.I., Narin’yani, A.S., and Selegei, V.P., Eds., Moscow: RGGU, 2006, pp. 138–142.
Google Scholar
Dobrov, B.V. and Lukashevich, N.V., Word Sense Disambiguation Based on Thesaurus and Subject Domain, Trudy mezhdunarodnoi konferentsii “Dialog 2007” (Proc. of Int. Conf. “Dialog 2007”), 2007.
Lukashevich, N.V. and Chuiko, D.S., Thesaurus-based Automatic Word Sense Disambiguation, Sbornik rabot uchastnikov konkursa “Internet-matematika 2007” (Proc. of Competition “Internet Mathematics 2007”), 2007.
Xiaohua, Zhou and Hyoil, Han., Survey of Word Sense Disambiguation Approaches, Proc. of the 18th Int. Florida AI Research Society Conf.
Chodorow, M., Leacock, C., and Miller, G.A., A Topical Local Classifier for Word Sense Identification, Comput. Humanities, 2000, vol. 34, pp. 115–120.
Article Google Scholar
Berger, A.L., Della Pietra, V.J., and Della Pietra, S.A., A Maximum Entropy Approach to Natural Language Processing, Comput. Linguist., 1996, vol. 22, no. 1, pp. 39–71.
Google Scholar
Fellbaum, C. and Palmer, M., Manual and Automatic Semantic Annotation with WordNet, Proc. of NAACL 2001 Workshop, 2001.
O’Hara, T. et al., Selecting Decomposable Models for Word Sense Disambiguation: The Grling-sdm System, Comput. Humanities, 2000, vol. 34, pp. 159–164.
Article Google Scholar
Bruce, R.F. and Wiebe, J.M., Decomposable Modeling in Natural Language Processing, Comput. Linguist., 1999, vol. 25, no. 2, pp. 195–207.
Google Scholar
Daelemans, W., Zavrel, J., van der Sloot, K., and van den Bosch, A., TiMBL: Tilburg Memory-Based Learner, Version 4.0. Reference Guide, 2001.
Stevenson, M. and Wilks, Y., The Interaction of Knowledge Sources in Word Sense Disambiguation, Comput. Linguist., 2001, vol. 27, no. 3, pp. 321–349.
Article Google Scholar
Hoa Trang Dang and Palmer, M., Combining Contextual Features for Word Sense Disambiguation, Proc. of the Workshop on Word Sense Disambiguation: Recent Successes and Future Directions, 2002, pp. 88–94.
Bhattacharya, I., Getoor, L., and Bengio, Y., Unsupervised Sense Disambiguation Using Bilingual Probabilistic Models, ACL’04: Proc. of the 42nd Ann. Meeting of Association for Computational Linguistics, Morristown, NJ, USA: Association for Computational Linguistics, 2004, p. 287.
Chapter Google Scholar
De Loupy, C., El-Beze, M., and Marteau, P.F., Word Sense Disambiguation Using HMM Tagger, The 1st Int. Conf. on Language Resources and Evaluation (LREC), 1998, pp. 1255–1258.
Ferran, A.M., Molina, A., Pla F., Segarra, E., and Moreno, L., Word Sense Disambiguation Using Statistical Models and WordNet, Proc. of the 3rd Int. Conf. on Language Resources and Evaluation, LREC2002, Las Palmas de Gran Canaria, 2002.
Molina, A., Pla, F., and Segarra, E., WSD System Based on Specialized Hidden Markov Model (upvshmm-eaw), Senseval-3: The Third Int. Workshop on the Evaluation of Systems for the Semantic Analysis of Text, Mihalcea R. and Edmonds, P., Eds., Barcelona, Spain: Association for Computational Linguistics, 2004, pp. 171–174.
Google Scholar
Plungyan, V.A., Reznikova T.I., and Sichinava, D.V., National Russian-Language Corpus: General Characteristic, Nauchno-teknicheskaya informatsiya, 2005, Ser. 2, no. 2.
Kobritsov, B.P., Word Sense Disambiguation Methods, Nauchno-teknicheskaya informatsiya, 2004, Ser. 2, no. 3, pp. 9–13.
Kobritsov, B.P. and Lyashevskaya, O.N., Automatic Word Sense Disambiguation in National Russian-Language Corpus, Komp’yutornaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intelligence Technologies), Moscow: Nauka, 2004.
Google Scholar
Kobritsov, B.P., Lyashevskaya, O.N., and Shemanaeva, O.Yu., Surface Filters for Disambiguation of Semantic Homonymy in a Text Corpus, Komp’yutornaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intelligence Technologies), Kobozeva, I.M., Narin’yani, A.S., and Selegei, V.P., Eds., Moscow: Nauka, 2005.
Google Scholar
Kobritsov, B.P., Lyashevskaya, O.N., and Shemanaeva, O.Yu., Disambiguation of Lexical-Semantic Homonymy in News and Newspaper and Magazine Texts, in Internet-matematika (Internet Mathematics), Moscow, 2005.
Kobritsov, B.P., Lyashevskaya, O.N., and Toldova, S.Yu., Verb Sense Disambiguation with the Use of Inflection Models Extracted from Digital Explanatory Dictionaries, Digital publication, http://download.yandex.ru/IMAT2007/kobricov.pdf.2007.
Shemanaeva, O.Yu., Kustova, G.I., Lyashevskaya, O.N., and Rakhilina, E.V., Semantic Filters for Word Sense Diambiguation in National Russian-Language Corpus: Adjectives, Komp’yutornaya lingvistika i intellektual’nye tekhnologii (Computer Linguistics and Intelligence Technologies), 2006, pp. 138–142.
Zlatic, V., Bozicevic, M., Stefancic, H., and Domazet, M., Wikipedias: Collaborative Web-based Encyclopedias as Complex Networks, Physical Review E., 2006, vol. 74, pp. 16–115.
Article Google Scholar
Strube, M. and Ponzetto, S.P., WikiRelate! Computing Semantic Relatedness Using Wikipedia, Proc. of AAAI, 2006, pp. 1419–1424.
Gabrilovich, E. and Markovitch, S., Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis, Proc. of the 20th Int. Joint Conf. on Artificial Intelligence, 2007, pp. 6–12.
Milne, D., Computing Semantic Relatedness Using Wikipedia Link Structure, Proc. of the New Zealand Comput. Sci. Research Student Conf. (NZCSRSC), Hamilton, New Zealand, 2007.
Milne, D. and Witten, I.H., An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, Proc. of the AAAI’08 Workshop on Wikipedia and Artificial Intelligence, 2008.
Yeh, E., Ramage, D., Manning, C.D., Agirre, E., and Soroa, A., WikiWalk: Random Walks on Wikipedia for Semantic Relatedness, Proc. of the 2009 Workshop on Graph-based Methods for Natural Language Processing (TextGraphs-4), Suntec, Singapore: Association for Computational Linguistics, 2009, pp. 41–49.
Chapter Google Scholar
Turdakov, D. and Velikhov, P., Semantic Relatedness Metric for Wikipedia Concepts Based on Link Analysis and its Applications to Word Sense Disambiguation, Proc. of SYRCoDIS, 2008.
Lizorkin, D., Velikhov, P., Grinev, M., and Turdakov, D., Accuracy Estimate and Optimization Techniques for SimRank Computation, The VLDB J., 2009. http://dx.doi.org/10.1145/1453856.1453904.
Zesch, T. and Gurevych, I., Analysis of the Wikipedia Category Graph for NLP Applications, Proc. of the TextGraphs-2 Workshop, NAACL-HLT, 2007.
Giles, J., Internet Encyclopedias Go Head to Head, Nature, 2005, vol. 438, pp. 900–901.
Article Google Scholar
Mihalcea, R., Using Wikipedia for Automatic Word Sense Disambiguation, Proc. of NAACL HLT 2007, Rochester, NY, 2007, pp. 196–203.
Mihalcea, R. and Csomai, A., Wikify!: Linking Documents to Encyclopedic Knowledge, Proc. of the 16th ACM Conf. on Information and Knowledge Management (CIKM’07), 2007.
Cucerzan, S., Large-Scale Named Entity Disambiguation Based on Wikipedia Data, Proc. of Conf. on Empirical Methods in Natural Language Processing (EMNLP 2007), Prague, 2007, pp. 708–716.
Bunescu, R. and Pasca, M., Using Encyclopedic Knowledge for Named Entity Disambiguation, Proc. of the 11th Conf. of the European Chapter of the Association for Computational Linguistics (EACL), Trento, Italy, 2006.
Medelyan, O., Witten, I.H., and Milne, D., Topic Indexing with Wikipedia, Proc. of the 1st AAAI’08 Workshop on Wikipedia and Artificial Intelligence, 2008.
Milne, D. and Witten, I.H., Learning to Link with Wikipedia, Proc. of the 17th ACM Conf. on Information and Knowledge Management, 2008, pp. 509–518.
Turdakov, D.Yu., Disambiguation of Wikipedia Terms Based on Hidden Markov Model, XI Vserossiiskaya nauchnaya konferentsiya “Elektronnye biblioteki: perspektivnye metody i tekhnologii, elektronnye kollektsii (XI All-Russian Scientific Conf. “Digital Libraries: Perspective Methods and Technologies, Digital Collections”)

Download references

Author information

Authors and Affiliations

Institute of System Programming, Russian Academy of Sciences, ul. Solzhenitsyna 25, Moscow, 109004, Russia
D. Yu. Turdakov

Authors

D. Yu. Turdakov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Yu. Turdakov.

Additional information

Rights and permissions

Reprints and permissions

About this article

Cite this article

Turdakov, D.Y. Word sense disambiguation methods. Program Comput Soft 36, 309–326 (2010). https://doi.org/10.1134/S0361768810060010

Download citation

Received: 15 April 2010
Published: 25 November 2010
Issue Date: November 2010
DOI: https://doi.org/10.1134/S0361768810060010

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Word sense disambiguation methods

Abstract

Access this article

Similar content being viewed by others

Advances Toward Word-Sense Disambiguation

A Survey of Different Approaches for Word Sense Disambiguation

An Analysis of Word Sense Disambiguation (WSD)

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Word sense disambiguation methods

Abstract

Access this article

Similar content being viewed by others

Advances Toward Word-Sense Disambiguation

A Survey of Different Approaches for Word Sense Disambiguation

An Analysis of Word Sense Disambiguation (WSD)

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation