Skip to main content
Log in

Hebrew Computational Linguistics: Past and Future

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

This paper reviews the current state of the art in Natural LanguageProcessing for Hebrew, both theoretical and practical. The Hebrewlanguage, like other Semitic languages, poses special challenges fordevelopers of programs for natural language processing: the writingsystem, rich morphology, unique word formation process of roots andpatterns, lack of linguistic corpora that document language usage, allcontribute to making computational approaches to Hebrew challenging. The paper briefly reviews the field of computational linguistics andthe problems it addresses, describes the special difficulties inherentto Hebrew (as well as to other Semitic languages), surveys a widevariety of past and ongoing works and attempts to characterize futureneeds and possible solutions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adler, M. & Tebeka, M. (2001). Unsupervised Hebrew Part-of-Speech Tagging. In Wintner, S. (ed.) Israeli Seminar on Computational Linguistics (ISCOL'01), 19–20. Haifa.

  • Albeck, O. (1995). A Formal Method for Analyzing a Hebrew Sentence. Hebrew Linguistics 39: 5–27 (in Hebrew).

    Google Scholar 

  • Attar, R., Choueka, Y., Dershowitz, N. & Fraenkel, A. S. (1978). KEDMA – Linguistic Tools for Retrieval Systems. Journal of the Association for Computing Machinery 25(1): 52–66.

    Google Scholar 

  • Azar, M. (1970). Analyse morphologique automatique du texte hébreu de la Bible. Technical Report 12 et 19, Faculte des Lettres et des Sciences Humaines, Nancy.

    Google Scholar 

  • Azar, M. (1972). Automatic Syntactical Analysis: The Method and Its Application to the Book of Ruth. Hebrew Computational Linguistics 5: 1–50 (in Hebrew).

    Google Scholar 

  • Bashkansky, G. & Ornan, U. (1998). Monolingual Translator Workstation. In MT and the Information Soup: Proceedings of AMTA'98, 136–149. Springer.

  • Beesley, K. (1996). Arabic Finite-State Morphological Analysis and Generation. In Proceedings of COLING-96, the 16th International Conference on Computational Linguistics. Copenhagen.

  • Beesley, K. R. (1998). Arabic Morphology Using Only Finite-State Operations. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages, 50–57. Montreal, Quebec (COLING-ACL'98).

  • Beesley, K. R. & Karttunen, L. (2003). Finite-State Morphology: Xerox Tools and Techniques. Stanford: CSLI Publications.

    Google Scholar 

  • Bentur, E., Angel, A. & Segev, D. (1992). Computerized Analysis of Hebrew Words. Hebrew Linguistics 36: 33–38 (in Hebrew).

    Google Scholar 

  • Bentur, E., Angel, A., Segev, D. & Lavie, A. (1992). Analysis and Generation of the Nouns Inflection in Hebrew. In Ornan et al. (eds.), Chapter 3, 36–38 (in Hebrew).

  • Carmel, D. & Maarek, Y. (1999). Morphological Disambiguation for Hebrew Search Systems. In Proceedings of the 4th international Workshop, NGITS-99, Number 1649 in Lecture Notes in Computer Science, 312–325. Springer Verlag.

  • Chayen, M. J. and Dror, Z. (1976). Introduction to Hebrew Transformational Grammar. Jerusalem: University Publishing Projects Ltd. (in Hebrew).

    Google Scholar 

  • Choueka, Y. (1966). Computers and Grammar: Mechnical Analysis of Hebrew Verbs. In Proceedings of the Annual Conference of the Israeli Association for Information Processing, 49–66. Rehovot (in Hebrew).

  • Choueka, Y. (1972). Fast Searching and Retrieval Techniques for Large Dictionaries and Concordances. Hebrew Computational Linguistics 6: 12–32 (in Hebrew).

    Google Scholar 

  • Choueka, Y. (1980). Computerized Full-Text Retrieval Systems and Research in the Humanities: The Responsa Project. Computers and the Humanities 14: 153–169.

    Google Scholar 

  • Choueka, Y. (1990). MLIM – a System for Full, Exact, On-Line Grammatical Analysis of Modern Hebrew. In Eizenberg, Y. (ed.) Proceedings of the Annual Conference on Computers in Education, 63. Tel Aviv (in Hebrew).

  • Choueka, Y. (1993). Response to “Computerized Analysis of Hebrew Words”. Hebrew Linguistics 37: 87 (in Hebrew).

    Google Scholar 

  • Choueka, Y. & Lusignan, S. (1985). Disambiguation by Short Context. Computers and the Humanities 19: 147–157.

    Google Scholar 

  • Cohen, D. (1984). Mechanical Syntactic Analysis of a Hebrew Sentence. Ph.D. thesis, Hebrew University of Jerusalem (in Hebrew).

  • Cohen, D. (1985). Analysis of Unvocalized Texts. In Proceedings of the Ninth World Congress of Jewish Studies, 117–122. Jerusalem: World Union of Jewish Studies (in Hebrew).

    Google Scholar 

  • Dagan, I. & Itai, A. (1994). Word Sense Disambiguation Using a Second Language Monolingual Corpus. Computational Linguistics 20(4): 563–596.

    Google Scholar 

  • Dahan Netzer, Y. (1997). HUGG – Unification-Based Grammar for the Generation of Hebrew Noun Phrases. Master's thesis, Ben-Gurion University of the Negev, Department of Computer Science, Faculty of Natural Sciences, Be'er Sheva, Israel.

    Google Scholar 

  • Dahan Netzer, Y. & Elhadad, M. (1998a). Generating Determiners and Quantifiers in Hebrew. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages (COLING/ACL'98), 82–88. Montreal, Canada.

  • Dahan Netzer, Y. & Elhadad, M. (1998b). Generation of Noun Compounds in Hebrew: Can Syntactic Knowledge be Fully Encapsulated? In Hovy, E. (ed.) Proceedings of the Ninth International Workshop on Natural Language Generation, 168–177, New Brunswick, New Jersey: Association for Computational Linguistics.

    Google Scholar 

  • Dahan Netzer, Y. & Elhadad, M. (1999). Hebrew–English Generation of Possessives and Partitives: Raising the Input Abstraction Level. In Proceedings of the 37th Meeting of the ACL, 144–151. Maryland.

  • Dalrymple, M., Kaplan, R. M., Maxwell, J. T. & Zaenen, A. (eds.) (1995). Formal Issues in Lexical-Functional Grammar, Volume 47 of CSLI Lecture Notes. Stanford, CA: CSLI.

    Google Scholar 

  • Fraenkel, A. S. (1976). All about the Responsa Retrieval Project – What You Always Wanted to Know But Were Afraid to Ask. Jurimetrics Journal 16(3): 149–156.

    Google Scholar 

  • Glinert, L. (1989). The Grammar of Modern Hebrew. Cambridge: Cambridge University Press.

    Google Scholar 

  • Goldstein, L. (1991). Generation and Inflection of the Possession Inflection of Hebrew Nouns. Master's thesis, Technion, Haifa, Israel (in Hebrew).

    Google Scholar 

  • Haddock, N., Klein, E. & Morill, G. (eds.) (1987). Categorial Grammar, Unification and Parsing, Volume 1 of Working Papers in Cognitive Science. University of Edinburgh, Center for Cognitive Science.

  • Herz, J. & Rimon, M. (1991). Local Syntactic Constraints. In Proceedings of the Second International Workshop on Parsing Technologies. Cancun, Mexico.

  • Herz, J. & Rimon, M. (1992). Lexical Disambiguation and Other Applications of Short Context Automata. In Ornan et al. (eds.), Chapter 7, 74–87 (in Hebrew).

  • Izre'el, S., Hary, B. & Rahav, G. (to appear). Designing CoSIH: The Corpus of Spoken Israeli Hebrew.

  • Joshi, A. K. (1987). An Introduction to Tree Adjoining Grammars. In Manaster-Ramer, A. (ed.) Mathematics of Language. Amsterdam: John Benjamins.

    Google Scholar 

  • Kaplan, R. & Bresnan, J. (1982). Lexical Functional Grammar: A Formal System for Grammatical Representation. In Bresnan, J. (ed.) The Mental Representation of Grammatical Relations, 173–281. Cambridge, MA: MIT Press.

    Google Scholar 

  • Kaplan, R. M. & Kay, M. (1994). Regular Models of Phonological Rule Systems. Computational Linguistics 20(3): 331–378.

    Google Scholar 

  • Karttunen, L., Chanod, J-P., Grefenstette, G. & Schiller, A. (1996). Regular Expressions for Language Engineering. Natural Language Engineering 2(4): 305–328.

    Google Scholar 

  • Kiraz, G. A. (2000). Multitiered Nonlinear Morphology Using Multitape Finite Automata: A Case Study on Syriac and Arabic. Computational Linguistics 26(1): 77–105.

    Google Scholar 

  • Koskenniemi, K. (1983). Two-Level Morphology: A General Computational Model for Word-Form Recognition and Production. The Department of General Linguistics, University of Helsinki.

  • Laufer, A. (1976). Computer Generated Artificial Hebrew Speech. Leshonenu 40: 67–78 (in Hebrew).

    Google Scholar 

  • Lavie, A. (1989). Two-Level Morphology for Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).

    Google Scholar 

  • Lavie, A., Itai, A., Ornan, U. & Rimon, M. (1988a). On the Applicability of Two-Level Morphology to the Inflection of Hebrew Verbs, Technical Report 513. Department of Computer Science, Technion, 32000 Haifa, Israel.

    Google Scholar 

  • Lavie, A., Itai, A., Ornan, U. & Rimon, M. (1988b). On the Applicability of Two-Level Morphology to the Inflection of Hebrew Verbs. In Proceedings of the International Conference of the ALLC. Jerusalem, Israel.

  • Lazewnik, R. G. (1970). Construction of an Algorithm for Stem Recognition in the Hebrew Language. Hebrew Computational Linguistics 2: 84–101.

    Google Scholar 

  • Levinger, M. (1992). Morphologic Disambiguation in Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).

    Google Scholar 

  • Levinger, M., Ornan, U. & Itai, A. (1995). Learning Morpho-Lexical Probabilities from an Untagged Corpus with an Application to Hebrew. Computational Linguistics 21(3): 383–404.

    Google Scholar 

  • Mani, A. (2001). Automatic Summarization. Amsterdam: John Benjamins.

    Google Scholar 

  • Mani, A. & Maybury, M. T. (eds.) (1999). Advances in Automatic Text Summarization. Cambridge, MA: MIT Press.

    Google Scholar 

  • Mohri, M. (1996). On Some Applications of Finite-State Automata Theory to Natural Language Processing. Natural Language Engineering 2(1): 61–80.

    Google Scholar 

  • Mohri, M., Pereira, F. & Riley, M. (1998). A Rational Design for a Weighted Finite-State Transducer Library, Number 1436 in Lecture Notes in Computer Science. Springer.

  • Morgenbrod, M. & Serifi, E. (1976). Computer-Analysed Aspects of Hebrew Verbs. Hebrew Computational Linguistics 10: E1–17.

    Google Scholar 

  • Morgenbrod, M. & Serifi, E. (1977). Computer-Analysed Aspects of Hebrew Verbs: Mathematical Models. Hebrew Computational Linguistics 12: E1–18.

    Google Scholar 

  • Morgenbrod, M. & Serifi, E. (1978). Computer-Analysed Aspects of Hebrew Verbs: The Binjanim Structure. Hebrew Computational Linguistics 14: V–XV.

    Google Scholar 

  • Nirenburg, S. & Ben-Asher, Y. (1984). HUHU – the Hebrew University Hebrew Understander. Computer Languages 9(3/4).

  • Nissan, E. (1993). Onomaturge: An Expert System for Word Formation. Hebrew Linguistics 36: 39–49 (in Hebrew).

    Google Scholar 

  • Ornan, U. (1977). Report on Linguistic Research in the Computer Carried on in Israel. Hebrew Computational Linguistics 11: 121–127 (in Hebrew).

    Google Scholar 

  • Ornan, U. (1979). The Simple Sentence. Jerusalem, Israel: Academon (in Hebrew).

    Google Scholar 

  • Ornan, U. (1985a). Indexes and Concordances in a Phonemic Hebrew Script. In Proceedings of the Ninth World Congress of Jewish Studies, 101–108. Jerusalem: World Union of Jewish Studies (in Hebrew).

    Google Scholar 

  • Ornan, U. (1985b). Vocalization by a Computer: A Linguistic Lesson. In Luria, B-Z. (ed.) Avraham Even-Shoshan Book, 67–76. Jerusalem: Kiryat-Sefer (in Hebrew).

    Google Scholar 

  • Ornan, U. (1986). Phonemic Script: A Central Vehicle for Processing Natural Language – the Case of Hebrew, Technical Report 88.181. IBM Research Center, Haifa, Israel.

    Google Scholar 

  • Ornan, U. 1(1987). Computer Processing of Hebrew Texts Based on an Unambiguous Script. Mishpatim 17(2): 15–24 (in Hebrew).

    Google Scholar 

  • Ornan, U. (1994). Basic Concepts in “Romanization” of Scripts, Technical Report LCL 94–5. Laboratory for Computational Linguistics, Technion, Haifa, Israel.

    Google Scholar 

  • Ornan, U., Arieli, G. & Doron, E. (eds.) (1992). Hebrew Computational Linguistics: Papers Presented at Seminars Held in 1988, 1989, 1990. Ministry of Science and Technology (in Hebrew).

  • Ornan, U. & Gutter, I. (2000). Machine Translation by Semantic Features. In Lewis, D. & Mitkov, R. (eds.) Machine Translation and Multilingual Applications in the New Millennium. Exester, UK.

  • Ornan, U. & Katz, M. (1995). A New Program for Hebrew Index Based on the Phonemic Script, Technical Report LCL 94–7. Laboratory for Computational Linguistics, Technion, Haifa, Israel.

    Google Scholar 

  • Ornan, U. & Kazatski, W. (1986). Analysis and Synthesis Processes in Hebrew Morphology. In Proceedings of the 21 st National Data Processing Conference (in Hebrew).

  • Pinkas, G. (1985). A Linguistic System for Information Retrieval. Maase Hoshev 12: 10–16 (in Hebrew).

    Google Scholar 

  • Pollard, C. & Sag, I. A. (1987). Information Based Syntax and Semantics, Number 13 in CSLI Lecture Notes. CSLI.

  • Pollard, C. & Sag, I. A. (1994). Head-Driven Phrase Structure Grammar. University of Chicago Press and CSLI Publications.

  • Price, J. D. (1969). An Algorithm for Generating Hebrew Words. Hebrew Computational Linguistics 1: 51–54. Reprinted from Computer Studies in the Humanities and Verbal Behavior 1(2): 84–102 (1969).

    Google Scholar 

  • Price, J. D. (1970). The Development of a Theoretical Basis for Machine Aids for Translation from Hebrew to English. Hebrew Computational Linguistics 2: 65–83, May. Abstract of a Doctoral Dissertation, The Dropsie College for Hebrew and Cognate Learning, Philadelphia.

    Google Scholar 

  • Price, J. D. (1971a). An Algorithm for Analyzing Hebrew Words. Computer Studies in the Humanities and Verbal Behavior 3(2): 137–165.

    Google Scholar 

  • Price, J. D. (1971b). A Computerized Phrase Structure Grammar (Modern Hebrew), Report F-C2585–1/2/3/4. Franklin Institute.

  • Roche, E. & Schabes, Y. (eds.) (1997). Finite-State Language Processing. Language, Speech and Communication. Cambridge, MA: MIT Press.

    Google Scholar 

  • Rosen, H. B. (1966). Ivrit Tova (Good Hebrew). Jerusalem: Kiryat Sepher (in Hebrew).

    Google Scholar 

  • Rubinstein, E. (1968). Ha-mishpat Ha-shemani (The Nominal Sentence). Merhavia: Ha-Kibbutz Ha-Me'uxad (in Hebrew).

    Google Scholar 

  • Rubinstein, E. (1970). Ha-cerup Ha-pooliy (The Verb Phrase). Merhavia: Ha-Kibbutz Ha-Me'uxad (in Hebrew).

    Google Scholar 

  • Samuelsdorff, P. O. (1980). Computational Analysis of Modern Hebrew. Hebrew Computational Linguistics 16: IV–XVI.

    Google Scholar 

  • Segal, E. (1997). Morphological Analyzer for Unvocalized Hebrew Words. Unpublished work, available from http://www.cs.technion.ac.il/~erelsgl/hmntx.zip.

  • Segal, E. (1999). Hebrew Morphological Analyzer for Hebrew Undotted Texts. Master's thesis, Technion, Israel Institute of Technology, Haifa (in Hebrew).

    Google Scholar 

  • Shany-Klein, M. (1990). Generation and Analysis of Segolate Noun Inflection in Hebrew. Master's thesis, Technion, Haifa, Israel (in Hebrew).

    Google Scholar 

  • Shany-Klein, M. & Ornan, U. (1992). Analysis and Generation of Hebrew Segolate Nouns. In Ornan et al. (eds.), Chapter 4, 39–51 (in Hebrew).

  • Shapira, M. & Choueka, Y. (1964). Mechanographic Analysis of Hebrew Morphology: Possibilities and Achievements. Leshonenu 28(4): 354–372 (in Hebrew).

    Google Scholar 

  • Shieber, S. M. (1986). An Introduction to Unification Based Approaches to Grammar, Number 4 in CSLI Lecture Notes. CSLI.

  • Sima'an, K., Itai, A., Winter, Y., Altman, A. & Nativ, N. (to appear). Building a Tree-Bank of Modern Hebrew Text. Traitment Automatique des Langues.

  • Skoblikov, V. (2000). Feature-Based Computational Lexicon of Hebrew Verbs. Master's thesis, Technion, Israel Institute of Technology, Haifa, Israel.

    Google Scholar 

  • Sproat, R. W. (1992). Morphology and Computation. Cambridge, MA: MIT Press.

    Google Scholar 

  • Steedman, M. (2000). The Syntactic Process. Language, Speech and Communication. Cambridge, MA: The MIT Press.

    Google Scholar 

  • Talmon, R. & Wintner, S. (2001). Computational Processing of Spoken North Israeli Arabic. In Arabic Language Processing: Status and Prospects, 124–126. Toulouse, France: Association for Computational Linguistics.

    Google Scholar 

  • Vaillette, N. (2001). Hebrew Relative Clauses in HPSG. In Flickinger, D. & Kathol, A. (eds.) Proceedings of the 7th International Conference on Head-Driven Phrase Structure Grammar. CSLI Publications.

  • van der Toorn, A. J. (1971). Automatic Reading of Handwritten Hebrew. Hebrew Computational Linguistics 4: 83–99.

    Google Scholar 

  • van Noord, G. & Gerdemann, D. (2001). Finite State Transducers with Predicates and Identity. Grammars 4(3).

  • Wintner, S. (1991). Syntactic Analysis of Hebrew Sentences. Master's thesis, Technion, Israel Institute of Technology, Haifa, Israel (in Hebrew, abstract in English).

    Google Scholar 

  • Wintner, S. (1992). Syntactic Analysis of Hebrew Sentences Using PATR. In Ornan et al. (eds.), Chapter 9, 105–115 (in Hebrew).

  • Wintner, S. (1997). An Abstract Machine for Unification Grammars. Ph.D. thesis, Technion –Israel Institute of Technology, Haifa, Israel.

    Google Scholar 

  • Wintner, S. (1998). Towards a Linguistically Motivated Computational Grammar for Hebrew. In Rosner, M. (ed.) Proceedings of the Workshop on Computational Approaches to Semitic Languages (COLING-ACL'98), 82–88. Université de Montréal, Quebec, Canada: Association for Computational Linguistics.

    Google Scholar 

  • Wintner, S. (ed.) (2001). Israeli Seminar on Computational Linguistics (ISCOL'01). Haifa.

  • Wintner, S. & Ornan, U. (1991a). Computational Models for Syntactic Analysis – Their Fitness for Writing a Computational Grammar for Hebrew. In Proceedings of the Bar-Ilan Symposium on Foundations of Artificial Intelligence. Also as CIS Report 9103, Center for Intelligent Systems, Technion.

  • Wintner, S. & Ornan, U. (1991b). Syntactic Analysis of Hebrew Sentences. In Proceedings of the 8th Israeli Symposium on Artificial Intelligence and Computer Vision, 201–230. Information Processing Association of Israel.

  • Wintner, S, & Ornan, U. (1996). Syntactic Analysis of Hebrew Sentences. Natural Language Engineering 1(3): 261–288.

    Google Scholar 

  • Yizhar, D. (1993). Computational Grammar for Hebrew Noun Phrases. Master's thesis, Computer Science Department, Hebrew University, Jerusalem, Israel (in Hebrew).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wintner, S. Hebrew Computational Linguistics: Past and Future. Artificial Intelligence Review 21, 113–138 (2004). https://doi.org/10.1023/B:AIRE.0000020865.73561.bc

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:AIRE.0000020865.73561.bc

Navigation