Abstract
In this paper we propose an explicit computer model for learning natural language syntax based on Angluin's (1982) efficient induction algorithms, using a complete corpus of grammatical example sentences. We use these results to show how inductive inference methods may be applied to learn substantial, coherent subparts of at least one natural language — English — that are not susceptible to the kinds of learning envisioned in linguistic theory. As two concrete case studies, we show how to learn English auxiliary verb sequences (such as could be taking, will have been taking) and the sequences of articles and adjectives that appear before noun phrases (such as the very old big deer). Both systems can be acquired in a computationally feasible amount of time using either positive examples, or, in an incremental mode, with implicit negative examples (examples outside a finite corpus are considered to be negative examples). As far as we know, this is the first computer procedure that learns a full-scale range of noun subclasses and noun phrase structure. The generalizations and the time required for acquisition match our knowledge of child language acquisition for these two cases. More importantly, these results show that just where linguistic theories admit to highly irregular subportions, we can apply efficient automata-theoretic learning algorithms. Since the algorithm works only for fragments of language syntax, we do not believe that it suffices for all of language acquisition. Rather, we would claim that language acquisition is nonuniform and susceptible to a variety of acquisition strategies; this algorithm may be one these.
Article PDF
Similar content being viewed by others
References
Akmajian, A., Steele, S., & Wasow, T. (1979). The category AUX in universal grammar. Linguistic Inquiry, 10, 1–64.
Angluin, D. (1977). Inductive inference of formal languages from positive data. Information and Control, 45, 117–135.
Angluin, D. (1982). Inference of reversible languages. Journal of the Association for Computing Machinery, 29, 741–765.
Berwick, R. (1982). Locality principles and the acquisition of syntactic knowledge. Doctoral dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA.
Berwick, R. (1985). The acquisition of syntactic knowledge. Cambridge, MA: MIT Press.
Brown, R. (1973). A first language. Cambridge, MA: Harvard University Press.
Fu, K., & Booth, T. (1975). Grammatical inference: Introduction and survey. IEEE Transactions on Systems, Man, and Cybernetics, 5, 95–111.
Gleitman, L., & Wanner, E. (1982). Language acquisition: The state of the state of the art. In E. Wanner & L.Gleitman (Eds.), Language acquisition: The state of the art. New York: Cambridge University Press.
Gold, E. M. (1967). Language identification in the limit. Information and Control, 10, 447–474.
Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37, 302–320.
Gonzalez, R. C., & Thomason, M. G. (1978). Syntactic pattern recognition. Reading, MA: Addison-Wesley.
Jackendoff, R. (1977). X syntax: A study in phrase structure. Cambridge, MA: MIT Press.
Langley, P. (1982). Language acquisition through error recovery. Cognition and Brain Theory, 3, 211–255.
Lightfoot, D. (1982). The language lottery. Cambridge, MA: MIT Press.
MacWhinney, B. (1982). Basic processes in syntactic acquisition. In S.A. Kuczaj II (Ed.), Language development: Vol. 1. Syntax and semantics. Hillsdale, NJ: Lawrence Erlbaum.
Mitchell, T. M. (1978). Version spaces: An approach to concept learning. Doctoral dissertation, Department of Electrical Engineering, Standord University, Stanford, CA.
Olivier, D. (1968). Stochastic grammars and language acquisition mechanisms. Doctoral dissertation, Department of Psychology and Social Relations, Harvard University, Cambridge, MA.
Osherson, D., Stob, M., & Weinstein, S. (1985). Systems that learn. Cambridge, MA: MIT Press.
Pinker, S. (1984). Language learnability and language development. Cambridge, MA: Harvard University Press.
Wexler, K., & Culicover, P. (1982). Formal principles of language acquisition. Cambridge, MA: MIT Press.
Wolff, J. G. (1978). Grammar discovery as data compression. In Proceedings of the AISB/GI Conference on Artificial Intelligence (pp. 375–379). Hamburg, West Germany.
Wolff, J. G. (1982). Language acquisition, data compression, and generalization. La nguage and Communication, 2, 57–89.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Berwick, R.C., Pilato, S. Learning syntax by automata induction. Mach Learn 2, 9–38 (1987). https://doi.org/10.1007/BF00058753
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF00058753