skip to main content
10.3115/1119212.1119217dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free Access

An architecture for word learning using bidirectional multimodal structural alignment

Published:31 May 2003Publication History

ABSTRACT

Learning of new words is assisted by contextual information. This context can come in several forms, including observations in non-linguistic semantic domains, as well as the linguistic context in which the new word was presented. We outline a general architecture for word learning, in which structural alignment coordinates this contextual information in order to restrict the possible interpretations of unknown words. We identify spatial relations as an applicable semantic domain, and describe a system-in-progress for implementing the general architecture using video sequences as our non-linguistic input. For example, when the complete system is presented with "The bird dove to the rock," with a video sequence of a bird flying from a tree to a rock, and with the meanings for all the words except the preposition "to," the system will register the unknown "to" with the corresponding aspect of the bird's trajectory.

References

  1. John R. Bender. 2001. Connecting language and vision using a conceptual semantics. Master's thesis, Massachusetts Institute of Technology.Google ScholarGoogle Scholar
  2. Lera Boroditsky. 2000. Metamorphic structuring: understanding time through spatial metaphors. Cognition, 75(1):1--28.Google ScholarGoogle ScholarCross RefCross Ref
  3. Bonnie Dorr. 1992. The use of lexical semantics in interlingual machine translation. Machine Translation, 7(3):135--193.Google ScholarGoogle ScholarCross RefCross Ref
  4. Dedre Gentner and Arthur B. Markman. 1997. Structure mapping in analogy and similarity. American Psychologist, 52(1):45--56.Google ScholarGoogle ScholarCross RefCross Ref
  5. Berthold Klaus Paul Horn. 1986. Robot Vision. McGraw-Hill, New York, New York.Google ScholarGoogle Scholar
  6. Ray Jackendoff. 1983. Semantics and Cognition, volume 8 of Current Studies in Linguistics Series. MIT Press, Cambridge, Massachusetts.Google ScholarGoogle Scholar
  7. Jeffrey Mark Siskind. 1990. Acquiring core meanings of words, representated as jackendoff-style conceptual structures, from correlated streams of linguistic and non-linguistic input. In Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics (ACL-1990). Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mark Steedman. 2000. The Syntactic Process. MIT Press, Cambridge, Massachusetts. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Shimon Ullman, 1996. High Level Vision, chapter 10, pages 317--358. Sequence Seeking and Counter Streams: A Model for Information Flow in the Visual Cortex. MIT Press, Cambridge, Massachusetts.Google ScholarGoogle Scholar
  10. Deniz Yuret. 1999. Lexical attraction models of language. Submitted to The Sixteenth National Conference on Artificial Intelligence.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    HLT-NAACL-LWM '04: Proceedings of the HLT-NAACL 2003 workshop on Learning word meaning from non-linguistic data - Volume 6
    May 2003
    101 pages

    Publisher

    Association for Computational Linguistics

    United States

    Publication History

    • Published: 31 May 2003

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate240of768submissions,31%
  • Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)3

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader