Skip to main content

Designing Annotation Schemes: From Theory to Model

  • Chapter
  • First Online:
Handbook of Linguistic Annotation

Abstract

In this chapter, we describe the method and process of transforming the theoretical formulations of a linguistic phenomenon, based on empirical observations, into a model that can be used for the development of a language annotation specification. We outline this procedure generally, and then examine the steps in detail by specific example. We look at how this methodology has been implemented in the creation of TimeML (and ISO-TimeML), a broad-based standard for annotating temporal information in natural language texts. Because of the scope of this effort and the richness of the theoretical work in the area, the development of TimeML illustrates very clearly the methodology of the early stages of the MATTER annotation cycle, where initial models and schemas cycle through progressively mature versions of the resulting specification. Furthermore, the subsequent effort to convert TimeML into an ISO compliant standard, ISO-TimeML, demonstrates the utility of the CASCADES model in distinguishing between the concrete syntax of the schema and abstract syntax of the model behind it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See [6, 7] for the formal semantics of abstract annotation structures.

  2. 2.

    See [44] for a review of tense and aspect in the context of temporal reasoning and event semantics.

  3. 3.

    There is an ISO standard, ISO 8601, that we will adopt as part of ISO-TimeML, which provides a useful standard for the purpose of normalizing times. See Sect. 5.4 below.

  4. 4.

    Setzer’s work came to be known as STAG (Sheffield Temporal Annotation Guidelines) by the working group.

  5. 5.

    The relType value ‘IDENTITY’ is actually not part of Allen’s calculus, but was used for event coreference.

  6. 6.

    This is because all temporal relations in ISO-TimeML are binary.

References

  1. Allen, J.: Towards a general theory of action and time. Arif. Intell. 23, 123–154 (1984)

    Article  Google Scholar 

  2. Baker, C., Fillmore, C., Cronin, B.: The structure of the Framenet database. Int. J. Lexicogr. 16(3), 281–296 (2003)

    Article  Google Scholar 

  3. Beavers, J.: Scalar complexity and the structure of events. In: Dŏlling, J., Heyde-Zybatow, T., Schăfer, M. (eds.) Event Structures in Linguistic Form and Interpretation, pp. 245–265. Mouton de Gruyter, Berlin (2008)

    Google Scholar 

  4. Boguraev, B., Pustejovsky, J., Ando, R., Verhagen, M.: Timebank evolution as a community resource for timeml parsing. Lang. Resour. Eval. 41(1), 91–115 (2007)

    Article  Google Scholar 

  5. Bunt, H.C.: Mass Terms and Model-Theoretic Semantics. Cambridge University Press, Cambridge (1985)

    Google Scholar 

  6. Bunt, H.: Semantic annotations as complementary to underspecified semantic representations. In: Proceedings of the Eighth International Conference on Computational Semantics, pp. 33–44. Association for Computational Linguistics (2009)

    Google Scholar 

  7. Bunt, H.: Introducing abstract syntax+ semantics in semantic annotation, and its consequences for the annotation of time and events. In: Lee, E., Yoong, A. (eds.) Recent Trends in Language and Knowledge Processing, pp. 157–204. Hankukmunhwasa, Seoul (2011)

    Google Scholar 

  8. Bunt, H.: On the principles of interoperable semantic annotation. In: Proceedings of the 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, pp. 1–13 (2015)

    Google Scholar 

  9. Bunt, H., Pustejovsky, J.: Annotating temporal and event quantification. In: Proceedings of 5th ISA Workshop (2010)

    Google Scholar 

  10. Bunt, H., Fang, A.C., Ide, N., Webster, J.: A methodology for designing semantic annotation languages exploiting syntactic-semantic isomorphisms (2010)

    Google Scholar 

  11. Bunt, H., Alexandersson, J., Choe, J.W., Fang, A.C., Hasida, K., Petukhova, V., Popescu-Belis, A., Traum, D.R.: Iso 24617-2: a semantically-based standard for dialogue annotation. In: LREC, pp. 430–437. Citeseer (2012)

    Google Scholar 

  12. Carlson, G.N.: Reference to kinds in English. Ph.D. thesis, Linguistics Department, University of Massachusetts, Amherst, Massachusetts (1977)

    Google Scholar 

  13. Chierchia, G.: Structured meanings, thematic roles and control. In: Properties, Types and Meaning, pp. 131–166. Springer, Berlin (1989)

    Google Scholar 

  14. Chinchor, N., Robinson, P.: MUC-7 named entity task definition. In: Proceedings of the 7th Conference on Message Understanding, p. 29 (1997)

    Google Scholar 

  15. Comrie, B.: Tense. Cambridge University Press, Cambridge (1985)

    Google Scholar 

  16. Davidson, D.: The logical form of action sentences. In: Rescher, N. (ed.) The Logic of Decision and Action, pp. 81–95. Pittsburgh Press, Pittsburgh (1967)

    Google Scholar 

  17. Diesing, M.: Bare plural subjects and the derivation of logical representations. Linguist. Inq. 23, 353–380 (1992)

    Google Scholar 

  18. Dowty, D.R.: Word Meaning and Montague Grammar: The Semantics of Verbs and Times in Generative Semantics and in Montague’s PTQ, vol. 7. Springer Science & Business Media, Berlin (1979)

    Google Scholar 

  19. Dowty, D.: On the semantic content of the notion of thematic role. Prop. Types Mean. 2, 69–130 (1989)

    Article  Google Scholar 

  20. Dowty, D.: Thematic proto-roles and argument selection. Language 67, 547–619 (1991)

    Google Scholar 

  21. Grimshaw, J.: Argument Structure. MIT Press, Cambridge (1990)

    Google Scholar 

  22. Hay, J., Kennedy, C., Levin, B.: Scalar structure underlies telicity in ‘degree achievements’. In: Matthews, T., Strolovitch, D. (eds.) Proceedings of Semantics and Linguistic Theory IX, pp. 127–144. Cornell University, Ithaca (1999)

    Google Scholar 

  23. Higginbotham, J.: On semantics. Linguist. Inq. 16, 547–593 (1985)

    Google Scholar 

  24. Hobbs, J.R., Pustejovsky., J.: Annotating and reasoning about time and events. In: Doherty, P., McCarthy, J., Williams, M.A. (eds.) Working Papers of the 2003 AAAI Spring Symposium on Logical Formalization of Commonsense Reasoning, pp. 74–82. AAAI Press, Menlo Park (2003)

    Google Scholar 

  25. Hobbs, J.R., Pan, F.: An ontology of time for the semantic web. ACM Trans. Asian Lang. Inf. Process. (TALIP) 3(1), 66–85 (2004)

    Article  Google Scholar 

  26. Ide, N., Romary, L.: Outline of the international standard linguistic annotation framework. In: Proceedings of the ACL 2003 Workshop on Linguistic Annotation: Getting the Model Right, vol. 19, pp. 1–5. Association for Computational Linguistics (2003)

    Google Scholar 

  27. Ide, N., Romary, L.: International standard for a linguistic annotation framework. Nat. Lang. Eng. 10(3–4), 211–225 (2004)

    Article  Google Scholar 

  28. Ide, N., Suderman, K.: The linguistic annotation framework: a standard for annotation interchange and merging. Lang. Resour. Eval. 48(3), 395–418 (2014)

    Article  Google Scholar 

  29. Jäger, G.: Topic-comment structure and the contrast between stage level and individual level predicates. J. Semant. 18(2), 83–126 (2001)

    Article  Google Scholar 

  30. Kamp, H., Reyle, U.: From Discourse to Logic; Introduction to the Model-theoretic Semantics of Natural Language. Springer, Berlin (1993)

    Google Scholar 

  31. Karttunen, L.: Implicative verbs. Language 47, 340–358 (1971)

    Google Scholar 

  32. Karttunen, L.: Some observations on factivity. Res. Lang. Soc. Interact. 4(1), 55–69 (1971)

    Google Scholar 

  33. Karttunen, L., Zaenen, A.: Veridicity. In: Dagstuhl Seminar Proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik (2005)

    Google Scholar 

  34. Katz, G.: Towards a denotational semantics for TimeML. In: Annotating, Extracting and Reasoning about Time and Events, pp. 88–106. Springer, Berlin (2007)

    Google Scholar 

  35. Kennedy, C., Levin, B.: Measure of change: the adjectival core of degree achievements. Adjectives and adverbs: Syntax, semantics and discourse pp. 156–182. Oxford University Press, Oxford (2008)

    Google Scholar 

  36. Kiparsky, P., Kiparsky, C.: Fact. In: Progress in Linguistics, pp. 143–173. Mouton, The Hague (1971)

    Google Scholar 

  37. Kipper, K.: Verbnet: a broad-coverage, comprehensive verb lexicon. Ph.D. dissertation, University of Pennsylvania, PA (2005). http://repository.upenn.edu/dissertations/AAI3179808/

  38. Krifka, M.: Thematic relations as links between nominal reference and temporal constitution. Lex. Matters 2953, 30–52 (1992)

    Google Scholar 

  39. Krifka, M.: The origins of telicity. In: Rothstein, S. (ed.) Events and Grammar. Kluwer, Dordrecht (1998)

    Google Scholar 

  40. Levin, B., Hovav Rappaport, M.: Argument Realization. Cambridge University Press, Cambridge (2005)

    Google Scholar 

  41. Lin, J.W.: Time in a language without tense: the case of chinese. J. Semant. 23(1), 1–53 (2006)

    Article  Google Scholar 

  42. Mani, I., Wilson, G.: Robust temporal processing of news. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics (ACL2000), pp. 69–76. New Brunswick (2000)

    Google Scholar 

  43. Mani, I., Wilson, G., Sundheim, B., Ferro, L.: Guidelines for annotating temporal information. In: Proceedings of HLT 2001, First International Conference on Human Language Technology Research (2001)

    Google Scholar 

  44. Mani, I., Pustejovsky, J., Gaizauskas, R.: The Language of Time: A Reader. Oxford University Press, Oxford (2005)

    Google Scholar 

  45. Mani, I., Verhagen, M., Wellner, B., Lee, C.M., Pustejovsky, J.: Machine learning of temporal relations. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 753–760. Association for Computational Linguistics, Sydney (2006). http://www.aclweb.org/anthology/P/P06/P06-1095

  46. Mani, I., Wellner, B., Verhagen, M., Pustejovsky, J.: Three approaches to learning TLINKs in timeml. Technical Report CS-07-268, Brandeis University, Waltham, United States (2007)

    Google Scholar 

  47. Manna, Z., Pnueli, A.: Temporal Verification of Reactive Systems: Safty. Springer, Berlin (1995)

    Google Scholar 

  48. McCarthy, J.: Situations, actions, and causal laws. Technical Report, DTIC Document (1963)

    Google Scholar 

  49. McCarthy, J., Hayes, P.J.: Some philosophical problems from the standpoint of artificial intelligence. Readings in artificial intelligence pp. 431–450 (1969)

    Google Scholar 

  50. Moens, M., Steedman, M.: Temporal ontology and temporal reference. Comput. Linguist. 14(2), 15–28 (1988)

    Google Scholar 

  51. MUC: Proceedings of the Sixth Message Understanding Conference (MUC-6). Morgan Kaufmann, California (1995)

    Google Scholar 

  52. Naumann, R.: Aspects of changes: a dynamic event semantics. J. Semant. 18(1), 27–81 (2001)

    Article  Google Scholar 

  53. Palmer, M., Gildea, D., Kingsbury, P.: The proposition bank: an annotated corpus of semantic roles. Comput. Linguist. 31(1), 71–106 (2003)

    Google Scholar 

  54. Parsons, T.: Events in the Semantics of English, vol. 5. MIT Press, Cambridge (1990)

    Google Scholar 

  55. Partee, B.H.: Some structural analogies between tenses and pronouns in english. J. Philos. 70(18), 601–609 (1973)

    Article  Google Scholar 

  56. Pratt-Hartmann, I.: From TimeML to Interval temporal logic. In: Proceedings of the Seventh International Workshop on Computational Semantics, pp. 166–180 (2007)

    Google Scholar 

  57. Prior, A.N.: Past, Present and Future, vol. 154. Clarendon Press, Oxford (1967)

    Book  Google Scholar 

  58. Pustejovsky, J.: The geometry of events. In: Tenny, C. (ed.) Studies in Generative Approaches to Aspect. Lexicon Project Working Papers vol. 24. MIT, Cambridge (1988)

    Google Scholar 

  59. Pustejovsky, J.: The syntax of event structure. Cognition 41(1), 47–81 (1991)

    Article  Google Scholar 

  60. Pustejovsky, J., Stubbs, A.: Natural Language Annotation for Machine Learning. O’Reilly Media, Inc., USA (2012)

    Google Scholar 

  61. Pustejovsky, J., Castano, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G.: TimeML: robust specification of event and temporal expressions in text. In: IWCS-5, Fifth International Workshop on Computational Semantics (2003). http://www.timeml.org

  62. Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The TimeBank corpus. In: Proceedings of Corpus Linguistics, pp. 647–656 (2003)

    Google Scholar 

  63. Pustejovsky, J., Ingria, B., Sauri, R., Castano, J., Littman, J., Gaizauskas, R., Setzer, A., Katz, G., Mani, I.: The Specification Language TimeML. The Language of Time: A Reader. Oxford University Press, Oxford (2004)

    Google Scholar 

  64. Pustejovsky, J., Knippen, R., Littman, J., Saurí, R.: Temporal and event information in natural language text. Lang. Resour. Eval. 39, 123–164 (2005)

    Article  Google Scholar 

  65. Pustejovsky, J., Littman, J., Sauri, R.: Arguments in TimeML: events and entities. In: Katz, F., Pustejovsky, J., Schilder, F. (eds.) Annotating, Extracting and Reasoning about Time and Events, vol. 4795, pp. 107–126. Springer, Berlin (2007)

    Chapter  Google Scholar 

  66. Pustejovsky, J., Lee, K., Bunt, H., Romary, L.: Iso-TimeML: an international standard for semantic annotation. In: LREC (2010)

    Google Scholar 

  67. Rappaport Hovav, M., Levin, B.: An event structure account of english resultatives. Language 77(4), 766–797 (2001)

    Google Scholar 

  68. Ruppenhofer, J., Ellsworth, M., Petruck, M., Johnson, C., Scheffczyk, J.: FrameNet II: Extended Theory and Practice (2006). http://framenet.icsi.berkeley.edu/framenet

  69. Setzer, A.: Temporal information in newswire articles: an annotation scheme and corpus study. Ph.D. thesis, University of Sheffield, UK (2001)

    Google Scholar 

  70. Sundheim, B.M.: Overview of results of the MUC-6 evaluation. In: Proceedings of a Workshop on Held at Vienna, Virginia: May 6–8, 1996, TIPSTER ’96, pp. 423–442. Association for Computational Linguistics (1996)

    Google Scholar 

  71. Tenny, C.: The aspectual interface hypothesis. 31. Lexicon Project, Center for Cognitive Science, MIT (1989)

    Google Scholar 

  72. Van Lambalgen, M., Hamm, F.: The Proper Treatment of Events, vol. 6. Wiley, New York (2008)

    Google Scholar 

  73. Vendler, Z.: Verbs and times. Philos. Rev. 66, 143–160 (1957)

    Article  Google Scholar 

  74. Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. Lang. Resour. Eval. 39(2–3), 165–210 (2005)

    Article  Google Scholar 

  75. Wilson, G., Mani, I., Sundheim, B., Ferro, L.: A multilingual approach to annotating and extracting temporal information. In: Proceedings of the Workshop on Temporal and Spatial Information Processing. vol. 13, p. 12. Association for Computational Linguistics (2001)

    Google Scholar 

Download references

Acknowledgements

We would like to express our thanks for the many people involved in the development of TimeML and ISO-TimeML. In particular, we would like to thanks Kiyong Lee, Jessica Moszkowicz, Roser Saurí, Marc Verhagen, Bran Boguraev, Bob Knippen, Inderjeet Mani, Graham Katz, Rob Gauzauskis, Andrea Setzer, Jerry Hobbs, Ian Pratt-Harman, Drago Radev, Tommaso Caselli, and members of the ISO community, including Nancy Ide, Alex Chengyu Fang, Rainer Osswald, Haihua Pan, Yuzhen Cui, Haihua Pan, Manigo Kit, and Amanda Schiffrin.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to James Pustejovsky or Annie Zaenen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Pustejovsky, J., Bunt, H., Zaenen, A. (2017). Designing Annotation Schemes: From Theory to Model. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-94-024-0881-2_2

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-024-0879-9

  • Online ISBN: 978-94-024-0881-2

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics