Repository logo
 

Acquiring verb classes through bottom-up semantic verb clustering

Accepted version
Peer-reviewed

Type

Conference Object

Change log

Authors

McCarthy, D 
Vulić, I 
Korhonen, A 

Abstract

In this paper, we present the first analysis of bottom-up manual semantic clustering of verbs in three languages, English, Polish and Croatian. Verb classes including syntactic and semantic information have been shown to support many NLP tasks by allowing abstraction from individual words and thereby alleviating data sparseness. The availability of such classifications is however still non-existent or limited in most languages. While a range of automatic verb classification approaches have been proposed, high-quality resources and gold standards are needed for evaluation and to improve the performance of NLP systems. We investigate whether semantic verb classes in three different languages can be reliably obtained from native speakers without linguistics training. The analysis of inter-annotator agreement shows an encouraging degree of overlap in the classifications produced for each language individually, as well as across all three languages. Comparative examination of the resultant classifications provides interesting insights into cross-linguistic semantic commonalities and patterns of ambiguity.

Description

Keywords

verb classes, semantic clustering, multilingual NLP

Journal Title

LREC 2018 - 11th International Conference on Language Resources and Evaluation

Conference Name

Language Resources and Evaluation Conference

Journal ISSN

Volume Title

Publisher

Sponsorship
European Research Council (648909)