Neural semantic role labeling with more or less supervision

Cai, Rui

View/Open

Cai2021.pdf (4.108Mb)

Date

30/11/2021

Author

Cai, Rui

Metadata

Show full item record

Abstract

In recent years, thanks to the relative maturity of neural network models, the task of automatically identifying and labeling the semantic roles has been the focus of renewed interest. These models have the capacity to learn continuous representations automatically and thereby forgo the need for extensive feature engineering. Semantic role labeling (SRL) has generally been recognized as a core task in natural language processing (NLP) and has been shown to benefit a range of NLP applications such as machine translation, information extraction and summarization. Recent SRL systems have usually been trained on datasets whose semantic role annotations have been produced on the top of tree-banked corpora. This reflects the intimate relationship between syntactic information and semantic roles. In order to effectively incorporate syntactic information into neural network models, we train the semantic role labeler jointly with two auxiliary tasks: predicting the dependency label of a word, and determining whether there exists an arc linking it to the predicate. The auxiliary tasks provide syntactic information that is specific to SRL and can be learnt from training data (dependency annotations). This liberates our SRL system from the dependence on external parsers, which is believed to be noisy (e.g., on out-of-domain data or infrequent constructions). Supervised neural SRL models, which derive their efficacy via sufficient annotated data, are driven by data. Nonetheless, the reliance on high-quality annotations obscures the development of SRL systems in low-resource scenarios (e.g., rare languages or domains). In order to reduce the annotation effort involved, we have rendered semi-supervised learning for SRL as simple as possible. More specifically, we propose an end-to-end SRL model and demonstrate it could effectively leverage unlabeled data within the cross-view training modeling paradigm. Our semantic role labeler is jointly trained with auxiliary tasks subsidiary to SRL. Consequently, our system may be applied directly to plain text, and it is essentially self-sufficient. For true low-resource languages, we cannot even expect to perform semi-supervised learning for them, as SRL annotations are only available for a handful of the world’s languages. To build a competitive semantic role labeler for these low-resource languages, we have resorted to cross-lingual semantic role labeling, which can transfer supervision in a source language to target languages (low-resource languages). The backbone of our model is an LSTM-based semantic role labeler jointly trained with a semantic role compressor and multilingual word embeddings. The compressor collects useful information from the output of semantic role labeler and compress it into fixed-size cross-lingual representations. Our model (in contrast to earlier efforts, which deployed automatic alignments in order to transfer annotations) exists in a space of mul-tilingual embeddings. For the target language, moreover, it affords direct supervision for the prediction of semantic roles. For model evaluation, we have also contributed two quality-controlled datasets, which we hope will be useful for the development of cross-lingual models.

URI

https://hdl.handle.net/1842/38272

http://dx.doi.org/10.7488/era/1538

Collections

Informatics thesis and dissertation collection