Neural semantic role labeling with more or less supervision
Abstract
In recent years, thanks to the relative maturity of neural network models, the task of
automatically identifying and labeling the semantic roles has been the focus of renewed interest. These models have the capacity to learn continuous representations
automatically and thereby forgo the need for extensive feature engineering. Semantic
role labeling (SRL) has generally been recognized as a core task in natural language
processing (NLP) and has been shown to benefit a range of NLP applications such as
machine translation, information extraction and summarization.
Recent SRL systems have usually been trained on datasets whose semantic role
annotations have been produced on the top of tree-banked corpora. This reflects the
intimate relationship between syntactic information and semantic roles. In order to
effectively incorporate syntactic information into neural network models, we train the
semantic role labeler jointly with two auxiliary tasks: predicting the dependency label
of a word, and determining whether there exists an arc linking it to the predicate. The
auxiliary tasks provide syntactic information that is specific to SRL and can be learnt
from training data (dependency annotations). This liberates our SRL system from the
dependence on external parsers, which is believed to be noisy (e.g., on out-of-domain
data or infrequent constructions).
Supervised neural SRL models, which derive their efficacy via sufficient annotated
data, are driven by data. Nonetheless, the reliance on high-quality annotations obscures the development of SRL systems in low-resource scenarios (e.g., rare languages
or domains). In order to reduce the annotation effort involved, we have rendered semi-supervised learning for SRL as simple as possible. More specifically, we propose an end-to-end SRL model and demonstrate it could effectively leverage unlabeled data
within the cross-view training modeling paradigm. Our semantic role labeler is jointly
trained with auxiliary tasks subsidiary to SRL. Consequently, our system may be applied directly to plain text, and it is essentially self-sufficient.
For true low-resource languages, we cannot even expect to perform semi-supervised
learning for them, as SRL annotations are only available for a handful of the world’s
languages. To build a competitive semantic role labeler for these low-resource languages, we have resorted to cross-lingual semantic role labeling, which can transfer supervision in a source language to target languages (low-resource languages). The
backbone of our model is an LSTM-based semantic role labeler jointly trained with a
semantic role compressor and multilingual word embeddings. The compressor collects
useful information from the output of semantic role labeler and compress it into fixed-size cross-lingual representations. Our model (in contrast to earlier efforts, which deployed automatic alignments in order to transfer annotations) exists in a space of mul-tilingual embeddings. For the target language, moreover, it affords direct supervision
for the prediction of semantic roles. For model evaluation, we have also contributed
two quality-controlled datasets, which we hope will be useful for the development of
cross-lingual models.