Learning of text-level discourse parsing

Gregor Weiss (2019) Learning of text-level discourse parsing. PhD thesis.

Preview

Abstract

Understanding the sense of discourse relations that appear between segments of text is essential to truly comprehend any natural language text. Several automated approaches have been suggested, but all rely on external resources, linguistic feature engineering, and their processing pipelines are built from substantially different models. Instead of designing a system specifically for a given language and task, we pursue a language-independent approach for sense classification of shallow discourse relations. In this dissertation we first present our focused recurrent neural networks (focused RNNs) layer, the first multi-dimensional RNN-attention mechanism for constructing sentence/argument embeddings. It consists of a filtering RNN with a filtering/gating mechanism that enables downstream RNNs to focus on different aspects of each argument of a discourse relation and project it into several embedding subspaces. On top of the proposed mechanism we build our FR system, a novel method for sense classification of shallow discourse relations. In contrast to existing systems, the FR system consists of a single end-to-end trainable model for handling all types and specific situations of discourse relations, requires no feature engineering or external resources, can be used almost out-of-the-box on any language or set of sense labels, and can be applied at the word and character level representation. We evaluate the proposed FR system using the official datasets and methodology of CoNLL 2016 Shared Task. It does not fall a lot behind state-of-the-art performance on English, but it outperforms other systems without a focused RNNs layer by 8% on the Chinese dataset. Afterwards we perform a detailed analysis on both languages.

Item Type:

Thesis (PhD thesis)

Keywords:

natural language processing, shallow discourse relations, recurrent neural networks, attention mechanisms, language-independent, no external resources

Number of Pages:

104

Language of Content:

English

Mentor / Comentors:

Name and Surname	ID	Function
prof. dr. Marko Bajec	245	Mentor

Link to COBISS:

http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1538240707)

Institution:

University of Ljubljana

Department:

Faculty of Computer and Information Science

Item ID:

4430

Date Deposited:

17 May 2019 09:34

Last Modified:

15 Jul 2019 13:01

URI:

http://eprints.fri.uni-lj.si/id/eprint/4430

Actions (login required)

View Item