ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Part of speech tagging of slovene language using deep neural networks

Primož Belej (2018) Part of speech tagging of slovene language using deep neural networks. MSc thesis.

Download (4Mb)


    The thesis deals with part of speech tagging of Slovene language. Part of speech tagging is a process of matching sentences in natural language with a sequence of suitable tags, which contain information about parts of speech and morphological properties of words. Our solution uses character-level representation of words, which is different from typical solutions, which process input sentences as sequences of words. Our part of speech tagger is implemented using convolutional and recurrent neural networks. Unlike common approaches that address this problem as multi-class classification, our solution proposes a multi-label classification approach. In order to improve our results we implement an ensemble of three part of speech taggers. When comparing our solution with existing ones, we find that the proposed solution achieves the best results.

    Item Type: Thesis (MSc thesis)
    Keywords: machine learning, part-of-speech tagging, deep learning, convolutional neural networks, recurrent neural networks, ensemble classifiers
    Number of Pages: 50
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Marko Robnik Šikonja276Mentor
    dr. Simon KrekComentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1538047171)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 4313
    Date Deposited: 09 Nov 2018 14:28
    Last Modified: 23 Nov 2018 12:45
    URI: http://eprints.fri.uni-lj.si/id/eprint/4313

    Actions (login required)

    View Item