ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Cross–lingual mappings of contextual word embedding ELMo

Ljupche Milosheski (2019) Cross–lingual mappings of contextual word embedding ELMo. EngD thesis.

[img]
Preview
PDF
Download (663Kb)

    Abstract

    To work with textual data, machine learning algorithms, in particular, neural networks, require word embeddings – vector representations of words in high-dimensional space. There are languages with a small amount of available resources. Exploiting the knowledge from the well-resourced languages for under-resourced languages is possible with cross-lingual embeddings by aligning the embeddings of one language with the vector space of another language. Existing methods for aligning embeddings are intended for context-independent embeddings, where every word has one representation. We propose a method, based on a dictionary and a parallel corpus aligns contextual embeddings, which capture more information about the context in which words appear. The proposed method requires a small amount of bilingual data, which is available for many language pairs. We empirically show that the proposed method outperforms the baseline obtained by alignment of context-independent embeddings.

    Item Type: Thesis (EngD thesis)
    Keywords: cross-lingual word embeddings, contextual word embeddings, vector word embeddings, word translation, parallel corpus, vector space mappings, singular value decomposition
    Number of Pages: 42
    Language of Content: English
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Marko Robnik Šikonja276Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1538288835)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 4444
    Date Deposited: 23 Jul 2019 15:04
    Last Modified: 08 Aug 2019 10:50
    URI: http://eprints.fri.uni-lj.si/id/eprint/4444

    Actions (login required)

    View Item