ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Statistical machine translation from English to Slovene using Moses system

Sašo Kuntarič (2016) Statistical machine translation from English to Slovene using Moses system. EngD thesis.

Download (1538Kb)


    The aim of the thesis is to customise the Moses system for statistical machine translation from English to Slovenian. Machine translation is a field in computational linguistics that explores the use of software to translate text from one language to another. Factorised statistical translation is an extension of statistical machine translation, where language tags are added on the word level. Words are turned into vectors in an attempt to improve the translation quality. For the open-source machine translation system Moses we created multiple factorised language and translation models from a language corpus, containing IT-related texts. We translated two different IT-based documents. First one was marketing-orientated with a complex structure, while the second one was technical and straight-forward. We used two methods to compare the generated translations, two independent human translations and a translation, created by the Google Translate service. In the first comparison we used the algorithm BLEU and in the second comparison the translations were marked by human reviewers, who expressed a subjective score, which is very important in the translation field. In conclusion we calculated the inter-rater coherence and analysed the results. We discovered that our models were more suitable for technical texts, however switching to factorised models affects complex texts more.

    Item Type: Thesis (EngD thesis)
    Keywords: statistical machine translation, factorised machine translation, Moses system, language corpus, language model, translation model, BLEU, human evaluation, Google Translate
    Number of Pages: 49
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Marko Robnik Šikonja276Mentor
    doc. dr. Simon KrekComentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537090499)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3456
    Date Deposited: 30 Aug 2016 11:43
    Last Modified: 09 Sep 2016 10:37
    URI: http://eprints.fri.uni-lj.si/id/eprint/3456

    Actions (login required)

    View Item