ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Correcting comma placement in Slovene language with LanguageTool

Karin Piškur (2015) Correcting comma placement in Slovene language with LanguageTool. EngD thesis.

Download (622Kb)


    The aim of the thesis is to add the rules for comma usage to the LanguageTool program. Using the Lektor corpus, we examined which rules for comma usage are causing the most issues in written Slovene. In view of these results, we analyzed the rules for comma usage before conjunctions »and«, »or« and »that« in Slovenian ortography 2001. After finishing the analysis, we tried to implement comma placement rules for the open source program LanguageTool, which can be used as a stand-alone desktop application, as web interface or in open source office suites LibreOffice and OpenOffice. Some of the rules were successfully implemented. For all of the rules to be implemented, we would need the part-of-speech tagger, which is not a part of the LanguageTool for Slovene, yet. We evaluated the rules, taking their accuracy, applicability and user experience into account.

    Item Type: Thesis (EngD thesis)
    Keywords: LanguageTool, Slovenian language, comma, punctuation mark, Lektor corpus
    Number of Pages: 50
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Marko Robnik Šikonja276Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536464067)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3053
    Date Deposited: 08 Sep 2015 13:24
    Last Modified: 15 Sep 2015 10:53
    URI: http://eprints.fri.uni-lj.si/id/eprint/3053

    Actions (login required)

    View Item