ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Efficient full-text search in general-purpose database systems

Blaž Štempelj (2016) Efficient full-text search in general-purpose database systems. EngD thesis.

Download (1265Kb)


    The goal of the thesis is the review and evaluation of options that database management systems support when working with natural language texts. In the first part we describe the slovenian corpuses ccKres and ccGigafida, the database structure of MariaDB, PostgreSQL, MongoDB and their use of full-text indexes. But the support the slovenian language still isn't all that great. MariaDB supports only the use of stop words, while MongoDB doesn't even support those. With a little work, PostgreSQL enables us to define custom made configurations which enable the use of lexemes and more fine tuned results. In the second part of this thesis we test the performance of each DBMS by using colocation. Results are presented by using tables and graphs. The final results also show that for colocation the best choice is to use MongoDB.

    Item Type: Thesis (EngD thesis)
    Keywords: MySQL, MariaDB, PostgreSQL, MongoDB, full-text search
    Number of Pages: 85
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Matjaž Kukar267Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536906691)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3323
    Date Deposited: 14 Apr 2016 16:23
    Last Modified: 06 May 2016 13:44
    URI: http://eprints.fri.uni-lj.si/id/eprint/3323

    Actions (login required)

    View Item