Andrej Jančič (2016) Query Optimization in ElasticSearch. EngD thesis.
Abstract
In the graduation thesis, I present database, its history of origin, and where it is placed from the perspective of cases of use on the software market. I make a short overview of examples of use from the real world, and shortly research trends of its popularity compared to related products and market as a whole. Based on my own experience, literature, official documentation, and experience of other users, I examine the cases which caused problematic operation of the database. For each of the cases I examine the possibility and advisability of solving the problem with automatic optimisation of queries. I examine the case also historically, since Elasticsearch has in the last year significantly changed. I note that automation of queries is not advisable, since the developers in Elasticsearch solved most of the cases with architectural changes, internal optimisation, and a change of query language, which takes away from the user ambiguity in expressing the queries. I establish that the most important feature of well-functioning cluster is a proper size of shards, which cannot be easily changed. For that, an experimental planning of activities is necessary, which I also describe. In addition to optimum size of shard, there are some bad practices, which I also describe in the thesis; with them we can collapse cluster, and it is important that they are known by the user.
Actions (login required)