Primož Kariž (2015) Searching nearest neighbours in high dimensional spaces. MSc thesis.
Abstract
Nearest neighbours search is used in different problems, therefore it is important that we are able to find nearest neighbours fast. When searching in high-dimensional spaces we have to be satisfied with approximate nearest neighbours, because fast methods do not exist. In this master thesis we describe some well-known exact and approximate methods for searching nearest neighbours. The described exact ones are R, R*, KD, M, PM and ball-tree, while the approximate are RKD-tree, LSH, hierarchical k-means and boundary-forest. Some of them we implemented, while others were taken from existing libraries. We present and analyze the search results in terms of speed, precision and memory requirements of methods. We developed a library in python programming language, which includes the described methods and provides a simple and consistent API. The library also allows automatic selection of the most suitable algorithm for a given dataset based on two decision trees, which were created through analysis of the results.
Item Type: | Thesis (MSc thesis) |
Keywords: | algorithms, data structures, nearest neighbours search, approximate nearest neighbours, high-dimensional space, R-tree, R*-tree, M-tree, PM-tree, ball-tree, KD-tree, RKD-tree, LSH, hierarchical k-means |
Number of Pages: | 126 |
Language of Content: | Slovenian |
Mentor / Comentors: | Name and Surname | ID | Function |
---|
izr. prof. dr. Marko Robnik Šikonja | 276 | Mentor |
|
Link to COBISS: | http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536276675) |
Institution: | University of Ljubljana |
Department: | Faculty of Computer and Information Science |
Item ID: | 2974 |
Date Deposited: | 02 Apr 2015 17:39 |
Last Modified: | 16 Apr 2015 11:18 |
URI: | http://eprints.fri.uni-lj.si/id/eprint/2974 |
---|
Actions (login required)