Vid Kocijan (2017) Vertex embeddings in linear space complexity. Prešeren awards for students.
Abstract
In order to predict the behaviour of networks with machine-learning algorithms, the vector representation of nodes in a low dimensional vector space is required. The current state-of-the-art algorithm for the calculation of node embeddings in vector space is Node2vec. Node2vec samples the network through the 2nd order random walks. Unfortunately, Node2vec has a high memory complexity due to the preprocessed probability-distribution tables. Due to high memory complexity, an average user is unable to use it for larger networks. In this thesis, we present a heuristic approach to the random walk simulation. The heuristic approach replaces probability tables with binary trees and guarantees linear time and space complexity, while retaining the quality of computed features. The heuristic approach requires from 6 up to 40 times less memory than Node2vec on tested datasets.
Actions (login required)