Niko Colnerič (2013) Construction and exploration of networks of scientific papers. EngD thesis.
Abstract
We present an innovative approach to reviewing related scientific articles. Rather than showing articles in a list, we show related articles in a network. Edge width represents semantic similarity of connected articles. We define similarity between two articles as cosine similarity between vectors, which represents this two articles. Vectors are obtained by TF-IDF transformation of article abstracts. Before transformation abstracts are first lemmatized. Information about articles are gathered from online repositories PubMed and CiteULike. We use caching to speed up the access. We also present a method of removing less important edges and techniques used to visualize this networks with D3 library. Our application is publicly accessible on website http://butler.fri.uni-lj.si/articles.
Actions (login required)