ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Network structural properties and their application to missing property prediction

Klemen Simonič (2013) Network structural properties and their application to missing property prediction. EngD thesis.

[img]
Preview
PDF
Download (1090Kb)

    Abstract

    The volume of available structured data is increasing, particularly in the form of Linked Data, where relationships between individual pieces of data are encoded by a graph-like structure. Despite increasing scales of the data, the use and applicability of these resources is currently limited by mistakes and omissions in the linking data. In this diploma thesis, we look at the problem of predicting potential instance properties (types of relations). Given a specific query node in our multigraph dataset, can we correctly rank possibly omitted properties? We propose a method based on leveraging properties from similar nodes in our dataset. In order to compute similar nodes, we define various network structural properties, which induce dissimilarities between nodes. These structural properties are based on either local or global processing of the underlying network. Since their complexity highly varies, a special treatment needs to be considered when dealing with networks containing hundreds of millions of nodes and edges. In our tool LODminer, we use weighted averages of property frequency vectors over a set of similar nodes to determine the most likely missing in¬stance property. We investigate the performance of different dissimilarities and compare them to several other methods on three large-scale datasets, two based on DBpedia and one based on Freebase. Mathematics Subject Classification [MSC2010]: 68R10 [Graph the¬ory], 68T30 [Knowledge representation], 05C82 [Small world graphs, complex networks], 91D30 [Social networks]. CCS Categories and Subject Descriptors [1998 system]: G.2.2 [Graph Theory], I.2.4 [Knowledge Representation Formalisms and Methods]: Seman¬tic networks, H.2 [Database Management]: Database Applications – Data Mining.

    Item Type: Thesis (EngD thesis)
    Keywords: Linked Data, Graph Mining, Network, Structural Properties, Missing Properties, Prediction.
    Number of Pages: 55
    Language of Content: English
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Vladimir Batagelj1093Mentor
    znan. sod. dr. Primož ŠkrabaComentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=9896020)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 2035
    Date Deposited: 06 May 2013 15:46
    Last Modified: 05 Jun 2013 14:43
    URI: http://eprints.fri.uni-lj.si/id/eprint/2035

    Actions (login required)

    View Item