ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Data schemes integration with algorithms for data summarization via archetypal analysis

Anton Zvonko Gazvoda (2014) Data schemes integration with algorithms for data summarization via archetypal analysis. MSc thesis.

[img]
Preview
PDF
Download (2250Kb)

    Abstract

    Schema mapping discovery is key activity while performing data-level integration process and represents the basis for proper data transformation. For this purpose, we introduce novel instance-based schema matching method by using archetypal analysis in order to generate data summary for each schema element. Summary approximations are represented by convex hulls. We define several approaches for data transformation to vector space, as well as summary-similarity metrics. Two algorithms were developed in order to determine simple and complex matches. Our method was evaluated on the test data including proper mappings between schemas and compared with COMA CE schema matcher. Efficiency of our method was evaluated with sensitivity (91%), specificity (75%), accuracy (87%) and precision (91%). Compared with COMA CE, our method performs on average 20% better.

    Item Type: Thesis (MSc thesis)
    Keywords: instance based schema matching, schema mapping, archetypal analysis, convex hull, data summary
    Number of Pages: 91
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    red. prof. dr. Matjaž Branko JuričMentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536018115 )
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 2730
    Date Deposited: 19 Sep 2014 15:45
    Last Modified: 06 Nov 2014 13:54
    URI: http://eprints.fri.uni-lj.si/id/eprint/2730

    Actions (login required)

    View Item