ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Behaviour of FreeViz algorithm in a high-dimensional space

Matija Polajnar (2009) Behaviour of FreeViz algorithm in a high-dimensional space. EngD thesis.

[img] PDF
Download (1258Kb)

    Abstract

    FreeViz is a data mining method for local optimization of linear projections. In this thesis we cover method's performance in field of genetics. Genetic datasets usually contain much more attributes than instances; we first use linear algebra to predict method's problems on such data. Then we try to find properties of datasets that influence the performance of FreeViz. The goal of the analysed method is to find good (informative) visualizations, so we estimate its performance by measuring quality of a k(k Nearest Neighbours) classifier on the projections it yields. The results confirm FreeViz's poor performance on genetic data, but it nevertheless proved successful on one of the used dataset. In pursue of dataset properties that influence method's performance, we generated synthetic datasets. Results show that the ratio between attribute count and instance count has negligible influence. On the other hand, FreeViz's quality is degraded when most of the attributes are redundant and improved when there are mutually correlated attributes. We have also observed the paths that attribute projections make during optimization, but found no rule to distinguish redundant attributes from the rest. In case there is a large number of instances, FreeViz yields a projection that maps redundant attributes closer to the origin. That is not the case when there are more attributes than instances. However, in that case not even a nomogram for a naive Bayesian classifier can distinguish between informative and redundant attributes.

    Item Type: Thesis (EngD thesis)
    Keywords: visualization, linear projection, FreeViz, genetics, redundancy, correlation, attribute importance
    Number of Pages: 45
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    doc. dr. Janez Demšar257Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=7296340)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 905
    Date Deposited: 09 Sep 2009 15:54
    Last Modified: 13 Aug 2011 00:35
    URI: http://eprints.fri.uni-lj.si/id/eprint/905

    Actions (login required)

    View Item