ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Attribute scoring based on performance of an learning algorithm on samples of attribute space

Gregor Weiss (2011) Attribute scoring based on performance of an learning algorithm on samples of attribute space. EngD thesis.

[img] PDF
Download (2127Kb)


    In the field of machine learning and knowledge discovery in databases attributes or features have a central role, thus it is reasonable to also question their quality and importance for the given problem. Because this is in general a difficult problem, we focused in the thesis on the development of a new method for estimating attribute importance. The new method is based on sampling the attribute space, evaluating the performance of algorithms for machine learning and reasoning about the importance of individual attributes based on the obtained scores. More specifically, at first different combinations of attributes are chosen and smaller data sets that contain them are prepared on which a testing procedure with sampling obtains estimates on performance of an arbitrary chosen learning algorithm. Performance estimates obtained that way are statistically processed for each attribute according to their presence and with a given formula joined into final scores for individual attributes. In order to determine how well different variants of the new method work, an appropriate experimental methodology and many diverse data sets has been prepared. Some successful methods have also been further tested in more detail to reinforce the conclusion, that certain variants of the new method really are statistically significant better than conventional widely used methods for this problem, but unfortunately an improved version of the best one of them still seems to be better. The thesis concludes with a discussion of the results and various ideas for further work, improvements and applications of the method.

    Item Type: Thesis (EngD thesis)
    Keywords: supervised machine learning, estimating attribute importance, sampling of attribute space, generalized wrapper method on arbitrary learning algorithms
    Number of Pages: 78
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Blaž Zupan106Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=00008300628)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 1319
    Date Deposited: 29 Mar 2011 14:07
    Last Modified: 13 Aug 2011 00:38
    URI: http://eprints.fri.uni-lj.si/id/eprint/1319

    Actions (login required)

    View Item