ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Reliability estimation for prediction of effects of small molecules

Rok Močnik (2011) Reliability estimation for prediction of effects of small molecules. EngD thesis.

[img]
Preview
PDF
Download (4Mb)

    Abstract

    Today in all the area of human activities, we gather enormous amounts of data, more than ever before. This data hides important information and knowledge. For human capabilities this amount of data is overwhelming, so we try to develop computer systems to help us with this. Much of essential research in this field is done within machine learning, a subfield of artificial intelligence. Machine learning often deals with with predicting classes of attribute-value defined examples. Success of predictions is usualy estimated over whole test set. In this thesis we are interested whether we can - for a specific example estimate if the prediction for this example is accurate or not. To solve this problem, we implemented a set of already known methods for reliability estimation of specific examples. The methods were developed within Orange, a Python-based data mining suite. Method were tested on data about quantitative structure-activity relationships. Among all the methods tested, the best-performing was the approach that selects the technique for reliability estimation based on internal cross-validation. But this method has a flaw. On bigger datasets this approach is computationally very demanding. In search of effective solution we proposed new method for reliability estimation, that only works in association with random forest. The method uses the variance inside specific prediction in random forest for reliability estimation. This approach is quick, because it does not add anything to random forest, except for calculating variance. It also shows good results on bigger datasets.

    Item Type: Thesis (EngD thesis)
    Keywords: machine learning, reliability estimation, quantitative structure-activity relationship, random forest
    Number of Pages: 42
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Blaž Zupan106Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=00008608340)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 1502
    Date Deposited: 15 Sep 2011 10:40
    Last Modified: 20 Sep 2011 13:15
    URI: http://eprints.fri.uni-lj.si/id/eprint/1502

    Actions (login required)

    View Item