ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Decision Tree Ensemble Selection

Darjan Oblak (2016) Decision Tree Ensemble Selection. EngD thesis.

Download (1048Kb)


    Ensemble models are well-known in machine learning for their accuracy. Their main quality, convergence towards an asymptotic upper limit as the number of internal models increases, is however partly counterbalanced by their large size. Existing studies show that posterior reduction of the number of models in the ensemble can be done without hurting — or with even increasing — the accuracy of the ensemble. The thesis introduces two new approaches to ensemble selection using the so-called out-of-bag set. Using such a selection set is important in case of small training sets where no data should be held out for learning in order to maintain high generalization accuracy of an ensemble. Both methods are evaluated on 34 datasets for bagging, random forest and extra decision trees. Some of the comparisons show that the selection model outperforms the base ensemble method in a statistically significant manner. The other confirm that the methods are able to reduce the size of ensembles while on average maintaining accuracy.

    Item Type: Thesis (EngD thesis)
    Keywords: ensemble models, decision trees, ensemble selection, ensemble pruning, ensemble thinning, bagging, random forest, extremely randomized trees
    Number of Pages: 63
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Janez Demšar257Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537131971)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3574
    Date Deposited: 13 Sep 2016 14:43
    Last Modified: 22 Sep 2016 10:41
    URI: http://eprints.fri.uni-lj.si/id/eprint/3574

    Actions (login required)

    View Item