Decision Tree Ensemble Selection

Darjan Oblak (2016) Decision Tree Ensemble Selection. EngD thesis.

Preview

Abstract

Ensemble models are well-known in machine learning for their accuracy. Their main quality, convergence towards an asymptotic upper limit as the number of internal models increases, is however partly counterbalanced by their large size. Existing studies show that posterior reduction of the number of models in the ensemble can be done without hurting — or with even increasing — the accuracy of the ensemble. The thesis introduces two new approaches to ensemble selection using the so-called out-of-bag set. Using such a selection set is important in case of small training sets where no data should be held out for learning in order to maintain high generalization accuracy of an ensemble. Both methods are evaluated on 34 datasets for bagging, random forest and extra decision trees. Some of the comparisons show that the selection model outperforms the base ensemble method in a statistically significant manner. The other confirm that the methods are able to reduce the size of ensembles while on average maintaining accuracy.

Item Type:

Thesis (EngD thesis)

Keywords:

ensemble models, decision trees, ensemble selection, ensemble pruning, ensemble thinning, bagging, random forest, extremely randomized trees

Number of Pages:

Language of Content:

Slovenian

Mentor / Comentors:

Name and Surname	ID	Function
izr. prof. dr. Janez Demšar	257	Mentor

Link to COBISS:

http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537131971)

Institution:

University of Ljubljana

Department:

Faculty of Computer and Information Science

Item ID:

3574

Date Deposited:

13 Sep 2016 14:43

Last Modified:

22 Sep 2016 10:41

URI:

http://eprints.fri.uni-lj.si/id/eprint/3574

Actions (login required)

View Item