Darjan Oblak (2016) Decision Tree Ensemble Selection. EngD thesis.
Abstract
Ensemble models are well-known in machine learning for their accuracy. Their main quality, convergence towards an asymptotic upper limit as the number of internal models increases, is however partly counterbalanced by their large size. Existing studies show that posterior reduction of the number of models in the ensemble can be done without hurting — or with even increasing — the accuracy of the ensemble. The thesis introduces two new approaches to ensemble selection using the so-called out-of-bag set. Using such a selection set is important in case of small training sets where no data should be held out for learning in order to maintain high generalization accuracy of an ensemble. Both methods are evaluated on 34 datasets for bagging, random forest and extra decision trees. Some of the comparisons show that the selection model outperforms the base ensemble method in a statistically significant manner. The other confirm that the methods are able to reduce the size of ensembles while on average maintaining accuracy.
Item Type: | Thesis (EngD thesis) |
Keywords: | ensemble models, decision trees, ensemble selection, ensemble pruning, ensemble thinning, bagging, random forest, extremely randomized trees |
Number of Pages: | 63 |
Language of Content: | Slovenian |
Mentor / Comentors: | Name and Surname | ID | Function |
---|
izr. prof. dr. Janez Demšar | 257 | Mentor |
|
Link to COBISS: | http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537131971) |
Institution: | University of Ljubljana |
Department: | Faculty of Computer and Information Science |
Item ID: | 3574 |
Date Deposited: | 13 Sep 2016 14:43 |
Last Modified: | 22 Sep 2016 10:41 |
URI: | http://eprints.fri.uni-lj.si/id/eprint/3574 |
---|
Actions (login required)