Blaž Sovdat (2013) Algorithms for incremental learning of decision trees from time-changing data streams. EngD thesis.
Abstract
This work is detailed presentation of the main ideas behind state-of-the-art algorithms for online learning of decision trees and other models from time-changing data streams. We begin by proving Hoeffding inequality, which we then use to derive a general method for scaling up machine learning algorithms. We apply this method to scale up classical decision tree learning algorithm, and prove theoretical guarantees for one of the learners. We implement scaled up decision tree learners and, after giving a rough description of the imple-mentation, illustrate usage on a simple, bootstrapped dataset. We then turn to methods for assessing stream learning algorithm performance and comparing two algorithms on a single data stream. We later apply these methods when performing experiments on a real-world electricity-demand scenario for New York state electricity data, demonstrating usage of both our implementation and aforementioned evaluation methods. Original contribution are simple formulas and algorithms for computing entropy and Gini index on time-changing data streams.
Item Type: | Thesis (EngD thesis) |
Keywords: | online learning, machine learning, decision trees, time-changing data streams, Hoeffd¬ing inequality, stream learning algorithm evaluation |
Number of Pages: | 83 |
Language of Content: | Slovenian |
Mentor / Comentors: | Name and Surname | ID | Function |
---|
doc. dr. Zoran Bosnić | 3826 | Mentor |
|
Link to COBISS: | http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=10132820) |
Institution: | University of Ljubljana |
Department: | Faculty of Computer and Information Science |
Item ID: | 2150 |
Date Deposited: | 13 Sep 2013 16:21 |
Last Modified: | 24 Sep 2013 10:40 |
URI: | http://eprints.fri.uni-lj.si/id/eprint/2150 |
---|
Actions (login required)