ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Multitask learning in classification and regression

Gregor Čepin (2016) Multitask learning in classification and regression. MSc thesis.

[img] PDF
Restricted to Repository staff only

Download (454Kb)

    Abstract

    Multitask learning is an approach to machine learning, in which algorithm learns to solve multiple related problems. It tries to find one common model instead of building multiple separate models. Such a model is usually smaller than the sum of separate models, easier to understand and less likely to overfit training data. In prediction stage the algorithm predicts values for several problems at the same time. Problems that are learned together must be related, so that learning of one problem can improve learning of other problems. Currently this approach is used with tree models for either multiple classification or multiple regression tasks. In this work we extend the approach to mixed classification and regression tasks. During construction of trees different attribute selection methods are used in regression and classification. The returned scores are not directly comparable, so in our scenario we rank attributes for each task and choose the attribute that is best ranked in total. We implement multitask regression and classification tree, multitask bagging and multitask random forest based on rankings of attributes. We compare these algorithms with their single task variants, with regular multitask tree and with multitask neural network. We propose task relatedness measure based on ranking of attributes. In this way we can find related tasks in a dataset and use them together in multitask approach. On one dataset implemented multitask random forest works statistically significantly better than single-task version. On some datasets implemented algorithms work worse than single-task versions.

    Item Type: Thesis (MSc thesis)
    Keywords: machine learning, decision tree, multitask tree, random forest, bagging, classification, regression, ranking
    Number of Pages: 57
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Marko Robnik Šikonja276Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536881603)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3301
    Date Deposited: 24 Mar 2016 12:43
    Last Modified: 21 Apr 2016 08:23
    URI: http://eprints.fri.uni-lj.si/id/eprint/3301

    Actions (login required)

    View Item