ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Clustering with Argument-Based Machine Learning

Šaponja Peter (2015) Clustering with Argument-Based Machine Learning. MSc thesis.

Download (1303Kb)


    The need for improvement of data clustering methods demanded more interactive options with domain experts, which led to the development of algorithms, coined as constrained clustering. These algorithms use domain knowledge in the form of positive must-link and negative cannot-link constraints to improve the quality of detected groups. One of the most overlooked issues in this filed is the effectiveness of constraint elicitation. While the process of constraint elicitation can be a tedious task it can have a significant impact on the quality of clustering. In this thesis we designed and developed a method named Argument-based k-means (AB k-means), which is designed for a more efficient clustering and is based on the paradigm of argument-based machine learning (ABML). The knowledge refinement loop enables the domain expert to articulate his domain knowledge by argumenting automatically chosen problematic cases, while the method with the help of counter examples highlights any shortcomings in the expert’s arguments. We adapted the knowledge refinement loop to the needs of clustering by exposing badly and well clustered cases when eliciting constraints, which are crucial for the improvement of clustering. At the same time the obtained constraints lead to clusters that are consistent with the knowledge of the expert in their chosen domain. For an easier use of the new method we have also developed an interactive application. The effectiveness of our approach was empirically tested on three different experimental domains and compared favourably with an ordinary algorithm for constrained clustering.

    Item Type: Thesis (MSc thesis)
    Keywords: semi-supervised learning, clustering, k-means, constrained clustering, argument-based machine learning, knowledge refinement loop, constraint elicitation, argument-based k-means
    Number of Pages: 75
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    doc. dr. Matej Guid937Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536580803)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3173
    Date Deposited: 02 Oct 2015 18:27
    Last Modified: 20 Oct 2015 15:37
    URI: http://eprints.fri.uni-lj.si/id/eprint/3173

    Actions (login required)

    View Item