Domen Perc (2011) Implementation and experimental analysis of consensus clustering. EngD thesis.
Abstract
Consensus clustering is a machine learning tehnique for class discovery and clustering validation. The method uses various clustering algorithms in conjunction with different resampling tehniques for data clustering. It is based on multiple runs of clustering and sampling algorithm. Data gathered in these runs is used for clustering and for visual representation of clustering. Visual representation helps us to understand clustering results. In this thesis we compare consensus clustering with standard clustering algorithms to find advances of using this tehnique. We have implemented consensus clustering in programming language Python using open-source data mining and machine learning suite Orange. We tested the implementation with data sets from machine learning repository of University of California, Irvine. Experiment results showed some improvements in comparison with standard tehniques, especially in terms of clustering consistency. Changes in overall performance were rather smaller. If standard tehniques have clustering problems on specific data set, also consesus clustering won't be much better
Actions (login required)