Tadej Magajna (2016) Feature Selection for Multilayer Clustering. MSc thesis.
Abstract
We present an overview of feature selection for multi-layer clustering. We explain the concepts of multi-view learning and redescription mining. We propose a new clustering method using predictor explanations which provide multiple explanations for each resulting cluster. These explanations serve as a interpretable definition of groups and can help to understand connections between features from different views. Test on our synthetic data set shows that the proposed multi-view feature selection method mvReliefF handles multi-view data well. On a data set from UCI repository we compared our method with published results. On a joined ADNI Alzheimer's disease data set, we explain the obtained clusters separately with clinical and separately with biological features using predictor explanations. The explanations serve as an interpretable cluster definitions and help to understand the connections between clinical and biological features. Neurological analysis suggests that the obtained clusters and connections between view features are meaningful.
Actions (login required)