Lan Umek and Blaz Zupan (2011) Subgroup discovery in data sets with multi-dimensional responses. Intelligent Data Analysis, 15 (4). pp. 533-549.
Most of the present subgroup discovery approaches aim at finding subsets of attribute-value data with unusual distribution of a single output variable. In general, real-life problems may be described with richer, multi-dimensional descriptions of the outcome. The discovery task in such domains is to find subsets of data instances with similar outcome description that are separable from the rest of the instances in the input space. We have developed a technique that directly addresses this problem and uses a combination of agglomerative clustering to find subgroup candidates in the space of output attributes, and predictive modeling to score and describe these candidates in the input attribute space. Experiments with the proposed method on a set of synthetic and on a real social survey data set demonstrate its ability to discover relevant and interesting subgroups from the data with multi-dimensional responses.
|Item Type: ||Article|
|Keywords: ||subgroup discovery, multi-dimensional responses, clustering, machine learning, data mining|
|Institution: ||University of Ljubljana|
|Department: ||Faculty of Computer and Information Science|
|Divisions: ||Faculty of Computer and Information Science > Bioinformatics Laboratory|
|Item ID: ||1484|
|Date Deposited: ||14 Aug 2011 09:48|
|Last Modified: ||21 Aug 2011 13:10|
Actions (login required)