Anže Vavpetič (2011) The Use of Ontologies as Background Knowledge in Data Mining. EngD thesis.
The thesis describes the development of an application for subgroup discovery called g-SEGS, which supports the use of ontologies as background knowledge. The application can use ontological concepts as terms in rules, which describe subgroups of examples. The system is a generalization of an existing system SEGS, which was successfully used in the field of genomics, but it cannot be applied to other fields. The system g-SEGS was implemented as a web services and can thus be imported into various applications which support the use of web services. The thesis also presents the use of g-SEGS as a widget in the Orange data mining environment for visual programming and its extension Orange4WS; an additional, easy-to-use user interface was implemented for this purpose. Next, the thesis describes how to formulate the problem of using ontologies in background knowledge for subgroup discovery in the inductive logic programming system Aleph. Both approaches were experimentally evaluated on a toy domain and on two real-life biological domains. Lastly, the thesis provides some ideas for future work.
Actions (login required)