Uroš Kosič (2012) Random forest based similarity measure. EngD thesis.
Abstract
We evaluate a similarity measure based on random forests. Existing similarity measure classifies examples with trees in the forest and is based only on instance coocurance in the leaves. The proposed measure takes also nodes on the path to the leaf into account. We present results of clustering and outlier detection on some real data sets. The existing similarity measure works better with different clustering algorithms than extended one. In outlier detection we get better results with extended measure. Because of the evaluation method results cannot be generalized to different approaches to outlier detection.
Actions (login required)