Mark Lemut (2018) Integrating a data balancing widget into Orange machine learning suite. EngD thesis.
Abstract
Statistical tests can sometimes be performed on unbalanced data sets, even though they are most reliably performed on balanced data sets. There are not a lot of programs that can balance a data set. The assignments goal was to make the algorithm that was developed by Aleš Smodiš more accessible for an audience, that works with data and does not know how to program. Because of that, the algorithm was implemented into a free open source program called Orange. Orange is a simple program and offersa lot of options for data analysis, visualization, and for working with data in general. It offers the use of machine learning for classification and regression.
Actions (login required)