Aleš Mrak (2012) Python data mining environments. EngD thesis.
Abstract
In the thesis we compare the systems for data mining that have an interface in the programming language Python. Many open-source systems for data mining and library had implemented their software interfaces to the Python programming language. They choose Python because it is fast and provides object-oriented programming, allows for the integration of other software libraries in Python and is implemented in all major operating systems (Windows, Linux / Unix, OS / 2, Mac, etc..). Our analysis systems for data mining covers seven most used systems (Elefant, MDP, OpenCVLibrary, Orange, Pybrain, Pyml and Shogun). The analysis covered the following properties of the systems data formats, application programming interface (GUI and API), multitasking, support for databases, response times and other aspects such as installation, documentation and support for the users. From this analysis, we also find out what are the common shortcomings of the analyzed libraries and we give some recommendations to developers.
Actions (login required)