Miha Mlakar (2010) Accuracy of cancer diagnosis models inferred by machine learning from gene expression data sets. EngD thesis.
Abstract
Using machine learning on gene expression data we can try to predict if tissue is benign or malignant. We have evaluated different machine learning technique on the data that we have obtained from the public data base Gene Expression Omnibus. The algorithms were tested on different data sets to get more reliable results. The methods were scored using AUC measure and statistically compared in a critical distance graph. The results were a bit surprising. We expected that the best method would be support vector machines method, but it was method of random forests. Standard deviation was relatively high so the order of methods could be different on some other data.
Actions (login required)