Marko Žerjal (2009) Predicting the amount of rainfall with propositional machine learning. EngD thesis.
Abstract
In this diploma thesis we tried to create rain forecasts with the use of data obtained by combining numerical meteorological information with information gathered from radar images of rain activity. We have combined the two groups of data by partitioning the radar images into equally sized cells, which were then assigned a numerical value based on the prevailing degree of rainfall inside the cells. This is in an appropriate format of attributes in the models used to make the forecasts. The combined set of data was then split into a learning and a testing set and we have tried to make the most accurate forecasts possible with the use of selected attribute-based machine learning methods. The following methods were used: decision trees, KNN, naive Bayesan classifier and regression trees. The best results were obtained by the decision trees and by the KNN algorithm. Regression trees were compared with other methods by discretizing their results which revealed that they were not as accurate as decision trees and the KNN algorithm. The naive Bayesan classifier also proved to be less adequate for use on this type of data. Beside the already existing methods of machine learning we also developed a set of methods which divide the area covered by the radar images into regions, thus enabling us to take into account the local weather characteristics of various parts of Slovenia. We assess the results of regional division as good, but there are still many possibilities for improvement to be researched in the future.
Actions (login required)