Kristjan Pičulin (2018) Discovery and analysis of advertisements from textual data. EngD thesis.
Abstract
For my thesis i made a program, that recognizes if a web article is a pai advertisement or if it is a real news article and also analized the results that were made by the program. I analized why articles are classified the way they are, why are some articles misclassified and what things affect how program is recognizing articles. I was especially interested in a way to separate news articles and advertisements. The program was made in Python programming language. I used libraries such as: pyqt, sklearn and similar. I was quite successful in making the program work the way i wanted and i also found out many interesting things about articles and advertisements.
Actions (login required)