ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Learning from textual data streams for detecting email spam

Jernej Porenta (2016) Learning from textual data streams for detecting email spam. MSc thesis.

[img]
Preview
PDF
Download (1801Kb)

    Abstract

    This master thesis introduces a method for the detecting email spam through the translation problem in incremental learning of the time series. Common spam detection systems mainly use methods of supervised learning (naive Bayesian classifier, decision trees), while in the master’s thesis presents the classification by using the methods of data stream mining. For learning sets, we also choose the attributes that do not contain personal data and which are not required to obtain the consent of the sender or the recipient (attributes consist the envelope part of e-mail). With the help of algorithms for learning from data streams (VFDT, cVFDT) we used the electronic sequence of messages as text data stream. The results were compared with the traditional spam detection methods and they show that traditional spam detection methods have higher accuracy compared to algorithms for learning from data stream and therefore are not suitable for detecting email spam.

    Item Type: Thesis (MSc thesis)
    Keywords: email, machine learning, stream mining
    Number of Pages: 52
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Zoran Bosnić3826Mentor
    doc. dr. Mojca Ciglarič256Comentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537120195)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3535
    Date Deposited: 08 Sep 2016 16:22
    Last Modified: 20 Sep 2016 09:52
    URI: http://eprints.fri.uni-lj.si/id/eprint/3535

    Actions (login required)

    View Item