ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Deep Models for Classification of Biomedical Documents

Tomislav Slijepčević (2018) Deep Models for Classification of Biomedical Documents. MSc thesis.

[img]
Preview
PDF
Download (5Mb)

    Abstract

    In this master thesis, we developed a model that can present texts from life sciences in the vector form that is suitable for machine learning. Our corpus were abstracts from the MEDLINE collection, where abstracts are labeled with annotations from the MeSH ontology. The developed model uses a deep neural network for predicting MeSH annotations from a text. For the vector representation of a text, we used penultimate layer of a network that has 1000 neurons. The model was compared to the multinomial logistic regression, which predicts MeSH annotations from vector representations of texts that are obtained with doc2vec. In the task of predicting MeSH annotations on the test dataset, our model achieved higher accuracy. Also, vector representations of texts obtained with our model were in comparison with vector representations of texts obtained with doc2vec, better in point-based visualizations using the t-SNE method.

    Item Type: Thesis (MSc thesis)
    Keywords: biomedical literature, vector representation of text, deep learning, prediction of MeSH terms
    Number of Pages: 54
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Blaž Zupan106Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537792707)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 4080
    Date Deposited: 26 Feb 2018 16:52
    Last Modified: 22 May 2018 12:40
    URI: http://eprints.fri.uni-lj.si/id/eprint/4080

    Actions (login required)

    View Item