ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Visualization of text documents based on conceptual spaces

Kaja Vidmar (2010) Visualization of text documents based on conceptual spaces. EngD thesis.

[img] PDF
Download (4009Kb)

    Abstract

    In my thesis I am presenting an approach of conceptual spaces for vizulalization of text corpora. Thesis is divided into two parts. First part is overview of methods for text corpora analysis and the second one presents some ways for result vizualization. Due to increasing number of eletronic data, we tend to automatic analisys and organisation of this data into various, pre-unknown groups. Some algorithms, that are providing us ways to do this, are presented (such as latent semantic analysis, probabilistic latent semantic analysis, latend Dirichlet Allocartion) further on in thesis.. We are looking for unknown topics, that arise in the text corpora. Text corpora is then analyzed with selected algorithm and presented in conceptual space. Conceptual space represents information by geometric structures: semantics of words are represented by points and relations between them are represented with regions. This suggests that word semantics is generated from concepts, that are represented as regions in conceptual space. For vizualization of conceptual space of text corpora, I decided to use three dimensional representation with self-organizing maps and two dimensional representation with Voronoi diagram. Both representations allow spatial interaction, which can offer us easier way to imagine the conceptual space.

    Item Type: Thesis (EngD thesis)
    Keywords: latent semantic analysis, probabilistic latent semantic analysis, latent Dirichlet allocation, conceptual space, word semantics, vizualization, selforganizing map, Voronoi diagram
    Number of Pages: 62
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    doc. dr. Matija Marolt271Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=00008137044)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 1249
    Date Deposited: 16 Dec 2010 12:01
    Last Modified: 13 Aug 2011 00:38
    URI: http://eprints.fri.uni-lj.si/id/eprint/1249

    Actions (login required)

    View Item