ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Text Generation using Hidden Markov Model

Miha Filej (2016) Text Generation using Hidden Markov Model. EngD thesis.

Download (1283Kb)


    Natural language generation (NLG) is the task of producing text that feels natural to the reader. The goal of this diploma thesis is to study to which level natural language generation can be achieved using statistical models – specifically hidden Markov models. The diploma thesis covers probability and information theories that allow the definition of hidden Markov models and describes how such models can be used for the purpose of text generation. Available tools for working with hidden markov models are reviewed, compared, and assesed for their suitability for generating text. A library for hidden Markov models is implemented in Elixir. Two of the reviewed tools and the implemented library are used to generate text from a corpus of written slovenian language. A criterion for comparing generated texts is chosen and used to compare the models as well as comparing the generated texts to the corpus.

    Item Type: Thesis (EngD thesis)
    Keywords: natural language generation, hidden markov models, Baum-Welch algorithm, Forward-Backward algorithm, expectation–maximization algorithm, Elixir, Erlang/OTP
    Number of Pages: 65
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    doc. dr. Andrej Brodnik5540Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537138883)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3606
    Date Deposited: 16 Sep 2016 14:21
    Last Modified: 26 Sep 2016 09:02
    URI: http://eprints.fri.uni-lj.si/id/eprint/3606

    Actions (login required)

    View Item