Miha Filej (2016) Text Generation using Hidden Markov Model. EngD thesis.
Abstract
Natural language generation (NLG) is the task of producing text that feels natural to the reader. The goal of this diploma thesis is to study to which level natural language generation can be achieved using statistical models – specifically hidden Markov models. The diploma thesis covers probability and information theories that allow the definition of hidden Markov models and describes how such models can be used for the purpose of text generation. Available tools for working with hidden markov models are reviewed, compared, and assesed for their suitability for generating text. A library for hidden Markov models is implemented in Elixir. Two of the reviewed tools and the implemented library are used to generate text from a corpus of written slovenian language. A criterion for comparing generated texts is chosen and used to compare the models as well as comparing the generated texts to the corpus.
Item Type: | Thesis (EngD thesis) |
Keywords: | natural language generation, hidden markov models, Baum-Welch algorithm, Forward-Backward algorithm, expectation–maximization algorithm, Elixir, Erlang/OTP |
Number of Pages: | 65 |
Language of Content: | Slovenian |
Mentor / Comentors: | Name and Surname | ID | Function |
---|
doc. dr. Andrej Brodnik | 5540 | Mentor |
|
Link to COBISS: | http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537138883) |
Institution: | University of Ljubljana |
Department: | Faculty of Computer and Information Science |
Item ID: | 3606 |
Date Deposited: | 16 Sep 2016 14:21 |
Last Modified: | 26 Sep 2016 09:02 |
URI: | http://eprints.fri.uni-lj.si/id/eprint/3606 |
---|
Actions (login required)