Text Generation using Hidden Markov Model

Miha Filej (2016) Text Generation using Hidden Markov Model. EngD thesis.

Preview

Abstract

Natural language generation (NLG) is the task of producing text that feels natural to the reader. The goal of this diploma thesis is to study to which level natural language generation can be achieved using statistical models – specifically hidden Markov models. The diploma thesis covers probability and information theories that allow the definition of hidden Markov models and describes how such models can be used for the purpose of text generation. Available tools for working with hidden markov models are reviewed, compared, and assesed for their suitability for generating text. A library for hidden Markov models is implemented in Elixir. Two of the reviewed tools and the implemented library are used to generate text from a corpus of written slovenian language. A criterion for comparing generated texts is chosen and used to compare the models as well as comparing the generated texts to the corpus.

Item Type:

Thesis (EngD thesis)

Keywords:

natural language generation, hidden markov models, Baum-Welch algorithm, Forward-Backward algorithm, expectation–maximization algorithm, Elixir, Erlang/OTP

Number of Pages:

Language of Content:

Slovenian

Mentor / Comentors:

Name and Surname	ID	Function
doc. dr. Andrej Brodnik	5540	Mentor

Link to COBISS:

http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537138883)

Institution:

University of Ljubljana

Department:

Faculty of Computer and Information Science

Item ID:

3606

Date Deposited:

16 Sep 2016 14:21

Last Modified:

26 Sep 2016 09:02

URI:

http://eprints.fri.uni-lj.si/id/eprint/3606

Actions (login required)

View Item