Katja Cetinski (2010) Voice controlled television set using system Sphinx-4. EngD thesis.
Abstract
The main purpose of this diploma thesis was development of a system for voice controlled TV set. I used Sphinx-4 system for recognition of television commands. Because there is a big problem of getting free Slovenian speech recordings, I created my own speech database. I used NetBeans IDE for developing my application, which is written in Java programming language. Application recognizes a certain voice command and then sends corresponding data to Arduino development board. According to received data, Arduino sends appropriate IR signal to television. The content of diploma thesis is divided into two parts. First part is mainly theoretical. We learn the basis of speech recognition and general recognition system. This is followed by theoretical basis of Hidden Markov Models which are one of the principles used in speech recognition systems. Next we learn about Sphinx-4 system and SphinxTrain. The last one is used for acoustic model learning. The first part of diploma thesis ends with a description of Arduino development board. That way we learn the basic things that helps us understand the whole system. The second part describes the development of a speech controlled television system. It includes testing and conclusion. Some smaller parts of source code are written in contents chapter, others can be found on a CD that is part of the thesis.
Actions (login required)