Nejc Bernot (2012) Voice control of the ZPlet interpreter by using the Sphinx-4 system. EngD thesis.
Abstract
For humans speech represents the most natural form of communication with their environment. Due to this fact research began as early as the 1940s into systems that would enable computers to recognize speech. The purpose of this diploma thesis is to present an overview of the current state of computer speech recognition, the java speech recognition system Sphinx-4 and the integration of this system with the Zplet interpreter, which can be used for playing games written according to the Z-machine standard. The introduction contains an overview of the basics of the structure of speech, a description of the process of speech recognition, a list of the different types of speech recognition systems and a description of the basic advantages of using speech recognition. The following chapter is dedicated to the theory behind hidden Markov models, which form the core of most of the speech recognition systems currently in use. The third and forth chapters contain an overview of the structure and operation of the Sphinx-4 system and the Zplet interpreter. The last few chapters contain the descriptions of the process of integrating Sphinx-4 and Zplet, the steps needed to run any of the Z-machine programs on this integrated system and the results of practical tests of the accuracy of the speech recognition performed by Sphinx-4.
Actions (login required)