Grega Kres (2011) Computer vision based sign language recognition. EngD thesis.
This thesis focuses on methods designed to recognize signs of American manual alphabet on still images and video. The American manual alphabet was chosen due to the availability of material and properties of the alphabet itself (singlehanded and low amount of signs requiring movement). Methods are divided into two chapters. The first chapter describes methods based on feature extraction. The image is first segmented using color filters, then the features are extracted and converted into numerical form, and finally classification is performed. Classification is based on nearest neighbor search, which requires a metric to be defined so distance to neighboring examples can be calculated. The metric used is detailed in this chapter as well. The second chapter describes methods based on template matching. Unlike methods used in the previous chapter, templates are not represented in numerical form but rather as binary images. A set of templates is constructed using a group of training images. An input image is then compared to every template in the set and the best match is returned. We have developed three alterations of the algorithm, each having different classification accuracy and speed. We then focus on testing the speed and accuracy of the classification. Classification is tested using both still images and video. When testing is performed on video, certain problems occur, especially when capturing video in real-time using a web cam. Due to a generally lower quality of such capture, noise is introduced to the image, which severely affects classification accuracy. Certain methods are explored that help avoid the issue. In the final chapter we propose improvements that could be the focus of further research.
Actions (login required)