next up previous
Next: Speech Details Up: Hidden Markov Models for Previous: Introduction

Project Overview

To recognize verbal, as well as, gestural commands, similar approaches to each modality were used. As it is well known, speech recognition depends on the measurement and classification of spectral and temporal related features of acoustics. The natural progression of thought is that gestural recognition also can be made by classification of features of gestures. A question still currently in debate is what gesture features are necessary for unique gesture recognition. Since the gestures for this project are made up of repeated hand motions and specific hand shapes, features that exploit these traits were chosen. To recognize specific temporal and spectral feature sequences, Hidden Markov Models (HMM's) provide a powerful modeling tool. Thus, I used Entropic Research's HMM Tool Kit (HTK) version 2.0 to create HMM's for recognition. By modeling each gesture feature sequence with a HMM, gesture recognition is treated the same as a isolated word speech recognition system. Similarly, the speech recognition system is a isolated word recognizer. In combined sequences where both auditory and visual information are present, a simple independent probability calculation of the maximum likelihood probabilities of each type of command (i.e. $P(LEFT \,gesture)*P(LEFT\, speech),\, P(UP \,gesture)*P(UP\, speech)$,..., etc.) is computed and the maximum probability is chosen.



 
next up previous
Next: Speech Details Up: Hidden Markov Models for Previous: Introduction
Greg Berry
9/15/1997