pose for VR interaction. Other important poses from the above applications that either form part of a dynamic gesture or are used as-is include: Open Hand, Fist, Pinch, and Thumbs-Up.
2.2.4.2 Other Games and Simulations
The Kinect was also used to control an avatar in the Second Life virtual world [47].
Users are able to control the camera by pretending to hold an imaginary window pane in their hands. This means that both hands are roughly the same distance in front of the user and are in a grasping pose as though they were holding the imaginary pane. Moving the pane to the left or right pans the camera left or right respec- tively, while pushing the pane forward or backwards zooms the camera in or out respectively. The camera is rotated clockwise by moving the left hand forward and the right hand back, which is the motion the user makes to rotate the imaginary pane.
plication domain, such as pointing to an object.
– Semaphores: Gestures that could be either static or dynamic and have a meaning to be communicated to the application. Most gesture recognition research focuses on this gesture style.
– Gesticulation: Gestures that are used naturally during conversations.
– Manipulation: Gestures that have a strong relationship between how the hand moves and how a manipulated object moves, such as grabbing and moving an object.
– Sign language: Gestures similar to semaphoric gestures, with the difference being that they are based on a spoken language.
• Application Domain: The application in which the gestures are applied, such as desktop use or virtual reality.
• Enabling Technologies: The technologies used in order to capture the ges- tural data from the user.
• System Response: The means by which the system responds to a gesture, for example visual, audio, or through commands to the CPU.
This taxonomy is broad, but lacks detail in separating hand shapes from one another.
In more recent research, Vafaei argues that previous taxonomies, such as the one by Karam and Schraefel [24], are too broad, and do not capture specific dimensions, such as the physical form of the hand [60]. It is claimed that the older taxonomies are not related to gestural interaction with computers, and rather just gestures for human communication. Vafaei proposed a taxonomy by adjusting and combining dimensions used in the taxonomies of Wobbrock et al. [67] and Ruiz et al. [50]. The categories defined in the taxonomy include: Nature, Form, Binding, Temporal, Context, Dimen- sionality, Complexity, Body Part, Handedness, Hand Shape, and Range of Motion.
A dimension of note is the Hand Shape dimension, which has assignable values such asFlat, Open, Bent,and Curved. At the time of his investigations, Vafaei states that Hand Shapes were not used as a dimension for taxonomies. A user elicitation study
was performed to determine the common hand shapes that users make in gestural interaction. Figure 2-2 lists the common hand shapes discovered.
Figure 2-2: Common hand shapes extracted from the user elicitation study by Vafei [60].
While the hand shapes listed by Vafei provide insight into what could be contained
in a comprehensive list of poses, there are no further sub-categories of the Hand Shape dimension. This makes it difficult to create and verify the comprehensiveness of a set of hand poses, as ideally one would want to ensure that the set of hand poses covers all sub-categories of the Hand Shape dimension.
Mo devised a means to notate hand poses by proposing a notating language named GeLex [43]. In GeLex, each finger was described by a Finger Pose (Figure 2-3), and each relationship between two fingers was described by aFinger Inter-relation (Figure 2-4).
Figure 2-3: Finger poses defined in GeLex [43]. The poses illustrated in the right-hand figure, described from top-left to bottom-right: Point,BendHalf,Bend,CloseHalf, and Close.
Figure 2-4: Finger Inter-relations defined in GeLex [43]. From left to right: Group,Separate, Cross, and Touch.
From these definitions of finger poses and inter-relations, an encoding technique was devised to describe a single hand pose using a series of integers. Each hand pose was described using five integers describing the pose of each finger, followed by four integers describing the relationship between the thumb and each of the other fingers,
followed by three integers describing the relationship between adjacent non-thumb fingers. Therefore, a single hand pose is described by a twelve-dimensional vector of integers.
GeLex separates hand poses very well into intuitive categories, and could provide a solid foundation for a set of representative gestures.
Since many recent studies in hand gesture recognition involve user-elicitation, Choi et al. set the focus of their study on developing a taxonomy that allows researchers to notate these gestures systematically [8]. Figure 2-5 depicts how they separated gestures into categories.
Figure 2-5: The taxonomy developed by Choi et al. [8]
The Hand Shape category is analogous to hand pose, and unlike most previous taxonomies, theHand Shape category was further separated into sub-categories. Choi et al. based their categorization of Hand Shape on GeLex by Mo [43], and divided Hand Shape intoFinger Poses andFinger Inter-relations. They simplified theFinger Poses and expanded on the Finger Inter-relations in GeLex. Figure 2-6 illustrates the Finger Poses and Finger Inter-Relations proposed by Choi et al.
Figure 2-6: a) Finger Pose states. b) Finger Inter-relation states. Extracted from Choi et al. [8]
Another category of note in their taxonomy is the Hand Orientation category, which is further divided into Palm Orientation and Fist Face Orientation. Palm Orientation is the direction of the palm normal, while Fist Face Orientation is the direction the knuckles would point in if a fist were to be made. Both of these sub- categories could have an assignable state of forwards, backwards, left, right, up, or down.
Older taxonomies are very broad, and such general categorization is not applicable for the purposes of this research. In order to evaluate the performance of a camera-based system, it is obvious that changes in hand shape and orientation will have more of a direct impact on recognition performance than gesture meanings and styles. Of the taxonomies reviewed, Choi et al.’s taxonomy expands on GeLex and provides an in- depth means of separating gestures by hand shape and orientation. This will provide a strong basis to create a comprehensive pose set, as the evaluation of the set will simply involve ensuring that each sub-category of Choi et al.’sHand Shape andHand Orientation categories are represented in the comprehensive pose set. The construc- tion of this pose set and a more in-depth view of Choi et al.’s notation method can be seen in Chapter 3.1.