Professional Documents
Culture Documents
KeywordsKinect; gesture command; human joint; human Different to those integration studies of the Kinect and
action imitation; humanoid robot the robot, this paper develops a HCI system where Kinect-
based gesture command control is performed for human action
I. INTRODUCTION imitations of the humanoid robot. Figure 1 depicts the
practical scenario of human actions learning of the humanoid
With the fast development of human computer interactive robot with a Kinect gesture capture sensor. The humanoid
(HCI), speech recognition and text recognition techniques robot will operate the active gesture as the test active users
have been matured and widely used in many areas in human gesture. The users active gesture that are viewed as the
daily life. Gesture recognition by using the human active operation command for controlling the robot is recognized by
gestures is a new type of HCI techniques. The development of three different recognition schemes, dynamic time warping
the Kinect sensor produced by the Microsoft company does (DTW) [5], hidden Markov model (HMM) [6] and principal
speed up the elaboration of gesture recognition [1]. Integration component analysis (PCA)-based eigenspace [7] methods.
of the Microsoft Kinect sensor and the robot for creating a fine
interface between the human user and the robot machine to
perform the specific function has been an interesting technical II. GESTURE COMMAND RECOGNITION
issue in the recent year. In the work of [2], the Microsoft This work employs three recognition schemes, DTW [5],
Kinect is applied to recognize different body gestures and HMM [6] and PCA-based eigenspace [7] approaches, for
generate an interaction interface between the body gesture performing gesture recognition and then controlling the action
module and the humanoid robot Nao that is made by the of the humanoid robot according to the recognized command
Aldebaran company. In addition, a control system for result. This section will primarily introduce these three
manipulator robot using the Microsoft Kinect based on recognition schemes.
proportional-derivative control algorithm is proposed in [3].
The study of [3] is a combination scheme of the robot arm A. DTW Recogntion
control system and the Microsoft Kinect. Integrating the
Kinect, gesture recognition systems and mobile devices for the Dynamic time warping [5] belongs to the type of
application of an interactive discussion could be seen in the dynamic programming techniques. Such the dynamic
work of [4]. programming algorithm makes appropriate template-matching
calculations by a time-warping technique [5]. The time
warping of DTW is essentially to search an optimal path the establishment of the eigenspace for representing all
between the testing data and the reference template. When collected gesture information. As mentioned, fourteen human
performing DTW time warping calculations, the similarity gesture command are designed, and these fourteen gesture
degree between the testing data and the reference template will training data will be collected and then used to establish the
be derived. The high distortion between the two of them eigenspace. When the test user has a gesture recognition test in
denotes a low similarity degree. the test phase, the recognition decision process is to locate the
position of the test gesture data and then find the most
In this work of Kinect-based gesture command matched gesture categorization among these fourteen gesture
recognition for humanoid robot imitations, there are fourteen classes for this test data to be the recognition outcome.
human gesture command designed. These fourteen gesture
training data will be collected and then used to establish the III. HUMANOID ROBOT IMITATIONS ON HUMAN ACTIVE GESTURE
DTW reference template database. When the test user has a Gesture recognition adopting DTW, HMM and PCA-based
gesture recognition test in the test phase, each template of the eigenspace methods are integrated into the overall human
DTW referenced template database will be compared with the machine interactive system to be a command mechanism for
test users active gesture by time warping. controlling the action of a humanoid robot. Figure 2 depicts
the joint distributions in an indicated humanoid robot (the left
B. HMM Recognition side of Fig. 2) and the Kinect-captured human skeleton (the
Hidden Markov model [6] has been widely used in the right side of Fig. 2). As could be seen in Fig. 2, the robot
field of pattern recognition, such as speech recognition, text adopted to imitate the human gesture is the Bioloid humanoid
recognition and human gesture recognition in this study. As robot. The adopted Bioloid humanoid robot is produced by the
the above mentioned DTW recognition scheme, the HMM South Korean company, Robotis, and the Bioloid robot is
recognition task also includes training and test stages. composed of components and modular servomechanisms
Different to template matching of DTW in two templates, the (called artificial joint motor) which can be arranged according
HMM approach belongs the category of the model-based to the requirement of the user [9]. In this work, there are
technique, and therefore, in the training stage, HMM models totally fourteen gesture commands, and therefore fourteen
that contain statistical information of training patterns will be different setting configurations for the corresponding fourteen
built up. In fact, HMM has been early used in speech robot action establishments are made.
recognition where the HMM probability model is employed to
The number of joints in the Kinect-captured human
describe the pronunciation characteristics of the speakers the
skeleton is 20, which is different to the number of the artificial
uttered speech signals [8]. In this study of gesture recognition,
joint motor in Bioloid humanoid robot. The Bioloid humanoid
the HMM probability model is used to describe the active
robot has 18 modular servomechanisms, each of which
characteristics of the actors operative gestures, especially the
represents a corresponding artificial joint motor. In this study,
gesture captured by the Kinect sensing camera.
the gesture operated by the test user is recognized, and then
There are totally fourteen HMM classification gesture three dimensional positions of a series of joint sets, each joint
models established in this study. Each HMM gesture model set containing 20 joints, in the Kinect-captured human
among fourteen classification models will be used to calculate skeleton are determined. The setting configurations for the
the likelihood degree with test actors operated test gesture in corresponding robot action is made according to all these
the test stage. The derived likelihood degree between each of derived position information of joint sets in the Kinect-
those trained HMM state sequence models and the input test captured human skeleton and the real action gesture from the
active gesture of a test actor is used to evaluate the recognition human actor.
result of the test gesture. The label of the trained HMM state
sequence model with the highest value of the likelihood
degree will be the recognition outcome of the test gesture. In
this study of Kinect-based gesture command recognition for
humanoid robot imitations, left-to-right state transitions are
adopted in both HMM training and HMM testing.
C. PCA-based Recognition
Principal component analysis technique has been widely
used in image processing [7]. The eigenspace method
employing the PCA technique has been proved to be effective
in pattern recognition. This paper also explores the
effectiveness of the utilization of the PCA-based eigenspace
approach for achieving humanoid robot imitations by Kinect-
based gesture command recognition.
In this study, the PCA-based eigenspace approach for
gesture recognition employing the classical principal
component analysis technique, and there are two main stages
Fig. 2. Bioloid humanoid robot joints (left) and Kinect-captured human
included, PCA operations for features of human activities and skeleton joints (right).
V. CONCLUSIONS
Fig. 4. The humanoid robot action lifting the left foot with both hands held In this paper, the popular Microsoft Kinect sensor and
with the best imitation due to a perfect match between the human joints and
the robot joints. the humanoid robot is properly integrated for humanoid robot
action imitation applications. The humanoid robot with the
artificial joint servomechanisms could imitate the humans
specific active gestures according to the gesture command
made by the test active user. The actors active gesture
captured by the Kinect platform for humanoid robot imitations
is viewed as the control command. DTW, HMM and
eigenspace recognition schemes are employed for recognizing
the gesture control command in this work. Experiments show
Fig. 5. The humanoid robot action putting both hands in the hip with the left that the presented Kinect-based gesture command control
foot lifted to the left side with the passable imitation due to a acceptable method is effective and efficiency for humanoid robot action
match between the human joints and the robot joints. imitation.
ACKNOWLEDGMENT
This research is partially supported by the Ministry of
Science and Technology (MOST) in Taiwan under Grant
MOST 103-2218-E-150-004.
REFERENCES
Fig. 6. The humanoid robot action handing the phone using the right hand [1] Z. Zhang, Microsoft kinect sensor and its effect, IEEE Multimedia,
with the dissatisfactory imitation due to a bad match between the human joints vol. 19, no. 2, pp. 410, 2012.
and the robot joints.
[2] L. Cheng, Q. Sun, H. Su, Y. Cong, and S. Zhao, Design and
implementation of human-robot interactive demonstration system based
humanoid robot imitation action. As mentioned, the robot used on Kinect, Proc. the 24th Control and Decision Conference (CCDC),
in this work is the Type-A Bioloid humanoid robot, which has 2012, pp. 971975.
18 artificial joints. Such the number of 18 joints is smaller [3] R. Afthoni, A. Rizal, and E. Susanto, Proportional derivative control
than 20 of Kinect-captured skeleton joints, and therefore, in based robot arm system using Microsoft Kinect, Proc. IEEE
some situations, the gesture command is correctly recognized, International Conference on Robotics, Biomimetics, and Intelligent
but the Bioloid humanoid robot operates an imperfect Computational Systems (ROBIONETICS), 2013, pp. 2429.
imitation action due to the limitation of the restricted joint [4] V. Tam and L.-S. Li Integrating the Kinect camera, gesture recognition
number and the specific joint distribution. Fourteen human and mobile devices for interactive discussion, Proc. IEEE International
active gestures are devised in this work, and not all gestures Conference on Teaching, Assessment and Learning for Engineering
(TALE), 2012, pp. H4C-11 H4C-13.
are well imitated by the Bioloid humanoid robot in the
situation of correct gesture command received. Figure 4 [5] H. Sakoe and S. Chiba, Dynamic programming algorithm optimization
for spoken word recognition, IEEE Transactions on Acoustics, Speech,
depicts the Bioloid humanoid robot action lifting the left foot and Signal Processing, vol. 26, no. 1, pp. 43 49, 1978.
with both hands held with the best imitation due to a perfect
match between the human joints and the robot joints. In the [6] L. R. Rabiner, A Tutorial on hidden Markov models and selected
applications in speech recognition, Proceedings of the IEEE, vol. 77, no.
category of active gestures, the joint matched degree between 2, pp. 257286, 1989.
the Bioloid humanoid robot and the Kinect-captured human [7] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of
skeleton is high, and therefore the robot imitation action is Cognitive Neuroscience, vol. 3, no. 1, pp. 7186, 1991.
satisfactory. The humanoid robot action putting both hands in [8] I. J. Ding, Speech recognition using variable-length frame overlaps by
the hip with the left foot lifted to the left side with the intelligent fuzzy control, Journal of Intelligent and Fuzzy Systems, vol.
passable imitation is shown in Fig. 5. Although the robot 25, no. 1, pp. 4956, 2013.
imitation action is not ideal but still acceptable due to a much [9] J.-K. Han and I.-Y. Ha, Educational robotic construction kit: Bioloid,
proper match between the human joints and the robot joints. Proc. the 17th World Congress of the International Federation of
Figure 6 is the humanoid robot action handing the phone Automatic Control (IFAC), 2008, pp. 30353036.