Professional Documents
Culture Documents
RESEARCH ARTICLE
Abstract
The objective of this study was to assess the suitability of the Microsoft Kinect depth camera as a tool in segment scanning,
segment tracking and player tracking. A mannequin was scanned with the Kinect and a laser scanner. The geometries
were truncated to create torso segments and compared. Separate shoulder abduction (2 1008 to 508) and flexion motions
(08 1008) were recorded by the Kinect (using free and commercial software) and a Motion Analysis Corporation (MAC)
system. Segment angles were compared. A participants centre of mass (COM) was tracked over a 6 3 m floor area using
the Kinect and a MAC system and compared. Mean errors with uncertainty of the mass, COM position and principal
moments of inertia were 2 1.9 ^ 1.6%, 0.5 ^ 0.4% and 3 ^ 2.6%, respectively. The commercial software gave the highest
accuracy, in which the maximum and root mean square errors (RMSEs) were 13.858 and 7.598 in abduction and 21.578 and
12.008 in flexion. RMSEs in X, Y and Z COM positions were 0.12, 0.14 and 0.08 m, respectively, although vertical position
(Y) was subject to a large systematic bias of 405 mm. The Kinects low cost and depth camera are an advantage for sports
biomechanics and motion analysis. Although segment tracking accuracy is low, the Kinect could potentially be used in
coaching and education for all three application areas in this study.
Introduction
The Kinect is a motion sensing device for use in
home entertainment, which captures separate colour
and depth data at 30 Hz, at a resolution of 640
480 pixels. The technology achieves this through the
use of two cameras (colour and monochrome) and an
infra-red (IR) projector. A pattern generated by the
projector is imaged by the monochrome camera.
This projected pattern is distorted (from a calibrated
datum) by objects in the scene. The disparities and
deformation are measured and translated into depth
information. The method is thought to be related to
structured light techniques (Scharstein & Szeliski,
2003), although specific details of the methods have
not been published.
Cameras capable of measuring depth are not
new, they already exist in the form of laser-based
time-of-flight cameras, structured light systems
and camera-based triangulation systems. Depth
Correspondence: S. Choppin, Sheffield Hallam University, Centre for Sports Engineering Research, Sheffield, UK. E-mail: s.choppin@shu.ac.uk
q 2013 Taylor & Francis
79
80
Figure 1. The 3D meshes of the scanned torso segment. The lefthand mesh shows the geometry obtained by the Kinect, and the
right-hand mesh shows the geometry obtained by the ModelMaker
laser scanner.
Segment tracking
The position and orientation of body segments is
important when assessing performance, injury risk
and joint loading. Methods of obtaining joint
information range from simple, single-camera twodimensional methods to sophisticated methods using
multi-camera calibrated volumes or spatially sensitive sensors.
Of the three available Kinect drivers, only
OpenNI/NITE (www.openni.org) and Kinect for
Windows have segment tracking capabilities, both
approach the problem in significantly different ways.
The Primesense (NITE) software registers an initial
pose which a skeleton tracking algorithm locks onto
the participant, allowing tracking in subsequent
frames. Microsoft invested considerable resource in
developing a method that works in a different
manner, requiring only a single frame to capture
body pose. This was achieved using machine learning
techniques with a large data-set of real and
synthesised body position data (Shotton et al.
2013). It is also important to note that at the time
of testing, the OpenNI tracking algorithm gave joint
position and segment orientations, whereas Microsofts gave only joint positions.
The objectives of a biomechanics analysis are far
removed from those of the typical mass consumer.
Calibration poses can be tolerated and real-time
processing can be sacrificed if accuracy of tracking is
increased. IPI Soft (www.ipisoft.com), a commercial
motion capture package, records the colour and depth
streams from the Kinect and analyses them postcapture. Tracking is not real-time but the complexity
of the skeleton is increased (e.g. shoulder and feet
segments are included), making this a popular choice
for users in the computer animation community. The
software has also recently added support for dual
Kinect recording, with the claim of increased
accuracy due to a more complete point cloud.
To assess the accuracy of skeleton tracking
methods, the freely available NITE algorithms were
compared with IPI Soft. A 12-camera Motion
Analysis Corporation (MAC) system was used to
81
Figure 2. Object segmentation from images is easier and more reliable when depth information is used instead of colour. The image on the left
was taken with the standard colour camera (image converted to grey scale) in the Kinect. The image on the right was constructed from the
depth information returned by the Kinect. The plots below each image show the intensity of the values taken along each image in the position
of the white line.
82
Segment tracking
Figures 3 and 4 show movement traces for the flexion
and abduction movements for the IPI Soft and NITE
tracking algorithms. For segment angles captured
Discussion
This paper has explored the viability of the Kinect
for use in three distinct sports analysis themes:
Figure 3. A comparison between MAC and Kinect shoulder flexion segment angles. IPI Soft segment tracking is shown on the left and NITE
on the right.
83
Figure 4. A comparison between MAC and Kinect shoulder abduction segment angles. IPI Soft segment tracking is shown on the left and
NITE on the right.
IPI Soft
Flexion
Abduction
NITE
Flexion
Abduction
a (8)
CI (8)
CI
0.984
0.998
4.41
10.7
3.17/5.59
11.2/10.2
1.06
1.12
1.02/1.09
1.11/1.14
0.993
0.988
3.49
8.61
3.30/3.68
6.55/10.6
1.45
1.26
1.42/1.49
1.22/1.30
84
Figure 5. A comparison between the MAC system and Kinect in recording participant COM over a 6 3 m area. The left plot shows
position in the X, Yand Z directions from top to bottom. The right plot shows participant movement as a projection onto the plane of the floor
as captured by the Kinect and MAC systems, shown in a 1:1 aspect.
A (mm)
274.7
405
2161
CI (mm)
CI
2100/49.4 1.00
0.997/1.01
366/440
0.587 0.548/0.629
2154/2167 0.885 0.875/0.896
resolutions, larger capture volumes, more sophisticated tracking techniques and increased sampling
rates. This will only increase the suitability of depth
cameras for sporting and coaching applications.
There is a need to develop specific software for users
in the sport and coaching community. We hope to
address this need in future research by releasing
software applications under a free licence; visit www.
depthbiomechanics.co.uk for more information.
Conclusions
The low cost and automatic tracking capabilities of
depth cameras (such as the Kinect) make them
potentially revolutionary for sports biomechanics
and motion analysis. Accuracy is currently not high
enough for some applications, but there is potential
for its use in coaching and education domains. In
order for the Kinect and future (more advanced)
depth cameras to benefit the sports analysis and
biomechanics community, there is a need for the
development of effective software and future studies
exploring their accuracy in specific domains.
References
Daffertshofer, A., Lamoth, C. J. C., Meijer, O. G., & Beek, P. J.
(2004). PCA in studying coordination and variability: A
tutorial. Clinical Biomechanics, 19, 415428.
Dempster, W. T. (1955). Space requirements of the seated
operator. WADC technical report 55159, 55 (WADC-55-159,
AD-087-892), 55159. Retrieved from http://www.mendeley.
com/research/space-requirements-of-the-seated-operator/
Durkin, J. L., & Dowling, J. J. (2006). Body segment parameter
estimation of the human lower leg using an elliptical model with
validation from DEXA. Annals of Biomedical Engineering, 34(9),
14831493.
85