You are on page 1of 9

CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

Graspables: Grasp-Recognition as a User Interface


Brandon Taylor V. Michael Bove, Jr.
MIT Media Lab MIT Media Lab
20 Ames St. E15-346 20 Ames St. E15-368B
Cambridge, MA 02139 USA Cambridge, MA 02139 USA
bttaylor@media.mit.edu vmb@media.mit.edu
ABSTRACT handheld devices. It was suggested that an ideal multi-
The Graspables project is an exploration of how measuring function device would need to be capable of two things: it
the way people hold and manipulate objects can be used as would need to automatically infer what users want to do
a user interface. As computational ability continues to be with it and it would need to be able to alter its affordances
implemented in more and more objects and devices, new accordingly. When it wasn’t being used, the device would
interaction methods need to be developed. The Graspables simply appear to be an undifferentiated block, like a bar of
System is embodied by a physical set of sensors combined soap.
with pattern recognition software that can determine how
users hold a device. The Graspables System has been While the Graspables may not completely fulfill this vision,
implemented in two prototypes, the Bar of Soap and the the idea of creating devices that implicitly understand users’
Ball of Soap. Applications developed for these prototypes intentions, without the need for menus and direct
demonstrate the effectiveness of grasp-recognition as an commands, was the launching point for the project. As the
interface in multiple scenarios. project evolved, emphasis shifted away from multi-function
handhelds to exploring how basic manipulations of objects
Author Keywords can contain useful information. In other words, what can
Grasp, User Interface. you learn from the way people hold an object? Can you
distinguish whether a user wants to make a phone call or
ACM Classification Keywords just look up a contact by the way they hold their phone?
H.1.2: Models and Principles: User/Machine Systems. Can a golf club predict a slice if it is gripped improperly?
H.5.2: Information Interfaces and Presentation: User In pursuing these questions, the Graspables were
Interfaces. K.8.0: Personal Computing: General. J.7: constrained by the desire to have a system that could be
Computer Applications: Computers in Other Systems. realistically implemented in existing objects and devices.
This led us to shy away from approaches that require
INTRODUCTION elaborate sensing environments or expensive input devices.
The Graspables are devices developed to explore how The hope was that the right combinations of sensors and
measuring the way people grasp objects can enhance user software could give objects an enhanced understanding of
interfaces. This paper will first attempt to explain the their users’ actions without limiting portability or
rationale and inspiration behind using grasp-recognition as affordability.
an interface. It will provide a detailed description of how
the sensors and software were implemented in the two Another key aspect of the research was the focus placed on
Graspables prototypes. Next, follows a discussion of the objects themselves. Instead of focusing on just creating a
applications that have been developed to explore and new interface method or a specific type of controller, we
demonstrate the capabilities of grasp-recognition. Lastly, were very interested in understanding and exploring how
there will be a discussion of our experiences with grasp- people interact with a variety of different objects. Our view
recognition, including methods for evaluation and was that understanding how people grasp and interact with
improvement. a coffee cup is potentially just as valuable as how they
interact with electronics. Thus, we wanted a system that
The origin of the Graspables can be traced back to a high could be implemented into arbitrary geometries.
level discussion of ways to improve multi-function
Background
Nearly twenty years ago, Mark Weiser coined the term
Permission to make digital or hard copies of all or part of this work for
Ubiquitous Computing to describe the idea of a vast
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies network of computing devices interacting unobtrusively to
bear this notice and the full citation on the first page. To copy otherwise, enhance productivity. While the proliferation and
or republish, to post on servers or to redistribute to lists, requires prior dispersion of computational power has certainly occurred, it
specific permission and/or a fee.
has not yet “vanish[ed] into the background” [17].
CHI 2009, April 4–9, 2009, Boston, MA, USA.
Copyright 2009 ACM 978-1-60558-246-7/09/04…$5.00.

917
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

Projects like the Graspables fit into the realm of Ubiquitous devices’ geometries impact what objects they can easily
Computing by trying to expand the ways in which represent.
computers are controlled. By developing grasp-recognition
The most common input devices, such as mice, keyboards
as a user interface, it is hoped that users can be presented
and even video game controllers, generally sacrifice
with a more natural method of interacting with devices.
representation in favor of more robust, general controls.
Instead of seeing a device and trying to imagine how its
Over time, these systems develop semantics of their own
menu system and buttons are mapped, grasp-recognition
(think how similar most video game controllers are or how
can leverage users’ intuitions about how devices should be
people expect to be able to click icons in graphical
used for certain functions.
interfaces) and people hardly even think about the control
Studies have been performed demonstrating how certain metaphors. However, there are exceptions.
computerized tasks can be more easily accomplished when
Tablet and stylus systems exist to better bridge the gap
properly modeled by physical devices [5]. Early work on
between writing or drawing and computers. Video games
the concept of Graspable User Interfaces suggested that by
can use specialized peripherals such as steering wheels or
“facilitating 2-handed interactions, spatial caching, and
guns. These examples highlight the importance of objects
parallel position and orientation control” physical
for some tasks. While there is likely no way to completely
computing could provide a richer interface than virtual,
avoid the tradeoff between robust and representative
graphics-based interfaces [4].
controls, it is certainly worth exploring how new interfaces
A grasp-recognition based interface would, by virtue of its can create more literal interactions and of what value these
nature, capitalize on these advantages. Rather than creating may be.
controls through arbitrary key-mappings, the physical
nature of the Graspables provides suggestive physical RELATED WORK
affordances and passive haptic feedback by approximating The Huggable is a robotic Teddy Bear being designed by
and representing real-world objects. the Personal Robots group at the MIT Media Lab to provide
therapeutic interactions similar to those of companion
Motivation animals. The Huggable is being developed with the goal of
Portable device interfaces provide a distinct challenge for “properly detecting the affective content of touch” [15].
designers. For more complex portable systems, there is a Towards this end, the Huggable is thus equipped with an
natural desire to mimic the operational semantics of the array of sensors that detect the proximity of a human hand,
computer as much as possible. People are accustomed to measure the force of contact and track changes in
the window metaphor of most desktop computer GUIs, so it temperature. The data from these sensors is then processed
makes sense to leverage this knowledge to some extent. to distinguish interactions such as tickling, poking or
slapping [14].
A common approach in portable devices is to imitate the
clicking and dragging functions of a mouse with a From a technical perspective, the goals of the Huggable are
touchscreen. Full keyboards are often implemented either very similar to those of the Graspables System. Both seek
in the form of physical buttons or virtual ones. Both to identify and understand the ways users manipulate an
approaches have drawbacks. Physical buttons are object. In many ways, the Huggable could be viewed as a
necessarily small and always present, even when an sophisticated example of a grasp-recognition system. That
application only needs a subset of the keys. Virtual buttons said, there are obvious differences between the Graspables
on the other hand, provide no tactile feedback, which can System described in this paper and the hardware/software
render them unusable to certain groups of users. system of the Huggable. The sensing hardware of the
Huggable, for example, relies on dense arrays of Quantum
These issues have led researchers to explore other
Tunneling Composite (QTC) force sensors and broad
interaction methods that may end up being more
electric field sensors for touch sensing, whereas the
appropriate for handheld devices. A common example is
Graspables are implemented with a dense set of capacitive
the use of accelerometers in many cameras and phones for
sensors. Additionally, whereas the Huggable is intimately
switching between portrait and landscape views. Another
connected to the Teddy Bear form factor, our work
approach is to capitalize on the inherent mobility of
demonstrates a system that can readily be adapted into
handheld devices by exploring gestures as an interaction
various geometries for different uses.
method. Studies have explored using gestures for things
such as the detection of common usage modes [9] to the The Tango is a whole-hand interface designed by the
mapping of functions to relative body positions [1]. While it Multisensory Computation Laboratory at Rutgers for the
is hard to predict what new interfaces will catch on, manipulation of virtual 3D objects [11]. The device is a
successes like the iPhone’s Multi-Touch display provide hand-sized spherical object with a 3-axis accelerometer and
encouragement for continuing research. an 8x32 capacitive sensing grid housed in a compressible
dielectric material. The Tango is calibrated to detect
In implementing the Graspables system into the Bar of
Soap and the Ball of Soap, we were interested in how the

918
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

variations in pressure from which a simplified hand model devices. It also served as a test bed for the Graspables
can be estimated. System’s hardware.
The Tango uses spherical harmonics to create a rotationally The Bar of Soap prototype, shown in Figure 1, is a
invariant map of pressures [8]. These pressure maps can 11.5x7.6x3.3 cm rectangular box containing a 3-axis
then be reduced using principal component analysis and accelerometer and 72 capacitive sensors. The capacitive
classified using K-nearest neighbors. A 3D virtual sensors are controlled by three Qprox QT60248 chips,
environment in which the Tango was used to manipulate which treat each one as a discrete, binary sensor. An Atmel
virtual objects was also developed. Atmega644 microcontroller in the device samples these
sensors and can communicate results to a PC via Bluetooth.
While the Tango clearly shares certain objectives with the
Low-power cholesteric LCD screens on the two largest
Graspables, there are significant differences in their
faces can provide user feedback. Transparent capacitive
respective implementations. First, the grid structure of the
sensors were developed and placed over both screens. This
capacitive sensors and the classification software of the
allowed the display surfaces to also function as sensing
Tango would not directly translate to other device
surfaces, which in turn allowed the Bar of Soap to better
geometries, severely limiting the number of objects it could
emulate functional devices with interactive touchscreens.
represent. Also, since the Tango is actually attempting to
Having screens on both sides preserved the symmetry of the
infer general hand poses it requires additional constraints,
device and allowed the Bar of Soap to function as a generic
such as single hand use. In the end, while the sensing
rectangular sensing device with two customizable faces.
techniques and software analysis provide interesting
references, the goals of the Tango require a significantly The transparent sensors were created by placing thin film
different approach than those of the Graspables. coated with Indium Tin Oxide (ITO) on opposite sides of a
clear piece of acrylic. ITO is a transparent conductive
When development began on the first version of the Bar of
material that fills the role of the interdigitated copper traces
Soap, a similar study was being conducted by the Samsung
of the other capacitive sensors. We tested ITO sensors in a
Advanced Institute of Technology (SAIT) [2,7]. After
variety of patterns and sizes before choosing strips of
receiving encouraging results from an initial study in which
approximately 5mm width. This design has the advantage
painted gloves where used to create image maps of grasp
of being relatively simple to construct while providing
patterns, a prototype device was built for real-time grasp
sensitivity to finger touches comparable to the copper traces
detection. The SAIT device contained a 3-axis
used elsewhere. Settings in the QT60248 chip were able to
accelerometer and 64 capacitive sensors. A user study was
amplify the responses of sensors located further along the
performed to try and classify 8 different use modes with the
ITO to compensate for its resistance. Response was further
device.
improved by lining the edges of the device with a grounded
The results from the SAIT study match up well with those copper strip.
of the initial Bar of Soap study [16], correctly classifying
75% to 90% of grasps across multiple users. The SAIT
device uses non-binary capacitive sensors, a different
sensor layout on a device of different physical dimensions,
a unique set of use modes and different classification
techniques from the Bar of Soap. In addition to these
differences, our research goes beyond static grip
recognition to explore how changing grasps and gestures
can enhance interactions. We also look beyond common
handheld electronics to see how grasp recognition can
impact interactions with physical objects and virtual
environments.

DESIGN
The Graspables System is a hardware and software platform
capable of detecting how a user is manipulating a device.
The system needs to be flexible enough to accommodate
distinct sensor layouts for objects with different physical Figure 1. The Bar of Soap
geometries. It is also important that the system be able to
process and transmit data in real-time.

The Bar of Soap The Ball of Soap


The Bar of Soap was designed to explore how grasp- As the Bar of Soap evolved from an exploration of ways to
recognition could be of use in a variety of modern handheld improve multi-function handhelds into a more general
platform to explore grasp-recognition, we began to consider

919
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

the limitations of its physical form. While a small section will discuss the applications that have been
rectangular box provides an adequate representation of developed for the Graspables implementations.
many handheld electronics, it has inherent limitations.
Multi-Function Handheld
In order to explore interactions with different physical As handheld electronics have become more powerful, it has
forms, the Ball of Soap was developed. Since a truly become possible to add more functionality to individual
spherical object would create difficulties in laying out the devices. Whereas phones used to be just phones, most
capacitive sensors used in the Bar of Soap, we built the Ball modern cell phones now take pictures, play music, browse
of Soap as a small rhombicosidodecahedron. This 62-sided the internet and more. One difficulty for designers has been
Archimedean solid provides flat surfaces near the quarter how to layout the appropriate affordances for these various
square inch size and surface density of the Bar of Soap’s functions given the small size of the devices. We noticed
sensors when the overall diameter approaches three inches. that many of the functions now provided by multi-function
handhelds have been adopted from devices that people are
accustomed to holding and operating in distinct ways. The
Bar of Soap as a multi-function handheld application is
designed to capitalize on people’s previous experiences by
inferring how they want to use the device based on how
they are holding it.
This application provides a very self-contained
demonstration of the potentials of grasp-recognition. The
device passively senses its orientation and the position of a
user’s hands, and then displays an interface corresponding
to the most likely functionality mode. For demonstration
purposes the sampling and classification routine is
performed every three seconds, but it could easily be
triggered by some sort of gesture. Figure 3 shows the Bar
of Soap’s multi-function handheld mode switching
application in use.
Figure 2. The unassembled Ball of Soap
The surface structure of the Ball of Soap prevented the
simple printing of interdigitated copper traces used as
sensors on the Bar of Soap. Instead, adhesive copper pads
were cut and attached to the faces with wires running to
circuit boards inside the ball. We explored using a variety
of trace shapes for each of the different face geometries, but
found that only using the smallest, triangle arrangement
provided a more consistent response across the capacitive
sensors.
As can be seen in Figure 2, the small
rhombicosidodecahedron shape also allowed the Ball of
Soap to be separated into three sections for easier assembly.
The two end pieces are identical and each contains a Qprox
chip that controls the 23 capacitive sensors on its surface.
The center piece has 16 faces, 15 of which have capacitive Figure 3. The Bar of Soap being held in a ‘phone’ grasp
sensors and one that houses the power button and
Currently, the application switches between five
programming interface. Inside the Ball, attached to the
functionality modes: camera, gamepad, phone, personal
center piece is the main circuit board with the
data assistant (PDA), and remote control. These modes
microcontroller, accelerometer, battery and Bluetooth chip.
were chosen to represent common, pre-existing handheld
A grounding bracelet can be attached to improve sensor
devices that were assumed to have relatively distinct
response.
interaction methods. The data used to train the Bar of Soap
to distinguish the different modes was gathered from users
APPLICATIONS
who were given a functionality mode (camera, gamepad,
In the process of developing the Graspables System,
phone, PDA or remote) and asked to hold the device
applications were always a consideration. While it is hoped
however they felt was appropriate given the mode.
that the grasp-recognition technique is general enough to be
applied to many other scenarios, specific objectives
strongly influenced the design of the prototypes. This

920
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

Rubik’s Cube certain pitches or diagnose potential delivery issues. On the


Increasingly, gesture recognition using accelerometers and other hand, baseball video games could use such a device to
other sensors is being incorporated into handheld device provide a method of pitch selection that is more realistic
interfaces [13,6]. To demonstrate that the Graspables are and engaging than pushing a button on a controller.
not just limited to distinguishing a small number of static
grasps we designed a virtual Rubik’s cube application. This While individual grasps may vary slightly from pitcher to
application makes use of gesture recognition to control a pitcher, in general the outcome (pitch type) is mapped to a
virtual object using a tangible real-world interface. certain grip relative to the baseball’s seams. Given these
previously defined grasps, training data was acquired by
having a single user appropriately hold the Ball of Soap for
a set of pitch types. For this application, the skin of a
baseball was wrapped around the Ball of Soap. Due to the
four-way symmetry of a baseball, each individual pitch type
can be thrown with the ball in four unique absolute
orientations. Since the sensors on the Ball of Soap are
aligned to the absolute orientation of the Ball rather than
relative to the baseball’s seams, training data was collected
for each pitch being held in the four different absolute
orientations.

Figure 4. The Bar of Soap controlling a virtual Rubik’s cube


This application, shown in Figure 4, exists as a Matlab
script that streams raw sensor data from the Bar of Soap via
Bluetooth. A graphical version of a Rubik’s cube is
displayed on screen and is mapped to the Bar of Soap’s
orientation as determined by the accelerometer data.
Rotating ends of the virtual Rubik’s cube is accomplished
by sliding a finger across the sensors that are most closely
mapped to that end on the Bar of Soap.
Figure 5. The Ball of Soap with a baseball cover as a pitch
The Rubik’s cube application demonstrates how grasp- selector
recognition can be used to provide a more tangible interface The pitch selection application, shown in Figure 5, operates
to virtual environments. By mapping the orientation of the as a Matlab script. Upon activation, Matlab opens a serial
virtual object to the real world device, changing viewpoints port for communication with the Ball of Soap and a screen
is incredibly intuitive. Selecting different virtual objects presents the user with a pitcher ready to throw. The user
can be as simple as picking up a different grasp-recognition then grips the Ball appropriately to throw a fastball, a
equipped device. Lastly, this application shows how the curveball, or a splitter and makes a throwing gesture. The
sensors used for distinguishing static grasps can also be acceleration of the Ball triggers the classification routine,
used to recognize dynamic manipulations and gestures. which in turn triggers an animation taken from Nintendo’s
Mario Super Sluggers videogame to display the selected
Pitch Selection pitch.
In baseball, subtle differences in the way the pitcher grips
the ball have a profound effect on the outcome of the pitch. DATA ANALYSIS
Thrown correctly, a slider can become nearly unhittable as In addition to building the Graspables hardware, we also
it darts away from the batter at the last second. However, had to develop methods for making sense of the data that it
the slightest error in its delivery can see the pitch landing produced. A significant amount of work was put into
hundreds of feet in the wrong direction. creating appropriate data features and classification
Given the importance of fine finger manipulations on methods for each of the applications.
pitching, this seems an ideal scenario for the Graspables
System. A baseball that can detect how it’s being held
could be extremely useful in training players to throw

921
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

Multi-Function Handheld Data face of the Bar of Soap. The sliding gesture was recorded
Each time the Bar of Soap is sampled, data are gathered in both directions along each edge of the Bar of Soap and
from 75 independent sensors. In order to create a more along the outermost rows and columns of sensors on the
manageable feature space, we needed a way to process largest two faces. While the sliding gesture was being
these data before applying classification techniques. recorded on a specific side, no particular attention was
While we were testing the first version of the Bar of Soap placed on how the user was holding the Bar of Soap. This
(which had no capacitive sensors on one of the largest insured that data about manipulations that were not sliding
faces) we developed a Matlab script to visualize the data. gestures was also recorded.
This visualization tool, shown in Figure 6, had checkboxes The sliding gestures were modeled using a left-right Hidden
laid out to represent the capacitive buttons, a 3D plot of the Markov Model. The states represented the position of the
accelerometer readings and icon display that could either finger activating either a single capacitive sensor or two as
present a sample’s class label or the result of a classifier. it slides between them. The number of states in the model
depended upon the number of capacitive sensors on the side
that was being modeled. In addition to the left-right HMM’s
modeling the sliding gestures, ergodic models exist to
model general, non-sliding interactions.
These models are trained using the raw sensor data as
observation sequences. The sliding models are trained
using the corresponding sliding gestures. The general
ergodic model is trained using the data from sliding
gestures that do not correspond to the modeled area. For
example, the data set that represents a sliding gesture along
the short edge of the Bar of Soap is used to train the ergodic
model of the long side.
The time sequences of activated capacitive sensors are then
broken up into observation sequences corresponding to the
Figure 6. A visualization of the data of a ‘phone’ grasp. This different gesture models. The trained models are used to
visualization layout was designed for the Bar of Soap V1, calculate the probability of observing such a sequence. If a
which did not have sensors on the front face. sequence has a higher probability of being observed given
Using this visualization tool, we noticed that the capacitive one of the sliding gesture models, a sliding event is
sensors had a strong tendency to be activated in regional triggered.
groups. Additionally it became clear that treating the If the sliding gesture is performed on either of the largest
sensors as having an absolute location reduced the accuracy faces, the rotation will occur in the direction of the sliding
of classifiers. To account for these facts, we began gesture and on the corresponding end of the virtual cube.
reducing the dataset by counting the number of activated To rotate an end along the axis that is mapped
capacitive sensors along each face (splitting the largest two perpendicular to the front and back face, the user simply
faces into two halves) and orienting these groups according places their hand over one of the large faces and slides their
to the accelerometer readings. This method outperforms finger along one of the edge faces in the direction of desired
datasets reduced using Principal Component Analysis and rotation. The virtual cube will interpret the covered side as
Fischer Linear Discriminants and automatically adjusts the the stationary side of the cube and rotate the opposite side.
device when it is rotated or flipped.
Pitch Selection Data
Rubik’s Cube Data Like with the Bar of Soap as a multi-function handheld, the
In order to control the virtual Rubik’s cube, data is rapidly capacitive sensors are grouped in order to reduce the size of
sampled via Bluetooth. The accelerometer data is the feature space. Instead of grouping by sides, as in the
smoothed and mapped to the orientation of the virtual cube. Bar of Soap, capacitive buttons are grouped around the 12
Additionally, sliding a finger across different faces of the pentagonal faces on the small rhombicosidodecahedron.
Bar of Soap triggers a rotation of the corresponding part of Each pentagonal face is surrounded by five square faces,
the Rubik’s cube. While it might be possible to detect shared with a single other pentagonal face, and five triangle
sliding gestures in a simpler manner, it was desired that the faces, shared with two other pentagonal faces. These faces
method be generalizable to more complex gestures. Thus, are weighted inversely to the number of groups they
we chose to implement hidden Markov models [12] to inhabit: an active pentagonal face receiving a weight of 6, a
detect the sliding gestures. square 3, and a triangle 2. Thus each of the twelve groups
In order to train the Hidden Markov Models, data was creates a feature with a value between 0 and 31 depending
collected and labeled as a single user slid his finger over the on the number of activated faces.

922
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

The pitch recognition application for the Ball of Soap repeated this process with each user until we had data
operates very similarly to the multi-function mode samples from each of the five tested functionality modes.
switching application for the Bar of Soap. The capacitive For the single user part we collected 39 sample grasps in
sensors are grouped and processed as discussed above, then each functionality mode for a total of 395 grasps. For the
Bayesian discriminants are calculated. Each of the four multiple users part, each of the 13 users provided three
pitch orientations is treated as a separate class. Thus, for N sample grasps per mode to create a matching data set.
pitch types, 4xN determinants are calculated.
Single User Test Multi-User Test
The classification routine is triggered by a throwing
gesture. Thus, though the sensors are continually sampled, Templates 82.2% 75.4%
the discriminant functions are not calculated until the
Nnets 92.5% 79%
accelerometer values surpass a threshold. When the
threshold is crossed, the discriminants are calculated and KNN 95% 75.8%
the most likely pitch is selected.
Parzen 95.4% 72.3%
EVALUATION
One of the greatest difficulties our research faced was Bayes 95% 79%
finding an effective method for evaluating the Graspables.
GLD 87.5% 70.3%
On a general level, this is a difficulty facing any new
method of interface. If it really is novel, then task Table 1. Recognition rates for different classification
completion comparisons with existing interfaces overlook techniques for both single and multiple user datasets.
new possibilities offered by the interface. On a more
specific level, grasp-recognition is difficult to evaluate For this study we tested a wide range of classifiers
because of the lack of any ground truth. Even in situations including Templates, Neural Networks, Bayesian
where a grasp is universally recognized, such as the Classification, k-Nearest Neighbors, Parzen Windows and
relations of fingers to baseball seams when throwing a four- General Linear Discriminants [3]. To test the reliability of
seam fastball, the exact grip will vary from person to person the system, we used the single user dataset. Each classifier
based on hand size. The problem becomes even more was trained using a randomly selected set of 29 sample
challenging when dealing with less defined grasps. After grasps from each mode, then tested on the remaining 10
all, who’s to decide what the proper way to grasp a phone samples. This process was repeated 10 times using
is? different training sets. To see how consistently grasps are
associated with devices across multiple users, we trained
This isn’t to say that no evaluation can be done. Obviously, the classifiers using the entire single user dataset and tested
for grasp-recognition to be of any value, grasps must at them with the multiple user dataset. The recognition rates
least have enough meaning to be remembered and repeated for both studies are shown in Table 1.
by individuals. We conducted a two-part user study to
examine first, how reliably our system could recognize We were pleased to find that multiple classification
grasps, and second, how grasps associated with various techniques were able to correctly identify a single user’s
devices vary across a population. grasps with over 90% accuracy. This led us to conclude
that our sensor design was adequate for grasp-recognition.
This study was conducted using the first version of the Bar The relatively high rates of recognition across multiple
of Soap’s Multi-Function Handheld application. The users further encouraged us that even without prompting or
procedure was the same for both parts of the study, the only demonstration, people do have similar models of how to
difference being that the first part was conducted using a grasp things. In the end, we chose to implement the
single user, whereas the second part was conducted with a Bayesian classifier due to its high performance and minimal
total of thirteen individuals. computational complexity.
For this study, users were seated with the Bar of Soap in
DISCUSSION
front of them on a table. They were told that they would be
While we were generally pleased with the results of our
given a specific functionality mode (camera, gamepad,
user study, there are a few caveats that must be kept in
PDA, phone or remote control) and that they should then
mind. Our study group varied in hand size and handedness,
pick up the device and interact with it however they saw fit
however, all the participants were fairly homogenous in age
until instructed to set it back down.
(18 to 36) and familiar with electronic devices.
After giving these instructions, we would begin recording Additionally, the tested functionality modes were selected
data from the Bar of Soap and then verbally indicate what based on the assumption that they would have distinct
functionality mode the device should be treated as. Once grasps associated with them. A large source of error in the
the user had established a relatively stable pose with the multi-user study was when this assumption failed. For
device, we would label and save the data sample and have example, when users held the phone as though they were
the user place the device back on the table. We then

923
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

looking up a number rather than up to their ear as though paintbrushes, erasers and wands could all be represented by
speaking on it, the grasp was often classified as a PDA. different ways of grasping a stylus.
Another problem the Graspables encountered is that the For more of departure from the work in this paper,
system tends to show a significant bias regarding hand size. implementing Graspables into existing devices would be
While this can be overcome by explicitly training the interesting. Using the handheld device mode switching that
classifiers for different users or by implementing a learning was demonstrated by the Bar of Soap in a fully functional
algorithm, it would be desirable to have a more universal handheld would be worth studying. Questions about when
response. to trigger the classification algorithms, what error rates
would be acceptable to users, and the general effectiveness
There is also the common question of how many different
of the natural mode switching would be better explored by
grasps the system is capable of recognizing. Unfortunately,
longer studies with fully functional devices.
we do not have a simple answer. From our experience the
limiting factor is not the sensing system as much as the Applying what has been learned from the existing
application. Theoretically, the binary capacitive sensors on applications to other scenarios also has potential. Can the
the Bar of Soap can distinguish 722 different grasps in a Graspables System be used as a safety check to ensure that
number of different orientations. While the anatomy of the power tools are being operated properly? What could be
hand would make that infeasible, the fact is there are not gained by expanding the scale of the system from handheld
many applications where shifting a finger a fraction of an objects to whole body-sized arrays?
inch creates a meaningful difference in the user’s mind.
There is also room to perform further tests to improve the
The Ball of Soap’s pitch selection is an example where
reliability and robustness of the system. Optimizing sensor
minor changes in grip could have a meaningful impact.
densities could be valuable. Exploring how environmental
However, even in that case it would be difficult to draw
factors such as humidity impact the capacitive sensors
distinctions between minor grip changes without
could improve system reliability. Exploring additional
extensively measuring and studying live pitchers and ball
inputs such as pressure sensors could be beneficial. Lastly,
trajectories.
the software and classifiers could always benefit from more
While the limited number of meaningful ways to hold a training data.
rectangular box may make the Graspables System seem like
overkill, we feel that our sensor density has other benefits. CONCLUSIONS
As the Rubik’s cube application demonstrates, a grasp- The Graspables demonstrate how grasp-recognition can
recognition system can be used for more than just provide a unique and intuitive user interface. We presented
recognizing distinct grasps. Virtual environments could a design rationale, our system design, a variety of
make use of more precise mappings of hand position. application scenarios, and a discussion of our experiences
Another possibility is a twister-like finger dexterity game with the system. The system we developed can
could make use of far more grasp combinations that would implemented in multiple geometries to provide a better
naturally be used. representation of different objects in virtual environments.
It can also provide devices with additional awareness of
Aside from the study used to test the accuracy of the grasp- users’ intentions via their manipulations. As mobile
recognition, many users have informally interacted with the devices continue to grow in power and popularity, new
Graspables. It is interesting how quickly users respond to interaction methods will need to be developed
the idea of grasp-recognition. For the Bar of Soap as a accommodate them. We feel that grasp-recognition has the
mode switching handheld in particular, users are quick to potential to provide significant enhancement to current
adjust how they are holding the device and seem to enjoy interfaces.
trying to figure out how the grasps have been trained in the
device. Similarly, many users of the Ball of Soap as a pitch ACKNOWLEDGEMENTS
selector begin experimenting with various grasps even We thank Jeevan Kalanithi, Daniel Smalley and Matt
before the trained pitch grips are explained. Whether this Adcock for their work on the classification techniques.
enthusiasm would transfer to real implementations of grasp- Thanks also to Quinn Smitwick, James Barabas and Ana
recognition is questionable, but it does seem to indicate that Luisa Santos for their assistance. This work was supported
making better use of people’s senses of touch and by the CELab, Digital Life, and Things That Think
proprioception could provide better interfaces. consortia at the MIT Media Lab.
FUTURE WORK REFERENCES
The prototypes discussed in this paper only represent a 1. Ängeslevä, J., Oakley, I. and Hughes, S. Body
small fraction of the potential implementations of the Mnemonics: Portable Device Interaction Design
Graspables System. Another implementation that was Concept. In Proc. of Info. Vis. (2003).
discussed and would be worth developing is a stylus
prototype. In the graphics art world alone, pencils, 2. Chang, W., Kim, K.E., Lee, H., Cho, J.K., Soh, B.S.,
Shim, J.H., Yang, G., Cho, S. and Park, J. Recognition

924
CHI 2009 ~ Techniques for Mobile Interaction April 7th, 2009 ~ Boston, MA, USA

of Grip-Patterns by Using Capacitive Touch Sensors. 10. MacKenzie, C.L., Iberall, T. The Grasping Hand.
IEEE ISIE 2006, 4 (2006), 2936-2941. Elsevier, 1994.
3. Duda, R., Hart, P., Stork, D. Pattern Classification, 11. Pai, D.K., VanDerLoo, E.W., Sadhukhan, S. and Kry,
John Wiley & Sons, Inc., 2001. P.G. The Tango: A Tangible Tangoreceptive Whole-
4. Fitzmaurice, G.W., Ishii, H. and Buxton, W. Bricks: Hand Human Interface. In Proc. of World Haptics.
Laying the Foundations for Graspable User Interfaces. (2005), 141-147.
CHI 1995: Proc. of the SIGCHI Conf. on Human 12. Rabiner, L.R. A Tutorial on Hidden Markov Models
Factors in Computing Systems. ACM Press (1995), and Selected Applications in Speech Recognition. In
442-449. Proc. of the IEEE. Vol. 77, no.2, (1989), 257-286.
5. Fitzmaurice, G.W. and Buxton, W. An Empirical 13. Rekimoto, J. Tilting Operations for Small Screen
Evaluation of Graspable User Interfaces: Towards Interfaces. In Proc. of 9th ACM Symposium on User
Specialized, Space-Multiplexed Input. CHI 1997: Proc. Interface Software and Technology. (1996), 167-168.
of the SIGCHI Conf. on Human Factors in Computing 14. Stiehl, W.D. and Breazeal, C. Affective Touch for
Systems. ACM Press (1997), 43-50. Robotic Companions. At 1st Int. Conf. on Affective
6. Harrison, B.L., Fishkin, K.P., Gujar, A., Mochon, C., Computing and Intelligent Interaction. (2005).
and Want, R. Squeeze Me, Hold Me, Tilt Me! In Proc. 15. Stiehl, W.D., Leiberman, J., Breazeal, C., Basel, L.,
of the SIGCHI Conf. on Human Factors in Computing Lalla, L., and Wolf, M. Design of Therapeutic Robotic
Systems. ACM Press (1998), 17-24. Companion for Relational, Affective Touch. 2005 IEEE
7. Kim, K.E., Chang, W., Cho, S., Shim, J., Lee, H., Park, Int. Workshop of Robots and Human Interactive
J., Lee, Y., Kim, S. Hand Grip Pattern Recognition for Communication. (2005), 408-415.
Mobile User Interfaces. In Proc. of AAAI 2006. (2006), 16. Taylor, B.T. and Bove, V.M. The Bar of Soap: A Grasp
1789-1794. Recognition System Implemented in a Multi-Functional
8. Kry, P.G. and Pai, D.K. Grasp Recognition and Handheld Device. CHI 2008: Extended Abstracts on
Manipulation with the Tango. Int. Symposium on Human Factors in Computing Systems. (2008), 3459-
Experimental Robotics. (2006), 551-559. 3464.
9. Mäntylä, V.M., Mantyjarvi, J., Seppanen, T. and 17. Weiser, M. The Computer for the 21st Century.
Tuulari, E. Hand Gesture Recognition of a Mobile Scientific American. Vol. 265, no. 3, Sept. 1991, 94-104.
Device User. IEEE Int. Conf. on Multimedia and Expo.
1 (2000), 281-284.

925

You might also like