You are on page 1of 4

173

A 2D-3D Integrated Interface for Mobile Robot Control


Using Omnidirectional Images and 3D Geometric Models
Kensaku Saitoh

Grad. School of Information


Science and Technology,
Osaka University
Takashi Machida

Cybermedia Center,
Osaka University
Grad. School of Information
Science and Technology,
Osaka University
Kiyoshi Kiyokawa

Cybermedia Center,
Osaka University
Grad. School of Information
Science and Technology,
Osaka University
Haruo Takemura

Cybermedia Center,
Osaka University
Grad. School of Information
Science and Technology,
Osaka University
ABSTRACT
This paper proposes a novel visualization and interaction technique
for remote surveillance using both 2D and 3D scene data acquired
by a mobile robot equipped with an omnidirectional camera and
an omnidirectional laser range sensor. In a normal situation, telep-
resence with an egocentric-view is provided using high resolution
omnidirectional live video on a hemispherical screen. As depth in-
formation of the remote environment is acquired, additional 3D in-
formation can be overlaid onto the 2D video image such as passable
area and roughness of the terrain in a manner of video see-through
augmented reality. A few functions to interact with the 3D environ-
ment through the 2D live video are provided, such as path-drawing
and path-preview. Path-drawing function allows to plan a robots
path by simply specifying 3D points on the path on screen. Path-
preview function provides a realistic image sequence seen from the
planned path using a texture-mapped 3D geometric model in a man-
ner of virtualized reality. In addition, a miniaturized 3D model
is overlaid on the screen providing an exocentric view, which is
a common technique in virtual reality. In this way, our technique
allows an operator to recognize the remote place and navigate the
robot intuitively by seamlessly using a variety of mixed reality tech-
niques on a spectrum of Milgrams real-virtual continuum.
CR Categories: H.5.2 [Information Systems]: Information Inter-
faces and PresentationUser Interface; I.3.7 [Computing Method-
ologies]: Computer GraphicsThree-Dimensional Graphics and
Realism; I.4.8 [Computing Methodologies]: Image Processing
Scene Analysis
Keywords: remote robot control, omnidirectional image, 3D geo-
metric model
1 INTRODUCTION
A remote control mobile robot system is useful for a variety types
of remote surveillance at unknown places such as disaster investi-
gation and planetary exploration. Providing sufcient information
of a remote environment to the operator is a key factor in such a
mobile robot control system. For example, a video camera pro-
vides an egocentric view of the remote scene, a Global Positioning
System (GPS) and a gyro sensor provide position and orientation
of the robot. There have been many studies to improve the op-

e-mail: saitoh@lab.ime.cmc.osaka-u.ac.jp

e-mail: machida@ime.cmc.osaka-u.ac.jp

e-mail: kiyo@ime.cmc.osaka-u.ac.jp

e-mail: takemura@ime.cmc.osaka-u.ac.jp
Figure 1: Screenshot of the proposed interface.
erators sense of telepresence and to provide rich information of
the remote environment. For example, Nagahara et al. [1] used
an omnidirectional camera and proposed a nonlinear Field-of-view
(FOV) transformation method to present a super wide FOV beyond
the displays visual angle. They showed that the peripheral visual
information improves the efciency of remote robot operation.
However, it is often difcult to understand the remote situation
from a 2D image even when a wide FOV is provided. Some studies
have employed a range sensor to acquire and visualize 3D geomet-
ric information of the remote environment. 3D information is useful
to perceive accurate geographical situations, such as a distance to an
obstacle and 3D structure of the environment. Keskinpala et al. [2]
designed a PDA-Based interface that has three screen modes; one
mode provides a camera image only, another provides a top view
with scanned range data, and the other provides range data overlaid
on the camera image. Although helpful, this interface requires fre-
quent mode-switching. Sugimoto et al. [3] proposed a technique
to provide a virtual exocentric view by rendering a wireframe robot
properly overlaid on past images. However this technique does not
allow an arbitrary viewpoint. Ricks et al. [4] used a 3D model of
the remote scene rendered from a tethered perspective above and
behind the robot. However, 2D video images and a 3D model are
not integrated but just shown together.
This paper proposes a novel visualization and interaction tech-
nique for remote surveillance by integrating 2D and 3D omnidi-
rectional scene data (see Figure 1). Our technique allows an op-
erator to recognize the remote place and navigate the robot intu-
itively by seamlessly using a variety of mixed reality techniques on
a spectrum of Milgrams real-virtual continuum [5]. Normally, an
egocentric view using high-resolution omnidirectional live video is
presented on a hemispherical display in a way of telepresence. Ad-
ditional 3D information can be overlaid onto the live video such
as passable area and roughness of the terrain in a manner of video
see-through augmented reality (AR). Path-drawing function allows
1-4244-0651-X/06/$20.00 2006 IEEE
Authorized licensed use limited to: Universidad Nacional Autonoma de Mexico. Downloaded on July 28,2010 at 04:03:48 UTC from IEEE Xplore. Restrictions apply.
174
Figure 2: Overview of the proposed system.
to plan a robots path by simply specifying points on screen. Path-
preview function provides a realistic image sequence seen from the
planned path using a texture-mapped 3D model in a manner of vir-
tualized reality. In addition, a miniaturized 3D model is overlaid
on screen providing an exocentric view as a World-in-miniature
(WIM), a common technique in virtual reality [6].
2 SYSTEM OVEWVIEW
Figure 2 illustrates an overview of the proposed system. Figure 3
shows the mobile robot. A custom-made omnidirectional camera
(HOV: Hyper Omni Vision) [7] with a hyperboloidal mirror and
a high-resolution camera (Point Grey Research, Scorpion), and a
laser range sensor (SICK, LMS200) are mounted on a turn stage
(Chuo Precision Industrial, ARS-136-HP and QT-CD1). The om-
nidirectional image (800600 at 15Hz, see Figure 4(left)) is ap-
propriately distorted to a panorama image or a standard perspective
view in real-time.
On the other hand, an omnidirectional range data is acquired by
horizontally rotating the line range sensor. The omnidirectional
range data is translated as a 3D point cloud, and is further made
to a polygonal mesh model (See Figure 4(right)). As it takes about
18 seconds to measure an omnidirectional range data, 3D geometric
models are not made in real-time, but sporadically at discrete places
whenever the operator issues the command.
An electric wheelchair (WACOGIKEN, Emu-S) works as a sen-
sor dolly by lading the above two devices and two laptop PCs
(Toshiba, Libretto U100). The wheelchair has two driving wheels
in front and two sub wheels at the back. The remote operator can
control the wheelchair by sending commands through RS-232C.
The wheelchair estimates its own position and orientation in two
steps. First, initial estimation is made based on the internal sensors,
i.e., odometry and an orientation sensor (InterSense, InertiaCube2).
Second, when a new omnidirectional range data is acquired, self-
tracking accuracy is improved as a result of the ICP registration
process [8] between the new data with old ones. In our experience,
self-tracking accuracy is about 1% of moving distance in indoor
environments and less than 10% of that in outdoor environments.
The operator can see the remote environment through a hemi-
spherical dome display system (Matsushita Electric Works, Cyber-
Dome, see Figure 5(left)) [9] with a total delay of about 400 ms via
a wireless LAN (IEEE802.11g, 54Mbps). CyberDome provides a
wide FOV of 140 by 90 degrees in a standard sitting position. The
operator sends commands to the robot using a joystick and a throttle
(see Figure 5(right)).
Figure 3: Robot system.
Figure 4: An omnidirectional image (left) and a 3D model (right).
3 2D-3D INTEGRATED INTERFACE
This section describes a set of interaction techniques in the pro-
posed system. In our techniques, 3D scene data is integrated into
2D live images in a manner of omnidirectional video see-through
AR. Note that calibration between the omnidirectional camera and
the range sensor is required to integrate 3D information to 2D im-
age. For this purpose, we calculate 3D transformation matrix using
the least square method by sampling a set of feature points from an
omnidirectional image and a corresponding 3D model in advance.
3.1 Visualization
Figure 1 shows a screenshot of the remote control system. The
dome screen is divided into three areas. A perspective image area
shows a 140-degree FOV front image in actual angular size. As
depth information is known, virtual objects such as a 3D pointer
can be correctly overlaid onto the perspective image. Figure 6
shows passable area and roughness of the terrain overlaid onto the
live video which were analyzed automatically by using spatial fre-
quency analysis of 3D data. A panoramic image area on top shows
the rest of rear and side image in a smaller visual angle. These
two areas provide an omnidirectional live video image. Thirdly, a
texture-mapped miniaturized 3D model (WIM) is optionally over-
laid on these images. As the robot moves, the WIMis automatically
translated and rotated accordingly so that the CG counterpart of the
robot stays in the middle heading upward. These three types of im-
ages provide both egocentric and exocentric views of the remote
environment at the same time, help the operator perceive the situa-
tion intuitively without the need for switching screen modes.
3.2 Interaction
In the following, a set of interactions provided by the system are
described.
Authorized licensed use limited to: Universidad Nacional Autonoma de Mexico. Downloaded on July 28,2010 at 04:03:48 UTC from IEEE Xplore. Restrictions apply.
175
Figure 5: Operation system; a dome screen(left), a control de-
vice(right).
Figure 6: Analytic overlay.
Figure 7: Display control.
Figure 8: Robot control.
3.2.1 Display Control
The perspective and panoramic images as well as the WIM model
can be rotated horizontally together by using a rudder of the joystick
(see Figure 7(middle)). The WIM model can also be scaled and
rotated vertically by joystick buttons (see Figure 7(right)). Four
semi-transparent blue triangles are displayed in the middle of the
screen while the operator controls the images. These operations
have no effect on actual robots orientation. That is, the operator
can check the surrounding situations without turning the robot.
3.2.2 Robot Control
The operator can move the robot by manipulating the joystick to-
ward each of four directions. Then the robot moves forward, back-
ward and turns to the left and to the right accordingly. Moving
velocity is controlled by the throttle. Four semi-transparent yellow
triangles are displayed in the middle of the screen while the oper-
ator controls the robot. Current moving direction is indicated by a
large highlighted triangle (see Figure 8).
Figure 9: Path drawing.
3.2.3 Range Sensor Control
When the operator presses a dedicated button on the device, the turn
table on the robot starts to rotate and an omnidirectional range data
is transferred to the operator. While a range data is acquired, the
operator cannot move the robot. After data transmission has been
nished, a new 3D model is made and shown on the screen.
3.2.4 Path Drawing
Path drawing is an intuitive function to plan a robot path on screen
[10]. First, the operator selects 2D points on screen by a cone-
shaped, green cursor, where he/she wants the robot moves through.
As depth information is known, selected points have 3D coordi-
nates, yielding a 3D B-Spline curve. The cursor turns red when the
position is too low or too high for the robot to move through. A blue
cone with an ID number is shown at each selected point. The blue
cones (selected points) can be moved or deleted freely. Color of the
curve indicates its curvature. Positions of the selected points in the
robots coordinates are displayed on screen in real-time. Figure 9
shows screenshots of path drawing.
3.2.5 Path Preview
Path preview is a function to examine the realistic virtual view from
the planned path. When path preview has started, live video shown
in the perspective image area gradually becomes transparent, and a
texture-mapped 3D model rendered from the same viewpoint grad-
ually appears. In this way, a real view is seamlessly changed to a
virtual view to maintain the operators sense of immersion. After
this transition, the viewpoint and the viewing direction are either
manually or automatically changed along the planned path until the
operator quits the preview mode.
If the operator is satised with the preview result, then he/she
can command the robot to actually follow the path. The robot then
controls its driving wheels automatically to move along the planned
path and stop at the end. Figure 10 shows some screenshots of
previewed images (left column) and real images (right column). We
can see that previewed images approximate real images well.
4 EXPERIMENT
We investigated the usefulness of the panoramic image and the
WIM in the experiment using a virtual scene representing a remote
Authorized licensed use limited to: Universidad Nacional Autonoma de Mexico. Downloaded on July 28,2010 at 04:03:48 UTC from IEEE Xplore. Restrictions apply.
176
Figure 10: Preview (left column )and real view (right coloumn).
environment. In this experiment, nine subjects in their early twen-
ties took part in two tasks with four visualization conditions. The
rst task (operation task) is to move the robot from a start point to
a goal as fast as possible. The second task (search task) is to col-
lect items in the environment as many as possible in a limited time.
Four conditions consist of combinations of with and without the
panoramic image and the WIM. Each of the subjects performed the
two tasks with four conditions (totally eight trials) in a randomized
order. After the experiment, subjects were asked to evaluate oper-
ability and searchability of each visualization condition in a scale
of 1 (worst) to 5 (best).
Figure 11 shows average scores of the questionnaire. It was
conrmed through a one-way ANOVA that the panoramic image
improved searchability but not operability. In the operation task,
rear and side images were rarely used because forward moving was
mostly sufcient to complete the task. In the search task, on the
other hand, rear and side information was highly necessary for ef-
cient searching and for backward moving from a dead end. As a
drawback of the panoramic image, however, some subjects felt it
difcult to estimate the distance to obstacles in rear and side direc-
tions. It was also conrmed through a one-way ANOVA that the
WIM improved both operability and searchability. It was clear that
the 3D map was useful in both tasks to grasp the robot position,
distance to obstacles, and the robots progress through the environ-
ment. Experimental results show that our visualization technique is
more useful than traditional visualization techniques with forward
Figure 11: Subjective evaluation.
images only in both robot operation and information gathering.
5 CONCLUSION
In this paper, we have proposed a 2D-3D integrated interface for
mobile robot control. Omnidirectional images provide real-time vi-
sual information at the remote environment, while a miniaturized
3D geometric model intuitively shows the robots position and ori-
entation, and 3D structure of the remote environment. With depth
information of the remote environment, a variety of 3D visualiza-
tion and interaction techniques are available on 2D live video im-
age in a manner of omnidirectional video see-through AR. Future
work includes improvement of 3D model quality and rigorous user
studies to validate the effectiveness of the proposed remote control
system.
REFERENCES
[1] H. Nagahara, Y. Yagi, and M. Yachida. Super Wide View Tele-
operation System. In Proc. IEEE Int. Conf. Multisensor Fusion and
Integration for Intelligent Systems (MFI2003), pp.149154, 2003.
[2] H. K. Keskinpala, J. A. Adams, and K. Kawamura. Pda-Based
Human-Robotic Interface. In Proc. IEEE Conf. on SMC, 2003.
[3] M. Sugimoto, G. Kagotani, H. Nii, N. Shiroma, M. Inami, and F. Mat-
suno. Time Followers Vision: A Teleoperation Interface with Past
Images. In IEEE CG&A, Vol.25, No.1, pp.5463, 2005.
[4] B. Ricks, C. W. Nielsen, and M. A. Goodrich. Ecological Displays
for Robot Interaction: A New Perspective. In Proc. IEEE/RSJ Inter-
national Conference on Intelligent Robots and Systems (IROS), 2004.
[5] P. Milgram and F. Kishino. A taxonomy of mixed reality visual
display. In IEICE Trans. on Information and Systems, E77-D(12),
pp.13211329, 1994.
[6] R. Stoakley, M. J. Conway, and R. Pausch. Virtual Reality on a WIM:
Interactive Worlds in Miniature. In Proc. of SIGCHI 95, pp.265272,
1995.
[7] K. Yamazawa, Y. Yagi, and M. Yachida. Omnidirectional Image Sen-
sor - Hyper Omni Vision -. In Proc. Int. Conf. on Automation Tech-
nology, 1994.
[8] P. J. Besl and N. D. McKay. A method for registration of 3-d shapes.
In IEEE Trans. PAMI, Vol.14, No.2, pp.239256, 1992.
[9] N. Shibano, P. V. Hareesh, M. Kashiwagi, K. Sawada, and H. Take-
mura. Development of VR Experiencing System with Hemi-Spherical
Immersive Projection Display. In Proc. Int. Display Research
Conf./Int. Display Workshops (Asia Display/IDW), 2001.
[10] T. Igarashi, R. Kadobayashi, K. Mase, and H. Tanaka. Path Drawing
for 3D Walkthrough. In 11th Annual Symposium on User Interface
Software and Technology (UIST 98), pp.173174, 1998.
Authorized licensed use limited to: Universidad Nacional Autonoma de Mexico. Downloaded on July 28,2010 at 04:03:48 UTC from IEEE Xplore. Restrictions apply.

You might also like