You are on page 1of 10

Enabling Beyond-Surface Interactions for Interactive Surface with An Invisible Projection

Li-Wei Chan, Hsiang-Tao Wu, Hui-Shan Kao, Ju-Chun Ko, Home-Ru Lin, Mike Y. Chen, Jane Hsu, Yi-Ping Hung Department of Computer Science and Information Engineering Graduate Institute of Networking and Multimedia National Taiwan University {mikechen, yjhsu, hung}@csie.ntu.edu.tw

ABSTRACT

This paper presents a programmable infrared (IR) technique that utilizes invisible, programmable markers to support interaction beyond the surface of a diffused-illumination (DI) multi-touch system. We combine an IR projector and a standard color projector to simultaneously project visible content and invisible markers. Mobile devices outtted with IR cameras can compute their 3D positions based on the markers perceived. Markers are selectively turned off to support multi-touch and direct on-surface tangible input. The proposed techniques enable a collaborative multi-display multitouch tabletop system. We also present three interactive tools: i-m-View, i-m-Lamp, and i-m-Flashlight, which consist of a mobile tablet and projectors that users can freely interact with beyond the main display surface. Early user feedback shows that these interactive devices, combined with a large interactive display, allow more intuitive navigation and are reportedly enjoyable to use.
Keywords: Invisible marker, infra-red projection, beyondsurface, multi-display, multi-resolution, multi-touch, picoprojector, tabletop ACM Classication: General terms: INTRODUCTION

Figure 1: Enabling above-surface interaction with an invisible projection. On the tabletop, the i-m-Lamp device allows the integral projection of ne-detail content with the tabletop system.

H5.2 [Information interfaces and presentation]: User Interfaces. - Graphical user interfaces. Design, Human Factors

display above the display surface. Surface computers like tabletops [29][10][6], however, do not easily facilitate extension to the presentation of 2D images with 3D information without the addition of GUI buttons or gestures. Our vision is to enable a spatially-aware multi-display and multi-touch tabletop system. By knowing the 6D positions of the mobile display devices with respect to the tabletop display, users are able to view additional information with mobile displays, such as private information or augmented reality views. Furthermore, with the miniaturization of projectors, our approach makes it possible to support multiple mobile projections of high-resolution images onto the shared tabletop display. For example, a set of users each with mobile projectors can project and overlay their availability onto a calendar. In a personal working desk scenario, the mobile projector can be embedded in a desk lamp and can be used in conjuntion with a rear-projection multi-touch tabletop (Fig. 1). Recent approaches to localize 3D camera poses require the use of visible markers [7], which often interfere with the sub263

We present a programmable infrared (IR) technique that enables mobile devices to interact with display surfaces, while supporting multi-touch and direct on-surface tangible input. Current interactive surfaces are designed to sense 2D interactions such as nger touches. The ability to interact beyond the surface makes it possible to support cooperative multidisplay interaction. For example, an architect may examine a 2D blueprint of a building shown on the display while inspecting 3D views of the same structure by moving a mobile
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for prot or commercial advantage and that copies bear this notice and the full citation on the rst page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specic permission and/or a fee. UIST10, October 36, 2010, New York, New York, USA. Copyright 2010 ACM 978-1-4503-0271-5/10/10...$10.00.

ject content, limiting its usefulness and applicability. Some marker-less tracking methods have been introduced which track natural features [4][24][1]. These methods are useful when the target objects have rigid and unchanging features, but they suffer from lower reliability than their marker-based counterparts. In this work, we propose a new interactive tabletop prototype that overcomes this limitation by using invisible markers. We combine an infrared (IR) projector and a standard color projector to simultaneously project visible content with invisible markers. Embedded IR cameras can localize their positions in six-degrees of freedom by simple video capture of the projected content, enabling mobile display device cooperation with a shared tabletop system. The programmable IR projector is used for two purposes. First, the IR projector projects special marker patterns to enable mobile IR cameras to estimate their positions in six-degrees of freedom. In addition, we adapt the marker sizes according to the 3D positions of the IR cameras, allowing robust localization regardless of local proximity to the tabletop surface. Second, the IR projector selectively projects uniform white regions on the tabletop for performing multi-touch detection and recognizing tangible objects placed on the surface. The proposed techniques allow users to perform both on-surface and above-surface interaction.
RELATED WORK

Figure 2: The hardware architecture of the system prototype.

There are several areas of research which are highly related to the present paper. We present a review of each work. Several interactive surface systems have been able to provide interaction beyond the display surface. SecondLight [14] and UlteriorScape [15] allow cameras and projectors to see through the display surface. Their systems can turn translucent sheets above the surface into mobile displays via the special projection mechanisms. Users do not need to carry additional devices. The mobile sheets of known size above the surface can be tracked in six degrees of freedom by cameras. Hilliges et. al. [11] extend SecondLight to support pick-up gestures from above the surface. Active lens in metaDesk [28] tracks the pose of a LCD screen using a magnetic-eld position tracker, and renders the screen as an interactive 3D viewer of the tabletop display. LightSense [19] estimates the position of a cell phone by tracking the LED light behind the phone using a camera mounted on the tabletop system. LUMAR [21] extends LightSense using LED light to perform 2D tracking, and uses the camera on a cell phone to recognize the printed tags for 3D pose estimation. Benko et. al. [3] combines tracked, head-worn displays with a multi-touch tabletop, providing users interaction with the tabletop which includes speech, touch, and 3D hand gestures. Song et. al. present two works [25][26] which embed projectors into a digital pen and a computer mouse respectively, enabling dynamic augmented visual overlay on physical paper.
Beyond-Surface Interaction

ble by the synchronized cameras. In [30], a private image is embedded in a public display and is only visible to users with polarization-based shutter glasses. In [23], two individual channels are delivered by a shared display by altering synchronized shutter glasses. Using this technique, multiple channels can be integrally displayed by separating the spectrum. Prototypes that use an IR projector were proposed to project IR and visible light simultaneously. Implementing a programmable IR projector, Lee et. al.[17] built a projector capable of displaying both infrared and visible light images. With IR projection, location tracking of mobile objects can be greatly simplied.
DESIGN OF OUR TABLETOP SYSTEM

To demonstrate the feasibility of the proposed technique, we have developed a new tabletop system that supports abovesurface interaction while preserving the ability to sense touches as well as tangible inputs on the surface. Our tabletop system is based on a Direct-Illuminated (DI) setup. As shown in Fig. 2, the IR and visible light projections are reected by the mirrors and are simultaneously projected onto the tabletop surface. Two IR cameras are installed under the tabletop for detecting nger touches. In this section, we describe the hardware prototype design of our tabletop system, the interaction techniques applied to integrate the above-surface and on-surface interactions, and some considerations learned while implementing the system.
Invisible Light Projector

Overlaying invisible information by directly manipulating visible content, several researches have embedded imperceptible information via synchronized systems. In [8][5], coded patterns are temporally integrated into the displayed image, which is only perceptiEmbedding Invisible Information

The IR projector allows us to display invisible content for use in realizing 3D localization. To our best knowledge, however, IR projectors are not commercially available on the market. Hence, to produce such a projector, we converted a standard DLP projector meet our needs. The basic idea was to replace the IR-cut lter in the DLP projector with an IR-pass lter. However, as this process can easily cause permanent damage to the projector due to the heat being transmitted from the bulb, we provide here the detailed conversion procedure.
264

Figure 4: The IR spots in the camera view due to the touch-glass layer. Two camera views are blended to obtain a spot-free image for later processing. Figure 3: Two common relative placement cases of the diffuser and touch-glass layers for a DI tabletop system. In our work, we placed the diffuser layer on top of the touch-glass layer in order to obtain the best quality for projections both from above and beneath the tabletop.

Firstly, we remove the IR-cut lter from a DLP projector. This lter is usually installed in front of the light bulb to provide for infrared ray protection and heat reduction. Secondly, in the position of the IR-cut lter, we add a cold mirror which is able to reect visible light spectrum while very efciently transmitting IR wavelengths. It is important that the cold mirror be heatproof as it directly faces the light bulb. Our cold mirror has an average transmission of 93% between 750 and 1100 nm. Reducing the long spectrum can protect the projector components from over-heating. We used a wideangle projector, BenQ MP515s, which allowed us to do the modication within several minutes. In practice, it is possible to modify a DLP projector to enable it to project both IR and visible content simultaneously. In [17], Lee has demonstrated one feasible approach of making an IR projector by adding an additional DMD in the projector body responsible for delivering IR content.
Diffuser and Glass Surfaces

Figure 5: Adapting maker size according to the observing positions of the camera. The markers selectively split and merge when the camera observes the surface (a) at a distance and (b) close-up respectively.

causes reections of the visible projection and IR rays from beneath the table. The luminance degradation due to reection of the projection is acceptable, because the visible light projector provides luminance far greater than required. The reections of IR rays, however, cause an IR spot region in each camera view, resulting in dead zones in the image processing. In our implementation, we placed the diffuser layer on top of the touch-glass layer in order to obtain the best quality projections both from above and beneath the table. The problem of IR spots can be solved by using two IR cameras. To overcome the IR spot effect, we use two IR cameras installed at two corners of the table, both capturing the entire tabletop. Because the two cameras observe the table surface from different angles, the IR spot reveals a disjointed region of the tabletop in each camera view (Fig. 4). By masking out the IR spot regions and blending together the two views, we can remove the IR spot effect.
Marker Split and Merge

The surface of a tabletop system is usually composed of a diffuser layer and a touch-glass layer. The diffuser layer is placed directly above or beneath the touch-glass layer. However, because the touch-glass layer is a reective surface, the relative placement of these two layers can lead to distinct problems in touch detection and screen projection from above the surface. In the following, we discuss the advantages and disadvantages about the relative placement of the diffuser and touch-glass layers. In the case of the touch-glass above the diffuser layer, the touch-glass reects the visible light of the projections from above the surface. As shown in Fig. 3, a pico-projector is projected onto the surface from above the table. The reections cause not only a degrade in the luminance of the resulting projection, but may also undesirably shine on observers around the table. Moreover, the pico-projector is usually considered to be power-saving and low-luminance, thus it is better to preserve as much the quality of its projections as possible. In the case of the diffuser layer above the touch-glass layer, we are able to obtain the best quality for the pico-projector and for the surrounding observers. However, the touch-glass
265

Many augmented reality (AR) systems built on marker-based methods use printed markers which have static IDs and sizes. The printed markers cannot adapt to the 3D positions of the cameras. Therefore, as the cameras may observe the makers close-up or at a distance, making accurate detections are often unstable due to underqualied resolution or simply fails due to an out of range condition of the camera view. The main advantage of using an IR projector is its ability to program the projection content. The basic idea is to adapt the marker size to the observing positions of the cameras, so that the cameras can see markers of optimal size during the interaction. The technique of adapting marker size was also used in [9] to track camera pose.

Figure 6: The image processing ow of the proposed method.

The markers sizes are prepared in advance and are assigned to corresponding coordinates on the surface. Initially, we project the rst level of markers across the projected surface, which allow the cameras to observe the markers at far distance. During the procedure, the camera continuously measures its 3D position according the markers perceived, and transmits the result to the system. According to the position of the camera, the system re-arranges the layout of IR makers. For example, if the system observes a big marker in the camera view which might soon lose its tracking, we split the marker into smaller ones, allowing multiple markers visible in the camera view. On the other hand, the markers will be consolidated into a larger one if the markers perceived are too small (Fig. 5). In our implementation, the camera tries to nd at least four markers at appropriate resolution in its camera view. If the requirement cannot be met, the camera will request the system to split markers whose areas in the camera view are greater than a preset percentage with respect to the camera resolution. In each frame, the system aggregates the requests from the cameras beyond the surface and rearranges the layout of markers accordingly. To support multiple overlapping cameras, we prioritize cameras that are close to the tabletop over those that are farther away. In the overlapping regions, we display smaller IR markers to support the cameras that are closer, while the cameras that are further away see a larger markers outside the overlapping regions. To improve marker detection, we apply an adaptive thresholding method to avoid adverse effects of non-uniform distribution of the IR projection on marker extraction. Although this vision-based approach has assured accurate measures of camera positions by use of markers, it is prone to jitter if directly using the estimations in applications. To remove the
266

jitter effect, we further apply Kalman ltering [16] to smooth the resulting estimations.
Cooperating With Multi-Touch

A common implementation of nger-touch detection on DI tabletop systems includes foreground extraction by simple background subtraction in conjunction with nger touch and tangible objects recognition. This process, however, cannot be applied in our case. In order to support above-surface interaction, our IR projector constantly displays marker patterns that change over time as described in the previous section. The background of our table surface in the IR camera views is therefore not constant. To enable above-surface interaction while preserving onsurface object detection, we strategically layout the markers in the IR projection. The idea is as follows. As we know the current layout of markers, we can extract the foreground by simulating the background image at each frame. During the next frame, according to the foreground areas, we project uniform white regions enclosing foregrounds to further inspect meaningful events such as nger touches and markers on tangible objects. Note that the two steps run parallel in execution for each detection frame, so we only have a one frame delay in response to user interactions. To avoid detection loss, especially when moving ngers, the layout manager also takes into consideration the predicted nger positions based on Kalman ltering. Fig. 6 shows the process ow of our table system. To facilitate the following descriptions, we refer to the stitched view as the one stitched from the two IR cameras. We also assume the homographic transformations among the stitched view, the IR projection and the table surface coordinates are identical. In summary, at the start of each frame, the layout manager organizes the IR projection considering together the requests from the mobile devices beyond the surface, the

foreground objects on the surface, and predicted nger and object positions of the current frame from Kalman ltering. In the following, we will describe in detail the implementation of each component.
Background Simulation

For a DI multi-touch system, analysis of foreground objects is based on the increments in intensity of observed IR images. Background subtraction is critical as it directly affects the performance of on-surface object detection. However, modeling the background is not trivial in our case because the layout of IR markers adaptively change over time. Our approach is to dynamically simulate the background. At the start of each frame, the layout manager component takes charge of organizing the marker layout and generating the layout image for IR projection. According to the marker layout, we can simulate the background image which is expected to be observed in the stitched view when no object is presented on the tabletop. By subtracting the simulated background from the observed stitched view, the foregrounds are extracted. In the initial method, the tabletop projects black and white screens, of which two stitched views are retained. By black and white screens, we are referring to the intensity of the black and white pixels set for the marker. On simulating the background, we retrieve the intensity value from the two reference images according to the marker layout. The method is easily implemented, as we simply take two images as the intensity mapping reference. However, we found the accuracy of the result is less than ideal because the pixel intensity is affected by neighborhood pixels due to the scattering property of the diffuser. Consequently, the simulated background of this naive method presents inconsistent intensity values in the white/black regions of the markers as compared with the observed intensity values in the stitched view. To combat the effect of scattering, we propose recording the intensity response in the stitched view for each marker individually. On simulating the background, we can simply accumulate the intensity responses of the markers based on the marker layout. The steps are detailed as follows and shown in Fig. 7.
Marker-Level Simulation

Figure 7: The process of background simulation. Steps 1 and 2 are executed in the ofine phase to collect the intensity response of projected markers. Step 3 is run during the online phase to generate the simulated background of a given marker layout.

by subtracting all the patches in place from the base view image.
Marker Turnoff

After the background subtraction, we extract the foregrounds with simple thresholding. A region of interest (ROI) enclosing the foregrounds is fed to the layout manager, and is considered in the next frame to further inspect the foregrounds in the ROI. This is accomplished by projecting the ROI with uniform white regions in the IR projection. In the implementation, markers in the layout that overlap with the ROI regions are turned off by projecting white squares. On recognizing the foregrounds, we apply multi-touch detection modied from Touchlib, and recognize the markers attached on the bottom of the tangible objects by the method modied from ARToolkitPlus. Kalman ltering is applied to improve the nger association, and to smooth the detection results. Furthermore, to avoid detection loss when moving ngers and objects, we use Kalman ltering to predict positions of the foreground objects at a certain future frame, and feed these predictions to the layout manager, querying for marker-turnover therein.
Enhancing Intensity of black regions in markers

First, we keep a stitched-view image, named the base view, by projecting a full white screen with the IR projection. This base view records the intensity prole when no marker is displayed. Second, for a marker shown in a particular position, we capture a stitched-view image by only projecting the marker. Note that, since the marker itself is set as black, the projected marker is referred to as the black marker displayed in the base view. By subtracting the image from the base view, we obtain the intensity response of the marker. Because the scattering only affects a limited range, the intensity response image is able to further be reduced to a small patch based on a preset affecting extent. This step is repeated for each marker. Because the positions of the markers in the table coordinate are predened, each marker only corresponds to one patch. Third, while simulating the background image, we collect the patches corresponding to the markers of the current layout. The simulated background is then obtained
267

Finger detection using a DI setup is based on IR reection due to nger touches. If the IR rays are too weak, the reection will be difcult to be recognized. In our table system, the projected IR markers are used not only for computing above-surface positions, but also for illuminating the surface to enable on-surface foreground detection. Projecting black pixels in the markers, however, contributes insufcient IR illumination, making the black-pixel regions unsuitable for foreground detection. To mitigate this problem, we increased the intensity of the black pixels in the markers. Fig 8 shows three different intensity levels of black pixels from 64 and 192. In the implementation, we choose the black pixels to display an intensity level of 192, which allows clear IR reection from objects on

calibration process as a bridge to obtain the transformation between the visible light and IR projections. Fig. 9 shows the procedure of the system calibration. First, the user species four points corresponding to the tabletop display corners in the visible light projection image by use of computer mouse. The visible light projection is then associated with the table surface coordinates. Second, the visible projector displays a grid of markers. The IR-color-camera then recognizes the markers and computes the transformation between its camera view and the visible projection. Third, the same process is applied for the IR projector to nd the transformation between the IR-color-camera view and the IR projection. At this point, the IR projector is also associated with the tablet surface coordinates. Lastly, the IR projector projects a grid of markers that the two IR cameras utilize to compute the transformations according to the markers detected. Manually specifying point pairs among multiple devices can easily induce inconsistent results among coordinate transformations. Our approach only requires four points to be specied, ensuring that all transformations are consistent with respect to the user input. In addition, the process can be applied to systems which use more projectors and IR cameras than our implementation without additional user input.
INTERACTION DESIGN

Figure 8: Three different intensity levels, from 64 and 192, of the black pixels in the markers. On the top row are the stitched view captured from beneath the surface used for on-surface detection. On the bottom row are the images from the IR camera above the surface for computing the positions.

Figure 9: The process of our semi-automatic system calibration, where the user need only to specify the four points of the tabletop display corners in the visible light projection image.

the surface, while providing sufcient dynamic range for 3D localization of IR cameras from above the surface.
Software Synchronization

In the implementation, our cameras and projectors are two independent systems. As the two systems are not synchronized in capturing and displaying, it is often the case that, after updating the projection content, the cameras still capture previous content. As the IR cameras used are modied Logitech webcams, synchronization with a projector is not supported by the cameras. To this end, we applied a software solution to solve the synchronization problem. To provide the needed synchronization, we keep simulated backgrounds of several past frames in a buffer. On background subtraction, we push the new simulated stitch view into the buffer. All backgrounds in the buffer are candidates for subtraction. Next, we pick up the one with least error in the real stitched view, and erase all the backgrounds in the buffer prior to the chosen one. This measure ensures the subtraction step always utilizes the correct background.
Calibration

Based on the proposed techniques, we can compute the positions of the IR cameras in six degrees of freedom according to the invisible visual markers perceived. Embedded with IR cameras, the mobile devices can perform interactions beyond the display surface. In the following, we provide three example applications that take advantages of the proposed techniques to provide intuitive and natural manipulations. i-m-Lamp: a desk lamp for tabletops Writing and reading are the most common tasks for a working desktop. Thus use of a tabletop display system as a working desktop requires such short distance user interaction. Given that normal reading distance is around 13 inches [18], humans are capable of perceiving visual information at a much higher resolution than that which is provided by the 56-inch 1920 x 1080 pixel tabletop displays currently available on the market. To deal with this problem, there are several researchers that have developed multi-resolution display systems. One category of such systems [22][20] use mobile screens in conjuntion with the tabletop for use as the interface to present high-resolution content. The advantage of this setup is that users can easily manipulate the high-resolution regions by physically moving a tablet computer. However, the resulting multi-resolution imaging is not seamless due to the borders of the tablet computers. Another category [2][13][12][25] introduced the high-resolution display region via an additional projector. Working with a pan-tilt unit enables the high-resolution projection to be movable on the tabletop display. Seamless presentation is achieved, although users have to learn new gestures to control the high-resolution projection.
268

In a DI tabletop system, the coordinate mappings among the visible light projection, the IR camera images, and the tabletop surface are usually computed using homographic transformation. Computation of the homographic matrix requires at least four pairs of corresponding points between a coordinate system pair. The point-pairs are often manually specied by the user. As multiple projectors and cameras are incorporated for constructing a large tabletop system, the calibration process can typically become increasingly timeconsuming as well as inaccurate due to error accumulation. With the IR projection, the calibration process can be greatly simplied as we only require four points to be manually specied. The estimates can achieve sub-pixel accuracy because the remaining steps are automatically handled using computer-vision techniques. We used an additional camera, named a IR-color-cameraa which can see both the IR and g l, visible light range. The camera is only used when in the

Figure 10: The i-m-Lamp prototype device composed of a pico-projector and an IR camera.

Figure 11: The i-m-Flashlight device, composed of a pico-projector with an IR camera, is a mobile version of i-m-Lamp. The projection is applied with a circular mask.

We proposed the i-m-Lamp which is composed of a picoprojector attached to an IR camera, mimicking the use of a desk lamp for the tabletop system. i-m-Lamp provides a solution for multi-resolution that delivers ne details of the content right below the lamp while the overall view of the content is given by the tabletop system. Operations for the i-m-Lamp are very natural. Users can simply manipulate the lamp to move the high-resolution projection anywhere on the tabletop without memorizing any gestures. i-m-Lamp is very suitable for integration with a personal tabletop system as a working desk. To enable seamless i-m-Lamp projection on the tabletop, we need the transformation relationship between the IR camera and the pico-projector. The intrinsic parameters of the IR camera are already given. We further compute the intrinsic parameters of the pico-projector, and the extrinsic parameters of the pico-projector with respect to the IR camera. These parameters allow the pico-projection to be positioned with respect to the IR camera coordinates. Note that we only need to compute the parameters once in the off-line setup. After we obtain the transformation, we are able to translate the projection coordinate system of the pico-projector to the IR camera coordinate system. In each frame, the embedded camera computes its position with respect to the table surface coordinates, and then the content of the pico-projection can accordingly be integrated with the tabletop display. As the i-m-Lamp and the tabletop system both project on the table surface, the overlapped region of the two projections causes a blur artifact due to the dual projections. To avoid this artifact, the tabletop projection inside the overlapped region is masked. We can also dene a custom mask to form an arbitrary projection shape for the i-m-Lamp.
Avoidance of double projection

Figure 12: The user points out an region of interest with the i-m-Flashlight where ne-detail information is presented.

applications, i-m-Flashlight is used as the interface to appreciate rich details in digital paintings. Users can step back to view the overall painting and also explore ne details via use of the i-m-Flashlight. i-m-Flashlight naturally enhances the pico-projector with pointing ability. Use of i-mFlashlight as a region indicator facilitates interactions among multiple users. For example, in a map application as shown in Fig. 12, i-m-Flashlight is used as a pointer to highlight the touring information of a certain region while other users can easily follow the discussion. With a button on i-m-Flashlight, users can drag the content around passing the high-resolution focus region to users nearby.
Projection as a region pointer

i-m-Flashlight: a mobile exploration tool i-m-Flashlight is a mobile version of i-m-Lamp, designed as an intuitive interface for applications of information exploration. The operations for i-m-Flashlight imitate the use of a handheld ashlight. Users can inspect the ne details of a region of interest by simply focusing the i-m-Flashlight toward that location. In comparison, i-m-Flashlight facilitates short-term exploration activities where users can quickly explore multiple regions in the projection display. In one of our
269

We used the 3M LCD-based projectors, mPro120, for both the i-m-Lamp and i-m-Flashlight devices. The lens focus of the pico-projectors, however, need to be manually adjusted to obtain sharp projections. This manual process limits the feasibility of mobile projection applications. Fortunately, this issue can be resolved by replacing the projector with one that contains an integrated laser, such as a Microvision ShowWX pico-projector which uses laserscanning technology rather than a manual lens, providing an image that is always in focus.
Focus problem

Figure 13: The i-m-View prototype device attached with an IR camera on the back side of the tablet computer.

Figure 14: The table boundary is attached in the im-View to reduce the sense of isolation between the users and the table and the content outside the boundary is purportedly darkened to enhance the presence of the table while using the i-m-View.

i-m-View: an intuitive 3D viewer i-m-View is composed of a tablet computer attached to an IR camera, featuring an intuitive viewer to explore 3D geographical information. The concept has been proposed and implemented in metaDesk[28], where a magnetic-eld position sensor is used to track an arm-mounted monitor in six degrees of freedom. More discussions on interaction with the interactive viewer can also be found in [27]. In this work, the proposed techniques easily realize this concept for multiple users. We have used the i-m-View to explore 3D buildings from above the 2D map shown on the table system. In addition, we found that the i-m-View can be immersive, isolating the user from other users around the table, and from the table system. In the map application, the i-m-View provides a full-screen 3D view of the 2D map in perspective. The viewing experience, however, easily isolates the user from the real world. This is because when the user is focused on the i-m-View, he/she is often unable to remain aware of what is happening on and around the table system. In order to alleviate this sense of isolation, we reintroduce the boundary of the table surface in the perspective view. Fig. 14 shows the perspective view attached with table boundary. The boundary can be computed when the i-m-View observes at least one marker shown on the table system.
Introducing the reality Implementation

tem as well. The detection software for the table system is implemented in C++ and DirectX, and incorporates multithreading and GPU acceleration, which allows detection performance reaching 30 frames per second. For the map application, the display contents for the table, im-Lamp and i-m-Flashlight are generated using Flash. Warping in perspective is implemented with respect to their positions relative to the table system. As the warp operation is very slow in Flash, we directly access the memory of the Flash application and do the warping by GPU using DirectX, which achieves the warping in real-time. The i-m-View obtains the 3D geographical view of the 2D map in correct perspective by directly manipulating the Google Earth application. In practice, the mobile devices can be run standalone and communicate wirelessly with the table system via socket connections like the current implementation of i-m-View devices. We run the i-m-Lamp and i-m-Flashlight on the same computer with the table system in order to reduce hardware costs and also minimize the content transmission load over the network.
EARLY USER RESPONSE

Our prototype system is composed of a table system and several mobile devices, the i-m-Lamp, i-m-Flashlight and i-mView, connected with the table. The table system prototype is implemented on a standard desktop computer running an iCore7 2.4 GHz processor with 4GB RAM. In the current implementation, the i-m-Lamp and i-m-Flashlight are both running on the table system. The i-m-View devices are running on standalone tablet computers. Two graphical cards are used in the table system to provide four display output channels, where two of them are used for the color and IR projectors of the table system, and the other two are for the pico-projectors of the i-m-Lamp and i-m-Flashlight devices. Four IR cameras, two for the tabletop running at 640 x 480 pixels and the other two for the i-m-Lamp and i-m-Flashlight running at 320 x 240 pixels, are connected with the table sys270

In order to better understand the strengths and limitations of each device, we conducted early user testing. We asked a group of ve users to explore our map application. Users were told to freely use the three devices available for testing in turn. The task was to navigate the famous buildings and annotated photos shown on the map. Each photo in the table projection is presented as a pin on the map. With i-m-Lamp and i-m-Flashlight, users can see the photo beneath the pin. Therefore, to browse through photos, users have to drag the map, or move the i-m-Lamp or i-m-Flashlight to highlight the pins. i-m-View was used to see the perspective view of the 2D map. This early stage study focused on the qualitative aspects of the user experience. During the whole process, users were encouraged to think aloud. In the following, we discuss the observations collected via user feedback. One particular phenomenon quickly became a reoccurring theme during close inspection of 3D buildings using the i-m-View. The users could only see the
Dead reckoning

bottom part of the building and desired to lift the i-m-View upward to see the upper parts of the same structure. Undesirably, all the invisible visual markers shown on the table surface were out of the view eld of the i-m-View, and the i-m-View eventually became lost in the space above the table system. i-m-Lamp and i-m-Flashlight do not suffer from this issue as these two devices are focused on the markers while in operation. We found that some users ip the i-m-View on end with the desire to obtain a portrait view of the map scene, which would allow them to more easily explore tall buildings. Unfortunately, this is an operational mode which is not currently supported by the system. In order to support this feature, we plan to attach an orientation sensor with the i-m-View as a future work. Another problem, the i-m-View becoming lost because no marker enters its view eld, was dealt by continuously updating the orientation of the i-m-View according to the data from the orientation sensor. Once the i-m-View recognizes the new markers, its position and orientation are corrected via interpolation over several frames. All users reported mostly positive feedback regarding the i-m-Lamp. The focus problem was less of an issue for the i-m-Lamp as compared to the i-mFlashlight. This is because i-m-Lamp, when in regular use, is usually at an appropriate distance from the table surface and can hence present a sharp projection. i-m-Flashlight suffers from a severe focus problem, as users tend to move it quickly around, and use it from a wide range of proximities from the table surface.
More about operation

users. In this scenario, dragged content passed to other users would always highlighted and in ne detail. We plan to implement this additional functionality as a future work. While using the i-m-View, users have to move their positions to explore different views of a building. This is because by default we lock the perspective view with the real spatial relationship computed. Three users reported that they would like to scale and rotate the building on the i-m-View directly with their ngers. Locking the i-m-View with its real position to the table allows the users to efciently access buildings of interest on the map. However, when targeting a certain building, a simple gesture or button to unlock the association is helpful to facilitate the navigation. In addition, the users also reported moving the i-m-View made them easily become unaware about the real scene, namely the table, because they became too immersed in the i-m-View. This and other problems found in using i-m-View are topics which will be addressed as a future work.
CONCLUSION

Although the functions of the i-m-Lamp and i-m-Flashlight are similar, they naturally facilitate different kinds of usage scenarios. For example, the i-m-Lamp, while in use, should rarely be moved as its usage is based on the metaphor of a desk lamp. Most users, however, would like the ability to move the i-m-Lamp around on the table to check information of different locations. This is despite the fact that users reported that they would tend to remain xed at a certain location more often with the i-m-Lamp than with the i-mFlashlight. The i-m-Lamp is usually moved to a new location, and then used to explore nearby information. With the i-m-Flashlight, users frequently y through the space above the table to randomly explore information on the map. i-mLamp is ideal for long-term tasks on a projection working desk, while i-m-Flashlight is ideal for short-term tasks which facilitate quickly checking information at multiple locations. The three interactive devices require different styles of user interaction. As i-m-Lamp is designed to stand on the table, it allows users to operate with the map using their two hands. Operating the i-m-Flashlight, users usually use one hand to hold the i-m-Flashlight while the other hand is free for other operations. For the i-m-View, users need two hands to hold the device steadily. While using the i-m-Flashlight, some users reported that they would like to drag the map by directly using the i-mFlashlight, using it as a remote mouse. We found that this idea would a good feature for the i-m-Flashlight, especially when it is desired to pass highlighted information to other
271

This paper presents a new interactive tabletop system that enables interaction beyond the surface using invisible markers while supporting multi-touch and tangible input directly on the surface. Mobile devices outtted with IR cameras can compute their 3D positions based on the markers perceived. Markers are selectively turned off to support multitouch and direct on-surface tangible input. To demonstrate the feasibility of the proposed techniques, we have developed three interactive devices: i-m-Lamp, i-m-Flashlight and i-m-View which work closely with a shared tabletop display. In an early user evaluation, users reported mostly positive feedback regarding interactions with the multi-display multitouch tabletop system. Several problems uncovered by user feedback will be addressed in future work. When using the i-m-View to explore 3D perspectives of 2D content shown on the tabletop, users reported a sense of isolation from the tabletop system and to the other users. We will develop interfaces that facilitate communications among i-m-View users and tabletop users. The proposed interactive devices open new opportunities for multi-user interactions. We will explore interactions to facilitate group discussion on the cooperative multi-display multitouch tabletop system.
ACKNOWLEDGMENTS

This work was supported in part by the Excellent Research Projects of National Taiwan University, under grants 98R006204, and grants from Intel Corporation.
REFERENCES

1. Michael Aron, Gilles Simon, and Marie-Odile Berger. Handling uncertain sensor data in vision-based camera tracking. In Proc. ISMAR 04, 5867. 2. Patrick Baudisch, Nathaniel Good, Victoria Bellotti, and Pamela Schraedley. Keeping things in context: a comparative evaluation of focus plus context screens, overviews, and zooming. In Proc. CHI 02, 259266.

3. Hrvoje Benko, Edward W. Ishak, and Steven Feiner. Collaborative mixed reality visualization of an archaeological excavation. In ISMAR 04, 132140. 4. Kar Wee Chia, Adrian David Cheok, and Simon J. D. Prince. Online 6 dof augmented reality registration from natural features. In Proc. ISMAR 02, 305313. 5. Daniel Cotting, Martin Naef, Markus Gross, and Henry Fuchs. Embedding imperceptible patterns into projected images for simultaneous acquisition and display. In Proc. ISMAR 04, 100109. 6. Paul Dietz and Darren Leigh. Diamondtouch: a multi-user touch technology. In Proc. UIST 01, 219226. 7. Mark Fiala. Automatic projector calibration using self-identifying patterns. In Workshops of CVPR 05, 113113. 8. Anselm Grundhfer, Manja Seeger, Ferry Hantsch, and Oliver Bimber. Dynamic adaptation of projected imperceptible codes. In Proc. ISMAR 07, 110. 9. Anselm Grundhfer, Manja Seeger, Ferry Hantsch, and Oliver Bimber. Dynamic adaptation of projected imperceptible codes. In ISMAR 07, 110. 10. Jefferson Y. Han. Low-cost multi-touch sensing through frustrated total internal reection. In Proc. UIST 05, 115118. 11. Otmar Hilliges, Shahram Izadi, Andrew D. Wilson, Steve Hodges, Armando Garcia-Mendoza, and Andreas Butz. Interactions in the air: adding further depth to interactive tabletops. In Proc. UIST 09, 139148. 12. Chuan-Heng Hsiao, Li-Wei Chan, Ting-Ting Hu, Mon-Chu Chen, Jane Hsu, and Yi-Ping Hung. To move or not to move: a comparison between steerable versus xed focus region paradigms in multi-resolution tabletop display systems. In Proc. CHI 09, 153162. 13. Ting-Ting Hu, Yi-Wei Chia, Li-Wei Chan, Yi-Ping Hung, and Jane Hsu. i-m-top: An interactive multi-resolution tabletop system accommodating to multi-resolution human vision. In Proc. Tabletop 08, 177180. 14. Shahram Izadi, Steve Hodges, Stuart Taylor, Dan Rosenfeld, Nicolas Villar, Alex Butler, and Jonathan Westhues. Going beyond the display: a surface technology with an electronically switchable diffuser. In Proc. UIST 08, 269278. 15. Yasuaki Kakehi and Takeshi Naemura. Ulteriorscape: optical superimposing on view-dependent tabletop display and its applications. In SIGGRAPH 08: ACM SIGGRAPH 2008 new tech demos. 16. Rudolph Emil Kalman. A new approach to linear ltering and prediction problems. Transactions of the ASMEJournal of Basic Engineering, 82(Series D), 3545.
272

17. Johnny Lee, Scott Hudson, and Pau Dietz. Hybrid infrared and visible light projection for location tracking. In Proc. UIST 07, 5760. 18. Rubin G. Pelli D. Legge, G. and M. Schleske. Psychophyics of reading ii. low vision. Vision Research, 25, 2:253266, 1985. 19. Alex Olwal. Lightsense: enabling spatially aware handheld interaction devices. In Proc. ISMAR 06, 119122. 20. Alex Olwal and Steven Feiner. Spatially aware handhelds for high-precision tangible interaction with large displays. In Proc. TEI 09, 181188. 21. Alex Olwal and Anders Henrysson. Lumar: A hybrid spatial display system for 2d and 3d handheld augmented reality. International Conference on Articial Reality and Telexistence, 0, 6370. 22. Johan Sanneblad and Lars Erik Holmquist. Ubiquitous graphics: combining hand-held and wall-size displays to interact with large images. In Proc. AVI 06, 373377. 23. Garth B. D. Shoemaker and Kori M. Inkpen. Single display privacyware: augmenting public displays with private information. In Proc. CHI 01, 522529. 24. Gilles Simon and Marie-Odile Berger. Reconstructing while registering: A novel approach for markerless augmented reality. In Proc. ISMAR 02, 285285. 25. Hyunyoung Song, Tovi Grossman, George Fitzmaurice, Franois Guimbretiere, Azam Khan, Ramtin Attar, and Gordon Kurtenbach. Penlight: combining a mobile projector and a digital pen for dynamic visual overlay. In Proc. CHI 09, 143152. 26. Hyunyoung Song, Francois Guimbretiere, Tovi Grossman, and George Fitzmaurice. Mouselight: bimanual interactions on digital paper using a pen and a spatially-aware mobile projector. In Proc. CHI 10, 24512460. 27. Michael Tsang, George W. Fitzmaurice, Gordon Kurtenbach, Azam Khan, and Bill Buxton. Boom chameleon: simultaneous capture of 3d viewpoint, voice and gesture annotations on a spatially-aware display. In Proc. UIST 02, 111120. 28. Brygg Ullmer and Hiroshi Ishii. The metadesk: models and prototypes for tangible user interfaces. In Proc. UIST 97, 223232. 29. Andrew D. Wilson. Playanywhere: a compact interactive tabletop projection-vision system. In Proc. UIST 05, 8392. 30. W.S. Yerazunis and M. Carbone. Privacy-enhanced displays by time-masking images. In Proc. OzCHI 01.

You might also like