You are on page 1of 21

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

ABSTRACT

Tele-immersion is aimed to enable users in geographically distributed sites to collaborate in real time in a shared simulated environment as if they were in the same physical room. Teleimmersion is aimed to be used in different areas, such as 3D CAD design, entertainment (e.g. games), remote learning and training, 3D motion capturing. We define tele-immersion as that sense of shared presence with distant individuals and their environments that feels substantially as if they were in one's own local space. One of the first visitors to our tele-immersion system remarked "It's as if someone took a chain saw and cut a hole in the wall [and I see the next room]." This kind of tele-immersion differs significantly from conventional video teleconferencing in that the user's view of the remote environment changes dynamically as he moves his head. Tele-immersion is a technology to be implemented with Internet2 that will enable users in different geographic locations to come together in a simulated environment to interact. Users will feel like they are actually looking, talking, and meeting with each other face-to-face in the same room. This is achieved using computers that recognize the presence and movements of individuals and objects, tracking those individuals and images, and reconstructing them onto one stereo-immersive surface Such approaches are geared toward exploration of abstract data; our vision, instead, is of a realistic distributed extension to our own physical space, this presents challenges in environment sampling, transmission, reconstruction, presentation, and user interaction. Other approaches that concentrate on realistic rendering of participants in a shared tele-conference do not employ the extensive local environment acquisition necessary to sustain a seamless blending of the real and synthetic locales. Tele-immersion presents the greatest technological challenge for Internet2etc.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 1

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

INDEX
1. Introduction 1.1 Early Developments 1 2

2.

Requirements For Immersive Tele-Conference Systems

3.

How Tele-Immersion Works

4.

Tele-Cubicles 4.1 How Tele-Cubicle Works 4.2 Next Generation Systems Tele-Cubicles

5 5 7

5.

Shared Virtual Table Concept 5.1 Architecture of the System 5.2 Foreground Background Segmentation 5.3 Disparity Estimation and Depth Analysis 5.4 Head Tracking 5.5 View Synthesis and Virtual Scene Composition

9 11 12 13 13 14

6.

Collaboration With I2 & IPPM

15

7.

Conclusion & Future Application

16

8.

References

17

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 2

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

LIST OF FIGURES

1. Figure 3.1. Tele Immersion Implementation 2. Figure 4.1.Tele Cubicles 3. Figure 4.2.Working of Tele cubicle 4. Figure 4.3. Experimental set-up of NTII at UNC [7] 5. Figure 5.1 A: Setup for a 3-party conference. B: VIRTUE setup 6. Figure 5.2 Rendering of the virtual 3-D conference scene 7. Figure 5.3 Architecture of the3-D video conference system 8. Figure 5.4 Left: Original frame. Middle: Segmented foreground without shadow detection. Right: with shadow detection 9. Figure 5.5 1. and 2. Image: Rectified frames from the left and right camera. 3. and 4. Image: Disparity fields with occluded areas. 5. Image: Disparity map after post-processing 10. Figure 5.6 Composed scene containing computer graphics and video objects

4 5 6 7 9 10 11

12

13 14

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 3

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-1 INTRODUCTION
According to Jason Leigh The term Tele-immersion was first used as the title of a workshop to bring together researchers in distributed computing, collaboration, virtual reality and networking According to Watsen & Zyda It enable the interaction between geographically remote participants within a shared, three-dimensional space. In the past people only dream about communicating geographically but the advancement in telecommunication along with advancement in media techniques make it possible. But still there was struggle to make them collaborate in a real time world, like efforts to have users share the same physical space, during there meetings, conferences, etc. National Tele-Immersion Initiative NTII team leads the way to make all these things possible. They are working on projects to have users share the same physical space in a real time world, as if they are sitting in front of each other in the same room. In this regard Advanced Network & Services played a vital role, to bring together the experts in this field close together. This team is lead by Jaron Lanier, who was one of the pioneers in development of Virtual Reality (which according to him is the brain anticipates a virtual world instead of the physical one) in 1980s. National TeleImmersion team started there work in middle of 1997 and the collaborating schools were Brown University, Providence Naval Postguard School , Monterey University of North Carolina, Chapel Hill and University of Pennsylvania, Philadelphia.

1.1 EARLY DEVELOPMENTS


In start the main aim of the team was to take into account the ultimate synthesis of media technologies for the scanning and tracking of three dimensional environment. Based on vision based three dimensional reconstruction with the help of new advancement in fields like media technologies, networking, robotics.In May 2000 whole the hectic efforts of the team cope up
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 4

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

with some success, with first demonstration of three years long work. National Tele-Immersion Initiative team lead by virtual reality pioneer Jaron Lanier, conducted which at one stage was just imagination. This effort lead to the thinking which could change the way we communicate over long distances, people could feel each other submerge together in the same physical space. The experiment was conducted in Chapel Hill led by UNC computer scientists Henry Fuchs and Greg Welch. It linked UNC Chapel Hill, the University of Pennsylvania in Philadelphia and Advanced Network & Services at New York. Researchers at each place could feel themselves in the office of their colleagues hundreds of miles far apart. The apparatus of the test consisted of two large walls, projection cameras and head tracking gear. One screen was at left side of Welch and other was on right. Through left wall Welch can see his colleagues at Philadelphia and through other of New York. He can peep in and out and images change accordingly, like when he leaned forward images grew larger and become smaller when he moved back. At each target site there were digital cameras to capture the image and laser rangefinders to gather information regarding the position of the object. Computer then converted them into a three dimensional information which was then transmitted to Chapel Hill via Internet2, where computers were mounted to reconstruct the image and display that on the screen. To some point it seems that Tele-Immersion is another kind of Virtual Reality but Jaron Lanier is of other view. According to him virtual reality allows people to move around in a preprogrammed representation of a 3D environment, whereas tele-immersion is more like photography. It's measuring the real world and conveying the results to the sensory system," he says

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 5

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-2 REQUIREMENTS FOR IMMERSIVE TELECONFERENCE SYSTEMS


To meet the requirements of immersion, it is absolutely necessary to use a large display that covers almost the whole viewing angle of the visual system. In addition, the large display has to be integrated into the usual workspace of an office or a meeting room. Thus, the most practicable solution is a desktop-like arrangement with large flat screens like plasma displays with a diagonal of 50 inch and more. Starting from such a desktop-like system and taking into account results from intensive human factors research , further requirements on the presentation of the scene can be formulated as follows :

Conferees are seamlessly integrated in the scene and displayed with at least head, shoulders, torso and arms in natural life-size

y y

All visual parameters of the scene and the different sources have to be harmonized The perspective of the scene is permanently adapted to the current viewpoint of the conferee in front of the display (head motion parallax; look behind effect)

y y

Eye-contact between two partners talking to each other has to be provided Gaze from one conferee to another has to be reproduced in a sufficient manner such that everybody can recognize who is looking at whom (e.g.: who is searching for eye contact)

Voice of a conferee must come from the same direction where he is positioned on the screen.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 6

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-3 HOW TELE-IMMERSION WORKS

Figure 3.1. Tele Immersion Implementation Above figure is a nice description of the Tele-Immersion implementation. Two partners separated by 1000 miles collaborate with each other. There is a sea of cameras which provide view of users and their surroundings. Mounted Virtual Mirrors provide each user a view how his surrounding seems to other. At each instant camera generated an image which is sorted into subsets of overlapping trio. The depth map generated from each trio then combined into a single view point at a given moment.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 7

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-4 TELE-CUBICLES
A tele-cubicle is an office that can appear to become one quadrant in a larger shared virtual office space. Initial sites were UIC, UNC, and USC, as well as one in the New York Area. The main idea behind this work came directly from the Tele-Immersion meeting on July 21 ,1997 at the Advanced Network Office. At the meeting each participant university (UIC, NPS, UNC, Columbia, and USC) brought its individual designs of cubicles and together immersed the user and the desk. One of the striking results of the meeting was the discovery of how future immersive interfaces look like, and what were the needs and requirements at that time to make this impossible looking task in the past, into reality.

4.1 HOW TELE-CUBICLE WORKS

Figure 4.1.Tele Cubicles The apparatus consists of:


DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 8

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

y y y

desk surface (stereo immersive desk) two wall surfaces two oblique front stereo projection sources (might be integrated with projectors)

As illustrated (in fig 6) the three display surfaces meet each other in the corner to make a desk. At the moment four tele-cubicles can be joined to form a large virtually shared space. The walls appear to be transparent passage for other cubicles during this linkage, and the desk surfaces join to form a large table in the middle. Objects at each place can be shared for viewing across the common desk and through walls can be seen the colleagues at other end and their environment.

Figure 4.2.Working of Tele cubicle Fig 4.2 describes how the participants so far away share the same physical space, through common immersed stereo desk and can see each other environment, virtual objects place in the others environment, across the walls which looks like transparent gasses when cubicles connected together. So the virtual world extends through the desktop. The short term solution at that time was to have remote environment pre-scanned which lead towards the goal which was obviously to have environment automatically scanned.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 9

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

In the early years there were some limitations in the task as each partner university did not have the same techniques to present itself to others. Various modules like Sketch, Body Electric, and Alice were the results of the first year development, but they were not of much success as there were not much technologies available at that time to integrate them. The hectic efforts in this regard initiated a project called Office of the Future. In this project the ideas which were discussed in the July, 1997 meeting coined together. The approach was to use the advanced techniques in computer vision field to capture the visible objects in the office like furniture, people, etc. The capture images were then reconstruct and transmitted over the network to the other remote site for display.

4.2 NEXT GENERATION SYSTEMS TELE-CUBICLES


An attractive SVTE approach known from the past is the one of tele-cubicles . A common feature of these proposals is a special system set-up, where the participants are situated symmetrically around the shared table, with each conferee appearing on an own screen (see Figure 2). Note that the symmetric geometry of this set-up guarantees eye contact, gaze awareness and gesture reproduction. Thus, everybody in the session can observe under correct perspective who is talking to whom or who is pointing at what. For example, if the person in front of the terminal in Figure 2 talks to the one on left while making a gesture in direction of the right one, this third person can easily recognize that the two others are talking about him.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 10

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

Figure 4.3. Experimental set-up of NTII at UNC [7]. However, while the tele-cubicle concept seems to hold merit, there exist a lot of severe disadvantages and unsolved problems. First of all, the specifically arranged display surfaces appear as 'windows' into the offices of the other conferees, resulting in a restricted mediation of social and physical presence. Furthermore, ideally suited for a fixed number of participants (e.g. three in the set-up from Figure 4.3) and limited to single-user terminals only, the tele-cubicle concept does not scale well. Any addition of further terminals requires a physical re-arrangement of displays and cameras, simply to adjust the geometry of the SVTE set-up to the new situation. Finally, it is difficult to merge the tele-cubicle concept with the philosophy of shared virtual working spaces.Although the National Tele-Immersion Initiative (NTII) has already demonstrated an integration of tele-collaboration tools into their experimental tele-cubicle set-up from Figure 2, the possibility of joint interactions is limited to two participants only, whereas shared workspaces with more than two partners are hard to achieve because of the physical separation of tele-cubicles windows.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 11

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-5 SHARED VIRTUAL TABLE CONCEPT


The basic idea of the shared virtual environment concept is to place 3-D video reproductions of a given number of participants at predefined positions in a shared virtual environment. For this purpose, the conferees are captured at each terminal by a multiple camera setup as shown in Fig. 5.1 and the desired 3-D video representation of the local person is extracted from the images

Figure 5.1 A: Setup for a 3-party conference. B: VIRTUE setup.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 12

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

Then, the 3-D video objects of all conferees are grouped virtually around the shared table. Ideally, this is done in an isotropic manner in order to obtain social and geometric symmetry. Hence, in the case of a 3-party conference, the participants form an equilateral triangle. In the case of four parties, it would be a square, and so on. Following such generic composition rules and knowing the number of participants, the same SVTE can be built at each terminal from previously loaded scene descriptions and the 3-D video streams.

Figure 5.2 Rendering of the virtual 3-D conference scene. Based on this generic scene composition, individual views of the virtual conference environment can be rendered by using a virtual camera as illustrated in Fig. 5.2. Locally, the position of the virtual camera has to move coincidently with the current position of the conferees head, which is permanently registered by a head tracker. Thus, supposing that the geometrical parameters of the multi-view capture device, the virtual scene, and the virtual camera are well fitted to each other, it is ensured that all conferees see the scene under the correct perspective view, even while changing their own viewing position. This geometrical coincidence provides all desired attributes mentioned in the introduction eye contact, gaze awareness, gesture reproduction, natural conference situation, and high amount of realism. In addition, the support of head motion parallax allows the conferees to change their viewing position in order to watch the scene from another perspective, to look behind objects, or to look at a previously occluded object. 5.1 Architecture of the System

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 13

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

Fig. 5.3 outlines the system architecture o f the 3-D video conferencing system. After multiview capturing, the video frames are segmented to separate the persons silhouette from the background. As a result, the conferees are represented by arbitrarily shaped video objects, which can be integrated seamlessly into virtual environments. To extract depth information, disparity estimation is performed on the rectified video objects resulting in dense disparity maps. Both video objects and disparity maps are eciently encoded using MPEG-4.

Figure 5.3 Architecture of the3-D video conference system. The system concept takes advantage of several particular features of the MPEG-4 multimedia standard. MPEG-4 allows the encoding of arbitrarily shaped video objects and provides auxiliary alpha planes to transmit additional pixel information associated to the color data. These additional planes can be used for a joint transmission of disparity maps and video objects. After encoding, the packets are streamed to other participating terminals via RTP. Simultaneously, the terminal receives video streams from the other conference partners and decodes them with multiple MPEG-4 video decoders. The shaped video objects and audio data are synchronized to each other and integrated into the SVTE scene represented by the MPEG-4
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 14

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

scene description language BIFS. Finally, an MPEG-4 compositor is used for rendering purposes. This compositor is able to handle user events (head tracker or interaction with scene content) as well as scene updates sent by a server. The MPEG-4 video objects are threedimensionally warped using image-based rendering techniques before they are integrated into the scene. Depending on the current input from the head tracker, the correct perspective view is calculated and the adapted view is inserted into the BIFS scene as a 2-D video. 5.2 Foreground Background Segmentation In the finally rendered scene, the participants appear as arbitrarily shaped video objects, seamlessly integrated into the virtual environment. This requires a segmentation of the moving person from the background which is assumed to remain rather static. Initially, the background is captured and a change detection scheme compares this reference image with the current video data and provides a segmentation mask. The reference image is permanently updated to cope with slight changes of illumination or scene content. This baseline algorithm has been improved in speed and quality in order to meet the real-time constraints for full CCIR601 resolution and video frame rate. Its performance has further been improved by adding a fast and ecient shadow detection tool. For a robust segmentation, this is particularly important, because shadows at the table can usually not be avoided, even under optimal illumination conditions. The eect of shadow detection on the segmentation result is shown in Fig. 5.4.

Figure 5.4 .Left: Original frame. Middle: Segmented foreground without shadow detection. Right: with shadow detection.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 15

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

5.3 Disparity Estimation and Depth Analysis To extract the required 3-D representation, the depth of the captured video object is analyzed by disparity matching on the basis o f the rectified images. We have developed a new hybrid block- and pixel-recursive approach, which is able to compute the disparity fields in real-time on a state-of-the-art PC. Apart from a considerable reduction of computational load, the algorithm leads to spatiotemporally consistent depth maps, particularly important for view synthesis, since temporal inconsistencies may cause annoying artifacts in the synthesized views. In order to deal with occlusions caused by hands, arms, or the head, several post-processing techniques are applied. In a first step, critical regions are detected by a consistency check between disparity fields for a left to right and a right to left match. Unreliable disparity values in occluded areas are filled eciently by suitable extrapolation techniques. The segmentation of the hands and arms which cause severe depth discontinuities is further refined by tracking and segmenting the hands using motion and skin color information. Fig. 5.5 shows two rectified images, the disparity fields with occluded areas, and the final disparity map after post-processing.

Figure 5.5. 1. and 2. Image: Rectified frames from the left and right camera. 3. and 4. Image: Disparity fields with occluded areas. 5. Image: Disparity map after post-processing. 5.4 Head Tracking The perspective view of the scene presented on the display depends on the viewers position. This requires an accurate estimation of the viewers position in the 3-D space, which is accomplished by the head-tracking module. The chosen approach is based on a skin color segmentation technique jointly with a facial feature tracker searching eye positions. Due to the
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 16

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

restriction to video conference scenarios, several assumptions can be made to facilitate the tracking of the viewers head. The obtained 2-D positions of the eyes in two images ca n then be used for an accurate calculation of the 3-D head position. 5.5 View Synthesis and Virtual Scene Composition The calculated head position and the analyzed depth of the transmitted video objects provide sucient information to synthesize novel virtual views o f the remote conferees. In order to meet real-time constraints and ecient occlusion handling, a new view synthesis algorithm has been developed in the VIRTUE project which is able to take these issues into account.

Figure 5.6 Composed scene containing computer graphics and video objects. At the end of the processing chain, the virtual conference scene has to be composed and displayed onto the screen as depicted in Fig. 5.6. The virtual scene is represented by a number of polygons in 3-D space and encoded by the BIFS scene description language of MPEG-4. In this polygon-based scene representation the participants are substituted by 2-D rectangles positioned around the virtual table. In this sense, the synthesized novel views of the remote participants are treated as textures a nd are transferred directly to the graphics card.
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 17

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-6 COLLABORATION WITH I2 & IPPM


To cop up with the problems like communicating speed and better transmission of data over the network, Tele-Immersion team collaborated with Internet2 and Internet Protocol Performance Metrics. Main problem as obvious was that todays internet is not fast enough to transmit data, specially when you need to transmit a huge bulk of data across the internet about people and their environment. The experiment conducted at Chapel Hill used 60 megabits per second and good quality tele-immersion requires 1.2 gigabits per second. To make it possible 160 USA universities and other institutes started a research project which should provide high reliability, and the propagation delay and queuing should be as less as possible. This could lead to revolutionary Internet applications and to ensure quick transfer of services to everyday growing network. All the members of the research collaborate on: y y y y y Partnerships Initiatives Applications Engineering Middleware

Abilene proves to be the backbone behind this research, with initiative of providing a separate network capability. The aim was to update the cross-country backbone of 2.5 gigabits per second to 10 gigabits per second, taking into consideration of achieving a goal of 100 megabits per second.

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 18

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-7 CONCLUSION & FUTURE APPLICATION


All this relies on the advancement in emerging technologies, most heavily on the ability of Internet to ship data across different networks without delay. In this regard Internet2 is the key. Both projects are going hand to hand. According to one of the researchers of TeleImmersion Team, Defiant, such technology would enable researchers to collaborate in fields such as architecture, medicine and astrophysics and aero plane design. The beauty of it is that it allows widely separated people to share a complex virtual experience. You might be testing a vehicle," says Defiant. "You want to smash it into the wall at 40 miles per hour and put your head by the cylinder block. Say there's a guy from Sweden and you have to prove to him that it doesn't move by 3 centimeters or more. That kind of stuff works." In the years to some it will be one of the major developments. You could visit each other environment, but one thing which is far behind to achieve is the physical contact of individuals at each end.

So it can be summarized as: y y


y

Collaboration at geographically distributed sites in real-time Synthesis of networking and media technologies Full integration of Virtual Reality into the workflow

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 19

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

CHAPTER-8 REFERENCES
1. Herman Towles, "National Tele-Immersion Initiative: Sharing Presence with Distant Colleagues", presentation given at Internet2 Spring Meeting held in Washington DC, March 7-9, 2001 2. Amela Sadagic, "National Tele-immersion Initiative: Towards Compelling Tele-Immersive Collaborative Environments", presentation given at the conference "Medicine Meets Virtual Reality 2001", Newport Beach, CA, Jan 2001 3. Jaron Lanier, "The National Tele-immersion Initiative", presentation given for US Army, The Pentagon 4. Amela Sadagic: "National Tele-immersion Initiative: Visions and Challenges",presentation for invited talk at the Second Workshop on VR in Sweden: FutureDirections in Virtual Environments - FDIVE II, 11-12 November 1999 5. Joint presentation by Brown University and NPS: "SOFT - Software Framework for Tele-immersion" and "SOFT Networking and Scalable Virtual Environments" 6. Ben Teitelbaum, "Internet2 QBone: Building a Testbed for IP Differentiated Services"

LINKS
http://www.advanced.org/teleimmersion.html http://www.cs.unc.edu/%7Eraskar/Office/ http://www.evl.uic.edu/aej/papers/7ss/7ss.html http://www.cs.brown.edu/research/graphics/ http://www.cs.unc.edu/Research/stc/
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING Page 20

A TECHHNICAL SEMINAR REPORT ON

TELE IMMERSION

http://www.cs.brown.edu/research/graphics/research/telei/ http://www.cis.upenn.edu/~sequence/teleim1.html http://www.internet2.edu/

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

Page 21

You might also like