Professional Documents
Culture Documents
Team Members
Abstract --- An important research topic in the field of image to effectively processes, segment and analyze visual input of
processing is stereo vision. For acquisition of 3-Dimensional their environments it is often a requirement that the system is
information in real space, stereo vision system is suitable. Stereo able to obtain data of the surrounding world in a format that can
vision is the process of constructing a 3D model of a scene through be easily equated to the actual environment in which the system
processing two 2D images of this scene. The main problem is the finds itself. In the case of many vision systems this could be a 3
construction of a disparity map. This is a map that describes which dimensional representation of the real world. For humans this is
points in the two images corresponds to the same point in the 3D a task that we achieve quite naturally from an early age and it
scene. soon becomes second nature for us to accurately judge distance,
perspective and space, however, when human vision systems are
In the stereo system, 3D real world position is derived from analyzed it becomes apparent that the brain uses a multitude of
translation of coordinates between cameras and world. Thus, to use techniques to give us a sense of the three dimensional world in
stereo vision, it is needed to construct a precise system which which we live.
provides kinematically precise translation between camera and In order for a computer vision system to obtain depth
world coordinate, in spite of intricacy and hardness. In our paper, data from a scene it is possible to use a number of different
we propose an approach to develop a system which can easily techniques. Three dimensional scene data can be obtained from
obtain 3D information by direct computation of position with sources including object shading, motion parallax data,
translation of coordinates, disparity in stereo pair of image which is structured light or laser range finders. However, perhaps the
used to find the reference depth of objects. The algorithm is based most obvious technique is that of stereo vision. In a system
on correlation and uses the epipolar constraint. analogous to a pair of human eyes, the input to two cameras
observing the same scene can be analyzed and the differences
between the two images used to compute object depth and hence
Keywords -– Stereo Vision, Epipolar Geometry, Camera a model of the scene that the system is viewing. The utilities of a
Calibration, Rectification, Disparity robust implementation of such a system are many and
potentially include applications in areas such as space flight,
face recognition, immersive video conferencing and industrial
I. INTRODUCTION inspection to name just a few.[7]
Stereo vision, which is inspired by human visual
Making a machine to see objects is one of the most process, computes the disparity between correspondence points
important tasks in artificial intelligence, manufacturing in images captured by multiple cameras for distance
automation, and robotics. measurement. In Robots, Visual interpretation is an extremely
Computer Vision is one of the fastest growing areas challenging problem. Robot vision technology is needed for the
within Computer Science. Aided by rapid recent progress in stable walking, object recognition and the movement to the
hardware and software design, computer vision projects are target spot.
making use of vast increases in processing and memory Stereo vision of three dimensional spaces would make
capacities to enhance their performance. In order for computers robot have powerful artificial intelligence. Stereo Vision is an
important research topic in the field of image processing. The parameters to be determined. Techniques for camera calibration
problem is to compute a 3D model of a scene from two (or loosely fall into three categories linear, non-linear and two-step.
possibly more) 2D images. Each pixel in these images is Linear techniques assume a simple pinhole camera model and
represented by a number, corresponding to a grey level. In order
do not account for lens distortion effects which turn out to be
to construct a 3D model of a scene a disparity map is needed.
Such a map describes for each point in the source image where “significant in most off-the-shelf charge coupled devices” .In
to find the best corresponding point in the target image. From a non-linear methods a relationship between parameters is
disparity map and the calibration information of the cameras established and then an iterative solution is calculated through
that generated the two images a 3D model of the scene can be minimizations. The parameters to be calibrated are classified in
constructed. two groups:
Computational stereo vision involves inferring the 3-
dimensional depth of a scene from the small differences
between multiple views of the same scene. The most general 1) Internal (or intrinsic) parameters: Internal geometric and
definition of computational stereo vision involves only two optical characteristics of the lenses and the imaging device.
views of the same scene. This is also the most difficult problem 2) External (or extrinsic) parameters: Position and orientation
in stereo vision, since the amount of available data is small of the camera in a world reference system [7].
relative to multi-view stereo vision and hence the ambiguity of
the problem is high. With stereo vision, the disparity or image Calibration is important for accuracy in 3D reconstruction.
distance d between matching points XL and XR in the left and In particular it is a critical task for stereovision analysis.
right images is inversely proportional to the depth z, with the Calibrating stereo cameras is usually dealt with by calibrating
baseline distance b and focal length f the constant factors. The each camera independently and then applying geometric
concept of stereo vision itself is easily grasped, but the stereo transformation of the external parameters to find out the
matching problem of finding the correct match is much more geometry of the stereo setting [6].
ambiguous.
Real-time stereo vision has been widely used in Methods for estimating camera parameters rely typically on
intelligent robot navigation, smart human-computer interaction, targeted test-fields and correspondences between targets and
intelligent surveillance, etc. The estimation of the disparity their images on one or more frames. For multi-image
between two images of the same scene is a long-standing issue configurations, precise 3D test-fields can be replaced by simple
for the machine vision community [1]. Stereoscopic vision is 2D patterns, typically of a chess-board type. With unknown
based on the principal, first utilized by nature, that two spatially exterior orientation (Fiala and Shu, 2005), a further advantage of
differentiated views of the same scene provide enough such patterns is the fact that their high contrast and regularity.
information so as to perceive the depth of the portrayed objects. Several freely available algorithms exist for estimating interior
Thus, the importance of stereo correspondence is apparent in the and exterior orientation parameters based on chess-board
fields of machine vision, virtual reality, robot navigation, patterns imaged from different points of view. Among
simultaneous localization and mapping [2], [3], depth functional tools presented in this context, Bouquet’s Camera
measurements and 3-D environment reconstruction. The two Calibration implemented in C++ and included in the Open
alternatives for estimating disparity are either to precisely align Source Computer Vision library distributed by Intel is probably
the stereo camera rig and then perform the demanded best known. The initialization step includes manual pointing of
rectification (leading to simple scan line searches),or to have the four chessboard corners on all images and knowledge of the
arbitrary stereo cameras setup and avoid any rectification number of nodes per row and column. Node locations can thus
(performing searching throughout blocks). Accurately aligned be first approximated and then identified with sub-pixel
stereo devices are very expensive. accuracy by a point operator [6].
III. IMPLEMENTATION
Fig: 2
D. Disparity