Aggarwal Panoramic Stereo Videos CVPR 2016 Paper

Panoramic Stereo Videos with a Single Camera
Rajat Aggarwal∗ Amrisha Vohra* Anoop M. Namboodiri

Kohli Center on Intelligent Systems,
International Institute of Information Technology- Hyderabad, India.
{rajat.aggarwal@research, amrisha.vohra@research, anoop@}.iiit.ac.in
Abstract
We present a practical solution for generating 360◦

stereo panoramic videos using a single camera. Current
approaches either use a moving camera that captures mul-
tiple images of a scene, which are then stitched together
to form the final panorama, or use multiple cameras that
are synchronized. A moving camera limits the solution to
static scenes, while multi-camera solutions require dedi-
cated calibrated setups. Our approach improves upon the
existing solutions in two significant ways: It solves the prob-
(a) (b)
lem using a single camera, thus minimizing the calibration
problem and providing us the ability to convert any digital
camera into a panoramic stereo capture device. It captures
all the light rays required for stereo panoramas in a single
frame using a compact custom designed mirror, thus making (c)
the design practical to manufacture and easier to use. We Figure 1: (a) 3D printed model of the proposed coffee-filter mirror
analyze several properties of the design as well as present (b) Image of the proposed mirror placed in a scene (POVRay) (c)
panoramic stereo and depth estimation results. Stereoscopic panorama (red-cyan anaglyph) recovered from the
image in (b).
1. Introduction one to stand at any location in the world and look around
as if they were present in the real environment. However,
The ability to capture and transmit high quality omni- the capture of stereo panoramas currently requires a camera
directional stereo videos enables several applications such that moves along a circle or a complex synchronized multi-
as virtual tourism, remote navigation, and immersive en- camera setup. We aim to develop a simple solution that
tertainment. However, accurate omnidirectional stereo re- can capture 360◦ stereo panorama using an existing digi-
quires the capture of a four-dimensional light-field. Peleg et tal camera, thus making the creation process of immersive
al. [17] showed that if we assume the two eyes are restricted VR content accessible to a much larger population. The
to move along a horizontal circle (panoramic stereo), one primary challenge here is to capture all the light rays cor-
can create independent 2D panoramic images for each eye responding two sets of cameras (left and right eye views)
that closely reproduces the stereo views for a human. This arranged along the same circle to a single sensor without
result theoretically enabled the capture of omnidirectional causing blind spots or occlusions in the optical system. We
stereo videos. However, practical systems for capturing show for the first time that such an optical system is possi-
stereo panoramic videos are just emerging [1]. ble and practical using a set of custom designed reflective
In recent years, display devices for stereo panoramas surfaces that we refer to as the Coffee Filter Mirror (See
have become ubiquitous with virtual reality (VR) headsets Fig. 1). We derive the surface equations of the mirror petals
that uses smart phone displays such as Google Cardboard and discuss the optimization of its parameters for maximiz-
c
and Samsung GearVR . Stereo panoramic videos allows ing the visual quality of the captured images.
∗ Equal Contribution As mentioned before, most of the existing approaches in-
13755
volve either multiple cameras to capture various perspective generates a pair of panoramic images with a vertical off-
directions [12, 1, 5], or a single camera that rotates along a set. While such a geometry results in simple epipolar lines
horizontal circle to acquire the images [17, 19, 20]. The leading to fast disparity estimates, they are not suitable for
problem with moving or multiple cameras is that it makes human stereo perception as our brain expects a horizontal
the calibration and camera positioning difficult thus mak- disparity. Gluckman et al. [12] proposed the use of two om-
ing the system bulky and delicate to use. They also give nidirectional cameras, each consisting of a parabolic mir-
rise to visible artifacts like motion parallax, visible seams, ror, telecentric optics and a conventional camera, that were
synchronization errors and mis-alignments [13]. Moving aligned vertically along the same axis. Kawanishi et al. [15]
cameras also limit their use to static scenes and require ex- proposed a more complex setup that replaces each omnidi-
tensive post processing to get the left and right panoramas. rectional camera in the above device with 6 cameras and
In comparison, our approach has the following advantages: a hexagonal mirror. While this increases the resolution of
the omnidirectional images, the disparity remains vertical
1. Simplicity of Data Acquisition: In the multi-camera
as before. Lin et al. [16] created omnidirectional stereo us-
systems like Google Jump [1] or the system proposed
ing two cameras and a conical mirror, which limits the ver-
by Amini et al. [5], all the cameras need to be syn-
tical FOV. Yi and Ahuja [23] improved on the above designs
chronized using an electronic system to ensure that the
by capturing stereo omnidirectional panoramas with a sin-
images are captured at the same time. The use of a
gle camera. Their system uses a mirror-lens combination to
single camera eliminates the need for synchronization
create two light paths that reflect off different positions in
and reduces the size of the device making it easy to be
the mirror forming a stereo pair with a vertical disparity.
used and handled. Data is acquired in the form of a
regular image or video and may be stored in standard Acquiring a horizontal disparity panoramic stereo re-
formats. quires one to record a 3D light-field that is the equivalent
2. Ease of Calibration and Post Processing: Our ap- of all cameras with centers along a horizontal circle. Pe-
proach solves the problem without any moving parts, leg et al. [17] showed that this may be compressed to two
thus simplifying the calibration process. As we ex- 360◦ panoramas, one for each eye, without affecting the vi-
plain later, simple binary patterns can be used to cal- sual perception. They proposed the use of a single camera
ibrate the relative configuration of the camera and the that rotates in a circular trajectory to acquire the 3D light-
mirror. This also simplifies the post-acquisition de- field. Strips of images are then extracted from left and right
warping process to obtain the left and right panoramas. ends of the frames that are stitched to create the right and
The whole process of data acquisition and post pro- left eye panoramas respectively. While this method works
cessing can easily be done on a smartphone, making it quite efficiently for static scenes, it is affected by artifacts
a panoramic stereo video capture and display device. like visible seams and vertical parallax for moving objects
3. Adaptability to Various Applications: Our custom or uneven camera motions. These errors cause major mis-
designed mirror can be easily used as an attachment perceptions when viewed in 3D as explained by Held and
to any consumer camera to convert it into a stereo Banks [13]. Couture et al. [8, 9] proposed the use of a ro-
panoramic capture device. The size of the mirror can tating stereo camera pair and stitching complete frames in-
be adapted according to the application and the field of stead of small strips, which can generate panoramic stereo
view can be controlled. video textures. Use of rotating cameras limits the system
4. Complete omni-directionality: Our design can be ex- to small repetitive motion and suffers from the perception
tended using a concave lens to capture mono images of artifacts like visible seams and vertical parallax.
the top or bottom region. We can thus generate a com- Richardt et al. [20] proposed a flow based blending ap-
plete 360◦ x 270◦ view of the world captured using a proach that helps reducing these visual artifacts and works
simple set up. well for images captured using hand held cameras. How-
ever, the use of SFM makes this approach computationally
We describe in detail the design of the proposed cata- expensive when applied to high resolution images.
dioptric omni-stereo system. We also show that our system Peleg et al. [17, 19] also explored creating panoramic im-
works efficiently for acquiring both 3D images and videos ages using a perspective camera and a spiral mirror. They
and for both static and dynamic scenes. We discuss the op- could capture the 360◦ panorama using three such setups,
timization of the design parameters and present the recon- each acquiring 132◦ . They also proposed the use of a fres-
structed stereo panoramas. nel lens cylinder around an omnidirectional camera for each
eye. Both systems were complex and hard to manufacture
2. Related Work and use because of its large size and delicate structure.
Last two decades have seen significant advances towards To reduce visual artifacts and capture dynamic scenes,
achieving omnistereo imaging. The first set of approaches synchronized multi-camera setups were recently proposed.
3756
Principle Camera Name Device Type of Perceived Artifacts Image
setup scenes Resolution
Omnistereo [17] Rotating Static Mis-alignments, stitching artifacts & Medium
Single Camera based Vertical parallax
Megastereo [20] Rotating Static No artifacts but extensive run time for High
SFM
Coffee-filter Fixed Dynamic No stitching artifacts and horizontal Medium
(proposed) disparity errors. Very less vertical
disparity errors.
Multiple camera Omnistereo, fresnel 2 Fixed Dynamic Chromatic aberrations, difficult to Low
based, needs Lens solution[17] cameras manufacture and use , Self-occlusions
camera Panoramic Stereo 2 Rotating Dynamic Visible seams and vertical parallax High
synchronization Video Textures [8, 9] Cameras textures
Google Jump [1] 16 Fixed Dynamic No stitching artifacts but setup is High
Cameras expensive and bulky
Omnipolar [6] 6 Fixed Dynamic Self occlusions , setup is expensive and High
Cameras bulky
Hexagonal Pyramidal 12 Fixed Dynamic Vertical disparity errors High
Mirrors [15] cameras
Table 1: Omnistereo device/approach comparison based on device setup complexity, Type of scenes that the camera can handle, Perceived
artifacts in the omnistereo experience and resolution of the omnistereo panoramas.
P
Scene
point
Couture and Roy [6] have proposed a setup which uses 6
L
cameras with fisheye lenses. Large occlusions are visible
in the produced omnistereo panoramas due to the huge b V V
R
curvature of the lenses. Also to reduce depth distortions, Viewing circle
the number of cameras needs to be increased, which in Image surface

Right eye views Left eye views
turn makes the system bulkier. Tanaka and Tachi [22] also
proposed a method to capture omnistereo video sequences. (a) (b)
Their rotating optics system consisted of prism sheets, Figure 2: (a) The scene point P viewed by L and R eyes forming
circular or linear polarizing films, and a hyperboloidal a viewing circle with diameter equal to baseline (b) Arrangement
of mirrors for capturing right and left views. Tangential rays are
mirror. The latest and most effective solution is a simple
captured by placing a mirror normal to the viewing circle.
extension of the idea by Peleg et al. [17], where the rotating
camera is replaced by 16 static cameras along a circle. This
circle, as shown in Fig. 2a. The diameter of the viewing
virtual reality camera was introduced at the Google I/O
circle is equal to the baseline. For each viewpoint, the set
conference, called ‘Google Jump’ [1]. All the cameras are
of tangential rays in the clockwise direction account for the
synchronized to take frame-aligned videos, which are then
left eye views, and the set of tangential rays in the anti-
stitched to form a complete 360◦ video. Although several
clockwise direction account for the right eye views. To ac-
systems have been proposed in the past most of them
curately capture stereo information, the camera should be
either suffer from limited FOV, or are limited by their size,
able to capture all the rays tangential to the viewing circle.
calibration and alignment issues and cost of manufacturing.
For this purpose, we propose a special mirror design for
Our goal is to reduce this to a simple catadioptric system
omnistereo viewing, called coffee filter mirror, owing to the
using a custom designed mirror and a single camera. We
similarity in shape. Using the image captured by this mir-
also provide a theoretical foundation of the design choices
ror we generate panoramas for left and right eye views with
and demonstrate results for both static and dynamic scenes.
appropriate disparity. The disparity in these images is used
Table 1 shows the more detailed comparison of our device
to perceive depth when seen in 3D using a VR headset such
with other omnistereo devices/approaches.
as Google cardboard [2].
3. Design Overview
Consider the arrangement of two flat mirrors P1 P2 and
3D information can be extracted from two images taken P2 P3 as shown in Fig. 3b, P B is the tangent to the view-
by two cameras horizontally displaced by a baseline. For a ing circle V and represents the direction of the rays that are
complete 360◦ view, the two cameras, analogous to the two required to be collected for the right eye view. Using a mir-
eyes, can move around the center of a circle called viewing ror P1 P2 , normal to the rays falling in the direction P B ,
3757
β A petal
P2
β face of a petal P1 P To cover the complete 360◦ view, each face should cover
at-least 2π
n FOV. As shown in Fig. 4a, face P1 P2 , which is a
β P3
β B
flat mirror, views only the region parallel to its length. The
O b
V
best arrangement of the next face for the same eye view,
tangents to
the viewing Cmin P3 P4 would be such that the FOV of the two mirrors cover
circle Cmax
β
β consecutive areas of the scene. However, this is only pos-
viewing circle: V
sible if the mirrors are arranged linearly. FOVs can not in-
(a) (b) tersect at any point if flat mirrors are arranged in a circular
manner, as shown in Fig. 4a. Hence, certain areas of the
Figure 3: (a) A general horizontal cross-section of the proposed
coffee filter mirror.(b) Combination of the two arrangements as world will be left unseen by the camera and will be missing
shown in Fig. 2b to capture both eye views in a single design. from the captured images.
Overlap between the
two FOV
the incident rays can be reflected to the camera placed at
the center of the design. P1 P2 provides a horizontal field P2
of view to the eye and is analogous to the strip width in the Flat mirrors P1
changed to
image-based approaches as mentioned in [17]. Each such curved P3 P4
mirror is referred to as a face. Multiple faces, when ar-
ranged in the configuration shown in Fig. 2b, captures all
the tangential rays required for constructing the panorama
for a single eye. These faces are arranged at equal angular (a) (b)
separation such that Pi (even i) lies on the circle Cmax and Figure 4: (a) Flat mirrors have limited FOV, causing blind spots (b)
Pj (odd j) lies on the circle Cmin . Rays that are falling in Curved mirrors increase the FOV and the overlap between consec-
the direction of P B correspond to the view of the world as utive faces of an eye.
seen from the right eye, as shown in Fig. 3b.
Our design is motivated from the idea to capture both Field of view of each face is increased by opting for a
eyes’ views in a single device for omnistereo imaging. Be- horizontal curvature for the face as shown in Fig. 4b. Center
tween two consecutive faces that are catering to same eye of each curved face lies on the tangent to the viewing cir-
view, we introduce a similar face that captures a second eye cle. Increasing the curvature of the faces, increases the over-
view.We combine the two arrangements of Fig. 2b in a sin- lap between the FOVs of two consecutive same eye view’s
gle design, as shown in Fig. 3b, such that rays for both eyes’ faces. This overlap is advantageous while de-warping the
views are reflected to a single camera. The combination of captured image into the panoramas, as it resolves the prob-
one left and one right face is referred to as a petal in rest lem of missing regions. However, the increase in curvature
of the paper. Different number of petals can be used de- also increases the inter-reflections between the neighboring
pending upon the application requirements. Fig. 3a shows faces.
a general horizontal cross-section of the proposed device to
explain the structure and the nomenclature that would be
4.2. Inter reflections
used further in the paper. Due to the curvature in the faces in the proposed design,
the FOV of the adjacent faces aligned at an obtuse angle
4. Design Details β, overlap. As shown in Fig. 5, a ray originating from the
camera that strikes the face P2 P3 , may get reflected multi-
In order to provide a better 3D experience to the user, ple times because of the FOV overlap between faces P2 P3
it is important that the horizontal FOV of each face of the and P3 P4 . These inter-reflections will cause mis-captured
mirror is sufficient enough to avoid any stitching artifacts information at the camera sensor,thus decreasing the reso-
and mis-alignments. Also, vertical FOV should cover the lution of the final de-warped panoramas. Hence, an optimal
appropriate height of the world and the resolution must be amount of overlap is kept such that inter-reflections are kept
uniform across all the regions of the captured scene. In this to a minimum, while also solving the problem of missing
section, we explain how these factors affect our choice of regions as explained in Section 4.1.
design parameters.
4.3. Vertical Field of View
4.1. Horizontal Field of View
In the past, several catadioptric systems using various
Horizontal field of view refers to the amount of the scene combinations of flat and curved mirrors have been used to
captured by a face in the horizontal direction. In our design, increase the amount of the scene captured by the camera
this is directly dependent on the number of petals, say n. in the vertical direction. For better quality and perception
3758
P2 P P
FOV1
L β/2 β/2
rc
F
l
C
Inter- +
P1 A
C D B A
reflections B E
FOV2 θ/2θ/2 E
O b
P3
Rmin
R max
V
Cmin
P4
Cmax
(a) (b)
Figure 5: Inter-reflections cause wrong world points to be captured Figure 7: Geometry of the petal surface used to obtain optimal
due to FOV overlap between each pair of left and right eye view design parameters.
mirrors
of the de-warped images, it is important that the resolution for left and right eye. The design of the mirror is symmetri-
is uniform both horizontally and vertically. In a flat mir- cal, and all the petals are of same size and dimensions. The
ror, vertical FOV is equal to the height of the mirror and distance between the two extreme points of a mirror surface
hence the resolution remains constant.Our proposed design i.e AP as shown in Fig. 7a is referred as petal length, de-
is such that at each horizontal cross section the faces can be noted by l. Each petal, say Pi , where i = 1 to n is bounded
confined within a circle. The radii of these circles increase by a circle Cmax with radius Rmax , and inside by a circle
from 0 to Rmax . As we need uniform resolution, we see Cmin with radius Rmin . V is the viewing circle with radius
that a parabolic curvature in the vertical direction is more equal to b. From Fig. 7a, OA = Rmin and OP = Rmax .
desirable as compared to a hyperbola or straight line(see From △OAP and △OBP , by sine rule we get the relations
Fig. 6)1 . There have been several approaches proposed to as, sin(l θ ) = Rmax
(θ+β)
Rmin
= sin( β
2 sin(π− ) 2 ) 2
attain uniform resolution on some planes or uniform angular
Note that ∠AP O = ∠BP O = β2 and ∠AOP =
resolutions [7, 11, 14]. However, there work focuses mainly
on creating sensors with uniform resolution along the radial ∠P OB = θ2 , because each face is symmetrical and ori-
line, whereas our mirror is designed to capture stereoscopic ented at equal separation. ∴ ∠OAP = π − θ2 − β2 . LD is
views. Therefore, to simplify the set up and the dewarp- the perpendicular bisector of the chord AP and is tangent to
ing process, at present, We seek optimality in terms of the the viewing circle V . OD = b is the radius of the viewing
size, complexity, and minimizing inter-reflections. Creat- circle. In △OCD and △CLP , we get LP = CP cos( β2 ),
ing a device that can optimally capture stereoscopic views implies CP = LP sec( β2 ) = 2l sec( β2 ).
with uniform radial resolution is an aspect that needs to be In △P LC, ∠LCP = π2 − β2 . ∠OCD = ∠LCP ,
explored in future. being vertically opposite angles. Hence, ∠COD = π2 −
∠OCD = β2 . Also, OC + CP = Rmax , which gives,
Differential angle (deg)
Parabola
4
OC = Rmax − 2l sec( β2 )
Cone
Hyperbola
β β
2 In △OCD, OD OC = cos 2 , which gives OC = b sec( 2 ).
β β
0 Using this we get, Rmax − 2l sec( 2 ) = b sec( 2 ) and
0 2 4 6 8 10 Rmax = (b + 2l ) sec( β2 ). Combining all these, we get:
Radial Length
Figure 6: Uniformity of vertical resolution in terms of differen- 2b sin( θ+β
2 )
tial angle of the incident rays along the radial length for a cone, Rmax = (1)
parabola and hyperbola. sin( θ+2β
2 )
sin( β2 )
Rmin = Rmax (2)
5. The Mirror Surface sin( θ+β
2 )
We now derive the equation of the mirror surface. Mul- sin( θ2 )

l = Rmax (3)
tiple factors can be varied to make the device adaptive to sin( θ+β
2 )
specific applications. We derive the expressions for only
one petal AP B as shown in the Fig. 7a. and the same ex- 5.1. Optimizing the design parameters
pressions hold for all n petals rotated by 2π/n. Circular
surfaces AP and P B are used to capture the right and left In our proposed design, disparity and mirror size can be
eye view respectively. Let us consider the angle between the altered depending upon the application requirement. Size
chords of these two faces as β. Each petal subtends an angle of the mirror is proportional to Rmax . In order to have
θ at the center, where θ = 2π
n . Hence, we get n views each
a compact mirror design that generates human perceivable
stereo panoramas, the design parameters need to be opti-
1 Please see Supplementary Material for more mathematical details mized. The minimum value of the outer radius of the coffee
3759
P
filter mirror i.e. Rmax is dependent upon β. At petal an- rc

/2 A
gle, βopt = π−θ 2 , Rmax is minimum, and hence we get the
/2
minimum size of the device. (xc,yc) /2
2
In Fig. 7a, Let ∠P BE be α, the angle between two d /2 (xd,yd)
1
petals. In △OBP , ∠OP B = β2 , ∠P OB = θ2 and

c
1
(0,0)
∠OBP = π − θ+β θ+β
2 . Therefore, ∠P BF = π −(π − 2 ) =
θ+β
2 . ∠P BE = 2∠P BF , which means α = θ + β. Con-
sider Fig. 7b where O′ is the center of curvature of the face Figure 8: Parameters of the mirror petal.
P B. P O′ and O′ B are the radii of curvature i.e rc and
∠P O′ B = 2γ is the angle subtended by each face at the
center of curvature. In △P O′ B, ∠A = π − (θ + β), which function of x and y axis:
implies, γ = π2 − ∠A = (θ + β) − π2 . In order to have small z = f (x, y) = mφ (x2 + y 2 ), (4)
mirror size, γopt = (θ + βopt ) − π2 . Therefore, the optimal
where mφ is the slope of the parabola for a given φ. Let
horizontal angular field of view γopt = θ2 and is indepen-
x2 + y 2 = r2 , where r is the radial distance in the XY plane
dent of the obtuse angle ∠P BE between nearby faces.
and φ is the angle of the radial line, then:
O′ C is the perpendicular bisector of P B, CB = 2l .
l z = mφ r 2 (5)
In △O′ CB, r2c = sin γ. Radius of curvature rc can be Eqn 5 represents the petal surface of our custom de-
optimized by using the optimal value of γ. Therefore, signed mirror centered around origin. Consider the upper-
l
rc = 2 sin θ , is the optimal radius of curvature. It is to be most and widest cross section of the mirror at z = zmax ,
2
noted that these centers of curvature lie on a circle. such that zmax = mφ r1 2 , mφ = zmax r12
. Let (xc , yc ) be
To avoid wastage of pixels due to inter-reflections, as ex- the center of the circle of curvature of a face of a petal and
plained in Section 4.2, it is important to collect the maxi- (xd , yd ) be the point which lie on the curvature, r12 = k 2 r2
mum scene information in the captured image. Each face such that xd = kx and yd = ky. rc be the radius of the
covers 2θn angular FOV, thus a total of n such faces for
circle of curvature for a face. Combining this with Eqn 5,
each view covers complete 2π FOV. For no missing regions, we get mφ = zkmax 2 r 2 which implies, z =
zmax
k2 . It is to be
2 2 2
FOVs of two faces for the same eye views should be cov- noted that xc + yc = dc , Calculating distance from center
ering consecutive areas of the scene. This is achieved by of the curvature and the point on the curvature we have:
aligning one face in the direction of O′ P and the next face
for the same eye view, in the direction BE. Hence the ob- (xd −xc )2 +(yd −yc )2 = (kx−xc )2 +(ky−yc )2 = rc2 (6)
tuse angle between the two faces P B and BE is π+θ 2 . The
amount of inter-reflections depends upon the angle between p
two consecutive petals, α, which depends upon the sam- (xxc + yyc ) + (xxc + yyc )2 − r2 (d2c − rc2 )
=⇒ k =
pling angle of the mirror 2π r2
n . Ideally, the amount of inter-
reflections reduces down to zero, when the FOV of two con- Since, mφ = zmax /k 2 r2 ,
secutive faces do not intersect at all. However, this way,
some of the scene regions will be left uncovered in the FOV 2
r
of some faces and hence not imaged at all. In order to ac- mφ = zmax p
(xxc + yyc ) + (xxc + yyc )2 − r2 (d2c − rc2 )
count for these inter-reflections, we introduce a small angle
δ such that the angle of curvature becomes 2γ + δ. This as mφ = zmax /k 2 r2 . Also, it can be derived from
makes sure some overlap is there, so that some redundant the Fig. 8, (xc , yc ) = (xd , yd ) + rc (cos(θ1 + θ2 + β2 +
information is captured, which can be used while dewarp-
θ2 ), sin(θ1 + θ2 + β2 + θ2 )), where θ2 = tan−1 ( 2rl c ).
ing. However, the value of δ is kept sufficiently low, such
From this and Eqn 5 we get,
that inter-reflections are also reduced to a huge extent.
!2
5.2. Resultant Mirror Surface r2
z = zmax p
In this section, we obtain the surface equations of the (xxc + yyc ) + (xxc + yyc )2 − r2 (d2c − rc2 )
proposed coffee filter mirror in terms of polar coordinates φ (7)
and r. As explained earlier, the surface of the coffee filter Therefore, Eqn 7 gives the equation of the paraboloidal
mirror is such tha each face is paraboloidal vertically and surface of the mirror. Note that the slope mφ at every point
at each horizontal cross section the faces can be confined is a function of r. Any incident ray I coming from the world
within a circle. The radii of these circles increase from 0 point falls on the mirror surface at the point P (r, φ) and
to Rmax . Let us consider the central axis of the mirror to ˆ φ))n(r,
is reflected in the direction R̃ = 2(Ĩ · n(r, ˆ φ) − Ĩ,
be the z axis. Then the surface equation can be written as a where n(r, φ) is the direction of normal vector at P . n is the
3760
normal vector to the tangent plane containing the tangents in which is imaged by a mirror surface at point P (r, φ), then
both horizontal and the vertical direction. For a horizontal the ray coming from X to P is viewed by some other mirror
plane, the direction of normal vector nh at point (r, φ) is surface at location P ′ (r′ , φ′ ). The set of such points form
T
an epipolar curve for the point P . Each point is then trans-

given by yc − r sin φ r cos φ − xc 0 where (xc , yc )
are the center of curvature. Similarly, the normal vector nv formed into the corresponding image coordinate using the
in vertical direction is obtained by dx
dy dz T

which calibration explained in previous section. Epipolar curve
dr dr dr for a point in the left face is found by minimizing the dis-
T
is obtained as cos φ sin φ 2mφ r . Hence, the normal tance between the reflected rays from a point in a left face
vector to the tangent plane at point (r, φ) is obtained by P to every other point in it’s right face P ′ . Thus, for each
n = nˆh × nˆv . φ in the mirror surface, we find the rφ which intersects the
5.3. Calibration and De-warping reflected ray from point P such that,
Images captured by the camera using the coffee filter |[PP′ , Ir,φ , Ir′ ,φ′ ]|
=0 (8)
mirror are of the form as shown in Fig. 1b. Captured images |Ir,φ × Ir′ ,φ′ |
varies with the orientation and viewing angle of the camera.
However, for stereo vision to be perceivable, camera’s view- where Ir,φ represents the direction of reflected ray from
ing axis must be aligned with the central axis of the device. mirror surface. Since, the design behaves as a non-central
To calibrate our device, we use a generic non-parametric camera, every point has different epipolar constraints. We
camera calibration method as proposed by Posdamer and calculate stereo disparity between the left and right views
Altschuler [18]. We project structured light binary patterns by finding the correspondences along these epipolar curves.
onto a display surface and project both normal and inverse
binary sequence patterns. The obtained calibration images 7. Results and Discussions
together will be used to compute a mapping from 3D world In order to test our design, we have modeled the cof-
coordinates to 2D image coordinates which is used for de- fee filter mirror using POVRay [4], a freely available ray
warping the captured scene image into left and right eye tracing software tool that accurately simulates imaging by
panoramas. tracing rays through a given scene. We have used two 3D
scene datasets [10, 3] to demonstrate how the proposed mir-
6. Epipolar Geometry and Stereo ror is used to create stereo panoramas (red-cyan anaglyph)
as shown in Fig. 1c and Fig. 9. Our mirror can also be used
To find the center of projections of the proposed coffee-
to capture dynamic scenes and create 3D stereo videos. As
filter mirror, we find the trajectory of the viewpoints by
shown in the results, the proposed coffee filter mirror is able
finding the intersection of the reflected rays for every pair
to capture 103◦ FOV in vertical direction and 360◦ in the
of adjacent points. Let a point lying on a radial line with
horizontal direction. The image of the simulated scene is
single φj is denoted by pi,φj . Then the reflected rays for
captured using the simulated proposed setup and a virtual
each pair of adjacent points pi,φj and pi+1,φj intersect at
camera at a resolution of 5000 x 5000 and is dewarped into
the ci,φj . The set of all such points form a line as shown in
two left and right panoramas of resolution 512 x 8192.
Fig. 10a. Lφj = {ci,φj }∀i. We take the average of these
Fig. 11a shows a stereo depth map of the POVRay scene
intersection points c′φj , to find the average viewing circle
‘average office’[10] computed using the coffee filter mir-
for complete 360◦ view. Locus of c′φj for φ for a single
ror. The qualitative comparison of the depth map with the
face is a straight line parallel to central axis of the camera.
ground truth is shown in Fig. 11b. Fig. 12a and Fig. 12b
Let c′′k be the average center for a single face. Locus of c′′k
shows left and right panoramas respectively created using
for all n faces is viewing circle, as shown in Fig. 10b. In
12 petals of the initial manufactured prototype of the coffee-
order to have a mirror with diameter of the viewing circle
filter mirror. Additional results including anaglyph images,
equal to the baseline of human eye, diameter of the upper-
videos of dynamic scenes and stereo depth maps may be
most cross-section must be twice the human baseline. With
found at the project website 2 .
the change of diameter of the cross-section from top to bot-
We have created a physical prototype of the proposed
tom, both linear and angular disparities change. However,
mirror (Fig. 13(b)) with 24 petals, capturing 48 different left
for panorama to be perceivable, linear disparities are more
or right eye views. Each face subtends an angle θ = 15◦ at
important than the angular disparities. The change in linear
the center of the coffee filter mirror. The total height of the
disparities are avoided by interpolation during dewarping.
mirror zmax is kept as 5 cm and the radius of the hole rmin =
The combination of mirrors and a conventional camera be-
2.7cm. We have kept b = 6.5 cm which is equal to the aver-
have as a non-central catadioptric camera. For details on
age value for human baseline. This optimal petal angle βopt
such classification, readers are directed to [21].
Let us consider a point in 3D world defined by (X, Y, Z) 2 http://cvit.iiit.ac.in/research/projects/panoStereo/
3761
Figure 9: Red-Cyan anaglyph panorama obtained by using the proposed set up using POVRay dataset [3].
6 4
ABCDE
5
2
4
3 0
Z
Y
E
D
C
B
2 A
−2
1 (a)
0 −4
0 2 4 6 8 10 −4 −2 0 2 4
X X
(a) (b)
Figure 10: (a) Lines in black represents the locus of the intersec-
tion of consecutive rays for different radial lines lying on a single (b)
face (b) Red and blue curves represent the caustic curve for left Figure 12: Left and right panoramas extracted from 12 petals of
and right faces respectively. an initial prototype of the mirror shown in Fig13(b).
ters between faces that causes the artifacts in Fig. 9. How-

ever, the ratio of the vertical shift to the horizontal shift is
(a) very small and does not create any visible artifacts beyond
a depth of d. The value of d turns out to be 70 cm when the
rendering cylinder is kept at 200 cm, for images captured
at 5000 × 5000 resolution. These could also be corrected
like other approaches by depth estimation at the dewarping
(b)
Figure 11: (a) Comparison of reconstructed depth as obtained us- stage.
ing the proposed set up with the ground truth depth map.
comes out to be 82.5◦ , the optimal values of Rmax = 9.77

cm, Rmin = 8.571 cm and l = 1.696 cm. To mitigate the
effects of inter-reflections in the adjacent faces, as explained
in Section 4.2, we introduced a small angular overlap be-
tween two adjacent petals as δ = 2◦ . This means each petal
captures with a redundancy of 1/15◦ , since θ = 15◦ . The (a) (b)
resulting panoramas are shown in Fig. 12. Degradations in Figure 13: (a) Mockup of proposed device attached to a consumer
quality of images are due to imperfections of the mirror sur- cellphone camera (b) Set up created using initial manufactured
face that was created by SLA printing. prototype of coffee filter mirror.
In our design, the horizontal shift in camera centers be-

tween faces is very small (≈ 2mm) reducing the blind spot 8. Conclusions
to within 5 cm of the device. The shift decreases along
the height of the mirror. To simplify the post-capture setup, We have proposed a simple practical solution to captur-
ing 360◦ stereo panoramas using a single digital camera for
we avoided depth computation step at dewarping. This re-
immersive human experience. As the resolution of sensors
tains a small disparity difference from the top of the image increase, the quality of the panoramas also increase. We de-
to its bottom. This difference is visible on close inspection rived the optimal parameters of the design and experimental
and may be corrected with depth estimation. As our goal results show that we can avoid most visual artifacts in the
was human perceivable stereo and the artifacts are imper- panoramas. While designed with human consumption in
ceivable while using stereo glasses or HMD, we avoided mind, the stereo pairs could also be used for depth estima-
this step. There are radial and vertical shifts in camera cen- tion.
3762
References [19] Y. Pritch, M. Ben-Ezra, and S. Peleg. Optics for omnistereo
imaging. In Foundations of Image Understanding, pages
[1] Google Jump, https://www.google.com/get/cardboard/jump/. 447–467. 2001. 2
1, 2, 3
[20] C. Richardt, Y. Pritch, H. Zimmer, and A. Sorkine-Hornung.
[2] Google Cardboard, https://www.google.com/get/cardboard/.
Megastereo: Constructing high-resolution stereo panoramas.
3
In Computer Vision and Pattern Recognition, IEEE Confer-
[3] http://www.ignorancia.org/en/index.php?page=Childhood. ence on, 2013. 2, 3
7, 8
[21] P. Sturm, S. Ramalingam, and S. Lodha. On calibration,
[4] POVRay, http://www.povray.org/. 7
structure from motion and multi-view geometry for generic
[5] A. S. Amini, M. Varshosaz, and M. Saadatseresht. Evalu- camera models. In Imaging Beyond the Pinhole Camera.
ating a new stereo panorama system based on stereo cam- 2006. 7
eras. International Journal of Scientific Research in Inven-
[22] K. Tanaka and S. Tachi. Tornado: Omnistereo video imaging
tions and New Ideas, 2, 2014. 2
with rotating optics. Visualization and Computer Graphics,
[6] V. Chapdelaine-Couture and S. Roy. The omnipolar cam-
IEEE Transactions on, 11(6):614–625, 2005. 3
era: A new approach to stereo immersive capture. In Com-
[23] S. Yi and N. Ahuja. An omnidirectional stereo vision system
putational Photography, IEEE International Conference on,
using a single camera. In Pattern Recognition. Eighteenth
2013. 3
International Conference on, 2006. 2
[7] T. L. Conroy and J. B. Moore. Resolution invariant surfaces
for panoramic vision systems. In Computer Vision, 1999.
The Proceedings of the Seventh IEEE International Confer-
ence on, volume 1, pages 392–397. IEEE, 1999. 5
[8] V. Couture, M. S. Langer, and S. Roy. Panoramic stereo
video textures. In Computer Vision, IEEE International Con-
ference on, 2011. 2, 3
[9] V. C. Couture, M. S. Langer, and S. Roy. Omnistereo video
textures without ghosting. In 3D Vision, International Con-
ference on, 2013. 2, 3
[10] F. Devernay and S. Pujades. Focus mismatch detection in
stereoscopic content. In IS&T/SPIE Electronic Imaging. In-
ternational Society for Optics and Photonics, 2012. 7
[11] S. Gächter, T. Pajdla, and B. Micusik. Mirror design for an
omnidirectional camera with a space variant imager. In IEEE
Workshop on Omnidirectional Vision Applied to Robotic Ori-
entation and Nondestructive Testing, pages 99–105, 2001. 5
[12] J. Gluckman, S. K. Nayar, and K. J. Thoresz. Real-time om-
nidirectional and panoramic stereo. In Proceedings of Image
Understanding Workshop, 1998. 2
[13] R. T. Held and M. S. Banks. Misperceptions in stereoscopic
displays: a vision science perspective. In Proceedings of
the 5th symposium on Applied perception in graphics and
visualization, 2008. 2
[14] R. A. Hicks and R. K. Perline. Equi-areal catadioptric sen-
sors. In null, page 13. IEEE, 2002. 5
[15] T. Kawanishi, K. Yamazawa, H. Iwasa, H. Takemura, and
N. Yokoya. Generation of high-resolution stereo panoramic
images by omnidirectional imaging sensor using hexagonal
pyramidal mirrors. In Pattern Recognition. Fourteenth Inter-
national Conference on, 1998. 2, 3
[16] S. Lin and R. Bajcsy. High resolution catadioptric omni-
directional stereo sensor for robot vision. In Robotics and
Automation. IEEE International Conference on, 2003. 2
[17] S. Peleg, M. Ben-Ezra, and Y. Pritch. Omnistereo:
Panoramic stereo imaging. Pattern Analysis and Machine
Intelligence, IEEE Transactions on, 23:279–290, 2001. 1, 2,
3, 4
[18] J. Posdamer and M. Altschuler. Surface measurement by
space-encoded projected beam systems. Computer graphics
and image processing, 18:1–17, 1982. 7
3763

Aggarwal Panoramic Stereo Videos CVPR 2016 Paper

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aggarwal Panoramic Stereo Videos CVPR 2016 Paper

Uploaded by

Copyright:

Available Formats

Panoramic Stereo Videos with a Single Camera

Rajat Aggarwal∗ Amrisha Vohra* Anoop M. Namboodiri

We present a practical solution for generating 360◦

the number of cameras needs to be increased, which in Image surface

We now derive the equation of the mirror surface. Mul- sin( θ2 )

filter mirror i.e. Rmax is dependent upon β. At petal an- rc

petals. In △OBP , ∠OP B = β2 , ∠P OB = θ2 and

ters between faces that causes the artifacts in Fig. 9. How-

comes out to be 82.5◦ , the optimal values of Rmax = 9.77

In our design, the horizontal shift in camera centers be-

You might also like