Professional Documents
Culture Documents
Image
K.V.Shailesh, 200401120
Prof. Suman Mitra
Evaluation Committee no: 2
Abstract - This project aims at developing a method to
automatically generate a 3-D model from a single
uncalibrated image. We do not aim to identify various
objects and shapes in an image but, coarsely segregate the
image into 3 main categories, namely: sky, vertical and
horizontal. Next step is to make the computer learn the
perspective in the image and generate a 3D interpretation
of the single input image. Our major contributions are:
segmenting image into the 3 categories mentioned above
and developing algorithms for machine learning. This
methodology would ultimately aid in the development of
virtual walkthroughs and 3D models without the need of
going into their technical details.
Index Terms Image processing, image segmentation,
perspective viewing, projective geometry, texturing.
I. INTRODUCTION
Increasing interest in the entertainment and communication
design industry has led to significant developments in the field
of 3D graphics. As a result of which we are able to enjoy
virtual reality gaming with stunning graphics, high definition
animation films and interactive virtual tours. A subset of this
boom is 3D modeling based on image-based rendering, which
involves the use of photographs for generating 3D models
which are further used in developing gaming and virtual tour
environments. However, the development of these
environments has strictly remained a domain of the
professionals and often requires great amount of technical
expertise, multiple photographs shot using calibrated cameras
and special equipment.
This project allows us to bypass all the technical
complexity mentioned above by embedding the same into an
automated framework. The application of this project can vary
from the development of personalized virtual tours and
walkthroughs to FPS (First-person shooter) game
development. The application takes as input any image with
considerable depth of field and automatically generates a 3D
VRML model as the output, allowing us to walk through it or
further integrate it with any other application.
II. PROBLEM STATEMENT
To develop a methodology for automatically generating 3D
models in VRML from single images. For achieving this, the
main tasks would be to develop image segmentation
V. APPROACH
Before we begin we form certain basic assumptions for our set
of input images and derive some conclusions based on
observation and performing some basic image processing
operations:
I. Assumptions
1. Every image has a sky in the upper part.
2. Every image has at least one vanishing point, inside or
outside the image boundaries.
3. Every image has sufficient depth of view.
4. Every image in the input data set has straight lines.
II. Conclusions
1. Sky has almost uniform gradient (Observed by edge
detection).
2. Vertical structures have vertical lines (slope=90o).
3. Vertical lines (slope=90o) only diminish with distance but
inclination remains unaltered (Based on statistical
analysis using Linear Hough Transform).
To address the above problem we have divided the whole
process into 7 steps as shown in the Figure 1.
Step1. Detecting an Edge
Edges characterize boundaries and are therefore a problem of
fundamental importance in image processing. Edges in images
are areas with strong intensity contrasts a jump in intensity
from one pixel to the next. Edge detecting an image
significantly reduces the amount of data and filters out useless
information, while preserving the important structural
properties in an image. According to [6] different edge
detection methods may be grouped into two categories,
FIGURE 1
THE STEPS INVOLVED IN THE DESIGN OF THE ALGORITHM
sin sin
which can be rearranged to
(2)
r = x cos + y sin
It is therefore possible to associate to each line of the
image, a couple (r,) which is unique if [0, ] and
i =l ( dp )
dp( xi, yi) l (dp)
i =1
(Xf, Yf) =
(3)
FIGURE 2
OUTPUT OF LINE DETECTION USING HOUGH TRANSFORM [8]
FIGURE 3
VANISHING POINT APPROXIMATION. LEFT-CONSIDERING VERTICAL LINES.
RIGHT-IGNORING VERTICAL LINES. RED CROSSES REPRESENT INTERSECTION
POINTS AND GREEN DOT REPRESENTS VANISHING POINT
c)
TABLE I
PROS AND CONS OF THE 3 ALGORITHMS TESTED
1.Vertical Traversal
+ Faster - 1.135675 secs for a
478x434 px image.
2. Region Growing
+As all the edge pixels for every - Slower - 14.071094secs for a 478x434
pixel are considered, it is more
px image.
versatile compared to 1.
- Obstacles dividing the sky into 2 disjoint
regions cannot be overcome. Eg. An
overhead transmission line.
3. K-Means clustering using L*a*b* colour space
+ As it is colour based
segmentation, it is more
efficient in removing sky (blue)
and clouds (white/grey).
FIGURE 5
RED REGION REPRESENTS THE MASK TO SELECT THE GROUND. DETECTED
VERTICAL LINE SEGMENTS ARE SHOWN IN BLUE
FIGURE 4
IMAGE ON THE LEFT IS THE ORIGINAL AND RIGHT SIDE SHOWS THE OUTPUT OF
THE MODIFIED IMPLEMENTATION OF SKY SEGMENTATION. NOTE THE
OBSTACLES IN THE SKY HAVE BEEN ENCOUNTERED EFFECTIVELY.
FIGURE 6
PROJECTION OF THE BLUE RECTANGLE ON THE PLANE Z=1
FIGURE 7
SNAPSHOTS OF THE OUTPUTS GENERATED FOR 2 IMAGES
VI. ACKNOWLEDGEMENT
I convey my sincere thanks to Prof. Suman Mitra for
providing his valuable and timely guidance.
REFERENCES
[1]
Hoiem, D., Efros, A., A., and Hebert, M., Automatic Photo Pop-up,
ACM Siggraph, 2005.
[2]
[3]
[4]
[5]
Horry, Y., Anjyo, K.-I., and Arai, K. Tour into the picture: using a
spidery mesh interface to make animation from a single image. In ACM
Siggraph, 225232, 1997.
[6]
www.pages.drexel.edu/~weg22/edge.html
[7]
Duda, O., Richard and Hart, E., Peter, Use of the Hough
Transformation to Detect Lines and Curves in Pictures. In Comm.
ACM, vol. 15, no. 1, 11-15, April 1971.
[8]
[9]