You are on page 1of 16

Image De-fencing

M.TECH FIRST YEAR THESIS PRELIMINARY REPORT


SIGNAL PROCESSING

Submitted in partial fulfillment of


the requirements for the award of M.Tech Degree in
Electronics and Communication Engineering (Signal Processing)
of the University of Kerala

Submitted by
Jerry Korulla George
Second Semester
M.Tech, Signal Processing

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING


COLLEGE OF ENGINEERING
TRIVANDRUM
2014

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING


COLLEGE OF ENGINEERING
TRIVANDRUM

CERTIFICATE
This is to certify that this report entitled Image De-fencing is a bonafide record of
the work done by Jerry Korulla George, under our guidance towards partial fulfillment
for the award of Master of Technology Degree in Electronics and Communication
Engineering (Signal Processing), of the University of Kerala during the year 2014.

Dr. Sreeni K G
Asst. Professor, Dept. of ECE
College of Engineering, Trivandrum,
(Guide)

Dr. Vrinda V Nair


Professor , Dept. of ECE
College of Engineering, Trivandrum.
(Head Of the Department)

Dr. Jiji C V
Professor, Dept. of ECE
College of Engineering, Trivandrum.
(PG Coordinator)

ACKNOWLEDGEMENTS
I would like to express my sincere gratitude and heartfelt indebtedness to my Professors,
Department of Electronics and Communication Engineering for their valuable guidance
and encouragement in pursuing this seminar.
I would like to express my sincere gratitude to Dr. Vrinda V. Nair, Head of Department and Dr. Jiji C.V., P.G. Coordinator, Electronics and Communication
Engineering, for their valuable guidance and support during the seminar work.
I would like to convey my special gratitude to Dr. Sreeni K G, Assistant Professor, Electronics and Communication Engineering Department, for his guidance and
valuable suggestions.
I also acknowledge other members of faculty in the Department of Electronics and
Communication Engineering and all my friends for their whole hearted cooperation and
encouragement.
Above all I am thankful to the Almighty.

Jerry Korulla George

Contents

INTRODUCTION

LITERATURE SURVEY

2.1

Fence Detection and Removal . . . . . . . . . . . . . . . . . . . .

2.2

Image In-painting . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

WORK DONE

WORK PLAN

4.1

September . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2

October . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.3

November . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.4

December . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iii

List of Figures
1.1

Fences in images. . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1

Results of Liu et al. [1]. . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Results of Park et al. [2]. . . . . . . . . . . . . . . . . . . . . . . .

iv

CHAPTER 1
INTRODUCTION
A picture is worth a thousand words. A complex idea can be conveyed with just
a single still image. Invented in the early decades of the 19th century, photography by
means of the camera seemed able to capture more detail and information than traditional
media, such as painting and sculpture.Color photography was invented in the mid-19th
century. In 1981, Sony introduced the first consumer camera to use a charge-coupled
device for imaging, eliminating the need for film. This led to the beginning of a new era
in photography called Digital Photography. The introduction of digital photography
made photography simpler and cheaper for the common man.
Ever since its invention, the development and evolution of technologies in digital
image capturing has been relentless. There have been tremendous advancements in the
hardware capabilities of digital cameras such as: the resolution of image sensors, low
light capabilities of the digital cameras, zooming capabilities of cameras etc. There have
also been advancements in the software part of digital photography. The introduction of
techniques such as red-eye correction, blur removal etc. have helped in making digital
photographs much better than their analog counterparts. As these advancements in
technologies slowly filters down to the amateur and enthusiast consumer products, more
and more people have the chance of taking better digital photos at cheaper coasts. Some
of the broad areas of photography are wildlife photography, landscape photography,
urban photography etc.
Tourists and amateur photographers are often hindered in capturing their cherished
images by a fence/occlusion that limits accessibility to the scene of interest. Some of
the typical examples of such conditions is illustrated by figure 1.1. The picture of the
leopard would have looked much better if there had not been the fence behind it. Also
it is desirable to eliminate the window frames from the airport floor scene. But it is not
possible to avoid them due to growing concerns of security at public places. Hence a
need exists to provide a tool that can be used for post-processing such fenced images to
produce a de-fenced image.
The above mentioned de-fencing problem provides a good area of research in im-

Figure 1.1: Fences in images.


age processing. This work attempts to identify and solve the challenges in image defencing. The fundamental steps that have been identified in image de-fencing are: 1.
Automatic detection and removal of the fence/occlusions in an input image. 2. Filling
in the gap created by the removal of the fence/occlusions by utilizing the information
from the input image.

CHAPTER 2
LITERATURE SURVEY
The noval idea of Image De-fencing was introduced by Liu et al. [1]. Park et
al. [2] suggested techniques for improving the quality of the de-fenced images using
some additional techniques such as multiview inpainting. But the techniques involved
in the process such as lattice detection and image completion have been in the area of
research.

2.1

Fence Detection and Removal


The process of identification of regular and near-regular patterns in images

has a long history of research. Liu et al. [1] used the method implemented by Hays
et al. [3], which is an iterative algorithm trying to find the most regular lattice for a
given image by assigning neighbor relationships among a set of interest points, and
then using the strongest cluster of repeated elements to propose new, visually similar
interest points. The neighbor relationships are assigned such that neighbors have maximum visual similarity. More importantly, higher-order constraints promote geometric
consistency between pairs of assignments. Finding the optimal assignment under these
second-order constraints is NP-hard so a spectral approximation method [4] was used.
No specific restrictions on the type of visual or geometric deformations present in a
near-regular texture are imposed in [3], but with increasing irregularity it becomes more
difficult to find a reasonable lattice. When a see-through regular structure is overlaid
onto an irregular background, finding a lattice is especially challenging. If the regularity
is too subtle or the irregularity too dominant the algorithm will not find potential texels.
To alleviate this,the threshold for visual similarity used in [3] for the proposal of texels
was lowered . Since the near-regular structures the test cases tend to have relatively
small geometric deformations, the algorithm could still find the correct lattice even with
a large number of falsely proposed texels that might appear with a less conservative texel
proposal method.
The final lattice was a connected, space-covering mesh of quadrilaterals in which
the repeated elements contained in each quadrilateral (texels) are maximally similar.

The lattice does not convey which parts of each texel are part of a regular structure
or an irregular background. However, the lattice does imply a dense correspondence
between all texels which allows to discover any spatially distinct regular and irregular
subregions of the texels which correspond to foreground and background respectively.
Liu et al. [1] put all the texels into correspondence by calculating a homography
for each texel which brings the corners into alignment with the average-shaped texel.
After aligning all the texels ,the standard deviation of each pixel was computed through
this stack of texels. Liu et al. [1] could classify background versus foreground based
on a threshold of this variance among corresponded pixels, but a more accurate classification is achieved, when the color information in each texel is considered in addition
Zs
color is coupled with the standard devito their aggregate statistics. Each pixelA
ation of each color channel at its offset in the aligned texel. This gives as many as
6-dimensional examples as there are pixels within the lattice. There is a constant, relative weighting between the standard deviation and the color features for k-means. The
standard deviation is weighted more heavily. Different values of k are used for k-means
for different examples. The examples are clustered with k-means and assign whichever
cluster ends up having the lowest variance centroid to be the foreground and the rest
background. From this classification a mask is constructed in image space which corresponds to foreground, background, and unknown.
Park et al. [2] used a basic lattice detection algorithm which is similar to [5]. The
procedure is divided into two phases, where the frst phase proposes one (t1 , t2) vector
pair and one texture element, or texel. 2D lattice theory tells that every 2D repeating
pattern can then be reconstructed by translating this texel along the t1 and t2 directions.
During phase one, we detect KLT corner features, extract texture around the detected
corners, and select the largest group of similar features in terms of normalized correlation similarity. Then we propose the most consistent (t1 , t2)-vector pair through an
iterative process of randomly selecting 3 points to form a (t1 , t2) pivot for RANSAC
and searching for the pivot with the maximum number of inliers.
At phase two, tracking of each lattice point takes place under a 2D Markov Random
Field formulation with compatibility functions built from the proposed (t1 , t2)-vector
pair and texel. The lattice grows outwards from the initial texel locations using the (t1 ,
t2)-vector pair to detect additional lattice points. The tracking is initiated by predicting

lattice points using the proposed (t1 , t2) vector pair under the MRF formulation. The
inferred locations are further examined; if the image likelihood at a location is high,
then that location becomes part of the lattice. However, for robustness, the method
avoids setting a hard threshold and uses the region of dominance idea introduced in [6].
This is particularly important since there is no prior information about how many points
to expect in any given image. If the threshold of detecting lattice points is too high, then
recall rate suffers.
Since the performance of lattice detection plays an essential role in this application,
Park et al. [2] introduced a better decision system that uses online classification and
combined the lattice detection procedure with foreground / background segmentation.
In addition, Park et al. [2] segments out the foreground layer during the detection procedure and build a mask to remove noisy regions of each texel to represent background
irregularities from distracting and misguiding the inference procedure. Since evaluation of a noisy image likelihood could misdirect the inference of new texel locations,
resulting in inaccurate lattice detection, Park et al. [2] evaluated the image likelihood of
the each texel by normalized cross correlation using only the foreground mask.

2.2

Image In-painting
A plausible background can be estimated by applying texture based in-painting

to all pixels which have been labeled as foreground. Liu et al. [1] used the method of
Criminisi et al. [7], which modifies Efros and Leung [8] by changing the order in which
pixels are synthesized to encourage continuous, linear structures. The modified synthesis order profoundly improves in-painting results even for regions that are relatively
thin. Patch-based image completion methods [9] are less appropriate for our in-painting
task because the target regions are perhaps a single patch wide which obviates the need
for sophisticated patch placement strategies as explored in [9]. Also, the source regions offered few complete patches to draw from. On the other end of the in-painting
spectrum, diffusion-based in-painting methods also work poorly. Our target regions are
wide enough such that the image diffusion leaves obvious blurring.
A mask which appears to cover a foreground object perfectly can produce surprisingly bad in-painting results due to a number of factors: the foreground objects are often
not well focused because scenes often have considerable depth to them, sharpening ha-

los introduced in post-processing or in the camera itself extend beyond the foreground
object, and compression artifacts also reveal the presence of an object beyond its boundary. All of these factors can leave obvious bands where a foreground object is removed.
In order to remove all traces of a foreground object the mask is dilated considerably
before applying in-painting.
The in-painting task is especially challenging compared to previous in-painting
work by [7]. Liu et al. [1] typically had a large percentage of the image to fill in,
from 18 to 53 percent after dilation masked out and the boundary between the source
regions and the masked-out regions had a very large perimeter. These factors conspire
to give few complete samples of source texture with which to perform the in-painting a new problem rarely happened in previous in-painting applications where large regions
of source textures with simple topology were available.
One of the most challenging problems in in-painting is the scarcity of source samples [1]. Park et al. [2] tried to overcome this in two ways. The first approach was
to try to see the occluded object in another view. It was reported by Liu et al. [1]
that overall occupation of the foreground fence layer in their data set was from 18% to
53%. However, even a small offset of the camera can reveal pixel values behind the
foreground layer since objects behind the layer will experience less parallax than the
foreground. Also, moving objects will reveal parts of themselves, even to a stationary
camera, through multiple frames. Since in video these offsets are small, object alignment can be approximated as a 2D translation. Park et al. [2] utilized the information
from multiple views to aid the in-painting process by minimizing the number of pixel
values that need to be inferred.
A second approach dealt with the situation after multi-view in-painting or where
no additional views are available. For gaps that still remain, Park et al. [2] adopted an
exemplar based in-painting algorithm [10], [7] as the base tool. In addition, they tried
to overcome scarcity of candidate patches by simulating bilateral symmetry patterns
from the source image. As reflection symmetry often exists in man-made environments
and nature, simulating these patterns from the source image often recovered occluded
regions reliably and efficiently.

2.3

Results
Figures 2.1 and 2.2 illustrates the results of Liu et al. [1] and Park et al. [2]

respectively.

Figure 2.1: Results of Liu et al. [1].

Figure 2.2: Results of Park et al. [2].


The images clearly illustrate that the results of Park et al. [2] are better compared to
Liu et al. [1] in case of images having larger percentage of occlusions.

CHAPTER 3
WORK DONE
The preliminary task in doing the thesis work was the collection of relevant literature. A number of journal and conference publications related to the topic were
collected. Each of the papers were read and necessary information was collected.

CHAPTER 4
WORK PLAN
The work plan for the four months from September to December are as follows:-

4.1

September

Extensive literature survey over the related publications in regular pattern detection and image in-painting.

4.2

October

Implement the algorithm in MATLAB and C.

4.3

November

Develop an android app to perform automatic image de-fencing.

4.4

December

Study the scope of improving the algorithm.

REFERENCES
[1] Yanxi Liu, Belkina T,Hays J.H, Lublinerman R, Image De-fencing,IEEE Conference on Computer Vision and Pattern Recognition, 2008.
[2] Park, M.,Brocklehurst, K., Collins, R. T., and Liu, Y. Image de-fencing revisited.
Computer Vision 2010. Springer Berlin Heidelberg, 2011. 422-434.
[3] J. Hays, M. Leordeanu, A. Efros, and Y. Liu. Discovering texture regularity as
a higher-order correspondence problem. In European Conference on Computer
Vision, 2006.
[4] M. Leordeanu and M. Hebert. A spectral technique for correspondence problems
using pairwise constraints. In IEEE International Conference on Computer Vision
(ICCV), 2005.
[5] Park, M., Collins, R.T., Liu, Y.: Deformed Lattice Discovery via Efficient Mean
Shift Belief Propagation. In: 10th European Conference on Computer Vision,
Marsellie, France (2008)
[6] Liu, Y., Collins, R.T., Tsin, Y.: A computational model for periodic pattern perception based on frieze and wallpaper groups. Pattern Analysis and Machine Intelligence, IEEE Transactions on 26 (2004) 354-371
[7] A. Criminisi, P. Prez, and K. Toyama. Region filling and object removal by
exemplar-based inpainting. In Proc. IEEE Computer Vision and Pattern Recognition (CVPR), 2003.
[8] A. A. Efros and T. K. Leung. Texture synthesis by non-parametric sampling. In
IEEE International Conference on Computer Vision (ICCV), pages 1033-1038,
1999.
[9] J. Sun, L. Yuan, J. Jia, and H.-Y. Shum. Image completion with structure propagation. ACM Transactions on Graphics (SIGGRAPH), 24(3): 861-868, 2005.

10

[10] Criminisi, A., Perez, P., Toyama, K.: Object removal by exemplar-based inpainting. In: CVPR. (2003) 721-728

11

You might also like