You are on page 1of 137

CS Club Ekaterinburg on 17-18 February 2011

What Computer Vision with the OpenCV


can and can not

http://pixdaus.com/pics/1244922167d3Z4fjf.jpg

Denis S. Perevalov

www.uralvision-en.blogspot.com perevalovds@gmail.com
Ural Federal University / Institute of Mathematics an d Mechanics UB RAS
Contents
Introduction
1. What is computer vision
2. Cameras for computer vision
3. Introduction to OpenCV
4. OpenCV integration into multimedia projects
5. Possibilities and limitations of simple tasks of computer
vision
6. Possibilities and limitations of complex tasks of computer
vision
7. New applications of computer vision
Conclusion
Introduction

- The lecture intent


- The lecture is for…

Back to Contents
The lecture intent
This lecture is about:
- Computer vision,
- OpenCV library,
- The possibilities and limitations of computer vision, appearing
in the solution of applied problems of image analysis.

Therefore:
The lecture intent
Will be interested in:

1) algorithms that solve the problem of image analysis


in (almost) real time. That is, the processing time of one frame
should not exceed 1 - 10 sec.

2) The concerns and observations about the applicability of


such algorithms.

Not be interested in:

1) Questions accelerating algorithms using GPU.

2) Neural networks and artificial intelligence.


The lecture is for…

- For those interested in computer vision and wants to learn


more about their current opportunities and new ways of its
application.
The lecture is for…

- For those who do not have experience with OpenCV, but


wishes as soon as possible to get it.
The lecture is for…

- For those who are seriously engaged in computer vision, and


wants to learn more about the bottlenecks and problems that
may occur when using the best (to date) algorithms for
computer vision.
1. What is
computer vision
- Definition
- First sign of computer vision tasks
- Second sign of computer vision tasks
- Examples of computer vision tasks
- An example of NOT computer vision task

Back to Contents
Definition
(From Wikipedia)

Computer vision - Theory and technology of creating


machines that can see.

http://the-gadgeteer.com/wp-content/uploads/2009/12/
mr-robot-head-game.jpg
Definition
As a scientific discipline, computer vision relates to the theory
and technology of creating artificial systems that receive
information from images. ...

As a technological discipline, computer vision seeks to apply


theories and models of computer vision to create computer
vision systems. ...

http://www.spectec.net.nz/pictures/cctv% 20pic.jpg
Definition
Computer vision can also be described as a supplement (but not necessarily
opposite) biological vision.

In biology, we study the visual perception of humans and various animals, resulting in a model of such
systems in terms of physiological processes. Computer vision, on the other hand, studies and describes a
computer vision system, which are executed by hardware or software. Interdisciplinary exchange between
biological and computer vision has been very productive for both scientific fields.

http://sobiratelzvezd.ru/wallpapers/wikimedia_23.jpg
Definition
The topics include computer vision

- Play action

- Detection of events

- Tracking,

- Pattern recognition,

- Restoration of images.
First sign of computer vision tasks

The input data are two-dimensional array of data


- i.e., "image".

NOTE
Data can also be:
- Video, that is, a sequence of images,
- 3D-data - point clouds with 3D scanners or other devices.
Sample Image

Ordinary light, radio waves, ultrasound - they are all sources of


images:

1. Color images of the visible spectrum


2. Infrared images
3. Ultrasound images
4. Radar images
5. Images with depth data
Sample Image

1. Color images of the visible spectrum

http://rkc.oblcit.ru/system/files/images/% D0% 9F% D1% 80% D0% B8% D1% 80% D0% BE% D0% B4% D0%
B013.preview.jpg
http://imaging.geocomm.com/gallery/san_francisco_IOD032102.jpg
Sample Image
2. Infrared images

http://lh6.ggpht.com/_Wy2U3qKMO8k/SSyB6BTdg8I/AAAAAAAACd8/Iai_3QZIjrI/Australia+5+dollars+B+se.jpg
http://i367.photobucket.com/albums/oo117/syquest/acrylic_no_filter.jpg
Sample Image
3. Ultrasound images
Image with side-scan sonar:

http://ess.ru/publications/2_2003/sedov/ris6.jpg
Sample Image
4. Radar images

Snapshot of the radar:

http://cdn.wn.com/pd/b1/3a/abd9ebc81d9a3be0ba7c4a3dfc28_grande.jpg
Sample Image

5. Images with depth data

http://opencv.willowgarage.com/documentation/c/ Video http://www.youtube.com/watch?v=pk_cQVjqFZ4


_images / disparity.png
First sign of computer vision tasks

The input data are two-dimensional array of data


- Ie, "image".

But the two-dimensional arrays of data are used not only in


computer vision:
Disciplines dealing with 2D-data
Second sign of computer vision tasks

The goal of treatment - extraction and use of color and geometric


structures in the image.

http://www.tyvek.ru/construction/images/structure.jpg
Disciplines dealing with 2D-images
Disciplines involved
2D-images
1. Signal and Image Processing
Low-level data processing, usually without a detailed study of image
content.
Objectives - restoration, removal of noise, data compression, improved
performance (sharpness, contrast, ...)

2. Computer vision
Sredneurovnevy data analysis involves the separation of the image of
any objects, and measuring their parameters.

3. Pattern recognition
High-level analysis of data - the definition of the type of object. The
input data usually must be presented as a set of attributes. Often the
signs are used to calculate 1. and 2.
Examples of computer vision tasks

Segmentation - A partition image to the "homogeneous"


in some sense the field.
Examples of computer vision tasks

Detection interesting objects in the picture, and


calculation of their size and other characteristics.

http://armi.kaist.ac.kr/korean/UserFiles/File/MMPC.JPG
Examples of computer vision tasks

Tracking - Tracking objects of interest on the sequence of


frames.

http://www.merl.com/projects/images/particle.jpg
Examples of computer vision tasks
Gloves of virtual reality - recognition of colors and patterns brushes, draft MIT,
the prototype.

Video http://www.csail.mit.edu/videoarchive/research/gv/hand-tracking
Examples of computer vision tasks
Search markers (for use in augmented reality-based
markers).

http://www.edhv.nl/edhv/wp-content/uploads/2009/12/aug_Picture-10_no-border-450x337.jpg
http://jamiedubs.com/fuckflickr/data/web/ar-marker-BchThin_0036.png
An example of NOT computer vision
task
Search path in the maze.
(Although the input data - the image, but the task - not to find
objects on it, and solve the combinatorial problem to find a
way).

http://www.promrds.com/chapter9/Images/NewMaze.gif
2. Cameras
for computer vision
- Key features
- Examples of good cameras

Back to Contents
Key features

For various processing tasks in real-time need different


cameras.

Their main features are:

1. Resolution

2. The number of frames per second

3. Type of data obtained

4. Way to transfer data into the computer


Resolution
This is the image size in pixels, obtained from the camera.

320 x 240 640 x 480 1280 x 1024


accuracy accuracy accuracy
when observing an object when observing an object when observing an object
with the size of 1m: with the size of 1m: with the size of 1m:
3.13 mm 1.56 mm 0.97 mm
size of 30 frames:
6.6 MB size of 30 frames: size of 30 frames:
26.4 MB 112.5 MB

http://www.mtlru.com/images/klubnik1.jpg
The number of frames per second
This is the number of images obtained from the camera per second.

30 fps 60 fps 150 fps


Time between frames: Time between frames: Time between frames:
33 ms 16 mS 6 ms
Can used for musical
instrument
http://www.youtube.com/watch?v=7iEvQIvbn8o
Type of data obtained
What data we get from the camera for processing.

Color or grayscale image Infrared image Color image + depth


of the visible spectrum (Information about the
distance to objects)
Using invisible infrared
illumination, this camera will
see
in a dark room
(On performance)
Way to transfer data into the computer

- Analog
-Webcams (USB-camera)
- Firewire-camera(Cameras IEEE-1394)
- Network (IP-camera)
- Smart Camera (Smart cameras)
Analog

Historically appeared first,


signal is transmitted to analog signals (TV format).

(+) Transmit data over long distances,


albeit with interference (100 m)
(+) Easy to install, small size

(-) For signal input into the computer requires a special card or TV tuner ", they
usually consume a lot of computing resources.
(-) "Interlace"Or Interlace - very difficult to analyze the image, if there is
movement.
(Actually attending 2 half frame, each 50 times / sec)
Webcams (USB-camera)

Appeared in ~ 2000.,
transmit data via the USB-protocol
uncompressed or compressed in JPEG.

(+) Easy to connect computer and software


(+) Cheap, available for sale

(-) Overhead - to decode JPEG requires computing resources.


(-) The cheapest models are usually bad optics and the matrix (Makes noise in
the image)
(-) Because of limitations of USB bandwidth can not connect more than 2
cameras to a single USB-hub, but usually on the PC 2-3 USB hub.
Firewire-camera (IEEE-1394)

Cameras that transmit a signal


protocol FireWire,
pylevlagozaschitnom usually the case,
usually it is the camera for industrial applications.

(+) Transfer of uncompressed video in excellent quality at high speed


(+) You can connect multiple cameras
(+) Tend to have excellent optics

(-) High price


(-) Requires power, which is sometimes difficult to connect to laptops
Network (IP-camera)

Cameras that transmit data on


network (wired or wireless)
channel. Is now rapidly gaining
popularity in all areas.

(+) Easy connection to PC


(+) Ease of installation
(+) The possibility of transferring data to an unlimited distance, which allows you
to construct a network of cameras covering the building or area, attached to the
airship, etc.
(+) Control - to rotate the camera, adjust the increase

(-) May have problems with speed of response


(-) Is still relatively high price
(-) While not portable (2011)
"Smart" cameras (Smart cameras)

Cameras, in which case


located computer.
These cameras are fully functional
vision systems,
transmitting the output of the detected
facilities, etc. under different protocols.

(+) Compact.
(+) Scalability - it is easy to build a network of such cameras.

(-) Often they require adaptation of existing projects.


(-) Cost model is rather slow, so do a good job with only a relatively simple task of
image analysis.
Separate type: Infrared Camera

Constructed from ordinary cameras


by adding an infrared filter
and, often, an infrared illuminator.

+ IR-rays are almost invisible man (in the dark can be seen as a faint red color),
so often used to simplify the analysis of objects in the field of view.

- Specialized infrared camera suitable for machine vision, not a mass product, so
they usually need to be ordered.
Examples of good cameras
Sony PS3 Eye

320 x 240: 150 FPS


640 x 480: 60 FPS

Data Types:
visible light
IR (requires removing the IR filter)

Price: $ 50.

USB, CCD
Examples of good cameras
Point Grey Flea3
648 x 488: 120 FPS

Data Type:
- Visible light,
- IR (?)

Price: $ 600.

Model FL3-FW-03S1C-C
IEEE 1394b, CCD
Examples of good cameras
Microsoft Kinect
640 x 480: 30 FPS

Data Type:
visible light + depth

Price: $ 150.

(Depth - stereo vision using laser infrared illuminator,


why not work in sunlight)
USB, CMOS
Examples of good cameras
Point Grey BumbleBee2
640 x 480: 48 FPS

Data Type:
visible light + depth

Price: $ 2000.

(Depth - stereo vision with two cameras)


IEEE 1394b, CCD
3. Introduction to OpenCV

- What is OpenCV
- The first project on OpenCV
- Mat class
- Image processing functions

Back to Contents
What is OpenCV

"Open Computer Vision Library"

Open library with a set of functions for processing,


analysis and image recognition, C / C++.
What is OpenCV

2000 - First alpha version, support for Intel, C-interface

2006 - Version 1.0

2008 - Support Willow Garage (lab. Robotics)

2009 - version 2.0, classes in C + +

2010 - version 2.2, realized work with the GPU


The first project on OpenCV
1. Creating a Project
We assume that Microsoft Visual C + + 2008 Express Edition
and OpenCV 2.1 is already installed.

1. Run VS2008

2. Create a console project


File - New - Project - Win32 Console Application,
in the Name enter Project1, click OK.

3. Set up the path


Alt + F7 - opens the project properties
Configuration Properties - C / C + + - General - Additional Include Directories,
where we put the value "C: \ Program Files \ OpenCV2.1 \ include \ opencv";

Linker - General - Additional Library Directories, where we put the value of


C: \ Program Files \ OpenCV2.1 \ lib \

Linker - Input - Additional Dependencies -


cv210.lib cvaux210.lib cxcore210.lib cxts210.lib highgui210.lib for Release,
cv210d.lib cvaux210d.lib cxcore210d.lib cxts210.lib highgui210d.lib for Debug
The first project on OpenCV
2. Reading the image and display it on screen
1. Preparing the input data:
file http://www.fitseniors.org/wp-content/uploads/2008/04/green_apple.jpg
write in C:\green_apple.jpg

2. Writing in Project1.cpp:
# Include "stdafx.h"
# Include "cv.h"
# Include "highgui.h"
using namespace cv;

int main (int argc, const char ** argv)


{
Mat image = imread ("C:\\green_apple.jpg");/ / Load image from disk
imshow ("image", image); / / Show image
waitKey (0); / / Wait for keystroke
return 0;
}

3. Press F7 - compilation, F5 - run.


The program will show the image in the window and by pressing any key will complete its
work.
The first project on OpenCV
3. Linear operations on images

Replace the text in the main from the previous


for example:

int main (int argc, const char ** argv)


{
Mat image = imread ("C: \ \ green_apple.jpg");

//Image1 pixel by pixel is equal to 0.3 * image


Mat image1 = 0.3 * image;
imshow ("image", image);
imshow ("image1", image1);
waitKey (0);
return 0;
}
The first project on OpenCV
4. Working with rectangular subimages

Replace the text in the main from the previous example to:

int main (int argc, const char ** argv)


{
Mat image = imread ("C:\\green_apple.jpg");

//Cut of the picture


Rect rect = Rect (100, 100, 200, 200); //Rectangle
Mat image3;
image (rect). copyTo (image3); //Copy of the image
imshow ("image3", image3);

//Change the part of the picture inside the picture


image (rect) *= 2;
imshow ("image changed", image);

waitKey (0);
return 0;
}
Mat class
Mat - Base class for storing images OpenCV.
Mat class
Single-and multi-channel images

The image is a matrix of pixels.


Each pixel can store some data.
If the pixel stores the vector data, the dimension vector is number of image
channels.

1-channel image - also called the half-tone


3-channel images - typically consist of three components (Red, Green, Blue).

Also, OpenCV can work with 2 - and 4-channel image.


Mat class
Creating images
1) Let the picture without some type of

Mat imageEmpty;

2) Image w x h pixels, the values 0 .. 255


(8Umeans "unsigned 8 bit",C1means "a channel"):

int w = 150; int h = 100;


Mat imageGray (cv:: Size ( w, h ) CV_8UC1 );
Mat class
Creating images

3) 1-channel with the floating-point values


(32F means "float 32 bit"):

Mat imageFloat (cv:: Size (w, h), CV_32FC1 );

4) 3-channel image with values 0 .. 255 for each channel:

Mat imageRGB (cv:: Size (w, h), CV_8UC3 );


Mat class
Memory management
1. Memory for the image stands out and is automatically cleared

That is, OpenCV itself creates the image of the desired size and type, if this
image is an output parameter of a function:

Image imageFloat;
imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);

- Here OpenCV itself allocate imageFloat.


It is important that if your image is already the right size, there are no
operations on memory allocation is performed.

2. Assignment operator shall not copy the data (as does the std:: vector), and
not by copying pointers, and using mechanism of the reference count.
Mat class
Memory management
The mechanism of the reference count (In STL it is a shared_ptr, in Java
it is all pointers) works like this:
{
Mat A (cv:: Size (100, 100), CV_8UC1);
// Allocate memory for the image, and the memories,
// This memory is a single image.
{
Mat B = A;
// Here the memory for the image does not stand out, but simply
// Data in B point to the same area in memory.
// Therefore, if we change B, then changed, and A.
// Reference count increased by an image, was equal to 2.
}
// Here B came out of scope, the reference count is decreased,
// And became equal to 1.
}
// Here A came out of scope, the reference count becomes equal to 0,
// And the memory allocated to it are automatically cleared.
Mat class
Memory management

Since the operation


Mat B = A;
does not copy the image A to B, then in order to create a copy of the image
for subsequent independent use, you must use explicit commands copyTo
and clone:

image1.copyTo (image2);
image3 = image1.clone ();
Mat class
Memory management

Outcome:
1) an assignment Mat B = A; is very fast, and does not copy the data and
adjusts the pointers in a special way to them. This allows you to transfer
Mat in the function directly, without pointers and references. This will not
cause unwanted copying Mat the stack (as it would stalal std::vector).

Although, of course, const Mat & will be transmitted still faster.

2) to copy the images to use explicit commands copyTo and clone.


Mat class
Per-pixel access to images
In OpenCV has several ways of per-pixel access to images. They vary in
the degree of security (typing and go beyond the border), the speed and
convenience.

Wherever possible, you should try to avoid direct references to the pixels,
but instead use the functions of OpenCV, since they usually work faster
and the code more understandable.
Mat class
Per-pixel access to images
One way to access the pixels for images that have known the type - the
use of the at. For single-channel images 0 ... 255 it is:

// Get values
int value = imageGray.at <uchar> (y, x);

// Set the values


imageGray.at <uchar> (y, x) = 100;

Note that x and y in the call are swapped.


Mat class
Conversion types
Note
In the derivation of the on-screen images with floating-point OpenCV
means we must bear in mind that they are displayed on the
assumption that their values lie in [0,1]. Therefore, when converting
8-bit images in an image float to do the transformation - the
multiplication by 1.0 / 255.0.

To convert images of different types of bit mode (float and unsigned


char) used a class member convertTo.
In its second argument - the type of the image.

imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);

The number of channels input and output must match!


Mat class
Conversion types
For converting different color spaces using the function cvtColor. If
necessary, it can change the number of channels.

For example, the conversion of 3-channel RGB-image to grayscale:

cvtColor (inputRGB, outputGray, CV_BGR2GRAY);

And reverse:
cvtColor (inputGray, outputRGB, CV_GRAY2BGR);
Mat class
Partition of the channels
Function split divides the multi-channel image into channels.
Functionmergestitches together a single-image multi-channel.

void split(Const Mat &mtx // Original color image


vector <Mat> &mv // Resulting set is 1-channel Images
)

void merge(Const vector <Mat> &mv // Initial set of 1-channel images


Mat &dst // The resulting color image
)

Most often they are used to separately to each color image processing, as well as
for various manipulations of the channels.
Mat class
Partition of the channels
int main (int argc, const char ** argv)
{
Mat image = imread ("C: \ \ green_apple.jpg");

// Split the original image into three channels


// - Channels [0], channels [1], channels [2]

vector <Mat> channels;


split (image, channels);

// Show the channels in separate windows


// Note that the red channel - 2, not 0.

imshow ("Red", channels [2]);


imshow ("Green", channels [1]);
imshow ("Blue", channels [0]);
waitKey (0);
return 0;
}
Image processing functions
Smoothing

Original image The image, smoothed box 11 x


11

Function GaussianBlur performs image smoothing Gaussian filter.

Most often, the smoothing is applied to remove small noise on the image for
subsequent image analysis. Is done by using a filter of small size.

http://www.innocentenglish.com/funny-pics/best-pics/stairs-sidewalk-art.jpg
Image processing functions
Threshold

Function threshold performs threshold processing of the image.

Most often it is used to highlight objects of interest pixels in the image.

http://www.svi.nl/wikiimg/SeedAndThreshold_02.png
Image processing functions
Fill areas

Function floodFill provides a fill area, starting from a pixel (x, y), with specified
boundaries shutdown
using a 4 - or 8 - adjacency pixels.

It is important: It spoils the original image - as it fills.

Most often it is used to highlight areas identified by the threshold processing, for
subsequent analysis.

http://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Square_4_connectivity.svg/300px-Square_4_connectivity.svg.png
http://tunginobi.spheredev.org/images/flood_fill_ss_01.png
Image processing functions
Isolation circuits

The contour of the object - this is the line representing the edge of the object's shape.
Underline the contour points - Sobel, curves - Canny.

Application
1. Recognition. Along the contour can often determine the type of object that we observe.

2. Dimension. With the circuit can accurately estimate the size of the object of their
rotation, and location.

http://howto.nicubunu.ro/gears/gears_16.png
http://cvpr.uni-muenster.de/research/rack/index.html
4. OpenCV integration into
multimedia projects

- Low-level Libraries
- Middle-level Platforms
- High-level Environments

Back to Contents
Low-level Libraries

(Open Computing Language)


Processing, analysis (Open Graphics Library) Parallelization and speed up the
High-speed graphics calculations, in particular, means
and image recognition GPU.

(Open Audio Box2D - 2D Bullet - 3D physics Web server


Library) physics engine engine
Sound
Video 1 Video 2 and so on ...
Middle-level Platforms
This is a platforms for "Creative coding", includes a large set of
functions and libraries that are integrated for convenient
Programming.

Processing openFrameworks Cinder

Language: Java Language: C / C + + Language: C / C + +


For computer vision Recently appeared,
Java is slow. gaining popularity.

Video 1 Video 2 Video 3


High-level Environments
“Visual programming“ environments, which allows to
implement projects without actual programming. It is important
that they can expand by the plugins made with low-level
libraries.

Max / MSP / VVVV Unity3D


Jitter
Focused on visual Focused on high-
Focused on audio. effects. quality 3D.
5. Possibilities and limitations
of simple tasks of Computer Vision
- The principal possibilities
- Source of problems
- "The Problem of Boundaries"
- "The Problem of Texture"
- Segmentation
- Optical flow
- Applications of Optical Flow
- Methods for calculating the optical flow
- Optical flow problems

Back to Contents
The principal possibilities

In principle, using computer vision to measure any


parameters of physical processes, if they are expressed in
mechanical motion, changing shape or color.

http://people.rit.edu/andpph/photofile-misc/strobe-motion-ta-08.jpg
Source of problems
The main source of algorithmic problems of image analysis lies
in the fact that there is no simple relationship

Pixel values Objects in the scene

For simple cases when such connection is - computer vision


algorithms work very well :)

http://zone.ni.com/cms/images/devzone/epd/GeometricMatchingScreenshot.JPG
"The problem of boundaries
The closeness of color pixels does not mean that they belong to the
same object.

Similarly, a strong difference between colors of adjacent pixels does


not mean that the pixels belong to different objects.
"The problem of boundaries

How to separate the shadows from the trees?

http://fineartamerica.com/images-medium/shadows-in-the-forest-gary-bydlo.jpg
"The problem of boundaries
To overcome this problem need to build the algorithms that receive
and use the contextual information about the location of objects in
the scene.
"The problem of boundaries

How to find the fish?


http://flogiston.ru/img/invisible_flounder_fish.jpg
"The problem of texture"
On the objects are such a texture that does not allow to consider the
objects the same color, but are difficult to model and describe.
"The problem of texture"

How to differentiate the zebras?


http://dangerouswildlife.com/images/zebra-herd.jpg
"The problem of boundaries" and "texture
issue" is most clearly manifested in solving the
problem of segmentation.
Segmentation

Segmentation - A partition image to the "homogeneous"


in some sense the field.

The purpose of segmentation - to build a "simple" description


of the original image, which can be applied for further analysis
of the image.
Segmentation
There are many methods of segmentation, which use a variety
of ideas:

- "Growing regions" on the basis of luminance,

- "Method snake" - an iterative movement curves

- Construction of boundaries (for example, the method of


"watershed")

- Search for a partition on the field, minimizing "entropy",

- The use of a priori information about the form field

- The use of multiple scales to refine the boundaries.


Segmentation
One of the best algorithms - a method GrabCut.
(While working quite slowly.)

The method is based on the approximate construction of the


minimum cut of a special graph, which is based on the pixel.
Segmentation
Idea:
construct a weighted graph G = {V, E}.
image pixels correspond to the vertices V,
and geometric, brightness and texture proximity between two
pixels i,j corresponds to the weight of the edgeS_i, j
(S"From the" similarity "- the proximity of pixels)

http://www.cis.upenn.edu/ ~ jshi / GraphTutorial /


Segmentation
Then the problem of segmentation into two regions can be
formulated as a problem of partitioning the set of vertices into
two parts. Such a division is called cut.

- The value of the cut.

Minimal incision (Ie, cut with a minimum value) - will


announce the decision of the segmentation problem.
Segmentation
Complementing the method of minimal incision manual original
layout of areas you can get very good results (although not fully
automatic):

James Malcolm, Yogesh Rathi, Allen Tannenbaum


A Graph Cut Approach to Image Segmentation in Tensor Space
http://jgmalcolm.com/pubs/malcolm_tc.pdf
Segmentation

Adding the idea of multiresolution analysis


(Fully automatic result):

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04


http://www.cis.upenn.edu/ ~ jshi / GraphTutorial /
Tutorial-ImageSegmentationGraph-cut4-Sharon.pdf
Segmentation

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04


http://www.cis.upenn.edu/ ~ jshi / GraphTutorial /
Tutorial-ImageSegmentationGraph-cut4-Sharon.pdf
Segmentation

Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04


http://www.cis.upenn.edu/ ~ jshi / GraphTutorial /
Tutorial-ImageSegmentationGraph-cut4-Sharon.pdf
Optical Flow
Optical Flow(Optical flow) - is a vector field of apparent motion
of objects, surfaces and edges in a visual scene, caused by
relative motion between the observer and the scene.
Optical Flow
Usually considered an optical flow that arises when considering
two frames of video.

For each pixel (x,y) Optical flow is a vector (f(x,y)


g(x,y))Characterizing the shift:
Optical Flow

http://www.ultimategraphics.co.jp/jp/images/stories/ultimate/BCC/optical.jpg
Application of optical flow

1. To determine the direction in which the moving objects


in the frame. Video

2. For segmentation of moving regions in the frame for further


analysis.

3. To restore the form of three-dimensional object, near which


the camera moves.

4. As an auxiliary method of increasing the stability of


algorithms for detecting objects, if they are not in every
frame.
For example, for search problems individuals, markers, etc.
Application of optical flow
Enhancing the stability of facial recognition.

Video Green circle - the result of processor 1,


purple rectangles - Combine the results.
Methods of calculating the optical flow

(I) Block ("Naive" methods)


For each point searched shift that minimizes the difference in the local window.

(II) Differential (Most used)


Estimation of the derivatives of x, y, t.
1. Lucas-Kanade - Very fast.
2.Farneback- Good enough, but slow
3.CLG - Qualitative, but not yet implemented in OpenCV
4. Pyramidal Lucas-Kanade, calculated only on "points of interest"
5. Horn-Schunk - not very resistant to noise

(III) based on discrete optimization (Intensive)


The solution is constructed using the methods of min-cut, max-flow, linear programming
and belief propagation.
Methods of calculating the optical flow
Today in OpenCV implements several algorithms, the best of them - Farneback.
(Gunnar Farneb? Ack "Two-Frame Motion Estimation Based on Polynomial Expansion", Proceedings of the
13th Scandinavian Conference on Image Analysis Pages 363-370 June-July, 2003)
Methods of calculating the optical flow
The idea is to approximate a quadratic function of the brightness of the pixels
in the neighborhood of a pixel in both frames.
Using the coefficients of polynomials, we can calculate the shift - which
declares the value of optical flow in a given pixel.
Optical flow problems

Optical flow and the actual field of motion may not coincide, and even
be perpendicular to:
Optical flow problems
The problem of the aperture - the ambiguity of the definition of motion
caused by the consideration of the motion only locally (without
analyzing the edge of the object).

Is particularly so
- In maloteksturirovannyh scenes
- Stages in the combinatorial type strips and chess boards

http://pages.slc.edu/ ~ ebj / sight_mind / motion / Nakayama / aperture_problem.GIF


Homework 1 of 2
To obtain the test on these lectures "automatic"

Build a picture with objects consisting of black bars and cages


on a white background, then the second picture, where those
objects are shifted by a distance greater than the width of the
bands. Then calculate and display the resulting optical flow.

Send the results to perevalovds@gmail.com


6. Possibilities and limitations
of complex problems of
Computer Vision
- Face detection – Viola-Jones algorithm
- Pedestrian detection – HOG algorithm

Back to Contents
Face detection – Viola-Jones algorithm

The algorithm of Viola-Jones is now the base for searching


the frontal faces in the frame.
Face detection – Viola-Jones algorithm
The algorithm uses a set of “Haar-like features”.

http://www710.univ-lyon1.fr/ ~ bouakaz/OpenCV-0.9.5/docs/ref/pics/haarfeatures.png
http://alliance.seas.upenn.edu/ ~ cis520/wiki/uploads/Lectures/viola_jones_first2_small.png
Face detection – Viola-Jones algorithm

At the stage of learning from the excessive feature set by


boosting construct a set of classifiers.
Face detection – Viola-Jones algorithm

- Works well for frontal faces.

Problems
- For those in the profile does not work because of hairstyles.
- For the whole person or the upper part - I was not able to
achieve recognition.

This is due, apparently, so that the algorithm is capable of well


trained to recognize the internal contours of virtually immutable.
And with the changing contours of the external it does not work
very well.
Face detection – Viola-Jones algorithm

Application to search for objects in a fenced grass.


Face detection – Viola-Jones algorithm
Application to search for objects in a fenced grass.

•frequency of crossing: 0.158 (158 among 1000 positive examples);


• the frequency of false alarms: 0.049 (40 among 1000 positive examples and 58 among
1000 negative examples);
• Thus, the frequency of correct detection was 84.2%, The frequency of false alarms 4.9%.
Pedestrian detection - HOG algorithm

HOG = Histogram Of Gradients.


Pedestrian detection - HOG algorithm
How it works: the image is divided into regions, which is the gradient direction.
These areas are accumulated in the histogram. The resulting vector is used for
pattern recognition (the method of SVM).

http://ericbenhaim.free.fr/images/hog_process.png
Pedestrian detection - HOG algorithm

Algorithm can reliably detect cars, motorcycles, bicycles.

Demo:

Video http://www.youtube.com/watch?v=BbL2wWy8KUM

Problems:

Judging by the demo video, the algorithm does not work well
with people in skirts, or he was trained to recognize people from
a different perspective.
7. New applications of
computer vision
- Interactive multimedia systems
- 3D-illusion
- Projection mapping
- Dynamic projection mapping

Back to Contents
Interactive multimedia system

Funky Forest
(2007,
T. Watson)

Video http://zanyparade.com/v8/projects.php?id=12
Interactive multimedia system

Body paint

Video http://www.youtube.com/watch?v=3T5uhe3KU6s
Interactive multimedia system
Projection onto the hands
of spectators

Yoko Ishii and Hiroshi Homura, It's fire, you can touch it, 2007.
Interactive multimedia system
Floor Games

Video championship outdoor ping-pong,


Chrustalnaya-2011
Championship held at the conference "Modern problems of mathematics" - 42 th
National Youth School-Conference
“Chrustalnaya” hotel, 2 February 2011
www.playbat.net

Video
3d-illusion
This illusion is the perception of body volume in the (flat) surface, which is
achieved by accurate simulation of geometry and light and shade for the body
from the point where the viewer stands.

http://justinmaier.com/wp-content/uploads/2006/05/ATT5082002.jpg
3d-illusion
Creating the illusion of 3D by tracking head and eye. Now
embedded in the portable gaming consoles, with the camera.

Video http://www.youtube.com/watch?v=o5tlIIOXMs4
Projection mapping
Mapping (projection mapping) - implementation of video
projection is not on special screens, and on other objects for
their “animation".

Video http://www.youtube.com/watch?v=BLNqZ1Nbo7Q
Projection mapping
Today the mapping on buildings is popular
(Architectural Projection Mapping).

Video http://www.youtube.com/watch?v=BGXcfvWhdDQ
Dynamic projection mapping

The idea: to use the techniques and methods for tracking


Markerless AR to synchronize the moving object and image
from the projector.
Dynamic projection mapping

More radical: to track the movement of multiple objects


(skipping balls falling sheets of paper)
and implement a projection onto them.
Homework 2 of 2
To obtain the test on these lectures "automatic"
Make Tracking (detection position) of the falling ball, which then
bounces and jumps.

Shoot video, which has the ball and the top shows the position
of the ball, found a computer. Video to put on youtube, To send
this link to perevalovds@gmail.com.
Outside view and graphics: Physical
computing

Water musical instrument


sound waves start

Aleatoric water musical instrument


http://www.youtube.com/watch?v=CZ_KijiwQHE
Video
Conclusion

- Literature
- Friendly lectures and seminars
- Partners
- Contacts

Back to Contents
Literature
Computer vision
1. Gonzalez R., Woods R.Digital image processing.
2. Shapiro, J. StockmanComputer vision.

OpenCV

1. Documentation OpenCV C++:


http://opencv.willowgarage.com/documentation/cpp/index.html

2. G. Bradski, A. Kaehler Learning OpenCV: Computer Vision with the OpenCV Library
- Unfortunately, for the version of OpenCV for C, not C++.

3. My lectures on OpenCV for C++ (Fall 2010)


http://uralvision.blogspot.com/2010/12/opencv-2010.html
Friendly lectures and seminars

Course on openFrameworks with elements OpenCV,


matmeh USU, spring semester 2011.

The program of study course and the lessons will be on site


www.uralvision.blogspot.com
Friendly lectures and seminars
Ural Federal University - College of Arts and culture
Ekaterinburg branch of the National Centre for Contemporary Art
UB RAS

The program "Art, Science, Technology

art.usu.ru/index.php/confer/229
www.art.usu.ru,www.ncca.ru
Friendly lectures and seminars
The program "Art, Science, Technology
On March 2, 18:30
The body as interface: Wearable Computing

INVITATION TO with a 10-15 minute presentation without it.


Please indicate the desire and the subject of speeches Ksenia Fedorova
ksenfedorova@gmail.com
The seminar will be held at the USU Library (Lenin 51, 4 floor, 413a)
Details - http://www.scribd.com/doc/49078187/Body-As-Interface
Friendly lectures and seminars
The program "Art, Science, Technology
April 21-22
2-nd International Workshop
"Theory and Practice of Media Art"

Chairman org.komiteta Ksenia Fedorova, curator of the EP NCCA


ksenfedorova@gmail.com
Information about the workshop will be published on ncca.ru,art.usu.ru,usu.ruand
www.uralvision.blogspot.com

http://www.ago.net/assets/images/assets/past_exhibitions/2007/steinkamp.jpg
Partners
www.playbat.net (Interactive Systems)

LLC Business Frame (CV consulting) www.bframe.ru

Ltd. 5-th dimension (interactive systems)

gorodaonline.com (A network of information-business portals)

Animation studio, "Mult-On www.mult-on.ru

Animation Studio "Animatech" www.anima-tech.ru


Contacts

Denis Perevalov
perevalovds@gmail.com

This lecture is published at


www.uralvision-en.blogspot.com
www.uralvision.blogspot.com (Russian)

You might also like