What Computer Vision With The OpenCV

CS Club Ekaterinburg on 17-18 February 2011
What Computer Vision with the OpenCV

can and can not
http://pixdaus.com/pics/1244922167d3Z4fjf.jpg
Denis S. Perevalov
www.uralvision-en.blogspot.com perevalovds@gmail.com
Ural Federal University / Institute of Mathematics an d Mechanics UB RAS
Contents
Introduction
1. What is computer vision
2. Cameras for computer vision
3. Introduction to OpenCV
4. OpenCV integration into multimedia projects
5. Possibilities and limitations of simple tasks of computer
vision
6. Possibilities and limitations of complex tasks of computer
vision
7. New applications of computer vision
Conclusion
Introduction
- The lecture intent

- The lecture is for…
Back to Contents
The lecture intent
This lecture is about:
- Computer vision,
- OpenCV library,
- The possibilities and limitations of computer vision, appearing
in the solution of applied problems of image analysis.
Therefore:
The lecture intent
Will be interested in:
1) algorithms that solve the problem of image analysis

in (almost) real time. That is, the processing time of one frame
should not exceed 1 - 10 sec.
2) The concerns and observations about the applicability of

such algorithms.
Not be interested in:
1) Questions accelerating algorithms using GPU.
2) Neural networks and artificial intelligence.

The lecture is for…
- For those interested in computer vision and wants to learn

more about their current opportunities and new ways of its
application.
- For those who do not have experience with OpenCV, but

wishes as soon as possible to get it.
- For those who are seriously engaged in computer vision, and

wants to learn more about the bottlenecks and problems that
may occur when using the best (to date) algorithms for
computer vision.
1. What is
computer vision
- Definition
- First sign of computer vision tasks
- Second sign of computer vision tasks
- Examples of computer vision tasks
- An example of NOT computer vision task
Back to Contents
Definition
(From Wikipedia)
Computer vision - Theory and technology of creating

machines that can see.
http://the-gadgeteer.com/wp-content/uploads/2009/12/
mr-robot-head-game.jpg
Definition
As a scientific discipline, computer vision relates to the theory
and technology of creating artificial systems that receive
information from images. ...
As a technological discipline, computer vision seeks to apply

theories and models of computer vision to create computer
vision systems. ...
http://www.spectec.net.nz/pictures/cctv% 20pic.jpg
Definition
Computer vision can also be described as a supplement (but not necessarily
opposite) biological vision.
In biology, we study the visual perception of humans and various animals, resulting in a model of such
systems in terms of physiological processes. Computer vision, on the other hand, studies and describes a
computer vision system, which are executed by hardware or software. Interdisciplinary exchange between
biological and computer vision has been very productive for both scientific fields.
http://sobiratelzvezd.ru/wallpapers/wikimedia_23.jpg
Definition
The topics include computer vision
- Play action
- Detection of events
- Tracking,
- Pattern recognition,
- Restoration of images.
First sign of computer vision tasks
The input data are two-dimensional array of data

- i.e., "image".
NOTE
Data can also be:
- Video, that is, a sequence of images,
- 3D-data - point clouds with 3D scanners or other devices.
Sample Image
Ordinary light, radio waves, ultrasound - they are all sources of

images:
1. Color images of the visible spectrum

2. Infrared images
3. Ultrasound images
4. Radar images
5. Images with depth data
Sample Image
1. Color images of the visible spectrum
http://rkc.oblcit.ru/system/files/images/% D0% 9F% D1% 80% D0% B8% D1% 80% D0% BE% D0% B4% D0%
B013.preview.jpg
http://imaging.geocomm.com/gallery/san_francisco_IOD032102.jpg
Sample Image
2. Infrared images
http://lh6.ggpht.com/_Wy2U3qKMO8k/SSyB6BTdg8I/AAAAAAAACd8/Iai_3QZIjrI/Australia+5+dollars+B+se.jpg
http://i367.photobucket.com/albums/oo117/syquest/acrylic_no_filter.jpg
Sample Image
3. Ultrasound images
Image with side-scan sonar:
http://ess.ru/publications/2_2003/sedov/ris6.jpg
Sample Image
4. Radar images
Snapshot of the radar:
http://cdn.wn.com/pd/b1/3a/abd9ebc81d9a3be0ba7c4a3dfc28_grande.jpg
Sample Image
5. Images with depth data
http://opencv.willowgarage.com/documentation/c/ Video http://www.youtube.com/watch?v=pk_cQVjqFZ4

_images / disparity.png
First sign of computer vision tasks
The input data are two-dimensional array of data

- Ie, "image".
But the two-dimensional arrays of data are used not only in

computer vision:
Disciplines dealing with 2D-data
Second sign of computer vision tasks
The goal of treatment - extraction and use of color and geometric

structures in the image.
http://www.tyvek.ru/construction/images/structure.jpg
Disciplines dealing with 2D-images
Disciplines involved
2D-images
1. Signal and Image Processing
Low-level data processing, usually without a detailed study of image
content.
Objectives - restoration, removal of noise, data compression, improved
performance (sharpness, contrast, ...)
2. Computer vision
Sredneurovnevy data analysis involves the separation of the image of
any objects, and measuring their parameters.
3. Pattern recognition
High-level analysis of data - the definition of the type of object. The
input data usually must be presented as a set of attributes. Often the
signs are used to calculate 1. and 2.
Examples of computer vision tasks
Segmentation - A partition image to the "homogeneous"

in some sense the field.
Detection interesting objects in the picture, and

calculation of their size and other characteristics.
http://armi.kaist.ac.kr/korean/UserFiles/File/MMPC.JPG
Tracking - Tracking objects of interest on the sequence of

frames.
http://www.merl.com/projects/images/particle.jpg
Gloves of virtual reality - recognition of colors and patterns brushes, draft MIT,
the prototype.
Video http://www.csail.mit.edu/videoarchive/research/gv/hand-tracking
Search markers (for use in augmented reality-based
markers).
http://www.edhv.nl/edhv/wp-content/uploads/2009/12/aug_Picture-10_no-border-450x337.jpg
http://jamiedubs.com/fuckflickr/data/web/ar-marker-BchThin_0036.png
An example of NOT computer vision
task
Search path in the maze.
(Although the input data - the image, but the task - not to find
objects on it, and solve the combinatorial problem to find a
way).
http://www.promrds.com/chapter9/Images/NewMaze.gif
2. Cameras
for computer vision
- Key features
- Examples of good cameras
Back to Contents
Key features
For various processing tasks in real-time need different

cameras.
Their main features are:
1. Resolution
2. The number of frames per second
3. Type of data obtained
4. Way to transfer data into the computer

Resolution
This is the image size in pixels, obtained from the camera.
320 x 240 640 x 480 1280 x 1024

accuracy accuracy accuracy
when observing an object when observing an object when observing an object
with the size of 1m: with the size of 1m: with the size of 1m:
3.13 mm 1.56 mm 0.97 mm
size of 30 frames:
6.6 MB size of 30 frames: size of 30 frames:
26.4 MB 112.5 MB
http://www.mtlru.com/images/klubnik1.jpg
The number of frames per second
This is the number of images obtained from the camera per second.
30 fps 60 fps 150 fps

Time between frames: Time between frames: Time between frames:
33 ms 16 mS 6 ms
Can used for musical
instrument
http://www.youtube.com/watch?v=7iEvQIvbn8o
Type of data obtained
What data we get from the camera for processing.
Color or grayscale image Infrared image Color image + depth

of the visible spectrum (Information about the
distance to objects)
Using invisible infrared
illumination, this camera will
see
in a dark room
(On performance)
Way to transfer data into the computer
- Analog
-Webcams (USB-camera)
- Firewire-camera(Cameras IEEE-1394)
- Network (IP-camera)
- Smart Camera (Smart cameras)
Analog
Historically appeared first,

signal is transmitted to analog signals (TV format).
(+) Transmit data over long distances,

albeit with interference (100 m)
(+) Easy to install, small size
(-) For signal input into the computer requires a special card or TV tuner ", they
usually consume a lot of computing resources.
(-) "Interlace"Or Interlace - very difficult to analyze the image, if there is
movement.
(Actually attending 2 half frame, each 50 times / sec)
Webcams (USB-camera)
Appeared in ~ 2000.,
transmit data via the USB-protocol
uncompressed or compressed in JPEG.
(+) Easy to connect computer and software

(+) Cheap, available for sale
(-) Overhead - to decode JPEG requires computing resources.

(-) The cheapest models are usually bad optics and the matrix (Makes noise in
the image)
(-) Because of limitations of USB bandwidth can not connect more than 2
cameras to a single USB-hub, but usually on the PC 2-3 USB hub.
Firewire-camera (IEEE-1394)
Cameras that transmit a signal

protocol FireWire,
pylevlagozaschitnom usually the case,
usually it is the camera for industrial applications.
(+) Transfer of uncompressed video in excellent quality at high speed

(+) You can connect multiple cameras
(+) Tend to have excellent optics
(-) High price

(-) Requires power, which is sometimes difficult to connect to laptops
Network (IP-camera)
Cameras that transmit data on

network (wired or wireless)
channel. Is now rapidly gaining
popularity in all areas.
(+) Easy connection to PC

(+) Ease of installation
(+) The possibility of transferring data to an unlimited distance, which allows you
to construct a network of cameras covering the building or area, attached to the
airship, etc.
(+) Control - to rotate the camera, adjust the increase
(-) May have problems with speed of response

(-) Is still relatively high price
(-) While not portable (2011)
"Smart" cameras (Smart cameras)
Cameras, in which case

located computer.
These cameras are fully functional
vision systems,
transmitting the output of the detected
facilities, etc. under different protocols.
(+) Compact.
(+) Scalability - it is easy to build a network of such cameras.
(-) Often they require adaptation of existing projects.

(-) Cost model is rather slow, so do a good job with only a relatively simple task of
image analysis.
Separate type: Infrared Camera
Constructed from ordinary cameras

by adding an infrared filter
and, often, an infrared illuminator.
+ IR-rays are almost invisible man (in the dark can be seen as a faint red color),
so often used to simplify the analysis of objects in the field of view.
- Specialized infrared camera suitable for machine vision, not a mass product, so
they usually need to be ordered.
Examples of good cameras
Sony PS3 Eye
320 x 240: 150 FPS

640 x 480: 60 FPS
Data Types:
visible light
IR (requires removing the IR filter)
Price: $ 50.
USB, CCD
Point Grey Flea3
648 x 488: 120 FPS
Data Type:
- Visible light,
- IR (?)
Price: $ 600.
Model FL3-FW-03S1C-C
IEEE 1394b, CCD
Microsoft Kinect
640 x 480: 30 FPS
Data Type:
visible light + depth
Price: $ 150.
(Depth - stereo vision using laser infrared illuminator,

why not work in sunlight)
USB, CMOS
Point Grey BumbleBee2
640 x 480: 48 FPS
Data Type:
visible light + depth
Price: $ 2000.
(Depth - stereo vision with two cameras)

IEEE 1394b, CCD
3. Introduction to OpenCV
- What is OpenCV
- The first project on OpenCV
- Mat class
- Image processing functions
Back to Contents
What is OpenCV
"Open Computer Vision Library"
Open library with a set of functions for processing,

analysis and image recognition, C / C++.
What is OpenCV
2000 - First alpha version, support for Intel, C-interface
2006 - Version 1.0
2008 - Support Willow Garage (lab. Robotics)
2009 - version 2.0, classes in C + +
2010 - version 2.2, realized work with the GPU

The first project on OpenCV
1. Creating a Project
We assume that Microsoft Visual C + + 2008 Express Edition
and OpenCV 2.1 is already installed.
1. Run VS2008
2. Create a console project

File - New - Project - Win32 Console Application,
in the Name enter Project1, click OK.
3. Set up the path

Alt + F7 - opens the project properties
Configuration Properties - C / C + + - General - Additional Include Directories,
where we put the value "C: \ Program Files \ OpenCV2.1 \ include \ opencv";
Linker - General - Additional Library Directories, where we put the value of

C: \ Program Files \ OpenCV2.1 \ lib \
Linker - Input - Additional Dependencies -

cv210.lib cvaux210.lib cxcore210.lib cxts210.lib highgui210.lib for Release,
cv210d.lib cvaux210d.lib cxcore210d.lib cxts210.lib highgui210d.lib for Debug
2. Reading the image and display it on screen
1. Preparing the input data:
file http://www.fitseniors.org/wp-content/uploads/2008/04/green_apple.jpg
write in C:\green_apple.jpg
2. Writing in Project1.cpp:
# Include "stdafx.h"
# Include "cv.h"
# Include "highgui.h"
using namespace cv;
int main (int argc, const char ** argv)

{
Mat image = imread ("C:\\green_apple.jpg");/ / Load image from disk
imshow ("image", image); / / Show image
waitKey (0); / / Wait for keystroke
return 0;
}
3. Press F7 - compilation, F5 - run.

The program will show the image in the window and by pressing any key will complete its
work.
3. Linear operations on images
Replace the text in the main from the previous

for example:

{
Mat image = imread ("C: \ \ green_apple.jpg");
//Image1 pixel by pixel is equal to 0.3 * image

Mat image1 = 0.3 * image;
imshow ("image", image);
imshow ("image1", image1);
waitKey (0);
return 0;
}
4. Working with rectangular subimages
Replace the text in the main from the previous example to:

{
Mat image = imread ("C:\\green_apple.jpg");
//Cut of the picture

Rect rect = Rect (100, 100, 200, 200); //Rectangle
Mat image3;
image (rect). copyTo (image3); //Copy of the image
imshow ("image3", image3);
//Change the part of the picture inside the picture

image (rect) *= 2;
imshow ("image changed", image);
waitKey (0);
return 0;
}
Mat class
Mat - Base class for storing images OpenCV.
Mat class
Single-and multi-channel images
The image is a matrix of pixels.

Each pixel can store some data.
If the pixel stores the vector data, the dimension vector is number of image
channels.
1-channel image - also called the half-tone

3-channel images - typically consist of three components (Red, Green, Blue).
Also, OpenCV can work with 2 - and 4-channel image.

Mat class
Creating images
1) Let the picture without some type of
Mat imageEmpty;
2) Image w x h pixels, the values 0 .. 255

(8Umeans "unsigned 8 bit",C1means "a channel"):
int w = 150; int h = 100;

Mat imageGray (cv:: Size ( w, h ) CV_8UC1 );
Mat class
Creating images
3) 1-channel with the floating-point values

(32F means "float 32 bit"):
Mat imageFloat (cv:: Size (w, h), CV_32FC1 );
4) 3-channel image with values 0 .. 255 for each channel:
Mat imageRGB (cv:: Size (w, h), CV_8UC3 );

Mat class
Memory management
1. Memory for the image stands out and is automatically cleared
That is, OpenCV itself creates the image of the desired size and type, if this
image is an output parameter of a function:
Image imageFloat;
imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);
- Here OpenCV itself allocate imageFloat.

It is important that if your image is already the right size, there are no
operations on memory allocation is performed.
2. Assignment operator shall not copy the data (as does the std:: vector), and
not by copying pointers, and using mechanism of the reference count.
Mat class
Memory management
The mechanism of the reference count (In STL it is a shared_ptr, in Java
it is all pointers) works like this:
{
Mat A (cv:: Size (100, 100), CV_8UC1);
// Allocate memory for the image, and the memories,
// This memory is a single image.
{
Mat B = A;
// Here the memory for the image does not stand out, but simply
// Data in B point to the same area in memory.
// Therefore, if we change B, then changed, and A.
// Reference count increased by an image, was equal to 2.
}
// Here B came out of scope, the reference count is decreased,
// And became equal to 1.
}
// Here A came out of scope, the reference count becomes equal to 0,
// And the memory allocated to it are automatically cleared.
Mat class
Memory management
Since the operation

Mat B = A;
does not copy the image A to B, then in order to create a copy of the image
for subsequent independent use, you must use explicit commands copyTo
and clone:
image1.copyTo (image2);
image3 = image1.clone ();
Mat class
Memory management
Outcome:
1) an assignment Mat B = A; is very fast, and does not copy the data and
adjusts the pointers in a special way to them. This allows you to transfer
Mat in the function directly, without pointers and references. This will not
cause unwanted copying Mat the stack (as it would stalal std::vector).
Although, of course, const Mat & will be transmitted still faster.
2) to copy the images to use explicit commands copyTo and clone.

Mat class
Per-pixel access to images
In OpenCV has several ways of per-pixel access to images. They vary in
the degree of security (typing and go beyond the border), the speed and
convenience.
Wherever possible, you should try to avoid direct references to the pixels,
but instead use the functions of OpenCV, since they usually work faster
and the code more understandable.
Mat class
Per-pixel access to images
One way to access the pixels for images that have known the type - the
use of the at. For single-channel images 0 ... 255 it is:
// Get values
int value = imageGray.at <uchar> (y, x);
// Set the values

imageGray.at <uchar> (y, x) = 100;
Note that x and y in the call are swapped.

Mat class
Conversion types
Note
In the derivation of the on-screen images with floating-point OpenCV
means we must bear in mind that they are displayed on the
assumption that their values lie in [0,1]. Therefore, when converting
8-bit images in an image float to do the transformation - the
multiplication by 1.0 / 255.0.
To convert images of different types of bit mode (float and unsigned

char) used a class member convertTo.
In its second argument - the type of the image.
imageGray.convertTo (imageFloat, CV_32FC1, 1.0 / 255.0);
The number of channels input and output must match!

Mat class
Conversion types
For converting different color spaces using the function cvtColor. If
necessary, it can change the number of channels.
For example, the conversion of 3-channel RGB-image to grayscale:
cvtColor (inputRGB, outputGray, CV_BGR2GRAY);
And reverse:
cvtColor (inputGray, outputRGB, CV_GRAY2BGR);
Mat class
Partition of the channels
Function split divides the multi-channel image into channels.
Functionmergestitches together a single-image multi-channel.
void split(Const Mat &mtx // Original color image

vector <Mat> &mv // Resulting set is 1-channel Images
)
void merge(Const vector <Mat> &mv // Initial set of 1-channel images

Mat &dst // The resulting color image
)
Most often they are used to separately to each color image processing, as well as
for various manipulations of the channels.
Mat class
Partition of the channels
{
Mat image = imread ("C: \ \ green_apple.jpg");
// Split the original image into three channels

// - Channels [0], channels [1], channels [2]
vector <Mat> channels;

split (image, channels);
// Show the channels in separate windows

// Note that the red channel - 2, not 0.
imshow ("Red", channels [2]);

imshow ("Green", channels [1]);
imshow ("Blue", channels [0]);
waitKey (0);
return 0;
}
Image processing functions
Smoothing
Original image The image, smoothed box 11 x

11
Function GaussianBlur performs image smoothing Gaussian filter.
Most often, the smoothing is applied to remove small noise on the image for
subsequent image analysis. Is done by using a filter of small size.
http://www.innocentenglish.com/funny-pics/best-pics/stairs-sidewalk-art.jpg
Threshold
Function threshold performs threshold processing of the image.
Most often it is used to highlight objects of interest pixels in the image.
http://www.svi.nl/wikiimg/SeedAndThreshold_02.png
Fill areas
Function floodFill provides a fill area, starting from a pixel (x, y), with specified
boundaries shutdown
using a 4 - or 8 - adjacency pixels.
It is important: It spoils the original image - as it fills.
Most often it is used to highlight areas identified by the threshold processing, for
subsequent analysis.
http://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Square_4_connectivity.svg/300px-Square_4_connectivity.svg.png
http://tunginobi.spheredev.org/images/flood_fill_ss_01.png
Isolation circuits
The contour of the object - this is the line representing the edge of the object's shape.
Underline the contour points - Sobel, curves - Canny.
Application
1. Recognition. Along the contour can often determine the type of object that we observe.
2. Dimension. With the circuit can accurately estimate the size of the object of their
rotation, and location.
http://howto.nicubunu.ro/gears/gears_16.png
http://cvpr.uni-muenster.de/research/rack/index.html
4. OpenCV integration into
multimedia projects
- Low-level Libraries
- Middle-level Platforms
- High-level Environments
Back to Contents
Low-level Libraries
(Open Computing Language)

Processing, analysis (Open Graphics Library) Parallelization and speed up the
High-speed graphics calculations, in particular, means
and image recognition GPU.
(Open Audio Box2D - 2D Bullet - 3D physics Web server

Library) physics engine engine
Sound
Video 1 Video 2 and so on ...
Middle-level Platforms
This is a platforms for "Creative coding", includes a large set of
functions and libraries that are integrated for convenient
Programming.
Processing openFrameworks Cinder
Language: Java Language: C / C + + Language: C / C + +

For computer vision Recently appeared,
Java is slow. gaining popularity.
Video 1 Video 2 Video 3

High-level Environments
“Visual programming“ environments, which allows to
implement projects without actual programming. It is important
that they can expand by the plugins made with low-level
libraries.
Max / MSP / VVVV Unity3D

Jitter
Focused on visual Focused on high-
Focused on audio. effects. quality 3D.
5. Possibilities and limitations
of simple tasks of Computer Vision
- The principal possibilities
- Source of problems
- "The Problem of Boundaries"
- "The Problem of Texture"
- Segmentation
- Optical flow
- Applications of Optical Flow
- Methods for calculating the optical flow
- Optical flow problems
Back to Contents
The principal possibilities
In principle, using computer vision to measure any

parameters of physical processes, if they are expressed in
mechanical motion, changing shape or color.
http://people.rit.edu/andpph/photofile-misc/strobe-motion-ta-08.jpg
Source of problems
The main source of algorithmic problems of image analysis lies
in the fact that there is no simple relationship
Pixel values Objects in the scene
For simple cases when such connection is - computer vision

algorithms work very well :)
http://zone.ni.com/cms/images/devzone/epd/GeometricMatchingScreenshot.JPG
"The problem of boundaries
The closeness of color pixels does not mean that they belong to the
same object.
Similarly, a strong difference between colors of adjacent pixels does

not mean that the pixels belong to different objects.
How to separate the shadows from the trees?
http://fineartamerica.com/images-medium/shadows-in-the-forest-gary-bydlo.jpg
To overcome this problem need to build the algorithms that receive
and use the contextual information about the location of objects in
the scene.
How to find the fish?

http://flogiston.ru/img/invisible_flounder_fish.jpg
"The problem of texture"
On the objects are such a texture that does not allow to consider the
objects the same color, but are difficult to model and describe.
"The problem of texture"
How to differentiate the zebras?

http://dangerouswildlife.com/images/zebra-herd.jpg
"The problem of boundaries" and "texture
issue" is most clearly manifested in solving the
problem of segmentation.
Segmentation
Segmentation - A partition image to the "homogeneous"

in some sense the field.
The purpose of segmentation - to build a "simple" description

of the original image, which can be applied for further analysis
of the image.
Segmentation
There are many methods of segmentation, which use a variety
of ideas:
- "Growing regions" on the basis of luminance,
- "Method snake" - an iterative movement curves
- Construction of boundaries (for example, the method of

"watershed")
- Search for a partition on the field, minimizing "entropy",
- The use of a priori information about the form field
- The use of multiple scales to refine the boundaries.

Segmentation
One of the best algorithms - a method GrabCut.
(While working quite slowly.)
The method is based on the approximate construction of the

minimum cut of a special graph, which is based on the pixel.
Segmentation
Idea:
construct a weighted graph G = {V, E}.
image pixels correspond to the vertices V,
and geometric, brightness and texture proximity between two
pixels i,j corresponds to the weight of the edgeS_i, j
(S"From the" similarity "- the proximity of pixels)
http://www.cis.upenn.edu/ ~ jshi / GraphTutorial /

Segmentation
Then the problem of segmentation into two regions can be
formulated as a problem of partitioning the set of vertices into
two parts. Such a division is called cut.
- The value of the cut.
Minimal incision (Ie, cut with a minimum value) - will

announce the decision of the segmentation problem.
Segmentation
Complementing the method of minimal incision manual original
layout of areas you can get very good results (although not fully
automatic):
James Malcolm, Yogesh Rathi, Allen Tannenbaum

A Graph Cut Approach to Image Segmentation in Tensor Space
http://jgmalcolm.com/pubs/malcolm_tc.pdf
Segmentation
Adding the idea of multiresolution analysis

(Fully automatic result):
Eitan Sharon, Segmentation by Weighted Aggregation, CVPR'04

Tutorial-ImageSegmentationGraph-cut4-Sharon.pdf
Segmentation

Segmentation

Optical Flow
Optical Flow(Optical flow) - is a vector field of apparent motion
of objects, surfaces and edges in a visual scene, caused by
relative motion between the observer and the scene.
Optical Flow
Usually considered an optical flow that arises when considering
two frames of video.
For each pixel (x,y) Optical flow is a vector (f(x,y)

g(x,y))Characterizing the shift:
Optical Flow
http://www.ultimategraphics.co.jp/jp/images/stories/ultimate/BCC/optical.jpg
Application of optical flow
1. To determine the direction in which the moving objects

in the frame. Video
2. For segmentation of moving regions in the frame for further

analysis.
3. To restore the form of three-dimensional object, near which

the camera moves.
4. As an auxiliary method of increasing the stability of

algorithms for detecting objects, if they are not in every
frame.
For example, for search problems individuals, markers, etc.
Application of optical flow
Enhancing the stability of facial recognition.
Video Green circle - the result of processor 1,

purple rectangles - Combine the results.
Methods of calculating the optical flow
(I) Block ("Naive" methods)

For each point searched shift that minimizes the difference in the local window.
(II) Differential (Most used)

Estimation of the derivatives of x, y, t.
1. Lucas-Kanade - Very fast.
2.Farneback- Good enough, but slow
3.CLG - Qualitative, but not yet implemented in OpenCV
4. Pyramidal Lucas-Kanade, calculated only on "points of interest"
5. Horn-Schunk - not very resistant to noise
(III) based on discrete optimization (Intensive)

The solution is constructed using the methods of min-cut, max-flow, linear programming
and belief propagation.
Today in OpenCV implements several algorithms, the best of them - Farneback.
(Gunnar Farneb? Ack "Two-Frame Motion Estimation Based on Polynomial Expansion", Proceedings of the
13th Scandinavian Conference on Image Analysis Pages 363-370 June-July, 2003)
The idea is to approximate a quadratic function of the brightness of the pixels
in the neighborhood of a pixel in both frames.
Using the coefficients of polynomials, we can calculate the shift - which
declares the value of optical flow in a given pixel.
Optical flow problems
Optical flow and the actual field of motion may not coincide, and even
be perpendicular to:
Optical flow problems
The problem of the aperture - the ambiguity of the definition of motion
caused by the consideration of the motion only locally (without
analyzing the edge of the object).
Is particularly so
- In maloteksturirovannyh scenes
- Stages in the combinatorial type strips and chess boards
http://pages.slc.edu/ ~ ebj / sight_mind / motion / Nakayama / aperture_problem.GIF

Homework 1 of 2
To obtain the test on these lectures "automatic"
Build a picture with objects consisting of black bars and cages

on a white background, then the second picture, where those
objects are shifted by a distance greater than the width of the
bands. Then calculate and display the resulting optical flow.
Send the results to perevalovds@gmail.com

6. Possibilities and limitations
of complex problems of
Computer Vision
- Face detection – Viola-Jones algorithm
- Pedestrian detection – HOG algorithm
Back to Contents
Face detection – Viola-Jones algorithm
The algorithm of Viola-Jones is now the base for searching

the frontal faces in the frame.
The algorithm uses a set of “Haar-like features”.
http://www710.univ-lyon1.fr/ ~ bouakaz/OpenCV-0.9.5/docs/ref/pics/haarfeatures.png
http://alliance.seas.upenn.edu/ ~ cis520/wiki/uploads/Lectures/viola_jones_first2_small.png
At the stage of learning from the excessive feature set by

boosting construct a set of classifiers.
- Works well for frontal faces.
Problems
- For those in the profile does not work because of hairstyles.
- For the whole person or the upper part - I was not able to
achieve recognition.
This is due, apparently, so that the algorithm is capable of well

trained to recognize the internal contours of virtually immutable.
And with the changing contours of the external it does not work
very well.
Application to search for objects in a fenced grass.

Application to search for objects in a fenced grass.
•frequency of crossing: 0.158 (158 among 1000 positive examples);

• the frequency of false alarms: 0.049 (40 among 1000 positive examples and 58 among
1000 negative examples);
• Thus, the frequency of correct detection was 84.2%, The frequency of false alarms 4.9%.
Pedestrian detection - HOG algorithm
HOG = Histogram Of Gradients.

How it works: the image is divided into regions, which is the gradient direction.
These areas are accumulated in the histogram. The resulting vector is used for
pattern recognition (the method of SVM).
http://ericbenhaim.free.fr/images/hog_process.png
Algorithm can reliably detect cars, motorcycles, bicycles.
Demo:
Video http://www.youtube.com/watch?v=BbL2wWy8KUM
Problems:
Judging by the demo video, the algorithm does not work well
with people in skirts, or he was trained to recognize people from
a different perspective.
7. New applications of
computer vision
- Interactive multimedia systems
- 3D-illusion
- Projection mapping
- Dynamic projection mapping
Back to Contents
Interactive multimedia system
Funky Forest
(2007,
T. Watson)
Video http://zanyparade.com/v8/projects.php?id=12
Body paint
Video http://www.youtube.com/watch?v=3T5uhe3KU6s
Projection onto the hands
of spectators
Yoko Ishii and Hiroshi Homura, It's fire, you can touch it, 2007.
Floor Games
Video championship outdoor ping-pong,

Chrustalnaya-2011
Championship held at the conference "Modern problems of mathematics" - 42 th
National Youth School-Conference
“Chrustalnaya” hotel, 2 February 2011
www.playbat.net
Video
3d-illusion
This illusion is the perception of body volume in the (flat) surface, which is
achieved by accurate simulation of geometry and light and shade for the body
from the point where the viewer stands.
http://justinmaier.com/wp-content/uploads/2006/05/ATT5082002.jpg
3d-illusion
Creating the illusion of 3D by tracking head and eye. Now
embedded in the portable gaming consoles, with the camera.
Video http://www.youtube.com/watch?v=o5tlIIOXMs4
Projection mapping
Mapping (projection mapping) - implementation of video
projection is not on special screens, and on other objects for
their “animation".
Video http://www.youtube.com/watch?v=BLNqZ1Nbo7Q
Projection mapping
Today the mapping on buildings is popular
(Architectural Projection Mapping).
Video http://www.youtube.com/watch?v=BGXcfvWhdDQ
Dynamic projection mapping
The idea: to use the techniques and methods for tracking

Markerless AR to synchronize the moving object and image
from the projector.
Dynamic projection mapping
More radical: to track the movement of multiple objects

(skipping balls falling sheets of paper)
and implement a projection onto them.
Homework 2 of 2
To obtain the test on these lectures "automatic"
Make Tracking (detection position) of the falling ball, which then
bounces and jumps.
Shoot video, which has the ball and the top shows the position
of the ball, found a computer. Video to put on youtube, To send
this link to perevalovds@gmail.com.
Outside view and graphics: Physical
computing
Water musical instrument

sound waves start
Aleatoric water musical instrument

http://www.youtube.com/watch?v=CZ_KijiwQHE
Video
Conclusion
- Literature
- Friendly lectures and seminars
- Partners
- Contacts
Back to Contents
Literature
Computer vision
1. Gonzalez R., Woods R.Digital image processing.
2. Shapiro, J. StockmanComputer vision.
OpenCV
1. Documentation OpenCV C++:

http://opencv.willowgarage.com/documentation/cpp/index.html
2. G. Bradski, A. Kaehler Learning OpenCV: Computer Vision with the OpenCV Library
- Unfortunately, for the version of OpenCV for C, not C++.
3. My lectures on OpenCV for C++ (Fall 2010)

http://uralvision.blogspot.com/2010/12/opencv-2010.html
Friendly lectures and seminars
Course on openFrameworks with elements OpenCV,

matmeh USU, spring semester 2011.
The program of study course and the lessons will be on site

www.uralvision.blogspot.com
Ural Federal University - College of Arts and culture
Ekaterinburg branch of the National Centre for Contemporary Art
UB RAS
The program "Art, Science, Technology
art.usu.ru/index.php/confer/229
www.art.usu.ru,www.ncca.ru
On March 2, 18:30
The body as interface: Wearable Computing
INVITATION TO with a 10-15 minute presentation without it.

Please indicate the desire and the subject of speeches Ksenia Fedorova
ksenfedorova@gmail.com
The seminar will be held at the USU Library (Lenin 51, 4 floor, 413a)
Details - http://www.scribd.com/doc/49078187/Body-As-Interface
April 21-22
2-nd International Workshop
"Theory and Practice of Media Art"
Chairman org.komiteta Ksenia Fedorova, curator of the EP NCCA

ksenfedorova@gmail.com
Information about the workshop will be published on ncca.ru,art.usu.ru,usu.ruand
www.uralvision.blogspot.com
http://www.ago.net/assets/images/assets/past_exhibitions/2007/steinkamp.jpg
Partners
www.playbat.net (Interactive Systems)
LLC Business Frame (CV consulting) www.bframe.ru
Ltd. 5-th dimension (interactive systems)
gorodaonline.com (A network of information-business portals)
Animation studio, "Mult-On www.mult-on.ru
Animation Studio "Animatech" www.anima-tech.ru

Contacts
Denis Perevalov
perevalovds@gmail.com
This lecture is published at

www.uralvision-en.blogspot.com
www.uralvision.blogspot.com (Russian)

What Computer Vision With The OpenCV

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

What Computer Vision With The OpenCV

Uploaded by

Copyright:

Available Formats

CS Club Ekaterinburg on 17-18 February 2011

What Computer Vision with the OpenCV

- The lecture intent

1) algorithms that solve the problem of image analysis

2) The concerns and observations about the applicability of

Not be interested in:

1) Questions accelerating algorithms using GPU.

2) Neural networks and artificial intelligence.

- For those interested in computer vision and wants to learn

- For those who do not have experience with OpenCV, but

- For those who are seriously engaged in computer vision, and

Computer vision - Theory and technology of creating

As a technological discipline, computer vision seeks to apply

The input data are two-dimensional array of data

Ordinary light, radio waves, ultrasound - they are all sources of

1. Color images of the visible spectrum

1. Color images of the visible spectrum

Snapshot of the radar:

5. Images with depth data

http://opencv.willowgarage.com/documentation/c/ Video http://www.youtube.com/watch?v=pk_cQVjqFZ4

The input data are two-dimensional array of data

But the two-dimensional arrays of data are used not only in

The goal of treatment - extraction and use of color and geometric

Segmentation - A partition image to the "homogeneous"

Detection interesting objects in the picture, and

Tracking - Tracking objects of interest on the sequence of

For various processing tasks in real-time need different

Their main features are:

2. The number of frames per second

3. Type of data obtained

4. Way to transfer data into the computer

320 x 240 640 x 480 1280 x 1024

30 fps 60 fps 150 fps

Color or grayscale image Infrared image Color image + depth

Historically appeared first,

(+) Transmit data over long distances,

(+) Easy to connect computer and software

(-) Overhead - to decode JPEG requires computing resources.

Cameras that transmit a signal

(+) Transfer of uncompressed video in excellent quality at high speed

(-) High price

Cameras that transmit data on

(+) Easy connection to PC

(-) May have problems with speed of response

Cameras, in which case

(-) Often they require adaptation of existing projects.

Constructed from ordinary cameras

320 x 240: 150 FPS

(Depth - stereo vision using laser infrared illuminator,

(Depth - stereo vision with two cameras)

"Open Computer Vision Library"

Open library with a set of functions for processing,

2000 - First alpha version, support for Intel, C-interface

2006 - Version 1.0

2008 - Support Willow Garage (lab. Robotics)

2009 - version 2.0, classes in C + +

2010 - version 2.2, realized work with the GPU

2. Create a console project

3. Set up the path

Linker - General - Additional Library Directories, where we put the value of

Linker - Input - Additional Dependencies -

int main (int argc, const char ** argv)

3. Press F7 - compilation, F5 - run.