Professional Documents
Culture Documents
by
Arcellana, Anthony A.
Ching, Warren S.
Guevara, Ram Christopher M.
Santos, Marvin S.
So, Jonathan N.
October 2006
Chapter 3 Theoretical Considerations
3.1 Image
set of digital values, called pixels derived from the word “picture element”. It has
illuminated by some light which is partly reflected and partly absorbed by it. Part
of the reflected light reaches the sensor used to image the scene and is responsible
for the value recorded for the specific pixel. The pixels are stored in computer
integers. (Petrou, M., et. al, 1999). The number of horizontal and vertical samples
These values are often transmitted or stored in a compressed form. The number of
b=NxNxm
That is why we often try to reduce m and N, without significant loss in the quality
because it determines the storage size. Digital images can be created in a variety
of ways with input devices like digital cameras, scanners and etc.
There are many kinds of digital image like binary, grayscale, and color.
These digital images can be classified according to the number and nature of the
value of a pixel. Binary images are images that have been quantized to two
values, usually denoted 0 and 1, but often with pixel values 0 and 255,
representing black and white. A grayscale image is an image in which the value of
each pixel is a single sample. Images of this sort are typically composed of shades
principle the samples could be displayed as shades of any color, or even coded
with various colors for different intensities. An example of this image is in figure
3.1. The original image is the letter a (leftmost) is a grayscale image that has an
intensity of 0 to 255, the center image is a zoomed in version of the image and it
reveals the individual pixels of the letter a. The rightmost image is the normalized
numerical values of each pixel. For this example the coding used is that 1(255) is
Figure 3.1
3.1.3 Color
A color image is a digital image that includes color information for each
integer triplets; or as three separate raster maps, one for each channel. One of the
most popular colour model is the RGB model. The colors red, green, and blue was
(Morris, T., 2004). Almost any colour can be made to match using linear
C = rR + gG + bB
Today there are many RGB standards in use. Some of these are ISO RGB, sRGB,
ROMM RGB, and NTSC RGB. (Buckley, R. et. al, 1999). These standards are
Figure 3.2
RGB Colorspace
3.1.4 Resolution
Resolution is sometimes identified by the width and height of the image as well as
the total number of pixels in the image. For example, an image that is 2048 pixels
wide and 1536 pixels high (2048X1536) contains (multiply) 3,145,728 pixels (or
3.1 Megapixels). Resolution of an image expresses how much detail we can see in
physical dimensions. The most often used measurement is ppi, pixels per inch.
scale the image. Resampling algorithms try to reconstruct the original continous
represent the image. The spatial continuity of the image is approximated by the
spacing of the samples in the sample grid. The values represented for each pixel is
3.2.1 PC Camera
camera widely used for video conferencing via the Internet. Acquired images
from this device were uploaded in a web server hence making it accessible using
the world wide web, instant messaging, or a PC video calling application. Web
cameras typically include a lens, an image sensor, and some support electronics.
Image sensors can be a CMOS or CCD, the former being the dominant for low-
cost cameras. Typically, consumer webcams offers a resolution in the VGA region
having a rate of around 25 frames per second. Various lens is also available, the
most being a plastic lens that can be screwed in and out to manually control the
camera focus. Support electronics is present to read the image from the sensor and
3.2.2 Projector
Processing) and LCD (Liquid Crystal Display). This refers to the internal
3.2.2.1 DLP
recreate the source material. Originally developed by Texas Instruments, there are
two manners by which DLP projection creates a color image. First it employs the
usage of single-chip DLP projectors and the other is on the use of three-chip
projectors. On a single DMD chip, placing a color wheel between the lamp and
the DMD chip generates colors. Basically a color wheel is divided into four
sectors: red, green, blue and an additional clear section to boost brightness. The
later is usually omitted since it is only use to reduce color saturation. The DMD
chip is synchronized with the rotating color wheel thus when a certain color
section of the color wheel is in front of the lamp that color is displayed at the
DMD. While on a three chip DLP projector, a prism is used to split the light from
the lamp. Each primary color of light is routed to its own DMD chip, recombined
and directed out through the lens. Three chip DLP is referred to the market as
DLP2.
3.2.2.2 LCD
LCD projectors contain three separate LCD glass panels, one for red,
green, and blue components of the image signal being transferred to the projector.
As the light passes through the LCD panels, individual pixels can be opened to
allow light to pass or closed to block the light. This activity modulates the light
and produces the image that is projected onto the screen (Projectorpoint).
screen, or when the projection screen has an angled surface. The resulting image
modifying the light-path through the lens. The correction is done after the light
has been reflected off the image panels in the projector. Digital keystone
correction adjusts the image proportions by shrinking the image at the edge
furthest away from the screen before the projector generates it (HTRgroup). The
to 35 degrees of vertical keystone correction and some even offer both vertical
images undergo signal processing techniques to manipulate the images to the users’
desire. These techniques will either enhance or suppress wanted and unwanted part of an
image respectively.
data reduction and to make the analysis easier. This stage is basically where we
2005 )
3.3.2 Thresholding
images to two values and the simplest way to do image segmentation. One of
which is the “object pixel” and the other is the “background pixel”. An image will
be marked as an object pixel when its value is greater than the threshold value and
background pixel otherwise. Usually, an object pixel is given a value of '1' while a
0if . f (i, j ) ≤ θ
g (i, j ){
1 otherwise
The main parameter in thresholding lies in selecting the correct value for
the threshold. There are many ways to acquire the value of threshold and the
simplest way to select the threshold value would be to choose the mean or median
value. This is effective provided that the object pixels are brighter than the
background, and they should also be brighter than the average. Using a histogram
to record the frequency of occurrence of the image pixel and use the valley point
as the threshold would be the next. The histogram approach assumes that there is
some average value for the background and object pixels, but that the actual pixel
values have some variation around these average values. A more effective way to
There are two ways to possibly perform the iterative method. The first
method will incrementally search through the histogram for a threshold. Starting
at the lower end of the histogram, the average of the gray values less than the
suggested threshold will be computed thus labeled L, and the same thing with
gray values greater than the suggested threshold labeled G. The average of L and
G will be then computed. If the average is equal to the suggested threshold, it will
image’s four corner pixels. Then the next steps will be similar to the first method,
the only difference lies on updating of the suggested threshold, on this method the
updated value is now equal to the average the value of L and G. (Umbaugh, 2005)
parts of objects.
are Prewitt, kirsch, and Robinson operators and for differential gradient Roberts
and Sobel operators. Both template matching and differential gradient estimate
local intensity gradients with the help of suitable convolution masks (Davies,
2005).
g = (gx + gy) ½
θ = tan-1 ( gy / gx)
differencing. Image differencing over successive pairs of frames should reveal the
different pixels which should be composed of the moving object. However certain
parallel to the direction of motion give no sign of motion (Davies, E. , 2005). Also
image differencing suffers from noise. It is prone to contain errors due to subtle
changes in illumination. This can be caused due to environmental changes and the
digitization process of the camera where in internal noise causes subtle changes in
successive frames.
Where S(x,y) is the sum of the individual pixel intensities at point x and y
Sq(x,y) is the sum of the squares of the individual pixel intensities at point x and y
(m( x , y ) − p ( x , y ) ) > cσ ( x , y )
regions according to a given criterion. Regions may also be defined as a group of pixels
having both a border and a particular shape such as circle, ellipse, or polygon. Image
segmentation is a very important tool in many image processing and computer vision
necessary before any processing can be done at a higher level than that of the pixel. Most
concepts. The two basic concepts are the measure of homogeneity within themselves and
the measure of contrast with the objects on their border. Image segmentation techniques
can be divided into three main categories: (1) region growing and shrinking, (2)
The region growing and shrinking methods use the row and column based
selected and the adjacent pixels which satisfy the homogeneity property is added.
This process will output a single connected region in the image. To fully partition
the image into N regions, seed points must be selected in each region and the
manually selecting the points within the objects of interest. With this process of
selecting seed points, it does ensure that the resulting object meets the needs of
intensity maxima was usually used as a seed point since majority of the image
Once a seed point (x,y) is identified, the neighbors of that point (x+1,y),
(x-1,y), (x,y+1) and (x,y-1) will be examined to see which belong in the region.
All pixels whose colour is within the radius Rmax of the mean region colour cr are
part of the region, then these points should be added to the region and their
neighbors are next to be considered. As the region grows, the list of adjacent
pixels will also grow. The region will stop growing when all of the neighboring
elements are placed into groups. These groups are based on some measure of
similarity within the group. The major difference of clustering technique with the
region growing technique is that domains other than the row and column (x,y),
based image space (the spatial domain) may be considered as the primary domain
for clustering. Other domains include color spaces, histogram spaces, or complex
feature spaces.
space) of interest. The simplest method is to divide the space of interest into
regions by selecting the center or median along each dimension and splitting it
there. This method is used in the center and median segmentation algorithms. This
method will only be effective if the space we are using and the entire algorithm is
designed intelligently because the center or median split alone may not find good
clusters.
objects, thus indirectly defining the objects. The process starts by marking points
that may be a part of an edge. These points are then merged into line segments,
and the line segments are then merged into object boundaries. Edge detectors are
used to mark points of rapid change, thus indicating the possibility of an edge.
After the detection of edges the next step is to threshold the results. One
method is to consider the histogram of the edge detection results, looking for the
best valley manually. Edge detection threshold method works best with a bimodal
3.6 OpenCV
Intel developed an open source computer vision library named OpenCV which
developers, government and camera vendors as reflected in the license. OpenCV Library
is a collection of algorithms and sample codes for various computer vision problems.
This library is cross-platformed, and runs both on Windows and Linux Operating
Systems. It focuses mainly towards real-time image processing with applications in areas
interface, robotics, monitoring, biometrics and security by providing a free and open
infrastructure where the distributed efforts of the vision community can be consolidated
image and pattern analysis functions. The functions are optimized for Intel®
source vision community that will make better use of up-to-date opportunities to
apply computer vision in the growing PC environment. The Library is open and
Library (IPL) on which the latter extends the functionality toward image and
level functions that perform image processing and computer vision operations.
OpenCV can automatically benefit from using IPP on platforms like IA32, IA 64
and StrongARM.
helper data types are introduced. The fundamental data types include array-like
tools for creating and debugging C++ codes. It posses features like syntax highlighting,
auto-completion feature and debugging functions. The compile and build system feature,
precompiled header files, "minimal rebuild" functionality and incremental link: these
features significantly shorten turn-around time to edit, compile and link the program.
Microsoft Foundation Class (MFC) libraries, and standard libraries such as the
Standard C++ Library, and the C RunTime Library (CRT), which has been
security issues. A new library, the C++ Support Library, is designed to simplify
The Microsoft .NET framework is a software component that can be added to the
applications that are easier to build, manage, and integrate with other networked systems
(MSDN).
Windows API is designed for usage by C/C++ programs and is the most direct
way to interact with a Windows system for software applications (MSDN). An API
the appearance and behavior of every Windows function, from the outlook of the desktop
to the memory allocation for new processes. Every action triggers several more API
The APIs can be found in the DLL’s (Dynamic Link Library) in the Windows
library concept in the Microsoft OS (Wikipedia). These win32 APIs can be split in to
three, User32.dll, which handles the user interface, Kernel32.dll, which handles file
operations and memory management and Gdi32.dll which handles graphics (Nair, 2002).
References
Buckley, R., et. al. (1999). Standard RGB color spaces. In the IS&T/SID Seventh Color
Imaging Conference: Color Science, Systems and Applications. Scottsdale,
Arizona
DLP and LCD Projector Technology Explained. (n.d.). Retrieved June 2, 2006, from
http://www.projectorpoint.co.uk/projectorLCDvsDLP.htm.
Home Theater Research Group. Keystone Correction. Retrieved September 24, 2006
from http://htrgroup.com/?tab=projector-docs§ion=keystone
Intel (2001). Open source computer vision library reference manual. Retrieved September
22, 2006 from http://developer.intel.com
Kolas, O. (2005) Image Processing with gluas: introduction to pixel molding. Retrieved
September 24, 2006 form http://pippin.gimp.org/image_processing/chap_dir.html
Microbus (2003). Image, resolution, size and compression. Retrieved September 23, 2006
from http://www.microscope-microscope.org/imaging/image-resolution.htm
Nair. S. (2002). Working with Win32 API in .NET. Retrieved September 24, 2006 from
http://www.c-sharpcorner.com/Code/2002/Nov/win32api.asp
Petrou, M., and Bosdogianni, P (1999). Image Processing, The Fundamentals. John Wiley
& Sons, LTD : New York
Projector People. Projector Keystone Correction. Retrieved September 24, 2006 from
http://www.projectorpeople.com/tutorials/keystone-correction.asp
Sangwine, S. (1998). The colour image processing handbook. Chapman and Hall:
London
Shapiro, L. and Stockman, G. (2001). Computer Vision. Prentice Hall. Upper Saddle
River, New Jersey.
Umbaugh, S. (2005). Computer Imaging: Digital Image Analysis and Processing. CRC
Press: Boca Raton, Florida.